U.S. patent application number 10/581354 was filed with the patent office on 2007-11-29 for use of proteins and peptides encoded by the genome of a novel sars-associated coronavirus strain.
Invention is credited to Saliha Azebi, Jean-Michel Betton, Ana Maria Burguiere, Pierre Charneua, Chantal Combredet, Bernadette Crescenzo-Chaigne, Jean-Francois Delagneau, Nicolas Escriou, Sylvie Gerbaud, Frederick Kunst, Jean-Claude Manuguerra, Monique Martin, Frederic Tangy, Sylvie Van Der Werf.
Application Number | 20070275002 10/581354 |
Document ID | / |
Family ID | 34680402 |
Filed Date | 2007-11-29 |
United States Patent
Application |
20070275002 |
Kind Code |
A1 |
Van Der Werf; Sylvie ; et
al. |
November 29, 2007 |
Use Of Proteins And Peptides Encoded By The Genome Of A Novel
Sars-Associated Coronavirus Strain
Abstract
The invention relates to the use of proteins and peptides coded
by the genome of the isolated or purified strain of severe acute
respiratory syndrome (SARS)-associated coronavirus, resulting from
sample reference number 031589 and, in particular, to the use of
protein S and the derivative antibodies thereof as diagnostic
reagents and as a vaccine.
Inventors: |
Van Der Werf; Sylvie;
(Gif-Sur-Yvette, FR) ; Escriou; Nicolas; (Paris,
FR) ; Crescenzo-Chaigne; Bernadette;
(Neuilly-Sur-Seine, FR) ; Manuguerra; Jean-Claude;
(Paris, FR) ; Kunst; Frederick; (Cedex, FR)
; Betton; Jean-Michel; (Paris, FR) ; Gerbaud;
Sylvie; (Saint-Maur-Des-Fosses, FR) ; Burguiere; Ana
Maria; (Clamart, FR) ; Azebi; Saliha;
(Vitry-Sur-Seine, FR) ; Charneua; Pierre; (Paris,
FR) ; Tangy; Frederic; (Les Lilas, FR) ;
Combredet; Chantal; (Villiers, FR) ; Delagneau;
Jean-Francois; (La Celle Saint Cloud, FR) ; Martin;
Monique; (Chatenay Malabry, FR) |
Correspondence
Address: |
FINNEGAN, HENDERSON, FARABOW, GARRETT & DUNNER;LLP
901 NEW YORK AVENUE, NW
WASHINGTON
DC
20001-4413
US
|
Family ID: |
34680402 |
Appl. No.: |
10/581354 |
Filed: |
December 2, 2004 |
PCT Filed: |
December 2, 2004 |
PCT NO: |
PCT/FR04/03105 |
371 Date: |
April 12, 2007 |
Current U.S.
Class: |
424/186.1 ;
424/93.2; 435/243; 435/320.1; 435/5; 514/44R; 530/350; 530/388.3;
530/391.1; 536/23.72 |
Current CPC
Class: |
G01N 2333/165 20130101;
A61K 39/12 20130101; C07K 14/005 20130101; A61P 11/06 20180101;
C12N 2770/20034 20130101; A61P 31/14 20180101; C12N 2770/20022
20130101; G01N 33/56983 20130101; C12N 7/00 20130101; A61P 31/12
20180101; A61K 2039/53 20130101; A61P 11/08 20180101; G01N 33/569
20130101; A61K 38/00 20130101; A61P 11/00 20180101; A61P 37/02
20180101; A61P 37/04 20180101; C12N 2710/24143 20130101 |
Class at
Publication: |
424/186.1 ;
424/093.2; 435/243; 435/320.1; 435/005; 514/044; 530/350;
530/388.3; 530/391.1; 536/023.72 |
International
Class: |
A61K 39/215 20060101
A61K039/215; A61K 31/7088 20060101 A61K031/7088; A61P 31/12
20060101 A61P031/12; C07K 14/505 20060101 C07K014/505; C07K 16/46
20060101 C07K016/46; C12Q 1/70 20060101 C12Q001/70; C12N 15/63
20060101 C12N015/63; C07K 16/08 20060101 C07K016/08; C07H 21/04
20060101 C07H021/04; A61K 35/76 20060101 A61K035/76 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 2, 2003 |
FR |
0314152 |
Dec 2, 2003 |
FR |
0314151 |
Claims
1. An isolated and purified protein or polypeptide, characterized
in that it is the S protein having the sequence SEQ ID No: 3, its
ectodomain or a fragment of its ectodomain.
2. The protein or polypeptide as claimed in claim 1, characterized
in that it consists of the amino acids corresponding to positions 1
to 1193 of the amino acid sequence of the S protein.
3. The protein or polypeptide as claimed in claim 1, characterized
in that it consists of the amino acids corresponding to positions
14 to 1193 of the amino acid sequence of the S protein.
4. The isolated protein or polypeptide as claimed in claim 1,
characterized in that it consists of the amino acids corresponding
to positions 475 to 1193 of the amino acid sequence of the S
protein.
5. A nucleic acid encoding a protein or a polypeptide as claimed in
any one of claims 1 to 4.
6. The nucleic acid as claimed in claim 5, characterized in that it
comprises the sequence encoding SEQ ID No: 5 or the sequence
encoding SEQ ID No: 6.
7. A recombinant expression vector, characterized in that it
encodes a protein or a polypeptide as claimed in any one of claims
1 to 4.
8. The recombinant expression vector as claimed in claim 7,
characterized in that it is chosen from the vectors contained in
the following bacterial strains, deposited at the Collection
Nationale de Cultures de Microorganismes (CNCM), 25 rue du Docteur
Roux, 75724 Paris Cedex 15: a) strain No. I-3118, deposited on Oct.
23, 2003, b) strain No. I-3019, deposited on May 12, 2003, c)
strain No. I-3020, deposited on May 12, 2003, d) strain No. I-3059,
deposited on Jun. 20, 2003, e) strain No. I-3323, deposited on Nov.
22, 2004, f) strain No. I-3324, deposited on Nov. 22, 2004, g)
strain No. I-3326, deposited on Dec. 1, 2004, h) strain No. I-3327,
deposited on Dec. 1, 2004, i) strain No. I-3332, deposited on Dec.
1, 2004, j) strain No. I-3333, deposited on Dec. 1, 2004, k) strain
No. I-3334, deposited on Dec. 1, 2004, l) strain No. I-3335,
deposited on Dec. 1, 2004, m) strain No. I-3336, deposited on Dec.
1, 2004, n) strain No. I-3337, deposited on Dec. 1, 2004, o) strain
No. I-3338, deposited on Dec. 2, 2004, p) strain No. I-3339,
deposited on Dec. 2, 2004, q) strain No. I-3340, deposited on Dec.
2, 2004, and r) strain No. I-3341, deposited on Dec. 2, 2004.
9. A nucleic acid containing a synthetic gene allowing optimized
expression of the S protein in eukaryotic cells, characterized in
that it possesses the sequence SEQ ID No: 140.
10. An expression vector containing a nucleic acid as claimed in
claim 9, characterized in that it is contained in the bacterial
strain deposited at the CNCM, on Dec. 1, 2004, under the No.
I-3333.
11. The expression vector as claimed in claim 7 or claim 9,
characterized in that it is a viral vector, in the form of a viral
particle or in the form of a recombinant genome.
12. The vector as claimed in claim 11, characterized in that it is
a recombinant viral particle or a recombinant viral genome capable
of being obtained by transfecting a plasmid according to paragraphs
g), h) or k) to r) of claim 8, into an appropriate cellular
system.
13. A lentiviral vector encoding a polypeptide as claimed in any
one of claims 1 to 4.
14. A recombinant measles virus encoding a polypeptide as claimed
in any one of claims 1 to 4.
15. A recombinant vaccinia virus encoding a polypeptide as claimed
in any one of claims 1 to 4.
16. The use of a vector according to paragraphs d) to p) of claim
8, or of a vector as claimed in claim 10, for the production, in a
eukaryotic system, of the SARS-associated coronavirus S protein or
of a fragment of this protein.
17. A method for producing the S protein in a eukaryotic system,
comprising a step of transfecting eukaryotic cells in culture with
a vector chosen from the vectors contained in the bacterial strains
mentioned in paragraphs d) to p) of claim 8, or in claim 10.
18. A genetically modified eukaryotic cell expressing a protein or
a polypeptide as claimed in any one of claims 1 to 4.
19. The cell as claimed in claim 18, capable of being obtained by
transfection with any one of the vectors mentioned in paragraphs k)
to n) of claim 8.
20. The cell as claimed in claim 19, characterized in that it is
the cell FRhK4-Ssol-30, deposited at the CNCM on Nov. 22, 2004,
under the No. I-3325.
21. A monoclonal antibody recognizing the native S protein of a
SARS-associated coronavirus.
22. The use of a protein or a polypeptide as claimed in any one of
claims 1 to 4, or of an antibody as claimed in claim 21, for
detecting a SARS-associated coronavirus infection, from a
biological sample.
23. A method for detecting a SARS-associated coronavirus, from a
biological sample, characterized in that the detection is carried
out by ELISA using the recombinant S protein or its ectodomain, or
a fragment of its ectodomain, expressed in a eukaryotic system.
24. The method of detection as claimed in claim 23, additionally
comprising a step of detection by ELISA using the recombinant N
protein.
25. The method as claimed in claim 23 or 24, characterized in that
it is a double epitope ELISA method, and in that the serum to be
tested is mixed with the visualizing antigen, said mixture then
being brought into contact with the antigen attached to a solid
support.
26. An immune complex formed of a monoclonal antibody or antibody
fragment as claimed in claim 21, and of a SARS-associated
coronavirus protein or peptide
27. An immune complex formed of a protein or a polypeptide as
claimed in any one of claims 1 to 4, and of an antibody directed
specifically against an epitope of the SARS-associated
coronavirus.
28. A SARS-associated coronavirus detection kit or box,
characterized in that it comprises at least one reagent selected
from the group consisting of: a protein or polypeptide as claimed
in any one of claims 1 to 4, a nucleic acid as claimed in either of
claims 5 and 6, a cell as claimed in any one of claims 18 to 20, or
an antibody as claimed in claim 21.
29. An immunogenic and/or vaccine composition, characterized in
that it comprises a recombinant protein or polypeptide as claimed
in any one of claims 1 to 4, obtained in a eukaryotic expression
system.
30. An immunogenic and/or vaccine composition, characterized in
that it comprises a recombinant vector or virus as claimed in any
one of claims 7, 8, and 10 to 15.
31. A nucleic acid insert of viral origin, characterized in that it
is contained in any one of the strains mentioned in paragraphs a)
to h) and k) to r) of claim 8.
Description
[0001] The present invention relates to a novel strain of severe
acute respiratory syndrome (SARS)-associated coronavirus derived
from a sample recorded under No. 031589 and collected in Hanoi
(Vietnam), to nucleic acid molecules derived from its genome, to
the proteins and peptides encoded by said nucleic acid molecules
and to their applications, in particular as diagnostic reagents
and/or as vaccine.
[0002] Coronavirus is a virus containing single-stranded RNA, of
positive polarity, of approximately 30 kilobases which replicates
in the cytoplasm of the host cells; the 5' end of the genome has a
capped structure and the 3' end contains a polyA tail. This virus
is enveloped and comprises, at its surface, peplomeric structures
called spicules.
[0003] The genome comprises the following open reading frames or
ORFs, from its 5' end to its 3' end: ORF1a and ORF1b corresponding
to the proteins of the transcription-replication complex, and
ORF-S, ORF-E, ORF-M and ORF-N corresponding to the structural
proteins S, E, M and N. It also comprises ORFs corresponding to
proteins of unknown function encoded by: the region situated
between ORF-S and ORF-E and overlapping the latter, the region
situated between ORF-M and ORF-N, and the region included in
ORF-N.
[0004] The S protein is a membrane glycoprotein (200-220 kDa) which
exists in the form of spicules or spikes emerging from the surface
of the viral envelope. It is responsible for the attachment of the
virus to the receptors of the host cell and for inducing the fusion
of the viral envelope with the cell membrane.
[0005] The small envelope protein (E), also called sM (small
membrane), which is a nonglycosylated transmembrane protein of
about 10 kDa, is the protein present in the smallest quantity in
the virion. It plays a powerful role in the coronavirus budding
process which occurs at the level of the intermediate compartment
in the endoplasmic reticulum and the Golgi apparatus.
[0006] The M protein or matrix protein (25-30 kDa) is a more
abundant membrane glycoprotein which is integrated into the viral
particle by an M/E interaction, whereas the incorporation of S into
the particles is directed by an S/M interaction. It appears to be
important for the viral maturation of coronaviruses and for the
determination of the site where the viral particles are
assembled.
[0007] The N protein or nucleocapsid protein (45-50 kDa) which is
the most conserved among the coronavirus structural proteins is
necessary for encapsidating the genomic RNA and then for directing
its incorporation into the virion. This protein is probably also
involved in the replication of the RNA.
[0008] When the host cell is infected, the reading frame (ORF)
situated in 5' of the viral genome is translated into a polyprotein
which is cleaved by the viral proteases and then releases several
nonstructural proteins such as the RNA-dependent RNA polymerase
(Rep) and the ATPase helicase (Hel). These two proteins are
involved in the replication of the viral genome and in the
generation of transcripts which are used in the synthesis of the
viral proteins. The mechanisms by which these subgenomic mRNAs are
produced are not completely understood; however, recent facts
indicate that the sequences for regulation of transcription at the
5' end of each gene represent signals which regulate the
discontinuous transcription of the subgenomic mRNAs.
[0009] The proteins of the viral membrane (S, E and M proteins) are
inserted into the intermediate compartment, whereas the replicated
RNA (+ strand) is assembled with the N (nucleocapsid) protein. This
protein-RNA complex then combines with the M protein contained in
the membranes of the endoplasmic reticulum and the viral particles
form when the nucleocapsid complex buds into the endoplasmic
reticulum. The virus then migrates across the Golgi complex and
eventually leaves the cell, for example by exocytosis. The site of
attachment of the virus to the host cell is at the level of the S
protein.
[0010] Coronaviruses are responsible for 15 to 30% of colds in
humans and for respiratory and digestive infections in animals,
especially cats (FIPV: Feline infectious peritonitis virus),
poultry (IBV: Avian infectious bronchitis virus), mice (MHV: Mouse
hepatitis virus), pigs (TGEV: Transmissible gastroenterititis
virus, PEDV: Porcine Epidemic diarrhea virus, PRCoV: Porcine
Respiratory Coronavirus, HEV: Hemagglutinating encephalomyelitis
Virus) and bovines (BCoV: Bovine coronavirus).
[0011] In general, each coronavirus affects only one species; in
immunocompetent individuals, the infection induces optionally
neutralizing antibodies and cell immunity, capable of destroying
the infected cells.
[0012] An epidemy of atypical pneumonia, called severe acute
respiratory syndrome (SARS) has spread in various countries
(Vietnam, Hong Kong, Singapore, Thailand and Canada) during the
first quarter of 2003, from an initial focus which appeared in
China in the last quarter of 2002. The severity of this disease is
such that its mortality rate is about 3 to 6%. The determination of
the causative agent of this disease is underway by numerous
laboratories worldwide.
[0013] In March 2003, a new coronavirus (SARS-CoV or SARS virus)
was isolated, in association with cases of severe acute respiratory
syndrome (T. G. KSIAZEK et al., The New England Journal of
Medicine, 2003, 348, 1319-1330; C. DROSTEN et al., The New England
Journal of Medicine, 2003, 348, 1967-1976; Peiris et al., Lancet,
2003, 361, 1319).
[0014] Genomic sequences of this new coronavirus have thus been
obtained, in particular those of the Urbani isolate (Genbank
accession No. AY274119.3 and A. MARRA et al., Science, May 1, 2003,
300, 1399-1404) and the Toronto isolate (Tor2, Genbank accession
No. AY278741 and A. ROTA et al., Science, 2003, 300,
1394-1399).
[0015] The organization of the genome is comparable with that of
other known coronaviruses, thus making it possible to confirm that
SARS-CoV belongs to the Coronaviridae family; open reading frames
ORF1a and 1b and open reading frames corresponding to the S, E, M
and N proteins, and to proteins encoded by: the region situated
between ORF-S and ORF-E (ORF3), the region situated between ORF-S
and ORF-E and overlapping ORF-E (ORF4), the region situated between
ORF-M and ORF-N (ORF7 to ORF11) and the region corresponding to
ORF-N (ORF13 and ORF14), have in particular been identified.
[0016] Seven differences have been identified between the sequences
of the Tor2 and Urbani isolates; 3 correspond to silent mutations
(c/t at position 16622 and a/g at position 19064 of ORF1b, t/c at
position 24872 of ORF-S) and 4 modify the amino acid sequence of
respectively: the proteins encoded by ORF1a (c/t at position 7919
corresponding to the A/V mutation), the S protein (g/t at position
23220 corresponding to the A/S mutation), the protein encoded by
ORF3 (a/g at position 25298 corresponding to the R/G mutation) and
the M protein (t/c at position 26857 corresponding to the S/P
mutation).
[0017] In addition, phylogenetic analysis shows that SARS-CoV is
distant from other coronaviruses and that it did not appear by
mutation of human respiratory coronaviruses nor by recombination
between known coronaviruses (for a review, see Holmes, J. C. I.,
2003, 111, 1605-1609).
[0018] The determination and the taking into account of new
variants are important for the development of reagents for the
detection and diagnosis of SARS which are sufficiently sensitive
and specific, and immunogenic compositions capable of protecting
populations against epidemics of SARS.
[0019] The inventors have now identified another strain of
SARS-associated coronavirus which is distinguishable from the Tor2
and Urbani isolates.
[0020] The subject of the present invention is therefore an
isolated or purified strain of severe acute respiratory
syndrome-associated human coronavirus, characterized in that its
genome has, in the form of complementary DNA, a serine codon at
position 23220-23222 of the gene for the S protein or a glycine
codon at position 25298-25300 of the gene for ORF3, and an alanine
codon at position 7918-7920 of ORF1a or a serine codon at position
26857-26859 of the gene for the M protein, said positions being
indicated in terms of reference to the Genbank sequence
AY274119.3.
[0021] According to an advantageous embodiment of said strain, the
DNA equivalent of its genome has a sequence corresponding to the
sequence SEQ ID No: 1; this coronavirus strain is derived from the
sample collected from the bronchoaleveolar washings from a patient
suffering from SARS, recorded under the No. 031589 and collected at
the Hanoi (Vietnam) French hospital.
[0022] In accordance with the invention, said sequence SEQ ID No: 1
is that of the deoxyribonucleic acid corresponding to the
ribonucleic acid molecule of the genome of the isolated coronavirus
strain as defined above.
[0023] The sequence SEQ ID No: 1 is distinguishable from the
Genbank sequence AY274119.3 (Tor2 isolate) in that it possesses the
following mutations: [0024] g/t at position 23220; the alanine
codon (gct) at position 577 of the amino acid sequence of the Tor2
S protein is replaced by a serine codon (tct), [0025] a/g at
position 25298; the arginine codon (aga) at position 11 of the
amino acid sequence of the protein encoded by the Tor2 ORF3 is
replaced by a glycine codon (gga).
[0026] In addition, the sequence SEQ ID No: 1 is distinguishable
from the Genbank sequence AY278741 (Urbani isolate) in that it
possesses the following mutations: [0027] t/c at position 7919; the
valine codon (gtt) in position 2552 of the amino acid sequence of
the protein encoded by ORF1a is replaced by an alanine codon (gct),
[0028] t/c at position 16622: this mutation does not modify the
amino acid sequence of the proteins encoded by ORF1b (silent
mutation), [0029] g/a at position 19064: this mutation does not
modify the amino acid sequence of the proteins encoded by ORF1b
(silent mutation), [0030] c/t at position 24872: this mutation does
not modify the amino acid sequence of the S protein, and c/t at
position 26857: the proline codon (ccc) at position 154 of the
amino acid sequence of the M protein is replaced by a serine codon
(tcc).
[0031] Unless otherwise stated, the positions of the nucleotide and
peptide sequences are indicated with reference to the Genbank
sequence AY274119.3.
[0032] The subject of the present invention is also an isolated or
purified polynucleotide, characterized in that its sequence is that
of the genome of the isolated coronavirus strain as defined
above.
[0033] According to an advantageous embodiment of said
polynucleotide, it has the sequence SEQ ID No: 1.
[0034] The subject of the present invention is also an isolated or
purified polynucleotide, characterized in that its sequence
hybridizes under high stringency conditions with the sequence of
the polynucleotide as defined above.
[0035] The terms "isolated or purified" mean modified "by the hand
of humans" from the natural state; in other words if an object
exists in nature, it is said to be isolated or purified if it is
modified or extracted from its natural environment or both. For
example, a polynucleotide or a protein/peptide naturally present in
a living organism is neither isolated nor purified; on the other
hand, the same polynucleotide or protein/peptide separated from
coexisting molecules in its natural environment, obtained by
cloning, amplification and/or chemical synthesis is isolated for
the purposes of the present invention. Furthermore, a
polynucleotide or a protein/peptide which is introduced into an
organism by transformation, genetic manipulation or by any other
method, is "isolated" even if it is present in said organism. The
term purified as used in the present invention means that the
proteins/peptides according to the invention are essentially free
of association with the other proteins or polypeptides, as is for
example the product purified from the culture of recombinant host
cells or the product purified from a nonrecombinant source.
[0036] For the purposes of the present invention, high stringency
hybridization conditions are understood to mean temperature and
ionic strength conditions chosen such that they make it possible to
maintain the specific and selective hybridization between
complementary polynucleotides.
[0037] By way of illustration, high stringency conditions for the
purposes of defining the above polynucleotides are advantageously
the following: the DNA-DNA or DNA-RNA hybridization is performed in
two steps: (1) prehybridization at 42.degree. C. for 3 hours in
phosphate buffer (20 mM, pH 7.5) containing 5.times.SSC
(1.times.SSC corresponds to a 0.15 M NaCl+0.015 M sodium citrate
solution), 50% formamide, 7% sodium dodecyl sulfate (SDS),
10.times.Denhardt's, 5% dextran sulfate and 1% salmon sperm DNA;
(2) hybridization for 20 hours at 42.degree. C. followed by 2
washings of 20 minutes at 20.degree. C. in 2.times.SSC+2% SDS, 1
washing of 20 minutes at 20.degree. C. in 0.1.times.SSC+0.1% SDS.
The final washing is performed in 0.1.times.SSC+0.1% SDS for 30
minutes at 60.degree. C.
[0038] The subject of the present invention is also a
representative fragment of the polynucleotide as defined above,
characterized in that it is capable of being obtained either by the
use of restriction enzymes whose recognition and cleavage sites are
present in said polynucleotide as defined above, or by
amplification with the aid of oligonucleotide primers specific for
said polynucleotide as defined above, or by transcription in vitro,
or by chemical synthesis.
[0039] According to an advantageous embodiment of said fragment, it
is selected from the group consisting of: the cDNA corresponding to
at least one open reading frame (ORF) chosen from: ORF1a, ORF1b,
ORF-S, ORF-E, ORF-M, ORF-N, ORF3, ORF4, ORF7 to ORF11, ORF13 and
ORF14 and the cDNA corresponding to the noncoding 5' or 3' ends of
said polynucleotide.
[0040] According to an advantageous feature of this embodiment,
said fragment has a sequence selected from the group consisting of:
[0041] the sequences SEQ ID NO: 2 and 4 representing the cDNA
corresponding to the ORF-S which encodes the S protein, [0042] the
sequences SEQ ID NO: 13 and 15 representing the cDNA corresponding
to the ORF-E which encodes the E protein, [0043] the sequences SEQ
ID NO: 1-6 and 18 representing the cDNA corresponding to the ORF-M
which encodes the M protein, [0044] the sequences SEQ ID NO: 36 and
38 representing the cDNA corresponding to the ORF-N which encodes
the N protein, [0045] the sequences representing the cDNA
corresponding respectively: to ORF1a and ORF1b (ORF1ab, SEQ ID NO:
31), to ORF3 and ORF4 (SEQ ID NO: 7, 8), to ORF7 to 11 (SEQ ID NO:
19, 20) to ORF13 (SEQ ID NO: 32) and to ORF14 (SEQ ID NO: 34), and
[0046] the sequences representing the cDNAs corresponding
respectively to the noncoding 5' (SEQ ID NO: 39 and 72) and 3' (SEQ
ID NO: 40, 73) ends of said polynucleotide.
[0047] The subject of the present invention is also a cDNA fragment
encoding the S protein, as defined above, characterized in that it
has a sequence selected from the group consisting of the sequences
SEQ ID NO: 5 and 6 (Sa and Sb fragments).
[0048] The subject of the present invention is also a cDNA fragment
corresponding to ORF1a and ORF1b as defined above, characterized in
that it has a sequence selected from the group consisting of the
sequences SEQ ID NO: 41 to 54 (L0 to L12 fragments).
[0049] The subject of the present invention is also a
polynucleotide fragment as defined above, characterized in that it
has at least 15 consecutive bases or base pairs of the sequence of
the genome of said strain including at least one of those situated
in position 7979, 16622, 19064, 23220, 24872, 25298 and 26857.
Preferably this is a fragment of 20 to 2500 bases or base pairs,
preferably from 20 to 400.
[0050] According to an advantageous embodiment of said fragment, it
includes at least one pair of bases or base pairs corresponding to
the following positions: 7919 and 23220, 7919 and 25298, 16622 and
23220, 19064 and 23220, 16622 and 25298, 19064 and 25298, 23220 and
24872, 23220 and 26857, 24872 and 25298, 25298 and 26857.
[0051] The subject of the present invention is also primers of at
least 18 bases capable of amplifying a fragment of the genome of a
SARS-associated coronavirus or of the DNA equivalent thereof.
[0052] According to an embodiment of said primers, they are
selected from the group consisting of: [0053] the pair of primers
No. 1 corresponding respectively to positions 28507 to 28522 (sense
primer, SEQ ID NO: 60) and 28774 to 28759 (antisense primer, SEQ ID
NO: 61) of the sequence of the polynucleotide as defined above,
[0054] the pair of primers No. 2 corresponding respectively to
positions 28375 to 28390 (sense primer, SEQ ID NO: 62) and 28702 to
28687 (antisense primer, SEQ ID NO: 63) of the sequence of the
polynucleotide as defined above, and [0055] the pair of primers
consisting of the primers SEQ ID Nos: 55 and 56.
[0056] The subject of the present invention is also a probe capable
of detecting the presence of the genome of a SARS-associated
coronavirus or of a fragment thereof, characterized in that it is
selected from the group consisting of: the fragments as defined
above and the fragments corresponding to the following positions of
the polynucleotide sequence as defined above: 28561 to 28586, 28588
to 28608, 28541 to 28563 and 28565 to 28589 (SEQ ID NO: 64 to
67).
[0057] The probes and primers according to the invention may be
labeled directly or indirectly with a radioactive or nonradioactive
compound by methods well known to persons skilled in the art so as
to obtain a detectable and/or quantifiable signal. Among the
radioactive isotopes used, there may be mentioned .sup.32P,
.sup.33P, .sup.35S, .sup.3H or .sup.125I. The nonradioactive
entities are selected from ligands such as biotin, avidin,
streptavidin, digoxygenin, haptens, dyes, luminescent agents such
as radioluminescent, chemoluminescent, bioluminescent, fluorescent
and phosphorescent agents.
[0058] The invention encompasses the labeled probes and primers
derived from the preceding sequences.
[0059] Such probes and primers are useful for the diagnosis of
infection by a SARS-associated coronavirus.
[0060] The subject of the present invention is also a method for
the detection of a SARS-associated coronavirus, from a biological
sample, which method is characterized in that it comprises at
least:
[0061] (a) the extraction of nucleic acids present in said
biological sample,
[0062] (b) the amplification of a fragment of ORF-N by RT-PCR with
the aid of a pair of primers as defined above, and
[0063] (c) the detection, by any appropriate means, of the
amplification products obtained in (b).
[0064] The amplification products (amplicons) in (b) are 268 bp for
the pair of primers No. 1 and 328 bp for the pair of primers No.
2.
[0065] According to an advantageous embodiment of said method, the
step (b) of detection is carried out with the aid of at least one
probe corresponding to positions 28561 to 28586, 28588 to 28608,
28541 to 28563 and 28565 to 28589 of the sequence of the
polynucleotide as defined above.
[0066] Preferably, the SARS-associated coronavirus genome is
detected and optionally quantified by PCR in real time with the aid
of the pair of primers No. 2 and probes corresponding to positions
28541 to 28563 and 28565 to 28589 labeled with different compounds,
in particular different fluorescent agents.
[0067] The real time RT-PCR which uses this pair of primers and
this probe is very sensitive since it makes it possible to detect
102 copies of RNA and up to 10 copies of RNA; it is in addition
reliable and reproducible.
[0068] The invention encompasses the single-stranded,
double-stranded and triple-stranded polydeoxyribonucleotides and
polyribonucleotides corresponding to the sequence of the genome of
the isolated strain of coronavirus and its fragments as defined
above, and to their sense or antisense complementary sequences, in
particular the RNAs and cDNAs corresponding to the sequence of the
genome and of its fragments as defined above.
[0069] The present invention also encompasses the amplification
fragments obtained with the aid of primers specific for the genome
of the purified or isolated strain as defined above, in particular
with the aid of primers or pairs of primers as defined above, the
restriction fragments formed by or comprising the sequence of
fragments as defined above, the fragments obtained by transcription
in vitro from a vector containing the sequence SEQ ID NO: 1 or a
fragment as defined above, and fragments obtained by chemical
synthesis. Examples of restriction fragments are deduced from the
restriction map of the sequence SEQ ID NO: 1 illustrated by FIG.
13. In accordance with the invention, said fragments are either in
the form of isolated fragments, or in the form of mixtures of
fragments. The invention also encompasses fragments modified, in
relation to the preceding ones, by removal or addition of
nucleotides in a proportion of about 15%, relative to the length of
the above fragments and/or modified in terms of the nature of the
nucleotides, as long as the modified nucleotide fragments retain a
capacity for hybridization with the genomic or antigenomic RNA
sequences of the isolate as defined above.
[0070] The nucleic acid molecules according to the invention are
obtained by conventional methods, known per se, following standard
protocols such as those described in Current Protocols in Molecular
Biology (Frederick M. AUSUBEL, 2000, Wiley and son Inc., Library of
Congress, USA). For example, they may be obtained by amplification
of a nucleic sequence by PCR or RT-PCR or alternatively by total or
partial chemical synthesis.
[0071] The subject of the present invention is also a DNA or RNA
chip or filter, characterized in that it comprises at least one
polynucleotide or one of its fragments as defined above.
[0072] The DNA or RNA chips or filters according to the invention
are prepared by conventional methods, known per se, such as for
example chemical or electrochemical grafting of oligonucleotides on
a glass or nylon support.
[0073] The subject of the present invention is also a recombinant
cloning and/or expression vector, in particular a plasmid, a virus,
a viral vector or a phage comprising a nucleic acid fragment as
defined above. Preferably, said recombinant vector is an expression
vector in which said nucleic acid fragment is placed under the
control of appropriate elements for regulating transcription and
translation. In addition, said vector may comprise sequences (tags)
fused in phase with the 5' and/or 3' end of said insert, which are
useful for the immobilization and/or detection and/or purification
of the protein expressed from said vector.
[0074] These vectors are constructed and introduced into host cells
by conventional recombinant DNA and genetic engineering methods
which are known per se. Numerous vectors into which a nucleic acid
molecule of interest may be inserted in order to introduce it and
to maintain it in a host cell are known per se; the choice of an
appropriate vector depends on the use envisaged for this vector
(for example replication of the sequence of interest, expression of
this sequence, maintenance of the sequence in extrachromosomal form
or alternatively integration into the chromosomal material of the
host), and on the nature of the host cell.
[0075] In accordance with the invention, said plasmid is selected
in particular from the following plasmids: [0076] the plasmid,
called SARS-S, contained in the bacterial strain deposited under
the No. I-3059, on Jun. 20, 2003, at the Collection Nationale de
Cultures de Microorganismes, 25 rue du Docteur Roux, 75724 Paris
Cedex 15; it contains the cDNA sequence encoding the S protein of
the SARS-CoV strain derived from the sample recorded under the No.
031589, said sequence corresponding to the nucleotides at positions
21406 to 25348 (SEQ ID NO: 4), with reference to the Genbank
sequence AY274119.3, [0077] the plasmid, called SARS-S1, contained
in the bacterial strain deposited under the No. I-3020, on May 12,
2003, at the Collection Nationale de Cultures de Microorganismes,
25 rue du Docteur Roux, 75724 Paris Cedex 15; it contains a 5'
fragment of the cDNA sequence encoding the S protein of the
SARS-CoV strain derived from the sample recorded under the No.
031589, as defined above, said fragment corresponding to the
nucleotides at positions 21406 to 23454 (SEQ ID NO: 5), with
reference to the Genbank sequence AY274119.3 Tor2, [0078] the
plasmid, called SARS-S2, contained in the bacterial strain
deposited under the No. I-3019, on May 12, 2003, at the Collection
Nationale de Cultures de Microorganismes, 25 rue du Docteur Roux,
75724 Paris Cedex 15; it contains a 3' fragment of the cDNA
sequence encoding the S protein of the SARS-CoV strain derived from
the sample recorded under the number No. 031589, as defined above,
said fragment corresponding to the nucleotides at positions 23322
to 25348 (SEQ ID NO: 6), with reference to the Genbank sequence
accession No. AY274119.3, [0079] the plasmid, called SARS-SE,
contained in the bacterial strain deposited under the No. I-3126,
on Nov. 13, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA corresponding to the region situated between
ORF-S and ORF-E and overlapping ORF-E of the SARS-CoV strain
derived from the sample recorded under the No. 031589, as defined
above, said region corresponding to the nucleotides at positions
25110 to 26244 (SEQ ID NO: 8), with reference to the Genbank
sequence accession No. AY274119.3, [0080] the plasmid, called
SARS-E, contained in the bacterial strain deposited under the No.
I-3046, on May 28, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA sequence encoding the E protein of the SARS-CoV
strain derived from the sample recorded under the No. 031589, as
defined above, said sequence corresponding to the nucleotides at
positions 26082 to 26413 (SEQ ID NO: 15), with reference to the
Genbank sequence accession No. AY274119.3, [0081] the plasmid,
called SARS-M, contained in the bacterial strain deposited under
the No. I-3047, on May 28, 2003, at the Collection Nationale de
Cultures de Microorganismes, 25 rue du Docteur Roux, 75724 Paris
Cedex 15; it contains the cDNA sequence encoding the M protein of
the SARS-CoV strain derived from the sample recorded under the No.
031589, as defined above; said sequence corresponding to the
nucleotides at positions 26330 to 27098 (SEQ ID NO: 18), with
reference to the Genbank sequence accession No. AY274119.3, [0082]
the plasmid, called SARS-MN, contained in the bacterial sequence
deposited under the No. I-3125, on Nov. 13, 2003, at the Collection
Nationale de Cultures de Microorganismes, 25 rue du Docteur Roux,
75724 Paris Cedex 15; it contains the cDNA sequence corresponding
to the region situated between ORF-M and ORF-N of the SARS-CoV
strain derived from the sample recorded under the No. 031589 and
collected in Hanoi, as defined above, said sequence corresponding
to the nucleotides at positions 26977 to 28218 (SEQ ID NO: 20),
with reference to the Genbank accession No. AY274119.3, [0083] the
plasmid, called SARS-N, contained in the bacterial strain deposited
under the No. I-3048, on Jun. 5, 2003, at the Collection Nationale
de Cultures de Microorganismes, 25 rue du Docteur Roux, 75724 Paris
Cedex 15; it contains the cDNA encoding the N protein of the
SARS-CoV strain derived from the sample recorded under the No.
031589, as defined above, said sequence corresponding to the
nucleotides at positions 28054 to 29430 (SEQ ID NO: 38), with
reference to the Genbank sequence accession No. AY274119.3; thus,
this plasmid comprises an insert of sequence SEQ ID NO: 38 and is
contained in a bacterial strain which was deposited under the No.
I-3048, on Jun. 5, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15,
[0084] the plasmid, called SARS-5'NC, contained in the bacterial
strain deposited under the No. I-3124, on Nov. 7, 2003, at the
Collection Nationale de Cultures de Microorganismes, 25 rue du
Docteur Roux, 75724 Paris Cedex 15; it contains the cDNA
corresponding to the noncoding 5' end of the genome of the SARS-CoV
strain derived from the sample recorded under the No. 031589, as
defined above, said sequence corresponding to the nucleotides at
positions 1 to 204 (SEQ ID NO: 39), with reference to the Genbank
sequence accession No. AY274119.3, [0085] the plasmid called
SARS-3'NC, contained in the bacterial strain deposited under the
No. I-3123 on Nov. 7, 2003, at the Collection Nationale de Cultures
de Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15;
it contains the cDNA sequence corresponding to the noncoding 3' end
of the genome of the SARS-CoV strain derived from the sample
recorded under the No. 031589, as defined above, said sequence
corresponding to that situated between the nucleotide and position
28933 to 29727 (SEQ ID NO: 40), with reference to the Genbank
sequence accession No. AY274119.3, ends with a series of
nucleotides a., [0086] the expression plasmid, called pIV2.3N,
containing a cDNA fragment encoding a C-terminal fusion of the N
protein (SEQ ID NO: 37) with a polyhistidine tag, [0087] the
expression plasmid, called pIV2.3S.sub.C, containing a cDNA
fragment encoding a C-terminal fusion of the fragment corresponding
to positions 475 to 1193 of the amino acid sequence of the S
protein (SEQ ID NO: 3) with a polyhistidine tag, [0088] the
expression plasmid, pIV2.3S.sub.L, containing a cDNA fragment
encoding a C-terminal fusion of the fragment corresponding to
positions 14 to 1193 of the amino acid sequence of the S protein
(SEQ ID NO: 3) with a polyhistidine tag, [0089] the expression
plasmid, called pIV2.4N, containing a cDNA fragment encoding a
N-terminal fusion of the N protein (SEQ ID NO: 3) with a
polyhistidine tag, [0090] the expression plasmid, called
pIV2.4S.sub.C or pIV2.4S.sub.1, containing an insert encoding a
N-terminal fusion of the fragment corresponding to positions 475 to
1193 of the amino acid sequence of the S protein (SEQ ID NO: 3)
with a polyhistidine tag, and [0091] the expression plasmid, called
pIV2.4S.sub.L, containing a cDNA fragment encoding an N-terminal
fusion of the fragment corresponding to positions 14 to 1193 of the
amino acid sequence of the S protein (SEQ ID NO: 3) with a
polyhistidine tag.
[0092] According to an advantageous feature of the expression
plasmid as defined above, it is contained in a bacterial strain
which was deposited under the No. I-3117, on Oct. 23, 2003, at the
Collection Nationale de Cultures de Microorganismes, 25 rue du
Docteur Roux, 75724 Paris Cedex 15.
[0093] According to another advantageous feature of the expression
plasmid as defined above, it is contained in a bacterial strain
which was deposited under the No. I-3118, on Oct. 23, 2003, at the
Collection Nationale de Cultures de Microorganismes, 25 rue du
Docteur Roux, 75724 Paris Cedex 15.
[0094] According to another feature of the expression plasmid as
defined above, it is contained in a bacterial strain which was
deposited at the CNCM, 25 rue du Docteur Roux, 75724 Paris Cedex 15
under the following numbers: [0095] a) strain No. I-3118, deposited
on Oct. 23, 2003, [0096] b) strain No. I-3019, deposited on May 12,
2003, [0097] c) strain No. I-3020, deposited on May 12, 2003,
[0098] d) strain No. I-3059, deposited on Jun. 20, 2003, [0099] e)
strain No. I-3323, deposited on Nov. 22, 2004, [0100] f) strain No.
I-3324, deposited on Nov. 22, 2004, [0101] g) strain No. I-3326,
deposited on Dec. 1, 2004, [0102] h) strain No. I-3327, deposited
on Dec. 1, 2004, [0103] i) strain No. I-3332, deposited on Dec. 1,
2004, [0104] j) strain No. I-3333, deposited on Dec. 1, 2004,
[0105] k) strain No. I-3334, deposited on Dec. 1, 2004, [0106] l)
strain No. I-3335, deposited on Dec. 1, 2004, [0107] m) strain No.
I-3336, deposited on Dec. 1, 2004, [0108] n) strain No. I-3337,
deposited on Dec. 1, 2004, [0109] o) strain No. I-3338, deposited
on Dec. 2, 2004, [0110] p) strain No. I-3339, deposited on Dec. 2,
2004, [0111] q) strain No. I-3340, deposited on Dec. 2, 2004,
[0112] r) strain No. I-3341, deposited on Dec. 2, 2004.
[0113] The subject of the present invention is also a nucleic acid
insert of viral origin, characterized in that it is contained in
any of the strains as defined above in a)-r).
[0114] The subject of the present invention is also a nucleic acid
containing a synthetic gene allowing optimized expression of the S
protein in eukaryotic cells, characterized in that it possesses the
sequence SEQ ID NO: 140.
[0115] The subject of the present invention is also an expression
vector containing a nucleic acid containing a synthetic gene
allowing optimized expression of the S protein, which vector is
contained in the bacterial strain deposited at the CNCM, on Dec. 1,
2004, under the No. I-3333.
[0116] According to one embodiment of said expression vector, it is
a viral vector, in the form of a viral particle or in the form of a
recombinant genome.
[0117] According to an advantageous feature of this embodiment,
this is a recombinant viral particle or a recombinant viral genome
capable of being obtained by transfection of a plasmid according to
paragraphs g), h) and k) to r) as defined above, in an appropriate
cellular system, that is to say, for example, cells transfected
with one or more other plasmids intended to transcomplement certain
functions of the virus that are deleted in the vector and that are
necessary for the formation of the viral particles.
[0118] The expression "S protein family" is understood here to mean
the complete S protein, its ectodomain and fragments of this
ectodomain which are preferably produced in a eukaryotic
system.
[0119] The subject of the present invention is also a lentiviral
vector encoding a polypeptide of the S protein family, as defined
above.
[0120] The subject of the present invention is also a recombinant
measles virus encoding a polypeptide of the S protein family, as
defined above.
[0121] The subject of the present invention is also a recombinant
vaccinia virus encoding a polypeptide of the S protein family, as
defined above.
[0122] The subject of the present invention is also the use of a
vector according to paragraphs e) to r) as defined above, or of a
vector containing a synthetic gene for the S protein, as defined
above, for the production, in a eukaryotic system, of the
SARS-associated coronavirus S protein or of a fragment of this
protein.
[0123] The subject of the present invention is also a method for
producing the S protein in a eukaryotic system, comprising a step
of transfecting eukaryotic cells in culture with a vector chosen
from the vectors contained in the bacterial strains mentioned in
paragraphs e) to r) above or a vector containing a synthetic gene
allowing optimized expression of the S protein.
[0124] The subject of the present invention is also a cDNA library
characterized in that it comprises fragments as defined above, in
particular amplification fragments or restriction fragments, cloned
into a recombinant vector, in particular an expression vector
(expression library).
[0125] The subject of the present invention is also cells, in
particular prokaryotic cells, modified by a recombinant vector as
defined above.
[0126] The subject of the present invention is also a genetically
modified eukaryotic cell expressing a protein or a polypeptide as
defined above. Quite obviously, the terms "genetically modified
eukaryotic cell" do not denote a cell modified with a wild-type
virus.
[0127] According to an advantageous embodiment of said cell, it is
capable of being obtained by transfection with any of the vectors
mentioned in paragraphs K) to N) above.
[0128] According to an advantageous feature of this embodiment,
this is the cell FRhK4-Ssol-30, deposited at the CNCM on Nov. 22,
2004, under the No. I-3325.
[0129] The recombinant vectors as defined above and the cells
transformed with said expression vectors are advantageously used
for the production of the corresponding proteins and peptides. The
expression libraries derived from said vectors, and the cells
transformed with said expression libraries are advantageously used
to identify the immunogenic epitopes (B and T epitopes) of the
SARS-associated coronavirus proteins.
[0130] The subject of the present invention is also the purified or
isolated proteins and peptides, characterized in that they are
encoded by the polynucleotide or one of its fragments as defined
above.
[0131] According to an advantageous embodiment of the invention,
said protein is selected from the group consisting of: [0132] the S
protein having the sequence SEQ ID NO: 3 or its ectodomain [0133]
the E protein having the sequence SEQ ID NO: 14 [0134] the M
protein having the sequence SEQ ID NO: 17 [0135] the N protein
having the sequence SEQ ID NO: 37 [0136] the proteins encoded by
the ORFs: ORF1a, ORF1b, ORF3, ORF4 and ORF7 to ORF11, ORF13 and
ORF14 and having the respective sequence, SEQ ID NO: 74, 75, 10,
12, 22, 24, 26, 28, 30, 33 and 35.
[0137] The terms "ectodomain of the S protein" and "soluble form of
the S protein" will be used interchangeably below.
[0138] According to an advantageous embodiment of the invention,
said polypeptide consists of the amino acids corresponding to
positions 1 to 1193 of the amino acid sequence of the S
protein.
[0139] According to another advantageous embodiment of the
invention, said peptide is selected from the group consisting
of:
[0140] a) the peptides corresponding to positions 14 to 1193 and
475 to 1193 of the amino acid sequence of the S protein,
[0141] b) the peptides corresponding to positions 2 to 14 (SEQ ID
NO: 69) and 100 to 221 of the amino acid sequence of the M protein;
these peptides correspond respectively to the ectodomain and to the
endodomain of the M protein, and
[0142] c) the peptides corresponding to positions 1 to 12 (SEQ ID
NO: 70) and 53 to 76 (SEQ ID NO: 71) of the amino acid sequence of
the E protein; these peptides correspond respectively to the
ectodomain and to the C-terminal end of the E protein, and
[0143] d) the peptides of 5 to 50 consecutive amino acids,
preferably of 10 to 30 amino acids, inclusive or partially or
completely overlapping the sequence of the peptides as defined in
a), b) or c).
[0144] The subject of the present invention is also a peptide,
characterized in that it has a sequence of 7 to 50 amino acids
including an amino acid residue selected from the group consisting
of: [0145] the alanine situated at position 2552 of the amino acid
sequence of the protein encoded by ORF1a, [0146] the serine
situated at position 577 of the amino acid sequence of the S
protein of the SARS-CoV strain as defined above, [0147] the glycine
at position 11 of the amino acid sequence of the protein encoded by
ORF3 of the SARS-CoV strain as defined above, [0148] the serine at
position 154 of the amino acid sequence of the M protein of the
SARS-CoV strain as defined above.
[0149] The subject of the present invention is also an antibody or
a polyclonal or monoclonal antibody fragment which can be obtained
by immunization of an animal with a recombinant vector as defined
above, a cDNA library as defined above or alternatively a protein
or a peptide as defined above, characterized in that it binds to at
least one of the proteins encoded by SARS-CoV as defined above.
[0150] The invention encompasses the polyclonal antibodies, the
monoclonal antibodies, the chimeric antibodies such as the
humanized antibodies, and fragments thereof (Fab, Fv, scFv).
[0151] A subject of the present invention is also a hybridoma
producing a monoclonal antibody against the N protein,
characterized in that it is chosen from the following hybridomas:
[0152] the hybridoma producing the monoclonal antibody 87,
deposited at the CNCM on Dec. 1, 2004 under the number I-3328,
[0153] the hybridoma producing the monoclonal antibody 86,
deposited at the CNCM on Dec. 1, 2004 under the number I-3329,
[0154] the hybridoma producing the monoclonal antibody 57,
deposited at the CNCM on Dec. 1, 2004 under the number I-3330, and
[0155] the hybridoma producing the monoclonal antibody 156,
deposited at the CNCM on Dec. 1, 2004 under the number I-3331.
[0156] The subject of the present invention is also a polyclonal or
monoclonal antibody or antibody fragment directed against the N
protein, characterized in that it is produced by a hybridoma as
defined above.
[0157] For the purposes of the present invention, the expression
chimeric antibody is understood to mean, in relation to an antibody
of a particular animal species or of a particular class of
antibody, an antibody comprising all or part of a heavy chain
and/or of a light chain of an antibody of another animal species or
of another class of antibody.
[0158] For the purposes of the present invention, the expression
humanized antibody is understood to mean a human immunoglobulin in
which the residues of the CDRs (Complementary Determining Regions)
which form the antigen-binding site are replaced by those of a
nonhuman monoclonal antibody possessing the desired specificity,
affinity or activity. Compared with the nonhuman antibodies, the
humanized antibodies are less immunogenic and possess a prolonged
half-life in humans because they possess only a small proportion of
nonhuman sequences given that practically all the residues of the
FR (Framework) regions and of the constant (Fc) region of these
antibodies are those of a consensus sequence of human
immunoglobulins.
[0159] A subject of the present invention is also a protein chip or
filter, characterized in that it comprises a protein, a peptide or
alternatively an antibody as defined above.
[0160] The protein chips according to the invention are prepared by
conventional methods known per se. Among the appropriate supports
on which proteins may be immobilized, there may be mentioned those
made of plastic or glass, in particular in the form of
microplates.
[0161] The subject of the present invention is also reagents
derived from the isolated strain of SARS-associated coronavirus,
derived from the sample recorded under the No. 031589, which are
useful for the study and diagnosis of the infection caused by a
SARS-associated coronavirus, said reagents are selected from the
group consisting of: [0162] (a) a pair of primers, a probe or a DNA
chip as defined above, [0163] (b) a recombinant vector or a
modified cell as defined above, [0164] (c) an isolated coronavirus
strain or a polynucleotide as defined above, [0165] (d) a protein
or a peptide as defined above, [0166] (e) an antibody or an
antibody fragment as defined above, and [0167] (f) a protein chip
as defined above.
[0168] These various reagents are prepared and used according to
conventional molecular biology and immunology techniques following
standard protocols such as those described in Current Protocols in
Molecular Biology (Frederick M. AUSUBEL, 2000, Wiley and Son Inc.,
Library of Congress, USA), in Current Protocols in Immunology (John
E. Cologan, 2000, Wiley and Son Inc., Library of Congress, USA) and
in Antibodies: A Laboratory Manual (E. Howell and D. Lane, Cold
Spring Harbor Laboratory, 1988).
[0169] The nucleic acid fragments according to the invention are
prepared and used according to conventional techniques as defined
above. The peptides and proteins according to the invention are
prepared by recombinant DNA techniques, known to persons skilled in
the art, in particular with the aid of the recombinant vectors as
defined above. Alternatively, the peptides according to the
invention may be prepared by conventional techniques of solid or
liquid phase synthesis, known to persons skilled in the art.
[0170] The polyclonal antibodies are prepared by immunizing an
appropriate animal with a protein or a peptide as defined above,
optionally coupled to KLH or to albumin and/or combined with an
appropriate adjuvant such as (complete or incomplete) Freund's
adjuvant or aluminum hydroxide; after obtaining a satisfactory
antibody titer, the antibodies are harvested by collecting serum
from the immunized animals and enriched with IgG by precipitation,
according to conventional techniques, and then the IgGs specific
for the SARS-CoV proteins are optionally purified by affinity
chromatography on an appropriate column to which said peptide or
said protein is attached, as defined above, so as to obtain a
monospecific IgG preparation.
[0171] The monoclonal antibodies are produced from hybridomas
obtained by fusion of B lymphocytes from an animal immunized with a
protein or a peptide as defined above with myelomas, according to
the Kohler and Milstein technique (Nature, 1975, 256, 495-497); the
hybridomas are cultured in vitro, in particular in fermenters or
produced in vivo, in the form of ascites; alternatively, said
monoclonal antibodies are produced by genetic engineering as
described in American patent U.S. Pat. No. 4,816,567.
[0172] The humanized antibodies are produced by general methods
such as those described in International application WO
98/45332.
[0173] The antibody fragments are produced from the cloned V.sub.H
and V.sub.L regions, from the mRNAs of hybridomas or splenic
lymphocytes of an immunized mouse; for example, the Fv, scFv or Fab
fragments are expressed at the surface of filamentous phages
according to the Winter and Milstein technique (Nature, 1991, 349,
293-299); after several selection steps, the antibody fragments
specific for the antigen are isolated and expressed in an
appropriate expression system, by conventional techniques for
cloning and expression of recombinant DNA.
[0174] The antibodies or fragments thereof as defined above are
purified by conventional techniques known to persons skilled in the
art, such as affinity chromatography.
[0175] The subject of the present invention is additionally the use
of a product selected from the group consisting of: a pair of
primers, a probe, a DNA chip, a recombinant vector, a modified
cell, an isolated coronavirus strain, a polynucleotide, a protein
or a peptide, an antibody or an antibody fragment and a protein
chip as defined above, for the preparation of a reagent for the
detection and optionally genotyping/serotyping of a SARS-associated
coronavirus.
[0176] The proteins and peptides according to the invention, which
are capable of being recognized and/or of inducing the production
of antibodies specific for the SARS-associated coronavirus, are
useful for the diagnosis of infection with such a coronavirus; the
infection is detected, by an appropriate technique--in particular
EIA, ELISA, RIA, immunofluorescence--, in a biological sample
collected from an individual capable of being infected.
[0177] According to an advantageous feature of said use, said
proteins are selected from the group consisting of the S, E, M
and/or N proteins and the peptides as defined above.
[0178] The S, E, M and/or N proteins and the peptides derived from
these proteins as defined above, for example the N protein, are
used for the indirect diagnosis of a SARS-associated coronavirus
infection (serological diagnosis; detection of an antibody specific
for SARS-CoV), in particular by an immunoenzymatic method
(ELISA).
[0179] The antibodies and antibody fragments according to the
invention, in particular those directed against the S, E, M and/or
N proteins and the derived peptides as defined above, are useful
for the direct diagnosis of a SARS-associated coronavirus
infection; the detection of the protein(s) of SARS-CoV is carried
out by an appropriate technique, in particular EIA, ELISA, RIA,
immunofluorescence, in a biological sample collected from an
individual capable of being infected.
[0180] The subject of the present invention is also a method for
the detection of a SARS-associated coronavirus, from a biological
sample, which method is characterized in that it comprises at
least: [0181] (a) bringing said biological sample into contact with
at least one antibody or one antibody fragment, one protein, one
peptide or alternatively one protein or peptide chip or filter as
defined above, and [0182] (b) visualizing by any appropriate means
antigen-antibody complexes formed in (a), for example by EIA,
ELISA, RIA, or by immunofluorescence.
[0183] According to one advantageous embodiment of said process,
step (a) comprises: [0184] (a.sub.1) bringing said biological
sample into contact with at least a first antibody or an antibody
fragment which is attached to an appropriate support, in particular
a microplate, [0185] (a.sub.2) washing the solid phase, and [0186]
(a.sub.3) adding at least a second antibody or an antibody
fragment, different from the first, said antibody or antibody
fragment being optionally appropriately labeled.
[0187] This method, which makes it possible to capture the viral
particles present in the biological sample, is also called
immunocapture method.
[0188] For example: [0189] step (a.sub.1) is carried out with at
least a first monoclonal or polyclonal antibody or a fragment
thereof, directed against the S, M and/or E protein, and/or a
peptide corresponding to the ectodomain of one of these proteins
(M2-14 or E1-12 peptides) [0190] step (a.sub.3) is carried out with
at least one antibody or an antibody fragment directed against
another epitope of the same protein or preferably against another
protein, preferably against an inner protein such as the N
nucleoprotein or the endodomain of the E or M protein, more
preferably still these are antibodies or antibody fragments
directed against the N protein which is very abundant in the viral
particle; when an antibody or an antibody fragment directed against
an inner protein (N) or against the endodomain of the E or M
proteins is used, said antibody is incubated in the presence of
detergent, such as Tween 20 for example, at concentrations of the
order of 0.1%. [0191] step (b) for visualizing the antigen-antibody
complexes formed is carried out, either directly with the aid of a
second antibody labeled for example with biotin or an appropriate
enzyme such as peroxidase or alkaline phosphatase, or indirectly
with the aid of an anti-immunoglobulin serum labeled as above. The
complexes thus formed are visualized with the aid of an appropriate
substrate.
[0192] According to a preferred embodiment of this aspect of the
invention, the biological sample is mixed with the visualizing
monoclonal antibody prior to its being brought into contact with
the capture monoclonal antibodies. Where appropriate, the
serum-visualizing antibody mixture is incubated for at least 10
minutes at room temperature before being applied to the plate.
[0193] The subject of the present invention is also an
immunocapture test intended to detect an infection by the
SARS-associated coronavirus by detecting the native nucleoprotein
(N protein), in particular characterized in that the antibody used
for the capture of the native viral nucleoprotein is a monoclonal
antibody specific for the central region and/or for a
conformational epitope.
[0194] According to one embodiment of said test, the antibody used
for the capture of the N protein is the monoclonal antibody mAb87,
produced by the hybridoma deposited at the CNCM on Dec. 1, 2004
under the number I-3328.
[0195] According to another embodiment of said immunocapture test,
the antibody used for the capture of the N protein is the
monoclonal antibody mAb86, produced by the hybridoma deposited at
the CNCM on Dec. 1, 2004 under the number I-3329.
[0196] According to another embodiment of said immunocapture test,
the monoclonal antibodies mAb86 and mAb87 are used for the capture
of the N protein.
[0197] In the immunocapture tests according to the invention, it is
possible to use, for visualizing the N protein, the monoclonal
antibody mAb57, produced by the hybridoma deposited at the CNCM on
Dec. 1, 2004 under the number I-3330, said antibody being
conjugated with a visualizing molecule or particle.
[0198] In accordance with said immunocapture test, a combination of
the antibodies mAb57 and mAb87, conjugated with a visualizing
molecule or particle, is used for the visualization of the N
protein.
[0199] A visualizing molecule may be a radioactive atom, a dye, a
fluorescent molecule, a fluorophore, an enzyme; a visualizing
particle may be for example: colloidal gold, a magnetic particle or
a latex bead.
[0200] The subject of the present invention is also a reagent for
detecting a SARS-associated coronavirus, characterized in that it
is selected from the group consisting of: [0201] (a) a pair of
primers or a probe as defined above, [0202] (b) a recombinant
vector as defined above or a modified cell as defined above, [0203]
(c) an isolated coronavirus strain as defined above or a
polynucleotide as defined above, [0204] (d) an antibody or an
antibody fragment as defined above, [0205] (e) a combination of
antibodies comprising the monoclonal antibodies mAb86 and/or mAb87,
and the monoclonal antibody mAb57, as defined above, [0206] (f) a
chip or a filter as defined above.
[0207] The subject of the present invention is also a method for
the detection of a SARS-associated coronavirus infection, from a
biological sample, by indirect IgG ELISA using the N protein, which
method is characterized in that the plates are sensitized with an N
protein solution at a concentration of between 0.5 and 4 .mu.g/ml,
preferably to 2 .mu.g/ml, in a 10 mM PBS buffer pH 7.2, phenol red
at 0.25 ml/l.
[0208] The subject of the present invention is additionally a
method for the detection of a SARS-associated coronavirus
infection, from a biological sample, by double epitope ELSA,
characterized in that the serum to be tested is mixed with the
visualizing antigen, said mixture then being brought into contact
with the antigen attached to a solid support.
[0209] According to one variant of the tests for detecting
SARS-associated coronaviruses, these tests combine an ELSA using
the N protein, and another ELSA using the S protein, as described
below.
[0210] The subject of the present invention is also an immune
complex formed of a polyclonal or monoclonal antibody or antibody
fragment as defined above, and of a SARS-associated coronavirus
protein or peptide.
[0211] The subject of the present invention is additionally a
SARS-associated coronavirus detection kit, characterized in that it
comprises at least one reagent selected from the group consisting
of: a pair of primers, a probe, a DNA or RNA chip, a recombinant
vector, a modified cell, an isolated coronavirus strain, a
polynucleotide, a protein or a peptide, an antibody, and a protein
chip as defined above.
[0212] The subject of the present invention is additionally an
immunogenic composition, characterized in that it comprises at
least one product selected from the group consisting of: [0213] a)
a protein or a peptide as defined above, [0214] b) a polynucleotide
of the DNA or RNA type or one of its representative fragments as
defined above, having a sequence chosen from: [0215] (i) the
sequence SEQ ID NO: 1 or its RNA equivalent [0216] (ii) the
sequence hybridizing under high stringency conditions with the
sequence SEQ ID NO: 1, [0217] (iii) the sequence complementary to
the sequence SEQ ID NO: 1 or to the sequence hybridizing under high
stringency conditions with the sequence SEQ ID NO: 1, [0218] (iv)
the nucleotide sequence of a representative fragment of the
polynucleotide as defined in (i), (ii) or (iii), [0219] (v) the
sequence as defined in (i), (ii), (iii) or (iv), modified, and
[0220] c) a recombinant expression vector comprising a
polynucleotide as defined in b), and [0221] d) a cDNA library as
defined above, said immunogenic composition being capable of
inducing protective humoral or cellular immunity specific for the
SARS-associated coronavirus, in particular the production of an
antibody directed against a specific epitope of the SARS-associated
coronavirus.
[0222] The proteins and peptides as defined above, in particular
the S, M, E and/or N proteins and the derived peptides, and the
nucleic acid (DNA or RNA) molecules encoding said proteins or said
peptides are good candidate vaccines and may be used in immunogenic
compositions for the production of a vaccine against the
SARS-associated coronavirus.
[0223] According to an advantageous embodiment of the compositions
according to the invention, they additionally contain at least one
pharmaceutically acceptable vehicle and optionally carrier
substances and/or adjuvants.
[0224] The pharmaceutically acceptable vehicles, the carrier
substances and the adjuvants are those conventionally used.
[0225] The adjuvants are advantageously chosen from the group
consisting of oily emulsions, saponin, mineral substances,
bacterial extracts, aluminum hydroxide and squalene.
[0226] The carrier substances are advantageously selected from the
group consisting of unilamellar liposomes, multilamellar liposomes,
micelles of saponin or solid microspheres of a saccharide or
auriferous nature.
[0227] The compositions according to the invention are administered
by the general route, in particular by the intramuscular or
subcutaneous route or alternatively by the local, in particular
nasal (aerosol) route.
[0228] The subject of the present invention is also the use of an
isolated or purified protein or peptide having a sequence selected
from the group consisting of the sequences SEQ ID NO: 3, 10, 12,
14, 17, 22, 24, 26, 28, 30, 33, 35, 37, 69, 70, 71, 74 and 75 to
form an immune complex with an antibody specifically directed
against an epitope of the SARS-associated coronavirus.
[0229] The subject of the present invention is also an immune
complex consisting of an isolated or purified protein or peptide
having a sequence selected from the group consisting of the
sequences SEQ ID NO: 3, 10, 12, 14, 17, 22, 24, 26, 28, 30, 33, 35,
37, 69, 70, 71, 74 and 75, and of an antibody specifically directed
against an epitope of the SARS-associated coronavirus.
[0230] The subject of the present invention is also the use of an
isolated or purified protein or peptide having a sequence selected
from the group consisting of the sequences SEQ ID NO: 3, 10, 12,
14, 17, 22, 24, 26, 28, 30, 33, 35, 37, 69, 70, 71, 74 and 75 to
induce the production of an antibody capable of specifically
recognizing an epitope of the SARS-associated coronavirus.
[0231] The subject of the present invention is also the use of an
isolated or purified polynucleotide having a sequence selected from
the group consisting of the sequences SEQ ID NO: 1, 2, 4, 7, 8, 13,
15, 16, 18, 19, 20, 31, 36 and 38 to induce the production of an
antibody directed against the protein encoded by said
polynucleotide and capable of specifically recognizing an epitope
of the SARS-associated coronavirus.
[0232] The subject of the present invention is also monoclonal
antibodies recognizing the native S protein of a SARS-associated
coronavirus.
[0233] The subject of the present invention is also the use of a
protein or a polypeptide of the S protein family, as defined above,
or of an antibody recognizing the native S protein, as defined
above, to detect an infection by a SARS-associated coronavirus, in
a biological sample.
[0234] The subject of the present invention is also a method for
detecting an infection by a SARS-associated coronavirus, in a
biological sample, characterized in that the detection is carried
out by ELISA using the recombinant S protein, expressed in a
eukaryotic system.
[0235] According to an advantageous embodiment of said method, it
is a double epitope ELISA method, and the serum to be tested is
mixed with the visualizing antigen, said mixture then being brought
into contact with the antigen attached to a solid support.
[0236] The subject of the present invention is also an immune
complex consisting of a monoclonal antibody or antibody fragment
recognizing the native S protein, and of a protein or a peptide of
the SARS-associated coronavirus.
[0237] The subject of the present invention is also an immune
complex consisting of a protein or a polypeptide of the S protein
family, as defined above, and of an antibody specifically directed
against an epitope of the SARS-associated coronavirus.
[0238] The subject of the present invention is additionally a
SARS-associated coronavirus detection kit or box, characterized in
that it comprises at least one reagent selected from the group
consisting of: a protein or polypeptide of the S protein family, as
defined above, a nucleic acid encoding a protein or peptide of the
S protein family, as defined above, a cell expressing a protein or
polypeptide of the S protein family, as defined above, or an
antibody recognizing the native S protein of a SARS-associated
coronavirus.
[0239] The subject of the present invention is an immunogenic
and/or vaccine composition, characterized in that it comprises a
polypeptide or a recombinant protein of the S protein family, as
defined above, obtained in a eukaryotic expression system.
[0240] The subject of the present invention is also an immunogenic
and/or vaccine composition, characterized in that it comprises a
vector or recombinant virus, expressing a protein or a polypeptide
of the S protein family, as defined above.
[0241] In addition to the preceding features, the invention further
comprises other features, which will emerge from the description
which follows, which refers to examples of use of the
polynucleotide representing the genome of the SARS-CoV strain
derived from the sample recorded under the number 031589, and
derived cDNA fragments which are the subject of the present
invention, and to Table I presenting the sequence listing:
TABLE-US-00001 TABLE I Sequence listing Deposit Position number at
of the the CNCM cDNA with of the reference to correspond-
Identification Genbank ing number Sequence AY274119.3 plasmid SEQ
ID NO: 1 genome of the -- -- strain derived from the sample 031589
SEQ ID NO: 2 ORF-S* 21406-25348 -- SEQ ID NO: 3 S protein -- -- SEQ
ID NO: 4 ORF-S** 21406-25348 I-3059 SEQ ID NO: 5 Sa fragment
21406-23454 I-3020 SEQ ID NO: 6 Sb fragment 23322-25348 I-3019 SEQ
ID NO: 7 ORF-3 + ORF-4* 25110-26244 -- SEQ ID NO: 8 ORF-3 + ORF-4**
25110-26244 I-3126 SEQ ID NO: 9 ORF3 -- -- SEQ ID NO: 10 ORF-3
protein -- -- SEQ ID NO: 11 ORF4 -- -- SEQ ID NO: 12 ORF-4 protein
-- -- SEQ ID NO: 13 ORF-E* 26082-26413 -- SEQ ID NO: 14 E protein
-- -- SEQ ID NO: 15 ORF-E** 26082-26413 I-3046 SEQ ID NO: 16 ORF-M*
26330-27098 -- SEQ ID NO: 17 M protein -- -- SEQ ID NO: 18 ORF-M**
26330-27098 I-3047 SEQ ID NO: 19 ORF7 to 11* 26977-28218 -- SEQ ID
NO: 20 ORF7 to 11** 26977-28218 I-3125 SEQ ID NO: 21 ORF7 -- -- SEQ
ID NO: 22 ORF7 protein -- -- SEQ ID NO: 23 ORF8 -- -- SEQ ID NO: 24
ORF8 protein -- -- SEQ ID NO: 25 ORF9 -- -- SEQ ID NO: 26 ORF9
protein -- -- SEQ ID NO: 27 ORF10 -- -- SEQ ID NO: 28 ORF10 protein
-- -- SEQ ID NO: 29 ORF11 -- -- SEQ ID NO: 30 ORF11 protein -- --
SEQ ID NO: 31 OrF1ab 265-21485 -- SEQ ID NO: 32 ORF13 28130-28426
-- SEQ ID NO: 33 ORF13 protein -- -- SEQ ID NO: 34 ORF14 -- -- SEQ
ID NO: 35 ORF14 protein 28583-28795 -- SEQ ID NO: 36 ORF-N*
28054-29430 SEQ ID NO: 37 N protein -- -- SEQ ID NO: 38 ORF-N**
28054-29430 I-3048 SEQ ID NO: 39 noncoding 5'** 1-204 I-3124 SEQ ID
NO: 40 noncoding 3'** 28933-29727 I-3123 SEQ ID NO: 41 ORF1ab
30-500 -- Fragment L0 SEQ ID NO: 42 Fragment L1 211-2260 -- SEQ ID
NO: 43 Fragment L2 2136-4187 -- SEQ ID NO: 44 Fragment L3 3892-5344
-- SEQ ID NO: 45 Fragment L4b 4932-6043 -- SEQ ID NO: 46 Fragment
L4 5305-7318 -- SEQ ID NO: 47 Fragment L5 7275-9176 -- SEQ ID NO:
48 Fragment L6 9032-11086 -- SEQ ID NO: 49 Fragment L7 10298-10982
-- SEQ ID NO: 50 Fragment L8 12815-14854 -- SEQ ID NO: 51 Fragment
L9 14745-16646 -- SEQ ID NO: 52 Fragment L10 16514-18590 -- SEQ ID
NO: 53 Fragment L11 18500-20602 -- SEQ ID NO: 54 Fragment L12
20319-22224 -- SEQ ID NO: 55 Sense N primer -- -- SEQ ID NO: 56
Antisense -- -- N primer SEQ ID NO: 57 Sense S.sub.C primer -- --
SEQ ID NO: 58 Sense S.sub.L primer -- -- SEQ ID NO: 59 Antisense
S.sub.C -- -- and S.sub.L primer SEQ ID NO: 60 Sense primer
28507-28522 -- series 1 SEQ ID NO: 61 Antisense primer 28774-28759
series 1 SEQ ID NO: 62 Sense primer 28375-28390 -- series 2 SEQ ID
NO: 63 Antisense primer 28702-28687 -- series 2 SEQ ID NO: 64 Probe
1/series 1 28561-28586 -- SEQ ID NO: 65 Probe 2/series 1
28588-28608 -- SEQ ID NO: 66 Probe 1/series 2 28541-28563 -- SEQ ID
NO: 67 Probe 2/series 2 28565-28589 -- SEQ ID NO: 68 Anchor primer
14T SEQ ID NO: 69 Peptide M2-14 -- -- SEQ ID NO: 70 Peptide E1-12
-- -- SEQ ID NO: 71 Peptide E53-76 -- -- SEQ ID NO: 72 Noncoding
5'* 1-204 -- SEQ ID NO: 73 Noncoding 3'* 28933-29727 -- SEQ ID NO:
74 ORF1a protein -- -- SEQ ID NO: 75 ORF1b protein -- -- SEQ ID NO:
76-139 Primers SEQ ID NO: 140 Pseudogene of S SEQ ID NO: 141-148
Primers SEQ ID NO: 149 Aa1-13 of S SEQ ID NO: 150 Polypeptide SEQ
ID NO: 151-158 Primers *PCR amplification product (amplicon)
**Insert cloned into the plasmid deposited at the CNCM and to the
appended drawings in which:
[0242] FIG. 1 illustrates Western-blot analysis of the expression
in vitro of the recombinant proteins N, S.sub.C and S.sub.L from
the expression vectors pIVEX. Lane 1: pIV2.3N. Lane 2:
pIV2.3S.sub.C. Lane 3: pIV2.3S.sub.L. Lane 4: pIV2.4N. Lane 5:
pIV2.4S.sub.1 or pIV2.4S.sub.C. Lane 6: pIV2.4S.sub.L. The
expression of the GFP protein expressed from the same vector is
used as a control.
[0243] FIG. 2 illustrates the analysis, by polyacrylamide gel
electrophoresis under denaturing conditions (SDS-PAGE) and staining
with Coomassie blue, of the expression in vivo of the N protein
from the expression vectors pIVEX. The E. coli BL21(DE3)pDIA17
strain transformed with the recombinant vectors pIVEX is cultured
at 30.degree. C. in LB medium, in the presence or in the absence of
inducer (IPTG 1 mM). Lane 1: pIV2.3N. Lane 2: pIV2.4N.
[0244] FIG. 3 illustrates the analysis, by polyacrylamide gel
electrophoresis under denaturing conditions (SDS-PAGE) and staining
with Coomassie blue, of the expression in vivo of the S.sub.L and
S.sub.C polypeptides from the expression vectors pIVEX. The E. coli
BL21(DE3)pDIA17 strain transformed with the recombinant vectors
pIVEX is cultured at 30.degree. C. in LB medium, in the presence or
in the absence of inducer (IPTG 1 mM). Lane 1: pIV2.3S.sub.C. Lane
2: pIV2.3S.sub.L. Lane 3: pIV2.4S.sub.1. Lane 4: pIV2.4S.sub.L.
[0245] FIG. 4 illustrates the antigenic activity of the recombinant
N, S.sub.L and S.sub.C proteins produced in the E. coli
BL21(DE3)pDIA17 strain transformed with the recombinant vectors
pIVEX. A: electrophoresis (SDS-PAGE) of the bacterial lysates. B
and C: Western-blot with the sera, obtained from the same patient
infected with SARS-CoV, collected 8 days (B: serum M12) and 29 days
(C: serum M13) respectively after the onset of the SARS symptoms.
Lane 1: pIV2.3N. Lane 2: pIV2.4N. Lane 3: pIV2.3S.sub.C. Lane 4:
pIV2.4S.sub.1. Lane 5: pIV2.3S.sub.L. Lane 6: pIV2.4S.sub.L.
[0246] FIG. 5 illustrates the purification on an Ni-NTA agarose
column of the recombinant N protein produced in the E. coli
BL21(DE3)pDIA17 strain from the vector pIV2.3N. Lane 1: total
bacterial extract. Lane 2: soluble extract. Lane 3: insoluble
extract. Lane 4: extract deposited on the Ni-NTA column. Lane 5:
unbound proteins. Lane 6: fractions of peak 1. Lane 7: fractions of
peak 2.
[0247] FIG. 6 illustrates the purification of the recombinant
S.sub.C protein from the inclusion bodies produced in the E. coli
BL21(DE3)pDIA17 strain transformed with pIV2.4S.sub.1. A. Treatment
with Triton X-100 (2%): Lane 1: total bacterial extract. Lane 2:
soluble extract. Lane 3: insoluble extract. Lane 4: supernatant
after treatment with Triton X-100 (2%). Lanes 5 and 6: pellet after
treatment with Triton X-100 (2%). B: Treatment with 4 M, 5 M, 6 M
and 7 M urea of the soluble and insoluble extracts.
[0248] FIG. 7 represents the immunoblot produced with the aid of a
lysate of cells infected with SARS-CoV and a serum from a patient
suffering from atypical pneumopathy.
[0249] FIG. 8 represents immunoblots produced with the aid of a
lysate of cells infected with SARS-CoV and rabbit immunosera
specific for the nucleoprotein N (A) and for the spicule protein S
(B). I.S.: immune serum. p.i.: preimmune serum. The anti-N immune
serum was used at 1/50 000 and the anti-S immune serum at 1/10
000.
[0250] FIG. 9 illustrates the ELISA reactivity of the rabbit
monospecific polyclonal sera directed against the N protein or the
short fragment of the S protein (S.sub.C), toward the corresponding
recombinant proteins used for immunization. A: rabbits P13097,
P13081 and P13031 immunized with the purified recombinant N
protein. B: rabbits P11135, P13042 and P14001 immunized with a
preparation of inclusion bodies corresponding to the short fragment
of the S protein (S.sub.C). I.S.: immune serum. p.i.: preimmune
serum.
[0251] FIG. 10 illustrates the ELISA reactivity of the purified
recombinant N protein, toward sera from patients suffering from
atypical pneumonia caused by SARS-CoV. FIG. 10a: ELISA plates
prepared with the N protein at the concentration of 4 .mu.g/ml and
2 .mu.g/ml. FIG. 10B: ELISA plate prepared with the N protein at
the concentration of 1 .mu.g/ml. The sera designated A, B, D, E, F,
G, H correspond to those of Table IV.
[0252] FIG. 11 illustrates the amplification by RT-PCR of
decreasing quantities of synthetic RNA of the SARS-CoV N gene
(10.sup.7 to 1 copy), with the aid of pairs of primers No. 1
(N/+/28507, N/-/28774) (A) and No. 2 (N/+/28375, N/-/28702) (B). T:
amplification performed in the absence of RNA. MW: DNA marker.
[0253] FIG. 12 illustrates the amplification by RT-PCR in real time
of synthetic RNA for the SARS-CoV N gene: decreasing quantities of
synthetic RNA as replica (repli.; lanes 16 to 29) and of viral RNA
diluted 1/20.times.10.sup.-4 (lane 32) were amplified by RT-PCR in
real time with the aid of the kit "Light Cycler RNA Amplification
Kit Hybridization Probes" and pairs of primers and probes of the
No. 2 series, under the conditions described in Example 8.
[0254] FIG. 13 (FIGS. 13.1 to 13.7) represents the restriction map
of the sequence SEQ ID NO: 1 corresponding to the DNA equivalent of
the genome of the SARS-CoV strain derived from the sample recorded
under the number 031589.
[0255] FIG. 14 shows the result of the SARS serology test by
indirect N ELISA (1.sup.st series of sera tested).
[0256] FIG. 15 shows the result of the SARS serology test by
indirect N ELISA (2.sup.nd series of sera tested).
[0257] FIG. 16 presents the result of the SARS serology test by
double epitope N ELISA (1.sup.st series of sera tested).
[0258] FIG. 17 shows the result of the SARS serology test by double
epitope N ELISA (2.sup.nd series of sera tested).
[0259] FIG. 18 illustrates the test of reactivity of the anti-N
monoclonal antibodies by ELISA on the native nucleoprotein N of
SARS-CoV. The antibodies were tested in the form of hybridoma
culture supernatants by indirect ELISA using an irradiated lysate
of VeroE6 cells infected with SARS-CoV as antigen (SARS lysate
curves). A negative control for reactivity is performed for each
antibody on a lysate of uninfected VeroE6 cells (negative lysate
curves). Several monoclonal antibodies of known specificity were
used as negative control antibodies: para1-3 directed against the
antigens of the parainfluenza viruses type 1-3 (Bio-Rad) and
influenza B directed against the antigens of the influenza virus
type B (Bio-Rad).
[0260] FIG. 19 illustrates the test of reactivity of the anti-N of
SARS-CoV monoclonal antibodies by ELISA on the native antigens of
the human coronavirus 229E (HCoV-229E). The antibodies were tested
in the form of hybridoma culture supernatants by an indirect ELISA
test using a lysate of MRC-5 cells infected with the human
coronavirus 229E as antigen (229E lysate curves). A negative
control for immunoreactivity was performed for each antibody on a
lysate of noninfected MRC-5 cells (negative lysate curves). The
monoclonal antibody 5-11H.6 directed against the S protein of the
human coronavirus 229E (Sizun et al. 1998, J. Virol. Met. 72:
145-152) is used as positive control antibody. The antibodies
para1-3 directed against the antigens of the parainfluenza virus
type 1-3 (Bio-Rad) and influenza B directed against the antigens of
the influenza virus type B (Bio-Rad) were added to the panel of
monoclonal antibodies tested.
[0261] FIG. 20 shows a test of reactivity of the anti-N of SARS-CoV
monoclonal antibodies by Western blotting on the denatured native
nucleoprotein N of SARS-CoV. A lysate of VeroE6 cells infected with
SARS-CoV was prepared in the loading buffer according to Laemmli
and caused to migrate in a 12% SDS polyacrylamide gel and then the
proteins were transferred onto PVDF membrane. The anti-N monoclonal
antibodies tested were used for the immunoassay at the
concentration of 0.05 .mu.g/ml. The visualization is carried out
with anti-mouse IgG(H+L) antibodies coupled to peroxidase (NA931V,
Amersham) and the ECL+ system. Two monoclonal antibodies were used
as negative controls for reactivity: influenza B directed against
the antigens of the influenza virus type B (Bio-Rad) and para1-3
directed against the antigens of the parainfluenza virus type 1-3
(Bio-Rad).
[0262] FIG. 21 presents the plasmids for expression in mammalian
cells of the SARS-CoV S protein. The cDNA for the SARS-CoV S was
inserted between the BamHI and Xho1 sites of the expression plasmid
pcDNA3.1(+) (Clontech) in order to obtain the plasmid pcDNA-S and
between the Nhe1 and Xho1 sites of the expression plasmid pCI
(Promega) in order to obtain the plasmid pCI-S. The WPRE and CTE
sequences were inserted between each of the two plasmids pcDNA-S
and pCI-S between the Xho1 and Xba1 sites in order to obtain the
plasmids pcDNA-S-CTE, pcDNA-S-WPRE, pCI-S-CTE and pCI-S-WPRE,
respectively. [0263] SP: signal peptide predicted (aa 1-13) with
the software signalP v2.0 (Nielsen et al., 1997, Protein
Engineering, 10:1-6) [0264] TM: transmembrane region predicted (aa
1196-1218) with the software TMHMM v2.0 (Sonnhammer et al., 1998,
Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular
Biology, pp. 175-182, AAAI Press). It should be noted that the
amino acids W1194 and P1195 are possibly part of the transmembrane
region with the respective probabilities of 0.13 and 0.42 [0265]
P-CMV: cytomegalovirus immediate/early promoter. BGH pA:
polyadenylation signal of the bovine growth hormone gene [0266]
SV40 late pA: SV40 virus late polyadenylation signal [0267] SD/SA:
splice donor and acceptor sites [0268] WPRE: sequences of the
"Woodchuck Hepatitis Virus posttranscriptional regulatory element"
of the woodchuck hepatitis virus [0269] CTE: sequences of the
"constitutive transport element" of the Mason-Pfizer simian
retrovirus
[0270] FIG. 22 illustrates the expression of the S protein after
transfection of VeroE6 cells. Cellular extracts were prepared 48
hours after transfection of VeroE6 cells with the plasmids pcDNA,
pcDNA-S, pCI and pCI-S. Cellular extracts were also prepared 18
hours after infection with the recombinant vaccinia virus VV-TF7.3
and transfection with the plasmids pcDNA or pcDNA-S. As a control,
extracts of VeroE6 cells were prepared 8 hours after infection with
SARS-CoV at a multiplicity of infection of 3. They were separated
on an 8% SDS acrylamide gel and analyzed by Western blotting with
the aid of an anti-S rabbit polyclonal antibody and an anti-rabbit
IgG(H+L) polyclonal antibody coupled to peroxidase (NA934V,
Amersham). A molecular mass ladder (kDa) is presented in the
figure. [0271] SARS-CoV: extract of VeroE6 cells infected with
SARS-CoV [0272] Mock: control extract of noninfected cells
[0273] FIG. 23 illustrates the effect of the CTE and WPRE sequences
on the expression of the S protein after transfection of VeroE6 and
293T cells. Cellular extracts were prepared 48 hours after
transfection of VeroE6 cells (A) or 293T cells (B) with the
plasmids pcDNA, pcDNA-S, pcDNA-S-CTE, pcDNA-S-WPRE, pCI-S,
pCI-S-CTE and pCI-S-WPRE separated on 8% SDS polyacrylamide gel and
analyzed by Western blotting with the aid of an anti-S rabbit
polyclonal antibody and an anti-rabbit IgG(H+L) polyclonal antibody
coupled to peroxidase (NA934V, Amersham). A molecular mass ladder
(kDa) is presented in the figure. [0274] SARS-CoV: extract of
VeroE6 cells prepared 8 hours after infection with SARS-CoV at a
multiplicity of infection of 3. [0275] Mock: control extract of
noninfected VeroE6 cells
[0276] FIG. 24 presents defective lentiviral vectors with central
DNA flap for the expression of SARS-CoV S. The cDNA for the
SARS-CoV S protein was cloned in the form of a BamH1-Xho1 fragment
into the plasmid pTRIP.DELTA.U3-CMV containing a defective
lentiviral vector TRIP with central DNA flap (Sirven et al., 2001,
Mol. Ther., 3: 438-448) in order to obtain the plasmid pTRIP-S. The
optimum expression cassettes consisting of the CMV virus
immediate/early promoter, a splice signal, cDNA for S and either of
the posttranscriptional signals CTE or WPRE were substituted for
the cassette EF1.alpha.-EGFP of the defective lentiviral expression
vector with central DNA flap TRIP.DELTA.U3-EF1.alpha. (Sirven et
al., 2001, Mol. Ther., 3: 438-448) in order to obtain the plasmids
pTRIP-SD/SA-S-CTE and pTRIP-SD/SA-S-WPRE. [0277] SP: signal peptide
[0278] TM: transmembrane region [0279] P-CMV: cytomegalovirus
immediate/early promoter [0280] P-EF1.alpha.: EF1.alpha. gene
promoter [0281] SD/SA: splice donor and acceptor sites [0282] WPRE:
sequences of the "Woodchuck Hepatitis Virus posttranscriptional
regulatory element" of the woodchuck hepatitis virus [0283] CTE:
sequences of the "constitutive transport element" of the
Mason-Pfizer simian retrovirus [0284] LTR: long terminal repeat
[0285] .DELTA.U3: LTR deleted for the "promoter/enhancer" sequences
[0286] cPPT: "polypurine tract cis-active sequence" [0287] CTS:
"central termination sequence"
[0288] FIG. 25 shows the Western-blot analysis of the expression of
the SARS-CoV S by cell lines transduced with the lentiviral vectors
TRIP-SD/SA-S-WPRE and TRIP-SD/SA-S-CTE. Cellular extracts were
prepared from established lines FrhK4-S-CTE and FrhK4-S-WPRE after
transduction with the lentiviral vectors TRIP-SD/SA-S-CTE and
TRIP-SD/SA-S-WPRE respectively. They were separated on an 8% SDS
acrylamide gel and analyzed by Western blotting with the aid of an
anti-S rabbit polyclonal antibody and an anti-rabbit IgG(H+L)
conjugate coupled to peroxidase. A molecular mass ladder (kDa) is
presented in the figure. [0289] T-: control extract of FrhK-4 cells
[0290] T+: extract of FrhK-4 cells prepared 24 hours after
infection with SARS-CoV at a multiplicity of infection of 3.
[0291] FIG. 26 relates to the analysis of the expression of Ssol
polypeptide by cell lines transduced with the lentiviral vectors
TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA-Ssol-CTE. The secretion of the
Ssol polypeptide was determined in the supernatant of a series of
cell clones isolated after transduction of FrhK-4 cells with the
lentiviral vectors TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA-Ssol-CTE. 5
.mu.l of supernatant, diluted 1/2 in loading buffer according to
Laemmli, were analyzed by Western blotting, visualized with an
anti-FLAG monoclonal antibody (M2, Sigma) and an anti-mouse
IgG(H+L) conjugate coupled to peroxidase. T-: supernatant of the
parental FRhK-4 line. T+: supernatant of BHK cells infected with a
recombinant vaccinia virus expressing the Ssol polypeptide. The
solid arrow indicates the Ssol polypeptide, while the empty arrow
indicates a cross reaction with a protein of cellular origin.
[0292] FIG. 27 shows the results relating to the analysis of the
purified Ssol polypeptide
[0293] A. 8, 2, 0.5 and 0.125 .mu.g of recombinant Ssol polypeptide
purified by anti-FLAG affinity chromatography and gel filtration
(G75) were separated on 8% SDS polyacrylamide gel. The Ssol
polypeptide and variable quantities of molecular mass markers (MM)
were visualized by staining with silver nitrate (Gelcode SilverSNAP
stain kit II, Pierce).
B. Standard markers for analysis by SELDI-TOF mass spectrometry
[0294] IgG: bovine IgG of MM 147300 [0295] ConA: conalbumin of MM
77490 [0296] HRP: horseradish peroxidase analyzed as a control and
of MM 43240 C. Analysis by mass spectrometry (SELDI-TOF) of the
recombinant Ssol polypeptide.
[0297] The peaks A and B correspond to the single and double
charged Ssol polypeptide.
D. Sequencing of the N-terminal end of the recombinant Ssol
polypeptide. 5 Edman degradation cycles in liquid phase were
carried out on an ABI494 sequencer (Applied Biosystems).
[0298] FIG. 28 illustrates the influence of a splicing signal and
of the CTE and WPRE sequences on the efficacy of the gene
immunization with the aid of plasmid DNA encoding the SARS-CoV
S
A. Groups of 7 BALB/c mice were immunized twice at 4 weeks'
interval with the aid of 50 .mu.g of plasmid DNA of pCI, pcDNA-S,
pCI-S, pcDNA-N and pCI-HA.
B. Groups of 6 BALB/c mice were immunized twice at 4 weeks'
interval with the aid of 2 .mu.g, 10 .mu.g or 50 .mu.g of plasmid
DNA of pCI, pCI-S, pCI-S-CTE and pCI-S-WPRE.
[0299] The immune sera collected 3 weeks after the second
immunization were analyzed by indirect ELISA using a lysate of
VeroE6 cells infected with SARS-CoV as antigen. The anti-SARS-CoV
antibody titers are calculated as the reciprocal of the dilution
producing a specific OD of 0.5 after visualization with an
anti-mouse IgG polyclonal antibody coupled to peroxidase (NA931V,
Amersham) and TMB (KPL).
[0300] FIG. 29 shows the seroneutralization of the infectivity of
SARS-CoV with the antibodies induced in mice after gene
immunization with the aid of plasmid DNA encoding SARS-CoV S. Pools
of immune sera collected 3 weeks after the second immunization were
prepared for each of the groups of experiments described in FIG. 28
and evaluated for their capacity to seroneutralize the infectivity
of 100 TCID50 of SARS-CoV on FRhK-4 cells. 4 points are produced
for each of the 2-fold dilutions tested from 1/20. The
seroneutralizing titer is calculated according to the Reed and
Munsch method as the reciprocal of the dilution neutralizing the
infectivity of 2 wells out of 4.
A. Groups by BALB/c mice immunized twice at 4 weeks' interval with
the aid of 50 .mu.g of plasmid DNA of pCI, pcDNA-S, pCI-S, pcDNA-N
and pCI-HA. .quadrature.: preimmune serum. .box-solid.: immune
serum.
B. Groups of BALB/c mice immunized twice at 4 weeks' interval with
the aid of 2 .mu.g, 10 .mu.g or 50 .mu.g of plasmid DNA of pCI,
pCI-S, pCI-S-CTE and pCI-S-WPRE.
[0301] FIG. 30 illustrates the immunoreactivity of the recombinant
Ssol polypeptide toward sera from patients suffering from SARS. The
reactivity of sera from patients was analyzed by indirect ELISA
test against solid phases prepared with the aid of the purified
recombinant Ssol polypeptide. The antibodies from patients reacting
with the solid phase at a dilution of 1/400 are visualized with a
human anti-IgG(H+L) polyclonal antibody coupled to peroxidase
(Amersham NA933V) and TMB plus H202 (KPL). The sera of probable
SARS cases are identified by a National Reference Center for
Influenza Viruses serial number and by the initials of the patient
and the number of days elapsed since the onset of symptoms, where
appropriate. The TV sera are control sera from subjects which were
collected in France before the SARS epidemic which occurred in
2003.
[0302] FIG. 31 shows the induction of antibodies directed against
SARS-CoV after immunization with the recombinant Ssol polypeptide.
Two groups of 6 mice were immunized at 3 weeks' interval with 10
.mu.g of recombinant Ssol polypeptide (Ssol group) adjuvanted with
aluminum hydroxide or, as a control, of adjuvant alone (mock
group). Three successive immunizations were performed and the
immune sera were collected 3 weeks after each of the three
immunizations (IS1, IS2, IS3). The immune sera were analyzed per
pool for each of the 2 groups by indirect ELISA using a lysate of
VeroE6 cells infected with SARS-CoV as antigen. The anti-SARS-CoV
antibody titers are calculated as the reciprocal of the dilution
producing a specific OD of 0.5 after visualization with an
anti-mouse IgG polyclonal antibody coupled to peroxidase (Amersham)
and TMB (KPL).
[0303] FIG. 32 presents the nucleotide alignment of the sequences
of the synthetic gene 040530 with the sequence of the wild-type
gene of the SARS-CoV isolate 031589. I-3059 corresponds to
nucleotides 21406-25348 of the SARS-CoV isolate 031589 deposited at
the C.N.C.M. under the number I-3059 (SEQ ID NO: 4, plasmid
pSARS-S)S-040530 is the sequence of the synthetic gene 040530.
[0304] FIG. 33 illustrates the use of a synthetic gene for the
expression of the SARS-CoV S. Cellular extracts prepared 48 hours
after transfection of VeroE6 cells (A) or 293T cells (B) with the
plasmids pCI, pCI-S, pCI-S-CTE, pCI-S-WPRE and pCI-Ssynth were
separated on 8% SDS acrylamide gel and analyzed by Western blotting
with the aid of an anti-S rabbit polyclonal antibody and an
anti-rabbit IgG(H+L) polyclonal antibody coupled to peroxidase
(NA934V, Amersham). The Western blot is visualized by luminescence
(ECL+, Amersham) and acquisition on a digital imaging device (Fluor
S, BioRad). The levels of expression of the S protein were measured
by quantifying the 2 predominant bands identified on the image.
[0305] FIG. 34 presents a diagram for the construction of
recombinant vaccinia viruses VV-TG-S, VV-TG-Ssol, VV-TN-S and
W-TN-Ssol
A. The cDNAs for the S protein and the Ssol polypeptide of SARS-CoV
were inserted between the BamH1 and Sma1 sites of the transfer
plasmid pTG186 in order to obtain the plasmids pTG-S and
pTG-Ssol.
[0306] B. The sequences of the synthetic promoter 480 were then
substituted for those of the 7.5 promoter by exchange of the
Nde1-Pst1 fragments of the plasmids pTG186poly, pTG-S and pTG-Ssol
in order to obtain the transfer plasmids pTN480, pTN-S and
pTN-Ssol.
[0307] C. Sequence of the synthetic promoter 480 as contained
between the Nde1 and Pst1 sites of the transfer plasmids of the pTN
series. An Asc1 site was inserted in order to facilitate subsequent
handling. The restriction sites and the promoter sequence are
underlined.
D. The recombinant vaccinia viruses are obtained by double
homologous recombination in vivo between the TK cassette of the
transfer plasmids of the pTG and pTN series and the TK gene of the
Copenhagen strain of the vaccinia virus.
[0308] SP: signal peptide predicted (aa 1-13) with the software
signalP v2.0 (Nielsen et al., 1997, Protein Engineering, 10:1-6)
[0309] TM: transmembrane region predicted (aa 1196-1218) with the
software TMHMM v2.0 (Sonnhammer et al., 1998, Proc. of Sixth Int.
Conf. on Intelligent Systems for Molecular Biology, pp. 175-182,
AAAI Press). It should be noted that the amino acids W1194 and
P1195 possibly form part of the transmembrane region with
respective probabilities of 0.13 and 0.42. [0310] TK-L, TK-R: left-
and right-hand parts of the vaccinia virus thymidine kinase gene
[0311] MCS: multiple cloning site [0312] PE: early promoter [0313]
PL: late promoter [0314] PL synth: synthetic late promoter 480
[0315] FIG. 35 illustrates the expression of the S protein by
recombinant vaccinia viruses, analyzed by Western blotting.
Cellular extracts were prepared 18 hours after infection of CV1
cells with the recombinant vaccinia viruses VV-TG, VV-TG-S and
VV-TN-S at an M.O.I. of 2 (A). As a control, extracts of VeroE6
cells were prepared 8 hours after infection with SARS-CoV at a
multiplicity of infection of 2. Cellular extracts were also
prepared 18 hours after infection of CV1 cells with the recombinant
vaccinia viruses VV-TG-S, VV-TG-Ssol, VV-TN, VV-TN-S and VV-TN-Ssol
(B). They were separated on 8% SDS acrylamide gels and analyzed by
Western blotting with the aid of an anti-S rabbit polyclonal
antibody and an anti-rabbit IgG(H+L) polyclonal antibody coupled to
peroxidase (NA934V, Amersham). "1 .mu.l" and "10 .mu.l" indicates
the quantities of cellular extracts deposited on the gel. A
molecular mass ladder (kDa) is presented in the figure. [0316]
SARS-CoV: extract of VeroE6 cells infected with SARS-CoV [0317]
Mock: control extract of noninfected cells
[0318] FIG. 36 shows the result of a Western-blot analysis of the
secretion of the Ssol polypeptide by the recombinant vaccinia
viruses.
A. Supernatants of CV1 cells infected with the recombinant vaccinia
virus VV-TN, various clones of the VV-TN-Ssol virus and with the
viruses VV-TG-Ssol or VV-TN-Sflag were harvested 18 hours after
infection of CV1 cells at an M.O.I. of 2.
[0319] B. Supernatants of 293T, FRhK-4, BHK-21 and CV1 cells
infected in duplicate (1.2) with the recombinant vaccinia virus
VV-TN-Ssol at an M.O.I. of 2 were harvested 18 hours after
infection. The supernatant of CV1 cells infected with the virus
VV-TN was also harvested as a control (M).
[0320] All the supernatants were separated on 8% SDS acrylamide gel
according to Laemmli and analyzed by Western blotting with the aid
of an anti-FLAG mouse monoclonal antibody and an anti-mouse
IgG(H+L) polyclonal antibody coupled to peroxidase (NA931V,
Amersham) (A) or with the aid of an anti-S rabbit polyclonal
antibody and an anti-rabbit IgG(H+L) polyclonal antibody coupled to
peroxidase (NA934V, Amersham) (B).
[0321] A molecular mass ladder (kDa) is presented in the
figure.
[0322] FIG. 37 shows the analysis of the Ssol polypeptide, purified
on SDS polyacrylamide gel
[0323] 10, 5 and 211 of recombinant Ssol polypeptide purified by
anti-FLAG affinity chromatography were separated on 4 to 15%
gradient SDS polyacrylamide gel. The Ssol polypeptide and variable
quantities of molecular mass markers (MM) were visualized by
staining with silver nitrate (Gelcode SilverSNAP stain kit II,
Pierce).
[0324] FIG. 38 illustrates the immunoreactivity of the recombinant
Ssol polypeptide produced by the recombinant vaccinia virus
VV-TN-Ssol toward sera of patients suffering from SARS. The
reactivity of sera from patients was analyzed by indirect ELISA
test against solid phases prepared with the aid of the purified
recombinant Ssol polypeptide. The antibodies from patients reacting
with the solid phase at a dilution of 1/100 and 1/400 are
visualized with a human anti-IgG(H+L) polyclonal antibody coupled
to peroxidase (Amersham NA933V) and TMB plus H202 (KPL). The sera
of probable SARS cases are identified by a National Reference
Center for Influenza Virus serial number and by the initials of the
patient and the number of days elapsed since the onset of symptoms,
where appropriate. The TV sera are control sera from subjects which
were collected in France before the SARS epidemic which occurred in
2003.
[0325] FIG. 39 shows the anti-SARS-CoV antibody response in mice
after immunization with the recombinant vaccinia viruses. Groups of
7 BALB/c mice were immunized by the i.v. route twice at 4 weeks'
interval with 106 pfu of recombinant vaccinia viruses VV-TG,
VV-TG-HA, VV-TG-S, VV-TG-Ssol, W-TN, W-TN-S, VV-TN-Ssol.
[0326] A. Pools of immune sera collected 3 weeks after each of the
two immunizations were prepared for each of the groups and were
analyzed by indirect ELISA using a lysate of VeroE6 cells infected
with SARS-CoV as antigen. The anti-SARS-CoV antibody titers are
calculated as the reciprocal of the dilution producing a specific
OD of 0.5 after visualization with an anti-mouse IgG polyclonal
antibody coupled to peroxidase (NA931V, Amersham) and TMB
(KPL).
[0327] B. The pools of immune sera were evaluated for their
capacity to seroneutralize the infectivity of 100 TCID50 of
SARS-CoV on FRhK-4 cells. 4 points are produced for each of the
2-fold dilutions tested from 1/20. The seroneutralizing titer is
calculated according to the Reed and Munsch method as the
reciprocal of the dilution neutralizing the infectivity of 2 wells
out of 4.
[0328] FIG. 40 describes the construction of the recombinant
viruses MVSchw2-SARS-S and MVSchw2-SARS-Ssol.
[0329] A. The measles vector is a complete genome of the Schwarz
vaccine strain of the measles virus (MV) into which an additional
transcription unit has been introduced (Combredet, 2003, Journal of
Virology, 77: 11546-11554). The expression of the additional open
reading frames (ORF) is controlled by cis-acting elements necessary
for the transcription, for the formation of the cap and for the
polyadenylation of the transgene which were copied from the
elements present at the N/P junction. 2 different vectors allow the
insertion between the P (phosphoprotein) and M (matrix) genes on
the one hand and the H (hemagglutinin) and L (polymerase) genes on
the other hand.
[0330] B. The recombinant genomes MVSchw2-SARS-S and
MVSchw2-SARS-Ssol of the measles virus were constructed by
inserting the ORFs of the S protein and of the Ssol polypeptide
into an additional transcription unit located between the P and M
genes of the vector.
[0331] The various genes of the measles virus (MV) are indicated: N
(nucleoprotein), PVC (V/C phosphoprotein and protein), M (matrix),
F (fusion), H (hemagglutinin), L (polymerase). T7=T7 RNA polymerase
promoter, hh=hammerhead ribozyme, T7t=T7 phage RNA polymerase
terminator sequence, 6=ribozyme of the hepatitis .delta. virus,
(2), (3)=additional transcription units (ATU). [0332] Size of the
MV genome: 15 894 nt. [0333] SP: signal peptide [0334] TM:
transmembrane region [0335] FLAG: FLAG tag
[0336] FIG. 41 illustrates the expression of the S protein by the
recombinant measles viruses, analyzed by Western blotting.
[0337] Cytoplasmic extracts were prepared after infection of Vero
cells by different passages of the viruses MVSchw2-SARS-S and
MVSchw2-SARS-Ssol and the wild-type virus MWSchw as control.
Cellular extracts in loading buffer according to Laemmli were also
prepared 8 hours after infection of VeroE6 cells with SARS-CoV at a
multiplicity of infection of 3. They were separated on 8% SDS
acrylamide gel and analyzed by Western blotting with the aid of an
anti-S rabbit polyclonal antibody and an anti-rabbit IgG(H+L)
polyclonal antibody coupled to peroxidase (NA934V, Amersham).
[0338] A molecular mass ladder (kDa) is presented in the figure.
[0339] Pn: nth passage of the virus after coculture of 293-3-46 and
Vero cells [0340] SARS-CoV: extract of VeroE6 cells infected with
SARS-CoV [0341] Mock: control extract of noninfected VeroE6
cells
[0342] FIG. 42 shows the expression of the S protein by the
recombinant measles viruses, analyzed by immunofluorescence
[0343] Vero cells in monolayers on glass slides were infected with
the wild-type virus MWSchw (A) or the viruses MVSchw2-SARS-S (B)
and MVSchw2-SARS-Ssol (C). When the syncytia have reached 30 to 40%
confluence (A., B.) or 90-100% (C), the cells were fixed,
permeabilized and labeled with anti-SARS-CoV rabbit polyclonal
antibodies and an anti-rabbit IgG(H+L) conjugate coupled to FITC
(Jackson).
[0344] FIG. 43 illustrates the Western-blot analysis of the
immunoreactivity of rabbit sera directed against the peptides
E1-12, E53-76 and M2-14. The rabbit 20047 was immunized with the
peptide E1-12 coupled to KLH. The rabbits 22234 and 22240 were
immunized with the peptide E53-76 coupled to KLH. The rabbits 20013
and 20080 were immunized with the peptide M2-14 coupled to KLH. The
immune sera were analyzed by Western blotting with the aid of
extracts of cells infected with SARS-CoV (B) or with the aid of
extracts of cells infected with a recombinant vaccinia virus
expressing the protein E (A) or M (C) of the SARS-CoV 031589
isolate. The immunoblots were visualized with the aid of an
anti-rabbit IgG(H+L) conjugate coupled to peroxidase (NA934V,
Amersham).
[0345] The position of the E and M proteins is indicated by an
arrow.
[0346] A molecular mass ladder (kDa) is presented in the
figure.
[0347] It should be understood, however, that these examples are
given solely by way of illustration of the subject of the
invention, and do not constitute in any manner a limitation
thereto.
EXAMPLE 1
Cloning and Sequencing of the Genome of the SARS-CoV Strain Derived
from the Sample Recorded Under the Number 031589
[0348] The RNA of the SARS-CoV strain was extracted from the sample
of bronchoalveolar washing recorded under the number 031589,
performed on a patient at the Hanoi (Vietnam) French hospital
suffering from SARS.
[0349] The isolated RNA was used as template to amplify the cDNAs
corresponding to the various open reading frames of the genome
(ORF1a, ORF1b, ORF-S, ORF-E, ORF-M, ORF-N (including ORF-13 and
ORF-14), ORF3, ORF4, ORF7 to ORF11), and at the noncoding 5' and 3'
ends. The sequences of the primers and of the probes used for the
amplification/detection were defined based on the available
SARS-CoV nucleotide sequence.
[0350] In the text which follows, the primers and the probes are
identified by: the letter S, followed by a letter which indicates
the corresponding region of the genome (L for the 5' end including
ORF1a and ORF1b; S, M and N for ORF-S, ORF-M, ORF-N, SE and MN for
the corresponding intergene regions), and then optionally by Fn,
Rn, with n between 1 and 6 corresponding to the primers used for
the nested PCR (F1+R1 pair for the first amplification, F2+R2 pair
for the second amplication, and the like), and then by /+/or /-/
corresponding to a sense or antisense primer and finally by the
positions of the primers with reference to the Genbank sequence
AY27411.3; for the sense and antisense S and N primers and the
other sense primers only, when a single position is indicated, it
corresponds to that of the 5' end of a probe or of a primer of
about 20 bases; for the antisense primers other than the S and N
primers, when a single position is indicated, it corresponds to
that of the 3' end of a probe or of a primer of about 20 bases.
[0351] The amplification products thus generated were sequenced
with the aid of specific primers in order to determine the complete
sequence of the genome of the SARS-CoV strain derived from the
sample recorded under the number 031589. These amplification
products, with the exception of those corresponding to ORF1a and
ORF1b, were then cloned into expression vectors in order to produce
the corresponding viral proteins and the antibodies directed
against these proteins, in particular by DNA-based
immunization.
1. Extraction of the RNAs
[0352] The RNAs were extracted with the aid of the QIamp viral RNA
extraction mini kit (QIAGEN) according to the manufacturer's
recommendations. More specifically: 14011 of the sample and 560
.mu.l of AVL buffer were vigorously mixed for 15 seconds, incubated
for 10 minutes at room temperature and then briefly centrifuged at
maximum speed. 560 .mu.l of 100% ethanol were added to the
supernatant and the mixture thus obtained was very vigorously
stirred for 15 sec. 630 .mu.l of the mixture were then deposited on
the column.
[0353] The column was placed on a 2 ml tube, centrifuged for 1 min
at 8000 rpm, and then the remainder of the preceding mixture was
deposited on the same column, centrifuged again, for 1 min at 8000
rpm, and the column was transferred over a clean 2 ml tube. Next,
500 .mu.l of AW1 buffer were added to the column, and then the
column was centrifuged for 1 min at 8000 rpm and the eluate was
discarded. 500 .mu.l of AW2 buffer were added to the column which
was then centrifuged for 3 min at 14 000 rpm and transferred onto a
1.5 ml tube. Finally, 60 .mu.l of AVE buffer were added to the
column which was incubated for 1 to 2 min at room temperature and
then centrifuged for 1 min at 8000 rpm. The eluate corresponding to
the purified RNA was recovered and frozen at -20.degree. C.
2. Amplification, Sequencing and Cloning of the cDNAs
2.1) cDNA Encoding the S Protein
[0354] The RNAs extracted from the sample were subjected to reverse
transcription with the aid of random sequence hexameric
oligonucleotides (pdN6), so as to produce cDNA fragments.
[0355] The sequence encoding the SARS-CoV S glycoprotein was
amplified in the form of two overlapping DNA fragments: 5' fragment
(SARS-Sa, SEQ ID NO: 5) and 3' fragment (SARS-Sb, SEQ ID NO: 6), by
carrying out two successive amplifications with the aid of nested
primers. The amplicons thus obtained were sequenced, cloned into
the PCR plasmid vector 2.1-TOPO.TM. (INVITROGEN), and then the
sequence of the cloned cDNAs was determined.
a) Cloning and Sequencing of the Sa and Sb Fragments
a.1) Synthesis of the cDNA
[0356] The reaction mixture containing: RNA (5 .mu.l), H.sub.2O for
injection (3.5 .mu.l), 5.times. reverse transcriptase buffer (4
.mu.l), 5 mM dNTP (2 .mu.l), pdN6 100 .mu.g/ml (4 .mu.l), RNasin 40
IU/.mu.l (0.5 .mu.l) and reverse transcriptase AMV-RT, 10 IU/.mu.l,
PROMEGA (1 .mu.l) was incubated in a thermocycler under the
following conditions: 45 min at 42.degree. C., 15 min at 55.degree.
C., 5 min at 95.degree. C., and then the cDNA obtained was kept at
+4.degree. C.
a.2) First PCR Amplification
[0357] The 5' and 3' ends of the S gene were respectively amplified
with the pairs of primers S/F1/+/21350-21372 and
S/R1/-/23518-23498, S/F3/+/23258-23277 and S/R3/-/25382-25363. The
50 .mu.l reaction mixture containing: cDNA (2 .mu.l), 50 .mu.M
primers (0.5 .mu.l), 10.times. buffer (5 .mu.l), 5 mM dNTP (2
.mu.l), Taq Expand High Fidelity, Roche (0.75 .mu.l) and H.sub.2O
(39, 75 .mu.l) was amplified in a thermocycler, under the following
conditions: an initial step of denaturation at 94.degree. C. for 2
min was followed by 40 cycles comprising: a step of denaturation at
94.degree. C. for 30 sec, a step of annealing at 55.degree. C. for
30 sec and then a step of extension at 72.degree. C. for 2 min 30
sec, with 10 sec of additional extension at each cycle, and then a
final step of extension at 72.degree. C. for 5 min.
a.3) Second PCR Amplification
[0358] The products of the first PCR amplification (5' and 3'
amplicons) were subjected to a second PCR amplification step
(nested PCR) under conditions identical to those of the first
amplification, with the pairs of primers S/F2/+/21406-21426 and
S/R2/-/23454-23435 and S/F4/+/23322-23341 and S/R4/-/25348-25329,
respectively for the 5' amplicon and the 3' amplicon.
a.4) Cloning and Sequencing of the Sa and Sb Fragments
[0359] The Sa (5' end) and Sb (3' end) amplicons thus obtained were
purified with the aid of the QIAquick PCR purification kit
(QIAGEN), following the manufacturer's instructions, and then they
were cloned into the vector PCR2.1-TOPO (Invitrogen kit), to give
the plasmids called SARS-S1 and SARS-S2.
[0360] The DNA of the Sa and Sb clones was isolated and then the
corresponding insert was sequenced with the aid of the Big Dye kit,
Applied Biosystem.RTM. and universal primers M13 forward and M13
reverse, and primers: S/S/+/21867, S/S/+/22353, S/S/+/22811,
S/S/+/23754, S/S/+/24207, S/S/+/24699, S/S/+/24348, S/S/-/24209,
S/S/-/23630, S/S/-/23038, S/S/-/22454, S/S/-/21815, S/S/-/24784,
S/S/+/21556, S/S/+/23130 and S/S/+/24465 following the
manufacturer's instructions; the sequences of the Sa and Sb
fragments thus obtained correspond to the sequences SEQ ID NO: 5
and SEQ ID NO: 6 in the sequence listing appended as an annex.
[0361] The plasmid, called SARS-S1, was deposited under the No.
I-3020, on May 12, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains a 5' fragment of the sequence of the S gene of the
SARS-CoV strain derived from the sample recorded under the No.
031589, as defined above, said fragment called Sa corresponding to
the nucleotides at positions 21406 to 23454 (SEQ ID NO: 5), with
reference to the Genbank sequence AY274119.3 Tor2.
[0362] The plasmid, called TOP10F'-SARS-S2, was deposited under the
No. I-3019, on May 12, 2003, at the Collection Nationale de
Cultures de Microorganismes, 25 rue du Docteur Roux, 75724 Paris
Cedex 15; it contains a 3' fragment of the sequence of the S gene
of the SARS-CoV strain derived from the sample recorded under the
No. 031589, as defined above, said fragment called Sb corresponding
to the nucleotides at positions 23322 to 25348 (SEQ ID NO: 6), with
reference to the Genbank sequence accession No. AY274119.3.
b) Cloning and Sequencing of the Complete cDNA (SARS-S Clone of 4
kb)
[0363] The complete S cDNA was obtained from the abovementioned
clones SARS-S1 and SARS-S2, in the following manner:
[0364] 1) A PCR amplification reaction was carried out on a SARS-S2
clone in the presence of the above-mentioned primer
S/R4/-/25348-25329 and of the primer S/S/+/24696-24715: an amplicon
of 633 bp was obtained,
[0365] 2) Another PCR amplification reaction was carried out on
another SARS-S2 clone, in the presence of the primers
S/F4/+/23322-23341 mentioned above and S/S/-/24803-24784: an
amplicon of 1481 bp was obtained.
[0366] The amplification reaction was carried out under the
conditions as defined above for the amplification of the Sa and Sb
fragments, with the exception that 30 amplification cycles
comprising a step of denaturation at 94.degree. C. for 20 sec and a
step of extension at 72.degree. C. for 2 min 30 sec were carried
out.
[0367] 3) The 2 amplicons (633 bp and 1481 bp) were purified under
the conditions as defined above for the Sa and Sb fragments.
[0368] 4) Another PCR amplification reaction with the aid of the
abovementioned primers S/F4/+/23322-23341 and S/R4/-/25348-25329
was carried out on the purified amplicons obtained in 3). The
amplification reaction was carried out under the conditions as
defined above for the amplification of the Sa and Sb fragments,
except that 30 amplification cycles were performed.
[0369] The 2026 bp amplicon thus obtained was purified, cloned into
the vector PCR2.1-TOPO and then sequenced as above, with the aid of
the primers as defined above for the Sa and Sb fragments. The clone
thus obtained was called clone 3'.
[0370] 5) The clone SARS-S1 obtained above and the clone 3' were
digested with EcoR I, the bands of about 2 kb thus obtained were
gel purified and then amplified by PCR with the abovementioned
primers S/F2/+/21406-21426 and S/R4/-/25348-25329. The
amplification reaction was carried out under the conditions as
defined above for the amplification of the Sa and Sb fragments,
except that 30 amplification cycles were performed. The amplicon of
about 4 kb was purified and sequenced. It was then cloned into the
vector PCR2.1-TOPO in order to give the plasmid, called SARS-S, and
the insert obtained in this plasmid was sequenced as above, with
the aid of the primers as defined above for the Sa and Sb
fragments. The cDNA sequences of the insert and of the amplicon
encoding the S protein correspond respectively to the sequences SEQ
ID NO: 4 and SEQ ID NO: 2 in the sequence listing appended as an
annex, they encode the S protein (SEQ ID NO: 3).
[0371] The sequence of the amplicon corresponding to the cDNA
encoding the S protein of the SARS-CoV strain derived from the
sample No. 031589 has the following two mutations compared with the
corresponding sequences of respectively the Tor2 and Urbani
isolates, the positions of the mutations being indicated with
reference to the complete sequence of the genome of the Tor2
isolate (Genbank AY274119.3): [0372] g/t in position 23220; the
alanine codon (gct) in position 577 of the amino acid sequence of
the S protein of Tor2 is replaced with a serine codon (tct), [0373]
c/t in position 24872: this mutation does not modify the amino acid
sequence of the S protein, and the plasmid, called SARS-S, was
deposited under the No. I-3059, on Jun. 20, 2003, at the Collection
Nationale de Cultures de Microorganismes, 25 rue du Docteur Roux,
75724 Paris Cedex 15; it contains the cDNA sequence encoding the S
protein of the SARS-CoV strain derived from the sample recorded
under the No. 031589, said sequence corresponding to the
nucleotides at positions 21406 to 25348 (SEQ ID NO: 4), with
reference to the Genbank sequence AY274119.3. 2.2) cDNA Encoding
the M and E Proteins
[0374] The RNAs derived from the sample 031589, extracted as above,
were subjected to a reverse transcription, combined, during the
same step (Titan One Step RT-PCR.RTM. kit, Roche), with a PCR
amplification reaction, with the aid of the pairs of primers:
[0375] S/E/F1/+/26051-26070 and S/E/R1/-/26455-26436 in order to
amplify ORF-E, and [0376] S/M/F1/+/26225-26244 and
S/M/R1/-/27148-27129 in order to amplify ORF-M.
[0377] A first reaction mixture containing: 8.6 .mu.l of H.sub.2O
for injection, 1 .mu.l of dNTP (5 mM), 0.2 .mu.l of each of the
primers (50 .mu.M), 1.25 .mu.l of DTT (100 mM) and 0.25 .mu.l of
RNAsin (40 IU/.mu.l) was combined with a second reaction mixture
containing: 1 .mu.l of RNA, 7 .mu.l of H.sub.2O for injection, 5
.mu.l of 5.times.RT-PCR buffer and 0.5 .mu.l of enzyme mixture and
the combined mixtures were incubated in a thermocycler under the
following conditions: 30 min at 42.degree. C., 10 min at 55.degree.
C., 2 min at 94.degree. C. followed by 40 cycles comprising a step
of denaturation at 94.degree. C. for 10 sec, a step of annealing at
55.degree. C. for 30 sec and a step of extension at 68.degree. C.
for 45 sec, with 3 sec increment per cycle and finally a step of
terminal extension at 68.degree. C. for 7 min.
[0378] The amplification products thus obtained (M and E amplicons)
were subjected to a second PCR amplification (nested PCR) using the
Expand High-Fi.RTM. kit, Roche), with the aid of the pairs of
primers: [0379] S/E/F2/+/26082-26101 and S/E/R2/-/26413-26394 for
the amplicon E, and [0380] S/M/F2/+/26330-26350 and
S/M/R2/-/27098-27078 for the amplicon M.
[0381] The reaction mixture containing: 2 .mu.l of the product of
the first PCR, 39.25 .mu.l of H.sub.2O for injection, 5 .mu.l of
10.times. buffer containing MgCl.sub.2, 2 .mu.l of dNTP (5 mM), 0.5
.mu.l of each of the primers (50 .mu.M) and 0.75 .mu.l of enzyme
mixture was incubated in a thermocycler under the following
conditions: a step of denaturation at 94.degree. C. for 2 min was
followed by 30 cycles comprising a step of denaturation at
94.degree. C. for 15 sec, a step of annealing at 60.degree. C. for
30 sec and a step of extension at 72.degree. C. for 45 sec, with 3
sec increment per cycle, and finally a step of terminal extension
at 72.degree. C. for 7 min. The amplification products obtained
corresponding to the cDNAs encoding the E and M proteins were
sequenced as above, with the aid of the primers: S/E/F2/+/26082 and
S/E/R2/-/26394, S/M/F2/+/26330, S/M/R2/-/27078 cited above and the
primers S/M/+/26636-26655 and S/M/-/26567-26548. They were then
cloned, as above, in order to give the plasmids called SARS-E and
SARS-M. The DNA of these clones was then isolated and sequenced
with the aid of the universal primers M13 forward and M13 reverse
and the primers S/M/+/26636 and S/M/-/26548 mentioned above.
[0382] The sequence of the amplicon representing the cDNA encoding
the E protein (SEQ ID NO: 13) of the SARS-CoV strain derived from
the sample No. 031589 does not contain differences in relation to
the corresponding sequences of the isolates AY274119.3-Tor2 and
AY278741-Urbani. The sequence of the E protein of the SARS-CoV
031589 strain corresponds to the sequence SEQ ID NO: 14 in the
sequence listing appended as an annex.
[0383] The plasmid, called SARS-E, was deposited under the No.
I-3046, on May 28, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA sequence encoding the E protein of the SARS-CoV
strain derived from the sample recorded under the No. 031589, as
defined above, said sequence corresponding to the nucleotides at
positions 26082 to 26413 (SEQ ID NO: 15), with reference to the
Genbank sequence accession No. AY274119.3.
[0384] The sequence of the amplicon representing the cDNA encoding
M (SEQ ID NO: 16) from the SARS-CoV strain derived from the sample
No. 031589 does not contain differences in relation to the
corresponding sequence of the isolate AY274119.3-Tor2. By contrast,
at position 26857, the isolate AY278741-Urbani contains a c and the
sequence of the SARS-CoV strain derived from the sample recorded
under the No. 031589 contains a t. This mutation results in a
modification of the amino acid sequence of the corresponding
protein: at position 154, a proline (AY278741-Urbani) is changed to
serine in the SARS-CoV strain derived from the sample recorded
under the No. 031589. The sequence of the M protein of the SARS-CoV
strain derived from the sample recorded under the No. 031589
corresponds to the sequence SEQ ID NO: 17 in the sequence listing
appended as an annex.
[0385] The plasmid, called SARS-M, was deposited under the No.
I-3047, on May 28, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA sequence encoding the M protein of the SARS-CoV
strain derived from the sample recorded under the No. 031589, as
defined above; said sequence corresponding to the nucleotides at
positions 26330 to 27098 (SEQ ID NO: 18), with reference to the
Genbank sequence accession No. AY274119.3.
2.3) cDNA Corresponding to ORF3, ORF4, ORF7 to ORF11
[0386] The same amplification, cloning and sequencing strategy was
used to obtain the cDNA fragments corresponding respectively to the
following ORFs: ORF3, ORF4, ORF7, ORF8, ORF9, ORF10 and ORF11. The
pairs of primers used for the first amplification are: [0387] ORF3
and ORF4: S/SE/F1/+/25069-25088 and S/SE/R1/-/26300-26281 [0388]
ORF7 to ORF11: S/MN/F1/+/26898-26917 and S/MN/R1/-/28287-28266
[0389] The pairs of primers used for the second amplification are:
[0390] ORF3 and ORF4: S/SE/F2/+/25110-25129 and
S/SE/R2/-/26244-26225 [0391] ORF7 to ORF11: S/MN/F2/+/26977-26996
and S/MN/R2/-/28218-28199
[0392] The conditions for the first amplification (RT-PCR) are the
following: 45 min at 42.degree. C., 10 min at 55.degree. C., 2 min
at 94.degree. C. followed by 40 cycles comprising a step of
denaturation at 94.degree. C. for 15 sec, a step of annealing at
58.degree. C. for 30 sec and a step of extension at 68.degree. C.
for 1 min, with 5 sec increment per cycle and finally a step of
terminal extension at 68.degree. C. for 7 min.
[0393] The conditions for the nested PCR are the following: a step
of denaturation at 94.degree. C. for 2 min was followed by 40
cycles comprising a step of denaturation at 94.degree. C. for 20
sec. a step of annealing at 58.degree. C. for 30 sec and a step of
extension at 72.degree. C. for 50 sec, with 4 sec increment per
cycle and finally a step of terminal extension at 72.degree. C. for
7 min.
[0394] The amplification products obtained corresponding to the
cDNAs containing respectively ORF3 and 4 and ORF7 to 11 were
sequenced with the aid of the primers: S/SE/+/25363, S/SE/+/25835,
S/SE/-/25494, S/SE/-/25875, S/MN/+/27839, S/MN/+/27409,
S/MN/-/27836, S/MN/-/27799 and cloned as above for the other ORFs,
to give the plasmids called SARS-SE and SARS-MN. The DNA of these
clones was isolated and sequenced with the aid of these same
primers and of the universal primers M13 sense and M13
antisense.
[0395] The sequence of the amplicon representing the cDNA of the
region containing OFR3 and ORF4 (SEQ ID NO: 7) of the SARS-CoV
strain derived from the sample No. 031589 contains a nucleotide
difference in relation to the corresponding sequence of the isolate
AY274119-Tor2. This mutation at position 25298 results in a
modification of the amino acid sequence of the corresponding
protein (ORF3): at position 11, an arginine (AY274119-Tor2) is
changed to glycine in the SARS-CoV strain derived from the sample
No. 031589. By contrast, no mutation was identified in relation to
the corresponding sequence of the isolate AY278741-Urbani. The
sequences of ORF3 and 4 of the SARS-CoV strain derived from the
sample No. 031589 correspond respectively to the sequences SEQ ID
NO: 10 and 12 in the sequence listing appended as an annex.
[0396] The plasmid, called SARS-SE, was deposited under the No.
I-3126, on Nov. 13, 2003, at the Collection Nationale de Cultures
de Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15;
it contains the cDNA corresponding to the region situated between
ORF-S and ORF-E and overlapping ORF-E of the SARS-CoV strain
derived from the sample recorded under the No. 031589, as defined
above, said region corresponding to the nucleotides at positions
25110 to 26244 (SEQ ID NO: 8), with reference to the Genbank
sequence accession No. AY274119.3.
[0397] The sequence of the amplicon representing the cDNA
corresponding to the region containing ORF7 to ORF11 (SEQ ID NO:
19) of the SARS-CoV strain derived from the sample No. 031589 does
not contain differences in relation to the corresponding sequences
of the isolates AY274119-Tor2 and AY278741-Urbani. The sequences of
ORF7 to 11 of the SARS-CoV strain derived from the sample No.
031589 correspond respectively to the sequences SEQ ID NO: 22, 24,
26, 28 and 30 in the sequence listing appended as an annex.
[0398] The plasmid, called SARS-MN, was deposited under the No.
I-3125, on Nov. 13, 2003, at the Collection Nationale de Cultures
de Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15;
it contains the cDNA sequence corresponding to the region situated
between ORF-M and ORF-N of the SARS-CoV strain derived from the
sample recorded under the No. 031589 and collected in Hanoi, as
defined above, said sequence corresponding to the nucleotides at
positions 26977 to 28218 (SEQ ID NO: 20), with reference to the
Genbank sequence accession No. AY274119.3.
[0399] The sequence of the amplicon representing the cDNA
corresponding to the region containing ORF7 to ORF11 (SEQ ID NO:
19) of the SARS-CoV strain derived from the sample No. 031589 does
not contain differences in relation to the corresponding sequences
of the isolates AY274119-Tor2 and AY278741-Urbani. The sequences of
ORF7 to 11 of the SARS-CoV strain derived from the sample No.
031589 correspond respectively to the sequences SEQ ID NO: 22, 24,
26, 28 and 30 in the sequence listing appended as an annex.
2.4) cDNA Encoding the N Protein and Including ORF13 and ORF14
[0400] The cDNA was synthesized and amplified as described above
for the fragments Sa and Sb. More specifically, the reaction
mixture containing: 5 .mu.l of RNA, 5 .mu.l of H.sub.2O for
injection, 4 .mu.l of 5.times. reverse transcriptase buffer, 2
.mu.l of dNTP (5 mM), 2 .mu.l of oligo 20T (5 .mu.M), 0.5 .mu.l of
RNasin (40 IU/.mu.l) and 1.5 .mu.l of AMV-RT (10 IU/.mu.l Promega)
was incubated in a thermocycler under the following conditions: 45
min at 42.degree. C., 15 min at 55.degree. C., 5 min at 95.degree.
C., and it was then kept at +4.degree. C.
[0401] A first PCR amplification was performed with the pair of
primers S/N/F3/+/28023 and S/N/R3/-/29480.
[0402] The reaction mixture as above for the amplification of the
S1 and S2 fragments was incubated in a thermo-cycler, under the
following conditions: an initial step of denaturation at 94.degree.
C. for 2 min was followed by 40 cycles comprising a step of
denaturation at 94.degree. C. for 20 sec, a step of annealing at
55.degree. C. for 30 sec and then a step of extension at 72.degree.
C. for 1 min 30 sec with 10 sec of additional extension at each
cycle, and then a final step of extension at 72.degree. C. for 5
min.
[0403] The amplicon obtained at the first PCR amplification was
subjected to a second PCR amplification step (nested PCR) with the
pairs of primer S/N/F4/+/28054 and S/N/R4/-/29430 under conditions
identical to those of the first amplification.
[0404] The amplification product obtained, corresponding to the
cDNA encoding the N protein of the SARS-CoV strain derived from the
sample No. 031589, was sequenced with the aid of the primers:
S/N/F4/+/28054, S/N/R4/-/29430, S/N/+/28468, S/N/+/28918 and
S/N/-/28607 and cloned as above for the other ORFs, to give the
plasmid called SARS-N. The DNA of these clones was isolated and
sequenced with the aid of the universal primers M13 sense and M13
antisense, and the primers S/N/+/28468, S/N/+/28918 and
S/N/-/28607.
[0405] The sequence of the amplicon representing the cDNA
corresponding to ORF-N and including ORF13 and ORF14 (SEQ ID NO:
36) of the SARS-CoV strain derived from the sample No. 031589 does
not contain differences in relation to the corresponding sequences
of the isolates AY274119.3-Tor2 and AY278741-Urbani. The sequence
of the N protein of the SARS-CoV strain derived from the sample No.
031589 corresponds to the sequence SEQ ID NO: 37 in the sequence
listing appended as an annex.
[0406] The sequences of ORF13 and 14 of the SARS-CoV strain derived
from the sample No. 031589 correspond respectively to the sequences
SEQ ID NO: 32 and 34 in the sequence listing appended as an
annex.
[0407] The plasmid, called SARS-N, was deposited under the No.
I-3048, on Jun. 5, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA encoding the N protein of the SARS-CoV strain
derived from the sample recorded under the No. 031589, as defined
above, said sequence corresponding to the nucleotides at positions
28054 to 29430 (SEQ ID NO: 38), with reference to the Genbank
sequence accession No. AY274119.3.
2.5) Noncoding 5' and 3' Ends
a) Noncoding 5' end (5'NC)
a.sub.1) Synthesis of the cDNA
[0408] The RNAs derived from the sample 031589, extracted as above,
were subjected to reverse transcription under the following
conditions:
[0409] The RNA (15 .mu.l) and the primer S/L/-/443 (3 .mu.l at the
concentration of 5 .mu.m) were incubated for 10 min at 75.degree.
C.
[0410] Next, the 5.times. reverse transcriptase buffer (6 .mu.l,
INVITROGEN), 10 Mm dNTP (1 .mu.l), 0.1 M DTT (3 .mu.l) were added
and the mixture was incubated at 50.degree. C. for 3 min.
[0411] Finally, the reverse transcriptase (3 .mu.l of
Superscript.RTM., INVITROGEN) was added to the preceding mixture
which was incubated at 50.degree. C. for 1 h 30 min and then at
90.degree. C. for 2 min.
[0412] The cDNA thus obtained was purified with the aid of the
QIAquick PCR purification kit (QIAGEN), according to the
manufacturer's recommendations.
b.sub.1) Terminal Transferase Reaction (TdT)
[0413] The cDNA (10 .mu.l) is incubated for 2 min at 100.degree.
C., stored in ice, and the following are then added: H.sub.2O (2.5
.mu.l), 5.times.TdT buffer (4 .mu.l, AMERSHAM), 5 mM dATP (2 .mu.l)
and TdT (1.5 .mu.l, AMERSHAM). The mixture thus obtained is
incubated for 45 min at 37.degree. C. and then for 2 min at
65.degree. C.
[0414] The product obtained is amplified by a first PCR reaction
with the aid of the primers: S/L/-225-206 and anchor 14T:
5'-AGATGAATTCGGTACCTTTTTTTTTTTTTT-3' (SEQ ID NO: 68). The
amplification conditions are the following: an initial step of
denaturation at 94.degree. C. for 2 min is followed by 10 cycles
comprising a step of denaturation at 94.degree. C. for 10 sec, a
step of annealing at 45.degree. C. for 30 sec and then a step of
extension at 72.degree. C. for 30 sec and then by 30 cycles
comprising a step of denaturation at 94.degree. C. for 10 sec, a
step of annealing at 50.degree. C. for 30 sec and then a step of
extension at 72.degree. C. for 30 sec, and then a final step of
extension at 72.degree. C. for 5 min.
[0415] The product of the first PCR amplification was subjected to
a second amplification step with the aid of the primers:
S/L/-/204-185 and anchor 14T mentioned above under conditions
identical to those of the first amplification. The amplicon thus
obtained was purified, sequenced with the aid of the primer
S/L/-/182-163 and it was then cloned as above for the different
ORFs, to give the plasmid called SARS-5'NC. The DNA of this clone
was isolated and sequenced with the aid of the universal primers
M13 sense and M13 antisense and the primer S/L/-/182-163 mentioned
above.
[0416] The amplicon representing the cDNA corresponding to the 5'NC
end of the SARS-CoV strain derived from the sample recorded under
the No. 031589 corresponds to the sequence SEQ ID NO: 72 in the
sequence listing appended as an annex; this sequence does not
contain differences in relation to the corresponding sequences of
the isolates AY274119.3-Tor2 and AY278741-Urbani.
[0417] The plasmid, called SARS-5'NC, was deposited under the No.
I-3124, on Nov. 7, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA corresponding to the noncoding 5' end of the
genome of the SARS-CoV strain derived from the sample recorded
under the No. 031589, as defined above, said sequence corresponding
to the nucleotides at positions 1 to 204 (SEQ ID NO: 39), with
reference to the Genbank sequence accession No. AY274119.3.
b) Noncoding 3' End (3'NC)
a.sub.1) Synthesis of the cDNA
[0418] The RNAs derived from the sample 031589, extracted as above,
were subjected to reverse transcription, according to the following
protocol: the reaction mixture containing: RNA (5 .mu.l), H.sub.2O
(5 .mu.l), 5.times. reverse transcriptase buffer (4 .mu.l), 5 mM
dNTP (2 .mu.l), 5 .mu.M Oligo 20T (2 .mu.l), 40 U/.mu.l RNasin (0.5
.mu.l) and 10 IU/.mu.l RT-AMV (1.5 .mu.l, PROMEGA) was incubated in
a thermo-cycler, under the following conditions: 45 min at
42.degree. C., 15 min at 55.degree. C., 5 min at 95.degree. C., and
it was then kept at +4.degree. C.
[0419] The cDNA obtained was amplified by a first PCR reaction with
the aid of the primers S/N/+/28468-28487 and anchor 14T mentioned
above. The amplification conditions are the following: an initial
step of denaturation at 94.degree. C. for 2 min is followed by 10
cycles comprising a step of denaturation at 94.degree. C. for 20
sec, a step of annealing at 45.degree. C. for 30 sec and then a
step of extension at 72.degree. C. for 50 sec and then 30 cycles
comprising a step of denaturation at 94.degree. C. for 20 sec, a
step of annealing at 50.degree. C. for 30 sec and then a step of
extension at 72.degree. C. for 50 sec, and then a final step of
extension at 72.degree. C. for 5 min.
[0420] The product of the first PCR amplification was subjected to
a second amplification step with the aid of the primers
S/N/+/28933-28952 and anchor 14T mentioned above, under conditions
identical to those of the first amplification. The amplicon thus
obtained was purified, sequenced with the aid of the primer
S/N/+/29257-29278 and cloned as above for the different ORFs, to
give the plasmid called SARS-3'NC. The DNA of this clone was
isolated and sequenced with the aid of the universal primers M13
sense and M13 antisense and the primer S/N/+/29257-29278 mentioned
above.
[0421] The amplicon representing the cDNA corresponding to the 3'NC
end of the SARS-CoV strain derived from the sample recorded under
the No. 031589 corresponds to the sequence SEQ ID NO: 73 in the
sequence listing appended as an annex; this sequence does not
contain differences in relation to the corresponding sequences of
the isolates AY274119.3-Tor2 and AY278741-Urbani.
[0422] The plasmid called SARS-3'NC was deposited under the No.
I-3123 on Nov. 7, 2003, at the Collection Nationale de Cultures de
Microorganismes, 25 rue du Docteur Roux, 75724 Paris Cedex 15; it
contains the cDNA sequence corresponding to the noncoding 3' end of
the genome of the SARS-CoV strain derived from the sample recorded
under the No. 031589, as defined above, said sequence corresponding
to that situated between the nucleotide at positions 28933 to 29727
(SEQ ID NO: 40), with reference to the Genbank sequence accession
No. AY274119.3, ends with a series of nucleotides a.
2.6) ORF1a and ORF1b
[0423] The amplification of the 5' region containing ORF1a and
ORF1b of the SARS-CoV genome derived from the sample 031589 was
performed by carrying out RT-PCR reactions followed by nested PCRs
according to the same principles as those described above for the
other ORFs. The amplified fragments overlap over several tenths of
bases, thus allowing computer reconstruction of the complete
sequence of this part of the genome. On average, the amplified
fragments are of two kilobases.
[0424] 14 overlapping fragments, called L0 to L12, were thus
amplified with the aid of the following primers: TABLE-US-00002
TABLE II Primers used for the amplification of the 5' region (ORF1a
and ORF1b) REGION AMPLIFIED AND SEQUENCED (does not include RT-PCR
RT-PCR Nested PCR Nested PCR the primers) sense primer antisense
primer sense primer antisense primer L0 S/L0/F1/+30 S/L0/R1/-481
50-480 L1 S/L1/F1/+147 S/L1/R1/-2338 S/L1/F2/+211 S/L1/R2/-2241
231-2240 L2 S/L2/F1/+2033 S/L2/R1/-4192 S/L2/F2/+2136 S/L2/R2/-4168
2156-4167 L3 S/L3bis/F1/+3850 S/L3bis/R1/-5365 S/L3bis/F2/+3892
S/L3bis/R2/-5325 3913-5324 L4b S/L4b/F1/+4878 S/L4b/R1/-6061
S/L4b/F2/+4932 S/L4b/R2/-6024 4952-6023 L4 S/L4/F1/+5272
S/L4/R1/-7392 S/L4/F2/+5305 S/L4/R2/-7323 5325-7318 L5
S/L5/F1/+7111 S/L5/R1/-9253 S/L5/F2/+7275 S/L5/R2/-9157 7296-9156
L6 S/L6/F1/+8975 S/L6/R1/-11151 S/L6/F2/+9032 S/L6/R2/-11067
9053-11066 L7 S/L7/F1/+10883 S/L7/R1/-13050 S/L7/F2/+10928
S/L7/R2/-12963 10928-12962 L8 S/L8/F1/+12690 S/L8/R1/-14857
S/L8/F2/+12815 S/L8/R2/-14835 12835-14834 L9 S/L9/F1/+14688
S/L9/R1/-16678 S/L9/F2/+14745 S/L9/R2/-16625 14765-16624 L10
S/L10/F1/+16451 S/L10/R1/-18594 S/L10/F2/+16514 S/L10/R2/-18571
16534-18570 L11 S/L11/F1/+18441 S/L11/R1/-20612 S/L11/F2/+18500
S/L11/R2/-20583 18521-20582 L12 S/L12/F1/+20279 S/L12/R1/-22229
S/L12/F2/+20319 S/L12/R2/-22206 20338-22205.
[0425] All the fragments were amplified under the following
conditions, except fragment L0 which was amplified as described
above for ORF-M: [0426] RT-PCR: 30 min at 42.degree. C., 15 min at
55.degree. C., 2 min at 94.degree. C., and then the cDNA obtained
is amplified under the following conditions: 40 cycles comprising:
a step of denaturation at 94.degree. C. for 15 sec, a step of
annealing at 58.degree. C. for 30 sec and then a step of extension
at 68.degree. C. for 1 min 30 sec, with 5 sec additional extension
at each cycle, and then a final step of extension at 68.degree. C.
for 7 min. [0427] Nested PCR: An initial step of denaturation at
94.degree. C. for 2 min is followed by 35 cycles comprising: a step
of denaturation at 94.degree. C. for 15 sec, a step of annealing at
60.degree. C. for 30 sec and then a step of extension at 72.degree.
C. for 1 min 30 sec, with 5 sec of additional extension at each
cycle, and then a final step of extension at 72.degree. C. for 7
min.
[0428] The amplification products were sequenced with the aid of
the primers defined in table III below: TABLE-US-00003 TABLE III
Primers used for the sequencing of the 5' region (ORF1a and ORF1b)
Sequences Names (SEQ ID NO: 76 to 139) S/L3/+/4932
5'-CCACACACAGCTTGTGGATA-3' S/L4/+/6401 5'-CCGAAGTTGTAGGCAATGTC-3'
S/L4/+/6964 5'-TTTGGTGCTCCTTCTTATTG-3' S/L4/-/6817
5'-CCGGCATCCAAACATAATTT-3' S/L5/-/7633 5'-TGGTCAGTAGGGTTGATTGG-3'
S/L5/-/8127 5'-CATCCTTTGTGTCAACATCG-3' S/L5/-/8633
5'-GTCACGAGTGACACCATCCT-3' S/L5/+/7839 5'-ATGCGACGAGTCTGCTTCTA-3'
S/L5/+/8785 5'-TTCATAGTGCCTGGCTTACC-3' S/L5/+/8255
5'-ATCTTGGCGCATGTATTGAC-3' S/L6/-/9422 5'-TGCATTAGCAGCAACAACAT-3'
S/L6/-/9966 5'-TCTGCAGAACAGCAGAAGTG-3' S/L6/-/10542
5'-CCTGTGCAGTTTGTCTGTCA-3' S/L6/+/10677 5'-CCTTGTGGCAATGAAGTACA-3'
S/L6/+/10106 5'-ATGTCATTTGCACAGCAGAA-3' S/L6/+/9571
5'-CTTCAATGGTTTGCCATGTT-3' S/L7/-/11271 5'-TGCGAGCTGTCATGAGAATA-3'
S/L7/-/11801 5'-AACCGAGAGCAGTACCACAG-3' S/L7/-/12383
5'-TTTGGCTGCTGTAGTCAATG-3' S/L7/+/12640 5'-CTACGACAGATGTCCTGTGC-3'
S/L7/+/12088 5'-GAGCAGGCTGTAGCTAATGG-3' S/L7/+/11551
5'-TTAGGCTATTGTTGCTGCTG-3' S/L8/-13160 5'-CAGACAACATGAAGCACCAC-3'
S/L8/-/13704 5'-CGCTGACGTGATATATGTGG-3' S/L8/-14284
5'-TGCACAATGAAGGATACACC-3' S/L8/+/14453 5'-ACATAGCTCGCGTCTCAGTT-3'
S/L8/+/13968 5'-GGCATTGTAGGCGTACTGAC-3' S/L8/+/13401
5'-GTTTGCGGTGTAAGTGCAG-3' S/L9/-15098 5'-TAGTGGCGGCTATTGACTTC-3'
S/L9/-15677 5'-CTAAACCTTGAGCCGCATAG-3' S/L9/-16247
5'-CATGGTCATAGCAGCACTTG-3' S/L9/+16323 5'-CCAGGTTGTGATGTCACTGAT-3'
S/L9/+15858 5'-CCTTACCCAGATCCATCAAG-3' S/L9/+15288
5'-CGCAAACATAACACTTGCTG-3' S/L10/-16914 5'-AGTGTTGGGTACAAGCCAGT-3'
S/L10/-17466 5'-GTTCCAAGGAACATGTCTGG-3' S/L10/-18022
5'-AGGTGCCTGTGTAGGATGAA-3' S/L10/+18245 5'-GGGCTGTCATGCAACTAGAG-3'
S/L10/+17663 5'-TCTTACACGCAATCCTGCTT-3' S/L10/+17061
5'-TACCCATCTGCTCGCATAGT-3' S/L11/-/18877 5'-GCAAGCAGAATTAACCCTCA-3'
S/L11/-19396 5'-AGCACCACCTAAATTGCATC-3' S/L11/-20002
5'-TGGTCCCTTTGAAGGTGTTA-3' S/L11/+20245 5'-TCGAACACATCGTTTATGGA-3'
S/L11/+/19611 5'-GAAGCACCTGTTTCCATCAT-3' S/L11/+/19021
5'-ACGATGCTCAGCCATGTAGT-3' SARS/L1/F3/+800
5'-GAGGTGCAGTCACTCGCTAT-3' SARS/L1/F4/+1391
5'-CAGAGATTGGACCTGAGCAT-3' SARS/L1/F5/+1925
5'-CAGCAAACCACTCAATTCCT-3' SARS/L1/R3/-1674
5'-AAATGATGGCAACCTCTTCA-3' SARS/L1/R4/-1107
5'-CACGTGGTTGAATGACTTTG-3' SARS/L1/R5/-520
5'-ATTTCTGCAACCAGCTCAAC-3' SARS/L2/F3/+2664
5'-CGCATTGTCTCCTGGTTTAC-3' SARS/L2/F4/+3232
5'-GAGATTGAGCCAGAACCAGA-3' SARS/L2/F5/+3746
5'-ATGAGCAGGTTGTCATGGAT-3' SARS/L2/R3/-3579
5'-CTGCCTTAAGAAGCTGGATG-3' SARS/L2/R4/-2991
5'-TTTCTTCACCAGCATCATCA-3' SARS/L2/R5/-2529
5'-CACCGTTCTTGAGAACAACC-3' SARS/L3/F3/+4708
5'-TCTTTGGCTGGCTCTTACAG-3' SARS/L3/F4/+5305
5'-GCTGGTGATGCTGCTAACTT-3' SARS/L3/F5/+5822
5'-CCATCAAGCCTGTGTCGTAT-3' SARS/L3/R3/-5610
5'-CAGGTGGTGCAGACATCATA-3' SARS/L3/R4/-4988
5'-AACATCAGCACCATCCAAGT-3' SARS/L3/R5/-4437
5'-ATCGGACACCATAGTCAACG-3'
[0429] The sequences of the fragments L0 to L12 of the SARS-CoV
strain derived from the sample recorded under the No. 031589
correspond respectively to the sequences SEQ ID NO: 41 to SEQ ID
NO: 54 in the sequence listing appended as an annex. Among these
sequences, only that corresponding to the fragments L5 contains a
nucleotide difference in relation to the corresponding sequence of
the isolate AY278741-Urbani. This t/c mutation at position 7919
results in a modification of the amino acid sequence of the
corresponding protein, encoded by ORF1a: at position 2552, a valine
(gtt codon; AY278741) is changed to alanine (gct codon) in the
SARS-CoV strain 031589. By contrast, no mutation was identified in
relation to the corresponding sequence of the isolate
AY274119.3-Urbani. The other fragments do not exhibit differences
in relation to the corresponding sequences of the isolates Tor2 and
Urbani.
EXAMPLE 2
Production and Purification of the Recombinant N and S Proteins of
the SARS-CoV Strain Derived from the Sample Recorded Under the
Number 031589
[0430] The entire N protein and two polypeptide fragments of the S
protein of the SARS-CoV strain derived from the sample recorded
under the number 031589 were produced in E. coli, in the form of
fusion proteins comprising an N- or C-terminal polyhistidine tag.
In the two S polypeptides, the N- and C-terminal hydrophobic
sequences of the S protein (signal peptide: positions 1 to 13 and
transmembrane helix: positions 1196 to 1218) were deleted whereas
the .beta. helix (positions 565 to 687) and the two motifs of the
coiled-coil type (positions 895 to 980 and 1155 to 1186) of the S
protein were preserved. These two polypeptides consist of: a long
fragment (S.sub.L) corresponding to positions 14 to 1193 of the
amino acid sequence of the S protein and a short fragment (S.sub.C)
corresponding to positions 475 to 1193 of the amino acid sequence
of the S protein.
1) Cloning of the cDNAS N, S.sub.L and S.sub.C into the Expression
Vectors pIVEX2.3 and pIVEX2.4
[0431] The cDNAs corresponding to the N protein and to the S.sub.L
and S.sub.C fragments were amplified by PCR under standard
conditions, with the aid of the DNA polymerase Platinum Pfx.RTM.
(INVITROGEN). The plasmids SRAS-N and SRAS-S were used as template
and the following oligo-nucleotides as primers: TABLE-US-00004
5'-CCCATATGTCTGATAATGGACCCCAATCAAAC-3' (N sense, SEQ ID NO: 55)
5'-CCCCCGGGTGCCTGAGTTGAATCAGCAGAAGC-3' (N antisense, SEQ ID NO: 56)
5'-CCCATATGAGTGACCTTGACCGGTGCACCAC-3' (S.sub.c sense, SEQ ID NO:
57) 5'-CCCATATGAAACCTTGCACCCCACCTGCTC-3' (S.sub.L sense, SEQ ID NO:
58) 5'-CCCCCGGGTTTAATATATTGCTCATATTTTCCC-3' (S.sub.c and S.sub.L
antisense, SEQ ID NO: 29).
[0432] The sense primers introduce an NdeI site (underlined) while
the antisense primers introduce an XmaI or SmaI site (underlined).
The 3 amplification products were column purified (QIAquick PCR
Purification kit, QIAGEN) and cloned into an appropriate vector.
The plasmid DNA purified from the 3 constructs (QIAFilter Midi
Plasmid kit, QIAGEN) was verified by sequencing and digested with
the enzymes NdeI and XmaI. The 3 fragments corresponding to the
cDNAs N, S.sub.L and S.sub.C were purified on agarose gel and then
inserted into the plasmids pIVEX2.3MCS(C-terminal polyhistidine
tag) and pIVEX2.4d (N-terminal polyhistidine tag) digested
beforehand with the same enzymes. After verification of the
constructs, the 6 expression vectors thus obtained (pIV2.3N,
pIV2.3S.sub.C, pIV2.3S.sub.L, pIV2.4N, pIV2.4S.sub.C also called
pIV2.4S.sub.1, pIV2.4S.sub.L) were then used, on the one hand to
test the expression of the proteins in vitro, and on the other hand
to transform the bacterial strain BL21(DE3)pDIA17 (NOVAGEN). These
constructs encode proteins whose expected molecular mass is the
following: pIV2.3N (47174 Da), pIV2.3S.sub.C (82897 Da),
pIV2.3S.sub.L (132056 Da), pIV2.4N (48996 Da), pIV2.4S.sub.1 (81076
Da) and pIV2.4S.sub.L (133877 Da). Bacteria transformed with
pIV2.3N were deposited at the CNCM on Oct. 23, 2003, under the
number I-3117, and bacteria transformed with pIV2.4S.sub.1 were
deposited at the CNCM on Oct. 23, 2003, under the number
I-3118.
2) Analysis of the Expression of the Recombinant Proteins In Vitro
and In Vivo
[0433] The expression of recombinant proteins from the 6
recombinant vectors was tested, in a first instance, in a system in
vitro (RTS100, Roche). The proteins produced in vitro, after
incubation of the recombinant vectors pIVEX for 4 h at 30.degree.
C., in the RTS100 system, were analyzed by Western blotting with
the aid of an anti-(his).sub.6 antibody coupled to peroxidase. The
result of expression in vitro (FIG. 1) shows that only the N
protein is expressed in large quantities, regardless of the
position, N- or C-terminal, of the polyhistidine tag. In a second
step, the expression of the N and S proteins was tested in vivo at
30.degree. C. in LB medium in the presence or in the absence of
inducer (1 mM IPTG). The N protein is very well produced in this
bacterial The sequences of the fragments L0 to L12 of the SARS-CoV
strain derived from the sample recorded under the No. 031589
correspond respectively to the sequences SEQ ID NO: 41 to SEQ ID
NO: 54 in the sequence listing appended as an annex. Among these
sequences, only that corresponding to the fragments L5 contains a
nucleotide difference in relation to the corresponding sequence of
the isolate AY278741-Urbani. This t/c mutation at position 7919
results in a modification of the amino acid sequence of the
corresponding protein, encoded by ORF1a: at position 2552, a valine
(gtt codon; AY278741) is changed to alanine (gct codon) in the
SARS-CoV strain 031589. By contrast, no mutation was identified in
relation to the corresponding sequence of the isolate
AY274119.3-Urbani. The other fragments do not exhibit differences
in relation to the corresponding sequences of the isolates Tor2 and
Urbani.
EXAMPLE 2
Production and Purification of the Recombinant N and S Proteins of
the SARS-CoV Strain Derived from the Sample Recorded Under the
Number 031589
[0434] The entire N protein and two polypeptide fragments of the S
protein of the SARS-CoV strain derived from the sample recorded
under the number 031589 were produced in E. coli, in the form of
fusion proteins comprising an N- or C-terminal polyhistidine tag.
In the two S polypeptides, the N- and C-terminal hydrophobic
sequences of the S protein (signal peptide: positions 1 to 13 and
transmembrane helix: positions 1196 to 1218) were deleted whereas
the .beta. helix (positions 565 to 687) and the two motifs of the
coiled-coil type (positions 895 to 980 and 1155 to 1186) of the S
protein were preserved. These two polypeptides consist of: a long
fragment (S.sub.L) corresponding to positions 14 to 1193 of the
amino acid sequence of the S protein and a short fragment (S.sub.C)
corresponding to positions 475 to 1193 of the amino acid sequence
of the S protein.
1) Cloning of the cDNAS N, S.sub.L and S.sub.C into the Expression
Vectors pIVEX2.3 and pIVEX2.4
[0435] The cDNAs corresponding to the N protein and to the S.sub.L
and S.sub.C fragments were amplified by PCR under standard
conditions, with the aid of the DNA polymerase Platinum Pfx.RTM.
(INVITROGEN). The plasmids SRAS-N and SRAS-S were used as template
and the following oligo-nucleotides as primers: TABLE-US-00005
5'-CCCATATGTCTGATAATGGACCCCAATCAAAC-3' (N sense, SEQ ID NO: 55)
5'-CCCCCGGGTGCCTGAGTTGAATCAGCAGAAGC-3' (N antisense, SEQ ID NO: 56)
5'-CCCATATGAGTGACCTTGACCGGTGCACCAC-3' (S.sub.c sense, SEQ ID NO:
57) 5'-CCCATATGAAACCTTGCACCCCACCTGCTC-3' (S.sub.L sense, SEQ ID NO:
58) 5'-CCCCCGGGTTTAATATATTGCTCATATTTTCCC-3' (S.sub.c and S.sub.L
antisense, SEQ ID NO: 29).
[0436] The sense primers introduce an NdeI site (underlined) while
the antisense primers introduce an XmaI or SmaI site (underlined).
The 3 amplification products were column purified (QIAquick PCR
Purification kit, QIAGEN) and cloned into an appropriate vector.
The plasmid DNA purified from the 3 constructs (QIAFilter Midi
Plasmid kit, QIAGEN) was verified by sequencing and digested with
the enzymes NdeI and XmaI. The 3 fragments corresponding to the
cDNAs N, S.sub.L and S.sub.C were purified on agarose gel and then
inserted into the plasmids pIVEX2.3MCS(C-terminal polyhistidine
tag) and pIVEX2.4d (N-terminal polyhistidine tag) digested
beforehand with the same enzymes. After verification of the
constructs, the 6 expression vectors thus obtained (pIV2.3N,
pIV2.3S.sub.C, pIV2.3S.sub.L, pIV2.4N, pIV2.4S.sub.C also called
pIV2.4S.sub.1, pIV2.4S.sub.L) were then used, on the one hand to
test the expression of the proteins in vitro, and on the other hand
to transform the bacterial strain BL21(DE3)pDIA17 (NOVAGEN). These
constructs encode proteins whose expected molecular mass is the
following: pIV2.3N (47174 Da), pIV2.3S.sub.C (82897 Da),
pIV2.3S.sub.L (132056 Da), pIV2.4N (48996 Da), pIV2.4S.sub.1 (81076
Da) and pIV2.4S.sub.L (133877 Da). Bacteria transformed with
pIV2.3N were deposited at the CNCM on Oct. 23, 2003, under the
number I-3117, and bacteria transformed with pIV2.4S.sub.1 were
deposited at the CNCM on Oct. 23, 2003, under the number
I-3118.
2) Analysis of the Expression of the Recombinant Proteins In Vitro
and In Vivo
[0437] The expression of recombinant proteins from the 6
recombinant vectors was tested, in a first instance, in a system in
vitro (RTS100, Roche). The proteins produced in vitro, after
incubation of the recombinant vectors pIVEX for 4 h at 30.degree.
C., in the RTS100 system, were analyzed by Western blotting with
the aid of an anti-(his).sub.6 antibody coupled to peroxidase. The
result of expression in vitro (FIG. 1) shows that only the N
protein is expressed in large quantities, regardless of the
position, N- or C-terminal, of the polyhistidine tag. In a second
step, the expression of the N and S proteins was tested in vivo at
30.degree. C. in LB medium in the presence or in the absence of
inducer (1 mM IPTG). The N protein is very well produced in this
bacterial system (FIG. 2) and is found mainly in a soluble fraction
after lysis of the bacteria. By contrast, the long version of S
(S.sub.L) is very weakly produced and is completely insoluble (FIG.
3). The short version (S.sub.C) also exhibits a very weak
solubility, but an expression level that is much higher than that
of the long version. Moreover, the construct S.sub.C fused with a
polyhistidine tag at the C-terminal position has a smaller size
than that expected. An immunodetection experiment with an
anti-polyhistidine antibody has shown that this construct was
incomplete. In conclusion, the two constructs, pIV2.3N and
pIV2.4S.sub.1, which express respectively the entire N protein
fused with the C-terminal polyhistidine tag and the short S protein
fused with the N-terminal polyhistidine tag, were selected in order
to produce the two proteins in a large quantity so as to purify
them. The plasmids pIV2.3N and pIV2.4S.sub.1 were deposited
respectively under the No. I-3117 and I-3118 at the CNCM, 25 rue du
Docteur Roux, 75724 PARIS 15, on Oct. 23, 2003.
3) Analysis of the Antigenic Activity of the Recombinant
Proteins
[0438] The antigenic activity of the N, S.sub.L and S.sub.C
proteins was tested by Western blotting with the aid of two serum
samples, obtained from the same patient infected with SARS-CoV,
collected 8 days (M12) and 29 days (M13) after the onset of the
SARS symptoms. The experimental protocol is as described in example
3. The results illustrated by FIG. 4 show (i) the seroconversion of
the patient, and (ii) that the N protein possesses a higher
antigenic reactivity than the short S protein.
4) Purification of the N Protein from pIV2.3N
[0439] Several experiments for purifying the N protein, produced
from the vector pIV2.3N, were carried out according to the
following protocol. The bacteria BL21(DE3)pDIA17, transformed with
the expression vector pIV2.3N, were cultured at 30.degree. C. in 1
liter of culture medium containing 0.1 mg/ml of ampicillin, and
induced with 1 mM IPTG when the cell density equivalent to
A.sub.600=0.8 is reached (about 3 hours). After 2 hours of culture
in the presence of inducer, the cells were recovered by
centrifugation (10 min at 5000 rpm), resuspended in the lysis
buffer (50 mM NaH.sub.2PO.sub.4, 0.3 M NaCl, 20 mM imidazole, pH 8,
containing the mixture of protease inhibitors Complete.RTM.,
Roche), and lysed with the French press (12 000 psi). After
centrifugation of the bacterial lysate (15 min at 12 000 rpm), the
supernatant (50 ml) was deposited at a flow rate of 1 ml/min on a
metal chelation column (15 ml) (Ni-NTA superflow, Qiagen),
equilibrated with the lysis buffer. After washing the column with
200 ml of lysis buffer, the N protein was eluted with an imidazole
gradient (20.fwdarw.250 mM) in 10 column volumes. The fractions
containing the N protein were assembled and analyzed by
polyacrylamide gel electrophoresis under denaturing conditions
followed by staining with Coomassie blue. The results illustrated
by FIG. 5 show that the protocol used makes it possible to purify
the N protein with a very satisfactory homogeneity (95%) and a mean
yield of 15 mg of protein per liter of culture.
5) Purification of the S.sub.C Protein from pIV2.4S.sub.C
(pIV2.4S.sub.1)
[0440] The protocol followed for purifying the short S protein is
very different from that described above because the protein is
highly aggregated in the bacterial system (inclusion bodies). The
bacteria BL21(DE3)pDIA17, transformed with the expression vector
pIV2.4S.sub.1, were cultured at 30.degree. C. in 1 liter of culture
medium containing 0.1 mg/ml of ampicillin, and induced with 1 mM
IPTG when the cell density equivalent to A.sub.600=0.8 is reached
(about 3 hours). After 2 hours of culture in the presence of
inducer, the cells were recovered by centrifugation (10 min at 5000
rpm), resuspended in the lysis buffer (0.1 M Tris-HCl, 1 mM EDTA,
pH 7.5), and lysed with the French press (1200 psi). After
centrifugation of the bacterial lysate (15 min at 12 000 rpm), the
pellet was resuspended in 25 ml of lysis buffer containing 2%
Triton X100 and 10 mM .beta.-mercaptoethanol, and then centrifuged
for 20 min at 12 000 rpm. The pellet was resuspended in 10 mM
Tris-HCl buffer containing 7 M urea, and gently stirred for 30 min
at room temperature. This final washing of the inclusion bodies
with 7 M urea is necessary in order to remove most of the E. coli
membrane proteins which co-sediment with the aggregated S.sub.C
protein. After a final centrifugation for 20 min at 12 000 rpm, the
final pellet is resuspended in the 10 mM Tris-HCl buffer. The
electrophoretic analysis of this preparation (FIG. 6) shows that
the short S protein may be purified with a satisfactory homogeneity
(about 90%) from the inclusion bodies (insoluble extract).
EXAMPLE 3
Immunodominance of the N Protein
[0441] The reactivity of the antibodies present in the serum of
patients suffering from atypical pneumopathy caused by the
SARS-associated coronavirus (SARS-CoV), toward the various proteins
of this virus, was analyzed by Western blotting under the
conditions described below.
1) Materials
a) Lysate of Cells Infected with SARS-CoV
[0442] Vero E6 cells (2.times.10.sup.6) were infected with SARS-CoV
(isolate recorded under the number FFM/MA104) at a multiplicity of
infection (M.O.I.) of 10.sup.-1 or 10.sup.-2 and then incubated in
DMEM medium containing 2% FCS, at 35.degree. C. in an atmosphere
containing 5% CO.sub.2. 48 hours later, the cellular lawn was
washed with PBS and then lysed with 500 .mu.l of loading buffer
prepared according to Laemmli and containing
.beta.-mercaptoethanol. The samples were then boiled for 10 minutes
and then sonicated for 3 times 20 seconds.
b) Antibodies
b.sub.1) Serum from a Patient Suffering from Atypical
Pneumopathy
[0443] The serum designated by a reference at the National
Reference Center for Influenza Viruses (Northern region) under the
No. 20033168 is that from a French patient suffering from atypical
pneumopathy caused by SARS-CoV collected on day 38 after the onset
of the symptoms; the diagnosis of SARS-CoV infection was performed
by nested RT-PCR and quantitative PCR.
b.sub.2) Monospecific Rabbit Polyclonal Sera Directed Against the N
Protein or the S Protein
[0444] The sera are those produced from the recombinant N and
S.sub.C proteins (example 2), according to the immunization
protocol described in example 4; they are the rabbit P13097 serum
(anti-N serum) and the rabbit P11135 serum (anti-S serum).
2) Method
[0445] 20 .mu.l of lysate of cells infected with SARS-CoV at M.O.I.
values of 10.sup.-1 and 10.sup.-2 and, as a control, 20 .mu.l of a
lysate of noninfected cells (mock) were separated on 10% SDS
polyacrylamide gel and then transferred onto a nitrocellulose
membrane. After blocking in a solution of PBS/5% milk/0.1% Tween
and washing in PBS/0.1% Tween, this membrane was hybridized
overnight at 4.degree. C. with: (i) the immune serum No. 20033168
diluted 1/300, 1/1000 and 1/3000 in the buffer PBS/1% BSA/0.1%
Tween, (ii) the rabbit P13097 serum (anti-N serum) diluted 1/50 000
in the same buffer and (iii) the rabbit P11135 serum (anti-S serum)
diluted 1/10 000 in the same buffer. After washing in PBS/Tween, a
secondary hybridization was performed with the aid of either sheep
polyclonal antibodies directed against the heavy and light chains
of human G immunoglobulins and coupled with peroxidase (NA933V,
Amersham), or of donkey polyclonal antibodies directed against the
heavy and light chains of the rabbit G immunoglobulins and coupled
with peroxidase (NA934V, Amersham). The bound antibodies were
visualized with the aid of the ECL+ kit (Amersham) and of Hyperfilm
MP autoradiography films (Amersham). A molecular mass ladder (kDa)
is presented in the figure.
3) Results
[0446] FIG. 7 shows that three polypeptides of apparent molecular
mass 35, 55 and 200 kDa are specifically detected in the extracts
of cells infected with SARS-CoV.
[0447] In order to identify these polypeptides, two other
immunoblots (FIG. 8) were prepared on the same samples and under
the same conditions with rabbit polyclonal antibodies specific for
the nucleoprotein N (rabbit P13097, FIG. 8A) and for the spicule
protein S (rabbit P11135, FIG. 8B). This experiment shows that the
200 kDa polypeptide corresponds to the SARS-CoV spicule
glycoprotein S, that the 55 kDa polypeptide corresponds to the
nucleoprotein N while the 35 kDa polypeptide probably represents a
truncated or degraded form of N.
[0448] The data presented in FIG. 7 therefore show that the serum
20033168 strongly reacts with N and a lot more weakly with the
SARS-CoV S since the 35 and 55 kDa polypeptides are visualized in
the form of intense bands for 1/300, 1/1000 and 1/3000 dilutions of
the immunoserum whereas the 200 kDa polypeptide is only weakly
visualized for a dilution of 1/300. It is also possible to note
that no other SARS-CoV polypeptide is detected for dilutions
greater than 1/300 of the serum 20033168.
[0449] This experiment indicates that the antibody response
specific for the SARS-CoV N dominates the antibody responses
specific for the other SARS-CoV polypeptides and in particular the
antibody response directed against the S glycoprotein. It indicates
an immuno-dominance of the nucleoprotein N during human infections
with SARS-CoV.
EXAMPLE 4
Preparation of Monospecific Polyclonal Anti-Bodies Directed Against
the SARS-Associated Coronavirus (SARS-CoV) N and S Proteins
1) Materials and Method
[0450] Three rabbits (P13097, P13081, P13031) were immunized with
the purified recombinant polypeptide corresponding to the entire
nucleoprotein (N), prepared according to the protocol described in
example 2. After a first injection of 0.35 mg per rabbit of protein
emulsified in complete Freund's adjuvant (intradermal route), the
animals received 3 booster injections at 3 and then 4 weeks'
interval, of 0.35 mg of recombinant protein emulsified in
incomplete Freund's adjuvant.
[0451] Three rabbits (P11135, P13042, P14001) were immunized with
the recombinant polypeptide corresponding to the short fragment of
the S protein (S.sub.C) produced as described in example 2. As this
polypeptide is found mainly in the form of inclusion bodies in the
bacterial cytoplasm, the animals received 4 intradermal injections
at 3-4 weeks' interval of a preparation of inclusion bodies
corresponding to 0.5 mg of recombinant protein emulsified in
incomplete Freund's adjuvant. The first 3 injections were made with
a preparation of inclusion bodies prepared according to the
protocol described in example 2, while the fourth injection was
made with a preparation of inclusion bodies which were prepared
according to the protocol described in example 2 and then purified
on sucrose gradient and washed in 2% Triton X100.
[0452] For each rabbit, a preimmune (p.i.) serum was prepared
before the first immunization and an immune serum (I.S.) 5 weeks
after the fourth immunization.
[0453] In a first instance, the reactivity of the sera was analyzed
by ELISA test on preparations of recombinant proteins similar to
those used for the immunizations; the ELISA tests were carried out
according to the protocol and with the reagents as described in
example 6.
[0454] In a second instance, the reactivity of the sera was
analyzed by preparing an immunoblot (Western blot) of a lysate of
cells infected with SARS-CoV, according to the protocol as
described in example 3.
2) Results
[0455] The ELISA tests (FIG. 9) demonstrate that the preparations
of recombinant N protein and of inclusion bodies of the short
fragment of the S protein (S.sub.C) are immunogenic in animals and
that the titer of the immune sera is high (more than 1/25 000).
[0456] The immunoblot (FIG. 8) shows that the rabbit P13097 immune
serum recognizes two polypeptides present in the lysates of cells
infected with SARS-CoV: a polypeptide whose apparent molecular mass
(50-55 kDa based on experiments) is compatible with that of the
nucleo-protein N (422 residues, predicted molecular mass of 46 kDa)
and a polypeptide of 35 kDa, which probably represents a truncated
or degraded form of N.
[0457] This experiment also shows that the rabbit P11135 serum
mainly recognizes a polypeptide whose apparent molecular mass
(180-220 kDa based on experiments) is compatible with a
glycosylated form of S (1255 residues, nonglycosylated polypeptide
chain of 139 kDa), as well as lighter polypeptides, which probably
represent truncated and/or nonglycosylated forms of S.
[0458] In conclusion, all these experiments demonstrate that the
recombinant polypeptides expressed in E. coli and corresponding to
the SARS-CoV N and S proteins make it possible to induce, in
animals, polyclonal antibodies capable of recognizing the native
forms of these proteins.
EXAMPLE 5
Preparation of Monospecific Polyclonal Anti-Bodies Directed Against
the SARS-Associated Coronavirus (SARS-CoV) M and E Proteins
1) Analysis of the Structure of the M and E Proteins
a) E Protein
[0459] The structure of the SARS-CoV E protein (76 amino acids) was
analyzed in silico, with the aid of various software packages such
as signalP v1.1, NetNGlyc 1.0, THMM 1.0 and 2.0 (Krogh et al.,
2001, J. Mol. Biol., 305(3):567-580) or alternatively TOPPRED (von
Heijne, 1992, J. Mol. Biol. 225, 487-494). The analysis shows that
this nonglycosylated polypeptide is a type 1 membrane protein,
containing a single transmembrane helix (aa 12-34 according to
THMM), and in which the majority of the hydrophilic domain (42
residues) is located at the C-terminal end and probably inside the
viral particle (endodomain). It is possible to note an inversion in
the topology predicted by versions 1.0 (N-ter is external) and 2.0
(N-ter is internal) of the THMM software, but that other
algorithms, in particular TOPPRED and THUMBUP (Zhou et Zhou, 2003,
Protein Science 12:1547-1555) confirm an external location of the
N-terminal end of E.
b) M Protein
[0460] A similar analysis carried out on the SARS-CoV M protein
(221 amino acids) shows that this polypeptide does not possess a
signal peptide (according to the software signalP v1.1) but three
transmembrane domains (residues 15-37, 50-72, 77-99 according to
THMM 2.0) and a large hydrophilic domain (aa 100-221) located
inside the viral particle (endodomain). It is probably glycosylated
on the asparagine at position 4 (according to NetNGlyc 1.0).
[0461] Thus, in agreement with the experimental data known for the
other coronaviruses, it is remarkable that the two M and E proteins
exhibit endodomains corresponding to the majority of the
polypeptides and of the ectodomains that are very small in size.
[0462] The ectodomain of E probably corresponds to residues 1 to 11
or 1 to 12 of the protein: MYSFVSEETGT(L), SEQ ID NO: 70. Indeed,
the probability associated with the transmembrane location of
residue 12 is intermediate (0.56 according to THMM 2.0). [0463] The
ectodomain of M probably corresponds to residues 2 to 14 of the
protein: ADNGTITVEELKQ, SEQ ID NO: 69. Indeed, the N-terminal
methionine of M is very probably cleaved from the mature
polypeptide because the residue at position 2 is an alanine
(Varshavsky, 1996, 93:12142-12149).
[0464] Moreover, the analysis of the hydrophobicity (Kyte &
Doolittle, Hopp & Woods) of the E protein demonstrates that the
C-terminal end of the endodomain of E is hydrophilic and therefore
probably exposed at the surface of this domain. Thus, a synthetic
peptide corresponding to this end is a good immunogenic candidate
for inducing, in animals, antibodies directed against the
endodomain of E. Consequently, a peptide corresponding to 24
C-terminal residues of E was synthesized.
2) Preparation of Antibodies Directed Against the Ectodomain of the
M and E Proteins and the Endodomain of the E Protein
[0465] The peptides M2-14 (ADNGTITVEELKQ, SEQ ID NO: 69), E1-12
(MYSFVSEETGTL, SEQ ID NO: 70) and E53-76 (KPTVYVYSRV KNLNSSEGVP
DLLV, SEQ ID NO: 71) were synthesized by Neosystem. They were
coupled with KLH (Keyhole Limpet Hemocyanin) with the aid of MBS
(m-maleimido-benzoyl-N-hydroxysuccinimide ester) via a cysteine
added during the synthesis either at the N-terminus of the peptide
(case for E53-76) or at the C-terminus (case of M2-14 and
E1-12).
[0466] Two rabbits were immunized with each of the conjugates,
according to the following immunization protocol: after a first
injection of 0.5 mg of peptide coupled with KLH and emulsified in
complete Freund's adjuvant (intra-dermal route), the animals
receive 2 to 4 booster injections at 3 or 4 weeks' interval of 0.25
mg of peptide coupled to KLH and emulsified in incomplete Freund's
adjuvant.
[0467] For each rabbit, a preimmune (p.i.) serum was prepared
before the first immunization and an immune serum (I.S.) is
prepared 3 to 5 weeks after the booster injections.
[0468] The reactivity of the sera was analyzed by Western blotting
with the aid of extracts of cells infected with SARS-CoV (FIG. 43B)
or with the aid of extracts of cells infected with a recombinant
vaccinia virus expressing the protein E (VV-TG-E, FIG. 43A) or M
(VV-TN-M, FIG. 43C) of the SARS-CoV 031589 isolate.
[0469] The immune sera of the rabbits 22234 and 22240, immunized
with the conjugate KLH-E53-76, recognize a polypeptide of about 9
to 10 kD, which is present in the extracts of cells infected with
SARS-CoV but absent from the extracts of noninfected cells (FIG.
43B). The apparent mass of this polypeptide is compatible with the
predicted mass of the E protein, which is 8.4 kD. Similarly, the
immune serum of the rabbit 20047, immunized with the conjugate
KLH-E1-12, recognizes a polypeptide present in the extracts of
cells infected with the VV-TG-E virus, whose apparent molar mass is
compatible with that of the E protein (FIG. 43A).
[0470] The immune serum of the rabbits 20013 and 20080, immunized
with the conjugate KLH-M2-14, recognizes a polypeptide present in
the extracts of cells infected with the VV-TN-M virus (FIG. 43C),
whose apparent molar mass (about 18 kD) is compatible with that of
the glycoprotein M, which is 25.1 kD and has a high iso-electric
point (9.1 for the naked polypeptide).
[0471] These results demonstrate that the peptides E1-12 and
E53-76, on the one hand, and the peptide M2-14, on the other hand,
make it possible to induce, in animals, polyclonal antibodies
capable of recognizing the native forms of the SARS-CoV E and M
proteins, respectively.
EXAMPLE 6
Analysis of the ELISA Reactivity of the Recombinant N Protein
Toward Sera from Patients Suffering from SARS
1) Materials
[0472] The antigen used to prepare the solid phases is the purified
recombinant nucleoprotein N prepared according to the protocol
described in example 2.
[0473] The sera to be tested (table IV) were chosen on the basis of
the results of analysis of their reactivity by immunofluorescence
(IF-SARS titer), toward cells infected with SARS-CoV.
TABLE-US-00006 TABLE IV Sera tested by ELISA Serum Date of the
IF-SARS Reference No. Type of serum serum*** titer 3050 A Control
na* nt** 3048 B Control na nt 033168 D Patient 1-SARS Apr. 27, 2003
(D38) 320 033397 E Patient-1 SARS May 11, 2005 (D52) 320 032632 F
Patient-2 SARS Mar. 21, 2003 (D17) 2500 032791 G Patient-3 SARS
Apr. 04, 2003 (D3) <40 033258 H Patient-3 SARS Apr. 28, 2003
(D27) 160 *na: not applicable. **nt: not tested. ***the dates
indicated correspond to the number of days after the onset of the
SARS symptoms.
2) Method
[0474] The N protein (100 .mu.l) diluted at various concentrations
in 0.1 M carbonate buffer, pH 9.6 (1, 2 or 4 .mu.g/ml) is
distributed into the wells of ELISA plates, and then the plates are
incubated overnight at laboratory temperature. The plates are
washed with PBS-Tween buffer saturated with PBS-skimmed
milk-sucrose (5%) buffer. The test sera (100 .mu.l), diluted
beforehand ( 1/50, 1/100, 1/200, 1/400, 1/800, 1/1600 and 1/3200)
are added and then the plates are incubated for 1 h at 37.degree.
C. After 3 washings, the peroxidase-labeled anti-human IgG
conjugate (reference 209-035-098, JACKSON) diluted 1/18 000 is
added and then the plates are incubated for 1 h at 37.degree. C.
After 4 washings, the chromogen (TMB) and the substrate
(H.sub.2O.sub.2) are added and the plates are incubated for 30 min
at room temperature, protected from light. The reaction is then
stopped and then the absorbance at 450 nm is measured with the aid
of an automated reader.
3) Results
[0475] The ELISA tests (FIG. 10) demonstrate that the recombinant N
protein preparation is specifically recognized by the antibodies of
sera from patients suffering from SARS collected in the late phase
of the infection (.gtoreq.17 days after the onset of the symptoms)
whereas it is not significantly recognized by the antibodies of a
patient's serum collected in the early phase of the infection (3
days after the onset of the symptoms) or by control sera from
subjects not suffering from SARS.
EXAMPLE 7
ELISA Tests Prepared for a Very Specific and Sensitive Detection of
a SARS-Associated Coronavirus Infection, from Sera of Patients
1) Indirect ELISA IgG Test
a) Reagents
Preparation of the Plates
[0476] The plates are sensitized with a solution of N protein at 2
.mu.g/ml in a 10 mM PBS buffer, pH 7.2, phenol red at 0.25 ml/l.
100 .mu.l of solution are deposited in the wells and left to
incubate at room temperature overnight. Saturation is obtained by
prewashing in 10 mM PBS/0.1% Tween buffer, followed by washing with
a saturation solution PBS, 25% milk/sucrose.
Diluent Sera
[0477] Buffer 0.48 g/l TRIS, 10 mM PBS, 3.7 g/l EDTA, 15% v/v milk,
pH 6.7
Diluent Conjugate
[0478] Citrate buffer (15 g/l), 0.5% Tween, 25% bovine serum, 12%
NaCl, 6% v/v skimmed milk pH 6.5
Conjugate
[0479] 50.times. anti-human IgG conjugate, marketed by Bio-Rad:
Platelia H. pylori kit ref 72778
Other Solutions:
[0480] Washing solution R2, solutions for visualizing with TMB R8
diluent, R9 chromogen, R10 stopping solution: reagents marketed by
Bio-Rad (e.g.: Platelia pylori kit, ref 72778)
b) Procedure
[0481] Dilute the sera 1/200 in the sample diluent
[0482] Distribute 100 .mu.l/well
[0483] Incubation 1 h at 37.degree. C.
[0484] 3 washings in 10.times. WASHING solution R2 diluted
before-hand 10-fold in demineralized water (i.e., 1.times. washing
solution)
[0485] Distribute 100 .mu.l of conjugate (50.times. conjugate to be
diluted immediately before use in the diluent conjugate
provided)
[0486] Incubation 1 h at 37.degree. C.
[0487] 4 washings in 1.times. washing solution
[0488] Distribute 200 .mu.l/well of visualization solution (to be
diluted immediately before use e.g.: 1 ml of R9 in 10 ml of R8)
[0489] Incubation for 30 min at room temperature in the dark
[0490] Stop the reaction with 100 .mu.l/well of R10
[0491] READING at 450/620 nm
[0492] The results can be interpreted by taking a THRESHOLD serum
giving a response above which the sera tested would be considered
as positive. This serum is chosen and diluted so as to give a
significantly higher signal than the background noise.
2) Double Epitope Elisa Test
a) Reagents
Preparation of the Plates
[0493] The plates are sensitized with a solution of N protein at 1
g/ml in a 10 mM PBS buffer, pH 7.2, phenol red at 0.25 ml/l. 100
.mu.l of solution are deposited in the wells and left to incubate
at room temperature overnight. Saturation is obtained by prewashing
in 10 mM PBS/0.1% Tween buffer, followed by washing with a
saturation solution 10 mM PBS, 25% (V/V) milk.
Diluent Sera and Conjugate
[0494] Buffer 50 mM TRIS saline, pH 8, 2% milk
Conjugate
[0495] This is the purified recombinant N protein coupled with
peroxidase according to the Nakane protocol (Nakane P. K. and
Kawaoi A.; (1974): Peroxydase-labeled antibody, a new method of
conjugation. The Journal of Histochemistry and Cytochemistry Vol.
22, N) 23, pp. 1084-1091), in respective molar ratios 1/2. This
ProtN POD conjugate is used at a concentration of 2 .mu.g/ml in
serum/conjugate diluent.
Other Solutions:
[0496] Washing solution R2, solutions for visualization with TMB
R8, diluent, R9 chromogen, R10 stopping solution: reagents marketed
by Bio-Rad (e.g. Platelia pylori kit, ref 72778).
b) Procedure
[0497] 1st step in "predilution" plate [0498] Dilute each serum 1/5
in the predilution plate [0499] (48 .mu.l of diluent+12 .mu.l of
serum). [0500] After having diluted all the sera, distribute 60
.mu.l of conjugate. [0501] Where appropriate, the serum+conjugate
mix is left to incubate.
[0502] 2nd step in "reaction" plate [0503] Transfer 100 .mu.l of
mixture/well into the reaction plate [0504] Incubation 1 h
37.degree. C. [0505] 5 washings in 10.times. WASHING solution R2
diluted 10-fold beforehand in demineralized water (.fwdarw.1.times.
washing solution) [0506] Distribute 200 .mu.l/well of visualization
solution (to be diluted immediately before use e.g.: 1 ml of R9 in
10 ml of R8) [0507] Incubation 30 min at room temperature and
protected from light [0508] Stop the reaction with 100 .mu.l/well
of R10 [0509] READING at 450/620 nm
[0510] Likewise as for the indirect ELISA test, the results can be
interpreted using a "threshold value" serum. Any serum having a
response greater than the threshold value serum will be considered
as positive.
2) Results
[0511] The sera of patients classified as probable cases of SARS
from the French hospital of Hanoi, Vietnam or in relation with the
French hospital of Hanoi (JYK) were analyzed using the indirect
IgG-N test and the double epitope N test.
[0512] The results of the indirect IgG-N test (FIGS. 14 and 15) and
double epitope N test (FIGS. 16 and 17) show an excellent
correlation between them and with an indirect ELISA test comparing
the reactivity of the sera toward a lysate of VeroE6 cells infected
or not infected with SARS-CoV (ELISA-SARS-CoV lysate; see table V
below). All the sera collected 12 days or more after the onset of
the symptoms were found to be positive, including in patients for
whom it had not been possible to document the SARS-CoV virus
infection by analyzing respiratory samples by RT-PCR, probably
because of a sample being collected too late during the infection
(.gtoreq.D12). In the case of the patient TTH for whom a nasal
sample collected on D7 was found to be negative by RT-PCR, the
quality of the sample may be in question.
[0513] Some sera were found to be negative whereas the presence of
SARS-CoV was detected by RT-PCR. They are in all cases early sera
collected less than 10 days after the onset of the symptoms (e.g.:
serum # 032637). In the case of a patient PTTH (serum # 032673),
only a suspicion of SARS was raised at the time the samples were
collected.
[0514] In conclusion, the indirect IgG-N and N-double epitope
serological tests make it possible to document the SARS-CoV
infection in all the patients for the sera collected 12 days or
more after the infection. TABLE-US-00007 TABLE V Results of the
ELISA tests ELISA Sample PCR-SARS SARS-CoV IgG-N 2Xepitope Num
Patient Day (1) lysate (2) (2nd series) (2nd series) 033168 JYK 38
POS +++ >5000 NT 033597 JYK 74 POS NT .apprxeq.5000 NT 032552
VTT 8 NEG- NEG <200 <5 D3&D8&D12 032544 CTP 16 NEG ++
>5000 >>20 D16&D20 032546 CJF 15 NEG ++ >5000
>>20 D15&D19 032548 PTL 17 NEG ++ >5000 >>20
D17&D21 032550 NTH 17 NEG-D17&D21 ++ >5000 >>20
032553 VTT 8 NEG- NEG <200 <5 D3&D8&D12 032554 NTBV 4
POS NEG <200 <5 032555 NTBV 4 POS NEG <200 032564 NTP 15
POS ++ >5000 >>20 032629 NVH 4 POS NEG <200 <5
032631 BTTX 9 POS NEG <200 <5 032635 NHH 4 POS NEG <200
<5 032637 NHB 10 POS NEG <200 <5 032642 BTTX 9 POS NEG
<200 <5 032643 LTDH 1 POS NEG <200 <5 032644 NTBV 4 POS
NEG <200 <5 032646 TTH 12 NEG ++ >5000 >>20
D7&D12&D16 032647 DTH 17 NEG ++ >5000 >>20
D17&D21 032648 NNT 15 NEG ++ >5000 >>20 D15&D19
032649 PTH 17 NEG ++ >5000 >>20 D17&D21 032672 LVV 16
NEG + >5000 >>20 D16&D20 032673 PTTH NA NEG NEG
<200 <5 032674 PNB 17 NEG ++ >5000 >>20 D17&D21
032682 VTH 12 NEG ++ >5000 >>20 D12&D16 032683 DTV 17
NEG + >1000 >>20 D17&D21
Remarks:
[0515] (1): The RT-PCR analyses were carried out by nested RT-PCR
BNI, LC Artus and LC-N on nasal or pharyngeal swabs; POS means that
at least one sample was found to be positive in this patient.
[0516] (2): The reactivity of the sera in the ELISA test using a
lysate of cells infected with SARS-CoV was classified as very
highly reactive (+++), highly reactive (++), reactive (+) and
negative according to the OD value obtained at the dilutions
tested.
EXAMPLE 8
Detection of SARS-Associated Coronavirus (SARS-CoV) by RT-PCR
1) Real Time Development of RT-PCR Conditions with the Aid of
Primers Specific for the Gene for the Nucleocapsid Protein--"Light
Cycler N" Test
a) Design of the Primers and Probes
[0517] The primers and probes were designed from the sequence of
the genome of the SARS-CoV strain derived from the sample recorded
under the number 031589, with the aid of the programme "Light
Cycler Probe Design (Roche)". Thus, the following two series of
primers and probes were selected: TABLE-US-00008 series 1 (SEQ ID
NO: 60, 61, 64, 65): sense primer: N/+/28507: 5'-GGC ATC GTA TGG
GTT G-3' [28507-28522] antisense primer: N/-/28774: 5'-CAG TTT CAC
CAC CTC C-3' [28774-28759] probe 1: 5'-GGC ACC CGC AAT CCT AAT AAC
AAT GC- fluorescein 3' [28561-28586] probe 2: 5' Red705-GCC ACC GTG
CTA CAA CTT CCT-phosphate [28588-28608] series 2 (SEQ ID NO: 62,
63, 66, 67) sense primer: N/+/28375: 5'-GGC TAC TAC CGA AGA G-3'
[28375-28390] antisense primer: N/-/28702: 5'-AAT TAC CGC GAC TAC
G-3' [28702-28687] probe 1: SARS/N/FL: 5'-ATA CAC CCA AAG ACC ACA
TTG GC-fluorescein 3' [28541-28563] probe 2: SARS/N/LC705: 5'
Red705-CCC GCA ATC CTA ATA ACA ATG CTG C- phosphate 3'
[28565-28589]
b) Analysis of the Efficacy of the Two Primer Pairs
[0518] In order to test the respective efficacy of the two pairs of
primers, an RT-PCR amplification was carried out on a synthetic RNA
corresponding to nucleotides 28054-29430 of the genome of the
SARS-CoV strain derived from the sample recorded under the number
031589 and containing the sequence of the N gene.
[0519] More specifically:
[0520] This synthetic RNA was prepared by in vitro transcription
with the aid of the T7 phage RNA polymerase, of a DNA template
obtained by linearization of the plasmid SRAS-N with the enzyme Bam
H1. After eliminating the DNA template by digestion with the aid of
DNAse 1, the synthetic RNAs are purified by a phenol-chloroform
extraction, followed by two successive precipitations in ammonium
acetate and isopropanol. They are then quantified by measuring the
absorbance at 260 nm and their quality is checked by the ratio of
the absorbances at 260 and 280 nm and by agarose gel
electrophoresis. Thus, the concentration of the synthetic RNA
preparation used for these studies is 1.6 mg/ml, which corresponds
to 2.1.times.10.sup.15 copies/ml of RNA.
[0521] Decreasing quantities of synthetic RNA were amplified by
RT-PCR with the aid of the "Superscript.TM. One-Step RT-PCR with
Platinum.RTM. Taq" kit and the pairs of primers No. 1 (N/+/28507,
N/-/28774) (FIG. 1A) and No. 2 (N/+/28375, N/-/28702) (FIG. 1B),
according to the supplier's instructions. The amplification
conditions used are the following: the cDNA was synthesized by
incubation for 30 min at 45.degree. C., 15 min at 55.degree. C. and
then 2 min at 94.degree. C. and it was then amplified by 5 cycles
comprising: a step of denaturation at 94.degree. C. for 15 sec, a
step of annealing at 45.degree. C. for 30 sec and then a step of
extension at 72.degree. C. for 30 sec, followed by 35 cycles
comprising: a step of denaturation at 94.degree. C. for 15 sec, a
step of annealing at 55.degree. C. for 30 sec and then a step of
extension at 72.degree. C. for 30 sec, with 2 sec of additional
extension at each cycle, and a final step of extension at
72.degree. C. for 5 min. The amplification products obtained were
then kept at 10.degree. C.
[0522] The results presented in FIG. 11 show that the pair of
primers No. 2 (N/+/28375, N/-/28702) makes it possible to detect up
to 10 copies of RNA (band of weak intensity) or 10.sup.2 copies
(band of good intensity) against 10.sup.4 copies for the pair of
primers No. 1 (N/+/28507, N/-/28774). The amplicons are
respectively 268 bp (pair 1) and 328 bp (pair 2).
c) Development of Real Time RT-PCR
[0523] A real time RT-PCR was developed with the aid of the pair of
primers No. 2 and of the pair of probes consisting of SRAS/N/FL and
SRAS/N/LC705 (FIG. 2).
[0524] The amplification was carried out on a LightCycler.TM.
(Roche) with the aid of the "Light Cycler RNA Amplification Kit
Hybridization Probes" kit (reference 2 015 145, Roche) under the
following optimized conditions. A reaction mixture containing:
H.sub.2O (6.8 .mu.l), 25 mM MgCl.sub.2 (0.8 .mu.l, 4 .mu.M Mg2+
final), 5.times. reaction mixture (4 .mu.l), 3 .mu.m probe
SRAS/N/FL (0.5 .mu.l, 0.075 .mu.M final), 3 .mu.M probe
SRAS/N/LC705 (0.5 .mu.l, 0.075 .mu.M final), 10 .mu.M primer
N/+/28375 (1 .mu.l, 0.5 .mu.M final), 10 .mu.M primer N/-/28702 (1
.mu.l, 0.5 .mu.M final), enzyme mixture (0.4 .mu.l) and sample
(viral RNA, 5 .mu.l) was amplified according to the following
program: TABLE-US-00009 Reverse transcription: 50.degree. C. 10:00
min analysis mode: none Denaturation: 95.degree. C. 30 sec .times.
1 analysis mode: none Amplification: 95.degree. C. 2 sec 50.degree.
C. 15 sec analysis mode: quantification* {close oversize brace}
.times.45 72.degree. C. 13 sec thermal ramp 2.0.degree. C./sec
Annealing: 40.degree. C. 30 sec .times. 1 analysis mode: none *The
fluorescence is measured at the end of the annealing and at each
cycle (in SINGLE mode).
[0525] The results presented in FIG. 12 show that this real time
RT-PCR is very sensitive since it makes it possible to detect 102
copies of synthetic RNA in 100% of the 5 samples analyzed (29/29
samples in 8 experiments) and up to 10 copies of RNA in 100% of the
5 samples analyzed (40/45 samples in 8 experiments). It also shows
that this RT-PCR makes it possible to detect the presence of the
SARS-CoV genome in a sample and to quantify the number of genomes
present. By way of example, the viral RNA of a SARS-CoV stock
cultured on Vero E6 cells was extracted with the aid of the "Qiamp
viral RNA extraction" kit (Qiagen), diluted to
0.05.times.10.sup.-14 and analyzed by real time RT-PCR according to
the protocol described above; the analysis presented in FIG. 12
shows that this virus stock contains 6.5.times.10.sup.9
genome-equivalents/ml (geq/ml), which is entirely similar to the
1.0.times.10.sup.10 geq/ml value measured with the aid of the
"RealArt.TM. HPA-Coronavirus LC RT PCR Reagents" kit marketed by
Artus.
2) Development of Nested RT-PCR Conditions Targeting the Gene for
RNA Polymerase--"CDC (Centers for Disease Control and
Prevention)/IP Nested RT-PCR" Test
a) Extraction of the Viral RNA
[0526] Clinical sample: QIAmp viral RNA Mini Kit (QIAGEN) according
to the manufacturer's instructions, or an equivalent technique. The
RNA is eluted in a volume of 60 .mu.l.
b) "SNE/SAR" Nested RT-PCR
First Step: "SNE" Coupled RT-PCR
[0527] The Invitrogen "Superscript.TM. One-Step RT-PCR with
Platinum.RTM. Taq" kit was used, but the "Titan" kit from Roche
Boehringer can be used in its place with similar results.
TABLE-US-00010 Oligonucleotides: SNE-S1 5' GGT TGG GAT TAT CCA AAA
TGT GA 3' SNE-AS1 5' GCA TCA TCA GAA AGA ATC ATC ATG 3' .fwdarw.
Expected size: 440 bp
[0528] 1. Prepare a mix: TABLE-US-00011 H2O 6.5 .mu.l Reaction mix
2X 12.5 .mu.l Oligo SNE-S1 50 .mu.M 0.2 .mu.l Oligo SNE-AS1 50
.mu.M 0.2 .mu.l RNAsin 40 U/.mu.l 0.12 .mu.l RT/Platinum Taq mix
0.5 .mu.l
[0529] 2. To 20 .mu.l of the mix, add 5 .mu.l of RNA and carry out
the amplification on a thermocycler (ABI 9600 conditions):
TABLE-US-00012 2.1 45.degree. C. 30 min. 55.degree. C. 15 min.
94.degree. C. 2 min. 2.2. 94.degree. C. 15 sec. 45.degree. C. 30
sec. {close oversize brace} .times.5 cycles 72.degree. C. 30 sec.
2.3. 94.degree. C. 15 sec. 55.degree. C. 30 sec. {close oversize
brace} .times.35 cycles 72.degree. C. 30 sec. + 2 sec./cycle 2.4.
72.degree. C. 5 min. 2.5 10.degree. C. .infin. Storage at
+4.degree. C..
[0530] The RNAsin (N2511/N2515) from Promega was used as RNase
inhibitors.
[0531] Synthetic RNAs served as positive control. As the control,
10.sup.3, 10.sup.2 and 10 copies of synthetic RNA R.sub.SNE were
amplified in each experiment.
[0532] Second Step: "SAR" Nested PCR TABLE-US-00013
Oligonucleotides: SAR1-S 5' CCT CTC TTG TTC TTG CTC GCA 3' SAR1-AS
5' TAT AGT GAG CCG CCA CAC ATG 3' .fwdarw. Expected size: 121
bp
[0533] 1. Prepare a mix: TABLE-US-00014 H2O 35.8 .mu.l Taq buffer
10X 5 .mu.l MgCl.sub.2 25 mM 4 .mu.l Mix dNTPs 5 mM 2 .mu.l Oligo
SAR1-S 50 .mu.M 0.5 .mu.l Oligo SAR1-AS 50 .mu.M 0.5 .mu.l Taq DNA
pol 5 U/.mu.l 0.25 .mu.l
[0534] AmpliTaq DNA Pol from Applied Biosystems was used (10.times.
buffer without MgCl.sub.2, ref 27216601).
[0535] 2. To 48 .mu.l of the mix, add 2 .mu.l of the product from
the first PCR and carry out the amplification (ABI 9600
conditions): TABLE-US-00015 2.1. 94.degree. C. 2 min. 2.2.
94.degree. C. 30 sec. 45.degree. C. 45 sec. {close oversize brace}
.times.5 cycles 72.degree. C. 30 sec. 2.3. 94.degree. C. 30 sec.
55.degree. C. 30 sec. {close oversize brace} .times.35 cycles
72.degree. C. 30 sec. + 1 sec./cycle 2.4. 72.degree. C. 5 min. 2.5
10.degree. C. .infin.
[0536] 3. Analyze 10 .mu.l of the reaction product on "low-melting"
gel (Seakem GTG type) containing 3% agarose.
[0537] The sensitivity of the nested test is routinely, under the
conditions described, 10 copies of RNA.
[0538] 4. The fragments can then be purified on QIAquick PCR kit
(QIAGEN) and sequenced with the oligos SAR1-S and SAR1-AS.
3) Detection of the SARS-CoV RNA by PCR from Respiratory
Samples
a) First Comparative Study
[0539] A comparative study was carried out on a series of
respiratory samples received by the National Reference Center for
the Influenza Virus (Northern region) and likely to contain
SARS-CoV. To do this, the RNA was extracted from the samples with
the aid of the "Qiamp viral RNA extraction" kit (Qiagen) and
analyzed by real time RT-PCR, on the one hand with the aid of the
pairs of primers and probes of the No. 2 series under the
conditions described above on the one hand, and on the other hand
with the aid of the kit "LightCycler SARS-CoV quantification kit"
marketed by Roche (reference 03 604 438). The results are
summarized in table VI below. They show that 18 of the 26 samples
are negative and 5 of the 26 samples are positive for the two kits,
while one sample is positive for the Roche kit alone and two for
the "series 2" N reagents alone., Additionally, for 3 samples
(20032701, 20032712, 20032714) the quantities of RNA detected are
markedly higher with the reagents (probes and primers) of the No. 2
series. These results indicate that the "series 2" N primers and
probes are more sensitive for the detection of the SARS-CoV genome
in biological samples than those of the kit currently available.
TABLE-US-00016 TABLE VI Real time RT-PCR analysis of the RNAs
extracted from a series of samples from 5 patients with the aid of
the pairs of primers and probes of the No. 2 series ("series 2" N)
or of the kit "Lightcycler SARS- CoV quantification kit" (Roche).
The type of sample is indicated as well as the number of copies of
viral genome measured in each of the two tests. NEG: negative
RT-PCR. ROCHE Sample No. Patient Type of sample KIT "Series 2" N
20033082 K nasal NEG NEG 20033083 K pharyngeal NEG NEG 20033086 K
nasal NEG NEG 20033087 K pharyngeal NEG NEG 20032802 M nasal NEG
NEG 20032803 M expectoration NEG NEG 20032806 M nasal or pharyngeal
NEG NEG 20031746ARN2 C pharyngeal NEG NEG 20032711 C nasal or
pharyngeal 39 NEG 20032910 B nasal NEG NEG 20032911 B pharyngeal
NEG NEG 20033356 V expectoration NEG NEG 20033357 V expectoration
NEG NEG 20031725 K endotracheal asp. NEG 150 20032657 K
endotracheal asp. NEG NEG 20032698 K endotracheal asp. NEG NEG
20032720 K endotracheal asp. 3 5 20033074 K stools 115 257 20032701
M pharyngeal 443 1676 20032702 M expectoration NEG 249 20031747ARN2
C pharyngeal NEG NEG 20032712 C unknown 634 6914 20032714 C
pharyngeal 17 223 20032800 B nasal NEG NEG 20033353 V nasal NEG NEG
20033384 V nasal NEG NEG
b) Second Comparative Study
[0540] The performance of various nested RT-PCR and real time
RT-PCR methods were then compared for 121 respiratory samples from
possible cases of SARS at the French hospital in Hanoi, Vietnam,
taken between the 4th and the 17th day after the onset of the
symptoms. Among these samples, 14 were found to be positive during
a first test using the nested RT-PCR method targeting ORF1b
(encoding replicase) as described initially by Bernhard Nocht
Institute (BNI nested RT-PCR). Information relating to this test is
available on the internet, at the address
http://www15.bni-hamburg.de/bni2/neu2/getfile.acgi?area_eng1=diagnostics&-
pid=4112.
[0541] The various tests compared in this study are: [0542] the
quantitative RT-PCR method according to the invention, with the
"series 2" N primers and probes described above (LightCycler N
column), [0543] the nested RT-PCR test targeting the RNA polymerase
gene described above, developed by the CDC, BNI and Institut
Pasteur (CDC/IP nested RT-PCR), [0544] the ARTUS kit with the
reference "HPA Corona LC RT-PCR Kit # 5601-02", which is a real
time RT-PCR test targeting the ORF1b gene, [0545] the BNI nested
RT-PCR test, also targeting the RNA polymerase gene mentioned
above.
[0546] The inventors observed:
[0547] 1) an inter-test variability for the same technique, linked
to the degradation of the RNA preparation during repeated thawing,
in particular for the samples containing the lowest quantities of
RNA,
[0548] 2) a reduced sensitivity of the CDC/IP nested RT-PCR
compared with the BNI nested RT-PCR, and
[0549] 3) a comparable sensitivity of the quantitative RT-PCR test
according to the invention (LightCycler N) compared with the Artus
LightCycler (LC) test.
[0550] These results, which are presented in table VII below, show
that the quantitative RT-PCR test according to the invention
constitutes an excellent addition--or an alternative--to the tests
currently available. Indeed, the SARS-linked coronavirus is an
emergent virus which is capable of changing rapidly. In particular,
the gene for the RNA polymerase of the SARS-linked coronavirus,
which is targeted in most of the tests currently available, can
recombine with that of other coronaviruses not linked to SARS. The
use of a test targeting this gene exclusively could then lead to
the production of false-negatives.
[0551] The quantitative RT-PCR test according to the invention does
not target the same genomic region as the ARTUS kit since it
targets the gene encoding the N protein. By carrying out a
diagnostic test targeting two different genes of the SARS-linked
coronavirus, it can therefore be hoped to avoid false-negative type
results which could be due to the genetic evolution of the
virus.
[0552] Furthermore, it appears particularly advantageous to target
the gene for the nucleocapsid protein because it is very stable
because of the high selection pressure linked to the high
structural constraints regarding this protein. TABLE-US-00017 TABLE
VII Comparison of various methods of analysis by gene
amplification, from 121 samples of probable cases of SARS at the
French hospital in Hanoi, Vietnam (epidemic 2003) Artus Sample
Sample CDC/IP BNI Light Light type collection nested nested Cycler
Cycler NRC No. (1) day Patient RT-PCR RT-PCR kit N (IP) 107 N and P
Negative Negative Negative Negative samples 032529 P 10 NHB
Negative Positive Negative Negative 032530 N 10 NHB Positive
Positive 3.10E+01 4.20E+01 032531 P 7 LP Positive Positive 7.70E+00
3.10E+00 032534 N 15 BND Positive Positive 1.60E+00 Negative 032600
P 4 NHH Negative Positive Negative 0.30E+02 032612 P 17 NTS
Negative Positive Negative Negative 032688 P 9 BTX Positive
Positive Negative Negative 032689 N 4 NVH Positive Positive
1.20E+01 2.30E+02 032690 P 4 NVH Negative Positive 1.60E+00
Negative 032727 P 8 NVH Positive Positive 2.30E+02 4.00E+02 032728
N 8 NVH Positive Positive 1.10E+03 1.60E+04 032729 P 14 NHB
Positive Positive 5.90E+00 3.40E+01 032730 N 14 NHB Positive
Positive 1.30E+02 4.80E+02 032741 P 8 NHH Positive Positive
2.10E+02 1.30E+02 positives 10 14 10 9 fraction detected from the
14 positives 71.4% 100.0% 71.4% 64.3% (1) P = pharyngeal swab N =
nasal swab
EXAMPLE 9
Production and Characterization of Monoclonol Antibodies Directed
Against the N Protein
[0553] Balb C mice were immunized with the purified recombinant N
protein and their spleen cells fused with an appropriate murine
myeloma according to the Kohler and Milstein techniques.
[0554] Nineteen anti-N antibody secreting hybridomas were
preselected and their immunoreactivities determined. These
antibodies do indeed recognize the recombinant N protein (in ELISA)
with variable intensities, and the natural viral N protein in ELISA
and/or in Western blotting. FIGS. 18 to 20 show the results of
these tests for 15 of these 19 monoclonal antibodies.
[0555] The highly reactive clones 12, 17, 28, 57, 72, 76, 86, 87,
98, 103, 146, 156, 166, 170, 199, 212, 218, 219 and 222 were
subcloned. Specificity studies were carried out with the
appropriate tools in order to determine the epitopes recognized and
verify the absence of reactivity toward other human coronaviruses
and certain respiratory viruses.
[0556] Epitope mapping studies (performed on spot membrane with the
aid of overlapping peptides of 15 aa) and additional studies
performed on the natural N protein in Western blotting revealed the
existence of 4 groups of monoclonal antibodies:
[0557] 1. Monoclonal antibodies specific for a major linear epitope
at the N-ter position (75-81, sequence: INTNSVP).
[0558] The representative of this group is antibody 156. The
hybridoma producing this antibody was deposited at the Collection
Nationale de Cultures de Microorganismes (CNCM) of the Institut
Pasteur (Paris, France) on Dec. 1, 2004, under the number I-3331.
This same epitope is also recognized by a rabbit serum (anti-N
polyclonal) obtained by conventional immunization with the aid of
this same N protein.
[0559] 2. Monoclonal antibodies specific for a major linear epitope
located in a central position (position 217-224, sequence:
ETALALL); the representatives of this group are the monoclonal
antibodies 87 and 166. The hybridoma producing antibody 87 was
deposited at the CNCM on Dec. 1, 2004, under the number I-3328.
[0560] 3. Monoclonal antibodies specific for a major linear epitope
located at the C-terminal position (position 403-408, sequence:
DFFRQL), the representatives of this group are the antibodies 28,
57 and 143. The hybridoma producing antibody 57 was deposited at
the CNCM on Dec. 1, 2004, under the number I-3330.
[0561] 4. Monoclonal antibodies specific for a discontinuous
conformational epitope. This group of antibodies does not recognize
any of the peptides spanning the sequence of the N protein, but
react strongly on the non-denatured natural protein. The
representative of this final group is the antibody 86. The
hybridoma producing this antibody was deposited at the CNCM on Dec.
1, 2004, under the number I-3329.
[0562] Table VIII below summarizes the epitope mapping results
obtained: TABLE-US-00018 TABLE VIII Epitope mapping of the
monoclonal antibodies Antibody Epitope Position Region 28 DFSRQL Q
403 . . . 408 C-Ter. 143 DFSRQL Q 76 DFSRQL Q 57 DFSRQL Q 315 . . .
319 146 LPQRQ 383 . . . 387 166 ETALALLLL 217 . . . 224 central 87
ETALALL 217 . . . 224 156 75 . . . 81 N-Ter. 86 Conformational 212
Conformational 170 Conformational
[0563] In addition, as illustrated in particular in FIGS. 18 and
19, these antibodies exhibit no reactivity in ELISA and/or in WB
toward the N protein of the human corona-virus 229 E.
EXAMPLE 10
Combinations of the Monoclonal Antibodies for the Development of a
Sensitive Immunocapture Test Specific for the Viral N Antigen in
the Serum or Biological Fluids of Patients Infected with the
SARS-CoV Virus
[0564] The antibodies listed below were selected because of their
very specific properties for an additional capture and detection
study of the viral N protein, in the serum of the subjects or
patients.
[0565] These antibodies were produced in ascites on mice, purified
by affinity chromatography and used alone or in combination, as
capture antibodies and as signal antibodies.
[0566] List of the antibodies selected: [0567] Ab anti-C-ter region
(No. 28, 57, 143) [0568] Ab anti-central region (No. 87, 166)
[0569] Ab anti-N-ter region (No. 156) [0570] Ab anti-discontinuous
conformational epitope (86) 1) Preparation of the Reagents: a)
Immunocapture ELISA Plates
[0571] The plates are sensitized with the antibody solutions at 5
.mu.g/ml in 0.1 M carbonate buffer, pH 9.6. The (monovalent or
plurivalent) solutions are deposited in a volume of 100 .mu.l in
the wells and incubated overnight at room temperature. These plates
are then washed with PBS buffer (10 mM pH 7.4 supplemented with
0.1% Tween 20) and then saturated with a PBS solution supplemented
with 0.3% BSA and 5% sucrose). The plates are then dried and then
packaged in a bag in the presence of a desiccant. They are ready to
use.
b) Conjugates
[0572] The purified antibodies were coupled with peroxidase
according to the Nakane protocol (Nakane et al.--1974, J. of Histo
and cytochemistry, vol. 22, pp. 1084-1091) in a ratio of one
molecule of IgG per 3 molecules of peroxidase. These conjugates
were purified by exclusion chromatography and stored concentrated
(concentration between 1 and 2 mg/ml) in the presence of 50%
glycerol and at -20.degree. C. They are diluted for their use in
the assays at the final concentration of 1 or 2 .mu.g/ml in PBS
buffer (pH 7.4) supplemented with 1% BSA.
c) Other Reagents
[0573] Human sera negative for all the serum markers for the HIV,
HBV, HCV and THLV viruses Pool of negative human sera supplemented
with 0.5% Triton X 100 [0574] Inactivated viral Ag: viral culture
supernatant inactivated by irradiation and inactivation verified
after placing in culture on sensitive cells--titer of the
suspension before inactivation about 10.sup.7 infectious particles
per ml or alternatively about 5.times.10.sup.9 physical viral
particles per ml of antigen [0575] The Ag samples diluted in
negative human serum: these samples were prepared by diluting 1:100
and then by 5-fold serial dilution. [0576] These noninfectious
samples mimic human samples thought to contain low to very low
concentrations of viral nucleoprotein N. Such samples are not
available for routine work. [0577] Washing solution R2, solution
for visualization TMB R8, chromogen R9 and stop solution R10, are
the generic reagents marketed by Bio-Rad in its ELISA kits (e.g.:
Platelia pylori kit ref. 72778). 2) Procedure
[0578] The samples of human sera overloaded with inactivated viral
Ag are distributed in an amount of 100 .mu.l per well, directly in
the ready-to-use sensitized plates, and then incubated for 1 hour
at 37.degree. C. (Bio-Rad IPS incubation).
[0579] The material not bound to the solid phase is removed by 3
washings (washing with dilute R2 solution, automatic LP 35
washer).
[0580] The appropriate conjugates, diluted to the final
concentration of 1 or 2 .mu.g/ml, are distributed in an amount of
100 .mu.l per well and the plates are again incubated for one hour
at 37.degree. C. (IPS incubation).
[0581] The excess conjugate is removed by 4 successive washings
(dilute R2 solution--LP 35 washer).
[0582] The presence of conjugate attached to the plates is
visualized after adding 100 .mu.l of visualization solution
prepared before use (1 ml of R9 and 10 ml of R8) and after
incubation for 30 minutes, at room temperature and protected from
light.
[0583] The enzymatic reaction is finally blocked by adding 100
.mu.l of R10 reagent (1 N H.sub.2SO.sub.4) to all the wells.
[0584] The reading is carried out with the aid of an appropriate
microplate reader at double wavelength (450/620 nm).
[0585] The results can be interpreted by using, as provisional
threshold value, the mean of at least two negative controls
multiplied by a factor of 2 or alternatively the mean of 100
negative sera supplemented with an increment corresponding to 6 SD
(standard deviation calculated on the 100 individual
measurements).
3) Results
[0586] Various capture antibody and signal antibody combinations
were tested based on the properties of the antibodies selected, and
avoiding the combinations of antibodies specific for the same
epitopes in solid phase and as conjugates.
[0587] The best results were obtained with the 4 combinations
listed below. These results are reproduced in table IX below.
1. Combination F/28
[0588] Solid phase (Ab 166+87 central region): conjugate antibody
28 (C-ter)
2. Combination G/28
[0589] Solid phase (Ab 86--conformational epitope): conjugate
antibody 28 (C-ter)
3. Combination H/28
[0590] Solid phase (Ab 86, 166 and 87 central region and
conformational epitope): conjugate antibody 28 (C-ter)
4. Combination H/28+87
[0591] Solid phase (Ab 86, 166 and 87 central region and
conformational epitope): mixed conjugate antibodies 28 (C-ter) and
87 (central)
5. Combination G/87
[0592] Solid phase (Ab 86--conformational epitope): conjugate
antibody 87 (central region)
[0593] The first 4 combinations exhibit equivalent and reproduced
performance levels, greater than the other combinations used (such
as for example the combination G/87). Of course, in these
combinations, a monoclonal antibody may be replaced with another
antibody recognizing the same epitope. Thus, the following variants
may be mentioned:
6. Variant of the Combination F/28
[0594] Solid phase (Ab 87 only): conjugate antibody 57 (C-ter)
7. Variant of the Combination G/28
[0595] Solid phase (Ab 86--conformational epitope): conjugate
antibody 57 (C-ter)
8. Variant of the Combination H/28
[0596] Solid phase (Ab 86 and 87 central region and conformational
epitope): conjugate antibody 57 (C-ter)
9. Variant of the Combination H/28+87
[0597] Solid phase (Ab 86 and 87 central region and conformational
epitope): mixed conjugate antibodies 57 (C-ter) and 87 (central)
TABLE-US-00019 TABLE IX Test of immunoreactivity of the
anti-SARS-CoV nucleoprotein Abs: optical densities measured with
each combination of antibodies according to the dilutions of the
inactivated viral antigen. No. Dilution F/28 G/28 G/87 H/28 H/28 +
87 0 1/100 5 5 3.495 3.900 5 1 1/500 3.795 3.814 1.379 3.702 3.804
2 1/2 500 2.815 2.950 0.275 3.268 2.680 3 1/12 500 0.987 1.038
0.135 1.374 0.865 4 1/62 500 0.404 0.348 0.125 0.480 0.328 5 1/312
500 0.285 0.211 0.123 0.240 0.215 6 Control 0.210 0.200 0.098 0.186
0.156 7 Control 0.269 0.153 0.104 0.193 0.202
[0598] The detection limit for these 4 experimental trials
corresponds to the antigen dilution in negative serum 1:62 500. A
rapid extrapolation suggests the detection of less than 10.sup.3
infectious particles per ml of sera.
[0599] From this study, it is evident that the most appropriate
antibodies for the capture of the native viral nucleoprotein are
the antibodies specific for the central region and/or for a
conformational epitope, both being antibodies also selected for
their high affinity for the native antigen.
[0600] Having determined the best antibodies for the composition of
the solid phase, the antibodies to be selected as a priority for
the detection of the antigens attached to the solid phase are the
complementary antibodies specific for a dominant epitope in the
C-ter region. The use of any other complementary antibody specific
for epitopes located in the N-ter region of the protein leads to
average or poor results.
EXAMPLE 11
Eukaryotic Expression Systems for the SARS-Associated Coronavirus
(SARS-CoV) spicule (S) Protein
1) Optimization of the Conditions for Expression of the SARS-CoV S
in Mammalian Cells
[0601] The conditions for transient expression of the SARS-CoV
spicule (S) protein were optimized in mammalian cells (293T,
VeroE6).
[0602] For that, a DNA fragment containing the cDNA for SARS-CoV S
was amplified by PCR with the aid of the oligo-nucleotides
5'-ATAGGATCCA CCATGTTTAT TTTCTTATTA TTTCTTACTC TCACT-3' and
5'-ATACTCGAGTT ATGTGTAATG TAATTTGACA CCCTTG-3' from the plasmid
pSARS-S(C.N.C.M. No. I-3059) and then inserted between the BamHI
and Xho1 sites of the plasmid pTRIP.DELTA.U3-CMV containing a
lentiviral vector TRIP (Sirven, 2001, Mol. Ther., 3, 438-448) in
order to obtain the plasmid pTRIP-S. The BamH1 and Xho1 fragment
containing the cDNA for S was then subcloned between BamH1 and Xho1
of the eukaryotic expression plasmid pcDNA3.1(+) (Clontech) in
order to obtain the plasmid pcDNA-S. The Nhe1 and Xho1 fragment
containing the cDNA for S was then subcloned between the
corresponding sites of the expression plasmid pCI (Promega) in
order to obtain the plasmid pCI-S. The WPRE sequences of the
woodchuck hepatitis virus ("Woodchuck Hepatitis Virus
posttranscriptional regulatory element") and the CTE sequences
("constitutive transport element") of the simian retro-virus from
Mason-Pfizer were inserted into each of the two plasmids pcDNA-S
and pCI-S between the Xho1 and Xba1 sites in order to obtain
respectively the plasmids pcDNA-S-CTE, pcDNA-S-WPRE, pCI-S-CTE and
pCI-S-WPRE (FIG. 21). The plasmid pCI-S-WPRE was deposited at the
CNCM, on Nov. 22, 2004, under the number I-3323. All the inserts
were sequenced with the aid of a BigDye Terminator v1.1 kit
(Applied Biosystems) and an automated sequencer ABI377.
[0603] The capacity of the plasmid constructs to direct the
expression of SARS-CoV S in mammalian cells was assessed after
transfection of VeroE6 cells (FIG. 22). In this experiment,
monolayers of 5.times.10.sup.5 VeroE6 cells in 35 mm Petri dishes
were transfected with 2 .mu.g of plasmids pcDNA (as control),
pcDNA-S, pCI and pCI-S and 6 .mu.l of Fugene6 reagent according to
the manufacturer's instructions (Roche). After 48 hours of
incubation at 37.degree. C. and under 5% CO.sub.2, cellular
extracts were prepared in loading buffer according to Laemmli,
separated on 8% SDS polyacrylamide gel, and then transferred onto a
PVDF membrane (BioRad). The detection of this immunoblot (Western
blot) was carried out with the aid of an anti-S rabbit polyclonal
serum (immune serum from the rabbit P11135: cf. example 4 above)
and donkey polyclonal antibodies directed against rabbit IgGs and
coupled with peroxidase (NA934V, Amersham). The bound antibodies
were visualized by luminescence with the aid of the ECL+ kit
(Amersham) and autoradiography films Hyperfilm MP (Amersham).
[0604] This experiment (FIG. 22) shows that the plasmid pcDNA-S
does not make it possible to direct the expression of SARS-CoV S at
detectable levels whereas the plasmid pCI-S allows a weak
expression, close to the limit of detection, which may be detected
when the film is overexposed. Similar results were obtained when
the expression of S was sought by immunofluorescence (data not
shown). This impossibility to detect effective expression of S
cannot be attributed to the detection techniques used since the S
protein can be detected at the expected size (180 kDa) in an
extract of cells infected with SARS-CoV or in an extract of VeroE6
cells infected with the recombinant vaccinia virus VV-TF7.3 and
transfected with the plasmid pcDNA-S. In this latter experiment,
the virus VV-TF7.3 expresses the RNA polymerase of the T7 phage and
allows the cytoplasmic transcription of an uncapped RNA capable of
being efficiently translated. This experiment suggests that the
expression defects described above are due to an intrinsic
inability of the cDNA for S to be efficiently expressed when the
step for transcription to messenger RNA is carried out at the
nuclear level.
[0605] In a second experiment, the effect of the CTE and WPRE
signals on the expression of S was assessed after transfection of
VeroE6 (FIG. 23A) and 293T (FIG. 23B) cells and according to a
protocol similar to that described above. Whereas the expression of
S cannot be detected after transfection of the plasmids pcDNA-S-CTE
and pcDNA-S-WPRE derived from pcDNA-S, the insertion of the WPRE
and CTE signals greatly improves the expression of S in the context
of the expression plasmid pCI-S.
[0606] To specify this result, a second series of experiments were
carried out where the immunoblot is quantitatively visualized by
luminescence and acquisition on a digital imaging device (Fluor S,
BioRad). The analysis of the results obtained with the QuantityOne
v4.2.3 software (BioRad) shows that the WPRE and CTE sequences
increase respectively the expression of S by a factor of 20 to 42
and 10 to 26 in Vero E6 cells (table X). In 293T cells (table X),
the effect of the CTE sequence is more moderate (4 to 5 times)
whereas that of the WPRE sequence remains high (13 to 28 times).
TABLE-US-00020 TABLE X Quantitative analysis of the effect of the
CTE and WPRE signals on the expression of SARS-CoV S: Cellular
extracts were prepared 48 hours after transfection of VeroE6 or
293T cells with the plasmid pCI, pCI-S, pCI-S-CTE and pCI-S-WPRE
and analyzed by Western blotting as described in the legend to FIG.
22. The Western blot is visualized by luminescence (ECL+, Amersham)
and acquisition on a digital imaging device (FluorS, BioRad). The
expression levels are indicated according to an arbitrary scale
where the value of 1 represents the level measured after
transfection of the plasmid pCI-S. Two independent experiments were
carried out for each of the two cell types. In experiment 1 on
VeroE6 cells, the transfections were carried out in duplicate and
the results are indicated in the form of the mean and standard
deviation values for the expression levels measured. Plasmid cell
exp. 1 exp. 2 PCI VeroE6 0.0 0.0 pCI-S VeroE6 1.0 .+-. 0.1 1.0
pCI-S-CTE VeroE6 9.8 .+-. 0.9 26.4 pCI-S-WPRE VeroE6 20.1 .+-. 2.0
42.3 PCI 293T 0.0 0.0 PCI-S 293T 1.0 1.0 PCI-S-CTE 293T 4.6 4.0
PCI-S-WPRE 293T 27.6 12.8
[0607] In summary, all these results show that the expression, in
mammalian cells, of the cDNA for the SARS-CoV S under the control
of the RNA polymerase II promoter sequences requires, to be
efficient, the expression of a splice signal and of either of the
sequences WPRE and CTE.
2) Production of Stable Lines Allowing the Expression of SARS-CoV
S
[0608] The cDNA for the SARS-CoV S protein was cloned in the form
of a BamH1-Xho1 fragment into the plasmid pTRIP.DELTA.U3-CMV
containing a defective lentiviral vector TRIP with central DNA flap
(Sirven et al., 2001, Mol. Ther., 3: 438-448) in order to obtain
the plasmid pTRIP-S (FIG. 24). Transient cotransfection according
to Zennou et al. (2000, Cell, 101: 173-185) of this plasmid, of an
encapsidation plasmid (p8.2) and of a plasmid for expression of the
VSV envelope glycoprotein G (pHCMV-G) in 293T cells allowed the
preparation of retroviral pseudoparticles containing the vector
TRIP-S and pseudotyped with the envelope protein G. These
pseudotyped TRIP-S vectors were used to translate 293T and FRhK-4
cells: no expression of the S protein could be detected by Western
blotting and immunofluorescence in the transduced cells (data not
presented).
[0609] The optimum expression cassettes consisting of the CMV virus
immediate/early promoter, a splice signal, cDNA for S and either of
the posttranscriptional signals WPRE or CTE described above were
then substituted for the EF1.alpha.-EGFP cassette of the defective
lentiviral expression vector with central DNA flap
TRIP.DELTA.U3-EF1.alpha. (Sirven et al., 2001, Mol. Ther., 3:
438-448) (FIG. 25). These substitutions were carried out by a
series of successive subclonings of the S expression cassettes
which were excised from the plasmids pCT-S-CTE (BglII-Apa1) or
respectively pCI-S-WPRE (BglII-Sal1) and then inserted between the
Mlu1 and Kpn1 sites or respectively Mlu1 or Xho1 sites of the
plasmid TRIP.DELTA.U3-EF1.alpha. in order to obtain the plasmids
pTRIP-SD/SA-S-CTE and pTRIP-SD/SA-S-WPRE, deposited at the CNCM, on
Dec. 1, 2004, under the numbers I-3336 and I-3334, respectively.
Pseudotyped vectors were produced according to Zennou et al. (2000,
Cell, 101: 173-185) and used to transduce 293T cells (10 000 cells)
and FRhK-4 cells (15 000 cells) according to a series of 5
successive transduction cycles with a quantity of vectors
corresponding to 25 ng (TRIP-SD/SA-S-CTE) or 22 ng
TRIP-SD/SA-S-WPRE) of p24 per cycle.
[0610] The transduced cells were cloned by limiting dilution and a
series of clones were qualitatively analyzed for the expression of
SARS-CoV S by immunofluorescence (data not shown), and then
quantitatively by Western blotting (FIG. 25) with the aid of an
anti-S rabbit polyclonal serum. The results presented in FIG. 25
show that clones 2 and 15 of FrhK4-s-CTE cells transduced with
TRIP-SD/SA-S-CTE and clones 4, 9 and 12 of FRhK4-S-WPRE cells
transduced with TRIP-SD/SA-S-WPRE allow the expression of the
SARS-CoV S at respectively low or moderate levels if they are
compared to those which can be observed during infection with
SARS-CoV.
[0611] In summary, the vectors TRIP-SD/SA-S-CTE and
TRIP-SD/SA-S-WPRE allow the production of stable clones of FRhK-4
cells and similarly 293T cells expressing SARS-CoV S, whereas the
assays carried out with the "parent" vector TRIP-S remained
unsuccessful, which demonstrates the need for a splice signal and
for either of the sequences CTE and WPRE for the production of
stable cell clones expressing the S protein.
[0612] In addition, these modifications of the vector TRIP
(insertion of a splice signal and of a post-transcriptional signal
like CTE and WPRE) could prove advantageous for improving the
expression of other cDNAs than that for S.
[0613] 3) Production of stable lines allowing the expression of a
soluble form of SARS-CoV S. Purification of this recombinant
antigen.
[0614] A cDNA encoding a soluble form of the S protein (Ssol) was
obtained by fusing the sequences encoding the ecto-domain of the
protein (amino acids 1 to 1193) with those of a tag (FLAG:DYKDDDDK)
via a BspE1 linker encoding the SG dipeptide. Practically, in order
to obtain the plasmid pcDNA-Ssol, a DNA fragment encoding the
ectodomain of SARS-CoV S was amplified by PCR with the aid of the
oligonucleotides 5'-ATAGGATCCA CCATGTTTAT TTTCTTATTA TTTCTTACTC
TCACT-3' and 5'-ACCTCCGGAT TTAATATATT GCTCATATTT TCCCAA-3' from the
plasmid pcDNA-S, and then inserted between the unique BamH1 and
BspE1 sites of a modified eukaryotic expression plasmid pcDNA3.1(+)
(Clontech) containing the tag sequence FLAG between its BamH1 and
Xho1 sites: TABLE-US-00021 // GGATCC ...nnn... TCC GGA GAT TAT AAA
GAT GAC BamH1 S G D Y K D D GAC GAT AAA TAA CTCGAG // D D K ter
Xhol
[0615] The Nhe1-Xho1 and BamH1-Xho1 fragments, containing the cDNA
for S, were then excised from the plasmid pcDNA-Ssol, and subcloned
between the corresponding sites of the plasmid pTRIP-SD/SA-S-CTE
and of the plasmid pTRIP-SD-SA-S-WPRE, respectively, in order to
obtain the plasmids pTRIP-SD/SA-Ssol-CTE and pTRIP-SD/SA-Ssol-WPRE,
deposited at the CNCM, on Dec. 1, 2004, under the numbers I-3337
and I-3335, respectively.
[0616] Pseudotyped vectors were produced according to Zennou et al.
(2000, Cell, 101:173-185) and used to transduce FRhK-4 cells (15
000 cells) according to a series of 5 successive transduction
cycles (15 000 cells) with a quantity of vector corresponding to 24
ng (TRIP-SD/SA-Ssol-CTE) or 40 ng (TRIP-SD/SA-Ssol-WPRE) of p24 per
cycle. The transduced cells were cloned by limiting dilution and a
series of 16 clones transduced with TRIP-SD/SA-Ssol-CTE and of 15
clones with TRIP-SD/SA-Ssol-WPRE were analyzed for the expression
of the Ssol polypeptide by Western blotting visualized with an
anti-FLAG monoclonal antibody (FIG. 26 and data not presented), and
by capture ELISA specific for the Ssol polypeptide which was
developed for this purpose (table XI and data not presented). Part
of the process for selecting the best secretory clones is shown in
FIG. 26. Capture ELISA is based on the use of solid phases coated
with polyclonal antibodies of rabbits immunized with purified and
inactivated SARS-CoV. These solid phases allow the capture of the
Ssol polypeptide secreted into the cellular supernatants, whose
presence is then visualized with a series of steps successively
involving the attachment of an anti-FLAG monoclonal antibody (M2,
SIGMA), of anti-mouse IgG(H+L) biotinylated rabbit polyclonal
antibodies (Jackson) and of a streptavidin-peroxidase conjugate
(Amersham) and then the addition of chromogen and substrate
(TMB+H.sub.2O.sub.2, KPL). TABLE-US-00022 TABLE XI Analysis of the
expression of the Ssol polypeptide by cell lines transduced with
the lentiviral vectors TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA-
Ssol-CTE. The secretion of the Ssol polypeptide was assessed in the
supernatant of a series of cell clones isolated after transduction
of FRhK-4 cells with the lentiviral vectors TRIP-SD/SA-Ssol-WPRE
and TRIP-SD/SA- Ssol-CTE. The supernatants diluted 1/50 were
analyzed by a capture ELISA test specific for SARS-CoV S. Vector
Clone OD (450 nm) Control -- 0.031 TRIP-SD/SA-Ssol- CTE2 0.547 CTE
CTE3 0.668 CTE9 0.171 CTE12 0.208 CTE13 0.133 TRIP-SD/SA-Ssol-
WPRE1 0.061 WPRE WPRE10 0.134
[0617] The cell line secreting the highest quantities of Ssol
polypeptide in the culture supernatant is the FRhK4-Ssol-CTE3 line.
It was subjected to a second series of 5 cycles of transduction
with the vector TRIP-SD/SA-Ssol-CTE under conditions similar to
those described above and then cloned. The subclone secreting the
highest quantities of Ssol was selected by a combination of Western
blot and capture ELISA analysis: it is the subclone FRhK4-Ssol-30,
which was deposited at the CNCM, on Nov. 22, 2004, under the name
I-3325.
[0618] The FRhK4-Ssol-30 line allows the quantitative production
and purification of the recombinant Ssol polypeptide. In a typical
experiment where the experimental conditions for growth, production
and purification were optimized, the cells of the FRhK4-Ssol-30
line are inoculated in standard culture medium (pyruvate-free DMEM
containing 4.5 g/l of glucose and supplemented with 5% FCS, 100
U/ml of penicillin and 100 .mu.g/ml of streptomycin) in the form of
a subconfluent monolayer (1 million cells per each 100 cm.sup.2 in
20 ml of medium). At confluence, the standard medium is replaced
with the secretion medium where the quantity of FCS is reduced to
0.5% and the quantity of medium reduced to 16 ml per each 100
cm.sup.2. The culture supernatant is removed after 4 to 5 days of
incubation at 35.degree. C. and under 5% CO.sub.2. The recombinant
polypeptide Ssol is purified from the supernatant by the succession
of steps of filtration on 0.1 .mu.m polyethersulfone (PES)
membrane, concentration by ultrafiltration on a PES membrane with a
50 kD cut-off, affinity chromatography on anti-FLAG matrix with
elution with a solution of FLAG peptide (DYKDDDDK) at 100 .mu.g/ml
in TBS (50 mM tris, pH 7.4, 150 mM NaCl) and then gel filtration
chromatography in TBS on sephadex G-75 beads (Pharmacia). The
concentration of the purified recombinant Ssol polypeptide was
determined by micro-BCA test (Pierce) and then its biochemical
characteristics analyzed.
[0619] Analysis by 8% SDS acrylamide gel stained with silver
nitrate demonstrates a predominant polypeptide whose molecular mass
is about 180 kD and whose degree of purity may be evaluated at 98%
(FIG. 27A). Two main peaks are detected by SELDI-TOF mass
spectrometry (Cyphergen): they correspond to single and double
charged forms of a predominant polypeptide whose molecular mass is
thus determined at 182.6.+-.3.7 kD (FIGS. 27B and C). After
transfer onto Prosorb membrane and rinsing in 0.1% TFA, the
N-terminal end of the Ssol polypeptide was sequenced in liquid
phase by Edman degradation on 5 residues (ABI494, Applied
Biosystems) and determined as being SDLDR (FIG. 27D). This
demonstrates that the signal peptide located at the N-terminal end
of the SARS-CoV S protein, composed of aa 1 to 13 (MFIFLLFLTLTSG)
according to an analysis carried out with the software signalP v2.0
(Nielsen et al., 1997, Protein Engineering, 10:1-6), is cleaved
from the mature Ssol polypeptide. The recombinant Ssol polypeptide
therefore consists of amino acids 14 to 1193 of the SARS-CoV S
protein fused at the C-terminals with a sequence SGDYKDDDDK
containing the sequence of the FLAG tag (underlined). The
difference between the theoretical molar mass of the naked Ssol
polypeptide (132.0 kD) and the real molar mass of the mature
polypeptide (182.6 kD) suggests that the Ssol polypeptide is
glycosylated.
[0620] A preparation of purified Ssol polypeptide, whose protein
concentration was determined by micro-BCA test, makes it possible
to prepare a calibration series in order to measure, with the aid
of the capture ELISA test described above, the concentrations of
Ssol present in the culture supernatants and to review the
characteristics of the secretory lines. According to this test, the
FRhK4-Ssol-CT3 line secretes 4 to 6 g/ml of polypeptide Ssol while
the FRhK4-Ssol-30 line secretes 9 to 13 g/ml of Ssol after 4 to 5
days of culture at confluence. In addition, the purification scheme
presented above makes it possible routinely to purify from 1 to 2
mg of Ssol polypeptide per liter of culture supernatant.
EXAMPLE 12
Gene Immunization Involving the SARS-Associated Corona Virus
(SARS-CoV) Spicule (S) Protein
[0621] The effect of a splice signal and of the posttranscriptional
signals WPRE and CTE was analyzed after gene immunization of BALB/c
mice (FIG. 28).
[0622] For that, BALB/c mice were immunized at intervals of 4 weeks
by injecting into the tibialis anterior a saline solution of 50
.mu.g of plasmid DNA of pcDNA-S and pCI-S and, as a control, 50
.mu.g of plasmid DNA of pcDNA-N (directing the expression of
SARS-CoV N) or of pCI-HA (directing the expression of the HA of the
influenza virus A/PR/8/34) and the immune sera collected 3 weeks
after the 2.sup.nd injection. The presence of antibodies directed
against the SARS-CoV S was assessed by indirect ELISA using as
antigen a lysate of VeroE6 cells infected with SARS-CoV and, as a
control, a lysate of noninfected VeroE6 cells. The anti-SARS-CoV
antibody titers (TI) are calculated as the reciprocal of the
dilution producing a specific OD of 0.5 (difference between OD
measured on a lysate of infected cells and OD measured on a lysate
of noninfected cells) after visualization with an anti-mouse IgG
polyclonal antibody coupled with peroxidase (NA931V, Amersham) and
TMB supplemented with H.sub.2O.sub.2 (KPL) (FIG. 28A).
[0623] Under these conditions, the expression plasmid pcDNA-S only
allows the induction of low antibody titers directed against
SARS-CoV S in 3 mice out of 6 (LOG.sub.10(TI)=1.9.+-.0.6) whereas
the plasmid pcDNA-N allows the induction of anti-N antibodies at
high titers (LOG.sub.10(TI)=3.9.+-.0.3) in all the animals, and the
control plasmids (pCI, pCI-HA) do not result in any detectable
antibody (LOG.sub.10(TI)<1.7). The plasmid pCI-S equipped with a
splice signal allows the induction of antibodies at high titers
(LOG.sub.10(TI)=3.7.+-.0.2), which are approximately 60 times
higher than those observed after injection of the plasmid pcDNA-S
(p<10.sup.-5).
[0624] The efficiency of the posttranscriptional signals was
studied by carrying out a dose-response study of the anti-S
antibody titers induced in the BALB/c mouse as a function of the
quantity of plasmid DNA used as immunogen (2 .mu.g, 10 .mu.g and 50
.mu.g). This study (FIG. 28B) demonstrates that the
posttranscriptional signal WPRE greatly improves the efficiency of
gene immunization when small doses of DNA are used (p<10.sup.-5
for a dose of 2 .mu.g of DNA and p<10.sup.-2 for a dose of 10
.mu.g), whereas the effect of the CTE signal remains marginal
(p=0.34 for a dose of 2 .mu.g of DNA).
[0625] Finally, the antibodies induced in mice after gene
immunization neutralize the infectivity of SARS-CoV in vitro (FIGS.
29A and 29B) at titers which are consistent with the titers
measured by ELISA.
[0626] In summary, the use of a splice signal and of the
posttranscriptional signal WPRE of the woodchuck hepatitis virus
considerably improves the induction of neutralizing antibodies
directed against SARS-CoV after gene immunization with the aid of
plasmid DNA directing the expression of the cDNA for SARS-CoV
S.
EXAMPLE 13
Diagnostic Applications of the S Protein
[0627] The ELISA reactivity of the recombinant Ssol polypeptide was
analyzed with respect to sera from patients suffering from
SARS.
[0628] The sera from probable cases of SARS tested were chosen on
the basis of the results (positive or negative) of analysis of
their specific reactivity toward the native antigens of SARS-CoV by
immunofluorescence test on VeroE6 cells infected with SARS-CoV
and/or by indirect ELISA test using as antigen a lysate of VeroE6
cells infected with SARS-CoV. The sera of these patients are
identified by a serial number of the National Reference Center for
Influenza Viruses and by the initials of the patient and the number
of days elapsed since the onset of the symptoms. All the sera of
probable cases (cf. Table XII) recognize the native antigens of
SARS-CoV, with the exception of the serum 032552 of the patient VTT
for whom infection with SARS-CoV could not be confirmed by RT-PCR
performed on respiratory samples of days 3, 8 and 12. A panel of
control sera was used as control (TV sera): they are sera collected
in France before the SARS epidemic that occurred in 2003.
TABLE-US-00023 TABLE XII Sera of probable cases of SARS Sample
collection Serum Patient day 031724 JYK 7 033168 JYK 38 033597 JYK
74 032632 NTM 17 032634 THA 15 032541 PHV 10 032542 NIH 17 032552
VTT 8 032633 PTU 16 032791 JLB 3 033258 JLB 27 032703 JCM 8 033153
JCM 29
[0629] Solid phases sensitized with the recombinant Ssol
polypeptide were prepared by adsorption of a solution of purified
Ssol polypeptide at 2 .mu.g/ml in PBS in the wells of an ELISA
plate, and then the plates are incubated overnight at 4.degree. C.
and washed with PBS-Tween buffer (PBS, 0.1% Tween 20). After
saturating the ELISA plates with a solution of PBS-10% skimmed milk
(weight/volume) and washing in PBS-Tween, the sera to be tested
(100 .mu.l) are diluted 1/400 in PBS skimmed milk-Tween buffer
(PBS, 3% skimmed milk, 0.1% Tween) and then added to the wells of
the sensitized ELISA plate. The plates are incubated for 1 h at
37.degree. C. After 3 washings with PBS-Tween buffer, the
anti-human IgG conjugate labeled with peroxidase (ref. NA933V,
Amersham) diluted 1/4000 in PBS-skimmed milk-Tween buffer is added,
and then the plates are incubated for 1 hour at 37.degree. C. After
6 washings with PBS-Tween buffer, the chromogen (TMB) and the
substrate (H.sub.2O.sub.2) are added and the plates are incubated
for 10 minutes protected from light. The reaction is stopped by
adding a 1 N H.sub.3PO.sub.4 solution, and then the absorbance is
measured at 450 nm with a reference at 620 nm.
[0630] The ELISA tests (FIG. 30) demonstrate that the recombinant
Ssol polypeptide is specifically recognized by the serum antibodies
of patients suffering from SARS collected at the medium or late
phase of infection (.gtoreq.10 days after the onset of the
symptoms) whereas it is not significantly recognized by the serum
antibodies of 2 patients (JLB and JCM) collected in the early phase
of infection (3 to 8 days after the onset of the symptoms) or by
control sera of subjects not suffering from SARS. The serum
antibodies of patients JLB and JCM show a seroconversion between
days 3 and 27 for the first and 8 and 29 for the second after the
onset of the symptoms, which confirms the specificity of the
reactivity of these sera toward the Ssol polypeptide.
[0631] In conclusion, these results demonstrate that the
recombinant Ssol polypeptide may be used as an antigen for the
development of an ELISA test for serological diagnosis of infection
with SARS-CoV.
EXAMPLE 14
Vaccine Applications of the Recombinant Soluble S Protein
[0632] The immunogenicity of the recombinant Ssol polypeptide was
studied in mice.
[0633] For that, a group of 6 mice was immunized at 3 weeks'
interval with 10 .mu.g of recombinant Ssol polypeptide adjuvanted
with 1 mg of aluminum hydroxide (Alu-gel-S, Serva) diluted in PBS.
Three successive immunizations were performed and the immune sera
were collected 3 weeks after each of the immunizations (IS1, IS2,
IS3). As a control, a group of mice (mock group) received aluminum
hydroxide alone according to the same protocol.
[0634] The immune sera were analyzed per pool for each of the 2
groups by indirect ELISA using a lysate of VeroE6 cells infected
with SARS-CoV as antigen and as a control a lysate of noninfected
VeroE6 cells. The anti-SARS-CoV antibody titers are calculated as
the reciprocal of the dilution producing a specific OD of 0.5 after
visualization with an anti-mouse IgG(H+L) polyclonal antibody
coupled with peroxidase (NA931V, Amersham) and TMB supplemented
with H.sub.2O.sub.2 (KPL). This analysis (FIG. 31) shows that the
immunization with the Ssol polypeptide induces in mice, from the
first immunization, antibodies directed against the native form of
the SARS-CoV spicule protein present in the lysate of infected
VeroE6 cells. After 2 then 3 immunizations, the anti-S antibody
titers become very high.
[0635] The immune sera were analyzed per pool for each of the two
groups for their capacity to seroneutralize the infectivity of
SARS-CoV. 4 points of seroneutralization on FRhK-4 cells (100
TCID50 of SARS-CoV) are produced for each of the 2-fold dilutions
tested from 1/20. The seroneutralizing titer is calculated
according to the Reed and Munsch method as the reciprocal of the
dilution neutralizing the infectivity of 2 wells out of 4. This
analysis shows that the antibodies induced in mice by the Ssol
polypeptide are neutralizing: the titers observed are very high
after 2 and then 3 immunizations (greater than 2560 and 5120
respectively, table XIII). TABLE-US-00024 TABLE XIII Induction of
antibodies directed against SARS-CoV after immunization with the
recombinant Ssol polypeptide. The immune sera were analyzed per
pool for each of the two groups for their capacity to
seroneutralize the infectivity of 100 TCID50 of SARS- CoV on FRhK-4
cells. 4 points are produced for each of the 2-fold dilutions
tested from 1/20. The seroneutralizing titer is calculated
according to the Reed and Munsch method as the reciprocal of the
dilution neutralizing the infectivity of 2 wells out of 4. Group
Sera Neutralizing Ab Mock pi <20 IS1 <20 IS2 <20 IS3
<20 Ssol pi <20 IS1 57 IS2 >2560 IS3 >5120
[0636] The neutralizing titers observed in mice immunized with the
Ssol polypeptide reach levels far greater than the titers observed
by Yang et al. in mice (2004, Nature, 428:561-564) and those
observed by Buchholz in the hamster (2004, PNAS 101:9804-9809)
which protect respectively mice and hamsters from infection with
SARS-CoV. It is therefore probable that the neutralizing antibodies
induced in mice after immunization with the Ssol polypeptide
protect these animals against infection with SARS-CoV.
EXAMPLE 15
Optimized Synthetic Gene for the Expression in Mammalian Cells of
the SARS-Associated Coronavirus (SARS-CoV) Spicule (S) Protein
1) Design of the Synthetic Gene
[0637] A synthetic gene encoding the SARS-CoV spicule protein was
designed from the gene of the isolate 031589 (plasmid pSARS-S,
C.N.C.M. No. I-3059) so as to allow high levels of expression in
mammalian cells and in particular in cells of human origin.
[0638] For that: [0639] the use of codons of the wild-type gene of
the isolate 031589 was modified so as to become close to the bias
observed in humans and to improve the efficiency of translation of
the corresponding mRNA [0640] the overall GC content of the gene
was increased so as to extend the half-life of the corresponding
mRNA [0641] the optionally cryptic motifs capable of interfering
with an efficient expression of the gene were deleted (splice donor
and acceptor sites, polyadenylation signals, sequences very rich
(>80%) or very low (<30%) in GC, repeat sequences, sequences
involved in the formation of secondary RNA structures, TATA boxes)
[0642] a second STOP codon was added to allow efficient termination
of translation.
[0643] In addition, CpG motifs were introduced into the gene so as
to increase its immunogenicity as DNA vaccine. In order to
facilitate the manipulation of the synthetic gene, two BamH1 and
Xho1 restriction sites were placed on either side of the open
reading frame of the S protein, and the BamH1, Xho1, Nhe1, Kpn1,
BspE1 and Sal1 restriction sites were avoided in the synthetic
gene.
[0644] The sequence of the synthetic gene designed (gene 040530) is
given in SEQ ID No: 140.
[0645] An alignment of the synthetic gene 040530 with the sequence
of the wild-type gene of the isolate 031589 of SARS-CoV deposited
at the C.N.C.M. under the number I-3059 (SEQ ID No: 4, plasmid
pSRAS-S) is presented in FIG. 32.
2) Plasmid Constructs
[0646] The synthetic gene SEQ ID No: 140 was assembled from
synthetic oligonucleotides and cloned between the Kpn1 and Sac1
sites of the plasmid pUC-Kana in order to give the plasmid
040530pUC-Kana. The nucleotide sequence of the insert of the
plasmid 040530pUC-Kana was verified by automated sequencing
(Applied).
[0647] A Kpn1-Xho1 fragment containing the synthetic gene 040530
was excised from the plasmid 040530pUC-Kana and subcloned between
the Nhe1 and Xho1 sites of the expression plasmic pCI (Promega) in
order to obtain the plasmid pCI-SSYNTH, deposited at the CNCM on
Dec. 1, 2004, under the number I-3333.
[0648] A synthetic gene encoding the soluble form of the S protein
was then obtained by fusing the synthetic sequences encoding the
ectodomain of the S protein (amino acids 1 to 1193) with those of
the tag (FLAG:DYKDDDDK) via a linker BspE1 encoding the dipeptide
SG. Practically, a DNA fragment encoding the ectodomain of the
SARS-CoV S was amplified by PCR with the aid of the
oligonucleotides 5'-ACTAGCTAGC GGATCCACCATGTTCATCTT CCTG-3' and
5'-AGTATCCGGAC TTG ATGTACT GCTCGTACTTGC-3' from the plasmid
040530pUC-Kana, digested with Nhe1 and BspE1 and then inserted
between the unique Nhe1 and BspE1 sites of the plasmid pCI-Ssol, to
give the plasmid pCI-SCUBE, deposited at the CNCM on Dec. 1, 2004,
under the number I-3332. The plasmids pCI-Ssol, pCI-Ssol-CTE, and
pCI-Ssol-WPRE (deposited at the CNCM, on Nov. 22, 2004, under the
number I-3324) had been previously obtained by subcloning the
Kpn1-Xho1 fragment excised from the plasmid pcDNA-Ssol (see
technical note of DI 2004-106) between the Nhe1 and Xho1 sites of
the plasmids pCI, pCI-S-CTE and pCI-S-WPRE respectively.)
[0649] The plasmids pCI-Scube and pCI-Ssol encode the same
recombinant Ssol polypeptide.
3) Results
[0650] The capacity of the synthetic gene encoding the S protein to
efficiently direct the expression of the SARS-CoV S in mammalian
cells was compared with that of the wild-type gene after transient
transfection of primate cells (VeroE6) and of human cells
(293T).
[0651] In the experiment presented in FIG. 33 and in table XIV,
monolayers of 5.times.10.sup.5 VeroE6 cells or 7.times.10.sup.5
293T cells in 35 mm Petri dishes were transfected with 2 g of
plasmids pCI (as control), pCI-S, pCI-S-CTE, pCI-S-WPRE and
pCI-S-Ssynth and 6 .mu.l of Fugene6 reagent according to the
manufacturer's instructions (Roche). After 48 hours of incubation
at 37.degree. C. and under 5% CO.sub.2, cell extracts were prepared
in loading buffer according to Laemmli, separated on 8% SDS
polyacrylamide gel and then transferred onto a PVDF membrane
(BioRad). The detection of this immunoblot (Western blot) was
carried out with the aid of an anti-S rabbit polyclonal serum
(immune serum of the rabbit P11135: cf example 4 above) and of
donkey polyclonal antibodies directed against rabbit IgGs and
coupled with peroxidase (NA934V, Amersham). The immunoblot was
quantitatively visualized by luminescence with the aid of the ECL+
kit (Amersham) and acquisition on a digital imaging device (Fluor
S, BioRad).
[0652] The analysis of the results obtained with the software
QuantityOne v4.2.3 (BioRad) shows that in this experiment, the
plasmid pCI-Synth allows the transient expression of the S protein
at high levels in the VeroE6 and 293T cells, whereas the plasmid
pCI-S does not make it possible to induce expression at sufficient
levels to be detected. The expression levels observed are of the
order of twice as high as those observed with the plasmid
pCI-S-WPRE. TABLE-US-00025 TABLE XIV Use of a synthetic gene for
the expression of the SARS-CoV S. Cell extracts prepared 48 hours
after transfection of VeroE6 or 293T cells with the plasmids pCI,
pCI-S, pCI-S-CTE, pCI-S-WPRE and pCI-S- Ssynth were separated on 8%
SDS acrylamide gel and analyzed by Western blotting with the aid of
an anti-S rabbit polyclonal antibody and an anti-rabbit IgG (H + L)
polyclonal antibody coupled with peroxidase (NA934V, Amersham). The
Western blot is visualized by luminescence (ECL+, Amersham) and
acquisition on a digital imaging device (FluorS, BioRad). The
expression levels of the S protein were measured by quantifying the
two predominant bands identified on the image (see FIG. 33) and are
indicated according to an arbitrary scale where the value 1
represents the level measured after transfection of the plasmid
pCI-S-WPRE. Plasmid VeroE6 293T pCI 0.0 0.0 pCI-S .ltoreq.0.1
.ltoreq.0.1 pCI-S-CTE 0.5 .ltoreq.0.1 pCI-S-WPRE 1.0 1.0 pCI-Ssynth
1.8 1.9
[0653] In a second instance, the capacity of the synthetic gene
Scube to efficiently direct the synthesis and the secretion of the
Ssol polypeptide by mammalian cells was compared with that of the
wild-type gene after transient transfection of hamster cells
(BHK-21) and of human cells (293T).
[0654] In the experiment presented in table XV, monolayers of
6.times.10.sup.5 BHK-21 cells and 7.times.10.sup.5 293T cells in 35
mm Petri dishes were transfected with 2 .mu.g of plasmids pCI (as
control), pCI-Ssol, pCI-Ssol-CTE, pCI-Ssol-WPRE and pCI-Scube and 6
.mu.l of Fugene6 reagent according to the manufacturer's
instructions (Roche). After 48 hours of incubation at 37.degree. C.
and under 5% CO.sub.2, the cellular supernatants were collected and
quantitatively analyzed for the secretion of the Ssol polypeptide
by a capture ELISA test specific for the Ssol polypeptide.
[0655] Analysis of the results shows that, in this experiment, the
plasmid pCI-Scube allows the expression of the Ssol polypeptide at
levels 8 times (BHK-21 cells) to 20 times (293T cells) higher than
the plasmid pCI-Ssol.
[0656] The levels of expression observed are of the order of twice
(293T cells) to 5 times (BHK-21 cells) as high as those observed
with the plasmid pCI-Ssol-WPRE. TABLE-US-00026 TABLE XV Use of a
synthetic gene for the expression of the Ssol polypeptide. The
supernatants were harvested 48 hours after transfection of BHK or
293T cells with the plasmids pCI, pCI-Ssol, pCI-Ssol-CTE,
pCI-Ssol-WPRE and pCI-Scube and quantitatively analyzed for the
secretion of the Ssol polypeptide by an ELISA test specific for the
Ssol polypeptide. The transfections were carried out in duplicate
and the results are presented in the form of means and standard
deviations of the concentrations of Ssol polypeptide (ng/ml)
measured in the supernatants. Plasmid BHK 293T pci <20 <20
pCI-Ssol <20 56 .+-. 10 pCI-Ssol-CTE <20 63 .+-. 8
pCI-Ssol-WPRE 28 .+-. 1 531 .+-. 15 pCI-Scube 152 .+-. 6 1140 .+-.
20
[0657] In summary, these results show that the expression, in
mammalian cells, of the synthetic gene 040530 encoding SARS-CoV S
under the control of RNA polymerase II promoter sequences is much
more efficient than that of the wild-type gene of the 031589
isolate. This expression is even more efficient than that directed
by the wild-type gene in the presence of the WPRE sequences of the
woodchuck hepatitis virus.
4) Applications
[0658] The use of the synthetic gene 040530 encoding SARS-CoV S or
its Scube variant encoding the polypeptide Ssol is capable of
advantageously replacing the wild-type gene in numerous
applications where the expression of S is necessary at high levels.
In particular in order to: [0659] improve the efficiency of gene
immunization with plasmids of the pCI-Ssynth or even pCI-Ssynth-CTE
or pCI-Ssynth-WPRE type [0660] establish novel cell lines
expressing higher quantities of the S protein or of the Ssol
polypeptide with the aid of recombinant lentiviral vectors carrying
the Ssynth gene or the Scube gene respectively [0661] improve the
immunogenicity of the recombinant lentiviral vectors allowing the
expression of the S protein or of the Ssol polypeptide [0662]
improve the immunogenicity of live vectors allowing the expression
of the S protein or of the Ssol polypeptide like recombinant
vaccinia viruses or recombinant measles viruses (see examples 16
and 17 below)
EXAMPLE 16
Expression of the SARS-Associated Coronavirus (SARS-CoV) Spicule
(S) Protein with the Aid of Recombinant Vaccinia Viruses
[0662] Vaccine Application
Application to the Production of a Soluble form of the Spicule (S)
Protein and Design of a Serological Test for SARS
1) Introduction
[0663] The aim of this example is to evaluate the capacity of
recombinant vaccinia viruses (VV) expressing various
SARS-associated coronavirus (SARS-CoV) antigens to constitute novel
vaccine candidates against SARS and a means of producing
recombinant antigens in mammalian cells.
[0664] For that, the inventors focused on the SARS-CoV spicule (S)
protein which makes it possible to induce, after gene immunization
in animals, antibodies neutralizing the infectivity of SARS-CoV,
and a soluble and secreted form of this protein, the Ssol
polypeptide, which is composed of the ectodomain (aa 1-1193) of S
fused at its C-ter end with a tag FLAG (DYKDDDDK) via a BspE1
linker encoding the SG dipeptide. This Ssol polypeptide exhibits an
antigenicity similar to that of the S protein and allows, after
injection into mice in the form of a purified protein adjuvanted
with aluminum hydroxide, the induction of high neutralizing
antibody titers against SARS-CoV.
[0665] The various forms of the S gene were placed under the
control of the promoter of the 7.5K gene and then introduced into
the thymidine kinase (TK) locus of the Copenhagen strain of the
vaccinia virus by double homologous recombination in vivo. In order
to improve the immunogenicity of the recombinant vaccinia viruses,
a synthetic late promoter was chosen in place of the 7.5K promoter,
in order to increase the production of S and Ssol during the late
phases of the viral cycle.
[0666] After having isolated the recombinant vaccinia viruses and
verified their capacity to express the SARS-CoV S antigen, their
capacity to induce in mice an immune response against SARS was
tested. After having purified the Ssol antigen from the supernatant
of infected cells, an ELISA test for serodiagnosis of SARS was
designed, and its efficiency was evaluated with the aid of sera
from probable cases of SARS.
2) Construction of the Recombinant Viruses
[0667] Recombinant vaccinia viruses directing the expression of the
S glycoprotein of the 031589 isolate of SARS-CoV and of a soluble
and secreted form of this protein, the Ssol polypeptide, under the
control of the 7.5K promoter were obtained. With the aim of
increasing the levels of expression of S and Ssol, recombinant
viruses in which the cDNAs for S and for Ssol are placed under the
control of a late synthetic promoter were also obtained.
[0668] The plasmid pTG186poly is a transfer plasmid for the
construction of recombinant vaccinia viruses (Kieny, 1986,
Biotechnology, 4:790-795). As such, it contains the VV thymidine
kinase gene into which the promoter of the 7.5K gene has been
inserted followed by a multiple cloning site allowing the insertion
of heterologous genes (FIG. 34A). The promoter of the 7.5K gene in
fact contains a tandem of two promoter sequences that are
respectively active during the early (P.sub.E) and late (P.sub.L)
phases of the vaccinia virus replication cycle. The BamH1-Xho1
fragments were excised from the plasmids pTRIP-S and pcDNA-Ssol
respectively and inserted between the BamH1 and Sma1 sites of the
plasmid pTG186poly in order to give the plasmids pTG-S and pTG-Ssol
(FIG. 34A). The plasmids pTG-S and pTG-Ssol were deposited at the
CNCM, on Dec. 2, 2004, under the numbers I-3338 and I-3339,
respectively.
[0669] The plasmids pTN480, pTN-S and pTN-Ssol were obtained from
the plasmids pTG186poly, pTG-S and pTG-Ssol respectively, by
substituting the Nde1-Pst1 fragment containing the 7.5K promoter by
a DNA fragment containing the synthetic late promoter 480, which
was obtained by hybridization of the oligonucleotides 5'-TATGAGCTTT
TTTTTTTTTT TTTTTTTGGC ATATAAATAG ACTCGGCGCG CCATCTGCA-3' and
5'-GATGGCGCGCCGAGTCTATT TATATGCCAA AAAAAAAAAA AAAAAAAAGC TCA-3'
(FIG. 34B). The insert was sequenced with the aid of a BigDye
Terminator v1.1 kit (Applied Biosystems) and an automated sequencer
ABI377. The sequence of the late synthetic promoter 480 as cloned
into the transfer plasmids of the pTN series is indicated in FIG.
34C. The plasmids pTN-S and pTN-Ssol were deposited at the CNCM, on
Dec. 2, 2004, under the numbers I-3340 and I-3341,
respectively.
[0670] The recombinant vaccinia viruses were obtained by double
homologous recombination in vivo between the TK cassette of the
transfer plasmids of the series pTG and pTN and the TK gene of the
Copenhagen strain of the vaccinia virus according to a procedure
described by Kieny et al. (1984, Nature, 312:163-166). Briefly,
CV-1 cells are transfected with the aid of DOTAP (Roche) with
genomic DNA of the Copenhagen strain of the vaccinia virus and each
of the transfer plasmids of the pTG and pTN series described above,
and then superinfected with the helper vaccinia virus W-ts7 for 24
hours at 33.degree. C. The helper virus is counter-selected by
incubation at 40.degree. C. for 2 days and then the recombinant
viruses (TK-phenotype) selected by two cloning cycles under agar
medium on 143Btk-cells in the presence of BuDr (25 .mu.g/ml). The 6
viruses VV-TG, VV-TG-S, VV-TG-Ssol, VV-TN, VV-TN-S, and VV-TN-Ssol
are respectively obtained with the aid of the transfer plasmids
pTG186poly, pTG-S, pTG-Ssol, pTN480, pTN-S, pTN-Ssol. The viruses
VV-TG and VV-TN do not express any heterologous gene and were used
as TK-control in the experiments. The preparations of recombinant
viruses were performed on monolayers of CV-1 or BHK-21 cells and
the titer in plaque forming units (p.f.u) determined on CV-1 cells
according to Earl and Moss (1998, Current Protocols in Molecular
Biology, 16.16.1-16.16.13).
3) Characterization of the Recombinant Viruses
[0671] The expression of the transgenes encoding the S protein and
the Ssol polypeptide was assessed by Western blotting.
[0672] Monolayers of CV-1 cells were infected at a multiplicity of
2 with various recombinant vaccinia viruses VV-TG, VV-TG-S,
VV-TG-Ssol, W-TN, W-TN-S and VV-TN-Ssol. After 18 hours of
incubation at 37.degree. C. and under 5% CO2, cellular extracts
were prepared in loading buffer according to Laemmli, separated on
8% SDS polyacrylamide gel and then transferred onto a PVDF membrane
(BioRad). The detection of this immunoblot (Western blot) was
performed with the aid of an anti-S rabbit polyclonal serum (immune
serum from the rabbit P11135: cf. example 4) and donkey polyclonal
antibodies directed against rabbit IgGs and coupled with peroxidase
(NA934V, Amersham). The bound antibodies were visualized by
luminescence with the aid of the ECL+ kit (Amersham) and
autoradiography films Hyperfilm MP (Amersham).
[0673] As shown in FIG. 35A, the recombinant virus VV-TN-S directs
the expression of the S protein at levels which are comparable to
those which can be observed 8 h after infection with SARS-CoV but
which are much higher than those which can be observed after
infection with VV-TG-S. In a second experiment (FIG. 35B), the
analysis of variable quantities of cellular extracts shows that the
levels of expression observed after infection with viruses of the
TN series (VV-TN-S and VV-TN-Ssol) are about 10 times as high as
those observed with the viruses of the TG series (VV-TG-S and
VV-TG-Ssol, respectively). In addition, the Ssol polypeptide is
secreted into the supernatant of CV-1 cells infected with the
VV-TN-Ssol virus more efficiently than in the supernatant of cells
infected with VV-TG-Ssol (FIG. 36A). In this experiment, the
VV-TN-Sflag virus was used as a control because it expresses the
membrane form of the S protein fused at its C-ter end with the FLAG
tag. The Sflag protein is not detected in the supernatant of cells
infected with VV-TN-Sflag, demonstrating that the Ssol polypeptide
is indeed actively secreted after infection with VV-TN-Ssol.
[0674] These results demonstrate that the recombinant vaccinia
viruses are indeed carriers of the transgenes and allow the
expression of the SRAS glycoprotein in its membrane form (S) or in
a soluble or secreted form (Ssol). The vaccinia viruses carrying
the synthetic promoter 480 allow the expression of S and the
secretion of Ssol at levels much higher than the viruses carrying
the promoter of the 7.5K gene.
4) Application to the Production of a Soluble Form of SARS-CoV S.
Purification of this Recombinant Antigen and Diagnostic
Applications
[0675] The BHK-21 line is the cell line which secretes the highest
quantities of Ssol polypeptide after infection with the VV-TN-Ssol
virus among the lines tested (BHK-21, CV1, 293T and FrhK-4, FIG.
36B); it allows the quantitative production and purification of the
recombinant Ssol polypeptide. In a typical experiment where the
experimental conditions for infection, production and purification
were optimized, the BHK-21 cells are inoculated in standard culture
medium (pyruvate-free DMEM containing 4.5 g/l of glucose and
supplemented with 5% TPB, 5% FCS, 100 U/ml of penicillin and 100
.mu.g/ml of streptomycin) in the form of a subconfluent monolayer
(10 million cells for each 100 cm.sup.2 in 25 ml of medium). After
24 h of incubation at 37.degree. C. under 5% CO.sub.2, the cells
are infected at an M.O.I. of 0.03 and the standard medium replaced
with the secretion medium where the quantity of FCS is reduced to
0.5% and the TPB eliminated. The culture supernatant is removed
after 2.5 days of incubation at 35.degree. C. and under 5% CO.sub.2
and the vaccinia virus inactivated by addition of Triton X-100
(0.1%). After filtration on 0.1 .mu.m polyethersulfone (PES)
membrane, the recombinant Ssol polypeptide is purified by affinity
chromatography on an anti-FLAG matrix with elution with a solution
of FLAG peptide (DYKDDDDK) at 100 .mu.g/ml in TBS (50 mM Tris, pH
7.4, 150 mM NaCl).
[0676] The analysis by 8% SDS acrylamide gel stained with silver
nitrate identified a predominant polypeptide whose molecular mass
is about 180 kD and whose degree of purity is greater than 90%
(FIG. 37). The concentration of the purified Ssol recombinant
polypeptide was determined by comparison with molecular mass
markers and estimated at 24 ng/.mu.l.
[0677] This purified Ssol polypeptide preparation makes it possible
to produce a calibration series in order to measure, with the aid
of a capture ELISA test, the Ssol concentrations present in the
culture supernatants. According to this test, the BHK-21 line
secretes about 1 g/ml of Ssol polypeptide under the production
conditions described above. In addition, the purification scheme
presented makes it possible to purify of the order of 160 .mu.g of
Ssol polypeptide per liter of culture supernatant.
[0678] The ELISA reactivity of the recombinant Ssol polypeptide was
analyzed toward sera from patients suffering from SARS.
[0679] The sera of probable cases of SARS tested were chosen on the
basis of the results (positive or negative) of analysis of their
specific reactivity toward the native antigens of SARS-CoV by
immunofluorescence test on VeroE6 cells infected with SARS-CoV
and/or by indirect ELISA test using, as antigen, a lysate of VeroE6
cells infected with SARS-CoV. The sera of these patients are
identified by a serial number of the National Reference Center for
Influenza Viruses and by the patient's initials and the number of
days elapsed since the onset of the symptoms. All the sera of
probable cases (cf. table XVI) recognize the native antigens of
SARS-CoV with the exception of the serum 032552 of the patient VTT,
for which infection with SARS-CoV could not be confirmed by RT-PCR
performed on respiratory samples of days 3, 8 and 12. A panel of
control sera was used as control (TV sera): they are sera collected
in France before the SARS epidemic which occurred in 2003.
TABLE-US-00027 TABLE XVI Sera of probable cases of SARS Serum
Patient Sample collection day 033168 JYK 38 033597 JYK 74 032632
NTM 17 032634 THA 15 032541 PHV 10 032542 NIH 17 032552 VTT 8
032633 PTU 16
[0680] Solid phases sensitized with the recombinant Ssol
polypeptide were prepared by adsorption of a solution of purified
Ssol polypeptide at 4 .mu.g/ml in PBS in the wells of an ELISA
plate. The plates are incubated overnight at 4.degree. C. and then
washed with PBS-Tween buffer (PBS, 0.1% Tween 20). After washing
with PBS-Tween, the sera to be tested (100 .mu.l) are diluted 1/100
and 1/400 in PBS-skimmed milk-Tween buffer (PBS, 3% skimmed milk,
0.1% Tween) and then added to the wells of the sensitized ELISA
plate. The plates are then incubated for 1 h at 37.degree. C. After
3 washings with PBS-Tween buffer, the anti-human IgG conjugate
labeled with peroxidase (ref. NA933V, Amersham) diluted 1/4000 in
PBS-skimmed milk-Tween buffer is added and then the plates are
incubated for one hour at 37.degree. C. After 6 washings with
PBS-Tween buffer, the chromogen (TMB) and the substrate
(H.sub.2O.sub.2) are added and the plates are incubated for 10
minutes protected from light. The reaction is stopped by adding a
1M solution of H.sub.3PO.sub.4 and then the absorbance is measured
at 450 nm with a reference at 620 nm.
[0681] The ELISA tests (FIG. 38) demonstrate that the recombinant
Ssol polypeptide is specifically recognized by the serum antibodies
of patients suffering from SARS, collected at the middle or late
phase of infection (.gtoreq.10 days after the onset of the
symptoms), whereas it is not significantly recognized by the serum
antibodies of the control sera of subjects not suffering from
SARS.
[0682] In conclusion, these results demonstrate that the
recombinant Ssol polypeptide can be purified from the supernatant
of mammalian cells infected with the recombinant vaccinia virus
W-TN-Ssol and can be used as antigen for developing an ELISA test
for serological diagnosis of infection with SARS-CoV.
5. Vaccine Applications
[0683] The immunogenicity of the recombinant vaccinia viruses was
studied in mice.
[0684] For that, groups of 7 BALB/c mice were immunized by the i.v.
route twice at 4 weeks' interval with 10.sup.6 p.f.u. of
recombinant vaccinia viruses W-TG, VV-TG-S, W-TG-Ssol, VV-TN,
VV-TN-S and W-TN-Ssol and, as a control, VV-TG-HA which directs the
expression of hemagglutinin of the A/PR/8/34 strain of the
influenza virus. The immune sera were collected 3 weeks after each
of the immunizations (IS1, IS2).
[0685] The immune sera were analyzed per pool for each of the
groups by indirect ELISA using a lysate of VeroE6 cells infected
with SARS-CoV as antigen and, as control, a lysate of noninfected
VeroE6 cells. The anti-SARS-CoV antibody titers (TI) are calculated
as the reciprocal of the dilution producing a specific OD of 0.5
after visualization with an anti-mouse IgG(H+L) polyclonal antibody
coupled with peroxidase (NA931V, Amersham) and TMB supplemented
with H.sub.2O.sub.2 (KPL). This analysis (FIG. 39A) shows that
immunization with the virus VV-TG-S and VV-TN-S induces in mice,
from the first immunization, antibodies directed against the native
form of the SARS-CoV spicule protein present in the lysate of
infected VeroE6 cells. The responses induced by the VV-TN-S virus
are higher than those induced by the VV-TG-S virus after the first
(TI=740 and TI=270 respectively) and the second (TI=3230 and TI=600
respectively) immunization. The VV-TN-Ssol virus induces high
anti-SARS-CoV antibody titers after two immunizations (TI=640),
whereas the virus VV-TG-Ssol induces a response at the detection
limit (TI=40).
[0686] The immune sera were analyzed per pool for each of the
groups for their capacity to seroneutralize the infectivity of
SARS-CoV. 4 seroneutralization points on FRhK-4 cells (100 TCID50
of SARS-CoV) are produced for each of the 2-fold dilutions tested
from 1/20. The seroneutralizing titer is calculated according to
the Reed and Munsch method as the reciprocal of the dilution
neutralizing the infectivity of 2 wells out of 4. This analysis
shows that the antibodies induced in mice by the vaccinia viruses
expressing the S protein or the Ssol polypeptide are neutralizing
and that the viruses with synthetic promoters are more efficient
immunogens than the viruses carrying the 7.5K promoter: the highest
titers (640) are observed after 2 immunizations with the virus
VV-TN-S (FIG. 39B).
[0687] The protective power of the neutralizing antibodies induced
in mice after immunization with the recombinant vaccinia viruses is
evaluated with the aid of a challenge infection with SARS-CoV.
6) Other Applications
[0688] Third generation recombinant vaccinia viruses are
constructed by substituting the wild-type sequences of the S and
Ssol genes by synthetic genes optimized for the expression in
mammalian cells, described above. These recombinant vaccinia
viruses are capable of expressing larger quantities of S and Ssol
antigens and therefore of exhibiting increased immunogenicity.
[0689] The recombinant vaccinia virus VV-TN-Ssol can be used for
the quantitative production and purification of the Ssol antigen
for diagnostic (serology by ELISA) and vaccine (subunit vaccine)
applications.
EXAMPLE 17
Recombinant Measles Virus Expressing the SARS-Associated
Coronavirus (SARS-CoV) Spicule (S) Protein. Vaccine
Applications
1) Introduction
[0690] The measles vaccine (MV) induces a lasting protective
immunity in humans after a single injection (Hilleman, 2002,
Vaccine, 20: 651-665). The protection conferred is very robust and
is based on the induction of an antibody response and of a CD4 and
CD8 cell response. The MV genome is very stable and no reversion of
the vaccine strains to virulence has ever been observed. The
measles virus belongs to the genus Morbillivirus of the
Paramyxoviridae family; it is an enveloped virus whose genome is a
16 kb single-stranded RNA of negative polarity (FIG. 40A) and whose
exclusively cytoplasmic replication cycle excludes any possibility
of integration into the genome of the host. The measles vaccine is
thus one of the most effective and one of the safest live vaccines
used in the human population. Frederic Tangy's team recently
developed an expression vector on the basis of the Schwarz strain
of the measles virus, which is the safest attenuated strain and the
most widely used in humans as vaccine against measles. This vaccine
strain may be isolated from an infectious molecular clone while
preserving its immuno-genicity in primates and in mice that are
sensitive to the infection. It constitutes, after insertion of
additional transcription units, a vector for the expression of
heterologous sequences (Combredet, 2003, J. Virol. 77:
11546-11554). In addition, a recombinant MV Schwarz expressing the
envelope glycoprotein of the West Nile virus (WNV) induces an
effective and lasting antibody response which protects mice from a
lethal challenge infection with WNV (Despres et al., 2004, J.
Infect. Dis., in press). All these characteristics make the
attenuated Schwarz strain of the measles virus an extremely
promising candidate vector for the construction of novel
recombinant live vaccines.
[0691] The aim of this example is to evaluate the capacity of
recombinant measles viruses (MV) expressing various SARS-associated
coronavirus (SARS-CoV) antigens to constitute novel candidate
vaccines against SARS.
[0692] The inventors focused on the SARS-CoV spicule (S) protein,
which makes it possible to induce, after gene immunization in
animals, antibodies neutralizing the infectivity of SARS-CoV, and
on a soluble and secreted form of this protein, the Ssol
polypeptide, which is composed of the ectodomain (aa 1-1193) of S
fused at its C-ter end with a FLAG tag (DYKDDDDK) via a BspE1
linker encoding the SG dipeptide. This Ssol polypeptide exhibits a
similar antigenicity to that of the S protein and allows, after
injection into mice in the form of a purified protein adjuvanted
with aluminum hydroxide, the induction of high neutralizing
antibody titers against SARS-CoV.
[0693] The various forms of the S gene were introduced in the form
of an additional transcription unit between the P (phosphoprotein)
and M (matrix) genes into the cDNA of the Schwarz strain of Mv
previously described (Combredet, 2003, J. Virol. 77: 11546-11554;
EP application No. 02291551.6 of Jun. 20, 2002, and EP application
No. 02291550.8 of Jun. 20, 2002). After having isolated the
recombinant viruses MVSchw2-SARS-S and MVSchw2-SARS-Ssol and
checked their capacity to express the SARS-CoV S antigen, their
capacity to induce a protective immune response against SARS in
mice and then in monkeys was tested.
2) Construction of the Recombinant Viruses
[0694] The plasmid pTM-MVSchw-ATU2 (FIG. 40B) contains an
infectious cDNA corresponding to the antigenome of the Schwarz
vaccine strain of the measles virus (MV) into which an additional
transcription unit (ATU) has been introduced between the P
(phosphoprotein) and M (matrix) genes (Combredet, 2003, Journal of
Virology, 77: 11546-11554). Recombinant genomes MVSchw2-SARS-S and
MVSchw2-SARS-Ssol of the measles virus were constructed by
inserting ORFs of the S protein and of the Ssol polypeptide into
the additional transcription unit of the MVSchw-ATU2 vector.
[0695] For that, a DNA fragment containing the SARS-CoV S cDNA was
amplified by PCR with the aid of the oligo-nucleotides
5'-ATACGTACGA CCATGTTTAT TTTCTTATTA TTTCTTACTC TCACT-3' and
5'-ATAGCGCGCT CATTATGTGT AATGTAATTT GACACCCTTG-3' using the plasmid
pcDNA-S as template and then inserted into the plasmid
pCR.RTM.2.1-TOPO (Invitrogen) in order to obtain the plasmid
pTOPO-S-MV. The two oligonucleotides used contain restriction sites
BsiW1 and BssHII, so as to allow subsequent insertion into the
measles vector, and were designed so as to generate a sequence of
3774 nt including the codons for initiation and termination, so as
to observe the rule of 6 which stipulates that the length of the
genome of a measles virus must be divisible by 6 (Calain &
Roux, 1993, J. Virol., 67: 4822-4830; Schneider et al., 1997,
Virology, 227: 314-322). The insert was sequenced with the aid of a
BigDye Terminator v1.1 kit (Applied Biosystems) and an automated
sequencer ABI377.
[0696] To express a soluble and secreted form of SARS-CoV S, a
plasmid containing the cDNA of the Ssol polypeptide corresponding
to the ectodomain (aa 1-1193) of SARS-CoV S fused at its C-ter end
with the sequence of a FLAG tag (DYKDDDDK) via a BspE1 linker
encoding the SG dipeptide was then obtained. For that, a DNA
fragment was amplified with the aid of the oligonucleotides
5'-CCATTTCAAC AATTTGGCCG-3' and 5'-ATAGGATCCG CGCGCTCATT ATTTATCGTC
GTCATCTTTA TAATC-3' from the plasmid pcDNA-Ssol and then inserted
into the plasmid pTOPO-S-MV between the Sal1 and BamH1 sites in
order to obtain the plasmid pTOPO-S-MV-SF. The sequence generated
is 3618 nt long between the BsiW1 and BssHII sites and observes the
rule of 6. The insert was sequenced as indicated above.
[0697] The BsiW1-BssHII fragments containing the cDNAs for the S
protein and the Ssol polypeptide were then excised by digestion of
the plasmids pTOPO-S-MV and pTOPO-S-MV-SF and then subcloned
between the corresponding sites of the plasmid pTM-MVSchw-ATU2 in
order to give the plasmids pTM-MVSchw2-SARS-S and
pTM-MVSchw2-SARS-Ssol (FIG. 40B). These two plasmids were deposited
at the C.N.C.M. on Dec. 1, 2004, under the numbers I-3326 and
I-3327, respectively.
[0698] The recombinant measles viruses corresponding to the
plasmids pTM-MVSchw2-SARS-S and pTM-MVSchw2-SARS-Ssol were obtained
by reverse genetics according to the system based on the use of a
helper cell line, described by Radecke et al. (1995, Embo J., 14:
5773-5784) and modified by Parks et al. (1999, J. Virol., 73:
3560-3566). Briefly, the helper cells 293-3-46 are transfected
according to the calcium phosphate method with 5 .mu.g of the
plasmids pTM-MVSchw2-SARS-S or pTM-MVSchw2-SARS-Ssol and 0.02 .mu.g
of the plasmid pEMC-La directing the expression of the MV L
polymerase (gift from M. A. Billeter). After incubating overnight
at 37.degree. C., a heat shock is produced for 2 hours at
43.degree. C. and the transfected cells are transferred onto a
monolayer of Vero cells. For each of the two plasmids, syncytia
appeared after 2 to 3 days of coculture and were transferred
successively onto monolayers of Vero cells at 70% confluence in 35
mm Petri dishes and then in 25 and 75 cm.sup.2 flasks. When the
syncytia have reached 80-90% confluence, the cells are recovered
with the aid of a scraper and then frozen and thawed once. After
low-speed centrifugation, the supernatant containing the virus is
stored in aliquots at -80.degree. C. The titers of the recombinant
viruses MVSchw2-SARS-S and MVSchw2-SARS-Ssol were determined by
limiting dilution on Vero cells and the titer as dose infecting 50%
of the wells (TCID.sub.50) calculated according to the Karber
method.
3) Characterization of the Recombinant Viruses
[0699] The expression of the transgenes encoding the S protein and
the Ssol polypeptide was assessed by Western blotting and
immunofluorescence.
[0700] Monolayers of Vero cells in T-25 flasks were infected at a
multiplicity of 0.05 by various passages of the two viruses
MVSchw2-SARS-S and MVSchw2-SARS-Ssol and the wild-type virus MWSchw
as a control. When the syncytia had reached 80 to 90% confluence,
cytoplasmic extracts were prepared in an extraction buffer (150 mM
NaCl, 50 mM Tris-HCl, pH 7.2, 1% Triton X-100, 0.1% SDS, 1% DOC)
and then diluted in loading buffer according to Laemmli, separated
on 8% SDS polyacrylamide gel and transferred onto a PVDF membrane
(BioRad). The detection of this immunoblot (Western blot) was
carried out with the aid of an anti-S rabbit polyclonal serum
(immune serum of the rabbit P11135: cf. example 4 above) and donkey
polyclonal antibodies directed against rabbit IgGs and coupled with
peroxidase (NA934V, Amersham). The bound antibodies were visualized
by luminescence with the aid of the ECL+ kit (Amersham) and
Hyperfilm MP autoradiography films (Amersham).
[0701] Vero cells in monolayers on glass slides were infected with
the two viruses MVSchw2-SARS-S and MVSchw2-SARS-Ssol and the
wild-type virus MWSchw as a control at multiplicities of infection
of 0.05. When the syncytia had reached 90 to 100%
(MVSchw2-SARS-Ssol virus) or 30 to 40% (MVSchw2-SARS-S, MWSchw)
confluence, the cells were fixed in a 4% PBS-PFA solution,
permeabilized with a PBS solution containing 0.2% Triton and then
labeled with rabbit polyclonal antibodies hyperimmunized with
purified and inactivated SARS-CoV virions and with an anti-rabbit
IgG(H+L) goat antibody conjugate coupled with FITC (Jackson).
[0702] As shown in FIGS. 41 and 42, the recombinant viruses
MVSchw2-SARS-S and MVSchw2-SARS-Ssol direct the expression of the S
protein and the Ssol polypeptide respectively at levels comparable
to those which can be observed 8 h after infection with SARS-CoV.
The expression of these polypeptides is stable after 3 passages of
the recombinant viruses in cell culture. These results demonstrate
that the recombinant measles viruses are indeed carriers of the
transgenes and allow the expression of the SARS glycoprotein in its
membrane form (S) or in a soluble form (Ssol). The Ssol polypeptide
is expected to be secreted by cells infected with the
MVSchw2-SARS-Ssol virus as is the case when this same polypeptide
is expressed in mammalian cells after transient transfection of the
corresponding sequences (cf. example 11 above).
4) Applications
[0703] Having shown that the viruses MVSchw2-SARS-S and
MVSchw2-SARS-Ssol allow the expression of the SARS-CoV S, their
capacity to induce a protective immune response against SARS-CoV in
CD46.sup.+/- IFN- .alpha..beta.R.sup.-/- mice, which is sensitive
to infection by MV, is evaluated. The antibody response of the
immunized mice is evaluated by ELISA test against the native
antigens of SARS-CoV and for their capacity to neutralize the
infectivity of SARS-CoV in vitro, using the methodologies described
above. The protective power of the response will be evaluated by
measuring the reduction in the pulmonary viral load 2 days after a
nonlethal challenge infection with SARS-CoV.
[0704] Second generation recombinant measles viruses are
constructed by substituting the wild-type sequences of the S and
Sol genes by synthetic genes optimized for expression in mammalian
cells, described in example 15 above. These recombinant measles
viruses are capable of expressing larger quantities of the S and
Ssol antigens and therefore of exhibiting increased
immunogenicity.
[0705] Alternatively, the wild-type or synthetic genes encoding the
S protein or the Ssol polypeptide may be inserted into the measles
vector MVSchw-ATU3 in the form of an additional transcription unit
located between the H and L genes, and then the recombinant viruses
produced and characterized in a similar manner. This insertion is
capable of generating recombinant viruses possessing different
characteristics (multiplication of the virus, level of expression
of the transgene) and possibly an improved immunogenicity compared
with those obtained after insertion of the transgenes between the P
and N genes.
[0706] The recombinant measles virus MVSchw2-SARS-Ssol may be used
for the quantitative production and the purification of the Ssol
antigen for diagnostic and vaccine applications.
EXAMPLE 18
Other Applications Linked to the S Protein
[0707] a) The lentiviral vectors allowing the expression of S or
Ssol (or even of fragments of S) can constitute a recombinant
vaccine against SARS-CoV, to be used in human or veterinary
prophylaxis. In order to demonstrate the feasibility of such a
vaccine, the immunogenicity of the recombinant lentiviral vectors
TRIP-SD/SA-S-WPRE and TRIP-SD/SA-Ssol-WPRE is studied in mice.
[0708] b) Monoclonal antibodies are produced with the aid of the
recombinant Ssol polypeptide. According to the results presented in
example 14 above, these antibodies or at least the majority of them
will recognize the native form of the SARS-CoV S and will be
capable of diagnostic and/or prophylactic applications.
[0709] c) A serological test for SARS is developed with the Ssol
polypeptide used as antigen and the double epitope methodology.
Sequence CWU 1
1
158 1 29746 DNA CORONAVIRUS 1 atattaggtt tttacctacc caggaaaagc
caaccaacct cgatctcttg tagatctgtt 60 ctctaaacga actttaaaat
ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac 120 gcagtataaa
caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct 180
tctgcagact gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc
240 gtccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa
cgagaaaaca 300 cacgtccaac tcagtttgcc tgtccttcag gttagagacg
tgctagtgcg tggcttcggg 360 gactctgtgg aagaggccct atcggaggca
cgtgaacacc tcaaaaatgg cacttgtggt 420 ctagtagagc tggaaaaagg
cgtactgccc cagcttgaac agccctatgt gttcattaaa 480 cgttctgatg
ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg 540
gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgcc acatgtgggc
600 gaaaccccaa ttgcataccg caatgttctt cttcgtaaga acggtaataa
gggagccggt 660 ggtcatagct atggcatcga tctaaagtct tatgacttag
gtgacgagct tggcactgat 720 cccattgaag attatgaaca aaactggaac
actaagcatg gcagtggtgc actccgtgaa 780 ctcactcgtg agctcaatgg
aggtgcagtc actcgctatg tcgacaacaa tttctgtggc 840 ccagatgggt
accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 900
tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt
960 gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta
cgagcaccag 1020 acacccttcg aaattaagag tgccaagaaa tttgacactt
tcaaagggga atgcccaaag 1080 tttgtgtttc ctcttaactc aaaagtcaaa
gtcattcaac cacgtgttga aaagaaaaag 1140 actgagggtt tcatggggcg
tatacgctct gtgtaccctg ttgcatctcc acaggagtgt 1200 aacaatatgc
acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag 1260
acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa
1320 ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc
atgtcctgcc 1380 tgtcaagacc cagagattgg acctgagcat agtgttgcag
attatcacaa ccactcaaac 1440 attgaaactc gactccgcaa gggaggtagg
actagatgtt ttggaggctg tgtgtttgcc 1500 tatgttggct gctataataa
gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 1560 tcaggccata
ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 1620
atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag
1680 gttgccatca ttttggcatc tttctctgct tctacaagtg cctttattga
cactataaag 1740 agtcttgatt acaagtcttt caaaaccatt gttgagtcct
gcggtaacta taaagttacc 1800 aagggaaagc ccgtaaaagg tgcttggaac
attggacaac agagatcagt tttaacacca 1860 ctgtgtggtt ttccctcaca
ggctgctggt gttatcagat caatttttgc gcgcacactt 1920 gatgcagcaa
accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 1980
atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc
2040 aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac
ttctcagtgg 2100 ttgtctaatc ttttgggcac tactgttgaa aaactcaggc
ctatctttga atggattgag 2160 gcgaaactta gtgcaggagt tgaatttctc
aaggatgctt gggagattct caaatttctc 2220 attacaggtg tttttgacat
cgtcaagggt caaatacagg ttgcttcaga taacatcaag 2280 gattgtgtaa
aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa 2340
gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa
2400 agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact
actcatgcct 2460 cttaaggcac caaaagaagt aacctttctt gaaggtgatt
cacatgacac agtacttacc 2520 tctgaggagg ttgttctcaa gaacggtgaa
ctcgaagcac tcgagacgcc cgttgatagc 2580 ttcacaaatg gagctatcgt
tggcacacca gtctgtgtaa atggcctcat gctcttagag 2640 attaaggaca
aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 2700
tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg
2760 gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg
tgttgacaaa 2820 gtgcttaatg aaaagtgctc tgtctacact gttgaatccg
gtaccgaagt tactgagttt 2880 gcatgtgttg tagcagaggc tgttgtgaag
actttacaac cagtttctga tctccttacc 2940 aacatgggta ttgatcttga
tgagtggagt gtagctacat tctacttatt tgatgatgct 3000 ggtgaagaaa
acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa 3060
gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt
3120 acagaggatg attatcaagg tctccctctg gaatttggtg cctcagctga
aacagttcga 3180 gttgaggaag aagaagagga agactggctg gatgatacta
ctgagcaatc agagattgag 3240 ccagaaccag aacctacacc tgaagaacca
gttaatcagt ttactggtta tttaaaactt 3300 actgacaatg ttgccattaa
atgtgttgac atcgttaagg aggcacaaag tgctaatcct 3360 atggtgattg
taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca 3420
ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat
3480 ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc
taagaagtgt 3540 ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca
tccagcttct taaggcagca 3600 tatgaaaatt tcaattcaca ggacatctta
cttgcaccat tgttgtcagc aggcatattt 3660 ggtgctaaac cacttcagtc
tttacaagtg tgcgtgcaga cggttcgtac acaggtttat 3720 attgcagtca
atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg 3780
aagcctagag tggaagcacc taaacaagag gagccaccaa acacagaaga ttccaaaact
3840 gaggagaaat ctgtcgtaca gaagcctgtc gatgtgaagc caaaaattaa
ggcctgcatt 3900 gatgaggtta ccacaacact ggaagaaact aagtttctta
ccaataagtt actcttgttt 3960 gctgatatca atggtaagct ttaccatgat
tctcagaaca tgcttagagg tgaagatatg 4020 tctttccttg agaaggatgc
accttacatg gtaggtgatg ttatcactag tggtgatatc 4080 acttgtgttg
taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct 4140
ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt
4200 tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt
ttatgtacta 4260 ccttcagaag cacctaatgc taaggaagag attctaggaa
ctgtatcctg gaatttgaga 4320 gaaatgcttg ctcatgctga agagacaaga
aaattaatgc ctatatgcat ggatgttaga 4380 gccataatgg caaccatcca
acgtaagtat aaaggaatta aaattcaaga gggcatcgtt 4440 gactatggtg
tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg 4500
aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt
4560 tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc
cgtagtgtca 4620 gtatcatcac cagatgctgt tactacatat aatggatacc
tcacttcgtc atcaaagaca 4680 tctgaggagc actttgtaga aacagtttct
ttggctggct cttacagaga ttggtcctat 4740 tcaggacagc gtacagagtt
aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac 4800 cacactctgg
agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa 4860
ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac
4920 aacactaatc tccacacaca gcttgtggat atgtctatga catatggaca
gcagtttggt 4980 ccaacatact tggatggtgc tgatgttaca aaaattaaac
ctcatgtaaa tcatgagggt 5040 aagactttct ttgtactacc tagtgatgac
acactacgta gtgaagcttt cgagtactac 5100 catactcttg atgagagttt
tcttggtagg tacatgtctg ctttaaacca cacaaagaaa 5160 tggaaatttc
ctcaagttgg tggtttaact tcaattaaat gggctgataa caattgttat 5220
ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt
5280 caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgc
actcatactc 5340 gcttacagta ataaaactgt tggcgagctt ggtgatgtca
gagaaactat gacccatctt 5400 ctacagcatg ctaatttgga atctgcaaag
cgagttctta atgtggtgtg taaacattgt 5460 ggtcagaaaa ctactacctt
aacgggtgta gaagctgtga tgtatatggg tactctatct 5520 tatgataatc
ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctacacaa 5580
tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa
5640 ttacagcaag gtacattctt atgtgcgaat gagtacactg gtaactatca
gtgtggtcat 5700 tacactcata taactgctaa ggagaccctc tatcgtattg
acggagctca ccttacaaag 5760 atgtcagagt acaaaggacc agtgactgat
gttttctaca aggaaacatc ttacactaca 5820 accatcaagc ctgtgtcgta
taaactcgat ggagttactt acacagagat tgaaccaaaa 5880 ttggatgggt
attataaaaa ggataatgct tactatacag agcagcctat agaccttgta 5940
ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca
6000 aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc
acgagagcta 6060 tctgtcacat tcttcccaga cttgaatggc gatgtagtgg
ctattgacta tagacactat 6120 tcagcgagtt tcaagaaagg tgctaaatta
ctgcataagc caattgtttg gcacattaac 6180 caggctacaa ccaagacaac
gttcaaacca aacacttggt gtttacgttg tctttggagt 6240 acaaagccag
tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga 6300
atggacaatc ttgcttgtga aagtcaacaa cccacctctg aagaagtagt ggaaaatcct
6360 accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt
aggcaatgtc 6420 atacttaaac catcagatga aggtgttaaa gtaacacaag
agttaggtca tgaggatctt 6480 atggctgctt atgtggaaaa cacaagcatt
accattaaga aacctaatga gctttcacta 6540 gccttaggtt taaaaacaat
tgccactcat ggtattgctg caattaatag tgttccttgg 6600 agtaaaattt
tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat 6660
tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta
6720 ttgttccaat tgtgtacttt tactaaaagt accaattcta gaattagagc
ttcactacct 6780 acaactattg ctaaaaatag tgttaagagt gttgctaaat
tatgtttgga tgccggcatt 6840 aattatgtga agtcacccaa attttctaaa
ttgttcacaa tcgctatgtg gctattgttg 6900 ttaagtattt gcttaggttc
tctaatctgt gtaactgctg cttttggtgt actcttatct 6960 aattttggtg
ctccttctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac 7020
gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta
7080 gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc
atcgtacaag 7140 ctagacttga caattttagg tctggccgct gagtgggttt
tggcatatat gttgttcaca 7200 aaattctttt atttattagg tctttcagct
ataatgcagg tgttctttgg ctattttgct 7260 agtcatttca tcagcaattc
ttggctcatg tggtttatca ttagtattgt acaaatggca 7320 cccgtttctg
caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag 7380
agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc
7440 aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag
atctttctat 7500 gtctatgcaa atggaggccg tggcttctgc aagactcaca
attggaattg tctcaattgt 7560 gacacatttt gcactggtag tacattcatt
agtgatgaag ttgctcgtga tttgtcactc 7620 cagtttaaaa gaccaatcaa
ccctactgac cagtcatcgt atattgttga tagtgttgct 7680 gtgaaaaatg
gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga 7740
catccgctct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca
7800 ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc
tgcttctaag 7860 tctgcttctg tgtactacag tcagctgatg tgccaaccta
ttctgttgct tgaccaagct 7920 cttgtatcag acgttggaga tagtactgaa
gtttccgtta agatgtttga tgcttatgtc 7980 gacacctttt cagcaacttt
tagtgttcct atggaaaaac ttaaggcact tgttgctaca 8040 gctcacagcg
agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 8100
gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc
8160 aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa
tttcatgctc 8220 acctataata aggttgaaaa catgacgccc agagatcttg
gcgcatgtat tgactgtaat 8280 gcaaggcata tcaatgccca agtagcaaaa
agtcacaatg tttcactcat ctggaatgta 8340 aaagactaca tgtctttatc
tgaacagctg cgtaaacaaa ttcgtagtgc tgccaagaag 8400 aacaacatac
cttttagact aacttgtgct acaactagac aggttgtcaa tgtcataact 8460
actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag
8520 gccacattat tgtgcgttct tgctgcattg gtttgttata tcgttatgcc
agtacataca 8580 ttgtcaatcc atgatggtta cacaaatgaa atcattggtt
acaaagccat tcaggatggt 8640 gtcactcgtg acatcatttc tactgatgat
tgttttgcaa ataaacatgc tggttttgac 8700 gcatggttta gccagcgtgg
tggttcatac aaaaatgaca aaagctgccc tgtagtagct 8760 gctatcatta
caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 8820
gcaatcaatg gtgacttctt gcattttcta cctcgtgttt ttagtgctgt tggcaacatt
8880 tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc
ttgcgttctt 8940 gctgctgagt gtacaatttt taaggatgct atgggcaaac
ctgtgccata ttgttatgac 9000 actaatttgc tagagggttc tatttcttat
agtgagcttc gtccagacac tcgttatgtg 9060 cttatggatg gttccatcat
acagtttcct aacacttacc tggagggttc tgttagagta 9120 gtaacaactt
ttgatgctga gtactgtaga catggtacat gcgaaaggtc agaagtaggt 9180
atttgcctat ctaccagtgg tagatgggtt cttaataatg agcattacag agctctatca
9240 ggagttttct gtggtgttga tgcgatgaat ctcatagcta acatctttac
tcctcttgtg 9300 caacctgtgg gtgctttaga tgtgtctgct tcagtagtgg
ctggtggtat tattgccata 9360 ttggtgactt gtgctgccta ctactttatg
aaattcagac gtgtttttgg tgagtacaac 9420 catgttgttg ctgctaatgc
acttttgttt ttgatgtctt tcactatact ctgtctggta 9480 ccagcttaca
gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat 9540
ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt
9600 gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca
ctgccattgg 9660 ttctttaaca actatcttag gaaaagagtc atgtttaatg
gagttacatt tagtaccttc 9720 gaggaggctg ctttgtgtac ctttttgctc
aacaaggaaa tgtacctaaa attgcgtagc 9780 gagacactgt tgccacttac
acagtataac aggtatcttg ctctatataa caagtacaag 9840 tatttcagtg
gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagca 9900
aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagaca
9960 tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc
gtcaggcaaa 10020 gttgaagggt gcatggtaca agtaacctgt ggaactacaa
ctcttaatgg attgtggttg 10080 gatgacacag tatactgtcc aagacatgtc
atttgcacag cagaagacat gcttaatcct 10140 aactatgaag atctgctcat
tcgcaaatcc aaccatagct ttcttgttca ggctggcaat 10200 gttcaacttc
gtgttattgg ccattctatg caaaattgtc tgcttaggct taaagttgat 10260
acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt
10320 tcagttctag catgctacaa tggttcacca tctggtgttt atcagtgtgc
catgagacct 10380 aatcatacca ttaaaggttc tttccttaat ggatcatgtg
gtagtgttgg ttttaacatt 10440 gattatgatt gcgtgtcttt ctgctatatg
catcatatgg agcttccaac aggagtacac 10500 gctggtactg acttagaagg
taaattctat ggtccatttg ttgacagaca aactgcacag 10560 gctgcaggta
cagacacaac cataacatta aatgttttgg catggctgta tgctgctgtt 10620
atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt
10680 gtggcaatga agtacaacta tgaacctttg acacaagatc atgttgacat
attgggacct 10740 ctttctgctc aaacaggaat tgccgtctta gatatgtgtg
ctgctttgaa agagctgctg 10800 cagaatggta tgaatggtcg tactatcctt
ggtagcacta ttttagaaga tgagtttaca 10860 ccatttgatg ttgttagaca
atgctctggt gttaccttcc aaggtaagtt caagaaaatt 10920 gttaagggca
ctcatcattg gatgctttta actttcttga catcactatt gattcttgtt 10980
caaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact
11040 cttggtatta tggcaattgc tgcatgtgct atgctgcttg ttaagcataa
gcacgcattc 11100 ttgtgcttgt ttctgttacc ttctcttgca acagttgctt
actttaatat ggtctacatg 11160 cctgctagct gggtgatgcg tatcatgaca
tggcttgaat tggctgacac tagcttgtct 11220 ggttataggc ttaaggattg
tgttatgtat gcttcagctt tagttttgct tattctcatg 11280 acagctcgca
ctgtttatga tgatgctgct agacgtgttt ggacactgat gaatgtcatt 11340
acacttgttt acaaagtcta ctatggtaat gctttagatc aagctatttc catgtgggcc
11400 ttagttattt ctgtaacctc taactattct ggtgtcgtta cgactatcat
gtttttagct 11460 agagctatag tgtttgtgtg tgttgagtat tacccattgt
tatttattac tggcaacacc 11520 ttacagtgta tcatgcttgt ttattgtttc
ttaggctatt gttgctgctg ctactttggc 11580 cttttctgtt tactcaaccg
ttacttcagg cttactcttg gtgtttatga ctacttggtc 11640 tctacacaag
aatttaggta tatgaactcc caggggcttt tgcctcctaa gagtagtatt 11700
gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg tatcaaggtt
11760 gctactgtac agtctaaaat gtctgacgta aagtgcacat ctgtggtact
gctctcggtt 11820 cttcaacaac ttagagtaga gtcatcttct aaattgtggg
cacaatgtgt acaactccac 11880 aatgatattc ttcttgcaaa agacacaact
gaagctttcg agaagatggt ttctcttttg 11940 tctgttttgc tatccatgca
gggtgctgta gacattaata ggttgtgcga ggaaatgctc 12000 gataaccgtg
ctactcttca ggctattgct tcagaattta gttctttacc atcatatgcc 12060
gcttatgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc
12120 gttctcaaaa agttaaagaa atctttgaat gtggctaaat ctgagtttga
ccgtgatgct 12180 gccatgcaac gcaagttgga aaagatggca gatcaggcta
tgacccaaat gtacaaacag 12240 gcaagatctg aggacaagag ggcaaaagta
actagtgcta tgcaaacaat gctcttcact 12300 atgcttagga agcttgataa
tgatgcactt aacaacatta tcaacaatgc gcgtgatggt 12360 tgtgttccac
tcaacatcat accattgact acagcagcca aactcatggt tgttgtccct 12420
gattatggta cctacaagaa cacttgtgat ggtaacacct ttacatatgc atctgcactc
12480 tgggaaatcc agcaagttgt tgatgcggat agcaagattg ttcaacttag
tgaaattaac 12540 atggacaatt caccaaattt ggcttggcct cttattgtta
cagctctaag agccaactca 12600 gctgttaaac tacagaataa tgaactgagt
ccagtagcac tacgacagat gtcctgtgcg 12660 gctggtacca cacaaacagc
ttgtactgat gacaatgcac ttgcctacta taacaattcg 12720 aagggaggta
ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga 12780
ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaccacc ttgtaggttt
12840 gttacagaca caccaaaagg gcctaaagtg aaatacttgt acttcatcaa
aggcttaaac 12900 aacctaaata gaggtatggt gctgggcagt ttagctgcta
cagtacgtct tcaggctgga 12960 aatgctacag aagtacctgc caattcaact
gtgctttcct tctgtgcttt tgcagtagac 13020 cctgctaaag catataagga
ttacctagca agtggaggac aaccaatcac caactgtgtg 13080 aagatgttgt
gtacacacac tggtacagga caggcaatta ctgtaacacc agaagctaac 13140
atggaccaag agtcctttgg tggtgcttca tgttgtctgt attgtagatg ccacattgac
13200 catccaaatc ctaaaggatt ctgtgacttg aaaggtaagt acgtccaaat
acctaccact 13260 tgtgctaatg acccagtggg ttttacactt agaaacacag
tctgtaccgt ctgcggaatg 13320 tggaaaggtt atggctgtag ttgtgaccaa
ctccgcgaac ccttgatgca gtctgcggat 13380 gcatcaacgt ttttaaacgg
gtttgcggtg taagtgcagc ccgtcttaca ccgtgcggca 13440 caggcactag
tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaagttgctg 13500
gttttgcaaa gttcctaaaa actaattgct gtcgcttcca ggagaaggat gaggaaggca
13560 atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac
caacatgaag 13620 agactattta taacttggtt aaagattgtc cagcggttgc
tgtccatgac tttttcaagt 13680 ttagagtaga tggtgacatg gtaccacata
tatcacgtca gcgtctaact aaatacacaa 13740 tggctgattt agtctatgct
ctacgtcatt ttgatgaggg taattgtgat acattaaaag 13800 aaatactcgt
cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg 13860
acttcgtaga gaatcctgac atcttacgcg tatatgctaa cttaggtgag cgtgtacgcc
13920 aatcattatt aaagactgta caattctgcg atgctatgcg tgatgcaggc
attgtaggcg 13980 tactgacatt agataatcag gatcttaatg ggaactggta
cgatttcggt gatttcgtac 14040 aagtagcacc aggctgcgga gttcctattg
tggattcata ttactcattg ctgatgccca 14100 tcctcacttt gactagggca
ttggctgctg agtcccatat ggatgctgat ctcgcaaaac 14160 cacttattaa
gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 14220
accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg
14280 ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg
tttccaccta 14340 caagttttgg accactagta agaaaaatat ttgtagatgg
tgttcctttt gttgtttcaa 14400 ctggatacca ttttcgtgag ttaggagtcg
tacataatca ggatgtaaac ttacatagct 14460 cgcgtctcag tttcaaggaa
cttttagtgt atgctgctga tccagctatg catgcagctt 14520 ctggcaattt
attgctagat aaacgcacta catgcttttc agtagctgca ctaacaaaca 14580
atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg
14640 tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc
ttctttgctc 14700 aggatggcaa cgctgctatc agtgattatg actattatcg
ttataatctg ccaacaatgt 14760 gtgatatcag acaactccta ttcgtagttg
aagttgttga taaatacttt gattgttacg 14820 atggtggctg tattaatgcc
aaccaagtaa tcgttaacaa tctggataaa tcagctggtt 14880 tcccatttaa
taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc 14940
aagatgcact tttcgcgtat actaagcgta atgtcatccc tactataact caaatgaatc
15000 ttaagtatgc cattagtgca aagaatagag ctcgcaccgt
agctggtgtc tctatctgta 15060 gtactatgac aaatagacag tttcatcaga
aattattgaa gtcaatagcc gccactagag 15120 gagctactgt ggtaattgga
acaagcaagt tttacggtgg ctggcataat atgttaaaaa 15180 ctgtttacag
tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca 15240
gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacataaca
15300 cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg
caagtattaa 15360 gtgagatggt catgtgtggc ggctcactat atgttaaacc
aggtggaaca tcatccggtg 15420 atgctacaac tgcttatgct aatagtgtct
ttaacatttg tcaagctgtt acagccaatg 15480 taaatgcact tctttcaact
gatggtaata agatagctga caagtatgtc cgcaatctac 15540 aacacaggct
ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 15600
agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg
15660 tgtgctataa cagtaactat gcggctcaag gtttagtagc tagcattaag
aactttaagg 15720 cagttcttta ttatcaaaat aatgtgttca tgtctgaggc
aaaatgttgg actgagactg 15780 accttactaa aggacctcac gaattttgct
cacagcatac aatgctagtt aaacaaggag 15840 atgattacgt gtacctgcct
tacccagatc catcaagaat attaggcgca ggctgttttg 15900 tcgatgatat
tgtcaaaaca gatggtacac ttatgattga aaggttcgtg tcactggcta 15960
ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt
16020 atttacaata cattagaaag ttacatgatg agcttactgg ccacatgttg
gacatgtatt 16080 ccgtaatgct aactaatgat aacacctcac ggtactggga
acctgagttt tatgaggcta 16140 tgtacacacc acatacagtc ttgcaggctg
taggtgcttg tgtattgtgc aattcacaga 16200 cttcacttcg ttgcggtgcc
tgtattagga gaccattcct atgttgcaag tgctgctatg 16260 accatgtcat
ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg 16320
ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt
16380 gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag
gtttttggtt 16440 tatacaaaaa cacatgtgta ggcagtgaca atgtcactga
cttcaatgcg atagcaacat 16500 gtgattggac taatgctggc gattacatac
ttgccaacac ttgtactgag agactcaagc 16560 ttttcgcagc agaaacgctc
aaagccactg aggaaacatt taagctgtca tatggtattg 16620 ccactgtacg
cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 16680
ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta
16740 aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct
gttgtgtaca 16800 gaggtactac gacatacaag ttgaatgttg gtgattactt
tgtgttgaca tctcacactg 16860 taatgccact tagtgcacct actctagtgc
cacaagagca ctatgtgaga attactggct 16920 tgtacccaac actcaacatc
tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 16980 tcggcatgca
aaagtactct acactccaag gaccacctgg tactggtaag agtcattttg 17040
ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg
17100 cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat
aaatgtagta 17160 gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa
attcaaagtg aattcaacac 17220 tagaacagta tgttttctgc actgtaaatg
cattgccaga aacaactgct gacattgtag 17280 tctttgatga aatctctatg
gctactaatt atgacttgag tgttgtcaat gctagacttc 17340 gtgcaaaaca
ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc 17400
tgactaaagg cacactagaa ccagaatatt ttaattcagt gtgcagactt atgaaaacaa
17460 taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt
gttgacactg 17520 tgagtgcttt agtttatgac aataagctaa aagcacacaa
ggataagtca gctcaatgct 17580 tcaaaatgtt ctacaaaggt gttattacac
atgatgtttc atctgcaatc aacagacctc 17640 aaataggcgt tgtaagagaa
tttcttacac gcaatcctgc ttggagaaaa gctgttttta 17700 tctcacctta
taattcacag aacgctgtag cttcaaaaat cttaggattg cctacgcaga 17760
ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa
17820 cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca
aaaattggca 17880 ttttgtgcat aatgtctgat agagatcttt atgacaaact
gcaatttaca agtctagaaa 17940 taccacgtcg caatgtggct acattacaag
cagaaaatgt aactggactt tttaaggact 18000 gtagtaagat cattactggt
cttcatccta cacaggcacc tacacacctc agcgttgata 18060 taaagttcaa
gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct 18120
accgtagact catctctatg atgggtttca aaatgaatta ccaagtcaat ggttacccta
18180 atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt
ggctttgatg 18240 tagagggctg tcatgcaact agagatgctg tgggtactaa
cctacctctc cagctaggat 18300 tttctacagg tgttaactta gtagctgtac
cgactggtta tgttgacact gaaaataaca 18360 cagaattcac cagagttaat
gcaaaacctc caccaggtga ccagtttaaa catcttatac 18420 cactcatgta
taaaggcttg ccctggaatg tagtgcgtat taagatagta caaatgctca 18480
gtgatacact gaaaggattg tcagacagag tcgtgttcgt cctttgggcg catggctttg
18540 agcttacatc aatgaagtac tttgtcaaga ttggacctga aagaacgtgt
tgtctgtgtg 18600 acaaacgtgc aacttgcttt tctacttcat cagatactta
tgcctgctgg aatcattctg 18660 tgggttttga ctatgtctat aacccattta
tgattgatgt tcagcagtgg ggctttacgg 18720 gtaaccttca gagtaaccat
gaccaacatt gccaggtaca tggaaatgca catgtggcta 18780 gttgtgatgc
tatcatgact agatgtttag cagtccatga gtgctttgtt aagcgcgttg 18840
attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa
18900 aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca
gttcttcatg 18960 acattggaaa tccaaaggct atcaagtgtg tgcctcaggc
tgaagtagaa tggaagttct 19020 acgatgctca gccatgtagt gacaaagctt
acaaaataga ggaactcttc tattcttatg 19080 ctacacatca cgataaattc
actgatggtg tttgtttgtt ttggaattgt aacgttgatc 19140 gttacccagc
caatgcaatt gtgtgtaggt ttgacacaag agtcttgtca aacttgaact 19200
taccaggctg tgatggtggt agtttgtatg tgaataagca tgcattccac actccagctt
19260 tcgataaaag tgcatttact aatttaaagc aattgccttt cttttactat
tctgatagtc 19320 cttgtgagtc tcatggcaaa caagtagtgt cggatattga
ttatgttcca ctcaaatctg 19380 ctacgtgtat tacacgatgc aatttaggtg
gtgctgtttg cagacaccat gcaaatgagt 19440 accgacagta cttggatgca
tataatatga tgatttctgc tggatttagc ctatggattt 19500 acaaacaatt
tgatacttat aacctgtgga atacatttac caggttacag agtttagaaa 19560
atgtggctta taatgttgtt aataaaggac actttgatgg acacgccggc gaagcacctg
19620 tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg
gagatctttg 19680 aaaataagac aacacttcct gttaatgttg catttgagct
ttgggctaag cgtaacatta 19740 aaccagtgcc agagattaag atactcaata
atttgggtgt tgatatcgct gctaatactg 19800 taatctggga ctacaaaaga
gaagccccag cacatgtatc tacaataggt gtctgcacaa 19860 tgactgacat
tgccaagaaa cctactgaga gtgcttgttc ttcacttact gtcttgtttg 19920
atggtagagt ggaaggacag gtagaccttt ttagaaacgc ccgtaatggt gttttaataa
19980 cagaaggttc agtcaaaggt ctaacacctt caaagggacc agcacaagct
agcgtcaatg 20040 gagtcacatt aattggagaa tcagtaaaaa cacagtttaa
ctactttaag aaagtagacg 20100 gcattattca acagttgcct gaaacctact
ttactcagag cagagactta gaggatttta 20160 agcccagatc acaaatggaa
actgactttc tcgagctcgc tatggatgaa ttcatacagc 20220 gatataagct
cgagggctat gccttcgaac acatcgttta tggagatttc agtcatggac 20280
aacttggcgg tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta
20340 aattagagga ttttatccct atggacagca cagtgaaaaa ttacttcata
acagatgcgc 20400 aaacaggttc atcaaaatgt gtgtgttctg tgattgatct
tttacttgat gactttgtcg 20460 agataataaa gtcacaagat ttgtcagtga
tttcaaaagt ggtcaaggtt acaattgact 20520 atgctgaaat ttcattcatg
ctttggtgta aggatggaca tgttgaaacc ttctacccaa 20580 aactacaagc
aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc 20640
aaagaatgct tcttgaaaag tgtgaccttc agaattatgg tgaaaatgct gttataccaa
20700 aaggaataat gatgaatgtc gcaaagtata ctcaactgtg tcaatactta
aatacactta 20760 ctttagctgt accctacaac atgagagtta ttcactttgg
tgctggctct gataaaggag 20820 ttgcaccagg tacagctgtg ctcagacaat
ggttgccaac tggcacacta cttgtcgatt 20880 cagatcttaa tgacttcgtc
tccgacgcag attctacttt aattggagac tgtgcaacag 20940 tacatacggc
taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac 21000
atgtgacaaa agagaatgac tctaaagaag ggtttttcac ttatctgtgt ggatttataa
21060 agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat
tcttggaatg 21120 ctgaccttta caagcttatg ggccatttct catggtggac
agcttttgtt acaaatgtaa 21180 atgcatcatc atcggaagca tttttaattg
gggctaacta tcttggcaag ccgaaggaac 21240 aaattgatgg ctataccatg
catgctaact acattttctg gaggaacaca aatcctatcc 21300 agttgtcttc
ctattcactc tttgacatga gcaaatttcc tcttaaatta agaggaactg 21360
ctgtaatgtc tcttaaggag aatcaaatca atgatatgat ttattctctt ctggaaaaag
21420 gtaggcttat cattagagaa aacaacagag ttgtggtttc aagtgatatt
cttgttaaca 21480 actaaacgaa catgtttatt ttcttattat ttcttactct
cactagtggt agtgaccttg 21540 accggtgcac cacttttgat gatgttcaag
ctcctaatta cactcaacat acttcatcta 21600 tgaggggggt ttactatcct
gatgaaattt ttagatcaga cactctttat ttaactcagg 21660 atttatttct
tccattttat tctaatgtta cagggtttca tactattaat catacgtttg 21720
gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg
21780 ttgtccgtgg ttgggttttt ggttctacca tgaacaacaa gtcacagtcg
gtgattatta 21840 ttaacaattc tactaatgtt gttatacgag catgtaactt
tgaattgtgt gacaaccctt 21900 tctttgctgt ttctaaaccc atgggtacac
agacacatac tatgatattc gataatgcat 21960 ttaattgcac tttcgagtac
atatctgatg ccttttcgct tgatgtttca gaaaagtcag 22020 gtaattttaa
acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt 22080
ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga
22140 aacctatttt taagttgcct cttggtatta acattacaaa ttttagagcc
attcttacag 22200 ccttttcacc tgctcaagac atttggggca cgtcagctgc
agcctatttt gttggctatt 22260 taaagccaac tacatttatg ctcaagtatg
atgaaaatgg tacaatcaca gatgctgttg 22320 attgttctca aaatccactt
gctgaactca aatgctctgt taagagcttt gagattgaca 22380 aaggaattta
ccagacctct aatttcaggg ttgttccctc aggagatgtt gtgagattcc 22440
ctaatattac aaacttgtgt ccttttggag aggtttttaa tgctactaaa ttcccttctg
22500 tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct
gtgctctaca 22560 actcaacatt tttttcaacc tttaagtgct atggcgtttc
tgccactaag ttgaatgatc 22620 tttgcttctc caatgtctat gcagattctt
ttgtagtcaa gggagatgat gtaagacaaa 22680 tagcgccagg acaaactggt
gttattgctg attataatta taaattgcca gatgatttca 22740 tgggttgtgt
ccttgcttgg aatactagga acattgatgc tacttcaact ggtaattata 22800
attataaata taggtatctt agacatggca agcttaggcc ctttgagaga gacatatcta
22860 atgtgccttt ctcccctgat ggcaaacctt gcaccccacc tgctcttaat
tgttattggc 22920 cattaaatga ttatggtttt tacaccacta ctggcattgg
ctaccaacct tacagagttg 22980 tagtactttc ttttgaactt ttaaatgcac
cggccacggt ttgtggacca aaattatcca 23040 ctgaccttat taagaaccag
tgtgtcaatt ttaattttaa tggactcact ggtactggtg 23100 tgttaactcc
ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg 23160
atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgct
23220 cttttggggg tgtaagtgta attacacctg gaacaaatgc ttcatctgaa
gttgctgttc 23280 tatatcaaga tgttaactgc actgatgttt ctacagcaat
tcatgcagat caactcacac 23340 cagcttggcg catatattct actggaaaca
atgtattcca gactcaagca ggctgtctta 23400 taggagctga gcatgtcgac
acttcttatg agtgcgacat tcctattgga gctggcattt 23460 gtgctagtta
ccatacagtt tctttattac gtagtactag ccaaaaatct attgtggctt 23520
atactatgtc tttaggtgct gatagttcaa ttgcttactc taataacacc attgctatac
23580 ctactaactt ttcaattagc attactacag aagtaatgcc tgtttctatg
gctaaaacct 23640 ccgtagattg taatatgtac atctgcggag attctactga
atgtgctaat ttgcttctcc 23700 aatatggtag cttttgcaca caactaaatc
gtgcactctc aggtattgct gctgaacagg 23760 atcgcaacac acgtgaagtg
ttcgctcaag tcaaacaaat gtacaaaacc ccaactttga 23820 aatattttgg
tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga 23880
ggtcttttat tgaggacttg ctctttaata aggtgacact cgctgatgct ggcttcatga
23940 agcaatatgg cgaatgccta ggtgatatta atgctagaga tctcatttgt
gcgcagaagt 24000 tcaatggact tacagtgttg ccacctctgc tcactgatga
tatgattgct gcctacactg 24060 ctgctctagt tagtggtact gccactgctg
gatggacatt tggtgctggc gctgctcttc 24120 aaataccttt tgctatgcaa
atggcatata ggttcaatgg cattggagtt acccaaaatg 24180 ttctctatga
gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc 24240
aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga
24300 atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt
gcaatttcaa 24360 gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga
ggcggaggta caaattgaca 24420 ggttaattac aggcagactt caaagccttc
aaacctatgt aacacaacaa ctaatcaggg 24480 ctgctgaaat cagggcttct
gctaatcttg ctgctactaa aatgtctgag tgtgttcttg 24540 gacaatcaaa
aagagttgac ttttgtggaa agggctacca ccttatgtcc ttcccacaag 24600
cagccccgca tggtgttgtc ttcctacatg tcacgtatgt gccatcccag gagaggaact
24660 tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt
gaaggtgttt 24720 ttgtgtttaa tggcacttct tggtttatta cacagaggaa
cttcttttct ccacaaataa 24780 ttactacaga caatacattt gtctcaggaa
attgtgatgt cgttattggc atcattaaca 24840 acacagttta tgatcctctg
caacctgagc ttgactcatt caaagaagag ctggacaagt 24900 acttcaaaaa
tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt 24960
ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa aatttaaatg
25020 aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa
tggccttggt 25080 atgtttggct cggcttcatt gctggactaa ttgccatcgt
catggttaca atcttgcttt 25140 gttgcatgac tagttgttgc agttgcctca
agggtgcatg ctcttgtggt tcttgctgca 25200 agtttgatga ggatgactct
gagccagttc tcaagggtgt caaattacat tacacataaa 25260 cgaacttatg
gatttgttta tgagattttt tactcttgga tcaattactg cacagccagt 25320
aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca
25380 agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg
tttttcagag 25440 cgctaccaaa ataattgcgc tcaataaaag atggcagcta
gccctttata agggcttcca 25500 gttcatttgc aatttactgc tgctatttgt
taccatctat tcacatcttt tgcttgtcgc 25560 tgcaggtatg gaggcgcaat
ttttgtacct ctatgccttg atatattttc tacaatgcat 25620 caacgcatgt
agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc 25680
attactttat gatgccaact actttgtttg ctggcacaca cataactatg actactgtat
25740 accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca
tttcaacacc 25800 aaaactcaaa gaagactacc aaattggtgg ttattctgag
gataggcact caggtgttaa 25860 agactatgtc gttgtacatg gctatttcac
cgaagtttac taccagcttg agtctacaca 25920 aattactaca gacactggta
ttgaaaatgc tacattcttc atctttaaca agcttgttaa 25980 agacccaccg
aatgtgcaaa tacacacaat cgacggctct tcaggagttg ctaatccagc 26040
aatggatcca atttatgatg agccgacgac gactactagc gtgcctttgt aagcacaaga
26100 aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt
taatagttaa 26160 tagcgtactt ctttttcttg ctttcgtggt attcttgcta
gtcacactag ccatccttac 26220 tgcgcttcga ttgtgtgcgt actgctgcaa
tattgttaac gtgagtttag taaaaccaac 26280 ggtttacgtc tactcgcgtg
ttaaaaatct gaactcttct gaaggagttc ctgatcttct 26340 ggtctaaacg
aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg 26400
gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta
26460 gtaataggtt tcctattcct agcctggatt atgttactac aatttgccta
ttctaatcgg 26520 aacaggtttt tgtacataat aaagcttgtt ttcctctggc
tcttgtggcc agtaacactt 26580 gcttgttttg tgcttgctgc tgtctacaga
attaattggg tgactggcgg gattgcgatt 26640 gcaatggctt gtattgtagg
cttgatgtgg cttagctact tcgttgcttc cttcaggctg 26700 tttgctcgta
cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg 26760
cctctccggg ggacaattgt gaccagaccg ctcatggaaa gtgaacttgt cattggtgct
26820 gtgatcattc gtggtcactt gcgaatggcc ggacactccc tagggcgctg
tgacattaag 26880 gacctgccaa aagagatcac tgtggctaca tcacgaacgc
tttcttatta caaattagga 26940 gcgtcgcagc gtgtaggcac tgattcaggt
tttgctgcat acaaccgcta ccgtattgga 27000 aactataaat taaatacaga
ccacgccggt agcaacgaca atattgcttt gctagtacag 27060 taagtgacaa
cagatgtttc atcttgttga cttccaggtt acaatagcag agatattgat 27120
tatcattatg aggactttca ggattgctat ttggaatctt gacgttataa taagttcaat
27180 agtgagacaa ttatttaagc ctctaactaa gaagaattat tcggagttag
atgatgaaga 27240 acctatggag ttagattatc cataaaacga acatgaaaat
tattctcttc ctgacattga 27300 ttgtatttac atcttgcgag ctatatcact
atcaggagtg tgttagaggt acgactgtac 27360 tactaaaaga accttgccca
tcaggaacat acgagggcaa ttcaccattt caccctcttg 27420 ctgacaataa
atttgcacta acttgcacta gcacacactt tgcttttgct tgtgctgacg 27480
gtactcgaca tacctatcag ctgcgtgcaa gatcagtttc accaaaactt ttcatcagac
27540 aagaggaggt tcaacaagag ctctactcgc cactttttct cattgttgct
gctctagtat 27600 ttttaatact ttgcttcacc attaagagaa agacagaatg
aatgagctca ctttaattga 27660 cttctatttg tgctttttag cctttctgct
attccttgtt ttaataatgc ttattatatt 27720 ttggttttca ctcgaaatcc
aggatctaga agaaccttgt accaaagtct aaacgaacat 27780 gaaacttctc
attgttttga cttgtatttc tctatgcagt tgcatatgca ctgtagtaca 27840
gcgctgtgca tctaataaac ctcatgtgct tgaagatcct tgtaaggtac aacactaggg
27900 gtaatactta tagcactgct tggctttgtg ctctaggaaa ggttttacct
tttcatagat 27960 ggcacactat ggttcaaaca tgcacaccta atgttactat
caactgtcaa gatccagctg 28020 gtggtgcgct tatagctagg tgttggtacc
ttcatgaagg tcaccaaact gctgcattta 28080 gagacgtact tgttgtttta
aataaacgaa caaattaaaa tgtctgataa tggaccccaa 28140 tcaaaccaac
gtagtgcccc ccgcattaca tttggtggac ccacagattc aactgacaat 28200
aaccagaatg gaggacgcaa tggggcaagg ccaaaacagc gccgacccca aggtttaccc
28260 aataatactg cgtcttggtt cacagctctc actcagcatg gcaaggagga
acttagattc 28320 cctcgaggcc agggcgttcc aatcaacacc aatagtggtc
cagatgacca aattggctac 28380 taccgaagag ctacccgacg agttcgtggt
ggtgacggca aaatgaaaga gctcagcccc 28440 agatggtact tctattacct
aggaactggc ccagaagctt cacttcccta cggcgctaac 28500 aaagaaggca
tcgtatgggt tgcaactgag ggagccttga atacacccaa agaccacatt 28560
ggcacccgca atcctaataa caatgctgcc accgtgctac aacttcctca aggaacaaca
28620 ttgccaaaag gcttctacgc agagggaagc agaggcggca gtcaagcctc
ttctcgctcc 28680 tcatcacgta gtcgcggtaa ttcaagaaat tcaactcctg
gcagcagtag gggaaattct 28740 cctgctcgaa tggctagcgg aggtggtgaa
actgccctcg cgctattgct gctagacaga 28800 ttgaaccagc ttgagagcaa
agtttctggt aaaggccaac aacaacaagg ccaaactgtc 28860 actaagaaat
ctgctgctga ggcatctaaa aagcctcgcc aaaaacgtac tgccacaaaa 28920
cagtacaacg tcactcaagc atttgggaga cgtggtccag aacaaaccca aggaaatttc
28980 ggggaccaag acctaatcag acaaggaact gattacaaac attggccgca
aattgcacaa 29040 tttgctccaa gtgcctctgc attctttgga atgtcacgca
ttggcatgga agtcacacct 29100 tcgggaacat ggctgactta tcatggagcc
attaaattgg atgacaaaga tccacaattc 29160 aaagacaacg tcatactgct
gaacaagcac attgacgcat acaaaacatt cccaccaaca 29220 gagcctaaaa
aggacaaaaa gaaaaagact gatgaagctc agcctttgcc gcagagacaa 29280
aagaagcagc ccactgtgac tcttcttcct gcggctgaca tggatgattt ctccagacaa
29340 cttcaaaatt ccatgagtgg agcttctgct gattcaactc aggcataaac
actcatgatg 29400 accacacaag gcagatgggc tatgtaaacg ttttcgcaat
tccgtttacg atacatagtc 29460 tactcttgtg cagaatgaat tctcgtaact
aaacagcaca agtaggttta gttaacttta 29520 atctcacata gcaatcttta
atcaatgtgt aacattaggg aggacttgaa agagccacca 29580 cattttcatc
gaggccacgc ggagtacgat cgagggtaca gtgaataatg ctagggagag 29640
ctgcctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg
29700 attttaatag cttcttagga gaatgacaaa aaaaaaaaaa aaaaaa 29746 2
3945 DNA CORONAVIRUS CDS (89)..(3853) 2 ttctcttctg gaaaaaggta
ggcttatcat tagagaaaac aacagagttg tggtttcaag 60 tgatattctt
gttaacaact aaacgaac atg ttt att ttc tta tta ttt ctt 112 Met Phe Ile
Phe Leu Leu Phe Leu 1 5 act ctc act agt ggt agt gac ctt gac cgg tgc
acc act ttt gat gat 160 Thr Leu Thr Ser Gly Ser Asp Leu Asp
Arg Cys Thr Thr Phe Asp Asp 10 15 20 gtt caa gct cct aat tac act
caa cat act tca tct atg agg ggg gtt 208 Val Gln Ala Pro Asn Tyr Thr
Gln His Thr Ser Ser Met Arg Gly Val 25 30 35 40 tac tat cct gat gaa
att ttt aga tca gac act ctt tat tta act cag 256 Tyr Tyr Pro Asp Glu
Ile Phe Arg Ser Asp Thr Leu Tyr Leu Thr Gln 45 50 55 gat tta ttt
ctt cca ttt tat tct aat gtt aca ggg ttt cat act att 304 Asp Leu Phe
Leu Pro Phe Tyr Ser Asn Val Thr Gly Phe His Thr Ile 60 65 70 aat
cat acg ttt ggc aac cct gtc ata cct ttt aag gat ggt att tat 352 Asn
His Thr Phe Gly Asn Pro Val Ile Pro Phe Lys Asp Gly Ile Tyr 75 80
85 ttt gct gcc aca gag aaa tca aat gtt gtc cgt ggt tgg gtt ttt ggt
400 Phe Ala Ala Thr Glu Lys Ser Asn Val Val Arg Gly Trp Val Phe Gly
90 95 100 tct acc atg aac aac aag tca cag tcg gtg att att att aac
aat tct 448 Ser Thr Met Asn Asn Lys Ser Gln Ser Val Ile Ile Ile Asn
Asn Ser 105 110 115 120 act aat gtt gtt ata cga gca tgt aac ttt gaa
ttg tgt gac aac cct 496 Thr Asn Val Val Ile Arg Ala Cys Asn Phe Glu
Leu Cys Asp Asn Pro 125 130 135 ttc ttt gct gtt tct aaa ccc atg ggt
aca cag aca cat act atg ata 544 Phe Phe Ala Val Ser Lys Pro Met Gly
Thr Gln Thr His Thr Met Ile 140 145 150 ttc gat aat gca ttt aat tgc
act ttc gag tac ata tct gat gcc ttt 592 Phe Asp Asn Ala Phe Asn Cys
Thr Phe Glu Tyr Ile Ser Asp Ala Phe 155 160 165 tcg ctt gat gtt tca
gaa aag tca ggt aat ttt aaa cac tta cga gag 640 Ser Leu Asp Val Ser
Glu Lys Ser Gly Asn Phe Lys His Leu Arg Glu 170 175 180 ttt gtg ttt
aaa aat aaa gat ggg ttt ctc tat gtt tat aag ggc tat 688 Phe Val Phe
Lys Asn Lys Asp Gly Phe Leu Tyr Val Tyr Lys Gly Tyr 185 190 195 200
caa cct ata gat gta gtt cgt gat cta cct tct ggt ttt aac act ttg 736
Gln Pro Ile Asp Val Val Arg Asp Leu Pro Ser Gly Phe Asn Thr Leu 205
210 215 aaa cct att ttt aag ttg cct ctt ggt att aac att aca aat ttt
aga 784 Lys Pro Ile Phe Lys Leu Pro Leu Gly Ile Asn Ile Thr Asn Phe
Arg 220 225 230 gcc att ctt aca gcc ttt tca cct gct caa gac att tgg
ggc acg tca 832 Ala Ile Leu Thr Ala Phe Ser Pro Ala Gln Asp Ile Trp
Gly Thr Ser 235 240 245 gct gca gcc tat ttt gtt ggc tat tta aag cca
act aca ttt atg ctc 880 Ala Ala Ala Tyr Phe Val Gly Tyr Leu Lys Pro
Thr Thr Phe Met Leu 250 255 260 aag tat gat gaa aat ggt aca atc aca
gat gct gtt gat tgt tct caa 928 Lys Tyr Asp Glu Asn Gly Thr Ile Thr
Asp Ala Val Asp Cys Ser Gln 265 270 275 280 aat cca ctt gct gaa ctc
aaa tgc tct gtt aag agc ttt gag att gac 976 Asn Pro Leu Ala Glu Leu
Lys Cys Ser Val Lys Ser Phe Glu Ile Asp 285 290 295 aaa gga att tac
cag acc tct aat ttc agg gtt gtt ccc tca gga gat 1024 Lys Gly Ile
Tyr Gln Thr Ser Asn Phe Arg Val Val Pro Ser Gly Asp 300 305 310 gtt
gtg aga ttc cct aat att aca aac ttg tgt cct ttt gga gag gtt 1072
Val Val Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe Gly Glu Val 315
320 325 ttt aat gct act aaa ttc cct tct gtc tat gca tgg gag aga aaa
aaa 1120 Phe Asn Ala Thr Lys Phe Pro Ser Val Tyr Ala Trp Glu Arg
Lys Lys 330 335 340 att tct aat tgt gtt gct gat tac tct gtg ctc tac
aac tca aca ttt 1168 Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu
Tyr Asn Ser Thr Phe 345 350 355 360 ttt tca acc ttt aag tgc tat ggc
gtt tct gcc act aag ttg aat gat 1216 Phe Ser Thr Phe Lys Cys Tyr
Gly Val Ser Ala Thr Lys Leu Asn Asp 365 370 375 ctt tgc ttc tcc aat
gtc tat gca gat tct ttt gta gtc aag gga gat 1264 Leu Cys Phe Ser
Asn Val Tyr Ala Asp Ser Phe Val Val Lys Gly Asp 380 385 390 gat gta
aga caa ata gcg cca gga caa act ggt gtt att gct gat tat 1312 Asp
Val Arg Gln Ile Ala Pro Gly Gln Thr Gly Val Ile Ala Asp Tyr 395 400
405 aat tat aaa ttg cca gat gat ttc atg ggt tgt gtc ctt gct tgg aat
1360 Asn Tyr Lys Leu Pro Asp Asp Phe Met Gly Cys Val Leu Ala Trp
Asn 410 415 420 act agg aac att gat gct act tca act ggt aat tat aat
tat aaa tat 1408 Thr Arg Asn Ile Asp Ala Thr Ser Thr Gly Asn Tyr
Asn Tyr Lys Tyr 425 430 435 440 agg tat ctt aga cat ggc aag ctt agg
ccc ttt gag aga gac ata tct 1456 Arg Tyr Leu Arg His Gly Lys Leu
Arg Pro Phe Glu Arg Asp Ile Ser 445 450 455 aat gtg cct ttc tcc cct
gat ggc aaa cct tgc acc cca cct gct ctt 1504 Asn Val Pro Phe Ser
Pro Asp Gly Lys Pro Cys Thr Pro Pro Ala Leu 460 465 470 aat tgt tat
tgg cca tta aat gat tat ggt ttt tac acc act act ggc 1552 Asn Cys
Tyr Trp Pro Leu Asn Asp Tyr Gly Phe Tyr Thr Thr Thr Gly 475 480 485
att ggc tac caa cct tac aga gtt gta gta ctt tct ttt gaa ctt tta
1600 Ile Gly Tyr Gln Pro Tyr Arg Val Val Val Leu Ser Phe Glu Leu
Leu 490 495 500 aat gca ccg gcc acg gtt tgt gga cca aaa tta tcc act
gac ctt att 1648 Asn Ala Pro Ala Thr Val Cys Gly Pro Lys Leu Ser
Thr Asp Leu Ile 505 510 515 520 aag aac cag tgt gtc aat ttt aat ttt
aat gga ctc act ggt act ggt 1696 Lys Asn Gln Cys Val Asn Phe Asn
Phe Asn Gly Leu Thr Gly Thr Gly 525 530 535 gtg tta act cct tct tca
aag aga ttt caa cca ttt caa caa ttt ggc 1744 Val Leu Thr Pro Ser
Ser Lys Arg Phe Gln Pro Phe Gln Gln Phe Gly 540 545 550 cgt gat gtt
tct gat ttc act gat tcc gtt cga gat cct aaa aca tct 1792 Arg Asp
Val Ser Asp Phe Thr Asp Ser Val Arg Asp Pro Lys Thr Ser 555 560 565
gaa ata tta gac att tca cct tgc tct ttt ggg ggt gta agt gta att
1840 Glu Ile Leu Asp Ile Ser Pro Cys Ser Phe Gly Gly Val Ser Val
Ile 570 575 580 aca cct gga aca aat gct tca tct gaa gtt gct gtt cta
tat caa gat 1888 Thr Pro Gly Thr Asn Ala Ser Ser Glu Val Ala Val
Leu Tyr Gln Asp 585 590 595 600 gtt aac tgc act gat gtt tct aca gca
att cat gca gat caa ctc aca 1936 Val Asn Cys Thr Asp Val Ser Thr
Ala Ile His Ala Asp Gln Leu Thr 605 610 615 cca gct tgg cgc ata tat
tct act gga aac aat gta ttc cag act caa 1984 Pro Ala Trp Arg Ile
Tyr Ser Thr Gly Asn Asn Val Phe Gln Thr Gln 620 625 630 gca ggc tgt
ctt ata gga gct gag cat gtc gac act tct tat gag tgc 2032 Ala Gly
Cys Leu Ile Gly Ala Glu His Val Asp Thr Ser Tyr Glu Cys 635 640 645
gac att cct att gga gct ggc att tgt gct agt tac cat aca gtt tct
2080 Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala Ser Tyr His Thr Val
Ser 650 655 660 tta tta cgt agt act agc caa aaa tct att gtg gct tat
act atg tct 2128 Leu Leu Arg Ser Thr Ser Gln Lys Ser Ile Val Ala
Tyr Thr Met Ser 665 670 675 680 tta ggt gct gat agt tca att gct tac
tct aat aac acc att gct ata 2176 Leu Gly Ala Asp Ser Ser Ile Ala
Tyr Ser Asn Asn Thr Ile Ala Ile 685 690 695 cct act aac ttt tca att
agc att act aca gaa gta atg cct gtt tct 2224 Pro Thr Asn Phe Ser
Ile Ser Ile Thr Thr Glu Val Met Pro Val Ser 700 705 710 atg gct aaa
acc tcc gta gat tgt aat atg tac atc tgc gga gat tct 2272 Met Ala
Lys Thr Ser Val Asp Cys Asn Met Tyr Ile Cys Gly Asp Ser 715 720 725
act gaa tgt gct aat ttg ctt ctc caa tat ggt agc ttt tgc aca caa
2320 Thr Glu Cys Ala Asn Leu Leu Leu Gln Tyr Gly Ser Phe Cys Thr
Gln 730 735 740 cta aat cgt gca ctc tca ggt att gct gct gaa cag gat
cgc aac aca 2368 Leu Asn Arg Ala Leu Ser Gly Ile Ala Ala Glu Gln
Asp Arg Asn Thr 745 750 755 760 cgt gaa gtg ttc gct caa gtc aaa caa
atg tac aaa acc cca act ttg 2416 Arg Glu Val Phe Ala Gln Val Lys
Gln Met Tyr Lys Thr Pro Thr Leu 765 770 775 aaa tat ttt ggt ggt ttt
aat ttt tca caa ata tta cct gac cct cta 2464 Lys Tyr Phe Gly Gly
Phe Asn Phe Ser Gln Ile Leu Pro Asp Pro Leu 780 785 790 aag cca act
aag agg tct ttt att gag gac ttg ctc ttt aat aag gtg 2512 Lys Pro
Thr Lys Arg Ser Phe Ile Glu Asp Leu Leu Phe Asn Lys Val 795 800 805
aca ctc gct gat gct ggc ttc atg aag caa tat ggc gaa tgc cta ggt
2560 Thr Leu Ala Asp Ala Gly Phe Met Lys Gln Tyr Gly Glu Cys Leu
Gly 810 815 820 gat att aat gct aga gat ctc att tgt gcg cag aag ttc
aat gga ctt 2608 Asp Ile Asn Ala Arg Asp Leu Ile Cys Ala Gln Lys
Phe Asn Gly Leu 825 830 835 840 aca gtg ttg cca cct ctg ctc act gat
gat atg att gct gcc tac act 2656 Thr Val Leu Pro Pro Leu Leu Thr
Asp Asp Met Ile Ala Ala Tyr Thr 845 850 855 gct gct cta gtt agt ggt
act gcc act gct gga tgg aca ttt ggt gct 2704 Ala Ala Leu Val Ser
Gly Thr Ala Thr Ala Gly Trp Thr Phe Gly Ala 860 865 870 ggc gct gct
ctt caa ata cct ttt gct atg caa atg gca tat agg ttc 2752 Gly Ala
Ala Leu Gln Ile Pro Phe Ala Met Gln Met Ala Tyr Arg Phe 875 880 885
aat ggc att gga gtt acc caa aat gtt ctc tat gag aac caa aaa caa
2800 Asn Gly Ile Gly Val Thr Gln Asn Val Leu Tyr Glu Asn Gln Lys
Gln 890 895 900 atc gcc aac caa ttt aac aag gcg att agt caa att caa
gaa tca ctt 2848 Ile Ala Asn Gln Phe Asn Lys Ala Ile Ser Gln Ile
Gln Glu Ser Leu 905 910 915 920 aca aca aca tca act gca ttg ggc aag
ctg caa gac gtt gtt aac cag 2896 Thr Thr Thr Ser Thr Ala Leu Gly
Lys Leu Gln Asp Val Val Asn Gln 925 930 935 aat gct caa gca tta aac
aca ctt gtt aaa caa ctt agc tct aat ttt 2944 Asn Ala Gln Ala Leu
Asn Thr Leu Val Lys Gln Leu Ser Ser Asn Phe 940 945 950 ggt gca att
tca agt gtg cta aat gat atc ctt tcg cga ctt gat aaa 2992 Gly Ala
Ile Ser Ser Val Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys 955 960 965
gtc gag gcg gag gta caa att gac agg tta att aca ggc aga ctt caa
3040 Val Glu Ala Glu Val Gln Ile Asp Arg Leu Ile Thr Gly Arg Leu
Gln 970 975 980 agc ctt caa acc tat gta aca caa caa cta atc agg gct
gct gaa atc 3088 Ser Leu Gln Thr Tyr Val Thr Gln Gln Leu Ile Arg
Ala Ala Glu Ile 985 990 995 1000 agg gct tct gct aat ctt gct gct
act aaa atg tct gag tgt gtt 3133 Arg Ala Ser Ala Asn Leu Ala Ala
Thr Lys Met Ser Glu Cys Val 1005 1010 1015 ctt gga caa tca aaa aga
gtt gac ttt tgt gga aag ggc tac cac 3178 Leu Gly Gln Ser Lys Arg
Val Asp Phe Cys Gly Lys Gly Tyr His 1020 1025 1030 ctt atg tcc ttc
cca caa gca gcc ccg cat ggt gtt gtc ttc cta 3223 Leu Met Ser Phe
Pro Gln Ala Ala Pro His Gly Val Val Phe Leu 1035 1040 1045 cat gtc
acg tat gtg cca tcc cag gag agg aac ttc acc aca gcg 3268 His Val
Thr Tyr Val Pro Ser Gln Glu Arg Asn Phe Thr Thr Ala 1050 1055 1060
cca gca att tgt cat gaa ggc aaa gca tac ttc cct cgt gaa ggt 3313
Pro Ala Ile Cys His Glu Gly Lys Ala Tyr Phe Pro Arg Glu Gly 1065
1070 1075 gtt ttt gtg ttt aat ggc act tct tgg ttt att aca cag agg
aac 3358 Val Phe Val Phe Asn Gly Thr Ser Trp Phe Ile Thr Gln Arg
Asn 1080 1085 1090 ttc ttt tct cca caa ata att act aca gac aat aca
ttt gtc tca 3403 Phe Phe Ser Pro Gln Ile Ile Thr Thr Asp Asn Thr
Phe Val Ser 1095 1100 1105 gga aat tgt gat gtc gtt att ggc atc att
aac aac aca gtt tat 3448 Gly Asn Cys Asp Val Val Ile Gly Ile Ile
Asn Asn Thr Val Tyr 1110 1115 1120 gat cct ctg caa cct gag ctt gac
tca ttc aaa gaa gag ctg gac 3493 Asp Pro Leu Gln Pro Glu Leu Asp
Ser Phe Lys Glu Glu Leu Asp 1125 1130 1135 aag tac ttc aaa aat cat
aca tca cca gat gtt gat ctt ggc gac 3538 Lys Tyr Phe Lys Asn His
Thr Ser Pro Asp Val Asp Leu Gly Asp 1140 1145 1150 att tca ggc att
aac gct tct gtc gtc aac att caa aaa gaa att 3583 Ile Ser Gly Ile
Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile 1155 1160 1165 gac cgc
ctc aat gag gtc gct aaa aat tta aat gaa tca ctc att 3628 Asp Arg
Leu Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile 1170 1175 1180
gac ctt caa gaa ttg gga aaa tat gag caa tat att aaa tgg cct 3673
Asp Leu Gln Glu Leu Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro 1185
1190 1195 tgg tat gtt tgg ctc ggc ttc att gct gga cta att gcc atc
gtc 3718 Trp Tyr Val Trp Leu Gly Phe Ile Ala Gly Leu Ile Ala Ile
Val 1200 1205 1210 atg gtt aca atc ttg ctt tgt tgc atg act agt tgt
tgc agt tgc 3763 Met Val Thr Ile Leu Leu Cys Cys Met Thr Ser Cys
Cys Ser Cys 1215 1220 1225 ctc aag ggt gca tgc tct tgt ggt tct tgc
tgc aag ttt gat gag 3808 Leu Lys Gly Ala Cys Ser Cys Gly Ser Cys
Cys Lys Phe Asp Glu 1230 1235 1240 gat gac tct gag cca gtt ctc aag
ggt gtc aaa tta cat tac aca 3853 Asp Asp Ser Glu Pro Val Leu Lys
Gly Val Lys Leu His Tyr Thr 1245 1250 1255 taaacgaact tatggatttg
tttatgagat tttttactct tggatcaatt actgcacagc 3913 cagtaaaaat
tgacaatgct tctcctgcaa gt 3945 3 1255 PRT CORONAVIRUS 3 Met Phe Ile
Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly Ser Asp Leu 1 5 10 15 Asp
Arg Cys Thr Thr Phe Asp Asp Val Gln Ala Pro Asn Tyr Thr Gln 20 25
30 His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu Ile Phe Arg
35 40 45 Ser Asp Thr Leu Tyr Leu Thr Gln Asp Leu Phe Leu Pro Phe
Tyr Ser 50 55 60 Asn Val Thr Gly Phe His Thr Ile Asn His Thr Phe
Gly Asn Pro Val 65 70 75 80 Ile Pro Phe Lys Asp Gly Ile Tyr Phe Ala
Ala Thr Glu Lys Ser Asn 85 90 95 Val Val Arg Gly Trp Val Phe Gly
Ser Thr Met Asn Asn Lys Ser Gln 100 105 110 Ser Val Ile Ile Ile Asn
Asn Ser Thr Asn Val Val Ile Arg Ala Cys 115 120 125 Asn Phe Glu Leu
Cys Asp Asn Pro Phe Phe Ala Val Ser Lys Pro Met 130 135 140 Gly Thr
Gln Thr His Thr Met Ile Phe Asp Asn Ala Phe Asn Cys Thr 145 150 155
160 Phe Glu Tyr Ile Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser
165 170 175 Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn Lys
Asp Gly 180 185 190 Phe Leu Tyr Val Tyr Lys Gly Tyr Gln Pro Ile Asp
Val Val Arg Asp 195 200 205 Leu Pro Ser Gly Phe Asn Thr Leu Lys Pro
Ile Phe Lys Leu Pro Leu 210 215 220 Gly Ile Asn Ile Thr Asn Phe Arg
Ala Ile Leu Thr Ala Phe Ser Pro 225 230 235 240 Ala Gln Asp Ile Trp
Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 245 250 255 Leu Lys Pro
Thr Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr Ile 260 265 270 Thr
Asp Ala Val Asp Cys Ser Gln Asn Pro Leu Ala Glu Leu Lys Cys 275 280
285 Ser Val Lys Ser Phe Glu Ile Asp Lys Gly Ile Tyr Gln Thr Ser Asn
290 295 300 Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn
Ile Thr 305 310 315 320 Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala
Thr Lys Phe Pro Ser 325 330 335 Val Tyr Ala Trp Glu Arg Lys Lys Ile
Ser Asn Cys Val Ala Asp Tyr 340 345 350 Ser Val Leu Tyr Asn Ser Thr
Phe Phe Ser Thr Phe Lys Cys Tyr Gly 355 360 365 Val Ser Ala Thr Lys
Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 370 375 380 Asp Ser Phe
Val Val Lys Gly Asp Asp Val Arg Gln Ile Ala Pro Gly 385
390 395 400 Gln Thr Gly Val Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp
Asp Phe 405 410 415 Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asn Ile
Asp Ala Thr Ser 420 425 430 Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr
Leu Arg His Gly Lys Leu 435 440 445 Arg Pro Phe Glu Arg Asp Ile Ser
Asn Val Pro Phe Ser Pro Asp Gly 450 455 460 Lys Pro Cys Thr Pro Pro
Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 465 470 475 480 Tyr Gly Phe
Tyr Thr Thr Thr Gly Ile Gly Tyr Gln Pro Tyr Arg Val 485 490 495 Val
Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 500 505
510 Pro Lys Leu Ser Thr Asp Leu Ile Lys Asn Gln Cys Val Asn Phe Asn
515 520 525 Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser
Lys Arg 530 535 540 Phe Gln Pro Phe Gln Gln Phe Gly Arg Asp Val Ser
Asp Phe Thr Asp 545 550 555 560 Ser Val Arg Asp Pro Lys Thr Ser Glu
Ile Leu Asp Ile Ser Pro Cys 565 570 575 Ser Phe Gly Gly Val Ser Val
Ile Thr Pro Gly Thr Asn Ala Ser Ser 580 585 590 Glu Val Ala Val Leu
Tyr Gln Asp Val Asn Cys Thr Asp Val Ser Thr 595 600 605 Ala Ile His
Ala Asp Gln Leu Thr Pro Ala Trp Arg Ile Tyr Ser Thr 610 615 620 Gly
Asn Asn Val Phe Gln Thr Gln Ala Gly Cys Leu Ile Gly Ala Glu 625 630
635 640 His Val Asp Thr Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly
Ile 645 650 655 Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr
Ser Gln Lys 660 665 670 Ser Ile Val Ala Tyr Thr Met Ser Leu Gly Ala
Asp Ser Ser Ile Ala 675 680 685 Tyr Ser Asn Asn Thr Ile Ala Ile Pro
Thr Asn Phe Ser Ile Ser Ile 690 695 700 Thr Thr Glu Val Met Pro Val
Ser Met Ala Lys Thr Ser Val Asp Cys 705 710 715 720 Asn Met Tyr Ile
Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 725 730 735 Gln Tyr
Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Ser Gly Ile 740 745 750
Ala Ala Glu Gln Asp Arg Asn Thr Arg Glu Val Phe Ala Gln Val Lys 755
760 765 Gln Met Tyr Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn
Phe 770 775 780 Ser Gln Ile Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg
Ser Phe Ile 785 790 795 800 Glu Asp Leu Leu Phe Asn Lys Val Thr Leu
Ala Asp Ala Gly Phe Met 805 810 815 Lys Gln Tyr Gly Glu Cys Leu Gly
Asp Ile Asn Ala Arg Asp Leu Ile 820 825 830 Cys Ala Gln Lys Phe Asn
Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 835 840 845 Asp Asp Met Ile
Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 850 855 860 Thr Ala
Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe 865 870 875
880 Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn
885 890 895 Val Leu Tyr Glu Asn Gln Lys Gln Ile Ala Asn Gln Phe Asn
Lys Ala 900 905 910 Ile Ser Gln Ile Gln Glu Ser Leu Thr Thr Thr Ser
Thr Ala Leu Gly 915 920 925 Lys Leu Gln Asp Val Val Asn Gln Asn Ala
Gln Ala Leu Asn Thr Leu 930 935 940 Val Lys Gln Leu Ser Ser Asn Phe
Gly Ala Ile Ser Ser Val Leu Asn 945 950 955 960 Asp Ile Leu Ser Arg
Leu Asp Lys Val Glu Ala Glu Val Gln Ile Asp 965 970 975 Arg Leu Ile
Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln 980 985 990 Gln
Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala 995
1000 1005 Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val
Asp 1010 1015 1020 Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro
Gln Ala Ala 1025 1030 1035 Pro His Gly Val Val Phe Leu His Val Thr
Tyr Val Pro Ser Gln 1040 1045 1050 Glu Arg Asn Phe Thr Thr Ala Pro
Ala Ile Cys His Glu Gly Lys 1055 1060 1065 Ala Tyr Phe Pro Arg Glu
Gly Val Phe Val Phe Asn Gly Thr Ser 1070 1075 1080 Trp Phe Ile Thr
Gln Arg Asn Phe Phe Ser Pro Gln Ile Ile Thr 1085 1090 1095 Thr Asp
Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val Ile Gly 1100 1105 1110
Ile Ile Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp 1115
1120 1125 Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr
Ser 1130 1135 1140 Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn
Ala Ser Val 1145 1150 1155 Val Asn Ile Gln Lys Glu Ile Asp Arg Leu
Asn Glu Val Ala Lys 1160 1165 1170 Asn Leu Asn Glu Ser Leu Ile Asp
Leu Gln Glu Leu Gly Lys Tyr 1175 1180 1185 Glu Gln Tyr Ile Lys Trp
Pro Trp Tyr Val Trp Leu Gly Phe Ile 1190 1195 1200 Ala Gly Leu Ile
Ala Ile Val Met Val Thr Ile Leu Leu Cys Cys 1205 1210 1215 Met Thr
Ser Cys Cys Ser Cys Leu Lys Gly Ala Cys Ser Cys Gly 1220 1225 1230
Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 1235
1240 1245 Gly Val Lys Leu His Tyr Thr 1250 1255 4 3943 DNA
CORONAVIRUS 4 ctcttctgga aaaaggtagg cttatcatta gagaaaacaa
cagagttgtg gtttcaagtg 60 atattcttgt taacaactaa acgaacatgt
ttattttctt attatttctt actctcacta 120 gtggtagtga ccttgaccgg
tgcaccactt ttgatgatgt tcaagctcct aattacactc 180 aacatacttc
atctatgagg ggggtttact atcctgatga aatttttaga tcagacactc 240
tttatttaac tcaggattta tttcttccat tttattctaa tgttacaggg tttcatacta
300 ttaatcatac gtttggcaac cctgtcatac cttttaagga tggtatttat
tttgctgcca 360 cagagaaatc aaatgttgtc cgtggttggg tttttggttc
taccatgaac aacaagtcac 420 agtcggtgat tattattaac aattctacta
atgttgttat acgagcatgt aactttgaat 480 tgtgtgacaa ccctttcttt
gctgtttcta aacccatggg tacacagaca catactatga 540 tattcgataa
tgcatttaat tgcactttcg agtacatatc tgatgccttt tcgcttgatg 600
tttcagaaaa gtcaggtaat tttaaacact tacgagagtt tgtgtttaaa aataaagatg
660 ggtttctcta tgtttataag ggctatcaac ctatagatgt agttcgtgat
ctaccttctg 720 gttttaacac tttgaaacct atttttaagt tgcctcttgg
tattaacatt acaaatttta 780 gagccattct tacagccttt tcacctgctc
aagacatttg gggcacgtca gctgcagcct 840 attttgttgg ctatttaaag
ccaactacat ttatgctcaa gtatgatgaa aatggtacaa 900 tcacagatgc
tgttgattgt tctcaaaatc cacttgctga actcaaatgc tctgttaaga 960
gctttgagat tgacaaagga atttaccaga cctctaattt cagggttgtt ccctcaggag
1020 atgttgtgag attccctaat attacaaact tgtgtccttt tggagaggtt
tttaatgcta 1080 ctaaattccc ttctgtctat gcatgggaga gaaaaaaaat
ttctaattgt gttgctgatt 1140 actctgtgct ctacaactca acattttttt
caacctttaa gtgctatggc gtttctgcca 1200 ctaagttgaa tgatctttgc
ttctccaatg tctatgcaga ttcttttgta gtcaagggag 1260 atgatgtaag
acaaatagcg ccaggacaaa ctggtgttat tgctgattat aattataaat 1320
tgccagatga tttcatgggt tgtgtccttg cttggaatac taggaacatt gatgctactt
1380 caactggtaa ttataattat aaatataggt atcttagaca tggcaagctt
aggccctttg 1440 agagagacat atctaatgtg cctttctccc ctgatggcaa
accttgcacc ccacctgctc 1500 ttaattgtta ttggccatta aatgattatg
gtttttacac cactactggc attggctacc 1560 aaccttacag agttgtagta
ctttcttttg aacttttaaa tgcaccggcc acggtttgtg 1620 gaccaaaatt
atccactgac cttattaaga accagtgtgt caattttaat tttaatggac 1680
tcactggtac tggtgtgtta actccttctt caaagagatt tcaaccattt caacaatttg
1740 gccgtgatgt ctctgatttc actgattccg ttcgagatcc taaaacatct
gaaatattag 1800 acatttcacc ttgctctttt gggggtgtaa gtgtaattac
acctggaaca aatgcttcat 1860 ctgaagttgc tgttctatat caagatgtta
actgcactga tgtttctaca gcaatccatg 1920 cagatcaact cacaccagct
tggcgcatat attctactgg aaacaatgta ttccagactc 1980 aagcaggctg
tcttatagga gctgagcatg tcgacacttc ttatgagtgc gacattccta 2040
ttggagctgg catttgtgct agttaccata cagtttcttt attacgtagt actagccaaa
2100 aatctattgt ggcttatact atgtctttag gtgctgatag ttcaattgct
tactctaata 2160 acaccattgc tatacctact aacttttcaa ttagcattac
tacagaagta atgcctgttt 2220 ctatggctaa aacctccgta gattgtaata
tgtacatctg cggagattct actgaatgtg 2280 ctaatttgct tctccaatat
ggtagctttt gcacacaact aaatcgtgca ctctcaggta 2340 ttgctgctga
acaggatcgc aacacacgtg aagtgttcgc tcaagtcaaa caaatgtaca 2400
aaaccccaac tttgaaatat tttggtggtt ttaatttttc acaaatatta cctgaccctc
2460 taaagccaac taagaggtct tttattgagg acttgctctt taataaggtg
acactcgctg 2520 atgctggctt catgaagcaa tatggcgaat gcctaggtga
tattaatgct agagatctca 2580 tttgtgcgca gaagttcaat gggcttacag
tgttgccacc tctgctcact gatgatatga 2640 ttgctgccta cactgctgct
ctagttagtg gtactgccac tgctggatgg acatttggtg 2700 ctggcgctgc
tcttcaaata ccttttgcta tgcaaatggc atataggttc aatggcattg 2760
gagttaccca aaatgttctc tatgagaacc aaaaacaaat cgccaaccaa tttaacaagg
2820 cgattagtca aattcaagaa tcacttacaa caacatcaac tgcattgggc
aagctgcaag 2880 acgttgttaa ccagaatgct caagcattaa acacacttgt
taaacaactt agctctaatt 2940 ttggtgcaat ttcaagtgtg ctaaatgata
tcctttcgcg acttgataaa gtcgaggcgg 3000 aggtacaaat tgacaggcta
attacaggca gacttcaaag ccttcaaacc tatgtaacac 3060 aacaactaat
cagggctgct gaaatcaggg cttctgctaa tcttgctgct actaaaatgt 3120
ctgagtgtgt tcttggacaa tcaaaaagag ttgacttttg tggaaagggc taccacctta
3180 tgtccttccc acaagcagcc ccgcatggtg ttgtcttcct acatgtcacg
tatgtgccat 3240 cccaggagag gaacttcacc acagcgccag caatttgtca
tgaaggcaaa gcatacttcc 3300 ctcgtgaagg tgtttttgtg tttaatggca
cttcttggtt tattacacag aggaacttct 3360 tttctccaca aataattact
acagacaata catttgtctc aggaaattgt gatgtcgtta 3420 ttggcatcat
taacaacaca gtttatgatc ctctgcaacc tgagcttgac tcattcaaag 3480
aagagctgga caagtacttc aaaaatcata catcaccaga tgttgatctt ggcgacattt
3540 caggcattaa cgcttctgtc gtcaacattc aaaaagaaat tgaccgcctc
aatgaggtcg 3600 ctaaaaattt aaatgaatca ctcattgacc ttcaagaatt
gggaaaatat gagcaatata 3660 ttaaatggcc ttggtatgtt tggctcggct
tcattgctgg actaattgcc atcgtcatgg 3720 ttacaatctt gctttgttgc
atgactagtt gttgcagttg cctcaagggt gcatgctctt 3780 gtggttcttg
ctgcaagttt gatgaggatg actctgagcc agttctcaag ggtgtcaaat 3840
tacattacac ataaacgaac ttatggattt gtttatgaga ttttttactc ttggatcaat
3900 tactgcacag ccagtaaaaa ttgacaatgc ttctcctgca agt 3943 5 2049
DNA CORONAVIRUS 5 ctcttctgga aaaaggtagg cttatcatta gagaaaacaa
cagagttgtg gtttcaagtg 60 atattcttgt taacaactaa acgaacatgt
ttattttctt attatttctt actctcacta 120 gtggtagtga ccttgaccgg
tgcaccactt ttgatgatgt tcaagctcct aattacactc 180 aacatacttc
atctatgagg ggggtttact atcctgatga aatttttaga tcagacactc 240
tttatttaac tcaggattta tttcttccat tttattctaa tgttacaggg tttcatacta
300 ttaatcatac gtttggcaac cctgtcatac cttttaagga tggtatttat
tttgctgcca 360 cagagaaatc aaatgttgtc cgtggttggg tttttggttc
taccatgaac aacaagtcac 420 agtcggtgat tattattaac aattctacta
atgttgttat acgagcatgt aactttgaat 480 tgtgtgacaa ccctttcttt
gctgtttcta aacccatggg tacacagaca catactatga 540 tattcgataa
tgcatttaat tgcactttcg agtacatatc tgatgccttt tcgcttgatg 600
tttcagaaaa gtcaggtaat tttaaacact tacgagagtt tgtgtttaaa aataaagatg
660 ggtttctcta tgtttataag ggctatcaac ctatagatgt agttcgtgat
ctaccttctg 720 gttttaacac tttgaaacct atttttaagt tgcctcttgg
tattaacatt acaaatttta 780 gagccattct tacagccttt tcacctgctc
aagacatttg gggcacgtca gctgcagcct 840 attttgttgg ctatttaaag
ccaactacat ttatgctcaa gtatgatgaa aatggtacaa 900 tcacagatgc
tgttgattgt tctcaaaatc cacttgctga actcaaatgc tctgttaaga 960
gctttgagat tgacaaagga atttaccaga cctctaattt cagggttgtt ccctcaggag
1020 atgttgtgag attccctaat attacaaact tgtgtccttt tggagaggtt
tttaatgcta 1080 ctaaattccc ttctgtctat gcatgggaga gaaaaaaaat
ttctaattgt gttgctgatt 1140 actctgtgct ctacaactca acattttttt
caacctttaa gtgctatggc gtttctgcca 1200 ctaagttgaa tgatctttgc
ttctccaatg tctatgcaga ttcttttgta gtcaagggag 1260 atgatgtaag
acaaatagcg ccaggacaaa ctggtgttat tgctgattat aattataaat 1320
tgccagatga tttcatgggt tgtgtccttg cttggaatac taggaacatt gatgctactt
1380 caactggtaa ttataattat aaatataggt atcttagaca tggcaagctt
aggccctttg 1440 agagagacat atctaatgtg cctttctccc ctgatggcaa
accttgcacc ccacctgctc 1500 ttaattgtta ttggccatta aatgattatg
gtttttacac cactactggc attggctacc 1560 aaccttacag agttgtagta
ctttcttttg aacttttaaa tgcaccggcc acggtttgtg 1620 gaccaaaatt
atccactgac cttattaaga accagtgtgt caattttaat tttaatggac 1680
tcactggtac tggtgtgtta actccttctt caaagagatt tcaaccattt caacaatttg
1740 gccgtgatgt ctctgatttc actgattccg ttcgagatcc taaaacatct
gaaatattag 1800 acatttcacc ttgctctttt gggggtgtaa gtgtaattac
acctggaaca aatgcttcat 1860 ctgaagttgc tgttctatat caagatgtta
actgcactga tgtttctaca gcaatccatg 1920 cagatcaact cacaccagct
tggcgcatat attctactgg aaacaatgta ttccagactc 1980 aagcaggctg
tcttatagga gctgagcatg tcgacacttc ttatgagtgc gacattccta 2040
ttggagctg 2049 6 2027 DNA CORONAVIRUS 6 catgcagatc aactcacacc
agcttggcgc atatattcta ctggaaacaa tgtattccag 60 actcaagcag
gctgtcttat aggagctgag catgtcgaca cttcttatga gtgcgacatt 120
cctattggag ctggcatttg tgctagttac catacagttt ctttattacg tagtactagc
180 caaaaatcta ttgtggctta tactatgtct ttaggtgctg atagttcaat
tgcttactct 240 aataacacca ttgctatacc tactaacttt tcaattagca
ttactacaga agtaatgcct 300 gtttctatgg ctaaaacctc cgtagattgt
aatatgtaca tctgcggaga ttctactgaa 360 tgtgctaatt tgcttctcca
atatggtagc ttttgcacac aactaaatcg tgcactctca 420 ggtattgctg
ctgaacagga tcgcaacaca cgtgaagtgt tcgctcaagt caaacaaatg 480
tacaaaaccc caactttgaa atattttggt ggttttaatt tttcacaaat attacctgac
540 cctctaaagc caactaagag gtcttttatt gaggacttgc tctttaataa
ggtgacactc 600 gctgatgctg gcttcatgaa gcaatatggc gaatgcctag
gtgatattaa tgctagagat 660 ctcatttgtg cgcagaagtt caatgggctt
acagtgttgc cacctctgct cactgatgat 720 atgattgctg cctacactgc
tgctctagtt agtggtactg ccactgctgg atggacattt 780 ggtgctggcg
ctgctcttca aatacctttt gctatgcaaa tggcatatag gttcaatggc 840
attggagtta cccaaaatgt tctctatgag aaccaaaaac aaatcgccaa ccaatttaac
900 aaggcgatta gtcaaattca agaatcactt acaacaacat caactgcatt
gggcaagctg 960 caagacgttg ttaaccagaa tgctcaagca ttaaacacac
ttgttaaaca acttagctct 1020 aattttggtg caatttcaag tgtgctaaat
gatatccttt cgcgacttga taaagtcgag 1080 gcggaggtac aaattgacag
gttaattaca ggcagacttc aaagccttca aacctatgta 1140 acacaacaac
taatcagggc tgctgaaatc agggcttctg ctaatcttgc tgctactaaa 1200
atgtctgagt gtgttcttgg acaatcaaaa agagttgact tttgtggaaa gggctaccac
1260 cttatgtcct tcccacaagc agccccgcat ggtgttgtct tcctacatgt
cacgtatgtg 1320 ccatcccagg agaggaactt caccacagcg ccagcaattt
gtcatgaagg caaagcatac 1380 ttccctcgtg aaggtgtttt tgtgtttaat
ggcacttctt ggtttattac acagaggaac 1440 ttcttttctc cacaaataat
tactacagac aatacatttg tctcaggaaa ttgtgatgtc 1500 gttattggcg
tcattaacaa cacagtttat gatcctctgc aacctgagct tgactcattc 1560
aaagaagagc tggacaagta cttcaaaaat catacatcac cagatgttga tcttggcgac
1620 atttcaggca ttaacgcttc tgtcgtcaac attcaaaaag aaattgaccg
cctcaatgag 1680 gtcgctaaaa atttaaatga atcactcatt gaccttcaag
aattgggaaa atatgagcaa 1740 tatattaaat ggccttggta tgtttggctc
ggcttcattg ctggactaat tgccatcgtc 1800 atggttacaa tcttgctttg
ttgcatgact agttgttgca gttgcctcaa gggtgcatgc 1860 tcttgtggtt
cttgctgcaa gtttgatgag gatgactctg agccagttct caagggtgtc 1920
aaattacatt acacataaac gaacttatgg atttgtttat gagatttttt actcttggat
1980 caattactgc acagccagta aaaattgaca atgcttctcc tgcaagt 2027 7
1096 DNA CORONAVIRUS 7 tcttgctttg ttgcatgact agttgttgca gttgcctcaa
gggtgcatgc tcttgtggtt 60 cttgctgcaa gtttgatgag gatgactctg
agccagttct caagggtgtc aaattacatt 120 acacataaac gaacttatgg
atttgtttat gagatttttt actcttggat caattactgc 180 acagccagta
aaaattgaca atgcttctcc tgcaagtact gttcatgcta cagcaacgat 240
accgctacaa gcctcactcc ctttcggatg gcttgttatt ggcgttgcat ttcttgctgt
300 ttttcagagc gctaccaaaa taattgcgct caataaaaga tggcagctag
ccctttataa 360 gggcttccag ttcatttgca atttactgct gctatttgtt
accatctatt cacatctttt 420 gcttgtcgct gcaggtatgg aggcgcaatt
tttgtacctc tatgccttga tatattttct 480 acaatgcatc aacgcatgta
gaattattat gagatgttgg ctttgttgga agtgcaaatc 540 caagaaccca
ttactttatg atgccaacta ctttgtttgc tggcacacac ataactatga 600
ctactgtata ccatataaca gtgtcacaga tacaattgtc gttactgaag gtgacggcat
660 ttcaacacca aaactcaaag aagactacca aattggtggt tattctgagg
ataggcactc 720 aggtgttaaa gactatgtcg ttgtacatgg ctatttcacc
gaagtttact accagcttga 780 gtctacacaa attactacag acactggtat
tgaaaatgct acattcttca tctttaacaa 840 gcttgttaaa gacccaccga
atgtgcaaat acacacaatc gacggctctt caggagttgc 900 taatccagca
atggatccaa tttatgatga gccgacgacg actactagcg tgcctttgta 960
agcacaagaa agtgagtacg aacttatgta ctcattcgtt tcggaagaaa caggtacgtt
1020 aatagttaat agcgtacttc tttttcttgc tttcgtggta ttcttgctag
tcacactagc 1080 catccttact gcgctt 1096 8 1135 DNA CORONAVIRUS 8
attgccatcg tcatggttac aatcttgctt tgttgcatga ctagttgttg cagttgcctc
60 aagggtgcat gctcttgtgg ttcttgctgc aagtttgatg aggatgactc
tgagccagtt 120 ctcaagggtg tcaaattaca ttacacataa acgaacttat
ggatttgttt atgagatttt 180 ttactcttgg atcaattact gcacagccag
taaaaattga caatgcttct cctgcaagta 240 ctgttcatgc tacagcaacg
ataccgctac aagcctcact ccctttcgga tggcttgtta 300 ttggcgttgc
atttcttgct gtttttcaga gcgctaccaa aataattgcg ctcaataaaa 360
gatggcagct agccctttat aagggcttcc agttcatttg caatttactg ctgctatttg
420 ttaccatcta ttcacatctt ttgcttgtcg ctgcaggtat ggaggcgcaa
tttttgtacc 480 tctatgcctt gatatatttt ctacaatgca tcaacgcatg
tagaattatt atgagatgtt 540 ggctttgttg gaagtgcaaa tccaagaacc
cattacttta tgatgccaac tactttgttt 600 gctggcacac acataactat
gactactgta taccatataa cagtgtcaca gatacaattg 660 tcgttactga
aggtgacggc atttcaacac caaaactcaa agaagactac caaattggtg 720
gttattctga ggataggcac tcaggtgtta aagactatgt cgttgtacat ggctatttca
780 ccgaagttta ctaccagctt gagtctacac aaattactac agacactggt
attgaaaatg 840 ctacattctt catctttaac aagcttgtta aagacccacc
gaatgtgcaa atacacacaa 900 tcgacggctc ttcaggagtt gctaatccag
caatggatcc aatttatgat gagccgacga 960 cgactactag cgtgcctttg
taagcacaag aaagtgagta cgaacttatg tactcattcg 1020 tttcggaaga
aacaggtacg ttaatagtta atagcgtact tctttttctt gctttcgtgg 1080
tattcttgct agtcacacta gccatcctta ctgcgcttcg attgtgtgcg tactg 1135 9
1096 DNA CORONAVIRUS CDS (137)..(958) 9 tcttgctttg ttgcatgact
agttgttgca gttgcctcaa gggtgcatgc tcttgtggtt 60 cttgctgcaa
gtttgatgag gatgactctg agccagttct caagggtgtc aaattacatt 120
acacataaac gaactt atg gat ttg ttt atg aga ttt ttt act ctt gga tca
172 Met Asp Leu Phe Met Arg Phe Phe Thr Leu Gly Ser 1 5 10 att act
gca cag cca gta aaa att gac aat gct tct cct gca agt act 220 Ile Thr
Ala Gln Pro Val Lys Ile Asp Asn Ala Ser Pro Ala Ser Thr 15 20 25
gtt cat gct aca gca acg ata ccg cta caa gcc tca ctc cct ttc gga 268
Val His Ala Thr Ala Thr Ile Pro Leu Gln Ala Ser Leu Pro Phe Gly 30
35 40 tgg ctt gtt att ggc gtt gca ttt ctt gct gtt ttt cag agc gct
acc 316 Trp Leu Val Ile Gly Val Ala Phe Leu Ala Val Phe Gln Ser Ala
Thr 45 50 55 60 aaa ata att gcg ctc aat aaa aga tgg cag cta gcc ctt
tat aag ggc 364 Lys Ile Ile Ala Leu Asn Lys Arg Trp Gln Leu Ala Leu
Tyr Lys Gly 65 70 75 ttc cag ttc att tgc aat tta ctg ctg cta ttt
gtt acc atc tat tca 412 Phe Gln Phe Ile Cys Asn Leu Leu Leu Leu Phe
Val Thr Ile Tyr Ser 80 85 90 cat ctt ttg ctt gtc gct gca ggt atg
gag gcg caa ttt ttg tac ctc 460 His Leu Leu Leu Val Ala Ala Gly Met
Glu Ala Gln Phe Leu Tyr Leu 95 100 105 tat gcc ttg ata tat ttt cta
caa tgc atc aac gca tgt aga att att 508 Tyr Ala Leu Ile Tyr Phe Leu
Gln Cys Ile Asn Ala Cys Arg Ile Ile 110 115 120 atg aga tgt tgg ctt
tgt tgg aag tgc aaa tcc aag aac cca tta ctt 556 Met Arg Cys Trp Leu
Cys Trp Lys Cys Lys Ser Lys Asn Pro Leu Leu 125 130 135 140 tat gat
gcc aac tac ttt gtt tgc tgg cac aca cat aac tat gac tac 604 Tyr Asp
Ala Asn Tyr Phe Val Cys Trp His Thr His Asn Tyr Asp Tyr 145 150 155
tgt ata cca tat aac agt gtc aca gat aca att gtc gtt act gaa ggt 652
Cys Ile Pro Tyr Asn Ser Val Thr Asp Thr Ile Val Val Thr Glu Gly 160
165 170 gac ggc att tca aca cca aaa ctc aaa gaa gac tac caa att ggt
ggt 700 Asp Gly Ile Ser Thr Pro Lys Leu Lys Glu Asp Tyr Gln Ile Gly
Gly 175 180 185 tat tct gag gat agg cac tca ggt gtt aaa gac tat gtc
gtt gta cat 748 Tyr Ser Glu Asp Arg His Ser Gly Val Lys Asp Tyr Val
Val Val His 190 195 200 ggc tat ttc acc gaa gtt tac tac cag ctt gag
tct aca caa att act 796 Gly Tyr Phe Thr Glu Val Tyr Tyr Gln Leu Glu
Ser Thr Gln Ile Thr 205 210 215 220 aca gac act ggt att gaa aat gct
aca ttc ttc atc ttt aac aag ctt 844 Thr Asp Thr Gly Ile Glu Asn Ala
Thr Phe Phe Ile Phe Asn Lys Leu 225 230 235 gtt aaa gac cca ccg aat
gtg caa ata cac aca atc gac ggc tct tca 892 Val Lys Asp Pro Pro Asn
Val Gln Ile His Thr Ile Asp Gly Ser Ser 240 245 250 gga gtt gct aat
cca gca atg gat cca att tat gat gag ccg acg acg 940 Gly Val Ala Asn
Pro Ala Met Asp Pro Ile Tyr Asp Glu Pro Thr Thr 255 260 265 act act
agc gtg cct ttg taagcacaag aaagtgagta cgaacttatg 988 Thr Thr Ser
Val Pro Leu 270 tactcattcg tttcggaaga aacaggtacg ttaatagtta
atagcgtact tctttttctt 1048 gctttcgtgg tattcttgct agtcacacta
gccatcctta ctgcgctt 1096 10 274 PRT CORONAVIRUS 10 Met Asp Leu Phe
Met Arg Phe Phe Thr Leu Gly Ser Ile Thr Ala Gln 1 5 10 15 Pro Val
Lys Ile Asp Asn Ala Ser Pro Ala Ser Thr Val His Ala Thr 20 25 30
Ala Thr Ile Pro Leu Gln Ala Ser Leu Pro Phe Gly Trp Leu Val Ile 35
40 45 Gly Val Ala Phe Leu Ala Val Phe Gln Ser Ala Thr Lys Ile Ile
Ala 50 55 60 Leu Asn Lys Arg Trp Gln Leu Ala Leu Tyr Lys Gly Phe
Gln Phe Ile 65 70 75 80 Cys Asn Leu Leu Leu Leu Phe Val Thr Ile Tyr
Ser His Leu Leu Leu 85 90 95 Val Ala Ala Gly Met Glu Ala Gln Phe
Leu Tyr Leu Tyr Ala Leu Ile 100 105 110 Tyr Phe Leu Gln Cys Ile Asn
Ala Cys Arg Ile Ile Met Arg Cys Trp 115 120 125 Leu Cys Trp Lys Cys
Lys Ser Lys Asn Pro Leu Leu Tyr Asp Ala Asn 130 135 140 Tyr Phe Val
Cys Trp His Thr His Asn Tyr Asp Tyr Cys Ile Pro Tyr 145 150 155 160
Asn Ser Val Thr Asp Thr Ile Val Val Thr Glu Gly Asp Gly Ile Ser 165
170 175 Thr Pro Lys Leu Lys Glu Asp Tyr Gln Ile Gly Gly Tyr Ser Glu
Asp 180 185 190 Arg His Ser Gly Val Lys Asp Tyr Val Val Val His Gly
Tyr Phe Thr 195 200 205 Glu Val Tyr Tyr Gln Leu Glu Ser Thr Gln Ile
Thr Thr Asp Thr Gly 210 215 220 Ile Glu Asn Ala Thr Phe Phe Ile Phe
Asn Lys Leu Val Lys Asp Pro 225 230 235 240 Pro Asn Val Gln Ile His
Thr Ile Asp Gly Ser Ser Gly Val Ala Asn 245 250 255 Pro Ala Met Asp
Pro Ile Tyr Asp Glu Pro Thr Thr Thr Thr Ser Val 260 265 270 Pro Leu
11 1096 DNA CORONAVIRUS CDS (558)..(1019) 11 tcttgctttg ttgcatgact
agttgttgca gttgcctcaa gggtgcatgc tcttgtggtt 60 cttgctgcaa
gtttgatgag gatgactctg agccagttct caagggtgtc aaattacatt 120
acacataaac gaacttatgg atttgtttat gagatttttt actcttggat caattactgc
180 acagccagta aaaattgaca atgcttctcc tgcaagtact gttcatgcta
cagcaacgat 240 accgctacaa gcctcactcc ctttcggatg gcttgttatt
ggcgttgcat ttcttgctgt 300 ttttcagagc gctaccaaaa taattgcgct
caataaaaga tggcagctag ccctttataa 360 gggcttccag ttcatttgca
atttactgct gctatttgtt accatctatt cacatctttt 420 gcttgtcgct
gcaggtatgg aggcgcaatt tttgtacctc tatgccttga tatattttct 480
acaatgcatc aacgcatgta gaattattat gagatgttgg ctttgttgga agtgcaaatc
540 caagaaccca ttacttt atg atg cca act act ttg ttt gct ggc aca cac
590 Met Met Pro Thr Thr Leu Phe Ala Gly Thr His 1 5 10 ata act atg
act act gta tac cat ata aca gtg tca cag ata caa ttg 638 Ile Thr Met
Thr Thr Val Tyr His Ile Thr Val Ser Gln Ile Gln Leu 15 20 25 tcg
tta ctg aag gtg acg gca ttt caa cac caa aac tca aag aag act 686 Ser
Leu Leu Lys Val Thr Ala Phe Gln His Gln Asn Ser Lys Lys Thr 30 35
40 acc aaa ttg gtg gtt att ctg agg ata ggc act cag gtg tta aag act
734 Thr Lys Leu Val Val Ile Leu Arg Ile Gly Thr Gln Val Leu Lys Thr
45 50 55 atg tcg ttg tac atg gct att tca ccg aag ttt act acc agc
ttg agt 782 Met Ser Leu Tyr Met Ala Ile Ser Pro Lys Phe Thr Thr Ser
Leu Ser 60 65 70 75 cta cac aaa tta cta cag aca ctg gta ttg aaa atg
cta cat tct tca 830 Leu His Lys Leu Leu Gln Thr Leu Val Leu Lys Met
Leu His Ser Ser 80 85 90 tct tta aca agc ttg tta aag acc cac cga
atg tgc aaa tac aca caa 878 Ser Leu Thr Ser Leu Leu Lys Thr His Arg
Met Cys Lys Tyr Thr Gln 95 100 105 tcg acg gct ctt cag gag ttg cta
atc cag caa tgg atc caa ttt atg 926 Ser Thr Ala Leu Gln Glu Leu Leu
Ile Gln Gln Trp Ile Gln Phe Met 110 115 120 atg agc cga cga cga cta
cta gcg tgc ctt tgt aag cac aag aaa gtg 974 Met Ser Arg Arg Arg Leu
Leu Ala Cys Leu Cys Lys His Lys Lys Val 125 130 135 agt acg aac tta
tgt act cat tcg ttt cgg aag aaa cag gta cgt 1019 Ser Thr Asn Leu
Cys Thr His Ser Phe Arg Lys Lys Gln Val Arg 140 145 150 taatagttaa
tagcgtactt ctttttcttg ctttcgtggt attcttgcta gtcacactag 1079
ccatccttac tgcgctt 1096 12 154 PRT CORONAVIRUS 12 Met Met Pro Thr
Thr Leu Phe Ala Gly Thr His Ile Thr Met Thr Thr 1 5 10 15 Val Tyr
His Ile Thr Val Ser Gln Ile Gln Leu Ser Leu Leu Lys Val 20 25 30
Thr Ala Phe Gln His Gln Asn Ser Lys Lys Thr Thr Lys Leu Val Val 35
40 45 Ile Leu Arg Ile Gly Thr Gln Val Leu Lys Thr Met Ser Leu Tyr
Met 50 55 60 Ala Ile Ser Pro Lys Phe Thr Thr Ser Leu Ser Leu His
Lys Leu Leu 65 70 75 80 Gln Thr Leu Val Leu Lys Met Leu His Ser Ser
Ser Leu Thr Ser Leu 85 90 95 Leu Lys Thr His Arg Met Cys Lys Tyr
Thr Gln Ser Thr Ala Leu Gln 100 105 110 Glu Leu Leu Ile Gln Gln Trp
Ile Gln Phe Met Met Ser Arg Arg Arg 115 120 125 Leu Leu Ala Cys Leu
Cys Lys His Lys Lys Val Ser Thr Asn Leu Cys 130 135 140 Thr His Ser
Phe Arg Lys Lys Gln Val Arg 145 150 13 332 DNA CORONAVIRUS CDS
(36)..(263) 13 tgcctttgta agcacaagaa agtgagtacg aactt atg tac tca
ttc gtt tcg 53 Met Tyr Ser Phe Val Ser 1 5 gaa gaa aca ggt acg tta
ata gtt aat agc gta ctt ctt ttt ctt gct 101 Glu Glu Thr Gly Thr Leu
Ile Val Asn Ser Val Leu Leu Phe Leu Ala 10 15 20 ttc gtg gta ttc
ttg cta gtc aca cta gcc atc ctt act gcg ctt cga 149 Phe Val Val Phe
Leu Leu Val Thr Leu Ala Ile Leu Thr Ala Leu Arg 25 30 35 ttg tgt
gcg tac tgc tgc aat att gtt aac gtg agt tta gta aaa cca 197 Leu Cys
Ala Tyr Cys Cys Asn Ile Val Asn Val Ser Leu Val Lys Pro 40 45 50
acg gtt tac gtc tac tcg cgt gtt aaa aat ctg aac tct tct gaa gga 245
Thr Val Tyr Val Tyr Ser Arg Val Lys Asn Leu Asn Ser Ser Glu Gly 55
60 65 70 gtt cct gat ctt ctg gtc taaacgaact aactattatt attattctgt
293 Val Pro Asp Leu Leu Val 75 ttggaacttt aacattgctt atcatggcag
acaacggta 332 14 76 PRT CORONAVIRUS 14 Met Tyr Ser Phe Val Ser Glu
Glu Thr Gly Thr Leu Ile Val Asn Ser 1 5 10 15 Val Leu Leu Phe Leu
Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala 20 25 30 Ile Leu Thr
Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn Ile Val Asn 35 40 45 Val
Ser Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn 50 55
60 Leu Asn Ser Ser Glu Gly Val Pro Asp Leu Leu Val 65 70 75 15 332
DNA CORONAVIRUS 15 tgcctttgta agcacaagaa agtgagtacg aacttatgta
ctcattcgtt tcggaagaaa 60 caggtacgtt aatagttaat agcgtacttc
tttttcttgc tttcgtggta ttcttgctag 120 tcacactagc catccttact
gcgcttcgat tgtgtgcgta ctgctgcaat attgttaacg 180 tgagtttagt
aaaaccaacg gtttacgtct actcgcgtgt taaaaatctg aactcttctg 240
aaggagttcc tgatcttctg gtctaaacga actaactatt attattattc tgtttggaac
300 tttaacattg cttatcatgg cagacaacgg ta 332 16 708 DNA CORONAVIRUS
CDS (41)..(703) 16 tattattatt attctgtttg gaactttaac attgcttatc atg
gca gac aac ggt 55 Met Ala Asp Asn Gly 1 5 act att acc gtt gag gag
ctt aaa caa ctc ctg gaa caa tgg aac cta 103 Thr Ile Thr Val Glu Glu
Leu Lys Gln Leu Leu Glu Gln Trp Asn Leu 10 15 20 gta ata ggt ttc
cta ttc cta gcc tgg att atg tta cta caa ttt gcc 151 Val Ile Gly Phe
Leu Phe Leu Ala Trp Ile Met Leu Leu Gln Phe Ala 25 30 35 tat tct
aat cgg aac agg ttt ttg tac ata ata aag ctt gtt ttc ctc 199 Tyr Ser
Asn Arg Asn Arg Phe Leu Tyr Ile Ile Lys Leu Val Phe Leu 40 45 50
tgg ctc ttg tgg cca gta aca ctt gct tgt ttt gtg ctt gct gct gtc 247
Trp Leu Leu Trp Pro Val Thr Leu Ala Cys Phe Val Leu Ala Ala Val 55
60 65 tac aga att aat tgg gtg act ggc ggg att gcg att gca atg gct
tgt 295 Tyr Arg Ile Asn Trp Val Thr Gly Gly Ile Ala Ile Ala Met Ala
Cys 70 75 80 85 att gta ggc ttg atg tgg ctt agc tac ttc gtt gct tcc
ttc agg ctg 343 Ile Val Gly Leu Met Trp Leu Ser Tyr Phe Val Ala Ser
Phe Arg Leu 90 95 100 ttt gct cgt acc cgc tca atg tgg tca ttc aac
cca gaa aca aac att 391 Phe Ala Arg Thr Arg Ser Met Trp Ser Phe Asn
Pro Glu Thr Asn Ile 105 110 115 ctt ctc aat gtg cct ctc cgg ggg aca
att gtg acc aga ccg ctc atg 439 Leu Leu Asn Val Pro Leu Arg Gly Thr
Ile Val Thr Arg Pro Leu Met 120 125 130 gaa agt gaa ctt gtc att ggt
gct gtg atc att cgt ggt cac ttg cga 487 Glu Ser Glu Leu Val Ile Gly
Ala Val Ile Ile Arg Gly His Leu Arg 135 140 145 atg gcc gga cac tcc
cta ggg cgc tgt gac att aag gac ctg cca aaa 535 Met Ala Gly His Ser
Leu Gly Arg Cys Asp Ile Lys Asp Leu Pro Lys 150 155 160 165 gag atc
act gtg gct aca tca cga acg ctt tct tat tac aaa tta gga 583 Glu Ile
Thr Val Ala Thr Ser Arg Thr Leu Ser Tyr Tyr Lys Leu Gly 170 175 180
gcg tcg cag cgt gta ggc act gat tca ggt ttt gct gca tac aac cgc 631
Ala Ser Gln Arg Val Gly Thr Asp Ser Gly Phe Ala Ala Tyr Asn Arg 185
190 195 tac cgt att gga aac tat aaa tta aat aca gac cac gcc ggt agc
aac 679 Tyr Arg Ile Gly Asn Tyr Lys Leu Asn Thr Asp His Ala Gly Ser
Asn 200 205 210 gac aat att gct ttg cta gta cag taagt 708 Asp Asn
Ile Ala Leu Leu Val Gln 215 220 17 221 PRT CORONAVIRUS 17 Met Ala
Asp Asn Gly Thr Ile Thr Val Glu Glu Leu Lys Gln Leu Leu 1 5 10 15
Glu Gln Trp Asn Leu Val Ile Gly Phe Leu Phe Leu Ala Trp Ile Met 20
25 30 Leu Leu Gln Phe Ala Tyr Ser Asn Arg Asn Arg Phe Leu Tyr Ile
Ile 35 40 45 Lys Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu
Ala Cys Phe 50 55 60 Val Leu Ala Ala Val Tyr Arg Ile Asn Trp Val
Thr Gly Gly Ile Ala 65 70 75 80 Ile Ala Met Ala Cys Ile Val Gly Leu
Met Trp Leu Ser Tyr Phe Val 85 90 95 Ala Ser Phe Arg Leu Phe Ala
Arg Thr Arg Ser Met Trp Ser Phe Asn 100 105 110 Pro Glu Thr Asn Ile
Leu Leu Asn Val Pro Leu Arg Gly Thr Ile Val 115 120 125 Thr Arg Pro
Leu Met Glu Ser Glu Leu Val Ile Gly Ala Val Ile Ile 130 135 140 Arg
Gly His Leu Arg Met Ala Gly His Ser Leu Gly Arg Cys Asp Ile 145 150
155 160 Lys Asp Leu Pro Lys Glu Ile Thr Val Ala Thr Ser Arg Thr Leu
Ser 165 170 175 Tyr Tyr Lys Leu Gly Ala Ser Gln Arg Val Gly Thr Asp
Ser Gly Phe 180 185 190 Ala Ala Tyr Asn Arg Tyr Arg Ile Gly Asn Tyr
Lys Leu Asn Thr Asp 195 200 205 His Ala Gly Ser Asn Asp Asn Ile Ala
Leu Leu Val Gln 210 215 220 18 769 DNA CORONAVIRUS 18 cctgatcttc
tggtctaaac gaactaacta ttattattat tctgtttgga actttaacat 60
tgcttatcat ggcagacaac ggtactatta ccgttgagga gcttaaacaa ctcctggaac
120 aatggaacct agtaataggt ttcctattcc tagcctggat tatgttacta
caatttgcct 180 attctaatcg gaacaggttt ttgtacataa taaagcttgt
tttcctctgg ctcttgtggc 240 cagtaacact tgcttgtttt gtgcttgctg
ctgtctacag aattaattgg gtgactggcg 300 ggattgcgat tgcaatggct
tgtattgtag gcttgatgtg gcttagctac ttcgttgctt 360 ccttcaggct
gtttgctcgt acccgctcaa tgtggtcatt caacccagaa acaaacattc 420
ttctcaatgt gcctctccgg gggacaattg tgaccagacc gctcatggaa agtgaacttg
480 tcattggtgc tgtgatcatt cgtggtcact tgcgaatggc cggacactcc
ctagggcgct 540 gtgacattaa ggacctgcca aaagagatca ctgtggctac
atcacgaacg ctttcttatt 600 acaaattagg agcgtcgcag cgtgtaggca
ctgattcagg ttttgctgca tacaaccgct 660 accgtattgg
aaactataaa ttaaatacag accacgccgg tagcaacgac aatattgctt 720
tgctagtaca gtaagtgaca acagatgttt catcttgttg acttccagg 769 19 1231
DNA CORONAVIRUS 19 taccgtattg gaaactataa attaaataca gaccacgccg
gtagcaacga caatattgct 60 ttgctagtac agtaagtgac aacagatgtt
tcatcttgtt gacttccagg ttacaatagc 120 agagatattg attatcatta
tgaggacttt caggattgct atttggaatc ttgacgttat 180 aataagttca
atagtgagac aattatttaa gcctctaact aagaagaatt attcggagtt 240
agatgatgaa gaacctatgg agttagatta tccataaaac gaacatgaaa attattctct
300 tcctgacatt gattgtattt acatcttgcg agctatatca ctatcaggag
tgtgttagag 360 gtacgactgt actactaaaa gaaccttgcc catcaggaac
atacgagggc aattcaccat 420 ttcaccctct tgctgacaat aaatttgcac
taacttgcac tagcacacac tttgcttttg 480 cttgtgctga cggtactcga
catacctatc agctgcgtgc aagatcagtt tcaccaaaac 540 ttttcatcag
acaagaggag gttcaacaag agctctactc gccacttttt ctcattgttg 600
ctgctctagt atttttaata ctttgcttca ccattaagag aaagacagaa tgaatgagct
660 cactttaatt gacttctatt tgtgcttttt agcctttctg ctattccttg
ttttaataat 720 gcttattata ttttggtttt cactcgaaat ccaggatcta
gaagaacctt gtaccaaagt 780 ctaaacgaac atgaaacttc tcattgtttt
gacttgtatt tctctatgca gttgcatatg 840 cactgtagta cagcgctgtg
catctaataa acctcatgtg cttgaagatc cttgtaaggt 900 acaacactag
gggtaatact tatagcactg cttggctttg tgctctagga aaggttttac 960
cttttcatag atggcacact atggttcaaa catgcacacc taatgttact atcaactgtc
1020 aagatccagc tggtggtgcg cttatagcta ggtgttggta ccttcatgaa
ggtcaccaaa 1080 ctgctgcatt tagagacgta cttgttgttt taaataaacg
aacaaattaa aatgtctgat 1140 aatggacccc aatcaaacca acgtagtgcc
ccccgcatta catttggtgg acccacagat 1200 tcaactgaca ataaccagaa
tggaggacgc a 1231 20 1242 DNA CORONAVIRUS 20 gcatacaacc gctaccgtat
tggaaactat aaattaaata cagaccacgc cggtagcaac 60 gacaatattg
ctttgctagt acagtaagtg acaacagatg tttcatcttg ttgacttcca 120
ggttacaata gcagagatat tgattatcat tatgaggact ttcaggattg ctatttggaa
180 tcttgacgtt ataataagtt caatagtgag acagttattt aagcctctaa
ctaagaagaa 240 ttattcggag ttagatgatg aagaacctat ggagttagat
tatccataaa acgaacatga 300 aaattattct cttcctgaca ttgattgtat
ttacatcttg cgagctatat cactatcagg 360 agtgtgttag aggtacgact
gtactactaa aagaaccttg cccatcagga acatacgagg 420 gcaattcacc
atttcaccct cttgctgaca ataaatttgc actaacttgc actagcacac 480
actttgcttt tgcttgtgct gacggtactc gacataccta tcagctgcgt gcaagatcag
540 tttcaccaaa acttttcatc agacaagagg aggttcaaca agagctctac
tcgccacttt 600 ttctcattgt tgctgctcta gtatttttaa tactttgctt
caccattaag agaaagacag 660 aatgaatgag ctcactttaa ttgacttcta
tttgtgcttt ttagcctttc tgctattcct 720 tgttttaata atgcttatta
tattttggtt ttcactcgaa atccaggatc tagaagaacc 780 ttgtaccaaa
gtctaaacga acatgaaact tctcattgtt ttgacttgta tttctctatg 840
cagttgcata tgcactgtag tacagcgctg tgcatctaat aaacctcatg tgcttgaaga
900 tccttgtaag gtacaacact aggggtaata cttatagcac tgcttggctt
tgtgctctag 960 gaaaggtttt accttttcat agatggcaca ctatggttca
aacatgcaca cctaatgtta 1020 ctatcaactg tcaagatcca gctggtggtg
cgcttatagc taggtgttgg taccttcatg 1080 aaggtcacca aactgctgca
tttagagacg tacttgttgt tttaaataaa cgaacgaatt 1140 aaaatgtctg
ataatggacc ccaatcaaac caacgtagtg ccccccgcat tacatttggt 1200
ggacccacag attcaactga caataaccag aatggaggac gc 1242 21 1231 DNA
CORONAVIRUS CDS (86)..(274) 21 taccgtattg gaaactataa attaaataca
gaccacgccg gtagcaacga caatattgct 60 ttgctagtac agtaagtgac aacag atg
ttt cat ctt gtt gac ttc cag gtt 112 Met Phe His Leu Val Asp Phe Gln
Val 1 5 aca ata gca gag ata ttg att atc att atg agg act ttc agg att
gct 160 Thr Ile Ala Glu Ile Leu Ile Ile Ile Met Arg Thr Phe Arg Ile
Ala 10 15 20 25 att tgg aat ctt gac gtt ata ata agt tca ata gtg aga
caa tta ttt 208 Ile Trp Asn Leu Asp Val Ile Ile Ser Ser Ile Val Arg
Gln Leu Phe 30 35 40 aag cct cta act aag aag aat tat tcg gag tta
gat gat gaa gaa cct 256 Lys Pro Leu Thr Lys Lys Asn Tyr Ser Glu Leu
Asp Asp Glu Glu Pro 45 50 55 atg gag tta gat tat cca taaaacgaac
atgaaaatta ttctcttcct 304 Met Glu Leu Asp Tyr Pro 60 gacattgatt
gtatttacat cttgcgagct atatcactat caggagtgtg ttagaggtac 364
gactgtacta ctaaaagaac cttgcccatc aggaacatac gagggcaatt caccatttca
424 ccctcttgct gacaataaat ttgcactaac ttgcactagc acacactttg
cttttgcttg 484 tgctgacggt actcgacata cctatcagct gcgtgcaaga
tcagtttcac caaaactttt 544 catcagacaa gaggaggttc aacaagagct
ctactcgcca ctttttctca ttgttgctgc 604 tctagtattt ttaatacttt
gcttcaccat taagagaaag acagaatgaa tgagctcact 664 ttaattgact
tctatttgtg ctttttagcc tttctgctat tccttgtttt aataatgctt 724
attatatttt ggttttcact cgaaatccag gatctagaag aaccttgtac caaagtctaa
784 acgaacatga aacttctcat tgttttgact tgtatttctc tatgcagttg
catatgcact 844 gtagtacagc gctgtgcatc taataaacct catgtgcttg
aagatccttg taaggtacaa 904 cactaggggt aatacttata gcactgcttg
gctttgtgct ctaggaaagg ttttaccttt 964 tcatagatgg cacactatgg
ttcaaacatg cacacctaat gttactatca actgtcaaga 1024 tccagctggt
ggtgcgctta tagctaggtg ttggtacctt catgaaggtc accaaactgc 1084
tgcatttaga gacgtacttg ttgttttaaa taaacgaaca aattaaaatg tctgataatg
1144 gaccccaatc aaaccaacgt agtgcccccc gcattacatt tggtggaccc
acagattcaa 1204 ctgacaataa ccagaatgga ggacgca 1231 22 63 PRT
CORONAVIRUS 22 Met Phe His Leu Val Asp Phe Gln Val Thr Ile Ala Glu
Ile Leu Ile 1 5 10 15 Ile Ile Met Arg Thr Phe Arg Ile Ala Ile Trp
Asn Leu Asp Val Ile 20 25 30 Ile Ser Ser Ile Val Arg Gln Leu Phe
Lys Pro Leu Thr Lys Lys Asn 35 40 45 Tyr Ser Glu Leu Asp Asp Glu
Glu Pro Met Glu Leu Asp Tyr Pro 50 55 60 23 1231 DNA CORONAVIRUS
CDS (285)..(650) 23 taccgtattg gaaactataa attaaataca gaccacgccg
gtagcaacga caatattgct 60 ttgctagtac agtaagtgac aacagatgtt
tcatcttgtt gacttccagg ttacaatagc 120 agagatattg attatcatta
tgaggacttt caggattgct atttggaatc ttgacgttat 180 aataagttca
atagtgagac aattatttaa gcctctaact aagaagaatt attcggagtt 240
agatgatgaa gaacctatgg agttagatta tccataaaac gaac atg aaa att att
296 Met Lys Ile Ile 1 ctc ttc ctg aca ttg att gta ttt aca tct tgc
gag cta tat cac tat 344 Leu Phe Leu Thr Leu Ile Val Phe Thr Ser Cys
Glu Leu Tyr His Tyr 5 10 15 20 cag gag tgt gtt aga ggt acg act gta
cta cta aaa gaa cct tgc cca 392 Gln Glu Cys Val Arg Gly Thr Thr Val
Leu Leu Lys Glu Pro Cys Pro 25 30 35 tca gga aca tac gag ggc aat
tca cca ttt cac cct ctt gct gac aat 440 Ser Gly Thr Tyr Glu Gly Asn
Ser Pro Phe His Pro Leu Ala Asp Asn 40 45 50 aaa ttt gca cta act
tgc act agc aca cac ttt gct ttt gct tgt gct 488 Lys Phe Ala Leu Thr
Cys Thr Ser Thr His Phe Ala Phe Ala Cys Ala 55 60 65 gac ggt act
cga cat acc tat cag ctg cgt gca aga tca gtt tca cca 536 Asp Gly Thr
Arg His Thr Tyr Gln Leu Arg Ala Arg Ser Val Ser Pro 70 75 80 aaa
ctt ttc atc aga caa gag gag gtt caa caa gag ctc tac tcg cca 584 Lys
Leu Phe Ile Arg Gln Glu Glu Val Gln Gln Glu Leu Tyr Ser Pro 85 90
95 100 ctt ttt ctc att gtt gct gct cta gta ttt tta ata ctt tgc ttc
acc 632 Leu Phe Leu Ile Val Ala Ala Leu Val Phe Leu Ile Leu Cys Phe
Thr 105 110 115 att aag aga aag aca gaa tgaatgagct cactttaatt
gacttctatt 680 Ile Lys Arg Lys Thr Glu 120 tgtgcttttt agcctttctg
ctattccttg ttttaataat gcttattata ttttggtttt 740 cactcgaaat
ccaggatcta gaagaacctt gtaccaaagt ctaaacgaac atgaaacttc 800
tcattgtttt gacttgtatt tctctatgca gttgcatatg cactgtagta cagcgctgtg
860 catctaataa acctcatgtg cttgaagatc cttgtaaggt acaacactag
gggtaatact 920 tatagcactg cttggctttg tgctctagga aaggttttac
cttttcatag atggcacact 980 atggttcaaa catgcacacc taatgttact
atcaactgtc aagatccagc tggtggtgcg 1040 cttatagcta ggtgttggta
ccttcatgaa ggtcaccaaa ctgctgcatt tagagacgta 1100 cttgttgttt
taaataaacg aacaaattaa aatgtctgat aatggacccc aatcaaacca 1160
acgtagtgcc ccccgcatta catttggtgg acccacagat tcaactgaca ataaccagaa
1220 tggaggacgc a 1231 24 122 PRT CORONAVIRUS 24 Met Lys Ile Ile
Leu Phe Leu Thr Leu Ile Val Phe Thr Ser Cys Glu 1 5 10 15 Leu Tyr
His Tyr Gln Glu Cys Val Arg Gly Thr Thr Val Leu Leu Lys 20 25 30
Glu Pro Cys Pro Ser Gly Thr Tyr Glu Gly Asn Ser Pro Phe His Pro 35
40 45 Leu Ala Asp Asn Lys Phe Ala Leu Thr Cys Thr Ser Thr His Phe
Ala 50 55 60 Phe Ala Cys Ala Asp Gly Thr Arg His Thr Tyr Gln Leu
Arg Ala Arg 65 70 75 80 Ser Val Ser Pro Lys Leu Phe Ile Arg Gln Glu
Glu Val Gln Gln Glu 85 90 95 Leu Tyr Ser Pro Leu Phe Leu Ile Val
Ala Ala Leu Val Phe Leu Ile 100 105 110 Leu Cys Phe Thr Ile Lys Arg
Lys Thr Glu 115 120 25 1231 DNA CORONAVIRUS CDS (650)..(781) 25
taccgtattg gaaactataa attaaataca gaccacgccg gtagcaacga caatattgct
60 ttgctagtac agtaagtgac aacagatgtt tcatcttgtt gacttccagg
ttacaatagc 120 agagatattg attatcatta tgaggacttt caggattgct
atttggaatc ttgacgttat 180 aataagttca atagtgagac aattatttaa
gcctctaact aagaagaatt attcggagtt 240 agatgatgaa gaacctatgg
agttagatta tccataaaac gaacatgaaa attattctct 300 tcctgacatt
gattgtattt acatcttgcg agctatatca ctatcaggag tgtgttagag 360
gtacgactgt actactaaaa gaaccttgcc catcaggaac atacgagggc aattcaccat
420 ttcaccctct tgctgacaat aaatttgcac taacttgcac tagcacacac
tttgcttttg 480 cttgtgctga cggtactcga catacctatc agctgcgtgc
aagatcagtt tcaccaaaac 540 ttttcatcag acaagaggag gttcaacaag
agctctactc gccacttttt ctcattgttg 600 ctgctctagt atttttaata
ctttgcttca ccattaagag aaagacaga atg aat gag 658 Met Asn Glu 1 ctc
act tta att gac ttc tat ttg tgc ttt tta gcc ttt ctg cta ttc 706 Leu
Thr Leu Ile Asp Phe Tyr Leu Cys Phe Leu Ala Phe Leu Leu Phe 5 10 15
ctt gtt tta ata atg ctt att ata ttt tgg ttt tca ctc gaa atc cag 754
Leu Val Leu Ile Met Leu Ile Ile Phe Trp Phe Ser Leu Glu Ile Gln 20
25 30 35 gat cta gaa gaa cct tgt acc aaa gtc taaacgaaca tgaaacttct
801 Asp Leu Glu Glu Pro Cys Thr Lys Val 40 cattgttttg acttgtattt
ctctatgcag ttgcatatgc actgtagtac agcgctgtgc 861 atctaataaa
cctcatgtgc ttgaagatcc ttgtaaggta caacactagg ggtaatactt 921
atagcactgc ttggctttgt gctctaggaa aggttttacc ttttcataga tggcacacta
981 tggttcaaac atgcacacct aatgttacta tcaactgtca agatccagct
ggtggtgcgc 1041 ttatagctag gtgttggtac cttcatgaag gtcaccaaac
tgctgcattt agagacgtac 1101 ttgttgtttt aaataaacga acaaattaaa
atgtctgata atggacccca atcaaaccaa 1161 cgtagtgccc cccgcattac
atttggtgga cccacagatt caactgacaa taaccagaat 1221 ggaggacgca 1231 26
44 PRT CORONAVIRUS 26 Met Asn Glu Leu Thr Leu Ile Asp Phe Tyr Leu
Cys Phe Leu Ala Phe 1 5 10 15 Leu Leu Phe Leu Val Leu Ile Met Leu
Ile Ile Phe Trp Phe Ser Leu 20 25 30 Glu Ile Gln Asp Leu Glu Glu
Pro Cys Thr Lys Val 35 40 27 1231 DNA CORONAVIRUS CDS (791)..(907)
27 taccgtattg gaaactataa attaaataca gaccacgccg gtagcaacga
caatattgct 60 ttgctagtac agtaagtgac aacagatgtt tcatcttgtt
gacttccagg ttacaatagc 120 agagatattg attatcatta tgaggacttt
caggattgct atttggaatc ttgacgttat 180 aataagttca atagtgagac
aattatttaa gcctctaact aagaagaatt attcggagtt 240 agatgatgaa
gaacctatgg agttagatta tccataaaac gaacatgaaa attattctct 300
tcctgacatt gattgtattt acatcttgcg agctatatca ctatcaggag tgtgttagag
360 gtacgactgt actactaaaa gaaccttgcc catcaggaac atacgagggc
aattcaccat 420 ttcaccctct tgctgacaat aaatttgcac taacttgcac
tagcacacac tttgcttttg 480 cttgtgctga cggtactcga catacctatc
agctgcgtgc aagatcagtt tcaccaaaac 540 ttttcatcag acaagaggag
gttcaacaag agctctactc gccacttttt ctcattgttg 600 ctgctctagt
atttttaata ctttgcttca ccattaagag aaagacagaa tgaatgagct 660
cactttaatt gacttctatt tgtgcttttt agcctttctg ctattccttg ttttaataat
720 gcttattata ttttggtttt cactcgaaat ccaggatcta gaagaacctt
gtaccaaagt 780 ctaaacgaac atg aaa ctt ctc att gtt ttg act tgt att
tct cta tgc 829 Met Lys Leu Leu Ile Val Leu Thr Cys Ile Ser Leu Cys
1 5 10 agt tgc ata tgc act gta gta cag cgc tgt gca tct aat aaa cct
cat 877 Ser Cys Ile Cys Thr Val Val Gln Arg Cys Ala Ser Asn Lys Pro
His 15 20 25 gtg ctt gaa gat cct tgt aag gta caa cac taggggtaat
acttatagca 927 Val Leu Glu Asp Pro Cys Lys Val Gln His 30 35
ctgcttggct ttgtgctcta ggaaaggttt taccttttca tagatggcac actatggttc
987 aaacatgcac acctaatgtt actatcaact gtcaagatcc agctggtggt
gcgcttatag 1047 ctaggtgttg gtaccttcat gaaggtcacc aaactgctgc
atttagagac gtacttgttg 1107 ttttaaataa acgaacaaat taaaatgtct
gataatggac cccaatcaaa ccaacgtagt 1167 gccccccgca ttacatttgg
tggacccaca gattcaactg acaataacca gaatggagga 1227 cgca 1231 28 39
PRT CORONAVIRUS 28 Met Lys Leu Leu Ile Val Leu Thr Cys Ile Ser Leu
Cys Ser Cys Ile 1 5 10 15 Cys Thr Val Val Gln Arg Cys Ala Ser Asn
Lys Pro His Val Leu Glu 20 25 30 Asp Pro Cys Lys Val Gln His 35 29
1231 DNA CORONAVIRUS CDS (876)..(1127) 29 taccgtattg gaaactataa
attaaataca gaccacgccg gtagcaacga caatattgct 60 ttgctagtac
agtaagtgac aacagatgtt tcatcttgtt gacttccagg ttacaatagc 120
agagatattg attatcatta tgaggacttt caggattgct atttggaatc ttgacgttat
180 aataagttca atagtgagac aattatttaa gcctctaact aagaagaatt
attcggagtt 240 agatgatgaa gaacctatgg agttagatta tccataaaac
gaacatgaaa attattctct 300 tcctgacatt gattgtattt acatcttgcg
agctatatca ctatcaggag tgtgttagag 360 gtacgactgt actactaaaa
gaaccttgcc catcaggaac atacgagggc aattcaccat 420 ttcaccctct
tgctgacaat aaatttgcac taacttgcac tagcacacac tttgcttttg 480
cttgtgctga cggtactcga catacctatc agctgcgtgc aagatcagtt tcaccaaaac
540 ttttcatcag acaagaggag gttcaacaag agctctactc gccacttttt
ctcattgttg 600 ctgctctagt atttttaata ctttgcttca ccattaagag
aaagacagaa tgaatgagct 660 cactttaatt gacttctatt tgtgcttttt
agcctttctg ctattccttg ttttaataat 720 gcttattata ttttggtttt
cactcgaaat ccaggatcta gaagaacctt gtaccaaagt 780 ctaaacgaac
atgaaacttc tcattgtttt gacttgtatt tctctatgca gttgcatatg 840
cactgtagta cagcgctgtg catctaataa acctc atg tgc ttg aag atc ctt 893
Met Cys Leu Lys Ile Leu 1 5 gta agg tac aac act agg ggt aat act tat
agc act gct tgg ctt tgt 941 Val Arg Tyr Asn Thr Arg Gly Asn Thr Tyr
Ser Thr Ala Trp Leu Cys 10 15 20 gct cta gga aag gtt tta cct ttt
cat aga tgg cac act atg gtt caa 989 Ala Leu Gly Lys Val Leu Pro Phe
His Arg Trp His Thr Met Val Gln 25 30 35 aca tgc aca cct aat gtt
act atc aac tgt caa gat cca gct ggt ggt 1037 Thr Cys Thr Pro Asn
Val Thr Ile Asn Cys Gln Asp Pro Ala Gly Gly 40 45 50 gcg ctt ata
gct agg tgt tgg tac ctt cat gaa ggt cac caa act gct 1085 Ala Leu
Ile Ala Arg Cys Trp Tyr Leu His Glu Gly His Gln Thr Ala 55 60 65 70
gca ttt aga gac gta ctt gtt gtt tta aat aaa cga aca aat 1127 Ala
Phe Arg Asp Val Leu Val Val Leu Asn Lys Arg Thr Asn 75 80
taaaatgtct gataatggac cccaatcaaa ccaacgtagt gccccccgca ttacatttgg
1187 tggacccaca gattcaactg acaataacca gaatggagga cgca 1231 30 84
PRT CORONAVIRUS 30 Met Cys Leu Lys Ile Leu Val Arg Tyr Asn Thr Arg
Gly Asn Thr Tyr 1 5 10 15 Ser Thr Ala Trp Leu Cys Ala Leu Gly Lys
Val Leu Pro Phe His Arg 20 25 30 Trp His Thr Met Val Gln Thr Cys
Thr Pro Asn Val Thr Ile Asn Cys 35 40 45 Gln Asp Pro Ala Gly Gly
Ala Leu Ile Ala Arg Cys Trp Tyr Leu His 50 55 60 Glu Gly His Gln
Thr Ala Ala Phe Arg Asp Val Leu Val Val Leu Asn 65 70 75 80 Lys Arg
Thr Asn 31 21221 DNA CORONAVIRUS 31 atggagagcc ttgttcttgg
tgtcaacgag aaaacacacg tccaactcag tttgcctgtc 60 cttcaggtta
gagacgtgct agtgcgtggc ttcggggact ctgtggaaga ggccctatcg 120
gaggcacgtg aacacctcaa aaatggcact tgtggtctag tagagctgga aaaaggcgta
180 ctgccccagc ttgaacagcc ctatgtgttc attaaacgtt ctgatgcctt
aagcaccaat 240 cacggccaca aggtcgttga gctggttgca gaaatggacg
gcattcagta cggtcgtagc 300 ggtataacac tgggagtact cgtgccacat
gtgggcgaaa ccccaattgc ataccgcaat 360 gttcttcttc gtaagaacgg
taataaggga gccggtggtc atagctatgg catcgatcta 420 aagtcttatg
acttaggtga cgagcttggc actgatccca ttgaagatta tgaacaaaac 480
tggaacacta agcatggcag tggtgcactc cgtgaactca ctcgtgagct caatggaggt
540 gcagtcactc gctatgtcga caacaatttc tgtggcccag atgggtaccc
tcttgattgc 600 atcaaagatt ttctcgcacg cgcgggcaag tcaatgtgca
ctctttccga acaacttgat 660 tacatcgagt cgaagagagg tgtctactgc
tgccgtgacc atgagcatga aattgcctgg 720 ttcactgagc gctctgataa
gagctacgag caccagacac ccttcgaaat taagagtgcc 780 aagaaatttg
acactttcaa aggggaatgc ccaaagtttg tgtttcctct taactcaaaa 840
gtcaaagtca ttcaaccacg tgttgaaaag
aaaaagactg agggtttcat ggggcgtata 900 cgctctgtgt accctgttgc
atctccacag gagtgtaaca atatgcactt gtctaccttg 960 atgaaatgta
atcattgcga tgaagtttca tggcagacgt gcgactttct gaaagccact 1020
tgtgaacatt gtggcactga aaatttagtt attgaaggac ctactacatg tgggtaccta
1080 cctactaatg ctgtagtgaa aatgccatgt cctgcctgtc aagacccaga
gattggacct 1140 gagcatagtg ttgcagatta tcacaaccac tcaaacattg
aaactcgact ccgcaaggga 1200 ggtaggacta gatgttttgg aggctgtgtg
tttgcctatg ttggctgcta taataagcgt 1260 gcctactggg ttcctcgtgc
tagtgctgat attggctcag gccatactgg cattactggt 1320 gacaatgtgg
agaccttgaa tgaggatctc cttgagatac tgagtcgtga acgtgttaac 1380
attaacattg ttggcgattt tcatttgaat gaagaggttg ccatcatttt ggcatctttc
1440 tctgcttcta caagtgcctt tattgacact ataaagagtc ttgattacaa
gtctttcaaa 1500 accattgttg agtcctgcgg taactataaa gttaccaagg
gaaagcccgt aaaaggtgct 1560 tggaacattg gacaacagag atcagtttta
acaccactgt gtggttttcc ctcacaggct 1620 gctggtgtta tcagatcaat
ttttgcgcgc acacttgatg cagcaaacca ctcaattcct 1680 gatttgcaaa
gagcagctgt caccatactt gatggtattt ctgaacagtc attacgtctt 1740
gtcgacgcca tggtttatac ttcagacctg ctcaccaaca gtgtcattat tatggcatat
1800 gtaactggtg gtcttgtaca acagacttct cagtggttgt ctaatctttt
gggcactact 1860 gttgaaaaac tcaggcctat ctttgaatgg attgaggcga
aacttagtgc aggagttgaa 1920 tttctcaagg atgcttggga gattctcaaa
tttctcatta caggtgtttt tgacatcgtc 1980 aagggtcaaa tacaggttgc
ttcagataac atcaaggatt gtgtaaaatg cttcattgat 2040 gttgttaaca
aggcactcga aatgtgcatt gatcaagtca ctatcgctgg cgcaaagttg 2100
cgatcactca acttaggtga agtcttcatc gctcaaagca agggacttta ccgtcagtgt
2160 atacgtggca aggagcagct gcaactactc atgcctctta aggcaccaaa
agaagtaacc 2220 tttcttgaag gtgattcaca tgacacagta cttacctctg
aggaggttgt tctcaagaac 2280 ggtgaactcg aagcactcga gacgcccgtt
gatagcttca caaatggagc tatcgttggc 2340 acaccagtct gtgtaaatgg
cctcatgctc ttagagatta aggacaaaga acaatactgc 2400 gcattgtctc
ctggtttact ggctacaaac aatgtctttc gcttaaaagg gggtgcacca 2460
attaaaggtg taacctttgg agaagatact gtttgggaag ttcaaggtta caagaatgtg
2520 agaatcacat ttgagcttga tgaacgtgtt gacaaagtgc ttaatgaaaa
gtgctctgtc 2580 tacactgttg aatccggtac cgaagttact gagtttgcat
gtgttgtagc agaggctgtt 2640 gtgaagactt tacaaccagt ttctgatctc
cttaccaaca tgggtattga tcttgatgag 2700 tggagtgtag ctacattcta
cttatttgat gatgctggtg aagaaaactt ttcatcacgt 2760 atgtattgtt
ccttttaccc tccagatgag gaagaagagg acgatgcaga gtgtgaggaa 2820
gaagaaattg atgaaacctg tgaacatgag tacggtacag aggatgatta tcaaggtctc
2880 cctctggaat ttggtgcctc agctgaaaca gttcgagttg aggaagaaga
agaggaagac 2940 tggctggatg atactactga gcaatcagag attgagccag
aaccagaacc tacacctgaa 3000 gaaccagtta atcagtttac tggttattta
aaacttactg acaatgttgc cattaaatgt 3060 gttgacatcg ttaaggaggc
acaaagtgct aatcctatgg tgattgtaaa tgctgctaac 3120 atacacctga
aacatggtgg tggtgtagca ggtgcactca acaaggcaac caatggtgcc 3180
atgcaaaagg agagtgatga ttacattaag ctaaatggcc ctcttacagt aggagggtct
3240 tgtttgcttt ctggacataa tcttgctaag aagtgtctgc atgttgttgg
acctaaccta 3300 aatgcaggtg aggacatcca gcttcttaag gcagcatatg
aaaatttcaa ttcacaggac 3360 atcttacttg caccattgtt gtcagcaggc
atatttggtg ctaaaccact tcagtcttta 3420 caagtgtgcg tgcagacggt
tcgtacacag gtttatattg cagtcaatga caaagctctt 3480 tatgagcagg
ttgtcatgga ttatcttgat aacctgaagc ctagagtgga agcacctaaa 3540
caagaggagc caccaaacac agaagattcc aaaactgagg agaaatctgt cgtacagaag
3600 cctgtcgatg tgaagccaaa aattaaggcc tgcattgatg aggttaccac
aacactggaa 3660 gaaactaagt ttcttaccaa taagttactc ttgtttgctg
atatcaatgg taagctttac 3720 catgattctc agaacatgct tagaggtgaa
gatatgtctt tccttgagaa ggatgcacct 3780 tacatggtag gtgatgttat
cactagtggt gatatcactt gtgttgtaat accctccaaa 3840 aaggctggtg
gcactactga gatgctctca agagctttga agaaagtgcc agttgatgag 3900
tatataacca cgtaccctgg acaaggatgt gctggttata cacttgagga agctaagact
3960 gctcttaaga aatgcaaatc tgcattttat gtactacctt cagaagcacc
taatgctaag 4020 gaagagattc taggaactgt atcctggaat ttgagagaaa
tgcttgctca tgctgaagag 4080 acaagaaaat taatgcctat atgcatggat
gttagagcca taatggcaac catccaacgt 4140 aagtataaag gaattaaaat
tcaagagggc atcgttgact atggtgtccg attcttcttt 4200 tatactagta
aagagcctgt agcttctatt attacgaagc tgaactctct aaatgagccg 4260
cttgtcacaa tgccaattgg ttatgtgaca catggtttta atcttgaaga ggctgcgcgc
4320 tgtatgcgtt ctcttaaagc tcctgccgta gtgtcagtat catcaccaga
tgctgttact 4380 acatataatg gatacctcac ttcgtcatca aagacatctg
aggagcactt tgtagaaaca 4440 gtttctttgg ctggctctta cagagattgg
tcctattcag gacagcgtac agagttaggt 4500 gttgaatttc ttaagcgtgg
tgacaaaatt gtgtaccaca ctctggagag ccccgtcgag 4560 tttcatcttg
acggtgaggt tctttcactt gacaaactaa agagtctctt atccctgcgg 4620
gaggttaaga ctataaaagt gttcacaact gtggacaaca ctaatctcca cacacagctt
4680 gtggatatgt ctatgacata tggacagcag tttggtccaa catacttgga
tggtgctgat 4740 gttacaaaaa ttaaacctca tgtaaatcat gagggtaaga
ctttctttgt actacctagt 4800 gatgacacac tacgtagtga agctttcgag
tactaccata ctcttgatga gagttttctt 4860 ggtaggtaca tgtctgcttt
aaaccacaca aagaaatgga aatttcctca agttggtggt 4920 ttaacttcaa
ttaaatgggc tgataacaat tgttatttgt ctagtgtttt attagcactt 4980
caacagcttg aagtcaaatt caatgcacca gcacttcaag aggcttatta tagagcccgt
5040 gctggtgatg ctgctaactt ttgtgcactc atactcgctt acagtaataa
aactgttggc 5100 gagcttggtg atgtcagaga aactatgacc catcttctac
agcatgctaa tttggaatct 5160 gcaaagcgag ttcttaatgt ggtgtgtaaa
cattgtggtc agaaaactac taccttaacg 5220 ggtgtagaag ctgtgatgta
tatgggtact ctatcttatg ataatcttaa gacaggtgtt 5280 tccattccat
gtgtgtgtgg tcgtgatgct acacaatatc tagtacaaca agagtcttct 5340
tttgttatga tgtctgcacc acctgctgag tataaattac agcaaggtac attcttatgt
5400 gcgaatgagt acactggtaa ctatcagtgt ggtcattaca ctcatataac
tgctaaggag 5460 accctctatc gtattgacgg agctcacctt acaaagatgt
cagagtacaa aggaccagtg 5520 actgatgttt tctacaagga aacatcttac
actacaacca tcaagcctgt gtcgtataaa 5580 ctcgatggag ttacttacac
agagattgaa ccaaaattgg atgggtatta taaaaaggat 5640 aatgcttact
atacagagca gcctatagac cttgtaccaa ctcaaccatt accaaatgcg 5700
agttttgata atttcaaact cacatgttct aacacaaaat ttgctgatga tttaaatcaa
5760 atgacaggct tcacaaagcc agcttcacga gagctatctg tcacattctt
cccagacttg 5820 aatggcgatg tagtggctat tgactataga cactattcag
cgagtttcaa gaaaggtgct 5880 aaattactgc ataagccaat tgtttggcac
attaaccagg ctacaaccaa gacaacgttc 5940 aaaccaaaca cttggtgttt
acgttgtctt tggagtacaa agccagtaga tacttcaaat 6000 tcatttgaag
ttctggcagt agaagacaca caaggaatgg acaatcttgc ttgtgaaagt 6060
caacaaccca cctctgaaga agtagtggaa aatcctacca tacagaagga agtcatagag
6120 tgtgacgtga aaactaccga agttgtaggc aatgtcatac ttaaaccatc
agatgaaggt 6180 gttaaagtaa cacaagagtt aggtcatgag gatcttatgg
ctgcttatgt ggaaaacaca 6240 agcattacca ttaagaaacc taatgagctt
tcactagcct taggtttaaa aacaattgcc 6300 actcatggta ttgctgcaat
taatagtgtt ccttggagta aaattttggc ttatgtcaaa 6360 ccattcttag
gacaagcagc aattacaaca tcaaattgcg ctaagagatt agcacaacgt 6420
gtgtttaaca attatatgcc ttatgtgttt acattattgt tccaattgtg tacttttact
6480 aaaagtacca attctagaat tagagcttca ctacctacaa ctattgctaa
aaatagtgtt 6540 aagagtgttg ctaaattatg tttggatgcc ggcattaatt
atgtgaagtc acccaaattt 6600 tctaaattgt tcacaatcgc tatgtggcta
ttgttgttaa gtatttgctt aggttctcta 6660 atctgtgtaa ctgctgcttt
tggtgtactc ttatctaatt ttggtgctcc ttcttattgt 6720 aatggcgtta
gagaattgta tcttaattcg tctaacgtta ctactatgga tttctgtgaa 6780
ggttcttttc cttgcagcat ttgtttaagt ggattagact cccttgattc ttatccagct
6840 cttgaaacca ttcaggtgac gatttcatcg tacaagctag acttgacaat
tttaggtctg 6900 gccgctgagt gggttttggc atatatgttg ttcacaaaat
tcttttattt attaggtctt 6960 tcagctataa tgcaggtgtt ctttggctat
tttgctagtc atttcatcag caattcttgg 7020 ctcatgtggt ttatcattag
tattgtacaa atggcacccg tttctgcaat ggttaggatg 7080 tacatcttct
ttgcttcttt ctactacata tggaagagct atgttcatat catggatggt 7140
tgcacctctt cgacttgcat gatgtgctat aagcgcaatc gtgccacacg cgttgagtgt
7200 acaactattg ttaatggcat gaagagatct ttctatgtct atgcaaatgg
aggccgtggc 7260 ttctgcaaga ctcacaattg gaattgtctc aattgtgaca
cattttgcac tggtagtaca 7320 ttcattagtg atgaagttgc tcgtgatttg
tcactccagt ttaaaagacc aatcaaccct 7380 actgaccagt catcgtatat
tgttgatagt gttgctgtga aaaatggcgc gcttcacctc 7440 tactttgaca
aggctggtca aaagacctat gagagacatc cgctctccca ttttgtcaat 7500
ttagacaatt tgagagctaa caacactaaa ggttcactgc ctattaatgt catagttttt
7560 gatggcaagt ccaaatgcga cgagtctgct tctaagtctg cttctgtgta
ctacagtcag 7620 ctgatgtgcc aacctattct gttgcttgac caagctcttg
tatcagacgt tggagatagt 7680 actgaagttt ccgttaagat gtttgatgct
tatgtcgaca ccttttcagc aacttttagt 7740 gttcctatgg aaaaacttaa
ggcacttgtt gctacagctc acagcgagtt agcaaagggt 7800 gtagctttag
atggtgtcct ttctacattc gtgtcagctg cccgacaagg tgttgttgat 7860
accgatgttg acacaaagga tgttattgaa tgtctcaaac tttcacatca ctctgactta
7920 gaagtgacag gtgacagttg taacaatttc atgctcacct ataataaggt
tgaaaacatg 7980 acgcccagag atcttggcgc atgtattgac tgtaatgcaa
ggcatatcaa tgcccaagta 8040 gcaaaaagtc acaatgtttc actcatctgg
aatgtaaaag actacatgtc tttatctgaa 8100 cagctgcgta aacaaattcg
tagtgctgcc aagaagaaca acataccttt tagactaact 8160 tgtgctacaa
ctagacaggt tgtcaatgtc ataactacta aaatctcact caagggtggt 8220
aagattgtta gtacttgttt taaacttatg cttaaggcca cattattgtg cgttcttgct
8280 gcattggttt gttatatcgt tatgccagta catacattgt caatccatga
tggttacaca 8340 aatgaaatca ttggttacaa agccattcag gatggtgtca
ctcgtgacat catttctact 8400 gatgattgtt ttgcaaataa acatgctggt
tttgacgcat ggtttagcca gcgtggtggt 8460 tcatacaaaa atgacaaaag
ctgccctgta gtagctgcta tcattacaag agagattggt 8520 ttcatagtgc
ctggcttacc gggtactgtg ctgagagcaa tcaatggtga cttcttgcat 8580
tttctacctc gtgtttttag tgctgttggc aacatttgct acacaccttc caaactcatt
8640 gagtatagtg attttgctac ctctgcttgc gttcttgctg ctgagtgtac
aatttttaag 8700 gatgctatgg gcaaacctgt gccatattgt tatgacacta
atttgctaga gggttctatt 8760 tcttatagtg agcttcgtcc agacactcgt
tatgtgctta tggatggttc catcatacag 8820 tttcctaaca cttacctgga
gggttctgtt agagtagtaa caacttttga tgctgagtac 8880 tgtagacatg
gtacatgcga aaggtcagaa gtaggtattt gcctatctac cagtggtaga 8940
tgggttctta ataatgagca ttacagagct ctatcaggag ttttctgtgg tgttgatgcg
9000 atgaatctca tagctaacat ctttactcct cttgtgcaac ctgtgggtgc
tttagatgtg 9060 tctgcttcag tagtggctgg tggtattatt gccatattgg
tgacttgtgc tgcctactac 9120 tttatgaaat tcagacgtgt ttttggtgag
tacaaccatg ttgttgctgc taatgcactt 9180 ttgtttttga tgtctttcac
tatactctgt ctggtaccag cttacagctt tctgccggga 9240 gtctactcag
tcttttactt gtacttgaca ttctatttca ccaatgatgt ttcattcttg 9300
gctcaccttc aatggtttgc catgttttct cctattgtgc ctttttggat aacagcaatc
9360 tatgtattct gtatttctct gaagcactgc cattggttct ttaacaacta
tcttaggaaa 9420 agagtcatgt ttaatggagt tacatttagt accttcgagg
aggctgcttt gtgtaccttt 9480 ttgctcaaca aggaaatgta cctaaaattg
cgtagcgaga cactgttgcc acttacacag 9540 tataacaggt atcttgctct
atataacaag tacaagtatt tcagtggagc cttagatact 9600 accagctatc
gtgaagcagc ttgctgccac ttagcaaagg ctctaaatga ctttagcaac 9660
tcaggtgctg atgttctcta ccaaccacca cagacatcaa tcacttctgc tgttctgcag
9720 agtggtttta ggaaaatggc attcccgtca ggcaaagttg aagggtgcat
ggtacaagta 9780 acctgtggaa ctacaactct taatggattg tggttggatg
acacagtata ctgtccaaga 9840 catgtcattt gcacagcaga agacatgctt
aatcctaact atgaagatct gctcattcgc 9900 aaatccaacc atagctttct
tgttcaggct ggcaatgttc aacttcgtgt tattggccat 9960 tctatgcaaa
attgtctgct taggcttaaa gttgatactt ctaaccctaa gacacccaag 10020
tataaatttg tccgtatcca acctggtcaa acattttcag ttctagcatg ctacaatggt
10080 tcaccatctg gtgtttatca gtgtgccatg agacctaatc ataccattaa
aggttctttc 10140 cttaatggat catgtggtag tgttggtttt aacattgatt
atgattgcgt gtctttctgc 10200 tatatgcatc atatggagct tccaacagga
gtacacgctg gtactgactt agaaggtaaa 10260 ttctatggtc catttgttga
cagacaaact gcacaggctg caggtacaga cacaaccata 10320 acattaaatg
ttttggcatg gctgtatgct gctgttatca atggtgatag gtggtttctt 10380
aatagattca ccactacttt gaatgacttt aaccttgtgg caatgaagta caactatgaa
10440 cctttgacac aagatcatgt tgacatattg ggacctcttt ctgctcaaac
aggaattgcc 10500 gtcttagata tgtgtgctgc tttgaaagag ctgctgcaga
atggtatgaa tggtcgtact 10560 atccttggta gcactatttt agaagatgag
tttacaccat ttgatgttgt tagacaatgc 10620 tctggtgtta ccttccaagg
taagttcaag aaaattgtta agggcactca tcattggatg 10680 cttttaactt
tcttgacatc actattgatt cttgttcaaa gtacacagtg gtcactgttt 10740
ttctttgttt acgagaatgc tttcttgcca tttactcttg gtattatggc aattgctgca
10800 tgtgctatgc tgcttgttaa gcataagcac gcattcttgt gcttgtttct
gttaccttct 10860 cttgcaacag ttgcttactt taatatggtc tacatgcctg
ctagctgggt gatgcgtatc 10920 atgacatggc ttgaattggc tgacactagc
ttgtctggtt ataggcttaa ggattgtgtt 10980 atgtatgctt cagctttagt
tttgcttatt ctcatgacag ctcgcactgt ttatgatgat 11040 gctgctagac
gtgtttggac actgatgaat gtcattacac ttgtttacaa agtctactat 11100
ggtaatgctt tagatcaagc tatttccatg tgggccttag ttatttctgt aacctctaac
11160 tattctggtg tcgttacgac tatcatgttt ttagctagag ctatagtgtt
tgtgtgtgtt 11220 gagtattacc cattgttatt tattactggc aacaccttac
agtgtatcat gcttgtttat 11280 tgtttcttag gctattgttg ctgctgctac
tttggccttt tctgtttact caaccgttac 11340 ttcaggctta ctcttggtgt
ttatgactac ttggtctcta cacaagaatt taggtatatg 11400 aactcccagg
ggcttttgcc tcctaagagt agtattgatg ctttcaagct taacattaag 11460
ttgttgggta ttggaggtaa accatgtatc aaggttgcta ctgtacagtc taaaatgtct
11520 gacgtaaagt gcacatctgt ggtactgctc tcggttcttc aacaacttag
agtagagtca 11580 tcttctaaat tgtgggcaca atgtgtacaa ctccacaatg
atattcttct tgcaaaagac 11640 acaactgaag ctttcgagaa gatggtttct
cttttgtctg ttttgctatc catgcagggt 11700 gctgtagaca ttaataggtt
gtgcgaggaa atgctcgata accgtgctac tcttcaggct 11760 attgcttcag
aatttagttc tttaccatca tatgccgctt atgccactgc ccaggaggcc 11820
tatgagcagg ctgtagctaa tggtgattct gaagtcgttc tcaaaaagtt aaagaaatct
11880 ttgaatgtgg ctaaatctga gtttgaccgt gatgctgcca tgcaacgcaa
gttggaaaag 11940 atggcagatc aggctatgac ccaaatgtac aaacaggcaa
gatctgagga caagagggca 12000 aaagtaacta gtgctatgca aacaatgctc
ttcactatgc ttaggaagct tgataatgat 12060 gcacttaaca acattatcaa
caatgcgcgt gatggttgtg ttccactcaa catcatacca 12120 ttgactacag
cagccaaact catggttgtt gtccctgatt atggtaccta caagaacact 12180
tgtgatggta acacctttac atatgcatct gcactctggg aaatccagca agttgttgat
12240 gcggatagca agattgttca acttagtgaa attaacatgg acaattcacc
aaatttggct 12300 tggcctctta ttgttacagc tctaagagcc aactcagctg
ttaaactaca gaataatgaa 12360 ctgagtccag tagcactacg acagatgtcc
tgtgcggctg gtaccacaca aacagcttgt 12420 actgatgaca atgcacttgc
ctactataac aattcgaagg gaggtaggtt tgtgctggca 12480 ttactatcag
accaccaaga tctcaaatgg gctagattcc ctaagagtga tggtacaggt 12540
acaatttaca cagaactgga accaccttgt aggtttgtta cagacacacc aaaagggcct
12600 aaagtgaaat acttgtactt catcaaaggc ttaaacaacc taaatagagg
tatggtgctg 12660 ggcagtttag ctgctacagt acgtcttcag gctggaaatg
ctacagaagt acctgccaat 12720 tcaactgtgc tttccttctg tgcttttgca
gtagaccctg ctaaagcata taaggattac 12780 ctagcaagtg gaggacaacc
aatcaccaac tgtgtgaaga tgttgtgtac acacactggt 12840 acaggacagg
caattactgt aacaccagaa gctaacatgg accaagagtc ctttggtggt 12900
gcttcatgtt gtctgtattg tagatgccac attgaccatc caaatcctaa aggattctgt
12960 gacttgaaag gtaagtacgt ccaaatacct accacttgtg ctaatgaccc
agtgggtttt 13020 acacttagaa acacagtctg taccgtctgc ggaatgtgga
aaggttatgg ctgtagttgt 13080 gaccaactcc gcgaaccctt gatgcagtct
gcggatgcat caacgttttt aaacgggttt 13140 gcggtgtaag tgcagcccgt
cttacaccgt gcggcacagg cactagtact gatgtcgtct 13200 acagggcttt
tgatatttac aacgaaaaag ttgctggttt tgcaaagttc ctaaaaacta 13260
attgctgtcg cttccaggag aaggatgagg aaggcaattt attagactct tactttgtag
13320 ttaagaggca tactatgtct aactaccaac atgaagagac tatttataac
ttggttaaag 13380 attgtccagc ggttgctgtc catgactttt tcaagtttag
agtagatggt gacatggtac 13440 cacatatatc acgtcagcgt ctaactaaat
acacaatggc tgatttagtc tatgctctac 13500 gtcattttga tgagggtaat
tgtgatacat taaaagaaat actcgtcaca tacaattgct 13560 gtgatgatga
ttatttcaat aagaaggatt ggtatgactt cgtagagaat cctgacatct 13620
tacgcgtata tgctaactta ggtgagcgtg tacgccaatc attattaaag actgtacaat
13680 tctgcgatgc tatgcgtgat gcaggcattg taggcgtact gacattagat
aatcaggatc 13740 ttaatgggaa ctggtacgat ttcggtgatt tcgtacaagt
agcaccaggc tgcggagttc 13800 ctattgtgga ttcatattac tcattgctga
tgcccatcct cactttgact agggcattgg 13860 ctgctgagtc ccatatggat
gctgatctcg caaaaccact tattaagtgg gatttgctga 13920 aatatgattt
tacggaagag agactttgtc tcttcgaccg ttattttaaa tattgggacc 13980
agacatacca tcccaattgt attaactgtt tggatgatag gtgtatcctt cattgtgcaa
14040 actttaatgt gttattttct actgtgtttc cacctacaag ttttggacca
ctagtaagaa 14100 aaatatttgt agatggtgtt ccttttgttg tttcaactgg
ataccatttt cgtgagttag 14160 gagtcgtaca taatcaggat gtaaacttac
atagctcgcg tctcagtttc aaggaacttt 14220 tagtgtatgc tgctgatcca
gctatgcatg cagcttctgg caatttattg ctagataaac 14280 gcactacatg
cttttcagta gctgcactaa caaacaatgt tgcttttcaa actgtcaaac 14340
ccggtaattt taataaagac ttttatgact ttgctgtgtc taaaggtttc tttaaggaag
14400 gaagttctgt tgaactaaaa cacttcttct ttgctcagga tggcaacgct
gctatcagtg 14460 attatgacta ttatcgttat aatctgccaa caatgtgtga
tatcagacaa ctcctattcg 14520 tagttgaagt tgttgataaa tactttgatt
gttacgatgg tggctgtatt aatgccaacc 14580 aagtaatcgt taacaatctg
gataaatcag ctggtttccc atttaataaa tggggtaagg 14640 ctagacttta
ttatgactca atgagttatg aggatcaaga tgcacttttc gcgtatacta 14700
agcgtaatgt catccctact ataactcaaa tgaatcttaa gtatgccatt agtgcaaaga
14760 atagagctcg caccgtagct ggtgtctcta tctgtagtac tatgacaaat
agacagtttc 14820 atcagaaatt attgaagtca atagccgcca ctagaggagc
tactgtggta attggaacaa 14880 gcaagtttta cggtggctgg cataatatgt
taaaaactgt ttacagtgat gtagaaactc 14940 cacaccttat gggttgggat
tatccaaaat gtgacagagc catgcctaac atgcttagga 15000 taatggcctc
tcttgttctt gctcgcaaac ataacacttg ctgtaactta tcacaccgtt 15060
tctacaggtt agctaacgag tgtgcgcaag tattaagtga gatggtcatg tgtggcggct
15120 cactatatgt taaaccaggt ggaacatcat ccggtgatgc tacaactgct
tatgctaata 15180 gtgtctttaa catttgtcaa gctgttacag ccaatgtaaa
tgcacttctt tcaactgatg 15240 gtaataagat agctgacaag tatgtccgca
atctacaaca caggctctat gagtgtctct 15300 atagaaatag ggatgttgat
catgaattcg tggatgagtt ttacgcttac ctgcgtaaac 15360 atttctccat
gatgattctt tctgatgatg ccgttgtgtg ctataacagt aactatgcgg 15420
ctcaaggttt agtagctagc attaagaact ttaaggcagt tctttattat caaaataatg
15480 tgttcatgtc tgaggcaaaa tgttggactg agactgacct tactaaagga
cctcacgaat 15540 tttgctcaca gcatacaatg ctagttaaac aaggagatga
ttacgtgtac ctgccttacc 15600 cagatccatc aagaatatta ggcgcaggct
gttttgtcga tgatattgtc aaaacagatg 15660 gtacacttat gattgaaagg
ttcgtgtcac tggctattga tgcttaccca cttacaaaac 15720 atcctaatca
ggagtatgct gatgtctttc acttgtattt acaatacatt agaaagttac 15780
atgatgagct tactggccac atgttggaca tgtattccgt aatgctaact aatgataaca
15840 cctcacggta ctgggaacct gagttttatg aggctatgta cacaccacat
acagtcttgc 15900 aggctgtagg tgcttgtgta ttgtgcaatt
cacagacttc acttcgttgc ggtgcctgta 15960 ttaggagacc attcctatgt
tgcaagtgct gctatgacca tgtcatttca acatcacaca 16020 aattagtgtt
gtctgttaat ccctatgttt gcaatgcccc aggttgtgat gtcactgatg 16080
tgacacaact gtatctagga ggtatgagct attattgcaa gtcacataag cctcccatta
16140 gttttccatt atgtgctaat ggtcaggttt ttggtttata caaaaacaca
tgtgtaggca 16200 gtgacaatgt cactgacttc aatgcgatag caacatgtga
ttggactaat gctggcgatt 16260 acatacttgc caacacttgt actgagagac
tcaagctttt cgcagcagaa acgctcaaag 16320 ccactgagga aacatttaag
ctgtcatatg gtattgccac tgtacgcgaa gtactctctg 16380 acagagaatt
gcatctttca tgggaggttg gaaaacctag accaccattg aacagaaact 16440
atgtctttac tggttaccgt gtaactaaaa atagtaaagt acagattgga gagtacacct
16500 ttgaaaaagg tgactatggt gatgctgttg tgtacagagg tactacgaca
tacaagttga 16560 atgttggtga ttactttgtg ttgacatctc acactgtaat
gccacttagt gcacctactc 16620 tagtgccaca agagcactat gtgagaatta
ctggcttgta cccaacactc aacatctcag 16680 atgagttttc tagcaatgtt
gcaaattatc aaaaggtcgg catgcaaaag tactctacac 16740 tccaaggacc
acctggtact ggtaagagtc attttgccat cggacttgct ctctattacc 16800
catctgctcg catagtgtat acggcatgct ctcatgcagc tgttgatgcc ctatgtgaaa
16860 aggcattaaa atatttgccc atagataaat gtagtagaat catacctgcg
cgtgcgcgcg 16920 tagagtgttt tgataaattc aaagtgaatt caacactaga
acagtatgtt ttctgcactg 16980 taaatgcatt gccagaaaca actgctgaca
ttgtagtctt tgatgaaatc tctatggcta 17040 ctaattatga cttgagtgtt
gtcaatgcta gacttcgtgc aaaacactac gtctatattg 17100 gcgatcctgc
tcaattacca gccccccgca cattgctgac taaaggcaca ctagaaccag 17160
aatattttaa ttcagtgtgc agacttatga aaacaatagg tccagacatg ttccttggaa
17220 cttgtcgccg ttgtcctgct gaaattgttg acactgtgag tgctttagtt
tatgacaata 17280 agctaaaagc acacaaggat aagtcagctc aatgcttcaa
aatgttctac aaaggtgtta 17340 ttacacatga tgtttcatct gcaatcaaca
gacctcaaat aggcgttgta agagaatttc 17400 ttacacgcaa tcctgcttgg
agaaaagctg tttttatctc accttataat tcacagaacg 17460 ctgtagcttc
aaaaatctta ggattgccta cgcagactgt tgattcatca cagggttctg 17520
aatatgacta tgtcatattc acacaaacta ctgaaacagc acactcttgt aatgtcaacc
17580 gcttcaatgt ggctatcaca agggcaaaaa ttggcatttt gtgcataatg
tctgatagag 17640 atctttatga caaactgcaa tttacaagtc tagaaatacc
acgtcgcaat gtggctacat 17700 tacaagcaga aaatgtaact ggacttttta
aggactgtag taagatcatt actggtcttc 17760 atcctacaca ggcacctaca
cacctcagcg ttgatataaa gttcaagact gaaggattat 17820 gtgttgacat
accaggcata ccaaaggaca tgacctaccg tagactcatc tctatgatgg 17880
gtttcaaaat gaattaccaa gtcaatggtt accctaatat gtttatcacc cgcgaagaag
17940 ctattcgtca cgttcgtgcg tggattggct ttgatgtaga gggctgtcat
gcaactagag 18000 atgctgtggg tactaaccta cctctccagc taggattttc
tacaggtgtt aacttagtag 18060 ctgtaccgac tggttatgtt gacactgaaa
ataacacaga attcaccaga gttaatgcaa 18120 aacctccacc aggtgaccag
tttaaacatc ttataccact catgtataaa ggcttgccct 18180 ggaatgtagt
gcgtattaag atagtacaaa tgctcagtga tacactgaaa ggattgtcag 18240
acagagtcgt gttcgtcctt tgggcgcatg gctttgagct tacatcaatg aagtactttg
18300 tcaagattgg acctgaaaga acgtgttgtc tgtgtgacaa acgtgcaact
tgcttttcta 18360 cttcatcaga tacttatgcc tgctggaatc attctgtggg
ttttgactat gtctataacc 18420 catttatgat tgatgttcag cagtggggct
ttacgggtaa ccttcagagt aaccatgacc 18480 aacattgcca ggtacatgga
aatgcacatg tggctagttg tgatgctatc atgactagat 18540 gtttagcagt
ccatgagtgc tttgttaagc gcgttgattg gtctgttgaa taccctatta 18600
taggagatga actgagggtt aattctgctt gcagaaaagt acaacacatg gttgtgaagt
18660 ctgcattgct tgctgataag tttccagttc ttcatgacat tggaaatcca
aaggctatca 18720 agtgtgtgcc tcaggctgaa gtagaatgga agttctacga
tgctcagcca tgtagtgaca 18780 aagcttacaa aatagaggaa ctcttctatt
cttatgctac acatcacgat aaattcactg 18840 atggtgtttg tttgttttgg
aattgtaacg ttgatcgtta cccagccaat gcaattgtgt 18900 gtaggtttga
cacaagagtc ttgtcaaact tgaacttacc aggctgtgat ggtggtagtt 18960
tgtatgtgaa taagcatgca ttccacactc cagctttcga taaaagtgca tttactaatt
19020 taaagcaatt gcctttcttt tactattctg atagtccttg tgagtctcat
ggcaaacaag 19080 tagtgtcgga tattgattat gttccactca aatctgctac
gtgtattaca cgatgcaatt 19140 taggtggtgc tgtttgcaga caccatgcaa
atgagtaccg acagtacttg gatgcatata 19200 atatgatgat ttctgctgga
tttagcctat ggatttacaa acaatttgat acttataacc 19260 tgtggaatac
atttaccagg ttacagagtt tagaaaatgt ggcttataat gttgttaata 19320
aaggacactt tgatggacac gccggcgaag cacctgtttc catcattaat aatgctgttt
19380 acacaaaggt agatggtatt gatgtggaga tctttgaaaa taagacaaca
cttcctgtta 19440 atgttgcatt tgagctttgg gctaagcgta acattaaacc
agtgccagag attaagatac 19500 tcaataattt gggtgttgat atcgctgcta
atactgtaat ctgggactac aaaagagaag 19560 ccccagcaca tgtatctaca
ataggtgtct gcacaatgac tgacattgcc aagaaaccta 19620 ctgagagtgc
ttgttcttca cttactgtct tgtttgatgg tagagtggaa ggacaggtag 19680
acctttttag aaacgcccgt aatggtgttt taataacaga aggttcagtc aaaggtctaa
19740 caccttcaaa gggaccagca caagctagcg tcaatggagt cacattaatt
ggagaatcag 19800 taaaaacaca gtttaactac tttaagaaag tagacggcat
tattcaacag ttgcctgaaa 19860 cctactttac tcagagcaga gacttagagg
attttaagcc cagatcacaa atggaaactg 19920 actttctcga gctcgctatg
gatgaattca tacagcgata taagctcgag ggctatgcct 19980 tcgaacacat
cgtttatgga gatttcagtc atggacaact tggcggtctt catttaatga 20040
taggcttagc caagcgctca caagattcac cacttaaatt agaggatttt atccctatgg
20100 acagcacagt gaaaaattac ttcataacag atgcgcaaac aggttcatca
aaatgtgtgt 20160 gttctgtgat tgatctttta cttgatgact ttgtcgagat
aataaagtca caagatttgt 20220 cagtgatttc aaaagtggtc aaggttacaa
ttgactatgc tgaaatttca ttcatgcttt 20280 ggtgtaagga tggacatgtt
gaaaccttct acccaaaact acaagcaagt caagcgtggc 20340 aaccaggtgt
tgcgatgcct aacttgtaca agatgcaaag aatgcttctt gaaaagtgtg 20400
accttcagaa ttatggtgaa aatgctgtta taccaaaagg aataatgatg aatgtcgcaa
20460 agtatactca actgtgtcaa tacttaaata cacttacttt agctgtaccc
tacaacatga 20520 gagttattca ctttggtgct ggctctgata aaggagttgc
accaggtaca gctgtgctca 20580 gacaatggtt gccaactggc acactacttg
tcgattcaga tcttaatgac ttcgtctccg 20640 acgcagattc tactttaatt
ggagactgtg caacagtaca tacggctaat aaatgggacc 20700 ttattattag
cgatatgtat gaccctagga ccaaacatgt gacaaaagag aatgactcta 20760
aagaagggtt tttcacttat ctgtgtggat ttataaagca aaaactagcc ctgggtggtt
20820 ctatagctgt aaagataaca gagcattctt ggaatgctga cctttacaag
cttatgggcc 20880 atttctcatg gtggacagct tttgttacaa atgtaaatgc
atcatcatcg gaagcatttt 20940 taattggggc taactatctt ggcaagccga
aggaacaaat tgatggctat accatgcatg 21000 ctaactacat tttctggagg
aacacaaatc ctatccagtt gtcttcctat tcactctttg 21060 acatgagcaa
atttcctctt aaattaagag gaactgctgt aatgtctctt aaggagaatc 21120
aaatcaatga tatgatttat tctcttctgg aaaaaggtag gcttatcatt agagaaaaca
21180 acagagttgt ggtttcaagt gatattcttg ttaacaacta a 21221 32 297
DNA CORONAVIRUS 32 atggacccca atcaaaccaa cgtagtgccc cccgcattac
atttggtgga cccacagatt 60 caactgacaa taaccagaat ggaggacgca
atggggcaag gccaaaacag cgccgacccc 120 aaggtttacc caataatact
gcgtcttggt tcacagctct cactcagcat ggcaaggagg 180 aacttagatt
ccctcgaggc cagggcgttc caatcaacac caatagtggt ccagatgacc 240
aaattggcta ctaccgaaga gctacccgac gagttcgtgg tggtgacggc aaaatga 297
33 98 PRT CORONAVIRUS 33 Met Asp Pro Asn Gln Thr Asn Val Val Pro
Pro Ala Leu His Leu Val 1 5 10 15 Asp Pro Gln Ile Gln Leu Thr Ile
Thr Arg Met Glu Asp Ala Met Gly 20 25 30 Gln Gly Gln Asn Ser Ala
Asp Pro Lys Val Tyr Pro Ile Ile Leu Arg 35 40 45 Leu Gly Ser Gln
Leu Ser Leu Ser Met Ala Arg Arg Asn Leu Asp Ser 50 55 60 Leu Glu
Ala Arg Ala Phe Gln Ser Thr Pro Ile Val Val Gln Met Thr 65 70 75 80
Lys Leu Ala Thr Thr Glu Glu Leu Pro Asp Glu Phe Val Val Val Thr 85
90 95 Ala Lys 34 213 DNA CORONAVIRUS 34 atgctgccac cgtgctacaa
cttcctcaag gaacaacatt gccaaaaggc ttctacgcag 60 agggaagcag
aggcggcagt caagcctctt ctcgctcctc atcacgtagt cgcggtaatt 120
caagaaattc aactcctggc agcagtaggg gaaattctcc tgctcgaatg gctagcggag
180 gtggtgaaac tgccctcgcg ctattgctgc tag 213 35 70 PRT CORONAVIRUS
35 Met Leu Pro Pro Cys Tyr Asn Phe Leu Lys Glu Gln His Cys Gln Lys
1 5 10 15 Ala Ser Thr Gln Arg Glu Ala Glu Ala Ala Val Lys Pro Leu
Leu Ala 20 25 30 Pro His His Val Val Ala Val Ile Gln Glu Ile Gln
Leu Leu Ala Ala 35 40 45 Val Gly Glu Ile Leu Leu Leu Glu Trp Leu
Ala Glu Val Val Lys Leu 50 55 60 Pro Ser Arg Tyr Cys Cys 65 70 36
1377 DNA CORONAVIRUS CDS (67)..(1335) 36 atgaaggtca ccaaactgct
gcatttagag acgtacttgt tgttttaaat aaacgaacaa 60 attaaa atg tct gat
aat gga ccc caa tca aac caa cgt agt gcc ccc 108 Met Ser Asp Asn Gly
Pro Gln Ser Asn Gln Arg Ser Ala Pro 1 5 10 cgc att aca ttt ggt gga
ccc aca gat tca act gac aat aac cag aat 156 Arg Ile Thr Phe Gly Gly
Pro Thr Asp Ser Thr Asp Asn Asn Gln Asn 15 20 25 30 gga gga cgc aat
ggg gca agg cca aaa cag cgc cga ccc caa ggt tta 204 Gly Gly Arg Asn
Gly Ala Arg Pro Lys Gln Arg Arg Pro Gln Gly Leu 35 40 45 ccc aat
aat act gcg tct tgg ttc aca gct ctc act cag cat ggc aag 252 Pro Asn
Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gln His Gly Lys 50 55 60
gag gaa ctt aga ttc cct cga ggc cag ggc gtt cca atc aac acc aat 300
Glu Glu Leu Arg Phe Pro Arg Gly Gln Gly Val Pro Ile Asn Thr Asn 65
70 75 agt ggt cca gat gac caa att ggc tac tac cga aga gct acc cga
cga 348 Ser Gly Pro Asp Asp Gln Ile Gly Tyr Tyr Arg Arg Ala Thr Arg
Arg 80 85 90 gtt cgt ggt ggt gac ggc aaa atg aaa gag ctc agc ccc
aga tgg tac 396 Val Arg Gly Gly Asp Gly Lys Met Lys Glu Leu Ser Pro
Arg Trp Tyr 95 100 105 110 ttc tat tac cta gga act ggc cca gaa gct
tca ctt ccc tac ggc gct 444 Phe Tyr Tyr Leu Gly Thr Gly Pro Glu Ala
Ser Leu Pro Tyr Gly Ala 115 120 125 aac aaa gaa ggc atc gta tgg gtt
gca act gag gga gcc ttg aat aca 492 Asn Lys Glu Gly Ile Val Trp Val
Ala Thr Glu Gly Ala Leu Asn Thr 130 135 140 ccc aaa gac cac att ggc
acc cgc aat cct aat aac aat gct gcc acc 540 Pro Lys Asp His Ile Gly
Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr 145 150 155 gtg cta caa ctt
cct caa gga aca aca ttg cca aaa ggc ttc tac gca 588 Val Leu Gln Leu
Pro Gln Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala 160 165 170 gag gga
agc aga ggc ggc agt caa gcc tct tct cgc tcc tca tca cgt 636 Glu Gly
Ser Arg Gly Gly Ser Gln Ala Ser Ser Arg Ser Ser Ser Arg 175 180 185
190 agt cgc ggt aat tca aga aat tca act cct ggc agc agt agg gga aat
684 Ser Arg Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn
195 200 205 tct cct gct cga atg gct agc gga ggt ggt gaa act gcc ctc
gcg cta 732 Ser Pro Ala Arg Met Ala Ser Gly Gly Gly Glu Thr Ala Leu
Ala Leu 210 215 220 ttg ctg cta gac aga ttg aac cag ctt gag agc aaa
gtt tct ggt aaa 780 Leu Leu Leu Asp Arg Leu Asn Gln Leu Glu Ser Lys
Val Ser Gly Lys 225 230 235 ggc caa caa caa caa ggc caa act gtc act
aag aaa tct gct gct gag 828 Gly Gln Gln Gln Gln Gly Gln Thr Val Thr
Lys Lys Ser Ala Ala Glu 240 245 250 gca tct aaa aag cct cgc caa aaa
cgt act gcc aca aaa cag tac aac 876 Ala Ser Lys Lys Pro Arg Gln Lys
Arg Thr Ala Thr Lys Gln Tyr Asn 255 260 265 270 gtc act caa gca ttt
ggg aga cgt ggt cca gaa caa acc caa gga aat 924 Val Thr Gln Ala Phe
Gly Arg Arg Gly Pro Glu Gln Thr Gln Gly Asn 275 280 285 ttc ggg gac
caa gac cta atc aga caa gga act gat tac aaa cat tgg 972 Phe Gly Asp
Gln Asp Leu Ile Arg Gln Gly Thr Asp Tyr Lys His Trp 290 295 300 ccg
caa att gca caa ttt gct cca agt gcc tct gca ttc ttt gga atg 1020
Pro Gln Ile Ala Gln Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met 305
310 315 tca cgc att ggc atg gaa gtc aca cct tcg gga aca tgg ctg act
tat 1068 Ser Arg Ile Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu
Thr Tyr 320 325 330 cat gga gcc att aaa ttg gat gac aaa gat cca caa
ttc aaa gac aac 1116 His Gly Ala Ile Lys Leu Asp Asp Lys Asp Pro
Gln Phe Lys Asp Asn 335 340 345 350 gtc ata ctg ctg aac aag cac att
gac gca tac aaa aca ttc cca cca 1164 Val Ile Leu Leu Asn Lys His
Ile Asp Ala Tyr Lys Thr Phe Pro Pro 355 360 365 aca gag cct aaa aag
gac aaa aag aaa aag act gat gaa gct cag cct 1212 Thr Glu Pro Lys
Lys Asp Lys Lys Lys Lys Thr Asp Glu Ala Gln Pro 370 375 380 ttg ccg
cag aga caa aag aag cag ccc act gtg act ctt ctt cct gcg 1260 Leu
Pro Gln Arg Gln Lys Lys Gln Pro Thr Val Thr Leu Leu Pro Ala 385 390
395 gct gac atg gat gat ttc tcc aga caa ctt caa aat tcc atg agt gga
1308 Ala Asp Met Asp Asp Phe Ser Arg Gln Leu Gln Asn Ser Met Ser
Gly 400 405 410 gct tct gct gat tca act cag gca taa acactcatga
tgaccacaca 1355 Ala Ser Ala Asp Ser Thr Gln Ala 415 420 aggcagatgg
gctatgtaaa cg 1377 37 422 PRT CORONAVIRUS 37 Met Ser Asp Asn Gly
Pro Gln Ser Asn Gln Arg Ser Ala Pro Arg Ile 1 5 10 15 Thr Phe Gly
Gly Pro Thr Asp Ser Thr Asp Asn Asn Gln Asn Gly Gly 20 25 30 Arg
Asn Gly Ala Arg Pro Lys Gln Arg Arg Pro Gln Gly Leu Pro Asn 35 40
45 Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gln His Gly Lys Glu Glu
50 55 60 Leu Arg Phe Pro Arg Gly Gln Gly Val Pro Ile Asn Thr Asn
Ser Gly 65 70 75 80 Pro Asp Asp Gln Ile Gly Tyr Tyr Arg Arg Ala Thr
Arg Arg Val Arg 85 90 95 Gly Gly Asp Gly Lys Met Lys Glu Leu Ser
Pro Arg Trp Tyr Phe Tyr 100 105 110 Tyr Leu Gly Thr Gly Pro Glu Ala
Ser Leu Pro Tyr Gly Ala Asn Lys 115 120 125 Glu Gly Ile Val Trp Val
Ala Thr Glu Gly Ala Leu Asn Thr Pro Lys 130 135 140 Asp His Ile Gly
Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr Val Leu 145 150 155 160 Gln
Leu Pro Gln Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly 165 170
175 Ser Arg Gly Gly Ser Gln Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg
180 185 190 Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn
Ser Pro 195 200 205 Ala Arg Met Ala Ser Gly Gly Gly Glu Thr Ala Leu
Ala Leu Leu Leu 210 215 220 Leu Asp Arg Leu Asn Gln Leu Glu Ser Lys
Val Ser Gly Lys Gly Gln 225 230 235 240 Gln Gln Gln Gly Gln Thr Val
Thr Lys Lys Ser Ala Ala Glu Ala Ser 245 250 255 Lys Lys Pro Arg Gln
Lys Arg Thr Ala Thr Lys Gln Tyr Asn Val Thr 260 265 270 Gln Ala Phe
Gly Arg Arg Gly Pro Glu Gln Thr Gln Gly Asn Phe Gly 275 280 285 Asp
Gln Asp Leu Ile Arg Gln Gly Thr Asp Tyr Lys His Trp Pro Gln 290 295
300 Ile Ala Gln Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg
305 310 315 320 Ile Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr
Tyr His Gly 325 330 335 Ala Ile Lys Leu Asp Asp Lys Asp Pro Gln Phe
Lys Asp Asn Val Ile 340 345 350 Leu Leu Asn Lys His Ile Asp Ala Tyr
Lys Thr Phe Pro Pro Thr Glu 355 360 365 Pro Lys Lys Asp Lys Lys Lys
Lys Thr Asp Glu Ala Gln Pro Leu Pro 370 375 380 Gln Arg Gln Lys Lys
Gln Pro Thr Val Thr Leu Leu Pro Ala Ala Asp 385 390 395 400 Met Asp
Asp Phe Ser Arg Gln Leu Gln Asn Ser Met Ser Gly Ala Ser 405 410 415
Ala Asp Ser Thr Gln Ala 420 38 1377 DNA CORONAVIRUS 38 atgaaggtca
ccaaactgct gcatttagag acgtacttgt tgttttaaat aaacgaacaa 60
attaaaatgt ctgataatgg accccaatca aaccaacgta gtgccccccg cattacattt
120 ggtggaccca cagattcaac tgacaataac cagaatggag gacgcaatgg
ggcaaggcca 180 aaacagcgcc gaccccaagg tttacccaat aatactgcgt
cttggttcac agctctcact 240 cagcatggca aggaggaact tagattccct
cgaggccagg gcgttccaat caacaccaat 300 agtggtccag atgaccaaat
tggctactac cgaagagcta cccgacgagt tcgtggtggt 360 gacggcaaaa
tgaaagagct cagccccaga tggtacttct attacctagg aactggccca 420
gaagcttcac ttccctacgg cgctaacaaa gaaggcatcg tatgggttgc aactgaggga
480 gccttgaata cacccaaaga ccacattggc acccgcaatc ctaataacaa
tgctgccacc 540 gtgctacaac ttcctcaagg aacaacattg ccaaaaggct
tctacgcaga gggaagcaga 600 ggcggcagtc aagcctcttc tcgctcctca
tcacgtagtc gcggtaattc aagaaattca 660 actcctggca gcagtagggg
aaattctcct gctcgaatgg ctagcggagg tggtgaaact 720 gccctcgcgc
tattgctgct agacagattg aaccagcttg agagcaaagt ttctggtaaa 780
ggccaacaac aacaaggcca aactgtcact aagaaatctg ctgctgaggc atctaaaaag
840 cctcgccaaa aacgtactgc cacaaaacag tacaacgtca ctcaagcatt
tgggagacgt 900 ggtccagaac aaacccaagg aaatttcggg gaccaagacc
taatcagaca aggaactgat 960 tacaaacatt ggccgcaaat tgcacaattt
gctccaagtg cctctgcatt ctttggaatg 1020 tcacgcattg gcatggaagt
cacaccttcg ggaacatggc tgacttatca tggagccatt 1080
aaattggatg acaaagatcc acaattcaaa gacaacgtca tactgctgaa caagcacatt
1140 gacgcataca aaacattccc accaacagag cctaaaaagg acaaaaagaa
aaagactgat 1200 gaagctcagc ctttgccgca gagacaaaag aagcagccca
ctgtgactct tcttcctgcg 1260 gctgacatgg atgatttctc cagacaactt
caaaattcca tgagtggagc ttctgctgat 1320 tcaactcagg cataaacact
catgatgacc acacaaggca gatgggctat gtaaacg 1377 39 204 DNA
CORONAVIRUS 39 atattaggtt tttacctacc caggaaaagc caaccaacct
cgatctcttg tagatctgtt 60 ctctaaacga actttaaaat ctgtgtagct
gtcgctcggc tgcatgccta gtgcacctac 120 gcagtataaa caataataaa
ttttactgtc gttgacaaga aacgagtaac tcgtccctct 180 tctgcagact
gcttacggtt tcgt 204 40 809 DNA CORONAVIRUS 40 actcaagcat ttgggagacg
tggtccagaa caaacccaag gaaatttcgg ggaccaagac 60 ctaatcagac
aaggaactga ttacaaacat tggccgcaaa ttgcacaatt tgctccaagt 120
gcctctgcat tctttggaat gtcacgcatt ggcatggaag tcacaccttc gggaacatgg
180 ctgacttatc atggagccat taaattggat gacaaagatc cacaattcaa
agacaacgtc 240 atactgctga acaagcacat tgacgcatac aaaacattcc
caccaacaga gcctaaaaag 300 gacaaaaaga aaaagactga tgaagctcag
cctttgccgc agagacaaaa gaagcagccc 360 actgtgactc ttcttcctgc
ggctgacatg gatgatttct ccagacaact tcaaaattcc 420 atgagtggag
cttctgctga ttcaactcag gcataaacac tcatgatgac cacacaaggc 480
agatgggcta tgtaaacgtt ttcgcaattc cgtttacgat acatagtcta ctcttgtgca
540 gaatgaattc tcgtaactaa acagcacaag taggtttagt taactttaat
ctcacatagc 600 aatctttaat caatgtgtaa cattagggag gacttgaaag
agccaccaca ttttcatcga 660 ggccacgcgg agtacgatcg agggtacagt
gaataatgct agggagagct gcctatatgg 720 aagagcccta atgtgtaaaa
ttaattttag tagtgctatc cccatgtgat tttaatagct 780 tcttaggaga
atgacaaaaa aaaaaaaaa 809 41 448 DNA CORONAVIRUS 41 aatgaacaca
tagggctgtt caagctgggg cagtacgcct ttttccagct ctactagacc 60
acaagtgcca tttttgaggt gttcacgtgc ctccgatagg gcctcttcca cagagtcccc
120 gaagccacgc actagcacgt ctctaacctg aaggacaggc aaactgagtt
ggacgtgtgt 180 tttctcgttg acaccaagaa caaggctctc catcttacct
ttcggtcaca cccggacgaa 240 acctaggtat gctgatgatc gactgcaaca
cggacgaaac cgtaagcagt ctgcagaaga 300 gggacgagtt actcgtttct
tgtcaacgac agtaaaattt attattgttt atactgcgta 360 ggtgcactag
gcatgcagcc gagcgacagc tacacagatt ttaaagttcg tttagagaac 420
agatctacaa gagatcgagg ttggttgg 448 42 2033 DNA CORONAVIRUS 42
atacctaggt ttcgtccggg tgtgaccgaa aggtaagatg gagagccttg ttcttggtgt
60 caacgagaaa acacacgtcc aactcagttt gcctgtcctt caggttagag
acgtgctagt 120 gcgtggcttc ggggactctg tggaagaggc cctatcggag
gcacgtgaac acctcaaaaa 180 tggcacttgt ggtctagtag agctggaaaa
aggcgtactg ccccagcttg aacagcccta 240 tgtgttcatt aaacgttctg
atgccttaag caccaatcac ggccacaagg tcgttgagct 300 ggttgcagaa
atggacggca ttcagtacgg tcgtagcggt ataacactgg gagtactcgt 360
gccacatgtg ggcgaaaccc caattgcata ccgcaatgtt cttcttcgta agaacggtaa
420 taagggagcc ggtggtcata gctatggcat cgatctaaag tcttatgact
taggtgacga 480 gcttggcact gatcccattg aagattatga acaaaactgg
aacactaagc atggcagtgg 540 tgcactccgt gaactcactc gtgagctcaa
tggaggtgca gtcactcgct atgtcgacaa 600 caatttctgt ggcccagatg
ggtaccctct tgattgcatc aaagattttc tcgcacgcgc 660 gggcaagtca
atgtgcactc tttccgaaca acttgattac atcgagtcga agagaggtgt 720
ctactgctgc cgtgaccatg agcatgaaat tgcctggttc actgagcgct ctgataagag
780 ctacgagcac cagacaccct tcgaaattaa gagtgccaag aaatttgaca
ctttcaaagg 840 ggaatgccca aagtttgtgt ttcctcttaa ctcaaaagtc
aaagtcattc aaccacgtgt 900 tgaaaagaaa aagactgagg gtttcatggg
gcgtatacgc tctgtgtacc ctgttgcatc 960 tccacaggag tgtaacaata
tgcacttgtc taccttgatg aaatgtaatc attgcgatga 1020 agtttcatgg
cagacgtgcg actttctgaa agccacttgt gaacattgtg gcactgaaaa 1080
tttagttatt gaaggaccta ctacatgtgg gtacctacct actaatgctg tagtgaaaat
1140 gccatgtcct gcctgtcaag acccagagat tggacctgag catagtgttg
cagattatca 1200 caaccactca aacattgaaa ctcgactccg caagggaggt
aggactagat gttttggagg 1260 ctgtgtgttt gcctatgttg gctgctataa
taagcgtgcc tactgggttc ctcgtgctag 1320 tgctgatatt ggctcaggcc
atactggcat tactggtgac aatgtggaga ccttgaatga 1380 ggatctcctt
gagatactga gtcgtgaacg tgttaacatt aacattgttg gcgattttca 1440
tttgaatgaa gaggttgcca tcattttggc atctttctct gcttctacaa gtgcctttat
1500 tgacactata aagagtcttg attacaagtc tttcaaaacc attgttgagt
cctgcggtaa 1560 ctataaagtt accaagggaa agcccgtaaa aggtgcttgg
aacattggac aacagagatc 1620 agttttaaca ccactgtgtg gttttccctc
acaggctgct ggtgttatca gatcaatttt 1680 tgcgcgcaca cttgatgcag
caaaccactc aattcctgat ttgcaaagag cagctgtcac 1740 catacttgat
ggtatttctg aacagtcatt acgtcttgtc gacgccatgg tttatacttc 1800
agacctgctc accaacagtg tcattattat ggcatatgta actggtggtc ttgtacaaca
1860 gacttctcag tggttgtcta atcttttggg cactactgtt gaaaaactca
ggcctatctt 1920 tgaatggatt gaggcgaaac ttagtgcagg agttgaattt
ctcaaggatg cttgggagat 1980 tctcaaattt ctcattacag gtgtttttga
catcgtcaag ggtcaaatac agg 2033 43 2018 DNA CORONAVIRUS 43
ggattgaggc gaaacttagt gcaggagttg aatttctcaa ggatgcttgg gagattctca
60 aatttctcat tacaggtgtt tttgacatcg tcaagggtca aatacaggtt
gcttcagata 120 acatcaagga ttgtgtaaaa tgcttcattg atgttgttaa
caaggcactc gaaatgtgca 180 ttgatcaagt cactatcgct ggcgcaaagt
tgcgatcact caacttaggt gaagtcttca 240 tcgctcaaag caagggactt
taccgtcagt gtatacgtgg caaggagcag ctgcaactac 300 tcatgcctct
taaggcacca aaagaagtaa cctttcttga aggtgattca catgacacag 360
tacttacctc tgaggaggtt gttctcaaga acggtgaact cgaagcactc gagacgcccg
420 ttgatagctt cacaaatgga gctatcgttg gcacaccagt ctgtgtaaat
ggcctcatgc 480 tcttagagat taaggacaaa gaacaatact gcgcattgtc
tcctggttta ctggctacaa 540 acaatgtctt tcgcttaaaa gggggtgcac
caattaaagg tgtaaccttt ggagaagata 600 ctgtttggga agttcaaggt
tacaagaatg tgagaatcac atttgagctt gatgaacgtg 660 ttgacaaagt
gcttaatgaa aagtgctctg tctacactgt tgaatccggt accgaagtta 720
ctgagtttgc atgtgttgta gcagaggctg ttgtgaagac tttacaacca gtttctgatc
780 tccttaccaa catgggtatt gatcttgatg agtggagtgt agctacattc
tacttatttg 840 atgatgctgg tgaagaaaac ttttcatcac gtatgtattg
ttccttttac cctccagatg 900 aggaagaaga ggacgatgca gagtgtgagg
aagaagaaat tgatgaaacc tgtgaacatg 960 agtacggtac agaggatgat
tatcaaggtc tccctctgga atttggtgcc tcagctgaaa 1020 cagttcgagt
tgaggaagaa gaagaggaag actggctgga tgatactact gagcaatcag 1080
agattgagcc agaaccagaa cctacacctg aagaaccagt taatcagttt actggttatt
1140 taaaacttac tgacaatgtt gccattaaat gtgttgacat cgttaaggag
gcacaaagtg 1200 ctaatcctat ggtgattgta aatgctgcta acatacacct
gaaacatggt ggtggtgtag 1260 caggtgcact caacaaggca accaatggtg
ccatgcaaaa ggagagtgat gattacatta 1320 agctaaatgg ccctcttaca
gtaggagggt cttgtttgct ttctggacat aatcttgcta 1380 agaagtgtct
gcatgttgtt ggacctaacc taaatgcagg tgaggacatc cagcttctta 1440
aggcagcata tgaaaatttc aattcacagg acatcttact tgcaccattg ttgtcagcag
1500 gcatatttgg tgctaaacca cttcagtctt tacaagtgtg cgtgcagacg
gttcgtacac 1560 aggtttatat tgcagtcaat gacaaagctc tttatgagca
ggttgtcatg gattatcttg 1620 ataacctgaa gcctagagtg gaagcaccta
aacaagagga gccaccaaac acagaagatt 1680 ccaaaactga ggagaaatct
gtcgtacaga agcctgtcga tgtgaagcca aaaattaagg 1740 cctgcattga
tgaggttacc acaacactgg aagaaactaa gtttcttacc aataagttac 1800
tcttgtttgc tgatatcaat ggtaagcttt accatgattc tcagaacatg cttagaggtg
1860 aagatatgtc tttccttgag aaggatgcac cttacatggt aggtgatgtt
atcactagtg 1920 gtgatatcac ttgtgttgta ataccctcca aaaaggctgg
tggcactact gagatgctct 1980 caagagcttt gaagaaagtg ccagttgatg
agtatata 2018 44 1442 DNA CORONAVIRUS 44 ttgatgaggt taccacaaca
ctggaagaaa ctaagtttct taccaataag ttactcttgt 60 ttgctgatat
caatggtaag ctttaccatg attctcagaa catgcttaga ggtgaagata 120
tgtctttcct tgagaaggat gcaccttaca tggtaggtga tgttatcact agtggtgata
180 tcacttgtgt tgtaataccc tccaaaaagg ctggtggcac tactgagatg
ctctcaagag 240 ctttgaagaa agtgccagtt gatgagtata taaccacgta
ccctggacaa ggatgtgctg 300 gttatacact tgaggaagct aagactgctc
ttaagaaatg caaatctgca ttttatgtac 360 taccttcaga agcacctaat
gctaaggaag agattctagg aactgtatcc tggaatttga 420 gagaaatgct
tgctcatgct gaagagacaa gaaaattaat gcctatatgc atggatgtta 480
gagccataat ggcaaccatc caacgtaagt ataaaggaat taaaattcaa gagggcatcg
540 ttgactatgg tgtccgattc ttcttttata ctagtaaaga gcctgtagct
tctattatta 600 cgaagctgaa ctctctaaat gagccgcttg tcacaatgcc
aattggttat gtgacacatg 660 gttttaatct tgaagaggct gcgcgctgta
tgcgttctct taaagctcct gccgtagtgt 720 cagtatcatc accagatgct
gttactacat ataatggata cctcacttcg tcatcaaaga 780 catctgagga
gcactttgta gaaacagttt ctttggctgg ctcttacaga gattggtcct 840
attcaggaca gcgtacagag ttaggtgttg aatttcttaa gcgtggtgac aaaattgtgt
900 accacactct ggagagcccc gtcgagtttc atcttgacgg tgaggttctt
tcacttgaca 960 aactaaagag tctcttatcc ctgcgggagg ttaagactat
aaaagtgttc acaactgtgg 1020 acaacactaa tctccacaca cagcttgtgg
atatgtctat gacatatgga cagcagtttg 1080 gtccaacata cttggatggt
gctgatgtta caaaaattaa acctcatgta aatcatgagg 1140 gtaagacttt
ctttgtacta cctagtgatg acacactacg tagtgaagct ttcgagtact 1200
accatactct tgatgagagt tttcttggta ggtacatgtc tgctttaaac cacacaaaga
1260 aatggaaatt tcctcaagtt ggtggtttaa cttcaattaa atgggctgat
aacaattgtt 1320 atttgtctag tgttttatta gcacttcaac agcttgaagt
caaattcaat gcaccagcac 1380 ttcaagaggc ttattataga gcccgtgctg
gtgatgctgc taacttttgt gcactcatac 1440 tc 1442 45 1050 DNA
CORONAVIRUS 45 atatgtctat gacatatgga cagcagtttg gtccaacata
cttggatggt gctgatgtta 60 caaaaattaa acctcatgta aatcatgagg
gtaagacttt ctttgtacta cctagtgatg 120 acacactacg tagtgaagct
ttcgagtact accatactct tgatgagagt tttcttggta 180 ggtacatgtc
tgctttaaac cacacaaaga aatggaaatt tcctcaagtt ggtggtttaa 240
cttcaattaa atgggctgat aacaattgtt atttgtctag tgttttatta gcacttcaac
300 agcttgaagt caaattcaat gcaccagcac ttcaagaggc ttattataga
gcccgtgctg 360 gtgatgctgc taacttttgt gcactcatac tcgcttacag
taataaaact gttggcgagc 420 ttggtgatgt cagagaaact atgacccatc
ttctacagca tgctaatttg gaatctgcaa 480 agcgagttct taatgtggtg
tgtaaacatt gtggtcagaa aactactacc ttaacgggtg 540 tagaagctgt
gatgtatatg ggtactctat cttatgataa tcttaagaca ggtgtttcca 600
ttccatgtgt gtgtggtcgt gatgctacac aatatctagt acaacaagag tcttcttttg
660 ttatgatgtc tgcaccacct gctgagtata aattacagca aggtacattc
ttatgtgcga 720 atgagtacac tggtaactat cagtgtggtc attacactca
tataactgct aaggagaccc 780 tctatcgtat tgacggagct caccttacaa
agatgtcaga gtacaaagga ccagtgactg 840 atgttttcta caaggaaaca
tcttacacta caaccatcaa gcctgtgtcg tataaactcg 900 atggagttac
ttacacagag attgaaccaa aattggatgg gtattataaa aaggataatg 960
cttactatac agagcagcct atagaccttg taccaactca accattacca aatgcgagtt
1020 ttgataattt caaactcaca tgttctaaca 1050 46 1995 DNA CORONAVIRUS
46 tttgtgcact catactcgct tacagtaata aaactgttgg cgagcttggt
gatgtcagag 60 aaactatgac ccatcttcta cagcatgcta atttggaatc
tgcaaagcga gttcttaatg 120 tggtgtgtaa acattgtggt cagaaaacta
ctaccttaac gggtgtagaa gctgtgatgt 180 atatgggtac tctatcttat
gataatctta agacaggtgt ttccattcca tgtgtgtgtg 240 gtcgtgatgc
tacacaatat ctagtacaac aagagtcttc ttttgttatg atgtctgcac 300
cacctgctga gtataaatta cagcaaggta cattcttatg tgcgaatgag tacactggta
360 actatcagtg tggtcattac actcatataa ctgctaagga gaccctctat
cgtattgacg 420 gagctcacct tacaaagatg tcagagtaca aaggaccagt
gactgatgtt ttctacaagg 480 aaacatctta cactacaacc atcaagcctg
tgtcgtataa actcgatgga gttacttaca 540 cagagattga accaaaattg
gatgggtatt ataaaaagga taatgcttac tatacagagc 600 agcctataga
ccttgtacca actcaaccat taccaaatgc gagttttgat aatttcaaac 660
tcacatgttc taacacaaaa tttgctgatg atttaaatca aatgacaggc ttcacaaagc
720 cagcttcacg agagctatct gtcacattct tcccagactt gaatggcgat
gtagtggcta 780 ttgactatag acactattca gcgagtttca agaaaggtgc
taaattactg cataagccaa 840 ttgtttggca cattaaccag gctacaacca
agacaacgtt caaaccaaac acttggtgtt 900 tacgttgtct ttggagtaca
aagccagtag atacttcaaa ttcatttgaa gttctggcag 960 tagaagacac
acaaggaatg gacaatcttg cttgtgaaag tcaacaaccc acctctgaag 1020
aagtagtgga aaatcctacc atacagaagg aagtcataga gtgtgacgtg aaaactaccg
1080 aagttgtagg caatgtcata cttaaaccat cagatgaagg tgttaaagta
acacaagagt 1140 taggtcatga ggatcttatg gctgcttatg tggaaaacac
aagcattacc attaagaaac 1200 ctaatgagct ttcactagcc ttaggtttaa
aaacaattgc cactcatggt attgctgcaa 1260 ttaatagtgt tccttggagt
aaaattttgg cttatgtcaa accattctta ggacaagcag 1320 caattacaac
atcaaattgc gctaagagat tagcacaacg tgtgtttaac aattatatgc 1380
cttatgtgtt tacattattg ttccaattgt gtacttttac taaaagtacc aattctagaa
1440 ttagagcttc actacctaca actattgcta aaaatagtgt taagagtgtt
gctaaattat 1500 gtttggatgc cggcattaat tatgtgaagt cacccaaatt
ttctaaattg ttcacaatcg 1560 ctatgtggct attgttgtta agtatttgct
taggttctct aatctgtgta actgctgctt 1620 ttggtgtact cttatctaat
tttggtgctc cttcttattg taatggcgtt agagaattgt 1680 atcttaattc
gtctaacgtt actactatgg atttctgtga aggttctttt ccttgcagca 1740
tttgtttaag tggattagac tcccttgatt cttatccagc tcttgaaacc attcaggtga
1800 cgatttcatc gtacaagcta gacttgacaa ttttaggtct ggccgctgag
tgggttttgg 1860 catatatgtt gttcacaaaa ttcttttatt tattaggtct
ttcagctata atgcaggtgt 1920 tctttggcta ttttgctagt catttcatca
gcaattcttg gctcatgtgg tttatcatta 1980 gtattgtaca aatgg 1995 47 1884
DNA CORONAVIRUS 47 aattcttggc tcatgtggtt tatcattagt attgtacaaa
tggcacccgt ttctgcaatg 60 gttaggatgt acatcttctt tgcttctttc
tactacatat ggaagagcta tgttcatatc 120 atggatggtt gcacctcttc
gacttgcatg atgtgctata agcgcaatcg tgccacacgc 180 gttgagtgta
caactattgt taatggcatg aagagatctt tctatgtcta tgcaaatgga 240
ggccgtggct tctgcaagac tcacaattgg aattgtctca attgtgacac attttgcact
300 ggtagtacat tcattagtga tgaagttgct cgtgatttgt cactccagtt
taaaagacca 360 atcaacccta ctgaccagtc atcgtatatt gttgatagtg
ttgctgtgaa aaatggcgcg 420 cttcacctct actttgacaa ggctggtcaa
aagacctatg agagacatcc gctctcccat 480 tttgtcaatt tagacaattt
gagagctaac aacactaaag gttcactgcc tattaatgtc 540 atagtttttg
atggcaagtc caaatgcgac gagtctgctt ctaagtctgc ttctgtgtac 600
tacagtcagc tgatgtgcca acctattctg ttgcttgacc aagctcttgt atcagacgtt
660 ggagatagta ctgaagtttc cgttaagatg tttgatgctt atgtcgacac
cttttcagca 720 acttttagtg ttcctatgga aaaacttaag gcacttgttg
ctacagctca cagcgagtta 780 gcaaagggtg tagctttaga tggtgtcctt
tctacattcg tgtcagctgc ccgacaaggt 840 gttgttgata ccgatgttga
cacaaaggat gttattgaat gtctcaaact ttcacatcac 900 tctgacttag
aagtgacagg tgacagttgt aacaatttca tgctcaccta taataaggtt 960
gaaaacatga cgcccagaga tcttggcgca tgtattgact gtaatgcaag gcatatcaat
1020 gcccaagtag caaaaagtca caatgtttca ctcatctgga atgtaaaaga
ctacatgtct 1080 ttatctgaac agctgcgtaa acaaattcgt agtgctgcca
agaagaacaa catacctttt 1140 agactaactt gtgctacaac tagacaggtt
gtcaatgtca taactactaa aatctcactc 1200 aagggtggta agattgttag
tacttgtttt aaacttatgc ttaaggccac attattgtgc 1260 gttcttgctg
cattggtttg ttatatcgtt atgccagtac atacattgtc aatccatgat 1320
ggttacacaa atgaaatcat tggttacaaa gccattcagg atggtgtcac tcgtgacatc
1380 atttctactg atgattgttt tgcaaataaa catgctggtt ttgacgcatg
gtttagccag 1440 cgtggtggtt catacaaaaa tgacaaaagc tgccctgtag
tagctgctat cattacaaga 1500 gagattggtt tcatagtgcc tggcttaccg
ggtactgtgc tgagagcaat caatggtgac 1560 ttcttgcatt ttctacctcg
tgtttttagt gctgttggca acatttgcta cacaccttcc 1620 aaactcattg
agtatagtga ttttgctacc tctgcttgcg ttcttgctgc tgagtgtaca 1680
atttttaagg atgctatggg caaacctgtg ccatattgtt atgacactaa tttgctagag
1740 ggttctattt cttatagtga gcttcgtcca gacactcgtt atgtgcttat
ggatggttcc 1800 atcatacagt ttcctaacac ttacctggag ggttctgtta
gagtagtaac aacttttgat 1860 gctgagtact gtagacatgg taca 1884 48 2020
DNA CORONAVIRUS 48 cactcgttat gtgcttatgg atggttccat catacagttt
cctaacactt acctggaggg 60 ttctgttaga gtagtaacaa cttttgatgc
tgagtactgt agacatggta catgcgaaag 120 gtcagaagta ggtatttgcc
tatctaccag tggtagatgg gttcttaata atgagcatta 180 cagagctcta
tcaggagttt tctgtggtgt tgatgcgatg aatctcatag ctaacatctt 240
tactcctctt gtgcaacctg tgggtgcttt agatgtgtct gcttcagtag tggctggtgg
300 tattattgcc atattggtga cttgtgctgc ctactacttt atgaaattca
gacgtgtttt 360 tggtgagtac aaccatgttg ttgctgctaa tgcacttttg
tttttgatgt ctttcactat 420 actctgtctg gtaccagctt acagctttct
gccgggagtc tactcagtct tttacttgta 480 cttgacattc tatttcacca
atgatgtttc attcttggct caccttcaat ggtttgccat 540 gttttctcct
attgtgcctt tttggataac agcaatctat gtattctgta tttctctgaa 600
gcactgccat tggttcttta acaactatct taggaaaaga gtcatgttta atggagttac
660 atttagtacc ttcgaggagg ctgctttgtg tacctttttg ctcaacaagg
aaatgtacct 720 aaaattgcgt agcgagacac tgttgccact tacacagtat
aacaggtatc ttgctctata 780 taacaagtac aagtatttca gtggagcctt
agatactacc agctatcgtg aagcagcttg 840 ctgccactta gcaaaggctc
taaatgactt tagcaactca ggtgctgatg ttctctacca 900 accaccacag
acatcaatca cttctgctgt tctgcagagt ggttttagga aaatggcatt 960
cccgtcaggc aaagttgaag ggtgcatggt acaagtaacc tgtggaacta caactcttaa
1020 tggattgtgg ttggatgaca cagtatactg tccaagacat gtcatttgca
cagcagaaga 1080 catgcttaat cctaactatg aagatctgct cattcgcaaa
tccaaccata gctttcttgt 1140 tcaggctggc aatgttcaac ttcgtgttat
tggccattct atgcaaaatt gtctgcttag 1200 gcttaaagtt gatacttcta
accctaagac acccaagtat aaatttgtcc gtatccaacc 1260 tggtcaaaca
ttttcagttc tagcatgcta caatggttca ccatctggtg tttatcagtg 1320
tgccatgaga cctaatcata ccattaaagg ttctttcctt aatggatcat gtggtagtgt
1380 tggttttaac attgattatg attgcgtgtc tttctgctat atgcatcata
tggagcttcc 1440 aacaggagta cacgctggta ctgacttaga aggtaaattc
tatggtccat ttgttgacag 1500 acaaactgca caggctgcag gtacagacac
aaccataaca ttaaatgttt tggcatggct 1560 gtatgctgct gttatcaatg
gtgataggtg gtttcttaat agattcacca ctactttgaa 1620 tgactttaac
cttgtggcaa tgaagtacaa ctatgaacct ttgacacaag atcatgttga 1680
catattggga cctctttctg ctcaaacagg aattgccgtc ttagatatgt gtgctgcttt
1740 gaaagagctg ctgcagaatg gtatgaatgg tcgtactatc cttggtagca
ctattttaga 1800 agatgagttt acaccatttg atgttgttag acaatgctct
ggtgttacct tccaaggtaa 1860 gttcaagaaa attgttaagg gcactcatca
ttggatgctt ttaactttct tgacatcact 1920 attgattctt gttcaaagta
cacagtggtc actgtttttc tttgtttacg agaatgcttt 1980 cttgccattt
actcttggta ttatggcaat tgctgcatgt 2020 49 2040 DNA CORONAVIRUS 49
agcatttcca gcctgaagac gtactgtagc agctaaactg cccagcacca tacctctatt
60 taggttgttt aagcctttga tgaagtacaa gtatttcact ttaggccctt
ttggtgtgtc 120 tgtaacaaac ctacaaggtg gttccagttc tgtgtaaatt
gtacctgtac catcactctt 180 agggaatcta gcccatttga gatcttggtg
gtctgatagt aatgccagca caaacctacc 240 tcccttcgaa
ttgttatagt aggcaagtgc attgtcatca gtacaagctg tttgtgtggt 300
accagccgca caggacatct gtcgtagtgc tactggactc agttcattat tctgtagttt
360 aacagctgag ttggctctta gagctgtaac aataagaggc caagccaaat
ttggtgaatt 420 gtccatgtta atttcactaa gttgaacaat cttgctatcc
gcatcaacaa cttgctggat 480 ttcccagagt gcagatgcat atgtaaaggt
gttaccatca caagtgttct tgtaggtacc 540 ataatcaggg acaacaacca
tgagtttggc tgctgtagtc aatggtatga tgttgagtgg 600 aacacaacca
tcacgcgcat tgttgataat gttgttaagt gcatcattat caagcttcct 660
aagcatagtg aagagcattg tttgcatagc actagttact tttgccctct tgtcctcaga
720 tcttgcctgt ttgtacattt gggtcatagc ctgatctgcc atcttttcca
acttgcgttg 780 catggcagca tcacggtcaa actcagattt agccacattc
aaagatttct ttaacttttt 840 gagaacgact tcagaatcac cattagctac
agcctgctca taggcctcct gggcagtggc 900 ataagcggca tatgatggta
aagaactaaa ttctgaagca atagcctgaa gagtagcacg 960 gttatcgagc
atttcctcgc acaacctatt aatgtctaca gcaccctgca tggatagcaa 1020
aacagacaaa agagaaacca tcttctcgaa agcttcagtt gtgtcttttg caagaagaat
1080 atcattgtgg agttgtacac attgtgccca caatttagaa gatgactcta
ctctaagttg 1140 ttgaagaacc gagagcagta ccacagatgt gcactttacg
tcagacattt tagactgtac 1200 agtagcaacc ttgatacatg gtttacctcc
aatacccaac aacttaatgt taagcttgaa 1260 agcatcaata ctactcttag
gaggcaaaag cccctgggag ttcatatacc taaattcttg 1320 tgtagagacc
aagtagtcat aaacaccaag agtaagcctg aagtaacggt tgagtaaaca 1380
gaaaaggcca aagtagcagc agcaacaata gcctaagaaa caataaacaa gcatgataca
1440 ctgtaaggtg ttgccagtaa taaataacaa tgggtaatac tcaacacaca
caaacactat 1500 agctctagct aaaaacatga tagtcgtaac gacaccagaa
tagttagagg ttacagaaat 1560 aactaaggcc cacatggaaa tagcttgatc
taaagcatta ccatagtaga ctttgtaaac 1620 aagtgtaatg acattcatca
gtgtccaaac acgtctagca gcatcatcat aaacagtgcg 1680 agctgtcatg
agaataagca aaactaaagc tgaagcatac ataacacaat ccttaagcct 1740
ataaccagac aagctagtgt cagccaattc aagccatgtc atgatacgca tcacccagct
1800 agcaggcatg tagaccatat taaagtaagc aactgttgca agagaaggta
acagaaacaa 1860 gcacaagaat gcgtgcttat gcttaacaag cagcatagca
catgcagcaa ttgccataat 1920 accaagagta aatggcaaga aagcattctc
gtaaacaaag aaaaacagtg accactgtgt 1980 actttgaaca agaatcaata
gtgatgtcaa gaaagttaaa agcatccaat gatgagtgca 2040 50 2012 DNA
CORONAVIRUS 50 cttgtaggtt tgttacagac acaccaaaag ggcctaaagt
gaaatacttg tacttcatca 60 aaggcttaaa caacctaaat agaggtatgg
tgctgggcag tttagctgct acagtacgtc 120 ttcaggctgg aaatgctaca
gaagtacctg ccaattcaac tgtgctttcc ttctgtgctt 180 ttgcagtaga
ccctgctaaa gcatataagg attacctagc aagtggagga caaccaatca 240
ccaactgtgt gaagatgttg tgtacacaca ctggtacagg acaggcaatt actgtaacac
300 cagaagctaa catggaccaa gagtcctttg gtggtgcttc atgttgtctg
tattgtagat 360 gccacattga ccatccaaat cctaaaggat tctgtgactt
gaaaggtaag tacgtccaaa 420 tacctaccac ttgtgctaat gacccagtgg
gttttacact tagaaacaca gtctgtaccg 480 tctgcggaat gtggaaaggt
tatggctgta gttgtgacca actccgcgaa cccttgatgc 540 agtctgcgga
tgcatcaacg tttttaaacg ggtttgcggt gtaagtgcag cccgtcttac 600
accgtgcggc acaggcacta gtactgatgt cgtctacagg gcttttgata tttacaacga
660 aaaagttgct ggttttgcaa agttcctaaa aactaattgc tgtcgcttcc
aggagaagga 720 tgaggaaggc aatttattag actcttactt tgtagttaag
aggcatacta tgtctaacta 780 ccaacatgaa gagactattt ataacttggt
taaagattgt ccagcggttg ctgtccatga 840 ctttttcaag tttagagtag
atggtgacat ggtaccacat atatcacgtc agcgtctaac 900 taaatacaca
atggctgatt tagtctatgc tctacgtcat tttgatgagg gtaattgtga 960
tacattaaaa gaaatactcg tcacatacaa ttgctgtgat gatgattatt tcaataagaa
1020 ggattggtat gacttcgtag agaatcctga catcttacgc gtatatgcta
acttaggtga 1080 gcgtgtacgc caatcattat taaagactgt acaattctgc
gatgctatgc gtgatgcagg 1140 cattgtaggc gtactgacat tagataatca
ggatcttaat gggaactggt acgatttcgg 1200 tgatttcgta caagtagcac
caggctgcgg agttcctatt gtggattcat attactcatt 1260 gctgatgccc
atcctcactt tgactagggc attggctgct gagtcccata tggatgctga 1320
tctcgcaaaa ccacttatta agtgggattt gctgaaatat gattttacgg aagagagact
1380 ttgtctcttc gaccgttatt ttaaatattg ggaccagaca taccatccca
attgtattaa 1440 ctgtttggat gataggtgta tccttcattg tgcaaacttt
aatgtgttat tttctactgt 1500 gtttccacct acaagttttg gaccactagt
aagaaaaata tttgtagatg gtgttccttt 1560 tgttgtttca actggatacc
attttcgtga gttaggagtc gtacataatc aggatgtaaa 1620 cttacatagc
tcgcgtctca gtttcaagga acttttagtg tatgctgctg atccagctat 1680
gcatgcagct tctggcaatt tattgctaga taaacgcact acatgctttt cagtagctgc
1740 actaacaaac aatgttgctt ttcaaactgt caaacccggt aattttaata
aagactttta 1800 tgactttgct gtgtctaaag gtttctttaa ggaaggaagt
tctgttgaac taaaacactt 1860 cttctttgct caggatggca acgctgctat
cagtgattat gactattatc gttataatct 1920 gccaacaatg tgtgatatca
gacaactcct attcgtagtt gaagttgttg ataaatactt 1980 tgattgttac
gatggtggct gtattaatgc ca 2012 51 1877 DNA CORONAVIRUS 51 gtacttcgcg
tacagtggca ataccatatg acagcttaaa tgtttcctca gtggctttga 60
gcgtttctgc tgcgaaaagc ttgagtctct cagtacaagt gttggcaagt atgtaatcgc
120 cagcattagt ccaatcacat gttgctatcg cattgaagtc agtgacattg
tcactgccta 180 cacatgtgtt tttgtataaa ccaaaaacct gaccattagc
acataatgga aaactaatgg 240 gaggcttatg tgacttgcaa taatagctca
tacctcctag atacagttgt gtcacatcag 300 tgacatcaca acctggggca
ttgcaaacat agggattaac agacaacact aatttgtgtg 360 atgttgaaat
gacatggtca tagcagcact tgcaacatag gaatggtctc ctaatacagg 420
caccgcaacg aagtgaagtc tgtgaattgc acaatacaca agcacctaca gcctgcaaga
480 ctgtatgtgg tgtgtacata gcctcataaa actcaggttc ccagtaccgt
gaggtgttat 540 cattagttag cattacggaa tacatgtcca acatgtggcc
agtaagctca tcatgtaact 600 ttctaatgta ttgtaaatac aagtgaaaga
catcagcata ctcctgatta ggatgttttg 660 taagtgggta agcatcaata
gccagtgaca cgaacctttc aatcataagt gtaccatctg 720 ttttgacaat
atcatcgaca aaacagcctg cgcctaatat tcttgatgga tctgggtaag 780
gcaggtacac gtaatcatct ccttgtttaa ctagcattgt atgctgtgag caaaattcgt
840 gaggtccttt agtaaggtca gtctcagtcc aacattttgc ctcagacatg
aacacattat 900 tttgataata aagaactgcc ttaaagttct taatgctagc
tactaaacct tgagccgcat 960 agttactgtt atagcacaca acggcatcat
cagaaagaat catcatggag aaatgtttac 1020 gcaggtaagc gtaaaactca
tccacgaatt catgatcaac atccctattt ctatagagac 1080 actcatagag
cctgtgttgt agattgcgga catacttgtc agctatctta ttaccatcag 1140
ttgaaagaag tgcatttaca ttggctgtaa cagcttgaca aatgttaaag acactattag
1200 cataagcagt tgtagcatca ccggatgatg ttccacctgg tttaacatat
agtgagccgc 1260 cacacatgac catctcactt aatacttgcg cacactcgtt
agctaacctg tagaaacggt 1320 gtgataagtt acagcaagtg ttatgtttgc
gagcaagaac aagagaggcc attatcctaa 1380 gcatgttagg catggctctg
tcacattttg gataatccca acccataagg tgtggagttt 1440 ctacatcact
gtaaacagtt tttaacatat tatgccagcc accgtaaaac ttgcttgttc 1500
caattaccac agtagctcct ctagtggcgg ctattgactt caataatttc tgatgaaact
1560 gtctatttgt catagtacta cagatagaga caccagctac ggtgcgagct
ctattctttg 1620 cactaatggc atacttaaga ttcatttgag ttatagtagg
gatgacatta cgcttagtat 1680 acgcgaaaag tgcatcttga tcctcataac
tcattgagtc ataataaagt ctagccttac 1740 cccatttatt aaatgggaaa
ccagctgatt tatccagatt gttaacgatt acttggttgg 1800 cattaataca
gccaccatcg taacaatcaa agtatttatc aacaacttca actacgaata 1860
ggagttgtct gatatca 1877 52 2051 DNA CORONAVIRUS 52 tcaggtccaa
tcttgacaaa gtacttcatt gatgtaagct caaagccatg cgcccaaagg 60
acgaacacga ctctgtctga caatcctttc agtgtatcac tgagcatttg tactatctta
120 atacgcacta cattccaggg caagccttta tacatgagtg gtataagatg
tttaaactgg 180 tcacctggtg gaggttttgc attaactctg gtgaattctg
tgttattttc agtgtcaaca 240 taaccagtcg gtacagctac taagttaaca
cctgtagaaa atcctagctg gagaggtagg 300 ttagtaccca cagcatctct
agttgcatga cagccctcta catcaaagcc aatccacgca 360 cgaacgtgac
gaatagcttc ttcgcgggtg ataaacatat tagggtaacc attgacttgg 420
taattcattt tgaaacccat catagagatg agtctacggt aggtcatgtc ctttggtatg
480 cctggtatgt caacacataa tccttcagtc ttgaacttta tatcaacgct
gaggtgtgta 540 ggtgcctgtg taggatgaag accagtaatg atcttactac
agtccttaaa aagtccagtt 600 acattttctg cttgtaatgt agccacattg
cgacgtggta tttctagact tgtaaattgc 660 agtttgtcat aaagatctct
atcagacatt atgcacaaaa tgccaatttt tgcccttgtg 720 atagccacat
tgaagcggtt gacattacaa gagtgtgctg tttcagtagt ttgtgtgaat 780
atgacatagt catattcaga accctgtgat gaatcaacag tctgcgtagg caatcctaag
840 atttttgaag ctacagcgtt ctgtgaatta taaggtgaga taaaaacagc
ttttctccaa 900 gcaggattgc gtgtaagaaa ttctcttaca acgcctattt
gaggtctgtt gattgcagat 960 gaaacatcat gtgtaataac acctttgtag
aacattttga agcattgagc tgacttatcc 1020 ttgtgtgctt ttagcttatt
gtcataaact aaagcactca cagtgtcaac aatttcagca 1080 ggacaacggc
gacaagttcc aaggaacatg tctggaccta ttgttttcat aagtctgcac 1140
actgaattaa aatattctgg ttctagtgtg cctttagtca gcaatgtgcg gggggctggt
1200 aattgagcag gatcgccaat atagacgtag tgttttgcac gaagtctagc
attgacaaca 1260 ctcaagtcat aattagtagc catagagatt tcatcaaaga
ctacaatgtc agcagttgtt 1320 tctggcaatg catttacagt gcagaaaaca
tactgttcta gtgttgaatt cactttgaat 1380 ttatcaaaac actctacgcg
cgcacgcgca ggtatgattc tactacattt atctatgggc 1440 aaatatttta
atgccttttc acatagggca tcaacagctg catgagagca tgccgtatac 1500
actatgcgag cagatgggta atagagagca agtccgatgg caaaatgact cttaccagta
1560 ccaggtggtc cttggagtgt agagtacttt tgcatgccga ccttttgata
atttgcaaca 1620 ttgctagaaa actcatctga gatgttgagt gttgggtaca
agccagtaat tctcacatag 1680 tgctcttgtg gcactagagt aggtgcacta
agtggcatta cagtgtgaga tgtcaacaca 1740 aagtaatcac caacattcaa
cttgtatgtc gtagtacctc tgtacacaac agcatcacca 1800 tagtcacctt
tttcaaaggt gtactctcca atctgtactt tactattttt agttacacgg 1860
taaccagtaa agacatagtt tctgttcaat ggtggtctag gttttccaac ctcccatgaa
1920 agatgcaatt ctctgtcaga gagtacttcg cgtacagtgg caataccata
tgacagctta 1980 aatgtttcct cagtggcttt gagcgtttct gctgcgaaaa
gcttgagtct ctcagtacaa 2040 gtgttggcaa g 2051 53 2075 DNA
CORONAVIRUS 53 tgcttgtagt tttgggtaga aggtttcaac atgtccatcc
ttacaccaaa gcatgaatga 60 aatttcagca tagtcaattg taaccttgac
cacttttgaa atcactgaca aatcttgtga 120 ctttattatc tcgacaaagt
catcaagtaa aagatcaatc acagaacaca cacattttga 180 tgaacctgtt
tgcgcatctg ttatgaagta atttttcact gtgctgtcca tagggataaa 240
atcctctaat ttaagtggtg aatcttgtga gcgcttggct aagcctatca ttaaatgaag
300 accgccaagt tgtccatgac tgaaatctcc ataaacgatg tgttcgaagg
catagccctc 360 gagcttatat cgctgtatga attcatccat agcgagctcg
agaaagtcag tttccatttg 420 tgatctgggc ttaaaatcct ctaagtctct
gctctgagta aagtaggttt caggcaactg 480 ttgaataatg ccgtctactt
tcttaaagta gttaaactgt gtttttactg attctccaat 540 taatgtgact
ccattgacgc tagcttgtgc tggtcccttt gaaggtgtta gacctttgac 600
tgaaccttct gttattaaaa caccattacg ggcgtttcta aaaaggtcta cctgtccttc
660 cactctacca tcaaacaaga cagtaagtga agaacaagca ctctcagtag
gtttcttggc 720 aatgtcagtc attgtgcaga cacctattgt agatacatgt
gctggggctt ctcttttgta 780 gtcccagatt acagtattag cagcgatatc
aacacccaaa ttattgagta tcttaatctc 840 tggcactggt ttaatgttac
gcttagccca aagctcaaat gcaacattaa caggaagtgt 900 tgtcttattt
tcaaagatct ccacatcaat accatctacc tttgtgtaaa cagcattatt 960
aatgatggaa acaggtgctt cgccggcgtg tccatcaaag tgtcctttat taacaacatt
1020 ataagccaca ttttctaaac tctgtaacct ggtaaatgta ttccacaggt
tataagtatc 1080 aaattgtttg taaatccata ggctaaatcc agcagaaatc
atcatattat atgcatccaa 1140 gtactgtcgg tactcatttg catggtgtct
gcaaacagca ccacctaaat tgcatcgtgt 1200 aatacacgta gcagatttga
gtggaacata atcaatatcc gacactactt gtttgccatg 1260 agactcacaa
ggactatcag aatagtaaaa gaaaggcaat tgctttaaat tagtaaatgc 1320
acttttatcg aaagctggag tgtggaatgc atgcttattc acatacaaac taccaccatc
1380 acagcctggt aagttcaagt ttgacaagac tcttgtgtca aacctacaca
caattgcatt 1440 ggctgggtaa cgatcaacgt tacaattcca aaacaaacaa
acaccatcag tgaatttatc 1500 gtgatgtgta gcataagaat agaagagttc
ctctattttg taagctttgt cactacatgg 1560 ctgagcatcg tagaacttcc
attctacttc agcctgaggc acacacttga tagcctttgg 1620 atttccaatg
tcatgaagaa ctggaaactt atcagcaagc aatgcagact tcacaaccat 1680
gtgttgtact tttctgcaag cagaattaac cctcagttca tctcctataa tagggtattc
1740 aacagaccaa tcaacgcgct taacaaagca ctcatggact gctaaacatc
tagtcatgat 1800 agcatcacaa ctagccacat gtgcatttcc atgtacctgg
caatgttggt catggttact 1860 ctgaaggtta cccgtaaagc cccactgctg
aacatcaatc ataaatgggt tatagacata 1920 gtcaaaaccc acagaatgat
tccagcaggc ataagtatct gatgaagtag aaaagcaagt 1980 tgcacgtttg
tcacacagac aacacgttct ttcaggtcca atcttgacaa agtacttcat 2040
tgatgtaagc tcaaagccat gcgcccaaag gacga 2075 54 1891 DNA CORONAVIRUS
54 aagattcacc acttaaatta gaggatttta tccctatgga cagcacagtg
aaaaattact 60 tcataacaga tgcgcaaaca ggttcatcaa aatgtgtgtg
ttctgtgatt gatcttttac 120 ttgatgactt tgtcgagata ataaagtcac
aagatttgtc agtgatttca aaagtggtca 180 aggttacaat tgactatgct
gaaatttcat tcatgctttg gtgtaaggat ggacatgttg 240 aaaccttcta
cccaaaacta caagcaagtc aagcgtggca accaggtgtt gcgatgccta 300
acttgtacaa gatgcaaaga atgcttcttg aaaagtgtga ccttcagaat tatggtgaaa
360 atgctgttat accaaaagga ataatgatga atgtcgcaaa gtatactcaa
ctgtgtcaat 420 acttaaatac acttacttta gctgtaccct acaacatgag
agttattcac tttggtgctg 480 gctctgataa aggagttgca ccaggtacag
ctgtgctcag acaatggttg ccaactggca 540 cactacttgt cgattcagat
cttaatgact tcgtctccga cgcagattct actttaattg 600 gagactgtgc
aacagtacat acggctaata aatgggacct tattattagc gatatgtatg 660
accctaggac caaacatgtg acaaaagaga atgactctaa agaagggttt ttcacttatc
720 tgtgtggatt tataaagcaa aaactagccc tgggtggttc tatagctgta
aagataacag 780 agcattcttg gaatgctgac ctttacaagc ttatgggcca
tttctcatgg tggacagctt 840 ttgttacaaa tgtaaatgca tcatcatcgg
aagcattttt aattggggct aactatcttg 900 gcaagccgaa ggaacaaatt
gatggctata ccatgcatgc taactacatt ttctggagga 960 acacaaatcc
tatccagttg tcttcctatt cactctttga catgagcaaa tttcctctta 1020
aattaagagg aactgctgta atgtctctta aggagaatca aatcaatgat atgatttatt
1080 ctcttctgga aaaaggtagg cttatcatta gagaaaacaa cagagttgtg
gtttcaagtg 1140 atattcttgt taacaactaa acgaacatgt ttattttctt
attatttctt actctcacta 1200 gtggtagtga ccttgaccgg tgcaccactt
ttgatgatgt tcaagctcct aattacactc 1260 aacatacttc atctatgagg
ggggtttact atcctgatga aatttttaga tcagacactc 1320 tttatttaac
tcaggattta tttcttccat tttattctaa tgttacaggg tttcatacta 1380
ttaatcatac gtttggcaac cctgtcatac cttttaagga tggtatttat tttgctgcca
1440 cagagaaatc aaatgttgtc cgtggttggg tttttggttc taccatgaac
aacaagtcac 1500 agtcggtgat tattattaac aattctacta atgttgttat
acgagcatgt aactttgaat 1560 tgtgtgacaa ccctttcttt gctgtttcta
aacccatggg tacacagaca catactatga 1620 tattcgataa tgcatttaat
tgcactttcg agtacatatc tgatgccttt tcgcttgatg 1680 tttcagaaaa
gtcaggtaat tttaaacact tacgagagtt tgtgtttaaa aataaagatg 1740
ggtttctcta tgtttataag ggctatcaac ctatagatgt agttcgtgat ctaccttctg
1800 gttttaacac tttgaaacct atttttaagt tgcctcttgg tattaacatt
acaaatttta 1860 gagccattct tacagccttt tcacctgctc a 1891 55 32 DNA
artificial sequence N sens primer 55 cccatatgtc tgataatgga
ccccaatcaa ac 32 56 32 DNA artificial sequence N antisens primer 56
cccccgggtg cctgagttga atcagcagaa gc 32 57 31 DNA artificial
sequence Sc sens primer 57 cccatatgag tgaccttgac cggtgcacca c 31 58
30 DNA artificial sequence SL sens primer 58 cccatatgaa accttgcacc
ccacctgctc 30 59 33 DNA Sc and SL antisens primer 59 cccccgggtt
taatatattg ctcatatttt ccc 33 60 16 DNA Sens set 1 primer 60
ggcatcgtat gggttg 16 61 16 DNA Antisens set 2 (28774-28759) primer
61 cagtttcacc acctcc 16 62 16 DNA Sens set 2 (28375-28390) primer
62 ggctactacc gaagag 16 63 16 DNA Antisens set 2
(28702-28687)primer 63 aattaccgcg actacg 16 64 26 DNA Probe 1/set 1
(28561-28586) 64 ggcacccgca atcctaataa caatgc 26 65 21 DNA Probe
2/set 1 (28588-28608) 65 gccaccgtgc tacaacttcc t 21 66 23 DNA Probe
1/set 2 /probe N/FL (28541-28563) 66 atacacccaa agaccacatt ggc 23
67 25 DNA Probe 2/set 2/probe SARS/N/LC705 (28565-28589) 67
cccgcaatcc taataacaat gctgc 25 68 30 DNA artificial sequence Anchor
primer 14T 68 agatgaattc ggtacctttt tttttttttt 30 69 13 PRT
artificial sequence M2-14 peptide 69 Ala Asp Asn Gly Thr Ile Thr
Val Glu Glu Leu Lys Gln 1 5 10 70 12 PRT artificial sequence E1-12
peptide 70 Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu 1 5 10
71 24 PRT artificial sequence E53-72 peptide 71 Lys Pro Thr Val Tyr
Val Tyr Ser Arg Val Lys Asn Leu Asn Ser Ser 1 5 10 15 Glu Gly Val
Pro Asp Leu Leu Val 20 72 153 DNA CORONAVIRUS 72 gatattaggt
ttttacctac ccaggaaaag ccaaccaacc tcgatctctt gtagatctgt 60
tctctaaacg aactttaaaa tctgtgtagc tgtcgctcgg ctgcatgcct agtgcaccta
120 cgcagtataa acaataataa attttactgt cgt 153 73 410 DNA CORONAVIRUS
73 ttctccagac aacttcaaaa ttccatgagt ggagcttctg ctgattcaac
tcaggcataa 60 acactcatga tgaccacaca aggcagatgg gctatgtaaa
cgttttcgca attccgttta 120 cgatacatag tctactcttg tgcagaatga
attctcgtaa ctaaacagca caagtaggtt 180 tagttaactt taatctcaca
tagcaatctt taatcaatgt gtaacattag ggaggacttg 240 aaagagccac
cacattttca tcgaggccac gcggagtacg atcgagggta cagtgaataa 300
tgctagggag agctgcctat atggaagagc cctaatgtgt aaaattaatt ttagtagtgc
360 tatccccatg tgattttaat agcttcttag gagaatgaca aaaaaaaaaa 410 74
4382 PRT CORONAVIRUS 74 Met Glu Ser Leu Val Leu Gly Val Asn Glu Lys
Thr His Val Gln Leu 1 5 10 15 Ser Leu Pro Val Leu Gln Val Arg Asp
Val Leu Val Arg Gly Phe Gly 20 25 30 Asp Ser Val Glu Glu Ala Leu
Ser Glu Ala Arg Glu His Leu Lys Asn 35 40 45 Gly Thr Cys Gly Leu
Val Glu Leu Glu Lys Gly Val Leu Pro Gln Leu 50 55 60 Glu Gln Pro
Tyr Val Phe Ile Lys Arg Ser Asp Ala Leu Ser Thr Asn 65 70 75 80 His
Gly His Lys Val Val Glu Leu Val Ala Glu Met Asp Gly Ile Gln 85
90
95 Tyr Gly Arg Ser Gly Ile Thr Leu Gly Val Leu Val Pro His Val Gly
100 105 110 Glu Thr Pro Ile Ala Tyr Arg Asn Val Leu Leu Arg Lys Asn
Gly Asn 115 120 125 Lys Gly Ala Gly Gly His Ser Tyr Gly Ile Asp Leu
Lys Ser Tyr Asp 130 135 140 Leu Gly Asp Glu Leu Gly Thr Asp Pro Ile
Glu Asp Tyr Glu Gln Asn 145 150 155 160 Trp Asn Thr Lys His Gly Ser
Gly Ala Leu Arg Glu Leu Thr Arg Glu 165 170 175 Leu Asn Gly Gly Ala
Val Thr Arg Tyr Val Asp Asn Asn Phe Cys Gly 180 185 190 Pro Asp Gly
Tyr Pro Leu Asp Cys Ile Lys Asp Phe Leu Ala Arg Ala 195 200 205 Gly
Lys Ser Met Cys Thr Leu Ser Glu Gln Leu Asp Tyr Ile Glu Ser 210 215
220 Lys Arg Gly Val Tyr Cys Cys Arg Asp His Glu His Glu Ile Ala Trp
225 230 235 240 Phe Thr Glu Arg Ser Asp Lys Ser Tyr Glu His Gln Thr
Pro Phe Glu 245 250 255 Ile Lys Ser Ala Lys Lys Phe Asp Thr Phe Lys
Gly Glu Cys Pro Lys 260 265 270 Phe Val Phe Pro Leu Asn Ser Lys Val
Lys Val Ile Gln Pro Arg Val 275 280 285 Glu Lys Lys Lys Thr Glu Gly
Phe Met Gly Arg Ile Arg Ser Val Tyr 290 295 300 Pro Val Ala Ser Pro
Gln Glu Cys Asn Asn Met His Leu Ser Thr Leu 305 310 315 320 Met Lys
Cys Asn His Cys Asp Glu Val Ser Trp Gln Thr Cys Asp Phe 325 330 335
Leu Lys Ala Thr Cys Glu His Cys Gly Thr Glu Asn Leu Val Ile Glu 340
345 350 Gly Pro Thr Thr Cys Gly Tyr Leu Pro Thr Asn Ala Val Val Lys
Met 355 360 365 Pro Cys Pro Ala Cys Gln Asp Pro Glu Ile Gly Pro Glu
His Ser Val 370 375 380 Ala Asp Tyr His Asn His Ser Asn Ile Glu Thr
Arg Leu Arg Lys Gly 385 390 395 400 Gly Arg Thr Arg Cys Phe Gly Gly
Cys Val Phe Ala Tyr Val Gly Cys 405 410 415 Tyr Asn Lys Arg Ala Tyr
Trp Val Pro Arg Ala Ser Ala Asp Ile Gly 420 425 430 Ser Gly His Thr
Gly Ile Thr Gly Asp Asn Val Glu Thr Leu Asn Glu 435 440 445 Asp Leu
Leu Glu Ile Leu Ser Arg Glu Arg Val Asn Ile Asn Ile Val 450 455 460
Gly Asp Phe His Leu Asn Glu Glu Val Ala Ile Ile Leu Ala Ser Phe 465
470 475 480 Ser Ala Ser Thr Ser Ala Phe Ile Asp Thr Ile Lys Ser Leu
Asp Tyr 485 490 495 Lys Ser Phe Lys Thr Ile Val Glu Ser Cys Gly Asn
Tyr Lys Val Thr 500 505 510 Lys Gly Lys Pro Val Lys Gly Ala Trp Asn
Ile Gly Gln Gln Arg Ser 515 520 525 Val Leu Thr Pro Leu Cys Gly Phe
Pro Ser Gln Ala Ala Gly Val Ile 530 535 540 Arg Ser Ile Phe Ala Arg
Thr Leu Asp Ala Ala Asn His Ser Ile Pro 545 550 555 560 Asp Leu Gln
Arg Ala Ala Val Thr Ile Leu Asp Gly Ile Ser Glu Gln 565 570 575 Ser
Leu Arg Leu Val Asp Ala Met Val Tyr Thr Ser Asp Leu Leu Thr 580 585
590 Asn Ser Val Ile Ile Met Ala Tyr Val Thr Gly Gly Leu Val Gln Gln
595 600 605 Thr Ser Gln Trp Leu Ser Asn Leu Leu Gly Thr Thr Val Glu
Lys Leu 610 615 620 Arg Pro Ile Phe Glu Trp Ile Glu Ala Lys Leu Ser
Ala Gly Val Glu 625 630 635 640 Phe Leu Lys Asp Ala Trp Glu Ile Leu
Lys Phe Leu Ile Thr Gly Val 645 650 655 Phe Asp Ile Val Lys Gly Gln
Ile Gln Val Ala Ser Asp Asn Ile Lys 660 665 670 Asp Cys Val Lys Cys
Phe Ile Asp Val Val Asn Lys Ala Leu Glu Met 675 680 685 Cys Ile Asp
Gln Val Thr Ile Ala Gly Ala Lys Leu Arg Ser Leu Asn 690 695 700 Leu
Gly Glu Val Phe Ile Ala Gln Ser Lys Gly Leu Tyr Arg Gln Cys 705 710
715 720 Ile Arg Gly Lys Glu Gln Leu Gln Leu Leu Met Pro Leu Lys Ala
Pro 725 730 735 Lys Glu Val Thr Phe Leu Glu Gly Asp Ser His Asp Thr
Val Leu Thr 740 745 750 Ser Glu Glu Val Val Leu Lys Asn Gly Glu Leu
Glu Ala Leu Glu Thr 755 760 765 Pro Val Asp Ser Phe Thr Asn Gly Ala
Ile Val Gly Thr Pro Val Cys 770 775 780 Val Asn Gly Leu Met Leu Leu
Glu Ile Lys Asp Lys Glu Gln Tyr Cys 785 790 795 800 Ala Leu Ser Pro
Gly Leu Leu Ala Thr Asn Asn Val Phe Arg Leu Lys 805 810 815 Gly Gly
Ala Pro Ile Lys Gly Val Thr Phe Gly Glu Asp Thr Val Trp 820 825 830
Glu Val Gln Gly Tyr Lys Asn Val Arg Ile Thr Phe Glu Leu Asp Glu 835
840 845 Arg Val Asp Lys Val Leu Asn Glu Lys Cys Ser Val Tyr Thr Val
Glu 850 855 860 Ser Gly Thr Glu Val Thr Glu Phe Ala Cys Val Val Ala
Glu Ala Val 865 870 875 880 Val Lys Thr Leu Gln Pro Val Ser Asp Leu
Leu Thr Asn Met Gly Ile 885 890 895 Asp Leu Asp Glu Trp Ser Val Ala
Thr Phe Tyr Leu Phe Asp Asp Ala 900 905 910 Gly Glu Glu Asn Phe Ser
Ser Arg Met Tyr Cys Ser Phe Tyr Pro Pro 915 920 925 Asp Glu Glu Glu
Glu Asp Asp Ala Glu Cys Glu Glu Glu Glu Ile Asp 930 935 940 Glu Thr
Cys Glu His Glu Tyr Gly Thr Glu Asp Asp Tyr Gln Gly Leu 945 950 955
960 Pro Leu Glu Phe Gly Ala Ser Ala Glu Thr Val Arg Val Glu Glu Glu
965 970 975 Glu Glu Glu Asp Trp Leu Asp Asp Thr Thr Glu Gln Ser Glu
Ile Glu 980 985 990 Pro Glu Pro Glu Pro Thr Pro Glu Glu Pro Val Asn
Gln Phe Thr Gly 995 1000 1005 Tyr Leu Lys Leu Thr Asp Asn Val Ala
Ile Lys Cys Val Asp Ile 1010 1015 1020 Val Lys Glu Ala Gln Ser Ala
Asn Pro Met Val Ile Val Asn Ala 1025 1030 1035 Ala Asn Ile His Leu
Lys His Gly Gly Gly Val Ala Gly Ala Leu 1040 1045 1050 Asn Lys Ala
Thr Asn Gly Ala Met Gln Lys Glu Ser Asp Asp Tyr 1055 1060 1065 Ile
Lys Leu Asn Gly Pro Leu Thr Val Gly Gly Ser Cys Leu Leu 1070 1075
1080 Ser Gly His Asn Leu Ala Lys Lys Cys Leu His Val Val Gly Pro
1085 1090 1095 Asn Leu Asn Ala Gly Glu Asp Ile Gln Leu Leu Lys Ala
Ala Tyr 1100 1105 1110 Glu Asn Phe Asn Ser Gln Asp Ile Leu Leu Ala
Pro Leu Leu Ser 1115 1120 1125 Ala Gly Ile Phe Gly Ala Lys Pro Leu
Gln Ser Leu Gln Val Cys 1130 1135 1140 Val Gln Thr Val Arg Thr Gln
Val Tyr Ile Ala Val Asn Asp Lys 1145 1150 1155 Ala Leu Tyr Glu Gln
Val Val Met Asp Tyr Leu Asp Asn Leu Lys 1160 1165 1170 Pro Arg Val
Glu Ala Pro Lys Gln Glu Glu Pro Pro Asn Thr Glu 1175 1180 1185 Asp
Ser Lys Thr Glu Glu Lys Ser Val Val Gln Lys Pro Val Asp 1190 1195
1200 Val Lys Pro Lys Ile Lys Ala Cys Ile Asp Glu Val Thr Thr Thr
1205 1210 1215 Leu Glu Glu Thr Lys Phe Leu Thr Asn Lys Leu Leu Leu
Phe Ala 1220 1225 1230 Asp Ile Asn Gly Lys Leu Tyr His Asp Ser Gln
Asn Met Leu Arg 1235 1240 1245 Gly Glu Asp Met Ser Phe Leu Glu Lys
Asp Ala Pro Tyr Met Val 1250 1255 1260 Gly Asp Val Ile Thr Ser Gly
Asp Ile Thr Cys Val Val Ile Pro 1265 1270 1275 Ser Lys Lys Ala Gly
Gly Thr Thr Glu Met Leu Ser Arg Ala Leu 1280 1285 1290 Lys Lys Val
Pro Val Asp Glu Tyr Ile Thr Thr Tyr Pro Gly Gln 1295 1300 1305 Gly
Cys Ala Gly Tyr Thr Leu Glu Glu Ala Lys Thr Ala Leu Lys 1310 1315
1320 Lys Cys Lys Ser Ala Phe Tyr Val Leu Pro Ser Glu Ala Pro Asn
1325 1330 1335 Ala Lys Glu Glu Ile Leu Gly Thr Val Ser Trp Asn Leu
Arg Glu 1340 1345 1350 Met Leu Ala His Ala Glu Glu Thr Arg Lys Leu
Met Pro Ile Cys 1355 1360 1365 Met Asp Val Arg Ala Ile Met Ala Thr
Ile Gln Arg Lys Tyr Lys 1370 1375 1380 Gly Ile Lys Ile Gln Glu Gly
Ile Val Asp Tyr Gly Val Arg Phe 1385 1390 1395 Phe Phe Tyr Thr Ser
Lys Glu Pro Val Ala Ser Ile Ile Thr Lys 1400 1405 1410 Leu Asn Ser
Leu Asn Glu Pro Leu Val Thr Met Pro Ile Gly Tyr 1415 1420 1425 Val
Thr His Gly Phe Asn Leu Glu Glu Ala Ala Arg Cys Met Arg 1430 1435
1440 Ser Leu Lys Ala Pro Ala Val Val Ser Val Ser Ser Pro Asp Ala
1445 1450 1455 Val Thr Thr Tyr Asn Gly Tyr Leu Thr Ser Ser Ser Lys
Thr Ser 1460 1465 1470 Glu Glu His Phe Val Glu Thr Val Ser Leu Ala
Gly Ser Tyr Arg 1475 1480 1485 Asp Trp Ser Tyr Ser Gly Gln Arg Thr
Glu Leu Gly Val Glu Phe 1490 1495 1500 Leu Lys Arg Gly Asp Lys Ile
Val Tyr His Thr Leu Glu Ser Pro 1505 1510 1515 Val Glu Phe His Leu
Asp Gly Glu Val Leu Ser Leu Asp Lys Leu 1520 1525 1530 Lys Ser Leu
Leu Ser Leu Arg Glu Val Lys Thr Ile Lys Val Phe 1535 1540 1545 Thr
Thr Val Asp Asn Thr Asn Leu His Thr Gln Leu Val Asp Met 1550 1555
1560 Ser Met Thr Tyr Gly Gln Gln Phe Gly Pro Thr Tyr Leu Asp Gly
1565 1570 1575 Ala Asp Val Thr Lys Ile Lys Pro His Val Asn His Glu
Gly Lys 1580 1585 1590 Thr Phe Phe Val Leu Pro Ser Asp Asp Thr Leu
Arg Ser Glu Ala 1595 1600 1605 Phe Glu Tyr Tyr His Thr Leu Asp Glu
Ser Phe Leu Gly Arg Tyr 1610 1615 1620 Met Ser Ala Leu Asn His Thr
Lys Lys Trp Lys Phe Pro Gln Val 1625 1630 1635 Gly Gly Leu Thr Ser
Ile Lys Trp Ala Asp Asn Asn Cys Tyr Leu 1640 1645 1650 Ser Ser Val
Leu Leu Ala Leu Gln Gln Leu Glu Val Lys Phe Asn 1655 1660 1665 Ala
Pro Ala Leu Gln Glu Ala Tyr Tyr Arg Ala Arg Ala Gly Asp 1670 1675
1680 Ala Ala Asn Phe Cys Ala Leu Ile Leu Ala Tyr Ser Asn Lys Thr
1685 1690 1695 Val Gly Glu Leu Gly Asp Val Arg Glu Thr Met Thr His
Leu Leu 1700 1705 1710 Gln His Ala Asn Leu Glu Ser Ala Lys Arg Val
Leu Asn Val Val 1715 1720 1725 Cys Lys His Cys Gly Gln Lys Thr Thr
Thr Leu Thr Gly Val Glu 1730 1735 1740 Ala Val Met Tyr Met Gly Thr
Leu Ser Tyr Asp Asn Leu Lys Thr 1745 1750 1755 Gly Val Ser Ile Pro
Cys Val Cys Gly Arg Asp Ala Thr Gln Tyr 1760 1765 1770 Leu Val Gln
Gln Glu Ser Ser Phe Val Met Met Ser Ala Pro Pro 1775 1780 1785 Ala
Glu Tyr Lys Leu Gln Gln Gly Thr Phe Leu Cys Ala Asn Glu 1790 1795
1800 Tyr Thr Gly Asn Tyr Gln Cys Gly His Tyr Thr His Ile Thr Ala
1805 1810 1815 Lys Glu Thr Leu Tyr Arg Ile Asp Gly Ala His Leu Thr
Lys Met 1820 1825 1830 Ser Glu Tyr Lys Gly Pro Val Thr Asp Val Phe
Tyr Lys Glu Thr 1835 1840 1845 Ser Tyr Thr Thr Thr Ile Lys Pro Val
Ser Tyr Lys Leu Asp Gly 1850 1855 1860 Val Thr Tyr Thr Glu Ile Glu
Pro Lys Leu Asp Gly Tyr Tyr Lys 1865 1870 1875 Lys Asp Asn Ala Tyr
Tyr Thr Glu Gln Pro Ile Asp Leu Val Pro 1880 1885 1890 Thr Gln Pro
Leu Pro Asn Ala Ser Phe Asp Asn Phe Lys Leu Thr 1895 1900 1905 Cys
Ser Asn Thr Lys Phe Ala Asp Asp Leu Asn Gln Met Thr Gly 1910 1915
1920 Phe Thr Lys Pro Ala Ser Arg Glu Leu Ser Val Thr Phe Phe Pro
1925 1930 1935 Asp Leu Asn Gly Asp Val Val Ala Ile Asp Tyr Arg His
Tyr Ser 1940 1945 1950 Ala Ser Phe Lys Lys Gly Ala Lys Leu Leu His
Lys Pro Ile Val 1955 1960 1965 Trp His Ile Asn Gln Ala Thr Thr Lys
Thr Thr Phe Lys Pro Asn 1970 1975 1980 Thr Trp Cys Leu Arg Cys Leu
Trp Ser Thr Lys Pro Val Asp Thr 1985 1990 1995 Ser Asn Ser Phe Glu
Val Leu Ala Val Glu Asp Thr Gln Gly Met 2000 2005 2010 Asp Asn Leu
Ala Cys Glu Ser Gln Gln Pro Thr Ser Glu Glu Val 2015 2020 2025 Val
Glu Asn Pro Thr Ile Gln Lys Glu Val Ile Glu Cys Asp Val 2030 2035
2040 Lys Thr Thr Glu Val Val Gly Asn Val Ile Leu Lys Pro Ser Asp
2045 2050 2055 Glu Gly Val Lys Val Thr Gln Glu Leu Gly His Glu Asp
Leu Met 2060 2065 2070 Ala Ala Tyr Val Glu Asn Thr Ser Ile Thr Ile
Lys Lys Pro Asn 2075 2080 2085 Glu Leu Ser Leu Ala Leu Gly Leu Lys
Thr Ile Ala Thr His Gly 2090 2095 2100 Ile Ala Ala Ile Asn Ser Val
Pro Trp Ser Lys Ile Leu Ala Tyr 2105 2110 2115 Val Lys Pro Phe Leu
Gly Gln Ala Ala Ile Thr Thr Ser Asn Cys 2120 2125 2130 Ala Lys Arg
Leu Ala Gln Arg Val Phe Asn Asn Tyr Met Pro Tyr 2135 2140 2145 Val
Phe Thr Leu Leu Phe Gln Leu Cys Thr Phe Thr Lys Ser Thr 2150 2155
2160 Asn Ser Arg Ile Arg Ala Ser Leu Pro Thr Thr Ile Ala Lys Asn
2165 2170 2175 Ser Val Lys Ser Val Ala Lys Leu Cys Leu Asp Ala Gly
Ile Asn 2180 2185 2190 Tyr Val Lys Ser Pro Lys Phe Ser Lys Leu Phe
Thr Ile Ala Met 2195 2200 2205 Trp Leu Leu Leu Leu Ser Ile Cys Leu
Gly Ser Leu Ile Cys Val 2210 2215 2220 Thr Ala Ala Phe Gly Val Leu
Leu Ser Asn Phe Gly Ala Pro Ser 2225 2230 2235 Tyr Cys Asn Gly Val
Arg Glu Leu Tyr Leu Asn Ser Ser Asn Val 2240 2245 2250 Thr Thr Met
Asp Phe Cys Glu Gly Ser Phe Pro Cys Ser Ile Cys 2255 2260 2265 Leu
Ser Gly Leu Asp Ser Leu Asp Ser Tyr Pro Ala Leu Glu Thr 2270 2275
2280 Ile Gln Val Thr Ile Ser Ser Tyr Lys Leu Asp Leu Thr Ile Leu
2285 2290 2295 Gly Leu Ala Ala Glu Trp Val Leu Ala Tyr Met Leu Phe
Thr Lys 2300 2305 2310 Phe Phe Tyr Leu Leu Gly Leu Ser Ala Ile Met
Gln Val Phe Phe 2315 2320 2325 Gly Tyr Phe Ala Ser His Phe Ile Ser
Asn Ser Trp Leu Met Trp 2330 2335 2340 Phe Ile Ile Ser Ile Val Gln
Met Ala Pro Val Ser Ala Met Val 2345 2350 2355 Arg Met Tyr Ile Phe
Phe Ala Ser Phe Tyr Tyr Ile Trp Lys Ser 2360 2365 2370 Tyr Val His
Ile Met Asp Gly Cys Thr Ser Ser Thr Cys Met Met 2375 2380 2385 Cys
Tyr Lys Arg Asn Arg Ala Thr Arg Val Glu Cys Thr Thr Ile 2390 2395
2400 Val Asn Gly Met Lys Arg Ser Phe Tyr Val Tyr Ala Asn Gly Gly
2405 2410 2415 Arg Gly Phe Cys Lys Thr His Asn Trp Asn Cys Leu Asn
Cys Asp 2420 2425 2430 Thr Phe Cys Thr Gly Ser Thr Phe Ile Ser Asp
Glu Val Ala Arg 2435 2440 2445 Asp Leu Ser Leu Gln Phe Lys Arg Pro
Ile Asn Pro Thr Asp Gln 2450 2455 2460 Ser Ser Tyr Ile Val Asp Ser
Val Ala Val Lys Asn Gly Ala Leu 2465 2470 2475 His Leu Tyr Phe Asp
Lys Ala Gly Gln Lys Thr Tyr Glu Arg His 2480 2485 2490 Pro Leu Ser
His Phe Val Asn Leu Asp Asn Leu Arg Ala Asn Asn 2495 2500 2505 Thr
Lys Gly Ser Leu Pro
Ile Asn Val Ile Val Phe Asp Gly Lys 2510 2515 2520 Ser Lys Cys Asp
Glu Ser Ala Ser Lys Ser Ala Ser Val Tyr Tyr 2525 2530 2535 Ser Gln
Leu Met Cys Gln Pro Ile Leu Leu Leu Asp Gln Ala Leu 2540 2545 2550
Val Ser Asp Val Gly Asp Ser Thr Glu Val Ser Val Lys Met Phe 2555
2560 2565 Asp Ala Tyr Val Asp Thr Phe Ser Ala Thr Phe Ser Val Pro
Met 2570 2575 2580 Glu Lys Leu Lys Ala Leu Val Ala Thr Ala His Ser
Glu Leu Ala 2585 2590 2595 Lys Gly Val Ala Leu Asp Gly Val Leu Ser
Thr Phe Val Ser Ala 2600 2605 2610 Ala Arg Gln Gly Val Val Asp Thr
Asp Val Asp Thr Lys Asp Val 2615 2620 2625 Ile Glu Cys Leu Lys Leu
Ser His His Ser Asp Leu Glu Val Thr 2630 2635 2640 Gly Asp Ser Cys
Asn Asn Phe Met Leu Thr Tyr Asn Lys Val Glu 2645 2650 2655 Asn Met
Thr Pro Arg Asp Leu Gly Ala Cys Ile Asp Cys Asn Ala 2660 2665 2670
Arg His Ile Asn Ala Gln Val Ala Lys Ser His Asn Val Ser Leu 2675
2680 2685 Ile Trp Asn Val Lys Asp Tyr Met Ser Leu Ser Glu Gln Leu
Arg 2690 2695 2700 Lys Gln Ile Arg Ser Ala Ala Lys Lys Asn Asn Ile
Pro Phe Arg 2705 2710 2715 Leu Thr Cys Ala Thr Thr Arg Gln Val Val
Asn Val Ile Thr Thr 2720 2725 2730 Lys Ile Ser Leu Lys Gly Gly Lys
Ile Val Ser Thr Cys Phe Lys 2735 2740 2745 Leu Met Leu Lys Ala Thr
Leu Leu Cys Val Leu Ala Ala Leu Val 2750 2755 2760 Cys Tyr Ile Val
Met Pro Val His Thr Leu Ser Ile His Asp Gly 2765 2770 2775 Tyr Thr
Asn Glu Ile Ile Gly Tyr Lys Ala Ile Gln Asp Gly Val 2780 2785 2790
Thr Arg Asp Ile Ile Ser Thr Asp Asp Cys Phe Ala Asn Lys His 2795
2800 2805 Ala Gly Phe Asp Ala Trp Phe Ser Gln Arg Gly Gly Ser Tyr
Lys 2810 2815 2820 Asn Asp Lys Ser Cys Pro Val Val Ala Ala Ile Ile
Thr Arg Glu 2825 2830 2835 Ile Gly Phe Ile Val Pro Gly Leu Pro Gly
Thr Val Leu Arg Ala 2840 2845 2850 Ile Asn Gly Asp Phe Leu His Phe
Leu Pro Arg Val Phe Ser Ala 2855 2860 2865 Val Gly Asn Ile Cys Tyr
Thr Pro Ser Lys Leu Ile Glu Tyr Ser 2870 2875 2880 Asp Phe Ala Thr
Ser Ala Cys Val Leu Ala Ala Glu Cys Thr Ile 2885 2890 2895 Phe Lys
Asp Ala Met Gly Lys Pro Val Pro Tyr Cys Tyr Asp Thr 2900 2905 2910
Asn Leu Leu Glu Gly Ser Ile Ser Tyr Ser Glu Leu Arg Pro Asp 2915
2920 2925 Thr Arg Tyr Val Leu Met Asp Gly Ser Ile Ile Gln Phe Pro
Asn 2930 2935 2940 Thr Tyr Leu Glu Gly Ser Val Arg Val Val Thr Thr
Phe Asp Ala 2945 2950 2955 Glu Tyr Cys Arg His Gly Thr Cys Glu Arg
Ser Glu Val Gly Ile 2960 2965 2970 Cys Leu Ser Thr Ser Gly Arg Trp
Val Leu Asn Asn Glu His Tyr 2975 2980 2985 Arg Ala Leu Ser Gly Val
Phe Cys Gly Val Asp Ala Met Asn Leu 2990 2995 3000 Ile Ala Asn Ile
Phe Thr Pro Leu Val Gln Pro Val Gly Ala Leu 3005 3010 3015 Asp Val
Ser Ala Ser Val Val Ala Gly Gly Ile Ile Ala Ile Leu 3020 3025 3030
Val Thr Cys Ala Ala Tyr Tyr Phe Met Lys Phe Arg Arg Val Phe 3035
3040 3045 Gly Glu Tyr Asn His Val Val Ala Ala Asn Ala Leu Leu Phe
Leu 3050 3055 3060 Met Ser Phe Thr Ile Leu Cys Leu Val Pro Ala Tyr
Ser Phe Leu 3065 3070 3075 Pro Gly Val Tyr Ser Val Phe Tyr Leu Tyr
Leu Thr Phe Tyr Phe 3080 3085 3090 Thr Asn Asp Val Ser Phe Leu Ala
His Leu Gln Trp Phe Ala Met 3095 3100 3105 Phe Ser Pro Ile Val Pro
Phe Trp Ile Thr Ala Ile Tyr Val Phe 3110 3115 3120 Cys Ile Ser Leu
Lys His Cys His Trp Phe Phe Asn Asn Tyr Leu 3125 3130 3135 Arg Lys
Arg Val Met Phe Asn Gly Val Thr Phe Ser Thr Phe Glu 3140 3145 3150
Glu Ala Ala Leu Cys Thr Phe Leu Leu Asn Lys Glu Met Tyr Leu 3155
3160 3165 Lys Leu Arg Ser Glu Thr Leu Leu Pro Leu Thr Gln Tyr Asn
Arg 3170 3175 3180 Tyr Leu Ala Leu Tyr Asn Lys Tyr Lys Tyr Phe Ser
Gly Ala Leu 3185 3190 3195 Asp Thr Thr Ser Tyr Arg Glu Ala Ala Cys
Cys His Leu Ala Lys 3200 3205 3210 Ala Leu Asn Asp Phe Ser Asn Ser
Gly Ala Asp Val Leu Tyr Gln 3215 3220 3225 Pro Pro Gln Thr Ser Ile
Thr Ser Ala Val Leu Gln Ser Gly Phe 3230 3235 3240 Arg Lys Met Ala
Phe Pro Ser Gly Lys Val Glu Gly Cys Met Val 3245 3250 3255 Gln Val
Thr Cys Gly Thr Thr Thr Leu Asn Gly Leu Trp Leu Asp 3260 3265 3270
Asp Thr Val Tyr Cys Pro Arg His Val Ile Cys Thr Ala Glu Asp 3275
3280 3285 Met Leu Asn Pro Asn Tyr Glu Asp Leu Leu Ile Arg Lys Ser
Asn 3290 3295 3300 His Ser Phe Leu Val Gln Ala Gly Asn Val Gln Leu
Arg Val Ile 3305 3310 3315 Gly His Ser Met Gln Asn Cys Leu Leu Arg
Leu Lys Val Asp Thr 3320 3325 3330 Ser Asn Pro Lys Thr Pro Lys Tyr
Lys Phe Val Arg Ile Gln Pro 3335 3340 3345 Gly Gln Thr Phe Ser Val
Leu Ala Cys Tyr Asn Gly Ser Pro Ser 3350 3355 3360 Gly Val Tyr Gln
Cys Ala Met Arg Pro Asn His Thr Ile Lys Gly 3365 3370 3375 Ser Phe
Leu Asn Gly Ser Cys Gly Ser Val Gly Phe Asn Ile Asp 3380 3385 3390
Tyr Asp Cys Val Ser Phe Cys Tyr Met His His Met Glu Leu Pro 3395
3400 3405 Thr Gly Val His Ala Gly Thr Asp Leu Glu Gly Lys Phe Tyr
Gly 3410 3415 3420 Pro Phe Val Asp Arg Gln Thr Ala Gln Ala Ala Gly
Thr Asp Thr 3425 3430 3435 Thr Ile Thr Leu Asn Val Leu Ala Trp Leu
Tyr Ala Ala Val Ile 3440 3445 3450 Asn Gly Asp Arg Trp Phe Leu Asn
Arg Phe Thr Thr Thr Leu Asn 3455 3460 3465 Asp Phe Asn Leu Val Ala
Met Lys Tyr Asn Tyr Glu Pro Leu Thr 3470 3475 3480 Gln Asp His Val
Asp Ile Leu Gly Pro Leu Ser Ala Gln Thr Gly 3485 3490 3495 Ile Ala
Val Leu Asp Met Cys Ala Ala Leu Lys Glu Leu Leu Gln 3500 3505 3510
Asn Gly Met Asn Gly Arg Thr Ile Leu Gly Ser Thr Ile Leu Glu 3515
3520 3525 Asp Glu Phe Thr Pro Phe Asp Val Val Arg Gln Cys Ser Gly
Val 3530 3535 3540 Thr Phe Gln Gly Lys Phe Lys Lys Ile Val Lys Gly
Thr His His 3545 3550 3555 Trp Met Leu Leu Thr Phe Leu Thr Ser Leu
Leu Ile Leu Val Gln 3560 3565 3570 Ser Thr Gln Trp Ser Leu Phe Phe
Phe Val Tyr Glu Asn Ala Phe 3575 3580 3585 Leu Pro Phe Thr Leu Gly
Ile Met Ala Ile Ala Ala Cys Ala Met 3590 3595 3600 Leu Leu Val Lys
His Lys His Ala Phe Leu Cys Leu Phe Leu Leu 3605 3610 3615 Pro Ser
Leu Ala Thr Val Ala Tyr Phe Asn Met Val Tyr Met Pro 3620 3625 3630
Ala Ser Trp Val Met Arg Ile Met Thr Trp Leu Glu Leu Ala Asp 3635
3640 3645 Thr Ser Leu Ser Gly Tyr Arg Leu Lys Asp Cys Val Met Tyr
Ala 3650 3655 3660 Ser Ala Leu Val Leu Leu Ile Leu Met Thr Ala Arg
Thr Val Tyr 3665 3670 3675 Asp Asp Ala Ala Arg Arg Val Trp Thr Leu
Met Asn Val Ile Thr 3680 3685 3690 Leu Val Tyr Lys Val Tyr Tyr Gly
Asn Ala Leu Asp Gln Ala Ile 3695 3700 3705 Ser Met Trp Ala Leu Val
Ile Ser Val Thr Ser Asn Tyr Ser Gly 3710 3715 3720 Val Val Thr Thr
Ile Met Phe Leu Ala Arg Ala Ile Val Phe Val 3725 3730 3735 Cys Val
Glu Tyr Tyr Pro Leu Leu Phe Ile Thr Gly Asn Thr Leu 3740 3745 3750
Gln Cys Ile Met Leu Val Tyr Cys Phe Leu Gly Tyr Cys Cys Cys 3755
3760 3765 Cys Tyr Phe Gly Leu Phe Cys Leu Leu Asn Arg Tyr Phe Arg
Leu 3770 3775 3780 Thr Leu Gly Val Tyr Asp Tyr Leu Val Ser Thr Gln
Glu Phe Arg 3785 3790 3795 Tyr Met Asn Ser Gln Gly Leu Leu Pro Pro
Lys Ser Ser Ile Asp 3800 3805 3810 Ala Phe Lys Leu Asn Ile Lys Leu
Leu Gly Ile Gly Gly Lys Pro 3815 3820 3825 Cys Ile Lys Val Ala Thr
Val Gln Ser Lys Met Ser Asp Val Lys 3830 3835 3840 Cys Thr Ser Val
Val Leu Leu Ser Val Leu Gln Gln Leu Arg Val 3845 3850 3855 Glu Ser
Ser Ser Lys Leu Trp Ala Gln Cys Val Gln Leu His Asn 3860 3865 3870
Asp Ile Leu Leu Ala Lys Asp Thr Thr Glu Ala Phe Glu Lys Met 3875
3880 3885 Val Ser Leu Leu Ser Val Leu Leu Ser Met Gln Gly Ala Val
Asp 3890 3895 3900 Ile Asn Arg Leu Cys Glu Glu Met Leu Asp Asn Arg
Ala Thr Leu 3905 3910 3915 Gln Ala Ile Ala Ser Glu Phe Ser Ser Leu
Pro Ser Tyr Ala Ala 3920 3925 3930 Tyr Ala Thr Ala Gln Glu Ala Tyr
Glu Gln Ala Val Ala Asn Gly 3935 3940 3945 Asp Ser Glu Val Val Leu
Lys Lys Leu Lys Lys Ser Leu Asn Val 3950 3955 3960 Ala Lys Ser Glu
Phe Asp Arg Asp Ala Ala Met Gln Arg Lys Leu 3965 3970 3975 Glu Lys
Met Ala Asp Gln Ala Met Thr Gln Met Tyr Lys Gln Ala 3980 3985 3990
Arg Ser Glu Asp Lys Arg Ala Lys Val Thr Ser Ala Met Gln Thr 3995
4000 4005 Met Leu Phe Thr Met Leu Arg Lys Leu Asp Asn Asp Ala Leu
Asn 4010 4015 4020 Asn Ile Ile Asn Asn Ala Arg Asp Gly Cys Val Pro
Leu Asn Ile 4025 4030 4035 Ile Pro Leu Thr Thr Ala Ala Lys Leu Met
Val Val Val Pro Asp 4040 4045 4050 Tyr Gly Thr Tyr Lys Asn Thr Cys
Asp Gly Asn Thr Phe Thr Tyr 4055 4060 4065 Ala Ser Ala Leu Trp Glu
Ile Gln Gln Val Val Asp Ala Asp Ser 4070 4075 4080 Lys Ile Val Gln
Leu Ser Glu Ile Asn Met Asp Asn Ser Pro Asn 4085 4090 4095 Leu Ala
Trp Pro Leu Ile Val Thr Ala Leu Arg Ala Asn Ser Ala 4100 4105 4110
Val Lys Leu Gln Asn Asn Glu Leu Ser Pro Val Ala Leu Arg Gln 4115
4120 4125 Met Ser Cys Ala Ala Gly Thr Thr Gln Thr Ala Cys Thr Asp
Asp 4130 4135 4140 Asn Ala Leu Ala Tyr Tyr Asn Asn Ser Lys Gly Gly
Arg Phe Val 4145 4150 4155 Leu Ala Leu Leu Ser Asp His Gln Asp Leu
Lys Trp Ala Arg Phe 4160 4165 4170 Pro Lys Ser Asp Gly Thr Gly Thr
Ile Tyr Thr Glu Leu Glu Pro 4175 4180 4185 Pro Cys Arg Phe Val Thr
Asp Thr Pro Lys Gly Pro Lys Val Lys 4190 4195 4200 Tyr Leu Tyr Phe
Ile Lys Gly Leu Asn Asn Leu Asn Arg Gly Met 4205 4210 4215 Val Leu
Gly Ser Leu Ala Ala Thr Val Arg Leu Gln Ala Gly Asn 4220 4225 4230
Ala Thr Glu Val Pro Ala Asn Ser Thr Val Leu Ser Phe Cys Ala 4235
4240 4245 Phe Ala Val Asp Pro Ala Lys Ala Tyr Lys Asp Tyr Leu Ala
Ser 4250 4255 4260 Gly Gly Gln Pro Ile Thr Asn Cys Val Lys Met Leu
Cys Thr His 4265 4270 4275 Thr Gly Thr Gly Gln Ala Ile Thr Val Thr
Pro Glu Ala Asn Met 4280 4285 4290 Asp Gln Glu Ser Phe Gly Gly Ala
Ser Cys Cys Leu Tyr Cys Arg 4295 4300 4305 Cys His Ile Asp His Pro
Asn Pro Lys Gly Phe Cys Asp Leu Lys 4310 4315 4320 Gly Lys Tyr Val
Gln Ile Pro Thr Thr Cys Ala Asn Asp Pro Val 4325 4330 4335 Gly Phe
Thr Leu Arg Asn Thr Val Cys Thr Val Cys Gly Met Trp 4340 4345 4350
Lys Gly Tyr Gly Cys Ser Cys Asp Gln Leu Arg Glu Pro Leu Met 4355
4360 4365 Gln Ser Ala Asp Ala Ser Thr Phe Leu Asn Gly Phe Ala Val
4370 4375 4380 75 2695 PRT CORONAVIRUS 75 Arg Val Cys Gly Val Ser
Ala Ala Arg Leu Thr Pro Cys Gly Thr Gly 1 5 10 15 Thr Ser Thr Asp
Val Val Tyr Arg Ala Phe Asp Ile Tyr Asn Glu Lys 20 25 30 Val Ala
Gly Phe Ala Lys Phe Leu Lys Thr Asn Cys Cys Arg Phe Gln 35 40 45
Glu Lys Asp Glu Glu Gly Asn Leu Leu Asp Ser Tyr Phe Val Val Lys 50
55 60 Arg His Thr Met Ser Asn Tyr Gln His Glu Glu Thr Ile Tyr Asn
Leu 65 70 75 80 Val Lys Asp Cys Pro Ala Val Ala Val His Asp Phe Phe
Lys Phe Arg 85 90 95 Val Asp Gly Asp Met Val Pro His Ile Ser Arg
Gln Arg Leu Thr Lys 100 105 110 Tyr Thr Met Ala Asp Leu Val Tyr Ala
Leu Arg His Phe Asp Glu Gly 115 120 125 Asn Cys Asp Thr Leu Lys Glu
Ile Leu Val Thr Tyr Asn Cys Cys Asp 130 135 140 Asp Asp Tyr Phe Asn
Lys Lys Asp Trp Tyr Asp Phe Val Glu Asn Pro 145 150 155 160 Asp Ile
Leu Arg Val Tyr Ala Asn Leu Gly Glu Arg Val Arg Gln Ser 165 170 175
Leu Leu Lys Thr Val Gln Phe Cys Asp Ala Met Arg Asp Ala Gly Ile 180
185 190 Val Gly Val Leu Thr Leu Asp Asn Gln Asp Leu Asn Gly Asn Trp
Tyr 195 200 205 Asp Phe Gly Asp Phe Val Gln Val Ala Pro Gly Cys Gly
Val Pro Ile 210 215 220 Val Asp Ser Tyr Tyr Ser Leu Leu Met Pro Ile
Leu Thr Leu Thr Arg 225 230 235 240 Ala Leu Ala Ala Glu Ser His Met
Asp Ala Asp Leu Ala Lys Pro Leu 245 250 255 Ile Lys Trp Asp Leu Leu
Lys Tyr Asp Phe Thr Glu Glu Arg Leu Cys 260 265 270 Leu Phe Asp Arg
Tyr Phe Lys Tyr Trp Asp Gln Thr Tyr His Pro Asn 275 280 285 Cys Ile
Asn Cys Leu Asp Asp Arg Cys Ile Leu His Cys Ala Asn Phe 290 295 300
Asn Val Leu Phe Ser Thr Val Phe Pro Pro Thr Ser Phe Gly Pro Leu 305
310 315 320 Val Arg Lys Ile Phe Val Asp Gly Val Pro Phe Val Val Ser
Thr Gly 325 330 335 Tyr His Phe Arg Glu Leu Gly Val Val His Asn Gln
Asp Val Asn Leu 340 345 350 His Ser Ser Arg Leu Ser Phe Lys Glu Leu
Leu Val Tyr Ala Ala Asp 355 360 365 Pro Ala Met His Ala Ala Ser Gly
Asn Leu Leu Leu Asp Lys Arg Thr 370 375 380 Thr Cys Phe Ser Val Ala
Ala Leu Thr Asn Asn Val Ala Phe Gln Thr 385 390 395 400 Val Lys Pro
Gly Asn Phe Asn Lys Asp Phe Tyr Asp Phe Ala Val Ser 405 410 415 Lys
Gly Phe Phe Lys Glu Gly Ser Ser Val Glu Leu Lys His Phe Phe 420 425
430 Phe Ala Gln Asp Gly Asn Ala Ala Ile Ser Asp Tyr Asp Tyr Tyr Arg
435 440 445 Tyr Asn Leu Pro Thr Met Cys Asp Ile Arg Gln Leu Leu Phe
Val Val 450 455 460 Glu Val Val Asp Lys Tyr Phe Asp Cys Tyr Asp Gly
Gly Cys Ile Asn 465 470 475 480 Ala Asn Gln Val Ile Val Asn Asn Leu
Asp Lys Ser Ala Gly Phe Pro 485 490 495 Phe Asn Lys Trp Gly Lys Ala
Arg Leu Tyr Tyr Asp Ser Met Ser Tyr 500 505 510 Glu Asp Gln Asp Ala
Leu Phe Ala Tyr Thr Lys Arg Asn Val Ile Pro 515 520 525 Thr Ile Thr
Gln Met Asn Leu Lys Tyr Ala Ile Ser
Ala Lys Asn Arg 530 535 540 Ala Arg Thr Val Ala Gly Val Ser Ile Cys
Ser Thr Met Thr Asn Arg 545 550 555 560 Gln Phe His Gln Lys Leu Leu
Lys Ser Ile Ala Ala Thr Arg Gly Ala 565 570 575 Thr Val Val Ile Gly
Thr Ser Lys Phe Tyr Gly Gly Trp His Asn Met 580 585 590 Leu Lys Thr
Val Tyr Ser Asp Val Glu Thr Pro His Leu Met Gly Trp 595 600 605 Asp
Tyr Pro Lys Cys Asp Arg Ala Met Pro Asn Met Leu Arg Ile Met 610 615
620 Ala Ser Leu Val Leu Ala Arg Lys His Asn Thr Cys Cys Asn Leu Ser
625 630 635 640 His Arg Phe Tyr Arg Leu Ala Asn Glu Cys Ala Gln Val
Leu Ser Glu 645 650 655 Met Val Met Cys Gly Gly Ser Leu Tyr Val Lys
Pro Gly Gly Thr Ser 660 665 670 Ser Gly Asp Ala Thr Thr Ala Tyr Ala
Asn Ser Val Phe Asn Ile Cys 675 680 685 Gln Ala Val Thr Ala Asn Val
Asn Ala Leu Leu Ser Thr Asp Gly Asn 690 695 700 Lys Ile Ala Asp Lys
Tyr Val Arg Asn Leu Gln His Arg Leu Tyr Glu 705 710 715 720 Cys Leu
Tyr Arg Asn Arg Asp Val Asp His Glu Phe Val Asp Glu Phe 725 730 735
Tyr Ala Tyr Leu Arg Lys His Phe Ser Met Met Ile Leu Ser Asp Asp 740
745 750 Ala Val Val Cys Tyr Asn Ser Asn Tyr Ala Ala Gln Gly Leu Val
Ala 755 760 765 Ser Ile Lys Asn Phe Lys Ala Val Leu Tyr Tyr Gln Asn
Asn Val Phe 770 775 780 Met Ser Glu Ala Lys Cys Trp Thr Glu Thr Asp
Leu Thr Lys Gly Pro 785 790 795 800 His Glu Phe Cys Ser Gln His Thr
Met Leu Val Lys Gln Gly Asp Asp 805 810 815 Tyr Val Tyr Leu Pro Tyr
Pro Asp Pro Ser Arg Ile Leu Gly Ala Gly 820 825 830 Cys Phe Val Asp
Asp Ile Val Lys Thr Asp Gly Thr Leu Met Ile Glu 835 840 845 Arg Phe
Val Ser Leu Ala Ile Asp Ala Tyr Pro Leu Thr Lys His Pro 850 855 860
Asn Gln Glu Tyr Ala Asp Val Phe His Leu Tyr Leu Gln Tyr Ile Arg 865
870 875 880 Lys Leu His Asp Glu Leu Thr Gly His Met Leu Asp Met Tyr
Ser Val 885 890 895 Met Leu Thr Asn Asp Asn Thr Ser Arg Tyr Trp Glu
Pro Glu Phe Tyr 900 905 910 Glu Ala Met Tyr Thr Pro His Thr Val Leu
Gln Ala Val Gly Ala Cys 915 920 925 Val Leu Cys Asn Ser Gln Thr Ser
Leu Arg Cys Gly Ala Cys Ile Arg 930 935 940 Arg Pro Phe Leu Cys Cys
Lys Cys Cys Tyr Asp His Val Ile Ser Thr 945 950 955 960 Ser His Lys
Leu Val Leu Ser Val Asn Pro Tyr Val Cys Asn Ala Pro 965 970 975 Gly
Cys Asp Val Thr Asp Val Thr Gln Leu Tyr Leu Gly Gly Met Ser 980 985
990 Tyr Tyr Cys Lys Ser His Lys Pro Pro Ile Ser Phe Pro Leu Cys Ala
995 1000 1005 Asn Gly Gln Val Phe Gly Leu Tyr Lys Asn Thr Cys Val
Gly Ser 1010 1015 1020 Asp Asn Val Thr Asp Phe Asn Ala Ile Ala Thr
Cys Asp Trp Thr 1025 1030 1035 Asn Ala Gly Asp Tyr Ile Leu Ala Asn
Thr Cys Thr Glu Arg Leu 1040 1045 1050 Lys Leu Phe Ala Ala Glu Thr
Leu Lys Ala Thr Glu Glu Thr Phe 1055 1060 1065 Lys Leu Ser Tyr Gly
Ile Ala Thr Val Arg Glu Val Leu Ser Asp 1070 1075 1080 Arg Glu Leu
His Leu Ser Trp Glu Val Gly Lys Pro Arg Pro Pro 1085 1090 1095 Leu
Asn Arg Asn Tyr Val Phe Thr Gly Tyr Arg Val Thr Lys Asn 1100 1105
1110 Ser Lys Val Gln Ile Gly Glu Tyr Thr Phe Glu Lys Gly Asp Tyr
1115 1120 1125 Gly Asp Ala Val Val Tyr Arg Gly Thr Thr Thr Tyr Lys
Leu Asn 1130 1135 1140 Val Gly Asp Tyr Phe Val Leu Thr Ser His Thr
Val Met Pro Leu 1145 1150 1155 Ser Ala Pro Thr Leu Val Pro Gln Glu
His Tyr Val Arg Ile Thr 1160 1165 1170 Gly Leu Tyr Pro Thr Leu Asn
Ile Ser Asp Glu Phe Ser Ser Asn 1175 1180 1185 Val Ala Asn Tyr Gln
Lys Val Gly Met Gln Lys Tyr Ser Thr Leu 1190 1195 1200 Gln Gly Pro
Pro Gly Thr Gly Lys Ser His Phe Ala Ile Gly Leu 1205 1210 1215 Ala
Leu Tyr Tyr Pro Ser Ala Arg Ile Val Tyr Thr Ala Cys Ser 1220 1225
1230 His Ala Ala Val Asp Ala Leu Cys Glu Lys Ala Leu Lys Tyr Leu
1235 1240 1245 Pro Ile Asp Lys Cys Ser Arg Ile Ile Pro Ala Arg Ala
Arg Val 1250 1255 1260 Glu Cys Phe Asp Lys Phe Lys Val Asn Ser Thr
Leu Glu Gln Tyr 1265 1270 1275 Val Phe Cys Thr Val Asn Ala Leu Pro
Glu Thr Thr Ala Asp Ile 1280 1285 1290 Val Val Phe Asp Glu Ile Ser
Met Ala Thr Asn Tyr Asp Leu Ser 1295 1300 1305 Val Val Asn Ala Arg
Leu Arg Ala Lys His Tyr Val Tyr Ile Gly 1310 1315 1320 Asp Pro Ala
Gln Leu Pro Ala Pro Arg Thr Leu Leu Thr Lys Gly 1325 1330 1335 Thr
Leu Glu Pro Glu Tyr Phe Asn Ser Val Cys Arg Leu Met Lys 1340 1345
1350 Thr Ile Gly Pro Asp Met Phe Leu Gly Thr Cys Arg Arg Cys Pro
1355 1360 1365 Ala Glu Ile Val Asp Thr Val Ser Ala Leu Val Tyr Asp
Asn Lys 1370 1375 1380 Leu Lys Ala His Lys Asp Lys Ser Ala Gln Cys
Phe Lys Met Phe 1385 1390 1395 Tyr Lys Gly Val Ile Thr His Asp Val
Ser Ser Ala Ile Asn Arg 1400 1405 1410 Pro Gln Ile Gly Val Val Arg
Glu Phe Leu Thr Arg Asn Pro Ala 1415 1420 1425 Trp Arg Lys Ala Val
Phe Ile Ser Pro Tyr Asn Ser Gln Asn Ala 1430 1435 1440 Val Ala Ser
Lys Ile Leu Gly Leu Pro Thr Gln Thr Val Asp Ser 1445 1450 1455 Ser
Gln Gly Ser Glu Tyr Asp Tyr Val Ile Phe Thr Gln Thr Thr 1460 1465
1470 Glu Thr Ala His Ser Cys Asn Val Asn Arg Phe Asn Val Ala Ile
1475 1480 1485 Thr Arg Ala Lys Ile Gly Ile Leu Cys Ile Met Ser Asp
Arg Asp 1490 1495 1500 Leu Tyr Asp Lys Leu Gln Phe Thr Ser Leu Glu
Ile Pro Arg Arg 1505 1510 1515 Asn Val Ala Thr Leu Gln Ala Glu Asn
Val Thr Gly Leu Phe Lys 1520 1525 1530 Asp Cys Ser Lys Ile Ile Thr
Gly Leu His Pro Thr Gln Ala Pro 1535 1540 1545 Thr His Leu Ser Val
Asp Ile Lys Phe Lys Thr Glu Gly Leu Cys 1550 1555 1560 Val Asp Ile
Pro Gly Ile Pro Lys Asp Met Thr Tyr Arg Arg Leu 1565 1570 1575 Ile
Ser Met Met Gly Phe Lys Met Asn Tyr Gln Val Asn Gly Tyr 1580 1585
1590 Pro Asn Met Phe Ile Thr Arg Glu Glu Ala Ile Arg His Val Arg
1595 1600 1605 Ala Trp Ile Gly Phe Asp Val Glu Gly Cys His Ala Thr
Arg Asp 1610 1615 1620 Ala Val Gly Thr Asn Leu Pro Leu Gln Leu Gly
Phe Ser Thr Gly 1625 1630 1635 Val Asn Leu Val Ala Val Pro Thr Gly
Tyr Val Asp Thr Glu Asn 1640 1645 1650 Asn Thr Glu Phe Thr Arg Val
Asn Ala Lys Pro Pro Pro Gly Asp 1655 1660 1665 Gln Phe Lys His Leu
Ile Pro Leu Met Tyr Lys Gly Leu Pro Trp 1670 1675 1680 Asn Val Val
Arg Ile Lys Ile Val Gln Met Leu Ser Asp Thr Leu 1685 1690 1695 Lys
Gly Leu Ser Asp Arg Val Val Phe Val Leu Trp Ala His Gly 1700 1705
1710 Phe Glu Leu Thr Ser Met Lys Tyr Phe Val Lys Ile Gly Pro Glu
1715 1720 1725 Arg Thr Cys Cys Leu Cys Asp Lys Arg Ala Thr Cys Phe
Ser Thr 1730 1735 1740 Ser Ser Asp Thr Tyr Ala Cys Trp Asn His Ser
Val Gly Phe Asp 1745 1750 1755 Tyr Val Tyr Asn Pro Phe Met Ile Asp
Val Gln Gln Trp Gly Phe 1760 1765 1770 Thr Gly Asn Leu Gln Ser Asn
His Asp Gln His Cys Gln Val His 1775 1780 1785 Gly Asn Ala His Val
Ala Ser Cys Asp Ala Ile Met Thr Arg Cys 1790 1795 1800 Leu Ala Val
His Glu Cys Phe Val Lys Arg Val Asp Trp Ser Val 1805 1810 1815 Glu
Tyr Pro Ile Ile Gly Asp Glu Leu Arg Val Asn Ser Ala Cys 1820 1825
1830 Arg Lys Val Gln His Met Val Val Lys Ser Ala Leu Leu Ala Asp
1835 1840 1845 Lys Phe Pro Val Leu His Asp Ile Gly Asn Pro Lys Ala
Ile Lys 1850 1855 1860 Cys Val Pro Gln Ala Glu Val Glu Trp Lys Phe
Tyr Asp Ala Gln 1865 1870 1875 Pro Cys Ser Asp Lys Ala Tyr Lys Ile
Glu Glu Leu Phe Tyr Ser 1880 1885 1890 Tyr Ala Thr His His Asp Lys
Phe Thr Asp Gly Val Cys Leu Phe 1895 1900 1905 Trp Asn Cys Asn Val
Asp Arg Tyr Pro Ala Asn Ala Ile Val Cys 1910 1915 1920 Arg Phe Asp
Thr Arg Val Leu Ser Asn Leu Asn Leu Pro Gly Cys 1925 1930 1935 Asp
Gly Gly Ser Leu Tyr Val Asn Lys His Ala Phe His Thr Pro 1940 1945
1950 Ala Phe Asp Lys Ser Ala Phe Thr Asn Leu Lys Gln Leu Pro Phe
1955 1960 1965 Phe Tyr Tyr Ser Asp Ser Pro Cys Glu Ser His Gly Lys
Gln Val 1970 1975 1980 Val Ser Asp Ile Asp Tyr Val Pro Leu Lys Ser
Ala Thr Cys Ile 1985 1990 1995 Thr Arg Cys Asn Leu Gly Gly Ala Val
Cys Arg His His Ala Asn 2000 2005 2010 Glu Tyr Arg Gln Tyr Leu Asp
Ala Tyr Asn Met Met Ile Ser Ala 2015 2020 2025 Gly Phe Ser Leu Trp
Ile Tyr Lys Gln Phe Asp Thr Tyr Asn Leu 2030 2035 2040 Trp Asn Thr
Phe Thr Arg Leu Gln Ser Leu Glu Asn Val Ala Tyr 2045 2050 2055 Asn
Val Val Asn Lys Gly His Phe Asp Gly His Ala Gly Glu Ala 2060 2065
2070 Pro Val Ser Ile Ile Asn Asn Ala Val Tyr Thr Lys Val Asp Gly
2075 2080 2085 Ile Asp Val Glu Ile Phe Glu Asn Lys Thr Thr Leu Pro
Val Asn 2090 2095 2100 Val Ala Phe Glu Leu Trp Ala Lys Arg Asn Ile
Lys Pro Val Pro 2105 2110 2115 Glu Ile Lys Ile Leu Asn Asn Leu Gly
Val Asp Ile Ala Ala Asn 2120 2125 2130 Thr Val Ile Trp Asp Tyr Lys
Arg Glu Ala Pro Ala His Val Ser 2135 2140 2145 Thr Ile Gly Val Cys
Thr Met Thr Asp Ile Ala Lys Lys Pro Thr 2150 2155 2160 Glu Ser Ala
Cys Ser Ser Leu Thr Val Leu Phe Asp Gly Arg Val 2165 2170 2175 Glu
Gly Gln Val Asp Leu Phe Arg Asn Ala Arg Asn Gly Val Leu 2180 2185
2190 Ile Thr Glu Gly Ser Val Lys Gly Leu Thr Pro Ser Lys Gly Pro
2195 2200 2205 Ala Gln Ala Ser Val Asn Gly Val Thr Leu Ile Gly Glu
Ser Val 2210 2215 2220 Lys Thr Gln Phe Asn Tyr Phe Lys Lys Val Asp
Gly Ile Ile Gln 2225 2230 2235 Gln Leu Pro Glu Thr Tyr Phe Thr Gln
Ser Arg Asp Leu Glu Asp 2240 2245 2250 Phe Lys Pro Arg Ser Gln Met
Glu Thr Asp Phe Leu Glu Leu Ala 2255 2260 2265 Met Asp Glu Phe Ile
Gln Arg Tyr Lys Leu Glu Gly Tyr Ala Phe 2270 2275 2280 Glu His Ile
Val Tyr Gly Asp Phe Ser His Gly Gln Leu Gly Gly 2285 2290 2295 Leu
His Leu Met Ile Gly Leu Ala Lys Arg Ser Gln Asp Ser Pro 2300 2305
2310 Leu Lys Leu Glu Asp Phe Ile Pro Met Asp Ser Thr Val Lys Asn
2315 2320 2325 Tyr Phe Ile Thr Asp Ala Gln Thr Gly Ser Ser Lys Cys
Val Cys 2330 2335 2340 Ser Val Ile Asp Leu Leu Leu Asp Asp Phe Val
Glu Ile Ile Lys 2345 2350 2355 Ser Gln Asp Leu Ser Val Ile Ser Lys
Val Val Lys Val Thr Ile 2360 2365 2370 Asp Tyr Ala Glu Ile Ser Phe
Met Leu Trp Cys Lys Asp Gly His 2375 2380 2385 Val Glu Thr Phe Tyr
Pro Lys Leu Gln Ala Ser Gln Ala Trp Gln 2390 2395 2400 Pro Gly Val
Ala Met Pro Asn Leu Tyr Lys Met Gln Arg Met Leu 2405 2410 2415 Leu
Glu Lys Cys Asp Leu Gln Asn Tyr Gly Glu Asn Ala Val Ile 2420 2425
2430 Pro Lys Gly Ile Met Met Asn Val Ala Lys Tyr Thr Gln Leu Cys
2435 2440 2445 Gln Tyr Leu Asn Thr Leu Thr Leu Ala Val Pro Tyr Asn
Met Arg 2450 2455 2460 Val Ile His Phe Gly Ala Gly Ser Asp Lys Gly
Val Ala Pro Gly 2465 2470 2475 Thr Ala Val Leu Arg Gln Trp Leu Pro
Thr Gly Thr Leu Leu Val 2480 2485 2490 Asp Ser Asp Leu Asn Asp Phe
Val Ser Asp Ala Asp Ser Thr Leu 2495 2500 2505 Ile Gly Asp Cys Ala
Thr Val His Thr Ala Asn Lys Trp Asp Leu 2510 2515 2520 Ile Ile Ser
Asp Met Tyr Asp Pro Arg Thr Lys His Val Thr Lys 2525 2530 2535 Glu
Asn Asp Ser Lys Glu Gly Phe Phe Thr Tyr Leu Cys Gly Phe 2540 2545
2550 Ile Lys Gln Lys Leu Ala Leu Gly Gly Ser Ile Ala Val Lys Ile
2555 2560 2565 Thr Glu His Ser Trp Asn Ala Asp Leu Tyr Lys Leu Met
Gly His 2570 2575 2580 Phe Ser Trp Trp Thr Ala Phe Val Thr Asn Val
Asn Ala Ser Ser 2585 2590 2595 Ser Glu Ala Phe Leu Ile Gly Ala Asn
Tyr Leu Gly Lys Pro Lys 2600 2605 2610 Glu Gln Ile Asp Gly Tyr Thr
Met His Ala Asn Tyr Ile Phe Trp 2615 2620 2625 Arg Asn Thr Asn Pro
Ile Gln Leu Ser Ser Tyr Ser Leu Phe Asp 2630 2635 2640 Met Ser Lys
Phe Pro Leu Lys Leu Arg Gly Thr Ala Val Met Ser 2645 2650 2655 Leu
Lys Glu Asn Gln Ile Asn Asp Met Ile Tyr Ser Leu Leu Glu 2660 2665
2670 Lys Gly Arg Leu Ile Ile Arg Glu Asn Asn Arg Val Val Val Ser
2675 2680 2685 Ser Asp Ile Leu Val Asn Asn 2690 2695 76 20 DNA
Artificial sequence S/L3/+/4932 primer 76 ccacacacag cttgtggata 20
77 20 DNA Artificial sequence S/L4/+/6401 primer 77 ccgaagttgt
aggcaatgtc 20 78 20 DNA Artificial sequence S/L4/+/6964 primer 78
tttggtgctc cttcttattg 20 79 20 DNA Artificial sequence S/L4/-/6817
primer 79 ccggcatcca aacataattt 20 80 20 DNA Artificial sequence
S/L5/-/7633 primer 80 tggtcagtag ggttgattgg 20 81 20 DNA Artificial
sequence S/L5/-/8127 primer 81 catcctttgt gtcaacatcg 20 82 20 DNA
Artificial sequence S/L5/-/8633 primer 82 gtcacgagtg acaccatcct 20
83 20 DNA Artificial sequence S/L5/+/7839 primer 83 atgcgacgag
tctgcttcta 20 84 20 DNA Artificial sequence S/L5/+/8785 primer 84
ttcatagtgc ctggcttacc 20 85 20 DNA Artificial sequence S/L5/+/8255
primer 85 atcttggcgc atgtattgac 20 86 20 DNA Artificial sequence
S/L6/-/9422 primer 86 tgcattagca gcaacaacat 20 87 20 DNA Artificial
sequence S/L6/-/9966 primer 87 tctgcagaac agcagaagtg 20 88 20 DNA
Artificial sequence S/L6/-/10542 primer 88 cctgtgcagt ttgtctgtca 20
89 20 DNA Artificial sequence S/L6/+/10677 primer 89 ccttgtggca
atgaagtaca 20 90 20 DNA Artificial sequence S/L6/+/10106 primer 90
atgtcatttg cacagcagaa
20 91 20 DNA Artificial sequence S/L6/+/9571 primer 91 cttcaatggt
ttgccatgtt 20 92 20 DNA Artificial sequence S/L7/-/11271 primer 92
tgcgagctgt catgagaata 20 93 20 DNA Artificial sequence S/L7/-/11801
primer 93 aaccgagagc agtaccacag 20 94 20 DNA Artificial sequence
S/L7/-/12383 primer 94 tttggctgct gtagtcaatg 20 95 20 DNA
Artificial sequence S/L7/+/12640 primer 95 ctacgacaga tgtcctgtgc 20
96 20 DNA Artificial sequence S/L7/+/12088 primer 96 gagcaggctg
tagctaatgg 20 97 20 DNA Artificial sequence S/L7/+/11551 primer 97
ttaggctatt gttgctgctg 20 98 20 DNA Artificial sequence S/L8/-/13160
primer 98 cagacaacat gaagcaccac 20 99 20 DNA Artificial sequence
S/L8/-/13704 primer 99 cgctgacgtg atatatgtgg 20 100 20 DNA
Artificial sequence S/L8/-/14284 primer 100 tgcacaatga aggatacacc
20 101 20 DNA Artificial sequence S/L8/+/14453 primer 101
acatagctcg cgtctcagtt 20 102 20 DNA Artificial sequence
S/L8/+/13968 primer 102 ggcattgtag gcgtactgac 20 103 19 DNA
Artificial sequence S/L8/+/13401 primer 103 gtttgcggtg taagtgcag 19
104 20 DNA Artificial sequence S/L9/-/15098 primer 104 tagtggcggc
tattgacttc 20 105 20 DNA Artificial sequence S/L9/-/15677 primer
105 ctaaaccttg agccgcatag 20 106 20 DNA Artificial sequence
S/L9/-/16247 primer 106 catggtcata gcagcacttg 20 107 21 DNA
Artificial sequence S/L9/+/16323 primer 107 ccaggttgtg atgtcactga t
21 108 20 DNA Artificial sequence S/L9/+/15858 primer 108
ccttacccag atccatcaag 20 109 20 DNA Artificial sequence
S/L9/+/15288 primer 109 cgcaaacata acacttgctg 20 110 20 DNA
Artificial sequence S/L10/-/16914 primer 110 agtgttgggt acaagccagt
20 111 20 DNA Artificial sequence S/L10/-/17466 primer 111
gttccaagga acatgtctgg 20 112 20 DNA Artificial sequence
S/L10/-/18022 primer 112 aggtgcctgt gtaggatgaa 20 113 20 DNA
Artificial sequence S/L10/+/18245 primer 113 gggctgtcat gcaactagag
20 114 20 DNA Artificial sequence S/L10/+/17663 primer 114
tcttacacgc aatcctgctt 20 115 20 DNA Artificial sequence
S/L10/+/17061 primer 115 tacccatctg ctcgcatagt 20 116 20 DNA
Artificial sequence S/L11/-/18877 primer 116 gcaagcagaa ttaaccctca
20 117 20 DNA Artificial sequence S/L11/-/19396 primer 117
agcaccacct aaattgcatc 20 118 20 DNA Artificial sequence
S/L11/-/20002 primer 118 tggtcccttt gaaggtgtta 20 119 20 DNA
Artificial sequence S/L11/+/20245 primer 119 tcgaacacat cgtttatgga
20 120 20 DNA Artificial sequence S/L11/+/19611 primer 120
gaagcacctg tttccatcat 20 121 20 DNA Artificial sequence
S/L11/+/19021 primer 121 acgatgctca gccatgtagt 20 122 20 DNA
Artificial sequence SARS/L1/F3/+/800 primer 122 gaggtgcagt
cactcgctat 20 123 20 DNA Artificial sequence SARS/L1/F4/+/1391
primer 123 cagagattgg acctgagcat 20 124 20 DNA Artificial sequence
SARS/L1/F5/+/1925 primer 124 cagcaaacca ctcaattcct 20 125 20 DNA
Artificial sequence SARS/L1/R3/-/1674 primer 125 aaatgatggc
aacctcttca 20 126 20 DNA Artificial sequence SARS/L1/R4/-/1107
primer 126 cacgtggttg aatgactttg 20 127 20 DNA Artificial sequence
SARS/L1/R5/-/520 primer 127 atttctgcaa ccagctcaac 20 128 20 DNA
Artificial sequence SARS/L2/F3/+/2664 primer 128 cgcattgtct
cctggtttac 20 129 20 DNA Artificial sequence SARS/L2/F4/+/3232
primer 129 gagattgagc cagaaccaga 20 130 20 DNA Artificial sequence
SARS/L2/F5/+/3746 primer 130 atgagcaggt tgtcatggat 20 131 20 DNA
Artificial sequence SARS/L2/R3/-/3579 primer 131 ctgccttaag
aagctggatg 20 132 20 DNA Artificial sequence SARS/L2/R4/-/2991
primer 132 tttcttcacc agcatcatca 20 133 20 DNA Artificial sequence
SARS/L2/R5/-/2529 primer 133 caccgttctt gagaacaacc 20 134 20 DNA
Artificial sequence SARS/L3/F3/+/4708 primer 134 tctttggctg
gctcttacag 20 135 20 DNA Artificial sequence SRAS/L3/F4/+/5305
primer 135 gctggtgatg ctgctaactt 20 136 20 DNA Artificial sequence
SARS/L3/F5/+/5822 primer 136 ccatcaagcc tgtgtcgtat 20 137 20 DNA
Artificial sequence SARS/L3/R3/-/5610 primer 137 caggtggtgc
agacatcata 20 138 20 DNA Artificial sequence SARS/L3/R4/-/4988
primer 138 aacatcagca ccatccaagt 20 139 20 DNA Artificial sequence
SARS/L3/R5/-/4437 primer 139 atcggacacc atagtcaacg 20 140 7788 DNA
Artificial sequence synthetic S gene 140 tcaatattgg ccattagcca
tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg
catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120
aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg
180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg
taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca
ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg
tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag
tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg
cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480
gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac
540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc
ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt
ccaaaatgtc gtaataaccc 660 cgccccgttg acgcaaatgg gcggtaggcg
tgtacggtgg gaggtctata taagcagagc 720 tcgtttagtg aaccgtcaga
tcactagaag ctttattgcg gtagtttatc acagttaaat 780 tgctaacgca
gtcagtgctt ctgacacaac agtctcgaac ttaagctgca gaagttggtc 840
gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag accaatagaa
900 actgggcttg tcgagacaga gaagactctt gcgtttctga taggcaccta
ttggtcttac 960 tgacatccac tttgcctttc tctccacagg tgtccactcc
cagttcaatt acagctctta 1020 aggctagagt acttaatacg actcactata
ggctagcgga tccaccatgt tcatcttcct 1080 gctgttcctg accctgacca
gcggcagcga cctggaccgg tgcaccacct tcgacgacgt 1140 gcaggccccc
aactacaccc agcacaccag cagcatgcgg ggcgtgtact accccgacga 1200
gatctttcgg agcgacaccc tgtacctgac ccaggacctg ttcctgccct tctacagcaa
1260 cgtgaccggc ttccacacca tcaaccacac cttcggcaac cccgtgatcc
ccttcaagga 1320 cggcatctac ttcgccgcca ccgagaagag caacgtggtg
cggggctggg tgttcggcag 1380 caccatgaac aacaagagcc agagcgtgat
catcatcaac aacagcacca acgtggtgat 1440 ccgggcctgc aacttcgagc
tgtgcgacaa ccccttcttc gccgtgtcca aacccatggg 1500 cacccagacc
cacaccatga tcttcgacaa cgccttcaac tgcaccttcg agtacatcag 1560
cgacgccttc agcctggacg tgagcgagaa gagcggcaac ttcaagcacc tgcgggagtt
1620 cgtgttcaag aacaaggacg gcttcctgta cgtgtacaag ggctaccagc
ccatcgacgt 1680 ggtgagagac ctgcccagcg gcttcaacac cctgaagccc
atcttcaagc tgcccctggg 1740 catcaacatc accaacttcc gggccatcct
gaccgccttt agccctgccc aggacatctg 1800 gggcaccagc gccgccgcct
acttcgtggg ctacctgaag cctaccacct tcatgctgaa 1860 gtacgacgag
aacggcacca tcaccgacgc cgtggactgc agccagaacc ccctggccga 1920
gctgaagtgc agcgtgaaga gcttcgagat cgacaagggc atctaccaga ccagcaactt
1980 cagagtggtg cctagcggcg atgtggtgcg gttccccaat atcaccaacc
tgtgcccctt 2040 cggcgaagtg ttcaacgcca ccaagttccc cagcgtgtac
gcctgggagc ggaagaagat 2100 cagcaactgc gtggccgact acagcgtgct
gtacaactcc accttcttca gcaccttcaa 2160 gtgctacggc gtgagcgcca
ccaagctgaa cgacctgtgc ttcagcaacg tgtacgccga 2220 cagcttcgtg
gtgaagggcg acgacgtgag acagatcgcc cctggccaga ccggcgtgat 2280
cgccgactac aactacaagc tgcccgacga cttcatgggc tgcgtgctgg cctggaacac
2340 ccggaacatc gacgccacaa gcaccggcaa ctacaattac aagtaccgct
acctgcggca 2400 cggcaagctg cggcccttcg agcgggacat ctccaacgtg
cccttcagcc ccgacggcaa 2460 gccctgcacc ccccctgccc tgaactgcta
ctggcccctg aacgactacg gcttctacac 2520 caccaccggc atcggctatc
agccctacag agtggtggtg ctgagcttcg agctgctgaa 2580 cgcccctgcc
accgtgtgcg gccccaagct gagcaccgac ctgatcaaga accagtgcgt 2640
gaacttcaac ttcaacggcc tgaccggcac cggcgtgctg acccccagca gcaagcgctt
2700 ccagcccttc cagcagttcg gccgggatgt gagcgacttc accgacagcg
tgcgggaccc 2760 caagaccagc gagatcctgg acatcagccc ctgcagcttc
ggcggcgtgt ccgtgatcac 2820 ccccggcacc aacgccagca gcgaagtggc
cgtgctgtac caggacgtga actgcaccga 2880 cgtgagcacc gccatccacg
ccgaccagct gacccccgcc tggcggatct acagcaccgg 2940 gaacaacgtg
ttccagaccc aggccggctg cctgatcggc gccgagcacg tggacaccag 3000
ctacgagtgc gacatcccca ttggcgccgg aatctgcgcc agctaccaca ccgtgagcct
3060 gctgcggagc accagccaga agtccatcgt ggcctacacc atgagcctgg
gcgccgacag 3120 cagcatcgcc tacagcaaca acaccatcgc catccccacc
aacttcagca tctccatcac 3180 caccgaagtg atgcccgtga gcatggccaa
gacaagcgtg gattgcaaca tgtacatctg 3240 cggcgacagc accgagtgcg
ccaacctgct gctgcagtac ggcagcttct gcacccagct 3300 gaaccgggcc
ctgagcggca tcgccgccga gcaggaccgg aacaccagag aagtgttcgc 3360
ccaagtgaag cagatgtata agacccccac cctgaagtac ttcgggggct tcaacttctc
3420 tcagatcctg cccgaccctc tgaagcccac caagcgctcc ttcatcgagg
acctgctgtt 3480 caacaaagtg accctggccg acgccggctt tatgaagcag
tacggcgagt gcctgggcga 3540 catcaacgcc cgggacctga tctgcgccca
gaagtttaac gggctgaccg tgctgccccc 3600 cctgctgacc gacgacatga
tcgccgccta tacagccgcc ctggtgagcg gcaccgccac 3660 cgccggctgg
accttcggag ccggagccgc cctgcagatc cccttcgcca tgcagatggc 3720
ctaccggttc aacggcatcg gcgtgaccca gaacgtgctg tacgagaacc agaagcagat
3780 cgccaaccag ttcaacaagg ccatcagcca gatccaggag agcctgacca
caaccagcac 3840 cgccctgggc aagctgcagg acgtggtgaa ccagaacgcc
caggccctga acaccctggt 3900 gaagcagctg agcagcaact tcggcgccat
cagctctgtg ctgaacgaca tcctgagcag 3960 gctggacaaa gtggaggccg
aagtgcagat cgaccggctg atcaccggac gcctgcagtc 4020 cctgcagacc
tacgtgaccc agcagctgat cagagccgcc gagatccggg ccagcgccaa 4080
tctggccgcc accaagatga gcgagtgcgt gctgggccag agcaagagag tggacttctg
4140 cggcaagggc tatcacctga tgagcttccc ccaggccgcc ccccacggcg
tggtgttcct 4200 gcacgtgacc tacgtgccta gccaggagcg gaacttcacc
accgccccag ccatctgcca 4260 cgagggcaag gcctacttcc cccgggaggg
cgtgttcgtg tttaacggca ccagctggtt 4320 catcacccag cgcaacttct
tcagccccca gatcatcacc acagacaaca ccttcgtgtc 4380 cggcaactgt
gatgtggtga tcggcatcat caataacacc gtgtacgacc ccctgcagcc 4440
cgagctggac agcttcaagg aggagctgga caaatacttc aagaaccaca cctcccccga
4500 cgtggacctg ggcgatatca gcggcatcaa cgcctccgtg gtgaacatcc
agaaggagat 4560 cgacagactg aacgaagtgg ccaagaacct gaacgagagc
ctgatcgacc tgcaggagct 4620 gggcaagtac gagcagtaca tcaagtggcc
ctggtacgtg tggctgggct tcatcgccgg 4680 cctgatcgcc atcgtgatgg
tgaccatcct gctgtgctgc atgaccagct gctgtagctg 4740 cctgaaaggc
gcctgcagct gtggcagctg ctgcaagttc gacgaggacg acagcgagcc 4800
cgtgctgaag ggcgtgaagc tgcactacac ctgataactc gagaattcac gcgtggtacc
4860 tctagagtcg acccgggcgg ccgcttcgag cagacatgat aagatacatt
gatgagtttg 4920 gacaaaccac aactagaatg cagtgaaaaa aatgctttat
ttgtgaaatt tgtgatgcta 4980 ttgctttatt tgtaaccatt ataagctgca
ataaacaagt taacaacaac aattgcattc 5040 attttatgtt tcaggttcag
ggggagatgt gggaggtttt ttaaagcaag taaaacctct 5100 acaaatgtgg
taaaatcgat aaggatccgg gctggcgtaa tagcgaagag gcccgcaccg 5160
atcgcccttc ccaacagttg cgcagcctga atggcgaatg gacgcgccct gtagcggcgc
5220 attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg
ccagcgccct 5280 agcgcccgct cctttcgctt tcttcccttc ctttctcgcc
acgttcgccg gctttccccg 5340 tcaagctcta aatcgggggc tccctttagg
gttccgattt agagctttac ggcacctcga 5400 ccgcaaaaaa cttgatttgg
gtgatggttc acgtagtggg ccatcgccct gatagacggt 5460 ttttcgccct
ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 5520
aacaacactc aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc
5580 ggcctattgg ttaaaaaatg agctgattta acaaatattt aacgcgaatt
ttaacaaaat 5640 attaacgttt acaatttcgc ctgatgcggt attttctcct
tacgcatctg tgcggtattt 5700 cacaccgcat atggtgcact ctcagtacaa
tctgctctga tgccgcatag ttaagccagc 5760 cccgacaccc gccaacaccc
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg 5820 cttacagaca
agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat 5880
caccgaaacg cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca
5940 tgataataat ggtttcttag acgtcaggtg gcacttttcg gggaaatgtg
cgcggaaccc 6000 ctatttgttt atttttctaa atacattcaa atatgtatcc
gctcatgaga caataaccct 6060 gataaatgct tcaataatat tgaaaaagga
agagtatgag tattcaacat ttccgtgtcg 6120 cccttattcc cttttttgcg
gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg 6180 tgaaagtaaa
agatgctgaa gatcagttgg gtgcacgagt gggttacatc gaactggatc 6240
tcaacagcgg taagatcctt gagagttttc gccccgaaga acgttttcca atgatgagca
6300 cttttaaagt tctgctatgt ggcgcggtat tatcccgtat tgacgccggg
caagagcaac 6360 tcggtcgccg catacactat tctcagaatg acttggttga
gtactcacca gtcacagaaa 6420 agcatcttac ggatggcatg acagtaagag
aattatgcag tgctgccata accatgagtg 6480 ataacactgc ggccaactta
cttctgacaa cgatcggagg accgaaggag ctaaccgctt 6540 ttttgcacaa
catgggggat catgtaactc gccttgatcg ttgggaaccg gagctgaatg 6600
aagccatacc aaacgacgag cgtgacacca cgatgcctgt agcaatggca acaacgttgc
6660 gcaaactatt aactggcgaa ctacttactc tagcttcccg gcaacaatta
atagactgga 6720 tggaggcgga taaagttgca ggaccacttc tgcgctcggc
ccttccggct ggctggttta 6780 ttgctgataa atctggagcc ggtgagcgtg
ggtctcgcgg tatcattgca gcactggggc 6840 cagatggtaa gccctcccgt
atcgtagtta tctacacgac ggggagtcag gcaactatgg 6900 atgaacgaaa
tagacagatc gctgagatag gtgcctcact gattaagcat tggtaactgt 6960
cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa
7020 ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa
cgtgagtttt 7080 cgttccactg agcgtcagac cccgtagaaa agatcaaagg
atcttcttga gatccttttt 7140 ttctgcgcgt aatctgctgc ttgcaaacaa
aaaaaccacc gctaccagcg gtggtttgtt 7200 tgccggatca agagctacca
actctttttc cgaaggtaac tggcttcagc agagcgcaga 7260 taccaaatac
tgtccttcta gtgtagccgt agttaggcca ccacttcaag aactctgtag 7320
caccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc agtggcgata
7380 agtcgtgtct taccgggttg gactcaagac gatagttacc ggataaggcg
cagcggtcgg 7440 gctgaacggg gggttcgtgc acacagccca gcttggagcg
aacgacctac accgaactga 7500 gatacctaca gcgtgagcta tgagaaagcg
ccacgcttcc cgaagggaga aaggcggaca 7560 ggtatccggt aagcggcagg
gtcggaacag gagagcgcac gagggagctt ccagggggaa 7620 acgcctggta
tctttatagt cctgtcgggt ttcgccacct ctgacttgag cgtcgatttt 7680
tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc cagcaacgcg gcctttttac
7740 ggttcctggc cttttgctgg ccttttgctc acatggctcg acagatct 7788 141
23 DNA Artificial sequence SNE-S1 primer 141 ggttgggatt atccaaaatg
tga 23 142 24 DNA Artificial sequence SNE-AS1 primer 142 gcatcatcag
aaagaatcat catg 24 143 21 DNA Artificial sequence SAR1-S primer 143
cctctcttgt tcttgctcgc a 21 144 21 DNA Artificial sequence SAR1-AS
primer 144 tatagtgagc cgccacacat g 21 145 45 DNA Artificial
sequence PCR primer 145 ataggatcca ccatgtttat tttcttatta tttcttactc
tcact 45 146 37 DNA Artificial sequence PCR primer 146 atactcgagt
tatgtgtaat gtaatttgac acccttg 37 147 45 DNA Artificial sequence PCR
primer 147 ataggatcca ccatgtttat tttcttatta tttcttactc tcact 45 148
36 DNA Artificial sequence PCR primer 148 acctccggat ttaatatatt
gctcatattt tcccaa 36 149 13 PRT Artificial sequence N-terminal end
of SRAS-CoV S protein (amino acids 1 to 13) 149 Met Phe Ile Phe Leu
Leu Phe Leu Thr Leu Thr Ser Gly 1 5 10 150 10 PRT Artificial
sequence oligopeptide 150 Ser Gly Asp Tyr Lys Asp Asp Asp Asp Lys 1
5 10 151 34 DNA Artificial sequence PCR primer 151 actagctagc
ggatccacca tgttcatctt cctg 34 152 33 DNA Artificial sequence PCR
primer 152 agtatccgga cttgatgtac tgctcgtact tgc 33 153 59 DNA
Artificial sequence oligonucleotid 153 tatgagcttt tttttttttt
tttttttggc atataaatag actcggcgcg ccatctgca 59 154 53 DNA Artificial
sequence oligonucleotid 154 gatggcgcgc cgagtctatt tatatgccaa
aaaaaaaaaa aaaaaaaagc tca 53 155 45 DNA Artificial sequence PCR
primer 155 atacgtacga ccatgtttat tttcttatta tttcttactc tcact 45 156
40 DNA Artificial sequence PCR primer 156 atagcgcgct cattatgtgt
aatgtaattt gacacccttg
40 157 20 DNA Artificial sequence PCR primer 157 ccatttcaac
aatttggccg 20 158 45 DNA Artificial sequence PCR primer 158
ataggatccg cgcgctcatt atttatcgtc gtcatcttta taatc 45
* * * * *
References