U.S. patent application number 10/302947 was filed with the patent office on 2003-12-25 for cloned dna sequences related to the entire genomic rna of human immunodeficiency virus ii (hiv-2), polypeptides encoded by these dna sequences and use of these dna clones and polypeptides in diagnostic kits.
This patent application is currently assigned to Institut Pasteur. Invention is credited to Alizon, Marc, Clavel, Francois, Guetard, Denise, Guyader, Mireille, Montagnier, Luc, Sonigo, Pierre.
Application Number | 20030235835 10/302947 |
Document ID | / |
Family ID | 46252772 |
Filed Date | 2003-12-25 |
United States Patent
Application |
20030235835 |
Kind Code |
A1 |
Alizon, Marc ; et
al. |
December 25, 2003 |
Cloned DNA sequences related to the entire genomic RNA of human
immunodeficiency virus II (HIV-2), polypeptides encoded by these
DNA sequences and use of these DNA clones and polypeptides in
diagnostic kits
Abstract
A method for diagnosing an HIV-2 (LAV-II) infection and a kit
containing reagents for the same is disclosed. These reagents
include cDNA probes which are capable of hybridizing to at least a
portion of the genome of HIV-2. In one embodiment, the DNA probes
are capable of hybridizing to the entire genome of HIV-2. These
reagents also include polypeptides encoded by some of these DNA
sequences.
Inventors: |
Alizon, Marc; (Paris,
FR) ; Montagnier, Luc; (Le Plessis Robinson, FR)
; Guetard, Denise; (Paris, FR) ; Clavel,
Francois; (Rockville, MD) ; Sonigo, Pierre;
(Paris, FR) ; Guyader, Mireille; (Toulouse,
FR) |
Correspondence
Address: |
FINNEGAN, HENDERSON, FARABOW, GARRETT & DUNNER
LLP
1300 I STREET, NW
WASHINGTON
DC
20005
US
|
Assignee: |
Institut Pasteur
|
Family ID: |
46252772 |
Appl. No.: |
10/302947 |
Filed: |
November 25, 2002 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10302947 |
Nov 25, 2002 |
|
|
|
07810908 |
Dec 20, 1991 |
|
|
|
6544728 |
|
|
|
|
07810908 |
Dec 20, 1991 |
|
|
|
07752368 |
Sep 3, 1991 |
|
|
|
07752368 |
Sep 3, 1991 |
|
|
|
07013477 |
Feb 11, 1987 |
|
|
|
5079342 |
|
|
|
|
07013477 |
Feb 11, 1987 |
|
|
|
07003764 |
Jan 16, 1987 |
|
|
|
5051496 |
|
|
|
|
07003764 |
Jan 16, 1987 |
|
|
|
06933184 |
Nov 21, 1986 |
|
|
|
06933184 |
Nov 21, 1986 |
|
|
|
06916080 |
Oct 6, 1986 |
|
|
|
06933184 |
Nov 21, 1986 |
|
|
|
06835228 |
Mar 3, 1986 |
|
|
|
4839288 |
|
|
|
|
Current U.S.
Class: |
435/5 ;
536/23.72 |
Current CPC
Class: |
A61P 31/18 20180101;
C12N 7/00 20130101; G01N 2333/162 20130101; C12N 2740/16122
20130101; A61K 38/00 20130101; C12N 2740/16322 20130101; G01N
2469/20 20130101; C12Q 1/703 20130101; C12N 2740/16021 20130101;
C07K 7/06 20130101; G01N 33/56988 20130101; C07K 14/005 20130101;
A61K 39/00 20130101; C12N 2740/16222 20130101 |
Class at
Publication: |
435/6 ; 435/5;
536/23.72 |
International
Class: |
C12Q 001/68; C12Q
001/70; C07H 021/02 |
Foreign Application Data
Date |
Code |
Application Number |
Jan 22, 1986 |
FR |
86 00911 |
Feb 6, 1986 |
FR |
86 01635 |
Feb 13, 1986 |
FR |
86 01985 |
Mar 18, 1986 |
FR |
86 03881 |
Mar 24, 1986 |
FR |
86 04215 |
Claims
What is claimed is:
1. A method for diagnosing an HIV-2 infection which comprises: (a)
contacting genetic DNA or RNA from a body sample obtained from a
person suspected of having an HIV-2 infection with a DNA probe
derived from at least a portion of the genome of the HIV-2 virus;
and (b) determining whether a hybridized complex is created.
2. The method of claim 1 wherein said body sample is selected from
the group consisting of tissue, blood cells, cells and body
fluids.
3. The method of claim 1 wherein the presence of the hybridized
complex is determined by a process selected from the group
consisting of Southern blot, Northern blot and dot blot.
4. The method of claim 1 wherein the cDNA probe is analogous to the
entire genome of the HIV-2 virus.
5. A DNA probe capable of hybridizing to the entire genome of the
HIV-2 virus.
6. A method for diagnosing an HIV-2 infection which comprises: (a)
contacting sera obtained from a patient suspected of having an
HIV-2 infection with a polypeptide expression product of a DNA
segment derived from the genome of the HIV-2 virus; and (b)
determining whether an immunocomplex is formed.
7. The method of claim 6 wherein the formation of the immunocomplex
is determined by a process selected from the group consisting of
radioimmunoassays (RIA), radioimmunoprecipitation assays (RIPA),
immunofluoresence assays (IFA), enzyme-linked immunosorbent assays
(ELISA) and Western blots.
8. A process for detecting the presence of a virus selected from
the group consisting of LAV-II, HIV-2, STLV-III and other viruses
which form complexes with LAV-II reagents comprising: (a)
contacting DNA or RNA from a sample suspected of containing viral
genetic material with a DNA probe derived from a portion of the
genome of the HIV-2 virus; and (b) determining whether a hybridized
complex is created.
9. A peptide selected from the group consisting of env1, env2,
env3, env4, env5, env6, env7, env8, env9, env10, env11 and
gag1.
10. A kit for diagnosing an HIV-2 infection by the method of claim
6 and comprising env1, env2, env3 and gag1 peptides as the
polypeptide expression product.
11. A vaccinating agent comprising at least one peptide selected
from the group consisting of env4, env5, env6, env7, env8, env9,
env10 and env11 in admixture with suitable carriers.
12. A peptide having common immunological properties with the
peptide structure of the envelope glycoprotein of a virus of the
HIV-2 class, said peptide having no more than 40 amino acid
residues.
13. A peptide according to claim 12 having either of the following
formulas:
5 XR--A-E-D-YL-DQ--L--WGC-----CZ XA-E-D-YL-DZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of either of the following peptide
sequences:
6 RVTAIEKYLQDQARLNSWGCAFRQVC AIEKYLQDQ
14. A peptide according to claim 12 having either of the following
formulas:
7 X--E--Q-QQEKN--EL--L---Z XQ-QQEKNZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of either of the following peptide
sequences:
8 SLEQAQIQQEKNIVIYELQKLNSW QIQQEKN
15. A peptide according to claim 12 characterized as having either
of the following formulas
9 XEL--YK-V-I-P-G--APTK-KR-----Z XYK-V-T-P-G-APTK-KRZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of either of the following peptide
sequences:
10 ELGDYKLVEITPIGFAPTKEKRYSSAH YKLVEITPIGFAPTKEK
16. A peptide according to claim 12 characterized as having either
of the following formulas:
11 x----VTV-YGVP-WK-AT--LPCA-Z XVTV-YGVP-WK-ATZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of one of the following peptide
sequences:
12 CTQYVTVFYGVPTWKNATIPLECAT VTVFYGVPTWKNAT
EKLWVTVYYGVPVWKEATTTLFCAS VTVYYGVPVWKEAT
17. A peptide according to claim 16 characterized as having one of
the following formulas:
13 CTQYVTVFYGVPTWKNATIPLFCAT VTVYYGVPTWKNAT
EKLWVTVYYGVPVWKEATTTLFCAS VTVYYGVPVWKEAT EDLWVTVYYGVPVWKEATTTLFCAS
VTVYYGVPVWKEAT DNLWVTVYYGVPVWKEATTTLFCAS VTVYYGVPVWKEAT
18. A peptide according to claim 12 characterized as having either
of the following formulas:
14 X---QE--L-NVTE-F--W-NZ XL-NVTE-FZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of one of the following peptide
sequences:
15 DDYQEITL-NVTEAFDAWNN L-NVTE PNPQEVVLVNVTENFNMWKN LVNVTE
19. A peptide according to claim 18 characterized as having one of
the following formulas:
16 DDYQEITL-NVTEAFDAWNN L-NVTEAF PNPQEVVLVNVTENFNMWKN LVNVTENF
PNPQEIELENVTEGFNMWKN LENVTEGF PNPQEIALENVTENFNMWKN LENVTENF
20. A peptide according to claim 12 characterized as having one of
the following formulas:
17 XL---S-KPCVKLTPLCV--KZ XKPCVKLTPLCVZ XS-KPCVKLTPLCVZ
in which-X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of one of the following peptide
sequences:
18 ETSIKPCVKLTPLCVAMK DQSLKPCVKLTPLCVSLK KPCVKLTPLCV
SLKPCVKLTPLCV
21. A peptide according to claim 20 characterized as having one of
the following formulas:
19 ETSIKPCVKLTPLCVAMK DQSLKPCVKLTPLCVSLK DQSLKPCVKLTPLCVTLN
PCVKLTPLC
22. A peptide characterized as having either of the following
formulas:
20 X---N-S-IT--C-Z XN-S-ITZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of one of the following peptide
sequences:
21 NHCNTSVITESCD NTSVIT TSCNTSVITQACP NTSAIT
23. A peptide according to claim 22 characterized as having one of
the following formulas:
22 NHCNTSVITESCD NTSVIT TSCNTSVITQACP NTSVIT INCNTSVITQACP NTSVIT
INCNTSAITQACP NTSAIT
24. A peptide according to claim according to claim 12
characterized as having the following formula: XYC-P-G-A-L-C-N-TZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of either of the following peptide
sequences:
23 11 YCAPPGYALLRC-NDT YCAPAGFAILKCNNKT
25. A peptide according to claim 24 characterized as having one of
the following formulas:
24 YCAPPGYALLRC-NDT YCAPAGFAILKCNNKT YCAPAGFAILKCNDKK
YCAPAGFAILKCRDKK
26. A peptide according to claim 12 characterized as having the
following formula: X------A-C------W--Z in which X and Z are OH or
NH.sub.2 or, to the extent that the immunological properties of the
natural peptides lacking these groups shall not be essentially
modified, the groups having from one to five amino acid residues,
and each of the hyphens corresponding to an aminoacyl residue
chosen from those which permit the conservation for the peptide
characterized above of the immunological properties of either of
the following peptide sequences:
25 NKRPRQAWCWFKG-KWKD N--MRQAHCNISRAKWNA
27. A peptide according to claim 26 characterized as having one of
the following formulas:
26 NKRPRQAWCWFKG-KWKT N--MRQAHCNISRAKWNA D--IRRAYCTINETEWDK
I--IGQAHCNISRAQWSK
28. A peptide according to claim 12 characterized as having either
of the following formulas:
27 7 X-G-DPE------NC-GEF-YCN-----NZ XNC-GEF-YCNZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of one of the following peptide
sequences:
28 KGSDPEVAYMWTNCRGEFLYCNMTWFLN NCRGEFLYCN
-GGDPEIVTHSFNCGGEFFYCNSTQLFN NCGGEFFYCN
29. A peptide according to claim 28 characterized as having one of
the following formulas:
29 KGSDPEVAYMWTNCRGEFLYCNMTWFLN NCRGEFLYCN
-GGDPEIVTHSFNCGGEFFYCNSTQLFN NCGGEFFYCN
-GGDPEITTHSFNCRGEFFYCNTSKLFN NCRGEFFYCN
-GGDPEITTHSFNCGGEFFYCNTSGLFN NCGGEFFYCN
30. A peptide according to claim 12 characterized as having either
of the following formulas:
30 X-----C-IKQ-I------G---YZ XC-IKQ-IZ
in which X and Z are OH or NH.sub.2 or, to the extent that the
immunological properties of the natural peptides lacking these
groups shall not be essentially modified, the groups having from
one to five amino acid residues, and each of the hyphens
corresponding to an aminoacyl residue chosen from those which
permit the conservation for the peptide characterized above of the
immunological properties of one of the following peptide
sequences:
31 RNYAPCHIKQIINTWHKVGRNVY CHIKQII TITLPCRIKQFINMWQEVGKAMY
CRIKQFI
31. A peptide according to claim 30 characterized as having one of
the following formulas:
32 RHYAPCHIKQIINTWHKVGRNVY CHIKQII TITLPCRIKQFINMWQEVGKAMY CRIKQFI
SITLPCRIKQIINMWQKTCKAMY CRIKQII NITLQCRIKQIIKMVAGR-KAIY CRIKQII
32. The antigenic peptide gag1 characterized as having the
following formula: XNCKLVLKGLGMNPTLEEMLTAZ in which X and Z are OH
or NH.sub.2 or, to the extent that the immunological properties of
the natural peptides lacking these groups shall not be essentially
modified, the groups having from one to five amino acid residues,
and each of the hyphens corresponding to an aminoacyl residue
chosen from those which permit the conservation for the peptide
characterized above of the immunological properties of the
following peptide sequence: XNCKLVLKGLGMNPTLEEMLTA
33. An antigenic composition containing at least one gag1 peptide
according to claim 32 or at least an oligomer of this peptide,
characterized as having the capacity to be recognized by human
biological fluids such as serum containing anti-HIV-2 antibodies
and under appropriate conditions anti-HIV-1 antibodies.
34. An antigenic composition containing at least one peptide
according to claims 13, 14 or 15, or at least an oligomer of the
peptide, characterized in that the peptide specifically recognizes
the presence of anti-HIV-2 antibodies.
35. An immunogenic composition containing at least one peptide
according to any one of the claims 16-31 or at least an oligomer of
the peptide or the peptide conjugated with a carrier molecule, in
association with an acceptable pharmaceutical vehicle for the
production of vaccines, the composition characterized in that it
induces antibody production against the peptide in sufficient
quantities to form an effective immunocomplex with the entire HIV-2
retrovirus and its corresponding proteins.
36. An immunogenic composition according to claim 35 further
comprising peptides having formulas corresponding to the envelope
glycloprotein sequences of HIV-1 and HIV-2 which have an amino acid
homology greater than 50%.
37. An immunogenic composition according to either of claims 35 or
36 having at least one peptide or at least an oligomer of the
peptide or the peptide conjugated with a carrier molecule, the
composition corresponding to a peptide chosen from the group
consisting of Env4, Env5, Env6 and Env10.
38. A procedure for the in vitro diagnosis-of HIV-2 infections in a
biological fluid, comprising: contacting the biological fluid with
at least one peptide according to claims 12, 13,14,15 or 32, or a
conjugate of the peptide with a carrier molecule; detecting the
eventual presence in the biological fluid of an antigen-antibody
complex by physical or chemical methods.
39. The diagnostic procedure of claim 38, wherein the detection
step is performed by a test selected by the group consisting of
enzyme-linked immuno absorbant assay (ELISA), immunofluoresence
assay (IFA), radioimmunoassay (RIA), and radioimmunoprecipitation
assay (RIPIA).
40. A kit for the in vitro diagnosis of an HIV-2 infection in a
biological fluid comprising: a peptide composition containing a
peptide according to claims 12, 13, 14, 15 or 32, or a mixture of
such peptides, or a conjugate of such peptides with a carrier
molecule; an appropriate reaction environment for the production of
an antigen-antibody complex; one or more reagents adapted for the
detection of the formation of antigen-antibody complexes; and a
biological fluid as a reference sample having no antibodies
recognized by said peptide composition.
41. An protein selected from the group described in Example 4
consisting of p 16, p 26, p 12, polymerase, Q protein, R protein, X
protein, Y protein, env protein, F protein, TAT, ART, U5 and
U3.
42. A kit for diagnosing an HIV-2 infection by the method of claim
6 and comprising as the polypeptide expression product a protein of
claim 41.
43. A vaccinating agent comprising at least one protein of claim 41
in association with appropriate carriers.
Description
[0001] This application is a continuation-in-part of U.S. patent
application Ser. No. ______ of Alizon et al. for "Cloned DNA
Sequences Related to the Entire Genomic RNA of Human
Immunodeficiency Virus II (HIV-2), Polypeptides Encoded by these
DNA Sequences and Use of these DNA Clones and Polypeptides in
Diagnostic Kits," filed Jan. 16, 1987, which is a
continuation-in-part of U.S. patent application Ser. No. 931,866
filed Nov. 21, 1986, which is a continuation-in-part application of
U.S. patent application Ser. No. 916,080 of Montagnier et al. for
"Cloned DNA Sequences Related to the Genomic RNA of the Human
Immunodeficiency Virus II (HIV-2), Polypeptides Encoded by these
DNA Sequences and Use of these DNA Clones and Polypeptides in
Diagnostic Kits," filed Oct. 6, 1986 and U.S. patent application
Ser. No. 835,228 of Montagnier et al. for "New Retrovirus Capable
of Causing AIDS, Antigens Obtained from this Retrovirus and
Corresponding Antibodies and their Application for Diagnostic
Purposes," filed Mar. 3, 1986. The disclosures of each of these
predecessor applications are expressly incorporated herein by
reference.
[0002] The invention relates to cloned DNA sequences analogous to
the genomic RNA of a virus known as Lymphadenopathy-Associated
Virus II ("LAV-II"), a process for the preparation of these cloned
DNA sequences, and their use as probes in diagnostic kits. In one
embodiment, the invention relates to a cloned DNA sequence
analogous to the entire genomic RNA of HIV-2 and its use as a
probe. The invention also relates to polypeptides with amino acid
sequences encoded by these cloned DNA sequences and the use of
these polypeptides in diagnostic kits.
BACKGROUND OF THE INVENTION
[0003] According to recently adopted nomenclature, as reported in
Nature, May 1986, a substantially-identical group of retroviruses
which has been identified as one causative agent of AIDS are now
referred to as Human Immunodeficiency Viruses I (HIV-1). This
previously-described group of retroviruses includes
Lymphadenopathy-Associated Virus I (LAV-I), Human T-cell
Lymphotropic Virus-III (HTLV-III), and AIDS-Related Virus
(ARV).
[0004] Lymphadenopathy-Associated virus II has been described in
U.S. application Ser. No. 835,228, which was filed Mar. 3, 1986,
and is specifically incorporated herein by reference. Because
LAV-II is a second, distinct causative agent of AIDS, LAV-II
properly is classifiable as a Human Immunodeficiency Virus II
(HIV-2). Therefore, "LAV-II" as used hereinafter describes a
particular genus of HIV-2 isolates.
[0005] While HIV-2 is related to HIV-1 by its morphology, its
tropism and its in vitro cytopathic effect on CD4 (T4) positive
cell lines and lymphocytes, HIV-2 differs from previously described
human retroviruses known to be responsible for AIDS. Moreover, the
proteins of HIV-1 and 2 have different sizes and their serological
cross-reactivity is restricted mostly to the major core protein, as
the envelope glycoproteins of HIV-2 are not immune precipitated by
HIV-1-positive sera except in some cases where very faint
cross-reactivity can be detected. Since a significant proportion of
the HIV infected patients lack antibodies to the major core protein
of their infecting virus, it is important to include antigens to
both HIV-1 and HIV-2 in an effective serum test for the diagnosis
of the infection by these viruses.
[0006] HIV-2 was first discovered in the course of serological
research on patients native to Guinea-Bissau who exhibited clinical
and immunological symptoms of AIDS and from whom sero-negative or
weakly sero-positive reactions to tests using an HIV-1 lysate were
obtained. Further clinical studies on these patients isolated
viruses which were subsequently named "LAV-II."
[0007] One LAV-II isolate, subsequently referred to as LAV-II MIR,
was deposited at the Collection Nationale des Cultures de
Micro-Organismes (CNCM) at the Institut Pasteur in Paris, France on
Dec. 19, 1985 under Accession No. 1-502 and has also been deposited
at the British ECA CC under No. 87.001.001 on Jan. 9, 1987. A
second LAV-II isolate was deposited at CNCM on Feb. 21, 1986 under
Accession No.1-532 and has also been deposited at the British ECA
CC under No. 87.001.002 on Jan. 9, 1987. This second isolate has
been subsequently referred to as LAV-II ROD. Other isolates
deposited at the CNCM on Dec. 19, 1986 are HIV-2 IRMO (No.1-642)
and HIV-2 EHO (No.1-643). Several additional isolates have been
obtained from West African patients, some of whom have AIDS, others
with AIDS-related conditions and others with no AIDS symptoms. All
of these viruses have been isolated on normal human lymphocyte
cultures and some of them were thereafter propagated on lymphoid
tumor cell lines such as CEM and MOLT.
[0008] Due to the sero-negative or weak sero-positive results
obtained when using kits designed to identify HIV-1 infections in
the diagnosis of these new patients with HIV-2 disease, it has been
necessary to devise a new diagnostic kit capable of detecting HIV-2
infection, either by itself or in combination with an HIV-1
infection. The present inventors have, through the development of
cloned DNA sequences analogous to at least a portion of the genomic
RNA of LAV-II ROD Viruses, created the materials necessary for the
development of such kits.
SUMMARY OF THE INVENTION
[0009] As noted previously, the present invention relates to the
cloned nucleotide sequences homologous or identical to at least a
portion of the genomic RNA of HIV-2 viruses and to polypeptides
encoded by the same. The present invention also relates to kits
capable of diagnosing an HIV-2 infection.
[0010] Thus, a main object of the present invention is to provide a
kit capable of diagnosing an infection caused by the HIV-2 virus.
This kit may operate by detecting at least a portion of the RNA
genome of the HIV-2 virus or the provirus present in the infected
cells through hybridization with a DNA probe or it may operate
through the immunodiagnostic detection of polypeptides unique to
the HIV-2 virus.
[0011] Additional objects and advantages of the present invention
will be set forth in part in the description which follows, or may
be learned from practice of the invention. The objects and
advantages may be realized and attained by means of the
instrumentalities and combinations particularly pointed out in the
appended claims.
[0012] To achieve these objects and in accordance with the purposes
of the present invention, cloned DNA sequences related to the
entire genomic RNA of the LAV-II virus are set forth. These
sequences are analogous specifically to the entire genome of the
LAV-II ROD strain.
[0013] To further achieve the objects and in accordance with the
purposes of the present invention, a kit capable of diagnosing an
HIV-2 infection is described. This kit, in one embodiment, contains
the cloned DNA sequences of this invention which are capable of
hybridizing to viral RNA or analogous DNA sequences to indicate the
presence of an HIV-2 infection. Different diagnostic techniques can
be used which include, but are not limited to: (1) Southern blot
procedures to identify viral DNA which may or may not be digested
with restriction enzymes; (2) Northern blot techniques to identify
viral RNA extracted from cells; and (3) dot blot techniques, i.e.,
direct filtration of the sample through an ad hoc membrane such as
nitrocellulose or nylon without previous separation on agarose gel.
Suitable material for dot blot technique could be obtained from
body fluids including, but not limited to serum and plasma,
supernatants from culture cells, or cytoplasmic extracts obtained
after cell lysis and removal of membranes and nuclei of the cells
by ultra-centrifugation as accomplished in the "CYTODOT" procedure
as described in a booklet published by Schleicher and Schull.
[0014] In an alternate embodiment, the kit contains the
polypeptides created using these cloned DNA sequences. These
polypeptides are capable of reacting with antibodies to the HIV-2
virus present in sera of infected individuals, thus yielding an
immunodiagnostic complex.
[0015] To further achieve the objects of the invention, a
vaccinating agent is provided which comprises at least one peptide
selected from the polypeptide expression products of the viral DNA
in admixture with suitable carriers, adjuvents stabilizers.
[0016] It is understood that both the foregoing general description
and the following detailed description are exemplary and
explanatory only and are not restrictive of the invention as
claimed. The accompanying drawings, which are incorporated in and
constitute a part of the specification, illustrate one embodiment
of the invention and, together with the description, serve to
explain the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] FIG. 1 generally depicts the nucleotide sequence of a cloned
complementary DNA (cDNA) to the genomic RNA of HIV-2. FIG. 1A
depicts the genetic organization of HIV-1, position of the HIV-1
HindIII fragment used as a probe to screen the cDNA library, and
restriction map of the HIV-2 cDNA clone, E2. FIG. 1B depicts the
nucleotide sequence of the 3' end of HIV-2. The corresponding
region of the HIV-1 LTR was aligned using the Wilbur and Lipman
algorithm (window: 10; K-tuple: 7; gap penalty: 3) as described by
Wilbur and Lipman in Proc. Natl. Acad. Sci. USA 80: 726-730 (1983),
specifically incorporated herein by reference. The U3-R junction in
HIV-1 is indicated and the poly A addition signal and potential
TATA promoter regions are boxed. In FIG. 1B, the symbols B, H, Ps
and Pv refer to the restriction sites BamHI, HindIII, PstI and
PvuII, respectively.
[0018] FIG. 2 generally depicts the HIV-2 specificity of the E2
clone. FIG. 2A and B specifically depict a Southern blot of DNA
extracted from CEM cells infected with the following isolates:
HIV-2ROD (a,c), HIV-2DUL (b,d), and HIV-1BRU (e,f). DNA in lanes
a,b,f was Pst I digested; in c,d,e DNA was undigested. FIG. 2C and
D specifically depict dot blot hybridization of pelleted virions
from CEM cells infected by the HIV-1.sub.BRU(1), Simian
Immunodeficiency Virus (SIV) isolate Mm 142-83 (3), HIV-2.sub.DUL
(4), HIV-2.sub.ROD (5), and HIV-1.sub.ELI (6). Dot 2 is a pellet
from an equivalent volume of supernatant from uninfected CEM. Thus,
FIG. 2A and C depicts hybridization with the HIV-2 cDNA (E2) and
FIG. 2B and D depicts hybridization to an HIV-1 probe consisting of
a 9 Kb SacI insert from HIV-1 BRU (clone lambda J 19).
[0019] FIG. 3 generally depicts a restriction map of the HIV-2 ROD
genome and its homology to HIV-1. FIG. 3A specifically depicts the
organization of three recombinant phage lambda clones, ROD 4, ROD
27, and ROD 35. In FIG. 3A, the open boxes represent viral
sequences, the LTR are filled, and the dotted boxes represent
cellular flanking sequences (not mapped). Only some characteristic
restriction enzyme sites are indicated. .lambda.ROD 27 and
.lambda.ROD 35 are derived from integrated proviruses while
.lambda.ROD 4 is derived from a circular viral DNA. The portion of
the lambda clones that hybridzes to the cDNA E2 is indicated below
the maps. A restriction map of the .lambda.ROD isolate was
reconstructed from these three lambda clones. In this map, the
restriction sites are identified as follows: B: BamHI; E: EcoRI; H:
HindIII; K: KpnI; Ps: PstI; Pv: PvuII; S: SacI; X: XbaI. R and L
are the right and left BamHI arms of the lambda L47.1 vector.
[0020] FIG. 3B specifically depicts dots 1-11 which correspond to
the single-stranded DNA form of M13 subclones from the
HIV-1.sub.BRU cloned genome (.lambda.J19). Their size and position
on the HIV-1 genome, determined by sequencing is shown below the
figure. Dot 12 is a control containing lambda phage DNA. The
dot-blot was hybridized in low stringency conditions as described
in Example 1 with the complete lambda .lambda.ROD 4 clone as a
probe, and successively washed in 2.times.SSC, 0.1% SDS at
25.degree. C. (Tm -42.degree. C.), 1.times.SSC, 0.1% SDS at
60.degree. C. (Tm -20.degree. C.), and 0.1.times.SSC, 0.1% SDS at
60.degree. C. (Tm -3.degree. C.) and exposed overnight. A duplicate
dot blot was hybridized and washed in stringent conditions (as
described in Example 2) with the labelled lambda J19 clone carrying
the complete HIV-1.sub.BRU genome. HIV-1 and HIV-2 probes were
labelled the same specific activity (10.sup.8 cpm/g.).
[0021] FIG. 4 generally depicts the restriction map polymorphism in
different HIV-2 isolates and shows comparison of HIV-2 to SIV. FIG.
4A specifically depicts DNA (20 ug. per lane) from CEM cells
infected by the isolate HIV-2.sub.DUL (panel 1) or peripheral blood
lymphocytes (PBL) infected by the isolates HIV-2.sub.GOM (panel 2)
and HIV-2.sub.MIR (panel 3) digested with: EcoRI (a), PstI (b), and
HindIII (c). Much less viral DNA was obtained with HIV-2 isolates
propagated on PBL. Hybridization and washing were in stringent
conditions, as described in Example 2, with 10.sup.6 cpm/ml. of
each of the E2 insert (cDNA) and the 5 kb. HindIII fragment of
.lambda.ROD 4, labelled to 10.sup.9 cpm/ug.
[0022] FIG. 4B specifically depicts DNA from HUT 78 (a human T
lymphoid cell line) cells infected with STLV3 MAC isolate Mm
142-83. The same amounts of DNA and enzymes were used as indicated
in panel A. Hybridization was performed with the same probe as in
A, but in non-stringent conditions. As described in Example 1
washing was for one hour in 2.times.SSC, 0.1% SDS at 40.degree. C.
(panel 1) and after exposure, the same filter was re-washed in
0.1.times.SSC, 0.1% SDS at 60.degree. C. (panel 2). The
autoradiographs were obtained after overnight exposition with
intensifying screens.
[0023] FIG. 5 depicts the position of derived plasmids from
.lambda.ROD 27, .lambda.ROD 35 and .lambda.ROD 4.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0024] Reference will now be made in detail to the presently
preferred embodiments of the invention, which, together with the
following examples, serve to explain the principles of the
invention.
[0025] The genetic structure of the HIV-2 virus has been analyzed
by molecular cloning according to the method set forth herein and
in the Examples. A restriction map of the genome of this virus is
included in FIG. 4. In addition, the partial sequence of a cDNA
complementary to the genomic RNA of the virus has been determined.
This cDNA sequence information is included in FIG. 1.
[0026] Also contained herein is data describing the molecular
cloning of the complete 9.5 kb genome of HIV-2, data describing the
observation of restriction map polymorphism between different
isolates, and an analysis of the relationship between HIV-2 and
other human and simian retroviruses. From the totality of these
data, diagnostic probes can be discerned and prepared.
[0027] Generally, to practice one embodiment of the present
invention, a series of filter hybridizations of the HIV-2 RNA
genome with probes derived from the complete cloned HIV-1 genome
and from the gag and poI genes were conducted. These hybridizations
yielded only extremely weak signals even in conditions of very low
stringency of hybrization and washing. Thus, it was found to be
difficult to assess the amount of HIV-2 viral and proviral DNA in
infected cells by Southern blot techniques.
[0028] Therefore, a complementary DNA (cDNA) to the HIV-2 genomic
RNA initially was cloned in order to provide a specific
hybridization probe. To construct this cDNA, an oligo (dT) primed
cDNA first-strand was made in a detergent-activated endogenous
reaction using HIV-2 reverse transcriptase with virions purified
from supernatants of infected CEM cells. The CEM cell line is a
lymphoblastoid CD4+ cell line described by G. E. Foley et al. in
Cancer 18: 522-529 (1965), specifically incorporated herein by
reference. The CEM cells used were infected with the isolate ROD
and were continuously producing high amounts of HIV-2.
[0029] After second-strand synthesis, the cDNAs were inserted into
the M 13 tg 130 bacteriophage vector. A collection of 10.sup.4 M13
recombinant phages was obtained and screened in situ with an HIV-1
probe spanning 1.5 kb. of the 3' end of the LAV.sub.BRU isolate
(depicted in FIG. 1A). Some 50 positive plaques were detected,
purified, and characterized by end sequencing and cross-hybridizing
the inserts. This procedure is described in more detail in Example
1 and in FIG. 1.
[0030] The different clones were found to be complementary to the
3' end of a polyadenylated RNA having the AATAAA signal about 20
nucleotides upstream of the poly A tail, as found in the long
terminal repeat (LTR) of HIV-1. The LTR region of HIV-1 has been
described by S. Wain Hobson et al. in Cell 40: 9-17 (1985),
specifically incorporated herein by reference. The portion of the
HIV-2 LTR that was sequenced was related only distantly to the
homologous domain in HIV-1 as demonstrated in FIG. 1B. Indeed, only
about 50% of the nucleotides could be aligned and about a hundred
insertions/deletions need to be introduced. In comparison, the
homology of the corresponding domains in HIV-1 isolates from USA
and Africa is greater than 95% and no insertions or deletions are
seen.
[0031] The largest insert of this group of M13 clones was a 2 kb.
clone designated E2. Clone E2 was used as a probe to demonstrate
its HIV-2--specificity in a series of filter hybridization
experiments. Firstly, this probe could detect the genomic RNA of
HIV-2 but not HIV-1 in stringent conditions as shown in FIG. 2, C
and D. Secondly, positive signals were detected in Southern blots
of DNA from cells infected with the ROD isolate as well as other
isolates of HIV-2 as shown in FIG. 2, A and FIG. 4, A. No signal
was detected with DNA from uninfected cells or HIV-1 infected
cells, confirming the exogenous nature of HIV-2. In undigested DNA
from HIV-2 infected cells, an approximately 10 kb. species,
probably corresponding to linear unintegrated viral DNA, was
principally detected along with a species with an apparent size of
6 kb., likely to be the circular form of the viral DNA. Conversely,
rehybridization of the same filter with an HIV-1 probe under
stringent conditions showed hybridization to HIV-1 infected cells
only as depicted in FIG. 2, B.
[0032] To isolate the remainder of the genome of HIV-2, a genomic
library in lambda phage L47.1 was constructed. Lambda phage L47.1
has been described by W.A.M. Loenen et al. in Gene 10: 249-259
(1980), specifically incorporated herein by reference. The genomic
library was constructed with a partial Sau3AI restriction digest of
the DNA from the CEM cell line infected with HIV-2ROD.
[0033] About 2.times.10.sup.6 recombinant plaques were screened in
situ with labelled insert from the E2 cDNA clone. Ten recombinant
phages were detected and plaque purified. Of these phages, three
were characterized by restriction mapping and Southern blot
hybridization with the E2 insert and probes from its 3' end (LTR)
or 5' end (envelope), as well as with HIV-1 subgenomic probes. In
this instance, HIV-1 probes were used under non-stringent
conditions.
[0034] A clone carrying a 9.5 kb. insert and derived from a
circular viral DNA was identified as containing the complete genome
and designated AROD 4. Two other clones, AROD 27 and AROD 35 were
derived from integrated proviruses and found to carry an LTR and
cellular flanking sequences and a portion of the viral coding
sequences as shown in FIG. 3, A.
[0035] Fragments of the lambda clones were subcloned into a plasmid
vector p UC 18.
[0036] Plasmid pROD 27-5' is derived from AROD 27 and contains the
5' 2 Kb of the HIV-2 genome and cellular flanking sequences (5' LTR
and 5' viral coding sequences to the EcoRI site)
[0037] Plasmid p ROD 4-8 is derived from AROD 4 and contains the
about 5 Kb HindIII fragment that is the central part of the HIV-2
genome.
[0038] Plasmid pROD 27-5' and p ROD 4.8 inserts overlap.
[0039] Plasmid pROD 4.7 contains a HindIII 1.8 Kb fragment from
.lambda.ROD 4. This fragment is located 3' to the fragment
subcloned into pROD 4.8 and contains about 0.8 Kb of viral coding
sequences and the part of the lambda phage (.lambda.L47.1) left arm
located between the BamHI and HindIII cloning sites.
[0040] Plasmid pROD 35 contains all the HIV-2 coding sequences 3'
to the EcoRI site, the 3' LTR and about 4 Kb of cellular flanking
sequences.
[0041] Plasmid pROD 27-5' and pROD 35 in E. coli strain HB 101 are
deposited respectively under No. 1-626 and 1-633 at the CNCM, and
have also been deposited at the NCIB (British Collection). These
plasmids are depicted in FIG. 5. Plasmids pROD 4-7 and pROD 4-8 in
E. coli strain TG1 are deposited respectively under No. 1-627 and
1-628 at the CNCM.
[0042] To reconstitute the complete HIV-2 ROD genome, pROD 35 is
linearized with EcoRI and the EcoRI insert of pROD 27-5' is ligated
in the correct orientation into this site.
[0043] The relationship of HIV-2 to other human and simian
retroviruses was surmised from hybridization experiments. The
relative homology of the different regions of the HIV-1 and 2
genomes was determined by hybridization of fragments of the cloned
HIV-1 genome with the labelled AROD 4 expected to contain the
complete HIV-2 genome (FIG. 3, B). Even in very low stringency
conditions (Tm -42.degree. C.), the hybridization of HIV-1 and 2
was restricted to a fraction of their genomes, principally the gag
gene (dots 1 and 2), the reverse transcriptase domain in poI (dot
3), the end of poI and the Q (or sor) genes (dot 5) and the F gene
(or 3' orf) and 3' LTR (dot 11). The HIV-1 fragment used to detect
the HIV-2 cDNA clones contained the dot 11 subclone, which
hybridized well to HIV-2 under non-stringent conditions. Only the
signal from dot 5 persisted after stringent washing. The envelope
gene, the region of the tat gene and a part of poI thus seemed very
divergent. These data, along with the LTR sequence obtained (FIG.
1, B), indicated that HIV-2 is not an envelope variant of HIV-1, as
are African isolates from Zaire described by Alizon et al., Cell
40:63-74 (1986).
[0044] It was observed that HIV-2 is related more closely to the
Simian Immunodeficiency Virus (SIV) than it is to HIV-1. This
correlation has been described by F. Clavel et al. in C.R. Acad.
Sci. (Paris) 302: 485-488 (1986) and F. Clavel et al. in Science
233: 343-346 (1986), both of which are specifically incorporated
herein by reference. Simian Immunodeficiency virus (also designated
Simian T-cell Lymphotropic Virus Type 3, STLV-3) is a retrovirus
first isolated from captive macaques with an AIDS-like disease in
the USA. This simian virus has been described by M. D. Daniel et
al. in Science 228: 1201-1204 (1985), specifically incorporated
herein by reference.
[0045] All the SIV proteins, including the envelope, are immune
precipitated by sera from HIV-2 infected patients, whereas the
serological cross-reactivity of HIV-1 to 2 is restricted to the
core proteins. However SIV and HIV-2 can be distinguished by slight
differences in the apparent molecular weight of their proteins.
[0046] In terms of nucleotide sequence, it also appears that HIV-2
is closely related to SIV. The genomic MA of SIV can be detected in
stringent conditions as shown in FIG. 2, C by HIV-2 probes
corresponding to the LTR and 31 end of the genome (E2) or to the
gag or poI genes. Under the same conditions, HIV-1 derived probes
do not detect the SIV genome as shown in FIG. 2, D.
[0047] In Southern blots of DNA from SIV-infected cells, a
restriction pattern clearly different from HIV-2.sub.ROD and other
isolates is seen. All the bands persist after a stringent washing,
even though the signal is considerably weakened, indicating a
sequence homology throughout the genomes of HIV-2 and SIV. It has
recently been shown that baboons and macaques could be infected
experimentally by HIV-2, thereby providing an interesting animal
model for the study of the HIV infection and its preventive
therapy. Indeed, attempts to infect non-human primates with HIV-1
have been successful only in chimpanzees, which are not a
convenient model.
[0048] From an initial survey of the restriction maps for certain
of the HIV-2 isolates obtained according to the methods described
herein, it is already apparent that HIV-2, like HIV-1, undergoes
restriction site polymorphism. FIG. 4A depicts examples of such
differences for three isolates, all different one from another and
from the cloned HIV-2ROD. It is very likely that these differences
at the nucleotide level are accompanied by variations in the
amino-acid sequence of the viral proteins, as evidenced in the case
of HIV-1 and described by M. Alizon et al. in Cell 46: 63-74
(1986), specifically incorporated herein by reference. It is also
to be expected that the various isolates of HIV-2 will exhibit
amino acid heterogeneities. See, for example, Clavel et al., Nature
324 (18):691-695 (1986), specifically incorporated herein by
reference.
[0049] Further, the characterization of HIV-2 will also delineate
the domain of the envelope glycoprotein that is responsible for the
binding of the surface of the target cells and the subsequent
internalization of the virus. This interaction was shown to be
mediated by the CD4 molecule itself in the case of HIV-1 and
similar studies tend to indicate that HIV-2 uses the same receptor.
Thus, although there is wide divergence between the env genes of
HIV-1 and 2, small homologous domains of the envelopes of the two
HIV could represent a candidate receptor binding site. This site
could be used to raise a protective immune response against this
group of retroviruses.
[0050] From the data discussed herein, certain nucleotide sequences
have been identified which are capable of being used as probes in
diagnostic methods to obtain the immunological reagents necessary
to diagnose an HIV-2 infection. In particular, these sequences may
be used as probes in hybridization reactions with the genetic
material of infected patients to indicate whether the RNA of the
HIV-2 virus is present in these patient's lymphocytes or whether an
analogous DNA is present. In this embodiment, the test methods
which may be utilized include Northern blots, Southern blots and
dot blots. One particular nucleotide sequence which may be useful
as a probe is the combination of the 5 kb. HindIII fragment of ROD
4 and the E2 cDNA used in FIG. 4.
[0051] In addition, the genetic sequences of the HIV-2 virus may be
used to create the polypeptides encoded by these sequences.
Specifically, these polypeptides may be created by expression of
the cDNA obtained according to the teachings herein in hosts such
as bacteria, yeast or animal cells. These polypeptides may be used
in diagnostic tests such as immunofluorescence assays (IFA),
radioimmunoassays (RIA) and Western Blot tests.
[0052] Moreover, it is also contemplated that additional diagnostic
tests, including additional immunodiagnostic tests, may be
developed in which the DNA probes or the polypeptides of this
invention may serve as one of the diagnostic reagents. The
invention described herein includes these additional test
methods.
[0053] In addition, monoclonal antibodies to these polypeptides or
fragments thereof may be created. The monoclonal antibodies may be
used in immunodiagnostic tests in an analogous manner as the
polypeptides described above.
[0054] The polypeptides of the present invention may also be used
as immunogenic reagents to induce protection against infection by
HIV-2 viruses. In this embodiment, the polypeptides, produced by
recombinant-DNA techniques would function as vaccine agents.
[0055] Also, the polypeptides of this invention may be used in
competitive assays to test the ability of various antiviral agents
to determine their ability to prevent the virus from fixing on its
target.
[0056] Thus, it is to be understood that application of the
teachings of the present invention to a specific problem or
environment will be within the capabilities of one having ordinary
skill in the art in light of the teachings contained herein.
Examples of the products of the present invention and
representative processes for their isolation and manufacture appear
above and in the following examples.
EXAMPLES
Example 1
Cloning of a cDNA Complementary to Genomic RNA
[0057] From HIV-2 Virions
[0058] HIV-2 virions were purified from 5 liters of supernatant
from a culture of the CEM cell line infected with the ROD isolate
and a cDNA first strand using oligo (dT) primer was synthesized in
detergent activated endogenous reaction on pelleted virus, as
described by M. Alizon et al. in Nature, 312: 757-760 (1984),
specifically incorporated herein by reference. RNA-cDNA hybrids
were purified by phenol-chloroform extraction and ethanol
precipitation. The second-strand cDNA was created by the DNA
polymerase I/RNAase H method of Gubler and Hoffman in Gene, 25:
263-269 (1983), specifically incorporated herein by reference,
using a commercial cDNA synthesis kit obtained from Amersham. After
attachment of EcoRI linkers (obtained from Pharmacia), EcoRI
digestion, and ligation into EcoRI-digested dephosphorylated M13 tg
130 vector (obtained from Amersham), a cDNA library was obtained by
transformation of the E. coli TGI strain. Recombinant plaques
(10.sup.4) were screened in situ on replica filters with the 1.5
kb. HindIII fragment from clone J19, corresponding to the 3' part
of the genome of the LAVBRU isolate of HIV-1, .sup.32p labelled to
a specific activity of 10.sup.9 cpm ug. The filters were
prehybridized in 5.times.SSC, 5.times. Denhardt solution, 25%
formamide, and denatured salmon sperm DNA (100 ug/ml.) at
37.degree. C. for 4 hours and hybridized for 16 hours in the same
buffer (Tm -42.degree. C.) plus 4.times.10.sup.7 cpm of the
labelled probe (10.sup.6 cpm/ml. of hybridization buffer). The
washing was done in 5.times.SSC, 0.1% 8DS at 25.degree. C. for 2
hours. 20.times.SSC is 3M NaCl, 0.3M Na citrate. Positive plaques
were purified and single-stranded M13 DNA prepared and
end-sequenced according to the method described in Proc. Nat'l.
Acad. Sci. USA, 74: 5463-5467 (1977) of Sanger et al.
Example 2
Hybridization of DNA from HIV-1 and HIV-2 Infected Cells and RNA
from HIV-1 and 2 and SIV Virons With a Probe Derived From an HIV-2
Cloned cDNA
[0059] DNA was extracted from infected CEM cells continuously
producing HIV-1 or 2. The DNA digested with 20 ug of PstI digested
with or undigested, was electrophoresed on a 0.8% agarose gel, and
Southern-transferred to nylon membrane. Virion dot-blots were
prepared in duplicate, as described by F. Clavel et al. in Science
233: 343-346 (1986), specifically incorporated herein by reference,
by pelleting volumes of supernatant corresponding to the same
amount of reverse transcriptase activity. Prehybridization was done
in 50% formamide, 5.times.SSC, 5.times. Denhardt solution, and 100
mg./ml. denatured salmon sperm DNA for 4 hours at 42.degree. C.
Hybridization was performed in the same buffer plus 10% Dextran
sulphate, and 10.sup.6 cpm/ml. of the labelled E2 insert (specific
activity 10.sup.9 cpm/ug.) for 16 hours at 42.degree. C. Washing
was in 0.1.times.SSC, 0.1% SDS for 2.times.30 nm. After exposition
for 16 hours with intensifying screens, the Southern blot was
dehybridized in 0.4 N NaOH, neutralized, and rehybridized in the
same conditions to the HIV-1 probe labelled to 10.sup.9 cpm/ug.
Example 3
Cloning in Lambda Phage of the Complete Provirus DNA of HIV-2
[0060] DNA from the HIV-2.sub.ROD infected CEM (FIG. 2, lanes a and
c) was partially digested with Sau3AI. The 9-15 kb. fraction was
selected on a 5-40% sucrose gradient and ligated to BamHI arms of
the lambda L47.1 vector. Plaques (2.times.10.sup.6) obtained after
in vitro packaging and plating on E. coli LA 101 strain were
screened in situ with the insert from the E2 cDNA clone.
Approximately 10 positive clones were plaque purified and
propagated on E. coli C600 recBC. The ROD 4, 27, and 35 clones were
amplified and their DNA characterized by restriction mapping and
Southern blotting with the HIV-2 cDNA clone under stringent
conditions, and gag-poI probes from HIV-1 used under non stringent
conditions.
Example 4
Complete Genomic Sequence of the ROD HIV-2 Isolate
[0061] Experimental analysis of the HIV-2 ROD isolate yielded the
following sequence which represents the complete genome of this
HIV-2 isolate. Genes and major expression products identified
within the following sequence are indicated by nucleotides numbered
below:
[0062] 1) GAG gene (546-2111) expresses a protein product having a
molecular weight of around 55Kd and is cleaved into the following
proteins:
[0063] a) p 16 (546-950)
[0064] b) p 26 (951-1640)
[0065] c) p 12 (1701-2111)
[0066] 2) polymerase (1829-4936)
[0067] 3) Q protein (4869-5513)
[0068] 4) R protein (5682-5996)
[0069] 5) X protein (5344-5679)
[0070] 6) Y protein (5682-5996)
[0071] 7) Env protein (6147-8720)
[0072] 8) F protein (8557-9324)
[0073] 9) TAT gene (5845-6140 and 8307-8400) is expressed by two
exons separated by introns.
[0074] 10) ART protein (6071-6140 and 8307-8536) is similarly the
expression product of two exons.
[0075] 11) LTR:R (1-173 and 9498-9671)
[0076] 12) U5 (174-299)
[0077] 13) U3 (8942-9497)
[0078] It will be known to one of skill in the art that the
absolute numbering which has been adopted is not essential. For
example, the nucleotide within the LTR which is designated as "1"
is a somewhat arbitrary choice. What is important is the sequence
information provided.
1 GGTCGCTCTGCGGAGAGGCTGGCAGATTGAGCCCTGGGAGGTTCTCTCCAGCACTAGCAG * *
* * * *
GTAGAGCCTGGGTGTTCCCTGCTAGACTCTCACCAGCACTTGGCCGGTGCTGGGCAGACG * * *
100 * *
GCCCCACGCTTGCTTGCTTAAAAACCTCTTAATAAAGCTGCCAGTTAGAAGCAAGTTAAG * * *
* * * TGTGTGCTCCCATCTCTCCTAGTCGCCGCCTGGTCATTCGGTGTTCACCTGAGTAACAAG
* 200 * * * *
ACCCTGGTCTGTTAGGACCCTTCTTGCTTTGGGAAACCGAGGCAGGAAAATCCCTAGCAG * * *
* * 300
GTTGGCGCCTGAACAGGGACTTCAAGAAGACTCAGAAGTCTTGGAACACGGCTGAGTGPA * * *
* * * GGCAGTAAGGGCGGCAGGAACAAACCACGACGGAGTGCTCCTAGAAAGGCGCGGGCCGAG
* * * 400 * *
GTACCAAAGGCAGCGTGTGGAGCGGGAGGAGAAGAGGCCTCCGGGTGAAGGTAAGTACCT * * *
* * * ACACCAAAAACTGTAGCCGAAAGGGCTTGCTATCCTACCTTTAGACAGGTAGAAGATTGT
* 500 * * * *
MetGlyAlaArgAsnSerValLeuArgGlyLysLysAlaAspGluLeuGluArgIle
GGGAGATGGGCGCGAGAAACTCCGTCTTGAGAGGGAAAAAAGCAGATGAATTAGAAAGAA * * *
* * 600
ArgLeuArgProGlyGlyLysLysLysTyrArgLeuLysHisIleValTrpAlaAlaAsn
TCAGGTTACGGCCCGGCCGAAAGAAAAAGTACAGGCTAAAACATATTGTGTGGGCAGCGA * * *
* * * LysLeuAspArgPheGlyLeuAlaGluSerLeuLeuGluSerLysGluGlyCysGlnLys
ATAAATTGGACAGATTCGGATTAGCAGAGAGCCTGTTGGAGTCAAAAGAGGGTTGTCAAA * * *
700 * *
IleLeuThrValLeuAspProMetValProThrGlySerGluAsnLeuLysSerLeuPhe
AAATTCTTACAGTTTTAGATCCAATGGTACCGACAGGTTCAGAAAATTTAAAAAGTCTTT * * *
* * * AsnThrValCysValIleTrpCysIleHisAlaGluGluLysValLysAspThrGluGly
TTAATACTGTCTGCGTCATTTGGTGCATACACGCAGAAGAGAAAGTGAAAGATACTGAAG * 800
* * * *
AlaLysGlnhleValArgArgHisLeuValAlaGluThrGlyThrAlaGluLysMetPro
GAGCAAAACAAATAGTGCGGAGACATCTAGTGGCAGAAACAGGAACTGCAGACAAAATGC * * *
* * 900
SerThrSerArgProThrAlaProSerSerGluLysGlyGlyAsnTyrProValGlnHis
CAAGCACAAGTAGACCAACAGCACCATCTAGCGAGAAGGGAGGAAATTACCCAGTGCAAC * * *
* * * ValGlyGlyAsnTyrThrHisIleProLeuSerProArgThrLeuAsnAlaTrpValLys
ATGTAGGCGGCAACTACACCCATATACCGCTGAGTCCCCGAACCCTAAATGCCTGGGTAA * * *
1000 * *
LeuValGluGluLysLysPheGlyAlaGluValValProGlyPheGlnAlaLeuSerGlu
AATTAGTAGAGGAAAAAAAGTTCGGGGCAGAAGTAGTGCCAGGATTTCAGGCACTCTCAG * * *
* * * GlyCysThrProTyrAspIleAsnGlnMetLeuAsnCysValGlyAspHisGlnAlaAla
AAGGCTGCACGCCCTATGATATCAACCAAATGCTTAATTGTGTGGGCGACCATCAAGCAG * 1100
* * * *
MetGlnIleIleArgGluIleIleAsnGluGluAlaAlaGluTrpAspValGlnLisPro
CCATGCAGATAATCAGGGAGATTATCAATGAGGAAGCAGCAGAATGGGATGTGCAACATC * * *
* * 1200
IleProGlyProLeuProAlaGlyGlnLeuArgGluProArgGlyserAspIleAlaGly
CAATACCAGGCCCCTTACCAGCGGGGCAGCTTAGAGAGCCAAGGGGATCTGACATAGCAG * * *
* * * ThrThrSerThrValGluGluGlnIleGlnTrpMetPheArgProGlnAsnProValPro
CGACAACAAGCACAGTAGAAGAACAGATCCAGTGGATGTTTAGGCCACAAAATCCTGTAC * * *
1300 * *
ValGlyAsnIleTyrArgArgTrpIleGlnIleGlyLeuGlnLysCysValArgMetTyr
CAGTAGGAAACATCTATAGAAGATGGATCCAGATAGGATTGCAGAAGTGTGTCAGGATGT * * *
* * * AsnProThrAsnIleLeuAspIleLysGlnGlyProLysGluProPheGlnSerTyrVal
ACAACCCGACCAACATCCTAGACATAAAACAGGGACCAAAGGAGCCGTTCCAAAGCTATG * 1400
* * * *
AspArgPheTyrLysSerLeuArgAlaGluGlnThrAspProAlaValLysAsnTrpMet
TAGATAGATTCTACAAAAGCTTGAGGGCAGAACAAACAGATCCAGCAGTGAAGAATTGGA * * *
* * 1500
ThrGlnThrLeuLeuValGlnAsnAlaAsnProAspCysLysLeuValLeuLysGlyLeu
TGACCCAAACACTGCTAGTACAAAATGCCAACCCAGACTGTAAATTAGTGCTAAAAGGAC * * *
* * * GlyMetAsnProThrLeuGluGluMetLeuThrAlaCysGlnGlyValGlyGlyProGly
TAGGGATGAACCCTACCTTAGAAGAGATGCTGACCGCCTGTCAGGGGGTAGGTGGGCCAG * * *
1600 * *
GlnLysAlaArgLeuMetAlaGluAlaLeuLysGluValIleGlyProAlaProIlePro
GCCAGAAAGCTAGATTAATGGCAGAGGCCCTGAAAGAGGTCATAGGACCTGCCCCTATCC * * *
* * * PheAlaAlaAlaGlnGlnArgLysAlaPheLysCysTrpAsnCysGlyLysGluGlyHis
CATTCGCAGCAGCCCAGCAGAGAAAGGCATTTAAATGCTGGAACTGTGGAAAGGAAGGGC * 1700
* * * *
SerAlaArgGlnCysArgAlaProArgArgGlnGlyCysTrpLysCysGlyLysProGly
ACTCGGCAAGACAATGCCGAGCACCTAGAAGGCAGGGCTGCTGGAAGTGTGGTAAGCCAG * * *
* * 1800 ThrGlyArgPhePheArgThrGlyProLeuGly
HisIleMetThrAsnCysProAspArgGlnAlaGlyPheLeuGlyLeuGlyProTrpGly
GACACATCATGACAAACTGCCCAGATAGACAGGCAGGTTTTTTAGGACTGGGCCCTTGGG * * *
* * * LysGluAlaProGlnLeuProArgGlyProSerSerAlaGlyAlaAspThrAsnSerThr
LysLysProArgAsnPheProValAlaGlnValProGlnGlyLeuThrProThrAlaPro
GAAAGAAGCCCCGCAACTTCCCCGTGGCCCAAGTTCCGCAGGGGCTGACACCAACAGCAC * * *
1900 * *
ProSerGlySerSerSerGlySerThrGlyGluIleTyrAlaAlaArgGluLysThrGlu
ProValAspProAlaValAspLeuLeuGluLysTyrMetGlnGlnGlyLysArgGlnArg
CCCCAGTGGATCCAGCAGTGGATCTACTGGAGAAATATATGCAGCAAGGGAAAAGACAGA * * *
* * * ArgAlaGluArgGluThrIleGlnGlySerAspArgGlyLeuThrAlaProArgAlaGly
GluGlnArgGluArgProTyrLysGluValThrGluAspLeuLeuHisLeuGluGlnGly
GAGAGCAGAGAGAGAGACCATACAAGGAAGTGACAGAGGACTTACTGCACCTCGAGCAGG * 2000
* * * *
GlyAspThrIleGlnGlyAlaThrAsnArgGlyLeuAlaAlaProGlnPheSerLeuTrP
GluThrProTyrArgGluProProThrGluAspLeuLeuHisLeuAsnSerLeuPheGly
GGGAGACACCATACAGGGAGCCACCAACAGAGGACTTGCTGCACCTCAATTCTCTCTTTC * * *
* * 2100
LysArgProValValThrAlaTyrIleGluGlyGlnProValGluValLeuLeuAspThr
LysAspGln GAAAAGACCAGTAGTCACAGCATACATTGAGGGTCAGCCAGTAGAAGTCTTGTTAG-
ACAC * * * * * *
GlyAlaAspAspSerIleValAlaGlyIleGluLeuGlyAsnAsnTyrSerProLysI- le
AGGGGCTGACGACTCAATAGTAGCAGGAATAGAGTTAGGGAACAATTATAGCCCAAAAAT * * *
2200 * *
ValGlyGlyIleGlyGlyPheIleAsnThrLysGluTyrLysAsnValGluIleGluVal
AGTAGGGGGAATAGGGGGATTCATAAATACCAAGGAATATAAAAATGTAGAAATAGAAGT * * *
* * * LeuAsnLysLysValArgAlaThrIleMetThrGlyAspThrProIleAsnIlePheGly
TCTAAATAAAAAGGTACGGGCCACCATAATGACAGGCGACACCCCAATCAACATTTTTGG * 2300
* * * *
ArgAsnIleLeuThrAlaLeuGlyMetSerLeuAsnLeuProValAlaLysValGluPro
CAGAAATATTCTGACAGCCTTAGGCATGTCATTAAATCTACCAGTCGCCAAAGTAGAGCC * * *
* * 2400
IleLysIleMetLeuLysProGlyLysAspGlyPrcLysLeuArgGlnTrpProLeuThr
AATAAAAATAATGCTAAAGCCAGGGAAAGATGGACCAAAACTGAGACAATGGCCCTTAAC * * *
* * * LysGluLysIleGluAlaLeuLysGluIleCysGluLysMetGluLysGluGlyGlnLeu
AAAAGAAAAAATAGAAGCACTAAAAGAAATCTGTGAAAAAATGGAAAAAGAAGGCCAGCT * * *
2500 * *
GluGluAlaProProThrAsnProTyrAsnThrProThrPheAlaIleLysLysLysAsp
AGAGGAAGCACCTCCAACTAATCCTTATAATACCCCCACATTTGCAATCAAGAAAAAGGA * * *
* * * LysAsnLysTrpArgMetLeuIleAspPheArgGluLeuAsnLysValThrGlnAspPhe
CAAAAACAAATGGAGGATGCTAATAGATTTCAGAGAACTAAACAAGGTAACTCAAGATTT * 2600
* * * *
ThrGluIleGlnLeuGlyIleProHisProAlaGlyLeuAlaLysLysArgArgIleThr
CACAGAAATTCAGTTAGGAATTCCACACCCAGCAGGGTTGGCCAAGAAGAGAAGAATTAC * * *
* * 2700
ValLeuAspValGlyAspAlaTyrPheSerIleProLeuHisGluAspPheArgProTyr
TGTACTAGATGTAGGGGATGCTTACTTTTCCATACCACTACATGAGGACTTTAGACCATA * * *
* * * ThrAlaPheThrLeuProSerValAsnAsnAlaGluProGlyLysArgTyrIleTyrLys
TACTGCATTTACTCTACCATCAGTGAACAATGCAGAACCAGGAAAAAGATACATATATAA * * *
2800 * *
ValLeuProGlnGlyTrpLysGlySerProAlaIlePheGlnHisThrMetArgGlnVal
AGTCTTGCCACAGGGATGGAAGGGATCACCAGCAATTTTTCAACACACAATGAGACAGGT * * *
* * * LeuGluProPheArgLysAlaAsnLysAspValIleIleIleGlnTyrMetAspAspIle
ATTAGAACCATTCAGAAAAGCAAACAAGGATGTCATTATCATTCAGTACATGGATGATAT * 2900
* * * *
LeuIleAlaSerAspArgThrAspLeuGluHisAspArgValValLeuGlnLeuLysGlu
CTTAATAGCTAGTGACAGGACAGATTTAGAACATGATAGGGTAGTCCTGCAGCTCAAGGA * * *
* * 3000
LeuLeuAsnGlyLeuGlyPheSerThrProAspGluLysPheGlnLysAspProProTyr
ACTTCTAAATGGCCTAGGATTTTCTACCCCAGATGAGAAGTTCCAAAAAGACCCTCCATA * * *
* * * HisTrpMetGlyTyrGluLeuTrpProThrLysTrpLysLeuGlnLysIleGlnLeuPro
CCACTGGATGGGCTATGAACTATGGCCAACTAAATGGAAGTTGCAGAAAATACAGTTGCC * * *
3100 * *
GlnLysGluIleTrpThrValAsnAspIleGlnLysLeuValGlyValLeuAspTrpAla
CCAAAAAGAAATATGGACAGTCAATGACATCCAGAAGCTAGTCGGTGTCCTAAATTGGGC * * *
* * * AlaGlnLeuTyrProGlyIleLysThrLysHisLeuCysArgLeuIleArgGlyLysMet
AGCACAACTCTACCCAGGGATAAAGACCAAACACTTATGTAGGTTAATCAGAGGAAAAAT * 3200
* * * *
ThrLeuThrGluGluValGlnTrpThrGluLeuAlaGluAlaGluLeuGluGluAsnArg
GACACTCACAGAAGAAGTACAGTGGACAGAATTACCAGAAGCAGAGCTAGAAGAAAACAG * * *
* * 3300
IleIleLeuSerGlnGluGlnGluGlyHisTyrTyrGlnGluGluLysGluLeuGluAla
AATTATCCTAAGCCAGGAACAAGAGGGACACTATTACCAAGAAGAAAAAGAGCTAGAAGC * * *
* * * ThrValGlnLysAspGlnGluAsnGlnTrpThrTyrLysIleHisGlnGluGluLysIle
AACAGTCCAAAAGGATCAAGAGAATCAGTGGACATATAAAATACACCAGGAAGAAAAAAT * * *
3400 * *
LeuLysValGlyLysTyrAlaLysValLysAsnThrHisThrAspGlyIleArgLeuLeu
TCTAAPAGTAGGAAAATATGCAAAGGTGAAAAACACCCATACCAATGGAATCAGATTGTT * * *
* * * AlaGlnValValGlnLysIleGlyLysGluAlaLeuValIleTrpGlyArgIleProLys
AGCACAGGTAGTTCAGAAAATAGGAAAAGAAGCACTAGTCATTTGGGGACCAATACCAAA * 3500
* * * *
PheHisLeuProValGluArgGluIleTrpGluGlnTrpTrpAspAsnTyrTrPGlnVal
ATTTCACCTACCAGTAGAGAGAGAAATCTGGGAGCAGTGGTGGGATAACTACTGGCAAGT * * *
* * 3600
ThrTrpIleProAspTrpAspPheValSerThrProProLeuValArgLeuAlaPheAsn
GACATCCATCCCACACTGGGACTTCGTGTCTACCCCACCACTGGTCAGGTTAGCGTTTAA * * *
* * * LeuValGlyAspProIleProGlyAlaGluThrPheTyrThrAspGlySerCysAsnArg
CCTGGTAGGGGATCCTATACCAGGTGCAGAGACCTTCTACACAGATGGATCCTGCAATAG * * *
3700 * *
GlnSerLysGluGlyLysAlaGlyTyrValThrAspArgGlyLysAspLysValLysLys
GCAATCAAAAGAAGCAAAAGCAGGATATGTAACAGATAGAGGGAAAGACAAGGTAAAGAA * * *
* * * LeuGluGlnThrThrAsnGlnGlnAlaGluLeuGluAlaPheAlaMetAlaLeuThrAsp
ACTAGAGCAAACTACCAATCAGCAAGCAGAACTAGAAGCCTTTGCGATGGCACTAACAGA * 3800
* * * *
SerGlyProLysValAsnIleIleValAspSerGlnTyrValMetGlyIleSerAlaSer
CTCGGGTCCAAAAGTTAATATTATAGTAGACTCACAGTATGTAATGGGGATCAGTGCAAG * * *
* * 3900
GlnProThrGluSerGluSerLysIleValAsnGlnIleIleGluGluMetIleLysLys
CCAACCAACAGAGTCACAAAGTPAAATAGTGAACCAGATCATAGAAGAAATGATAAAAAA * * *
* * * GluAlaIleTyrValAlaTrpValProAlaHisLysGlyIleGlyGlyAsnGlnGluVal
GGAAGCAATCTATGTTGCATGGGTCCCAGCCCACAAAGGCATAGGGGGAAACCAGGAAGT * * *
4000 * *
AspHisLeuValSerGlnGlyIleArgGlnValLeuPheLeuGluLysIleGluProAla
AGATCATTTAGTGAGTCAGGGTATCAGACAAGTGTTGTTCCTGGAAAAAATAGAGCCCGC * * *
* * * GlnGluGluHisGluLysTyrHisSerAsnValLysGluLeuSerHisLysPheGlyIle
TCAGGAAGAACATGAAAAATATCATAGCAATGTAAAAGAACTGTCTCATAAATTTGGAAT * 4100
* * * *
ProAsnLeuValAlaArgGlnIleValAsnSerCysAlaGlnCysGlnGlnLysGlyGlu
ACCCAATTTAGTGGCAAGGCAAATAGTAAACTCATGTGCCCAATGTCAACAGAAAGGGGA * * *
* * 4200
AlaIleHisGlyGlnValAsnAlaGluLeuGlyThrTrpGlnMetAspCysThrHisLeu
AGCTATACATGGGCAAGTAAATGCAGAACTAGGCACTTGGCAAATGGACTGCACACATTT * * *
* * * GluGlyLysIleIleIleValAlaValHisValAlaSerGlyPheIleGluAlaGluVal
AGAAGGAAAGATCATTATAGTAGCAGTACATGTTGCAAGTGGATTTATAGAAGCAGAAGT * * *
4300 * *
IleProGlnGluSerGlyArgGlnThrAlaLeuPheLeuLeuLysLeuAlaSerArgTrp
CATCCCACAGGAATCAGGAAGACAAACAGCACTCTTCCTATTGAAACTGGCAAGTAGGTG * * *
* * * ProIleThrHisLeuHisThrAspAsnGlyAlaAsnPheThrSerGlnGluValLysMet
GCCAATAACACACTTGCATACAGATAATGGTGCCAACTTCACTTCACAGGAGGTGAAGAT * 4400
* * * *
ValAlaTrpTrpIleGlyIleGluGlnSerPheGlyValProTyrAsnProGlnSerGln
GGTAGCATGGTGGATAGGTATAGAACAATCCTTTGGAGTACCTTACAATCCACAGAGCCA * * *
* * 4500
GlyValValGluAlaMetAsnHisHisLeuLysAsnGlnIleSerArgIleArgGluGln
AGGAGTAGTACAAGCAATGAATCACCATCTAAAAAACCAAATAAGTAGAATCAGAGAACA * * *
* * * AlaAsnThrIleGluThrIleValLeuMetAlaIleHisCysMetAsnPheLysArgArg
GGCAAATACAATAGAAACAATAGTACTAATGGCAATTCATTGCATGAATTTTAAAAGAAG * * *
4600 * *
GlyGlyIleGlyAspMetThrProSerGluArgLeuIleAsnMetIleThrThrGluGln
GGGGGGAATAGGGGATATGACTCCATCAGAAAGATTAATCAATATGATCACCACAGAACA * * *
* * * GluIleGlnPheLeuGlnAlaLysAsnSerLysLeuLysAspPheArgValTyrPheArg
AGAGATACAATTCCTCCAAGCCAAAAATTCAAAATTAAAAGATTTTCGGGTCTATTTCAG * 4700
* * * *
GluGlyArgAspGlnLeuTrpLysGlyProGlyGluLeuLeuTrpLysGlyGluGlyAla
AGAAGGCAGAGATCAGTTGTGGAAAGGACCTGGGGAACTACTGTGGAAAGGAGAAGGAGC * * *
* * 4800
ValLeuValLysValGlyThrAspIleLysIleIleProArgArgLysAlaLysIleIle
AGTCCTAGTCAAGGTAGGAACAGACATAAAAATAATACCAAGAAGGAAAGCCAAGATCAT * * *
* * * ArgAspTyrGlyGlyArgGlnGluMetAspSerGlySerHisLeuGluGlyAlaArgGlu
l~ MetGluGluAspLysArgTrpIleValValProThrTrpArgValProGlyArg
CAGAGACTATGGAGGAAGACAAGAGATGGATAGTGGTTCCCACCTGGAGGGTGCCAGGGA * * *
4900 * * AspGlyGluMetAla
MetGluLysTrpHisSerLeuValLysTyrLeuLysTyrLysThrLys- AspLeuGluLys
GGATGGAGAAATGGCATAGCCTTGTCAAGTATCTAAAATACAAAACAAAGGATC- TAGAAA * *
* * * * ValCysTyrValProHisHisLysValGlyTrpAlaTrpTrpThrCysSerArgV-
alIle AGGTGTGCTATGTTCCCCACCATAAGGTGGGATGGGCATGGTCGACTTGCAGCAGGGTAA
* 5000 * * * *
PheProLeuLysGlyAsnSerHisLeuGluIleGlnAlaTyrTrpAsnLeuThrProGlu
TATTCCCATTAAAAGGAAACAGTCATCTAGAGATACAGGCATATTGGAACTTAACACCAG
* * * * * 5100
LysGlyTrpLeuSerSerTyrSerValArgIleThrTrpTyrThrGluLysPheTrpThr
AAAAAGGATGGCTCTCCTCTTATTCAGTAAGAATAACTTGGTACACAGAAAAGTTCTGGA * * *
* * * AspValThrProAspCysAlaAspValLeuIleHisSerThrTyrPheProCysPheThr
CAGATGTTACCCCAGACTGTGCAGATGTCCTAATACATAGCACTTATTTCCCTTGCTTTA * * *
5200 * *
AlaGlyGluValArgArgAlaIleArgGlyGluLysLeuLeuSerCysCysAsnTyrPro
CAGCAGGTGAAGTAAGAAGAGCCATCAGAGGGGAAAAGTTATTGTCCTGCTGCAATTATC * * *
* * * ArgAlaHisArgAlaGlnValProSerLeuGlnPheLeuAlaLeuValValValGlnGln
CCCGAGCTCATAGAGCCCAGGTACCGTCACTTCAATTTCTGGCCTTAGTGGTAGTGCAAC * 5300
* * * * MetThrAspProArgGluThrValProProGlyAsnSerGlyGluGluThrIleGly
AsnAspArgProGlnArgAspSerThrThrArgLysGlnArgArgArgAspTyrArgArg
AAAATGACAGACCCCAGAGAGACAGTACCACCAGGAAACAGCGGCGAAGAGACTATCGGA * * *
* * 5400
GluAlaPheAlaTrpLeuAsnArgThrValGluAlaIleAsnArgGluAlaValAsnHis
GlyLeuArgLeuAlaLysGlnAspSerArqSerHisLysGlnArgSerSerGluSerPro
GAGGCCTTCGCCTGGCTAAACAGGACAGTAGAAGCCATAAACAGAGAAGCAGTGAATCAC * * *
* * * LeuProArgGluLeuIlePheGlnValTrpGlnArgSerTrpArgTyrTrpHisAspGlu
ThrProArgThrTyrPheProGlyValAlaGluValLeuGluIleLeuAla
CTACCCCGAGAACTTATTTTCCAGGTGTGGCAGAGGTCCTGGAGATACTGGCATGATGAA * * *
5500 * *
GlnGlyMetSerGluSerTyrThrLysTyrArgTyrLeuCyslleIleGlnLysAlaVal
CAAGGGATGTCAGAAAGTTACACAAAGTATAGATATTTGTGCATAATACAGAAAGCAGTG * * *
* * * TyrMetHisValArgLysGlyCysThrCysbeuGlyArgGlyHisGlyProGlyGlyTrp
TACATGCATGTTAGGAAAGGGTGTACTTGCCTGGGGAGGGGACATGGGCCAGGAGGGTGG * 5600
* * * * ArgProGlyProProProProProProProGlyLeuVal
MetAlaGluAlaProThrGlu AGACCAGGGCCTCCTCCTCCTCCCCCTCCA-
GGTCTGGTCTAATGGCTGAAGCACCAACAG * * * * * 5700
LeuProProValAspGlyThrProLeuArgG- luProGlyAspGluTrpIleIleGluIle
AGCTCCCCCCGGTGGATGGGACCCCACTGAGGGAGCC- AGGGGATGAGTGGATAATAGAAA * *
* * * * LeuArgGluIleLysGluGluAlaLeuLysHisPheAs-
pProArgLeuLeuIleAlaLeu
TCTTGAGAGAAATAAAAGAAGAAGCTTTAAAGCATTTTGACCCT- CGCTTGCTAATTGCTC * *
* 5800 * * MetGluThrProLeuLysAlaPr- oGluSerSerLeu
GlyLysTyrIleTyrThrArgHisGlyAspThrLeuGluGlyAlaArgGlu- LeuIleLys
TTGGCAAATATATCTATACTAGACATGGAGACACCCTTGAAGGCGCCAGAGAGCTCA- TTA * *
* * * *
LysSerCysAsnGluProPheSerArgThrSerGluGlnAspValAlaThrGlnGluLeu
ValLeuGlnArgAlaLeuPheThrHisPheArgAlaGlyCysGlyHisSerArgIleGly
AAGTCCTGCAACGAGCCCTTTTCACGCACTTCAGAGCAGGATGTGGCCACTCAAGAATTG * 5900
* * * *
AlaArgGlnGlyGluGluIleLeuSerGlnLeuTyrArgProLeuGluThrCysAsnAsn
GlnThrArgGlyGlyAsnProLeuSerAlaIleProThrProArgAsnMetGln
GCCAGACAAGGGGAGGAAATCCTCTCTCAGCTATACCGACCCCTAGAAACATGCAATAAC * * *
* * 6000
SerCysTyrCysLysArgCysCySTyrHisCysGlnMetCysPheLeuAsnLysGlyLeu
TCATGCTATTGTAAGCCATGCTGCTACCATTGTCAGATGTGTTTTCTAAACAAGGGGCTC * * *
* * * GlyIleCysTyrGluArgLysGlyArgArgArgArgThrProLysLysThrLysThrHis
MetAsnGluArgAlaAspGluGluGlyLeuGlnArgLysLeuArgLeuIle
GGGATATGTTATGAACGAAAGGGCAGACGAAGAAGGACTCCAAAGAAAACTAAGACTCAT * * *
6100 * * ProSerProThrProAspLys ArgLeuLeuHisGlnThr
MetMetAsnGlnLeuLeuIleAlaIleLeuLeuAla
CCGTCTCCTACACCAGACAAGTGAGTATGATGAATCAGCTGCTTATTGCCATTTTATTAG * * *
* * * SerAlaCysLeuValTyrCysThrGlnTyrValThrValPheTyrGlyValProThrTrp
CTAGTGCTTGCTTACTATATTGCACCCAATATGTAACTGTTTTCTATGGCGTACCCACCT * 6200
* * * *
LysAsnAlaThrIleProLeuPheCysAlaThrArgAsnArgAspThrTrpGlyThrIle
GGAAAAATGCAACCATTCCCCTCTTTTGTGCAACCAGAAATAGGGATACTTGGGGAACCA * * *
* * 6300
GlnCysLeuProAspAsnAspAspTyrGlnGluIleThrLeuAsnValThrGluAlaPhe
TACAGTGCTTGCCTGACAATGATGATTATCAGGAAATAACTTTGAATGTAACAGAGGCTT * * *
* * * AspAlaTrpAsnAsnThrValThrGluGlnAlaIleGluAspValTrpHisLeuPheGlu
TTGATGCATGGATAATACAGTAACAGAACAAGCAATAGAAGATGTGTGGCATCTATTCG * * *
6400 * *
ThrSerIleLysProCysValLysLeuThrProLeuCysValAlaMetLysCysSerSer
AGACATCAATAAAACCATGTGTCAAACTAACACCTTTATGTGTAGCAATGAAATGCAGGA * * *
* * ThrGluSerSerThrGlyAsnAsnThrThrSerLysSerThrSerThrThrThrThrThr
GCACAGAGAGCAGCACAGGGAACAACACAACCTCAAAGAGCACAAGCACAACCACAACCA * 6500
* * * ProThrAspGlnGluGlnGluIleSerGluAspThrProCysAlaArgAlaAspAsnCys
CACCCACAGACCAGGAGCAAGAGATAAGTGAGGATACTCCATGCGCACGCGCAGACAACT * * *
* * 6600
SerGlyLeuGlyGluGluGluThrIleAsnCysGlnPheAsnMetThrGlyLeuGluArg
GCTCAGGATTGGGAGAGGAAGAAACGATCAATTGCCAGTTCAATATGACAGGATTAGAAA * * *
* * AspLysLysLysGlnTyrAsnGluThrTrpTyrSerLySASpValValCysGluThrAsn
GAGATAAGAAAAAACAGTATAATGAAACATGGTACTCAAAAGATGTGGTTTGTGAGACAA * * *
6700 * AsnSerThrAsnGlnThrGlnCysTyrMetAsnHisCysAsnThrSerValIleThrGlu
ATAATAGCACAAATCAGACCCAGTGTTACATGAACCATTGCAACACATCAGTCATCACAG * * *
* * SerCysAspLysHisTyrTrpAspAlaIleArgPheArgTyrCysAlaProProGlyTyr
AATCATGTGACAAGCACTATTGGGATGCTATAAGGTTTAGATACTGTGCACCACCGGGTT * 6800
* * * AlaLeuLeuArgCysAsnAspThrAsnTyrSerGlyPheAlaProAsnCysSerLysVal
ATGCCCTATTAAGATGTAATGATACCAATTATTCAGGCTTTGCACCCAACTGTTCTAAAG * * *
* 6900 ValAlaSerThrCysThrArgMetMetGluThrGlnThrSerThrTrpPheGlyPheAsn
TAGTAGCTTCTACATGCACCAGGATGATGGAAACGCAAACTTCCACATGGTTTGGCTTTA * * *
* * GlyThrArgAlaGluAsnArgThrTyrIleTyrTrpHisGlyArgAspAsnArgThrIle
ATGGCACTAGAGCAGAGAATAGAACATATATCTATTGGCATGGCAGAGATAATAGAACTA * * *
7000 * IleSerLeuAsnLysTyrTyrAsnLeuSerLeuHisCysLysArgProGlyAsnLysThr
TCATCAGCTTAAACAAATATTATAATCTCAGTTTGCATTGTAAGAGGCCAGGGAATAAGA * * *
* * ValLysGlnIleMetLeuMetSerGlyHisValPheHisSerHisTyrGlnProIleAsn
CAGTGAAACAAATAATGCTTATGTCAGGACATGTGTTTCACTCCCACTACCAGCCGATCA * 7100
* * * LysArgProArgGlnAlaTrpCysTrpPheLysGlyLysTrpLysAspAlaMetGlnGlu
ATAAAAGACCCAGACAAGCATGGTGCTGGTTCAAAGGCAAATGGAAAGACGCCATGCAGG * * *
* 7200 ValLysGluThrLeuAlaLysHisProArgTyrArgGlyThrAsnAspThrArgAsnIle
AGGTGAAGGAAACCCTTGCAAAACATCCCAGGTATAGAGGAACCAATGACACAAGGAATA * * *
* * SerPheAlaAlaProGlyLysGlySerAspProGluValAlaTyrMetTrpThrAsnCya
TTAGCTTTGCAGCGCCAGGAAAAGGCTCAGACCCAGAAGTAGCATACATGTGGACTAACT * * *
7300 * ArgGlyGluPheLeuTyrCysAsnMetThrTrpPheLeuAsnTrpIleGluAsnLysThr
GCAGAGGAGAGTTTCTCTACTGCAACATGACTTGGTTCCTCAATTGGATAGAGAATAAGA * * *
* * HisArgAsnTyrAlaProCysHisIleLysGinIleIleAsnThrTrpHisLysValGly
CACACCGCAATTATGCACCGTGCCATATAAAGCAAATAATTAACACATGGCATAAGGTAG * 7400
* * * ArgAsnValTyrLeuProProArgGluGlyGluLeuSerCysAsnSerThrValThrSer
GGAGAAATGTATATTTGCCTCCCAGGGAAGGGGAGCTGTCCTGCAACTCAACAGTAACCA * * *
* * 7500
IleIleAlaAsnIleAspTrpGlnAsnAsnAsnGlnThrAsnIleThrPheSerAlaGlu
GCATAATTGCTAACATTGACTGGCAAAACAATAATCAGACAAACATTACCTTTAGTGCAG * * *
* * ValAlagluLeuTyrArgLeuGluLeuGlyAspTyrLysLeuValGluIleThrProIle
AGGTGGCAGAACTATACAGATTGGAGTTGGGAGATTATAAATTGGTAGAAATAACACCAA * * *
7600 * GlyPheAlaProThrLysGluLysArgTyrSerSerAlaHisGlyArgHisThrArgGly
TTGGCTTCGCACCTACAAAAGAAAAAAGATACTCCTCTGCTCACGGGAGACATACAAGAG * * *
* * ValPheValLeuGlyPheLeuGlyPheLeuAlaThrAlaGlySerAlaMetGlyAlaAla
GTGTGTTCGTGCTAGGGTTCTTGGGTTTTCTCGCAACAGCAGGTTCTGCAATGGGCGCGG * 7700
* * * SerLeuThrValSerAlaGlnSerArgThrLeuLeUAlaGlyIleValGlnGlnGlnGln
CGTCCCTGACCGTGTCGGCTCAGTCCCGGACTTTACTGGCCGGGATAGTGCAGCAACAGC * * *
* 7800 GlnLeuLeuAspValValLysArgGlnGlnGluLeuLeuArgLeuThrValTrPGlyThr
AACAGCTGTTGGACGTGGTCAAGAGACAACAAGAACTGTTGCGACTGACCGTCTGGGGAA * * *
* * LysAsnLeuGlnAlaArgValThrAlaIleGluLysTyrLeuGlnAspGlnAlaArgLeu
CGAAAAACCTCCAGGCAAGAGTCACTGCTATAGAGAAGTACCTACAGGACCAGGCGCGGC * * *
7900 * AsnSerTrpGlyCysAlaPheArgGlnValCysHisThrThrValProTrpValAsnAsp
TAAATTCATGGGGATGTGCGTTTAGACAAGTCTGCCACACTACTGTACCATGGGTTAATG * * *
* * SerLeuAlaProAspTrpAspAsnMetThrTrpGlnGluTrpGluLysGlnValArgTyr
ATTCCTTAGCACCTGACTGGGACAATATGACGTGGCAGGAATGGGAAAAACAAGTCCGCT * *
8000 * *
LeuGluAlaAsnIleSerLysSerLeuGluGlnAlaGlnAleGlnGlnGluLysAsnMet
ACCTGGAGGCAAATATCAGTAAAAGTTTAGAACAGGCACAAATTCAGCAAGAGAAAAATA * * *
* 8100 TyrGluLeuGlnLysLeuAsnSerTrpASpIlePheGlyAsnTrpPheAspLeuThrSer
TGTATGAACTACAAAAATTAAATAGCTGGGATATTTTTGGCAATTGGTTTGACTTAACCT * * *
* * TrpValLysTyrIleGlnTyrGlyValLeuIleIleValAlaValIleAlaLeuArgIle
CCTGGGTCAAGTATATTCAATATGGAGTGCTTATAATAGTAGCAGTAATAGCTTTAAGAA * * *
8200 * ValIleTyrValValGlnMetLeuSerArgLeuArgLysGlyTyrArgProValPheSer
TAGTGATATATGTAGTACAAATGTTAAGTAGGCTTAGAAAGGGCTATAGGCCTGTTTTCT * * *
* * SerIleSerThrArgThrGlyAspSerGlnPro
AsnProTyrProGlnGlyProGlyThrAlaSerGln
SerProProGlyTynIleGlnGlnIleHisIleHisLysAspArgGlyGlnProAlaAsn
CTTCCCCCCCCGGTTATATCCAACAGATCCATATCCACAAGGACCGGGGACAGCCAGCCA * 8300
* * * ThrLysLysGlnLysLysThrValGluAlaThrValGluThrAspThrGlyProGlyArg
ArgArgAsnArgArgArgArgTrpLysGlnArgTrpArgGlnIleLeuAlaLeuAlaAsp
GluGluThrGluGluAspGlyGlySerAsnGlyGlyAspArgTyrTrpProTrpProIle
ACGAAGAAACAGAAGAAGACGGTGGAAGCAACGGTGGAGACAGATACTGGCCCTGGCCGA * * *
* 8400 SerIleTyrThrPheProAspProProAlaAspSerProLeuAspGlnThrIleGlnHis
AlaTyrIleHisPheLeuIleArgGlnLeuIleArgLeuLeuThrArgLeuTyrSerIle
TAGCATATATACATTTCCTGATCCGCCAGCTGATTCGCCTCTTGACCAGACTATACAGCA * * *
* * LeuGlnGlyLeuThrIleGlnGluLeuProAspProProThrHisLeuProGluSerGln
CysArgAspLeuLeuSerArgSerPheLeuThrLeuGlnLeuIleTyrGlnAsnLeuArg
TCTGCAGGGACTTACTATCCAGGAGCTTCCTGACCCTCCAACTCATCTACCAGAATCTCA * * *
8500 * ArgLeuAlAGluThr MetGlyAlaSerGlySerLysLys
AspTrpLeuArgLeuArgThrAlaPheLeuGlnTyrGlyCyaGluTrpIleGlnGluAla
GAGACTGGCTGAGACTTAGAACAGCCTTCTTGCAATATGGGTGCGAGTGGATCCAAGAAG * * *
* * HisSerArgProProArgGlyLeuGlnGluArgLeuLeuArgAlaArgAlaGlyAlaCys
PheGlnAlaAlaAlaArgAlaThrArgGluThrLeuAlaGlyAlaCysArgGlnLeuTrp
CATTCCAGGCCGCCGCGAGGGCTACAAGAGAGACTCTTGCGGGCGCGTGCAGGGGCTTGT * 8600
* * * GlyGlyTyrTrpAsnGluSerGlyGlyGluTyrSerArgPheGlnGluGlySerAspArg
ArgValLeuGluArgIleGlyArgGlyIleLeuAlaValProAlgArgIleArgGlnGly
GGAGGGTATTGGAACGAATCGGGAGGGGAATACTCGCGGTTCCAAGAAGGATCAGACAGG * * *
* 8700 GluGlnLysSerProSerCysGluGlyArgGlnTyrGlnGlnGlyAspPheMetAsnThr
AlaGluIleAlaLeuLeu GAGCAGAAATCGCCCTCCTGTGAGGGACGGCAGTATCAGCAGGGAGA-
CTTTATGAATACT * * * * *
ProTrpLysAspProAlaAlaGluArgGluLysAsnLeuTyrArgGlnGlnA- snMetAsp
CCATGGAAGGACCCAGCAGCAGAAAGGGAGAAAAATTTGTACAGGCAACAAAATATGG- AT * *
* 8800 *
AspValAspSerAspAspAspAspGlnValArgValSerValThrProLysValProLeu
GATGTAGATTCAGATGATGATGACCAAGTAAGAGTTTCTGTCACACCAAAAGTACCACTGA * * *
* * ArgProMetThrHisArgLeuAlaIleAspMetSerHisLeuIleLysThrArgGlyGly
AGACCAATGACACATAGATTGGCAATAGATATGTCACATTTAATAAAAACAAGGGGGGA * 8900
* * * LeuGluGlyMetPheTyrSerGluArgArgHisLysIleLeuAsnIleTyrLeuGluLys
CTGGAAGGGATGTTTTACAGTGAAAGAAGACATAAAATCTTAAATATATACTTAGAAAAG * * *
* 9000 GluGluGlyIleIleAlaAspTrpGlnAsnTyrThrHisGlyProGlyValArgTyrPro
GAAGAAGGGATAATTGCAGATTGGCAGAACTACACTCATGGGCCAGGAGTAAGATACCCA * * *
* * MetPhePheGlyTrpLeuTrpLysLeuValProValAspValProGlnGluGlyGluAsp
ATGTTCTTTGGGTGGCTATGGAAGCTAGTACCAGTAGATGTCCCACAAGAAGGGGAGGAC * * *
9100 * ThrGluThrHisCysLeuValHisProAlaGlnThrSerLysPheAspAspProHisGly
ACTGAGACTCACTCCTTAGTACATCCAGCACAAACAAGCAAGTTTGATGACCCGCATGGG * * *
* * GluThrLeuValTrpGluPheAspProLeuLeuAlaTyrSerTyrGluAlaPheIleArg
GAGACACTAGTCTGGGAGTTTGATCCCTTGCTGGCTTATAGTTACGAGGCTTTTATTCCG * 9300
* * * TyrProGluGluPheGlyHisLysSerGlyLeuProGluGluGluTrpLysAlaArgLeu
TACCCAGAGGAATTTGGGCACAAGTCAGGCCTGCCAGAGGAAGAGTGGAAGGCGAGACTG * * *
* 9300 LysAlaArgGlyIleProPheSer
AAAGCAAGAGGAATACCATTTAGTTAAAGACAGGAACAGCT- ATACTTGGTCAGGGCAGGA * *
* * * AGTAACTAACAGAAACAGCTGAGACTGCAGGGACTTTCCAGAAGG-
GGCTGTAACCAAGGG * * * 9400 *
AGGGACATGGGAGGAGCTGGTGGGGAACGCCCTCATATTCTCTGTAT- AAATATACCCGCT * *
* * * AGCTTGCATTGTACTTCGGTCGCTCTGCGGAGAGGCTGGCAGATTGAGCCCT-
GGGAGGTT * 9500 * * *
CTCTCCAGCAGTAGCAGGTAGAGCCTGGGTGTTCCCTGCTAGACTCTCACCAGC- ACTTGG * *
* * 9600 CCGGTGCTGGGCAGACGGCCCCACGCTTGCTTGCTTAAAAACCTCCTTAATA-
AAGCTGCC * * * * * AGTTAGAAGCA
Example 5
Sequences of the Coding Regions for the Envelope Protein and GAG
Product of the ROD HIV-2 Isolate
[0079] Through experimental analysis of the HIV-2 ROD isolate, the
following sequences were identified for the regions encoding the
env and gag gene products. One of ordinary skill in the art will
recognize that the numbering for both gene regions which follow
begins for convenience with "1" rather than the corresponding
number for its initial nucleotide as given in Example 4, above, in
the context of the complete genomic sequence.
[0080] Envelope Sequence
2 Envelope sequence MetMetAsnGlnLeuLeuIleAlaIleLeuLeuAla- SerAlaCys
ATGATGAATCAGCTGCTTATTGCCATTTTATTAGCTAGTGCTTGC * * * * *
LeuValTyrCysThrGlnTyrValThrValPheTyrGlyValPro
TTACTATATTGCACCCAATATGTAACTGTTTTCTATGGCGTACCC * * * * *
ThrTrpLysAsnAlaThrIleProLeu- PheCysAlathrArgAsn
ACGTGGAAAAATGCAACCATTCCCCTGTTTTGTGCAACCAGAAAT 100 * * * *
ArgAspThrTrpGlyThrIleGlnCysLeuProAspAsnAspAsp
AGGGATACTTGCGGAACCATACAGTGCTTGCCTGACAATGATGAT * * * * *
TyrGlnGluIleThrLeuAsnValThr- GluAlaPheAspAlaTrp
TATCAGGAAATAACTTTGAATGTAACAGAGCCTTTTGATGCATGG * 200 * * *
AsnAsnThrValThrGluGlnAlaIleGluAspValTrpHisLeu
AATAATACAGTAACAGAACAAGCAATAGAAGATGTCTGGCATCTA * * * * *
PheGluThrSerIleLysProCysVal- LysLeuThrProLeuCys
TTCGAGACATCAATAAAACCATGTGTCAAACTAACACCTTTATGT * * 300 * *
ValAlaMetLysCysSerSerThrGluSerSerThrGlyAsnAsn
GTAGCAATGAAATGCAGCAGCACAGAGAGCAGCACAGGGAACAAC * * * * *
ThrThrSerLysSerThrSerThrThr- ThrThrThrProThrAsp
ACAACCTCAAAGAGCACAAGCACAACCACAACCACACCCAGAGAC * * * 400 *
GlnGluGlnGluIleSerGluAspThrProCysAlaArgAlaAsp
CAGGAGCAAGAGATAAGTGAGGATACTCCATGCGCACGCGCAGAC * * * * *
AsnCysSerGlyLeuGlyGluGluGlu- ThrIleAsnCysGlnPhe
AACTGCTCAGGATTGGGAGAGGAAGAAACGATCAATTGCCAGTTC * * * *
AsnMetThrGlyLeuGluArgAspLysLysLysGlnTyrAsnGlu
AATATGACAGGATTAGAAAGAGATAAGAAAAAAGACTATAATGAA 500 * * * *
ThrTrpTyrSerLysAspValValCya- GluThrAsnAsnSerThr
ACATGGTACTCAAAAGATGTGGTTTGTGAGACAAATAATAGCACA * * * * *
AsnGlnThrGlnCysTyrMetAsnHisCysAsnThrSerValIle
AATCAGACCCAGTGTTACATGAACCATTGCAACACATCAGTCATC 600 * * * *
ThrGluSerCysAapLysHisTyrTrp- AspAlaIleArgPheArg
ACAGAATCATGTGACAAGCACTATTGGGATGCTATAAGGTTTAGA * * * * *
TyrCysAlaProProGlyTyrAlaLeuLeuArgCysAsnAspThr
TACTGTGCACCACCGGGTTATGCCCTATTAAGATGTAATGATACC * * 700 * *
AsnTyrSerGlyPheAlaProAsnCya- SerLysValValAlaSer
AATTATTCAGGCTTTGCACCCAACTGTTCTAAAGTAGTAGCTTCT * * * * *
ThrCysThrArgNetMetGluThrGlnThrSerThrTrpPheGly
ACATGCACCAGGATGATGGAAACGCAAACTTCCACATGGTTTGGG * * * 800 *
PheAsnGlyThrArgAlaGluAsnArgTh- rTyrIleTyrTrpHis
TTTAATGGCACTAGAGCAGAGAATAGAACATATATCTATTGGCAT * * * * *
GlyArgAspAsnArgThrIleIleSerLeuAsnLysTyrTyrAsn
GGCAGAGATAATAGAACTATCATCAGCTTAAACAAATATTATAAT * * * * 900
LeuSerLeuHisCysLysArgProGlyAs- nLysThrValLysGln
CTCAGTTTGCATTGTAAGAGGCCAGGGAATAAGACAGTGAAACAA * * * *
IleMetLeuMetSerGlyHisValPheHisSerHisTyrGlnPro
ATAATGCTTATGTCAGGACATGTGTTTCACTCCCACTACCAGCCG * * * * *
IleAsnLysArgProArgGlnAlaTrpCy- sTrpPheLysGlyLys
ATCAATAAAAGACCCAGACAAGCATGGTGCTGGTTCAAAGGCAAA 1000 * * *
TrpLysAspAlaMetGlnGluValLysThrLeuAlaLysHisPro
TGGAAAGACGCCATGCAGGAGGTGAAGACCCTTGCAAAACATCCC * * * * *
ArgTyrArgGlyThrAsnAspThrArgAs- nIleSerPheAlaAla
AGGTATAGAGGAACCAATGACACAAGGAATATTAGCTTTGCAGCG * 1000 * *
ProGlyLysGlySerAspProGluValAlaTyrMetTrpThrAsn
CCAGGAAAAGGCTCAGACCCAGAAGTAGCATACATGTGGACTAAG * * * * *
CysArgGlyGluPheLeuTyrCysAsnMe- tThrTrpPheLeuAsn
TGCAGAGGAGAGTTTCTCTACTGCAACATGACTTGGTTCCTCAAT * * 1200 *
TrpIleGluAsnLysThrHisArgAsnTyrAlaProCysHisIle
TGGATAGAGAATAAGACACACCGCAATTATGCACCGTGCCATATA * * * * *
LysGlnIleIleAsnThrTrpHisLysVs- lGlyArgAsnValTyr
AAGCAAATAATTAACACATGGCATAAGGTAGGGAGAAATGTATAT * * * 1300
LeuProProArgGluGlyGluLeuSerCysAsnSerThrValThr
TTGCCTCCCAGGGAAGCGGAGCTGTCCTGCAACTCAACAGTAACC * * * * *
SerIleIleAlaAsnIleAspTrpGlnAs- nAsnAsnGlnThrAsn
AGCATAATTGCTAACATTGACTGGCAAAACAATAATCAGACAAAC * * * *
IleThrPheSerAlaGluValAlaGluLeuTyrArgLeuGluLeu
ATTACCTTTAGTGCAGAGGTGGCAGAACTATACAGATTGGAGTTG 1400 * * * *
GlyAspTyrLysLeuValGluIleThrPr- oIleGlyPheAlaPro
GGAGATTATAAATTGGTAGAAATAACACCAATTGGCTTCGCACCT * * * *
ThrLysGluLysArgTyrSerSerAlaHisGlyArgHisThrArg
ACAAAAGAAAAAAGATACTCCTCTGCTCACGGGAGACATACAAGA * 1500 * * *
GlyValPheValLeuGlyPheLeuGlyPh- eLeuAlaThrAlaGly
GGTGTGTTCGTGCTAGGGTTCTTGGGTTTTCTCGCAACAGCAGGT * * * *
SerAlaMerGlyAlaArgAlaSerLeuThrValSerAlaGlnSer
TCTGCAATGGGCGCTCGAGCGTCCCTGACCGTGTCGGCTCAGTCC * * 1600 * *
ArgThrLeuLeuAlaGlyIleValGlnGl- nGlnGlnGlnLeuLeu
CGGACTTTACTGGCCGGGATAGTGCAGCAACAGCAACAGCTGTTG * * * *
AspValValLysArgGlnGlnGluLeuLeuArgLeuThrValTrp
GACGTGGTCAAGAGACAACAAGAACTGTTGCGACTGACCCTCTGG * * * 1700 *
GlyThrLysAsnLeuGlnAlaArgValTh- rAlaIleGluLysTyr
GGAACGAAAAACCTCCAGGCAAGAGTCACTGCTATAGAGAAGTAG * * * *
LeuGlnAspGlnAlaArgLeuAsnSerTrpGlyCysAlaPheArg
CTACAGGACCAGGCGCGGCTAAATTCATGGGGATGTGCGTTTAGA * * * * 1800
GlnValCysHisThrThrValProTrpVa- lAsnAspSerLeuAla
CAAGTCTGCCACACTACTGTACCATGGGTTAATGATTCCTTAGCA * * * *
ProAspTrpAspAsnMetThrTrpGlnGluTrpGluLysGlnVal
CCTGACTGGGACAATATGACGTGGCAGGAATGGGAAAAACAAGTC * * * * *
ArgTyrLeuGluAlaAsnIleSerLysSe- rLeuGluGlnAlaGln
CGCTACCTGGAGGCAAATATCAGTAAAAGTTTAGAACAGGCACAA 1900 * * *
IleGlnGlnGluLysAsnMetTyrGluLeuGlnLysLeuAsnSer
ATTCAGCAAGAGAAAAATATGTATGAACTACAAAAATTAAATAGC * * * * *
TrpAspIlePheGlyAsnTrpPheAspLe- uThrSerTrpValLys
TGGGATATTTTTGGCAATTGGTTTGACTTAACCTCCTGGGTCAAG * 2000 * *
TyrIleGlnTyrGlyValLeuIleIleValAlaValIleAlaLeu
TATATTCAATATGGAGTGCTTATAATAGTAGCAGTAATAGCTTTA * * * * *
ArgIleValIleTyrValValGlnMetLe- uSerArgLeuArgLys
AGAATAGTGATATATGTAGTACAAATGTTAAGTAGGCTTAGAAAG * * 2100 *
GlyTyrArgProValPheSerSerProProGlyTyrIleGln***
GGCTATAGGCCTGTTTTCTCTTCCCCCCCCGGTTATATCCAATAG * * * * *
IleHisIleHisLysAspArgGlyGlnP- roAlaAsnGluGluThr
ATCCATATCCACAAGGACCGGGGACAGCCAGCCAACGAAGAAACA * * * 2200
GluGluAspGlyGlySerAsnGlyGlyAspArgTyrTrpProTrp
GAAGAAGACGGTGGAAGCAACGGTGGAGACAGATACTGGCCCTGG * * * * *
ProIleAlaTyrIleHisPheLeuIleAr- gGlnLeuIleArgLeu
GCGATAGCATATATACATTTCCTGATCCGCCAGCTGATTCGCCTC * * * *
LeuThrArgLeuTyrSerIleCysArgAspLeubeuSerArgSer
TTGACCAGACTATACAGCATCTGCAGGGACTTACTATCCAGGAGC 2300 * * * *
PheLeuThrLeuGlnLeuIleTyrGlnAs- nLeuArgAspTrpLeu
CTCCTGACCCTCCAACTCATCTACCAGAATCTCAGAGACTGGCTG * * * *
ArgLeuArgThrAlaPheLeuGlnTyrGlyCysGluTrpIleGln
AGACTTAGAACAGCCTTCTTGCAATATGGGTGCGAGTGGATCCAA * 2400 * * *
GluAlaPheGlnAlaAlaAlaArgAlaTh- rArgGluThrLeuAla
GAAGCATTCCAGGCCGCCGCGAGGGCTACAAGAGAGACTCTTGCG * * * *
GlyAlaCysArgGlyLeuTrpArgValLeuGluArgIleGlyArg
GGCGCGTGCAGGGGCTTGTGGAGGGTATTGGAACGAATCGGGAGG * * 2500 * *
GlyTleLeuAlaValProArgArgIleAr- gGlnGlyAlaGluIle
CGAATACTCGCGGTTCCAAGAAGGATCACAGAGGGAGCAGAAATC * * * *
AlaLeuLeu***GlyThrAlaValSerAlaGlyArgLeuTyrGlu
GCCCTCCTGTGAGGGACGGCAGTATCAGCAGGGAGACTTTATGAA * * * 2600 *
TyrSerMetGluGlyProSerSerArgLy- sGlyGluLysPheVal
TACTCCATGCAAGGACCCACCAGCAGAAAGGGAGAAAAATTTGTA * * * *
GlnAlaThrLysTyrGly CAGGCAACAAAATATGGA * * Gag sequence
HetGlyAlaArgAsnSerValLeuArgGlyLysLysAl- aAspGlu
ATGGGCGCGAGAAACTCCGTCTTGAGAGGGAAAAAAGCAGATGAA * * * *
LeuGluArgIleArgLeuArgProGlyGlyLysLysLysTyrArg
TTAGAAAGAATCAGGTTACGGCCCGGGCGAAAGAAAAAGTACAGG * * * * *
LeuLysHisIleValTrpAlaAlaAsnLy- sLeuAspArgPheGly
CTAAAACATATTGTGTGGGCAGCGAATAAATTGGACAGATTCGGA 100 * * *
LeuAlaGluSerLeuLeuGluSerLysGluGlyCysGlnLysIle
TTAGCAGAGAGCCTGTTGGAGTCAAAAGACGGTTGTCAAAAAATT * * * *
LeuThrValLeuAspProMetVal ProThrGlySerGluAsnLeu
CTTACAGTTTTAGATCCAATGGTACCGACAGGTTCAGAAAATT- TA * 200 * *
LysSerLeuPheAsnThrValCysValIleTrpCysIleHisAla
AAAAGTCTTTTTAATACTGTCTGCGTCATTTGGTGCATACACGCA * * * * *
GluGluLysValLysAspThrGluGlyAl- aLysGlnIleValArg
GAAGAGAAAGTGAAAGATACTGAAGGAGCAAAACAAATAGTGCGG * * 300 *
ArgHisLeuValAlaGluThrGlyThrAlaGluLysMetProSer
AGACATCTAGTGGCAGAAACAGGAACTGCAGAGAAAATGCCAAGG * * * * *
ThrSerArgProThrAlaProSerSerGl- uLysGlyGlyAsnTyr
ACAAGTAGACCAACAGCACCATCTAGCGAGAAGGGAGGAAATTAC * * * 400
ProValGlnHisValGlyGlyAsnTyrThrHisIleProLeuSer
CCAGTGCAACATGTAGGCGGCAACTACACCCATATACCGCTGAGT * * * * *
ProArgThrLeuAsnAlaTrpValLysLe- uValGluGluLysLys
CCCCGAACCCTAAATGCCTGGGTAAAATTAGTAGAGGAAAAAAAG * * * *
PheGlyAlaGluValValProGlyPheGlnAlaLeuSerGluGly
TTCGGGGCAGAAGTAGTGCCAGGATTTCAGGCACTCTCAGAAGGC 500 * * * *
CysThrProTyrAspIleAsnGlnMetLe- uAsnCysValGlyAsp
TGCACGCCCTATGATATCAACCAAATGCTTAATTGTGTGGGCGAC * * * *
HisGlnAlaAlaMetGlnhleIleArgGluIleIleAsnGluGlu
CATCAAGCAGCCATGCAGATAATCAGGGAGATTATCAATGAGGAA * 600 * * *
AlaAlaGluTrpAspValGlnHisProIl- eProGlyProLeuPro
GCAGCAGAATGGGATGTGCAACATCCAATACCAGGCCCCTTACCA * * * *
AlaGlyGlnLeuArgGluProArgGlySerAspIleAlaGlyThr
GCGGGGCAGCTTAGAGAGCCAAGGGGATCTGACATAGCAGGCACA * * 700 * *
ThrSerThrValGluGluGlnIleGlnTr- pMetPheArgProGln
ACAAGCACAGTAGAAGAACAGATCCAGTGGATGTTTAGGCCACAA * * * *
AsnProValProValGlyAsnIleTyrArgArgTrpIleGlnIle
AATCCTGTACCAGTAGGAAACATCTATAGAAGATCGATGCAGATA * * * 800 *
GlyLeuGlnLysCysValArgMetTyrAs- nProThrAsnIleLeu
GGATTGCAGAAGTGTGTCAGGATGTACAACCCGACCAACATCCTA * * * *
AspIleLysGlnGlyProLysGluProPheClnSerTyrValAsp
GACATAAAACAGGGACCAAAGGAGCCGTTCCAAAGCTATGTAGAT * * * * 900
ArgPheTyrLysSerLeuArgAlaGluGl- nThrAspProAlaVal
AGATTCTACAAAAGCTTGAGGGCAGAACAAACAGATCCAGCAGTG * * * *
LysAsnTrpMetThrGlnThrLeuLeuValGlnAsnAlaAsnPro
AAGAATTGGATGACCCAAACACTGCTAGTACAAAATGCCAACCCA * * * * *
AspCysLysLeuValLeuLysGlyLeuGl- yMetAsnProThrLeu
GACTCTAAATTAGTGCTAAAAGGACTAGGGATGAACCCTACCTTA 1000 * * *
GluGluMetLeuThrAlaCysGlnGlyValGlyGlyProGlyGln
GAAGAGATGCTGACCGCCTGTCAGGGGGTAGGTGGGCCAGGCGAG * * * * *
LysAlaArgLeuMetAlaGluAlaLeuLy- sGluValIleGlyPro
AAAGCTAGATTAATGGCAGAGGCCCTGAAAGAGGTCATAGGACCT * 1100 * *
AlaProIleProPheAlaAlaAlaGlnGlnArgLysAlaPheLys
GCCCCTATCCCATTCGCAGCAGCCCAGCAGAGAAAGGCATTTAAA * * * * *
CysTrpAsnCysGlyLysGluGlyHisSe- rAlaArgGlnCysArg
TGCTGGAACTGTGGAAAGGAACGGCACTCGGCAAGACAATGCCGA * * 1200 *
AlaProArgArgGlnGlyCysTrpLysCysGlyLysProGlyHis
GCACCTAGAAGGCAGGGCTGCTGGAAGTGTGGTAAGCCAGGACAC * * * * *
IleMetThrAsnCysProAspArgGlnAl- aGlyPheLeuGlyLeu
ATCATGACAAACTGCCCAGATAGACAGGCAGGTTTTTTAGGACTG * * * 1300
GlyProTrpGlyLysLysProArgAsnPheProValAlaGlnVal
GGCCCTTGGGGAAAGAAGCCCCGCAACTTCCCCGTGGCCCAAGTT * * * * *
ProGlnGlyLeuThrProThrAlaProPr- oValAspProAlaVal
CCGCAGGCGCTGACACCAACAGCACCCCCAGTGGATCCAGCAGTG * * * *
AspLeuLeuGluLysTyrMetGlnGlnGlyLysArgGlnArgGln
GATCTACTGGAGAAATATATGCAGCAAGGGAAAAGACAGAGAGAG 1400 * * * *
GlnArgGluArgProTyrLysGluValTh- rGluAspLeuLeuHis
CAGAGAGAGAGACCATACAAGGAAGTGACAGAGGACTTACTGCAC * * * *
LeuGluGlnGlyGluThrProTyrArgGlnProProThrGluAsp
CTCGACCAGGGGGAGACACCATACACGCAGCCACCAACAGAGGAC * 1500 * * *
LeuLeuHisLeuAsnSerLeuPheGlyLy- sAspGln
TTGCTGCACCTCAATTCTCTCTTTGGAAAAGACCAG * * *
Example 6
Peptide Sequences Encoded By the ENV and GAG Genes
[0081] The following coding regions for antigenic peptides,
identified for convenience only by the nucleotide numbers of
Example 5, within the env and gag gene regions are of particular
interest.
3 env1 (1732-1809) ArgValThrAlaIleGluLyeTyrLeuGlnAspGlnAlaA-
rgLeuAsnSerTrpGlyCysAlaPheArgGlnValCys
AGAGTCACTGCTATAGAGAAGTACCTAC-
AGGACCAGGCGCGGCTAAATTCATGGGGATGTGCGTTTAGACAAGTCTGC * * * * * 1000
env2 (1912-1983)
SerLysSerLeuGluGlnAlaGlnIleGlnGlnGluLysAsnMetTyrGluLeu-
GlnLysLeuAsnSerTrp
AGTAAAAGTTTAGAACAGGCACAAATTCAGCAAGAGAAAAATATGTAT-
CAACTACAAAAATTAAATAGCTGG * * 1940 * * * * env3 (1482-1530)
ProThrLysGluLysArgTyrSerSerAlaHisGlyArgHisThrArg
CCTACAAAAGAAAAAAGATACTCCTCTGCTCACGGGAGACATACAAGA * 1500 * * * env4
(55-129) CysThrGlnTyrValThrValPheTyrGlyValProThrTrpLysAsnAlaThrI-
leProLeuPheCysAlaThr TGCACCCAATATGTAACTGTTTTCTA-
TGGCGTACCCACGTGGAAAAATGCAACCATTCCCCTGTTTTGTGCAACC * * * * 100 * *
env5 (175-231) AspAspTyrGlnGluIleThrLeuAsnValThrGluAlaP-
heAspAlaTrpAsnAsn
GATGATTATCAGGAAATAACTTTGAATGTAACAGAGGCTTTTGATGCAT- GGAATAAT * 200 *
* env6 (274-330) GluThrSerIleLysProCysValLysLeuThrProLeuCysValAl-
aMetLysCys
GAGACATCAATAAAACCATGTGTGAAACTAACACCTTTATGTGTAGCAATGAAATG- C * * 300
* * * env7 (607-660) AsnHisCysAsnThrSerValIleThrGluSerCysAspLysH-
isTyrTrpAsp AACCATTGCAACACATCAGTCATCACAGAATCATGTGACAAGCACTATTGGGAT
610 * * * * * env8 (661-720)
AlaIleArgPheArgTyrCysAlaProProGlyTyrAlaLeuLeuAr- gCysAsnAspThr
GCTATAAGGTTTAGATACTGTGCACCACCGGGTTATGCCCTATTAAGATGTAA- TGATACC * *
700 * * env9 (997-1044) LysArgProArgGlnAlaTrpCysTrpPheLy-
sglyLysTrpLysAsp AAAAGACCCAGACAAGCATGGTGCTGGTTCAAAGGCAAATGGAAAGAC
1000 * * * env10 (1132-1215)
LysGlySerAspProGluValAlaTyrMetTrpThrAsnCysArgGlyGluPheLeuTyrCysAsn-
MetThrTrpPheLeuAsn
AAAGGCTCAGACCCAGAAGTAGCATACATGTGGACTAACTGCAGAGGA-
GAGTTTCTCTACTGCAACATGACTTGGTTCCTCAAT * * * * * * 1200 * env11
(1237-1305) ArgAsnTyralaProCysHisIleLysGlnIleIleAsnThrTrpHisLysVal-
GlyArgAsnValTyr
CGCAATTATGCACCGTGCCATATAAAGCAAATAATTAACACATGGCATAAG-
GTAGGGAGAAATGTATAT * * * * * * 1300 gag1 (991-1053)
AspCysLysLeuValLeuLysGlyLeuGlyMetAsnProThrLeuGluGluMetLeuThrAla
GACTGTAAATTAGTGCTAAAAGGACTAGGGATGAACCCTACCTTAGAAGAGATGCTGACCGCC
1000 * * * * *
[0082] Of the foregoing peptides, env1, env2, env3 and gag1 are
particularly contemplated for diagnostic purposes, and env4, env5,
env6, env7, env8, env9, env10 and env11 are particularly
contemplated as protecting agents. These peptides have been
selected in part because of their sequence homology to certain of
the envelope and gag protein products of other of the retroviruses
in the HIV group. For vaccinating purposes, the foregoing peptides
may be coupled to a carrier protein by utilizing suitable and well
known techniques to enhance the host's immune response. Adjuvants
such as calcium phosphate or alum hydroxide may also be added. The
foregoing peptides can be synthesized by conventional protein
synthesis techniques, such as that of Merrifield.
[0083] It will be apparent to those skilled in the art that various
modifications and variations can be made in the processes and
products of the present invention. Thus, it is intended that the
present application cover the modifications and variations of this
invention provided they come within the scope of the appended
claims and their equivalents. For convenience in interpreting the
following claims, the following table sets forth the correspondence
between codon codes and amino acids and the correspondence between
three-letter and one-letter amino acid symbols.
4 DNA CODON AMINO ACID 3 LET. AMINO ACID 1 LET. \2 T```C```A```G
`T```C```A```G T C A G 1 3\ T TTT TCT TAT TGT PHE SER TYR CYS F S Y
C T C TTC TCC TAC TGC PHE SER TYR CYS F S Y C A TTA TCA TAA TGA LEU
SER *** *** L S * * G TTG TCG TAG TGG LEU SER *** TRP L S * W T CTT
CCT CAT CGT LEU PRO HIS ARG L P H R C C CTC CCC CAC CGC LEU PRO HIS
ARG L P H R A CTA CCA CAA CGA LEU PRO GLN ARG L P A R G CTG CCG CAG
CGG LEU PRO GLN ARG L P A R T ATT ACT AAT AGT ILE THR ASN SER I T N
S A C ATC ACC AAC AGC ILE THR ASN SER I T N S A ATA ACA AAA AGA ILE
THR LYS ARG I T K R G ATG ACG AAG AGG MET THR LYS ARG M T K R T GTT
GCT GAT GGT VAL ALA ASP GLY V A D G G C GTC GCC GAC GGC VAL ALA ASP
GLY V A D G A TGA GCA GAA GGA VAL ALA GLU GLY V A E G G GTG GCG GAG
GGG VAL ALA GLU GLY V A E G 3 Letter 1 Letter CODONS ALA A GCT GCC
GCA GCG ARG R CGT CCC CCA CCG AGA AGG ASN N AAT AAC ASP D GAT GAC
CYS C TGT TCC GLN Q CAA CAG GLU E GAA GAG GLY G GGT GGC GGA GGG HIS
H CAT CAC ILE I ATT ATC ATA LEU L CTT CTC CTA CTC TTA TTC LYS K AAA
AAC MET M ATG PHE F TTT TTC PRO P CCT CCC CCA CCC SER S TCT TCC TCA
TCC ACT ACC THR T ACT ACC ACA ACC TRP W TGC TYR Y TAT TAC VAL V CTT
CTC GTA GTG *** * TAA TAG TGA
* * * * *