U.S. patent application number 11/981459 was filed with the patent office on 2008-12-25 for fusion proteins of mycobacterium tuberculosis.
This patent application is currently assigned to Corixa Corporation. Invention is credited to Mark Alderson, Steven Reed, Yasir Skeiky.
Application Number | 20080317716 11/981459 |
Document ID | / |
Family ID | 40253655 |
Filed Date | 2008-12-25 |
United States Patent
Application |
20080317716 |
Kind Code |
A1 |
Skeiky; Yasir ; et
al. |
December 25, 2008 |
Fusion proteins of mycobacterium tuberculosis
Abstract
The present invention relates to compositions and fusion
proteins containing at least two Mycobacterium sp. antigens, and
nucleic acids encoding such compositions and fusion proteins. The
compositions of the invention increase serological sensitivity of
sera from individuals infected with tuberculosis, and methods for
their use in the diagnosis, treatment, and prevention of
tuberculosis infection.
Inventors: |
Skeiky; Yasir; (Bellevue,
WA) ; Reed; Steven; (Bellevue, WA) ; Alderson;
Mark; (Bainbridge Island, WA) |
Correspondence
Address: |
TOWNSEND AND TOWNSEND AND CREW, LLP
TWO EMBARCADERO CENTER, EIGHTH FLOOR
SAN FRANCISCO
CA
94111-3834
US
|
Assignee: |
Corixa Corporation
Seattle
WA
|
Family ID: |
40253655 |
Appl. No.: |
11/981459 |
Filed: |
October 30, 2007 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11418848 |
May 5, 2006 |
|
|
|
11981459 |
|
|
|
|
09886349 |
Jun 20, 2001 |
7083796 |
|
|
11418848 |
|
|
|
|
60265737 |
Feb 1, 2001 |
|
|
|
Current U.S.
Class: |
424/93.2 ;
435/252.3 |
Current CPC
Class: |
C07K 2319/00 20130101;
A61P 37/04 20180101; A61K 2039/55505 20130101; C07K 14/35 20130101;
A61P 31/06 20180101; A61K 2039/55566 20130101; A61K 39/04 20130101;
A61K 2039/53 20130101; A61K 39/00 20130101 |
Class at
Publication: |
424/93.2 ;
435/252.3 |
International
Class: |
A61K 35/74 20060101
A61K035/74; C12N 1/21 20060101 C12N001/21 |
Claims
1-88. (canceled)
89. An isolated host cell transfected with an expression vector
comprising a polynucleotide comprising a nucleotide sequence
encoding a fusion polypeptide, said fusion polypeptide comprising a
first amino acid sequence having at least 95% sequence identity to
SEQ ID NO:14 and a second amino acid sequence having at least 95%
sequence identity to SEQ ID NO:8, and wherein the amino acid
corresponding to position 176 in SEQ ID NO:8 is not a serine.
90. The host cell of claim 89, wherein the encoded fusion
polypeptide further comprises a third amino acid sequence having at
least 95% sequence identity to SEQ ID NO:10.
91. The host cell of claim 89, wherein the encoded fusion has an
amino acid corresponding to position 176 in SEQ ID NO:8 which is an
alanine.
92. The host cell of claim 90, wherein the encoded fusion has an
amino acid corresponding to position 176 in SEQ ID NO:8 which is an
alanine.
93. The host cell of claim 89, wherein the encoded fusion
polypeptide has at least 95% sequence identity to SEQ ID NO:18 or
SEQ ID NO:20.
94. The host cell of claim 89, wherein the encoded fusion
polypeptide comprises the amino acid sequence of SEQ ID NO:14 and
the amino acid sequence of SEQ ID NO:8, and wherein the amino acid
corresponding to position 176 in SEQ ID NO:8 is not a serine.
95. The host cell of claim 94, wherein the encoded fusion
polypeptide further comprises the amino acid sequence of SEQ ID
NO:10.
96. The host cell of claim 89, wherein the encoded fusion
polypeptide comprises the amino acid sequence of SEQ ID NO:18.
97. The host cell of claim 89, wherein the encoded fusion
polypeptide comprises the amino acid sequence of SEQ ID NO:20 with
an alanine replacing the serine at position 577.
98. The host cell of any one of claims 89 to 97, wherein the host
cell is a Bacillus-Calmette-Guerrin.
99. A method for the treatment and/or prevention of tuberculosis
comprising administering an effective amount of a
Bacillus-Calmette-Guerrin which expresses on its cell surface or
secretes a fusion polypeptide, said fusion polypeptide comprising a
first amino acid sequence having at least 95% sequence identity to
SEQ ID NO:14 and a second amino acid sequence having at least 95%
sequence identity to SEQ ID NO:8, and wherein the amino acid
corresponding to position 176 in SEQ ID NO:8 is not a serine.
100. The method of claim 99, wherein the fusion polypeptide further
comprises a third amino acid sequence having at least 95% sequence
identity to SEQ ID NO:10.
101. The method of claim 99 or 100, wherein the fusion polypeptide
has an amino acid corresponding to position 176 in SEQ ID NO:8
which is an alanine.
102. The method of claim 99, wherein the fusion polypeptide has at
least 95% sequence identity to SEQ ID NO:18 or SEQ ID NO:20.
103. The method of claim 99, wherein the encoded fusion polypeptide
comprises the amino acid sequence of SEQ ID NO:14 and the amino
acid sequence of SEQ ID NO:8, and wherein the amino acid
corresponding to position 176 in SEQ ID NO:8 is not a serine.
104. The method of claim 103, wherein the fusion polypeptide
further comprises the amino acid sequence of SEQ ID NO:10.
105. The method of claim 99, wherein the fusion polypeptide
comprises the amino acid sequence of SEQ ID NO:18.
106. The method of claim 99, wherein the fusion polypeptide
comprises the amino acid sequence of SEQ ID NO:20 with an alanine
replacing the serine at position 577.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] The present application claims priority to U.S. patent
application Ser. No. 09/597,796, filed Jun. 20, 2000, and U.S.
patent application No. 60/265,737, filed Feb. 1, 2001, herein each
incorporated by reference in their entirety.
[0002] The present application is related to U.S. patent
application Ser. No. 09/056,556, filed Apr. 7, 1998; U.S. patent
application Ser. No. 09/223,040, filed Dec. 30, 1998; U.S. patent
application Ser. No. 09/287,849, filed Apr. 7, 1999; published PCT
application No. WO99/51748, filed Apr. 7, 1999 (PCT/US99/07717),
U.S. patent application No. 60/158,338, filed Oct. 7, 1999, and
U.S. application No. 60/158,425, filed Oct. 7, 1999; U.S.
application Ser. No. 09/688,672, filed Oct. 10, 2000; and published
PCT application No. WO01/24820, filed Oct. 10, 2000
(PCT/US00/28095); herein each incorporated by reference in its
entirety.
STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED
RESEARCH AND DEVELOPMENT
[0003] Not applicable.
FIELD OF THE INVENTION
[0004] The present invention relates to fusion proteins containing
at least two Mycobacterium sp. antigens. In particular, it relates
to nucleic acids encoding fusion proteins that include two or more
individual M. tuberculosis antigens, which increase serological
sensitivity of sera from individuals infected with tuberculosis,
and methods for their use in the diagnosis, treatment, and
prevention of tuberculosis infection.
BACKGROUND OF THE INVENTION
[0005] Tuberculosis is a chronic infectious disease caused by
infection with M. tuberculosis and other Mycobacterium species. It
is a major disease in developing countries, as well as an
increasing problem in developed areas of the world, with about 8
million new cases and 3 million deaths each year. Although the
infection may be asymptomatic for a considerable period of time,
the disease is most commonly manifested as an acute inflammation of
the lungs, resulting in fever and a nonproductive cough. If
untreated, serious complications and death typically result.
[0006] Although tuberculosis can generally be controlled using
extended antibiotic therapy, such treatment is not sufficient to
prevent the spread of the disease. Infected individuals may be
asymptomatic, but contagious, for some time. In addition, although
compliance with the treatment regimen is critical, patient behavior
is difficult to monitor. Some patients do not complete the course
of treatment, which can lead to ineffective treatment and the
development of drug resistance.
[0007] In order to control the spread of tuberculosis, effective
vaccination and accurate early diagnosis of the disease are of
utmost importance. Currently, vaccination with live bacteria is the
most efficient method for inducing protective immunity. The most
common mycobacterium employed for this purpose is Bacillus
Calmette-Guerin (BCG), an avirulent strain of M. bovis. However,
the safety and efficacy of BCG is a source of controversy and some
countries, such as the United States, do not vaccinate the general
public with this agent.
[0008] Diagnosis of tuberculosis is commonly achieved using a skin
test, which involves intradermal exposure to tuberculin PPD
(protein-purified derivative). Antigen-specific T cell responses
result in measurable induration at the injection site by 48-72
hours after injection, which indicates exposure to mycobacterial
antigens. Sensitivity and specificity have, however, been a problem
with this test, and individuals vaccinated with BCG cannot be
distinguished from infected individuals.
[0009] While macrophages have been shown to act as the principal
effectors of Mycobacterium immunity, T cells are the predominant
inducers of such immunity. The essential role of T cells in
protection against Mycobacterium infection is illustrated by the
frequent occurrence of Mycobacterium infection in AIDS patients,
due to the depletion of CD4.sup.+ T cells associated with human
immunodeficiency virus (HIV) infection. Mycobacterium-reactive
CD4.sup.+ T cells have been shown to be potent producers of
.gamma.-interferon (IFN-.gamma.), which, in turn, has been shown to
trigger the anti-mycobacterial effects of macrophages in mice.
While the role of IFN-.gamma. in humans is less clear, studies have
shown that 1,25-dihydroxy-vitamin D3, either alone or in
combination with IFN-.gamma. or tumor necrosis factor-alpha,
activates human macrophages to inhibit M. tuberculosis infection.
Furthermore, it is known that IFN-.gamma. stimulates human
macrophages to make 1,25-dihydroxy-vitamin D3. Similarly,
interleukin-12 (IL-12) has been shown to play a role in stimulating
resistance to M. tuberculosis infection. For a review of the
immunology of M. tuberculosis infection, see Chan & Kaufmann,
Tuberculosis: Pathogenesis, Protection and Control (Bloom ed.,
1994), and Harrison's Principles of Internal Medicine, volume 1,
pp. 1004-1014 and 1019-1023 (14.sup.th ed., Fauci et al., eds.,
1998).
[0010] Accordingly, there is a need for improved diagnostic
reagents, and improved methods for diagnosis, preventing and
treating tuberculosis.
SUMMARY OF THE INVENTION
[0011] The present invention therefore provides compositions
comprising at least two heterologous antigens, fusion proteins
comprising the antigens, and nucleic acids encoding the antigens,
where the antigens are from a Mycobacterium species from the
tuberculosis complex and other Mycobacterium species that cause
opportunistic infections in immune compromised patients. The
present invention also relates methods of using the polypeptides
and polynucleotides in the diagnosis, treatment and prevention of
Mycobacterium infection.
[0012] In one aspect, the present invention provides compositions
and fusion proteins comprising a mutated version of Ra35
(N-terminal portion of MTB32A) or Ra35FL (full length MTB32A), in
which one, two, or three of the three amino acids histidine,
aspartate, or serine at the active site has been mutated to a
different amino acid. In one embodiment, in Ra35FL, the serine at
position 183 has been mutated to an alanine residue, creating
Ra35FLMutSA. In one embodiment, the DNA encoding Ra35FL has been
mutated by changing a T to a G, resulting in a serine to alanine
mutation at amino acid 183 of SEQ ID NO:4. In another embodiment,
the present invention provides the fusion protein MTB72FMutSA, in
which the Ra35 component of the fusion protein has a serine to
alanine mutation at amino acid position 710 of the MTB72F sequence.
In another embodiment, the present invention provides a nucleic
acid encoding the fusion protein MTB72F, in which the nucleic acid
encoding the Ra35 component has been mutated by changing a T to a
G, resulting in a serine to alanine mutation at amino acid position
710 of the MTB72F sequence.
[0013] The present invention is based, in part, on the inventors'
discovery that fusion polynucleotides, fusion polypeptides, or
compositions that contain at least two heterologous M. tuberculosis
coding sequences or antigens are highly antigenic and upon
administration to a patient increase the sensitivity of
tuberculosis sera. In addition, the compositions, fusion
polypeptides and polynucleotides are useful as diagnostic tools in
patients that may have been infected with Mycobacterium.
[0014] In one aspect, the compositions, fusion polypeptides, and
nucleic acids of the invention are used in in vitro and in vivo
assays for detecting humoral antibodies or cell-mediated immunity
against M. tuberculosis for diagnosis of infection or monitoring of
disease progression. For example, the polypeptides may be used as
an in vivo diagnostic agent in the form of an intradermal-skin
test. The polypeptides may also be used in in vitro tests such as
an ELISA with patient serum. Alternatively, the nucleic acids, the
compositions, and the fusion polypeptides may be used to raise
anti-M. tuberculosis antibodies in a non-human animal. The
antibodies can be used to detect the target antigens in vivo and in
vitro.
[0015] In another aspect, the compositions, fusion polypeptides and
nucleic acids may be used as immunogens to generate or elicit a
protective immune response in a patient. The isolated or purified
polynucleotides are used to produce recombinant fusion polypeptide
antigens in vitro, which are then administered as a vaccine.
Alternatively, the polynucleotides may be administered directly
into a subject as DNA vaccines to cause antigen expression in the
subject, and the subsequent induction of an anti-M. tuberculosis
immune response. Thus, the isolated or purified M. tuberculosis
polypeptides and nucleic acids of the invention may be formulated
as pharmaceutical compositions for administration into a subject in
the prevention and/or treatment of M. tuberculosis infection. The
immunogenicity of the fusion protein or antigens may be enhanced by
the inclusion of an adjuvant, as well as additional fusion
polypeptides, from Mycobacterium or other organisms, such as
bacterial, viral, mammalian polypeptides. Additional polypeptides
may also be included in the compositions, either linked or unlinked
to the fusion polypeptide or compositions.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] FIG. 1 shows percent survival of Guinea pigs vaccinated with
MTB72F polyprotein.
[0017] FIG. 2 shows CFUs from spleen cells (FIG. 2A) and lung cells
after immunization with MTB72F, MTB59F, MTB72F DNA, or a
composition comprising Ra12, TbH9, and Ra35 antigens.
[0018] FIG. 3 shows a schematic diagram of MTB72F.
[0019] FIG. 4 shows the nucleotide and amino acid sequence of Ra35
(195 amino acids from the N-terminal portion of MTB32A).
[0020] FIG. 5 shows an alignment of the amino acid sequences of
MTB72F and the mutated version MTB72FMutSA.
[0021] FIG. 6 shows an alignment of the amino acid sequences of
mature (full length) Ra35/MTB32A and the mutated version
Ra35FLMutSA.
[0022] FIG. 7 shows long term survival of guinea pigs vaccinated
with Mtb72F formulations.
DESCRIPTION OF THE SPECIFIC EMBODIMENTS
[0023] The present invention relates to compositions comprising
antigen compositions and fusion polypeptides useful for the
diagnosis and treatment of Mycobacterium infection, polynucleotides
encoding such antigens, and methods for their use. The antigens of
the present invention are polypeptides or fusion polypeptides of
Mycobacterium antigens and immunogenic thereof. More specifically,
the compositions of the present invention comprise at least two
heterologous polypeptides of a Mycobacterium species of the
tuberculosis complex, e.g., a species such as M. tuberculosis, M.
bovis, or M. africanum, or a Mycobacterium species that is
environmental or opportunistic and that causes opportunistic
infections such as lung infections in immune compromised hosts
(e.g., patients with AIDS), e.g., BCG, M. avium, M. intracellulare,
M. celatum, M. genavense, M. haemophilum, M. kansasii, M. simiae,
M. vaccae, M. fortuitum, and M. scrofulaceum (see, e.g., Harrison's
Principles of Internal Medicine, volume 1, pp. 1004-1014 and
1019-1023 (14.sup.th ed., Fauci et al., eds., 1998). The inventors
of the present application surprisingly discovered that
compositions and fusion proteins comprising at least two
heterologous Mycobacterium antigens, or immunogenic fragments
thereof, where highly antigenic. These compositions, fusion
polypeptides, and the nucleic acids that encode them are therefore
useful for eliciting protective response in patients, and for
diagnostic applications.
[0024] The antigens of the present invention may further comprise
other components designed to enhance the antigenicity of the
antigens or to improve these antigens in other aspects, for
example, the isolation of these antigens through addition of a
stretch of histidine residues at one end of the antigen. The
compositions, fusion polypeptides, and nucleic acids of the
invention can comprise additional copies of antigens, or additional
heterologous polypeptides from Mycobacterium sp., such as MTB8.4
antigen, MTB9.8 antigen, MTB9.9 antigen, MTB40 antigen, MTB41
antigen, 38-1, TbRa3, 38 kD, DPEP, TbH4, DPPD, ESAT-6 antigen,
MTB85 complex antigen (e.g., MTB85b), or .alpha.-crystalline
antigen, and Erd14. The compositions, fusion polypeptides, and
nucleic acids of the invention can also comprise additional
heterologous polypeptides from other non-Mycobacterium sources. For
example, the compositions and fusion proteins of the invention can
include polypeptides or nucleic acids encoding polypeptides,
wherein the polypeptide enhances expression of the antigen, e.g.,
NS1, an influenza virus protein, or an immunogenic portion thereof
(see, e.g. WO99/40188 and WO93/04175). The nucleic acids of the
invention can be engineered based on codon preference in a species
of choice, e.g., humans.
[0025] The compositions of the invention can be naked DNA, or the
compositions, e.g., polypeptides can also comprise adjuvants, e.g.,
MPL, 3D-MPL, IFA, AS adjuvants such as AS2, AS2', AS2'', AS4, AS6,
ENHANZYN (Detox), QS21, CWS, TDM, AGP, CPG, Leif, saponin, and
saponin mimetics, and derivatives thereof. In addition, the
compositions of the invention can comprise BCG or Pvac as an
adjuvant.
[0026] In one embodiment, the compositions and fusion proteins of
the invention are composed of at least two antigens selected from
the group consisting of a MTB39 antigen or an immunogenic fragment
thereof from a Mycobacterium species of the tuberculosis complex,
and a MTB32A antigen or an immunogenic fragment thereof from a
Mycobacterium species of the tuberculosis complex.
[0027] In another embodiment, the antigens are selected from the
group consisting of a MTB39 antigen or an immunogenic fragment
thereof from a Mycobacterium species of the tuberculosis complex,
and a polypeptide comprising at least 205 amino acids of the
N-terminus of a MTB32A antigen from a Mycobacterium species of the
tuberculosis complex.
[0028] In another embodiment, the antigens are selected from the
group consisting of a MTB39 antigen or an immunogenic fragment
thereof from a Mycobacterium species of the tuberculosis complex, a
polypeptide comprising at least about 205 amino acids of the
N-terminus of a MTB32A antigen from a Mycobacterium species of the
tuberculosis complex, and a polypeptide comprising at least about
132 amino acids from the C-terminus of MTB32A antigen from a
Mycobacterium species of the tuberculosis complex.
[0029] In the nomenclature of the application, Ra35 refers to the
N-terminus of MTB32A (Ra35FL), comprising at least about 195 to 205
amino acids of MTB32A from M. tuberculosis, or the corresponding
region from another Mycobacterium-species. Ra12 refers to the
C-terminus of MTB32A (Ra35FL), comprising at least about the last
132 amino acids from MTB32A from M. tuberculosis, or the
corresponding region from another Mycobacterium species.
[0030] The following provides sequences of some antigens-used in
the compositions and fusion proteins of the invention:
[0031] SEQ ID NO:1-4: MTB32A (Ra35FL or Ra35 mature), the sequence
of which is also disclosed as SEQ ID NO:17 (cDNA) and SEQ ID NO:79
(protein) in the U.S. patent application Ser. Nos. 08/523,436,
08/523,435, Ser. No. 08/658,800, Ser. No. 08/659,683, Ser. No.
08/818,112, Ser. No. 09/056,556, and Ser. No. 08/818,111 and in the
WO97/09428 and WO97/09429 applications, see also Skeiky et al.,
Infection and Immunity 67:3998-4007 (1999). The term MTB32A also
includes MTB32A amino acid sequences in which any one of the three
amino acids at the active site triad (His, Asp, Ser), e.g., the
serine residue at amino acid position 207 in SEQ ID NO:2 or amino
acid position 183 in SEQ ID NO:4, has been changed to another amino
acid (e.g., alanine, Ra35FLMutSA, see, e.g., FIG. 6 and SEQ ID
NO:6).
[0032] SEQ ID NO:5 and 6: Ra35FLMut SA, the mature version of
RA35FL in which the serine residue at amino acid position 183 of
SEQ ID NO:4 has been changed to an alanine residue.
[0033] SEQ ID NO:7 and 8: Ra35, the N-terminus of MTB32A (Ra35FL),
comprising at least about 195 amino acids from the N-terminus of
MTB32A from M. tuberculosis, the nucleotide and amino acid sequence
of which is disclosed in FIG. 4 (see also amino acids 33-227 of SEQ
ID NO:2 and amino acids 8 to 202 of SEQ ID NO:4). The term Ra35
(N-term) also includes Ra35 amino acid sequences in which any one
of the three amino acids at the active site triad (i.e., His, Asp,
or Ser) has been changed as described above.
[0034] SEQ ID NO:9 and 10: MTBRa12, the C-terminus of MTB32A
(Ra35FL), comprising at least about 132 amino acids from the
C-terminus of MTB32A from M. tuberculosis (see, e.g., amino acids
224 to 355 of SEQ ID NO:2 and amino acids 199 to 330 of SEQ ID
NO:4), the sequence of which is disclosed as SEQ ID NO:4 (DNA) and
SEQ ID NO:66 (predicted amino acid sequence) in the U.S. patent
application Ser. No. 09/072,967.
[0035] SEQ ID NO:11, 12, 13, and 14: MTB39 (TbH9), the sequence of
which is disclosed as SEQ ID NO:106 (cDNA full length) and SEQ ID
NO:107 (protein full length) in the U.S. patent applications No.
08/658,800, Ser. No. 08/659,683, Ser. No. 08/818,112, and No.
08/818,111 and in the WO97/09428 and WO97/09429 applications. The
sequence is also disclosed as SEQ ID NO:33 (DNA) and SEQ ID NO:91
(amino acid) in U.S. patent application Ser. No. 09/056,559.
[0036] The following provides sequences of some fusion proteins of
the invention
[0037] SEQ ID NO:15 and 16: MTB72F (Ra12-TbH9-Ra35), the sequence
of which is disclosed as SEQ ID NO:1 (DNA) and SEQ ID NO:2
(protein) in the U.S. patent application Ser. No. 09/223,040, Ser.
No. 09/223,040, and in the PCT/US99/07717 application. The term
MTB372F also includes MTB72F amino acid sequences in which any one
of the three amino acids at the active site triad in Ra35FL (i.e.,
His, Asp, or Ser), has been changed as described above (see, e.g.,
MTB72FMutSA, FIG. 5).
[0038] SEQ ID NO:17 and 18: MTB72FMutSA (Ra12-TbH9-Ra35MutSA),
wherein, in the Ra35 component of the fusion protein, the serine at
position 710 has been changed to an alanine.
[0039] SEQ ID NO:19 and 20: TbH9-Ra35 (MTB59F), the sequence of
which is disclosed as SEQ ID NO:23 (cDNA) and SEQ ID NO:24
(protein) in the U.S. patent application Ser. No. 09/287,849 and in
the PCT/US99/07717 application.
[0040] The following provides sequences of some additional antigens
used in the compositions and fusion proteins of the invention:
[0041] SEQ ID NO: 21 and 22: MTB8.4 (DPV), the sequence of which is
disclosed as SEQ ID NO:01 (cDNA) and SEQ ID NO:102 (protein) in the
U.S. patent application Ser. No. 08/658,800, Ser. No. 08/659,683,
Ser. No. 08/818,112 and Ser. No. 08/818,111 and in the WO97/09428
and WO97/09429 applications.
[0042] SEQ ID NO:23 and 24: MTB9.8 (MSL), the sequence of which is
disclosed as SEQ ID NO:12 (DNA), SEQ ID NO:109 (predicted amino
acid sequence) and SEQ ID NO:110 to 124 (peptides) in the U.S.
patent application Ser. No. 08/859,381, Ser. No. 08/858,998, Ser.
No. 09/073,009 and Ser. No. 09/073,010 and in the PCT/US98/10407
and PCT/US98/10514 applications.
[0043] SEQ ID NO:25, 26, and 27: MTB9.9A (MTI, also known as
MTI-A), the sequence of which is disclosed as SEQ ID NO:3 and SEQ
ID NO:4 (DNA) and SEQ ID NO:29 and SEQ ID NO:51 to 66 (ORF peptide
for MTI) in the U.S. patent application Ser. No. 08/859,381, Ser.
No. 08/858,998, Ser. Nos. 09/073,009 and 09/073,010 and in the
PCT/US98/10407 and PCT/US98/10514 applications. Two other MTI
variants also exist, called MTI-B and MTI-C.
[0044] SEQ ID NO:28 and 29: MTB40 (HTCC#1), the sequence of which
is disclosed as SEQ ID NO:137 (cDNA) and 138 (predicted amino acid
sequence) in the U.S. patent applications No. 09/073,009 and No.
09/073,010 and in the PCT/US98/10407 and PCT/US98/10514
applications.
[0045] SEQ ID NO:30 and 31: MTB41 (MTCC#2), the sequence of which
is disclosed as SEQ ID NO:140 (cDNA) and SEQ ID NO:142 (predicted
amino acid sequence) in the U.S. patent applications No. 09/073,009
and No. 09/073,010 and in the PCT/US98/10407 and PCT/US98/10514
applications.
[0046] SEQ ID NO:32 and 33: ESAT-6, the sequence of which is
disclosed as SEQ ID NO:103 (DNA) and SEQ ID NO:104 (predicted amino
acid sequence) in the U.S. patent application Ser. No. 09/072,967.
The sequence of ESAT-6 is also disclosed in U.S. Pat. No.
5,955,077.
[0047] SEQ ID NO:34 and 35: Tb38-1 or 38-1 (MTb1), the sequence of
which is disclosed in SEQ ID NO:46 (DNA) and SEQ ID NO:88
(predicted amino acid) in the U.S. patent application Ser. Nos.
09/072,96; 08/523,436; 08/523,435; 08/818,112; and 08/818,111; and
in the WO97/09428 and WO97/09429 applications.
[0048] SEQ ID NO:36 and 37: TbRa3, the sequence of which is
disclosed in SEQ ID NO:15 (DNA) and SEQ ID NO:77 (predicted amino
acid sequence) of WO 97/09428 and WO97/09429 applications.
[0049] SEQ ID NO:38 and 39: 38 kD, the sequence of which is
disclosed in SEQ ID NO:154 (DNA) and SEQ ID NO:155 (predicted amino
acid sequence) in the U.S. patent application Ser. No. 09/072,967.
38 kD has two alternative forms, with and without the N-terminal
cysteine residue.
[0050] SEQ ID NO:40 and 41: DPEP, the sequence of which is
disclosed in SEQ ID NO:52 (DNA) and SEQ ID NO:53 (predicted amino
acid sequence) in the WO97/09428 and WO97/09429 publications.
[0051] SEQ ID NO:42 and 43: TbH4, the sequence of which is
disclosed as SEQ ID NO:43 (DNA) and SEQ ID NO:81 (predicted amino
acid sequence) in WO97/09428 and WO97/09429 publications.
[0052] SEQ ID NO:44 and 45: DPPD, the sequence of which is
disclosed in SEQ ID NO:240 (DNA) and SEQ ID NO:241 (predicted amino
acid sequence) in U.S. Ser. No. 09/072,967 and in the
PCT/US99/03268 and PCT/US99/03265 applications. The secreted form
of DPPD is shown herein in FIG. 12 of PCT/US00/28095.
[0053] MTb82 (MTb867), the sequence of which is disclosed in FIGS.
8 (DNA) and 9 (amino acid) of PCT/US00/28095.
[0054] Erd14 (MTb16), the cDNA and amino acids sequences of which
are disclosed in Verbon et al., J. Bacteriology 174:1352-1359
(1992).
[0055] .alpha.-crystalline antigen, the sequence of which is
disclosed in Verbon et al., J. Bact. 174:1352-1359 (1992);
[0056] 85 complex antigen, e.g., 85b antigen, the sequence of which
is disclosed in Content et al., Infect. & Immunol. 59:3205-3212
(1991).
[0057] The following provides sequences of some additional fusion
proteins used in the compositions and fusion proteins of the
invention:
[0058] SEQ ID NO:46 and 47: DPV-MTI-MSL-MTCC#2 (MTb71F), the
sequence of which is disclosed as SEQ ID NO:15 (nucleic acid) and
in SEQ ID NO:16: (protein) in the U.S. patent application Ser. No.
09/287,849 and in the PCT/US99/07717 application.
[0059] SEQ ID NO:48 and 49: DPV-MTI-MSL (MTb31F), the sequence of
which is disclosed in SEQ ID NO:18 (cDNA) and SEQ ID NO:19
(protein) in the U.S. patent application Ser. No. 09/287,849 and in
the PCT/US99/07717 application.
[0060] Each of the above sequences is also disclosed in Cole et al.
Nature 393:537 (1998) and can be found at, e.g.,
http://www.sanger.ac.uk and http:/www.pasteur.fr/mycdb/.
[0061] The above sequences are disclosed in U.S. patent application
Ser. Nos. 08/523,435, 08/523,436, 08/658,800, 08/659,683,
08/818,111, 08/818,112, 08/942,341, 08/942,578, 08/858,998,
08/859,381, 09/056,556, 09/072,596, 09/072,967, 09/073,009,
09/073,010, 09/223,040, 09/287,849 09/597,796; and in PCT patent
applications PCT/US00/28095; PCT/US98/10407, PCT/US98/10514,
PCT/US99/03265, PCT/US99/03268, PCT/US99/07717, WO97/09428 and
WO97/09429, WO98/16645, WO98/16646, each of which is herein
incorporated by reference.
[0062] The antigens described herein include polymorphic variants
and conservatively modified variations, as well as inter-strain and
interspecies Mycobacterium homologs. In addition, the antigens
described herein include subsequences or truncated sequences. The
fusion proteins may also contain additional polypeptides,
optionally heterologous peptides from Mycobacterium or other
sources. These antigens may be modified, for example, by adding
linker peptide sequences as described below. These linker peptides
may be inserted between one or more polypeptides which make up each
of the fusion proteins.
DEFINITIONS
[0063] "Fusion polypeptide" or "fusion protein" refers to a protein
having at least two heterologous Mycobacterium sp. polypeptides
covalently linked, either directly or via an amino acid linker. The
polypeptides forming the fusion protein are typically linked
C-terminus to N-terminus, although they can also be linked
C-terminus to C-terminus, N-terminus to N-terminus, or N-terminus
to C-terminus. The polypeptides of the fusion protein can be in any
order. This term also refers to conservatively modified variants,
polymorphic variants, alleles, mutants, subsequences, interspecies
homologs, and immunogenic fragments of the antigens that make up
the fusion protein. Mycobacterium tuberculosis antigens are
described in Cole et al., Nature 393:537 (1998), which discloses
the entire Mycobacterium tuberculosis genome. The complete sequence
of Mycobacterium tuberculosis can also be found at
http://www.sanger.ac.uk and at http://www.pasteur.fr/mycdb/
(MycDB). Antigens from other Mycobacterium species that correspond
to M. tuberculosis antigens can be identified, e.g., using sequence
comparison algorithms, as described herein, or other methods known
to those of skill in the art, e.g., hybridization assays and
antibody binding assays. Fusion proteins of the invention can also
comprise additional copies of a component antigen or immunogenic
fragment thereof.
[0064] A polynucleotide sequence comprising a fusion protein of the
invention hybridizes under stringent conditions to at least two
nucleotide sequences, each encoding an antigen polypeptide selected
from the group consisting of MTB39 or an immunogenic fragment
thereof and MTB32A or an immunogenic fragment thereof. The
polynucleotide sequences encoding the individual antigens of the
fusion polypeptide therefore include conservatively modified
variants, polymorphic variants, alleles, mutants, subsequences,
immunogenic fragments, and interspecies homologs of MTB39 and
MTB32A. The polynucleotide sequence encoding the individual
polypeptides of the fusion protein can be in any order.
[0065] In some embodiments, the individual polypeptides of the
fusion protein are in order (N- to C-terminus) from large to small.
Large antigens are approximately 30 to 150 kD in size, medium
antigens are approximately 10 to 30 kD in size, and small antigens
are approximately less than 10 kD in size. The sequence encoding
the individual polypeptide may be as small as, e.g., an immunogenic
fragment such as an individual CTL epitope encoding about 8 to 9
amino acids, or, e.g., an HTL or B cell epitope. The fragment may
also include multiple epitopes. The immunogenic fragment may also
represent a larger part of the antigen sequence, e.g., about 50% or
more of MTB39 and MTB32A, e.g., the N- and C-terminal portions of
MTB32A. Preferred immunogenic fragments of MTB32A include Ra12,
Ra35, and Ra35MutSA.
[0066] A fusion polypeptide of the invention specifically binds to
antibodies raised against at least two antigen polypeptides,
wherein each antigen polypeptide is selected from the group
consisting of MTB39 or an immunogenic portion or fragment thereof
and MTB32A or an immunogenic portion thereof. The antibodies can be
polyclonal or monoclonal. Optionally, the fusion polypeptide
specifically binds to antibodies raised against the fusion junction
of the antigens, which antibodies do not bind to the antigens
individually, i.e., when they are not part of a fusion protein. The
fusion polypeptides optionally comprise additional polypeptides,
e.g., three, four, five, six, or more polypeptides, up to about 25
polypeptides, optionally heterologous polypeptides or repeated
homologous polypeptides, fused to the at least two heterologous
antigens. The additional polypeptides of the fusion protein are
optionally derived from Mycobacterium as well as other sources,
such as other bacterial, viral, or invertebrate, vertebrate, or
mammalian sources. The individual polypeptides of the fusion
protein can be in any order. As described herein, the fusion
protein can also be linked to other molecules, including additional
polypeptides. The compositions of the invention can also comprise
additional polypeptides that are unlinked to the fusion proteins of
the invention. These additional polypeptides may be heterologous or
homologous polypeptides.
[0067] The term "fused" refers to the covalent linkage between two
polypeptides in a fusion protein. The polypeptides are typically
joined via a peptide bond, either directly to each other or via an
amino acid linker. Optionally, the peptides can be joined via
non-peptide covalent linkages known to those of skill in the
art.
[0068] "FL" refers to full-length, i.e., a polypeptide that is the
same length as the wild-type polypeptide.
[0069] The term "immunogenic fragment thereof" refers to a
polypeptide comprising an epitope that is recognized by cytotoxic T
lymphocytes, helper T lymphocytes or B cells. Preferred immunogenic
fragments of, e.g., MTB32A, are RA35, Ra35MutSA, or Ra12.
[0070] The term "Mycobacterium species of the tuberculosis complex"
includes those species traditionally considered as causing the
disease tuberculosis, as well as Mycobacterium environmental and
opportunistic species that cause tuberculosis and lung disease in
immune compromised patients, such as patients with AIDS, e.g., M.
tuberculosis, M. bovis, or M. africanum, BCG, M. avium, M
intracellulare, M. celatum, M. genavense, M. haemophilum, M.
kansasii, M. simiae, M. vaccae, M. fortuitum, and M. scrofulaceum
(see, e.g., Harrison's Principles of Internal Medicine, volume 1,
pp. 1004-1014 and 1019-1023 (14.sup.th ed., Fauci et al., eds.,
1998).
[0071] An adjuvant refers to the components in a vaccine or
therapeutic composition that increase the specific immune response
to the antigen (see, e.g., Edelman, AIDS Res. Hum Retroviruses
8:1409-1411 (1992)). Adjuvants induce immune responses of the
Th1-type and Th-2 type response. Th1-type cytokines (e.g.,
IFN-.gamma., IL-2, and IL-12) tend to favor the induction of
cell-mediated immune response to an administered antigen, while
Th-2 type cytokines (e.g., IL-4, IL-5, Il-6, IL-10 and TNF-.beta.)
tend to favor the induction of humoral immune responses.
[0072] "Nucleic acid" refers to deoxyribonucleotides or
ribonucleotides and polymers thereof in either single- or
double-stranded form. The term encompasses nucleic acids containing
known nucleotide analogs or modified backbone residues or linkages,
which are synthetic, naturally occurring, and non-naturally
occurring, which have similar binding properties as the reference
nucleic acid, and which are metabolized in a manner similar to the
reference nucleotides. Examples of such analogs include, without
limitation, phosphorothioates, phosphoramidates, methyl
phosphonates, chiral-methyl phosphonates, 2-O-methyl
ribonucleotides, peptide-nucleic acids (PNAs).
[0073] Unless otherwise indicated, a particular nucleic acid
sequence also implicitly encompasses conservatively modified
variants thereof (e.g., degenerate codon substitutions) and
complementary sequences, as well as the sequence explicitly
indicated. Specifically, degenerate codon substitutions may be
achieved by generating sequences in which the third position of one
or more selected (or all) codons is substituted with mixed-base
and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res.
19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608
(1985); Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)). The
term nucleic acid is used interchangeably with gene, cDNA, mRNA,
oligonucleotide, and polynucleotide.
[0074] The terms "polypeptide," "peptide" and "protein" are used
interchangeably herein to refer to a polymer of amino acid
residues. The terms apply to amino acid polymers in which one or
more amino acid residue is an artificial chemical mimetic of a
corresponding naturally occurring amino acid, as well as to
naturally occurring amino acid polymers and non-naturally occurring
amino acid polymer.
[0075] The term "amino acid" refers to naturally occurring and
synthetic amino acids, as well as amino acid analogs and amino acid
mimetics that function in a manner similar to the naturally
occurring amino acids. Naturally occurring amino acids are those
encoded by the genetic code, as well as those amino acids that are
later modified, e.g., hydroxyproline, .gamma.-carboxyglutamate, and
O-phosphoserine. Amino acid analogs refers to compounds that have
the same basic chemical structure as a naturally occurring amino
acid, i.e., an .alpha.carbon that is bound to a hydrogen, a
carboxyl group, an amino group, and an R group, e.g., homoserine,
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such
analogs have modified R groups (e.g., norleucine) or modified
peptide backbones, but retain the same basic chemical structure as
a naturally occurring amino acid. Amino acid mimetics refers to
chemical compounds that have a structure that is different from the
general chemical structure of an amino acid, but that functions in
a manner similar to a naturally occurring amino acid.
[0076] Amino acids may be referred to herein by either their
commonly known three letter symbols or by the one-letter symbols
recommended by the IUPAC-IUB Biochemical Nomenclature Commission.
Nucleotides, likewise, may be referred to by their commonly
accepted single-letter codes.
[0077] "Conservatively modified variants" applies to both amino
acid and nucleic acid sequences. With respect to particular nucleic
acid sequences, conservatively modified variants refers to those
nucleic acids which encode identical or essentially identical amino
acid sequences, or where the nucleic acid does not encode an amino
acid sequence, to essentially identical sequences. Because of the
degeneracy of the genetic code, a large number of functionally
identical nucleic acids encode any given protein. For instance, the
codons GCA, GCC, GCG and GCU all encode the amino acid alanine.
Thus, at every position where an alanine is specified by a codon,
the codon can be altered to any of the corresponding codons
described without altering the encoded polypeptide. Such nucleic
acid variations are "silent variations," which are one species of
conservatively modified variations. Every nucleic acid sequence
herein which encodes a polypeptide also describes every possible
silent variation of the nucleic acid. One of skill will recognize
that each codon in a nucleic acid (except AUG, which is ordinarily
the only codon for methionine, and TGG, which is ordinarily the
only codon for tryptophan) can be modified to yield a functionally
identical molecule. Accordingly, each silent variation of a nucleic
acid which encodes a polypeptide is implicit in each described
sequence.
[0078] As to amino acid sequences, one of skill will recognize that
individual substitutions, deletions or additions to a nucleic acid,
peptide, polypeptide, or protein sequence which alters, adds or
deletes a single amino acid or a small percentage of amino acids in
the encoded sequence is a "conservatively modified variant" where
the alteration results in the substitution of an amino acid with a
chemically similar amino acid. Conservative substitution tables
providing functionally similar amino acids are well known in the
art. Such conservatively modified variants are in addition to and
do not exclude polymorphic variants, interspecies homologs, and
alleles of the invention.
[0079] The following eight groups each contain amino acids that are
conservative substitutions for one another:
1) Alanine (A), Glycine (G);
[0080] 2) Aspartic acid (D), Glutamic acid (E);
3) Asparagine (N), Glutamine (Q);
4) Arginine (R), Lysine (K);
5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V);
6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W);
7) Serine (S), Threonine (T); and
8) Cysteine (C), Methionine (M)
[0081] (see, e.g., Creighton, Proteins (1984)).
[0082] The term "heterologous" when used with reference to portions
of a nucleic acid indicates that the nucleic acid comprises two or
more subsequences that are not found in the same relationship to
each other in nature. For instance, the nucleic acid is typically
recombinantly produced, having two or more sequences from unrelated
genes arranged to make a new functional nucleic acid, e.g., a
promoter from one source and a coding region from another source.
Similarly, a heterologous protein indicates that the protein
comprises two or more subsequences that are not found in the same
relationship to each other in nature (e.g., a fusion protein).
[0083] The phrase "selectively (or specifically) hybridizes to"
refers to the binding, duplexing, or hybridizing of a molecule only
to a particular nucleotide sequence under stringent hybridization
conditions when that sequence is present in a complex mixture
(e.g., total cellular or library DNA or RNA).
[0084] The phrase "stringent hybridization conditions" refers to
conditions under which a probe will hybridize to its target
subsequence, typically in a complex mixture of nucleic acid, but to
no other sequences. Stringent conditions are sequence-dependent and
will be different in different circumstances. Longer sequences
hybridize specifically at higher temperatures. An extensive guide
to the hybridization of nucleic acids is found in Tijssen,
Techniques in Biochemistry and Molecular Biology--Hybridization
with Nucleic Probes, "Overview of principles of hybridization and
the strategy of nucleic acid assays" (1993). Generally, stringent
conditions are selected to be about 5-10.degree. C. lower than the
thermal melting point (T.sub.m) for the specific sequence at a
defined ionic strength pH. The T.sub.m is the temperature (under
defined ionic strength, pH, and nucleic concentration) at which 50%
of the probes complementary to the target hybridize to the target
sequence at equilibrium (as the target sequences are present in
excess, at T.sub.m, 50% of the probes are occupied at equilibrium).
Stringent conditions will be those in which the salt concentration
is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M
sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the
temperature is at least about 30.degree. C. for short probes (e.g.,
10 to 50 nucleotides) and at least about 60.degree. C. for long
probes (e.g., greater than 50 nucleotides). Stringent conditions
may also be achieved with the addition of destabilizing agents such
as formamide. For selective or specific hybridization, a positive
signal is at least two times background, optionally 10 times
background hybridization. Exemplary stringent hybridization
conditions can be as following: 50% formamide, 5.times.SSC, and 1%
SDS, incubating at 42.degree. C., or, 5.times.SSC, 1% SDS,
incubating at 65.degree. C., with wash in 0.2.times.SSC, and 0.1%
SDS at 65.degree. C.
[0085] Nucleic acids that do not hybridize to each other under
stringent conditions are still substantially identical if the
polypeptides which they encode are substantially identical. This
occurs, for example, when a copy of a nucleic acid is created using
the maximum codon degeneracy permitted by the genetic code. In such
cases, the nucleic acids typically hybridize under moderately
stringent hybridization conditions. Exemplary "moderately stringent
hybridization conditions" include a hybridization in a buffer of
40% formamide, 1 M NaCl, 1% SDS at 37.degree. C., and a wash in
1.times.SSC at 45.degree. C. A positive hybridization is at least
twice background. Those of ordinary skill will readily recognize
that alternative hybridization and wash conditions can be utilized
to provide conditions of similar stringency.
[0086] "Antibody" refers to a polypeptide comprising a framework
region from an immunoglobulin gene or fragments thereof that
specifically binds and recognizes an antigen. The recognized
immunoglobulin genes include the kappa, lambda, alpha, gamma,
delta, epsilon, and mu constant region genes, as well as the myriad
immunoglobulin variable region genes. Light chains are classified
as either kappa or lambda. Heavy chains are classified as gamma,
mu, alpha, delta, or epsilon, which in turn define the
immunoglobulin classes, IgG, IgM, IgA, IgD and IgE,
respectively.
[0087] An exemplary immunoglobulin (antibody) structural unit
comprises a tetramer. Each tetramer is composed of two identical
pairs of polypeptide chains, each pair having one "light" (about 25
kDa) and one "heavy" chain (about 50-70 kDa). The N-terminus of
each chain defines a variable region of about 100 to 110 or more
amino acids primarily responsible for antigen recognition. The
terms variable light chain (V.sub.L) and variable heavy chain
(V.sub.H) refer to these light and heavy chains respectively.
[0088] Antibodies exist, e.g., as intact immunoglobulins or as a
number of well-characterized fragments produced by digestion with
various peptidases. Thus, for example, pepsin digests an antibody
below the disulfide linkages in the hinge region to produce
F(ab)'.sub.2, a dimer of Fab which itself is a light chain joined
to V.sub.H-C.sub.H1 by a disulfide bond. The F(ab)'.sub.2 may be
reduced under mild conditions to break the disulfide linkage in the
hinge region, thereby converting the F(ab)'.sub.2 dimer into an
Fab' monomer. The Fab' monomer is essentially Fab with part of the
hinge region (see Fundamental Immunology (Paul ed., 3d ed. 1993).
While various antibody fragments are defined in terms of the
digestion of an intact antibody, one of skill will appreciate that
such fragments may be synthesized de novo either chemically or by
using recombinant DNA methodology. Thus, the term antibody, as used
herein, also includes antibody fragments either produced by the
modification of whole antibodies, or those synthesized de novo
using recombinant DNA methodologies (e.g., single chain Fv) or
those identified using phage display libraries (see, e.g.,
McCafferty et al., Nature 348:552-554 (1990)).
[0089] For preparation of monoclonal or polyclonal antibodies, any
technique known in the art can be used (see, e.g., Kohler &
Milstein, Nature 256:495-497 (1975); Kozbor et al., Immunology
Today 4: 72 (1983); Cole et al., pp. 77-96 in Monoclonal Antibodies
and Cancer Therapy (1985)). Techniques for the production of single
chain antibodies (U.S. Pat. No. 4,946,778) can be adapted to
produce antibodies to polypeptides of this invention. Also,
transgenic mice, or other organisms such as other mammals, may be
used to express humanized antibodies. Alternatively, phage display
technology can be used to identify antibodies and heteromeric Fab
fragments that specifically bind to selected antigens (see, e.g.,
McCafferty et al., Nature 348:552-554 (1990); Marks et al.,
Biotechnology 10:779-783 (1992)).
[0090] The phrase "specifically (or selectively) binds" to an
antibody or "specifically (or selectively) immunoreactive with,"
when referring to a protein or peptide, refers to a binding
reaction that is determinative of the presence of the protein in a
heterogeneous population of proteins and other biologics. Thus,
under designated immunoassay conditions, the specified antibodies
bind to a particular protein at least two times the background and
do not substantially bind in a significant amount to other proteins
present in the sample. Specific binding to an antibody under such
conditions may require an antibody that is selected for its
specificity for a particular protein. For example, polyclonal
antibodies raised to fusion proteins can be selected to obtain only
those polyclonal antibodies that are specifically immunoreactive
with fusion protein and not with individual components of the
fusion proteins. This selection may be achieved by subtracting out
antibodies that cross-react with the individual antigens. A variety
of immunoassay formats may be used to select antibodies
specifically immunoreactive with a particular protein. For example,
solid-phase ELISA immunoassays are routinely used to select
antibodies specifically immunoreactive with a protein (see, e.g.,
Harlow & Lane, Antibodies, A Laboratory Manual (1988), for a
description of immunoassay formats and conditions that can be used
to determine specific immunoreactivity). Typically a specific or
selective reaction will be at least twice background signal or
noise and more typically more than 10 to 100 times background.
[0091] Polynucleotides may comprise a native sequence (i.e., an
endogenous sequence that encodes an individual antigen or a portion
thereof) or may comprise a variant of such a sequence.
Polynucleotide variants may contain one or more substitutions,
additions, deletions and/or insertions such that the biological
activity of the encoded fusion polypeptide is not diminished,
relative to a fusion polypeptide comprising native antigens.
Variants preferably exhibit at least about 70% identity, more
preferably at least about 80% identity and most preferably at least
about 90% identity to a polynucleotide sequence that encodes a
native polypeptide or a portion thereof.
[0092] The terms "identical" or percent "identity," in the context
of two or more nucleic acids or polypeptide sequences, refer to two
or more sequences or subsequences that are the same or have a
specified percentage of amino acid residues or nucleotides that are
the same (i.e., 70% identity, optionally 75%, 80%, 85%, 90%, or 95%
identity over a specified region), when compared and aligned for
maximum correspondence over a comparison window, or designated
region as measured using one of the following sequence comparison
algorithms or by manual alignment and visual inspection. Such
sequences are then said to be "substantially identical." This
definition also refers to the compliment of a test sequence.
Optionally, the identity exists over a region that is at least
about 25 to about 50 amino acids or nucleotides in length, or
optionally over a region that is 75-100 amino acids or nucleotides
in length.
[0093] For sequence comparison, typically one sequence acts as a
reference sequence, to which test sequences are compared. When
using a sequence comparison algorithm, test and reference sequences
are entered into a computer, subsequence coordinates are
designated, if necessary, and sequence algorithm program parameters
are designated. Default program parameters can be used, or
alternative parameters can be designated. The sequence comparison
algorithm then calculates the percent sequence identities for the
test sequences relative to the reference sequence, based on the
program parameters.
[0094] A "comparison window", as used herein, includes reference to
a segment of any one of the number of contiguous positions selected
from the group consisting of from 25 to 500, usually about 50 to
about 200, more usually about 100 to about 150 in which a sequence
may be compared to a reference sequence of the same number of
contiguous positions after the two sequences are optimally aligned.
Methods of alignment of sequences for comparison are well-known in
the art. Optimal alignment of sequences for comparison can be
conducted, e.g., by the local homology algorithm of Smith &
Waterman; Adv. Appl. Math. 2:482 (1981), by the homology alignment
algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970),
by the search for similarity method of Pearson & Lipman, Proc.
Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized
implementations of these algorithms (GAP, BESTFIT, FASTA, and
TFASTA in the Wisconsin Genetics Software Package, Genetics
Computer Group, 575 Science Dr., Madison, Wis.), or by manual
alignment and visual inspection (see, e.g., Current Protocols in
Molecular Biology (Ausubel et al., eds. 1995 supplement)).
[0095] One example of a useful algorithm is PILEUP. PILEUP creates
a multiple sequence alignment from a group of related sequences
using progressive, pairwise alignments to show relationship and
percent sequence identity. It also plots a tree or dendogram
showing the clustering relationships used to create the alignment.
PILEUP uses a simplification of the progressive alignment method of
Feng & Doolittle, J. Mol. Evol. 35:351-360 (1987). The method
used is similar to the method described by Higgins & Sharp,
CABIOS 5:151-153 (1989). The program can align up to 300 sequences,
each of a maximum length of 5,000 nucleotides or amino acids. The
multiple alignment procedure begins with the pairwise alignment of
the two most similar sequences, producing a cluster of two aligned
sequences. This cluster is then aligned to the next most related
sequence or cluster of aligned sequences. Two clusters of sequences
are aligned by a simple extension of the pairwise alignment of two
individual sequences. The final alignment is achieved by a series
of progressive, pairwise alignments. The program is run by
designating specific sequences and their amino acid or nucleotide
coordinates for regions of sequence comparison and by designating
the program parameters. Using PILEUP, a reference sequence is
compared to other test sequences to determine the percent sequence
identity relationship using the following parameters: default gap
weight (3.00), default gap length weight (0.10), and weighted end
gaps. PILEUP can be obtained from the GCG sequence analysis
software package, e.g., version 7.0 (Devereaux et al., Nuc. Acids
Res. 12:387-395 (1984).
[0096] Another example of algorithm that is suitable for
determining percent sequence identity and sequence similarity are
the BLAST and BLAST 2.0 algorithms, which are described in Altschul
et al., Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et al., J.
Mol. Biol. 215:403-410 (1990), respectively. Software for
performing BLAST analyses is publicly available through the
National Center for Biotechnology Information
(http://www.ncbi.nlm.nih.gov/). This algorithm involves first
identifying high scoring sequence pairs (HSPs) by identifying short
words of length W in the query sequence, which either match or
satisfy some positive-valued threshold score T when aligned with a
word of the same length in a database sequence. T is referred to as
the neighborhood word score threshold (Altschul et al., supra).
These initial neighborhood word hits act as seeds for initiating
searches to find longer HSPs containing them. The word hits are
extended in both directions along each sequence for as far as the
cumulative alignment score can be increased. Cumulative scores are
calculated using, for nucleotide sequences, the parameters M
(reward score for a pair of matching residues; always >0) and N
(penalty score for mismatching residues; always <0). For amino
acid sequences, a scoring matrix is used to calculate the
cumulative score. Extension of the word hits in each direction are
halted when: the cumulative alignment score falls off by the
quantity X from its maximum achieved value; the cumulative score
goes to zero or below, due to the accumulation of one or more
negative-scoring residue alignments; or the end of either sequence
is reached. The BLAST algorithm parameters W, T, and X determine
the sensitivity and speed of the alignment. The BLASTN program (for
nucleotide sequences) uses as defaults a wordlength (W) of 11, an
expectation (E) or 10, M=5, N=-4 and a comparison of both strands.
For amino acid sequences, the BLASTP program uses as defaults a
wordlength of 3, and expectation (E) of 10, and the BLOSUM62
scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci.
USA 89:10915 (1989)) alignments (B) of 50, expectation (E) of 10,
M=5, N=-4, and a comparison of both strands.
[0097] The BLAST algorithm also performs a statistical analysis of
the similarity between two sequences (see, e.g., Karlin &
Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One
measure of similarity provided by the BLAST algorithm is the
smallest sum probability (P(N)), which provides an indication of
the probability by which a match between two nucleotide or amino
acid sequences would occur by chance. For example, a nucleic acid
is considered similar to a reference sequence if the smallest sum
probability in a comparison of the test nucleic acid to the
reference nucleic acid is less than about 0.2, more preferably less
than about 0.01, and most preferably less than about 0.001.
Polynucleotide Compositions
[0098] As used herein, the terms "DNA segment" and "polynucleotide"
refer to a DNA molecule that has been isolated free of total
genomic DNA of a particular species. Therefore, a DNA segment
encoding a polypeptide refers to a DNA segment that contains one or
more coding sequences yet is substantially isolated away from, or
purified free from, total genomic DNA of the species from which the
DNA segment is obtained. Included within the terms "DNA segment"
and "polynucleotide" are DNA segments and smaller fragments of such
segments, and also recombinant vectors, including, for example,
plasmids, cosmids, phagemids, phage, viruses, and the like.
[0099] As will be understood by those skilled in the art, the DNA
segments of this invention can include genomic sequences,
extra-genomic and plasmid-encoded sequences and smaller engineered
gene segments that express, or may be adapted to express, proteins,
polypeptides, peptides and the like. Such segments may be naturally
isolated, or modified synthetically by the hand of man.
[0100] The terms "isolated," "purified," or "biologically pure"
therefore refer to material that is substantially or essentially
free from components that normally accompany it as found in its
native state. Of course, this refers to the DNA segment as
originally isolated, and does not exclude other isolated proteins,
genes, or coding regions later added to the composition by the hand
of man. Purity and homogeneity are typically determined using
analytical chemistry techniques such as polyacrylamide gel
electrophoresis or high performance liquid chromatography. A
protein that is the predominant species present in a preparation is
substantially purified. An isolated nucleic acid is separated from
other open reading frames that flank the gene and encode proteins
other than the gene.
[0101] As will be recognized by the skilled artisan,
polynucleotides may be single-stranded (coding or antisense) or
double-stranded, and may be DNA (genomic, cDNA or synthetic) or RNA
molecules. RNA molecules include HnRNA molecules, which contain
introns and correspond to a DNA molecule in a one-to-one manner,
and mRNA molecules, which do not contain introns. Additional coding
or non-coding sequences may, but need not, be present within a
polynucleotide of the present invention, and a polynucleotide may,
but need not, be linked to other molecules and/or support
materials.
[0102] Polynucleotides may comprise a native sequence (i.e., an
endogenous sequence that encodes a Mycobacterium antigen or a
portion thereof) or may comprise a variant, or a biological or
antigenic functional equivalent of such a sequence. Polynucleotide
variants may contain one or more substitutions, additions,
deletions and/or insertions, as further described below, preferably
such that the immunogenicity of the encoded polypeptide is not
diminished, relative to a native tumor protein. The effect on the
immunogenicity of the encoded polypeptide may generally be assessed
as described herein. The term "variants" also encompasses
homologous genes of xenogenic origin.
[0103] In additional embodiments, the present invention provides
isolated polynucleotides and polypeptides comprising various
lengths of contiguous stretches of sequence identical to or
complementary to one or more of the sequences disclosed herein. For
example, polynucleotides are provided by this invention that
comprise at least about 15, 20, 30, 40, 50, 75, 100, 150, 200, 300,
400, 500 or 1000 or more contiguous nucleotides of one or more of
the sequences disclosed herein as well as all intermediate lengths
there between. It will be readily understood that "intermediate
lengths", in this context, means any length between the quoted
values, such as 16, 17, 18, 19, etc.; 21, 22, 23, etc.; 30, 31, 32,
etc.; 50, 51, 52, 53, etc.; 100, 101, 102, 103, etc.; 150, 151,
152, 153, etc.; including all integers through 200-500; 500-1,000,
and the like.
[0104] The polynucleotides of the present invention, or fragments
thereof, regardless of the length of the coding sequence itself,
may be combined with other DNA sequences, such as promoters,
polyadenylation signals, additional restriction enzyme sites,
multiple cloning sites, other coding segments, and the like, such
that their overall length may vary considerably. It is therefore
contemplated that a nucleic acid fragment of almost any length may
be employed, with the total length preferably being limited by the
ease of preparation and use in the intended recombinant DNA
protocol. For example, illustrative DNA segments with total lengths
of about 10,000, about 5000, about 3000, about 2,000, about 1,000,
about 500, about 200, about 100, about 50 base pairs in length, and
the like, (including all intermediate lengths) are contemplated to
be useful in many implementations of this invention.
[0105] Moreover, it will be appreciated by those of ordinary skill
in the art that, as a result of the degeneracy of the genetic code,
there are many nucleotide sequences that encode a polypeptide as
described herein. Some of these polynucleotides bear minimal
homology to the nucleotide sequence of any native gene.
Nonetheless, polynucleotides that vary due to differences in codon
usage are specifically contemplated by the present invention, for
example polynucleotides that are optimized for human and/or primate
codon selection. Further, alleles of the genes comprising the
polynucleotide sequences provided herein are within the scope of
the present invention. Alleles are endogenous genes that are
altered as a result of one or more mutations, such as deletions,
additions and/or substitutions of nucleotides. The resulting mRNA
and protein may, but need not, have an altered structure or
function. Alleles may be identified using standard techniques (such
as hybridization, amplification and/or database sequence
comparison).
Polynucleotide Identification and Characterization
[0106] Polynucleotides may be identified, prepared and/or
manipulated using any of a variety of well established techniques.
For example, a polynucleotide may be identified, as described in
more detail below, by screening a microarray of cDNAs for
tumor-associated expression (i.e., expression that is at least two
fold greater in a tumor than in normal tissue, as determined using
a representative assay provided herein). Such screens may be
performed, for example, using a Synteni microarray (Palo Alto,
Calif.) according to the manufacturer's instructions (and
essentially as described by Schena et al., Proc. Natl. Acad. Sci.
USA 93:10614-10619 (1996) and Heller et al., Proc. Natl. Acad. Sci.
USA 94:2150-2155 (1997)). Alternatively, polynucleotides may be
amplified from cDNA prepared from cells expressing the proteins
described herein, such as M. tuberculosis cells. Such
polynucleotides may be amplified via polymerase chain reaction
(PCR). For this approach, sequence-specific primers may be designed
based on the sequences provided herein, and may be purchased or
synthesized.
[0107] An amplified portion of a polynucleotide of the present
invention may be used to isolate a full length gene from a suitable
library (e.g., a M. tuberculosis cDNA library) using well known
techniques. Within such techniques, a library (cDNA or genomic) is
screened using one or more polynucleotide probes or primers
suitable for amplification. Preferably, a library is size-selected
to include larger molecules. Random primed libraries may also be
preferred for identifying 5' and upstream regions of genes. Genomic
libraries are preferred for obtaining introns and extending 5'
sequences.
[0108] For hybridization techniques, a partial sequence may be
labeled (e.g., by nick-translation or end-labeling with .sup.32P)
using well known techniques. A bacterial or bacteriophage library
is then generally screened by hybridizing filters containing
denatured bacterial colonies (or lawns containing phage plaques)
with the labeled probe (see Sambrook et al., Molecular Cloning: A
Laboratory Manual (1989)). Hybridizing colonies or plaques are
selected and expanded, and the DNA is isolated for further
analysis. cDNA clones may be analyzed to determine the amount of
additional sequence by, for example, PCR using a primer from the
partial sequence and a primer from the vector. Restriction maps and
partial sequences may be generated to identify one or more
overlapping clones. The complete sequence may then be determined
using standard techniques, which may involve generating a series of
deletion clones. The resulting overlapping sequences can then
assembled into a single contiguous sequence. A full length cDNA
molecule can be generated by ligating suitable fragments, using
well known techniques.
[0109] Alternatively, there are numerous amplification techniques
for obtaining a full length coding sequence from a partial cDNA
sequence. Within such techniques, amplification is generally
performed via PCR. Any of a variety of commercially available kits
may be used to perform the amplification step. Primers may be
designed using, for example, software well known in the art.
Primers are preferably 22-30 nucleotides in length, have a GC
content of at least 50% and anneal to the target sequence at
temperatures of about 68.degree. C. to 72.degree. C. The amplified
region may be sequenced as described above, and overlapping
sequences assembled into a contiguous sequence.
[0110] One such amplification technique is inverse PCR (see Triglia
et al., Nucl. Acids Res. 16:8186 (1988)), which uses restriction
enzymes to generate a fragment in the known region of the gene. The
fragment is then circularized by intramolecular ligation and used
as a template for PCR with divergent primers derived from the known
region. Within an alternative approach, sequences adjacent to a
partial sequence may be retrieved by amplification with a primer to
a linker sequence and a primer specific to a known region. The
amplified sequences are typically subjected to a second round of
amplification with the same linker primer and a second primer
specific to the known region. A variation on this procedure, which
employs two primers that initiate extension in opposite directions
from the known sequence, is described in WO 96/38591. Another such
technique is known as "rapid amplification of cDNA ends" or RACE.
This technique involves the use of an internal primer and an
external primer, which hybridizes to a polyA region or vector
sequence, to identify sequences that are 5' and 3' of a known
sequence. Additional techniques include capture PCR (Lagerstrom et
al., PCR Methods Applic. 1:111-19 (1991)) and walking PCR (Parker
et al., Nucl. Acids. Res. 19:3055-60 (1991)). Other methods
employing amplification may also be employed to obtain a full
length cDNA sequence.
[0111] In certain instances, it is possible to obtain a full length
cDNA sequence by analysis of sequences provided in an expressed
sequence tag (EST) database, such as that available from GenBank.
Searches for overlapping ESTs may generally be performed using well
known programs (e.g., NCBI BLAST searches), and such ESTs may be
used to generate a contiguous full length sequence. Full length DNA
sequences may also be obtained by analysis of genomic
fragments.
Polynucleotide Expression in Host Cells
[0112] In other embodiments of the invention, polynucleotide
sequences or fragments thereof which encode polypeptides of the
invention, or fusion proteins or functional equivalents thereof,
may be used in recombinant DNA molecules to direct expression of a
polypeptide in appropriate host cells. Due to the inherent
degeneracy of the genetic code, other DNA sequences that encode
substantially the same or a functionally equivalent amino acid
sequence may be produced and these sequences may be used to clone
and express a given polypeptide.
[0113] As will be understood by those of skill in the art, it may
be advantageous in some instances to produce polypeptide-encoding
nucleotide sequences possessing non-naturally occurring codons. For
example, codons preferred by a particular prokaryotic or eukaryotic
host can be selected to increase the rate of protein expression or
to produce a recombinant RNA transcript having desirable
properties, such as a half-life which is longer than that of a
transcript generated from the naturally occurring sequence.
[0114] Moreover, the polynucleotide sequences of the present
invention can be engineered using methods generally known in the
art in order to alter polypeptide encoding sequences for a variety
of reasons, including but not limited to, alterations which modify
the cloning, processing, and/or expression of the gene product. For
example, DNA shuffling by random fragmentation and PCR reassembly
of gene fragments and synthetic oligonucleotides may be used to
engineer the nucleotide sequences. In addition, site-directed
mutagenesis may be used to insert new restriction sites, alter
glycosylation patterns, change codon preference, produce splice
variants, or introduce mutations, and so forth.
[0115] In another embodiment of the invention, natural, modified,
or recombinant nucleic acid sequences may be ligated to a
heterologous sequence to encode a fusion protein. For example, to
screen peptide libraries for inhibitors of polypeptide activity, it
may be useful to encode a chimeric protein that can be recognized
by a commercially available antibody. A fusion protein may also be
engineered to contain a cleavage site located between the
polypeptide-encoding sequence and the heterologous protein
sequence, so that the polypeptide may be cleaved and purified away
from the heterologous moiety.
[0116] Sequences encoding a desired polypeptide may be synthesized,
in whole or in part, using chemical methods well known in the art
(see Caruthers, M. H. et al., Nucl. Acids Res. Symp. Ser. pp.
215-223 (1980), Horn et al., Nucl. Acids Res. Symp. Ser. pp.
225-232 (1980)). Alternatively, the protein itself may be produced
using chemical methods to synthesize the amino acid sequence of a
polypeptide, or a portion thereof. For example, peptide synthesis
can be performed using various solid-phase techniques (Roberge et
al., Science 269:202-204 (1995)) and automated synthesis may be
achieved, for example, using the ABI 431A Peptide Synthesizer
(Perkin Elmer, Palo Alto, Calif.).
[0117] A newly synthesized peptide may be substantially purified by
preparative high performance liquid chromatography (e.g.,
Creighton, Proteins, Structures and Molecular Principles (1983)) or
other comparable techniques available in the art. The composition
of the synthetic peptides may be confirmed by amino acid analysis
or sequencing (e.g., the Edman degradation procedure).
Additionally, the amino acid sequence of a polypeptide, or any part
thereof, may be altered during direct synthesis and/or combined
using chemical methods with sequences from other proteins, or any
part thereof, to produce a variant polypeptide.
[0118] In order to express a desired polypeptide, the nucleotide
sequences encoding the polypeptide, or functional equivalents, may
be inserted into appropriate expression vector, i.e., a vector
which contains the necessary elements for the transcription and
translation of the inserted coding sequence. Methods which are well
known to those skilled in the art may be used to construct
expression vectors containing sequences encoding a polypeptide of
interest and appropriate transcriptional and translational control
elements. These methods include in vitro recombinant DNA
techniques, synthetic techniques, and in vivo genetic
recombination. Such techniques are described in Sambrook et al.,
Molecular Cloning, A Laboratory Manual (1989), and Ausubel et al.,
Current Protocols in Molecular Biology (1989).
[0119] A variety of expression vector/host systems may be utilized
to contain and express polynucleotide sequences. These include, but
are not limited to, microorganisms such as bacteria transformed
with recombinant bacteriophage, plasmid, or cosmid DNA expression
vectors; yeast transformed with yeast expression vectors; insect
cell systems infected with virus expression vectors (e.g.,
baculovirus); plant cell systems transformed with virus expression
vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic
virus, TMV) or with bacterial expression vectors (e.g., Ti or
pBR322 plasmids); or animal cell systems.
[0120] The "control elements" or "regulatory sequences" present in
an expression vector are those non-translated regions of the
vector--enhancers, promoters, 5' and 3' untranslated regions--which
interact with host cellular proteins to carry out transcription and
translation. Such elements may vary in their strength and
specificity. Depending on the vector system and host utilized, any
number of suitable transcription and translation elements,
including constitutive and inducible promoters, may be used. For
example, when cloning in bacterial systems, inducible promoters
such as the hybrid lacZ promoter of the PBLUESCRIPT phagemid
(Stratagene, La Jolla, Calif.) or PSPORT1 plasmid (Gibco BRL,
Gaithersburg, Md.) and the like may be used. In mammalian cell
systems, promoters from mammalian genes or from mammalian viruses
are generally preferred. If it is necessary to generate a cell line
that contains multiple copies of the sequence encoding a
polypeptide, vectors based on SV40 or EBV may be advantageously
used with an appropriate selectable marker.
[0121] In bacterial systems, a number of expression vectors may be
selected depending upon the use intended for the expressed
polypeptide. For example, when large quantities are needed, for
example for the induction of antibodies, vectors which direct high
level expression of fusion proteins that are readily purified may
be used. Such vectors include, but are not limited to, the
multifunctional E. coli cloning and expression vectors such as
BLUESCRIPT (Stratagene), in which the sequence encoding the
polypeptide of interest may be ligated into the vector in frame
with sequences for the amino-terminal Met and the subsequent 7
residues of .beta.-galactosidase so that a hybrid protein is
produced; pIN vectors (Van Heeke & Schuster, J. Biol. Chem.
264:5503-5509 (1989)); and the like. pGEX Vectors (Promega,
Madison, Wis.) may also be used to express foreign polypeptides as
fusion proteins with glutathione S-transferase (GST). In general,
such fusion proteins are soluble and can easily be purified from
lysed cells by adsorption to glutathione-agarose beads followed by
elution in the presence of free glutathione. Proteins made in such
systems may be designed to include heparin, thrombin, or factor XA
protease cleavage sites so that the cloned polypeptide of interest
can be released from the GST moiety at will.
[0122] In the yeast, Saccharomyces cerevisiae, a number of vectors
containing constitutive or inducible promoters such as alpha
factor, alcohol oxidase, and PGH may be used. For reviews, see
Ausubel et al. (supra) and Grant et al., Methods Enzymol.
153:516-544 (1987).
[0123] In cases where plant expression vectors are used, the
expression of sequences encoding, polypeptides may be driven by any
of a number of promoters. For example, viral promoters such as the
35S and 19S promoters of CaMV may be used alone or in combination
with the omega leader sequence from TMV (Takamatsu, EMBO J.
6:307-311 (1987)). Alternatively, plant promoters such as the small
subunit of RUBISCO or heat shock promoters may be used (Coruzzi et
al., EMBO J. 3:1671-1680 (1984); Broglie et al., Science
224:838-843 (1984); and Winter et al., Results Probl. Cell Differ.
17:85-105 (1991)). These constructs can be introduced into plant
cells by direct DNA transformation or pathogen-mediated
transfection. Such techniques are described in a number of
generally available reviews (see, e.g., Hobbs in McGraw Hill
Yearbook of Science and Technology pp. 191-196 (1992)).
[0124] An insect system may also be used to express a polypeptide
of interest. For example, in one such system, Autographa
californica nuclear polyhedrosis virus (AcNPV) is used as a vector
to express foreign genes in Spodoptera frugiperda cells or in
Trichoplusia larvae. The sequences encoding the polypeptide may be
cloned into a non-essential region of the virus, such as the
polyhedrin gene, and placed under control of the polyhedrin
promoter. Successful insertion of the polypeptide-encoding sequence
will render the polyhedrin gene inactive and produce recombinant
virus lacking coat protein. The recombinant viruses may then be
used to infect, for example, S. frugiperda cells or Trichoplusia
larvae in which the polypeptide of interest may be expressed
(Engelhard et al., Proc. Natl. Acad. Sci. U.S.A. 91:3224-3227
(1994)).
[0125] In mammalian host cells, a number of viral-based expression
systems are generally available. For example, in cases where an
adenovirus is used as an expression vector, sequences encoding a
polypeptide of interest may be ligated into an adenovirus
transcription/translation complex consisting of the late promoter
and tripartite leader sequence. Insertion in a non-essential E1 or
E3 region of the viral genome may be used to obtain a viable virus
which is capable of expressing the polypeptide in infected host
cells (Logan & Shenk, Proc. Natl. Acad. Sci. U.S.A.
81:3655-3659 (1984)). In addition, transcription enhancers, such as
the Rous sarcoma virus (RSV) enhancer, may be used to increase
expression in mammalian host cells.
[0126] Specific initiation signals may also be used to achieve more
efficient translation of sequences encoding a polypeptide of
interest. Such signals include the ATG initiation codon and
adjacent sequences. In cases where sequences encoding the
polypeptide, its initiation codon, and upstream sequences are
inserted into the appropriate expression vector, no additional
transcriptional or translational control signals may be needed.
However, in cases where only coding sequence, or a portion thereof,
is inserted, exogenous translational control signals including the
ATG initiation codon should be provided. Furthermore, the
initiation codon should be in the correct reading frame to ensure
translation of the entire insert. Exogenous translational elements
and initiation codons may be of various origins, both natural and
synthetic. The efficiency of expression may be enhanced by the
inclusion of enhancers which are appropriate for the particular
cell system which is used, such as those described in the
literature (Scharf. et al., Results Probl. Cell Differ. 20:125-162
(1994)).
[0127] In addition, a host cell strain may be chosen for its
ability to modulate the expression of the inserted sequences or to
process the expressed protein in the desired fashion. Such
modifications of the polypeptide include, but are not limited to,
acetylation, carboxylation. glycosylation, phosphorylation,
lipidation, and acylation. Post-translational processing which
cleaves a "prepro" form of the protein may also be used to
facilitate correct insertion, folding and/or function. Different
host cells such as CHO, HeLa, MDCK, HEK293, and WI38, which have
specific cellular machinery and characteristic mechanisms for such
post-translational activities, may be chosen to ensure the correct
modification and processing of the foreign protein.
[0128] For long-term, high-yield production of recombinant
proteins, stable expression is generally preferred. For example,
cell lines which stably express a polynucleotide of interest may be
transformed using expression vectors which may contain viral
origins of replication and/or endogenous expression elements and a
selectable marker gene on the same or on a separate vector.
Following the introduction of the vector, cells may be allowed to
grow for 1-2 days in an enriched media before they are switched to
selective media. The purpose of the selectable marker is to confer
resistance to selection, and its presence allows growth and
recovery of cells which successfully express the introduced
sequences. Resistant clones of stably transformed cells may be
proliferated using tissue culture techniques appropriate to the
cell type.
[0129] Any number of selection systems may be used to recover
transformed cell lines. These include, but are not limited to, the
herpes simplex virus thymidine kinase (Wigler et al., Cell
11:223-32 (1977)) and adenine phosphoribosyltransferase (Lowy et
al., Cell 22:817-23 (1990)) genes which can be employed in tk.sup.-
or aprt.sup.-cells, respectively. Also, antimetabolite, antibiotic
or herbicide resistance can be used as the basis for selection; for
example, dhfr which confers resistance to methotrexate (Wigler et
al., Proc. Natl. Acad. Sci. U.S.A. 77:3567-70 (1980)); npt, which
confers resistance to the aminoglycosides, neomycin and G-418
(Colbere-Garapin et al., J. Mol. Biol. 150:1-14 (1981)); and als or
pat, which confer resistance to chlorsulfuron and phosphinotricin
acetyltransferase, respectively (Murry, supra). Additional
selectable genes have been described, for example, trpB, which
allows cells to utilize indole in place of tryptophan, or hisD,
which allows cells to utilize histinol in place of histidine
(Hartman & Mulligan, Proc. Natl. Acad. Sci. U.S.A. 85:8047-51
(1988)). Recently, the use of visible markers has gained popularity
with such markers as anthocyanins, .beta.-glucuronidase and its
substrate GUS, and luciferase and its substrate luciferin, being
widely used not only to identify transformants, but also to
quantify the amount of transient or stable protein expression
attributable to a specific vector system (Rhodes et al., Methods
Mol. Biol. 55:121-131 (1995)).
[0130] Although the presence/absence of marker gene expression
suggests that the gene of interest is also present, its presence
and expression may need to be confirmed. For example, if the
sequence encoding a polypeptide is inserted within a marker gene
sequence, recombinant cells containing sequences can be identified
by the absence of marker gene function. Alternatively, a marker
gene can be placed in tandem with a polypeptide-encoding sequence
under the control of a single promoter. Expression of the marker
gene in response to induction or selection usually indicates
expression of the tandem gene as well.
[0131] Alternatively, host cells which contain and express a
desired polynucleotide sequence may be identified by a variety of
procedures known to those of skill in the art. These procedures
include, but are not limited to, DNA-DNA or DNA-RNA hybridizations
and protein bioassay or immunoassay techniques which include
membrane, solution, or chip based technologies for the detection
and/or quantification of nucleic acid or protein.
[0132] A variety of protocols for detecting and measuring the
expression of polynucleotide-encoded products, using either
polyclonal or monoclonal antibodies specific for the product are
known in the art. Examples include enzyme-linked immunosorbent
assay (ELISA), radioimmunoassay (RIA), and fluorescence activated
cell sorting (FACS). A two-site, monoclonal-based immunoassay
utilizing monoclonal antibodies reactive to two non-interfering
epitopes on a given polypeptide may be preferred for some
applications, but a competitive binding assay may also be employed.
These and other assays are described, among other places, in
Hampton et al., Serological Methods, a Laboratory Manual (1990) and
Maddox et al., J. Exp. Med. 158:1211-1216 (1983).
[0133] A wide variety of labels and conjugation techniques are
known by those skilled in the art and may be used in various
nucleic acid and amino acid assays. Means for producing labeled
hybridization or PCR probes for detecting sequences related to
polynucleotides include oligolabeling, nick translation,
end-labeling or PCR amplification using a labeled nucleotide.
Alternatively, the sequences, or any portions thereof may be cloned
into a vector for the production of an mRNA probe. Such vectors are
known in the art, are commercially available, and may be used to
synthesize RNA probes in vitro by addition of an appropriate RNA
polymerase such as T7, T3, or SP6 and labeled nucleotides. These
procedures may be conducted using a variety of commercially
available kits. Suitable reporter molecules or labels, which may be
used include radionuclides, enzymes, fluorescent, chemiluminescent,
or chromogenic agents as well as substrates, cofactors, inhibitors,
magnetic particles, and the like.
[0134] Host cells transformed with a polynucleotide sequence of
interest may be cultured under conditions suitable for the
expression and recovery of the protein from cell culture. The
protein produced by a recombinant cell may be secreted or contained
intracellularly depending on the sequence and/or the vector used.
As will be understood by those of skill in the art, expression
vectors containing polynucleotides of the invention may be designed
to contain signal sequences which direct secretion of the encoded
polypeptide through a prokaryotic or eukaryotic cell membrane.
Other recombinant constructions may be used to join sequences
encoding a polypeptide of interest to nucleotide sequence encoding
a polypeptide domain which will facilitate purification of soluble
proteins. Such purification facilitating domains include, but are
not limited to, metal chelating peptides such as
histidine-tryptophan modules that allow purification on immobilized
metals, protein A domains that allow purification on immobilized
immunoglobulin, and the domain utilized in the FLAGS
extension/affinity purification system (Immunex Corp., Seattle,
Wash.). The inclusion of cleavable linker sequences such as those
specific for Factor XA or enterokinase (Invitrogen. San Diego,
Calif.) between the purification domain and the encoded polypeptide
may be used to facilitate purification. One such expression vector
provides for expression of a fusion protein containing a
polypeptide of interest and a nucleic acid encoding 6 histidine
residues preceding a thioredoxin or an enterokinase cleavage site.
The histidine residues facilitate purification on IMIAC
(immobilized metal ion affinity chromatography) as described in
Porath et al., Prot. Exp. Purif. 3:263-281 (1992) while the
enterokinase cleavage site provides a means for purifying the
desired polypeptide from the fusion protein. A discussion of
vectors which contain fusion proteins is provided in Kroll et al.,
DNA Cell Biol. 12:441-453 (1993)).
[0135] In addition to recombinant production methods, polypeptides
of the invention, and fragments thereof, may be produced by direct
peptide synthesis using solid-phase techniques (Merrifield, J. Am.
Chem. Soc. 85:2149-2154 (1963)). Protein synthesis may be performed
using manual techniques or by automation. Automated synthesis may
be achieved, for example, using Applied Biosystems 431A Peptide
Synthesizer (Perkin Elmer). Alternatively, various fragments may be
chemically synthesized separately and combined using chemical
methods to produce the full length molecule.
In Vivo Polynucleotide Delivery Techniques
[0136] In additional embodiments, genetic constructs comprising one
or more of the polynucleotides of the invention are introduced into
cells in vivo. This may be achieved using any of a variety or well
known approaches, several of which are outlined below for the
purpose of illustration.
[0137] 1. Adenovirus
[0138] One of the preferred methods for in vivo delivery of one or
more nucleic acid sequences involves the use of an adenovirus
expression vector. "Adenovirus expression vector" is meant to
include those constructs containing adenovirus sequences sufficient
to (a) support packaging of the construct and (b) to express a
polynucleotide that has been cloned therein in a sense or antisense
orientation. Of course, in the context of an antisense construct,
expression does not require that the gene product be
synthesized.
[0139] The expression vector comprises a genetically engineered
form of an adenovirus. Knowledge of the genetic organization of
adenovirus, a 36 kb, linear, double-stranded DNA virus, allows
substitution of large pieces of adenoviral DNA with foreign
sequences up to 7 kb (Grunhaus & Horwitz, 1992). In contrast to
retrovirus, the adenoviral infection of host cells does not result
in chromosomal integration because adenoviral DNA can replicate in
an episomal manner without potential genotoxicity. Also,
adenoviruses are structurally stable, and no genome rearrangement
has been detected after extensive amplification. Adenovirus can
infect virtually all epithelial cells regardless of their cell
cycle stage. So far, adenoviral infection appears to be linked only
to mild disease such as acute respiratory disease in humans.
[0140] Adenovirus is particularly suitable for use as a gene
transfer vector because of its mid-sized genome, ease of
manipulation, high titer, wide target-cell range and high
infectivity. Both ends of the viral genome contain 100-200 base
pair inverted repeats (ITRs), which are cis elements necessary for
viral DNA replication and packaging. The early (E) and late (L)
regions of the genome contain different transcription units that
are divided by the onset of viral DNA replication. The E1 region
(E1A and E1B) encodes proteins responsible for the regulation of
transcription of the viral genome and a few cellular genes. The
expression of the E2 region (E2A and E2B) results in the synthesis
of the proteins for viral DNA replication. These proteins are
involved in DNA replication, late gene expression and host cell
shut-off (Renan, 1990). The products of the late genes, including
the majority of the viral capsid proteins, are expressed only after
significant processing of a single primary transcript issued by the
major late promoter (MLP). The MLP, (located at 16.8 m.u.) is
particularly efficient during the late phase of infection, and all
the mRNA's issued from this promoter possess a 5'-tripartite leader
(TPL) sequence which makes them preferred mRNA's for
translation.
[0141] In a current system, recombinant adenovirus is generated
from homologous recombination between shuttle vector and provirus
vector. Due to the possible recombination between two proviral
vectors, wild-type adenovirus may be generated from this process.
Therefore, it is critical to isolate a single clone of virus from
an individual plaque and examine its genomic structure.
[0142] Generation and propagation of the current adenovirus
vectors, which are replication deficient, depend on a unique helper
cell line, designated 293, which was transformed from human
embryonic kidney cells by Ad5 DNA fragments and constitutively
expresses E1 proteins (Graham et al., 1977). Since the E3 region is
dispensable from the adenovirus genome (Jones & Shenk, 1978),
the current adenovirus vectors, with the help of 293 cells, carry
foreign DNA in either the E1, the D3 or both regions (Graham &
Prevec, 1991). In nature, adenovirus can package approximately 105%
of the wild-type genome (Ghosh-Choudhury et al., 1987), providing
capacity for about 2 extra kB of DNA. Combined with the
approximately 5.5 kB of DNA that is replaceable in the E1 and E3
regions, the maximum capacity of the current adenovirus vector is
under 7.5 kB, or about 15% of the total length of the vector. More
than 80% of the adenovirus viral genome remains in the vector
backbone and is the source of vector-borne cytotoxicity. Also, the
replication deficiency of the E1-deleted virus is incomplete. For
example, leakage of viral gene expression has been observed with
the currently available vectors at high multiplicities of infection
(MOI) (Mulligan, 1993).
[0143] Helper cell lines may be derived from human cells such as
human embryonic kidney cells, muscle cells, hematopoietic cells or
other human embryonic mesenchymal or epithelial cells.
Alternatively, the helper cells may be derived from the cells of
other mammalian species that are permissive for human adenovirus.
Such cells include, e.g., Vero cells or other monkey embryonic
mesenchymal or epithelial cells. As stated above, the currently
preferred helper cell line is 293.
[0144] Recently, Racher et al. (1995) disclosed improved methods
for culturing 293 cells and propagating adenovirus. In one format,
natural cell aggregates are grown by inoculating individual cells
into 1 liter siliconized spinner flasks (Techne, Cambridge, UK)
containing 100-200 ml of medium. Following stirring at 40 rpm, the
cell viability is estimated with trypan blue. In another format,
Fibra-Cel microcarriers (Bibby Sterlin, Stone, UK) (5 g/l) is
employed as follows. A cell inoculum, resuspended in 5 ml of
medium, is added to the carrier (50 ml) in a 250 ml Erlenmeyer
flask and left stationary, with occasional agitation, for 1 to 4 h.
The medium is then replaced with 50 ml of fresh medium and shaking
initiated. For virus production, cells are allowed to grow to about
80% confluence, after which time the medium is replaced (to 25% of
the final volume) and adenovirus added at an MOI of 0.05. Cultures
are left stationary overnight, following which the volume is
increased to 100% and shaking commenced for another 72 h.
[0145] Other than the requirement that the adenovirus vector be
replication defective, or at least conditionally defective, the
nature of the adenovirus vector is not believed to be crucial to
the successful practice of the invention. The adenovirus may be of
any of the 42 different known serotypes or subgroups A-F.
Adenovirus type 5 of subgroup C is the preferred starting material
in order to obtain a conditional replication-defective adenovirus
vector for use in the present invention, since Adenovirus type S is
a human adenovirus about which a great deal of biochemical and
genetic information is known, and it has historically been used for
most constructions employing adenovirus as a vector.
[0146] As stated above, the typical vector according to the present
invention is replication defective and will not have an adenovirus
E1 region. Thus, it will be most convenient to introduce the
polynucleotide encoding the gene of interest at the position from
which the E1-coding sequences have been removed. However, the
position of insertion of the construct within the adenovirus
sequences is not critical to the invention. The polynucleotide
encoding the gene of interest may also be inserted in lieu of the
deleted E3 region in E3 replacement vectors as described by
Karlsson et al. (1986) or in the E4 region where a helper cell line
or helper virus complements the E4 defect.
[0147] Adenovirus is easy to grow and manipulate and exhibits broad
host range in vitro and in vivo. This group of viruses can be
obtained in high titers, e.g., 10.sup.9-10.sup.11 plaque-forming
units per ml, and they are highly infective. The life cycle of
adenovirus does not require integration into the host cell genome.
The foreign genes delivered by adenovirus vectors are episomal and,
therefore, have low genotoxicity to host cells. No side effects
have been reported in studies of vaccination with wild-type
adenovirus (Couch et al., 1963; Top et al., 1971), demonstrating
their safety and therapeutic potential as in vivo gene transfer
vectors.
[0148] Adenovirus vectors have been used in eukaryotic gene
expression (Levrero et al., 1991; Gomez-Foix et al., 1992) and
vaccine development (Grunhaus & Horwitz, 1992; Graham &
Prevec, 1992). Recently, animal studies suggested that recombinant
adenovirus could be used for gene therapy (Stratford-Perricaudet
& Perricaudet, 1991; Stratford-Perricaudet et al., 1990; Rich
et al., 1993). Studies in administering recombinant adenovirus to
different tissues include trachea instillation (Rosenfeld et al.,
1991; Rosenfeld et al., 1992), muscle injection (Ragot et al.,
1993), peripheral intravenous injections (Herz & Gerard, 1993)
and stereotactic inoculation into the brain (Le Gal La Salle et
al., 1993).
[0149] 2. Retroviruses
[0150] The retroviruses are a group of single-stranded RNA viruses
characterized by an ability to convert their RNA to double-stranded
DNA in infected cells by a process of reverse-transcription
(Coffin, 1990). The resulting DNA then stably integrates into
cellular chromosomes as a provirus and directs synthesis of viral
proteins. The integration results in the retention of the viral
gene sequences in the recipient cell and its descendants. The
retroviral genome contains three genes, gag, pol, and env that code
for capsid proteins, polymerase enzyme, and envelope components,
respectively. A sequence found upstream from the gag gene contains
a signal for packaging of the genome into virions. Two long
terminal repeat (LTR) sequences are present at the 5' and 3' ends
of the viral genome. These contain strong promoter and enhancer
sequences and are also required for integration in the host cell
genome (Coffin, 1990).
[0151] In order to construct a retroviral vector, a nucleic acid
encoding one or more oligonucleotide or polynucleotide sequences of
interest is inserted into the viral genome in the place of certain
viral sequences to produce a virus that is replication-defective.
In order to produce virions, a packaging cell line containing the
gag, pol, and env genes but without the LTR and packaging
components is constructed (Mann et al., 1983). When a recombinant
plasmid containing a cDNA, together with the retroviral LTR and
packaging sequences is introduced into this cell line (by calcium
phosphate precipitation for example), the packaging sequence allows
the RNA transcript of the recombinant plasmid to be packaged into
viral particles, which are then secreted into the culture media
(Nicolas & Rubenstein, 1988; Temin, 1986; Mann et al., 1983).
The media containing the recombinant retroviruses is then
collected, optionally concentrated, and used for gene transfer.
Retroviral vectors are able to infect a broad variety of cell
types. However, integration and stable expression require the
division of host cells (Paskind et al., 1975).
[0152] A novel approach designed to allow specific targeting of
retrovirus vectors was recently developed based on the chemical
modification of a retrovirus by the chemical addition of lactose
residues to the viral envelope. This modification could permit the
specific infection of hepatocytes via sialoglycoprotein
receptors.
[0153] A different approach to targeting of recombinant
retroviruses was designed in which biotinylated antibodies against
a retroviral envelope protein and against a specific cell receptor
were used. The antibodies were coupled via the biotin components by
using streptavidin (Roux et al., 1989). Using antibodies against
major histocompatibility complex class I and class II antigens,
they demonstrated the infection of a variety of human cells that
bore those surface antigens with an ecotropic virus in vitro (Roux
et al., 1989).
[0154] 3. Adeno-Associated Viruses
[0155] AAV (Ridgeway, 1988; Hermonat & Muzycska, 1984) is a
parovirus, discovered as a contamination of adenoviral stocks. It
is a ubiquitous virus (antibodies are present in 85% of the US
human population) that has not been linked to any disease. It is
also classified as a dependovirus, because its replications is
dependent on the presence of a helper virus, such as adenovirus.
Five serotypes have been isolated, of which AAV-2 is the best
characterized. AAV has a single-stranded linear DNA that is
encapsidated into capsid proteins VP1, VP2 and VP3 to form an
icosahedral virion of 20 to 24 nm in diameter (Muzyczka &
McLaughlin, 1988).
[0156] The AAV DNA is approximately 4.7 kilobases long. It contains
two open reading frames and is flanked by two ITRs. There are two
major genes in the AAV genome: rep and cap. The rep gene codes for
proteins responsible for viral replications, whereas cap codes for
capsid protein VP1-3. Each ITR forms a T-shaped hairpin structure.
These terminal repeats are the only essential cis components of the
AAV for chromosomal integration. Therefore, the AAV can be used as
a vector with all viral coding sequences removed and replaced by
the cassette of genes for delivery. Three viral promoters have been
identified and named p5, p19, and p40, according to their map
position. Transcription from p5 and p19 results in production of
rep proteins, and transcription from p40 produces the capsid
proteins (Hermonat & Muzyczka, 1984).
[0157] There are several factors that prompted researchers to study
the possibility of using rAAV as an expression vector One is that
the requirements for delivering a gene to integrate into the host
chromosome are surprisingly few. It is necessary to have the 145-bp
ITRs, which are only 6% of the AAV genome. This leaves room in the
vector to assemble a 4.5-kb DNA insertion. While this carrying
capacity may prevent the AAV from delivering large genes, it is
amply suited for delivering the antisense constructs of the present
invention.
[0158] AAV is also a good choice of delivery vehicles due to its
safety. There is a relatively complicated rescue mechanism: not
only wild type adenovirus but also AAV genes are required to
mobilize rAAV. Likewise, AAV is not pathogenic and not associated
with any disease. The removal of viral coding sequences minimizes
immune reactions to viral gene expression, and therefore, rAAV does
not evoke an inflammatory response.
[0159] 4. Other Viral Vectors as Expression Constructs
[0160] Other viral vectors may be employed as expression constructs
in the present invention for the delivery of oligonucleotide or
polynucleotide sequences to a host cell. Vectors derived from
viruses such as vaccinia virus (Ridgeway, 1988; Coupar et al.,
1988), lentiviruses, polio viruses and herpes viruses may be
employed. They offer several attractive features for various
mammalian cells (Friedmann, 1989; Ridgeway, 1988; Coupar et al.,
1988; Horwich et al., 1990).
[0161] With the recent recognition of defective hepatitis B
viruses, new insight was gained into the structure-function
relationship of different viral sequences. In vitro studies showed
that the virus could retain the ability for helper-dependent
packaging and reverse transcription despite the deletion of up to
80% of its genome (Horwich et al., 1990). This suggested that large
portions of the genome could be replaced with foreign genetic
material. The hepatotropism and persistence (integration) were
particularly attractive properties for liver-directed gene
transfer. Chang et al. (1991) introduced the chloramphenicol
acetyltransferase (CAT) gene into duck hepatitis B virus genome in
the place of the polymerase, surface, and pre-surface coding
sequences. It was cotransfected with wild-type virus into an avian
hepatoma cell line. Culture media containing high titers of the
recombinant virus were used to infect primary duckling hepatocytes.
Stable CAT gene expression was detected for at least 24 days after
transfection (Chang et al., 1991).
[0162] 5. Non-Viral Vectors
[0163] In order to effect expression of the oligonucleotide or
polynucleotide sequences of the present invention, the expression
construct must be delivered into a cell. This delivery may be
accomplished in vitro, as in laboratory procedures for transforming
cells lines, or in vivo or ex vivo, as in the treatment of certain
disease states. As described above, one preferred mechanism for
delivery is via viral infection where the expression construct is
encapsulated in an infectious viral particle.
[0164] Once the expression construct has been delivered into the
cell the nucleic acid encoding the desired oligonucleotide or
polynucleotide sequences may be positioned and expressed at
different sites. In certain embodiments, the nucleic acid encoding
the construct may be stably integrated into the genome of the cell.
This integration may be in the specific location and orientation
via homologous recombination (gene replacement) or it may be
integrated in a random, non-specific location (gene augmentation).
In yet further embodiments, the nucleic acid may be stably
maintained in the cell as a separate, episomal segment of DNA. Such
nucleic acid segments or "episomes" encode sequences sufficient to
permit maintenance and replication independent of or in
synchronization with the host cell cycle. How the expression
construct is delivered to a cell and where in the cell the nucleic
acid remains is dependent on the type of expression construct
employed.
[0165] In certain embodiments of the invention, the expression
construct comprising one or more oligonucleotide or polynucleotide
sequences may simply consist of naked recombinant DNA or plasmids.
Transfer of the construct may be performed by any of the methods
mentioned above which physically or chemically permeabilize the
cell membrane. This is particularly applicable for transfer in
vitro but it may be applied to in vivo use as well. Dubensky et al.
(1984) successfully injected polyomavirus DNA in the form of
calcium phosphate precipitates into liver and spleen of adult and
newborn mice demonstrating active viral replication and acute
infection. Benvenisty & Reshef (1986) also demonstrated that
direct intraperitoneal injection of calcium phosphate-precipitated
plasmids results in expression of the transfected genes. It is
envisioned that DNA encoding a gene of interest may also be
transferred in a similar manner in vivo and express the gene
product.
[0166] Another embodiment of the invention for transferring a naked
DNA expression construct into cells may involve particle
bombardment. This method depends on the ability to accelerate
DNA-coated microprojectiles to a high velocity allowing them to
pierce cell membranes and enter cells without killing them (Klein
et al., 1987). Several devices for accelerating small particles
have been developed. One such device relies on a high voltage
discharge to generate an electrical current, which in turn provides
the motive force (Yang et al., 1990). The microprojectiles used
have consisted of biologically inert substances such as tungsten or
gold beads.
[0167] Selected organs including the liver, skin, and muscle tissue
of rats and mice have been bombarded in vivo (Yang et al., 1990;
Zelenin et al., 1991). This may require surgical exposure of the
tissue or cells, to eliminate any intervening tissue between the
gun and the target organ, i.e., ex vivo treatment. Again, DNA
encoding a particular gene may be delivered via this method and
still be incorporated by the present invention.
Polypeptide Compositions
[0168] The present invention, in other aspects, provides
polypeptide compositions. Generally, a polypeptide of the invention
will be an isolated polypeptide (or an epitope, variant, or active
fragment thereof) derived from a mammalian species. Preferably, the
polypeptide is encoded by a polynucleotide sequence disclosed
herein or a sequence which hybridizes under moderately stringent
conditions to a polynucleotide sequence disclosed herein.
Alternatively, the polypeptide may be defined as a polypeptide
which comprises a contiguous amino acid sequence from an amino acid
sequence disclosed herein, or which polypeptide comprises an entire
amino acid sequence disclosed herein.
[0169] Immunogenic portions may generally be identified using well
known techniques, such as those summarized in Paul, Fundamental
Immunology, 3rd ed., 243-247 (1993) and references cited therein.
Such techniques include screening polypeptides for the ability to
react with antigen-specific antibodies, antisera and/or T-cell
lines or clones. As used herein, antisera and antibodies are
"antigen-specific" if they specifically bind to an antigen (i.e.,
they react with the protein in an ELISA or other immunoassay, and
do not react detectably with unrelated proteins). Such antisera and
antibodies may be prepared as described herein, and using well
known techniques. An immunogenic portion of a Mycobacterium sp.
protein is a portion that reacts with such antisera and/or T-cells
at a level that is not substantially less than the reactivity of
the full length polypeptide (e.g., in an ELISA and/or T-cell
reactivity assay). Such immunogenic portions may react within such
assays at a level that is similar to or greater than the reactivity
of the full length polypeptide. Such screens may generally be
performed using methods well known to those of ordinary skill in
the art, such as those described in Harlow & Lane, Antibodies:
A Laboratory Manual (1988). For example, a polypeptide may be
immobilized on a solid support and contacted with patient sera to
allow binding of antibodies within the sera to the immobilized
polypeptide. Unbound sera may then be removed and bound antibodies
detected using, for example, .sup.125I-labeled Protein A.
[0170] Polypeptides may be prepared using any of a variety of well
known techniques. Recombinant polypeptides encoded by DNA sequences
as described above may be readily prepared from the DNA sequences
using any of a variety of expression vectors known to those of
ordinary skill in the art. Expression may be achieved in any
appropriate host cell that has been transformed or transfected with
an expression vector containing a DNA molecule that encodes a
recombinant polypeptide. Suitable host cells include prokaryotes,
yeast, and higher eukaryotic cells, such as mammalian cells and
plant cells. Preferably, the host cells employed are E. coli, yeast
or a mammalian cell line such as COS or CHO. Supernatants from
suitable host/vector systems which secrete recombinant protein or
polypeptide into culture media may be first concentrated using a
commercially available filter. Following concentration, the
concentrate may be applied to a suitable purification matrix such
as an affinity matrix or an ion exchange resin. Finally, one or
more reverse phase HPLC steps can be employed to further purify a
recombinant polypeptide.
[0171] Polypeptides of the invention, immunogenic fragments
thereof, and other variants having less than about 100 amino acids,
and generally less than about 50 amino acids, may also be generated
by synthetic means, using techniques well known to those of
ordinary skill in the art. For example, such polypeptides may be
synthesized using any of the commercially available solid-phase
techniques, such as the Merrifield solid-phase synthesis method,
where amino acids are sequentially added to a growing amino acid
chain. See Merrifield, J. Am. Chem. Soc. 85:2149-2146 (1963).
Equipment for automated synthesis of polypeptides is commercially
available from suppliers such as Perkin Elmer/Applied BioSystems
Division (Foster City, Calif.), and may be operated according to
the manufacturer's instructions.
[0172] Within certain specific embodiments, a polypeptide may be a
fusion protein that comprises multiple polypeptides as described
herein, or that comprises at least one polypeptide as described
herein and an unrelated sequence, such as a known tumor protein. A
fusion partner may, for example, assist in providing T helper
epitopes (an immunological fusion partner), preferably T helper
epitopes recognized by humans, or may assist in expressing the
protein (an expression enhancer) at higher yields than the native
recombinant protein. Certain preferred fusion partners are both
immunological and expression enhancing fusion partners. Other
fusion partners may be selected so as to increase the solubility of
the protein or to enable the protein to be targeted to desired
intracellular compartments. Still further fusion partners include
affinity tags, which facilitate purification of the protein.
[0173] Fusion proteins may generally be prepared using standard
techniques, including chemical conjugation. Preferably, a fusion
protein is expressed as a recombinant protein, allowing the
production of increased levels, relative to a non-fused protein, in
an expression system. Briefly, DNA sequences encoding the
polypeptide components may be assembled separately, and ligated
into an appropriate expression vector. The 3' end of the DNA
sequence encoding one polypeptide component is ligated, with or
without a peptide linker, to the 5' end of a DNA sequence encoding
the second polypeptide component so that the reading frames of the
sequences are in phase. This permits translation into a single
fusion protein that retains the biological activity of both
component polypeptides.
[0174] A peptide linker sequence may be employed to separate the
first and second polypeptide components by a distance sufficient to
ensure that each polypeptide folds into its secondary and tertiary
structures. Such a peptide linker sequence is incorporated into the
fusion protein using standard techniques well known in the art.
Suitable peptide linker sequences may be chosen based on the
following factors: (1) their ability to adopt a flexible extended
conformation; (2) their inability to adopt a secondary structure
that could interact with functional epitopes on the first and
second polypeptides; and (3) the lack of hydrophobic or charged
residues that might react with the polypeptide functional epitopes.
Preferred peptide linker sequences contain Gly, Asn and Ser
residues. Other near neutral amino acids, such as Thr and Ala may
also be used in the linker sequence. Amino acid sequences which may
be usefully employed as linkers include those disclosed in Maratea
et al., Gene 40:3946 (1985); Murphy et al., Proc. Natl. Acad. Sci.
USA 83:8258-8262 (1986); U.S. Pat. No. 4,935,233 and U.S. Pat. No.
4,751,180. The linker sequence may generally be from 1 to about 50
amino acids in length. Linker sequences are not required when the
first and second polypeptides have non-essential N-terminal amino
acid regions that can be used to separate the functional domains
and prevent steric interference.
[0175] The ligated DNA sequences are operably linked to suitable
transcriptional or translational regulatory elements. The
regulatory elements responsible for expression of DNA are located
only 5' to the DNA sequence encoding the first polypeptides.
Similarly, stop codons required to end translation and
transcription termination signals are only present 3' to the DNA
sequence encoding the second polypeptide.
[0176] Fusion proteins are also provided. Such proteins comprise a
polypeptide as described herein together with an unrelated
immunogenic protein. Preferably the immunogenic protein is capable
of eliciting a recall response. Examples of such proteins include
tetanus, tuberculosis and hepatitis proteins (see, e.g., Stoute et
al., New Engl. J. Med. 336:86-91 (1997)).
[0177] Within preferred embodiments, an immunological fusion
partner is derived from protein D, a surface protein of the
gram-negative bacterium Haemophilus influenza B (WO 91/18926).
Preferably, a protein D derivative comprises approximately the
first third of the protein (e.g., the first N-terminal 100-110
amino acids), and a protein D derivative may be lipidated. Within
certain preferred embodiments, the first 109 residues of a
lipoprotein D fusion partner is included on the N-terminus to
provide the polypeptide with additional exogenous T-cell epitopes
and to increase the expression level in E. coli (thus functioning
as an expression enhancer). The lipid tail ensures optimal
presentation of the antigen to antigen presenting cells. Other
fusion partners include the non-structural protein from influenzae
virus, NS1 (hemaglutinin). Typically, the N-terminal 81 amino acids
are used, although different fragments that include T-helper
epitopes may be used.
[0178] In another embodiment, the immunological fusion partner is
the protein known as LYTA, or a portion thereof (preferably a
C-terminal portion). LYTA is derived from Streptococcus pneumoniae,
which synthesizes an N-acetyl-L-alanine amidase known as amidase
LYTA (encoded by the LytA gene; Gene 43:265-292 (1986)). LYTA is an
autolysin that specifically degrades certain bonds in the
peptidoglycan backbone. The C-terminal domain of the LYTA protein
is responsible for the affinity to the choline or to some choline
analogues such as DEAE. This property has been exploited for the
development of E. coli C-LYTA expressing plasmids useful for
expression of fusion proteins. Purification of hybrid proteins
containing the C-LYTA fragment at the amino terminus has been
described (see Biotechnology 10:795-798 (1992)). Within a preferred
embodiment, a repeat portion of LYTA may be incorporated into a
fusion protein. A repeat portion is found in the C-terminal region
starting at residue 178. A particularly preferred repeat portion
incorporates residues 188-305.
[0179] In general, polypeptides (including fusion proteins) and
polynucleotides as described herein are isolated. An "isolated"
polypeptide or polynucleotide is one that is removed from its
original environment. For example, a naturally-occurring protein is
isolated if it is separated from some or all of the coexisting
materials in the natural system. Preferably, such polypeptides are
at least about 90% pure, more preferably at least about 95% pure
and most preferably at least about 99% pure. A polynucleotide is
considered to be isolated if, for example, it is cloned into a
vector that is not a part of the natural environment.
T Cells
[0180] Immunotherapeutic compositions may also, or alternatively,
comprise T cells specific for a Mycobacterium antigen. Such cells
may generally be prepared in vitro or ex vivo, using standard
procedures. For example, T cells may be isolated from bone marrow,
peripheral blood, or a fraction of bone marrow or peripheral blood
of a patient, using a commercially available cell separation
system, such as the Isolex.TM. System, available from Nexell
Therapeutics, Inc. (Irvine, Calif.; see also U.S. Pat. No.
5,240,856; U.S. Pat. No. 5,215,926; WO 89/06280; WO 91/16116 and WO
92/07243). Alternatively, T cells may be derived from related or
unrelated humans, non-human mammals, cell lines or cultures.
[0181] T cells may be stimulated with a polypeptide of the
invention, polynucleotide encoding such a polypeptide, and/or an
antigen presenting cell (APC) that expresses such a polypeptide.
Such stimulation is performed under conditions and for a time
sufficient to permit the generation of T cells that are specific
for the polypeptide. Preferably, the polypeptide or polynucleotide
is present within a delivery vehicle, such as a microsphere, to
facilitate the generation of specific T cells.
[0182] T cells are considered to be specific for a polypeptide of
the invention if the T cells specifically proliferate, secrete
cytokines or kill target cells coated with the polypeptide or
expressing a gene encoding the polypeptide. T cell specificity may
be evaluated using any of a variety of standard techniques. For
example, within a chromium release assay or proliferation assay, a
stimulation index of more than two fold increase in lysis and/or
proliferation, compared to negative controls, indicates T cell
specificity. Such assays may be performed, for example, as
described in Chen et al., Cancer Res. 54:1065-1070 (1994)).
Alternatively, detection of the proliferation of T cells may be
accomplished by a variety of known techniques. For example, T cell
proliferation can be detected by measuring an increased rate of DNA
synthesis (e.g., by pulse-labeling cultures of T cells with
tritiated thymidine and measuring the amount of tritiated thymidine
incorporated into DNA). Contact with a polypeptide of the invention
(100 ng/ml-100 .mu.g/ml, preferably 200 ng/ml-25 .mu.g/ml) for 3-7
days should result in at least a two fold increase in proliferation
of the T cells. Contact as described above for 2-3 hours should
result in activation of the T cells, as measured using standard
cytokine assays in which a two fold increase in the level of
cytokine release (e.g., TNF or IFN-.gamma.) is indicative of T cell
activation (see Coligan et al., Current Protocols in Immunology,
vol. 1 (1998)). T cells that have been activated in response to a
polypeptide, polynucleotide or polypeptide-expressing APC may be
CD4.sup.+ and/or CD8.sup.+. Protein-specific T cells may be
expanded using standard techniques. Within preferred embodiments,
the T cells are derived from a patient, a related donor or an
unrelated donor, and are administered to the patient following
stimulation and expansion.
[0183] For therapeutic purposes, CD4.sup.+ or CD8.sup.+ T cells
that proliferate in response to a polypeptide, polynucleotide or
APC can be expanded in number either in vitro or in vivo.
Proliferation of such T cells in vitro may be accomplished in a
variety of ways. For example, the T cells can be re-exposed to a
polypeptide, or a short peptide corresponding to an immunogenic
portion of such a polypeptide, with or without the addition of T
cell growth factors, such as interleukin-2, and/or stimulator cells
that synthesize the polypeptide. Alternatively, one or more T cells
that proliferate in the presence of the protein can be expanded in
number by cloning. Methods for cloning cells are well known in the
art, and include limiting dilution.
Pharmaceutical Compositions
[0184] In additional embodiments, the present invention concerns
formulation of one or more of the polynucleotide, polypeptide,
T-cell and/or antibody compositions disclosed herein in
pharmaceutically-acceptable or physiologically-acceptable solutions
for administration to a cell or an animal, either alone, or in
combination with one or more other modalities of therapy. Such
compositions are also useful for diagnostic uses.
[0185] It will also be understood that, if desired, the nucleic
acid segment, RNA, DNA or PNA compositions that express a
polypeptide as disclosed herein may be administered in combination
with other agents as well, such as, e.g., other proteins or
polypeptides or various pharmaceutically-active agents. In fact,
there is virtually no limit to other components that may also be
included, given that the additional agents do not cause a
significant adverse effect upon contact with the target cells or
host tissues. The compositions may thus be delivered along with
various other agents as required in the particular instance. Such
compositions may be purified from host cells or other biological
sources, or alternatively may be chemically synthesized as
described herein. Likewise, such compositions may further comprise
substituted or derivatized RNA or DNA compositions.
[0186] Formulation of pharmaceutically-acceptable excipients and
carrier solutions is well-known to those of skill in the art, as is
the development of suitable dosing and treatment regimens for using
the particular compositions described herein in a variety of
treatment regimens, including e.g., oral, parenteral, intravenous,
intranasal, and intramuscular administration and formulation.
[0187] 1. Oral Delivery
[0188] In certain applications, the pharmaceutical compositions
disclosed herein may be delivered via oral administration to an
animal. As such, these compositions may be formulated with an inert
diluent or with an assimilable edible carrier, or they may be
enclosed in hard- or soft-shell gelatin capsule, or they may be
compressed into tablets, or they may be incorporated directly with
the food of the diet.
[0189] The active compounds may even be incorporated with
excipients and used in the form of ingestible tablets, buccal
tables, troches, capsules, elixirs, suspensions, syrups, wafers,
and the like (Mathiowitz et al., 1997; Hwang et al., 1998; U.S.
Pat. No. 5,641,515; U.S. Pat. No. 5,580,579 and U.S. Pat. No.
5,792,451, each specifically incorporated herein by reference in
its entirety). The tablets, troches, pills, capsules and the like
may also contain the following: a binder, as gum tragacanth,
acacia, cornstarch, or gelatin; excipients, such as dicalcium
phosphate; a disintegrating agent, such as corn starch, potato
starch, alginic acid and the like; a lubricant, such as magnesium
stearate; and a sweetening agent, such as sucrose, lactose or
saccharin may be added or a flavoring agent, such as peppermint,
oil of wintergreen, or cherry flavoring. When the dosage unit form
is a capsule, it may contain, in addition to materials of the above
type, a liquid carrier. Various other materials may be present as
coatings or to otherwise modify the physical form of the dosage
unit. For instance, tablets, pills, or capsules may be coated with
shellac, sugar, or both. A syrup of elixir may contain the active
compound sucrose as a sweetening agent methyl and propylparabens as
preservatives, a dye and flavoring, such as cherry or orange
flavor. Of course, any material used in preparing any dosage unit
form should be pharmaceutically pure and substantially non-toxic in
the amounts employed. In addition, the active compounds may be
incorporated into sustained-release preparation and
formulations.
[0190] Typically, these formulations may contain at least about
0.1% of the active compound or more, although the percentage of the
active ingredient(s) may, of course, be varied and may conveniently
be between about 1 or 2% and about 60% or 70% or more of the weight
or volume of the total formulation. Naturally, the amount of active
compound(s) in each therapeutically useful composition may be
prepared is such a way that a suitable dosage will be obtained in
any given unit dose of the compound. Factors such as solubility,
bioavailability, biological half-life, route of administration,
product shelf life, as well as other pharmacological considerations
will be contemplated by one skilled in the art of preparing such
pharmaceutical formulations, and as such, a variety of dosages and
treatment regimens may be desirable.
[0191] For oral administration the compositions of the present
invention may alternatively be incorporated with one or more
excipients in the form of a mouthwash, dentifrice, buccal tablet,
oral spray, or sublingual orally-administered formulation. For
example, a mouthwash may be prepared incorporating the active
ingredient in the required amount in an appropriate solvent, such
as a sodium borate solution (Dobell's Solution). Alternatively, the
active ingredient may be incorporated into an oral solution such as
one containing sodium borate, glycerin and potassium bicarbonate,
or dispersed in a dentifrice, or added in a
therapeutically-effective amount to a composition that may include
water, binders, abrasives, flavoring agents, foaming agents, and
humectants. Alternatively the compositions may be fashioned into a
tablet or solution form that may be placed under the tongue or
otherwise dissolved in the mouth.
[0192] 2. Injectable Delivery
[0193] In certain circumstances it will be desirable to deliver the
pharmaceutical compositions disclosed herein parenterally,
intravenously, intramuscularly, or even intraperitoneally as
described in U.S. Pat. No. 5,543,158; U.S. Pat. No. 5,641,515 and
U.S. Pat. No. 5,399,363 (each specifically incorporated herein by
reference in its entirety). Solutions of the active compounds as
free base or pharmacologically acceptable salts may be prepared in
water suitably mixed with a surfactant, such as
hydroxypropylcellulose. Dispersions may also be prepared in
glycerol, liquid polyethylene glycols, and mixtures thereof and in
oils. Under ordinary conditions of storage and use, these
preparations contain a preservative to prevent the growth of
microorganisms.
[0194] The pharmaceutical forms suitable for injectable use include
sterile aqueous solutions or dispersions and sterile powders for
the extemporaneous preparation of sterile injectable solutions or
dispersions (U.S. Pat. No. 5,466,468, specifically incorporated
herein by reference in its entirety). In all cases the form must be
sterile and must be fluid to the extent that easy syringability
exists. It must be stable under the conditions of manufacture and
storage and must be preserved against the contaminating action of
microorganisms, such as bacteria and fungi. The carrier can be a
solvent or dispersion medium containing, for example, water,
ethanol, polyol (e.g., glycerol, propylene glycol, and liquid
polyethylene glycol, and the like), suitable mixtures thereof,
and/or vegetable oils. Proper fluidity may be maintained, for
example, by the use of a coating, such as lecithin, by the
maintenance of the required particle size in the case of dispersion
and by the use of surfactants. The prevention of the action of
microorganisms can be facilitated by various antibacterial and
antifungal agents, for example, parabens, chlorobutanol, phenol,
sorbic acid, thimerosal, and the like. In many cases, it will be
preferable to include isotonic agents, for example, sugars or
sodium chloride. Prolonged absorption of the injectable
compositions can be brought about by the use in the compositions of
agents delaying absorption, for example, aluminum monostearate and
gelatin.
[0195] For parenteral administration in an aqueous solution, for
example, the solution should be suitably buffered if necessary and
the liquid diluent first rendered isotonic with sufficient saline
or glucose. These particular aqueous solutions are especially
suitable for intravenous, intramuscular, subcutaneous and
intraperitoneal administration. In this connection, a sterile
aqueous medium that can be employed will be known to those of skill
in the art in light of the present disclosure. For example, one
dosage may be dissolved in 1 ml of isotonic NaCl solution and
either added to 1000 ml of hypodermoclysis fluid or injected at the
proposed site of infusion (see, e.g., Remington's Pharmaceutical
Sciences, 15th Edition, pp. 1035-1038 and 1570-1580). Some
variation in dosage will necessarily occur depending on the
condition of the subject being treated. The person responsible for
administration will, in any event, determine the appropriate dose
for the individual subject. Moreover, for human administration,
preparations should meet sterility, pyrogenicity, and the general
safety and purity standards as required by FDA Office of Biologics
standards.
[0196] Sterile injectable solutions are prepared by incorporating
the active compounds in the required amount in the appropriate
solvent with various of the other ingredients enumerated above, as
required, followed by filtered sterilization. Generally,
dispersions are prepared by incorporating the various sterilized
active ingredients into a sterile vehicle which contains the basic
dispersion medium and the required other ingredients from those
enumerated above. In the case of sterile powders for the
preparation of sterile injectable solutions, the preferred methods
of preparation are vacuum-drying and freeze-drying techniques which
yield a powder of the active ingredient plus any additional desired
ingredient from a previously sterile-filtered solution thereof.
[0197] The compositions disclosed herein may be formulated in a
neutral or salt form. Pharmaceutically-acceptable salts, include
the acid addition salts (formed with the free amino groups of the
protein) and which are formed with inorganic acids such as, for
example, hydrochloric or phosphoric acids, or such organic acids as
acetic, oxalic, tartaric, mandelic, and the like. Salts formed with
the free carboxyl groups can also be derived from inorganic bases
such as, for example, sodium, potassium, ammonium, calcium, or
ferric hydroxides, and such organic bases as isopropylamine,
trimethylamine, histidine, procaine and the like. Upon formulation,
solutions will be administered in a manner compatible with the
dosage formulation and in such amount as is therapeutically
effective. The formulations are easily administered in a variety of
dosage forms such as injectable solutions, drug-release capsules,
and the like.
[0198] As used herein, "carrier" includes any and all solvents,
dispersion media, vehicles, coatings, diluents, antibacterial and
antifungal agents, isotonic and absorption delaying agents,
buffers, carrier solutions, suspensions, colloids, and the like.
The use of such media and agents for pharmaceutical active
substances is well known in the art. Except insofar as any
conventional media or agent is incompatible with the active
ingredient, its use in the therapeutic compositions is
contemplated. Supplementary active ingredients can also be
incorporated into the compositions.
[0199] The phrase "pharmaceutically-acceptable" refers to molecular
entities and compositions that do not produce an allergic or
similar untoward reaction when administered to a human. The
preparation of an aqueous composition that contains a protein as an
active ingredient is well understood in the art. Typically, such
compositions are prepared as injectables, either as liquid
solutions or suspensions; solid forms suitable for solution in, or
suspension in, liquid prior to injection can also be prepared. The
preparation can also be emulsified.
[0200] 3. Nasal Delivery
[0201] In certain embodiments, the pharmaceutical compositions may
be delivered by intranasal sprays, inhalation, and/or other aerosol
delivery vehicles. Methods for delivering genes, nucleic acids, and
peptide compositions directly to the lungs via nasal aerosol sprays
has been described e.g., in U.S. Pat. No. 5,756,353 and U.S. Pat.
No. 5,804,212 (each specifically incorporated herein by reference
in its entirety). Likewise, the delivery of drugs using intranasal
microparticle resins (Takenaga et al., 1998) and
lysophosphatidyl-glycerol compounds (U.S. Pat. No. 5,725,871,
specifically incorporated herein by reference in its entirety) are
also well-known in the pharmaceutical arts. Likewise, transmucosal
drug delivery in the form of a polytetrafluoroethylene support
matrix is described in U.S. Pat. No. 5,780,045 (specifically
incorporated herein by reference in its entirety).
[0202] 4. Liposome-, Nanocapsule-, and Microparticle-Mediated
Delivery
[0203] In certain embodiments, the inventors contemplate the use of
liposomes, nanocapsules, microparticles, microspheres, lipid
particles, vesicles, and the like, for the introduction of the
compositions of the present invention into suitable host cells. In
particular, the compositions of the present invention may be
formulated for delivery either encapsulated in a lipid particle, a
liposome, a vesicle, a nanosphere, or a nanoparticle or the
like.
[0204] Such formulations may be preferred for the introduction of
pharmaceutically-acceptable formulations of the nucleic acids or
constructs disclosed herein. The formation and use of liposomes is
generally known to those of skill in the art (see for example,
Couvreur et al., 1977; Couvreur, 1988; Lasic, 1998; which describes
the use of liposomes and nanocapsules in the targeted antibiotic
therapy for intracellular bacterial infections and diseases).
Recently, liposomes were developed with improved serum stability
and circulation half-times (Gabizon & Papahadjopoulos, 1988;
Allen and Choun, 1987; U.S. Pat. No. 5,741,516, specifically
incorporated herein by reference in its entirety). Further, various
methods of liposome and liposome like preparations as potential
drug carriers have been reviewed (Takakura, 1998; Chandran et al.,
1997; Margalit, 1995; U.S. Pat. No. 5,567,434; U.S. Pat. No.
5,552,157; U.S. Pat. No. 5,565,213; U.S. Pat. No. 5,738,868 and
U.S. Pat. No. 5,795,587, each specifically incorporated herein by
reference in its entirety).
[0205] Liposomes have been used successfully with a number of cell
types that are normally resistant to transfection by other
procedures including T cell suspensions, primary hepatocyte
cultures and PC 12 cells (Renneisen et al., 1990; Muller et al.,
1990). In addition, liposomes are free of the DNA length
constraints that are typical of viral-based delivery systems.
Liposomes have been used effectively to introduce genes, drugs
(Heath & Martin, 1986; Heath et al., 1986; Balazsovits et al.,
1989; Fresta & Puglisi, 1996), radiotherapeutic agents (Pikul
et al., 1987), enzymes (Imaizumi et al., 1990a; Imaizumi et al.,
1990b), viruses (Faller & Baltimore, 1984), transcription
factors and allosteric effectors (Nicolau & Gersonde, 1979)
into a variety of cultured cell lines and animals. In addition,
several successful clinical trails examining the effectiveness of
liposome-mediated drug delivery have been completed
(Lopez-Berestein et al., 1985a; 1985b; Coune, 1988; Sculier et al.,
1988). Furthermore, several studies suggest that the use of
liposomes is not associated with autoimmune responses, toxicity or
gonadal localization after systemic delivery (Mori & Fukatsu,
1992).
[0206] Liposomes are formed from phospholipids that are dispersed
in an aqueous medium and spontaneously form multilamellar
concentric bilayer vesicles (also termed multilamellar vesicles
(MLVs). MLVs generally have diameters of from 25 nm to 4 .mu.m.
Sonication of MLVs results in the formation of small unilamellar
vesicles (SUVs) with diameters in the range of 200 to 500 .ANG.,
containing an aqueous solution in the core.
[0207] Liposomes bear resemblance to cellular membranes and are
contemplated for use in connection with the present invention as
carriers for the peptide compositions. They are widely suitable as
both water- and lipid-soluble substances can be entrapped, i.e. in
the aqueous spaces and within the bilayer itself, respectively. It
is possible that the drug-bearing liposomes may even be employed
for site-specific delivery of active agents by selectively
modifying the liposomal formulation.
[0208] In addition to the teachings of Couvreur et al. (1977;
1988), the following information may be utilized in generating
liposomal formulations. Phospholipids can form a variety of
structures other than liposomes when dispersed in water, depending
on the molar ratio of lipid to water. At low ratios the liposome is
the preferred structure. The physical characteristics of liposomes
depend on pH, ionic strength and the presence of divalent cations.
Liposomes can show low permeability to ionic and polar substances,
but at elevated temperatures undergo a phase transition which
markedly alters their permeability. The phase transition involves a
change from a closely packed, ordered structure, known as the gel
state, to a loosely packed, less-ordered structure, known as the
fluid state. This occurs at a characteristic phase-transition
temperature and results in an increase in permeability to ions,
sugars and drugs.
[0209] In addition to temperature, exposure to proteins can alter
the permeability of liposomes. Certain soluble proteins, such as
cytochrome c, bind, deform and penetrate the bilayer, thereby
causing changes in permeability. Cholesterol inhibits this
penetration of proteins, apparently by packing the phospholipids
more tightly. It is contemplated that the most useful liposome
formations for antibiotic and inhibitor delivery will contain
cholesterol.
[0210] The ability to trap solutes varies between different types
of liposomes. For example, MLVs are moderately efficient at
trapping solutes, but SUVs are extremely inefficient. SUVs offer
the advantage of homogeneity and reproducibility in size
distribution, however, and a compromise between size and trapping
efficiency is offered by large unilamellar vesicles (LUVs). These
are prepared by ether evaporation and are three to four times more
efficient at solute entrapment than MLVs.
[0211] In addition to liposome characteristics, an important
determinant in entrapping compounds is the physicochemical
properties of the compound itself. Polar compounds are trapped in
the aqueous spaces and nonpolar compounds bind to the lipid bilayer
of the vesicle. Polar compounds are released through permeation or
when the bilayer is broken, but nonpolar compounds remain
affiliated with the bilayer unless it is disrupted by temperature
or exposure to lipoproteins. Both types show maximum efflux rates
at the phase transition temperature.
[0212] Liposomes interact with cells via four different mechanisms:
endocytosis by phagocytic cells of the reticuloendothelial system
such as macrophages and neutrophils; adsorption to the cell
surface, either by nonspecific weak hydrophobic or electrostatic
forces, or by specific interactions with cell-surface components;
fusion with the plasma cell membrane by insertion of the lipid
bilayer of the liposome into the plasma membrane, with simultaneous
release of liposomal contents into the cytoplasm; and by transfer
of liposomal lipids to cellular or subcellular membranes, or vice
versa, without any association of the liposome contents. It often
is difficult to determine which mechanism is operative and more
than one may operate at the same time.
[0213] The fate and disposition of intravenously injected liposomes
depend on their physical properties, such as size, fluidity, and
surface charge. They may persist in tissues for h or days,
depending on their composition, and half lives in the blood range
from min to several h. Larger liposomes, such as MLVs and LUVs, are
taken up rapidly by phagocytic cells of the reticuloendothelial
system, but physiology of the circulatory system restrains the exit
of such large species at most sites. They can exit only in places
where large openings or pores exist in the capillary endothelium,
such as the sinusoids of the liver or spleen. Thus, these organs
are the predominate site of uptake. On the other hand, SUVs show a
broader tissue distribution but still are sequestered highly in the
liver and spleen. In general, this in vivo behavior limits the
potential targeting of liposomes to only those organs and tissues
accessible to their large size. These include the blood, liver,
spleen, bone marrow, and lymphoid organs.
[0214] Targeting is generally not a limitation in terms of the
present invention. However, should specific targeting be desired,
methods are available for this to be accomplished. Antibodies may
be used to bind to the liposome surface and to direct the antibody
and its drug contents to specific antigenic receptors located on a
particular cell-type surface. Carbohydrate determinants
(glycoprotein or glycolipid cell-surface components that play a
role in cell-cell recognition, interaction and adhesion) may also
be used as recognition sites as they have potential in directing
liposomes to particular cell types. Mostly, it is contemplated that
intravenous injection of liposomal preparations would be used, but
other routes of administration are also conceivable.
[0215] Alternatively, the invention provides for
pharmaceutically-acceptable nanocapsule formulations of the
compositions of the present invention. Nanocapsules can generally
entrap compounds in a stable and reproducible way (Henry-Michelland
et al., 1987; Quintanar-Guerrero et al., 1998; Douglas et al.,
1987). To avoid side effects due to intracellular polymeric
overloading, such ultrafine particles (sized around 0.1 .mu.m)
should be designed using polymers able to be degraded in vivo.
Biodegradable polyalkyl-cyanoacrylate nanoparticles that meet these
requirements are contemplated for use in the present invention.
Such particles may be are easily made, as described (Couvreur et
al., 1980; 1988; zur Muhlen et al., 1998; Zambaux et al. 1998;
Pinto-Alphandry et al., 1995 and U.S. Pat. No. 5,145,684,
specifically incorporated herein by reference in its entirety).
Vaccines
[0216] In certain preferred embodiments of the present invention,
vaccines are provided. The vaccines will generally comprise one or
more pharmaceutical compositions, such as those discussed above, in
combination with an immunostimulant. An immunostimulant may be any
substance that enhances or potentiates an immune response (antibody
and/or cell-mediated) to an exogenous antigen. Examples of
immunostimulants include adjuvants, biodegradable microspheres
(e.g., polylactic galactide) and liposomes (into which the compound
is incorporated; see, e.g., Fullerton, U.S. Pat. No. 4,235,877).
Vaccine preparation is generally described in, for example, Powell
& Newman, eds., Vaccine Design (the subunit and adjuvant
approach) (1995). Pharmaceutical compositions and vaccines within
the scope of the present invention may also contain other
compounds, which may be biologically active or inactive. For
example, one or more immunogenic portions of other tumor antigens
may be present, either incorporated into a fusion polypeptide or as
a separate compound, within the composition or vaccine.
[0217] Illustrative vaccines may contain DNA encoding one or more
of the polypeptides as described above, such that the polypeptide
is generated in situ. As noted above, the DNA may be present within
any of a variety of delivery systems known to those of ordinary
skill in the art, including nucleic acid expression systems,
bacteria and viral expression systems. Numerous gene delivery
techniques are well known in the art, such as those described by
Rolland, Crit. Rev. Therap. Drug Carrier Systems 15:143-198 (1998),
and references cited therein. Appropriate nucleic acid expression
systems contain the necessary DNA sequences for expression in the
patient (such as a suitable promoter and terminating signal).
Bacterial delivery systems involve the administration of a
bacterium (such as Bacillus-Calmette-Guerrin) that expresses an
immunogenic portion of the polypeptide on its cell surface or
secretes such an epitope. In a preferred embodiment, the DNA may be
introduced using a viral expression system (e.g., vaccinia or other
pox virus, retrovirus, or adenovirus), which may involve the use of
a non-pathogenic (defective), replication competent virus. Suitable
systems are disclosed, for example, in Fisher-Hoch et al., Proc.
Natl. Acad. Sci. USA 86:317-321 (1989); Flexner et al., Ann. N.Y.
Acad. Sci. 569:86-103 (1989); Flexner et al., Vaccine 8:17-21
(1990); U.S. Pat. Nos. 4,603,112, 4,769,330, and 5,017,487; WO
89/01973; U.S. Pat. No. 4,777,127; GB 2,200,651; EP 0,345,242; WO
91/02805; Berkner, Biotechniques 6:616-627 (1988); Rosenfeld et
al., Science 252:431-434 (1991); Kolls et al., Proc. Natl. Acad.
Sci. USA 91:215-219 (1994); Kass-Eisler et al., Proc. Natl. Acad.
Sci. USA 90:11498-11502 (1993); Guzman et al., Circulation
88:2838-2848 (1993); and Guzman et al., Cir. Res. 73:1202-1207
(1993). Techniques for incorporating DNA into such expression
systems are well known to those of ordinary skill in the art. The
DNA may also be "naked," as described, for example, in Ulmer et
al., Science 259:1745-1749 (1993) and reviewed by Cohen, Science
259:1691-1692 (1993). The uptake of naked DNA may be increased by
coating the DNA onto biodegradable beads, which are efficiently
transported into the cells. It will be apparent that a vaccine may
comprise both a polynucleotide and a polypeptide component. Such
vaccines may provide for an enhanced immune response.
[0218] It will be apparent that a vaccine may contain
pharmaceutically acceptable salts of the polynucleotides and
polypeptides provided herein. Such salts may be prepared from
pharmaceutically acceptable non-toxic bases, including organic
bases (e.g., salts of primary, secondary and tertiary amines and
basic amino acids) and inorganic bases (e.g., sodium, potassium,
lithium, ammonium, calcium and magnesium salts).
[0219] While any suitable carrier known to those of ordinary skill
in the art may be employed in the vaccine compositions of this
invention, the type of carrier will vary depending on the mode of
administration. Compositions of the present invention may be
formulated for any appropriate manner of administration, including
for example, topical, oral, nasal, intravenous, intracranial,
intraperitoneal, subcutaneous or intramuscular administration. For
parenteral administration, such as subcutaneous injection, the
carrier preferably comprises water, saline, alcohol, a fat, a wax
or a buffer. For oral administration, any of the above carriers or
a solid carrier, such as mannitol, lactose, starch, magnesium
stearate, sodium saccharine, talcum, cellulose, glucose, sucrose,
and magnesium carbonate, may be employed. Biodegradable
microspheres (e.g., polylactate polyglycolate) may also be employed
as carriers for the pharmaceutical compositions of this invention.
Suitable biodegradable microspheres are disclosed, for example, in
U.S. Pat. Nos. 4,897,268; 5,075,109; 5,928,647; 5,811,128;
5,820,883; 5,853,763; 5,814,344 and 5,942,252. One may also employ
a carrier comprising the particulate-protein complexes described in
U.S. Pat. No. 5,928,647, which are capable of inducing a class
I-restricted cytotoxic T lymphocyte responses in a host.
[0220] Such compositions may also comprise buffers (e.g., neutral
buffered saline or phosphate buffered saline), carbohydrates (e.g.,
glucose, mannose, sucrose or dextrans), mannitol, proteins,
polypeptides or amino acids such as glycine, antioxidants,
bacteriostats, chelating agents such as EDTA or glutathione,
adjuvants (e.g., aluminum hydroxide), solutes that render the
formulation isotonic, hypotonic or weakly hypertonic with the blood
of a recipient, suspending agents, thickening agents and/or
preservatives. Alternatively, compositions of the present invention
may be formulated as a lyophilizate. Compounds may also be
encapsulated within liposomes using well known technology.
[0221] Any of a variety of immunostimulants may be employed in the
vaccines of this invention. For example, an adjuvant may be
included. Most adjuvants contain a substance designed to protect
the antigen from rapid catabolism, such as aluminum hydroxide or
mineral oil, and a stimulator of immune responses, such as lipid A,
Bortadella pertussis or Mycobacterium species or Mycobacterium
derived proteins. For example, delipidated, deglycolipidated M.
vaccae ("pVac") can be used. In another embodiment, BCG is used as
an adjuvant. In addition, the vaccine can be administered to a
subject previously exposed to BCG. Suitable adjuvants are
commercially available as, for example, Freund's Incomplete
Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit,
Mich.); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, N.J.);
AS-2 and derivatives thereof (SmithKline Beecham, Philadelphia,
Pa.); CWS, TDM, Leif, aluminum salts such as aluminum hydroxide gel
(alum) or aluminum phosphate; salts of calcium, iron or zinc; an
insoluble suspension of acylated tyrosine; acylated sugars;
cationically or anionically derivatized polysaccharides;
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid
A and quil A. Cytokines, such as GM-CSF or interleukin-2, -7, or
-12, may also be used as adjuvants.
[0222] Within the vaccines provided herein, the adjuvant
composition is preferably designed to induce an immune response
predominantly of the Th1 type. High levels of Th1-type cytokines
(e.g., IFN-.gamma., TNF.alpha., IL-2 and IL-12) tend to favor the
induction of cell mediated immune responses to an administered
antigen. In contrast, high levels of Th2-type cytokines (e.g.,
IL-4, IL-5, IL-6 and IL-10) tend to favor the induction of humoral
immune responses. Following application of a vaccine as provided
herein, a patient will support an immune response that includes
Th1- and Th2-type responses. Within a preferred embodiment, in
which a response is predominantly Th1-type, the level of Th1-type
cytokines will increase to a greater extent than the level of
Th2-type cytokines. The levels of these cytokines may be readily
assessed using standard assays. For a review of the families of
cytokines, see Mosmann & Coffman, Ann. Rev. Immunol. 7:145-173
(1989).
[0223] Preferred adjuvants for use in eliciting a predominantly
Th1-type response include, for example, a combination of
monophosphoryl lipid A, preferably 3-de-O-acylated monophosphoryl
lipid A (3D-MPL), together with an aluminum salt. MPL adjuvants are
available from Corixa Corporation (Seattle, Wash.; see U.S. Pat.
Nos. 4,436,727; 4,877,611; 4,866,034 and 4,912,094). CpG-containing
oligonucleotides (in which the CpG dinucleotide is unmethylated)
also induce a predominantly Th1 response. Such oligonucleotides are
well known and are described, for example, in WO 96/02555, WO
99/33488 and U.S. Pat. Nos. 6,008,200 and 5,856,462.
Immunostimulatory DNA sequences are also described, for example, by
Sato et al., Science 273:352 (1996). Another preferred adjuvant
comprises a saponin, such as Quil A, or derivatives thereof,
including QS21 and QS7 (Aquila Biopharmaceuticals Inc., Framingham,
Mass.); Escin; Digitonin; or Gypsophila or Chenopodium quinoa
saponins. Other preferred formulations include more than one
saponin in the adjuvant combinations of the present invention, for
example combinations of at least two of the following group
comprising QS21, QS7, Quil A, .beta.-escin, or digitonin.
[0224] Alternatively the saponin formulations may be combined with
vaccine vehicles composed of chitosan or other polycationic
polymers, polylactide and polylactide-co-glycolide particles,
poly-N-acetyl glucosamine-based polymer matrix, particles composed
of polysaccharides or chemically modified polysaccharides,
liposomes and lipid-based particles, particles composed of glycerol
monoesters, etc. The saponins may also be formulated in the
presence of cholesterol to form particulate structures such as
liposomes or ISCOMs. Furthermore, the saponins may be formulated
together with a polyoxyethylene ether or ester, in either a
non-particulate solution or suspension, or in a particulate
structure such as a paucilamelar liposome or ISCOM. The saponins
may also be formulated with excipients such as Carbopol.sup.R to
increase viscosity, or may be formulated in a dry powder form with
a powder excipient such as lactose.
[0225] In one preferred embodiment, the adjuvant system includes
the combination of a monophosphoryl lipid A and a saponin
derivative, such as the combination of QS21 and 3D-MPL.RTM.
adjuvant, as described in WO 94/00153, or a less reactogenic
composition where the QS21 is quenched with cholesterol, as
described in WO 96/33739. Other preferred formulations comprise an
oil-in-water emulsion and tocopherol. Another particularly
preferred adjuvant formulation employing QS21, 3D-MPL.RTM. adjuvant
and tocopherol in an oil-in-water emulsion is described in WO
95/17210.
[0226] Another enhanced adjuvant system involves the combination of
a CpG-containing oligonucleotide and a saponin derivative
particularly the combination of CpG and QS21 as disclosed in WO
00/09159. Preferably the formulation additionally comprises an oil
in water emulsion and tocopherol.
[0227] Other preferred adjuvants include Montanide ISA 720 (Seppic,
France), SAF (Chiron, Calif., United States), ISCOMS (CSL), MF-59
(Chiron), the SBAS series of adjuvants (e.g., SBAS-2, AS2', AS2,''
SBAS4, or SBAS6, available from SmithKline Beecham, Rixensart,
Belgium), Detox (Corixa, Hamilton, Mont.), RC-529 (Corixa,
Hamilton, Mont.) and other aminoalkyl glucosaminide 4-phosphates
(AGPs), such as those described in pending U.S. patent application
Ser. Nos. 08/853,826 and 09/074,720, the disclosures of which are
incorporated herein by reference in their entireties, and
polyoxyethylene ether adjuvants such as those described in WO
99/52549A1.
[0228] Other preferred adjuvants include adjuvant molecules of the
general formula (I): HO(CH.sub.2CH.sub.2O).sub.n-A-R,
wherein, n is 1-50, A is a bond or --C(O)--, R is C.sub.1-50 alkyl
or Phenyl C.sub.1-50 alkyl.
[0229] One embodiment of the present invention consists of a
vaccine formulation comprising a polyoxyethylene ether of general
formula (I), wherein n is between 1 and 50, preferably 4-24, most
preferably 9; the R component is C.sub.1-50, preferably
C.sub.4-C.sub.20 alkyl and most preferably C.sub.1-2 alkyl, and A
is a bond. The concentration of the polyoxyethylene ethers should
be in the range 0.1-20%, preferably from 0.1-10%, and most
preferably in the range 0.1-1%. Preferred polyoxyethylene ethers
are selected from the following group: polyoxyethylene-9-lauryl
ether, polyoxyethylene-9-steoryl ether, polyoxyethylene-8-steoryl
ether, polyoxyethylene-4-lauryl ether, polyoxyethylene-35-lauryl
ether, and polyoxyethylene-23-lauryl ether. Polyoxyethylene ethers
such as polyoxyethylene lauryl ether are described in the Merck
index (12.sup.th edition: entry 7717). These adjuvant molecules are
described in WO 99/52549.
[0230] The polyoxyethylene ether according to the general formula
(I) above may, if desired, be combined with another adjuvant. For
example, a preferred adjuvant combination is preferably with CpG as
described in the pending UK patent application GB 9820956.2.
[0231] Any vaccine provided herein may be prepared using well known
methods that result in a combination of antigen, immune response
enhancer and a suitable carrier or excipient. The compositions
described herein may be administered as part of a sustained release
formulation (i.e., a formulation such as a capsule, sponge or gel
(composed of polysaccharides, for example) that effects a slow
release of compound following administration). Such formulations
may generally be prepared using well known technology (see, e.g.,
Coombes et al., Vaccine 14:1429-1438 (1996)) and administered by,
for example, oral, rectal or subcutaneous implantation, or by
implantation at the desired target site. Sustained-release
formulations may contain a polypeptide, polynucleotide or antibody
dispersed in a carrier matrix and/or contained within a reservoir
surrounded by a rate controlling membrane.
[0232] Carriers for use within such formulations are biocompatible,
and may also be biodegradable; preferably the formulation provides
a relatively constant level of active component release. Such
carriers include microparticles of poly(lactide-co-glycolide),
polyacrylate, latex, starch, cellulose, dextran and the like. Other
delayed-release carriers include supramolecular biovectors, which
comprise a non-liquid hydrophilic core (e.g., a cross-linked
polysaccharide or oligosaccharide) and, optionally, an external
layer comprising an amphiphilic compound, such as a phospholipid
(see, e.g., U.S. Pat. No. 5,151,254 and PCT applications WO
94/20078, WO/94/23701 and WO 96/06638). The amount of active
compound contained within a sustained release formulation depends
upon the site of implantation, the rate and expected duration of
release and the nature of the condition to be treated or
prevented.
[0233] Any of a variety of delivery vehicles may be employed within
pharmaceutical compositions and vaccines to facilitate production
of an antigen-specific immune response that targets tumor cells.
Delivery vehicles include antigen presenting cells (APCs), such as
dendritic cells, macrophages, B cells, monocytes and other cells
that may be engineered to be efficient APCs. Such cells may, but
need not, be genetically modified to increase the capacity for
presenting the antigen, to improve activation and/or maintenance of
the T cell response, to have anti-tumor effects per se and/or to be
immunologically compatible with the receiver (i.e., matched HLA
haplotype). APCs may generally be isolated from any of a variety of
biological fluids and organs, including tumor and peritumoral
tissues, and may be autologous, allogeneic, syngeneic or xenogeneic
cells.
[0234] Certain preferred embodiments of the present invention use
dendritic cells or progenitors thereof as antigen-presenting cells.
Dendritic cells are highly potent APCs (Banchereau & Steinman,
Nature 392:245-251 (1998)) and have been shown to be effective as a
physiological adjuvant for eliciting prophylactic or therapeutic
antitumor immunity (see Timmerman & Levy, Ann. Rev. Med.
50:507-529 (1999)). In general, dendritic cells may be identified
based on their typical shape (stellate in situ, with marked
cytoplasmic processes (dendrites) visible in vitro), their ability
to take up, process and present antigens with high efficiency and
their ability to activate naive T cell responses. Dendritic cells
may, of course, be engineered to express specific cell-surface
receptors or ligands that are not commonly found on dendritic cells
in vivo or ex vivo, and such modified dendritic cells are
contemplated by the present invention. As an alternative to
dendritic cells, secreted vesicles antigen-loaded dendritic cells
(called exosomes) may be used within a vaccine (see Zitvogel et
al., Nature Med. 4:594-600 (1998)).
[0235] Dendritic cells and progenitors may be obtained from
peripheral blood, bone marrow, tumor-infiltrating cells,
peritumoral tissues-infiltrating cells, lymph nodes, spleen, skin,
umbilical cord blood or any other suitable tissue or fluid. For
example, dendritic cells may be differentiated ex vivo by adding a
combination of cytokines such as GM-CSF, IL-4, IL-13 and/or
TNF.alpha. to cultures of monocytes harvested from peripheral
blood. Alternatively, CD34 positive cells harvested from peripheral
blood, umbilical cord blood or bone marrow may be differentiated
into dendritic cells by adding to the culture medium combinations
of GM-CSF, IL-3, TNF.alpha., CD40 ligand, LPS, flt3 ligand and/or
other compound(s) that induce differentiation, maturation and
proliferation of dendritic cells.
[0236] Dendritic cells are conveniently categorized as "immature"
and "mature" cells, which allows a simple way to discriminate
between two well characterized phenotypes. However, this
nomenclature should not be construed to exclude all possible
intermediate stages of differentiation. Immature dendritic cells
are characterized as APC with a high capacity for antigen uptake
and processing, which correlates with the high expression of
Fc.gamma. receptor and mannose receptor. The mature phenotype is
typically characterized by a lower expression of these markers, but
a high expression of cell surface molecules responsible for T cell
activation such as class I and class II MHC, adhesion molecules
(e.g., CD54 and CD11) and costimulatory molecules (e.g., CD40,
CD80, CD86 and 4-1BB).
[0237] APCs may generally be transfected with a polynucleotide
encoding a protein (or portion or other variant thereof) such that
the polypeptide, or an immunogenic portion thereof, is expressed on
the cell surface. Such transfection may take place ex vivo, and a
composition or vaccine comprising such transfected cells may then
be used for therapeutic purposes, as described herein.
Alternatively, a gene delivery vehicle that targets a dendritic or
other antigen presenting cell may be administered to a patient,
resulting in transfection that occurs in vivo. In vivo and ex vivo
transfection of dendritic cells, for example, may generally be
performed using any methods known in the art, such as those
described in WO 97/24447, or the gene gun approach described by
Mahvi et al., Immunology and Cell Biology 75:456-460 (1997).
Antigen loading of dendritic cells may be achieved by incubating
dendritic cells or progenitor cells with the polypeptide, DNA
(naked or within a plasmid vector) or RNA; or with
antigen-expressing recombinant bacterium or viruses (e.g.,
vaccinia, fowlpox, adenovirus or lentivirus vectors). Prior to
loading, the polypeptide may be covalently conjugated to an
immunological partner that provides T cell help (e.g., a carrier
molecule). Alternatively, a dendritic cell may be pulsed with a
non-conjugated immunological partner, separately or in the presence
of the polypeptide.
[0238] Vaccines and pharmaceutical compositions may be presented in
unit-dose or multi-dose containers, such as sealed ampoules or
vials. Such containers are preferably hermetically sealed to
preserve sterility of the formulation until use. In general,
formulations may be stored as suspensions, solutions or emulsions
in oily or aqueous vehicles. Alternatively, a vaccine or
pharmaceutical composition may be stored in a freeze-dried
condition requiring only the addition of a sterile liquid carrier
immediately prior to use.
Diagnostic Kits
[0239] The present invention further provides kits for use within
any of the above diagnostic methods. Such kits typically comprise
two or more components necessary for performing a diagnostic assay.
Components may be compounds, reagents, containers and/or equipment.
For example, one container within a kit may contain a monoclonal
antibody or fragment thereof that specifically binds to a protein.
Such antibodies or fragments may be provided attached to a support
material, as described above. One or more additional containers may
enclose elements, such as reagents or buffers, to be used in the
assay. Such kits may also, or alternatively, contain a detection
reagent as described above that contains a reporter group suitable
for direct or indirect detection of antibody binding.
[0240] Alternatively, a kit may be designed to detect the level of
mRNA encoding a protein in a biological sample. Such kits generally
comprise at least one oligonucleotide probe or primer, as described
above, that hybridizes to a polynucleotide encoding a protein. Such
an oligonucleotide may be used, for example, within a PCR or
hybridization assay. Additional components that may be present
within such kits include a second oligonucleotide and/or a
diagnostic reagent or container to facilitate the detection of a
polynucleotide encoding a protein of the invention.
[0241] All publications and patent applications cited in this
specification are herein incorporated by reference as if each
individual publication or patent application were specifically and
individually indicated to be incorporated by reference.
[0242] Although the foregoing invention has been described in some
detail by way of illustration and example for purposes of clarity
of understanding, it will be readily apparent to one of ordinary
skill in the art in light of the teachings of this invention that
certain changes and modifications may be made thereto without
departing from the spirit or scope of the appended claims.
EXAMPLES
[0243] The following examples are provided by way of illustration
only and not by way of limitation. Those of skill in the art will
readily recognize a variety of noncritical parameters that could be
changed or modified to yield essentially similar results.
Example 1
Guinea Pig Vaccination with MTB72F Fusion Protein and Compositions
with Individual Antigens
[0244] Guinea pigs were immunized with adjuvant alone (SBAS1,
SBAS2, or ASAS7 plus A1(OH).sub.3), MTB72F fusion protein in
adjuvant, or TbH9 plus Ra35 antigen composition.
[0245] Methods:
TABLE-US-00001 Groups: 1) SBAS1 2) SBAS2 3) SBAS7 + Al(OH)3 4) TbH9
+ Ra35 + SBAS1 5) TbH9 + Ra35 + SBAS2 6) TbH9 + Ra35 +
SBAS7(Al(OH)3) 7) MTB72F in SBAS1 8) MTB72F in SBAS2 9) MTB72F in
SBAS7 + Al(OH)3 10) PBS 11) BCG
[0246] Dosage: [0247] 4 .mu.g each of TbH9 and Ra35 [0248] 8 .mu.g
MTB72F
[0249] Protocol: 1st immunization, 2nd immunization approximately 3
weeks later, 3rd immunization approximately two and a half weeks
later.
[0250] Pre-challenge: DTH (delayed type hypersensitivity, used to
determine antigenicity; 10 .mu.g antigen)
[0251] Challenge: Aerosol with .about.30 cfu Erdman strain
[0252] Post challenge monitoring: [0253] Weight loss [0254] Death
(.about.6 months post challenge)
[0255] Results:
[0256] 1. DTH
[0257] Positive reaction to the immunizing antigens. Reactions to
individual antigens or the fusion protein were comparable. Skin
test reactivity to PPD was only seen with the BCG immunized
groups
[0258] 2. Protection: Guinea pigs vaccinated with MTB72F fusion
protein afforded protection compared to those immunized with a
mixture of antigens (see FIG. 1).
Example 2
Mouse Vaccination with MTB72F Fusion Protein and Compositions with
Individual Antigens
[0259] As described above, mice were immunized with adjuvant alone
(SBAS2, SBAS2', SBAS2'', or SBAS6), MTB72F fusion protein in
adjuvant, MTB72F DNA, MTB59F fusion protein in adjuvant, or TbH9,
Ra35 and Ra12 antigen composition.
[0260] Methods:
TABLE-US-00002 Groups: 1) MTB72F + SBAS2 2) MTB72F + SBAS2' 3)
MTB72F + SBAS2'' 4) MTB72F + SBAS6 5) Ra12 + TbH9 + Ra35 in SBAS2
6) MTB59F in SBAS2 7) SBAS2 8) MTB72F + delipidated,
deglycolipidated M. vaccae 9) MTB72F DNA 10) MTB72F + IFA 11)
MTB72F + BCG 12) delipidated, deglycolipidated M. vaccae 13) BCG
14) Saline 15) MTB72F + SBAS2 (in house formulation)
[0261] 8 Animals Per Group
[0262] Immunization schedule: First immunization, second
immunization approximately 3 weeks later; third immunization
approximately three weeks later.
[0263] Aerosol challenge approximately three months after first
does
[0264] Spleen or lung cells were isolated and cultured; count CFU
of cultures approximately three weeks after plating.
[0265] Dose: 8 .mu.g MTB72F, 6.56 .mu.g MTB59F, or 1.52, 4.3, and
2.24 .mu.g, respectively, of Ra12, TbH9, and Ra35, mixed.
[0266] Results:
[0267] Of the AS adjuvants, AS2''+MTB72F gave the best protection
in both the spleen and lung in this set of experiments (see FIGS.
2A and 2B). MTB72F gave .about.1 log better protection than MTB59F
in both spleen and lung in this set of experiments, indicating that
Ra12 provides additional benefit. Mixture of 12/H9/35+AS2 gave a
better protection than MTB72F in this experiment. MTB72F DNA gave
the best protection in this experiment, particularly in the spleen
(>2 log). The protection was comparable in the lung to that seen
with MTB72F protein+AS2'', in this experiment.
Example 3
Guinea Pig Vaccination with MTB72F Fusion Protein and Compositions
with Individual Antigens
[0268] As described above, guinea pigs were immunized with adjuvant
alone (SBAS2, SBAS2', SBAS2'', or SBAS6), MTB72F fusion protein in
adjuvant, MTB72F DNA, MTB59F fusion protein in adjuvant, or TbH9,
Ra35 and Ra12 antigen composition.
[0269] Methods:
TABLE-US-00003 Groups: 1) MTB72F + SBAS2 2) MTB72F + SBAS2' 3)
MTB72F + SBAS2'' 4) MTB72F + SBAS6 5) Ra12 + TbH9 + Ra35 in SBAS2
6) MTB59F in SBAS2 7) SBAS2 8) MTB72F + pvac 9) MTB72F DNA 10)
MTB72F + IFA 11) MTB72F + BCG 12) BCG 13) Saline 14) delipidated,
deglycolipidated M. vaccae
[0270] Antigens:
[0271] Antigens were formulated on a molar equivalent
[0272] 5 animals per group
[0273] Injection volume per dose is 250 .mu.l (IM) containing
TABLE-US-00004 MTB72F 20 .mu.g Ra12, TbH9, Ra35 3.8, 10.8, and 5.6
.mu.g MTB59F 16.4 .mu.g
[0274] Schedule:
[0275] 1st immunization, 2nd immunization approximately three weeks
later, 3rd immunization approximately three weeks later.
[0276] Challenge: .about.one and one half months after first
immunization.
[0277] Results:
[0278] .about.38 Wks post challenge
TABLE-US-00005 Groups Alive State G1. MTB72F + AS2 1/5 [losing
weight] G2. MTB72F + AS2' 2/5 [not gaining weight] G3. MTB72F +
AS2'' 3/5 [looking okay, but no weight gain] G4. MTB72F + AS6 2/5
[both these gaining weight] G5. MTBRa12 + H9 + 4/5 [one maybe a bit
peaked, but two gaining] Ra35 + AS2 G6. MTB59F + AS2 2/5 [both
losing a little] G7. AS2 2/5 [both losing] G8. MTB72F + pVac 1/5
[not looking too good] G9. MTB72F DNA 3/5 [all holding steady] G10.
MTB72F + IFA 2/5 [doing okay] G11. MTB72F + BCG 5/5 [eating very
well] G12 BCG 4/5 [doing fine] G13 Saline all dead G14 pVac 2/5
[not gaining weight]
[0279] By 50 weeks post challenge, while 80% (4/5) of the guinea
pigs immunized with BCG+Mtb72F were still alive, only 20% (1/5) of
those immunized with BCG alone were alive. At 85 weeks, 4/5 of the
guinea pigs immunized with BCG+Mtb72F were still alive and healthy
(see FIG. 7).
Example 4
Long Term Protection
[0280] As described above, guinea pigs were immunized with adjuvant
alone (AS2 or AS2''), MTB72F fusion protein in adjuvant, TbH9, Ra35
and Ra12 antigen composition, or a variety of individual antigens
in adjuvant.
[0281] Methods:
TABLE-US-00006 GROUPS ANTIGEN DOSE 1. AS2'' + MTB39 (TbH9) 20
ug/250 ul (IM) 2. AS2'' + MTB8.4 (DPV) 20 ug 3. AS2'' + MTB9.9
(MTI) 20 ug 4. AS2'' + MTB41 (MTCC#2) 20 ug 5. AS2'' + MTB40
(HTCC#1) 20 ug 6. AS2'' + MTB9.8 (MSL) 20 ug 7. AS2'' + MTB72F 20
ug 8. AS2'' + Ra12 + TbH9 + 3.8 .mu.g + 10.8 .mu.g + 5.6 .mu.g Ra35
(molar equivalent) 9. AS2'' + MTB71F + MTB72F + 20 .mu.g + 20 .mu.g
+ 10 .mu.g HTCC#1 10. AS2'' + Ra12 20 .mu.g 11. BCG 12. AS2'' 13.
AS2 + MTB72F 14. AS2 + Ra12 + TbH9 + Ra35 15. AS2
Example 5
Monkey Vaccination with MTB72F Fusion Protein and Compositions with
Individual Antigens
[0282] As described above, monkeys were immunized with MTB72F
fusion protein in SBAS2 adjuvant, or MTB8.4 antigen composition in
adjuvant, or a mixture of MTB72F and MTB8.4.
[0283] Methods:
TABLE-US-00007 Groups 1. Saline 2. BCG 3. MTB8.4/AS2 4. MTB72F/AS2
5. MTB72F/AS2 (one arm) + MTB8.4/AS2 (other arm) 40 .mu.g each
antigen
[0284] Results:
[0285] At 8 weeks post challenge, monkeys immunized with BCG are
showing signs of infection
[0286] Current data for 16 weeks post challenge reveals the
following trend:
[0287] Groups immunized with MTB72F (4 and 5) are holding on their
weights and have low ESR values compared to group 3 (MTB8.4
immunization) (Tables 1 and 2).
TABLE-US-00008 TABLE 1 Prophylactic Vaccine Study in Cynomolgus
Monkeys with MTB8.4 and MTB72F formulated in AS2 20 Weeks Post
Challenge Net weight Groups ID Change (kg) Chest X-ray (onset)
Status AS2 1398K 24% Pn, bil, prog (wk 8) Alive 4437B 33% Pn, bil,
prog (wk 4) Dead 2959G 8.30% Pn, bil, prog (wk 4) Alive 605AE
14.00% Pn, rt, stable (wk 8) Alive BCG 3436A 15.00% Neg Alive 3642G
Plus 4.5% Pn, rt, prog (wk 8) Alive 1190H 0% Neg Alive 1051I 30%
Pn, rt, prog (wk 8) Dead MTB8.4 3665C 25% Pn, rt, prog (wk 8) Dead
2200F 18.00% Pn, rt, stable (wk 8) Alive 1654J 33.00% Pn, bil, prog
(wk 4) Dead 4141C 33% Pn, bil, prog (wk 4) Dead MTB72F 3061C* Died
after IT challenge 1228G Plus 3.6% Bron, bil, stable for Alive 3 mo
(wk 8) 3462E 2.20% Neg Alive 4254C Plus 1.21 Pn, rt, stable for
Alive 3 mo (wk 4) MTB8.4 4496A Plus 7% Pn, rt, stable for Alive 1
mo (wk 8) 4422C 39.00% Pn, bil, prog (wk 4) Dead MTB72F 4416A Plus
11% Pn, rt, stable for 2 Alive mo (wk 12) 2734E Plus 12.5% Susp
infil rt, stable Alive for 3 mo (wk 8)
TABLE-US-00009 TABLE 2 Prophylactic Vaccine Study in Cynomolgus
Monkeys with MTB8.4 and MTB72F formulated in AS2 Wks Post Challenge
ESR Groups ID 4 8 12 16 16 wks Chest X-ray AS2 1398K 3 3 10 19 Pn,
bil, progrsv 4437B 10 20 3 Died 2959G 6 3 3 0 Pn, rt, progrsv 605AE
1 4 7 3 Pn, rt, stable BCG 3436A 0 8 7 15 Neg 3642G 0 0 0 0 Pn, rt,
progrsv 1190H 1 0 2 0 Neg 1051I 0 8 22 7 Pn, bil, w/furt progrsn
Died MTB8.4 3665C 12 30 19 Died 2200F 1 7 2 0 Pn, rt, progrsv 1654J
20 8 21 7 Pn, bil, w/fur progrsn 4141C 13 8 2 15 Pn, bil, w/fur
progrsn MTB72F 3061C* Died after IT challenge 1228G 0 1 20 0 Now
stable 3462E 0 0 0 0 Neg 4254C 13 0 0 0 Pn, now stable MTB8.4/
4496A 5 1 0 5 Pn, rt, w/furt prog MTB72F 4422C 10 3 0 Died 4416A 6
0 1 0 Pn, now stable 2734E 0 0 0 0 Susp infil, now stable
Example 6
BCG Priming Experiment in Monkeys
[0288] 5 animals per group with four groups immunized with BCG and
then rested, then immunized as described above and challenged. The
following protocol will be used:
TABLE-US-00010 Groups # animals Immunizing Antigen -Antigen Dose 1.
Nothing 5 AS2 2. BCG 5 AS2 3. BCG 5 MTB72F 40 ug 4. BCG 4 Ra12 +
TbH9 + Ra35 Molar equiv of antigens in MTB72F dose 5. BCG 4 MTB72F
+ MTB71F + MTB40 40 ug MTB72F 40 ug MTB72F 20 ug MTB40
All antigens in formulated in AS2 Groups 4 and 5 have four animals
each. Two of the BCG immunized monkeys died
TABLE-US-00011 Antigens for T cell # Immunizing proliferation and
cytokine Groups animals Antigen production assays 1. Nothing 5 AS2
PHA, PPD, MTB72F, MTB71F, HTCC#1, DPV MTCC#2, Ra12, TbH9, Ra35,
MSL, MTI 2. BCG 5 AS2 PHA, PPD, MTB72F, MTB71F, HTCC#1, DPV,
MTCC#2, Ra12, TbH9, Ra35, MSL, MTI 3. BCG 5 MTB72F PHA, PPD,
MTB72F, Ra12, TbH9, Ra35 4. BCG 4 Ra12 + PHA, PPD, MTB72F, Ra12,
TbH9 + TbH9, Ra35 Ra35 5. BCG 4 MTB72F + PHA, PPD, MTB72F, MTB71F +
MTB71F, HTCC#1, MTB40 DPV, MTCC-2, Ra12, TbH9, Ra35, MSL, MTI
Example 7
Construction of Ra35MutSA and MTB72FMutSA
[0289] Expression of Mtb72f typically results in some breakdown
products. In addition, the expression of the full-length sequences
of the mature or full length form of Ra35 (Mtb32A) in E. coli has
been difficult. The expressed product was only visible after
immunoblotting with a polyclonal rabbit anti-Ra35 Ab indicative of
low levels of protein expression. Even then, multiple specific
species (bands) were detected indicative of auto-catalytic
breakdown (degradation) of the recombinant antigen. This was
presumed to be due to the expression of Ra35FL in E. coli as a
biologically active form.
[0290] It has been previously shown that it was possible to express
Ra35FL as two overlapping halves comprising the N-terminal
(Ra35N-term, called Ra35) and C-term halves (Ra35C-term called
Ra12). To enhance and stabilize the expression of the whole Ra35
molecule, a single point mutation was introduced at one of the
residues within the active-site triad (substitution of Ser to Ala;
see FIG. 6). This mutagenized form of Mtb32A can now be easily
expressed at high levels in a stable form. In addition, to
stabilize expression of Mtb72F, a single nucleotide substitution (T
to G, resulting in a Ser to Ala change at position 710 of the
fusion polypeptide) was incorporated in the sequence of Mtb72F at
nucleotide position 2128 (see FIG. 5).
[0291] This stabilization is also readily accomplished by
mutagenizing any one, any two, or all three of the three residues
comprising the active site triad in Ra35FL, Ra35, or Mtb72F or
other fusion proteins comprising Ra35 (His, Asp, or Ser).
Mutagenesis can be performed using any technique known to one of
skill in the art.
Example 8
Immunization of Mice with Ra35FLMutSA-TbH9 and MTB72FMutSA
[0292] Eight mice per group were immunized with the compositions
listed below, which include the adjuvant AS2A. The mice were then
challenged with Mycobacterium tuberculosis, and survival of the
mice was measured.
TABLE-US-00012 Group Concentration of protein or DNA 1. Mtb72f
protein 1.5 mg/ml 2. Mtb72fDNA 1.2 mg/ml 3. Mtb72f-85b protein 0.6
mg/ml 4. Mtb72f-85b DNA 1.1 mg/ml 5. Mtb72f-MTI protein 1.3 mg/ml
6. Mtb72f-MTI DNA 1.1 mg/ml 7. Mtb72f MutSA protein 1.7 mg/ml 8.
MTB3AMutSA-TbH9 protein 2.4 mg/ml 9. BCG 10. AS2 11. vector alone
1.5 mg/ml 12. saline
Sequence CWU 1
1
5011872DNAMycobacterium tuberculosisMTB32A (Ra35FL) 1gactacgttg
gtgtagaaaa atcctgccgc ccggaccctt aaggctggga caatttctga 60tagctacccc
gacacaggag gttacgggat gagcaattcg cgccgccgct cactcaggtg
120gtcatggttg ctgagcgtgc tggctgccgt cgggctgggc ctggccacgg
cgccggccca 180ggcggccccg ccggccttgt cgcaggaccg gttcgccgac
ttccccgcgc tgcccctcga 240cccgtccgcg atggtcgccc aagtggcgcc
acaggtggtc aacatcaaca ccaaactggg 300ctacaacaac gccgtgggcg
ccgggaccgg catcgtcatc gatcccaacg gtgtcgtgct 360gaccaacaac
cacgtgatcg cgggcgccac cgacatcaat gcgttcagcg tcggctccgg
420ccaaacctac ggcgtcgatg tggtcgggta tgaccgcacc caggatgtcg
cggtgctgca 480gctgcgcggt gccggtggcc tgccgtcggc ggcgatcggt
ggcggcgtcg cggttggtga 540gcccgtcgtc gcgatgggca acagcggtgg
gcagggcgga acgccccgtg cggtgcctgg 600cagggtggtc gcgctcggcc
aaaccgtgca ggcgtcggat tcgctgaccg gtgccgaaga 660gacattgaac
gggttgatcc agttcgatgc cgcaatccag cccggtgatt cgggcgggcc
720cgtcgtcaac ggcctaggac aggtggtcgg tatgaacacg gccgcgtccg
ataacttcca 780gctgtcccag ggtgggcagg gattcgccat tccgatcggg
caggcgatgg cgatcgcggg 840ccaaatccga tcgggtgggg ggtcacccac
cgttcatatc gggcctaccg ccttcctcgg 900cttgggtgtt gtcgacaaca
acggcaacgg cgcacgagtc caacgcgtgg tcggaagcgc 960tccggcggca
agtctcggca tctccaccgg cgacgtgatc accgcggtcg acggcgctcc
1020gatcaactcg gccaccgcga tggcggacgc gcttaacggg catcatcccg
gtgacgtcat 1080ctcggtgaac tggcaaacca agtcgggcgg cacgcgtaca
gggaacgtga cattggccga 1140gggacccccg gcctgatttg tcgcggatac
cacccgccgg ccggccaatt ggattggcgc 1200cagccgtgat tgccgcgtga
gcccccgagt tccgtctccc gtgcgcgtgg cattgtggaa 1260gcaatgaacg
aggcagaaca cagcgttgag caccctcccg tgcagggcag ttacgtcgaa
1320ggcggtgtgg tcgagcatcc ggatgccaag gacttcggca gcgccgccgc
cctgcccgcc 1380gatccgacct ggtttaagca cgccgtcttc tacgaggtgc
tggtccgggc gttcttcgac 1440gccagcgcgg acggttccgn cgatctgcgt
ggactcatcg atcgcctcga ctacctgcag 1500tggcttggca tcgactgcat
ctgttgccgc cgttcctacg actcaccgct gcgcgacggc 1560ggttacgaca
ttcgcgactt ctacaaggtg ctgcccgaat tcggcaccgt cgacgatttc
1620gtcgccctgg tcgacaccgc tcaccggcga ggtatccgca tcatcaccga
cctggtgatg 1680aatcacacct cggagtcgca cccctggttt caggagtccc
gccgcgaccc agacggaccg 1740tacggtgact attacgtgtg gagcgacacc
agcgagcgct acaccgacgc ccggatcatc 1800ttcgtcgaca ccgaagagtc
gaactggtca ttcgatcctg tccgccgaca gttnctactg 1860gcaccgattc tt
18722355PRTMycobacterium tuberculosisMTB32A (Ra35FL) 2Met Ser Asn
Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser1 5 10 15Val Leu
Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gln Ala20 25 30Ala
Pro Pro Ala Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala Leu35 40
45Pro Leu Asp Pro Ser Ala Met Val Ala Gln Val Ala Pro Gln Val Val50
55 60Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly
Thr65 70 75 80Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu Thr Asn
Asn His Val85 90 95Ile Ala Gly Ala Thr Asp Ile Asn Ala Phe Ser Val
Gly Ser Gly Gln100 105 110Thr Tyr Gly Val Asp Val Val Gly Tyr Asp
Arg Thr Gln Asp Val Ala115 120 125Val Leu Gln Leu Arg Gly Ala Gly
Gly Leu Pro Ser Ala Ala Ile Gly130 135 140Gly Gly Val Ala Val Gly
Glu Pro Val Val Ala Met Gly Asn Ser Gly145 150 155 160Gly Gln Gly
Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu165 170 175Gly
Gln Thr Val Gln Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr180 185
190Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala Ile Gln Pro Gly Asp
Ser195 200 205Gly Gly Pro Val Val Asn Gly Leu Gly Gln Val Val Gly
Met Asn Thr210 215 220Ala Ala Ser Asp Asn Phe Gln Leu Ser Gln Gly
Gly Gln Gly Phe Ala225 230 235 240Ile Pro Ile Gly Gln Ala Met Ala
Ile Ala Gly Gln Ile Arg Ser Gly245 250 255Gly Gly Ser Pro Thr Val
His Ile Gly Pro Thr Ala Phe Leu Gly Leu260 265 270Gly Val Val Asp
Asn Asn Gly Asn Gly Ala Arg Val Gln Arg Val Val275 280 285Gly Ser
Ala Pro Ala Ala Ser Leu Gly Ile Ser Thr Gly Asp Val Ile290 295
300Thr Ala Val Asp Gly Ala Pro Ile Asn Ser Ala Thr Ala Met Ala
Asp305 310 315 320Ala Leu Asn Gly His His Pro Gly Asp Val Ile Ser
Val Asn Trp Gln325 330 335Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn
Val Thr Leu Ala Glu Gly340 345 350Pro Pro
Ala35531002DNAMycobacterium tuberculosisMTB32A (Ra35 mature)
3catatgcatc accatcacca tcacgccccg ccggccttgt cgcaggaccg gttcgccgac
60ttccccgcgc tgcccctcga cccgtccgcg atggtcgccc aagtggggcc acaggtggtc
120aacatcaaca ccaaactggg ctacaacaac gccgtgggcg ccgggaccgg
catcgtcatc 180gatcccaacg gtgtcgtgct gaccaacaac cacgtgatcg
cgggcgccac cgacatcaat 240gcgttcagcg tcggctccgg ccaaacctac
ggcgtcgatg tggtcgggta tgaccgcacc 300caggatgtcg cggtgctgca
gctgcgcggt gccggtggcc tgccgtcggc ggcgatcggt 360ggcggcgtcg
cggttggtga gcccgtcgtc gcgatgggca acagcggtgg gcagggcgga
420acgccccgtg cggtgcctgg cagggtggtc gcgctcggcc aaaccgtgca
ggcgtcggat 480tcgctgaccg gtgccgaaga gacattgaac gggttgatcc
agttcgatgc cgcgatccag 540cccggtgagg cgggcgggcc cgtcgtcaac
ggcctaggac aggtggtcgg tatgaacacg 600gccgcgtccg ataacttcca
gctgtcccag ggtgggcagg gattcgccat tccgatcggg 660caggcgatgg
cgatcgcggg ccagatccga tcgggtgggg ggtcacccac cgttcatatc
720gggcctaccg ccttcctcgg cttgggtgtt gtcgacaaca acggcaacgg
cgcacgagtc 780caacgcgtgg tcgggagcgc tccggcggca agtctcggca
tctccaccgg cgacgtgatc 840accgcggtcg acggcgctcc gatcaactcg
gccaccgcga tggcggacgc gcttaacggg 900catcatcccg gtgacgtcat
ctcggtgacc tggcaaacca agtcgggcgg cacgcgtaca 960gggaacgtga
cattggccga gggacccccg gcctgagaat tc 10024330PRTMycobacterium
tuberculosisMTB32A (Ra35 mature) 4Met His His His His His His Ala
Pro Pro Ala Leu Ser Gln Asp Arg1 5 10 15Phe Ala Asp Phe Pro Ala Leu
Pro Leu Asp Pro Ser Ala Met Val Ala20 25 30Gln Val Gly Pro Gln Val
Val Asn Ile Asn Thr Lys Leu Gly Tyr Asn35 40 45Asn Ala Val Gly Ala
Gly Thr Gly Ile Val Ile Asp Pro Asn Gly Val50 55 60Val Leu Thr Asn
Asn His Val Ile Ala Gly Ala Thr Asp Ile Asn Ala65 70 75 80Phe Ser
Val Gly Ser Gly Gln Thr Tyr Gly Val Asp Val Val Gly Tyr85 90 95Asp
Arg Thr Gln Asp Val Ala Val Leu Gln Leu Arg Gly Ala Gly Gly100 105
110Leu Pro Ser Ala Ala Ile Gly Gly Gly Val Ala Val Gly Glu Pro
Val115 120 125Val Ala Met Gly Asn Ser Gly Gly Gln Gly Gly Thr Pro
Arg Ala Val130 135 140Pro Gly Arg Val Val Ala Leu Gly Gln Thr Val
Gln Ala Ser Asp Ser145 150 155 160Leu Thr Gly Ala Glu Glu Thr Leu
Asn Gly Leu Ile Gln Phe Asp Ala165 170 175Ala Ile Gln Pro Gly Asp
Ser Gly Gly Pro Val Val Asn Gly Leu Gly180 185 190Gln Val Val Gly
Met Asn Thr Ala Ala Ser Asp Asn Phe Gln Leu Ser195 200 205Gln Gly
Gly Gln Gly Phe Ala Ile Pro Ile Gly Gln Ala Met Ala Ile210 215
220Ala Gly Gln Ile Arg Ser Gly Gly Gly Ser Pro Thr Val His Ile
Gly225 230 235 240Pro Thr Ala Phe Leu Gly Leu Gly Val Val Asp Asn
Asn Gly Asn Gly245 250 255Ala Arg Val Gln Arg Val Val Gly Ser Ala
Pro Ala Ala Ser Leu Gly260 265 270Ile Ser Thr Gly Asp Val Ile Thr
Ala Val Asp Gly Ala Pro Ile Asn275 280 285Ser Ala Thr Ala Met Ala
Asp Ala Leu Asn Gly His His Pro Gly Asp290 295 300Val Ile Ser Val
Thr Trp Gln Thr Lys Ser Gly Gly Thr Arg Thr Gly305 310 315 320Asn
Val Thr Leu Ala Glu Gly Pro Pro Ala325 33051002DNAArtificial
SequenceDescription of Artificial SequenceRa35FLMutSA 5catatgcatc
accatcacca tcacgccccg ccggccttgt cgcaggaccg gttcgccgac 60ttccccgcgc
tgcccctcga cccgtccgcg atggtcgccc aagtggggcc acaggtggtc
120aacatcaaca ccaaactggg ctacaacaac gccgtgggcg ccgggaccgg
catcgtcatc 180gatcccaacg gtgtcgtgct gaccaacaac cacgtgatcg
cgggcgccac cgacatcaat 240gcgttcagcg tcggctccgg ccaaacctac
ggcgtcgatg tggtcgggta tgaccgcacc 300caggatgtcg cggtgctgca
gctgcgcggt gccggtggcc tgccgtcggc ggcgatcggt 360ggcggcgtcg
cggttggtga gcccgtcgtc gcgatgggca acagcggtgg gcagggcgga
420acgccccgtg cggtgcctgg cagggtggtc gcgctcggcc aaaccgtgca
ggcgtcggat 480tcgctgaccg gtgccgaaga gacattgaac gggttgatcc
agttcgatgc cgcgatccag 540cccggtgatg cgggcgggcc cgtcgtcaac
ggcctaggac aggtggtcgg tatgaacacg 600gccgcgtccg ataacttcca
gctgtcccag ggtgggcagg gattcgccat tccgatcggg 660caggcgatgg
cgatcgcggg ccagatccga tcgggtgggg ggtcacccac cgttcatatc
720gggcctaccg ccttcctcgg cttgggtgtt gtcgacaaca acggcaacgg
cgcacgagtc 780caacgcgtgg tcgggagcgc tccggcggca agtctcggca
tctccaccgg cgacgtgatc 840accgcggtcg acggcgctcc gatcaactcg
gccaccgcga tggcggacgc gcttaacggg 900catcatcccg gtgacgtcat
ctcggtgacc tggcaaacca agtcgggcgg cacgcgtaca 960gggaacgtga
cattggccga gggacccccg gcctgagaat tc 10026330PRTArtificial
SequenceDescription of Artificial SequenceRa35FLMutSA 6Met His His
His His His His Ala Pro Pro Ala Leu Ser Gln Asp Arg1 5 10 15Phe Ala
Asp Phe Pro Ala Leu Pro Leu Asp Pro Ser Ala Met Val Ala20 25 30Gln
Val Gly Pro Gln Val Val Asn Ile Asn Thr Lys Leu Gly Tyr Asn35 40
45Asn Ala Val Gly Ala Gly Thr Gly Ile Val Ile Asp Pro Asn Gly Val50
55 60Val Leu Thr Asn Asn His Val Ile Ala Gly Ala Thr Asp Ile Asn
Ala65 70 75 80Phe Ser Val Gly Ser Gly Gln Thr Tyr Gly Val Asp Val
Val Gly Tyr85 90 95Asp Arg Thr Gln Asp Val Ala Val Leu Gln Leu Arg
Gly Ala Gly Gly100 105 110Leu Pro Ser Ala Ala Ile Gly Gly Gly Val
Ala Val Gly Glu Pro Val115 120 125Val Ala Met Gly Asn Ser Gly Gly
Gln Gly Gly Thr Pro Arg Ala Val130 135 140Pro Gly Arg Val Val Ala
Leu Gly Gln Thr Val Gln Ala Ser Asp Ser145 150 155 160Leu Thr Gly
Ala Glu Glu Thr Leu Asn Gly Leu Ile Gln Phe Asp Ala165 170 175Ala
Ile Gln Pro Gly Asp Ala Gly Gly Pro Val Val Asn Gly Leu Gly180 185
190Gln Val Val Gly Met Asn Thr Ala Ala Ser Asp Asn Phe Gln Leu
Ser195 200 205Gln Gly Gly Gln Gly Phe Ala Ile Pro Ile Gly Gln Ala
Met Ala Ile210 215 220Ala Gly Gln Ile Arg Ser Gly Gly Gly Ser Pro
Thr Val His Ile Gly225 230 235 240Pro Thr Ala Phe Leu Gly Leu Gly
Val Val Asp Asn Asn Gly Asn Gly245 250 255Ala Arg Val Gln Arg Val
Val Gly Ser Ala Pro Ala Ala Ser Leu Gly260 265 270Ile Ser Thr Gly
Asp Val Ile Thr Ala Val Asp Gly Ala Pro Ile Asn275 280 285Ser Ala
Thr Ala Met Ala Asp Ala Leu Asn Gly His His Pro Gly Asp290 295
300Val Ile Ser Val Thr Trp Gln Thr Lys Ser Gly Gly Thr Arg Thr
Gly305 310 315 320Asn Val Thr Leu Ala Glu Gly Pro Pro Ala325
3307585DNAMycobacterium tuberculosisRa35 N-terminus of MTB32A
(Ra35FL) 7gccccgccgg ccttgtcgca ggaccggttc gccgacttcc ccgcgctgcc
cctcgacccg 60tccgcgatgg tcgcccaagt ggggccacag gtggtcaaca tcaacaccaa
actgggctac 120aacaacgccg tgggcgccgg gaccggcatc gtcatcgatc
ccaacggtgt cgtgctgacc 180aacaaccacg tgatcgcggg cgccaccgac
atcaatgcgt tcagcgtcgg ctccggccaa 240acctacggcg tcgatgtggt
cgggtatgac cgcacccagg atgtcgcggt gctgcagctg 300cgcggtgccg
gtggcctgcc gtcggcggcg atcggtggcg gcgtcgcggt tggtgagccc
360gtcgtcgcga tgggcaacag cggtgggcag ggcggaacgc cccgtgcggt
gcctggcagg 420gtggtcgcgc tcggccaaac cgtgcaggcg tcggattcgc
tgaccggtgc cgaagagaca 480ttgaacgggt tgatccagtt cgatgccgcg
atccagcccg gtgaggcggg cgggcccgtc 540gtcaacggcc taggacaggt
ggtcggtatg aacacggccg cgtcc 5858195PRTMycobacterium
tuberculosisRa35 N-terminus of MTB32A (Ra35FL) 8Ala Pro Pro Ala Leu
Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala Leu1 5 10 15Pro Leu Asp Pro
Ser Ala Met Val Ala Gln Val Gly Pro Gln Val Val20 25 30Asn Ile Asn
Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr35 40 45Gly Ile
Val Ile Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val50 55 60Ile
Ala Gly Ala Thr Asp Ile Asn Ala Phe Ser Val Gly Ser Gly Gln65 70 75
80Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gln Asp Val Ala85
90 95Val Leu Gln Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala Ile
Gly100 105 110Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly
Asn Ser Gly115 120 125Gly Gln Gly Gly Thr Pro Arg Ala Val Pro Gly
Arg Val Val Ala Leu130 135 140Gly Gln Thr Val Gln Ala Ser Asp Ser
Leu Thr Gly Ala Glu Glu Thr145 150 155 160Leu Asn Gly Leu Ile Gln
Phe Asp Ala Ala Ile Gln Pro Gly Asp Ser165 170 175Gly Gly Pro Val
Val Asn Gly Leu Gly Gln Val Val Gly Met Asn Thr180 185 190Ala Ala
Ser1959447DNAMycobacterium tuberculosisRa12 or MTBRa12 C-terminus
of MTB32A (Ra35FL) 9cggtatgaac acggccgcgt ccgataactt ccagctgtcc
cagggtgggc agggattcgc 60cattccgatc gggcaggcga tggcgatcgc gggccagatc
cgatcgggtg gggggtcacc 120caccgttcat atcgggccta ccgccttcct
cggcttgggt gttgtcgaca acaacggcaa 180cggcgcacga gtccaacgcg
tggtcgggag cgctccggcg gcaagtctcg gcatctccac 240cggcgacgtg
atcaccgcgg tcgacggcgc tccgatcaac tcggccaccg cgatggcgga
300cgcgcttaac gggcatcatc ccggtgacgt catctcggtg aactggcaaa
ccaagtcggg 360cggcacgcgt acagggaacg tgacattggc cgagggaccc
ccggcctgat ttcgtcgygg 420ataccacccg ccggccggcc aattgga
44710132PRTMycobacterium tuberculosisRa 12 or MTBRa12 C-terminus of
MTB32A (Ra35FL) 10Thr Ala Ala Ser Asp Asn Phe Gln Leu Ser Gln Gly
Gly Gln Gly Phe1 5 10 15Ala Ile Pro Ile Gly Gln Ala Met Ala Ile Ala
Gly Gln Ile Arg Ser20 25 30Gly Gly Gly Ser Pro Thr Val His Ile Gly
Pro Thr Ala Phe Leu Gly35 40 45Leu Gly Val Val Asp Asn Asn Gly Asn
Gly Ala Arg Val Gln Arg Val50 55 60Val Gly Ser Ala Pro Ala Ala Ser
Leu Gly Ile Ser Thr Gly Asp Val65 70 75 80Ile Thr Ala Val Asp Gly
Ala Pro Ile Asn Ser Ala Thr Ala Met Ala85 90 95Asp Ala Leu Asn Gly
His His Pro Gly Asp Val Ile Ser Val Asn Trp100 105 110Gln Thr Lys
Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu115 120 125Gly
Pro Pro Ala13011851DNAMycobacterium tuberculosisMTB39 (TbH9)
11ctgcagggtg gcgtggatga gcgtcaccgc ggggcaggcc gagctgaccg ccgcccaggt
60ccgggttgct gcggcggcct acgagacggc gtatgggctg acggtgcccc cgccggtgat
120cgccgagaac cgtgctgaac tgatgattct gatagcgacc aacctcttgg
ggcaaaacac 180cccggcgatc gcggtcaacg aggccgaata cggcgagatg
tgggcccaag acgccgccgc 240gatgtttggc tacgccgcgg cgacggcgac
ggcgacggcg acgttgctgc cgttcgagga 300ggcgccggag atgaccagcg
cgggtgggct cctcgagcag gccgccgcgg tcgaggaggc 360ctccgacacc
gccgcggcga accagttgat gaacaatgtg ccccaggcgc tgaaacagtt
420ggcccagccc acgcagggca ccacgccttc ttccaagctg ggtggcctgt
ggaagacggt 480ctcgccgcat cggtcgccga tcagcaacat ggtgtcgatg
gccaacaacc acatgtcgat 540gaccaactcg ggtgtgtcga tgaccaacac
cttgagctcg atgttgaagg gctttgctcc 600ggcggcggcc gcccaggccg
tgcaaaccgc ggcgcaaaac ggggtccggg cgatgagctc 660gctgggcagc
tcgctgggtt cttcgggtct gggcggtggg gtggccgcca acttgggtcg
720ggcggcctcg gtacggtatg gtcaccggga tggcggaaaa tatgcanagt
ctggtcggcg 780gaacggtggt ccggcgtaag gtttaccccc gttttctgga
tgcggtgaac ttcgtcaacg 840gaaacagtta c 85112263PRTMycobacterium
tuberculosisMTB39 (TbH9) 12Val Ala Trp Met Ser Val Thr Ala Gly Gln
Ala Glu Leu Thr Ala Ala1 5 10 15Gln Val Arg Val Ala Ala Ala Ala Tyr
Glu Thr Ala Tyr Gly Leu Thr20 25 30Val Pro Pro Pro Val Ile Ala Glu
Asn Arg Ala Glu Leu Met Ile Leu35 40 45Ile Ala Thr Asn Leu Leu Gly
Gln Asn Thr Pro Ala Ile Ala Val Asn50 55 60Glu Ala Glu Tyr Gly Glu
Met Trp Ala Gln Asp Ala Ala Ala Met Phe65 70 75 80Gly Tyr Ala Ala
Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe85 90 95Glu Glu Ala
Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gln Ala100 105 110Ala
Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gln Leu Met115 120
125Asn Asn Val Pro Gln Ala Leu Lys Gln Leu Ala Gln Pro Thr Gln
Gly130 135 140Thr Thr Pro Ser Ser Lys Leu Gly Gly
Leu Trp Lys Thr Val Ser Pro145 150 155 160His Arg Ser Pro Ile Ser
Asn Met Val Ser Met Ala Asn Asn His Met165 170 175Ser Met Thr Asn
Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met180 185 190Leu Lys
Gly Phe Ala Pro Ala Ala Ala Ala Gln Ala Val Gln Thr Ala195 200
205Ala Gln Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu
Gly210 215 220Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly
Arg Ala Ala225 230 235 240Ser Val Arg Tyr Gly His Arg Asp Gly Gly
Lys Tyr Ala Xaa Ser Gly245 250 255Arg Arg Asn Gly Gly Pro
Ala260133058DNAMycobacterium tuberculosisMTB39 (TbH9FL)
13gatcgtaccc gtgcgagtgc tcgggccgtt tgaggatgga gtgcacgtgt ctttcgtgat
60ggcataccca gagatgttgg cggcggcggc tgacaccctg cagagcatcg gtgctaccac
120tgtggctagc aatgccgctg cggcggcccc gacgactggg gtggtgcccc
ccgctgccga 180tgaggtgtcg gcgctgactg cggcgcactt cgccgcacat
gcggcgatgt atcagtccgt 240gagcgctcgg gctgctgcga ttcatgacca
gttcgtggcc acccttgcca gcagcgccag 300ctcgtatgcg gccactgaag
tcgccaatgc ggcggcggcc agctaagcca ggaacagtcg 360gcacgagaaa
ccacgagaaa tagggacacg taatggtgga tttcggggcg ttaccaccgg
420agatcaactc cgcgaggatg tacgccggcc cgggttcggc ctcgctggtg
gccgcggctc 480agatgtggga cagcgtggcg agtgacctgt tttcggccgc
gtcggcgttt cagtcggtgg 540tctggggtct gacggtgggg tcgtggatag
gttcgtcggc gggtctgatg gtggcggcgg 600cctcgccgta tgtggcgtgg
atgagcgtca ccgcggggca ggccgagctg accgccgccc 660aggtccgggt
tgctgcggcg gcctacgaga cggcgtatgg gctgacggtg cccccgccgg
720tgatcgccga gaaccgtgct gaactgatga ttctgatagc gaccaacctc
ttggggcaaa 780acaccccggc gatcgcggtc aacgaggccg aatacggcga
gatgtgggcc caagacgccg 840ccgcgatgtt tggctacgcc gcggcgacgg
cgacggcgac ggcgacgttg ctgccgttcg 900aggaggcgcc ggagatgacc
agcgcgggtg ggctcctcga gcaggccgcc gcggtcgagg 960aggcctccga
caccgccgcg gcgaaccagt tgatgaacaa tgtgccccag gcgctgcaac
1020agctggccca gcccacgcag ggcaccacgc cttcttccaa gctgggtggc
ctgtggaaga 1080cggtctcgcc gcatcggtcg ccgatcagca acatggtgtc
gatggccaac aaccacatgt 1140cgatgaccaa ctcgggtgtg tcgatgacca
acaccttgag ctcgatgttg aagggctttg 1200ctccggcggc ggccgcccag
gccgtgcaaa ccgcggcgca aaacggggtc cgggcgatga 1260gctcgctggg
cagctcgctg ggttcttcgg gtctgggcgg tggggtggcc gccaacttgg
1320gtcgggcggc ctcggtcggt tcgttgtcgg tgccgcaggc ctgggccgcg
gccaaccagg 1380cagtcacccc ggcggcgcgg gcgctgccgc tgaccagcct
gaccagcgcc gcggaaagag 1440ggcccgggca gatgctgggc gggctgccgg
tggggcagat gggcgccagg gccggtggtg 1500ggctcagtgg tgtgctgcgt
gttccgccgc gaccctatgt gatgccgcat tctccggcgg 1560ccggctagga
gagggggcgc agactgtcgt tatttgacca gtgatcggcg gtctcggtgt
1620ttccgcggcc ggctatgaca acagtcaatg tgcatgacaa gttacaggta
ttaggtccag 1680gttcaacaag gagacaggca acatggcctc acgttttatg
acggatccgc acgcgatgcg 1740ggacatggcg ggccgttttg aggtgcacgc
ccagacggtg gaggacgagg ctcgccggat 1800gtgggcgtcc gcgcaaaaca
tttccggtgc gggctggagt ggcatggccg aggcgacctc 1860gctagacacc
atggcccaga tgaatcaggc gtttcgcaac atcgtgaaca tgctgcacgg
1920ggtgcgtgac gggctggttc gcgacgccaa caactacgag cagcaagagc
aggcctccca 1980gcagatcctc agcagctaac gtcagccgct gcagcacaat
acttttacaa gcgaaggaga 2040acaggttcga tgaccatcaa ctatcaattc
ggggatgtcg acgctcacgg cgccatgatc 2100cgcgctcagg ccgggttgct
ggaggccgag catcaggcca tcattcgtga tgtgttgacc 2160gcgagtgact
tttggggcgg cgccggttcg gcggcctgcc aggggttcat tacccagttg
2220ggccgtaact tccaggtgat ctacgagcag gccaacgccc acgggcagaa
ggtgcaggct 2280gccggcaaca acatggcgca aaccgacagc gccgtcggct
ccagctgggc ctgacaccag 2340gccaaggcca gggacgtggt gtacgagtga
agttcctcgc gtgatccttc gggtggcagt 2400ctaagtggtc agtgctgggg
tgttggtggt ttgctgcttg gcgggttctt cggtgctggt 2460cagtgctgct
cgggctcggg tgaggacctc gaggcccagg tagcgccgtc cttcgatcca
2520ttcgtcgtgt tgttcggcga ggacggctcc gacgaggcgg atgatcgagg
cgcggtcggg 2580gaagatgccc acgacgtcgg ttcggcgtcg tacctctcgg
ttgaggcgtt cctgggggtt 2640gttggaccag atttggcgcc agatctgctt
ggggaaggcg gtgaacgcca gcaggtcggt 2700gcgggcggtg tcgaggtgct
cggccaccgc ggggagtttg tcggtcagag cgtcgagtac 2760ccgatcatat
tgggcaacaa ctgattcggc gtcgggctgg tcgtagatgg agtgcagcag
2820ggtgcgcacc cacggccagg agggcttcgg ggtggctgcc atcagattgg
ctgcgtagtg 2880ggttctgcag cgctgccagg ccgctgcggg cagggtggcg
ccgatcgcgg ccaccaggcc 2940ggcgtgggcg tcgctggtga ccagcgcgac
cccggacagg ccgcgggcga ccaggtcgcg 3000gaagaacgcc agccagccgg
ccccgtcctc ggcggaggtg acctggatgc ccaggatc 305814391PRTMycobacterium
tuberculosisMTB39 (TbH9FL) 14Met Val Asp Phe Gly Ala Leu Pro Pro
Glu Ile Asn Ser Ala Arg Met1 5 10 15Tyr Ala Gly Pro Gly Ser Ala Ser
Leu Val Ala Ala Ala Gln Met Trp20 25 30Asp Ser Val Ala Ser Asp Leu
Phe Ser Ala Ala Ser Ala Phe Gln Ser35 40 45Val Val Trp Gly Leu Thr
Val Gly Ser Trp Ile Gly Ser Ser Ala Gly50 55 60Leu Met Val Ala Ala
Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr65 70 75 80Ala Gly Gln
Ala Glu Leu Thr Ala Ala Gln Val Arg Val Ala Ala Ala85 90 95Ala Tyr
Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val Ile Ala100 105
110Glu Asn Arg Ala Glu Leu Met Ile Leu Ile Ala Thr Asn Leu Leu
Gly115 120 125Gln Asn Thr Pro Ala Ile Ala Val Asn Glu Ala Glu Tyr
Gly Glu Met130 135 140Trp Ala Gln Asp Ala Ala Ala Met Phe Gly Tyr
Ala Ala Ala Thr Ala145 150 155 160Thr Ala Thr Ala Thr Leu Leu Pro
Phe Glu Glu Ala Pro Glu Met Thr165 170 175Ser Ala Gly Gly Leu Leu
Glu Gln Ala Ala Ala Val Glu Glu Ala Ser180 185 190Asp Thr Ala Ala
Ala Asn Gln Leu Met Asn Asn Val Pro Gln Ala Leu195 200 205Gln Gln
Leu Ala Gln Pro Thr Gln Gly Thr Thr Pro Ser Ser Lys Leu210 215
220Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro Ile Ser
Asn225 230 235 240Met Val Ser Met Ala Asn Asn His Met Ser Met Thr
Asn Ser Gly Val245 250 255Ser Met Thr Asn Thr Leu Ser Ser Met Leu
Lys Gly Phe Ala Pro Ala260 265 270Ala Ala Ala Gln Ala Val Gln Thr
Ala Ala Gln Asn Gly Val Arg Ala275 280 285Met Ser Ser Leu Gly Ser
Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly290 295 300Val Ala Ala Asn
Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val305 310 315 320Pro
Gln Ala Trp Ala Ala Ala Asn Gln Ala Val Thr Pro Ala Ala Arg325 330
335Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro
Gly340 345 350Gln Met Leu Gly Gly Leu Pro Val Gly Gln Met Gly Ala
Arg Ala Gly355 360 365Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro
Arg Pro Tyr Val Met370 375 380Pro His Ser Pro Ala Ala Gly385
390152287DNAArtificial SequenceDescription of Artificial
Sequencetri-fusion protein MTB72F (Ra12-TbH9-Ra35 or MTB32-MTB39
fusion) 15tctagaaata attttgttta ctttaagaan ganatataca t atg cat cac
cat cac 56Met His His His His1 5cat cac acg gcc gcg tcc gat aac ttc
cag ctg tcc cag ggt ggg cag 104His His Thr Ala Ala Ser Asp Asn Phe
Gln Leu Ser Gln Gly Gly Gln10 15 20gga ttc gcc att ccg atc ggg cag
gcg atg gcg atc gcg ggc cag atc 152Gly Phe Ala Ile Pro Ile Gly Gln
Ala Met Ala Ile Ala Gly Gln Ile25 30 35cga tcg ggt ggg ggg tca ccc
acc gtt cat atc ggg cct acc gcc ttc 200Arg Ser Gly Gly Gly Ser Pro
Thr Val His Ile Gly Pro Thr Ala Phe40 45 50ctc ggc ttg ggt gtt gtc
gac aac aac ggc aac ggc gca cga gtc caa 248Leu Gly Leu Gly Val Val
Asp Asn Asn Gly Asn Gly Ala Arg Val Gln55 60 65cgc gtg gtc ggg agc
gct ccg gcg gca agt ctc ggc atc tcc acc ggc 296Arg Val Val Gly Ser
Ala Pro Ala Ala Ser Leu Gly Ile Ser Thr Gly70 75 80 85gac gtg atc
acc gcg gtc gac ggc gct ccg atc aac tcg gcc acc gcg 344Asp Val Ile
Thr Ala Val Asp Gly Ala Pro Ile Asn Ser Ala Thr Ala90 95 100atg gcg
gac gcg ctt aac ggg cat cat ccc ggt gac gtc atc tcg gtg 392Met Ala
Asp Ala Leu Asn Gly His His Pro Gly Asp Val Ile Ser Val105 110
115acc tgg caa acc aag tcg ggc ggc acg cgt aca ggg aac gtg aca ttg
440Thr Trp Gln Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr
Leu120 125 130gcc gag gga ccc ccg gcc gaa ttc atg gtg gat ttc ggg
gcg tta cca 488Ala Glu Gly Pro Pro Ala Glu Phe Met Val Asp Phe Gly
Ala Leu Pro135 140 145ccg gag atc aac tcc gcg agg atg tac gcc ggc
ccg ggt tcg gcc tcg 536Pro Glu Ile Asn Ser Ala Arg Met Tyr Ala Gly
Pro Gly Ser Ala Ser150 155 160 165ctg gtg gcc gcg gct cag atg tgg
gac agc gtg gcg agt gac ctg ttt 584Leu Val Ala Ala Ala Gln Met Trp
Asp Ser Val Ala Ser Asp Leu Phe170 175 180tcg gcc gcg tcg gcg ttt
cag tcg gtg gtc tgg ggt ctg acg gtg ggg 632Ser Ala Ala Ser Ala Phe
Gln Ser Val Val Trp Gly Leu Thr Val Gly185 190 195tcg tgg ata ggt
tcg tcg gcg ggt ctg atg gtg gcg gcg gcc tcg ccg 680Ser Trp Ile Gly
Ser Ser Ala Gly Leu Met Val Ala Ala Ala Ser Pro200 205 210tat gtg
gcg tgg atg agc gtc acc gcg ggg cag gcc gag ctg acc gcc 728Tyr Val
Ala Trp Met Ser Val Thr Ala Gly Gln Ala Glu Leu Thr Ala215 220
225gcc cag gtc cgg gtt gct gcg gcg gcc tac gag acg gcg tat ggg ctg
776Ala Gln Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly
Leu230 235 240 245acg gtg ccc ccg ccg gtg atc gcc gag aac cgt gct
gaa ctg atg att 824Thr Val Pro Pro Pro Val Ile Ala Glu Asn Arg Ala
Glu Leu Met Ile250 255 260ctg ata gcg acc aac ctc ttg ggg caa aac
acc ccg gcg atc gcg gtc 872Leu Ile Ala Thr Asn Leu Leu Gly Gln Asn
Thr Pro Ala Ile Ala Val265 270 275aac gag gcc gaa tac ggc gag atg
tgg gcc caa gac gcc gcc gcg atg 920Asn Glu Ala Glu Tyr Gly Glu Met
Trp Ala Gln Asp Ala Ala Ala Met280 285 290ttt ggc tac gcc gcg gcg
acg gcg acg gcg acg gcg acg ttg ctg ccg 968Phe Gly Tyr Ala Ala Ala
Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro295 300 305ttc gag gag gcg
ccg gag atg acc agc gcg ggt ggg ctc ctc gag cag 1016Phe Glu Glu Ala
Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gln310 315 320 325gcc
gcc gcg gtc gag gag gcc tcc gac acc gcc gcg gcg aac cag ttg 1064Ala
Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gln Leu330 335
340atg aac aat gtg ccc cag gcg ctg caa cag ctg gcc cag ccc acg cag
1112Met Asn Asn Val Pro Gln Ala Leu Gln Gln Leu Ala Gln Pro Thr
Gln345 350 355ggc acc acg cct tct tcc aag ctg ggt ggc ctg tgg aag
acg gtc tcg 1160Gly Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys
Thr Val Ser360 365 370ccg cat cgg tcg ccg atc agc aac atg gtg tcg
atg gcc aac aac cac 1208Pro His Arg Ser Pro Ile Ser Asn Met Val Ser
Met Ala Asn Asn His375 380 385atg tcg atg acc aac tcg ggt gtg tcg
atg acc aac acc ttg agc tcg 1256Met Ser Met Thr Asn Ser Gly Val Ser
Met Thr Asn Thr Leu Ser Ser390 395 400 405atg ttg aag ggc ttt gct
ccg gcg gcg gcc gcc cag gcc gtg caa acc 1304Met Leu Lys Gly Phe Ala
Pro Ala Ala Ala Ala Gln Ala Val Gln Thr410 415 420gcg gcg caa aac
ggg gtc cgg gcg atg agc tcg ctg ggc agc tcg ctg 1352Ala Ala Gln Asn
Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu425 430 435ggt tct
tcg ggt ctg ggc ggt ggg gtg gcc gcc aac ttg ggt cgg gcg 1400Gly Ser
Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala440 445
450gcc tcg gtc ggt tcg ttg tcg gtg ccg cag gcc tgg gcc gcg gcc aac
1448Ala Ser Val Gly Ser Leu Ser Val Pro Gln Ala Trp Ala Ala Ala
Asn455 460 465cag gca gtc acc ccg gcg gcg cgg gcg ctg ccg ctg acc
agc ctg acc 1496Gln Ala Val Thr Pro Ala Ala Arg Ala Leu Pro Leu Thr
Ser Leu Thr470 475 480 485agc gcc gcg gaa aga ggg ccc ggg cag atg
ctg ggc ggg ctg ccg gtg 1544Ser Ala Ala Glu Arg Gly Pro Gly Gln Met
Leu Gly Gly Leu Pro Val490 495 500ggg cag atg ggc gcc agg gcc ggt
ggt ggg ctc agt ggt gtg ctg cgt 1592Gly Gln Met Gly Ala Arg Ala Gly
Gly Gly Leu Ser Gly Val Leu Arg505 510 515gtt ccg ccg cga ccc tat
gtg atg ccg cat tct ccg gca gcc ggc gat 1640Val Pro Pro Arg Pro Tyr
Val Met Pro His Ser Pro Ala Ala Gly Asp520 525 530atc gcc ccg ccg
gcc ttg tcg cag gac cgg ttc gcc gac ttc ccc gcg 1688Ile Ala Pro Pro
Ala Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala535 540 545ctg ccc
ctc gac ccg tcc gcg atg gtc gcc caa gtg ggg cca cag gtg 1736Leu Pro
Leu Asp Pro Ser Ala Met Val Ala Gln Val Gly Pro Gln Val550 555 560
565gtc aac atc aac acc aaa ctg ggc tac aac aac gcc gtg ggc gcc ggg
1784Val Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala
Gly570 575 580acc ggc atc gtc atc gat ccc aac ggt gtc gtg ctg acc
aac aac cac 1832Thr Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu Thr
Asn Asn His585 590 595gtg atc gcg ggc gcc acc gac atc aat gcg ttc
agc gtc ggc tcc ggc 1880Val Ile Ala Gly Ala Thr Asp Ile Asn Ala Phe
Ser Val Gly Ser Gly600 605 610caa acc tac ggc gtc gat gtg gtc ggg
tat gac cgc acc cag gat gtc 1928Gln Thr Tyr Gly Val Asp Val Val Gly
Tyr Asp Arg Thr Gln Asp Val615 620 625gcg gtg ctg cag ctg cgc ggt
gcc ggt ggc ctg ccg tcg gcg gcg atc 1976Ala Val Leu Gln Leu Arg Gly
Ala Gly Gly Leu Pro Ser Ala Ala Ile630 635 640 645ggt ggc ggc gtc
gcg gtt ggt gag ccc gtc gtc gcg atg ggc aac agc 2024Gly Gly Gly Val
Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser650 655 660ggt ggg
cag ggc gga acg ccc cgt gcg gtg cct ggc agg gtg gtc gcg 2072Gly Gly
Gln Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala665 670
675ctc ggc caa acc gtg cag gcg tcg gat tcg ctg acc ggt gcc gaa gag
2120Leu Gly Gln Thr Val Gln Ala Ser Asp Ser Leu Thr Gly Ala Glu
Glu680 685 690aca ttg aac ggg ttg atc cag ttc gat gcc gcg atc cag
ccc ggt gat 2168Thr Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala Ile Gln
Pro Gly Asp695 700 705tcg ggc ggg ccc gtc gtc aac ggc cta gga cag
gtg gtc ggt atg aac 2216Ser Gly Gly Pro Val Val Asn Gly Leu Gly Gln
Val Val Gly Met Asn710 715 720 725acg gcc gcg tcc tag gatatccatc
acactggcgg ccgctcgagc agatccggnt 2271Thr Ala Ala Sergtaacaaagc
ccgaaa 228716729PRTArtificial SequenceDescription of Artificial
Sequencetri-fusion protein MTB72F (Ra12-TbH9-Ra35 or MTB32-MTB39
fusion) 16Met His His His His His His Thr Ala Ala Ser Asp Asn Phe
Gln Leu1 5 10 15Ser Gln Gly Gly Gln Gly Phe Ala Ile Pro Ile Gly Gln
Ala Met Ala20 25 30Ile Ala Gly Gln Ile Arg Ser Gly Gly Gly Ser Pro
Thr Val His Ile35 40 45Gly Pro Thr Ala Phe Leu Gly Leu Gly Val Val
Asp Asn Asn Gly Asn50 55 60Gly Ala Arg Val Gln Arg Val Val Gly Ser
Ala Pro Ala Ala Ser Leu65 70 75 80Gly Ile Ser Thr Gly Asp Val Ile
Thr Ala Val Asp Gly Ala Pro Ile85 90 95Asn Ser Ala Thr Ala Met Ala
Asp Ala Leu Asn Gly His His Pro Gly100 105 110Asp Val Ile Ser Val
Thr Trp Gln Thr Lys Ser Gly Gly Thr Arg Thr115 120 125Gly Asn Val
Thr Leu Ala Glu Gly Pro Pro Ala Glu Phe Met Val Asp130 135 140Phe
Gly Ala Leu Pro Pro Glu Ile Asn Ser Ala Arg Met Tyr Ala Gly145 150
155 160Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gln Met Trp Asp Ser
Val165 170 175Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gln Ser
Val Val Trp180 185 190Gly Leu Thr Val Gly Ser Trp Ile Gly Ser Ser
Ala Gly Leu Met Val195 200 205Ala Ala Ala Ser Pro Tyr Val Ala Trp
Met Ser Val Thr Ala Gly Gln210 215 220Ala Glu Leu Thr Ala Ala Gln
Val Arg Val Ala Ala Ala Ala Tyr Glu225 230 235 240Thr Ala Tyr Gly
Leu Thr Val Pro Pro Pro Val Ile Ala Glu Asn Arg245 250 255Ala Glu
Leu Met Ile Leu Ile Ala Thr Asn Leu Leu Gly Gln Asn Thr260 265
270Pro Ala Ile Ala Val Asn Glu Ala Glu Tyr Gly Glu Met Trp Ala
Gln275 280 285Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala
Thr Ala Thr290 295 300Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu
Met Thr Ser Ala Gly305 310 315 320Gly Leu Leu Glu Gln Ala Ala Ala
Val
Glu Glu Ala Ser Asp Thr Ala325 330 335Ala Ala Asn Gln Leu Met Asn
Asn Val Pro Gln Ala Leu Gln Gln Leu340 345 350Ala Gln Pro Thr Gln
Gly Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu355 360 365Trp Lys Thr
Val Ser Pro His Arg Ser Pro Ile Ser Asn Met Val Ser370 375 380Met
Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val Ser Met Thr385 390
395 400Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala Ala Ala
Ala405 410 415Gln Ala Val Gln Thr Ala Ala Gln Asn Gly Val Arg Ala
Met Ser Ser420 425 430Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly
Gly Gly Val Ala Ala435 440 445Asn Leu Gly Arg Ala Ala Ser Val Gly
Ser Leu Ser Val Pro Gln Ala450 455 460Trp Ala Ala Ala Asn Gln Ala
Val Thr Pro Ala Ala Arg Ala Leu Pro465 470 475 480Leu Thr Ser Leu
Thr Ser Ala Ala Glu Arg Gly Pro Gly Gln Met Leu485 490 495Gly Gly
Leu Pro Val Gly Gln Met Gly Ala Arg Ala Gly Gly Gly Leu500 505
510Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met Pro His
Ser515 520 525Pro Ala Ala Gly Asp Ile Ala Pro Pro Ala Leu Ser Gln
Asp Arg Phe530 535 540Ala Asp Phe Pro Ala Leu Pro Leu Asp Pro Ser
Ala Met Val Ala Gln545 550 555 560Val Gly Pro Gln Val Val Asn Ile
Asn Thr Lys Leu Gly Tyr Asn Asn565 570 575Ala Val Gly Ala Gly Thr
Gly Ile Val Ile Asp Pro Asn Gly Val Val580 585 590Leu Thr Asn Asn
His Val Ile Ala Gly Ala Thr Asp Ile Asn Ala Phe595 600 605Ser Val
Gly Ser Gly Gln Thr Tyr Gly Val Asp Val Val Gly Tyr Asp610 615
620Arg Thr Gln Asp Val Ala Val Leu Gln Leu Arg Gly Ala Gly Gly
Leu625 630 635 640Pro Ser Ala Ala Ile Gly Gly Gly Val Ala Val Gly
Glu Pro Val Val645 650 655Ala Met Gly Asn Ser Gly Gly Gln Gly Gly
Thr Pro Arg Ala Val Pro660 665 670Gly Arg Val Val Ala Leu Gly Gln
Thr Val Gln Ala Ser Asp Ser Leu675 680 685Thr Gly Ala Glu Glu Thr
Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala690 695 700Ile Gln Pro Gly
Asp Ser Gly Gly Pro Val Val Asn Gly Leu Gly Gln705 710 715 720Val
Val Gly Met Asn Thr Ala Ala Ser725172190DNAArtificial
SequenceDescription of Artificial SequenceMTB72FMutSA
(Ra12-TbHp-Ra35MutSA) cDNA 17atgcatcacc atcaccatca cacggccgcg
tccgataact tccagctgtc ccagggtggg 60cagggattcg ccattccgat cgggcaggcg
atggcgatcg cgggccagat ccgatcgggt 120ggggggtcac ccaccgttca
tatcgggcct accgccttcc tcggcttggg tgttgtcgac 180aacaacggca
acggcgcacg agtccaacgc gtggtcggga gcgctccggc ggcaagtctc
240ggcatctcca ccggcgacgt gatcaccgcg gtcgacggcg ctccgatcaa
ctcggccacc 300gcgatggcgg acgcgcttaa cgggcatcat cccggtgacg
tcatctcggt gacctggcaa 360accaagtcgg gcggcacgcg tacagggaac
gtgacattgg ccgagggacc cccggccgaa 420ttcatggtgg atttcggggc
gttaccaccg gagatcaact ccgcgaggat gtacgccggc 480ccgggttcgg
cctcgctggt ggccgcggct cagatgtggg acagcgtggc gagtgacctg
540ttttcggccg cgtcggcgtt tcagtcggtg gtctggggtc tgacggtggg
gtcgtggata 600ggttcgtcgg cgggtctgat ggtggcggcg gcctcgccgt
atgtggcgtg gatgagcgtc 660accgcggggc aggccgagct gaccgccgcc
caggtccggg ttgctgcggc ggcctacgag 720acggcgtatg ggctgacggt
gcccccgccg gtgatcgccg agaaccgtgc tgaactgatg 780attctgatag
cgaccaacct cttggggcaa aacaccccgg cgatcgcggt caacgaggcc
840gaatacggcg agatgtgggc ccaagacgcc gccgcgatgt ttggctacgc
cgcggcgacg 900gcgacggcga cggcgacgtt gctgccgttc gaggaggcgc
cggagatgac cagcgcgggt 960gggctcctcg agcaggccgc cgcggtcgag
gaggcctccg acaccgccgc ggcgaaccag 1020ttgatgaaca atgtgcccca
ggcgctgcaa cagctggccc agcccacgca gggcaccacg 1080ccttcttcca
agctgggtgg cctgtggaag acggtctcgc cgcatcggtc gccgatcagc
1140aacatggtgt cgatggccaa caaccacatg tcgatgacca actcgggtgt
gtcgatgacc 1200aacaccttga gctcgatgtt gaagggcttt gctccggcgg
cggccgccca ggccgtgcaa 1260accgcggcgc aaaacggggt ccgggcgatg
agctcgctgg gcagctcgct gggttcttcg 1320ggtctgggcg gtggggtggc
cgccaacttg ggtcgggcgg cctcggtcgg ttcgttgtcg 1380gtgccgcagg
cctgggccgc ggccaaccag gcagtcaccc cggcggcgcg ggcgctgccg
1440ctgaccagcc tgaccagcgc cgcggaaaga gggcccgggc agatgctggg
cgggctgccg 1500gtggggcaga tgggcgccag ggccggtggt gggctcagtg
gtgtgctgcg tgttccgccg 1560cgaccctatg tgatgccgca ttctccggca
gccggcgata tcgccccgcc ggccttgtcg 1620caggaccggt tcgccgactt
ccccgcgctg cccctcgacc cgtccgcgat ggtcgcccaa 1680gtggggccac
aggtggtcaa catcaacacc aaactgggct acaacaacgc cgtgggcgcc
1740gggaccggca tcgtcatcga tcccaacggt gtcgtgctga ccaacaacca
cgtgatcgcg 1800ggcgccaccg acatcaatgc gttcagcgtc ggctccggcc
aaacctacgg cgtcgatgtg 1860gtcgggtatg accgcaccca ggatgtcgcg
gtgctgcagc tgcgcggtgc cggtggcctg 1920ccgtcggcgg cgatcggtgg
cggcgtcgcg gttggtgagc ccgtcgtcgc gatgggcaac 1980agcggtgggc
agggcggaac gccccgtgcg gtgcctggca gggtggtcgc gctcggccaa
2040accgtgcagg cgtcggattc gctgaccggt gccgaagaga cattgaacgg
gttgatccag 2100ttcgatgccg cgatccagcc cggtgatgcg ggcgggcccg
tcgtcaacgg cctaggacag 2160gtggtcggta tgaacacggc cgcgtcctag
219018729PRTArtificial SequenceDescription of Artificial
SequenceMTB72FMutSA (Ra12-TbHp-Ra35MutSA) 18Met His His His His His
His Thr Ala Ala Ser Asp Asn Phe Gln Leu1 5 10 15Ser Gln Gly Gly Gln
Gly Phe Ala Ile Pro Ile Gly Gln Ala Met Ala20 25 30Ile Ala Gly Gln
Ile Arg Ser Gly Gly Gly Ser Pro Thr Val His Ile35 40 45Gly Pro Thr
Ala Phe Leu Gly Leu Gly Val Val Asp Asn Asn Gly Asn50 55 60Gly Ala
Arg Val Gln Arg Val Val Gly Ser Ala Pro Ala Ala Ser Leu65 70 75
80Gly Ile Ser Thr Gly Asp Val Ile Thr Ala Val Asp Gly Ala Pro Ile85
90 95Asn Ser Ala Thr Ala Met Ala Asp Ala Leu Asn Gly His His Pro
Gly100 105 110Asp Val Ile Ser Val Thr Trp Gln Thr Lys Ser Gly Gly
Thr Arg Thr115 120 125Gly Asn Val Thr Leu Ala Glu Gly Pro Pro Ala
Glu Phe Met Val Asp130 135 140Phe Gly Ala Leu Pro Pro Glu Ile Asn
Ser Ala Arg Met Tyr Ala Gly145 150 155 160Pro Gly Ser Ala Ser Leu
Val Ala Ala Ala Gln Met Trp Asp Ser Val165 170 175Ala Ser Asp Leu
Phe Ser Ala Ala Ser Ala Phe Gln Ser Val Val Trp180 185 190Gly Leu
Thr Val Gly Ser Trp Ile Gly Ser Ser Ala Gly Leu Met Val195 200
205Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr Ala Gly
Gln210 215 220Ala Glu Leu Thr Ala Ala Gln Val Arg Val Ala Ala Ala
Ala Tyr Glu225 230 235 240Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro
Val Ile Ala Glu Asn Arg245 250 255Ala Glu Leu Met Ile Leu Ile Ala
Thr Asn Leu Leu Gly Gln Asn Thr260 265 270Pro Ala Ile Ala Val Asn
Glu Ala Glu Tyr Gly Glu Met Trp Ala Gln275 280 285Asp Ala Ala Ala
Met Phe Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr290 295 300Ala Thr
Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr Ser Ala Gly305 310 315
320Gly Leu Leu Glu Gln Ala Ala Ala Val Glu Glu Ala Ser Asp Thr
Ala325 330 335Ala Ala Asn Gln Leu Met Asn Asn Val Pro Gln Ala Leu
Gln Gln Leu340 345 350Ala Gln Pro Thr Gln Gly Thr Thr Pro Ser Ser
Lys Leu Gly Gly Leu355 360 365Trp Lys Thr Val Ser Pro His Arg Ser
Pro Ile Ser Asn Met Val Ser370 375 380Met Ala Asn Asn His Met Ser
Met Thr Asn Ser Gly Val Ser Met Thr385 390 395 400Asn Thr Leu Ser
Ser Met Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala405 410 415Gln Ala
Val Gln Thr Ala Ala Gln Asn Gly Val Arg Ala Met Ser Ser420 425
430Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly Val Ala
Ala435 440 445Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val
Pro Gln Ala450 455 460Trp Ala Ala Ala Asn Gln Ala Val Thr Pro Ala
Ala Arg Ala Leu Pro465 470 475 480Leu Thr Ser Leu Thr Ser Ala Ala
Glu Arg Gly Pro Gly Gln Met Leu485 490 495Gly Gly Leu Pro Val Gly
Gln Met Gly Ala Arg Ala Gly Gly Gly Leu500 505 510Ser Gly Val Leu
Arg Val Pro Pro Arg Pro Tyr Val Met Pro His Ser515 520 525Pro Ala
Ala Gly Asp Ile Ala Pro Pro Ala Leu Ser Gln Asp Arg Phe530 535
540Ala Asp Phe Pro Ala Leu Pro Leu Asp Pro Ser Ala Met Val Ala
Gln545 550 555 560Val Gly Pro Gln Val Val Asn Ile Asn Thr Lys Leu
Gly Tyr Asn Asn565 570 575Ala Val Gly Ala Gly Thr Gly Ile Val Ile
Asp Pro Asn Gly Val Val580 585 590Leu Thr Asn Asn His Val Ile Ala
Gly Ala Thr Asp Ile Asn Ala Phe595 600 605Ser Val Gly Ser Gly Gln
Thr Tyr Gly Val Asp Val Val Gly Tyr Asp610 615 620Arg Thr Gln Asp
Val Ala Val Leu Gln Leu Arg Gly Ala Gly Gly Leu625 630 635 640Pro
Ser Ala Ala Ile Gly Gly Gly Val Ala Val Gly Glu Pro Val Val645 650
655Ala Met Gly Asn Ser Gly Gly Gln Gly Gly Thr Pro Arg Ala Val
Pro660 665 670Gly Arg Val Val Ala Leu Gly Gln Thr Val Gln Ala Ser
Asp Ser Leu675 680 685Thr Gly Ala Glu Glu Thr Leu Asn Gly Leu Ile
Gln Phe Asp Ala Ala690 695 700Ile Gln Pro Gly Asp Ala Gly Gly Pro
Val Val Asn Gly Leu Gly Gln705 710 715 720Val Val Gly Met Asn Thr
Ala Ala Ser725191797DNAArtificial SequenceDescription of Artificial
Sequencebi-fusion protein TbH9-Ra35 (designated MTB59F) 19cat atg
cat cac cat cac cat cac atg gtg gat ttc ggg gcg tta cca 48His Met
His His His His His His Met Val Asp Phe Gly Ala Leu Pro1 5 10 15ccg
gag atc aac tcc gcg agg atg tac gcc ggc ccg ggt tcg gcc tcg 96Pro
Glu Ile Asn Ser Ala Arg Met Tyr Ala Gly Pro Gly Ser Ala Ser20 25
30ctg gtg gcc gcg gct cag atg tgg gac agc gtg gcg agt gac ctg ttt
144Leu Val Ala Ala Ala Gln Met Trp Asp Ser Val Ala Ser Asp Leu
Phe35 40 45tcg gcc gcg tcg gcg ttt cag tcg gtg gtc tgg ggt ctg acg
gtg ggg 192Ser Ala Ala Ser Ala Phe Gln Ser Val Val Trp Gly Leu Thr
Val Gly50 55 60tcg tgg ata ggt tcg tcg gcg ggt ctg atg gtg gcg gcg
gcc tcg ccg 240Ser Trp Ile Gly Ser Ser Ala Gly Leu Met Val Ala Ala
Ala Ser Pro65 70 75 80tat gtg gcg tgg atg agc gtc acc gcg ggg cag
gcc gag ctg acc gcc 288Tyr Val Ala Trp Met Ser Val Thr Ala Gly Gln
Ala Glu Leu Thr Ala85 90 95gcc cag gtc cgg gtt gct gcg gcg gcc tac
gag acg gcg tat ggg ctg 336Ala Gln Val Arg Val Ala Ala Ala Ala Tyr
Glu Thr Ala Tyr Gly Leu100 105 110acg gtg ccc ccg ccg gtg atc gcc
gag aac cgt gct gaa ctg atg att 384Thr Val Pro Pro Pro Val Ile Ala
Glu Asn Arg Ala Glu Leu Met Ile115 120 125ctg ata gcg acc aac ctc
ttg ggg caa aac acc ccg gcg atc gcg gtc 432Leu Ile Ala Thr Asn Leu
Leu Gly Gln Asn Thr Pro Ala Ile Ala Val130 135 140aac gag gcc gaa
tac ggc gag atg tgg gcc caa gac gcc gcc gcg atg 480Asn Glu Ala Glu
Tyr Gly Glu Met Trp Ala Gln Asp Ala Ala Ala Met145 150 155 160ttt
ggc tac gcc gcg gcg acg gcg acg gcg acg gcg acg ttg ctg ccg 528Phe
Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro165 170
175ttc gag gag gcg ccg gag atg acc agc gcg ggt ggg ctc ctc gag cag
576Phe Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu
Gln180 185 190gcc gcc gcg gtc gag gag gcc tcc gac acc gcc gcg gcg
aac cag ttg 624Ala Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala
Asn Gln Leu195 200 205atg aac aat gtg ccc cag gcg ctg caa cag ctg
gcc cag ccc acg cag 672Met Asn Asn Val Pro Gln Ala Leu Gln Gln Leu
Ala Gln Pro Thr Gln210 215 220ggc acc acg cct tct tcc aag ctg ggt
ggc ctg tgg aag acg gtc tcg 720Gly Thr Thr Pro Ser Ser Lys Leu Gly
Gly Leu Trp Lys Thr Val Ser225 230 235 240ccg cat cgg tcg ccg atc
agc aac atg gtg tcg atg gcc aac aac cac 768Pro His Arg Ser Pro Ile
Ser Asn Met Val Ser Met Ala Asn Asn His245 250 255atg tcg atg acc
aac tcg ggt gtg tcg atg acc aac acc ttg agc tcg 816Met Ser Met Thr
Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser260 265 270atg ttg
aag ggc ttt gct ccg gcg gcg gcc gcc cag gcc gtg caa acc 864Met Leu
Lys Gly Phe Ala Pro Ala Ala Ala Ala Gln Ala Val Gln Thr275 280
285gcg gcg caa aac ggg gtc cgg gcg atg agc tcg ctg ggc agc tcg ctg
912Ala Ala Gln Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser
Leu290 295 300ggt tct tcg ggt ctg ggc ggt ggg gtg gcc gcc aac ttg
ggt cgg gcg 960Gly Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu
Gly Arg Ala305 310 315 320gcc tcg gtc ggt tcg ttg tcg gtg ccg cag
gcc tgg gcc gcg gcc aac 1008Ala Ser Val Gly Ser Leu Ser Val Pro Gln
Ala Trp Ala Ala Ala Asn325 330 335cag gca gtc acc ccg gcg gcg cgg
gcg ctg ccg ctg acc agc ctg acc 1056Gln Ala Val Thr Pro Ala Ala Arg
Ala Leu Pro Leu Thr Ser Leu Thr340 345 350agc gcc gcg gaa aga ggg
ccc ggg cag atg ctg ggc ggg ctg ccg gtg 1104Ser Ala Ala Glu Arg Gly
Pro Gly Gln Met Leu Gly Gly Leu Pro Val355 360 365ggg cag atg ggc
gcc agg gcc ggt ggt ggg ctc agt ggt gtg ctg cgt 1152Gly Gln Met Gly
Ala Arg Ala Gly Gly Gly Leu Ser Gly Val Leu Arg370 375 380gtt ccg
ccg cga ccc tat gtg atg ccg cat tct ccg gca gcc ggc gat 1200Val Pro
Pro Arg Pro Tyr Val Met Pro His Ser Pro Ala Ala Gly Asp385 390 395
400atc gcc ccg ccg gcc ttg tcg cag gac cgg ttc gcc gac ttc ccc gcg
1248Ile Ala Pro Pro Ala Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro
Ala405 410 415ctg ccc ctc gac ccg tcc gcg atg gtc gcc caa gtg ggg
cca cag gtg 1296Leu Pro Leu Asp Pro Ser Ala Met Val Ala Gln Val Gly
Pro Gln Val420 425 430gtc aac atc aac acc aaa ctg ggc tac aac aac
gcc gtg ggc gcc ggg 1344Val Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn
Ala Val Gly Ala Gly435 440 445acc ggc atc gtc atc gat ccc aac ggt
gtc gtg ctg acc aac aac cac 1392Thr Gly Ile Val Ile Asp Pro Asn Gly
Val Val Leu Thr Asn Asn His450 455 460gtg atc gcg ggc gcc acc gac
atc aat gcg ttc agc gtc ggc tcc ggc 1440Val Ile Ala Gly Ala Thr Asp
Ile Asn Ala Phe Ser Val Gly Ser Gly465 470 475 480caa acc tac ggc
gtc gat gtg gtc ggg tat gac cgc acc cag gat gtc 1488Gln Thr Tyr Gly
Val Asp Val Val Gly Tyr Asp Arg Thr Gln Asp Val485 490 495gcg gtg
ctg cag ctg cgc ggt gcc ggt ggc ctg ccg tcg gcg gcg atc 1536Ala Val
Leu Gln Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala Ile500 505
510ggt ggc ggc gtc gcg gtt ggt gag ccc gtc gtc gcg atg ggc aac agc
1584Gly Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn
Ser515 520 525ggt ggg cag ggc gga acg ccc cgt gcg gtg cct ggc agg
gtg gtc gcg 1632Gly Gly Gln Gly Gly Thr Pro Arg Ala Val Pro Gly Arg
Val Val Ala530 535 540ctc ggc caa acc gtg cag gcg tcg gat tcg ctg
acc ggt gcc gaa gag 1680Leu Gly Gln Thr Val Gln Ala Ser Asp Ser Leu
Thr Gly Ala Glu Glu545 550 555 560aca ttg aac ggg ttg atc cag ttc
gat gcc gcg atc cag ccc ggt gat 1728Thr Leu Asn Gly Leu Ile Gln Phe
Asp Ala Ala Ile Gln Pro Gly Asp565 570 575tcg ggc ggg ccc gtc gtc
aac ggc cta gga cag gtg gtc ggt atg aac 1776Ser Gly Gly Pro Val Val
Asn Gly Leu Gly Gln Val Val Gly Met Asn580 585 590acg gcc gcg tcc
tag gatatc 1797Thr Ala Ala Ser59520596PRTArtificial
SequenceDescription of Artificial Sequencebi-fusion protein
TbH9-Ra35 (designated MTB59F) 20His Met His His His His His His Met
Val Asp Phe Gly Ala Leu Pro1 5 10 15Pro Glu Ile Asn Ser Ala Arg Met
Tyr Ala Gly Pro Gly Ser Ala Ser20 25 30Leu Val Ala Ala Ala Gln Met
Trp Asp Ser Val Ala Ser Asp Leu Phe35 40 45Ser Ala Ala Ser Ala Phe
Gln Ser Val Val Trp Gly Leu Thr Val Gly50 55 60Ser Trp Ile Gly Ser
Ser Ala Gly Leu Met Val Ala Ala Ala Ser Pro65 70 75
80Tyr Val Ala Trp Met Ser Val Thr Ala Gly Gln Ala Glu Leu Thr Ala85
90 95Ala Gln Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly
Leu100 105 110Thr Val Pro Pro Pro Val Ile Ala Glu Asn Arg Ala Glu
Leu Met Ile115 120 125Leu Ile Ala Thr Asn Leu Leu Gly Gln Asn Thr
Pro Ala Ile Ala Val130 135 140Asn Glu Ala Glu Tyr Gly Glu Met Trp
Ala Gln Asp Ala Ala Ala Met145 150 155 160Phe Gly Tyr Ala Ala Ala
Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro165 170 175Phe Glu Glu Ala
Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gln180 185 190Ala Ala
Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gln Leu195 200
205Met Asn Asn Val Pro Gln Ala Leu Gln Gln Leu Ala Gln Pro Thr
Gln210 215 220Gly Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys
Thr Val Ser225 230 235 240Pro His Arg Ser Pro Ile Ser Asn Met Val
Ser Met Ala Asn Asn His245 250 255Met Ser Met Thr Asn Ser Gly Val
Ser Met Thr Asn Thr Leu Ser Ser260 265 270Met Leu Lys Gly Phe Ala
Pro Ala Ala Ala Ala Gln Ala Val Gln Thr275 280 285Ala Ala Gln Asn
Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu290 295 300Gly Ser
Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala305 310 315
320Ala Ser Val Gly Ser Leu Ser Val Pro Gln Ala Trp Ala Ala Ala
Asn325 330 335Gln Ala Val Thr Pro Ala Ala Arg Ala Leu Pro Leu Thr
Ser Leu Thr340 345 350Ser Ala Ala Glu Arg Gly Pro Gly Gln Met Leu
Gly Gly Leu Pro Val355 360 365Gly Gln Met Gly Ala Arg Ala Gly Gly
Gly Leu Ser Gly Val Leu Arg370 375 380Val Pro Pro Arg Pro Tyr Val
Met Pro His Ser Pro Ala Ala Gly Asp385 390 395 400Ile Ala Pro Pro
Ala Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala405 410 415Leu Pro
Leu Asp Pro Ser Ala Met Val Ala Gln Val Gly Pro Gln Val420 425
430Val Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala
Gly435 440 445Thr Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu Thr
Asn Asn His450 455 460Val Ile Ala Gly Ala Thr Asp Ile Asn Ala Phe
Ser Val Gly Ser Gly465 470 475 480Gln Thr Tyr Gly Val Asp Val Val
Gly Tyr Asp Arg Thr Gln Asp Val485 490 495Ala Val Leu Gln Leu Arg
Gly Ala Gly Gly Leu Pro Ser Ala Ala Ile500 505 510Gly Gly Gly Val
Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser515 520 525Gly Gly
Gln Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala530 535
540Leu Gly Gln Thr Val Gln Ala Ser Asp Ser Leu Thr Gly Ala Glu
Glu545 550 555 560Thr Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala Ile
Gln Pro Gly Asp565 570 575Ser Gly Gly Pro Val Val Asn Gly Leu Gly
Gln Val Val Gly Met Asn580 585 590Thr Ala Ala
Ser59521500DNAMycobacterium tuberculosisMTB8.4 (DPV) cDNA
21cgtggcaatg tcgttgaccg tcggggccgg ggtcgcctcc gcagatcccg tggacgcggt
60cattaacacc acctgcaatt acgggcaggt agtagctgcg ctcaacgcga cggatccggg
120ggctgccgca cagttcaacg cctcaccggt ggcgcagtcc tatttgcgca
atttcctcgc 180cgcaccgcca cctcagcgcg ctgccatggc cgcgcaattg
caagctgtgc cgggggcggc 240acagtacatc ggccttgtcg agtcggttgc
cggctcctgc aacaactatt aagcccatgc 300gggccccatc ccgcgacccg
gcatcgtcgc cggggctagg ccagattgcc ccgctcctca 360acgggccgca
tcccgcgacc cggcatcgtc gccggggcta ggccagattg ccccgctcct
420caacgggccg catctcgtgc cgaattcctg cagcccgggg gatccactag
ttctagagcg 480gccgccaccg cggtggagct 5002296PRTMycobacterium
tuberculosisMTB8.4 (DPV) 22Val Ala Met Ser Leu Thr Val Gly Ala Gly
Val Ala Ser Ala Asp Pro1 5 10 15Val Asp Ala Val Ile Asn Thr Thr Cys
Asn Tyr Gly Gln Val Val Ala20 25 30Ala Leu Asn Ala Thr Asp Pro Gly
Ala Ala Ala Gln Phe Asn Ala Ser35 40 45Pro Val Ala Gln Ser Tyr Leu
Arg Asn Phe Leu Ala Ala Pro Pro Pro50 55 60Gln Arg Ala Ala Met Ala
Ala Gln Leu Gln Ala Val Pro Gly Ala Ala65 70 75 80Gln Tyr Ile Gly
Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr85 90
9523585DNAMycobacterium tuberculosisMTB9.8 (MSL) 23tggattccga
tagcggtttc ggcccctcga cgggcgacca cggcgcgcag gcctccgaac 60ggggggccgg
gacgctggga ttcgccggga ccgcaaccaa agaacgccgg gtccgggcgg
120tcgggctgac cgcactggcc ggtgatgagt tcggcaacgg cccccggatg
ccgatggtgc 180cggggacctg ggagcagggc agcaacgagc ccgaggcgcc
cgacggatcg gggagagggg 240gaggcgacgg cttaccgcac gacagcaagt
aaccgaattc cgaatcacgt ggacccgtac 300gggtcgaaag gagagatgtt
atgagccttt tggatgctca tatcccacag ttggtggcct 360cccagtcggc
gtttgccgcc aaggcggggc tgatgcggca cacgatcggt caggccgagc
420aggcggcgat gtcggctcag gcgtttcacc agggggagtc gtcggcggcg
tttcaggccg 480cccatgcccg gtttgtggcg gcggccgcca aagtcaacac
cttgttggat gtcgcgcagg 540cgaatctggg tgaggccgcc ggtacctatg
tggccgccga tgctg 5852497PRTMycobacterium tuberculosisMTB9.8 (MSL)
24Met Ser Leu Leu Asp Ala His Ile Pro Gln Leu Val Ala Ser Gln Ser1
5 10 15Ala Phe Ala Ala Lys Ala Gly Leu Met Arg His Thr Ile Gly Gln
Ala20 25 30Glu Gln Ala Ala Met Ser Ala Gln Ala Phe His Gln Gly Glu
Ser Ser35 40 45Ala Ala Phe Gln Ala Ala His Ala Arg Phe Val Ala Ala
Ala Ala Lys50 55 60Val Asn Thr Leu Leu Asp Val Ala Gln Ala Asn Leu
Gly Glu Ala Ala65 70 75 80Gly Thr Tyr Val Ala Ala Asp Ala Ala Ala
Ala Ser Thr Tyr Thr Gly85 90 95Phe251742DNAMycobacterium
tuberculosisMTB9.9A (MTI, also known as MTI-A) 25ccgctctctt
tcaacgtcat aagttcggtg ggccagtcgg ccgcgcgtgc atatggcacc 60aataacgcgt
gtcccatgga tacccggacc gcacgacggt agagcggatc agcgcagccg
120gtgccgaaca ctaccgcgtc cacgctcagc cctgccgcgt tgcggaagat
cgagcccagg 180ttctcatggt cgttaacgcc ttccaacact gcgacggtgc
gcgccccggc gaccacctga 240gcaacgctcg gctccggcac ccggcgcgcg
gctgccaaca ccccacgatt gagatggaag 300ccgatcaccc gtgccatgac
atcagccgac gctcgatagt acggcgcgcc gacaccggcc 360agatcatcct
tgagctcggc cagccggcgg tcggtgccga acagcgccag cggcgtgaac
420cgtgaggcca gcatgcgctg caccaccagc acaccctcgg cgatcaccaa
cgccttgccg 480gtcggcagat cgggacnacn gtcgatgctg ttcaggtcac
ggaaatcgtc gagccgtggg 540tcgtcgggat cgcagacgtc ctgaacatcg
aggccgtcgg ggtgctgggc acaacggcct 600tcggtcacgg gctttcgtcg
accagagcca gcatcagatc ggcggcgctg cgcaggatgt 660cacgctcgct
gcggttcagc gtcgcgagcc gctcagccag ccactcttgc agagagccgt
720tgctgggatt aattgggaga ggaagacagc atgtcgttcg tgaccacaca
gccggaagcc 780ctggcagctg cggcggcgaa cctacagggt attggcacga
caatgaacgc ccagaacgcg 840gccgcggctg ctccaaccac cggagtagtg
cccgcagccg ccgatgaagt atcagcgctg 900accgcggctc agtttgctgc
gcacgcgcag atgtaccaaa cggtcagcgc ccaggccgcg 960gccattcacg
aaatgttcgt gaacacgctg gtggccagtt ctggctcata cgcggccacc
1020gaggcggcca acgcagccgc tgccggctga acgggctcgc acgaacctgc
tgaaggagag 1080ggggaacatc cggagttctc gggtcagggg ttgcgccagc
gcccagccga ttcagntatc 1140ggcgtccata acagcagacg atctaggcat
tcagtactaa ggagacaggc aacatggcct 1200cacgttttat gacggatccg
catgcgatgc gggacatggc gggccgtttt gaggtgcacg 1260cccagacggt
ggaggacgag gctcgccgga tgtgggcgtc cgcgcaaaac atttccggtg
1320cgggctggag tggcatggcc gaggcgacct cgctagacac catgacctag
atgaatcagg 1380cgtttcgcaa catcgtgaac atgctgcacg gggtgcgtga
cgggctggtt cgcgacgcca 1440acaantacga acagcaagag caggcctccc
agcagatcct gagcagntag cgccgaaagc 1500cacagctgng tacgntttct
cacattagga gaacaccaat atgacgatta attaccagtt 1560cggggacgtc
gacgctcatg gcgccatgat ccgcgctcag gcggcgtcgc ttgaggcgga
1620gcatcaggcc atcgttcgtg atgtgttggc cgcgggtgac ttttggggcg
gcgccggttc 1680ggtggcttgc caggagttca ttacccagtt gggccgtaac
ttccaggtga tctacgagca 1740gg 1742262836DNAMycobacterium
tuberculosisMTB9.9A (MTI also known as MTI-A) 26gttgattccg
ttcgcggcgc cgccgaagac caccaactcc gctggggtgg tcgcacaggc 60ggttgcgtcg
gtcagctggc cgaatcccaa tgattggtgg ctcngtgcgg ttgctgggct
120cgattacccc cacggaaagg acgacgatcg ttcgtttgct cggtcagtcg
tacttggcga 180cgggcatggc gcggtttctt acctcgatcg cacagcagct
gaccttcggc ccagggggca 240caacggctgg ctccggcgga gcctggtacc
caacgccaca attcgccggc ctgggtgcag 300gcccggcggt gtcggcgagt
ttggcgcggg cggagccggt cgggaggttg tcggtgccgc 360caagttgggc
cgtcgcggct ccggccttcg cggagaagcc tgaggcgggc acgccgatgt
420ccgtcatcgg cgaagcgtcc agctgcggtc agggaggcct gcttcgaggc
ataccgctgg 480cgagagcggg gcggcgtaca ggcgccttcg ctcaccgata
cgggttccgc cacagcgtga 540ttacccggtc tccgtcggcg ggatagcttt
cgatccggtc tgcgcggccg ccggaaatgc 600tgcagatagc gatcgaccgc
gccggtcggt aaacgccgca cacggcacta tcaatgcgca 660cggcgggcgt
tgatgccaaa ttgaccgtcc cgacggggct ttatctgcgg caagatttca
720tccccagccc ggtcggtggg ccgataaata cgctggtcag cgcgactctt
ccggctgaat 780tcgatgctct gggcgcccgc tcgacgccga gtatctcgag
tgggccgcaa acccggtcaa 840acgctgttac tgtggcgtta ccacaggtga
atttgcggtg ccaactggtg aacacttgcg 900aacgggtggc atcgaaatca
acttgttgcg ttgcagtgat ctactctctt gcagagagcc 960gttgctggga
ttaattggga gaggaagaca gcatgtcgtt cgtgaccaca cagccggaag
1020ccctggcagc tgcggcggcg aacctacagg gtattggcac gacaatgaac
gcccagaacg 1080cggccgcggc tgctccaacc accggagtag tgcccgcagc
cgccgatgaa gtatcagcgc 1140tgaccgcggc tcagtttgct gcgcacgcgc
agatgtacca aacggtcagc gcccaggccg 1200cggccattca cgaaatgttc
gtgaacacgc tggtggccag ttctggctca tacgcggcca 1260ccgaggcggc
caacgcagcc gctgccggct gaacgggctc gcacgaacct gctgaaggag
1320agggggaaca tccggagttc tcgggtcagg ggttgcgcca gcgcccagcc
gattcagcta 1380tcggcgtcca taacagcaga cgatctaggc attcagtact
aaggagacag gcaacatggc 1440ctcacgtttt atgacggatc cgcatgcgat
gcgggacatg gcgggccgtt ttgaggtgca 1500cgcccagacg gtggaggacg
aggctcgccg gatgtgggcg tccgcgcaaa acatttccgg 1560tgcgggctgg
agtggcatgg ccgaggcgac ctcgctagac accatgacct agatgaatca
1620ggcgtttcgc aacatcgtga acatgctgca cggggtgcgt gacgggctgg
ttcgcgacgc 1680caacaactac gaacagcaag agcaggcctc ccagcagatc
ctgagcagct agcgccgaaa 1740gccacagctg cgtacgcttt ctcacattag
gagaacacca atatgacgat taattaccag 1800ttcggggacg tcgacgctca
tggcgccatg atccgcgctc aggcggcgtc gcttgaggcg 1860gagcatcagg
ccatcgttcg tgatgtgttg gccgcgggtg acttttgggg cggcgccggt
1920tcggtggctt gccaggagtt cattacccag ttgggccgta acttccaggt
gatctacgag 1980caggccaacg cccacgggca gaaggtgcag gctgccggca
acaacatggc gcaaaccgac 2040agcgccgtcg gctccagctg ggcctaaaac
tgaacttcag tcgcggcagc acaccaacca 2100gccggtgtgc tgctgtgtcc
tgcagttaac tagcactcga ccgctgaggt agcgatggat 2160caacagagta
cccgcaccga catcaccgtc aacgtcgacg gcttctggat gcttcaggcg
2220ctactggata tccgccacgt tgcgcctgag ttacgttgcc ggccgtacgt
ctccaccgat 2280tccaatgact ggctaaacga gcacccgggg atggcggtca
tgcgcgagca gggcattgtc 2340gtcaacgacg cggtcaacga acaggtcgct
gcccggatga aggtgcttgc cgcacctgat 2400cttgaagtcg tcgccctgct
gtcacgcggc aagttgctgt acggggtcat agacgacgag 2460aaccagccgc
cgggttcgcg tgacatccct gacaatgagt tccgggtggt gttggcccgg
2520cgaggccagc actgggtgtc ggcggtacgg gttggcaatg acatcaccgt
cgatgacgtg 2580acggtctcgg atagcgcctc gatcgccgca ctggtaatgg
acggtctgga gtcgattcac 2640cacgccgacc cagccgcgat caacgcggtc
aacgtgccaa tggaggagat ctcgtgccga 2700attcggcacg aggcacgagg
cggtgtcggt gacgacggga tcgatcacga tcatcgaccg 2760gccgggatcc
ttggcgatct cgttgagcac gacccgggcc cgcgggaagc tctgcgacat
2820ccatgggttc ttcccg 28362794PRTMycobacterium tuberculosisMTB9.9A
(MTI, also known as MTI-A) ORF peptide 27Met Thr Ile Asn Tyr Gln
Phe Gly Asp Val Asp Ala His Gly Ala Met1 5 10 15Ile Arg Ala Leu Ala
Gly Leu Leu Glu Ala Glu His Gln Ala Ile Ile20 25 30Ser Asp Val Leu
Thr Ala Ser Asp Phe Trp Gly Gly Ala Gly Ser Ala35 40 45Ala Cys Gln
Gly Phe Ile Thr Gln Leu Gly Arg Asn Phe Gln Val Ile50 55 60Tyr Glu
Gln Ala Asn Ala His Gly Gln Lys Val Gln Ala Ala Gly Asn65 70 75
80Asn Met Ala Gln Thr Asp Ser Ala Val Gly Ser Ser Trp Ala85
90281200DNAMycobacterium tuberculosisMTB40 (HTCC#1) cDNA
28caggcatgag cagagcgttc atcatcgatc caacgatcag tgccattgac ggcttgtacg
60accttctggg gattggaata cccaaccaag ggggtatcct ttactcctca ctagagtact
120tcgaaaaagc cctggaggag ctggcagcag cgtttccggg tgatggctgg
ttaggttcgg 180ccgcggacaa atacgccggc aaaaaccgca accacgtgaa
ttttttccag gaactggcag 240acctcgatcg tcagctcatc agcctgatcc
acgaccaggc caacgcggtc cagacgaccc 300gcgacatcct ggagggcgcc
aagaaaggtc tcgagttcgt gcgcccggtg gctgtggacc 360tgacctacat
cccggtcgtc gggcacgccc tatcggccgc cttccaggcg ccgttttgcg
420cgggcgcgat ggccgtagtg ggcggcgcgc ttgcctactt ggtcgtgaaa
acgctgatca 480acgcgactca actcctcaaa ttgcttgcca aattggcgga
gttggtcgcg gccgccattg 540cggacatcat ttcggatgtg gcggacatca
tcaagggcac cctcggagaa gtgtgggagt 600tcatcacaaa cgcgctcaac
ggcctgaaag agctttggga caagctcacg gggtgggtga 660ccggactgtt
ctctcgaggg tggtcgaacc tggagtcctt ctttgcgggc gtccccggct
720tgaccggcgc gaccagcggc ttgtcgcaag tgactggctt gttcggtgcg
gccggtctgt 780ccgcatcgtc gggcttggct cacgcggata gcctggcgag
ctcagccagc ttgcccgccc 840tggccggcat tgggggcggg tccggttttg
ggggcttgcc gagcctggct caggtccatg 900ccgcctcaac tcggcaggcg
ctacggcccc gagctgatgg cccggtcggc gccgctgccg 960agcaggtcgg
cgggcagtcg cagctggtct ccgcgcaggg ttcccaaggt atgggcggac
1020ccgtaggcat gggcggcatg cacccctctt cgggggcgtc gaaagggacg
acgacgaaga 1080agtactcgga aggcgcggcg gcgggcactg aagacgccga
gcgcgcgcca gtcgaagctg 1140acgcgggcgg tgggcaaaag gtgctggtac
gaaacgtcgt ctaacggcat ggcgagccaa 120029392PRTMycobacterium
tuberculosisMTB40 (HTCC#1) 29Met Ser Arg Ala Phe Ile Ile Asp Pro
Thr Ile Ser Ala Ile Asp Gly1 5 10 15Leu Tyr Asp Leu Leu Gly Ile Gly
Ile Pro Asn Gln Gly Gly Ile Leu20 25 30Tyr Ser Ser Leu Glu Tyr Phe
Glu Lys Ala Leu Glu Glu Leu Ala Ala35 40 45Ala Phe Pro Gly Asp Gly
Trp Leu Gly Ser Ala Ala Asp Lys Tyr Ala50 55 60Gly Lys Asn Arg Asn
His Val Asn Phe Phe Gln Glu Leu Ala Asp Leu65 70 75 80Asp Arg Gln
Leu Ile Ser Leu Ile His Asp Gln Ala Asn Ala Val Gln85 90 95Thr Thr
Arg Asp Ile Leu Glu Gly Ala Lys Lys Gly Leu Glu Phe Val100 105
110Arg Pro Val Ala Val Asp Leu Thr Tyr Ile Pro Val Val Gly His
Ala115 120 125Leu Ser Ala Ala Phe Gln Ala Pro Phe Cys Ala Gly Ala
Met Ala Val130 135 140Val Gly Gly Ala Leu Ala Tyr Leu Val Val Lys
Thr Leu Ile Asn Ala145 150 155 160Thr Gln Leu Leu Lys Leu Leu Ala
Lys Leu Ala Glu Leu Val Ala Ala165 170 175Ala Ile Ala Asp Ile Ile
Ser Asp Val Ala Asp Ile Ile Lys Gly Thr180 185 190Leu Gly Glu Val
Trp Glu Phe Ile Thr Asn Ala Leu Asn Gly Leu Lys195 200 205Glu Leu
Trp Asp Lys Leu Thr Gly Trp Val Thr Gly Leu Phe Ser Arg210 215
220Gly Trp Ser Asn Leu Glu Ser Phe Phe Ala Gly Val Pro Gly Leu
Thr225 230 235 240Gly Ala Thr Ser Gly Leu Ser Gln Val Thr Gly Leu
Phe Gly Ala Ala245 250 255Gly Leu Ser Ala Ser Ser Gly Leu Ala His
Ala Asp Ser Leu Ala Ser260 265 270Ser Ala Ser Leu Pro Ala Leu Ala
Gly Ile Gly Gly Gly Ser Gly Phe275 280 285Gly Gly Leu Pro Ser Leu
Ala Gln Val His Ala Ala Ser Thr Arg Gln290 295 300Ala Leu Arg Pro
Arg Ala Asp Gly Pro Val Gly Ala Ala Ala Glu Gln305 310 315 320Val
Gly Gly Gln Ser Gln Leu Val Ser Ala Gln Gly Ser Gln Gly Met325 330
335Gly Gly Pro Val Gly Met Gly Gly Met His Pro Ser Ser Gly Ala
Ser340 345 350Lys Gly Thr Thr Thr Lys Lys Tyr Ser Glu Gly Ala Ala
Ala Gly Thr355 360 365Glu Asp Ala Glu Arg Ala Pro Val Glu Ala Asp
Ala Gly Gly Gly Gln370 375 380Lys Val Leu Val Arg Asn Val Val385
390301441DNAMycobacterium tuberculosisMTB41 (MTCC#2) cDNA
30gaggttgctg gcaatggatt tcgggctttt acctccggaa gtgaattcaa gccgaatgta
60ttccggtccg gggccggagt cgatgctagc cgccgcggcc gcctgggacg gtgtggccgc
120ggagttgact tccgccgcgg tctcgtatgg atcggtggtg tcgacgctga
tcgttgagcc 180gtggatgggg ccggcggcgg ccgcgatggc ggccgcggca
acgccgtatg tggggtggct 240ggccgccacg gcggcgctgg cgaaggagac
ggccacacag gcgagggcag cggcggaagc 300gtttgggacg gcgttcgcga
tgacggtgcc accatccctc gtcgcggcca accgcagccg 360gttgatgtcg
ctggtcgcgg cgaacattct ggggcaaaac agtgcggcga tcgcggctac
420ccaggccgag tatgccgaaa tgtgggccca agacgctgcc gtgatgtaca
gctatgaggg 480ggcatctgcg gccgcgtcgg cgttgccgcc gttcactcca
cccgtgcaag gcaccggccc 540ggccgggccc gcggccgcag ccgcggcgac
ccaagccgcc ggtgcgggcg ccgttgcgga 600tgcacaggcg acactggccc
agctgccccc ggggatcctg agcgacattc tgtccgcatt 660ggccgccaac
gctgatccgc tgacatcggg actgttgggg atcgcgtcga ccctcaaccc
720gcaagtcgga tccgctcagc cgatagtgat
ccccaccccg ataggggaat tggacgtgat 780cgcgctctac attgcatcca
tcgcgaccgg cagcattgcg ctcgcgatca cgaacacggc 840cagaccctgg
cacatcggcc tatacgggaa cgccggcggg ctgggaccga cgcagggcca
900tccactgagt tcggcgaccg acgagccgga gccgcactgg ggccccttcg
ggggcgcggc 960gccggtgtcc gcgggcgtcg gccacgcagc attagtcgga
gcgttgtcgg tgccgcacag 1020ctggaccacg gccgccccgg agatccagct
cgccgttcag gcaacaccca ccttcagctc 1080cagcgccggc gccgacccga
cggccctaaa cgggatgccg gcaggcctgc tcagcgggat 1140ggctttggcg
agcctggccg cacgcggcac gacgggcggt ggcggcaccc gtagcggcac
1200cagcactgac ggccaagagg acggccgcaa acccccggta gttgtgatta
gagagcagcc 1260gccgcccgga aaccccccgc ggtaaaagtc cggcaaccgt
tcgtcgccgc gcggaaaatg 1320cctggtgagc gtggctatcc gacgggccgt
tcacaccgct tgtagtagcg tacggctatg 1380gacgacggtg tctggattct
cggcggctat cagagcgatt ttgctcgcaa cctcagcaaa 1440g
144131423PRTMycobacterium tuberculosisMTB41 (MTCC#2) 31Met Asp Phe
Gly Leu Leu Pro Pro Glu Val Asn Ser Ser Arg Met Tyr1 5 10 15Ser Gly
Pro Gly Pro Glu Ser Met Leu Ala Ala Ala Ala Ala Trp Asp20 25 30Gly
Val Ala Ala Glu Leu Thr Ser Ala Ala Val Ser Tyr Gly Ser Val35 40
45Val Ser Thr Leu Ile Val Glu Pro Trp Met Gly Pro Ala Ala Ala Ala50
55 60Met Ala Ala Ala Ala Thr Pro Tyr Val Gly Trp Leu Ala Ala Thr
Ala65 70 75 80Ala Leu Ala Lys Glu Thr Ala Thr Gln Ala Arg Ala Ala
Ala Glu Ala85 90 95Phe Gly Thr Ala Phe Ala Met Thr Val Pro Pro Ser
Leu Val Ala Ala100 105 110Asn Arg Ser Arg Leu Met Ser Leu Val Ala
Ala Asn Ile Leu Gly Gln115 120 125Asn Ser Ala Ala Ile Ala Ala Thr
Gln Ala Glu Tyr Ala Glu Met Trp130 135 140Ala Gln Asp Ala Ala Val
Met Tyr Ser Tyr Glu Gly Ala Ser Ala Ala145 150 155 160Ala Ser Ala
Leu Pro Pro Phe Thr Pro Pro Val Gln Gly Thr Gly Pro165 170 175Ala
Gly Pro Ala Ala Ala Ala Ala Ala Thr Gln Ala Ala Gly Ala Gly180 185
190Ala Val Ala Asp Ala Gln Ala Thr Leu Ala Gln Leu Pro Pro Gly
Ile195 200 205Leu Ser Asp Ile Leu Ser Ala Leu Ala Ala Asn Ala Asp
Pro Leu Thr210 215 220Ser Gly Leu Leu Gly Ile Ala Ser Thr Leu Asn
Pro Gln Val Gly Ser225 230 235 240Ala Gln Pro Ile Val Ile Pro Thr
Pro Ile Gly Glu Leu Asp Val Ile245 250 255Ala Leu Tyr Ile Ala Ser
Ile Ala Thr Gly Ser Ile Ala Leu Ala Ile260 265 270Thr Asn Thr Ala
Arg Pro Trp His Ile Gly Leu Tyr Gly Asn Ala Gly275 280 285Gly Leu
Gly Pro Thr Gln Gly His Pro Leu Ser Ser Ala Thr Asp Glu290 295
300Pro Glu Pro His Trp Gly Pro Phe Gly Gly Ala Ala Pro Val Ser
Ala305 310 315 320Gly Val Gly His Ala Ala Leu Val Gly Ala Leu Ser
Val Pro His Ser325 330 335Trp Thr Thr Ala Ala Pro Glu Ile Gln Leu
Ala Val Gln Ala Thr Pro340 345 350Thr Phe Ser Ser Ser Ala Gly Ala
Asp Pro Thr Ala Leu Asn Gly Met355 360 365Pro Ala Gly Leu Leu Ser
Gly Met Ala Leu Ala Ser Leu Ala Ala Arg370 375 380Gly Thr Thr Gly
Gly Gly Gly Thr Arg Ser Gly Thr Ser Thr Asp Gly385 390 395 400Gln
Glu Asp Gly Arg Lys Pro Pro Val Val Val Ile Arg Glu Gln Pro405 410
415Pro Pro Gly Asn Pro Pro Arg42032154DNAMycobacterium
tuberculosisESAT-6 32atgacagagc agcagtggaa tttcgcgggt atcgaggccg
cggcaagcgc aatccaggga 60aatgtcacgt ccattcattc cctccttgac gaggggaagc
agtccctgac caagctcgca 120gcggcctggg gcggtagcgg ttcggaagcg tacc
1543351PRTMycobacterium tuberculosisESAT-6 33Met Thr Glu Gln Gln
Trp Asn Phe Ala Gly Ile Glu Ala Ala Ala Ser1 5 10 15Ala Ile Gln Gly
Asn Val Thr Ser Ile His Ser Leu Leu Asp Glu Gly20 25 30Lys Gln Ser
Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly Ser35 40 45Glu Ala
Tyr5034327DNAMycobacterium tuberculosisTb38-1 or 38-1 (MTb11)
34cggcacgaga gaccgatgcc gctaccctcg cgcaggaggc aggtaatttc gagcggatct
60ccggcgacct gaaaacccag atcgaccagg tggagtcgac ggcaggttcg ttgcagggcc
120agtggcgcgg cgcggcgggg acggccgccc aggccgcggt ggtgcgcttc
caagaagcag 180ccaataagca gaagcaggaa ctcgacgaga tctcgacgaa
tattcgtcag gccggcgtcc 240aatactcgag ggccgacgag gagcagcagc
aggcgctgtc ctcgcaaatg ggcttctgac 300ccgctaatac gaaaagaaac ggagcaa
3273595PRTMycobacterium tuberculosisTb38-1 or 38-1 (MTb11) 35Thr
Asp Ala Ala Thr Leu Ala Gln Glu Ala Gly Asn Phe Glu Arg Ile1 5 10
15Ser Gly Asp Leu Lys Thr Gln Ile Asp Gln Val Glu Ser Thr Ala Gly20
25 30Ser Leu Gln Gly Gln Trp Arg Gly Ala Ala Gly Thr Ala Ala Gln
Ala35 40 45Ala Val Val Arg Phe Gln Glu Ala Ala Asn Lys Gln Lys Gln
Glu Leu50 55 60Asp Glu Ile Ser Thr Asn Ile Arg Gln Ala Gly Val Gln
Tyr Ser Arg65 70 75 80Ala Asp Glu Glu Gln Gln Gln Ala Leu Ser Ser
Gln Met Gly Phe85 90 9536542DNAMycobacterium tuberculosisTbRa3
36gaattcggca cgagaggtga tcgacatcat cgggaccagc cccacatcct gggaacaggc
60ggcggcggag gcggtccagc gggcgcggga tagcgtcgat gacatccgcg tcgctcgggt
120cattgagcag gacatggccg tggacagcgc cggcaagatc acctaccgca
tcaagctcga 180agtgtcgttc aagatgaggc cggcgcaacc gcgctagcac
gggccggcga gcaagacgca 240aaatcgcacg gtttgcggtt gattcgtgcg
attttgtgtc tgctcgccga ggcctaccag 300gcgcggccca ggtccgcgtg
ctgccgtatc caggcgtgca tcgcgattcc ggcggccacg 360ccggagttaa
tgcttcgcgt cgacccgaac tgggcgatcc gccggngagc tgatcgatga
420ccgtggccag cccgtcgatg cccgagttgc ccgaggaaac gtgctgccag
gccggtagga 480agcgtccgta ggcggcggtg ctgaccggct ctgcctgcgc
cctcagtgcg gccagcgagc 540gg 5423766PRTMycobacterium
tuberculosisTbRa3 37Val Ile Asp Ile Ile Gly Thr Ser Pro Thr Ser Trp
Glu Gln Ala Ala1 5 10 15Ala Glu Ala Val Gln Arg Ala Arg Asp Ser Val
Asp Asp Ile Arg Val20 25 30Ala Arg Val Ile Glu Gln Asp Met Ala Val
Asp Ser Ala Gly Lys Ile35 40 45Thr Tyr Arg Ile Lys Leu Glu Val Ser
Phe Lys Met Arg Pro Ala Gln50 55 60Pro Arg65381993DNAMycobacterium
tuberculosis38 kD 38tgttcttcga cggcaggctg gtggaggaag ggcccaccga
acagctgttc tcctcgccga 60agcatgcgga aaccgcccga tacgtcgccg gactgtcggg
ggacgtcaag gacgccaagc 120gcggaaattg aagagcacag aaaggtatgg
cgtgaaaatt cgtttgcata cgctgttggc 180cgtgttgacc gctgcgccgc
tgctgctagc agcggcgggc tgtggctcga aaccaccgag 240cggttcgcct
gaaacgggcg ccggcgccgg tactgtcgcg actacccccg cgtcgtcgcc
300ggtgacgttg gcggagaccg gtagcacgct gctctacccg ctgttcaacc
tgtggggtcc 360ggcctttcac gagaggtatc cgaacgtcac gatcaccgct
cagggcaccg gttctggtgc 420cgggatcgcg caggccgccg ccgggacggt
caacattggg gcctccgacg cctatctgtc 480ggaaggtgat atggccgcgc
acaaggggct gatgaacatc gcgctagcca tctccgctca 540gcaggtcaac
tacaacctgc ccggagtgag cgagcacctc aagctgaacg gaaaagtcct
600ggcggccatg taccagggca ccatcaaaac ctgggacgac ccgcagatcg
ctgcgctcaa 660ccccggcgtg aacctgcccg gcaccgcggt agttccgctg
caccgctccg acgggtccgg 720tgacaccttc ttgttcaccc agtacctgtc
caagcaagat cccgagggct ggggcaagtc 780gcccggcttc ggcaccaccg
tcgacttccc ggcggtgccg ggtgcgctgg gtgagaacgg 840caacggcggc
atggtgaccg gttgcgccga gacaccgggc tgcgtggcct atatcggcat
900cagcttcctc gaccaggcca gtcaacgggg actcggcgag gcccaactag
gcaatagctc 960tggcaatttc ttgttgcccg acgcgcaaag cattcaggcc
gcggcggctg gcttcgcatc 1020gaaaaccccg gcgaaccagg cgatttcgat
gatcgacggg cccgccccgg acggctaccc 1080gatcatcaac tacgagtacg
ccatcgtcaa caaccggcaa aaggacgccg ccaccgcgca 1140gaccttgcag
gcatttctgc actgggcgat caccgacggc aacaaggcct cgttcctcga
1200ccaggttcat ttccagccgc tgccgcccgc ggtggtgaag ttgtctgacg
cgttgatcgc 1260gacgatttcc agctagcctc gttgaccacc acgcgacagc
aacctccgtc gggccatcgg 1320gctgctttgc ggagcatgct ggcccgtgcc
ggtgaagtcg gccgcgctgg cccggccatc 1380cggtggttgg gtgggatagg
tgcggtgatc ccgctgcttg cgctggtctt ggtgctggtg 1440gtgctggtca
tcgaggcgat gggtgcgatc aggctcaacg ggttgcattt cttcaccgcc
1500accgaatgga atccaggcaa cacctacggc gaaaccgttg tcaccgacgc
gtcgcccatc 1560cggtcggcgc ctactacggg gcgttgccgc tgatcgtcgg
gacgctggcg acctcggcaa 1620tcgccctgat catcgcggtg ccggtctctg
taggagcggc gctggtgatc gtggaacggc 1680tgccgaaacg gttggccgag
gctgtgggaa tagtcctgga attgctcgcc ggaatcccca 1740gcgtggtcgt
cggtttgtgg ggggcaatga cgttcgggcc gttcatcgct catcacatcg
1800ctccggtgat cgctcacaac gctcccgatg tgccggtgct gaactacttg
cgcggcgacc 1860cgggcaacgg ggagggcatg ttggtgtccg gtctggtgtt
ggcggtgatg gtcgttccca 1920ttatcgccac caccactcat gacctgttcc
ggcaggtgcc ggtgttgccc cgggagggcg 1980cgatcgggaa ttc
199339374PRTMycobacterium tuberculosis38 kD 39Met Lys Ile Arg Leu
His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro1 5 10 15Leu Leu Leu Ala
Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser20 25 30Pro Glu Thr
Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser35 40 45Ser Pro
Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu50 55 60Phe
Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr65 70 75
80Ile Thr Ala Gln Gly Thr Gly Ser Gly Ala Gly Ile Ala Gln Ala Ala85
90 95Ala Gly Thr Val Asn Ile Gly Ala Ser Asp Ala Tyr Leu Ser Glu
Gly100 105 110Asp Met Ala Ala His Lys Gly Leu Met Asn Ile Ala Leu
Ala Ile Ser115 120 125Ala Gln Gln Val Asn Tyr Asn Leu Pro Gly Val
Ser Glu His Leu Lys130 135 140Leu Asn Gly Lys Val Leu Ala Ala Met
Tyr Gln Gly Thr Ile Lys Thr145 150 155 160Trp Asp Asp Pro Gln Ile
Ala Ala Leu Asn Pro Gly Val Asn Leu Pro165 170 175Gly Thr Ala Val
Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr180 185 190Phe Leu
Phe Thr Gln Tyr Leu Ser Lys Gln Asp Pro Glu Gly Trp Gly195 200
205Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro
Gly210 215 220Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly
Cys Ala Glu225 230 235 240Thr Pro Gly Cys Val Ala Tyr Ile Gly Ile
Ser Phe Leu Asp Gln Ala245 250 255Ser Gln Arg Gly Leu Gly Glu Ala
Gln Leu Gly Asn Ser Ser Gly Asn260 265 270Phe Leu Leu Pro Asp Ala
Gln Ser Ile Gln Ala Ala Ala Ala Gly Phe275 280 285Ala Ser Lys Thr
Pro Ala Asn Gln Ala Ile Ser Met Ile Asp Gly Pro290 295 300Ala Pro
Asp Gly Tyr Pro Ile Ile Asn Tyr Glu Tyr Ala Ile Val Asn305 310 315
320Asn Arg Gln Lys Asp Ala Ala Thr Ala Gln Thr Leu Gln Ala Phe
Leu325 330 335His Trp Ala Ile Thr Asp Gly Asn Lys Ala Ser Phe Leu
Asp Gln Val340 345 350His Phe Gln Pro Leu Pro Pro Ala Val Val Lys
Leu Ser Asp Ala Leu355 360 365Ile Ala Thr Ile Ser
Ser37040999DNAMycobacterium tuberculosisDPEP 40atgcatcacc
atcaccatca catgcatcag gtggacccca acttgacacg tcgcaaggga 60cgattggcgg
cactggctat cgcggcgatg gccagcgcca gcctggtgac cgttgcggtg
120cccgcgaccg ccaacgccga tccggagcca gcgcccccgg tacccacaac
ggccgcctcg 180ccgccgtcga ccgctgcagc gccacccgca ccggcgacac
ctgttgcccc cccaccaccg 240gccgccgcca acacgccgaa tgcccagccg
ggcgatccca acgcagcacc tccgccggcc 300gacccgaacg caccgccgcc
acctgtcatt gccccaaacg caccccaacc tgtccggatc 360gacaacccgg
ttggaggatt cagcttcgcg ctgcctgctg gctgggtgga gtctgacgcc
420gcccacttcg actacggttc agcactcctc agcaaaacca ccggggaccc
gccatttccc 480ggacagccgc cgccggtggc caatgacacc cgtatcgtgc
tcggccggct agaccaaaag 540ctttacgcca gcgccgaagc caccgactcc
aaggccgcgg cccggttggg ctcggacatg 600ggtgagttct atatgcccta
cccgggcacc cggatcaacc aggaaaccgt ctcgctcgac 660gccaacgggg
tgtctggaag cgcgtcgtat tacgaagtca agttcagcga tccgagtaag
720ccgaacggcc agatctggac gggcgtaatc ggctcgcccg cggcgaacgc
accggacgcc 780gggccccctc agcgctggtt tgtggtatgg ctcgggaccg
ccaacaaccc ggtggacaag 840ggcgcggcca aggcgctggc cgaatcgatc
cggcctttgg tcgccccgcc gccggcgccg 900gcaccggctc ctgcagagcc
cgctccggcg ccggcgccgg ccggggaagt cgctcctacc 960ccgacgacac
cgacaccgca gcggacctta ccggcctga 99941332PRTMycobacterium
tuberculosisDPEP 41Met His His His His His His Met His Gln Val Asp
Pro Asn Leu Thr1 5 10 15Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala Ile
Ala Ala Met Ala Ser20 25 30Ala Ser Leu Val Thr Val Ala Val Pro Ala
Thr Ala Asn Ala Asp Pro35 40 45Glu Pro Ala Pro Pro Val Pro Thr Thr
Ala Ala Ser Pro Pro Ser Thr50 55 60Ala Ala Ala Pro Pro Ala Pro Ala
Thr Pro Val Ala Pro Pro Pro Pro65 70 75 80Ala Ala Ala Asn Thr Pro
Asn Ala Gln Pro Gly Asp Pro Asn Ala Ala85 90 95Pro Pro Pro Ala Asp
Pro Asn Ala Pro Pro Pro Pro Val Ile Ala Pro100 105 110Asn Ala Pro
Gln Pro Val Arg Ile Asp Asn Pro Val Gly Gly Phe Ser115 120 125Phe
Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp130 135
140Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe
Pro145 150 155 160Gly Gln Pro Pro Pro Val Ala Asn Asp Thr Arg Ile
Val Leu Gly Arg165 170 175Leu Asp Gln Lys Leu Tyr Ala Ser Ala Glu
Ala Thr Asp Ser Lys Ala180 185 190Ala Ala Arg Leu Gly Ser Asp Met
Gly Glu Phe Tyr Met Pro Tyr Pro195 200 205Gly Thr Arg Ile Asn Gln
Glu Thr Val Ser Leu Asp Ala Asn Gly Val210 215 220Ser Gly Ser Ala
Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys225 230 235 240Pro
Asn Gly Gln Ile Trp Thr Gly Val Ile Gly Ser Pro Ala Ala Asn245 250
255Ala Pro Asp Ala Gly Pro Pro Gln Arg Trp Phe Val Val Trp Leu
Gly260 265 270Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala
Leu Ala Glu275 280 285Ser Ile Arg Pro Leu Val Ala Pro Pro Pro Ala
Pro Ala Pro Ala Pro290 295 300Ala Glu Pro Ala Pro Ala Pro Ala Pro
Ala Gly Glu Val Ala Pro Thr305 310 315 320Pro Thr Thr Pro Thr Pro
Gln Arg Thr Leu Pro Ala325 33042702DNAMycobacterium
tuberculosisTbH4 42cggcacgagg atcggtaccc cgcggcatcg gcagctgccg
attcgccggg tttccccacc 60cgaggaaagc cgctaccaga tggcgctgcc gaagtagggc
gatccgttcg cgatgccggc 120atgaacgggc ggcatcaaat tagtgcagga
acctttcagt ttagcgacga taatggctat 180agcactaagg aggatgatcc
gatatgacgc agtcgcagac cgtgacggtg gatcagcaag 240agattttgaa
cagggccaac gaggtggagg ccccgatggc ggacccaccg actgatgtcc
300ccatcacacc gtgcgaactc acggnggnta aaaacgccgc ccaacagntg
gtnttgtccg 360ccgacaacat gcgggaatac ctggcggccg gtgccaaaga
gcggcagcgt ctggcgacct 420cgctgcgcaa cgcggccaag gngtatggcg
aggttgatga ggaggctgcg accgcgctgg 480acaacgacgg cgaaggaact
gtgcaggcag aatcggccgg ggccgtcgga ggggacagtt 540cggccgaact
aaccgatacg ccgagggtgg ccacggccgg tgaacccaac ttcatggatc
600tcaaagaagc ggcaaggaag ctcgaaacgg gcgaccaagg cgcatcgctc
gcgcactgng 660gggatgggtg gaacacttnc accctgacgc tgcaaggcga cg
70243286PRTMycobacterium tuberculosisTbH4 43Gly Asp Ser Phe Trp Ala
Ala Ala Asp Gln Met Ala Arg Gly Phe Val1 5 10 15Leu Gly Ala Thr Ala
Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gln20 25 30His Ala Asp Gly
His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val35 40 45Val Ala Tyr
Asp Pro Ala Phe Ala Tyr Glu Ile Gly Tyr Ile Xaa Glu50 55 60Ser Gly
Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn Ile Phe Phe65 70 75
80Tyr Ile Thr Val Tyr Asn Glu Pro Tyr Val Gln Pro Pro Glu Pro Glu85
90 95Asn Phe Asp Pro Glu Gly Val Leu Gly Gly Ile Tyr Arg Tyr His
Ala100 105 110Ala Thr Glu Gln Arg Thr Asn Lys Xaa Gln Ile Leu Ala
Ser Gly Val115 120 125Ala Met Pro Ala Ala Leu Arg Ala Ala Gln Met
Leu Ala Ala Glu Trp130 135 140Asp Val Ala Ala Asp Val Trp Ser Val
Thr Ser Trp Gly Glu Leu Asn145 150 155 160Arg Asp Gly Val Val Ile
Glu Thr Glu Lys Leu Arg His Pro Asp Arg165 170 175Pro Ala Gly Val
Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly180 185 190Pro Val
Ile Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gln Ile195 200
205Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly
Phe210 215 220Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr
Phe Asn Thr Asp225 230 235 240Ala Glu Ser Gln Val Gly Arg Gly Phe
Gly Arg Gly Trp Pro Gly Arg245 250 255Arg Val Asn Ile Asp Pro Phe
Gly Ala Gly Arg Gly Pro Pro Ala Gln260 265 270Leu Pro Gly Phe Asp
Glu Gly Gly Gly Leu Arg Pro Xaa Lys275 280 28544339DNAMycobacterium
tuberculosisDPPD genomic DNA 44atgaagttga agtttgctcg cctgagtact
gcgatactgg gttgtgcagc ggcgcttgtg 60tttcctgcct cggttgccag cgcagatcca
cctgacccgc atcagccgga catgacgaaa 120ggctattgcc cgggtggccg
atggggtttt ggcgacttgg ccgtgtgcga cggcgagaag 180taccccgacg
gctcgttttg gcaccagtgg atgcaaacgt ggtttaccgg cccacagttt
240tacttcgatt gtgtcagcgg cggtgagccc ctccccggcc cgccgccacc
gggtggttgc 300ggtggggcaa ttccgtccga gcagcccaac gctccctga
33945112PRTMycobacterium tuberculosisDPPD 45Met Lys Leu Lys Phe Ala
Arg Leu Ser Thr Ala Ile Leu Gly Cys Ala1 5 10 15Ala Ala Leu Val Phe
Pro Ala Ser Val Ala Ser Ala Asp Pro Pro Asp20 25 30Pro His Gln Pro
Asp Met Thr Lys Gly Tyr Cys Pro Gly Gly Arg Trp35 40 45Gly Phe Gly
Asp Leu Ala Val Cys Asp Gly Glu Lys Tyr Pro Asp Gly50 55 60Ser Phe
Trp His Gln Trp Met Gln Thr Trp Phe Thr Gly Pro Gln Phe65 70 75
80Tyr Phe Asp Cys Val Ser Gly Gly Glu Pro Leu Pro Gly Pro Pro Pro85
90 95Pro Gly Gly Cys Gly Gly Ala Ile Pro Ser Glu Gln Pro Asn Ala
Pro100 105 11046921DNAArtificial SequenceDescription of Artificial
Sequencetri-fusion protein DPV-MTI-MSL (designated MTb31F) cDNA
46cat atg cat cac cat cac cat cac gat ccc gtg gac gcg gtc att aac
48His Met His His His His His His Asp Pro Val Asp Ala Val Ile Asn1
5 10 15acc acc tgc aat tac ggg cag gta gta gct gcg ctc aac gcg acg
gat 96Thr Thr Cys Asn Tyr Gly Gln Val Val Ala Ala Leu Asn Ala Thr
Asp20 25 30ccg ggg gct gcc gca cag ttc aac gcc tca ccg gtg gcg cag
tcc tat 144Pro Gly Ala Ala Ala Gln Phe Asn Ala Ser Pro Val Ala Gln
Ser Tyr35 40 45ttg cgc aat ttc ctc gcc gca ccg cca cct cag cgc gct
gcc atg gcc 192Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro Gln Arg Ala
Ala Met Ala50 55 60gcg caa ttg caa gct gtg ccg ggg gcg gca cag tac
atc ggc ctt gtc 240Ala Gln Leu Gln Ala Val Pro Gly Ala Ala Gln Tyr
Ile Gly Leu Val65 70 75 80gag tcg gtt gcc ggc tcc tgc aac aac tat
gag ctc atg acg att aat 288Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr
Glu Leu Met Thr Ile Asn85 90 95tac cag ttc ggg gac gtc gac gct cat
ggc gcc atg atc cgc gct cag 336Tyr Gln Phe Gly Asp Val Asp Ala His
Gly Ala Met Ile Arg Ala Gln100 105 110gcg gcg tcg ctt gag gcg gag
cat cag gcc atc gtt cgt gat gtg ttg 384Ala Ala Ser Leu Glu Ala Glu
His Gln Ala Ile Val Arg Asp Val Leu115 120 125gcc gcg ggt gac ttt
tgg ggc ggc gcc ggt tcg gtg gct tgc cag gag 432Ala Ala Gly Asp Phe
Trp Gly Gly Ala Gly Ser Val Ala Cys Gln Glu130 135 140ttc att acc
cag ttg ggc cgt aac ttc cag gtg atc tac gag cag gcc 480Phe Ile Thr
Gln Leu Gly Arg Asn Phe Gln Val Ile Tyr Glu Gln Ala145 150 155
160aac gcc cac ggg cag aag gtg cag gct gcc ggc aac aac atg gcg caa
528Asn Ala His Gly Gln Lys Val Gln Ala Ala Gly Asn Asn Met Ala
Gln165 170 175acc gac agc gcc gtc ggc tcc agc tgg gcc act agt atg
agc ctt ttg 576Thr Asp Ser Ala Val Gly Ser Ser Trp Ala Thr Ser Met
Ser Leu Leu180 185 190gat gct cat atc cca cag ttg gtg gcc tcc cag
tcg gcg ttt gcc gcc 624Asp Ala His Ile Pro Gln Leu Val Ala Ser Gln
Ser Ala Phe Ala Ala195 200 205aag gcg ggg ctg atg cgg cac acg atc
ggt cag gcc gag cag gcg gcg 672Lys Ala Gly Leu Met Arg His Thr Ile
Gly Gln Ala Glu Gln Ala Ala210 215 220atg tcg gct cag gcg ttt cac
cag ggg gag tcg tcg gcg gcg ttt cag 720Met Ser Ala Gln Ala Phe His
Gln Gly Glu Ser Ser Ala Ala Phe Gln225 230 235 240gcc gcc cat gcc
cgg ttt gtg gcg gcg gcc gcc aaa gtc aac acc ttg 768Ala Ala His Ala
Arg Phe Val Ala Ala Ala Ala Lys Val Asn Thr Leu245 250 255ttg gat
gtc gcg cag gcg aat ctg ggt gag gcc gcc ggt acc tat gtg 816Leu Asp
Val Ala Gln Ala Asn Leu Gly Glu Ala Ala Gly Thr Tyr Val260 265
270gcc gcc gat gct gcg gcc gcg tcg acc tat acc ggg ttc gat atc cat
864Ala Ala Asp Ala Ala Ala Ala Ser Thr Tyr Thr Gly Phe Asp Ile
His275 280 285cac act ggc ggc cgc tcg agc aga tcc ggc tgc taa
caaagcccga 910His Thr Gly Gly Arg Ser Ser Arg Ser Gly Cys290
295aaggaagctg a 92147299PRTArtificial SequenceDescription of
Artificial Sequencetri-fusion protein DPV-MTI-MSL (designated
MTb31F) cDNA 47His Met His His His His His His Asp Pro Val Asp Ala
Val Ile Asn1 5 10 15Thr Thr Cys Asn Tyr Gly Gln Val Val Ala Ala Leu
Asn Ala Thr Asp20 25 30Pro Gly Ala Ala Ala Gln Phe Asn Ala Ser Pro
Val Ala Gln Ser Tyr35 40 45Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro
Gln Arg Ala Ala Met Ala50 55 60Ala Gln Leu Gln Ala Val Pro Gly Ala
Ala Gln Tyr Ile Gly Leu Val65 70 75 80Glu Ser Val Ala Gly Ser Cys
Asn Asn Tyr Glu Leu Met Thr Ile Asn85 90 95Tyr Gln Phe Gly Asp Val
Asp Ala His Gly Ala Met Ile Arg Ala Gln100 105 110Ala Ala Ser Leu
Glu Ala Glu His Gln Ala Ile Val Arg Asp Val Leu115 120 125Ala Ala
Gly Asp Phe Trp Gly Gly Ala Gly Ser Val Ala Cys Gln Glu130 135
140Phe Ile Thr Gln Leu Gly Arg Asn Phe Gln Val Ile Tyr Glu Gln
Ala145 150 155 160Asn Ala His Gly Gln Lys Val Gln Ala Ala Gly Asn
Asn Met Ala Gln165 170 175Thr Asp Ser Ala Val Gly Ser Ser Trp Ala
Thr Ser Met Ser Leu Leu180 185 190Asp Ala His Ile Pro Gln Leu Val
Ala Ser Gln Ser Ala Phe Ala Ala195 200 205Lys Ala Gly Leu Met Arg
His Thr Ile Gly Gln Ala Glu Gln Ala Ala210 215 220Met Ser Ala Gln
Ala Phe His Gln Gly Glu Ser Ser Ala Ala Phe Gln225 230 235 240Ala
Ala His Ala Arg Phe Val Ala Ala Ala Ala Lys Val Asn Thr Leu245 250
255Leu Asp Val Ala Gln Ala Asn Leu Gly Glu Ala Ala Gly Thr Tyr
Val260 265 270Ala Ala Asp Ala Ala Ala Ala Ser Thr Tyr Thr Gly Phe
Asp Ile His275 280 285His Thr Gly Gly Arg Ser Ser Arg Ser Gly
Cys290 295482168DNAArtificial SequenceDescription of Artificial
Sequencetetra-fusion protein DPV-MTI-MSL-MTCC#2 (designated MTb71F)
48cat atg cat cac cat cac cat cac gat ccc gtg gac gcg gtc att aac
48His Met His His His His His His Asp Pro Val Asp Ala Val Ile Asn1
5 10 15acc acc tgc aat tac ggg cag gta gta gct gcg ctc aac gcg acg
gat 96Thr Thr Cys Asn Tyr Gly Gln Val Val Ala Ala Leu Asn Ala Thr
Asp20 25 30ccg ggg gct gcc gca cag ttc aac gcc tca ccg gtg gcg cag
tcc tat 144Pro Gly Ala Ala Ala Gln Phe Asn Ala Ser Pro Val Ala Gln
Ser Tyr35 40 45ttg cgc aat ttc ctc gcc gca ccg cca cct cag cgc gct
gcc atg gcc 192Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro Gln Arg Ala
Ala Met Ala50 55 60gcg caa ttg caa gct gtg ccg ggg gcg gca cag tac
atc ggc ctt gtc 240Ala Gln Leu Gln Ala Val Pro Gly Ala Ala Gln Tyr
Ile Gly Leu Val65 70 75 80gag tcg gtt gcc ggc tcc tgc aac aac tat
gag ctc atg acg att aat 288Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr
Glu Leu Met Thr Ile Asn85 90 95tac cag ttc ggg gac gtc gac gct cat
ggc gcc atg atc cgc gct cag 336Tyr Gln Phe Gly Asp Val Asp Ala His
Gly Ala Met Ile Arg Ala Gln100 105 110gcg gcg tcg ctt gag gcg gag
cat cag gcc atc gtt cgt gat gtg ttg 384Ala Ala Ser Leu Glu Ala Glu
His Gln Ala Ile Val Arg Asp Val Leu115 120 125gcc gcg ggt gac ttt
tgg ggc ggc gcc ggt tcg gtg gct tgc cag gag 432Ala Ala Gly Asp Phe
Trp Gly Gly Ala Gly Ser Val Ala Cys Gln Glu130 135 140ttc att acc
cag ttg ggc cgt aac ttc cag gtg atc tac gag cag gcc 480Phe Ile Thr
Gln Leu Gly Arg Asn Phe Gln Val Ile Tyr Glu Gln Ala145 150 155
160aac gcc cac ggg cag aag gtg cag gct gcc ggc aac aac atg gcg caa
528Asn Ala His Gly Gln Lys Val Gln Ala Ala Gly Asn Asn Met Ala
Gln165 170 175acc gac agc gcc gtc ggc tcc agc tgg gcc act agt atg
agc ctt ttg 576Thr Asp Ser Ala Val Gly Ser Ser Trp Ala Thr Ser Met
Ser Leu Leu180 185 190gat gct cat atc cca cag ttg gtg gcc tcc cag
tcg gcg ttt gcc gcc 624Asp Ala His Ile Pro Gln Leu Val Ala Ser Gln
Ser Ala Phe Ala Ala195 200 205aag gcg ggg ctg atg cgg cac acg atc
ggt cag gcc gag cag gcg gcg 672Lys Ala Gly Leu Met Arg His Thr Ile
Gly Gln Ala Glu Gln Ala Ala210 215 220atg tcg gct cag gcg ttt cac
cag ggg gag tcg tcg gcg gcg ttt cag 720Met Ser Ala Gln Ala Phe His
Gln Gly Glu Ser Ser Ala Ala Phe Gln225 230 235 240gcc gcc cat gcc
cgg ttt gtg gcg gcg gcc gcc aaa gtc aac acc ttg 768Ala Ala His Ala
Arg Phe Val Ala Ala Ala Ala Lys Val Asn Thr Leu245 250 255ttg gat
gtc gcg cag gcg aat ctg ggt gag gcc gcc ggt acc tat gtg 816Leu Asp
Val Ala Gln Ala Asn Leu Gly Glu Ala Ala Gly Thr Tyr Val260 265
270gcc gcc gat gct gcg gcc gcg tcg acc tat acc ggg ttc gat atc atg
864Ala Ala Asp Ala Ala Ala Ala Ser Thr Tyr Thr Gly Phe Asp Ile
Met275 280 285gat ttc ggg ctt tta cct ccg gaa gtg aat tca agc cga
atg tat tcc 912Asp Phe Gly Leu Leu Pro Pro Glu Val Asn Ser Ser Arg
Met Tyr Ser290 295 300ggt ccg ggg ccg gag tcg atg cta gcc gcc gcg
gcc gcc tgg gac ggt 960Gly Pro Gly Pro Glu Ser Met Leu Ala Ala Ala
Ala Ala Trp Asp Gly305 310 315 320gtg gcc gcg gag ttg act tcc gcc
gcg gtc tcg tat gga tcg gtg gtg 1008Val Ala Ala Glu Leu Thr Ser Ala
Ala Val Ser Tyr Gly Ser Val Val325 330 335tcg acg ctg atc gtt gag
ccg tgg atg ggg ccg gcg gcg gcc gcg atg 1056Ser Thr Leu Ile Val Glu
Pro Trp Met Gly Pro Ala Ala Ala Ala Met340 345 350gcg gcc gcg gca
acg ccg tat gtg ggg tgg ctg gcc gcc acg gcg gcg 1104Ala Ala Ala Ala
Thr Pro Tyr Val Gly Trp Leu Ala Ala Thr Ala Ala355 360 365ctg gcg
aag gag acg gcc aca cag gcg agg gca gcg gcg gaa gcg ttt 1152Leu Ala
Lys Glu Thr Ala Thr Gln Ala Arg Ala Ala Ala Glu Ala Phe370 375
380ggg acg gcg ttc gcg atg acg gtg cca cca tcc ctc gtc gcg gcc aac
1200Gly Thr Ala Phe Ala Met Thr Val Pro Pro Ser Leu Val Ala Ala
Asn385 390 395 400cgc agc cgg ttg atg tcg ctg gtc gcg gcg aac att
ctg ggg caa aac 1248Arg Ser Arg Leu Met Ser Leu Val Ala Ala Asn Ile
Leu Gly Gln Asn405 410 415agt gcg gcg atc gcg gct acc cag gcc gag
tat gcc gaa atg tgg gcc 1296Ser Ala Ala Ile Ala Ala Thr Gln Ala Glu
Tyr Ala Glu Met Trp Ala420 425 430caa gac gct gcc gtg atg tac agc
tat gag ggg gca tct gcg gcc gcg 1344Gln Asp Ala Ala Val Met Tyr Ser
Tyr Glu Gly Ala Ser Ala Ala Ala435 440 445tcg gcg ttg ccg ccg ttc
act cca ccc gtg caa ggc acc ggc ccg gcc 1392Ser Ala Leu Pro Pro Phe
Thr Pro Pro Val Gln Gly Thr Gly Pro Ala450 455 460ggg ccc gcg gcc
gca gcc gcg gcg acc caa gcc gcc ggt gcg ggc gcc 1440Gly Pro Ala Ala
Ala Ala Ala Ala Thr Gln Ala Ala Gly Ala Gly Ala465 470 475 480gtt
gcg gat gca cag gcg aca ctg gcc cag ctg ccc ccg ggg atc ctg 1488Val
Ala Asp Ala Gln Ala Thr Leu Ala Gln Leu Pro Pro Gly Ile Leu485 490
495agc gac att ctg tcc gca ttg gcc gcc aac gct gat ccg ctg aca tcg
1536Ser Asp Ile Leu Ser Ala Leu Ala Ala Asn Ala Asp Pro Leu Thr
Ser500 505 510gga ctg ttg ggg atc gcg tcg acc ctc aac ccg caa gtc
gga tcc gct 1584Gly Leu Leu Gly Ile Ala Ser Thr Leu Asn Pro Gln Val
Gly Ser Ala515 520 525cag ccg ata gtg atc ccc acc ccg ata ggg gaa
ttg gac gtg atc gcg 1632Gln Pro Ile Val Ile Pro Thr Pro Ile Gly Glu
Leu Asp Val Ile Ala530 535 540ctc tac att gca tcc atc gcg acc ggc
agc att gcg ctc gcg atc acg 1680Leu Tyr Ile Ala Ser Ile Ala Thr Gly
Ser Ile Ala Leu Ala Ile Thr545 550 555 560aac acg gcc aga ccc tgg
cac atc ggc cta tac ggg aac gcc ggc ggg 1728Asn Thr Ala Arg Pro Trp
His Ile Gly Leu Tyr Gly Asn Ala Gly Gly565 570 575ctg gga ccg acg
cag ggc cat cca ctg agt tcg gcg acc gac gag ccg 1776Leu Gly Pro Thr
Gln Gly His Pro Leu Ser Ser Ala Thr Asp Glu Pro580 585 590gag ccg
cac tgg ggc ccc ttc ggg ggc gcg gcg ccg gtg tcc gcg ggc 1824Glu Pro
His Trp Gly Pro Phe Gly Gly Ala Ala Pro Val Ser Ala Gly595 600
605gtc ggc cac gca gca tta gtc gga gcg ttg tcg gtg ccg cac agc tgg
1872Val Gly His Ala Ala Leu Val Gly Ala Leu Ser Val Pro His Ser
Trp610 615 620acc acg gcc gcc ccg gag atc cag ctc gcc gtt cag gca
aca ccc acc 1920Thr Thr Ala Ala Pro Glu Ile Gln Leu Ala Val Gln Ala
Thr Pro Thr625 630 635 640ttc agc tcc agc gcc ggc gcc gac ccg acg
gcc cta aac ggg atg ccg 1968Phe Ser Ser Ser Ala Gly Ala Asp Pro Thr
Ala Leu Asn Gly Met Pro645 650 655gca ggc ctg ctc agc ggg atg gct
ttg gcg agc ctg gcc gca cgc ggc 2016Ala Gly Leu Leu Ser Gly Met Ala
Leu Ala Ser Leu Ala Ala Arg Gly660 665 670acg acg ggc ggt ggc ggc
acc cgt agc ggc acc agc act gac ggc caa 2064Thr Thr Gly Gly Gly Gly
Thr Arg Ser Gly Thr Ser Thr Asp Gly Gln675 680 685gag gac ggc cgc
aaa ccc ccg gta gtt gtg att aga gag cag ccg ccg 2112Glu Asp Gly Arg
Lys Pro Pro Val Val Val Ile Arg Glu Gln Pro Pro690 695 700ccc gga
aac ccc ccg cgg taa gatttctaaa tccatcacac tggcggccgc 2163Pro Gly
Asn Pro Pro Arg705 710tcgag 216849710PRTArtificial
SequenceDescription of Artificial Sequencetetra-fusion protein
DPV-MTI-MSL-MTCC#2 (designated MTb71F) 49His Met His His His His
His His Asp Pro Val Asp Ala Val Ile Asn1 5 10 15Thr Thr Cys Asn Tyr
Gly Gln Val Val Ala Ala Leu Asn Ala Thr Asp20 25 30Pro Gly Ala Ala
Ala Gln Phe Asn Ala Ser Pro Val Ala Gln Ser Tyr35 40 45Leu Arg Asn
Phe Leu Ala Ala Pro Pro Pro Gln Arg Ala Ala Met Ala50 55 60Ala Gln
Leu Gln Ala Val Pro Gly Ala Ala Gln Tyr Ile Gly Leu Val65 70 75
80Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr Glu Leu Met Thr Ile Asn85
90 95Tyr Gln Phe Gly Asp Val Asp Ala His Gly Ala Met Ile Arg Ala
Gln100 105 110Ala Ala Ser Leu Glu Ala Glu His Gln Ala Ile Val Arg
Asp Val Leu115 120 125Ala Ala Gly Asp Phe Trp Gly Gly Ala Gly Ser
Val Ala Cys Gln Glu130 135 140Phe Ile Thr Gln Leu Gly Arg Asn Phe
Gln Val Ile Tyr Glu Gln Ala145 150 155 160Asn Ala His Gly Gln Lys
Val Gln Ala Ala Gly Asn Asn Met Ala Gln165 170 175Thr Asp Ser Ala
Val Gly Ser Ser Trp Ala Thr Ser Met Ser Leu Leu180 185 190Asp Ala
His Ile Pro Gln Leu Val Ala Ser Gln Ser Ala Phe Ala Ala195 200
205Lys Ala Gly Leu Met Arg His Thr Ile Gly Gln Ala Glu Gln Ala
Ala210 215 220Met Ser Ala Gln Ala Phe His Gln Gly Glu Ser Ser Ala
Ala Phe Gln225 230 235 240Ala Ala His Ala Arg Phe Val Ala Ala Ala
Ala Lys Val Asn Thr Leu245 250 255Leu Asp Val Ala Gln Ala Asn Leu
Gly Glu Ala Ala Gly Thr Tyr Val260 265 270Ala Ala Asp Ala Ala Ala
Ala Ser Thr Tyr Thr Gly Phe Asp Ile Met275 280 285Asp Phe Gly Leu
Leu Pro Pro Glu Val Asn Ser Ser Arg Met Tyr Ser290 295 300Gly Pro
Gly Pro Glu Ser Met Leu Ala Ala Ala Ala Ala Trp Asp Gly305 310 315
320Val Ala Ala Glu Leu Thr Ser Ala Ala Val Ser Tyr Gly Ser Val
Val325 330
335Ser Thr Leu Ile Val Glu Pro Trp Met Gly Pro Ala Ala Ala Ala
Met340 345 350Ala Ala Ala Ala Thr Pro Tyr Val Gly Trp Leu Ala Ala
Thr Ala Ala355 360 365Leu Ala Lys Glu Thr Ala Thr Gln Ala Arg Ala
Ala Ala Glu Ala Phe370 375 380Gly Thr Ala Phe Ala Met Thr Val Pro
Pro Ser Leu Val Ala Ala Asn385 390 395 400Arg Ser Arg Leu Met Ser
Leu Val Ala Ala Asn Ile Leu Gly Gln Asn405 410 415Ser Ala Ala Ile
Ala Ala Thr Gln Ala Glu Tyr Ala Glu Met Trp Ala420 425 430Gln Asp
Ala Ala Val Met Tyr Ser Tyr Glu Gly Ala Ser Ala Ala Ala435 440
445Ser Ala Leu Pro Pro Phe Thr Pro Pro Val Gln Gly Thr Gly Pro
Ala450 455 460Gly Pro Ala Ala Ala Ala Ala Ala Thr Gln Ala Ala Gly
Ala Gly Ala465 470 475 480Val Ala Asp Ala Gln Ala Thr Leu Ala Gln
Leu Pro Pro Gly Ile Leu485 490 495Ser Asp Ile Leu Ser Ala Leu Ala
Ala Asn Ala Asp Pro Leu Thr Ser500 505 510Gly Leu Leu Gly Ile Ala
Ser Thr Leu Asn Pro Gln Val Gly Ser Ala515 520 525Gln Pro Ile Val
Ile Pro Thr Pro Ile Gly Glu Leu Asp Val Ile Ala530 535 540Leu Tyr
Ile Ala Ser Ile Ala Thr Gly Ser Ile Ala Leu Ala Ile Thr545 550 555
560Asn Thr Ala Arg Pro Trp His Ile Gly Leu Tyr Gly Asn Ala Gly
Gly565 570 575Leu Gly Pro Thr Gln Gly His Pro Leu Ser Ser Ala Thr
Asp Glu Pro580 585 590Glu Pro His Trp Gly Pro Phe Gly Gly Ala Ala
Pro Val Ser Ala Gly595 600 605Val Gly His Ala Ala Leu Val Gly Ala
Leu Ser Val Pro His Ser Trp610 615 620Thr Thr Ala Ala Pro Glu Ile
Gln Leu Ala Val Gln Ala Thr Pro Thr625 630 635 640Phe Ser Ser Ser
Ala Gly Ala Asp Pro Thr Ala Leu Asn Gly Met Pro645 650 655Ala Gly
Leu Leu Ser Gly Met Ala Leu Ala Ser Leu Ala Ala Arg Gly660 665
670Thr Thr Gly Gly Gly Gly Thr Arg Ser Gly Thr Ser Thr Asp Gly
Gln675 680 685Glu Asp Gly Arg Lys Pro Pro Val Val Val Ile Arg Glu
Gln Pro Pro690 695 700Pro Gly Asn Pro Pro Arg705
71050588DNAMycobacterium tuberculosisRa35 N-terminus of MTB32A
(Ra35FL) 50gccccgccgg ccttgtcgca ggaccggttc gccgacttcc ccgcgctgcc
cctcgacccg 60tccgcgatgg tcgcccaagt ggggccacag gtggtcaaca tcaacaccaa
actgggctac 120aacaacgccg tgggcgccgg gaccggcatc gtcatcgatc
ccaacggtgt cgtgctgacc 180aacaaccacg tgatcgcggg cgccaccgac
atcaatgcgt tcagcgtcgg ctccggccaa 240acctacggcg tcgatgtggt
cgggtatgac cgcacccagg atgtcgcggt gctgcagctg 300cgcggtgccg
gtggcctgcc gtcggcggcg atcggtggcg gcgtcgcggt tggtgagccc
360gtcgtcgcga tgggcaacag cggtgggcag ggcggaacgc cccgtgcggt
gcctggcagg 420gtggtcgcgc tcggccaaac cgtgcaggcg tcggattcgc
tgaccggtgc cgaagagaca 480ttgaacgggt tgatccagtt cgatgccgcg
atccagcccg gtgaggcggg cgggcccgtc 540gtcaacggcc taggacaggt
ggtcggtatg aacacggccg cgtcctag 588
* * * * *
References