U.S. patent application number 10/080170 was filed with the patent office on 2003-07-10 for comparative mycobacterial genomics as a tool for identifying targets for the diagnosis, prophylaxis or treatment of mycobacterioses.
Invention is credited to Cole, Stewart T..
Application Number | 20030129601 10/080170 |
Document ID | / |
Family ID | 23029995 |
Filed Date | 2003-07-10 |
United States Patent
Application |
20030129601 |
Kind Code |
A1 |
Cole, Stewart T. |
July 10, 2003 |
Comparative mycobacterial genomics as a tool for identifying
targets for the diagnosis, prophylaxis or treatment of
mycobacterioses
Abstract
The present invention is directed to a method of selection of
purified nucleotidic sequences or polynucleotides encoding proteins
or part of proteins carrying at least an essential function for the
survival or the virulence of mycobacterium species by a comparative
genomic analysis of the sequence of the genome of M. tuberculosis
aligned on the genome sequence of M. leprae and M. tuberculosis and
M. leprae marker polypeptides of nucleotides encoding the
polypeptides, and methods for using the nucleotides and the encoded
polypeptides are disclosed.
Inventors: |
Cole, Stewart T.; (Clamart,
FR) |
Correspondence
Address: |
FINNEGAN, HENDERSON, FARABOW, GARRETT &
DUNNER LLP
1300 I STREET, NW
WASHINGTON
DC
20006
US
|
Family ID: |
23029995 |
Appl. No.: |
10/080170 |
Filed: |
February 22, 2002 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60270123 |
Feb 22, 2001 |
|
|
|
Current U.S.
Class: |
435/6.16 |
Current CPC
Class: |
C07K 14/35 20130101;
A61P 31/06 20180101; C12Q 1/689 20130101; A61K 39/00 20130101; C07K
14/345 20130101; A61P 31/08 20180101; A61K 2039/505 20130101; C12N
15/1089 20130101 |
Class at
Publication: |
435/6 |
International
Class: |
C12Q 001/68 |
Claims
What is claimed is:
1. A method for identification and the selection of essential genes
for the survival or the virulence of mycobacterium species which
comprises: a. Aligning the genomic sequence of a first
mycobacterium species on a genomic sequence of the genomic sequence
of a second mycobacterium species, b. Selecting a
polypolynucleotide sequence highly conserved in both genomes with
no counterparts in other bacterial genomic sequences and which
corresponds to an essential gene for the survival or the virulence
of mycobacterium species, and c. Optionally, testing the
polypolynucleotide selected in step b) for its capacity of
virulence or involved in the survival of a mycobacterium species
said testing being based on the activation or inactivation of said
polypolynucleotide in a bacterial host or said testing being based
on the activity of the product of expression of said polynucleotide
in vivo or in vitro.
2. A method according to claim 1, wherein the first genomic
sequence of mycobacterium belongs to Mycobacterium
tuberculosis.
3. A method according to claim 1, wherein the second genomic
sequence of mycobacterium belongs to Mycobacterium leprae.
4. A method according to any one of claims 1 to 3, wherein the
complete genomic sequence of said mycobacterium species is
analysed.
5. A method for the identification and the selection in silico of
essential genes for the survival or the virulence of mycobacterium
species according to any one of claims 1 to 4.
6. Purified polynucleotide molecule obtained by the method
according to any one of claims 1 to 5.
7. A purified polynucleotide molecule of claim 6 which encodes
essential proteins or fragments of proteins of Mycobacteium
species.
8. A purified polynucleotide molecule of a formula selected from
the group consisting of polynucleotidic sequences, which encode for
polypeptides and regulatory sequences essential for the virulence
and/or the survival of mycobacterium which are, in one hand,
specific to Mycobacterium tuberculosis and, in the other hand,
specific to Mycobacterium leprae, that is to say, said
polynucleotidic sequences are not found in publicly accessible
banks of non-Mycobacterium tuberculosis and non-Mycobacterium
leprae genome.
9. A purified polynucleotide molecule according to claim 8 obtained
by the method according to any one of claims 1 to 5.
10. A purified polynucleotide molecule that hybridizes to either
strand of a denratured, double-stranded DNA comprising the purified
polynucleotide sequence according to claims 6 to 9 under conditions
of moderate stringency in 50% formamide and 6.times. SSC at
42.degree. C. with washing conditions of 60.degree. C.,
0.5.times.SSC, 0.1% of SDS.
11. The purified polynucleotide molecule as claimed in claim 10,
wherein said purified polynucleotide molecule is derived by
mutagenesis.
12. A purified polynucleotide molecule degenerate from the purified
polynucleotide molecule according to any one of claims 6 to 9 as a
result of the genetic code.
13. A purified polynucleotide according to any one of claims 6 to 9
which encodes M. tuberculosis or M. leprae marker polypeptide.
14. A purified polynucleotide molecule according to any one of
claims 6 to 9 which encodes an allelic variant of M. tuberculosis
or M. leprae marker polypeptide according to claim 13.
15. A purified polynucleotide molecule according to any one of
claims 6 to 9 which encodes a similar sequence of M. tuberculosis
or M. leprae marker polypeptide DNA according to claim 13.
16. A purified polypeptide encoded by a polynucleotide molecule
according to claim 8.
17. A purified polypeptide of a formula selected from the group
consisting of SEQ ID NO: X to SEQ ID NO: Y.
18. A purified polypeptide according to claim 17 encoded by a
polynucleotide molecule according to claims 6 or 9.
19. A purified polypeptide according to claim 16 in
non-glycosylated form.
20. A purified polypeptide according to claim 16 in glycosylated
form.
21. A purified polypeptide according to claim 17 in
non-glycosylated form.
22. A purified polypeptide according to claim 17 in glycosylated
form.
23. A purified polypeptide according to claim 18 in
non-glycosylated form.
24. A purified polypeptide according to claim 18 in
non-glycosylated form.
25. Process of screening of active molecules comprising: a.
Preparation of at least one purified polynucleotidic molecule or
fragment thereof according to claims 6 to 9 in an acceptable
medium, b. Contacting the purified polynucleotide sequence or a
fragment thereof corresponding to said essential gene of interest
with an active molecule to be tested, and c. Selecting the active
molecule by inhibition or activation of the activity of the
purified polynucleotide compared to the standard activity of the
essential gene in absence of said active molecule.
26. Process of screening of an active molecule comprising: a.
Preparation of at least one purified polypeptide according to
claims 16 to 24 or fragment thereof to be used as a target in an
acceptable medium, b. Contacting said purified polypeptide or
fragment thereof obtained in step a) with an active molecule to be
tested, and c. Selecting the active molecule by inhibition or
activation of the activity of the purified polypeptide obtainable
after expression of the essential gene selected according to claim
1 and compared to the standard activity of said polypeptide.
27. A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001 at the C.N.C.M. under the
accession number I-2625.
28. A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001 at the C.N.C.M. under the
accession number I-2626.
29. A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001 at the C.N.C.M. under the
accession number I-2627.
30. A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001 at the C.N.C.M. under the
accession number I-2628.
31. A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001 at the C.N.C.M. under the
accession number I-2629.
32. A recombinant cosmid containing a fragment of M. leprae genome
deposited on Feb. 21, 2001 at the C.N.C.M. under the accession
number I-2632.
33. A recombinant cosmid containing a fragment of M. leprae genome
deposited on Feb. 21, 2001 at the C.N.C.M. under the accession
number I-2633.
34. A recombinant purified vector that directs the expression of a
polynucleotide molecule selected from the group consisting of
purified polynucleotide molecule selected according to claims 1 or
6, 7, 8, 9, 13, 14 and 15.
35. A recombinant purified vector that directs the expression of a
purified polynucleotide molecule according to claims 10 or 11.
36. A recombinant purified vector that directs the expression of a
purified polynucleotide molecule according to claim 12.
37. A recombinant purified vector containing a part of the
polynucleotide insert of claims 27 to 32 or claims 33 to 35 which
is a plasmid.
38. A host cell transfected or transduced with the vector of claim
34.
39. A host cell transfected or transduced with the vector of claim
35.
40. A host cell transfected or transduced with the vector of claim
36.
41. A host cell transfected or transduced with the plasmid of claim
37.
42. A method for the production of mycobacterium purified marker
polypeptide comprising culturing a host cell of claim 38 under
conditions promoting expression, and recovering the polypeptide
from the culture medium.
43. A method for the production of mycobacterium purified marker
polypeptide comprising culturing a host cell of claim 39 under
conditions promoting expression, and recovering the polypeptide
from the culture medium.
44. A method for the production of mycobacterium purified marker
polypeptide comprising culturing a host cell of claim 40 under
conditions promoting expression, and recovering the polypeptide
from the culture medium.
45. A method for the production of mycobacterium purified marker
polypeptide comprising culturing a host cell of claim 41 under
conditions promoting expression, and recovering the polypeptide
from the culture medium.
46. The method according to any one of claims 42 to 45, wherein the
host cell is selected from the group consisting of bacterial cells,
yeast cells, plant cells, and mammalian cells.
47. An immunological complex comprising a Mycobacterium purified
marker polypeptide produced by a method according to any one of
claims 42 to 45 and an antibody that specifically recognizes said
polypeptide.
48. A composition comprising at least a mycobacterium purified
polypeptide marker produced by a method according to any one of
claims 42 to 45.
49. A method for detecting infection by mycobacteria, said method
comprises: a. Providing a composition according to claim 48 with a
biological sample suspected to be infected with a mycobacterium, b.
Assaying for the presence of said mycobacterium, and c. Optionally,
detecting the presence of mycobacteria in said biological sample if
infected.
50. The method of claim 49, which in step b) the assay is performed
by electrophoresis or by immunoassay with antibodies that are
immunologically reactive with M. tuberculosis and/or M. leprae.
51. An in vitro diagnostic method for the detection of the presence
or the absence of antibodies which bind to an antigen or fragment
of antigen comprising a mycobacterium purified polypeptide molecule
according to any one of claims 16 to 24, wherein the method
comprises contacting the antigen or fragment of antigen with a
biological fluid for a time and under conditions sufficient for the
antigen and antibodies in the biological fluid to form an
immunological complex, detecting the formation of the complex, and
optionally measuring the formation of the immunologicalcomplex.
52. The method as claimed in claim 51, wherein the formation of the
immunological complex is detected by immunoassay method based on
western blot technique, ELISA, indirect immuno-fluorescense assay,
or immunoprecipitation assay.
53. An in vitro diagnostic method for the detection of the presence
or the absence of antibodies which bind to an antigen or fragment
of antigen comprising a mycobacterium purified polypeptide molecule
obtained by a method according to any one of claims 42 to 45,
wherein the method comprises contacting the antigen or fragment of
antigen with a biological fluid for a time and under conditions
sufficient for the antigen and antibodies in the biological fluid
to form an immunological complex, detecting the formation of the
complex, and optionally measuring the formation of the
immunological complex.
54. The method as claimed in claim 53, wherein the formation of the
immunological complex is detected by immunoassay method based on
western blot technique, ELISA, indirect immuno-fluorescence assay,
or immunoprecipitation assay.
55. A kit for the in vitro diagnostics of mycobacterium infections
comprising: a. A mycobacterium purified polypeptide molecule
according to any one of claims 16 to 24 or mixture thereof, b.
Antibodies capable of forming an immunological complex with said
polypeptides, and c. Acceptable medium to permit the detection of
the formation of the complex thereof.
56. A kit as claimed in claim 55 useful for the detection of M.
tuberculosis infections.
57. A kit as claimed in claim 55 useful for the detection of M.
leprae infections.
58. An immunogenic composition comprising at least a purified
polypeptide according to any one of claims 16 to 24 in an amount
sufficient to induce an immunogenic or protective response in vivo,
and a pharmaceutically acceptable carrier therefor.
59. A polynucleotidic probe comprising a purified polynucleotide
molecule or fragment thereof according to any one of claims 6 to
15.
60. A polynucleotidic probe which is complementary to the full
length sequence of a purified nucleic acid that hybridizes under
conditions of moderate stringency in 50% formamide and 6.times. SSC
at 42.degree. C. with washing conditions of 60.degree. C.,
0.5.times.SSC, 0.1% SDS with a nucleic acid encoding a purified
polypeptide according to any one of claims 6 to 24.
61. A method for the detection of the presence or the absence of
mycobacteria in a sample comprising: a. contacting a sample
suspected to contain genetic material of mycobacteria with at least
one probe according to claims 59 or 60, b. Detecting the
hybridization under conditions of moderate stringency in 50%
formamide and 6.times. SSC at 42.degree. C. with washing conditions
of 60.degree. C., 0.5.times.SSC, 0.1% SDS.
62. A method for the detection of the presence or the absence of
mycobacteria according to claim 61, wherein said method is specific
for the detection of M. tuberculosis infections.
63. A method for the detection of the presence or the absence of
mycobacteria according to claim 61, wherein said method is specific
for the detection of M. leprae infections.
64. A method for the detection of the presence or the absence of
mycobacteria according to any one of claims 61 to 63, wherein the
sample contains nucleic acids of at least one microorganism other
than the mycobacteria.
65. A method of selection according to claim 1, wherein the
comparison of the genetic informations of different types of
organisms, wherein the method comprises: a. Providing a database
including sequence libraries for a plurality of types of organism,
said libraries having multiple genomic sequences, b. Providing one
or more probe sequences according to claims 59 or 60, c.
Determining homologous matches between one or more of said probe
sequences and one or more sequences of said sequences in said
genomic libraries; and d. Displaying the results of said
determination.
66. A method according to claims 1 to 4, wherein the genomic
sequence of a first mycobacterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 27.
67. A method according to claims 1 to 4, wherein the genomic
sequence of a first mycobacterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 28.
68. A method according to claims 1 to 4, wherein the genomic
sequence of a first mycobaterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 29.
69. A method according to claims 1 to 4, wherein the genomic
sequence of a first mycobacterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 30.
70. A method according to claims 1 to 4, wherein the genomic
sequence of a first mycobacterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 31.
71. A method according to claims 1 to 4, wherein the genomic
sequence of the second mycobacterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 32.
72. A method according to claims 1 to 4, wherein the genomic
sequence of the second mycobacterium species is the recombinant BAC
deposited at the C.N.C.M. according to claim 33.
73. An in vitro diagnostic method for the detection of the presence
or the absence of essential nucleotidic sequences for the survival
or the virulence in mycobacterium by hybridization or amplification
of said specific sequence comprising: a. Providing a composition
comprising a probe according to claims 59 or 60 with a sequence
library of interest to be tested in an acceptable medium and in
sufficient time to obtain an hybridization and/or an amplification
of said sequence, b. Purifying the sequence which hybridizes with
said probe; and c. Optionally, quantifying said sequence.
74. A method according to claim 1, wherein the first genomic
sequence of mycobacterium belongs to Mycobacterium microti.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is based on and claims the benefit of U.S.
Provisional application Ser. No. 60/270,123, filed Feb. 22, 2001
(attorney docket no. 03495.6061) The entire disclosure of this
application is relied upon and incorporated by reference
herein.
BACKGROUND OF THE INVENTION
[0002] The present invention is directed to a method of selection
of purified nucleotidic sequences or polynucleotides encoding
proteins or part of proteins carrying at least an essential
function for the survival or the virulence of mycobacterium species
by a comparative genomic analysis of the sequence of the genome of
M. tuberculosis aligned on the genome sequence of M. leprae. The
selection by the method of the invention of these nucleotidic or
peptidic sequences of interest which are encoding said essential
functions of mycobacterium leads to identify and characterize
specific antigens or regulator sequences, said antigens being
chosen as potential candidates for an immunogenic or vaccine
composition, but also useful to determine novel potential drug
targets for the pharmaceutical industry. The molecules having
essential functions encoded by these genes or corresponding to
regulatory elements represent also new highly specific targets for
chemotherapy. The sequence of the polynucleotides according to the
invention have the particularity to be maintained during the
evolution of the mycobacterium and therefore have been highly
conserved in pathogenic mycobacterium species. The invention is
directed to purified nucleic acid selected by the method of the
invention as well as the purified polypeptides with essential
functions for the survival or the virulence of mycobacterium
species encoded by these sequences. In a preferred embodiment, the
invention is directed to genes that code for essential proteins for
which the functions have been attributed. The invention is also
directed to a process for the production of recombinant
polypeptides and chimeric polypeptides comprising them, antibodies
generated against these polypeptides, immunogenic or vaccine
compositions comprising at least one polypeptide useful as
protective antigens or capable to induce a protective response in
vivo or in vitro against mycobacterium infections,
immunotherapeutic compositions comprising at least such a
polypeptide according to the invention, and the use of such nucleic
acids and polypeptides in diagnostic methods, vaccines, kits, or
antimicrobial therapy.
[0003] To illustrate the new approach of comparative mycobacterial
genomics for identifying essential molecules as regulator
nucleotidic sequences and proteins for the survival or the
virulence of mycobacterium species, the inventors made several
examples which will not limit the scope of the present invention. A
comparative genomic analysis, which permitted the inventors to
select the sequences encoding essential molecules as regulatory
nucleotidic sequences and proteins for the survival or the
virulence of mycobacterium species, has been made by analysis of
the complete genome sequence of both Mycobacterium tuberculosis and
Mycobacterium leprae. The whole genome comparisons led also to the
identification of genes that are present in both M. tuberculosis
and M. leprae but have no counterparts elsewhere. The polypeptides
having essential functions for the survival or the virulence
mycobacterium species are characterized by at least 40% identity at
the protein level and at least 70% identity at the gene level
between both genomic sequences. The amino acid sequences have been
compared using the program GAP, "GCG" (Genetic Computer Group) from
Program Manual (UNIX), Wisconsin Sequence Analysis Package.TM.,
Algorithm of Needleman and Wunsch. (J.Mol.Biol.48:443, 1970) The
parameters are chosen as follows:
[0004] For amino acid comparisons:
[0005] Gap penalty: 5
[0006] Gap extension penalty: 0.30
[0007] Length: the sequence to be compared are the following XXX
SED ID NO:XXX and having XXX amino acids.
[0008] For nucleotide comparisons:
[0009] Gap penalty: 50
[0010] Gap extension penalty: 3
[0011] Also the parameters could be adapted case by case.
[0012] Other techniques are known by the man of the art for the
comparison of sequences. We can refer to the algorithm of Smith and
Wateman (Ad. App. Math. 2: 482, 1982), the method of search of
similarities of Pearson and Lipman (Proc. NatI. Acad. Sci. USA 85:
2444, 1988), Zhang et al. "a greedy algorithm for aligning DNA
sequences" (J. Comp. Biol. 2000, Feb-Apr. 7 (1-2) p203-214), these
algorithms are used by the way of informatic tools (GAP, BLASTP,
BLASTN, BLASTX, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics
Software Package, Genetics Computer Group, 575 Sciences Dr.,
Madison Wis.).
[0013] The recombinant clones carrying DNA from Mycobacterium
tuberculosis and Mycobacterium leprae strains containing genomic
sequences of said bacteria, have been deposited at the Collection
Nationale de Cultures de Microoganismes (C.N.C.M.), of Institut
Pasteur, 28, rue du Docteur Roux, F-75724 Paris cedex 15, France,
and are designated as following.
[0014] HRV37 genomic library, deposited on Nov. 19, 1997 . . .
under the accession number I-1945;
[0015] A recombinant BAC containing a fragment of M. tuberculosis
genome deposited at the C.N.C.M. under the accession number
I-2625.
[0016] A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001, at the C.N.C.M. under the
accession number I-2626.
[0017] A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001, at the C.N.C.M. under the
accession number I-2627.
[0018] A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001, at the C.N.C.M. under the
accession number I-2628.
[0019] A recombinant BAC containing a fragment of M. tuberculosis
genome deposited on Feb. 20, 2001, at the C.N.C.M. under the
accession number I-2629.
[0020] A recombinant cosmid containing a fragment of M. leprae
genome deposited on Feb. 21, 2001, at the C.N.C.M. under the
accession number I-2632.
[0021] A recombinant cosmid containing a fragment of M. leprae
genome deposited on Feb. 21, 2001, at the C.N.C.M. under the
accession number I-2633.
[0022] Leprosy, one of the oldest recorded diseases, remains a
major public health problem. Although prevalence has been reduced
extensively by WHO multidrug therapy and vaccination with BCG1,2,
the incidence of the disease remains worrying with more than
690,000 new cases annually3 in the world. Leprosy was common in
Europe in the middle ages but gradually disappeared.
[0023] In 1873, in the first convincing association of a
microorganism with a human disease, Armauer Hansen4 discovered the
leprosy bacillus in skin biopsies but failed to culture
Mycobacterium leprae. A century later, the nine-banded armadillo5
was used as a surrogate host, enabling large quantities of the
bacillus to be isolated for biochemical and physiological studies.
Subsequent efforts to demonstrate multiplication in synthetic media
have been equally fruitless altough metabolic activity can be
detected6. The exceptionally slow growth of the bacillus, which has
a doubling time of .about.14 days 7, may contribute to these
failures.
[0024] The means of transmission of leprosy is uncertain but, like
tuberculosis, it is believed to be spread by the respiratory route
since lepromatous patients harbour bacilli in their nasal passages.
The bacterium accumulates principally in the extremities of the
body where it resides with macrophages and infects the Schwann
cells of the peripheral nervous system. Lack of myelin production
by infected Schwann cells, and their desctruction by host-mediated
immune reactions, leads to nerve damage, sensory loss and the
disfiguration that, sadly, are hallmarks of leprosy.
[0025] There is no data or technical information in the prior art
which permit to select specifically potential new targets and
protective antigens for new drugs and vaccine compositions to treat
and prevent mycobacterial diseases, particularly tuberculosis and
leprosy. Furthermore, there is a need for the development of new
tools for the selection of genes which are encoding for essential
proteins or regulatory nucleotidic sequences in the survival or
infection of mycobacterium species and useful for the design of
antituberculosis drugs and vaccines based on the knowledge of
comparative mycobactertial genomics.
SUMMARY OF THE INVENTION
[0026] The invention aids in fulfilling these needs in the art. The
method according to the invention has the advantage to reduce
drastically the number of potential new targets and protective
antigens by giving for the first time an exhaustive description of
conserved proteins in the tuberculosis and leprae bacilli. The
isolated polynucleotides and proteins described in the present
invention, which are highly conserved in both genomic sequences of
M. tuberculosis and M. leprae, are by this characteristic essential
for the survival or the virulence of these mycobacteria in the
host. The identification of antigens and potentially therapeutic
targets has been made on an evolutionary basis by a method of
comparative genomic analysis.
[0027] This invention provides a method for the identification and
the selection of essential genes for the survival or the virulence
of mycobacterium species which comprises:
[0028] a. Aligning the genomic sequence of a first mycobacterium
species on a genomic sequence of the genomic sequence of a second
mycobacterium species,
[0029] b. Selecting a polypolynucleotide sequence highly conserved
in both genomes with no counterparts in other bacterial genomic
sequences and which corresponds to an essential gene for the
survival or the virulence of mycobacterium species, and
[0030] c. Optionaly, testing the polypolynucleotide selected in
step b) for its capacity of virulence or involved in the survival
of a mycobacterium species, said testing being based on the
activation or inactivation of said polyucleotide in a bacterial
host or said testing being based on the activity of the product of
expression of said polynucleotide in vivo or in vitro.
[0031] This invention provides also a method for the identification
and the selection in silico of essential genes for the survival or
the virulence of mycobacterium species which comprises the
following steps:
[0032] a. Aligning the genomic sequence of a first mycobacterium
specie on a genomic sequence of the genomic sequence of a second
mycobacterium specie, and
[0033] b. Selecting a polynucleotide sequence highly conserved in
both genomes with no counterparts in other bacterial genomic
sequences and which corresponds to an essential gene for the
survival or the virulence of mycobacterium species.
[0034] Optionally, testing the polypolynucleotide selected in step
b) for its capacity of virulence or involved in the survival of a
mycobacterium species can be carried out, said testing being based
on the activation or inactivation of said polynucleotide in a
bacterial host or said testing being based on the activity of the
product of expression of said polynucleotide in vivo or in
vitro.
[0035] The method according to the invention permits also to
determine the polynucleotidic sequences, which encode for
polypeptides and regulatory sequences essential for the virulence
and/or the survival of mycobacterium which are, in one hand,
specific to Mycobacterium tuberculosis and, in the other hand,
specific to Mycobacterium leprae, that is to say, said
polynucleotidic sequences are not found in publicly accessible
banks of non-Mycobacterium tuberculosis and non-Mycobacterium
leprae genome.
[0036] A gene according to the invention is a defined nucleotidic
sequence, which contains an open reading frame with base
composition, codon usage, GC skew and other features typical of a
microorganism, preferably a mycobacterium. The definition of gene
according to the invention comprises nucleotidic sequences, which
encode an antigen or a fragment thereof, or nucleotidic sequences,
which encode for essential polypeptide with essential function in
the host, or nucleotidic sequence, which encodes polypeptide with
regulation function in the bacteria, by example, in the DNA
expression or in the transcription. An essential function for a
polypeptide in bacteria according to the invention comprises
functions implicated in the survival or in the virulence of the
bacteria.
[0037] In a preferred embodiment the first genomic sequence of
mycobacterium belongs to Mycobacterium tuberculosis. The
Mycobacterium microti is a Mycobacterium which infect the vole. It
has a genome sequence close to the sequence of Mycobacterium
tuberculosis (Cole et al. (1998, Nature, 393, 537-544)) and
therefore in a second preferred embodiment, the first genomic
sequence of Mycobacterium microti belongs to Mycobacterium
genus.
[0038] In another preferred embodiment the second genomic sequence
of mycobacterium belongs to Mycobacterium leprae.
[0039] In a preferred embodiment, the method according to the
invention comprises the complete genomic sequence of said
mycobacterium species which is analysed. This invention provides
purified polypolynucleotide molecule obtained by the method
according to the invention.
[0040] Further, this invention provides a purified polynucleotide
molecule according to the invention which encodes essential
proteins or fragments of proteins of Mycobacterium species.
[0041] The invention also encompasses a purified polynucleotide
molecule of a formula selected from the group consisting of
polynucleotidic sequences, which encode for polypeptides and
regulatory sequences essential for the virulence and/or the
survival of mycobacterium which are, in one hand, specific to
Mycobacterium tuberculosis and, in the other hand, specific to
Mycobacterium leprae, that is to say, said polynucleotidic
sequences are not found in publicly accessible banks of
non-Mycobacterium tuberculosis and non-Mycobacterium leprae genome.
In a preferred embodiment, this purified polynucleotide is obtained
by the method according to the invention.
[0042] The invention emcompasses a purified polypolynucleotide
molecule that hybridizes to either stand of a denatured,
double-stranded DNA comprising the purified polynucleotide sequence
according to the invention under conditions of moderate stringency
in 50% formamide and 6.times. SSC at 42.degree. C. with washing
conditions of 60.degree. C., 0.5.times.SSC, 0.1% SDS.
[0043] This invention provides a purified polypeptide of a formula
selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO:
644.
[0044] This invention also provides a purified nucleic acid
molecule encoding a polypeptide of a formula selected from the
group consisting of SEQ ID NO:1 to SEQ ID NO:644.
[0045] The nucleic acid molecules of the invention, which include
DNA and RNA, are referred to herein as "M. tuberculosis and M.
leprae marker nucleic acids" or "M. tuberculosis and M. leprae
marker DNA". The polypeptides encoded by these molecules, which are
referred to herein as "M. tuberculosis and M. Ieprae marker
polypeptides," have formulas selected from the group consisting of
SEQ ID NO:1 to SEQ ID NO:644.
[0046] Further, this invention provides a purified nucleic acid
molecule that hybridizes to either strand of a denatured,
double-stranded DNA comprising the nucleic acid molecule encoding
the polypeptide of a formula selected from the group consisting of
SEQ ID NO: 1 to SEQ ID NO:644 under conditions of moderate
stringency in 50% formamide and 6.times. SSC, at 42.degree. C. with
washing conditions of 60.degree. C., 0.5.times.SSC, 0.1% SDS. This
nucleic acid molecule that hybridizes under the stated conditions
can be derived by in vitro mutagenesis of a M. tuberculosis and M.
leprae marker nucleic acid of the invention.
[0047] The invention also encompasses purified nucleic acid
molecules degenerate from M. tuberculosis and M. leprae marker
nucleic acids as a result of the genetic code, purified nucleic
acid molecules that are allelic variants of M. tuberculosis and M.
leprae marker nucleic acids, and a species homolog of M.
tuberculosis and M. leprae marker nucleic acids. The invention also
encompasses recombinant vectors that direct the expression of these
nucleic acid molecules and host cells transformed or transfected
with these vectors.
[0048] The invention further encompasses methods for the production
of M. tuberculosis and M. leprae marker polypeptides, including
culturing a host cell under conditions promoting expression, and
recovering the polypeptide from the culture medium. Especially, the
expression of M. tuberculosis and M. leprae marker polypeptides in
bacteria, yeast, plant, and animal cells is encompassed by the
invention.
[0049] This invention also provides labeled M. tuberculosis and M.
leprae marker polypeptides. Preferably, the labeled polypeptides
are in purified form. It is also preferred that the unlabeled or
labeled polypeptide is capable of being immunologically recognized
by human body fluid containing antibodies to a mycobacterium. The
polypeptides can be labeled, for example, with an immunoassay label
selected from the group consisting of radioactive, enzymatic,
fluorescent, chemiluminescent labels, and chromophores.
[0050] Immunological complexes between the M. tuberculosis and M.
leprae marker polypeptides of the invention and antibodies
recognizing the polypeptides are also provided. The immunological
complexes can be labeled with an immunoassay label selected from
the group consisting of radioactive, enzymatic, fluorescent,
chemiluminescent labels, and chromophores.
[0051] Furthermore, this invention provides a method for detecting
infection by mycobacteria. The method comprises providing a
composition comprising a biological material suspected of being
infected with a mycobacteria, and assaying for the presence of M.
tuberculosis and M. leprae marker polypeptide of the mycobacteria.
The polypeptides are typically assayed by electrophoresis or by
immunoassay with antibodies that are immunologically reactive with
M. tuberculosis and M. leprae marker polypeptides of the
invention.
[0052] This invention also provides an in vitro diagnostic method
for the detection of the presence or absence of antibodies, which
bind to an antigen comprising a M. tuberculosis and M. leprae
marker polypeptide of the invention or mixtures of the
polypeptides. The method comprises contacting the antigen with a
biological fluid for a time and under conditions sufficient for the
antigen and antibodies in the biological fluid to form an
antigen-antibody complex, and then detecting the formation of the
complex. The detection step can further comprise measuring the
formation of the antigen-antibody complex. The formation of the
antigen-antibody complex is preferably measured by immunoassay
based on Western blot technique, ELISA (enzyme linked immunosorbent
assay), indirect immunofluorescent assay, or immunoprecipitation
assay.
[0053] The polypeptides of this invention are thus useful as a
portion of a diagnostic composition for detecting the presence of
antibodies to antigenic proteins associated with mycobacteria.
Thus, a diagnostic kit for the detection of the presence or absence
of antibodies, which bind to the M. tuberculosis and M. leprae
marker polypeptide of the invention or mixtures of the
polypeptides, contains antigen comprising the M. tuberculosis and
M. leprae marker polypeptide, or mixtures thereof, and means for
detecting the formation of immune complex between the antigen and
antibodies. The antigens and the means are present in an amount
sufficient to perform the detection.
[0054] This invention also provides an immunogenic composition
comprising a M. tuberculosis and M. leprae marker polypeptide of
the invention or a mixture thereof in an amount sufficient to
induce an immunogenic or protective response in vivo, in
association with a pharmaceutically acceptable carrier therefor. A
vaccine composition of the invention comprises a neutralizing
amount of the M. tuberculosis and M. leprae marker polypeptide and
a pharmaceutically acceptable carrier therefor.
[0055] In addition, the M. tuberculosis and M. leprae marker
polypeptides can be used to raise antibodies for detecting the
presence of antigenic proteins associated with a mycobacterium.
Purified polyclonal or monoclonal antibodies that bind to M.
tuberculosis and M. leprae marker polypeptides are encompassed by
the invention.
[0056] The polypeptides of the invention can be also employed to
raise neutralizing antibodies that either inactivate the
mycobacteria, reduce the viability of a mycobacterium in vivo, or
inhibit or prevent bacterial replication. The ability to elicit
mycobacteria-neutralizing antibodies is especially important when
the proteins and polypeptides of the invention are used in
immunizing or vaccinating compositions to activate the B-cell arm
of the immune response or induce a cytotoxic T lymphocyte response
(CTL) in the recipient host, or other T cell mediated response.
[0057] Further, this invention provides a method for detecting the
presence or absence of a mycobacterium comprising:
[0058] (1) contacting a sample suspected of containing bacterial
genetic material of a mycobacterium with at least one nucleotide
probe, and
[0059] (2) detecting hybridization between the nucleotide probe and
the bacterial genetic material in the sample,
[0060] wherein said nucleotide probe is complementary to the
full-length sequence of a purified M. tuberculosis and M. leprae
marker nucleic acid of the invention.
[0061] Also, this invention provides a method of comparing genetic
complements of different types of organisms, wherein the method
comprises:
[0062] (a) providing a database including sequence libraries for a
plurality of types of organisms, said libraries having multiple
genomic sequences;
[0063] (b) providing one or more probe sequences encoding a
polypeptide of a formula selected from the group consisting of SEQ
ID NO: 1 to SEQ ID NO:644;
[0064] (c) determining homologous matches between one or more of
said probe sequences and one or more sequences of said sequences in
said genomic libraries; and
[0065] (d) displaying the results of said determination.
[0066] The method can be carried out using a computer system
comprising a database including sequence libraries for a plurality
of types of organisms, wherein the libraries have multiple genomic
sequences, and providing a database including the one or more probe
sequences encoding a polypeptide of a formula selected from the
group consisting of SEQ ID NO: 1 to SEQ ID NO:644. The computer
system includes a user interface capable of receiving sequence
information from the sequence libraries and the probe sequence
information for comparison and displaying the results of the
comparison.
BRIEF DESCRIPTION OF THE DRAWINGS
[0067] The patent or application file contains at least one drawing
executed in color. Copies of this patent or patent application
publication with color drawings will be provided by the Office upon
request and payment of the necessary fee. This invention will be
described with reference to the drawings in which:
[0068] FIG. 1 is a circular genome map. From the outside, circles
1, 2, clockwise and anticlockwise, genes on the - and + strands,
respectively; circles 3 and 4, pseudogenes; 5 and 6, M. leprae
specific genes; 7, repeat sequences; 8, G+C content; 9, G/C bias
(G+C)/(G-C). See legend to FIG. 2 for colour code.
[0069] FIG. 2 is a comparison of the pros loci of M. leprae and M.
tuberculosis . A. The M. Ieprae pros region is shown above that of
M. tuberculosis. Genes or operons are depicted by arrows while
crosses denote pseudogenes. Note the absence of ugpAEBC and dinF
from M. leprae and the presence of pros at this site. B. Domain
structures of prolyl-tRNA synthetases of bacterial (M. tuberculosis
) or eukaryotic (M. leprae ) types after.sup.19.
[0070] FIG. 3 shows distribution of genes by functional category.
The number of complete (blue) and pseudogenes (red) within each
category for M. leprae is shown. Data for M. tuberculosis (green)
were taken from the published genome sequence.sup.8. Functional
categories: 1. Small-molecule catabolism, 2. Energy Metabolism, 3.
Central intermediary metabolism, 4. Amino acid biosynthesis, 5.
Nucleosides and nucleotide biosynthesis and metabolism, 6.
Biosynthesis of cofactors, prosthetic groups and carriers, 7. Lipid
biosynthesis, 8. Polyketide and nonribosomal peptide synthesis, 9.
Proteins performing regulatory functions, 10. Synthesis and
modification of macromolecules, 11. Degradation of macromolecules,
12. Cell envelope constituents, 13. Transport/binding proteins, 14.
Chaperones/Heat shock proteins, 15. Cell division proteins, 16.
Protein and peptide secretion, 17. Adaptations and atypical
conditions, 18. Detoxification, 19. Virulence determinants, 20. IS
elements and phage derived proteins, 21. PE and PPE families, 22.
Antibiotic production and resistance, 23. Cytochrome P450 enzymes,
24. Coenzyme F420-dependent enzymes, 25. Miscellaneous
transferases, 26. Miscellaneous phosphatases, lyases and
hydrolases, 27. Cyclases, and 28. Chelatases. Inset graph: The Y
axis shows the number of genes within each functional category. The
X axis shows the functional categories: 29. Conserved hypothetical
proteins, 30. Hypothetical proteins which share no significant
similarity with any protein currently in the databases.
[0071] FIG. 4: Polynucleotidic sequence of the Mycobacterium
tuberculosis H37Rv BAC clone, BAC-Rv221, deposited at the C.N.C.M.
under the accession number I-2625, which corresponds to pBelo BACII
with HindIII partial digest fragment from the genome of M.
tuberculosis H37 Rv that starts at position 2,115,612 and extends
to position 2,198,604 according to Cole et al. (1998, Nature, 393,
537-544). All of the M. tuberculosis genes contained herein are of
interest.
[0072] FIG. 5: Polynucleotidic sequence of the Mycobacterium
tuberculosis H37Rv BAC clone, BAC-Rv230, deposited at the C.N.C.M.
under the accession number I-2626, which corresponds to pBelo BACII
with HindIII partial digest fragment from the genome of M.
tuberculosis H37 Rv that starts at position 1,336,764 and extends
to position 1,411,979 according to Cole et al. (1998, Nature, 393,
537-544). All of the M. tuberculosis genes contained herein are of
interest.
[0073] FIG. 6: Polynucleotidic sequence of the Mycobacterium
tuberculosis H37Rv BAC clone, BAC-Rv234, deposited at the C.N.C.M.
under the accession number I-2627, which corresponds to pBelo BACII
with HindIII partial digest fragment from the genome of M.
tuberculosis H37 Rv that starts at position 2,847,864 and extends
to position 2,928,420 according to Cole et al. (1998, Nature, 393,
537-544). All of the M. tuberculosis genes contained herein are of
interest.
[0074] FIG. 7: Polynucleotidic sequence of the Mycobacterium
tuberculosis H37Rv BAC clone, BAC-Rv265, deposited at the C.N.C.M.
under the accession number I-2628, which corresponds to pBelo BACII
with Hindlul partial digest fragment from the genome of M.
tuberculosis H37 Rv that starts at position 514,402 and extends to
position 599,515 according to Cole et al. (1998, Nature, 393,
537-544). All of the M. tuberculosis genes contained herein are of
interest.
[0075] FIG. 8: Polynucleotidic sequence of the Mycobacterium
tuberculosis H37Rv BAC clone, BAC-Rv267, deposited at the C.N.C.M.
under the accession number I-2629, which corresponds to pBelo BACII
with HindIII partial digest fragment from the genome of M.
tuberculosis H37 Rv that starts at position 1,124,621 and extends
to position 1,169,811 according to Cole et al. (1998, Nature, 393,
537-544). All of the M. tuberculosis genes contained herein are of
interest.
[0076] FIG. 9: Polynucleotidic sequence of the Mycobacterium leprae
cosmid which corresponds to pYUB18 with Sau3A partial digest
fragment from the genome of M. leprae that starts at position
1,373,705 and extends to position 1,403,746. This sequence
comprises the sequence of the Mycobacterium leprae cosmid MLCY 811
which corresponds to pYUB18 with Sau3A partial digest fragment of
the genome of M. leprae deposited at the C.N.C.M. under the
accession number I-2633 that starts at position 1,363,759 and
extends to position 1,403,737 according to Cole et al. (2001,
Nature, 409, 1007-1011).
[0077] FIG. 10: Polynucleotidic sequence of the Mycobacterium
leprae cosmid which corresponds to pYUB18 with Sau3A partial digest
fragment of the genome of M. leprae that starts at position
3,160,443 and extends to position 3,194,161. This sequence
comprises the sequence of the Mycobacterium leprae cosmid MLCY 047
which corresponds to pYUB18 with Sau3A partial digest fragment of
the genome of M. leprae deposited at the C.N.C.M. under the
accession number I-2632 that starts at position 3,160,458 and
extends to position 3,194,087 according to Cole et al. (2001,
Nature, 409, 1007-1011).
[0078] FIG. 4 to 10 can be found in the APPENDIX hereto.
DETAILED DESCRIPTION OF THE INVENTION
Sequence Analysis of M. Ieprae
[0079] The complete genome sequence of M. leprae contains 3,268,203
bp, and has an average G+C content of 57.8%. These values are much
lower than those reported for the M. tuberculosis genome,
comprising .about.4,000 genes, 4,411,529 bp and 65.6% G+C.sup.8. On
detailed pairwise comparisons of both genome and proteome
sequences.sup.8, 9, it was immediately apparent that only 49.5% of
the genome was occupied by protein-coding genes, while 27%
contained recognisable pseudogenes, inactive reading frames with
functional counterparts in the tubercle bacillus. The remaining
23.5% of the genome did not appear to be coding, and probably
contains gene remnants mutated beyond recognition. The distribution
of the 1,114 pseudogenes was essentially random (FIG. 1) and, after
their exclusion, 1604 potentially active genes remained, of which
1,439 were common to both pathogens. Among the remaining 165 genes,
with no ortholog in M. tuberculosis , were 29 for which functions
could be attributed. Many of the 136 residual CDS in M. leprae ,
showing no similarity to known genes, may also represent
pseudogenes as they are shorter than average and occur in regions
of low gene density (FIGS. 1).
Reductive evolution
[0080] Assuming that the genome of M. leprae was once topologically
equivalent and similar in size to those of all other mycobacteria
(.about.4.4 Mb).sup.10-12, then extensive downsizing and
rearrangement must have occurred during evolution. Loss of 1.1 Mb
would eliminate .about.1100 CDS, and M. leprae would, therefore, be
expected to produce 3000 proteins compared to the 4000 predicted in
M. tuberculosis. On analysis of the proteome only 391 soluble
protein species were detected.sup.13, compared to nearly 1800 in M.
tuberculosis.sup.4, consistent with there being many pseudogenes.
In conclusion, since diverging from the last common mycobacterial
ancestor, the leprosy bacillus may have lost >2000 genes.
[0081] Reductive evolution is documented in obligate intracellular
parasites, such as Rickettsia and Chlamydia spp., and in some
endosymbionts.sup.15, as genes become inactivated once their
functions are no longer required in highly specialised niches. This
process may have naturally defined the minimal gene set for a
pathogenic mycobacterium. The most extensive genome degradation
reported previously was in Rickettsia prowazekii, the typhus agent,
where only 76% of the potential coding capacity was used and 12
pseudogenes identified.sup.16. In comparison with M. leprae, the
level of gene loss detected was modest, and it is striking that
elimination of pseudogenes by deletion lags far behind gene
inactivation in both pathogens. Intriguingly, the G+C content of M.
leprae genes (60.1%) is higher than that of the pseudogenes
(56.5%), and the remainder of the genome (54.5%). The high G+C
content of M. leprae, and other mycobacteria, is apparently driven
by the codon preference of active genes, while random mutation
within non-coding regions results in drift towards a more neutral
G+C content, closer to that of the host.
Mosaic Organisation and Horizontal Transfer
[0082] While the precise mechanism behind pseudogene formation in
M. leprae is unclear, loss of dnaQ-mediated proof-reading
activities of DNA polymerase III.sup.17 may have contributed. By
contrast, there is extensive evidence for large scale
rearrangements and deletions arising from homologous recombination
events. Comparison with M. tuberculosis delineated .about.65
segments that show synteny but differ in their relative order and
distribution in the M. leprae genome. Breaks in synteny generally
correspond to dispersed repeats, tRNA genes or gene-poor regions.
Copies of all three repeats, RLEP, REPLEP, and LEPREP, occur at the
junctions of discontinuity suggesting that the mosaic arrangement
of the M. leprae genome reflects multiple recombination events
between related repetitive sequences. In some cases, aberrant
recombination may have occurred as truncated repeats exist.
[0083] While there is little sequence similarity indicating that
they are insertion sequences, RLEP is clearly capable of
transposition since it exists within sequences corresponding to
known genes. Unlike M. tuberculosis H37Rv, which contains at least
two prophages and 56 intact or truncated IS elements.sup.8, 18, M.
leprae has only three phage-like genes, all with M. tuberculosis
orthologs, and 26 transposase gene fragments. However, some signs
of horizontal transfer of genetic material were detected when the
aminoacyl-tRNA synthetase genes were examined. With one exception,
all of these are more closely related to M. tuberculosis enzymes
than to those of any other organism. Surprisingly, prolyl-tRNA
synthetase, encoded by proS, is more similar to the enzymes of
Borrelia burgdorfieri and eukaryotes such as Drosophila, humans and
yeast. It has been proposed that horizontal transfer of tRNA
synthetase genes occurs frequently, and that the pathogen B.
burgdorferi may have acquired proS from its host.sup.19. Comparison
of the genetic context provides further support for this hypothesis
as the M. leprae proS is both displaced and inverted with respect
to the M. tuberculosis genome (FIG. 4), consistent with recent
acquisition.
Multigene Families
[0084] Half of the genes (52%) present in M. tuberculosis arose
from gene duplication events leading to extensive functional
redundancy.sup.9. Many of these are involved in lipid metabolism or
belong to the novel PE and PPE families, encoding unusual
glycine-rich proteins of repetitive structure and unknown function.
The latter are confined to certain mycobacterial species.sup.20,
and represent sources of genetic and, possibly antigenic,
variation.sup.8. The corresponding 167 genes are exceptionally
GC-rich and occupy >8% of the M. tuberculosis genome.sup.21. By
contrast, only 9 intact PE and PPE genes were found in M. leprae
although 30 pseudogenes were present. No intact members of the
PE-PGRS subfamily were found. This reduction partly contributes to
the smaller genome size and the lower GC content of M. leprae .
Recently, some PE-PGRS proteins were shown to be upregulated in
Mycobacterium marinum during granuloma formation in frogs.sup.22.
However, this effect is probably not mediated directly by the
PE-PGRS, as granulomas are a prominent cytological feature of all
forms of leprosy. Essentially all of the gene families.sup.9 have
undergone extensive retraction in M. leprae and now encode
"just-enough" activity to permit intracellular growth. Selected
examples of this are given in Table 1, whereas the comprehensive
comparison presented in FIG. 3 shows that all functional categories
have shrunk.
1TABLE 1 Selected examples of metabolic streamlining. M.
tuberculosis M. leprae M. leprae Function Pathway Gene Gene
Pseudogene gitA1, gltA2, gltA2 citA Citrate synthase Krebs cycle
cit, 4 icd1, icd2 icd2 icd1 Isocitrate Krebs cycle dehydrogenase
ic1, aceA aceA ic1 Isocitrate lyase Glyoxy- latecycle gnd1, gnd2
gnd1 qnd2 6- Pentose phosphogluconated phosphate ehydrogenase
pathway pfkA, pjkB pfkA pfkB Phospho- Glycolysis fructokinase aceE,
lpdA, aceE, lpd lpdA, pdhA, Pyruvate, Energy pdhA, pdhB, (Rv0462)
pdhB, pdhC dehydrogenase metabolism pdhC, lpd complex (Rv0462)
lldD1, lldD2 lldD2 lldD1 L-lactate Respiration dehydrogenase mmaA1,
mmaA2, mmaA1, mmaA2, Methyltransferase Mycolic acid mmaA3, mmaA4,
mmaA4, nmaA3, modification cmaA1, cmaA2, cmaA2, umaA2 cmaA1, umaA1
umaAl, umaA2 glnA1, glnA2, glnA1, glnA2 glnA3, glnA4 Glutamine
Glutamine glnA3, glnA4 synthase biosynthesis metA, metB, met4,
metB, metC Various Methionine metC, metE, metE, biosynthesis metH,
metK, metZ metH, metK, metZ bfrA, bfrB bfrA bfrB Bacterioferritin
Iron storage ligA, 11gB, ligC ligA 1igB DNA ligase DNA
metabolism
Metabolic clues
[0085] Successive generations of microbiologists have failed to
grow M. leprae in axenic culture leading to the notion that the
bacterium lacks certain biosynthetic pathways. Complete genome
comparisons shed new light on this. Lipid metabolism is prominent
in the biochemical repertoire of M. leprae but to a lesser extent
than in the tubercle bacillus whose cell envelope has a greater
diversity of lipids, glycolipids and carbohydrates.sup.23.
Envelope Biogenesis
[0086] Mycolic acids, structural components of all mycobacteria,
include the alpha mycolates, lacking oxygen functions, and the
oxygenated keto- and methoxy- forms. Reappraisal of mycolic acid
modification is now possible in the light of the reduced cmaA, mmaA
and umaA gene-sets encoding the effector methyltransferases. M.
Ieprae contains no methoxy-mycolates.sup.23, probably due to the
loss of the MmaA2 and MmaA3 enzymes that attach the methoxy group
in M. tuberculosis.sup.10, 24. However, the mycolic acids do
contain cyclopropane functions.sup.25 that in M. tuberculosis are
introduced by MmaA2 and CmaAI. Since both the mmaA2 and cmaA1 genes
have decayed in M. leprae , cyclopropanation must be encoded by one
of the related umaA genes. Recently, both umaA2 and cmaA2 were
shown to be essential for the cyclopropanation function in M.
tuberculosis.sup.26, The same enzymes also catalyse
cyclopropanation in M. leprae as their duplicate copies are both
inactive (Table 1).
[0087] Foremost among the outer lipids of the leprosy bacillus is
phenolic glycolipid 1 (PGL1), an envelope component not found in M.
tuberculosis.sup.27. PGL1 is derived from
phthiocerol-dimycocerosate (PDIM), an esterified compound lipid
generated by mycocerosic acid synthase and a type I polyketide
synthase (PKS), by addition of three o-methylated deoxy
sugars.sup.23. However, the genes for the glycosyltransferases,
that modify PDIM to produce PGLI, could not be detected despite
extensive comparisons. PDIM, a virulence factor in M. tuberculosis,
requires the RND protein, MmpL7, for its transport across the
cytoplasmic membrane.sup.9,28,29. Of the 18 PKS systems
identifiable in M. tuberculosis.sup.8, only six were predicted in
M. leprae and the number of mmpL genes (often linked to PKS genes)
has decreased from 16 to five, presumably because they are no
longer required for polyketide or lipid export. Deletion of such
systems may be reflected in the lack of mycolipenic and
hydroxylipenic acids, polyketides esterified to trehalose in M.
tuberculosis. Further PKS missing from M. leprae include the mbt
operon required for production of the salicylate-based mycobactin
siderophores. Lipids, polyketides and aromatic compounds are often
substrates, for cytochrome-P450 monooxygenases.sup.30, enzymes that
are exceptionally abundant in M. tuberculosis.sup.8. Astonishingly,
none of these is functional in M. leprae although a novel enzyme is
predicted.
Lipolysis
[0088] Intracellular mycobacteria probably derive much of their
energy from degradation of host-derived lipids.sup.31, a process
initiated by lipases. In remarkable contrast to the 22 lip genes of
M. tuberculosis, M. Ieprae has only two lipase genes, of which,
lipG, clusters with mmaA genes and could, therefore, effect fatty
acid remodelling. This appears to leave just one lipase for
scavenging fatty acids. The enzyme LipE (ML1190) or its counterpart
in M. tuberculosis (Rv3775) could represent an attractive drug
target. In addition to the multifunctional FadA and FadB enzymes,
which catalyse .beta.-oxidation, M. tuberculosis has numerous
alternative systems for fatty acid degradation.sup.8. Once again,
M. leprae has roughly one third as many potential enzymes; however,
there are three-times more FadD acyl-CoA synthases than FadE
acyl-CoA dehydrogenases, whereas these are expected in equal
amounts in M. tuberculosis. This may be explained by the dual role
of FadD in .beta.-oxidative and anabolic processes while FadE only
participates catabolically.
[0089] The acetyl-CoA produced by .beta.-oxidation, or glycolysis,
flows into the central pathways of carbon metabolism in M. leprae .
However, the pattern of "just enough" genes for each step is firmly
established, so that the redundancy seen in M. tuberculosis almost
never occurs. For instance, there is only one isocitrate lyase
(with low predicted activity) capable of participating in the
glyoxylate shunt (Table 1).sup.32, 33, and one enzyme complex that
oxidatively decarboxylates pyruvate to acetyl-CoA, compared to two
such systems in M. tuberculosis. In the Krebs cycle, as in
glycolysis, replicate genes for the same activity are deleted
although differences in expression levels might compensate for some
missing copies. Thus, while lack of pdh genes is reflected in a low
rate of oxidative decarboxylation of pyruvate, isocitrate
dehydrogenase activity is comparable in host-grown leprosy and
tubercle bacilli.sup.34 even though a duplicate icd gene is
inactivated in M. leprae.
Central and Energy Metabolism
[0090] Despite an active glyoxylate cycle, there appear to be
fundamental differences elsewhere in anaplerotic pathways between
M. leprae and M. tuberculosis. Here, phospho-enol-pyruvate (PEP)
carboxylase replaces the pyruvate carboxylase of M. tuberculosis,
and the malic enzyme, associated with fast growth in
mycobacteria.sup.35, is missing. The metabolic implications are
that flux between C3 and C4 compounds and the balance between
glycolysis and gluconeogenesis will be very different. Another
missing link between by-products of lipid metabolism and the Krebs
cycle is the production of succinyl-CoA by catabolic Acc
carboxylases predicted for M. tuberculosis.sup.8. Other carbon
sources lost to M. leprae are acetate, as ackA, pta and acs are all
inactive, and galactose, so the cell wall galactan can only be
produced from glucose since the galK, T genes are missing. This
might imply that M. leprae is limited to growth on very few carbon
sources, or even a limited and rather specialised combination, on
which it can maintain a balanced carbon metabolism. Though a
similar range of potential substrates is available to both M.
leprae and M. tuberculosis in the host, marked differences in their
ability to exploit them are apparent on examination of the systems
involved in carbon and nitrogen compound degradation: there are
fewer oxidoreductases, oxygenases and short-chain alcohol
dehydrogenases, and their probable regulatory genes, (FIG. 3). The
inescapable conclusion is that catabolism in M. leprae is severely
limited.
[0091] In the same vein, the leprosy bacillus has lost anaerobic
and microacrophilic electron transfer systems, such as formate
dehydrogenase, nitrate, and fumarate reductase together with the
biosynthetic and transport systems required to produce the cognate
prosthetic groups. Likewise, the aerobic respiratory chain of M.
leprae is truncated as only the 3'-end of the NADH oxidase operon,
nuoA-N, remains. The consequences of this event are far-reaching,
for not only has the potential to produce ATP from the oxidation of
NADH been lost, but also regeneration of NAD.sup.+ may be limited,
relying heavily on ndh, which is involved only in recycling
NAD.sup.+. Alternatively, M. leprae may regenerate NAD.sup.+ from
NADH by (1) diverting pyruvate to acetate and CO.sub.2 using
lactate dehydrogenase and lactate oxidase; (2) diverting PEP to
malate or fumarate via oxaloacetate, using its PEP carboxylase (an
enzyme not found in the tubercle bacillus) that only catalyses the
reaction in this direction. Given the loss of genes reviewed above,
the acids produced by (1) and (2) cannot be recycled and must be
excreted.
Anabolism
[0092] In surprising contrast, all the anabolic pathways seem to be
relatively intact. With few exceptions, complete enzyme systems are
predicted for synthesis of amino acids, purines, pyrimidines,
nucleosides, nucleotides, most vitamins and cofactors. This
suggests that the availability of these metabolites in phagosomes
is either highly limited or that M. leprae cannot transport them
efficiently. It also sets the biology of the leprosy bacillus apart
from that of the other obligate parasites for which genomes have
been sequenced.sup.15, 16. M., leprae may, however, be auxotrophic
for methionine as metC, encoding cystathionine .beta.-lyase, is a
pseudogene, whereas the other counterparts of M. tuberculosis met
genes are all intact. This requirement for methionine may be
dictated by the inactivation of the sulphate transport operon,
cysYWA, and in turn this implies that M. leprae depends upon an
organic source of sulphur. A second auxotrophy that is predicted is
for cobinamide, as examination of the cob genes shows selective
deletion of those to make cobinamide, while the genes needed to
produce vitamin B12 from cobinamide are retained.
Pathogenesis and Disease Control
[0093] Central to a successful pathogenic lifestyle is the ability
to obtain iron. M. leprae has many genes for haem and iron-based
proteins and employs the iron regulatory systems, ideR and furB,
yet may be severely handicapped compared to M. tuberculosis as it
appears to have lost the mbt operon, encoding the non-ribosomal
peptide synthase required for production of the iron-scavenging
siderophores, mycobactin/exochelin.sup.8, 36, 37. However, part of
the iron uptake system is functional in M. leprae, as it transports
exochelinMN, from M. neoaurum but not those of M. smegmatis or M.
tuberculosis.sup.38. The genes for exochelinMN are unknown and seem
unlikely to occur in M. leprae.
[0094] As might be expected given the differences in their
respective pathologies, M. leprae contains several enzymes that
have no counterparts in the tubercle bacillus, including a
eukaryotic-like uridine phosphorylase and adenylate cyclase. In
addition, there are two transport systems that may play significant
physiological roles: an ABC-transporter for sugars, and a second
member of the Nrampl family, involved in divalent metal ion uptake.
M. leprae may have acquired another Nrampl gene.sup.39 to ensure
adequate intracellular iron concentrations resulting from its lack
of mycobactin siderophores.
[0095] M. leprae, shows a marked tropism for myelin-producing
Schwann cells, and a surface-exposed 21 kDa laminin-binding protein
(LBP) may be an important virulence factor.sup.40-42. Inspection of
the genome sequence revealed a single LBP gene and this also occurs
in M. tuberculosis . No further candidates for virulence genes were
detected, and many of those present in M. tuberculosis have been
inactivated or lost, including three of the Mce operons encoding
putative invasins.sup.9, 43. Although the leprosy and tubercle
bacilli both survive within macrophages, M. leprae has no
catalase-peroxidase.sup.44, and fewer peroxidoxins and epoxide
hydrolases to combat reactive oxygen species. It has retained both
superoxide dismutases suggesting that these may contribute to its
survival.
Comparative Mycobacterial Genomics
[0096] Comparative genomics is a powerful new tool for exploring
micobial evolution and identifying those genes that might encode
new drug targets or protective antigens. Coupled with knowledge
derived from bioinformatic analysis of the proteome, and
understanding of the underlying microbiology, it is possible to
reduce the number of potential new targets within a pathogen to a
more tangible level.
[0097] This invention includes discoveries resulting from the
findings of a comparative analysis in which gene and protein sets
of the leprosy and tubercle bacilli have been compared pairwise,
and against the completed genome sequences of various prokaryotes
and eukaryotes.
[0098] The genome of M. leprae , an exceptionally slow growing
bacterium, is substantially smaller than that of M. tuberculosis
and contains numerous pseudogenes. While the genome of M.
tuberculosis comprises 4.41 Mb and contains around 4,000 genes, the
genome of M. leprae is only 3.27 Mb and a mere 49.5% is occupied by
protein-coding genes. About 27% of the M. leprae genome contains
recognizable pseudogenes, inactive reading frames with functional
counterparts in the tubercle bacillus. The remaining 23.5% of the
genome does not appear to be coding, and probably contains gene
remnants mutated beyond recognition. The distribution of the 1,114
pseudogenes was essentially random throughout the chromosome. 1,604
potentially active genes were identified, of which 1,439 were
common to both pathogens. Among the remaining 165 genes, with no
ortholog in M. tuberculosis, were 29 for which functions could be
attributed. Many of the 136 residual CDS in M. leprae, showing no
similarity to known genes, may also represent pseudogenes as they
are shorter than average and occur in regions of low gene density.
In summary, assuming that all mycobacteria are descendants of a
common ancestor, M. leprae has probably lost around 2,000 genes
during evolution and the minimal gene set required by a pathogenic
mycobacterium has been naturally defined.
[0099] Whole genome comparisons led to the identification of genes
that are present in both M. tuberculosis and M. leprae but have no
counterparts elsewhere. Since these pathogenic mycobacteria occupy
similar niches in the human body where they encounter the same
physiological stresses and immune responses, it is conceivable that
the products of some of these genes may conduct highly specialized
functions that could be essential for intracellular growth of
mycobacteria. If this were the case, the corresponding proteins or
enzymes might represent novel drug targets. In addition to those
proteins that are confined to the species or the genus, there is a
second group of polypeptides that also occur in Streptomyces spp.,
related members of the Actinomycetales kingdom, but not in other
prokaryotes. It is reasonable to assume that such proteins confer
specific properties on actinomycetes.
[0100] Knowledge of the subcellular location of proteins is
particularly valuable for the design of new tuberculosis vaccines
since it is widely believed that surface-exposed or secreted
proteins correspond to those antigenic components that are first
encountered by the immune system during infection.sup.51.
Bioinformatics has also been used to identify proteins that
localize to the cell envelope and these include transmembrane
proteins with hydrophobic domains, and lipoproteins with N-terminal
cysteine residues that are modified by addition of lipid groups.
Proteins that are secreted via the general secretory pathway.sup.52
are readily identifiable by their characteristic signal peptides,
whereas those metallo-enzymes that are secreted by the twin
arginine transporter system, Tat, can be recognized by the presence
at the N-terminus of the cognate motif, S/TRRXFLK preceeding the
signal peptide.sup.53, .sup.5. This will be discussed further
below.
[0101] Other proteins that lack signal peptides and are secreted
from mycobacteria in a Sec-independent manner, include those
belonging to the ESAT-6 family.sup.61. ESAT-6 is a potent T-cell
antigen that induces strong Th1-type responses.sup.55 and has been
extensively studied as a potential diagnostic reagent for
infection.sup.56, since its gene is missing from BCG.sup.57,
.sup.58, .sup.59, and as a component of a subunit vaccine.sup.60.
The comparative genomic analysis identified several ESAT-6
proteins, and their potential secretion machinery, that were common
to both M. tuberculosis and M. leprae (Table 2).
[0102] Several examples are given in Table 2 of proteins of limited
distribution with potential drug targets, diagnostic antigens or
subunit vaccine components.
[0103] Legend of Table 2:
[0104] The reading of the first example, by instance,
[0105] M. leprae
[0106] ML0048: Name of an identified ORF in the genome of M.
leprae.
[0107] M. tub.:
[0108] Rv3876: Name of Equivalent ORF in the genome of M.
tuberculosis published in 1998.
[0109] BLASTP:
[0110] Method of comparing protein sequences for establishing their
degree of similarity or identity.
[0111] 1,00.sup.E-79:
[0112] BLASTP score, which indicates how similar the protein
sequences are. The analyses of the results are described in Cole et
al. for the comparisons between the genome of M. tuberculosis and
the genome of BCG (Analysis of the proteome of Mycobacterium
tuberculosis in silico, tuber Lung. Dis. 1999; 79(6):329-42).
[0113] Description:
[0114] Description of the protein, identified from all publically
accessible databases, with highest similarity for the M. leprae
protein ML0048.
[0115] Sc3C3.03C: Nomenclature of the streptomyces protein.
[0116] EMB : AL031231: Accession number in EMBL databank for the
sequence of the Streptomyces protein found to be most similar to
ML0048.
[0117] 1083: length of the sequence in the EMBL databank.
[0118] FASTA score: Different method, like BLAST, for comparing
sequences for their similarity.
[0119] Score denotes the degree of similarity.
[0120] 31.6%: Percentage of identity between C terminal part of the
Streptomyces protein and the amino acid sequence of ML0048. This
31.6% identity is found in an overlapping region of 580 amino acids
between the two sequences. The other examples should be read
similarly.
2TABLE 2 Proteins of limited distribution with potential as drug
targets, diagnostic antigens or subunit vaccine components M. Group
leprae M. tub. BLASTP Description A ML0048 Rv3876 1,00E- C-terminal
half of Streptomyces coelicolor SC3C3.03C, 79 hypothetical protein,
TR:O86637 (EMBL:AL031231) (1083 aa); Fasta score E( ): 5.9e-27,
31.6% identity in 580 aa overlap, which contains Pro-Gln repeats A
ML0115 Rv3780 2,00E- S. coelicolor SCGD3.23C, hypothetical protein,
44 TR:Q9XA56 (EMBL:AL096822) A ML0124 Rv0164 2,00E- S. coelicolor
SC6CG0.02C, hypothetical protein, 40 TR:Q9X7Y8 (EMBL:AL049497) (144
aa); Fasta score E( ): 7e-05, 21.9% identity in 137 aa overlap. A
ML0151 Rv0948c 2,00E- S. coelicolor SC063.16C, hypothetical
protein, 25 TR:CAB82023 (EMBL:AL161755) A ML0169 Rv0966c 7,00E- S.
coelicolor SCE6.30C, hypothetical protein, 66 TR:CAB88834
(EMBL:AL353832) (277 aa); Fasta score E( ): 3.3e-20, 41.0% identity
in 205 aa overlap. A ML0229 Rv3603c 2,00E- N-terminal half of S.
coelicolor SCE126.02C, 60 hypothetical protein, TR:Q9X845
(ENBL:AL049630) (420 aa); Fasta score E( ): 4.1e-24, 36.7% identity
in 294 aa overlap A lsr2 Rv3597c 1,00E- S. coelicolor SCE94.26C,
putative lsr2-like protein, 27 TR:g9X8N1 (EMBL:AL049628) (111 aa);
Fasta score E( ): 7.3e-18, 56.3% identity in 112 aa overlap A
ML0284 Rv0360c 3,00E- S. coelicolor SCH10.25C, hypothetical
protein, 23 TR:Q9X8R4 (EMBL:AL049754) A whiB3 Rv3416 6,00E-
Transcriptional regulator 38 A lppS Rv2518c e-135 many predicted
lipoproteins from S. coelicolor. A ML0451 Rv2609c 2,00E- S.
coelicolor e.g. SC2E1.17, hypothetical protein, 85 TR:O69888
(EMBL:AL023797) (172 aa); Fasta score E( ): 2e-13, 43.3% identity
in 150 aa overlap. A ML0486 Rv2588c 2,00E- S. coelicolor SCL2.07C,
putative secreted protein, 19 TR:CAB70919 (EMBL:AL137778) (169 aa);
Fasta score E( ) 7.3e-08, 35.8% identity in 120 aa overlap A ML0542
Rv1390 6,00E- S. coelicolor 5C9C5.02C, hypothetical protein, 51
TR:CAB93358 (EMBL:AL357523) (90 aa); Fasta score E( ): 2e-18, 71.3%
identity in 80 aa overlap. A ML0561 Rv1417 3,00E- Corynebacterium
ammoniagenes ribX, hypothetical 38 protein, TR:O24754
(EMBL:AB003693) (184 aa); Fasta score E( ): 2.1e-15, 34.5% identity
in 148 aa overlap. Contains hydrophobic, possible membrane-spanning
regions A ML0580 Rv1446c 2,00E- hypothetical proteins from S.
coelicolor e.g. 64 SCC22.20, hypothetical protein, TR:Q9XAB8
(EMBL:AL096839) (351 aa); Fasta score E( ): 7.1e-21, 36.0% identity
in 203 aa overlap, although these have a short N-terminal extension
relative to this homologue. A ML0603 Rv2413c 3,00E- S. coelicolor
SCC123.02C, putative DNA-binding 77 protein, TR:Q9RDM2
(EMBL:AL136518) (336 aa); Fasta score E( ): 0, 39.3% identity in
326 aa overlap. A ML0630 Rv2365c 2,00E- S. coelicolor SCC77.05,
hypothetical protein, 15 TR:Q9RDF3 (Er4BL:AL136503) (132 aa); Fasta
score E( ): 3.3e-06, 39.4% identity in 99 aa overlap. A ML0642
Rv3195 e-143 S. coelicolor SCE9.14C, hypothetical protein,
TR:Q9X8I7 (EMBL:AL049841) (375 aa) ; Fasta score E( ) 4.9e-12,
24.9% identity in 305 aa overlap. A whiB2 Rv3260c 9,00E-
Transcription factor 31 A ML0762 Rv3258c 4,00E- S. coelicolor
hypothetical 15.0 kDa protein 5CE34.11C 41 TR:CAB88914
(EMBL:AL353862) fasta scores: E( ): 4.8e- 16, 47.0% id in 151 aa. A
lpqB Rv3244c 0.0 S. coelicolor putative lipoprotein SCE33.13C
TR:CAB90922 (EMBL:AL355774) fasta scores: E( ): 0.00039, 24.4% id
in 624 aa A whiB1 Rv3219 6,00E- Transcription factor 31 A ML0814
Rv3208c 3,00E- S. coelicolor hypothetical protein 32
gp.vertline.AL390975.vertlin- e.AL390975_32 (94 aa) E( ): 2.5e-09;
47.945% identity A ML0816 Rv3207c e-101 S. coelicolor putative
membrane protein SC2H12.28c (314 aa) TR:CAB94652 (EMBL:AL359215)
fasta scores: E( ): 1e-13, 30.2% id in 331 aa A ML0857 Rv2219
2,00E- S. coelicolor putative integral membrane protein 59
SC3H12.04 TR:CAB90843 (EMBL:AL355740) (234 aa) fasta scores: E( ):
1.2e-26, 39.6% id in 230 aa A ML0869 Rv2206 4,00E- S. coelicolor
putative integral membrane protein 40 SC5F7.32 TR:Q9S2R7
(EMBL:AL096872) A ML0876 Rv2199c 2,00E- S. coelicolor hypothetical
proteins e.g. putative 43 integral membrane protein SC6G10.27C
TR:Q9X812 (EMBL:AL049497) (132 aa) fasta scores: E( ) : 6.2e-15,
38.8% id in 139 aa A ML0920 Rv2147c 3,00E-
pir.vertline..vertline.T34949 hypothetical protein SC4A10.12c - 89
Streptomyces coelicolor A ML0921 Rv2146c 5,00E- S. coelicolor
TR:Q9S2X3 (EMBL:AL109663) (94 aa); Fasta 32 score E( ): 2.9e-12,
40.7% identity in 86 aa overlap. Contains possible membrane
spanning hydrophobic domains. A ML0986 Rv2738c 3,00E- S. coelicolor
TR:O50484 (EMBL:AL020958) (64 aa); Fasta 21 score E( ) : 2.5e-08,
44.4% identity in 63 aa overlap A ML0994 Rv2728c 1,00E- S.
coelicolor TR:O69964 (EMBL:AL022268) (237 aa); 56 Fasta score E( ):
1.3e-13, 32.9% identity in 243 aa overlap A ML1009 Rv2714 e-106
pir.vertline..vertline.T35742 hypothetical protein SC7H2.11c S.
coelicolor A ML1016 Rv2708c 1,00E-
emb.vertline.CAB72193.1.vertline. (AL138851) hypothetical protein
25 SCE59.06c [S. coelicolor A3(2)] Length = 97 A ML1026 Rv2699c
2,00E- T34816 hypothetical protein SC2E9.05 SC2E9.05 - S. 32
coelicolor 144 2e-34 A ML1027 Rv2698 1,00E- membrane protein, S.
coelicolor TR:O54132 33 (EMBL:AL021530) (154 aa); Fasta score E( )
: 1.1e-08, 33.6% identity in 149 aa overlap. A ML1029 Rv2696c
7,00E- pir.vertline..vertline.T34821 hypothetical protein SC2E9.10
SC2E9.10 - 69 S. coelicolor 86 4e-16 A ML1041 Rv2680 2,00E-
pir.vertline..vertline.T34710 hypothetical protein SC1C3.18c
SC1C3.18c - 62 S. coelicolor 158 5e-38 A ML1067 Rv1211 9,00E-
emb.vertline.CAC01346.1.vertline. (AL390975) conserved hypothetical
23 protein S. coelicolor 101 1e-21 A ML1093 Rv1244 5,00E-
lipoprotein, pir.vertline..vertline.T35857 probable secreted
substrate- 78 binding protein - S. coelicolor 67 3e-10 A ML1105
Rv1259 e-115 S. coelicolor TR:Q9S2L3 (EMBL:AL109732) (237 aa);
Fasta score E( ): 0, 54.5% identity in 231 aa overlap. A ML1117
Rv1276c 3,00E- pir.vertline..vertline.- T36773 hypothetical protein
SC128.03c - S. 53 coelicolor 115 4e-25 A ML1147 Rv1312 3,00E-
possible secreted protein, emb.vertline.CAB94546.1.vertline.
(AL359152) 42 putative secreted/membrane protein S. coelicolor 66
2e-10 A ML1166 Rv1332 7,00E- S. coelicolor TR:Q9S2G6
(EMBL:AL096852) (202 aa); 54 Fasta score E( ) 1.5e-05, 34.6%
identity in 188 aa overlap. A ML1230 Rv1182 e-149 papA3,
emb.vertline.CAC08383.1.vert- line.(AL392176) hypothetical protein
S.coelicolor 132 8e-30 A ML1306 Rv2125 5,00E- S. coelicolor
TR:Q9S2K6 (EMBL:AL109732) (312 aa); 87 Fasta score E( ) 1.6e-07,
23.4% identity in 278 aa overlap A ML1321 Rv2111c 3,00E- upstream
of bacterial proteasome beta subunits 07 including: Mycobacterium
smegmatis TR:O30517 (EMBL:AF009645) (64 aa); Fasta score E( ):
6.2e-18, 82.8% identity in 64 aa overlap, Rhodococcus A ML1338
Rv2673 e-150 conserved integral membrane protein, S. coelicolor
TR:Q53873 (EMBL:AL031317) (411 aa); Fasta score E( ) 1.1e-12, 28.3%
identity in 410 aa overlap A ML1439 Rv2050 4,00E-
emb.vertline.CAB61670.1.vertline. (AL133213) hypothetical protein
31 SC6D7.18c. S. coelicolor 101 4e-21 A ML1485 Rv2466c 2,00E- S.
coelicolor TR:CAB71809 (EMBL:AL138662) (216 aa); 66 Fasta score E(
): 0, 52.3% identity in 214 aa overlap A ML1508 Rv1155 2,00E- S.
coelicolor TR:Q9XAG1 (EMBL:AL079356) (144 aa); 48 Fasta score E( ):
5.6e-25, 54.3% identity in 140 aa overlap. A ML1525 Rv2771c 8,00E-
S. coelicolor TR:Q9RD46 (EMBL:AL133424) (151 aa); 27 Fasta score E(
): 1.3e-28, 56.1% identity in 148 aa overlap A ML1548 Rv2795c e-132
S. coelicolor TR:O88028 (EMBL:AL031107) (295 aa); Fasta score E( ):
0, 54.4% identity in 285 aa overlap A ML1557 Rv2840c 2,00E-
emb.vertline.CAB91141.1.- vertline. (AL355913) hypothetical protein
S. 27 coelicolor 46 7e-05 A ML1561 Rv2844 1,00E- S. coelicolor
TR:CAB91137 (ENBL:AL355913) (167 aa); 39 Fasta score E( ): 1.4e-07,
35.8% identity in 137 aa A ML1624 Rv2917 0.0 S. coelicolor
TR:Q9S3Y6 (EMBL:AF170560) (597 aa); Fasta score E( ): 0, 55.5%
identity in 566 aa overlap A ML1644 Rv2235 e-113 N-terminal signal
sequence plus membrane spanning hydrophobic domain;
emb.vertline.CAB59445.1.vertline. (AL132644) putative membrane
protein [Streptomyc . . . 109 4e-23 A ML1649 Rv2239c 3,00E-
emb.vertline.CAB92846.1.vertline. (AL356892) hypothetical protein
36 [Streptomyces Co . . . 137 6e-32 A ML1652 Rv2242 0.0 S.
coelicolor TR:Q9RDP8 (EMBL:AL133423) (401 aa); Fasta score E( ) :
4.3e-26, 42.0% identity A ML1666 Rv2968c 9,00E- S. coelicolor
putative integral membrane protein 59 TR:CAB93387 (EMBL:AL357523)
(240 aa); Fasta score E( ): 3.6e-25, 36.1% identity in 191 aa
overlap A ML1698 Rv3005c 4,00E- conserved membrane protein,
emb.vertline.CAB61735.1.vertline. (AL133220) 54 putative membrane
protein. S. coelicolor 99 5e-20 A ML1706 Rv3015c 1,00E- S.
coelicolor TR:Q9Z586 (EMBL:AL035569) (331 aa); 91 Fasta score E( ):
0, 38.6% identity in 337 aa overlap, A ML1781 Rv2256c 4,00E-
4pir.vertline..vertline.T11215 hypothetical protein 5 -
Streptomyces 62 glaucescens >g . . . 153 1e-36 A ML1782 Rv2257c
e-121 S. coelicolor SC4A7.08 TR:Q9RDQ4 (EMBL:AL133423) (273 aa);
Fasta score E( ): 0, 53.2% identity in 269 aa overlap A ML1791
Rv1976c 8,00E- S. coelicolor hypothetical protein SC1C3.03C
TR:O69845 25 (EMBL:AL023702) (125 aa); Fasta score E( ): 4.3e-06,
36.6% identity in 112 aa overlap. A ML1908 Rv0637 3,00E- S.
coelicolor SCD82.07 TR:CAB77410 (EMBL:AL160431) 62 (150 aa); Fasta
score E( ): 4.7e-11, 29.3% identity in 150 aa overlap. A ML1910
Rv0635 9,00E- emb.vertline.CAB77410.1.vertline. (AL160431)
hypothetical protein 49 SCD82.07 S. coelicolor 83 1e-15 A ML1926
Rv0431 6,00E- S. coelicolor hypothetical protein SCD95A.20 24
TR:CAB93047 (EMBL:AL357432) (84 aa); Fasta score E( ): 4.1e-11 A
ML1927 Rv0430 2,00E- S. coelicolor hypothetical protein SCD95A.20
25 TR:CAB93047 (EMBL:AL357432) (84 aa) ; Fasta score E( ) 4.1e-11,
52.8% identity in 72 aa overlap. A ML1997 Rv0970 7,00E- S.
coelicolor putative integral membrane protein 39 SCM2.15C A ML2030
Rv1884c 1,00E- Rpf, emb.vertline.CAC09538.1.vertline. (AL442120)
putative secreted 34 protein S. coelicolor 108 5e-23 A ML2031
Rv1883c 1,00E- Streptomyces actuosus NSH-OrfB TR:P72384
(EMBL:U75434) 44 fasta scores: E( ): 2.5e-08, 34.4% in 125 aa A
ML2048 Rv1871c 2,00E- S. coelicolor hypothetical protein
TR:CAE88434 14 (EMBL:AL353815) fasta scores: E( ): 0.0092, 39.3% in
61 aa; truncated at C-terminus; may represent a pseudogene A ML2063
Rv1846c 3,00E- possible regulator, pir.vertline..vertline.T36388
hypothetical protein 35 SCE94.28c - S. coelicolor 64 6e-10 A ML2064
Rv1845c 3,00E- S. coelicolor putative integral membrane protein 82
SC10A7.04 TR:Q9XAS1 (EMBL:AL078618) fasta scores: E( ): 1.8e-19,
32.6% in 328 aa A ML2073 Rv1830 2,00E- S. coelicolor hypothetical
19.1 kda protein 74 TR:CAB88877 (ENBL:AL353861) fasta scores: E( ):
3.7e- 30, 64.8% in 145 aa A ML2075 Rv1828 7,00E- S. coelicolor
hypothetical 26.5 kda protein 71 TR:CAB88879 (EMBL:AL353861) fasta
scores: E( ): 1.1e- 14, 41.4% in 237 aa. A ML2114 Rv0909 7,00E- S.
coelicolor hypothetical 9.9 kda protein TR:O69965 07
(EMBL:AL022268) fasta scores: E( ): 0.038, 41.3% in 46 aa A ML2135
Rv0885 e-123 S. coelicolor putative membrane protein TR:Q9XAE8
(EMBL:AL079356) fasta scores: E( ) : 1.5e-13, 27.1% in 255 aa A
ML2137 Rv0883c 1,00E- S. coelicolor hypothetical 39.0 kda protein
TR:O50529 76 (EMBL:AL009204) fasta scores: E( ): 2.2e-19, 36.0% in
247 aa A ML2142 Rv0877 8,00E- S. coelicolor hypothetical 32.2 kda
protein 91 TR:CAB93404 (EMBL:AL357524) fasta scores: E( ): 2.5e-
19, 43.3% in 270 aa. A ML2143 Rv0876c e-172 S. coelicolor putative
integral membrane protein TR:CAB93403 (EMBL:AL357524) fasta scores:
E( ): 5.3e- 16, 38.8% in 448 aa. A ML2151 Rv0867c 1,00E- Probable
resusicitation-promoting factors, exported 35 protein A ML2156
Rv0862c 0.0 S. coelicolor hypothetical 90.4 kda protein TR:CAB93395
(EMBL:AL357524) fasta scores: E( ): 3.9e- 27, 34.6% in 856 aa A
ML2193 Rv0819 2,00E- Acetyltransferase (GNAT) family,
emb.vertline.CAB88484.1.vertline. 87 (AL353816) putative
acetyltransferase S. coelicolor 216 3e-55 A ML2199 Rv3118 1,00E-
Saccharopolyspora erythraea hypothetical 10.2 kda 28 protein
TR:Q54084 (EMBL:M29612) fasta scores: E( ): 2.7e-16, 53.0% in 100
aa A ML2200 Rv0813c 3,00E- S. coelicolor hypothetical 21.7 kda
protein 59 TR:CAB94083 (EMBL:AL358692) fasta scores: E( ): 4.4e-
11, 30.5% in 167 aa A ML2204 Rv0810c 2,00E- S. coelicolor
hypothetical 9.3 kda protein SCD25.24C 13 TR:Q9RKJ8 (EMBL:AL118514)
fasta scores: E( ): 1.3e-06, 46.8% id in 62 aa. A ML2207 Rv0807
8,00E- S. coelicolor hypothetical protein SCD25.20 TR:Q9RKK0 36
(EMBL:AL118514) (202 aa) fasta scores: E( ): 6.6e-16, 52.5% id in
101 aa. A ML2219 ARv0787A 1,00E- S. coelicolor hypothetical protein
SCD25.13 (AL118514) 33 A ML2253 Rv2145c 1,00E- antigen 84 homolog,
also in S. coelicolor, etc. 06 A ML2261 Rv0546c 1,00E-
emb.vertline.CAB95979.1.vertline. (AL360034) conserved hypothetical
43 protein S. coelicolor 119 1e-26 A ML2289 Rv3662c 7,00E- S.
coelicolor putative oxidoreductase SCH5.22C 64 TR:Q9X924
(EMBL:AL035636) (274 aa) fasta scores: E( ) 1e-11, 40.9% id in 269
aa A ML2295 Rv3668c 7,00E- emb"CAB61552.1.vertline. (AL133171)
protease precursor S. 67 coelicolor 53 2e-06 A ML2296 Rv3669 2,00E-
Similar_to S. coelicolor putative integral membrane 43 transport
protein SCH5.28 TR:Q9X930 (EMBL:AL035636) (162 aa) fasta scores: E(
): 3.3e-10, 37.3% id A ML2306 Rv3680 e-110 S. coelicolor putative
ion-transporting ATPase TR:Q9XA35 (EMBL:AL079353) (481 aa) fasta
scores: E( ): 0, 48.6% id in 432 aa A ML2307 Rv3681c 4,00E- whiB4
28 A ML2330 Rv3716c 6,00E- pir.vertline..vertline.T35387
hypothetical protein SC66T3.30c - S. 10 coelicolor 47 6e-05 A
ML2332 Rv3718c 1,00E- S. coelicolor conserved hypothetical protein
TR:Q9ZBJ2 39 (EMBL:AL035161) (147 aa) fasta scores: E( ) : 1.4e-22,
47.6% id in 147 aa. A ML2410 Rv0528 e-160 conserved membrane
protein, emb.vertline.CAC08381.1.vertline. (AL392176) putative
integral membrane protein S. coelicolor 221 2e-56 A ML2425 Rv0504c
7,00E- emb.vertline.CAB77410.1.vertline. (AL160431) hypothetical
protein 52 SCD82.07 [Strept . . . 73 2e-12 A ML2428 ARv0500B 6,00E-
Small, strongly basic, S. coelicolor SCE68.25c, 17
gp.vertline.AL079345.vertline.AL079345_25 S. coelicolor (32 aa) E(
): 1.7e-07; 93.103% A ML2432 Rv0498 e-101 S. coelicolor
hypothetical protein TR:Q9X8H0 (EMBL:AL049819) (285 aa) fasta
scores: E( ): 3.2e-30, 51.6% id in 273 aa A ML2435 Rv0495c 7,00E-
S. coelicolor hypothetical protein TR:Q9X8H2 94 (EMBL:AL049819)
(271 aa) fasta scores: E( ): 0, 48.4% id in 250 aa A ML2442 Rv0487
1,00E- emb.vertline.CAC04041.1.vertline. (AL391406) conserved
hypothetical 47 protein S. coelicolor 142 2e-33 A ML2446 Rv0483
e-137 possible lipoprotein, S. coelicolor putative lipoprotein
TR:CAB76012 (EMBL:AL157916) fasta scores: E( ): 2.5e-24, 28.6% id
in 405 aa. A ML2453 Rv0476 9,00E- conserved membrane protein,
emb.vertline.CAC04036.1.vertline. (AL391406) 22 putative membrane
protein S. coelicolor 57 3e-08 A ML2522 Rv0309 5,00E- S. coelicolor
putative secreted protein
SCL24.08 65 TR:CAB76092 (EMBL:AL157956) A ML2529 Rv0290 e-116 S.
coelicolor putative integral membrane protein SC3C3.21 TR:O86654
(EMBL:AL031231) fasta scores: E( ): 1.9e-05, 23.8% id in 483 aa A
ML2566 Rv0241c e-125 S. coelicolor putative dehydratase TR:CAB77291
(EMBL:AL160312) A ML2630 Rv0007 4,00E-
emb.vertline.CAB92992.1.vertline. (AL357152) putative integral
membrane 06 protein S. coelicolor 69 5e-11 A ML2640 Rv0146 3,00E-
pir.vertline..vertline.T35930 hypothetical protein 5C9E5.10 - S. 93
coelicolor 141 1e-32 A ML2664 Rv0116c 1,00E- possible secreted
protein, pir.vertline..vertline.T35535 probable 72 secreted protein
- S. coelicolor 154 7e-37 A ML2687 Rv0051 e-150 conserved membrane
protein, pir.vertline..vertline.T3- 6589 probable transmembrane
protein - S. coelicolor 185 1e-45 A ML2699 Rv3909 0.0 putative
secreted protein, pir.vertline..vertline.T36582 hypothetical
protein SCH24.17c - S. coelicolor 90 8e-17 M ML0007 Rv0007 6,00E-
Putative membrane protein 51 M ML0012 Rv0010c 4,00E- Contains
hydrophobic, possible membrane-spanning 30 regions. M ML0013
Rv0011c 3,00E- Contains hydrophobic, possible membrane-spanning 36
regions. M ML0022 Rv0020c e-114 -- M ML0030 Rv0039c 9,00E- putative
membrane protein 06 M ML0031 Rv0040c 3,00E- Contains a probable
N-terminal signal sequence 48 M ML0042 Rv3882c e-138 putative
membrane protein M ML0044 Rv3880c 2,00E- -- 19 M ML0047 Rv3877
e-146 putative membrane protein M ML0049 Rv3875 5,00E- possible
secreted protein, ESAT-6 14 M ML0050 Rv3874 4,00E- possible
secreted protein, ESAT-6 12 M ML0051 Rv3873 1,00E- PPE-family
protein 30 M ML0054 Rv3869 e-151 putative membrane protein M ML0056
Rv3867 2,00E- -- 13 M ML0068 Rv3850 8,00E- -- 71 M ML0069 Rv3849
4,00E- -- 41 M ML0071 Rv3847 2,00E- -- 65 M ML0073 Rv3843c 3,00E-
putative membrane protein 51 M ML0081 Rv3835 e-107 putative
membrane protein, possible membrane-spanning region near the
N-terminus. M ML0091 Rv3810 1,00E- erp, pirG, exported repetitive
protein precursor 39 M ML0093 Rv3808c 0.0 -- M ML0094 Rv3807c
6,00E- putative membrane protein 30 M ML0096 Rv3805c 0.0 putative
membrane protein M ML0099 Rv3802c 8,00E- -- 96 M embB Rv3795 0.0
arabinosyl transferase M embA Rv3794 0.0 arabinosyl transferase M
embC Rv3793 0.0 arabinosyl transferase M ML0107 Rv3792 0.0
Nycobacterium smegmatis ORF3, hypothetical membrane protein M
ML0116 Rv3779 e-179 putative membrane protein M ML0133 Rv2949c
3,00E- Pfam match to entry PF01947 DUF98, Protein of unknown 64
function M lppX Rv2945c 6,00E- putative lipoprotein 60 M ML0158
Rv0954 4,00E- 34 kDa antigen, membrane protein 20 M ML0159 Rv0955
2,00E- putative membrane protein 74 M ML0185 Rv0996 2,00E- possible
membrane-spanning regions 74 M ML0187 Rv0998 e-124 Cyclic
nucleotide-binding domain. M ML0199 Rv3647c 2,00E- -- 52 M ML0208
Rv3632 2,00E- putative membrane protein 38 M ML0227 Rv3605c 3,00E-
putative membrane protein 36 M MML0228 Rv3604c 2,00E- putative
membrane protein 51 M lpqT Rv1016c 1,00E- putative lipoprotein 52 M
ML0256 Rv1024 2,00E- Contains hydrophobic, possible
membrane-spanning 42 region M ML0271 Rv0401 1,00E- putative
membrane protein 23 M ML0279 Rv0356c 9,00E- -- 63 M ML0281 Rv0358
2,00E- -- 36 M ML0285 Rv0361 1,00E- putative membrane protein 50 M
ML0298 Rv0416 5,00E- possibly thiamine_biosynthesis 10 M lpqE
Rv3584 3,00E- putative lipoprotein 40 M ML0370 Rv3438 2,00E-
Contains PS00107 Protein kinases ATP-binding region 78 signature M
NL0383 Rv3415c 5,00E- -- 59 M ML0386 Rv3412 4,00E- -- 45 M ML0405
Rv3616c 1,00E- -- 71 M ML0406 Rv3615c 2,00E- -- 14 M ML0407 Rv3614c
2,00E- -- 45 M ML0410 Rv2107 8,00E- PE-family protein 08 M ML0411
Rv2108 1,00E- PPE-family protein, serine-rich antigen 22 M ML0425
Rv2520c 2,00E- putative membrane protein 10 M ML0431 Rv2507 1,00E-
putative membrane protein 41 M ML0520 Rv2536 1,00E- putative
membrane protein 40 M PE Rv1386 1,00E- PE protein family 21 M PPE.0
Rv1387 3,00E- PPE protein family 99 M mihF Rv1388 4,00E-
integration host factor 24 M lprG Rv1411c 1,00E- putative
lipoprotein 50 M mtb12 Rv2376c 2,00E- putative secreted protein 28
M ML0676 Rv3354 2,00E- -- 15 M ML0703 Rv3311 e-125 -- M ML0730
Rv3281 1,00E- Contains Ffam match to entry PF01039 Carboxyl_trans,
20 Carboxyl transferase domain M ML0733 Rv3278c 4,00E- putative
membrane protein 53 M ML0734 Rv3277 2,00E- putative membrane
protein 64 M ML0748 Rv3269 1,00E- irpA 15 M ML0761 Rv3259 2,00E-
Mycobacterium smegmatis hypothetical 6.0 kDa protein 48 (partial
CDS) TR:Q9S425 (EMBL:AE164439) fasta scores: E( ): 1e-10, 75.5% id
in 53 aa M ML0764 Rv3256c 1,00E- -- 79 M ML0806 Rv3217c 5,00E-
putative membrane protein 25 M ML0810 Rv3212 e-104 putative
membrane protein M ML0813 Rv3209 2,00E- putative membrane protein
24 M ML0818 Rv3205c e-102 -- M ML0834 Rv2342 1,00E- -- 21 M ML0872
Rv2203 9,00E- putative membrane protein 43 M mmpS3 Rv219Bc 3,00E-
putative membrane protein 49 M ML0878 Rv2197c 1,00E- putative
membrane protein 55 M ML0888 Rv2186c 8,00E- -- 41 M ML0889 Rv2185c
8,00E- -- 41 M ML0891 Rv2183c 2,00E- -- 27 M ML0895 Rv2179c 1,00E-
-- 60 M ML0898 Rv2175c 1,00E- putative DNA-binding protein 41 M
MML0901 Rv2172c e-102 -- M ML0902 Rv2171 3,00E- probable
lipoprotein 57 M ML0903 Rv2170 9,00E- -- 55 M ML0904 Rv2169c 7,00E-
putative membrane protein 32 M ML0907 Rv2164c 2,00E- putative
conserved membrane protein 50 M ML0923 Rv2144c 3,00E- possible
membrane protein 07 M ML0984 Rv2740 3,00E- -- 31 M ML0990 Rv2732c
9,00E- possible conserved membrane protein 46 M ML1001 Rv2722
7,00E- -- 06 M ML1004 Rv2719c 1,00E- possible conserved membrane
protein 17 M ML1015 Rv2709 7,00E- possible conserved membrane
protein 26 M ML1025 Rv2700 1,00E- possible secreted protein 62 M
ML1030 Rv2695 1,00E- -- 47 M ML1053 Rv2107 8,00E- PE protein 11 M
ML1055 Rv2347c 1,00E- --, family 19 M ML1056 Rv3619c 6,00E- --,
family 18 M ML1065 Rv1209 6,00E- membrane protein 21 M ML1077
Rv1222 3,00E- Mycobacterium avium TR:O05736 (EMBL:U87308) (133 aa);
34 Fasta score E( ): 0, 71.7% identity in 138 aa overlap M ML1079
Rv1224 2,00E- possible secreted protein 29 M ML1096 Rv1249c 2,00E-
putative membrane protein 48 M ML1098 Rv1251c 0.0
some_similarity_to_GTP-binding_proteins M ML1099 Rv1252c 5,00E-
putative lipoprotein 41 M ML1115 Rv1274 3,00E- lipoprotein, lprB 58
M ML1116 Rv1275 8,00E- lipoprotein, lprC 54 M ML1120 Rv1278 0.0
Contains multiple possible coiled-coils. Contains PS00017
ATP/GTP-binding site motif A (P-loop) M ML1138 Rv1303 3,00E-
integral membrane protein 20 M ML1176 Rv1342c 3,00E- possible
conserved membrane protein 34 M ML1177 Rv1343c 5,00E- possible
lipoprotein, membrane protein 43 M ML1180 Rv3619c 6,00E- ESAT-6
family 18 M ML1181 Rv2347c 1,00E- QILSS family 19 M ML1182 Rv1361c
2,00E- PPE family 47 M ML1183 Rv2107 8,00E- PE family 11 M ML1190
Rv2525c 3,00E- --, twin-Arginine secreted protein 70 M ML1221
Rv1590 2,00E- -- 18 M ML1222 Rv1591 1,00E- membrane protein 29 M
ML1232 Rv1184c 2,00E- Possibly secreted PE protein, Contains
PS00017 77 ATP/GTP-binding site motif A (P-loop) M ML1233 Rv3821
9,00E- conserved membrane protein 33 M ML1244 Rv2484c e-130
conserved membrane protein M ML1255 Rv2468c 1,00E- -- 41 M ML1270
Rv1610 8,00E- conserved membrane protein, Contains Pfam match to 36
entry PF00218 IGPS, M ML1296 Rv2137c 1,00E- -- 25 M ML1299 Rv2134c
9,00E- -- 60 M ML1300 Rv2133c 4,00E- -- 90 M ML1315 Rv2116 1,00E-
lipoprotein, LppK 35 M ML1334 Rv2091c 6,00E- conserved membrane
protein, calcium-binding 28 M ML1357 Rv1693 7,00E- -- 09 M ML1361
Rv1697 e-114 conserved membrane protein M ML1362 Rv1698 6,00E-
conserved secreted protein 58 M ML1389 Rv1635c e-144 conserved
membrane protein M ML1446 Rv2061c 5,00E- -- 35 M ML1470 Rv2446c
2,00E- conserved membrane protein 16 M ML1505 Rv1158c 7,00E-
conserved hypothetical Proline rich protein, possibly 17 secreted M
ML1506 Rv1157c 4,00E- -- 62 M ML1526 Rv2772c 2,00E- conserved
membrane protein 43 M ML1537 Rv1797 1,00E- possible secreted
protein 98 M ML1540 Rv1794 e-101 -- M ML1544 Rv1782 e-155 conserved
membrane protein M ML1560 Rv2843 8,00E- -- 24 M ML1584 Rv2876
3,00E- conserved membrane protein 25 M ML1607 Rv2898c 2,00E-
Contains Pfam match to entry PF02021 UPF0102, 17 Uncharacterised
protein family UPF0102, sp.vertline.O83883.vertline.Y913_TREPA
HYPOTHETICAL PROTEIN TP0913 >gi.vertline.7514634.vertline.pir. M
ML1610 Rv2901c 2,00E- -- 39 M ML1638 Rv2229c 2,00E- -- 63 M ML1677
Rv2980 3,00E- possible secreted protein 33 M ML1704 Rv3013 6,00E-
-- 71 M ML1720 Rv3035 e-107 -- M ML1813 Rv1476 3,00E- -- 39 M PPE.1
Rv0256c 3,00E- PPE-family protein 93 M ML1828A Rv0257 1,00E-
Probably pseudogene as Rv0257 is longer 15 M ML1911A Rv0634A --,
May be pseudogene as Rv0634A is predicted to be 13 aa longer M
ML1918 Rv3587c 5,00E- conserved hypothetical membrane protein 69 M
ML1937 Rv1111c 9,00E- probable integral membrane protein 39 M
MML1939 Rv1109c 9,00E- -- 49 M ML1945 Rv1100 6,00E- possible
membrane protein 57 M ML1991 Rv0096 4,00E- PPE 90 M ML1988 Rv0093c
1,00E- Contains possible membrane spanning hydrophobic 52 domains.
Note lacks the N-terminal 46 aa of the N. tuberculosis protein M
ML1993 Rv0098 3,00E- -- 50 M ML1995 Rv0100 1,00E- -- 18 M ML2010
Rv1906c 4,00E- putative lipoprotein (secreted in Mt) 31 M ML2022
Rv1893 2,00E- -- 13 M ML2023 Rv1891 2.00E- Contains probable
N-terminal signal sequence. 46 M ML2054 Rv1861 1,00E- integral
membrane protein 07 M ML2070 Rv1836c e-171 -- M ML2111 Rv0912
1,00E- membrane protein 35 M ML2113 Rv0910 6,00E- -- 49 M ML2141
Rv0879c 9,00E- -- 22 M ML2144 Rv0875c 2,00E- possible exported
protein 45 M ML2155 Rv0863 2,00E- -- 18 M ML2195 Rv0817c 4,00E-
probable exported protein 68 M ML2228 Rv0779c 3,00E- probable
membrane protein 50 M ML2258 Rv0543c 2,00E- -- 28 M ML2259 Rv0544c
2,00E- possible membrane protein 16 M ML2271 Rv0556 6,00E- putative
membrane protein 46 M ML2274 Rv0559c 9,00E- putative secreted
protein 23 M ML2320 Rv3705c 8,00E- -- 64 M ML2337 Rv3723 4,00E-
possible membrane spanning hydrophobic domains 57 M ML2377 Rv0451c
1,00E- mmpS4, Mycobacterium avium TmtpA TR:Q9XCF4 35
(ENBL:AF143772) (221 aa) fasta scores: E( ): 0, 58.9% id in 146 aa
M ML2380 Rv0455c 2,00E- possible secreted protein 37 M ML2388
Rv0463 9,00E- possible membrane protein 18 M ML2390 Rv1083 1,00E-
possible secreted/membrane protein 10 M ML2392 Rv1081c 6,00E-
conserved membrane protein, 34 hydrophobic_stretch_from_aa_26-4- 8
M ML2407 Rv0531 5,00E- putative membrane protein 06 M ML2433 Rv0497
5,00E- conserved membrane protein 39 M ML2450 Rv0479c 7,00E-
possible secreted protein,
>gb.vertline.AAF74996.1.vertline.AF143402_1 57 (AF143402)
putative multicopper oxidase [Mycobacterium avium] M ML2452 Rv0477
2,00E- -- 23 M ML2454 Rv0475 6,00E- possible hemagglutinin 40 M
ML2465 Rv0464c 7,00E- -- 53 M ML2473 Rv3753c 2,00E- -- 53 M ML2489
Rv0383c 5,00E- possible secreted protein, hydrophobic N-terminus
and 91 Pro-rich C-terminus M ML2491 Rv1754c e-109 -- M ML2518
Rv0313 1,00E- -- 39 M ML2527 Rv0292 2,00E- conserved membrane
protein, 69 M ML2530 Rv0289 2,00E- -- 92 M ML2531 Rv0288 5,00E-
ESAT-6 family, possible cell surface protein 27 M ML2532 Rv3020c
9,00E- PE-family protein 10 M ML2534 Rv0285 9,00E- PE-family
protein 13 M ML2536 Rv0283 e-156 conserved membrane protein M
ML2557 Rv0250c 2,00E- -- 26 M mce Rv0169 e-107 Mce protein M
ML2569A Rv0236A 2,00E- Small secreted protein with typical
N-terminal signal 24 peptide M ML2570 Rv0236c 0.0 possible integral
membrane protein M ML2581 Rv0227c e-116 putative integral membrane
protein M ML2582 Rv0226c e-132 conserved membrane protein M ML2595
Rv0175 2,00E- possible membrane protein 41 M ML2596 Rv0176 1,00E-
conserved membrane protein 73 M ML2597 Rv0177 1,00E- conserved
membrane protein 42 M ML2598 Rv0178 2,00E- conserved membrane
protein 43 M ML2604 Rv0184 8,00E- -- 64 M ML2605 Rv0185 3,00E- --
47 M ML2614 Rv0199 3,00E- conserved membrane protein 47 M ML2615
Rv0200 5,00E- probable membrane protein 55 M ML2616 Rv0201c 5,00E-
-- 36 M ML2621 Rv0207c 2,00E- -- 43 M ML2627 Rv0216 e-103 -- M
ML2629 Rv0164 6,00E- -- 44 M ML2689 Rv0049 1,00E- -- 45 X ML0190
Rv1000 7,00E- gp.vertline.AL357613.vertline.AL357613_12 S.
coelicolor cosmid (210 aa) 53 E( ): 2.4e-44; 55.122% identity in
205 aa overlap; AE003963.vertline.AE003963_5 Xylella fastidiosa, E(
) 9.7e- 14; 3 9.894% identity in 188 aa overlap. Weak similarity to
proteins involved in DNA repair X ML0257 Rv1025 4,00E- Also
hypothetical proteins from Thermotoga maritima 72 e.g. TN1078,
hypothetical protein, TR:Q9X0G7 (EMBL:AE001768) (170 aa) X ML0418
Rv3368c 2,00E- weak similarity Thermus aquaticus nox, NADH 76
dehydrogenase, SW:NOX_THETH (X60110) (205 aa); Fasta score E( )
0.00023, 28.8% identity in 212 aa overlap. X ML0577 Rv1440 9,00E-
putative protein-export membrane protein, secG 12 X ML0776 Rv3242c
3,00E- probable competence protein ComF - Deinococcus 11 radio . .
. 77 2e-13 X ML1037 Rv2683 2,00E- Contains 2 Pfam matches to entry
PF00571 CBS, CBS 42 domain. X ML1119 Rv1277 e-105 possibly
phosphoesterase X ML1159 Rv1324 e-116 probable thioredoxin X ML1249
Rv2476c 0.0 Rickettsia prowazekii TR:Q9ZCI2 (ENBL:AJ235273) (1581
aa); Fasta score E( ): 0, 32.9% identity in 1494 aa overlap X
ML1399 Rv1647 1,00E- weakly adenylate cyclases 76 X ML1444 Rv2054
3,00E- Weakly several carboxymethylenebutenolidases (EC 94
3.1.1.45) involved in 3-chlorocatechol degradation e.g. Pseudomonas
putida SW:CLCD_PSEPU (P11453) (236 aa) X ML1494 Rv1171 8,00E-
conserved membrane protein, pir.vertline..vertline.PH0210
hypothetical 19 protein 133 (fdxA 5' region) - Saccharo . . . 74
5e- 13 X ML1503A Rv1159A 9,00E- S. coelicolor (SC5C7.25)
gp.vertline.AL03151 33 5AL031515.vertline.AL031515_25 (101 aa) E(
): 1.9e-06; 34.831% identity in 89 aa overlap; and archaebacteria.
X ML1660 Rv2926c 2,00E- --, pir.vertline..vertline.E72412 conserved
hypothetical protein -
69 Thermotoga manitima . . . 66 4e-10 X ML1723 Rv3038c e-152 --,
gb.vertline.AAC01738.1.vertline. (AF040571) methyltransferase
[Amycolatopsis medit . . . 59 1e-07 X ML1909 Rv0636 9,00E- Contains
Pfam match to entry PF01575 MaoC_dehydratas, 72 MaoC like domain.
ML2566 X desA2 Rv1094 7,00E- Gossypium hirsutum (Upland cotton)
acyl-[acyl-carrier 85 protein) desaturase precursor SW:STAD_GOSHI
(X95988) (397 aa); Fasta score E( ): 5.6e-05, 23.9% identity in 293
aa overlap. X ML1983 Rv1919c 8,00E- weakly similar pollen allergen
45 X ML2366 Rv3760 1,00E- Deinococcus radiodurans conserved
hypothetical 12 protein TR:Q9RU17 X ML2463 Rv0466 e-102 weakly
similar acyl-ACP thioesterase A, Actinomycete-specific; H,
mycobactenial-specific; X, limited distribution; --, no information
available.
[0121]
3TABLE 3 Possible twin arginine secreted proteins M. tuberculosis
M. leprae Gene Predicted function Rv0203 del -- unknown Rv0265c NF
fecB2 iron_transport_protein_FeIII_dicitrate_transporter Rv0846c
ML2171 pa -- similar_to_several_L-ascorbate_oxidases Rv1755c del
plcD phospholipase_C_precursor Rv2349c NF plcC
phospholipase_C_precurso- r Rv2350c del plcB
phospholipase_C_precursor Rv2351c NF plcA phospholipase_C_precursor
Rv2525c ML1190 -- unknown Rv2577 ML0497 ps --
similarity_to_G755244_acid_phosphatase Rv2833c del uqpB
sn-glycerol-3-phosphate transport Rv3353c del -- unknown NF ML2649
-- unknown del, deleted; NF, not found; ps, pseudogene.
[0122] The implications for this invention are widespread. M.
tuberculosis and M. leprae marker polypeptides are disclosed in SEQ
ID NO: 1 to SEQ ID NO:644. The discovery of the M. tuberculosis and
M. Ieprae marker polypeptides and DNA encoding the polypeptides
enables construction of expression vectors comprising nucleic acid
sequences encoding M. tuberculosis and M. leprae marker
polypeptides; host cells transfected or transformed with the
expression vectors; biologically active M. tuberculosis and M.
leprae marker polypeptides and M. tuberculosis and M. leprae marker
polypeptides as isolated or purified peptides; and antibodies
immunoreactive with M. tuberculosis and M. leprae marker
polypeptides. In addition, understanding of the mechanism by which
M. tuberculosis and M. leprae marker polypeptides function enables
the design of assays to detect inhibitors of M. tuberculosis and M.
leprae marker polypeptide activity.
[0123] As used herein, the term "M. tuberculosis and M. leprae
marker polypeptides" refers to a genus of polypeptides that
encompasses polypeptides of a formula selected from the group
consisting of SEQ ID NO: 1 to SEQ ID NO:644, as well as those
polypeptides having a high degree of similarity (at least 90%
homology) with such amino acid sequences and which polypeptides are
immunoreactive or biologically active.
[0124] The term "purified" as used herein, means that the M.
tuberculosis and M. leprae marker polypeptides are essentially free
of association with other proteins or polypeptides, for example, as
a purification product of recombinant host cell culture or as a
purified product from a non-recombinant source. The term
"substantially purified" as used herein, refers to a mixture that
contains M. tuberculosis and M. leprae marker polypeptides and is
essentially free of association with other proteins or
polypeptides, but for the presence of known proteins that can be
removed using a specific antibody, and which substantially purified
M. tuberculosis and M. leprae marker polypeptides can be used as
antigens.
[0125] A M. tuberculosis and M. leprae marker polypeptide "variant"
as referred to herein means a polypeptide substantially homologous
to native M. tuberculosis and M. leprae marker polypeptides, but
which has an amino acid sequence different from that of native M.
tuberculosis and M. leprae marker polypeptides because of one or
more deletions, insertions, or substitutions. The variant amino
acid sequence preferably is at least 80% identical to a native M.
tuberculosis and M. leprae marker polypeptide amino acid sequence,
most preferably at least 90% identical. The percent identity can be
determined, for example by comparing sequence information using the
GAP computer program, version 6.0 described by Devereux et al.
(Nucl. Acids Res. 12:387, 1984) and available from the University
of Wisconsin Genetics Computer Group (UWGCG). The GAP program
utilizes the alignment method of Needleman and Wunsch (J. Mol.
Biol. 48:443, 1970), as revised by Smith and Waterman (Adv. Appl.
Math 2:482, 1981). The preferred default parameters for the GAP
program include: (1) a unary comparison matrix (containing a value
of 1 for identities and 0 for non-identities) for nucleotides, and
the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids
Res. 14:6745, 1986, as described by Schwartz and Dayhoff, eds.,
Atlas of Protein Sequence and Structure, National Biomedical
Research Foundation, pp. 353-358, 1979; (2) a penalty of 3.0 for
each gap and an additional 0.10 penalty for each symbol in each
gap; and (3) no penalty for end gaps.
[0126] Variants can comprise conservatively substituted sequences,
meaning that a given amino acid residue is replaced by a residue
having similar physicochemical characteristics. Examples of
conservative substitutions include substitution of one aliphatic
residue for another, such as Ile, Val, Leu, or Ala for one another,
or substitutions of one polar residue for another, such as between
Lys and Arg; Glu and Asp; or Gln and Asn. Other such conservative
substitutions, for example, substitutions of entire regions having
similar hydrophobicity characteristics, are well known. Naturally
occurring M. tuberculosis and M. leprae marker polypeptide variants
are also encompassed by the invention. Examples of such variants
are proteins that result from alternate mRNA splicing events or
from proteolytic cleavage of the M. tuberculosis and M. leprae
marker polypeptides. Variations attributable to proteolysis
include, for example, differences in the termini upon expression in
different types of host cells, due to proteolytic removal of one or
more terminal amino acids from the M. tuberculosis and M. leprae
marker polypeptides. Variations attributable to frameshifting
include, for example, differences in the termini upon expression in
different types of host cells due to different amino acids.
[0127] As stated above, the invention provides isolated and
purified, or homogeneous, M. tuberculosis and M. leprae marker
polypeptides, both recombinant and non-recombinant. Variants and
derivatives of native M. tuberculosis and M. leprae marker
polypeptides that can be used as antigens can be obtained by
mutations of nucleotide sequences coding for native M. tuberculosis
and M. leprae marker polypeptides. Alterations of the native amino
acid sequence can be accomplished by any of a number of
conventional methods. Mutations can be introduced at particular
loci by synthesizing oligonucleotides containing a mutant sequence,
flanked by restriction sites enabling ligation to fragments of the
native sequence. Following ligation, the resulting reconstructed
sequence encodes an analog having the desired amino acid insertion,
substitution, or deletion.
[0128] Alternatively, oligonucleotide directed, site specific
mutagenesis procedures can be employed to provide an altered gene
wherein predetermined codons can be altered by substitution,
deletion, or insertion. Exemplary methods of making the alterations
set forth above are disclosed by Walder et al. (Gene 42:133, 1986);
Bauer et al. (Gene 37:73, 1985); Craik (BioTechniques, January
1985, 12-19); Smith et al. (Genetic Engineering: Principles and
Methods, Plenum Press, 1981); Kunkel (Proc. Natl. Acad. Sci. USA
82:488, 1985); Kunkel et al. (Methods in Enzymol. 154:367, 1987);
and U.S. Pat. Nos. 4,518,584 and 4,737,462, all of which are
incorporated by reference.
[0129] Within an aspect of the invention, M. tuberculosis and M.
leprae marker polypeptides can be utilized to prepare antibodies
that specifically bind to M. tuberculosis and M. leprae marker
polypeptides. The term "antibodies" is meant to include polyclonal
antibodies, monoclonal antibodies, fragments thereof such as
F(ab')2 and Fab fragments, as well as any recombinantly produced
binding partners. Antibodies are defined to be specifically binding
if they bind M. tuberculosis and M. leprae marker polypeptides with
a K.sub.a of greater than or equal to about 10.sup.7 M.sup.-1.
Affinities of binding partners or antibodies can be readily
determined using conventional techniques, for example, those
described by Scatchard et al., Ann. N. Y Acad. Sci., 51:660 (1949).
Polyclonal antibodies can be readily generated from a variety of
sources, for example, horses, cows, goats, sheep, dogs, chickens,
rabbits, mice, or rats, using procedures that are well known in the
art.
[0130] The invention further encompasses isolated fragments and
oligonucleotides derived from the nucleotide sequences of the
invention. The invention also encompasses polypeptides encoded by
these fragments and oligonucleotides.
[0131] Nucleic acid sequences within the scope of the invention
include isolated DNA and RNA sequences that hybridize to the native
M. tuberculosis and M. leprae marker nucleic acids disclosed herein
under conditions of moderate or severe stringency, and which encode
M. tuberculosis and M. leprae marker polypeptides. As used herein,
conditions of moderate stringency, as known to those having
ordinary skill in the art, and as defined by Sambrook et al.
Molecular Cloning: A Laboratory Manual, 2 ed. Vol. 1, pp.
1.101-104, Cold Spring Harbor Laboratory Press, (1989), include use
of a prewashing solution for the nitrocellulose filters 5.times.
SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0), hybridization conditions of
50% formamide, 6.times. SSC at 42.degree. C. (or other similar
hybridization solution, such as Stark's solution, in 50% formamide
at 42.degree. C.), and washing conditions of about 60.degree. C.,
0.5.times. SSC, 0.1% SDS. Conditions of high stringency are defined
as hybridization conditions as above, and with washing at
68.degree. C., 0.2.times. SSC, 0.1% SDS. The skilled artisan will
recognize that the temperature and wash solution salt concentration
can be adjusted as necessary according to factors such as the
length of the probe.
[0132] Due to the known degeneracy of the genetic code, wherein
more than one codon can encode the same amino acid, a DNA sequence
can vary and still encode a M. tuberculosis and M. leprae marker
polypeptide of a formula selected from the group consisting of SEQ
ID NO: 1 to SEQ ID NO:644. Such variant DNA sequences can result
from silent mutations (e.g., occurring during PCR amplification),
or can be the product of deliberate mutagenesis of a native
sequence.
[0133] The invention thus provides equivalent isolated DNA
sequences, encoding M. tuberculosis and M. leprae marker
polypeptides, selected from: (a) DNA derived from the coding region
of a native M. tuberculosis and M. leprae marker nucleic acid; (b)
cDNA comprising the nucleotide sequence of the invention; (c) DNA
capable of hybridization to a DNA of (a) under conditions of
moderate stringency and which encode M. tuberculosis and M. leprae
marker polypeptides; and (d) DNA which is degenerate as a result of
the genetic code to a DNA defined in (a), (b) or (c) and which
encodes M. tuberculosis and M. leprae marker polypeptides. M.
tuberculosis and M. leprae marker polypeptides encoded by such DNA
equivalent sequences are encompassed by the invention.
[0134] DNA that is equivalent to the DNA sequence of the invention
will hybridize under moderately stringent conditions to the
double-stranded native DNA sequence that encodes polypeptides of a
formula selected from the group consisting of SEQ ID NO: 1 to SEQ
ID NO:644. Examples of M. tuberculosis and M. leprae marker
polypeptides encoded by such DNA, include, but are not limited to,
M. tuberculosis and M. leprae marker polypeptide fragments and M.
tuberculosis and M. leprae marker polypeptides comprising
inactivated N-glycosylation site(s), inactivated protease
processing site(s), or conservative amino acid substitution(s), as
described above. M. tuberculosis and M. leprae marker polypeptides
encoded by DNA derived from other species, wherein the DNA will
hybridize to the complement of the DNA of the invention are also
encompassed.
[0135] Recombinant expression vectors containing a nucleic acid
sequence encoding M. tuberculosis and M. leprae marker polypeptides
can be prepared using well known methods. The expression vectors
include a M. tuberculosis and M. leprae marker DNA sequence
operably linked to suitable transcriptional or translational
regulatory nucleotide sequences, such as those derived from a
mammalian, microbial, viral, or insect gene. Examples of regulatory
sequences include transcriptional promoters, operators, or
enhancers, an mRNA ribosomal binding site, and appropriate
sequences which control transcription and translation initiation
and termination. Nucleotide sequences are "operably linked" when
the regulatory sequence functionally relates to the M. tuberculosis
and M. leprae marker DNA sequence. Thus, a promoter nucleotide
sequence is operably linked to a M. tuberculosis and M. leprae
marker DNA sequence if the promoter nucleotide sequence controls
the transcription of the M. tuberculosis and M. leprae marker DNA
sequence. The ability to replicate in the desired host cells,
usually conferred by an origin of replication, and a selection gene
by which transformants are identified can additionally be
incorporated into the expression vector.
[0136] In addition, sequences encoding appropriate signal peptides
that are not naturally associated with M. tuberculosis and M.
leprae marker polypeptides can be incorporated into expression
vectors. For example, a DNA sequence for a signal peptide
(secretory leader) can be fused in-frame to the M. tuberculosis and
M. leprae marker nucleotide sequence so that the M. tuberculosis
and M. leprae marker polypeptide is initially translated as a
fusion protein comprising the signal peptide. A signal peptide that
is functional in the intended host cells enhances extracellular
secretion of the M. tuberculosis and M. leprae marker polypeptide.
The signal peptide can be cleaved from the M. tuberculosis and M.
leprae marker polypeptide upon secretion of the marker polypeptide
from the cell.
[0137] Expression vectors fdr use in prokaryotic host cells
generally comprise one or more phenotypic selectable marker genes.
A phenotypic selectable marker gene is, for example, a gene
encoding a protein that confers antibiotic resistance or that
supplies an autotrophic requirement. Examples of useful expression
vectors for prokaryotic host cells include those derived from
commercially available plasmids. Commercially available vectors
include those that are specifically designed for the expression of
proteins. These include pMAL-p2 and pMAL-c2 vectors, which are used
for the expression of proteins fused to maltose binding protein
(New England Biolabs, Beverly, Mass., USA).
[0138] Promoter sequences commonly used for recombinant prokaryotic
host cell expression vectors include .beta.-lactamase
(penicillinase), lactose promoter system (Chang et al., Nature
275:615, 1978; and Goeddel et al., Nature 281:544, 1979),
tryptophan (trp) promoter system (Goeddel et al., Nucl. Acids Res.
8:4057, 1980; and EP-A-36776), and tac promoter (Maniatis,
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor
Laboratory, p. 412, 1982).
[0139] Suitable host cells for expression of M. tuberculosis and M.
leprae marker polypeptides include prokaryotes, yeast or higher
eukaryotic cells. Appropriate cloning and expression vectors for
use with bacterial, fungal, yeast, and mammalian cellular hosts are
described, for example, in Pouwels et al. Cloning Vectors: A
Laboratory Manual, Elsevier, N.Y., (1985). Cell-free translation
systems could also be employed to produce M. tuberculosis and M.
leprae marker polypeptides using RNAs derived from DNA constructs
disclosed herein.
[0140] It will be understood that the present invention is intended
to encompass the previously described proteins in isolated or
purified form, whether obtained using the techniques described
herein or other methods. In a preferred embodiment of this
invention, the M. tuberculosis and M. leprae marker polypeptides
are substantially free of human tissue and human tissue components,
nucleic acids, extraneous proteins and lipids, and adventitious
microorganisms, such as bacteria and a mycoplasma. It will also be
understood that the invention encompasses equivalent proteins
having substantially the same biological and immunogenic
properties. Thus, this invention is intended to cover serotypic
variants of the proteins of the invention.
[0141] Depending on the use to be made of the M. tuberculosis and
M. leprae marker polypeptides of the invention, it may be desirable
to label them. Examples of suitable labels are radioactive labels,
enzymatic labels, fluorescent labels, chemiluminescent labels, and
chromophores. The methods for labeling proteins and glycoproteins
of the invention do not differ in essence from those widely used
for labeling immunoglobulin. The need to label may be avoided by
using labeled antibody to the antigen of the invention or anti-
immunoglobulin to the antibodies to the antigen as an indirect
marker.
[0142] Once the M. tuberculosis and M. leprae marker polypeptides
of the invention have been obtained, they can be used to produce
polyctonal and monoclonal antibodies reactive therewith. Thus, a
polypeptide of the invention can be used to immunize an animal host
by techniques known in the art. Such techniques usually involve
inoculation, but they may involve other modes of administration. A
sufficient amount of the polypeptide is administered to create an
immunogenic response in the animal host. Any host that produces
antibodies to the antigen of the invention can be used. Once the
animal has been immunized and sufficient time has passed for it to
begin producing antibodies to the antigen, polyclonal antibodies
can be recovered. The general method comprises removing blood from
the animal and separating the serum from the blood. The serum,
which contains antibodies to the antigen, can be used as an
antiserum to the antigen. Alternatively, the antibodies can be
recovered from the serum. Affinity purification is a preferred
technique for recovering purified polyclonal antibodies to the
antigen from the serum.
[0143] Monoclonal antibodies to the antigens of the invention can
also be prepared. One method for producing monoclonal antibodies
reactive with the antigens comprises the steps of immunizing a host
with the antigen; recovering antibody producing cells from the
spleen of the host; fusing the antibody producing cells with
myeloma cells deficient in the enzyme hypoxanthine-guanine
phosphoribosyl transferase to form hybridomas; select at least one
of the hybridomas by growth in a medium comprising hypoxanthine,
aminopterin, and thymidine; identifying at least one of the
hybridomas that produces an antibody to the antigen, culturing the
identified hybridoma to produce antibody in a recoverable quantity;
and recovering the antibodies produced by the cultured
hybridoma.
[0144] These polyclonal or monoclonal antibodies can be used in a
variety of applications. Among these is the neutralization of
corresponding proteins. They can also be used to detect viral
antigens in biological preparations or in purifying corresponding
proteins, glycoproteins, or mixtures thereof, for example, when
used in a affinity chromatographic columns.
[0145] The M. tuberculosis and M. leprae marker polypeptides can be
used as antigens to identify antibodies to a mycobacteria in
materials and to determine the concentration of the antibodies in
those materials. Thus, the antigens can be used for qualitative or
quantitative determination of a mycobacteria in a material. Such
materials of course include human tissue and human cells, as well
as biological fluids, such as human body fluids, including human
sera. When used as a reagent in an immunoassay for determining the
presence or concentration of the antibodies to a mycobacteria, the
antigens of the present invention provide an assay that is
convenient, rapid, sensitive, and specific.
[0146] More particularly, the antigens of the invention can be
employed for the detection of a mycobacterium by means of
immunoassays that are well known for use in detecting or
quantifying humoral components in fluids. Thus, antigen-antibody
interactions can be directly observed or determined by secondary
reactions, such as precipitation or agglutination. In addition,
immunoelectrophoresis techniques can also be employed. For example,
the classic combination of electrophoresis in agar followed by
reaction with anti-serum can be utilized, as well as
two-dimensional electrophoresis, rocket electrophoresis, and
immunolabeling of polyacrylamide gel patterns (Western Blot or
immunoblot). Other immunoassays in which the antigens of the
present invention can be employed include, but are not limited to,
radioimmunoassay, competitive immunoprecipitation assay, enzyme
immunoassay, and immunofluorescence assay. It will be understood
that turbidimetric, calorimetric, and nephelometric techniques can
be employed. An immunoassay based on Western Blot technique is
preferred.
[0147] Immunoassays can be carried out by immobilizing one of the
immunoreagents, either an antigen of the invention or an antibody
of the invention to the antigen, on a carrier surface while
retaining immunoreactivity of the reagent. The reciprocal
immunoreagent can be unlabeled or labeled in such a manner that
immunoreactivity is also retained. These techniques are especially
suitable for use in enzyme immunoassays, such as enzyme linked
immunosorbent assay (ELISA) and competitive inhibition enzyme
immunoassay (CIEIA).
[0148] When either the antigen of the invention or antibody to the
antigen is attached to a solid support, the support is usually a
glass or plastic material. Plastic materials molded in the form of
plates, tubes, beads, or disks are preferred. Examples of suitable
plastic materials are polystyrene and polyvinyl chloride. If the
immunoreagent does not readily bind to the solid support, a carrier
material can be interposed between the reagent and the support.
Examples of suitable carrier materials are proteins, such as bovine
serum albumin, or chemical reagents, such as gluteraldehyde or
urea. Coating of the solid phase can be carried out using
conventional techniques.
[0149] The invention provides immunogenic M. tuberculosis and M.
leprae marker polypeptides, and more particularly, protective
polypeptides for use in the preparation of vaccine compositions
against a mycobacterium. These polypeptides can thus be employed as
viral vaccines by administering the polypeptides to a mammal
susceptible to a mycobacteria infection. Conventional modes of
administration can be employed. For example, administration can be
carried out by oral, respiratory, or parenteral routes.
Intradermal, subcutaneous, and intramuscular routes of
administration are preferred when the vaccine is administered
parenterally.
[0150] The major purpose of the immune response in a
mycobacteria-infected mammal is to inactivate the free mycobacteria
and to eliminate mycobacteria infected cells that have the
potential to release infectious mycobacteria. The B-cell arm of the
immune response has some responsibility for inactivating free
mycobacteria. The principal manner in which this is achieved is by
neutralization of infectivity. Another major mechanism for
destruction of the a mycobacteria-infected cells is provided by
cytotoxic T lymphocytes (CTL) that recognize M. tuberculosis and M.
leprae marker antigens expressed in combination with class I
histocompatibility antigens at the cell surface. The CTLs recognize
M. tuberculosis and M. leprae marker polypeptides processed within
cells from a M. tuberculosis and M. leprae marker polypeptide that
is produced, for example, by the infected cell or that is
internalized by a phagocytic cell. Thus, this invention can be
employed to stimulate a B-cell response to M. tuberculosis and M.
leprae marker polypeptides, as well as immunity mediated by a CTL
response following infection. The CTL response can play an
important role in mediating recovery from primary mycobacterial
infection and in accelerating recovery during subsequent
infections.
[0151] The ability of the M. tuberculosis and M. leprae marker
polypeptides and vaccines of the invention to induce protective
levels of neutralizing antibody in a host can be enhanced by
emulsification with an adjuvant, incorporating in a liposome,
coupling to a suitable carrier, or by combinations of these
techniques. For example, the M. tuberculosis and M. leprae marker
polypeptides of the invention can be administered with a
conventional adjuvant, such as aluminum phosphate and aluminum
hydroxide gel, in an amount sufficient to potentiate humoral or
cell-mediated immune responses in the host. Similarly, the M.
tuberculosis and M. leprae marker polypeptides can be bound to
lipid membranes or incorporated in lipid membranes to form
liposomes. The use of nonpyrogenic lipids free of nucleic acids and
other extraneous matter can be employed for this purpose.
[0152] The immunization schedule will depend upon several factors,
such as the susceptibility of the host to infection and the age of
the host. A single dose of the vaccine of the invention can be
administered to the host or a primary course of immunization can be
followed in which several doses at intervals of time are
administered. Subsequent doses used as boosters can be administered
as needed following the primary course.
[0153] The M. tuberculosis and M. leprae marker polypeptides and
vaccines of the invention can be administered to the host in an
amount sufficient to prevent or inhibit a mycobacteria infection or
replication in vivo. In any event, the amount administered should
be at least sufficient to protect the host against substantial
immunosuppression, even though a mycobacterial infection may not be
entirely prevented. An immunogenic response can be obtained by
administering the polypeptides of the invention to the host in an
amount of about 10 to about 500 micrograms antigen per kilogram of
body weight, preferably about 50 to about 100 micrograms antigen
per kilogram of body weight. The polypeptides and vaccines of the
invention can be administered together with a physiologically
acceptable carrier. For example, a diluent, such as water or a
saline solution, can be employed.
[0154] Another aspect of the invention provides a method of DNA
vaccination. The method also includes administering any combination
of the nucleic acids encoding M. tuberculosis and M. leprae marker
polypeptides, with or without carrier molecules, to an individual.
In embodiments, the individual is an animal, and is preferably a
mammal. More preferably, the mammal is selected from the group
consisting of a human, a dog, a cat, a bovine, a pig, and a horse.
In an especially preferred embodiment, the mammal is a human.
[0155] The methods of treating include administering immunogenic
compositions comprising M. tuberculosis and M. leprae marker
polypeptides, but compositions comprising nucleic acids encoding M.
tuberculosis and M. leprae marker polypeptides as well. Those of
skill in the art are cognizant of the concept, application, and
effectiveness of nucleic acid vaccines (e.g., DNA vaccines) and
nucleic acid vaccine technology as well as protein and polypeptide
based technologies. The nucleic acid based technology allows the
administration of nucleic acids encoding M. tuberculosis and M.
leprae marker polypeptides, naked or encapsulated, directly to
tissues and cells without the need for production of encoded
proteins prior to administration. The technology is based on the
ability of these nucleic acids to be taken up by cells of the
recipient organism and expressed to produce an immunogenic
determinant to which the recipient's immune system responds.
Typically, the expressed antigens are displayed on the surface of
cells that have taken up and expressed the nucleic acids, but
expression and export of the encoded antigens into the circulatory
system of the recipient individual is also within the scope of the
present invention. Such nucleic acid vaccine technology includes,
but is not limited to, delivery of naked DNA and RNA and delivery
of expression vectors encoding M. tuberculosis and M. leprae marker
polypeptides. Although the technology is termed "vaccine", it is
equally applicable to immunogenic compositions that do not result
in a protective response. Such non-protection inducing compositions
and methods are encompassed within the present invention.
[0156] Although it is within the present invention to deliver
nucleic acids encoding M. tuberculosis and M. leprae marker
polypeptides and carrier molecules as naked nucleic acid, the
present invention also encompasses delivery of nucleic acids as
part of larger or more complex compositions. Included among these
delivery systems are mycobacterium, mycobacteria-like particles, or
bacteria containing the nucleic acid encoding M. tuberculosis and
M. leprae marker polypeptides. Also, complexes of the invention's
nucleic acids and carrier molecules with cell permeabilizing
compounds, such as liposomes, are included within the scope of the
invention. Other compounds, such as molecular vectors (EP 696,191,
Samain et al.) and delivery systems for nucleic acid vaccines are
known to the skilled artisan and exemplified in, for example, WO 93
06223 and WO 90 11092, U.S. Pat. Nos. 5,580,859, and U.S. 5,589,466
(Vical's patents), which are incorporated by reference herein, and
can be made and used without undue or excessive
experimentation.
[0157] To further achieve the objective and in accordance with the
purposes of the present invention, a kit capable of diagnosing
mycobacteria infection is described. This kit, in one embodiment,
contains the DNA sequences of this invention, which are capable of
hybridizing to RNA or analogous DNA sequences to indicate the
presence of a mycobacteria infection. Different diagnostic
techniques can be used which include, but are not limited to: (I)
Southern blot procedures to identify cellular DNA which may or may
not be digested with restriction enzymes; (2) Northern blot
techniques to identify RNA extracted from cells; and (3) dot blot
techniques, i.e., direct filtration of the sample through an ad hoc
membrane, such as nitrocellulose or nylon, without previous
separation on agarose gel. Suitable material for dot blot technique
could be obtained from body fluids including, but not limited to,
serum and plasma, supernatants from culture cells, or cytoplasmic
extracts obtained after cell lysis and removal of membranes and
nuclei of the cells by centrifugation.
[0158] The invention also provides screening assays for identifying
agents that modulate (e.g. augment or inhibit) the activity of M.
tuberculosis and M. leprae marker polypeptides. Assays for
detecting the ability of agents to inhibit or augment the activity
of M. tuberculosis and M. leprae marker polypeptides provide for
facile high-throughput screening of agent banks (e.g., compound
libraries, peptide libraries, and the like) to identify antagonists
or agonists of these marker polypeptides. Such M. tuberculosis and
M. leprae marker polypeptide antagonists and agonists may modulate
marker polypeptide activity and thereby modulate, inhibit, or even
prevent infection of a host by M. tuberculosis and M. leprae
[0159] For example, yeast comprising (1) an expression cassette
encoding a GAL4 DNA binding domain (or GAL4 activator domain) fused
to a binding fragment of M. tuberculosis or M. leprae marker
polypeptide, (2) an expression cassette encoding a GAL4 DNA
activator domain (or GAL4 binding domain, respectively) fused to a
binding fragment of a test polypeptide, and (3) a reporter gene
(e.g., .beta.-galactosidase) comprising a cis-linked GAL4
transcriptional response element can be used for agent screening.
Such yeast are incubated, and expression of the reporter gene
(e.g., .beta.-galactosidase) is determined by the capacity of the
agent to affect expression of the reporter gene and thereby
identify the test polypeptide as a candidate modulatory agent for
M. tuberculosis or M. leprae marker polypeptides.
[0160] Yeast two-hybrid systems can be used to screen a mammalian
(typically human) cDNA expression library, wherein cDNA is fused to
a GAL4 DNA binding domain or activator domain, and either a M.
tuberculosis or M. leprae marker polypeptide sequence is fused to a
GAL4 activator domain or DNA binding domain, respectively. Such a
yeast two-hybrid system can screen for cDNAs that encode proteins
that interact with M. tuberculosis or M. leprae marker
polypeptides.
[0161] Polypeptides that interact with M. tuberculosis or M. leprae
marker polypeptides can also be identified by immunoprecipitation
of M. tuberculosis or M. leprae marker polypeptides with antibody,
and identification of co-precipitating species. Further,
polypeptides that interact with M. tuberculosis or M. leprae marker
polypeptides can be identified by screening a peptide library
(e.g., a bacteriophage peptide display library) with a M.
tuberculosis or M. leprae marker polypeptide.
[0162] Additional embodiments of the invention are directed to
methods that employ specific antisense polynucleotides
complementary to all or part of M. tuberculosis or M. leprae marker
nucleic acids. Such complementary antisense polynucleotides may
include nucleotide substitutions, additions, deletions, or
transpositions, so long as specific hybridization to the relevant
target sequence corresponding to M. tuberculosis or M. leprae
marker nucleic acids is retained as a functional property of the
polynucleotide. Complementary antisense polynucleotides include
soluble antisense RNA or DNA oligonucleotides that can hybridize
specifically to M. tuberculosis and M. leprae marker nucleic acid
species and prevent transcription of the mRNA species and/or
translation of the encoded polypeptide. See (Ching et al. (1989)
Proc. Natl. Acad. Sci. U.S.A. 86:10006; Broder et al. (1990) Ann.
Int. Med. 113:604; Loreau et al. (1990) FEBS Letters 274:53;
Holcenberg et al., WO91/11535; U.S. Ser. No. 07/530,165;
WO91/09865; WO91/04753; WO90/13641; and EP 386563). The antisense
polynucleotides, therefore, inhibit production of M. tuberculosis
or M. leprae marker polypeptides. Antisense polynucleotides that
prevent transcription and/or translation of mRNA corresponding to
M. tuberculosis or M. leprae marker polypeptides may inhibit or
prevent infection by M. tuberculosis or M. leprae. Antisense
polynucleotides of various lengths may be produced, although such
antisense polynucleotides typically comprise a sequence of about at
least 25 consecutive nucleotides, which are substantially identical
to a naturally-occurring M. tuberculosis or M. leprae marker
nucleic acids, and typically are identical to a M. tuberculosis or
M. leprae marker nucleic acid. For general methods relating to
antisense polynucleotides, see Antisense RNA and DNA, (1988) D.A.
Melton, Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor,
N.Y.).
[0163] Polypeptides with similar sequence should have similar
function. Thus, the functions of M. tuberculosis and M. leprne
marker polypeptides can be assessed by a database search. One
method by which structural and functional domains can be identified
is by comparison of the nucleotide and/or amino acid sequence data
for M. tuberculosis and M. lepree marker polypeptides, or M.
tuberculosis or M. leprae marker nucleic acids, to public or
proprietary sequence databases. Preferably, computerized comparison
methods are used to identify sequence motifs or predict polypeptide
conformation domains that occur in other polypeptides of known
structure and/or function. For example, methods to identify protein
sequences that fold into a known three-dimensional structure are
known (Bowie et al. (1991) Science 253:164).
[0164] As other examples, but not for limitation, the programs GAP,
BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software
Package (Genetics Computer Group, 575 Science Dr., Madison, Wis.)
can be used to identify sequences in databases, such as
GenBank/EMBL, that have regions of homology with M. tuberculosis or
M. leprae marker polypeptides or M. tuberculosis or M. leprae
marker nucleic acids. Such homologous regions are candidate
structural or functional domains. Alternatively, other algorithms
are provided for identifying such domains from sequence data.
Further, network methods, whether implemented in hardware or
software, can be used to: (1) identify related protein sequences
and nucleotide sequences, and (2) define structural or functional
domains in M. tuberculosis and M. leprae marker polypeptides.
[0165] Thus, those of skill in the art can recognize sequence
motifs and structural conformations that may be used to define
structural and functional domains in the M. tuberculosis and M.
leprae marker nucleic acids of the invention. Hydrophobicity
profiles can be generated and displayed graphically using the
ProtScale utility at ExPASy
(http://expasy.hcuge.ch/cqi-bin/protscale.pl). There are a number
of different ways of predicting transmembrane helices in sequences,
the simplest being merely to look for regions of the protein
containing a run of 20 hydrophobic residues. However, there are
also a number of more sophisticated, and accurate, algorithms which
can be used not only to predict the location of transmembrane
helices but also their orientation in the membrane.
[0166] Proteins can contain signals within their sequence which
assist in their processing within the cell, for example leader
sequences or signals which target proteins to specific compartments
within cells. Web resources are available to help predict both
these types of sites. Different regions of a polypeptide evolve at
different rates; some parts of a polypeptide must retain a certain
pattern of residues for the polypeptide to function. By identifying
such conserved regions, it is possible to make predictions about
the polypeptide function. Examples of conserved sequences can be
found around the active sites of enzymes, sites of
post-translational modification, binding sites for co-factors,
protein sorting signals, etc. A number of bioinformatics resources
have been developed both to build databases of conserved patterns
and to search for instances of such patterns in sequences. One of
the best known motif databases is PROSITE, which can be employed in
this invention.
[0167] This invention will be described in greater detail in the
following Examples.
EXAMPLE 1
[0168] The whole genome sequence was obtained from a combination of
sequenced cosmids.sup.45 and 54,000 end sequences (giving
7.1.times. coverage) from a pUC18 genomic shotgun library using dye
terminator chemistry on ABI373 or 377 automated sequencers The
sequences of 42 cosmids previously generated by multiplex
sequencing46 were used for scaffolding purposes only. The sequence
was assembled using Phrap (P. Green, unpublished), finished using
GAP4.sup.47 then compared with sequences present in public
databases using FASTA, BLASTN and BLASTX.sup.48. Potential CDS were
predicted, and gene and protein sequences analysed as described
previously.sup.8, 49, using Artemis.sup.50 to collate data and
facilitate annotation. The genome and proteome sequences of M.
leprae and M. tuberculosis H37Rv were compared pairwise to identify
conserved genes using the Artemis Comparison Tool (ACT) (K.
Rutherford; unpublished; http://www.sanger.ac.uk/Software/ACT/)- .
Pseudogenes had one or more mutations that would ablate expression
and were pinpointed by direct comparison with M. tuberculosis
EXAMPLE 2
Target Discovery
[0169] To illustrate the usefulness of comparative mycobacterial
genomics for identifying potentially important proteins, a precise
example will now be given. Preproteins transported by the TAT
pathway generally bind redox cofactors and fold or oligomerize
before crossing the membrane.sup.54, 62. After removal of the
signal peptide, these proteins usually function in extracytoplasmic
electron transfer chains. The specialized machinery that recognizes
the twin-arginine motif, and translocates the preprotein across the
membrane, is composed of several different Tat proteins. In
Escherichia coli, TatA and TatE are 50% identical and share weak
similarity with TatB. All three proteins are predicted to be
anchored to the cytoplasmic membrane via an N-terminal hydrophobic
alpha-helix, and to have cytoplasmic amphipathic helices followed
by variable regions. The TatC protein is predicted to be an
integral membrane protein with six transmembrane segments. M.
tuberculosis and M. leprae both contain clearly identifiable tatA,
tatB, tatC, and tatD genes and must, therefore, produce a
functional Tat system.
[0170] On examination of the proteome of M. tuberculosis, eleven
potential substrates for the Tat export system were recognized on
the basis of their signal peptides containing potential twin
arginine motifs (Table 3). During the extensive reductive evolution
of the genome of M. leprae only one of the corresponding genes,
ML1190, has escaped inactivation. It is orthologous to Rv2525c of
M. tuberculosis but shows no similarity to proteins present in
sequence databases. The 240 amino acid long precursor protein
encoded by Rv2525c (or its counterpart ML1190 contains five
histidines and one cysteine residue that may be important for
coordinating divalent metal ions. The conservation of this coding
sequence by M. leprae , in the face of massive gene loss, is a
strong indication that it must play an important biological role.
Given the many parallels with Tat systems elsewhere, it is likely
to be in electron transport. These indirect arguments suggest on
the one hand that, if this function were essential, the
ML1190/Rv2525c gene product might represent a novel drug target or,
on the other, since it is likely to be located extracellularly it
may, therefore, be an important sentinel protein antigen.
[0171] The Mycobacterium tuberculosis strain HRV37 genomic library
has been deposited at the Collection Nationale de Cultures de
Microorganismes (C.N.C.M.), of Institut Pasteur, 28, rue du Docteur
Roux, F-75724 Paris, Cedex 15, France, on Nov. 19, 1997, under the
Accession Number I-1945. This genomic DNA library is disclosed in
International patent application No. WO 9954487 (Institut
Pasteur).
[0172] In summary, Leprosy, a chronic human neurological disease,
results from infection with the obligate intracellular pathogen
Mycobacterium leprae , a close relative of the tubercle bacillus.
M. leprae has the longest doubling time of all known bacteria and
has thwarted every effort at axenic culture. Comparison of the 3.27
Mb genome sequence of an armadillo-derived Indian isolate of the
leprosy bacillus with that of Mycobacterium tuberculosis (4.41 Mb)
provides clear explanations for these properties and reveals an
extreme case of reductive evolution. Less than half of the genome
contains functional genes while pseudogenes, with intact
counterparts in M. tuberculosis, abound. Genome downsizing and the
current mosaic arrangement appear to have resulted from extensive
recombination events between dispersed repetitive sequences. Gene
deletion and decay have eliminated many important metabolic
activities including siderophore production, part of the oxidative,
and all of the microaerophilic and anaerobic respiratory chains,
together with numerous catabolic systems and their regulatory
circuits.
REFERENCES
[0173] The entire disclosure of each of the following publications
is relied upon and incorporated by reference herein.
[0174] 1. Anon. Randomised controlled trial of single BCG, repeated
BCG, or combined BCG and killed Mycobacterium leprae vaccine for
prevention of leprosy and tuberculosis in Malawi. Karonga
Prevention Trial Group. Lancet 348, 17-24 (1996).
[0175] 2. Nordeen, S. K. & Hombach, J. M. in Tropical disease
resarch: Progress 1991-1992: eleventh programme report of the
UNDP/Warld Bank/WHO Special Programme for Research and Training in
Tropical Diseases. (eds. Walgate, R. & Simpson, K.) 47-55
(World Health Organisation, Geneva, 1993).
[0176] 3. Anon. WHO Weekly Epidemiological Record 73, 40
(1998).
[0177] 4. Hansen, G. H. A. Undersogelser angaende spedalskhedens
aasager. Norsk Magazin for Laegervidenskaben (supplement) 4, 1-88
(1874).
[0178] 5. Kirchheimer, W. K. &. Storrs, E. E. Attempts to
establish the armadillo (Dasypus novemcinctus Linn.) as a model for
the study of leprosy. 1. Report of lepromatoid leprosy in an
experimentally infected armadillo. Int J Lep 39,693-702 (197
1).
[0179] 6. Franzblau, S. Drug susceptibility testing of
Mycobacterium leprae in the BACTEC 460 system. Antimicrobial Agents
and Chemotherapy 33, 2115-2117 (1989).
[0180] 7. Shephard, C. C. in Leprosy (eds. Hastings, R. C.) 269-286
(Churchill Livingstone, Edinburgh, 1985).
[0181] 8. Cole, S. T., et al. Deciphering the biology of
Mycobacterium tuberculosis from the complete genome sequence.
Nature 393,537-544 (1998).
[0182] 9. Tekaia, F., et al. Analysis of the proteome of
Mycobacterium. tuberculosis in silico. Tubercle Lung Disease
79,329-342 (1999).
[0183] 10. Brosch, R., Gordon, S. V., Eigimeier, K., Gamier, T.
& Cole, S. T. Comparative genomics of the leprosy and tubercle
bacilli. Res. Microbiol. 151, 135-142 (2000).
[0184] 11. Philipp, W., Schwartz, D. C., Telenti, A. & Cole, S.
T. Mycobacterial genome structure. Electrophoresis 19,
573-576(1998).
[0185] 12. Stinear, T. P., Jenkin, G. A., Johnson, P. D. R. &
Davies, J. K. Comparative genetic analysis of Mycobacterium
ulcerans and Mycobacterium marinum reveals evidence of recent
divergence. J. Bacteriol. 182, 6322-6330 (2000).
[0186] 13. Marques, M. A. M., Chitale, S., Brennan, P. J. &
Pessolani, M. C. V. Mapping and identification of the major
cell-wall associated components of Mycobacterium leprae. Infection
Immunity 66, 2625-2631 (1998).
[0187] 14. Jungblut, P. R., et al. Comparative proteome analysis of
Mycobacterium tuberculosis and Mycobacterium bovis BCG strains:
towards functional genomics of microbial pathogens. Mol Microbiol
33, 1103-1117 (1999).
[0188] 15. Andersson, J. O. & Andersson, S. G. E. Insights into
the evolutionary process of genome degradation. Curr. Opin.
Genetics & Development 9, 664-671 (1999).
[0189] 16. Anderssen, S. G. E., et al. The complete genome sequence
of the obligate intracellular parasite Rickettsia prowazekii.
Nature 396,133-140(1998).
[0190] 17. Mizrahi, V., Dawes, S. S. & Rubin, H, in Molecular
genetics of mycobacteria (eds. Haffull, G. F. & Jacobs, W. R.,
Jr.) 159-172(ASM Press, Washington, D.C., 2000).
[0191] 18. Gordon, S. V., Heym, B., Parkhill, J., Barrell, B. &
Cole, S. T. New insertion sequences and a novel repeated sequence
in the genome of Mycobacterium tuberculosis H37Rv. Microbiology
145, 881-892 (1999).
[0192] 19. Wolf, Y. I., Aravind, L., Grishin, N. V. & Koonin,
E. V. Evolution of amino-acyl-tRNA synthetases --analysis of unique
domain architectures and phylogenetic trees reveals a complex
history of horizontal gene transfer events. Genome Research 9,
689-710. (1999).
[0193] 20. Poulet, S. & Cole, S. T. Repeated DNA sequences in
mycobacteria. Arch. Microbiol. 163, 79-86 (1995).
[0194] 21. Cole, S. T. Learning from the genome sequence of
Mycobacterium tuberculosis H37Rv. FEBS Letters 452, 7-10(1999).
[0195] 22. Ramakrishnan, L., Federspiel, N. A. & Falkow, S.
Granuloma-specific expression of mycobacteriurm virulence proteins
from the glycine-rich PE-PGRS family. Science 288, 1436-1439
(2000).
[0196] 23. Daffe, M. & Draper, P. The envelope layers of
mycobacteria with reference to their pathogenicity. Advances in
Microbial Physiology 39, 131-203 (1998).
[0197] 24. Yuan, Y. & Barry, C. E., 3rd. A common mechanism for
the biosynthesis of methoxy and cyclopropyl mycolic acids in
Mycobacterium tuberculosis. Proceedings of the National Academy of
Sciences of the United States of America 93, 12828-33 (1996).
[0198] 25. Draper, P., Dobson, G., Minnikin, D. E. & Minnikin,
S. M. The mycolic acids of Mycobacterium leprae harvested from
experimentally infected nine-banded armadillos. Ann Microbiol
(Paris) 133,39-47 (1982).
[0199] 26. Glickman, M. S., Cox, J. S. & Jacobs, W. R., Jr. A
novel mycolic acid cyclopropane synthetase is required for coding,
persistence, and virulence of Mycobacterium tuberculosis. Mol Cell
5, 717-27 (2000).
[0200] 27. Melancon-Kaplan, J., et al. Immunological significance
of the cell wall of Mycobacterium leprae. Proc. Nad. Acad. Sci. USA
85,1917-1921 (1988).
[0201] 28. Cox, J. S., Chen, B., McNeil, M. & Jacobs, W. R.,
Jr. Complex lipid determines tissue specific replication of
Mycobacterium tuberculosis in mice. Nature 402, 79-83 (1999).
[0202] 29. Camacho, L. R., Ensergueix, D., Perez, E., Gicquel, B.
& Guilhot, C. Identification of a virulence gene cluster of
Mycobacterium tuberculosis by signature-tagged transposon
mutagenesis. Mol. Microbiol. 34, 257-267 (1999).
[0203] 30. Peterson, J. A. & Graham, S. E. A close family
resemblance: the importance of structure in understanding
cytochromes P450. Structure 6,1079-1085 (1998).
[0204] 31. Wheeler, P. R. & Ratledge, C. in Tuberculosis:
Pathogenesis, protection, and control. (eds. Bloom, B. R.) 353-385
(American Society for Microbiology, Washington DC 20005, 1994).
[0205] 32. Honer Zu Bentrup, K., Miczak, A., Swenson, D. L. &
Russell, D. G. Characterization of activity and expression of
isocitrate lyase in Mycobacterium avium and Mycobacterium
tuberculosis. Journal of Bacteriology 181,7161-7167 (1999).
[0206] 33. McKinney, J. D., et al. Persistence of Mycobacterium
tuberculosis in macrophages and mice requires the glyoxylate shunt
enzyme isocitrate lyase. Nature 406, 735-738 (2000).
[0207] 34. Wheeler, P. R. Oxidation of carbon sources through the
tricarboxylic acid cycle in Mycobacterium leprae grown in armadillo
liver. J Gen Microbiol 130, 381-9 (1984).
[0208] 35. Ratledge, C. R. in The Biology of the Mycobacteria (eds.
Ratledge, C. & Stanford, J.) 53-94 (Academic Press Limited, San
Diego, Calif., 1982).
[0209] 36. De Voss, J. J., et al. The salicylate-derived mycobactin
siderophores of Mycobacterium tuberculosis are essential for growth
in macrophages. Proc. Natl.Acad. Sci. USA 97, 1252-7, (2000).
[0210] 37. Quadri, L. E., Sello, J., Keating, T. A., Weinreb, P. H.
& Walsh, C. T. Identification of a Mycobacterium tuberculosis
gene cluster encoding the biosynthetic enzymes for assembly of the
virulence-conferring siderophore mycobactin. Chemistry &
Biology 5, 631-45 (1998).
[0211] 38. Hall, R. M. & Wheeler, P. R. Exochelin-mediated iron
uptake into Mycobacteriurn leprae. Int J Lepr Other Mycobact Dis
51, 490-4 (1983).
[0212] 39. Makui, H., et al. Identification of the Escherichia coli
k-12 Nramp orthologue (MntH) as a selective divalent metal ion
transporter. Molecular Microbiology 35, 1065-1078 (2000).
[0213] 40. Shimoji, Y., Ng, V., Matsumura, K., Fischetti, V. A.
& Rambukkana, A. A 21-kDa surface protein of Mycobacterium
leprae binds peripheral nerve laminin-2 and mediates Schwann cell
invasion. Proceedings of the National Academy of Sciences of the
United States ofAmerica 96,9857-62 (1999).
[0214] 41. Rambukkana, A., et al. Role of alpha-dystroglycan as a
Schwann cell receptor for Mycobacterium leprae. 282 2076-9
(1998).
[0215] 42. Rambukkana, A., Salzer, J. L., Yurchenco, P. D. &
Tuomanen, E. I. Neural targeting of Mycobacterium leprae mediated
by the G domain of the laminin-.alpha.2 chain. Cell 88,811-821
(1997).
[0216] 43. Arruda, S., Bomfim, G., Knights, R., Huima-Byron, T.
& Riley, L. W. Cloning of an M. tuberculosis DNA fragment
associated with entry and survival inside cells. Science 261,
1454-1457 (1993).
[0217] 44. Eigimeier, K., Fsihi, H., Heym, B. & Cole, S. T. On
the catalase-peroxidase gene, katG,of Mycobacterium leprae and the
implications for treatment of leprosy with isoniazid. FEMS
Microbiol. Lett. 149, 273-278 (1997).
[0218] 45. Eiglmeier, K., Honor, N., Woods, S. A., Caudron, B.
& Cole, S. T. Use of an ordered cosmid library to deduce the
genomic organisation of Mycobacterium leprae Mol. Microbiol.
7,197-206 (1993).
[0219] 46. Smith, D. R., et al. Multiplex sequencing of 1.5 Mb of
the Mycobacterium leprae genome. Genome Research 7, 802-819
(1997).
[0220] 47. Bonfield, J. K., Smith, K. F. & Staden, R. A new DNA
sequence assembly program. Nucleic Acids Res. 24, 4992-4999
(1995).
[0221] 48. Altschul, S. F., Boguski, M. S., Gish, W. & Wooton,
J. C. Issues in searching molecular sequence databases. Nature
Genet. 6,119-129 (1994).
[0222] 49. Parkhill, J., et al. Complete DNA sequence of a
serogroup A strain of Neisseria meningitidis Z249 1. Nature
404,502-505 (2000).
[0223] 50. Rutherford, K, et al Artemis: sequence visualization and
annotation. Bioinformatics 16, 944-945 (2000).
[0224] 51. Andersen, P. Host responses and antigens involved in
protective immunity to Mycobacterium tuberculosis. Scandinavian
Journal of Immunology, 45, 115-31 (1997).
[0225] 52. Pugsley, A. P. The complete general secretory pathway in
gram-negative bacteria. Microbiological Reviews, 57, 50-108
(1993).
[0226] 53. Stanley, N. R., Palmer, T. & Berks, B. C. The twin
arginine consensus motif of Tat signal peptides is involved in
Sec-independent protein targeting in Escherichia coli. J. Biol.
Chem., 275, 11591-61 (2000).
[0227] 54. Berks, B. C., Sargent, F. & Palmer, T. The Tat
protein export pathway. Mol. Microbiol., 35, 260-74 (2000).
[0228] 55. Lalvani, A., et al. Human cytolytic and interferon
gamma-secreting CD8+ T lymphocytes specific for Mycobacterium
tuberculosis. Proceedings of the National Academy of Sciences of
the United States of America, 95, 270-5 (1998).
[0229] 56. Pollock, J. M. & Andersen, P. The potential of the
ESAT-6 antigen secreted by virulent mycobacteria for specific
diagnosis of tuberculosis. J. Inf. Dis., 175, 1251-1254 (1997).
[0230] 57. Gordon, S. V., et al. Identification of variable regions
in the genomes of tubercle bacilli using bacterial artificial
chromosome arrays. Molec. Microbiol., 32, 643-656 (1999).
[0231] 58. Harboe, M., Oettinger, T., Wiker, H. G, Rosenkrands, I
& Andersen, P. Evidence for occurrence of the ESAT-6 protein in
Mycobacterium tuberculosis and virulent Mycobacterium bovis and for
its absence in Mycobacterium bovid BVG. Infection Immun., 64 16-22
(1996).
[0232] 59. Mahairas, G. G., Sabo, P. J., Hickey, M. J., Singh,
D.C> & Stover, C. K. Molecular analysis of genetic
differences between Mycobacterium bovis BCG and virulent M. bovis,
J. Bacteriol., 178, 1274-1282 (1996).
[0233] 60. Brandt, L., Elhay, M., Rosenkrands, I., Lindblad, E. B.
& Andersen, P. ESAT-6 subunit vaccination against Mycobacterium
tuberculosis. Infect Immun., 68, 791-795 (2000).
[0234] 61. Tekaia, F., et al. Analysis of the proteome of
Mycobacterium tuberculosis in silico. Tubercle Lung Disease, 79,
329-342 (1999).
[0235] 62. Berks, B. C., et al. A novel protein transport involved
in the biogenesis of bacterial electron transfer chains. Biochim.
Biophys. Acta., 1459, 325-330 (2000).
4 M. leprae proteins that are potential targets for the diagnosis,
prophylaxis or treatment of mycobacterioses. >BL;ML0007, ML.tab
11195:12106 forward MW:32204
VTSPNESRAFNAADDLIGDGSVERAGLHRATSVPGESSEGLQRGHSPEPNDSPPWQRGSARASQSGYRPSDPL-
TTTRQSN (SEQ ID NO:1)
PAPGANVRSNRFISGMTAPALSGQLPKKNNSTQALEPVLMSNEVP-
FTESYASELPDLSGPVQRTVPCKPSPDRGSSTPRM
GRLEITKVRGTGEIRSQISRRSHGPVRASMQ-
IRRIDPWSMLKVSLLLSVALFFVWMIAVAFLYLLLGGMGVWAKLNSNVG
DLLNNTGGNSGELVSNSTIFGCAVLVGLVNIVLMTTMAAIAAFVYNLSSDLVGGVEVTLADLD
>BL;ML0012, ML.tab 16566:16979 reverse MW:14690
MQQTAWGPRTARIAGCGGAGIVIAIACSTLDIDTPGFMLTGIAALGLILFAGLSWRARPKLAINPDGLAVQGW-
FRTRLFG (SEQ ID NO:2)
PADIKIIRITEFRRFGRKVRLLEIEAINGDLVILSRWDLGTGPLE- VLDALITAGYAG
>BL;ML0013, ML.tab 17134:17415 reverse MW:10464
MPKSKVRKKNDFTITSVSRTPVKVKVGPSSVWFVTLFVGLMLIGLVWLMVFQLAALG-
TQAPTALHWMAQLGPWNYAIAFA (SEQ ID NO:3) FMITGLLLTMRWH >BL;ML0022,
ML.tab 27173:28639 reverse MW:52748
MDNQKELIQRIERKLESSIDDAFARMFGGSIVPQEVEALLRREASDGVRSLQGNRLLAPNEYIITLGVHDLEK-
NKADPDL (SEQ ID NO:4)
TSSAFASDLADYINEQGWQTYGDVVVRFDQSSSLHTGQIRARSVV-
NPDVEPRPTVNDPVRTQSNQAFSAEPGVPPMTDNS
SYRGGQGQGRPSDDYYGRPQDDPRGADPQGG-
QDPRGCYPPKPGSYPQQAGHPPLHRPDQGGYPGQGGYEDQRAYHDQGQG
GYPSPYEQRPATPGGYGSQGHDQGYRPGSYGPPSGGQPGYGGYGDYGRGPARPDEGSYTPSGFPAPPEQRVAY-
PDQGGGY
DQGYQHSGLGYGREDYGRQEYTQYAENLPGGVYAPSSGGYAEPAGRDYDYGQPGAANDY-
SQPVIGGYGGYGALGSAVILQ
LDDGSGRTYQLREGSNIVGRGQDAQFRLPDTGVSRRHLEIRWDGQ-
VALLSDLNSTNGTTVNNAPVQEWQLADGDVIRLGH SEIIVRIH >BL;ML0030, ML.tab
34750:35091 reverse MW:11383
MLIAGTLCVCAAVISAVFGTWALIHNQTVDPTQLAMRAMAPPQLAAAIMLAAGGVVALVAVAHTALIVVAVCV-
TGAVGTL (SEQ ID NO:5) AAGSWQSARYTLRRRATATSCGKNCAGCILSCR
>BL;ML0031, ML.tab 35287:36123 reverse MW:28788
MIQSTQTWRVLAGGLAATAMGVTVFAGGTAAADPSPPAPPPAIPGVLPPASLPPIQSVTAVPGGITTNNRFVA-
TPQAPGP (SEQ ID NO:6)
AALGQPPLAVAAPVSESLHDYFKAKNIKLVAQKPHGFKALDITLP-
VPTRWTQVPDPNVPDAFAVIADRLGNSLYTSNAQL
VVYNLVGNFDPKEAITHGFVDTQQLSAWQTT-
NASKADFDGFPSSIIEGTYRENGMTLNTSRRHVIASSGPDKYLVSLSVT
TALSQAVADAPATNAIVNGFRVSSPTVSAPVPPQLGTR >BL;ML0042, ML.tab
51993:53396 forward MW:51453 MRNPVWLRFSMGRALLVTALVPPCIILFFH-
TQYWWAGIALVVLVVILTLVEFSGRWLSGWLMALYSFFRRSSKPLDTPSE (SEQ ID NO:7)
PVIGATVKPADQVIMRWQDGFLVSVVELIPRPFTPTVIVDGEAQTDDLLETQLLEHLLSVHCPDLEAVVVSAG-
YRVGHVA
SLDVVNLYQQVIGADPAPAHRRTWIMLRADPVRTRKSAQRRDAGVAGLARYLIASTTRI-
ADQLASHGVDAVCGHSFESVD
HATDVGFMQEKWSMMRGQNAYSVAYTAPAGPDAWWSARADHTITR-
VWVAPGKTPQATVVLTTLGKPKTPCGFYRLHGAQQ
PALLGRSFVAYQHCQMPIGSAGVLVGETVNR-
CSVYMPFDDVDVSVSLGDVQTFTQFVVRAAAAGGIVTLGQQFEKFARMI
GGQIGSVAKVAWPNATTYLDPYPGSERVILKHDIIGTPRHRKLPIRRISPPEEGHYQMVLPKSSYEL
>BL;ML0044, ML.tab 54698:55039 forward MW:12157
MDLDPTQAQTMALLGQFQSALDEQCNRMTDGVFKASDQEKTVEVTINGYQWLTGIRIESGALREFGHAVVADR-
INEALQN (SEQ ID NO:8) AQGVATAYNEVSGEQLAARLSALSCSIGEPPPT
>BL;ML0047, ML.tab 58020:59558 reverse MW:54486
LSAPAVTAGPATAGITPARPSATRVTILTGKRMTDLVLPSTVSIEAYIDETVAVLSDLLEDAPADVLAGFDFS-
AQGVWTF (SEQ ID NO:9)
ARPGSPPMKLDQSLDDAGVVDGSLLTLVSTSRTERYRPLVEDVID-
AIAVLNESPEFNRKAVDRFIGVAIPVLSLPITAVA
VWAWWVTGRSPFWSLAIGILSIVALTGSIVA-
EKFYKNLDLSESLLLTSYPLIASAAALVTPLPNGVDSLGPPQVAAAAAA
VLFLTLLTRGGARRHSGYASFTAITTIAIVVIAIAYGFGYQHWVPTGAVAFGLFIVTNAAKLTVAVARIALPP-
IPVPGET
VDNEELLDPVVTPHEATHEETPTWQAIIASVPDSAVRLTERSSLAKRLLIGYVISGTLI-
LCSGAIAVIVRGHFFAHSLVV
AFLLTVVCTFRSRLYAERWCAWALLAAAVVIPTGLTVKLCIWYTQ-
IAWLLLTSYLVAAIIALMVFGATVRVRRVSPVTKR
IMELIDGAVVASIIPLLLWIAGVYDMVRNLS- F >BL;ML0048, ML.tab
59555:61315 reverse MW:63226
MAADYDKLFRLDDGAYASPDQAAEQLFDDAPLYPPPIIPTCTTTPNGEVASPMPDWSEQLPPNPPAASKSPL-
PPMPIGSS (SEQ ID NO:10)
VQPPPASSESPRAPMPVSAPPRSPAASLMPISEPPQWPPAEAP-
EHQFAKAEPPSVPIPINEPSPAKPATPMPMTPIDGSQ
RTPVTSPEPSLAEFEAQPPATPKPSLLPR-
PMSSPPEAPRPSANQHSRHARRGHHHRDETQQANPASATEPMIAPRARTAE
LRQAPHAAAEPAPTQHLTRPDGLVSHRTALHDSTATSAIGVQTGRSTGAKKPSKVVAKRGWRHWVHTVTRINL-
GLSPDER
YELDLRTRVRRPPRGSYQIGILGLKGGAGKTTVTVTLGSMFARVRNDRILVVDADTSCG-
NLADRAGRFSEANIADLLADK
DVKSYNDIRTHTSVNAVNLEVLPAAEYSTAQHALSGEDWNFAAAT-
VSKYYNVMLADCGVGLFDPVTRGVLSTASGVVIVT
STSVDAARQAAIALDWLRHNGYQDLLSRACV-
VINHVMPKEPNIASKDLVQQFEQQIQPGRVVVLPWDKHIAAGTEIRLOR
LDPLYRRRILELAAALSDDFERAGRH >BL;ML0049, ML.tab 61406:61693
reverse MW:10464 MIQAWHFPALQGAVNELQGSQSRIDALLEQCQESLTK-
LQSSWHGSGNESYSSVQRRFNQNTEGINHALGDLVQAINHSAE (SEQ ID NO:11)
TMQQTEAGVMSMFTG >BL;ML0050, ML.tab 61720:62022 reverse MW:10964
MAEMITEAAILTQQAAQFDQIASGLSQERNFVDSIGQSFQNTWEGQAASAALGALGR-
FDEAMQDQIRQLESIVDKLNRSG (SEQ ID NO:12) GNYTKTDDEANQLLSSKNNF
>BL;ML0051, ML.tab 62201:63109 reverse MW:32135
MTWPMLWPASVPSECPPNYWHTPAPSAKCEPEQAAVAPIAAAKPMITWLQSAAEQTTTQAEAHRQAMASTPGM-
AVITENH (SEQ ID NO:13)
ITQAILATINFFGINMAPIAFTEAGDFICMRTQTALAMNSYQAE-
TLLNTAFQKLEPMAAILNPSSYSPPSALTSQVNQFT
QMISGFSAALPSTQVLQQTVGQVAELARPM-
QQVKSLFTSIDSTGVYTSAQRGDTESAHRIGLFGASTLSSHPLVGITGTT
TDTRLLCAESLPSASGSLAWTPLMTQFQLIDKSIAPEPRQRVMLPPWAAGSPGHNAQDGGTT
>BL;ML00S4, ML.tab 67417:68862 reverse MW:51875
MGLRLTTKVQVSGWRFLLRRVEHAIVRRDTRMFDDPLQFYSRSIALGIVVAVALLIGAVLLAYFKPQGKLGAS-
TLLTDRA (SEQ ID NO:14)
TNQLYVLLSGHLYPVYNLTSARLALGKPANPAAVKSSELTKLPI-
GQTIGIPGAPYATPVSGDSSSTWTLCDTVAKIESES
PAVQTSVIARSLQIDPAINPLQPNEALLAS-
YRDKTWLVNSKGRHSIDLADRALTSAIGISVNVKPTPLSEGLFNALPDVG
RWELPTIPDTGAPNSLGLSPDLVIGSVFQIQMEKSPQYYVVLSDGIAQVNATTADALRATQSHGLVAPPSLVP-
NVVVQIP
ERVYDSPLPDEPLKMVARDDYPTLCWAWERKASDQAPKRTMLIGQHLPLQPSANSTGIK-
QIRGTATVYIDGGRYVALQSP
DPRYSESLYYVDPEGVRYGLANSEAVKALGLASPQTAPWVIVRLL-
VEGPVLSRDAALLEHDTLIADPSPRKVPAGYSGVR P >BL;ML0056, ML.tab
70584:71093 reverse MW:18452
VNLLGNDDDNHLASLDFYSANRYYEESLFFDELDGYAPTTPVGIEANDLDVFQSLTEPEEELEVELLAVTNPA-
KSVSALM (SEQ ID NO:15)
NGRVHQVELTDQVTRIGEKKPATEAFVLASLARQKARTSQGTCI-
LDSLQGGGENTTARCELVGLTLNLPTSEQAAAEAEM FSNNILRQK >BL;ML0068,
ML.tab 89620:90336 reverse MW:26088
MGLFHKRRSRAMRRAEARAIKARAKLEARLAAKNEARRLNSAQRATNKALKAQLKAKRNSDRVALKVAETELK-
AAKECKL (SEQ ID NO:16)
LSPTRIRRVLTVSRLLAPIVVPLIYRAAIATRALIDQRRADQLG-
IPLAQIGQFSGPSARLSARIARSEQSVLLVQEKKPK
DAETKQFVSTITERLIDLSAAVAAVENMPA-
TRRRAVHSTISSQLDGIEADLMARLGVDLTTSMADNRSVADSTRKAAT >BL;ML0069,
ML.tab 90521:90919 reverse MW:14807
MSTTFAARLNRLFDTVYPPGRGPHTSAEVIAALKAEGITMSAPYLSQLRSGNRTNPSSATISALANFFRIKPA-
YFTNDEY (SEQ ID NO:17)
YEKLDQELAWLATMRDEGVRRIAMRTIGLSAQAQQDIVDRVDEL- RRAEHLDV
>BL;ML0071, ML.tab 91913:92446 reverse MW:18446
METGSGLPIGVVPFHARGALKGFVISGRWPDSTKEWAQLLMVAVRIASLPGLLSTTT-
VFGAREELPDEPEPGTVGLVLAE (SEQ ID NO:18)
GTVFGESAIQPGYFADHQPPALLMLHPP-
SETMPSLPECTGAASGCVLLPGLPYLGLEHRAAWVEAEADGTITSMVSRVGV
DPISHPDTAILAMLLAA >BL;ML0073, ML.tab 94092:95126 forward
MW:38124 VIQVCSQCGTRWNVRERRREWCLRCRGALIAPQAEMPTAVKQWSSHVGL-
PAGIVPEALGWQRTPPGFRWIAVRPGAAPPN (SEQ ID NO:19)
RRRQRHHVPTPRYAVMPRWGLADRVDQDSTWIQAPLKPGPSSAKVRTTLFVAVLVFTLAALVYVVRYVLLVIN-
RNTLLNF
GVAAIADWLGVIASLAAIAATLACVMALSRWLIARRAAAYAHHAVPEQRSSWELRAGCL-
LPLVNLLWAPVYVIELALMEN
HYTQLQKPIFMWWIVWVFSYVISVVAVVTSWAKDAQGIANNTVAM-
VFAYLFAAAAVAAVARVFEGFECKSIKRPVHRWVV VHLGGSVVHPSPGSVELEGQEPAA
>BL;ML0081, ML.tab 100648:102000 reverse MW:48150
MLDAPEEEPALADDLTGEDEQPPVEFQWPSTLQARATRRGLLLTALGGLLIGGLVTAIPTVGTGSGLLATYID-
SNPVPST (SEQ ID NO:20)
GAKSNVAFNRATNGDCLMWPDSTPHTAVIVNCADDHRFEVAESI-
DMRTFPGSEYGPNAAPPSPARIQQISEEQCETAVRR
YLGTKFDPNSKYTISMLWSGDRAWRQSGER-
RMLCGLQLPGVNNQQVAFKGKVANIDQSKVWPAGTCLSIDLTTNQPIDIP
VDCVAPHAMEVTGTVNLADKFPNALPAASEQDTFIKDACTRLTDIYLAAIELRTTTLTLIYPTLSLSSWAAGS-
RKVACSI
GATMGNGGWATLVNSAKGPLLINNQPPTPPPDIPEERLGMPPIPLHHLQVPNSQSNVPV-
NPIPPGNQQHRKQQPIVTVPQ
SPASTAPAASVSPAKHLPKARHTRQMSKRRGRPISPHPWAGRASP- GGLAE >BL;ML0091,
ML.tab 113153:113863 reverse MW:23752
VPNRRRCKLSTAISTVATLAIASPCAYFLVYEPTASAKPAAKHYEFKQAASIADLPGEVLDAISQG-
LSQFGINLPPVPSL (SEQ ID NO:21)
TGTDDPGNGLRTPGLTSPDLTNQELGTPVLTAPGTGL-
TPPVTGSPICTAPDLNLGGTCPSEVPITTPISLDPGTDGTYPI
LGDPSTLGGTSPISTSSGELVND-
LLKVANQLGASQVMDLIKGVVMPAVMQGVQNGNVAGDLSGSVTPAAISLIPVT
>BL;ML0093, ML.tab 115371:117302 forward MW:72371
MTAVSLLARVILPRPGDPLDVRKLYLVESITNARRAHALSPTTLQIGAESEVSFATYFNAFPASYWRRWTICK-
SVVLRVE (SEQ ID NO:22)
VTGAGRVDVYRTKANGARIFVEGREFAGVEDDASKAQVVELEVG-
LQPFEDGGWIWFDITAETRVTLCSGGWYATSPAPGR
ANIAVGIPTFNRPADCTNALAELTADPLVD-
EVIGAVIVPDQGVRKVRDHPDFPEAAARLGDRLSIHDQPNLGGSGGYSRV
MYEALKNTDCQQILFMDDDIRIEPDSILRVLAMHRFAKSPMLVGGQMLSLQEPSHLHIMGEVVNRSNFIWTAA-
PHAEYDH
DFVEYPLNDKEDKSKLLHRRIDVDYNGWWTCMIPRQVAEELGQPLPLFIKWDDADYGLR-
AAEHGYPTVTLPGAAIWHMAW
SDKDDAIDWQAYFHLRNRLVVAAMHWDGDVTGLVRSHLKATLKHL-
ACLEYSTVAIQNKAIDDFLAGPDHIFSILESALPE
VHRMRKEYPDAVVLPAATELPPPVHKNKVMK-
PPENPLSIVYRLLRGIFHNLTAADPECHKRPEFNIPTQDARWFRLCTVD
GTTVTTADGCGVVYRQRDRAKNFLLLFSSLHRQLQLARRFDELRKIYRDALPVLSSKQKWEMALLPLPDSPTR-
FPAEQEP EHA >BL;ML0094, ML.tab 117295:117873 forward MW:19928
MPEDKAPTGELAAIAAVQSVLVDRPGVLPTARGMSHFGEHSIGWLAISL-
LGAILVPCRRRYWLVAGAGVFAAHVAAVLIK (SEQ ID NO:23)
RMVRRIRPNHPAVTVNVGTPSPLSFPSAHATSTAAAAILIGRASRLPKGIVAAVLVAPMALSRIVLGVHYPSD-
VAFGVVL GAAVAGTTARFDSRLSRRWTVQHGLSSGSAVK >BL;ML0096, ML.tab
118819:120768 forward MW:71091
MRRRMVSRQVGWPLFPYHIVVRVSLWASVLLVAALFGWGAWQRRWIADDGLIVLRTVRNLLAGNGPVFNQGER-
VEANTST (SEQ ID NO:24)
VWTYLLYAGSWVGGPMRFEYVALAAALVLSVLGMVLLMLGTGRL-
YAPSLQGRQAIMLPAGALVYVALPPARDFATSGLES
GLVLTYLGLLWWMMVCWAQPLRNRSQSRRF-
IGALAFVAGCSVLVRPELALMGGSALIMLMIAARTCWLRALIVVAGGSLP
VAYQLFRMGYYGLLVPGTALAKDAAGDKWSQGIIYLSNFNQPYVLWVPLVLLVLLGLLLMLIHRWPSFMHPLE-
TPDSGRV
ARAVQSPPAVVVFVVFSGLLQAFYWIRQGGDFMHGRVLLAPLFCLLAPVVVIPVVISEG-
ADFSRQTGNWLAGVTSLLWLG
VAGWSLWAANSPGMGDDATNVSYSGIVDERRFYAQATGHAHPLTA-
ADYLGYPRMAAVLVALNNTPDGALLLPSGNYIKWD
LVPMIQLSPSSPGSPPDSLVSQKPQHTVFFT-
NLGMLGMNVGLDVRVIDQIGLANPLAQHTERLQHGRIGHDKNLFPDWVI
ADGPWVKWYPGIPGYLDQAWIAQAVAALQCSGTQAVLSSVRAPMALHRFISNLLNSFEFTRYRFDRVPLYELV-
RCGLPVP DVLPATPPE >BL;ML0099, ML.tab 123370:124380 forward
MW:35611 MVEKVPRKRHRVLAWTAALSMAAVVALAIVAVVIL-
LRSAESPRSSLPPGVLPIPSTAPHPRKPRPAFQDVSCPDVQLLVV (SEQ ID NO:25)
PGTWESSLQDNPLDPVQFPDALLRNSTMTIGQQFPTSRVQTYTIPYTAQFHNPLSGDKQMTYNDSRAEGTRAM-
VQEMINV
NNKCPLTSYVLVGFSQGAVIAGDITSDIGNGHGPVDDDLVLGVTLIADGRRQQGVGNDI-
GPNPPGEGAEVTLHEVPVLSG
LGMTMTCARPGGFGVLHSRTNEICAPGDLICAAPAEAFSVANLPA-
TLNTLASGAGQPIHANYATAQFWDLDGAPATVWTL NWVHRLIEGAPHPKHG >BL;ML0107,
ML.tab 147460:149358 reverse MW:68650
MRNALASFGQIVLAAVVASGVAAVSLIAIARVHWPAFPSSNQLHALTTVGQVGCLTGLLAVGGVWQAGRFRRL-
AQLGGLV (SEQ ID NO:26)
FVSAFTVVTLGMPLGATKLYLFGISVDQQFRTEYLTRLTDSAAL-
QDMTYLGLPPFYPPGWFWIGGRVAALTGTPAWEIFK
PWAITSITIAVAITLVLWWQMIRFEYALLV-
TIATAAVTLVYSSPEPYAAMITVLLPPALVLTWSGLRAAEREADRTLGNK
RGWATVVGAGIFLGFAATWYTLLLAYTAFTVVLMTLLLATALCRRAGFRATFDPLRRLAGIVVIAAAIGAITW-
LPFLARA
AHDPVSDTGSAQHYLPADGAELAFPMLQFSLLGMICMLGTLWLIVRTSSSVRASALMIS-
VLAVYLWSLLSILTTLARTTL
LSFRLQPTLTVLLVTAGVFGFIETAQSLAKHNRAVLSVASAIGLA-
GAIAFSQDIPNVLRPDLTIAYTDTDGHGQRGDRRP
PGSEKYYWAIDEAVLHITGKPRDQTVVLTAD-
YSFLAYYPYWGFQGLTSHYANPLAQFDLRAAQIQQWSRLTTASELIHAL
DTLPWPPPTVFVMRHGAGNTYTLRLAKNVYPNQPNVRRYTVDLPAALFADQRFAVQDIGPFVLAIRKPMGNA
>BL;ML0115, ML.tab 155413:155937 reverse MW:19053
LNNDNDDSIEIIGGVDPRTMATRGEDESRDSDEPSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPLDEASRN-
QLREIHATS (SEQ ID NO:27)
IRELEDGLAPELREELDRLTLPFNESTAPSNAELRIAQAQLV-
GWLEGLFHGIQTALFAQQMAARAQLEQMRNSALPPGMG KPGQAGGQGTGQYL
>BL;ML0116, ML.tab 155965:157929 reverse MW:70767
VGLCCGTLIALFLLIVPETIVARFAALTWPIAIAVSPALTYGVIALVIIPFGAVGIPWNSWTALAALVAVSML-
MIAFRLL (SEQ ID NO:28)
LVRYRDTAAETRGISGWPAVTVAVGVLLGALLIGWAAYRGLLHW-
QSIPSTWDAVWHANTVRFILDTGQASPTHMGELRNV
ETHSVLYYPSVLHALAGVYCQLTGAAPTTG-
YTVSSLAVAVWLFPVSAATLTWHLLRPVTTQKRAAGASATAAALSAAFTS
VPYVEFGVAANPNLAAYGVAVPTMVLITSTLRHRDRIPVAILALVGTFSVHLTGGIVVSLFLLGWWLMNALLH-
PVRSRAA
DARTLAAVVMPTALILAPQFIAVLNQADIIAGHSFPSFKSVKQGVIDALLLHTRHLNDF-
PIQYGLVVLAAIGMAILLYQK
IWWPSIAWLVLTVATVYSAAPFRGPIGSAIESFSQFFYNDPRRLS-
AVVTMLLTPMAGIALFAGVLLLVVGARRVTARFTA
LPRPVWTTATVVLLVAATVLTAWHYLFRHLV-
LFGDKYDSVMVNQKDLDAMSYLATLPGAHNTIIGNSNTDGSSWMYAVAD
LHPLWTHYDFPQQTGPGYFRYAFWAYARTGNPWVVEAVRVFNIRYILTTSPTVQGFAIPDGLVSLEESKSWTK-
IYDNGAA RIFEWSGNATATRA >BL;ML0124, ML.tab 166949:167389 reverse
MW:16448 MPVLSKTVEIDTDTATIMAIVTDFESYPQWHEWIK-
GVWVLAHYDDGRPSQLRIDINFQGMQGTYIQAVYYPGVNQIQTVM (SEQ ID NO:29)
QOGOLYSKQEQLFSVTQAEGGSVLTVDLDVELTMPVPAPMVKNLLNTALDRLAEKLKLYAEHLAPS
>BL;ML0133, ML.tab 179256:179888 forward MW:24045
MTNRTLSREEIRKLDRDLRILVATNGTLTRVLNVVANEEIVVDIINQQLLDVAPKIPELENLKIGRILQRDIL-
LKGQKSG (SEQ ID NO:30)
ILFVAAESLIVIDLLPTAITTYLTKTHHPIGEIMAASRIETYKE-
DAQVWIGDLPCWLADYGYWDLPKRAVGRRYRIIAGG
QPVIITTEYFLRSVFQDTPREELDRCQYSN- DIDTRSGDRFVLHGRVFKNL >BL;ML0151,
ML.tab 213175:213492 reverse MW:11830
MRPEPPHHENAELTEMNTEVVEAPLLTDIEELREEIDRLDAQILATVKR-
RAEVSQAIGKVRMASGGTRLVHSREMKVIER (SEQ ID NO:31)
YSELGPDGKDLAILLLRLGRGRLGH >BL;ML0158, ML.tab 221657:222601
forward MW:31355 LHCSSGAVTALEITGGVNTYLPGSPGYPLVQPAGS-
YPGATPSFVKSDVGESQLYHYLTIAVVVLGLAVYLGNFGPTFTSS (SEQ ID NO:32)
SDIGPGSGGFAGDAGTAVVVALLAALLAGLDLLPKAKSSAGVVGAIAVLGALLAISEMINMPAGFSIGWANWF-
ILVCSVL
QAIAAVAALLLEAGIITAPAPRLSYDPYLQYGQYGAQSYYGQPNRQLQVGLNAHSPQQS-
PAGYGAQYGAYTSSPTQIQAG
MPATGGFSAQHSAQQGPSTPPTGFPSFSPPPSVGAAAGSQAGSAP-
VSYSNPTDSKQGFGQGRESTSSSSGSAPV >BL;ML0159, ML.tab 222650:223942
forward MW:44093 VGDNRAAGVRQARDLVKVAFGPAVVALAIIAAITL-
LQLLIANSDMTGALGAIASMWLGVHQVPIAIGGRELSIMPLLPVL (SEQ ID NO:33)
LMVWATAHSTSQATSAYSSWLVIRWVVASALGGPLLIAAISLAVIHDASSVLTELQTPKALRAFTGVLVVHAI-
GAAIGVN
SRVGRRVLTASRLPDWVGDSVHAATAGVLALLGLSGLVTAGSLVVHWATMQEFYGITDS-
IFGQFSLTVLSVLYAPNVIVG
TSAVAVGSSAHLGFATFSSFTVFGGDIPALPVLAAAPTPPLAPVW-
VALLIVGAASGVAVGQQCTRHPLPLLAALAKLLVA
AATGALMMALLGYAGSGRLGNFGDIDVDQGA-
LVVGVFFWFAVVGWVTVVVACGIKRFPRHLKPPPALSSEEHADASSKDH
EAYFGVDLNVPFDLSGEDEIPKAEPGEAAD >BL;ML0169, ML.tab 234932:235534
forward MW:22082 MSNSAQRDAKGARDEPLRAADTDRIQIAQLLAYAA-
EQGRLELKDYEDRLAKAYAATTYQELEQLRDDLPGSQVSARRGGN (SEQ ID NO:34)
PNPAPSTLLLALMSGFERRGRWNVPRKLTTFSLWGSGVLDLRYADFTSTEVELHAYSVMGVQTILLPPEVNVE-
ISGHGVM GSFDRQVRGQGTPGAPTVKIRGFSLWGGVGIKRKARRPRR >BL;ML0185,
ML.tab 250221:251249 forward MW:38250
MPSIPQSLLWISLVVLWLFVLVPMLISKRDVVRRISDVALATRVLNGVAGARLLKRGGPATGHRSDHNWELDE-
DWRQNPV (SEQ ID NO:35)
DGEFADADQDIGEEQDQNVDDTQRTRPVVMEVAVAELTGTDYLD-
VDVVEDSVALPIEDSADVTESVLLAVGEGGSPGEEA
EAEQRQSDRYGYVDASSGLGLEQKDDKSPV-
PVAPTVSRQRRYDTKTATAVSARKYAFRKRVLMVMAIILVGSAAAAFEVD
SNAWWICGSSTTVTVLYLAYLRRQTRIEEKVRSRRMHRIARARVDVENAHDREFDVVPSRLRRPGAVVLEIDD-
EDPIFEH LDYEMPIRTFGWPRDLPRAVGQ >BL;ML0187, ML.tab 252658:253719
forward MW:38375 VAGSWQCGHCESCASPLGPRDIAVVELI-
ADRAEEFAAMDIFRGLPAEDLMSVAVSVEPVLAAAGEVLMQQGEQAVSFLLI (SEQ ID NO:36)
SSGNVEVRRVDDDGAVIVGQASHGMIIGEIALLRDGRRTATVITTEPLTGWVGDIDAFAQMVQIPSITRRLLL-
TVRQRLA
AFITPIPVQLRDGTHLMLRPVLPGDTERSLRGHVRFSRETLYLRFMSARAPSDELMHYL-
SEVDYVDHFVWVVTDGGDPVA
DARFVRDESDPTLAEIAFTVADAYQGRGVGNFLISALSIAAHVNG-
VNRFSARMLTDNGPMRAIMDHHGAVWRRYDVGVIT
TVIDVPRQRDLIIGRAMADQIAGVVRQVIGA- VG >BL;ML0190, ML.tab
255125:255742 reverse MW:22988
MCDMLVDVGIAFQGSLFEYHERRQLGDGAFIELRSGWLTDGVELLDTLLSEVPWRIERRRMYDKVVNVP-
RLVSFHDLTTD (SEQ ID NO:37)
DPPHPLLTRLRRRLNDIYAGELGEPFTSVGLCCYRDGSDS-
IAWHGDTIGRNSSEDTMVAIISLGATRVFALRKRGGGPSL
RLPLTHGDLLVMGGSCQRTWEHSVPK- TSASTGPRVSIQFRPRNVH >BL;ML0199,
ML.tab 265213:265815 forward MW:21217
VSQLSFFTAESLLPAIADLAGVLAASGQIVVVSASGQSPAPAARLSVVV-
DQLWRASALAEMISEAGLVPEISRTEEDTPL (SEQ ID NO:38)
VRTAVDPLLCPIAAEWTRGAVKTVPPRWLPGPRELRAWILAAGVPEAANRYLLGLDPHAPDTHSPLASALMRV-
GIAPTLI GTRSGRPALRISGRRRLSRLLENVGEPPDWAEALALWPRV >BL;ML0208,
ML.tab 279784:280125 forward MW:12884
MNWIQVLLIGSIIVLLIYLLRSRRNVRSRAWVKVGYIAFVLGGVYAVLRPNDTTVVAHWFGVCRGTDLMLYAL-
IMAFSFT (SEQ ID NO:39) TLSIYIRFKDLELRYACLARVVALEGARAPEPF
>BL;ML0227, ML.tab 298516:298992 forward MW:17121
MGPTRKRDLTAANIGAAVVGYLLVLVLYRWFPPITVWTGLSLLAVAIPEALWARYVRTKISDGEIGDGPGWLH-
PLAVAHS (SEQ ID NO:40)
LMVAKASAWVGALVLGWWVGVLVYFLPRWPWLRVADKDTSGTVV-
AALSALALLVAALWLQHCCKSPQDPTEHGEGAEN >BL;ML0229, ML.tab
300537:301466 forward MW:32734 MVQFDGLRSARLNIAILSTGRVGVALERADQVVVA-
CSAVSHASRQWVQFRLPETSVASPPEVASSAELLLLAVPDCEFAG (SEQ ID NO:41)
LMSGVAVTSVPRPGTIVAHTSWANGVGILAQLGKDGCIPLAIHPANMFSGSDEDLSQCQLRDTYFGITKTDDV-
GYAIAQS
LVLEMGGEPFCVVEYARILYHSVSPHVGNNIVTVLADALEVRRSALRGSELLGLGVPPA-
CRGEVVDDQLDVIVERIVGSL
ARAACENTLQRGQAGLTKLVARGDLDALAGHLVALMRIGPELAQA-
YRVNALRKTQRAHAPYDVVEALAP >BL;ML0256, ML.tab 335129:335812
forward MW:24402 MSEAKRLDPKRRSPASRPGKAGDSVRGRRSTKPVA-
KLSVKPSRTTPASSHSGRNSTRMLTQHVVEPIRQSIIESRERRSD (SEQ ID NO:42)
QQLGFTARRAAVLAAVVCVLTLTIAGPVRTYFAQHAEIEQLAATEATLRRQIADLEQQKGKLADSAYIAARAR-
ERLGFVM
PGDVPFQVQLPSTAAVSSQPGGRAAKPANNDPWYTSLWHNIADAPHLPPGAGTPPFSLT-
PLSTTSGG >BL;ML0257, ML.tab 335805:336308 forward MW:17960
VVDRADLEAVARHLGREPRGVLEIAYRCPSGEPGVVKTAPKLDDGTPFPTLYYLTHP-
VLIAAASRLESTGLMREMTERLG (SEQ ID NO:43)
QDPELAAGYRRAHESYLTERDAIESLGT-
TFSAGGMPDRVKCLHVLIAHSLAKGPGLNSLGDEVLALLAADPKTAATLVAG QWKECDR
>BL;ML0271, ML.tab 354849:355220 reverse MW:12992
MRLGQALAWLATDIVAVSVFCAVGRCSHAEGLTVADLAVTLWPFLTGTAIGWLASRGWQRPTAVVPTGVVVWL-
CTVVVGV (SEQ ID NO:44) ALRKASSAGVVANFMVVAASTTAALFLGWRAVVELILRRRSTR
>BL;ML0279, ML.tab 361297:361953 reverse MW:24062
VTYYSSLRPEDLPPERPKHEHYSGFPEYELANPGVGFRRFVATMRRLQDLAVSADPSDEVWYAAADRAVAL-
VELLGPFAT (SEQ ID NO:45)
DEGKAPAGRVPDMPGMGSLLLPPWTLTRSGPDSVEMTGYFTR-
FHVGFNHAVIIGGVLPLVFDHLFGMISYTAGRSISRTAF
LHVDYRKITPIDEPLVMRGRVTRTEGR- KAFVSAELVDGDEMLLAEGNGMMVRLLAGQP
>BL;ML0281, ML.tab 363432:364121 forward MW:25240
LNTSDSAPGVAVLLFGDDRTRQRWNTLTALSTYRA-
GGPDDIDSIDATIGPYRRLVVVGGDGDLAAVLGRLLRADRLDIEV (SEQ ID NO:46)
AYVPHQRTAATRVYRLPTGRRAARRARRGYATRVPLIRDETGSVIVGRADWLPVVDRQPLNGEAIVDDIPLFD-
GDVAGVR
IAPTLANPGLRARLHTSRTGIGIWSRWLTGRAVQLGSTGVAVVRDGVPTRRRERRSTFY-
RNVEGWMLVR >BL;ML0284, ML.tab 365860:366273 reverse MW:14541
MTSMGDLLGPDPILLPDDSAAEVELRANKDPGTVAAAHIPSASVAWAALAEGALADD-
KATTAYAYARTGYHRGLDQLRCNG (SEQ ID NO:47)
WKGFGPVPYSHEPNRGFLRCVAALARA- ANAIGETDEYRRCLNLLDDCDPAARNELGL
>BL;ML0285, ML.tab 366385:367263 forward MW:31593
MPNASEPERGITLNRPGLAPRTLDKNVVSNLPEAK-
DNPANTHGEEIAAGYPLAHSDSETEAMVLTKTEPDQDPGADRQHH (SEQ ID NO:48)
ERRFTAPGFDARATAIMATAPDPATEAIHPPLSSSDPPGHLGISPKAAVPQSIPPVLGTKLRSARHFHWGWVV-
ALLMMVL
ALAAIAILGTVLLTRGKHVKASPAEQVRHAIQSFDVAVQTGNLTALRSITCGTTRDGYV-
EYDESSWDETYHRVSAAKQYP
VIASIDQVVVNGQHAEANITTFMAYDPQVRSTRSLDLQFCDDQWK- ICQSPSG
>BL;ML0298, ML.tab 381423:381647 reverse MW:7851
MIVXTVNERPVEVNEQTTVAALLESLGFPASGIAVAVEFSVLPRSYWATKISELPAVT-
GRSEPIRLEVVTAVQGG (SEQ ID NO:49) >BL;ML0370, ML.tab
461945:462814 reverse MW:30055 VRYRRQVAHTRKLLAALSRRGPHRVLRGDLSFAGL-
PGVVYTPAGGLNLPGVAFGHDWLTGTARYAGLLEHLASWGIVTGA (SEQ ID NO:50)
PDTQRGLTPSVLNLAFDLGSALDIVAGVRLGPGNISVHPAKLGLVGHGFGGSAAVLAAAGLPGLAGLPAKSAV-
AIFPTVT
SPAPEQPAATCKVPGLILTAPGDPKTLNSNALSLYRAWDDATLRIVSKAKAGGLVEGWR-
MTKVVGLAGPHRATQKAVRSL
LTGYLLYALGGDKEYRDFADPDMHLPHTVPVDPEAPLVTPEQKIV- TLLK >BL;ML0383,
ML.tab 476440:477285 forward MW:29301
VTSVSIVVEIGHTSAAEPMLAAAAFGNQPGRWPLPTATTPHQLWLRAVAAGGQGHYSSAYRDLAVL-
RRSVPAGRLASLAH (SEQ ID NO:51)
STEGSFLRQLGWHSLARGWDGRALVLAGTDSEARADA-
LIGLAADALGVGRLAAAATLLRRVGSALAPAQLPAQVADRLAV
RRRWVAAELAMAVGDGATAVRNA-
REAVELAQVGRVSVRHQVKSDVVLALCSAATEPRVVAEAALAATGRLGLIPLR
WALACLLIDIGSVTFSEPELSELRDVCADQVRRAGGTWRTA >BL;ML0386, ML.tab
480093:480506 reverse MW:15294 VRDHLPPGLPPDPFADDPCDPSAALDAV-
EPGQPLDQQERIAVEADLADLAVYEALLAHKGIRGLVVCCDECQQDHYHDWD (SEQ ID NO:52)
MLRANLLQLLIDGTVRPHEPAYDPEPDSYVTWDYCRGYADASLNQATSDADGYRRRH
>BL;ML0405, ML.tab 503217:504401 forward MW:40754
MSGAFIIDPTLKAIEAWHALLGIGVPNDGGVLYSSLSFFEKALEHLAAAFPGDGWLGSAADKYAGQNRKRVDI-
FQELAEL (SEQ ID NO:53)
DKELIELIHNQANSVQTTRGILDGAKKALLFVRPVAIDLNYIPL-
VGSVMSASIQAQACAAAMAAVSGGLAYLLVQTAIHT
AKFVALLARLAHLLASAVADVVSDGVAIIK-
GIVDHLWHFIAGALTGLKDIVEKIIHWFFGLFSHWWSRLHSFFGGIPGLS
GATSGLSQVTGLFGVPGLAGSSGLLSGESLLSTENLPSLAGVGAGLGLGSLPQLAQLHAASTRQGTRSQAGVS-
AELSTEQ
FGGQQEPVSAQGSQGMGGSQGMGGMTPASTKSKKDERKKKKYSEGAAAGTDDAERAPIE-
VQSGGGKRALAQHVV >BL;ML0406, ML.tab 504459:504779 forward
MW:11110 MRSMIDNLTVQSEHLNSLASQHENEAACASSGVSAAAGLANAVSTSHGS-
YCAQFNDTLKMYEDAHRTLGESLHTGGIDLA (SEQ ID NO:54)
RVLRVAAANYCDADEICGSDIKSAFG >BL;ML0407, ML.tab 504793:505443
forward MW:24227 MGSRRRINRRLLPMSTFPAWQEFRRDVVVVFPGND-
FDRDDCDTVDPWGVGGAAHWTIDPIVGFASSSAPQDRGTDVDNTR (SEQ ID NO:55)
GQAEEDEKQKEPEVAIFTVTNPPRTVSVSVLMDGRIDHVELSKRVTWMSESQVASEILVLADLARQKAQSAQY-
TFILDKL SQLADGDEHRVALLRESVGNTWNLPSPEQAAEAEAEVFATRYSDYCPAQDTENDQW
>BL;ML0410, ML.tab 508327:508629 forward MW:10951
MSFFLRVEVGGLMMAAGRLERITSESMACNAKLTPVTTKVVPPAADQVSKLVSQVFSSYGKQYEGYAAQGVDQ-
SRLFVQS (SEQ ID NO:56) LKDAAGDYMDSDHMYLNTED >BL;ML0411, ML.tab
508755:509981 forward MW:42466
MFDFMVYSPEVNAFLMSRGPGSTPLWGAAEAWISLAEQLMEAAQEVSDTIVVAVPASFAGETSDMLASRVSTF-
VAWLDGN (SEQ ID NO:57)
AENAGLIARVLHAVAYAFEEARAGMVPLLTVLGNIIHTMALKAI-
NWFGQVSTTVAALEADYDLMWVQNSTAMTTYRDTVL
RETGKMENFEPAPQLVSRYCMDRRDSVNSF-
HSSSSSDSLYESIDNLYDSVAQSEEHGSDSMSQSYNTCGSVAQSELCDSP
FGTPSQSSQSNDLSATSLTQQLGGLDSIISSASASILLTTNSISSSTASSIMPIVASQVTETLGRSQVAVEKM-
IQSISSTA
VSVDVAASKVVAGVGQAVSVGALRVPENWATASQPVMATAHSVPAGCSAITTAVSGPL-
EGVTQPAEEVLTASVAGGSGTG GPAFNEAV >BL;ML0418, ML.tab 517644:518276
forward MW:23558 LLSVDEVLTTTRSVRKRLDFDKPVPRDV-
LMECLQLALQAPTGSNSQGWHWVFVEDAEKRKAIGDIYLVNARCYLSQPAPEY (SEQ ID
NO:58)
PEGDTRGERMRLVRDSATYLAEHMHEVPVLLIPCLLGRAEESPLGAVSYWASLFPAVWSFCLALRSRGLGTC-
WTSLHLLG DGEQRAAEVLGIPSDKYSQGGLFPIAYTKGTDFRPANRLPAENVTHWDIW
>BL;ML0425, ML.tab 524416:524643 forward MW:8231
VADRHPDTIKLEIDVAREQFAATVDSLAERANPRRLAGDLKARVVEFGRRPAVIAALVSCAVLTVIVVVRKVK-
NR (SEQ ID NO:59) >BL;ML0431, ML.tab 530916:531695 reverse
MW:27233 MNNPRRSEWLGPSLAGSGPIEPQVHQYPPLTDPAYAEQAPYAPAYGASL-
PPWTPKKPPQQLPRYWQQDQPPPTDIPPEGL (SEQ ID NO:60)
TLPPPHEPKSPHWFLWVVAGASVVLVVGLVMALIIANGAIKTQTAVPPLPAITESSSATPTPTTKTSPTPTAG-
PAPSTTG
SGTLTQTIGPSAMLDVVYSITGQGRAISVTYMDTGDVIQTEFNVVLPWSKQVSLSKSAV-
HPASVTIVNIGHDVTCSVTVA GVQIRQHTGVGLTICDAPR >BL;ML0451, ML.tab
551191:552240 reverse MW:37930
MLTLLVLLVALATLAGGWGYQTANRLNRLHVRYDLSWQALDGALARRAVVARAVAIDAYSGTSPGRRLAALAD-
AAECAPR (SEQ ID NO:61)
HTRENAENELSAALAMVDPASLPTALIAELADAEARVLLARRFH-
NDAVRDTLALGEQRLVRTLRLRGTASVPTYFEIVER
PHALTHGDHGVPNQRTSARVVLLDETGAVL-
LLCGSDPAITNGHAPRWWITVGGEVRPGERLAAAAARELAEETGLRVIPT
HMVGPIWRRDAIFEFNGSVIDSEEFYLVYRTRRFEPSTVGWTELEHLCLHGSRWCDANDIAELVASGEQVYPR-
QLGELLP VANQLADASTGTARGTAAARNTYVLLSIC >BL;ML0486, ML.tab
589652:589996 forward MW:12749
MESLVLLLLFLLIMGGFMFFASRRQRRSMQATIDLYNSLQPGDRVNTTSGLQATIIVVGDDTVDLEIAPGVVT-
TWMKLAI (SEQ ID NO:62) RDRILPDDAYMDEHEAEPGDFVYCDELEESDGSS
>BL;ML0520, ML.tab 631179:631787 reverse MW:21872
MNNWMLRGLVFAALMIVVRLMQGTMINVWQAQSVLISVVLLAVFIIAVVVWAARDGRADAIANPDPDRRRDLA-
MTWLLTG (SEQ ID NO:63)
ILVGVLSDAVAWVISLLYNGIYTGGLVSELTTFSAFTALIVFLT-
GIIGVACGRWRVDRRSPPVPEHSRSGQNRADSNVFA
AVCTDDDTPTGELSAAQTKEQTAAVATAES- EAPTEIIYIIQRA >BL;ML0542, ML.tab
657980:658312 forward MW:11942
VSIPQSNTSLSAVIAVDQFDPSSGGQGVYDTPLGITNPPIDELLDRVSSKYALVIYA-
AKRARQINDHYNQLGEGILEYVG (SEQ ID NO:64)
PLVEPGLQEKPLSIANREIHADLLEHTE- GE >BL;ML0561, ML.tab
678085:678555 forward MW:17254
MTAQLDRDDWDVELRPYWTPLFAYAAAFLIAAAHITVGLLLRIKSSGVVFRTADQVAIGALGLVIASAV-
LLLTRPRLRVG (SEQ ID NO:65)
AAGLLVRNIMFYRIIPWSHVVDVSFPLGSHWARIDLPDDE-
YIPLMAIQAVDKERAVEAMDAVRALLARYRAGPYGP >BL;ML0577, ML.tab
699950:700183 forward MW:8150 MELALQITLVVTSILVVLLVLLHRAKGGGLSTLFGG-
GVQSSLSGSTVVEKNLDRLTLFVTGIWLVSIIGVALLTKYR (SEQ ID NO:66)
>BL;ML0580, ML.tab 704001:704798 reverse MW:29039
MSRVLTLVITPYSKANLKESIEAANGASHKYPNRIIIANRVNSYANKARLDAQLWVGADTGAGVVVSRTLAVY-
AHSVVIS (SEQ ID NO:67)
ILLPDIPMVAWWPNIAPTMSGQDSLGKLAIQRITNATNSIDPLA-
TIKSRLSDYTADDTHLAWDLITYWRALLTSAVNLPP
HEPIDLALVSGMKTEPALDVLAGWLANRIN-
RPLRRAVADLKVELIRNSETIVLSRPQTWVTSTLIRTVKPDALVPWGAQG
SRGVPSRKSATTGSRQVLLQCLRRH >BL;ML0603, ML.tab 735002:736117
forward MW:39051 VLVLWRCFRVANISALMVAVACLPDWLSGFLTGGL-
IAGSSARRATIYGVSNKFSSLHLVLGNEELLVERAVGEVLRSARQ (SEQ ID NO:68)
RAGTQDVPVSRMRAGDVGTYELTELLSPSLFADERIVVLEAAAEAGKEAAALIVSAAADIPQGTVLVVVHSGG-
GRAKALA
NELQSLGATVHPCARITKLSERTDFVRKELRSLRVKVDEGAVTALLNAVGSDVRELASA-
CSQLVADTAGDVDADAVQRYH
SGKAEVKGFDIADKAVGGDVSGAVEALRWANMRGEPLVVLADALA-
EAVHTIGRVGPLSGDSYRLASRLGMPPWRVQKAQQ
QARRWSRDTVAAAMRVVAALNADVKGAAADA- YYALESAVRKVAELAADGSR
>BL;ML0630, ML.tab 763032:763358 forward MW:11114
MAELLNTEDAKLVVLVRAAMARTEAGSGAVVRDFDGRTYAAAPVTLSTL-
ELIGLQEEAAAAFSASSVVSGLEVGVLVAGS (SEQ ID NO:69)
VDEPDIAMVRELASTAVVILIDRNGNRV >BL;ML0642, ML.tab 774506:775945
reverse MW:50246 VAHSGSPVSTVDGVANPPFGFSSGNDSPNDESGRD-
KHGKNGPDSGSSGSDPLASFGMSGDFGMSDLGQIFTHLGQMFTNA (SEQ ID NO:70)
GTAMTADKQLGPVNYELARRVASSSIGFVAPIPATTSSAIGDAVHLAETWLDGVTALPAGTTKAEGWTPDDWV-
NNTLETW
KRLCDPMAQQISTVWAASLPEEAKSMASPLLSMMSQMGGMAFGSQLGQAFGQLSREVLT-
STDIGLPLGPRGVAAIMPQAV
ESFADGLEQPRCEILTFLATREAAHHRLFSHVPWLASQLLGAVEA-
YAAGMKIDMNGIEELARDFNPASLSDPTAIEELLG
QGVFEPQATPAQTQALERLEALLALIEGWVQ-
VVVTAALGDRIPGAAALGETLRRRRASGGPAEQTFATLVGLELRPRKLR
EAAVLWERLTQAAGVDARDAVWQHPDLLPSGKDLDDPASFIDRIIGGDTSGIDEAIAKLDLDRGNSDGRTPGS-
GGPVDN >BL;ML0676, ML.tab 811815:812291 reverse MW:16889
MFLRLMLSALKALQRLGAVMNSLARIDHWIWLFRCQPLTIRLLVATAALFTAATAFE-
VPAEADAIDDTFIKALNHAGVNF (SEQ ID NO:71)
GEPRSAMTMGHYVCPILAKSGGNFAAAV-
QRIRGNSDMSPQMAETFAKIAISIYCPTMMANVASGNLPSLPPGPGIPGI >BL;ML0703,
ML.tab 840882:842153 reverse MW:46028
MVADLVPICLSLPAGDRYTVWAPRWRDGGDEWEAFLGKDDNLYACETVADLVAFVRTDSDNDLVDHPAWKDLT-
SVHAHKL (SEQ ID NO:72)
DPSEDNQFDLVVVEELVAEKPTAESVTTLAATLAIVASIGSVCE-
LPAVSKFFNGNPSLGAVSGGIEHFTGRAGQRRWNSI
AEIIGRSWDDVLSAIDKVISTPRVNAAMSA-
KAADELAEEPVEPEVEPDDEDGADSATAQANNSDDTEDESRTAGDTVVLG
SDKDFWLQVGIDPVRIMTGAGTFYTLRCYLDDHPIFLGRNGRISVFSSERALARYLADEHDHDLSYLSTYDDI-
RTAATDG
SLAIDITDDNIYVLSGLSDDLADGPDAVDRDQLDLAVELLRDIGQYSEESAVDTALETN-
RPLGKLVAHVLSPSAVDKPVA PYSAAVREWEKLEQFVESRLRLE >BL;ML0730, ML.tab
871757:872011 reverse MW:9373
VSAANDGSETNKLPTTQNPHIQITKGQPTDQELAALIVVLSSIGGASQVKQPEPTRWGLPVDKLRYPVFSWQR-
ITLHEMT (SEQ ID NO:73) HMRR >BL;ML0733, ML.tab 874677:875195
forward MW:19913 MRYPGNTLVAGEQVVLHRHPHWKRLIWPAVVLILA-
TGLVSFGSGYVNSTHWAQVAKNVIYGVLWGVWLVIVGWLTLWPFL (SEQ ID NO:74)
NWLTTHFVVTNRRVMFRQGTLTRSGVDIPLARINSVEFRDRLFERMFRTGTLIIESASQDPVEFYNIPRLRQM-
YALLYHE VFDTLGSEESPS >BL;ML0734, ML.tab 875150:875836 reverse
MW:25515 VSFPDATITRLPTVLQPYAQRYHELIKFAIVGGTT-
FIIDSAIFYTLKLTILEPKPVTAKVVAGIVAVIASYVLNREWSFR (SEQ ID NO:75)
DRGGRERHNEALLFFAFSGIGVLLSNAPLWFSSYVLQLRAPTVSLTVENLADFLSAYIIGNLLQMAFRFWAFR-
RWVFPDA
FARNPEKTLESALTAGGIAEVFEDAIDGVFEDFGDALLRAWRNRSRRLDLSPASQLGDS-
SEPRVSKTS >BL;ML0748, ML.tab 890981:891259 reverse MW:9830
MVQGLLAKAATMVITGLTGVTAYEMLRKAVTKVPLHQIAVSALELGLRGSRKAEEAAE-
SARLKLADVMAEARERIGKETT (SEQ ID NO:76) APAVSDIHQHDH >BL;ML0761,
ML.tab 903525:904028 reverse MW:18775
VSNSCSSSRHGQWSRRFSSRRAAKRGRDIRGPLLPPTVPGWRSRAERFDMAVLEAYEPIEQRWQGRVSELDVA-
VDEIPRI (SEQ ID NO:77)
AARNPENVQWPPEVIADGPIALARLIPAGVDVRSNATRARIVLF-
RKPIERRAHDTVELGELLHDILVAQVAIYLDVEPSA IDPTMDD >BL;ML0762, ML.tab
904068:904565 forward MW:17248
MRVSGASATFSHDSLSVVNVPRRCCRPGCPHYAVATLTFVYSDSTAVVGPLATVREPHSWDLCVDHAARITAP-
RGWELVR (SEQ ID NO:78)
HAGPLPSNPDEDDLVALADAVREGPGGEHGSYGNGARASLGGFA-
DPQLQSAGAHATVPSGGLLAPSELRSGRRRGHLRVL PDPSD >BL;ML0764, ML.tab
906078:907175 forward MW:37464
VNATRLIDLEDTKGLIAADRDGLLRAASSAGAQVRAIAAAAEEGALETLRAHDRPRTVIWVAGRGTAETAGAM-
LAATSGG (SEQ ID NO:79)
ATTEPIVVASEAPPWVGPLDVLIVAGDDPGDPALVGAAATAVGR-
GARVVVVAPYEGPLRDATAGRVAVLEPRLRIPNEFG
LCRYLAAGLAALQTVDPRLRLDLANLADEL-
DSEALHNSVGYEVFTNPAKTLAASVSGHRVALAGDCAATLALARHGSSVL
LRIAHQVTSATGLSDAVVAVRSSVDVADYAPTSVDVLFHDEEIDGSLPERLRVLALTLASERTVVAARVVGLD-
DVYLVAA EDVPDGPSGLAGLPVSGGADRAEQELANLAVRLEMAAVYLRLVRG
>BL;ML0776, ML.tab 920258:920515 forward MW:8946
VAGCGVFATRWSDACTAELSVAAGEPRVVSLCVDPLVPVVVLGRYVGARRQAILAMKEHGRRNLVALPTRQCV-
SRLGSCT (SEQ ID NO:80) LPGRG >BL;ML0806, ML.tab 955206:955727
forward MW:17808 VDRVIALLSSGAIVGPCDYADVVTLPHKRAVFSRA-
PAAVRGAGLIVVVQGAVALVVAAALVVRGLTGADQRIVNGLGTAI (SEQ ID NO:81)
WFVVVGVAVLAAGCALLVGKRWGRGLAVFTQLLLLPVAWYLVVGSHQSAFGFPMGIVALIALILLFSPPAVRW-
SAGAYQR SVASSANRKADSR >BL;ML0810, ML.tab 958882:960105 reverse
MW:42979 MVRPERRTKADTIAAMTITVVMAAMVSLIWWTSDA-
QATHSRPATIPAPNPTPAREVPTAFNQLWAAASPATTAPVVVGGA (SEQ ID NO:82)
VITGDGHQIDGRNPVTGESRWSYARDSDLCGVSWVYHYAVAVYRDDRGCGQVSTIDGSTGRREAARSSYADPN-
VRLSSDG
TAVLSAGDTRLELWRSDMVRMLAYGEIDARVKPPARGLHSGCTLESTAASSSAVAVLEA-
CANQDDLQLVLLRPGKEDDEP
QQHLVAEPRVRSGSGARVLTVSDTHTAVYLPGEAGTQPRVDVIDE-
TGTTVASTLLTKPPSSSAVVSQAGNLVTWWTGDTL
MVFNQSNLTLRYTIAAGETTAPVGPGVMMAG-
QLLVPVTGKIGVYDLFSGANNRYIPVRRPPSSSAVIPAVSGSTVFEQRG DTLVALG
>BL;ML0813, ML.tab 962707:963294 reverse MW:20353
MRLTETTSIRRTTTTSYSGHPIVDGRQVAALLGSVAALCAIATAVIINSGDNATTKAIVGAPTPRPVLTTPSI-
PLPATPS (SEQ ID NO:83)
STPPLLLLFDTATATIPHKAAPPALHPRTVVYNVTGMKELLDLV-
TVVYTDARGYPKTEFNVVLPWTKAVVLNLGVKTQSV
VATSFHSQLHCSIVNAEGQPVVASTNNAVI- ATCTR >BL;ML0814, ML.tab
963593:963841 forward MW:8652
VEVKIGITDSPRELTFSSAQTPGEIEELVSAALREGLGLLVLTDERGRRFLIHGAKIAYVEIGVAD-
ARRVGFGIGAESAT (SEQ ID NO:84) NG >BL;MLO816, ML.tab
964701:965726 forward MW:37443 VSSPGPVGSPGRVPVLREEWRAPLRAQR-
EPLARGEGRVRVNRGRSRRWRKQTRLGRFVSVFGWRAYALPFLMALTAVVLY (SEQ ID NO:85)
QTVTGTNAPEPAASEPITEPPVIGAVGTAIMDVPPRGLAAFDANLPAGTLPDGGAFTEAGDKTWHVVPGTMPQ-
ISQSATK
VFKYSIEIENGLDPTMFGGDGAFAQMVDQTLANPKGWTHNPQFAFTRIDTGMPDFRISL-
VSPLTIRAGCGYEFRLETSCY
NPSFGPDRQARVLINEARWLRGALPFEGDVGSYRQYVINHEVGHA-
IGYVRHEPCDKQGGLAPVMMQQTFSTSNNDGAKFD PEWVKPDGKTCRFNPWPYPIA
>BL;ML0818, ML.tab 967172:968065 forward MW:32374
LRSPRSGSTRLTRVTVEPPPEHVLSAFGLTGVQPVPLGASWEGGWRCGEVVLSMVADNARAAWSARVRETLFV-
DGIRLAR (SEQ ID NO:86)
PVRSTDGRYVVSCWRANTFVAGTPEARHDEVVSAAVRLHEATGK-
LERPRFLTQGPTARWADVDIFIAADRAAWEGRPLQS
VPSGVWAAPMTTDGQRSVDLINQLAGLRKP-
TRSPNQLVHGDLYGTMLFVGTAAPGITDITPYWRPASWAAGVVVVDALSW
GEADDGLIERWNALPEWPQMLLRALMFRLAVHALHPRSTAEAFPGLARTAALVRLVL
>BL;ML0834, ML.tab 990401:990703 reverse MW:10909
VRQDGASRTVVGGTALIRYVIVLGLGYVLGAKAGRRRYEQIVGIYRTLTGSPMAKSMIAEGRRKVANRISPDE-
GFVTLAE (SEQ ID NO:87) IDNQTTVIERSAEWRENGGN >BL;ML0857, ML.tab
1019690:1020442 reverse MW:26808
MAKPRNAAAHKAARAEAKAARKAASRQRRLQLWQAFTIQRTEDKRLIPYMIAAFSLMVSASVTAGVLVGGLTM-
ITLILLG (SEQ ID NO:88)
VVLGALVAFIIFGRRTQQSVYHKAEGQTGGAAWALDNLRGKWRV-
SPGVAANGHFDAVHRVIGRPGVIFVAEGSAARVKPL LAQEKKRTARLVGDVPIYDI
IVGNGDGEVALVKLERHLARLPANISVKQVDILESRLAALGSRAGASLIPKCPLPNACKN
RGVQRTVRRK >BL;ML0869, ML.tab 1033201:1033575 reverse MW:13804
MMAGEEAYLPPRDQGPVRRYIRDLVDARRNALGLFTPSALVLLFITFGVPQLQLYMS-
PAMLVLLSVMGIDGIILGRKISK (SEQ ID NO:89)
LVDVKFPSNTESHWRLGLYAAGRASQMR- RLRVPRPQVEHGSSVG >BL;ML0872,
ML.tab 1035615:1036130 reverse MW:19047 MLPPAVSYPRRRSKRLI
ISVLVAIALVAAMTAVI IYGVRTNGSKTGGTFSEVTAKTAI EDYLKALEQSNINTIARNALCG
(SEQ ID NO:90) MYDSVRDQRPDQALAQLSSDAFRKQFSQVELTS
IDQIVYWSPYQAQVLFTMRTSPATGGPKRRQIQGIAQLL- YRRNQVLV CSYMLRTADSH
>BL;ML0876, ML.tab 1040896:1041315 forward MW:14969 MHI
EARLFEFVAVFFVIMAVLYGVLTSMFATG- GVDWVGTTALALTGGLALIVATFFRFVARRLDI
RPEDYEGAEISDGAG (SEQ ID NO:91)
ELGFFSPHSWWPVLVALSGSVAAVGIALWLPWLIVAGVVFVLASAAGLVFEYYVGPEKH
>BL;ML0878, ML.tab 1042383:1043021 forward MW:22756
MMNRYSPY2RGSDTIASDVIDRILVGVCAAVWLVLIGVSVAAAVALEDLGRGFHKIASDPHTTWVLYGIIVVS-
VLIIAGA (SEQ ID NO:92)
VPVLLWARRVARVEPPIRPAGVPERGGVRQLVSAGRSTARIEVE-
RVCAEERVQSVAQPGEWFDAAVDRIWLRGTVGLTGT
MGAALVAVAASTYLMAVGRDGASWVGYVLA- GIVTAVMPVIEWIYVRQLRRVG
>BL;ML0888, ML.tab 1055260:1055667 forward MW:15125 MNSTNS
IQIADETYVAADRALIGAAVADRSSW-
HRWWPDLRLQVVEDRAEKGIRWAVTGTLTGTMEIWLEPLTEELDGVVL (SEQ ID NO:93)
HYFLHAEPAGVAAWQLAKMNMAKVTHRRRVAGKAMAFEVKKTLERSRSIGVSPVI
>BL;ML0889, ML.tab 1055786:1056220 forward MW:16483
VADKTTQTFYIDANPGEVMKTIADIESYPQWISEYKEVEVLEVDDEDFPKRARMLMDAKIFKDTLIMSYDWTA-
DHQSVSW (SEQ ID NO:94)
ILESSSLLKSLEGSYRLVPKGSTTEVTYELAVDFAIPMIGMLKR- KAEHRLIDGALKDLKKRVEG
>BL;ML0891, ML.tab 1057485:1057877 forward MW:13550
MSGGYADIGPELRKLAQMTLDGIGPAVRSAAALVAGARGTGKCQQAWCP-
VCALTALVIGEQHPLLTVIADHSVALLDVIR (SEQ ID NO:95)
AIVDDIDQSNKIPPDSPHGGGLDTETPAQTNTSNGTVRRRYQPIPVSVED >BL;ML0895,
ML.tab 1061008:1061523 forward MW:19954
MRYPYDTEFIEDGRTIELVSIGVVAEDGREYYAVSNEFDPERAGNWVRVNVLSKLPPLASQLWRSRRQIRLDL-
EEFFGVD (SEQ ID NO:96)
GSEPTEPIELWAWVGAYDHVALCQLWGPMPDLPEALPRFTREIR-
QLWEDRGCPRMPPRPRDLHDALVDARDQLRRFRIIM SADDVGSLPTH >BL;ML0898,
ML.tab 1064343:1064747 forward MW:14698
VGSIPAGDDVLDPDEPTYDLTQVAELLGIPVSRVHRKLCEGYLVAVRRGDSLVVPQIFFTNSGAVVKSLPGLL-
TILHDGS (SEQ ID NO:97)
FHETEIVRWLFTPDPSLTLTRDGSRDVVSNARPVDALHTHQARE- VVRRAQAMAY
>BL;ML0902, ML.tab 1068511:1069230 reverse MW:25542
VRARFPPLFTRGCTAQRRRTLTIALLLVAMVPLATGCLRVTASITISPDNLVSGKII-
AAAKPKNKNDAGPQLNDNLPFSQ (SEQ ID NO:98)
KIAVSNYNSDGYVGSQAVFSDLTFAELP-
QLANMNSSTTDVTLSLRRNGNLVILESRADLTSVTDPDADVELTVAFPGVVT
STNGDRIETKVVAWKLKPGVVSTMSARARYTDPDTRSFTGAAVWLGIASFSAASVVVLLAWNERKSSARLQIP-
RDSSSS >BL;ML0903, ML.tab 1069302:1069934 reverse MW:23864
LAIFLINLSPNEMERRLNEALEVYVDAMRYPRNTENLRAGIWLEHIRRPGWQAVAAV-
EVRIEVADVADMADGPAHPAPSA (SEQ ID NO:99)
DELNNAPLRGVAYGYPGAPGQWWQQQVV-
QGLQRSGLSTLEIARLMNSYFELTELHIHPHTQGRGIGEALTRRLLAHRREN
NVLLSTPETNGETNRAWRLYRRLGFMDIIRRHYFAGDPRAFAILGRTLPL >BL;ML0904,
ML.tab 1070251:1070655 forward MW:14614
MPLSDHEQRMLDQIESALYAEDPKFVSSVRGGGLRVPTARRRTQGAALFVIGLGMLVCGVAFKATMIGSFPIL-
SVFGFVV (SEQ ID NO:100)
MFGGVLFAITGSRLSGREDHPGLAPGTSRQRRSKGAAGSFTSR- MEDRFRRRFDE
>BL;ML0907, ML.tab 1072702:1073835 forward MW:39543
MKAQRDTPIRRGDSGRPGGRDGAARSGKRTANEAGSRRLRTHAGKISASAREVGVPK-
SGPRTSPMSRPVERPARPRNTTQ (SEQ ID NO:101)
AKARAKARKAKAPKVVRPRLGECLIAR-
LALIDLRPRTLVNKVPFVVLVISSLGVGLGLTLWLSTDSAERSYQLGDAREQA
RMLQQQKEALERDVREAESAPALAETARKQGMIPTRDTAHLVQGPGGNWVVVGTPKPADGVPPPPLNTKLPDA-
GPPSLKP
PEIPLEVPVRVVPGPGGPPPPARSGPQMWLRVPDGATTLGGQHLPQELPQLPGMLNGPA-
AAQVQVPGFMPTPGLPIPGST
MRVPVPAPAPTEVPVRLQPGLVSPAVTSPVISTSPVPTPVNSEQF- GPVTATAPGTSR
>BL;ML0920, ML.tab 1090054:1090686 forward MW:24016
MSTLHKVKAYFGMAPMEDYDDEYYDDRSPTHGYGRSRFEEGYGRYEGRDYSDLRGDP-
TGYLPLGYRGGYGDEHRFRPREF (SEQ ID NO:102)
DRPDLSRPRLGSWLRNSTRGALAMDPR-
RMAMLFDEGSPLSKITTLRPKDYSEARTIGERFRDGTPVIIDLVSMDNADAKR
LVDFAAGLAFALRGSFDKVATKVFLLSPADVDVSPEERRRIAETGFYAYQ >BL;ML0921,
ML.tab 1090811:1091101 forward MW:10769
LALFYQILGLALFVFWLLLIARVVVEFIRSFSRDWRPNGVTVVILETIMSITDPPVKLLRRLIPQLTIGAVRF-
DLSIMVL (SEQ ID NO:103) LLVAFIGMQLALSAAA >BL;ML0923, ML.tab
1092221:1092613 forward MW:13650
MLIIALVLALIGLVVLVFAVATSNLLMAWVCIGASVLGVLLLIVDAVREHQCIDAANNEDKEDTDQDDGAVYV-
DYLDEVP (SEQ ID NO:104)
AGTSTEAPDAGSQEGDTNSGELSGYWGRLTIDTGEQSAVAADD- HDNDRAT >BL;ML0984,
ML.tab 1151024:1151473 reverse MW:16902
VAIAELTEASVQGVENIRTVEVFLAALQDAGLRNRIRDVGRQPRVYQNVGLPTIHGR-
SKTITLWRKMADCIGFEIKIHRI (SEQ ID NO:105)
AAVAIAVLCERADAVIVGPLWMQFWVC-
GTFEVQNKRIMLWPNYFDLFDLFKATMRSLVALRIPSLNAAF >BL;ML0986, ML.tab
1152702:1152905 forward MW:7514 LAGVRLTEFHERVVLRFGAAYGASVLV-
DHVLTGFDGRTVAQAIEDGVELRDVWRALCVDFDVPRDQW (SEQ ID NO:106)
>BL;ML0990, ML.tab 1157281:1157910 forward MW:22743
MNEEPNIVDFPDSNPLQAALEAEELRVVREIDSGAKIFVLIAVLVFMLLGSFILPHTGQVRGWDVLFDSHGAG-
AAAVALP (SEQ ID NO:107)
LRIFAWLSLVFGVGFSMLALMTRRWVLAWIALAGAANASIVGL-
LAVWSRQTVAVGQPGPGVGLIVAWITLILLTFHWARV
VWSSTIVQLATEEQRRRVVAQQQSKTLLD- GLYSAGNRDTRTRSDPQVGS >BL;ML0994,
ML.tab 1162620:1163318 forward MW:23576
VLSAIAIVPSAPVLVPELTGAAAAEVADLRSAVLAVAACLPPCWIVVGT-
GRADDVVGPGGCLGTFAGFGADVRVRLSPQV (SEQ ID NO:108)
GGEAELLVDFPVCALIAAWVRGQSQLDASAQVRVYCGDHDPDMALACGRQLRVEIEQAPDPIGVLVVADGATT-
LTSSSPG
GYDPSAADAELVLDDALASGDVAALTRLSCQISGRVAFQVLAGLVEPGPRLAKELYRGA-
PYGVGYFVGVWQP >BL;ML1001, ML.tab 1172599:1172874 reverse
MW:10150 MPDPVVMPVPCPTSGFTQYSPYYRGAQITLLQQTILAKLNQKYYNNRYR-
VDVEMVLSHTGVEADSAASHTILGLSSSILP (SEQ ID NO:109) PYGCRTKKQRS
>BL;ML1004, ML.tab 1176008:1176502 forward MW:17108
MLVIYAVPPLIGNVRHPMSRPILGPRCGSGESAGSRRPAPSRSASAPMRYSGASVANLVAPHRGRTVSLAKTI-
GLALLAG (SEQ ID NO:110)
MITLWLGLMADVSQVIDGDATGFVTHVPNRLAVVRVEAGESLQ-
DVAARVAPDAPVRQVSERIRELNVLDSSMLVAGQTLI APVG >BL;ML1009, ML.tab
1179519:1180499 reverse MW:36153
MSHSHYQDPDDEQHYQPGQPGMYVLEFPAPQLLASDGRGPVLIHALEGFSDAGHAIRLAATHLKAALNTELVA-
SFAIDEL (SEQ ID NO:111)
LDYRSRRPLMTFKTDHFTHYDDPELSLYALRDSVGTPFLLLAG-
MEPDLKWERFITAVRLLAERLGVRQTISLGTVPMAVP
HTRPITLTAHSNNGELIADFTPWITEIQV-
PGSASNLLEYRMGQHGHEVVGFTVHVPHYLTQTDYPAAAQALLEQVAKTGA
LQLPLSALAEAAAEIRAKIDEQVQASTEVAQVVAALERQYDAFIDAQENRSLLRRDEDLPSGDELGAEFERFL-
AQQAEKK RDDDLT >BL;ML1015, ML.tab 1185456:1185875 reverse
MW:15761 VGKLSTAPNRGTTDTFDDNSRPVLITTAAPSYE-
EERRTRVRKYMTLMAFRIPALMLTTVAYSAWHNGLISLLIVAASVPL (SEQ ID NO:112)
PWMAVLIANDRPLRRTEEPRRFDSRRRRTPLLLTTEQPAFKSLRRPPPKPTSLATDSRS
>BL;ML1016, ML.tab 1185903:1186226 forward MW:11848
VSGFCFSVGRVRRHNVTMLGRHNVTMLGMQTQTIEHTYTDEHVDDGTGSDTPKYFHYVKKDKIVESAVMGSNV-
VALCGEV (SEQ ID NO:113) FPVTRAAKPGSPVCSDCKRVYDMLKKG >BL;ML1O2S,
ML.tab 1192722:1193372 reverse MW:22931
VVTQITEGTAFDKHGRPFRRRNARPAIFVVVFLVIVAGVSWTIALTRPAKVREPEVCNPPTQSTGSVPTQLGK-
QVPRTEM (SEQ ID NO:114)
TDVTPAKLSDTKVHVLNASGRDGQAADIAGALRDLGFAQPTAA-
NDPMYADTLLNCQGQLRFGTAGQATVAAVWLVAPCTE
LLHDNRTDDSVDLALGTDFTALAHNDDID- AVLASLRPGATEPSDPALLQKIHANSC
>BL;ML1026, ML.tab 1193568:1193870 forward MW:10969
MPTDYDAPRRTETDNVPEDSLEELKARRNEAAS-
AVVDVDESESAESFELPGADLSGEELSVRVIPKQADEFTCSSCFLVQ (SEQ ID NO:115)
NRSRLASEKNGVMICTDCTA >BL;ML1027, ML.tab 1193875:1194348 reverse
MW:17131 VSGTPVAPHNVRYRERLWVPWWWWPLAFALASL-
IAFEVNLSGATLPSWLPFAVAAGTLLWLGRVEIQVIADAPLGGSVEL (SEQ ID NO:116)
WAGNAMLPITAIAQSAAISRSAKSAALGRQLDPAAYVLHRAWVGPMILVVLDDPDDPTPYWLVSCRHPERVLS-
ALRS >BL;ML1029, ML.tab 1194835:1195656 forward MW:29018
VTADDDAERSDEDGAAVMATFGKRTGKDGASRTLTEPADPEATELPAASEPDSEEVD-
ELEGPFDIDDFEDPAVAVLARLD (SEQ ID NO:117)
LGSVLIPLPEGSQLQVELTDVGVPNAV-
WVVTANGRFTITAYAAPKTGGLWREVAGELADSLRNDSAKVTVKDGPWGREVV
GTNTGVVRFIGVDGYRWMIRCVVNGPLETIDVLSEEARAALADTVVRRGDTPLPVRTPLPVQLPEQMAEQLRE-
AAVAQQS AQHADARQQSRELAPRRGAAGSAMQQLHNTTGG >BL;ML1030, ML.tab
1195701:1196399 reverse MW:23579
VAVDLRNVTTVLLPGTGSDDDYVYRAFSGPLHRVGAAVLTPPPQPNRLIDGYLSALDDAARAGPIGVGGVSIG-
AAVAAAW (SEQ ID NO:118)
ALAPERAVAVLAALPAWNGAPESAPAALAARYSASHLRRDGLA-
ATTLQMQASSPPWLADELARSWCGQWPLLPDANEEA
AAYIAPSCAELARLATPLGVAAAVDDPIHP-
LQVGVDWVTAAPHAALQTVTLNQIGTNAAALGTACLAALALT >BL;ML1037, ML.tab
1200579:1201133 reverse MW:19903 VSHEYWSTAVSCNPGIHHIVRLDVAJ-
4TPRTTQTYRHSMLAEDIAEDFPAISINSSALDAARMLAEHGLPGLLVTDMSDKP (SEQ ID
NO:119)
YAVLPASQVVRFIVPRYIQDDPSLAGVLNESTADQAAEKLSSKKVRDVLPDHLVNVSPVNADDTIIEVA-
ATMSRQRSPLL AVVKGGQLLGVITASRLLRAALKH >BL;ML1O41, ML.tab
1206538:1207128 reverse MW:21153 VAPVTAAEPTPFREAVAAMNAFTVRP-
EIELGPIRPPQRLAPYSYALGAQVKHPELDIVPEQSEDNAFGRLILLYDPDGSD (SEQ ID
NO:120)
AWDGTIRLVAYIQSDLDSREAIDPLLPEVAWSWLIEALESRIDHVTALGGTVTATTSVRYGDISGPPRAH-
QLELRASWTA TTPEVGVNVKAFCEVLENAAGLPPAGVIDLGSRSRS >BL;ML1053,
ML.tab 1220374:1220673 forward MW:10117
MPLFLNAEPQALTAAANTLEGLSAATVASNAAAAQLTTEIAPPAADDVSILLAHFFSGHGRQYQAHASQGATN-
HQDLIQS (SEQ ID NO:121) LLTSSSAYAGTETANIHDSL >BL;ML1055, ML.tab
1221433:1221735 forward MW:11298
MTAAHFMTDPQAMRDMARKFDMHAQNVRDESHKNFMSSMDIAGAGWSGTAQLTSHDTMGQINQAFRHIVTLLQ-
DVRDQLG (SEQ ID NO:122) TAADRYEHQEENSRKILSGS >BL;ML1056, ML.tab
1221787:1222074 forward MW:10260
MGNINYQFGEIDAHGAAIRAQAAALETTHQAILATVRDAAEFWGGQGSTAHEMFIADLGRNFQMIYEQANSHG-
QKVQRAS (SEQ ID NO:123) SSMADTDRSVSSAWS >BL;ML1065, ML.tab
1230644:1230988 forward MW:11933
VALVLLYLVVLVLVAIVLFGAASLLFGRGERLPPLPRGTTATVLPAHGVTGADVDAVKFTQVLRGYKPSEVDW-
VLDRLGR (SEQ ID NO:124) ELEALRGQLAAIADAEADADVSNVPSGDDGQDVT
>BL;ML1067, ML.tab 1231777:1232004 forward MW:7998
MLGAEWVREGGPARVWREHTMAAMKPRTGDGPLEATKEGRGIVMRVPLEGGGRLVVELTPDEAAALSDELKGV-
TS (SEQ ID NO:125) >BL;ML1077, ML.tab 1240278:1240697 forward
MW:15219 VMADRGQVFRRVFSWLPAQFASQNDAPVGAPRRFGSTEHLSVEAIAAFV-
DGELRMNAHLRAAHHISLCAQCAAEVDDQSR (SEQ ID NO:126)
TRAALRDSHPIRIPSTLFGLLTAIPRCSPDYTSPVSEPFSEGSVSDRFVDGVAREQGKR
>BL;ML1079, ML.tab 1242373:1242735 forward MW:13348
VFANIGWGEMLVLVVVGLVVLGPERFPGAIRWTLGALRQTRDYLSGVTNQLREDIGPEFDDLRGQFGELQKLR-
GMTPRAA (SEQ ID NO:127) LTKHLLDGDDSLFTGNFDRPAAAKQQDRDNHQTPFDTDAT
>BL;ML1093, ML.tab 1257729:1258586 forward MW:29736
MRVGRLAALLLAGVGVFVAGGCATDRGDRHPELVVGSKPDSESTLLAAIYVAALRSYGFGARGETGADPMAM-
LDSGGFTV (SEQ ID NO:128)
VPGFTGKVLQILQPRAAVLSAARVYRANVSALPEGIAAGDYT-
TAAEDKPTLVVTPDTAKAWSGSOLSLVLSHCNELVVGI
VAGTHTPSAVGSCRLPAAREFPDYPTMF-
AALRAGQLTAGWTTTANPDLPADLIVLTDGKATLIQAENVVPLYRRNVLTDR
QMLAINEVAGVLDTAALIEMRRQVIRGADSQAIADGWLAEHPMGR >BL;ML1096, ML.tab
1263932:1264606 reverse MW:23505
LAQVIERSVWIQGPAAEAYVARLRRTHPSASPTEIVAKLEKHYLAALTASGAVVGSVATLPGIGTLAAVSANA-
GETVFFL (SEQ ID NO:129)
EATAVFVLTIASVYGIPANHRERRRALVLAVLAGDDTRLTIGE-
LIGPGRTNGGWLLEGMASLPLSTWSQLHTRMLRYAAK
RCTVRRGALMFGKILPIGIGAAVGGAGNR- VVGKKIISNTRNAFGTAPSRWPATLILLPTVHNAG
>BL;ML1098, ML.tab 1266648:1270106 reverse MW:126012
VFVTDETIVYSASDLAAASRCEYALLRDFDAR-
LGRGPVVATTEDELFARTSALGADHEQRHLDQLRHEFGDAVAVIGRPA (SEQ ID NO:130)
YTYAGFAAAAEATQRAIANRAPAVYQAANFDGRFVGFIDFLVRDGEQYRVVDTKLARSPKVTALLQLAAYADA-
LAHSGVP
VAPEAELRLGDGMVVSYRICDLIPVYRSQRSLLQRLLDRHYTAGTAVRWQDDEVRSCFR-
CPQCTEQLRATDDLLLIAGMR
ISQRSKLLNVGITTIAELADHSGPVPDLSSSALSELTAQAKLQVQ-
QRNTGTPQFEIVDPQPLALLPDPDPGDLFFDFEGD
PLWTVDGQEWGLEYLFGVLDSEISGTFRPLW-
AHNRVEERKALTEFLKMVTKRRKQRPHMHVYHYAPYEKTALLRLAGRYG
VCEDEVDELLRSGTLVDLYPLVSKSIRVGAESFSLKALEPLYMGKQLRSGDVTTATDSITCYGRYCELLSAGN-
FDEAATV
LKEIEDYNHYDCRSTRELRNWLLLQAYEAGVVPVGAQPVPEGNTVKDDDELSAILSALS-
GFTGDVAVGDRTPEQTAIALV
AAARGYHRREDKPFWWGHFDRLNFPVGEWADNTDVFVADDASIII-
DWHTPPRARKPQRRVRLRGRLARGNLGSAVFALYD
PPAPLANDVHPGRRAAGRAEVVEADDLSIPT-
EVVIVERVGNDGNTFHQLPFALTPGPPIATTALRDSIESTATTLAASLP
QLPRTALIDILLRRIPRTHSGATLPRGTDTVADITAAVLDLDSSYLAVHGPPGTGKTHTAAHVITQLVSNHSW-
RIGVVAQ
SHAAVENLLDGVITAGLDARQVAKKRHDRSAPPWQEIDGNDYPTFIADPMGCVIGGTAW-
DFANRNRVPPGSLDLLVIDEA
GQFCLANTIAVAPAAANLMLLGDPQQLPQVSQGTHPEPVNTSALD-
WLVEGQRTLPNERGYFLDRSYRMHPAICAAVSTLS
YEGKLHAHTEYTAARRLNEYQPGVHVLAVHH-
QGNSTESPEEAGAITAEIERLLGTPWTDEHGTRPLDVSDILVLAPYNAQ
VALVRQQLMSAGFSGVRVGTVDKFQGGQAPVVFISMTSSSVEVVPRGISFLLNRNRLNVAVSRAQYAAVIVRS-
ETLTEYL PATPVGLIDLGAFLTLTTFNGTGRLESRLDKP >BL;ML1099, ML.tab
1270157:1270765 reverse MW:20773
VRDVWSLPCRKSLLGVAAVVLVSGTLTGCSSGDSTVAKTPVPPSTTTGTISTIISSAPSPPFATAAPPTSNTP-
PDDPCAV (SEQ ID NO:131)
NLASPTIARVVSELPRDPRSAQPWNPEPLAGNYNECAQLSAVI-
IKANTNAVNPTTRAVLFHLGRFIPQGVPDTYGFNGID
PAQTTGDTVALTYPSSIDGLATAVRFHWN- GNAVELISNIAGG >BL;ML1105, ML.tab
1279262:1279951 forward MW:24905
VLSGGAGSIPELNAQISVCRACPRLVDWREEVAVVKRRAFADQPYWGRP-
VPGWGSEQPRLLIVGLAPAAhGANRTGRMFT (SEQ ID NO:132)
GDRSGDQLYAALHRAGLVNLPISMDAADGLQANQIRITAPVRCAPPGNAPTQAEWVTCSPWLEAEWRLVSEYV-
RAIVALG
GFAWQIVLRLPGVSANRKPRFSHGVVAQLYAGVRLLGCYHPSQQNMFTGRLTPANLDDI-
FRDAKKLAGI >BL;ML1115, ML.tab 1291563:1292129 forward MW:19913
VRCDVRALALAARGLIELMIVIPMVAGCSNAGSNKSVGTISSTPGNTEGHHGPMFPR-
CGGISDQTMSQLTKVTGLTNTAR (SEQ ID NO:133)
NSVGCQWLAGGGIVGPHFSFSWYRGSP-
IGRERKTEELSRASVDDININGHSGFIAVGNEPSLGDSLCEVGIQFQDDFIEW
SVSFSQKPFPSPCGIAKELTRQSIANSK >BL;ML1116, ML.tab 1292138:1292701
forward MW:19839 MRLSVRGRRSVFAGVAVLVSAALVVTGCSRSIG-
GTAVKAGSHDVPRNNNSQQQYPNLLKECEVLTTDILAKTVGADSLDI (SEQ ID NO:134)
QSTFVGAICRWQAANPASLIDITRFWFEQGSLANERKVADFLKYKVENRSIAGVDSIVMRPDDPNGACGVASD-
AAGVVGW WINPQASGIDACGQAIKLMELTLATNS >BL;ML1117, ML.tab
1292709:1293194 reverse MW:17617 MRHAKSSYPRGFPDHIADHDRRLAPR-
GVREASLAGGWLRTNVPAIEKVLCSTAMRARETLTHSGIEAPVRYTERLYRADP (SEQ ID
NO:135)
DTVIKEIKAISDEVTTSLIVSHEPTISAVALALTGSGTNNDAAQRISTKFPTSGIAVLNVAGRWQHLELE-
SAELVAFHVP R >BL;ML1119, ML.tab 1294763:1295914 forward MW:41281
MRFLHTADWQLGMTRHFLAGDAQPRYSAARRDAVAGLGALAAEVGAEFV-
VVAGDVFEHNQLAPQVVSQSLEAIRAIGIPV (SEQ ID NO:136)
YLLPGNHDPLDASSVYTSALFTAECFDNINVLDRAGVHQVRPGLEIVAAPWRSKAPTTDLVAEMLGGLTADIV-
TRVLVAH
GSIDVFDPDRDKPSLIRLAGIDDALAGGAVHYVALGDRHSLTQVGSSGRVWYSGSPEVT-
NFDDIESNSGNVLVVEIDEND
PRRPVTVTARHVGHWRFFTLHWQVDNGRDIADLDMNLDQMMDKDR-
SVVRLALTGSLTITDRAVLDACLDRYARLFAWLGL
WERRSDLAVIPADGEFTDIGIGGFAAAAVDE- LVATAREGDTESAIDAQAALALLLRLADRGVA
>BL;ML1120, ML.tab 1295911:1298532 forward MW:94011
MKLHRLALTNYRGTARREIEFPDRGVILVCGAN-
EIGKSSMIEALDLLLEFRDRSTKKEVKQVKPANADIGSEVCAEISSG (SEQ ID NO:137)
TYRFVYRKRFNKKCETALTVLAPHREQLTGDEAHERVRAMLAETVDNDLWHAQRVLQAASTAAVDLSGCDALS-
RALDLAA
GDHAELSGTEPLLIERIEAEYRRYFTSTGRPTGEWAVAISRLSDAETAVGECAAAVAEV-
DDRVRRHAVLTERVAGLAQQR
FAAGPRLAVAQAAADKVAVLTRQAREAELVAAAATATNAAAVAAH-
TSRLRLLAEIDTRAVVLAATQDEAQEAVDAVSTMR
ADAEASDAAVEESTEALMAAQQRADIARSTV-
DQLVDRQEADRLSTRLAKIDAIQGERDLICAELSAVTLTEQLLQRIENA
AAIVDRTGEQLKLISAAVEFTATADIEISVGQQRVSLEAGQSWSTTATGPTEVEVLGVLTACVIPGATALDAQ-
SKYVAAQ
EELSVALADGGVVDLAAARCANQRRRELQSSLDQLSAALAGVCGDDQIDQLRARLEQLR-
DGYPGEPDLLAVDIGSARAEL
EAAETVRAAVDFEREVCRRTAAAANCRLVETSARANFLLKKAETQ-
RAELDQDIDQLAQQRASVSDEDLASAAEAGLRAVQ
IAEQRVAKLTEELVAAEPEAVTAELVAATAA-
AESLRDQHEDAAGALREISIELSVFGTEGRQGKLDTAEAEREHAISQNT
QIGRRARAAQLLRSVMARHRDTTRLRYVEPYRTELQRLGRPVFGPTFEVDIDSDLRIRSRTLDGITVPFESLS-
GGAKEQL
GILARFAGATLVAKEDNVPVVVDDALGFTDPDRLAKMGEMFDTVGAHGQVIVLTCSPDR-
YDGFTGAHRIDLNV >BL;ML1138, ML.tab 1329962:1330423 forward
MW:16168 VTTPAQDAPLVLPAVAFRPVRLFIINIVLTGLANLAAGLSGHLMVGVFF-
GIGLLLGLLNALLVRCSVESITAQGHPLKRS (SEQ ID NO:138)
MALNSASRLAIITVFGLIIAYAFPLAGLGVVFGLALFQVLLVLSTMLPVWRKFRFGEADGGVLKGSEGEEQQR
>BL;ML1147, ML.tab 1337937:1338380 forward MW:16518
MSAPMVGMVVLVVTLGAAVLALSYRLWKLRQGGTAGIMRDIPAVGGHGWRHGVIRYRGEAAAFYRLSSL-
RLWPDRRLSRR (SEQ ID NO:139)
GVEIVARRGPRGDEFDIMTDKIVVLELRDTTQDRRSGYE-
IALDQGALAAFLSWLESRPSPRTRRRSV >BL;ML1159, ML.tab 1354280:1355185
forward MW:31923 MAGAVDLSGLKQRARQKASTSDPASRATLGARG-
TGSSENTSVIEITEANFEDEVLVRSHEVPILVLVWSPRSDACVKLLE (SEQ ID NO:140)
TLSGLAVADSGTWLLATVNVDAVPRVAQIFGVDAVPTVVALAAGQPLSSFQGMQSVDQLRRWLDSLLSVTAGK-
LRGPTRS
EDSAEIDPAIAQARQQLEAGDFLTAKQSYHAILDADPASVEAKAAIRQIDFLTRATAQH-
PDAVAVADAAPGDIAAAFAAA
DVQILNQDVTAAFERLIVLVRSTSGDERSSVRTRLIELFELFDPD- DPNVIVGRRNLANALY
>BL;ML1166, ML.tab 1364964:1365551 forward MW:21484
VRKWKRVETANGPRFRSVVAPHEVALLKHLVGALLGLLNERESSSPLDE-
LEVITGIKAGNAQRPEDPTLRRLLPDFYTPD (SEQ ID NO:141)
DKDQLDPAALDAVDSLNAALRSLHEPEIVDAKRSAAQQLLDTLPESDGRLELTEASANAWIAAVNDLRLALGV-
ILEIDRP APERVPAGHPLSVIIFDVYQWLTVLQEYLVLALMAT >BL;ML1176, ML.tab
1372485:1372844 reverse MW:13792
MTRPETPQAPDFDFEKSRTALLGYRIMAWTTGIWLIALCYEIVSNLVFHHEIRWIEVVHGWVYFVYVLTAFNL-
AIKVRWP (SEQ ID NO:142) IGKTVGVLLAGTVPLLGIIVEHFQTKflVKTRFGLRRSRT
>BL;ML1177, ML.tab 1372841:1373221 reverse MW:14447
VSTTRRRRPALVALVTIAACGCLALGWWQWTRFQSASGTFQNLGYALQWPLFAGFCLYTYHNFVRYEESPPQ-
PRHMNCIA (SEQ ID NO:143)
EIPPELLPARPKPEQQPPDDPALRKYNTYLAELAKQDAENHN- RTTT >BL;ML1180,
ML.tab 1380300:1380587 reverse MW:10260
MGNINYQFGEIDAHGAAIRAQAAALETTHQAILATVRDAAEFWGGQGSTAHEMFIAD-
LGRNFQMIYEQANSHGQKVQRAS (SEQ ID NO:144) SSMADTDRSVSSAWS
>BL;ML1181, ML.tab 1380639:1380941 reverse MW:11298
MTAAHFMTDPQAMRDMARKFDMHAQNVRDESHKMFMSSMDIAGAGWSGTAQLTSHDTMGQINQAFRHIVTLLQ-
DVRDQLG (SEQ ID NO:145) TAADRYEHQEENSRKILSGS >BL;ML1182, ML.tab
1381004:1382269 reverse MW:43120
MFDFAALSPETNSTRMYLGPGSSPILTAAAAWVVLAKELTAAAQGLQSAVEALLTTFEGESAAALAERVTPYE-
KWLTQNA (SEQ ID NO:146)
ASAELTATQLTVAANAYETAFTMTVPPLMVFVNRAQACLLIMS-
NIFGQNSTAIAEKEAEYTEMWIQDAAAMTSYQASVLE
AVGATKAFTAPPLGVNEVGLAQEVVEEVV-
EEVVEEVVEEVVEAEQAISQAALDQAVNEGMEATVVPQVDQQVNVDVATPQ
TAVPDSSSAAAPQLWGGFAQHLSPINDTLSMINNHAGMANAGLSLVNGMGSANKSLAPTTTKAAESAFKANGS-
AVQSTGR
GLLGSSSGGHVTAQLGRAASIGSLRVPQTWTTASQPVTAATRALSPARVAVATESESAP-
LLGGGLPMAPMVPGGGSGTGG VNTALRLQPRAFVMPRNPAAG >BL;ML1183, ML.tab
1382326:1382625 reverse MW:10117
MPLFLNAEPQALTAAANTLEGLSAATVASNAAAAQLTTEIAPPAADDVSILLARFFSCHGRQYQAHASQCATN-
HQDLIQS (SEQ ID NO:147) LLTSSSAYAGTETANHDSL >BL;ML1190, ML.tab
1395291:1396010 forward MW:25342
MSVSRRDVLKFATVTPGLLGLGVAAAALCAVPASTAGSLGTLLDYAAGVIPASQIRATGAVGAIRYVSDPRGT-
WAVGKPI (SEQ ID NO:148)
QVTEARDLINNGLKIVSCYQYGKGNTADWLGGATAGLRHAQRG-
VQLHTAAGGPVSAPIYASIDSNPTYEQYKQQVAPYLR
SWESVIIGHQRTGVYANSRTIAWALQDGL-
ASYFWQHNWGSPKGYTHPAANLHQVEIDRRTVGGVGVDVNTILKPQFGQWA >BL;ML1221,
ML.tab 1443565:1443807 forward MW:8793
VVQDLPTPIGAGIYNIYTGVHRDELAGASMPTVAQLGLEPPRFCAECGRRMVVQVRPDGWRAKCSRHGQVDSV-
DMEAKR (SEQ ID NO:149) >BL;ML1222, ML.tab 1443804:1444400
forward MW:20601 VTEHCASDISDVSCPPRGRVIVGVVLGLAGTGA-
LIGGLWAWIAPPIHAVVGLTRTGERGHDYLGNESEHFFVAPCLMLGL (SEQ ID NO:150)
LTVLAVTASVLAWQLRQHRGPGMVIGLAIGLMICAATAAAVGALLVWMRYGALNFDAVPLSYDHKVAHVIQAP-
PVFFAHG LLQVAATVLWPAGIAALVYAVLAAANGRDDLGGRLCSR >BL;ML1232,
ML.tab 1465612:1466688 reverse MW:38687
MKRLLATTLAALTVGTVSGFGSTVASSEPGEPWLPPSPVPVRENSPAKIVYALGGARPPTFDWDYYTIRAGDE-
FFPDVNR (SEQ ID NO:151)
KLIDYPARAPFRYVPTFLVPGPRDEVTIGEAIAVATKNLNQAI-
HRGTEPAAVVGLSQGSLALDTEQEQLATDPTAPPPDQ
LTFNTFGDPSGYHGFGKSVLASIFRPGDY-
IPLIDYTMPQRMDSQYDSNRVVAAYDGLSEFPDRADNLLAQLNCFAGGAIS
HTPSGFFNPEDVPPQNIRTTVNSRGAKTTTYLIPVNHLPLTLPLRYLGWSDALVDQIDAVLQPKIDAAYAYND-
NPLNKPI SVDPVNGMDPIAGIDAELRDSILNVFAQLRSILPPPPG >BL;ML1233,
ML.tab 1466853:1467545 reverse MW:24400
MWITILLMAIAISLEPFRIGMTVLMLNRPRPTLQLLVFLCSGFTMGMTVGFVVLFVFRRRLMASMQLTLPKVQ-
ILIGVLA (SEQ ID NO:152)
LLVAAVLTVQVCISSEPPAESPVDSASGPPKPSKWAPRPLARL-
LNGDSLWVAGVAGLGIALPSVDYLAALAVILTSGAAA
TTQVGALLMFNVVAFALIEIPLAAYLLAP-
DTTRAWTAALNNWIRSRRRLEVATLLAGVSCLLLAVGIAGL >BL;ML1244, ML.tab
1482817:1484292 forward MW:52449 MADSVEGIGPFDELGALDYLLHRGEA-
NPRTRAGIMAVELLDTTPDWNRFRSRIEDVSQRVLRLRQKVVVPTLPTAAPRWV (SEQ ID
NO:153)
VDPDFELDFHVRRVRVPDPGTLREVFDLAEVIQQSPMDVSRPLWTATLVEGLAAGRAANLLQISHAITDG-
VGSVEMFAENI
YDLERDPPSRPRSPQPIPQDLSRNDLMLQGINHLPVALAGGVXTGGLSGVASVAG-
RAILRPASTVSGVVGYVRSGIRVLSQ
AAEPSPLLRQRSLATRTEAIEIQLSDLHKAAKAGDGSIND-
AYLAGLVGALRRYNEALGVSISTLPMAVPVNVRTEADVVG
SNRFVGVTLAAPLGTNDPAARMQKIR-
SQMTQRRDEPAMNIIGSLAPLMTVLPASVLDFIVDSVASSDVNASNIPAYPGDT
YFAGAKILRQYGIGPRPGVAMMAVLMSRGGFCTVTVRYDRASVKSEALFARCLLEGFDEVLALAGDPTPHAVP-
ASFAARS SGSPAGWLSSS >BL;ML1249, ML.tab 1490899:1495767 forward
MW:177878 MTIDPGATHVAELCTTFTQGADVPDWISKAYI-
DSYRGSHGDVREAPETSRVNPNALVTPAMLSAHYRLGQCRPNGRNCVR (SEQ ID NO:154)
VYPADDPAGFGPALQIVTDHGGMVMDSITVLLHRLGVTYTAMMTPVFMVLRSPTGELLGVEPRASSTSHSIEG-
TWVGEVW
IYIQLLPAVDSKSLAEVEQLLPRTLVDVQRVAADAAALNATLSGLAADVKTNKEGHFSA-
SDRDDVAALLHWLGNGNFLLL
GYQRCRVHYGLVSCDRSTGLGVLRARTGSRPRLTDDNELLVLAQA-
AVGNYLRYGAYPYAIAVREYDDGGDCGIIEHRFVG
LFTVAAMNADVLEIPSISHRVRAALAMANSD-
PIYPGQLLLDVIQTVPRSELFTLSAERLFTMAKEVVDLGSGRRALLFLR
ADRLQYFVSCLVYVPRDRYTTGVRLQIEDILVREFGGTQVEFTARVSESPWALMHFMVRLSEGAATGSVDVSE-
GNRIRIQ
AMLSEAARTWSDRLIAAAASFSEGSVSYAEAEHYAATFSETYKQAVTPADAIDHIAIIK-
ELADDSVKLVFFERKADGFAQ
LTWFLGGRSASLSQLLPMLQSMGVVVLEERPFTVARTDGLPVWIY-
QFKISPHPTIPLASTANERELTAKRFSDAVTAIWQ
GRVEIDRFNELVMRARLTWQQVVLLRAYAKY-
LRQAGFNYSQSYIESVLNEHPSTARSLVALFEALFDPSPLSSSTNCDAQ
AAAAAVAADIDALVSLDTDRILRAFASLVQATLRTNYFVTQKFSARSKGVLVLKLDAQLINELPLPRPKFEIF-
VYSPRVE
GVHLRFGAVARGGLRWSDRLDDFRTEILGLVKAQAVKNAVIVPVGAKGGFVLKRPPLPT-
GDAAADRDAMRAEGIACYQLF
ISGLLDITDNVDHATGKVNAPPQVVRRDSDDAYLVVAADKGTATF-
SDIANDVAKSYGFWLGDAFASGGSVGYDHKAMGIT
AKGAWEAVKRHFREMGVDTQNEDFTVVGIGD-
MSGDVFGNGMLLSKHIRLIAAFDHRHVFLDPDPDAAVSWAERQRMFDLP
RSSWDDYNKSLISEGGGVYSREQKAIPTSPQVRTALGIDGEVTEMAPPNLIRAILQAPVDLLFNGGIGTYIKA-
ETESVAD
VGDRANDPVRVNANQVRAKVIGEGGNLGVTALGRVEFDLSGGRINTDAMDNSAGVDCSD-
HEVNIKILIDSLVTAGKVKVE
ERKHLLESMTDEVARLVLTDNEDQNDLIGTSRANAANMLSVHAMQ-
IKYLVDERGVNRELEALPSEKEIQRRSEAGIGLTS
PELSTLMAHVKLALKEQMLATELPDQOVFVS-
RLPRYFPKPLRERFTPEIRSHQLRREIVTTMLINDLVDTAGISYAFRIA
EDIGVGPIOAIRTYVATDAIFGVGOVLRRIRAANLSVVLSDRMTLDTRRLIORAGRWLLNYRPQPLAVGAEIN-
RFAAKVK
ALTPRMSEWLRGODQAIVEQQATEFVSQGAPEDLAYRVAVGLYRYSLLDIIDIADITEL-
DPAEVADTYFSLMDRLGTDGL
LTAVSKLPQNDRWHSLARLAIRDDIYASLRSLCFDVLAVGEPDES-
GEEKIAEWEHISASRVERARLMLAEIHASGEKDLA TLSVAARQIRRMTRTSGRGSSG
>BL;ML1255, ML.tab 1499768:1500259 forward MW:16842
MGNQSSQSEVAPVVRGDVVTELPKGWVITTSGRVSGVTEPGDRSVHYPFPIKDLVALDDALTYSSRASHARFA-
VYLGDLG (SEQ ID NO:155)
NDTAALAREILAQVPTPDDAVLVAVSPNQCAIEVVYGSQVRGR-
GAESAAPLGVAAASSAFEQGNLIDGLISAVRVLSAGI SRS >BL;ML1270, ML.tab
1513725:1514522 forward MW:27833
VMAPDIKSARAGRLTIQIAQLLLVVAAGALWMAARLPWVVIRSFDGLGPPKEVALSGASWSAVLLPLALLMLA-
ATVAAIA (SEQ ID NO:156)
VRGWPLRVLAGLLAVASFLVGYLGVSLWVLPDVTVRGAVLAHV-
SLLSLVGSQRHHLGAGAAVAASGCTLIAAVLLMRSAS
VIGSARQGTSKYVVPAQRRSIARRDGAAT-
AISQMSERMIWDALDEDRDPTDRLREPDTEGRWWTACRRSLPFMNVVEIGG
CTGSVAGRWVTSGKGNDTHVSGNCA >BL;ML1296, ML.tab 1545740:1546075
forward MW:12349 MYRFGMRYLDSMTVDRHVAGNEFTVEEISTGIF-
ASGYGQVGDGRSFSFHIEHWSLWEIYRTRLAGLVPQTEEVVPRA (SEQ ID NO:157)
IRGLVNIDLTDERSLAAAVRDLVARTLTVSG >BL;ML1299, ML.tab
1547791:1548381 forward MW:21454 MARAIHVFRTPDRFVAGTVGQPGNRTFYIQAVH-
DSRVVSVVLEKQQVAVLAERIGALLLEVHRRFGTPVPPEPAEINDLN (SEQ ID NO:158)
PLVMPVDAEFRVGTMGLGWDSEAQTVVVELLAVTDAEFDASVVLDDTDEGPDAVRVFLTPESARQFATRSNRV-
ILAGRPP CPLCDEPLDPEGHVCARTNGYKRSALLGPKDDDTEW >BL;ML1300, ML.tab
1548392:1549180 forward MW:28681
VLRNGELTVLGRIRSASNATFLCESTLDQRSVHCVYKPVSGEQPLWDFPEGTLAGRELSAYLVSTDLGWNIVP-
YTVIRDG (SEQ ID NO:159)
PAGPGMLQLWVQQPGDVADSAPRSGPDMVDLFPADKLQSGYLP-
VLRSYNYAGDEVILMHADDTQLRRLAVFDVLINNADR
KGGHILYGLDGHVYGVDHGVSLHVEDKLR-
TVLWGWAGKPIDNQTLEEVAGLADALSGPLADTLAGQITWAEIIALRRRAY
ANLDNPVMPGPNRDRAIPWPAF >BL;ML1306, ML.tab 1555819:1556643
reverse MW:30377 VAAFEGWNDASDAASGALEHLNAVWEADPIVEI-
DDEAYYDYQVNRPVIRQVDGVTRELVWPAMRISYCRPPGSDRNVVLM (SEQ ID NO:160)
HGVEPNMRWRTFCTELLTIADRLNVDTVVILGALLADTPHTRPVPVSGAAYSPESARRFGLEETRYEGPTGIA-
GVFQDAC
VAARIPAVMFWAAVPHYVSHPPNPKATVALLRRVEDVLDVEVPLADLPTQAEDWEQAIT-
EIAAEDDELAEYVHSLEQRGD AEVDVNDALGKIDGDALAAEFERYLRRRRPGFGR
>BL;ML1315, ML.tab 1567372:1567956 reverse MW:20428
VSRWTHRTFFIALSAIVTTAGFGSSGCAHGNSSTSESAVPSTFPGISSSITAPPATGLPAPEVLTNVLSRLAD-
PNIPGID (SEQ ID NO:161)
KLPLIESATPDSAVTLDKFSNALRONGYLPMTFTANNIAWSNK-
NPSDVLATISVNIAQTNNSVFSFPMEFTPFPPPQQSW
QLSKRTADMLLEFGNSSGLTNPAPIKAPT- PTPSH >BL;ML1321, ML,tab
1575493:1575684 forward MW:7067
MAQEQTRRGGGGDDDEFTSSTSVGQERREKLTEETDDLLDEIDDVLEENAEDFVRAYV- QKGGQ
(SEQ ID NO:162) >BL;ML1334, ML.tab 1587322:1588140 forward
MW:29047 MSEPQGSDPGKQWQSPGEGVENHSSDQPTQAAS-
PWQQQPSTQDSTWHPPAYASPECYNYPQLTEPVYPHQYPSATPGYGQ (SEQ ID NO:163)
PGYFGAQFSQCGIPGQYPQSGSPGQYGSPGQYGPPGQYGPPGQYGPPGQYGPPGQYGPPGQYGPPGQYSQQFQ-
PYEQPGT
KGFVALIGSIAGVIGVLIFAAILVTGFLWPAWLVTTKLDVNKAQASVQQVLTDETNGYG-
AKNVKDVKCNNGADPTVKKGD TFDCSVSIDGMQKRVTVTFQDDKGTYEVGRPQ
>BL;ML1338, ML.tab 1595449:1596771 reverse MW:50163
VYGALVTAADSTQTGLRNWLLAVFHPRTHTPSTATIVRSALWPAAILSVLHRSTVITTNGNITDDFKPVYRAV-
LNFRHGW (SEQ ID NO:164)
DIYNEHFDYVDPHYLYPPGGTLLMAPFGYLPFAPSRYLFILIN-
TGAILIAWYLILRLFKYTLSSVAAPTLLLAMFCTETV
TSTLVFTNINGCIMLLEVLFLWWLINGSE-
PKTVSQQWWAGGAIGLTLVLKPLLGPLLCLPLLNRQWQALVPAIALPVVIN
LAALPLVSHPMDFFTRTVPYILGTRDYFNSSIEGNGVYFGLPTWLIVFLRLLFTVLAICSLWLLNRYYRTRDP-
LFWFTCS
TGVLLLWSWLVLPLAQGYYSMMLFPFLMTVVLPNSLIRNWPAWLGIYGFLTLDRWLLFN-
WMRYGRALEYLKITYGWSLLL IVVSTVLCFRYLDAKAENRLDHGIDPAWLTAERERASVNA
>BL;ML1357, ML.tab 1617916:1618101 forward MW:6609
MTIDPDQIRAEIDALLAQLPDFADLEDSVSGLSLAQLEEVALRFSEVHSVLLQALESAEKG (SEQ
ID NO:165) >BL;ML1361, ML.tab 1621823:1623004 forward MW:42405
MRMSALLSRNNSRPGLVGTARVDRNIDRLLRRICPGDIVVLDVLDLDRITADALVEA-
DIVAVVNASPSVSGRYPNLGPEV (SEQ ID NO:166)
LVNNGVTLIDETGPEVFKKIKDGAKIR-
LHEGGVYSGDRRLICGTERTDHDIADLMREAKSGLATHLEAFAGNTIEFIKSE
SPLLIDGIGIPDIDVDLRRRHVVIVADEPSAADDLKSLKPFIKEYQPVLVGVSGGADVLRKAGYRPQLIVGDP-
EQISTEA
LRCGAHVVLPADADGHAPGLERIQDLGVGAMTFPAAGSATDLALLLADHHGAALLVTAG-
HTANIETFFDRTRTQSNPSTF
LTRLRVGEKLVDAKAVATLYRNHISFGAIALLALIMLIAVIVALW-
VSRTDGVVLNGVIDYWNRFSLWIQRLIA >BL;ML1362, ML.tab 1623026:1623979
forward MW:32360 MISLRQHAFSLAAVFLALAVGVVLGSGFLSDTL-
LSSLRDEKRDLYTQISGLNDQKNMLNEKVSAANNFDNQLLGRIVHDV (SEQ ID NO:167)
LGGTSVVVFRTPDAKDDDVAAVSKIVVQAGGTVTGTVSLTQEFVDANSTEKLRSVVNSSILPAGAQLSTKLVD-
QGSQAGD
LLGITLLVNANPAVPNVGDAQRSTVLVALRDTGFITYQTYNRNDHLGAANAALVITGGL-
LPQDAGNQGVSVARFSAALAP
HGSGTLLAGRDGSATGVAAVAVARADAGMAATISTVDNVDAEPGR-
ITAILGLHDLLSGGHTGQYGVGHGATSITVPQ >BL;ML1389, ML.tab
1665174:1666757 reverse MW:57894 VTSIMSVSALEQSAADVGDNSARQHAHGALPDS-
LAIAMLATVISGAWASRPSLWFDEAATISASASRTVPELWRLLSHID (SEQ ID NO:168)
AVHGLYYLLMHGWFAIFPSTEFWSRVPSCLAIGAAAAGVTVFTRQFATRTTAVYAGIVFAILPRITWAGIEAR-
SSALSVA
AANWLTVLLVASVQRNRPRLWLCYALTLMLSILLNLTLATLVLVYAVILPWLAPNKFRN-
SPFIWWAVTSVVALGTITPFI
LFAHGQVWQVDWIFRVSWHYVFDITQRQYFDHSVSFAIATAVIIV-
PAIATRLAGLRAPAGDLRSLVIICTAWIVIPTTLM
VGYSAVIEPVYYPRYLILTAPAAAIVIAVCI-
VTVARKPWPIAGVLVLFAVAAFPNYLFTQRGRYAKEGWDYSQVADVISS
QAAPGDCLIVDNTVPWRPGPIRALLAARPAAFRSLIDIERGFYGPTVGTLWDGHVPVWLVTAKINKCSTVWTV-
SDRDTSL PDHQAGQLLSPGLILGRAPAYQFPSYLGFRIVERWQFHYSQVIKSTR
>BL;ML1399, ML.tab 1679214:1680188 forward MW:34653
LPPSAKSTYPGQVEGAPHDGTPSAPQEPDEDTVTVPAPALIRRSSVSMPNAAQWLHTTNRSPRLVANVRRARR-
LLPGDPD (SEQ ID NO:169)
FGDPLSTAGEGGPRAAARAADRLLGDRGAASREVSLSVLQVWQ-
ALTEAIARRPVNPEVTLVFTDLVGFSGWSLQAGDEAT
LALLRQVARAVESPLLDAGGHIVKRMGDG-
IMAVFRDPSVAVQAVLAATEAMKSVEVGGYTPRIRVGIHTGRPQRLAADWL
GVDVNIAARVMERATKGGIMISGPTLDLIPQSDLKELGIITRRVRKPMFTSKFTGIPPDMVIYRIKARRELTA-
SDETAQT NSLT >BL;ML1439, ML.tab 1733347:1733682 forward MW:12946
MADRVLRGSRLGAVSYETDRNHDLAPRQIARYRTDNGEEFEVPFADDAD-
IPGTWLCRNCMEGTLIEGDLPEPKMVKPPRT (SEQ ID NO:170)
HWDMLLERRSIEELEELLKERLDLIRSRRRG >BL;ML1444, ML.tab
1738595:1739293 forward MW:25220 MTKIQIDAPDGPIDALLSVPPGPEPWPGVVVIH-
DAIGYEPDKESTNNHIAMAGYVAITPNLYSRGSRARCITRVMRELLT (SEQ ID NO:171)
KRGRAFDDILATRDYLLAMPKCSGRVGIAGFCMGGRFALVMSPKGFDASAPFYGTPLPRNLSETLNGACPIVA-
SFGGRDP
LGIGAPKRLRQATQTRHITTDIKVYPDAGHSFANKLPGQPLMRITGFGYNQAATEDAWS-
RVFAFFDKHLRTD >BL;ML1446, ML.tab 1740098:1740484 reverse
MW:13769 VTPTFADLAKAKYILLTTFTKDGRPKPTPIWAAADGDRLLAISAGKAWK-
VKRIRNNPRITLATCNVRGCATSAGVQGNAT (SEQ ID NO:172)
ILDKLQTGSVYDAICKQYGIQGRLFNFVSKLRGGMQNNVGLELRVSGS >BL;ML1470,
ML.tab 1768618:1768989 reverse MW:12909
MTDSPGEHGPQKPLPSADPWKSFAGVMAAILFLEAIVVLLALPVLGASGGLTLSALSFLIGLAGLLIVLVGLQ-
RKAWAIW (SEQ ID NO:173) VNLGVQVVVLVGCAVYPVLGFVGVLFAGLWALIVYFRAEVSRR
>BL;ML1485, ML.tab 1787915:1788538 reverse MW:23064
MPEKASVKYQADLWFDPLCPWCWITSRWILEVEKVRDVEVHFHVMSLAILNENREDLTENYLENITKAW-
GPARVVIAAEQ (SEQ ID NO:174)
ANGASVLDPLYTAMGIRIHNEDNKNLDEVIKRSLADTGL-
PAELAAAAQSNAYDDALRESHHAGMDPVGDDVGTPTVHVNG
VAFFGPVLSKIPRGEEAGKLWDASL- TLAAYPHFFELKRTRTEPPQFA >BL;ML1494,
ML.tab 1801370:1801723 reverse MW:12353
MRIVVLVLFAADGVLSAVVGANLMPLYIGSVPF-
PISGLISGLVNAVLVWAARLWTRSPWLVALPLWVWLLTVGLLSLGGP (SEQ ID NO:175)
GVDVVSGQSVMASGALWLILLGVSPPACVLWRCNRYG >BL;ML1503A, ML.tab
1815091:1815375 forward MW:10624 MTVLTDEQVDAALLDLNDWKHTDGAL-
CRSIKFPTFLAGIDAVCRVAEHAETKDHHPDIDIRYRTVTFILVTHYADGITKK (SEQ ID
NO:176) DITMARDIDRLIGD >BL;ML15OS, ML.tab 1816758:1817306
forward MW:17516 MATIWTYLRATAIVVGSSAALLTGGIAHADTAPAPAPAPAPAPAPALNI-
PQQLISSAANAPQILQNLATALGATPPVTPS (SEQ ID NO:177)
APGISFPGLTPAAATVPTSSAATLPSLPGIMAPAISNTPTTPGLPASTPGFPQARVDMPAMPPLPVSVPPQIS-
LPGDLQA LASSASAAAAPVAAPTLLSALP >BL;ML1506, ML.tab
1817334:1818380 forward MW:34620 LKTGGFVTSAWNLPKGLVAVVTASTA-
AFGLCQNAAADPANPYGTPPNPTQQLPGLPALAQLSPIVQQAANNPQQATQLLM (SEQ ID
NO:178)
EAVSALTQNPTAPIASKNLATSVSQFMQEPNNPNPGASALDIPTSGVPAPAANGITPLDVVLVPHLPSAG-
AEPGAQAHLP
TGIDPVHAAGPATAATPTPGSPTNRTAAPPTPVASPAPTTPELPATTPGFGPDAPP-
TQDFIYPSISTNCLADGSSSIATA
LSVAGPAKIPLPGPGPGQAAYVFTAVGTPGPADVQKLPLNVT-
WVNLTTGKSGSATLKPRPDINPEGPTTLTVIADTGSGS IMSTIFGQVTTKEKQCQFMPTIGSTVVP
>BL;ML1508, ML.tab 1819557:1820048 reverse MW:18221
MRFGASLLRVRRRLYGMGSQVFDDKLLAVISGNSIGVLATIKRDGRPQLSNVQYHFDPRDLAVRVSITE-
PGVKTRNLRRD (SEQ ID NO:179)
PRASILVDVDDGWSYAVAEGTAELTPPAAAPDDDTVEAL-
IVLYRNIVGEHLDWDEYRQAMVTDRRVLLTLPILHVYGLPP GIR >SL;ML1525, ML.tab
1840011:1840466 reverse MW:15892
MTKTLLVVHHTPSPTTRELLEAVLAGANDSEIDGVEVVSRPALAATIPDMLNADGYLFGTTANFGYMSGALKH-
FFDTVYY (SEQ ID NO:180)
PILDNVSGRPYGLWVHGMNDTVGAAVAVGKLATGLSLTHAADV-
LEVVGPVDAMVCERAHELGGTLAAMLME >BL;ML1526, ML.tab 1840474:1840956
reverse MW:17670 MNDTVMTKRSLYILYIQLLIAALCFANLAYLVL-
LGRMAVANIGSGQAAAVSLGLALLIMPVIGLWANIATLRAGFAHQKL (SEQ ID NO:181)
ARLIAADGMELDTSVLLRRPSGRIQRDAADALFATVRAELAGDPDNWRCWYRLARAYDYAGDRRRAREANKTA-
LELHGRD >BL;ML1537, ML.tab 1853559:1854764 reverse MW:43686
MKAQRRLGLALSWSRVTTVFVVAIVVLVLASHVPELWQAGHHVAWCVGVGITVVVMV-
LVLVSYHGITLMSGLATWVWDCF (SEQ ID NO:182)
ADPRAALAAGCTPAIDHQRRFGRDVVG-
VREYKGRLVTVIAVDDGEDDPVGRHRHRTTPAGLPVQAVADGLRQFDIRLDSI
DIVSVKIRRGGNAAKFSALDNWGSEEWGLVCACPPTYQRCTWLVLRMNPQHNVVAVAARDSLASTLVAATERL-
AQDLDGQ
NCAARPLAAGELAEVDSAVLADLEPTWSYPGWRHIKYVNGFATSFWVTPSDIDSETLNE-
LWLSDLPDIKATVITIRLASR
SCQTRLSAWVRHHSETRLSKEACRGLNQLTGRQLAAVRASLPAPA-
TRPVLVVPSRELSDHDELVLPVGQGREGSASLFAG Q >BL;ML1540, ML.tab
1858295:1859197 reverse MW:32551
MDHQSTRTDITVNVDGFWLLQALLDIRHVAPELRCRPFVSTDSNDWLHEHPGMAVMREQGIVVNGNVNEHVAT-
RMRVFAA (SEQ ID NO:183)
PDLEVVALLSRGKLLYGAVNDENQPPGSRDIPENEFRVVLVRR-
GSHWASAVRVGNDITIDDVAIADSASITSLVMDGLES
IHHADPAAINAVNVPLEEMLEATKSWQDA-
GFNVFSGRDLRRMGISAATVAALGQALSDPAAEAAVYARQYRDDAKGPSAS
VLSLKDSSSGRIAIYQQARTAGLGDAWLAICPATPQLVQLGLKTVLDTLPYGDWKTHSRV
>BL;ML1544, ML.tab 1866369:1867889 reverse MW:53849
VAEESSGRRGSEYGLGLSTRTQVTGYQFLARRTAMALTRWRVRMEVEPGRRQTLAVVASVSAALLGCLGALLW-
SFISPSG (SEQ ID NO:184)
QLNESPIIIDRDSGALYVRVGDRLYPALNLASARLITGRPDNP-
HAVRSSQIATLPHGPLVGIPGAPSELSPETPQTSSWL
VCDTVATSTGIGSSSGVTVTVIDGSPDLS-
GHRRVLTGSDAVVLHYGGDAWVIRQGRRSRIDAMNRSVLLPLGLTPEQVSQ
ARPMSRALFDALPVGPDLVVPDVPNEGNPASFPGAPGPVGTVIVTPQISGPQQYSLVLADGVETVSPLVAQIM-
QNAGKPG
NTKLITVAPSVLVKMPVVNKLDLSVYPDTPLQVVDLRENPSTCWWWERTAGESRSRIQV-
ITGSTIPINSADVKKVVSLVK
ADTTGREADQVYFGSNYANFVAVTGNNPAAQTTESLWWVTRTGAR-
FGVEDTKDVRDALGLNGTPNLAPWVALRLLSQGPT LSRADALMEHDTLPMDMTPAELVVPK
>BL;ML1548, ML.tab 1872638:1873603 reverse MW:37427
VAKQEPMCGVEPTLWAISDLHTGHVGNKPVAESLYPLSPDDWLIVAGDVAECTDEIRWTLELLRRRFAKVIWV-
PGNHELW (SEQ ID NO:185)
TTNRDPMQIFGRARYDYLINMCDQMGVVTSEHPFPLWTERGGP-
ATIVPMFLLYDYTFLPTGADSKAKGLAIARERNVVAT
DEYLLSSEPYATREAWCRDRLDVTRSRLE-
QLDWMTPTVLVNNFPLVREPCDALFYPEFSLWCGTTKTADWHIRYNAVCSV
YGHLHIPRTTWYNEVRFEEVSVGYPREWRRRKPYSWLRQVLPDPQYAPGYLNDFGGHFVITSEMRAQAVQFRE-
RLRQKQS R >BL;ML1557, ML.tab 1883016:1883336 reverse MW:11540
VRTCVGCRKRELAVELLRVVAPSTGKGSYAVIVDTASSLSGRGAWLHPD-
MQCVQQAIRRRAFTGALRIAGSPDTSAVVEH (SEQ ID NO:186)
IEFLSELDRPGNRTGSKEHEHTVKSR >BL;ML1560, ML.tab 1885239:1885775
forward MW:17494 MLLVPSAVPVINRRGVLAGGATLTVLGVCFSAC-
SKSPSKTPEIEELLGPLDQARHDSALASAAAAAIGNLPQITAALAVV (SEQ ID NO:187)
ATQFAAHARALVTEISRATGKLASSSSDTTSPGLSPASPPSKPPPPVSDVIDALRTSAEGAGRLVSTASGYRA-
GLLASIA ASCTASYTVALVSGGPSI >BL;ML1561, ML.tab 1885772:1886269
forward MW:17495 MTSIEPSAPTPVATPKRTPVSQDSDNAGLSEAL-
VVEHSTIYGYGIVLALSPPNANSLVVDALIQHRQRRDDIIVMLTARR (SEQ ID NO:188)
VSPPVAASGYQLPMLVGSAADAARLAVRMENDGATAWRAVAEHAETADDRTFAAMALAQSAVMAARWNKMLGA-
WPITTTF PGSNE >BL;ML1584, ML.tab 1909590:1909844 forward MW:9465
VASTEGEHNAGVDPAEVPSVAWGWSRINHHTWHIVGLFAIGLLLANLRGN-
HIGHVENWYLIGFAALVFFVLIRDLLGRRR (SEQ ID NO:189) GWIR >BL;ML1607,
ML.tab 1932700:1932990 reverse MW:11018
MTTHKAMTRVQLEANGEVFAVDNLTRMGLRGLHCNWRCRYGECDVIASETAHRTVVSRLRSIAATVMEGSRRS-
APEQKVR (SEQ ID NO:190) WLRWLAGLWPANQDEF >BL;ML1610, ML.tab
1935020:1935325 reverse MW:12258
MSAEDLEKYETEMELSLYREYKDIVGQFSYVVETERRFYLANSVEMMPRNTDGEVYFELRLADAWVWDMYRPA-
RFVKQVR (SEQ ID NO:191) VVTFKDVNIEEVEKPELRLPE >BL;ML1624, ML.tab
1947637:1949427 forward MW:65207
MTQSDHRHTRPSSYGGTVLTQRVSQATVEDSRPLRGWQRRAMVKYLASQPRDFLAVATPGSGKTTFALRVMTE-
LLNSHAV (SEQ ID NO:192)
EQVTVVVPTEHLKVQWTRAAATHGLALDPKFSNTNPRTSPEYH-
GVTMTYAQVAAHPTLHRVRTEGRRTLVIFDEIHHCGN
AKAWGDAIREAFSDATRRLALTGTPFRSD-
DKPIPFVTYALDADGLMHSQADHTYSYAEGLADGVVRPVVFLVYSCQARWR
NSAGEEHAARLGEPLSAEQTARAWRTALDPSGEWMPAVISAADQRLRQLRTHVPDAGGMIIASDQTAARAYAN-
LLAQMTS
ETPTLVLSDDPGSSARITEFAKNTSQWLIAVRMVSEGVDIPRLSVGIYATSASTPLFFA-
QAIGRFVRSRHPGETASIFVP
SVPNLLQLASELETQRNHVLGKPHRESTDNPLGGNPATMTQTEQD-
DTEKYFTAIGADAELDQIIFDGSSFGTATPAGSEE
EAYYLGIPGLLDADQMRALLHRRQNEQLQKR-
TAAQQASSTPDRTSGAPASVHGQLRELRRELNSLVSIAHHHTGKPHGWI
HNELRRRCGGPPIAAATHDQLKARIDAVRQLNAEPS >BL;ML1638, ML.tab
1975122:1975820 reverse MW:25842 LSKLDDELSRIAHRANYLPQREAYER-
MRVERTGANDRLVAVQIALEDVDTQVFLLESEIDAMRQREDRDRLLLNSGATDA (SEQ ID
NO:193)
KQLSDLQPEFGTWQRRKNSLEDSLREVMKRRGELQDQLTAELGAIERMQTDLVGARQTLDVAPAEIDQVG-
QPHSSQCDVL
IAELAPALSAPYERLCAGGGLGVGQLQGHRCGACRSEIGRGELSCISVDVDDEVVK-
YPESGAIQLLDKGFFQ >BL;ML1644, ML.tab 1978961:1979773 forward
MW:30049 MRHLALLLRPGWIALTLVVTAFTYLCFMVLAPWQLGKNTRMSRENNQIE-
YSLNTPPVPVKTLLSHQDLSTSKSQWRQVTA (SEQ ID NO:194)
TGRYLPDVQVLARLRVVDSGQAFEVLAPFVVDDGPTVLVDRGYVRPEPGSHVPPIPRPPNEAVSITARLRDSE-
PVMKDKE
PFSRDGVQQVYSINIEQVARLTKIPLAGSYLQLVDNQPGGLGVIDIPHLDAGPFLSYGI-
QWISFGIIAPIGLGYLAYAEI RTHRQEKLAKPSQAPMAVEEKLADRYGHPR >BL;ML1649,
ML.tab 1987829:1988251 reverse MW:15144
VVAADHTPSFARKLGIQQGQVVQEWGWDEDTDDDIRASVEEACGGELLDEDTDEVVDVVLLWWRDGDGDLVDT-
LMDAITP (SEQ ID NO:195)
RAEDGVIWVLTPKTGKPGHVPPAEIAEAAPTAGLMLTSSVNLG- DWSASRLVQPKSRIGKR
>BL;ML1652, ML.tab 1992060:1993304 forward MW:44873
VNNNHFVPAPFRRLPLELLDTVPDSLLRRLKQYSGRLATEAVTAMQERL-
PFFADLEASQRASVTLVVQTAVVNFVQWMQN (SEQ ID NO:196)
PHSDVSYTAQAFELVPQDLARRIALRHTVDMVRVTMEFFEEVVPLLARSEEQLTALTVGILKYSRDLAFTAAT-
AYADAAE
ARGTWDSRMEASVVDAVVRGDTGPELLSRAAALNWDTTAPATVVVGTPTPNHDGPNGQV-
SSERASQEVREIAARHGRAAL
TDVHGTWLVAIISGQLAPTDKFFSDLLHAFSDGPVVIGPTAPMLT-
AAYHSASEAVSGMNAVAGWSGAPRPVQARELLPER
ALMGDASAIVALNTDVMLPLADAGPTLIETL-
DAYLDCGGAIEACARKLFVHPNTVRYRLKRITDFTGHDPTLPRDAYVLR VAATVGKLNYPTHH
>BL;ML1660, ML.tab 2001607:2002260 reverse MW:23449
VTVALALRDGCRRISTMTRQYGVTTQRHLISPVVFDITPLGRRPGAIIALQKTVPSLARIGLELVVIEWGAP-
INLDLRVE (SEQ ID NO:197)
SVSEDVLVAGTVTAPTVSECVRCLTAVHGHVQVTLNQLFAYP-
YSATKVTTEEDAVGHVVDGTIDLEQSIIDAVGIELPFA
PMCRSDCPGLCAECGTSLVVEPGHPHDR- IDPWWAKLTDMLAPDVPQTSETDGSRSEW
>BL;ML1666, ML.tab 2009601:2010245 reverse MW:23370
VSAQPVERPGDLKPAPASVLPMPVPTAWWVLIA-
GVIGLVASMMLTVEKIRILLNSAYVPSCNVNPIVACGSVMSTPQASV (SEQ ID NO:198)
LGFPNPLLGIVGFTLVTVTGVLSVAEVSLPQWYWIGLAVGTLAGVGFVHWLIFQSLYRIGALCAYCMVIWAVS-
VSLLVVV TAIVFRPLLEVLPGRTSAIARGIYQWRWSIATLWFITVFLLIMVRFWNYWQTLL
>BL;ML1677, ML.tab 2021235:2021810 forward MW:19668
VTSNSDAGSAVDAGGPPRTVIIAAVVLTAATIGTILVLAATLHEPPQPVVITAVPAPQATTAACRSLTQALPQ-
RLGDYER (SEQ ID NO:199)
APVAQPAPDAVAAWRTGSDTEPVVLRCGLDGPAEFVVGSPIQA-
VDQVQWFEVDAKPKPAIDAGRSTWYTVDRPVYVALTL
PSGSGPTPIQELSDVIDRTLAAIPISPAQ- SH >BL;ML1698, ML.tab
2046984:2047817 reverse MW:29424
VTGQSHDEHWRRPGECPEPIPGRPASASLVDPEDDLTPVGYPGDFGTTTVIPYSDPDHLKGPGGTG-
YNLLDQQEPLPYVQ (SEQ ID NO:200)
PQARHAVAEPTEIDSDQDNERLHTVGRRGTQHLGLL-
VLRVGLGVVLAAQGLHKLFGWWGGQGLTGFKNSLTQVGYQHADI
LAYVSAGGEAVAGVLLVLGLFAPVVAAGALAFLINGLLAAWPHSPLFSFFLPDGNEHQITLIVMDVTVILCGP-
GRYGLDA GRRWAYRPFIGSFIVLIAGIAAGIAVWVLLNGVNPLA >BL;ML1704, ML.tab
2055043:2055741 forward MW:24598
VEMPLTQRYATVLAVPSYLLRIELADRPGSLGSLAVALGSVGGDILSLDVVERSGGYAIDDLVIEVPSGAMPD-
KLITAFE (SEQ ID NO:201)
SLPGVRVDSVRPHSGLLEAHRELELLDQVAAADDNASKLQVLA-
DEAPKVLRVSWCTVLRSSQGKLLRLAASVGAPETRAN
SAPWLPIERAAPLDGTAEWVPQSLRDMNT-
TMIAAPLGDPHTAVVLGRPGPEFRPSEVARLGYLAGIVATMLR >BL;ML1706, ML.tab
2057887:2058900 reverse MW:34564 MSVFATASGVGSWPGSSPYPAAKVVV-
GELAGALAHIVELPARGVGADMLGRAGALLIDVAIDTVPRGYRIAARPGAVTRR (SEQ ID
NO:202)
AFSLLDEDMDAFEAAWEMAGLRGRGRVVKVQAPGPITLAAGLELANGHRAITDSCAVRDLAESLAEGVAA-
HRAALARRLD
TQVVVQFDEVSLPAALGGLLTGVTAFSPVAPLDETLAATLFDSCVATVGADVLLHS-
CAAELPWNFLQRSAIRAVSVDVNV
LRTGDLDGIAAFVESGRTVVLGVVPATAPQRLPSVEQVAAAV-
VGVTDRLGFGRSALRDRIGVSPACGLAGATPHWACTAI ELARKTAEAFAQDPDAI
>BL;ML1720, ML.tab 2076101:2077195 forward MW:38298
LAAGPALSARGYLAMNAQTQAGCSLMEWENNDNGRQRWCVRLVQGGGFGGPLFDGFENLYVGEPGTIFSFPMT-
QWTRWRQ (SEQ ID NO:203)
PVIGMPSTPRFLGNGQLLVTTHLGQVLVFDAHRGMVVGSPLDL-
VDGVNPTDPTRGLADCVSAQRGCPVASAPAYSPASOT
VVLDIWQPGAPTAGLIGLKYHSRQTPLLA-
RAWTSDAIGAGVLASPVLSADGSTVYVNGRDQHLWALHAANGKPKWSVPLE
FLAQTPPTVTPQGLIVSGGGPDTRLAAFKDAGDHAQQIWRRDDLTPLSTSSLAGVSVGYTVVSSSPAADAPGM-
SLLAFDP RNGHTLNSYPLPAATGYPVGVSIANNRRVITVTSDGQLYSFAPT >BL;ML1723,
ML.tab 2079775:2080758 reverse MW:36615
MTRSTEISANAVPNPHATAEQVAAARKDSKLAQVLYHDWEAENYDEKWSISYDQRCVDYVRHCFDAIVPDELF-
TELPYDC (SEQ ID NO:204)
ALELGCGTGFFLLNLIQSGVARRGSVTDLSPGMVKIATRNGQS-
LGLDINGRVADAEGIPYDDNTFDLVVGHAVLHHIPDV
ELSLREVLRVLKPGGRFVFAGEPTDVGNR-
YARVFADLTWKVTIRVVQLPGLSAWRRTQVEIDENSRAAALEWVVDLHTFE
PKVLETMATNAGAVQVKTVTEEFTAAMLGWPLRTFESTMPPSKLGWGWARFAFTSWKTLSWVDANVWRRVIPK-
SWFYNII ITGVKPS >BL;ML1781, ML.tab 2157673:2158185 reverse
MW:18049 MRASNQFADATAGVVYVHASPAAVCPHVEWALS-
STLQTKAINLVWTPQPAMPGQLRAVTNWTGPVGTGARLANALRSWSVL (SEQ ID NO:205)
RFEVTEDPSPGVDGQRFSHTPQLGLWSGSMSANGDIMVGEMRLRANMAHGADTLAAELDSVLGTAWDDALEVY-
RDGGDVG EVTWLSRGVG >BL;ML1782, ML.tab 2158324:2159145 reverse
MW:29017 MVALDALDDWPVPTTAAAVVGPTGVLAARGDTE-
QVFPLASVTKPLVARAVQIAIEEGVVDLDTVAGPPGSTIRHLLAHAS (SEQ ID NO:206)
GLAMHSKYVMAPPGTRRIYSNYGFRVLAETVQREAGIGFSRYLTEAVLEPLAMTATKLEDGTWAAGFGATSTV-
ADLAAFA
NDLLRPATVSAQIHAEATSVQFPDLDGVLPGHGVQRPNDWGLGYEIKNSKLPHWTGTLN-
SARTYGHFGQSGGFIWADPEA ELALVVLTDRDFGEWALQLWPAISDAVISEYAR
>BL;ML1791, ML.tab 2165620:2166063 reverse MW:16468
LGKIFIMLPTVTHQAITCTVRRVRWIVDAMNVIGTRPDCWWKDRRGAMVRLVGKLERWASTERNHVTVVFERP-
PSPSIRS (SEQ ID NO:207)
SVIVIAHAPKAFPDSADDEIVRLVQADPEPQGICVVTSDSALT-
DRVQEVGALAYPAARFRKHIDSID >BL;ML1813, ML.tab 2194277:2194849
reverse MW:20227 MTGQDVVVQQPQTIPMLPIYIPQDVDMTVVKSE-
VAAAGVSASPAAMPGLLEVVSHAQAEGINLKIVLLDHNLPNDTPLRD (SEQ ID NO:208)
IATVVGADYPDVTVLTLSPNYVGSYSTHYPRVTLEAGEDISKTGNPVQSAQNFLGELNVPEFPWTVLTIVLLI-
GVLVAAI GTRFMQLRSKRLATSLDAAGILAEDVNRAD >BL;ML1908, ML.tab
2293144:2293644 reverse MW:18864
VALKTDIRGMVWRYSDYFIVGREQCREFARAIKCDHPAYFSEDAAAELGYDAIVAPLTFVTIFAKYVQLDFFR-
NVDVGME (SEQ ID NO:209)
TNQIVQVDQRFVFHKPVLVGDKLWARMDIHSVSERFGADIVVT-
KNSCTSDDGELVMEAYTTLMGQQGDNSSQLKWDKESG QVIRSA >BL;ML1909, ML.tab
2293648:2294076 reverse MW:14875
MARREFSSVKVGDLLPEKTYPLTRQDLVNYAGVSGDLNPIHWDDEIAKVVGLDTAIAHGMLTMGIGGGYVTSW-
VGDPGAV (SEQ ID NO:210)
IEYNVRFTAVVPVPNDGQGAVLVFSGKVKSVDPDTKSVTIALS- ATTGGKKIFGRAIASAKLA
>BL;ML1910, ML.tab 2294063:2294542 reverse MW:17608
VGLTTNIVGMHYRYPDHYEVEREKIREYAVAVQNEDTSYFEEDAAAELG-
YKGLLAPLTFICLFGYKAQSAFFKHANIAVT (SEQ ID NO:211)
DQQIVQIDQVLKFVKPIVAGDKLYCDVYVDSMREAHGTQIIVTKNVVTNEVGDIVQETYTTLAGRVGEGGEEG-
FSDGAA >BL;ML1911A, ML.tab 2295156:2295371 reverse MW:7998
MLKKVEIEVDDDLVQEVIRRYGLLGRREAVHLALKALLGEPGVGGLSEQDPEYDEFSN-
PDAWRTRRSSDTG (SEQ ID NO:212) >BL;ML1918, ML.tab 2301835:2302626
reverse MW:27133 VLDLEPRGPLPTEIYWRRRGLAVGITVIVIVIA-
AIVAVVGNGAAAQPANVDKPGSSQNHPGSATPKVLPPNGHEGNLAPA (SEQ ID NO:213)
PPQGRNPESSTSTAAVQPPPVLREGDDCPDSTLAVKGLTNVPQYFLGDQPKFTMVVTNIGLVSCKRDVGAAVL-
AAYVYSL
DNKRLWSNLDCAPSNETLVKTFTPGEQVTTAVTWTGMGSSPHCPLPRPAIGPGTYNLVV-
QLGNLHSMPVPFILNQPPPPP GSSGPALAPSPEAPPANVPVSGG >BL;ML1926, ML.tab
2311558:2312061 reverse MW:17296
LVDTVGLVNKCVPDSSGLPLRAMVMVLLFLGVIFLLLGWQALGSSGNSDDYSALPMSSMPNTPTTPAATSTSS-
TSAANQA (SEQ ID NO:214)
EVRVYNISSKEGIAARTRDQLTTAGFKVTEVDNLVVSDVSATT-
VYYSDAEGEYATADAVGQKLGAPVEPRIAAITNQPPG VIVLVTS >BL;ML1927, ML.tab
2312092:2312400 reverse MW:11799
MDGAMARAHRAGDDVEIVDGLTRREHDILAFERQWWKFAGVKEEAIKELFSMSATRYYQVLNALVDRPEALAV-
DPMLVKR (SEQ ID NO:215) LRRLRASRQKARAARRLGFEVT >BL;ML1937,
ML.tab 2323320:2324552 forward MW:46582
MERPNEYMARQRSILNKVFTGFSAYCRYGQHVSKQQARSTVESTYRSILPNIPGVPWWAALLIATTASAIGYA-
IDAGNTD (SEQ ID NO:216)
LTTVFTGFYITGCVAAVLAVRQSGLFTAVIQPPLILFCAVPAA-
YWLFHGSKIKSLKDLLINCGYPLIERFPLMLGTAGSV
LLIGLVRWFFQLMYRTTASSNSEDDTVSK-
PNSLISGITAVLNSILCIYSNDQNHRANQDTADTELMHSDPPLRTRQTARD
NRSARTHSGQARRVVEYTSEPLVEPWQHSSQRSSEQARDFDAGESPRRSRRQPTPQSDPELRSQPPREIRRDA-
YGHRSGP
YEWPTNHSSHLEPYRRYKQPGPPEHFTEHGQLYERYKQPRRRATPPRASSVNPISQVRY-
RGSTARDDPRVDRQRSQTPRR PVTESWEFDV >BL;ML1945, ML.tab
2331776:2332549 reverse MW:27316 MVNDCPRSRSATWSWALAVNQQGWDT-
RRVTFEWQPNSEVANKPGSVSWSAKPRLLQDGRDMFWSLVPLVVGCILLAGMAG (SEQ ID
NO:217)
TCTFAPGGTTKSTVSSYDAAAALRADARALGFPVRLPELPTGWRSNSGSRGSIEDGRMDLSTGKRLPAVT-
STVGYITPTG
MYLSLTQSNADEDRLVASIHPSSHPTGTVDVAGTKWVVYQGSDQKKDHSGAAAEPV-
WTSRFASPVGAAQIAITGAGCSSQ FRMLASATQSQQPLPAR >BL;ML1983, ML.tab
2367418:2367885 forward MW:16955
MSNRKFSFEVTRTSSASAAVLFRLVTDGANWAKWAKPIVLQSSWARLGDLRPGGIGAVRKIGIWPILVREETV-
EYEQDRR (SEQ ID NO:218)
HLYKLIGPPNPAKDYLGEVLLAPNAAGGTNIHWSGSFTENIPG-
TGPVMRAALKGAVRCLTVRLVKTAERESDGVQ >BL;ML1988, ML.tab
2375705:2376418 reverse MW:25568 MDCEVAREALSARLDGEREAVPSVRVDEHLGEC-
SACRAWFDQVADQARDLRQLVDSRPAITPVDALGIVAPPRRRRPLMT (SEQ ID NO:219)
WQRWALLCVGIAQIVLATVQGLGVSVGLTHDRAMSFGSYLLHESTAWSASLGVIMMAAVLWPGVAAGLAGVLT-
VFVGLLT
VYVVVDVVAGATITLRILMHLPVAIGAVLAIAVWRHSSTPRPTFDDEVDVDLDIVLPRN-
ASRGRRRGHLWPTDGSAV >8L;ML1993, ML.tab 2380915:2381466 forward
MW:20576 MSAELAPSLQNAAESTNTFPMAEDLLGSILEPYSYKGCRYLIDAQYRAS-
PDSVFAYGNFGIEESAYIRGTGHFNAVELML (SEQ ID NO:220)
CFNQLGYSAYAQSVVNKDISALRGWSIADYCRNQLSGILIKNTSSRFKKLINPQKFSARLHVYDLRIVERTWR-
YLQLSNT IEFWDDNGGSAIGEFEVAILNIP >BL;ML1995, ML.tab
2383087:2383323 forward MW:8810 VTETARERILTAVCEVLYIAESDLVDG-
DETDLRDLGLDSVRFTLLMKQLGLSQEAEMQSKLMDNFSIANWVRQLESST (SEQ ID NO:221)
>BL;ML1997, ML.tab 2387532:2388164 reverse MW:23261
MIDDLLLRWVTGLFVLSAAECGFACLACRRPWTLVPSNGLHFAMAIAMAVMAWPRGAQLPTTGTVVFCGLVGA-
WFVILA (SEQ ID NO:222)
TVSSRRIAERAAYAYPALMMLANVWMYVIMDSHLHDHWATGHHT-
SPHTSMLGVDMTTTWPASGIPGWISIINWLWFAFF
CIATVFWTYRSFATSRRNAGFSRYCSLHPSG- QAMMSTGMAIMFGVLLFHV >BL;ML2010,
ML.tab 2402973:2403434 forward MW:15613
MWLTAAPAAARFVVASVIAASCAASTGVAGADPQSPSAPKTTIDHDGTY-
AVGTDIAPGTYSSAGPVGNGTCYWKRIDNPD (SEQ ID NO:223)
GPIDNAMSKKPKIVQIEASNKAFKTTGCQPWQQTSNTTVSTDLPGPIAGIQLESNLGILNGLLASNGQQVPRS
>BL;ML2022, ML.tab 2413644:2414168 reverse MW:18955
MSGPSSSCRCLPCCPHDRSSQPRCGESLGWQGYGRVYLESPWPFGLKQILQILRYGAEYAGKILAQPSK-
PTDVSSGPRRK (SEQ ID NO:224)
TQVFAKATVVPKILQPKETQMTFDPKNAVNAARDIATNF-
VEKASDIVENVSDIIKGDIAGGANDIVQNSIDIATHAVDKA KEIFIGKGEYDELE
>BL;ML2023, ML.tab 2414264:2414668 reverse MW:14091
MIRELLAISAIASGAVNSAPRAAANPHYDGDVPGMSYDASLSAPCFSWERYIFGRGPSGQAEACHFPPPNQFP-
PANTGYW (SEQ ID NO:225)
VISYPLYGVQQAGAPCPKTQAAAQSADGLPMLCLGGQGWQPGW- FTDKGFFPPAG
>BL;ML2030, ML.tab 2420647:2421120 forward MW:16613
LDMPGEMLDVRKLCKLFVKSAVVSGIVTASMALSTSTGMANAVPREPNWDAVAQCES-
GRNWRANTGNGFYGGLQFKPTIW (SEQ ID NO:226)
ARYGGVGNPAGASREQQITVANRVLAD-
QGLDAWPKCGAASDLPITLWSHPAQGVKQIINDIIQMGDTTLAAIALNGL >BL;ML2031,
ML.tab 2421151:2421606 forward MW:17306
MEGSVTVHMAAPADKVWNLIADVRNTGRFSPETFEAEWLDDVTGPALNAKFRGHVRRNEIGPVYWTTCKVTAC-
EPGREFG (SEQ ID NO:227)
FTVLLGNKPVNNWHYRLVTSGDGTYVTESFRFNRSPLLTVYWL-
LGGFLRKRRNIRDMTKTLRRIKDVVEAE >BL;ML2048, ML.tab 2437805:2438062
forward MW:9877 MTNLWDLKQKVIYSIQWLWNSLGHQLPRTMLETI-
RRKTGQPWRTAAGVHPVDNQFGMVCENSEYSDYVYNIKANTAVRT (SEQ ID NO:228) HKSKI
>BL;ML2054, ML.tab 2442182:2442481 reverse MW:10350
MDVMAATEYLARSTTLTSVGWIGYIIIGGIAGWIAGKIVQGGGSGILMNIVIGVVGALIGGFLLSFFVN-
TAAGGWWFTLF (SEQ ID NO:229) TSILGSVILLWVVGRVRKT >BL;ML2063,
ML.tab 2450656:2451084 forward MW:16160
MAKLTRLGDLERAVMDHLWSRQEPQTVRQVHEALSARRDLAYTTVMTVLQRLAKKNLVSQIRNNRAHRYAPVH-
GRDELVA (SEQ ID NO:230)
GLMVDALDQAEDSDSRQAALVHFVERVGADEADALRRALAELE- THQRHVRHLVALQRTTEGD
>BL;ML2064, ML.tab 2451089:2452039 forward MW:32995
VSALAFTILAVLLVGPTPTLVARSTWPLRAPRAAMVLWQTIALAAALST-
FSAGIAIASRVLMPGPDGRLLTASVIGAAGRL (SEQ ID NO:231)
GWPLWAAYVAAFALTVLVGARLIVAIVRVAIATRRRRAHHRMVVDLVGVGHNAALAQPCARARDLRVLEVAQP-
LAYCLPG
VRSRVVVSEGALTKLNDTEVTAILTHERAHLRARHDLVLEAFTAVHAAFPRLVRSSAAL-
SAVRLLVELLADDAAVRAAGR
TPLARALVACASGQAPSGALAAGGNTTVLRVRRLSGRSNSAVVSA-
AAYLAAAIVLLVPTVALAVPWLTELQRLFNI >BL;ML2070, ML.tab
2460317:2462518 forward MW:77018 MTKLGAEFKPDQTRINRKAANTGMGRHSMPDPE-
DSIDQPSNQFAASGPDQSDEIDHGYQSRMGYPEPVFEPAATGSPSYR (SEQ ID NO:232)
SYPHGAEHPADSTPEALDETIDYQSYWAEDRNEDLFVDGAADDHPDFPPRPAGSSTSSQAPTSLSHLFKASHR-
SVGKWQG
GHRSDGGRRGVSIGVIATLVAVVVLVGAVIMWSFLGHILNNRKHQAAARCVGGHQTVAV-
VADPSIADYLQEFAQSYNASA
RPVGDHCMMVTVKPVGSEAALTGFNDSWPANLGDKPALWIPGSSI-
SAARLAVTADQKTISESHSLVTSPVLLAVRPEFEQ
ALANKGWAALPGLQTNPNSLADLNLPAWGSL-
RLALPMNGNSDATFLAGEAVATASVPAGAPAIQGVGAVRTLMSAQPKLA
DSTWAEAMSTLLKPGDVATAPVHAVITTEQQLFQRGQSLSDAKSALGSWLPHGPAPVADYPAVLLNGSWLTQE-
QAAAASE
FARFVQKPDQLAKLAKAGFRVNGVTPPSSSVTSFAAVPSTVSVGDDGMRATLVEEMIQP-
SSGVAATIMLDQSMPTDEGGK
TRLANVVAALDDKINAMPPTSVMGLWTFDGHKGQTEVTTGQLADP-
VNGQPRSAALTAALDKQYSSNGGAVSFTTLRMIYQ
EMLANYHVGQTNSVLVITAGPHTDQTLDGAR-
LQDFIRTSADPAKPIAVNVIDFGTDPDQATWKAVAQISGGSYQNLSTSA SLDLATAINTFLS
>BL;ML2073, ML.tab 2466445:2467140 reverse MW:24857
MTHLVTRARSARGNTVSEQPRQGQLDLADYRDTPTATTHGDIGLNGPTAVSGPAQPGLFPDDSVPDELVGYRG-
PSACQIA (SEQ ID NO:233)
GITYRQLDYWARTSLVVPSIRGAAGSGSQRLYSFKDILVLKIV-
KRLLDTGISLHNIRVAVDHLRQRGVQDLANITLFSDG
TTVYECTSAEEVVDLLQGGQGVFGIAVSG-
ANRELTGVIDDFRGERADGGESIAAPEDELASRRKHRDRKIG >BL;ML2075, ML.tab
2467984:2468739 reverse MW:26531 VSAPDSPALVOMSIGAVLDLLRSDFP-
DVTISKIRFLEAEGLVTPQRSASGYRRFTAYDCARLRFILTAQRDHYLPLKVIR (SEQ ID
NO:234)
AQLDAQPDGELPSFGSPYVAPRLFSVTGGPGAGVGSGVGSDIPAVSPAGVRLSREDLLDRSGVADDLLTA-
LLKAGVITTG
PGGFFDEHAIVILQCARALSEYGVEPRHLRAFRSAADRQSDLIVQIAGPIVKAGKA-
GARDRADDLAREVAALAITLHTSL IKSAVRGVLHR >BL;ML2111, ML.tab
2512539:2512973 reverse MW:15031 MIHRLRLGWLVAFFATIVLISVWIPW-
LTTKVNDEGWANAIGGTNGNLELPSGFGAGQLIVLLSSTLLVAGAMMGRGLSVK (SEQ ID
NO:235)
VASIAALIISLLIVVIVVWYYQLNVNPPWAECGLYLGAAGAVCAMVCSVWAVIAAVAAGRSSL
>BL;ML2113, ML.tab 2513870:2514415 reverse MW:19748
VTSRIKRMAKLTGSIDVPLPPNEAWMCASDLARFGEWLTIHRAWRSKLPEVVEKGTVIESYVEVKGMPNRIRW-
TVVRYKA (SEQ ID NO:236)
PEANTLNGOGVGGVKVKLIAKVSPKDDGSVVSFDVNLGGSALF-
GPIGMIVAVALRTDIRESLQNFVTAFSRPEPGLIRSR ALVVDQHSRAVVEQRSASAGL
>BL;ML2114, ML.tab 2514412:2514582 reverse MW:6182
VLDKAKDILVQNADKVETVLDKAGEFVDEKAKGKYTDTIYQVAEETKKAASFDTFC (SEQ ID
NO:237) >SL;ML2135, ML.tab 2537496:2538521 reverse MW:40179
MARTRMVRRWRNNMEVRDDTDYVGMLTTLSEGSVRRNFNPYTDIDWESPAFAVKDNDPRWILPDTDPLG-
RHPWYLAQPEQ (SEQ ID NO:238)
RKIEIGMWRQANVAKVGLHFESILVRGLMNYTFWMPNGS-
PEYRYCLHECVEECNHTMMFQEMVNRVGADVPGMPRRLRWL
SPLVPLVAGPLPVAFFIGVLAGEEP-
IDHMQKNVLREGKSLHPIMERVMAIHVAEEARHISFAHEFLRKRLTHLTKRQLFW
VSLYYPLTMRVLCNAIMVPPKAFWQEFNIPREVKKELFFRSPESRKLLRDIFADVRMLAYDTRLMETRSARLM-
WHICKID GQPSRYRGEPQRQHPATTSAA >BL;ML2137, ML.tab 2539983:2540738
forward MW:27365 MRELKVVGLDPDSKTILCEGDIPGERFKLPADD-
RLRAAVRGDTTLLDQPQLDIQVTNMLSPKEIQARIRAGASVEQVANA (SEQ ID NO:239)
SGSDIARIRRFAHPVLLERSRAAELATAAHPVLADGPAVLTLLETVSAALVKRGLEPDKLSWDAWRNEDSRWT-
VQLMWHA
GRSDNLAHFCFTPGAHGGTVTASDDAANELIDPDFNRPLRPLARVAHLDFVEPATPATV-
TPDNQLVSSRRGKPTIPAWED VLLGVRSAGQP >BL;ML2141, ML.tab
2542620:2542895 forward MW:9766 MLVEPDRSREPPPLPSMLLEVWPVIMV-
GALAWLIAVVAAFVVSSLQSWRPVALAGLVVGLFGTGIFVWQLAAARRGARGA (SEQ ID
NO:240) QGGLETYLDPR >BL;ML2142, ML.tab 2543007:2543816 reverse
MW:28190 VTGPVEDSAVATVAEWPEELAAVLTNAADDARAAIEEFSGSVTVGDYLG-
VGYEDPNAATHRFLAHLPGYQGWQWAVVVAA (SEQ ID NO:241)
YPGAEHATVSEVVLVPGPTALLAPEWVPWEQRVRPGDLGPGDLLAPTSEDLRLVPGYNASGDPAVDEIAAEIG-
LGRRWVM
SVWGRSAAAERWHGGDYGPDSPMARSTKRVCRDCGFFLSLVGSLGANFGVCGNEMSADG-
HVVDKLYGCGAHSDTPAPAGS GSSVYEPYDDGVLDILEPSTVQLPESPAD >BL;ML2143,
ML.tab 2543926:2545665 forward MW:61675
VSGRRRNHPGRLASIPGLRTRTGSRNQHPGIANYPADSSDFRPAQQRRALEQREQQAGQLDPARRSPSMPSAN-
RYLPPLG (SEQ ID NO:242)
QQQSEPQHNSAPPCGPYPGERIKVTRAAAQRSREMGYRMYWMV-
QRAATADGADKSGLTALTWPVVTNFAVDSAMAVALAN
TLFFAAASGESKSKVALYLLITIAPFAVI-
APLIGPALDRLQHGRRVALATSFVLRTGLATLLIMNYDGATGSYPSMVLYP
CALAMMVLSKSFSVLRSAVTPRVMPPSIDLVRVNSRLTVFGLLGGTIAGGAIAAGVEFVCAHLFKLPGALFVV-
AAITISG
ALLSMRIPRWVEVTAGEIPATLSYRLHRKPLRQSWPEEVKKVSGRLRQPLGRNIITSLW-
GNCTIKVMVGFLFLYPAFVAK
EHQANGWVQLGMLGLIGTAAAIGNFAGNFTSARLQLGRPAVLVVR-
CTIAVTVLALAASVAGNLLMTTIATLITSGSSAIA
KASLDASLQNDLPEESRASGFGRSESTLQLA-
WVLGGALGVMVYTDLWVGFTAVSALLILGLAQTVVSFRGDSLIPGLGGN
RPVMIEQESMRRAAAVSPQ >BL;ML2144, ML.tab 2545673:2546161 forward
MW:18046 VQRRVTVLLPAVVMLLAAAAGFGVWLLVREPDPQRPEISVYSHGHLTRV-
GPYLHCNMLNLDECQTPQAQGVLSADEHNPV (SEQ ID NO:243)
QLSVPETISRAPWRLLRIYEDPTGTTSTLYRPNTRLAVTIPTLDPHRGRLTGIVVQLLTLVVDPAGEVHDVPH-
TEWLVRL TF >BL;ML21S1, ML.tab 2554745:2555269 forward MW:17666
MSESYRKLTTSSIIVAKITFTGAMLDGSIALAGQASPATDSEWDQVARC-
ESGGNWSINTGNGYLGGLQFSQGTWASHGGG (SEQ ID NO:244)
EYAPSAQLATREQQIAVAERVLATQGSGAWPACGHGLSGPSLQEVLPAGMGAPWINGAPAPLAPPPPAEPAPP-
QPPADNF PPTPGDVPSPLARP >BL;ML2155, ML.tab 2556716:2556940
reverse MW:8095 MKSVNYYVVADIAKSKAHKSRYVDNGWPTTDPDH-
HAVSELVTDCAGALSPFGDLVFPVPADDLPYVHPVTVVNR (SEQ ID NO:245)
>BL;ML2156, ML.tab 2557038:2559299 forward MW:80254
MTDNTPDIPLGSWLADLSDERLIQLLELRPDLAQPPPGSIAALAARAQARQSIKAATDELNFLQFAVIDALLV-
LQADSTP (SEQ ID NO:246)
VPTTKLKALIGDRAPQADVVSALDNLRQRALAWGETTVRIAAA-
AAAALPWHPGQVTLEDISSTGEQIAELIARLSQTQRD
VLQKLLEGSPLGRTRDAAPGAPPDRPVPQ-
LLAMGLLRRIDADTVILPRRVGQVLRGEQSGPTQLTQPYPVVSVTTPNDAD
AEAAGAVIEALHELDVLLETLGSAPVYELRNGGLGVREFKRLAKATGINEPRLGLLLEVTAAAGLIASGIPDP-
EPATGDS
PYWAPTVATDRFTANPPAERWHLLASTWLDLQCRPALIGSRGPDAKSYGALSNSLYSTA-
APLDRRLLLGMLAELPAGVGV
EAAEASAALIWRRPRWARRLQPGPVADMLAEGHTMGLVGRGAIST-
PGRALLDEAIASADPAIAVAAMTRALPEPIDHFLV
QADLTVVVPGPLQRNLAKELGTVATVESAGT-
ANVYRISEQSIRHALDIGKTRDWMHALFTNHSKTPVPQRLTYLINDVAR
RHGQLRIGMAASFVRCEDPALLTQTVAAAEELQLRALAPTVAVSPAPIAEVLVTLQSAGFAPAAEDSSGSIVD-
VRPRRAR
LPTPQHRRPYRPLQRPNLETLNAVIAVLRKVAATPFGGIRVDPTVTMSLLQRATKEKTT-
LVIGYLDAAGIATQRMVSPIA IRGGQLVAFDPGTGRLRDFVIHRITSVVSSDSQ
>BL;ML2193, ML.tab 2608113:2609048 reverse MW:33819
MVLNWRFALSADEQRLVREIISAATEFDEVSPVGEQVLRELGYDRTEHLLVTDSRPYAPIIGYLNLSSPRDAG-
VANAELV (SEQ ID NO:247)
VHPRERRRGVGAANVRAALAKTGGRNRFWAHGTLASARATASV-
LGLVPVRELVQMQRSLRTIPDPMVPDQLGVWVRTYVG
TVDDAELLRVNNAAFAGHPEQGGWTATQL-
AERRSEPWFDPAGLFLAFGDSSSNQPGKLLGFHWTKVHAAHPGLGEVYVLG
VDPSAQGRGLGQMLTSIGIASLAQRLVGPSAEPTVMLYVESDNVAAARTYERLGFTTYSVDTAYALARIDD
>BL;ML2195, ML.tab 2609965:2610816 forward MW:30382
MVKDATCIKVRHNMRMRKVLTSVIATVVTMAVLVVGVLGIDYGTSIYAEYLLSVNVRNAANLGSDPFVAIL-
AFPFIPQAM (SEQ ID NO:248)
RDHYTELEIKANAVDHANVGKASLEATMYSIDLTYASWLIK-
PDAKLPVDRLESRIIIDSMHLGQYLGISDLMVEAPHRET
NNATGGTTESGISSGHGLVLSGTPKSA-
NFDHRVSVLVDLSIAPEDQATLVFTPTGIATGPDTANQPVPDDKRNPVLRAFS
ARMSDQRLPFGVAPTSEGARGSDVIIEGITQGVTITLDGFKQS >BL;ML2199, ML.tab
2612896:2613198 forward MW:10570
MCSEPKQRLALPANVNLEKETVITGRVVDRKGQAVGGAFVRLLDSSNEFTAEVVTSATGDFRFFAAPGSWTLR-
ALSSVGN (SEQ ID NO:249) GDAMMSPSGTGIHKVDVKIT >BL;ML2200, ML.tab
2613457:2614143 forward MW:24489
VTSDEVRDGAGSPADSSKGNKCTAAGMFQAAKRSTVSAARNIPAFDDLPVPSDTANLREGANLNSTLLALLPL-
VGVWRGE (SEQ ID NO:250)
GEGRGPNGDYHFGQQIVVSHDGGNYLNWEARSWRLNDAGEYQE-
TSLRETGFWRFVSDPYDPTESQAIELLLAHSAGYVEL
FYGRPRNASSWELVTDALACSKSGVLVGG-
AKRLYGIVEGGDLAYVEERVDADGGLVPNLSARLYRFAG >BL;ML2204, ML.tab
2619139:2619327 forward MW:7054 MGRGRAKAKQTKVARELKYSSPQTDFQ-
RLQRELSSTGAADPGQLDGDDRVSEDSWDEDAWRR (SEQ ID NO:251) >BL;ML2207,
ML.tab 2622297:2622692 reverse MW:13959
MAVCDRADPAKTRQAVLALADWLKDRTLPAPDRDAVATAVRLTVRTLATLAPGASVEVRIPPYVAVQCVSGPS-
HTRGTPP (SEQ ID NO:252)
NVVETDSRTWLLVATGLMQLVEAVATGALRMSGSRAGDIEVWM- PLINLRCT
>BL;ML2219A, ML.tab 2636676:2636915 reverse MW:8613
VARAVVMVMLRAEILDPQGQAIAGALGRLGHTGISDVRQGKRFELEIDDTVDDSELAM-
IAESLLANTVIEDWTITRESQ (SEQ ID NO:253) >BL;ML2228, ML.tab
2645990:2646610 forward MW:21778 MNSRFLPYATSPGRLTIQLLSDIAVV-
MWTTFWVLVGIAVYNTISTIADTGAQVESGAHGIADNLASASHGMQLIPVLGNA (SEQ ID
NO:254)
VSKPLTLTSAAALDIADAGHSLNTTASWLAVLLALAVIALPILVAVVPWLVLRFWFFRHKWTVTTLAATP-
AGKQLLALRA LTLRSPSKLAAVSADPVGGWRREDPGTIRGLAALELQSSGITLRVH
>BL;ML2253, ML.tab 2676319:2676534 reverse MW:8295
MPNNIAFSRSYIGKRCYSEQEVGVFIDLAEQERTRHIEEDVEFRNRNAELRNQDGAPQPRSCADRARNCHV
(SEQ ID NO:255) >BL;ML2258, ML.tab 2682528:2682830 reverse
MW:11299 VNRFLTSIVSWLRAGYPEGIPATDTFAVLALLARRLTNDEVKIVARELI-
RRGEFDKIDIGVMISHLTDELPSPQDIERVR (SEQ ID NO:256)
TRLNAKGWLLDNARDNGEPT >BL;ML2259, ML.tab 2682860:2683150 reverse
MW:10541 LSPWFNYEATLKILLFSTLAGAALPGLFALGIR-
LQVNDAGDASTNNATPNRKPILVTLAWVIYALVLMVVILGVLYIVSR (SEQ ID NO:257)
DFIAHHTHYPFLGIKP >BL;ML2261, ML.tab 2684489:2684905 reverse
MW:15612 MEILASRMLLRPTDYQRSLSFYRDQIGLAIAREYGTGTVFFAGQSLLEL-
AGYVIQGAPDHSRGAFPGALWLQVRDIAVTQ (SEQ ID NO:258)
ADLEGRGVSITREPRREPWGLHEMHVTDPDEITLIFVEVPANHPLRRDTRSERPRTPD
>BL;ML2271, ML.tab 2695503:2696030 forward MW:19222
VILSKSLLRILIHGRSDEPPSTRARVIMRWGRIAVLIVTGLITLQSVLLVAGAWRNDLTIQHNMGVAQAEVLS-
AGPRRST (SEQ ID NO:259)
IEFVTPERVTYRPELGVLYLSELSTGMRIYVEYNKNDPNLVRV-
RHRNAGLAIIPAGSIAVVCWLAATVVLVALAVLDKRL DRHTESSAVPSQPTS
>BL;ML2274, ML.tab 2698017:2698355 reverse MW:11815
MKGTGLAANVAMAAAATVLAAPALADDYDAPFNAQLHSYGIYGAQDYNAWLGKIACQRLAKGVDGDVNKSATF
IQRNLPL (SEQ ID NO:260) YTTEGQSLQFLGAAINHYCPNQIGILQRAGAR
>BL;ML2289, ML.tab 2713408:2714178 reverse MW:26191
VTADLLAPLMELPGVATASDRARDALGRAHRHRANLRGWPVTAAEAALRAARASSVLDGGPVRLDHLPGASPQ-
AGGVSDP (SEQ ID NO:261)
VFGGALRVAQALEGGAGPLVGVWQRAPLQALARLHVLAAADQV-
GDEWLGRPRMDAEVGLRLGLLVDVVSGRTFAAAPVVA
AVVHGELLTLQPFGSADGVVARAVSRLVT-
IATGLDPHGLGVPEVSWMRRAAVYRDAACGFAGGTPKGVAAWLVLCCRALH
AGAQEALSIAESLPSR >BL;ML2295, ML.tab 2720556:2721260 reverse
MW:23836 LVLRRGHWCILVALVAVLLAVVSMPAKTVFADGRLPMGGGAGIVINGDT-
MCTLTTIGTDSAAELIGFTSAHCGGPGAQVA (SEQ ID NO:262)
AEGAENRGPMGTMIAGNDNLDYAVIKFDPAKVMPVAAYNGFVISGIGQDPAFGQIACKQGRTTGNSCGVAWGM-
GETSGTL
VMQVCGRPGDSGAPVTVNNLLVGMIHGAFTDNLPVCITKFIPLHTPAVVMSMNAILADV-
NKNNRPGAGFVPQPV >BL;ML2296, ML.tab 2721464:2722009 forward
MW:20041 MSKGDRKNGVPSTLTTIPLVDPHAEPTEPSIGDLIKDATTQVSTLVRAE-
VELARAEIIRDVKKGLTGSVFFIAALVVLFY (SEQ ID NO:263)
STFFFFLFLAELLDTWLWRWVALLIVFAIMVMVTAALALLGYVKIRRIRGPHQTIESVKETRTALTPGHDKAQ-
ARPRKLT GSGTNPENNPDRRTPADPSGW >BL;ML2306, ML.tab 2730213:2731358
forward MW:40804 MTAPTSHRPPTLDMAAILADTSNRVVVCCGAGG-
VGKTTTAAAMALQAAEYGRTVVVLTIDPAKRLAQALGVNDLGNTPQR (SEQ ID NO:264)
VPLAPEVPGELHAMMLDMRRTFDEMVVQYSGPERAQAILNSEFYQTVATSLAGTQEYMAMEKLGQLLSQDRWD-
LVVVDTP
PSRNALDFLDAPKRLGNFMNSRLGRLLLTPGRGIGRLVTGANGLAMRALSTVLGSQMLA-
DAATFVQSLDATFGGFRGKAD
RTYALLKQRGTQFVVVSAAEPDALREASFFVDRLSQENMPLAGLM-
LNRTHPTLCALPVEQAIDASKTLQDEITNSAAASL
ATAVLRINDRGQTAKREARLLSRFTGANPHV- PVIGIPLLPFDVSDLEALRAIADRLTASH
>BL;ML2307, ML.tab 2731463:2731708 reverse MW:9325
LCRATDPDELFVRGAAQRKAAVICRHCPVMQECR-
ADALDNKVEFGVWGGMTERQRRALLKQHPEVVSWADFFDTRKHRNV (SEQ ID NO:265) S
>BL;ML2320, ML.tab 2747152:2747799 reverse MW:22771
VSWVIRISAVVMSVGLGLGVPVASARPSEPGVVSYAVLGKGSVGNIIGRPMGWESLLTEPLQAYSVDLPMCN-
NWADIGLP (SEQ ID NO:266)
EVYHDVDLASFNGAITQTSANDQTHFVKQAVGVFATNDAAVR-
AFHRVVDRTVGCSGQTTAMHLDNGTTQVWSFVGGTPTY
ADANWTKQEAGTDRRCFVQTRLRENVLL- QTKVCQPGNAGPAVNVLAGAMQNALGQ
>BL;ML2330, MLtab 2761253:2761603 reverse MW:11931
MQPGGDMSALLAQAQQMQQKLLETQQQLANAQV-
HGQGGGGLVEVVVKGSGEVVSVAIDPKVVDPGDIETLQDLIVGAMAD (SEQ ID NO:267)
ASKQVTKLAQERLGALTSAMRPTAPPPTPPTYMAGT >BL;ML2332, ML.tab
2762489:2762926 reverse MW:15496 MGPVSAVSTILVNVEPVATLAAVADY-
QKNRPKILSPQYNEYQVVQGGQGPGTVVKWKLQVTRSRVRDVQVNVDVAGHTVI (SEQ ID
NO:268)
EKDANSSMVTSWTVAPAGPGSSVTMKTAWTGAGGVKGFFEKTFAPLGLKKIQAEVLANLKNELER
>BL;ML2337, ML.tab 2769442:2770194 forward MW:27248
MGYKVATLWHASFSIGAGVLYFYFVLPRWPELIGETTHTLGTTLRIVTGVLFGLAALPVVFTLLRSQTSELGI-
PQLALSI (SEQ ID NO:269)
RTWSIVAHVLSGVLIVSTAIGEIWLSSDTAVQCLFGIYGAAAA-
SAVLGFVAFYLSFVAALPPPSPKPVKLKKANQRRIRR
RKGRKDNEAEDEEADAGEHNEAETPAQAE-
EVTSENPQPAPESGAQKEPDELLANKTEETDEPRRGLHNRRPTGKVSHQRR RARSGIAVEN
>BL;ML2366, ML.tab 2834617:2834958 reverse MW:11956
MAFMTSNPSGPPSQPAPAVGLTPGERAVPVTRAGALWSALFAGFLILILLLIFIAQNTTSTPFTFFGWHWSLP-
LGVAIML (SEQ ID NO:270) SAVGGGLLAVAVGTARILQLRRTVKKRYVAAHR
>BL;ML2377, ML.tab 2846366:2846830 forward MW:16863
MSKRQRGKGISIFKLLSRIWIPLVILVVLVVGGFVVYRVHSYFASEKRESYADSNLGSSKPFNPKQIVYEVFG-
PPGTVAD (SEQ ID NO:271)
ISYFDANSDPQRIDGAQLPWSLLMTTTLAAVMGNLVAQGNTDS-
IGCRIIVDGVVKAERVSNEVNAYTYCLVKSA >BL;ML2380, ML.tab
2850483:2850944 reverse MW:17491 MSRLSTSLCKGAVFLVFGIIPVAFPTTAVADGS-
TEDFPIPRRQIATTCDAEQYLAAVRDTSPIYYQRYMIDMHNKPTDIQ (SEQ ID NO:272)
QAAVNRIHWFYSLSPTDRRQYSEDTATNVYYEQMATHWGNWAKIFFNNKGVVAKATEVCNQYQAGDMSVWNWP
>BL;ML2388, ML.tab 2858302:2858607 forward MW:10942
VAAGDNRLIGNDTPEIAVGAIGGVATGYVLWLVVMSIGYNITTVSQWSLIVLILSMVSVFCSGMCGWWL-
RQRRKHAWGAF (SEQ ID NO:273) TFGLPVFPVVLTLAVLAKIYL >BL;ML2390,
ML.tab 2861300:2861605 reverse MW:10840
VSHIFLTLIADGELQYHGPDFGKASPMGLLVIVLLLVATLLLLWSMNRQLKKIPASFDSEHPELDQAADEGTE-
LGGYLDE (SEQ ID NO:274) EPSDTDRNGPSLPPEPGADSG >BL;ML2392, ML.tab
2862579:2863013 forward MW:15303
MNQAFTPLTETRYGRSRLPGRSRHRGVIALTVLAVAASTGIAVIGYQRLGTSDVAGSLASYRVLDDETVSVTI-
SVMRSDP (SEQ ID NO:275)
SRPVDCIVRVRAKDGSETGRREVLVAPAEATTVQVVTTVKSAR- PPVMADIYGCGTDVPGYLRPA
>BL;ML2407, ML.tab 2878350:2878685 reverse MW:12141
VSKGPNSICAADHNRDHLVVNVLLYAAARLLLV-
ILLGSVIYGVARLLGVTQLPIVVAALFALIIAMPLGIWLFSPLRRA (SEQ ID NO:276)
TAALAVVGERRRSEREQLRARLRGEPLPDEE >BL;ML2410, ML.tab
2881021:2882649 reverse MW:59195 LMQVLMRLLAWVRNTWRALTSMGTALVLLFLLA-
LGAIPGALLPQRDLNVGKVDDYVAAHPVIGSWLDQLQAFNVFSSVWF (SEQ ID NO:277)
TAIYTLLFVSLVGCLTPRMIEHARSLRAMPVIAPRNLARLPKHASFQVIGDDKTLAGTIAGRLRGWRTVIRTQ-
DGVVPET
VVSAEKGYLREFGNLVFHFSLLGLLVAVAVDKLFGYEGNVVVIADGGPGFCSASPAAFD-
AFRAGNTVDGTSLHPICIRVN
DFAAHYLPSGQAMSFTADIDYQDGHDLTVNSWRPYRLEVNHPLRV-
AGNRVYLQGHGYAPTFTVTFPDGQTRTSTVQWRPE
NQQTLLSSGVVRIDPPAGSYPNMSERRQHEI-
AIQGLLAPTEQLDGTLLLSRFPALNAPAVAIDIYRGDTGLDSGRPQSLF
TLDPRLINQGRLTKEKRVNLQAGEQVRLGQGPGAGTVVRFDGAVPFVNLQVSHDPGQAVFVFAIAMMVGLVVS-
LMVRRR
RVWVRLTPAAGTVNVELGGLARTDNSGWGDEFERLTERLLAGLVAADRTLAAARRSSQMD- VK
>BL;ML2425, ML.tab 2901005:2901505 forward MW:18654
MTVPLEAEGLIGKYRQLDHFQVGREKIREFAIAVKDDHPTHYNETAAFEAGYPALVAPLTFLAIAC-
RRVQLEIFTKFNI (SEQ ID NO:278)
PINVARVFHRDQKFRFYRTILAQDKLYFDTYLDSVIE-
SHGTVIAEVRSEVTDTEGKAVVTSIVTMLCELARQDATAEETV AAIASI >BL;ML2428A,
ML.tab 2904745:2904846 reverse MW:4145
MGSVIKKRRKRMSKKKHRKLLRRTRVQRRKLGK (SEQ ID NO:279) >BL;ML2432,
ML.tab 2907135:2907977 reverse MW:30442
LRPVIKVGLSTASVYPLRAEAAFEYAAKLGYDGVELMVWGESVSQDIDAVKGLSRRYRVPVLSVHAPCLLISQ-
RVWGANP (SEQ ID NO:280)
VPKLERSVRAAEQLGAQTVVVHPPFRWQRRYAEGFSDQVAELE-
AASDVMIAVENPFPFRADRFFGADQSRERMRKRGGGP
GPAISVFAPSFDPLAGNHAHYTLDLSHTA-
TAGSDSLEMVRRMGSGLVHLHLCDGSGLPADEHLVPGRGTQPTAAVCQLLA
GADFAGHVVLEVSTSSVRSATERETMLTESLQFARTYLLR >BL;ML2433, ML.tab
2908008:2909075 reverse MW:37938 MTGPHNDTESPHARPISVAELLARNG-
TIGAPAVSRRRRRRTDSDAVTVAELTCDIPIIHDDHADEQHLAATHAHRANIGV (SEQ ID
NO:281)
RVVEPAAQSPLEPVCEGIVAEPPVDDHGHVPPGCWSAPEPRWPKSPPLTHLRTGLQRSACSRPLPHLGDV-
RHPVAPDSIA
QKQSDAEGMSPDPVEPFADIPVDVMGSEVRAAELVAEESAYARYNLQMSAGALFSG-
HTLTNELAERRGDEHAAGGLLAVG
IDLDEDHLDLHTDLAGITSPARGWQSRFEALWRGSLIVLQSI-
LAVVFGAGLFVAFDQLWRWNSIVALVLSVLVILGLVVG
VRVVRRTEDIASTLIAVVVGALITLGPL- ALSLQSG >BL;ML2435, ML.tab
2910195:2911028 forward MW:31007
VSSCPESGSTFEYVANSHLEPVHPGEEVDLDFTREWVEFYDPDNSEQLIAADLTWLL-
SRWTCVFGTPACRGTVAGRPDDG (SEQ ID NO:282)
CCSHGAFLSDDADRTRLDDAVKKLSHD-
DWQFREKGLGRKGYLELDEHDGQSQFRTRKHKNACIFLNRPGFPIGAGCALHS
KALKLGVPPRTMKPDICWQLPIRHSQEWVTRPDGTEILKTTVTEYDRRSWGSGGADLHWYCTGDPASHVDSKQ-
LWESLAD ELTELLGAKAYAKLAAICKRRNRLGIIAVHPATQEAK >BL;ML2442, ML.tab
2917372:2917926 reverse MW:20820
MSTAWDTVWHACSVIEHALQASHLTYSEFSGVPDGLLRLVVELPGERRLKTNAILSIGEHSVNVEAFVCRKPD-
ENHEGVY (SEQ ID NO:283)
RFLLKRNRRLFCVSYTLDNVGDIYLVGRMSLASVDTDEIDRVL-
GQVLEAVESDFNTLLELGFRSSIQKEWDWRISRGESL NNLQAFAHLIDDEGDGDASIYARP
>BL;ML2446, ML.tab 2921383:2922708 reverse MW:46863
VIASTPRQNRESINRRVALTALGVGVFAPSVFVACAGSAIKPSEKKTTPAPHLTFQPATATDDVIPVAPISVQ-
IADGWFQ (SEQ ID NO:284)
RVTLTNPVGKVVAGVFNQDRTVYTITEPLGYDTTYTWNGSAVG-
HDGKAVPVTGKFSTVTPVKKVNGGFQLADGQTVGVAA
PITIQFDAPISDKSAVEKALTVTTTPPVE-
GSWAWLPDEAKGARIHYRPREYYPAGTTVNVDAKLYGLPFGDGAYGLQDMS
LNIQIGRRQVVKAEVSSHRIQVVTDAGVIMDFPCSYGEADQARNVTRNGIHVVTEKYSDFYMSNPAAGYSNVY-
ERWAVRI
SNNGEFIHANPASVGAQGNTNVTNGCINLSTGDAEQFYRSAIYGDPVEVTGSSIQLSYS-
DGDIWDWAVDWDTWVGMSALL SFPTVHQPATQIPVTAPVTPPGAPILSGTPTSGSGTARPGG
>BL;ML2450, ML.tab 2925762:2926499 forward MW:26174
LLRDPLAIVLILIIVVALVISGLIGAELFARHTANSKVARVVTCEIKDQATAKFGVTPLLLWQFATQHFTNIS-
VETAGNQ (SEQ ID NO:285)
IRDAKGMKIAIDIQNVQIRDTPTSRGTIGVLDAIITWSSDGIR-
QSVQNSIPVLGGVVTTSVTTHPTNGTIELKGMLNDIV
AKPVVSNGGLQLQIVSFNTLGFSLPKETV-
QFTLDDFTTNLTKNYPLGIHADNVEVTSTGVTSHFSARNTNIPNSTGGQDP CFANL
>BL;ML2452, ML.tab 2927236:2927607 reverse MW:13492
MSSPLSPLYVLPFVDHTKWTRWRSLISLQAYSNLFGRTSAMQPDVAAGDEAWGDVLTLSPDADTADMHAQFIC-
HGQFAEF (SEQ ID NO:286) VQPSNTSSNLEPWRPVVDDSEIFAAGCHPGISEGIQQADEGPR
>BL;ML2453, ML.tab 2927710:2927997 reverse MW:10004
MSHLVGTVMLVLQLAVLVTAVYAFVHAALQRPDAYTAAEKLTKPVWLVILGAAVSLTSILGFVFGVLGI-
VIAACAAGVYL (SEQ ID NO:287) VDVRPKLLDIQGKSR >BL;ML2454, ML.tab
2928117:2928683 reverse MW:20421
MAENPNVDDLRAPLLAALGAADLALTTVNELVGNMRERAEETRIDTRSRVEESRARVAKLQEVLPEHLSELRE-
KFTADEL (SEQ ID NO:288)
RKAAEGYLEAATNRYNELVERGEAALERLRSRPVFEDASARAE-
GYVDQAVELTQEALGTVASQTRAVGGRAAKLVGIELP KKAAAPARKAPAKKAPAKKAPAKKVTQK
>BL;ML2463, ML.tab 2936751:2937545 reverse MW:30332
VSLDKIMMPVPEGHPDVFDREWPLRVGDIDRTGRLRLDAGVRHIQDIGQDQLREMGFEETHPLWIVRRT-
MVDLIRPVEFQ (SEQ ID NO:289)
EMLRLRRWCSGTSNRWCEMRVRIDGRKGGLIESEAFWIN-
INRETQMPARIADDFLAGLHRTTSVDRLRWRGYLQPGSRDD
ASEIHEFPVRVTDIDLFDHMNNSVY-
WSVIEDYLVSHSELLKGPLRTTIEHEAPVALGDKLEIVLHVHPAGSTDQFGPGLV
DRSVITLTYTVGDETKAIAAIFAL >BL;ML2465, ML.tab 2939177:2939743
forward MW:21174 MSGGTGTGPVGRIPPGSLRQLGPINWVIAKLAA-
SLLRTSEMHLFTILGQRQLLFWAWLIYGGRLLRGKLPRVDTELVILR (SEQ ID NO:290)
VAHLRTCEYELQHHRRMARKRGLDTKIQAMIPAWPDVPTGAGLSVRQQALLAATDEFVKDRKITSSSWQQLET-
HLDRRRL IEFCMLISQYDGLAATISSLDIPLDNSC >BL;ML2473, ML.tab
2948272:2948751 forward MW:17160 MPDGFGVAVVREEGQWRCSAMASKSL-
TSLTAAETELRELRSVGAVFGLLDVDDEFFIILRPAPSGTRLLLSDATAALDYD (SEQ ID
NO:291)
IAAEILDSLDAEIDPEDLEDAYPFEEGDLGLLSDVGLPEATLGVILDQTDLYADEQLGHIAREMGFAEQL-
SAVINRLGR >BL;ML2489, ML.tab 2962275:2963135 reverse MW:32001
MVPLWFTLSALCFVGAVVLLYVDIDRRRGRSRRRKSWARSHGFDYERESTEILQRWK-
RGVMSTVGDISAHNVVLGQIRGE (SEQ ID NO:292)
AVYIFDLEEVATVIALHRKVGTNIVVD-
LRLKGLKEPRESDIWLLGAIGPRMVYSTNLDAARRACDRRMVTFAHTAPDCAE
VMWNEQNWTLVSMPIGSTRVQWDEGLRTVRQFNDLLRVLPPLPADTSQEAGASARNAAPSRPLASVGRAELPP-
DRGVESD VAGLLGSGVQAGRSAEPISRDEGRWDGIRRPPSVERNGHQTTNYQY
>BL;ML2491, ML.tab 2966697:2967698 forward MW:36545
VTGPNSPNKTLQRFGISGTDLGIPWDNGDPTNHQVLMAFGDSFGYCSVKGQQRRYNILFRSSNQDLSHGIRIA-
DGVPNDK (SEQ ID NO:293)
YSGSPVWTTGLAKQVVNTIHRAPHETGIIPTTAISIGKTQYMN-
YMSIKKWGRDGEWTTNYSAIARSIDNGQSWGTYPGTI
RTASPDAIPGTHFVPGNQNFQMGAFMRGN-
DGYLYSFGTPSGRSGAAYLARVPQNLVPDLSKYQYWNGNWVPNNPGAATPL
FSGPVGEMSAQYNDYLKQYIILYSNGDSNDVVARTAPAPQGPWSPEQPLVSSFQMPGGIYAPMIHPWSSGRDL-
YFNLSLW SAYDVVLMHTVLP >BL;ML2518, ML.tab 2999816:3000208 reverse
MW:14353 VGEYSAFGFDPDDFDRLIKEGSEGLRDAFERIS-
RFVGGPGVRTAWSAIFEDLSRRARPAQETADEAGDGVWAIYTVTGDG (SEQ ID NO:294)
AARVEQVYATELDALRANKNNVDPKRKVRFLPYGIAVSVLDSHQESTQQL >BL;ML2522,
ML.tab 3004590:3005246 reverse MW:22735
MCRLIALLSAVVCAAWATLILAPIGAAAGAAWFANKVGNATQVVSVVSTGGSNAKMDIYQRTGTGWQPLKTGI-
PTYVGSA (SEQ ID NO:295)
GLVAQAKSGYPATPMGVYSLDSAFGTAPNPGSQLPYTQVGPNH-
WWSGDDHSPTFNTMQVCQKSHCRFNTAESENLQIPSY
KHAVVMGVNKAKVPGSGSAFFLHTTGGGP- TEGCVAIDDVTLVQILRWLRPGAVIAITK
>BL;ML2527, ML.tab 3009256:3010275 reverse MW:36773
VSAHRSRPAAMWPGSSRITLALLAVMPALMAYP-
WWFTRSYWLLGIVALVVVVLFGWWRGLYLTTILRRRLAMMWRRGRPV (SEQ ID NO:296)
SASGSATRTTVLLRLGPPVGGSDVIPLPLITRYLNCYGIRADSIRITSRNNESDGALCETWIGLTVSAAKNLA-
ALQARSS
RIPLQETAQVAARRLADHLREIGWEVSLAVPDGIPRLITAAGDETWRGMQQGSDYLTAY-
RVNVDDELPGTLDAVRLYPAR
ETWTALEIACPDSSSNRNTIAAACAFLTDTAPQGAAPLAGLTPQH-
GNHRPALAVLDLFSAQRLDGHTDTDADLLTRLRWP APAAGVSSCTATRSPAVSA
>BL;ML2529, ML.tab 3011710:3013167 reverse MW:49866
MVYVFAEEFCEGPVTSGAVMPIVRVAILALSRLIEMALPTELPLREILPAVKRLVVPAASDNDSPLAANASLH-
LSLAPIG (SEQ ID NO:297)
GAPFSLDASLDTVGVVDGDLLALQPVPVGPAAPGIVEDIADAA-
MIFSTARLQSWGPTHIRRGALAATTAVTFAATGLGVT
YRAVTGALTGLLVVIVIAVLIALGGLVLR-
SRAARTGLVLSIAALVPIGAVFALAVPGIFGPAQVLLASAGVTAWSLIALI
VPGPERVRIVAFFTATVVIGVAVMLEAGAALLWQLTPLTIGCGLILAALLVTVEAAQLSALWARFPLPVIPAP-
GDPTPSA
PSFQVLEDLPRRVRISSAHQSGFIAAATLLSMLGSVAIALRPEAVSSVGWYLVAATAVA-
ATLRARVWDSVACKAWLLAQP
YLVASVLLGLYTATGRYVAASAALLVLVVVVLAWAVVALNPRIAS-
SDSYSLPLRRLLGMVASGLDASLIPAMAYLVSLFS WVLNR >BL;ML2530, ML.tab
3013309:3014178 reverse MW:31518
MKVDPNAVELTVDHAWFIAEVIGAGSFPWVLAITTPYRDAGERSAFVERQVDELTRMGLLVAENSVDPTVADW-
IRVVCFP (SEQ ID NO:298)
DRWLDLRYLRSTSAVGDSELLRGMVAQRAGVSDKTVVALRSAQ-
LITFTAMDIDDPLRLVPILGAGLAQRPPARFDEFSMP
MRVGVRADERLRSGTSLAEVADYLGIPKS-
AQPVVESVFSGPRSYVEIVAGCRRDGKHATTEVGMSIVDTTTGRVLVNPSR
AFDGEWVSTFSPGTPFAIAVGIEQLTATLPEGQWFPGQRLCRDFSGQTS >BL;ML2531,
ML.tab 3014189:3014479 reverse MW:10392
MTQIMYNYPAMLDHAGNMSACAGALQGVGIDIAAEQAALQACWGGDTGISYQAWQVQWNQATEEMVRAYHAMA-
NTHQNNT (SEQ ID NO:299) LAMLTRDQAEAAKWGG >BL;ML2532, ML.tab
3014518:3014814 reverse MW:10191
VSLLDVHIPQLVASESAFAAKAALMRSQINQAECEAISAQAFHQGESSAAFQSAHAQFVTAAEKINALLDIAQ-
QHLGEAA (SEQ ID NO:300) ETYVATDATAASTYTTGL >BL;ML2534, ML.tab
3016103:3016411 reverse MW:10243
MTLRVVPEKLAATSEAMKALTARLEAAHAAAFPCLVAVVPPAADPVSLQTAAGFSARGQEHALVAAQGVEELG-
RAGIGVG (SEQ ID NO:301) QSSTHYAISDALAASTYGIVES >BL;ML2536,
ML.tab 3020432:3022090 reverse MW:57975
MTSNELPGEWSGERRSFFSRTPVNDNPDKVVYRRGFVTRHQVTGWRFVMRRIASGIALHDTRMLVDPLRTQSR-
SVLVGAL (SEQ ID NO:302)
LVITGLIGCFVFSFIRPNGAAGNNAVLADRSTAALYVRVGDEL-
HPVLNLTSARLIVGRSVNPITVKSSELDRFPRGNLIG
IPGAPERMVQNTTHDANWTVCDVVSGEGG-
HAAHSMGVTVIAGPPDSHGMRAAVLGSAHGVLVDAPSERGGsTWLLWDGKR
SEIDLADHAVTDALGFGVGFAEVPAPRPIGAOLFNAIPEAPPLKAPVIPNAGATPSFGVRAPIGAVVVSFGLA-
ALGKNPY
DSVRYYAVLPDGLQPISPVLAAILRNINSYGLQQPPRLGADEVDKLPVSRMLDTERYPE-
QQISLIDAGYAPVSCAYWSKP
AGAATSFLSLMSGAALPVPDAARAVELVSAPSRGDSSTASRVVLT-
PGTGYFAQTVGVGSAAPATASLFWVSDTGVRYGID
TEADARSEATAGPGKIVEALGLKLPAVPIPW-
SILSLFAVGPTLSRADALLEHDGLAPDTRAGRTTTAYGEHR >BL;ML2557, ML.tab
3049294:3049590 forward MW:10997 LYATTELAELHDLIGRMRRSVASFKA-
RYGDSPNRRIAIDADRILSDIELLDADISELDLARATVQQSNEKIAIPDTQYD (SEQ ID
NO:303) SDFWRDVDDEGVGGHSRS >BL;ML2566, ML.tab 3061369:3062271
forward MW:32889 LTFGFVLRRLRRHFSVKENTVNQPSGLKNILRA-
IVGALPLLPRTDQLPSRTVTIEELPIDHTNVSAYASVTGLRYGNHVP (SEQ ID NO:304)
LTYPFALTFPANMSLVTGFDFPFAANGSVHTENHITQYRPIAVTDVVGVQVHAENLREHRKGLLVDLVTDVSV-
GNDTAWH
QVTTFLHLQRTSLSDEPKPPSQKQPKLLPPSAVLQITPRQIRRYAAVGGDHNPIHTNPI-
VAKLFGFPTTIAHGMFSAAAV
LANIEARLPDAVHYSARFVKPVVLPATTGLYVDESAGNWDLTLRN- IAKGYPHLAGTVQGV
>BL;ML2569A, ML.tab 3065268:3065441 forward MW:5896
MSRIVAPAAASVVVGLLLGAATIFGMTLMVQQDTKPPLPGGDPQSSVLNR- VEYGNRT
>BL;ML2570, ML.tab 3065557:3069774 forward MW:147681
VAAMSRWWLVLVGVVAVALTFAQSPGQISPDTKLDLTTNPLRFLARATNLWNSDLP-
FGQVQNQAYGYLFPHGTFFLIGQL (SEQ ID NO:305)
LGSPGWITQRLWWALLLTAGFWGLLR-
VAETLSIGSPTSRAIGAVAFALSPRVLTTLGSISSETLPMMLSPWVLLPTILAL
QGAPGRSVRTRAAQAGLAVALMGAVNAIATLAGCLPAVIWWACHRPNRLWWRYTGWWLLALCLATLWWVVALV-
LLHGVSP
PFLDFIESSGVTTQWSSLVEMLRGTDSWTPFVAQTATAGTPLVTESVAILGTCLVAAAG-
LAGLASTGMPARGRLVTMLVI
GVVLLSAGYSGGLGSPLAQAVQAFLDSSGAALRNVHKLESVIRTP-
FALGIAGLLGRIPLPGSAPVLVWLSSFAHPERDKR
VAATVAVLTALLVSTSSAWTGRLTPPGTFSA-
IPQYWNDTSDWLSEHNTGIPTPGRVLVVPGAPFATQVWGTSHDEPLQVL
GNSPWGVRDSIPLTPPQAIRALDSVQRLFASGRPSVGLADTLARQGISYVVLRNDLDPDTSRSARPILVHRAI-
AGSPQLE
KVAQFGAPVGTNMLKGFVADSELRPWYPAVEIYRVAVSDGTNPGKPYFADTDQLPRIDG-
GPEVLLRLDERRRLLGQPALG
PALMTADAQFAGLPLPSRAEVTITDTPVARETDYGRVDQNSSAIR-
AVNDARHTFNRVPDYPVPGAEMVFGGWSGGRITAS
SSSSDATSMPDVAPATSPAAAIDGDPATSWV-
SNALQPAVGQWLQVDFDHPVNNAVITVTPSATAVGAQVRRIEIETVNGT
TNLRVDEAGKPLAVALPYGETPWVRITAAATDDGSSGVQFGITDLTITQYDASGFAHPVNLRHTALVPGSPPG-
WAVAGWD
LGSELLGRPGCAPAPDNVRCAASMTLAPEEPVNFSRTLTVPYPISVTAMLWVRPRQGPK-
LADLIAEPKTTRAYGEADTVD
ILGSAYAATDGNPATSWTAPQRVVQHKTPPTLTLVLPRPTEVNGL-
RLAPSRSALPARPTLVAVNLGNGPQVRELQAGEPQ
ALSLKPRITDTVTISLLDWHDVIDRNALGFD-
QLKPPGLAEVTVLGTDGNPTAPANASENRIREVTVDCDHGPIIAVAGRF
VHTSIRTTAAALLDGEPVAAVPCERAPIVLPAGQQELLISPGAAFIVDGAQLSTQDGTELPSARTISADTGKW-
GPSRREV
RAPGSATSQVLVMPDSINPGWVAHTSTGVRLMPVAVNGWQQGWLVPAGNPGTITLTFTA-
NSLYRPGLAAGLALLPLLALL
ALWGRRNERAADAAAQPWTPGAWSAVAVLSAGAVIAGAAGVVVVG-
AALSLRYALRHQQRWRNGLTVGLSAGGLVLAGAAL
SRQPWRSVDGYSGHSANVQLLALISLAALAA-
SVVSPRCGSTGVAT >BL;ML2581, ML.tab 3081538:3082821 forward
MW:46681 VNRAVILRFTACGIIGLGAAFLIAALLLATYTSSRITKIPLDIDATLVS-
EGNGTALDSSSLSSEHIIVNQNVPLVSQQQI (SEQ ID NO:306)
TVESPANVDVVTFQVGVSIRRTDKQKDTGLLLAVVDTVTLNRKTANAVSDDTHTGGSIQKPRGFTDENPPTAI-
PLRHDGL
SYRFPFHTEKKTYPYFDPVAQKTFDVNYQNQEDINGLTTYRFTQNVGYDADGKLVAPIT-
YPSLYASDEDGKITTTAJUWG
LSGDPSEQITMTRYYAAQRTFWVDPVSGTIVKETEHVNIYFARDT-
LKPEVTLADYKVTSTEETIESQVNSARDERDRLAL
WSRVLPITFTAVGLITLLSGGFLASFSLRTE-
SALTESGLDRANRDAFGHCRTEEPVPGAEAETEKLPTQRPELRDSSILS
VSAHRRRSSESSPPNSGPADPGHPERG >BL;ML2582, ML.tab 3082923:3084725
forward MW:61888 VSRELYHRSMSWSRPSYALGMALLVVGPLMRPG-
YLLLRDAVSTPRSYLSDAALGLTSAPRSTPQDFAVAMASHLVDGGIV (SEQ ID NO:307)
VKSALVLGLWLAGWGAARLVVTALPSAGVAGQFVASTLAVWNPYVAERLLQGHWSLLVGYGCLPWVAEAMLML-
RSSDNAS
RPGLLGFFALACWIALAGLTPTGLMLAATVALICVAVPVEGPGEPRPRWLCAAATLGSA-
LGAALPWLTASAVGTSLTAHT
VANTLGVTVFAPRAEPGLGTLASLASLGGIWNGEAVPTSRTTLFA-
VLSATVLLGVVVAGLPVAVRRPAVVPLLVLAAVAV
ATPAALATGPGIDMLKAVVNAVPGLGVLRDG-
QKWVALAVPGYSLAGAGAVVTLGRWLRPSRPLSPVVTALACCLALILAL
PDLAWGVWGKVQPVHYPSGWAAVAATINDRGEGPGWVAVLPAGTMRRFSWSGTAPVLDPLPRWVRDDVLTTGD-
LIISGVM
VAGEGNHARAAQDLLLSGPNPSALTAAGVAWLVVESDTAGDMGASARTLAALQPTYRDD-
AIGLYRIGGSNAKTAPSPYRG LLIAAHLTWLVILVMAVVGMQITRHTHFRDVSSVALLSRR
>BL;ML2595, ML.tab 3100808:3101356 forward MW:19619
LPESQVAADDSGVTLNRSRLGRGWLTGIAVALLLAGCGIGTGGYCMLRYHQDSQAMARNDNAALKTALDCVAA-
TQAPDTN (SEQ ID NO:308)
TMAASEQKIIDCGTDAFHAQALLYTNMLVQAYQAANVHVQVSD-
MRAAVERHNNDGSIDVLVALRVKLSNDRAHNQETGYR LRVKNALAEGQYKISKLDQVTK
>BL;ML2596, ML.tab 3101353:3102330 forward MW:35970
VTVVVAKSQTAAVIPEPLSNRLAPWHLRLVALAVDVLPGLVAVSTMTLVVFTVPLRSAWWWLCMAVGGIVILS-
MLVNRLL (SEQ ID NO:309)
LPTIIGWSLGRALCGISVIMRDGVAIGPWRLLLRDLTNLLDTA-
AVFAGWLWPLWDSRRRTFADILLRTEVRCVQAVERQR
TIRWWASVALLTAAGVSLGGASVSWAVVY-
SHDRAIDQTRSEIAIQGPKMVAQMLTYNPKSLRDDFTHAQSLASDKYRRQL
AAQQDVVKKGHPVINEYWPTAGAIQSATRDRATMLLFMQGRRGAAPGERYISATVRVSFAKGEHNHWLVDDLT-
VLTKPKT TGNGR >BL;ML2597, ML.tab 3102327:3102881 forward
MW:20340 MSPRRKFQAGEGLLLVSHTVASQRRWGLPLAATFAALVMVAAITASTLM-
SISHASRELAVAKDQQVLSYVKWFMTQFTTL (SEQ ID NO:310)
DPYHANDYVARILAQATGDFAKQYNEKVNEILLQVAQAEPATGTVLDAGVERWNDDGSANVLVATEVTSKYPD-
EKQVLEN TNRWTATATRECNQWKISNLLQVI >BL;ML2598, ML.tab
3102971:3103525 forward MW:19619 VAAAEGGGSWRNRRAGQRTAIAVVVA-
AVLFVGSAAFAGAAVQPYLADRATVAVKLEVARTAANAITVLWTYTPENMDTLA (SEQ ID
NO:311)
DRAATYLSGDFGAQYRKFVDAIVGPNKQAKITNSTEVTGVAVESLDASNAIAIVYTNTTSTSPLTKNIPA-
LKYLSYRLFM KRSAVRWLVTRMTTITSLDLTPQL >BL;ML2604, ML.tab
3108446:3109195 forward MW:26859 MTDNKMLARIAALLRQAEGTDNAHEA-
DAFMATAQRLATAASIDLAVARSHVANRSTAQAPTQRTITIGTAGTRGLRTYVQ (SEQ ID
NO:312)
LFVLIAAANDVRCDVASNSTFLYAYGFAEDIDATHALYASLVVQMVRESDAYLASGAYRPTPTITARLNF-
QLGFGMRVGQ
RLTEARDHIRSAVTEAWDRPTATAIALRDKEIELIDYYRSASKARGTWQAARASAG-
YSSAARNAGDQAGRRAWIDNSTEL PGARAALGR >BL;ML2605, ML.tab
3109192:3109698 forward MW:18602 MNLLDSVRDAQRSKVYAAEEFIRTLF-
DRAVEHGSPAVEFFGAQLSLPPEARFGSVAAVQRYVDDVLALQAVRQRWPRMLP (SEQ ID
NO:313)
LTVRARRAATAAHYENLDGAGVIAVPGNNADWAMRELVVLHEVAHHLCKDPPPHGPEFVATICALTELVM-
GPELGYVFRV VYAKEGVR >BL;ML2614, ML.tab 3120828:3121502 forward
MW:24144 VNWPKTPGTLAAMPDEEQTELPVHKEFAGIADY-
SDPGLSDGSVFSQYGIASTVLAVLSAAAVVFGVVIWRAHHDNSAERA (SEQ ID NO:314)
YLTHVMQTAFDWTGVLINMNTSNVDASLQRLHAGTVGELNTDFDAAVQPYRKVVEKLQTQSRGQIEAVAIESV-
HHDLDTQ
PGVAHPVVTTKLLPPAARTDSVMLVATSVSENVGGKPTTVHWSLRLDVSDVDGKLMISH- LESIR
>BL;ML2615, ML.tab 3121499:3122188 forward MW:24416
MRNRWRLLAFDVVAPLVAIAALVMIGVVLDWPRWWVSACSVLVLLIVEGVGVNFWLL-
RRDSVTIGTDDDAPGLRLAXIXISV (SEQ ID NO:315)
CTAALCAAVLIGYMHWTSPDRDFSL-
DSREVVQIATGMAEAFVIASFTPSAPTSSIDRAAAMIMPDQAGVFKEQYRKSSADLA
RRNVTAQASVLAAGVEAIGSSAASVAVILRVTQNTPGQPPSQAAPAVRVTLIKRGSDWLVTDVLSINAR
>BL;ML2616, ML.tab 3122267:3122779 reverse MW:18682
VTLADDHRRPAAPPEQPAADQGRYDPDQPVEFWSTAAIRSALHAGSIEIWKLITAAVKHDPYGRTAHQVEEVL-
EGTRPYG (SEQ ID NO:316)
ICKALGEVLQRARTHLEINERAEVARHVRLLIDRSGLGHQEFA-
SRIGVAPEGLASYLDGSTSPSAALMVRMRRVSDRFVK VKAARSANSD >BL;ML2621,
ML.tab 3131780:3132367 reverse MW:21956
MLIWDAPNLDMGLDAIVDHHHRNALERPCFDALGRWLFTCNTEVAVGYPDSTIGLKGTMFTNIAQASADVVRL-
WVDTLRN (SEQ ID NO:317)
VEFVIFVKPKIDEDSHMLGRIKGRYNEGLAVQVVVSAYSQALR-
QTLERTAHAVIDVQMIGFREHTSWALASAILEFADLE
DIAGVFRESLPRISLDSLPAQGEWCAPFP- GRWLRY >BL;ML2627, ML.tab
3140261:3141280 forward MW:36081
VTANGHAGRREGGPYFDDLSIGQVFDWAPAVTLTSGMAAVHQAILGDRMRLALDAEL-
STTVIGTHAMLAHPGLVDDVAIG (SEQ ID NO:318)
QSTLVTQRVKANLFYRGLTFHRFPVIG-
DTLYTSTEVVGLQANSAKPGRPPTGMAALRITTTDQHDRLVLDFYRCAMLPAS
AAWHPHDALNRNDDLANVGADAVASASDPTGQWDAAAFRERVPGPHFDAGITGAVLRSTGDIVSSAPDLARLT-
LNIASTH
HDSRVRGLRLVYGGHTIGLALAQAGRMLPNLATVLNWRSCDHTGPVHEGDTLYSELHVE-
SAEATEQGGVLGLRSLVYAVS AAGGSDHLVLDWRFTVLQF >BL;ML2629, ML.tab
3144572:3145042 reverse MW:17022
LGSVPGYASPMPVMSKTVEVRATAASIMAIVTDFEAYPQWNDGVKGVWVLARYDDGRPSQLRLDTEIQGTKCT-
YIQAVYY (SEQ ID NO:319)
PATNQIQTIMQQGDLFTKQEQLFSAVEIGAASLLTVDIDVESS-
MPVPAPMVKALLNNVLDNLAENLKLRAEQLAAN >BL;ML2630, ML.tab
3145226:3145597 reverse MW:12982 LLTDGVLLPELLFGYLNKCCLLPQLFDTAINTS-
VGVTSPNESRAFNAADDLIGDGSVERAGLHRATSVPGESPEGLQRGH (SEQ ID NO:320)
SPEPNDSPPWQRGSAQASQSGYRPSDPLTTTRQSNPAPGANVR >BL;ML2640, ML.tab
3159055:3159987 reverse MW:34454
MRTHDDTWDIKTSVGTTAVMVAAARAAETDRPDALIRDPYAKLLVTNTGAGALWEAMLDPSMVAKVEAIDAEA-
AAMVEHM (SEQ ID NO:321)
RSYQAVRTNFFDTYFNNAVIDGIRQFVILASGLDSRAYRLDWP-
TGTTVYEIDQPKVLAYKSTTLAEHGVTPTADRREVPI
DLRQDWPPALRSAGFDPSARTAWLAEGLL-
MYLPATAQDGLFTEIGGLSAVGSRIAVETSPLHGDEWREQMQLRFRRVSDA
LGFEQAVDVQELIYHDENRAVVADWLNRHGWRATAQSAPDEMRRVGRWGDGVPMADDKDAFAEFVTAHRL
>BL;ML2664, ML.tab 3190775:3191530 forward MW:26919
MYQAVRYLLVMAAIILMAVAESGSPSVAAIPALKPTPEVASVLPTNGAVVGVAHLVVVTFTAPVTDRSAAER-
SIRITSPN (SEQ ID NO:322)
NMTGHFEWLDGDVVQWIPTKYWPAYTHVSVEVQALTTGFETG-
DALLGVASLSTHTFTVSRNGEVLRTMPASMGKPTRPTP
IGKFTALSKERTVVMDSRTIGIPLNSPE-
GYLITAQYAVRVTWSGVYVHSAPWSVNSQGYTNVSHGCINLSPDDATWYFNT VNVGDPIEVVA
>BL;ML2687, ML.tab 3234968:3236662 reverse MW:61392
VINAQTHSTTISPRPLAADRQSADNRDCPSRTDYLGAALADAIGGPVGCHALIGRSWLMTPLRVMFLIGLVF-
LALGWSTK (SEQ ID NO:323)
AACLQTTGTGPGGQRVPNWDNQRAYYELCYSDIVPLYGTELL-
SQGKFPYKSSWIETDSSGTPRTRYDGRLAVRYMEYPVL
TGIYQYVSMAVAKSYTALSEPVSLPAVA-
EVVMFFDVVAFGLALAWLATIWATAGLAGLRIWDAALVAASPLVIFQVFTNF
DALAIAFATGGLLAWSRCRPISAGVLIGLGAAAKLYPLLFLVPLFVLGVRTGRLGGVACAAVTAATTWLLVNL-
PVLLLFP
RGWSEFFRFNTRRGDDMDSLYNVVKSLTGWRGFDTKLGFCELPLVLNTVVTVLFALCCA-
AVAYIALTAAQRPRVVQLAFL
LVAVFLLTNKVWSPQFSLWLVPLAVLALPHRRVLLAWMTIDALVW-
VPRMYYLYGNPSRSLPEQWFTATVLLRDIAVVALC
ALVIRQIYRPDEDPVRLGGRVDDPAGGPFDR-
APYAPPSWLPDWLHPAGMRRVVTLAASSVTETELAAAATPSGPMHHPHA PSSI
>BL;ML2689, ML.tab 3239183:3239599 reverse MW:15278
VVNYSLRRRFLLAEVYSGRTGVSEVCDANPYLLRAAKFHGKPSQVMCPICRKEQLTLVSWVFGNQLGAISGSA-
RTAEELV (SEQ ID NO:324)
LLATRYEEFAVYVVEVCRTCSWNYLVRSYVLGAARSAPPPRGT- PVTRTACNGARMAIE
>BL;ML2699, ML.tab 3250667:3253060 forward MW:84667
VTASRLRLAGSLSIALVVDIVASFAVLLVAPTATPHAAADEPRATSFVR-
VRIDKVTPDVVTTSSEPVVTVSGVVTNIGDR (SEQ ID NO:325)
PVRDLMVRLEHESAVISSAVLRTYLDDGADQFQTAADFVTVAEELQRGQEAGFTLVAPIRSTTKPSMAIDQPG-
IYPVLVN
VNGTPDYGTPARLDNARFLLPVAGVPPAKSDAMDSAVAPDITKPVWITMLWPLADRPRL-
SPGAPGGTIPVRLVDDDLASS
LAPGGRLDILLTAAETATGRDVDPDGAVSRALCLAVDPDLLVTVN-
AMTGGYIVSNSPDGPAQQPGTPTHPGTGQDAAVIW
LNRLRALAHRMCVASLPYAQADLDALQRIND-
TELSTTATTSVGDIVDHILDVTSIRGVTMLPDSPLTNRVVDLLNDNNST
VAIAAAAFSAQDSTSGSLVDIDTEPRRLSPRVVVAPFDPAVGAALAAAGTDPIVPTYLDSSLNIRIVHDSDTA-
RRQDALS
SILWRALERDAAPRSQILVPPTSWHLQADDARVMLTTLSTVIRSGLAVARPLPTVIADA-
LARTKLSDTVGSYTSARGRFN
DDIIADIASQVGRLWGLTSALTADGRTGLTGVQYTAPLREDMLRA-
LSQLEPPATRNGLAQQRLAVVSKTIKDLIGAVTIV
NPGGSYTLATEHSPLPLALHNGLAVPIRVRL-
QVDAPPGMTVTDVSQIELPPGYLPLRVPIEVNFTQRVAVDVALQTPEGI
QLGEPVRLLVHSNAYGKVLFEITLTAATILIVLAGRRLWHRFRIQTEGADSNRPDPLIVDAHPQHQYDDWVDE-
ENRI >BL;PE, ML.tab 654129:654437 forward MW:10295
MTLGVIPEGLEGASAVIEALTAHLATVHAEAAPFIMEVIPPGSGSVSVQNQVGFNVHGCQYVAMTAHGAEE-
LGRWGVGVA (SEQ ID NO:326) ESGVSYALRDAFAVASYLGGGL >BL;desA2,
ML.tab 2339270:2340097 reverse MW:31139
MAQKPVPNALILQLEPVVKDNMARHFANEELWFAHDYVPFDRGENFAFLGGRDWDPSQATLAKAVTDACEILL-
ILKDNLA (SEQ ID NO:327)
GYHRELVEHFILEGWWGRWLGRWTAEEHLHAIALREYLVVTRE-
VDPVANEQVRVEHVMKGYRVNSYTQIETLVYMAFLER
SYAFFCGSLAAQIKEPALFGLINQIVKDE-
VRHEEFFANLVAHCLECNRDETVAAIAARAAGLDVLGADIDAYHDKVENIS
AAGIFGSVELRQVISDRITAWGLINEPQLAQFVTS >BL;embA, ML.tab
139821:143156 reverse MW:118457 VPHDGHEPPQRIIRLIAVGAGITGLLLCAVVPLL-
PVKQTTATIRWPQSATRDGWVTQITAPLVSGTPRALDISIPCSAMA (SEQ ID NO:328)
TLPDSVGLVVSTLPSGGVDTGKSGLFVRANKNAVVVAFRDSVAAVAPRPAVAAGNCSVLHIWANTRGAGANFV-
GIPGAAG
ILTAEKKPQVGGIFTDLKVPVQPGLSAHIDIDTRFITAPTAIKKIAVGVGAAAVLIAIL-
ALSALDRRNRNGHRLINWRVS
MAWLAQWRVILATPPRAGGASRIADGGVLATLLLWHIIGATSSDD-
GYNLTVARVSSEAGYLANYYRYFGATEAPFDWYFT
VLAKLASVSTAGVWMRIPATLAGIACWLIIN-
HWVLRRLGPGTGGLSTNRVAVLTAGAMFLAAWLPFNNGLRPEPLIALGV
LFTWVLVERAIALRRLASAATAAVVAILTATLAPQGLIAIAALLTGARAITQTIRRRRTTDGLLAPLLVLAAS-
LSLITLV
VFHSQTLATVGESARIKYKVGPTIACYQDFLRYYFLTVESNADGSMTRRFPVLVLLLCM-
FGVLVVLLRRSRVPGLASGPT
WRLIGTTATSLLLLTFTPTKWAIQFGALAGLTGTFGAIAAFAFAR-
ISLHTRRNLTVYITALLFVLAWATAGINGWFGVSN
YGVPWFDIQPVIAGHPVTSIFLTLSILTGLL-
AGGQHFRLDYAKHTEVKDTRRNRFLATTPLVVVATTMVLCEVGSLAKGA
VARYPLYTTAKANLAALRSGLAPSVCAMADDVLTEPDPNAGMLQPVPGQIFGPTGPLGGMNPIGFKPEGVNDD-
LKSDPVV
SKPGLVNSDASPNKPNVTFSDSAGTAGGKGPVGVNGSHVALPFGLDPDRTPVMGSYGEN-
TLAASATSAWYQLPLHWKESI
ADRPLVVVSAAGAIWSYKEDGNFIYGQSLKLQWGVTRPDGIIQPL-
AQVMPIDIGPQPAWRNLRFPLTWAPPEANVARVVA
YDPNLSPDQWLAFTPPRVPVLQTLQQLLGSQ-
TPVLMDIATAANFPCQRPFSEHLGIAELPQYRILPDHKQTAASSNLWQS
SEAGGPFLFLQALLRTSTISTYLRDDWYRDWGSVEQYYRLVPADQAPEAVVKQGMITVPGWIRRGPIRALP
>BL;embB, ML.tab 136573:139824 reverse MW:117159
MSVIYRAHRVAIANRTASRNVRVARWVAAIAGLIGFVSSVVTPLLPVVQTTATLNWPQNGQLNSVTAPLISLT-
PVDITAT (SEQ ID NO:329)
VPCAVVAALPPSGGVVLGTAPKQGKDANLNALFIDVNSQRVDV-
TDRNVVILSVPRNQVAGDAGAPGCSSIEVTSTHAGTF
ATFVGVTDSAGNPLRGGFPDPNLRPQIVG-
VFTDLTGGAP5GLRLSATIDTRFSSTPTTLKRFAMMLAIITTVGALVALWR
LDQLDGRRMRRLIPARWSMFTLVDVAVIFGFLLWHVIGANSSDDGYQMQMARTADHSGYMANYFRWFGSPEDP-
FGWYYNL
LALMIHVSDASMWIRLPDLICGVACWLLLSREVLPRLGPAIVGFKPALWAAGLVLLAAW-
MPFNNGLRPEGQIALGALITY
VLIERAITYGRMTPVALATLTAAFTIGIQPTGLIAVAALLAGGRP-
MLYILVRRHRAVGAWPLVAPLLAAGTVVLTVVFAE
QTLSTVLEATKVRTAIGPAQAWYTENLRYYY-
LILPTVDGSLSRRFGFLITALCLFTAVLITLRRKQIPGVARGPAWRLIG
TILGTMFFLTFAPTKWVHHFGLFAALGAAVAALTTVLVSHEVLRWSRNRMAFLAALLFVMTLCFATTNGWWYV-
SSYGVPF
NSAMPRIDGITFSTIFFILFAIVALYAYYLHFTNTGHGEGRLIRTLTVSFWAPIPFAAG-
LMTLVFIGSMVAGIVRQYPTY
SNGWANIRALTGGCGLADDVLVEPDSNAGYMTALPSNYGPLGPLG-
OVNAIGFTANGVPEHTVAEAIRITPNQPGTDYDWE
APTKLKAPGINGSVVPLPYGLNPNKVPIAGT-
YTTGAQQQSRLTSAWYQLPKPDDRHPLVVVTAAGKITGNSVLHGHTYGQ
TVVLEYGDPGPNGGLVPAGRLVPDDLYGEQPKAWRNLRFARSQMPFDAVAVRVVAENLSLTPEDWIAVTPPRV-
PELRSLQ
EYVGSSQPVLLDWEVGLAFPCQQPMLHANGVTDIPKFRITPDYSAKKIDTDTWEDGANG-
GLLGITDLLLRAHVMSTYLAR DWGRDWGSLRKFDPLVDTHPAQLDLDTATRSGWWSPGKIRIKP
>BL;embC, ML.tab 144115:147327 reverse MW:114723
VSGAGANYWIARLLAVIAGLLGALLAMATPFLPVNQNTAQLNWPQNSTFESVEAPLIGYVATGLNVTVPCAAA-
AGLTGPQ (SEQ ID NO:330)
SAGQTVLLSTVPKQAPKAVDRGLLIQRANDDLVLVVRNVPVVS-
APMSQVLSPACQRLTFAAYFDKITAEFVGLTYGPNAE
HPGVPLRGERSGYDFRPQIVGVFTDLSGP-
IPTGLNFSATIDTRYSSSPTLLKTIAMILGVVLTIVALVALHLLDTADGTQ
HRRLLPSRWWSIGCLDGLVITILAWWHFVGANTSDDGYILTMARVSEHAGYMANYYRWFGTPEAPFGWYYDLL-
ALWAHVT
TTSAWMRVPTLAMALTCWWLISREVIPRLGHAAKASRAAAWTAAGMFLAVWLPLDNGLR-
PEPIIALGILLTWCSVERAVA
TSRLLPVAVACIVGALTLFSGPTGIASIGALLVAVGPLLTILQRR-
SKQFGAVPLVAPILAASTVTAILIFRDQTFAGESQ
ASLLKRAVGPSLKWFDEHIRYERLFMASPDG-
SVARRFAVLALLVALSVAVAMSLRKGRIPGLAAGPSRRIIGITVTSFLA
MMFTPTKWTHHFGVFAGLAGSLGALAAVAVASAALRSRRNRTVFAAVVLFVVALSFASVNGWWYVSNFGVPWS-
NSFPKLR
WSLTTALLELTVIVLLLAAWFHFVATTNGSAKTRFGVRIDRIVQSPIAIATWSLVIFEV-
ASLTMAMIGQYPAWTVGKSNL
QALTGQTCGLAEEVLVEQDPNAGMLLPVSTPVADALGSSLAEAFT-
ANGIPADVSADPVMEPPGDRSFVKENGMTTGGEAG
NEGGTNATPGINGSRAQLPYNLDPARTPVLG-
SWQSGIQVVARLRSGWYRLPARDKAGPLLVVSAAGRFDHHEVKLQWATD
SGAASGQPGGAFQFSDVGASPAWRNLRLPLSAIPSMATQIRLVADDEDLAPQHWIALTPPRIPQLRTLQDVVG-
YQDPVFL
DWLVGLAFPCQRPFDHQYGVDETPKWRILPDRFGAEANSPVMDNNGGGPLGVTELLLKA-
TTVASYLKDDWSRDWGALQRL TPYYPNAQPARLSLGTTTRSGLWNPAPLRH >BL;1ppS,
ML.tab 524929:526143 forward MW:43242
VGTATRRVQPKAWRALLTLLVISVVMPGVACNRGGGNVPVNVIGDKGTPFADLLVPKLTASVTDGAVGVNVDM-
PVTVTVA (SEQ ID NO:331)
DGVLAAVTMINDNGRMINGQFSPDGLRWSTTEPLGFNRCYTLS-
AKALGLGGVVNRQMTFQTSSPAHLTMPYVNPGNGEIV
GIGEPVAIRFDENIANRLAAQKAITITAN-
PPVEGAFYWLNNREVRWRPEHFWKSGTAVDVAVNTYGVDLGEGMFGEDNVK
THFTIGDEVFATADDATKMLTVRVNGEVVKIMPTSMGKDSTPTANGIYIVGARFKHIIMDSSTYGVPVNSPNG-
YRADVDW
ATQLSYSGVFLHSAPWSVGAQGHTNTSHGCLNVSPSNAQWFYDHVKRGDIVEVVNTVGD-
TLPGAEGLGDWNIPWEQWKAG NANI >BL;1ppX, ML.tab 188551:189252
forward MW:24411 MNDRKWVTSSVMLVTLSACLALGLSGCSSTKPDAQ-
EQSSSSSPASSDPALTAEIKQSLETTKALSSVHVVVQTTGKVDAL (SEQ ID NO:332)
LGISNADVDVQANPLAVKGTCTYNDQPGVPFRVLGDNISVKLFDDWSNLGSISDLSTSHVLDPNTGITQVLSG-
VINLQAQ
GTEVVDRIPTNKITGTVPTSSVKMLDPKAKGSKLATVWIAQDGSHHLVRASIDLGSGSI-
QLTQSKWNEPVNTN >BL;1pqB, ML.tab 918368:920137 forward MW:61768
VMRGVLVIMRLLCLGMLFTGCAGVPNSSAPQAIGTVERPVPSNLPKPTPGMDPDVLL-
REFFKATADPANRHLAARQFLTQ (SEQ ID NO:333)
SASNAWDDAGRALLIDHVVFVETRGAE-
RVSATMRADILGSLSDMGVFETAEGVLPDPGPVELIKTSGCWRIDRLPNGVFL
DWQQFQATYKRNTLYFADPTGKTVVPDPRYVAVLGHDQLATELVSKLLAGPRPEMAHAVRNLLAPPLRLRGPV-
TRADGSK
SGIGRGYGGARIDLEKLSTTDPHSRQLLAAQIIWTLARADIRGPYVINADGAPLDDRFA-
DGWTTSDVAATDPGVADGAGA
GLHALVGGALVSLIGQNTTTVLGAFGRMGYQTGAALSRSGRQVAS-
VVTLRRGAPDMAASLWIGDLGGEAVQSADGHSLSR
PSWSLDDAVWVVVDTNNVLRAIPEPASGQPA-
RIPVDSAAVASRFPGPITDLQLSRDGTRAAMVIGGQVILAGVEQTQAGQ
FALTYPRRLGFGLGSSVVSLSWRTGDDIVVTRTDATHPVSYVNLDGVNSDAPARGLQVPLSVIAANPSTVYVA-
GPQGVLQ YSASVAESQQGWSEVAGLTVMGAEPVLPG >BL;1pqE, ML.tab
409442:409993 reverse MW:19242 VSRFKISLPALATRVAVLGFLTLMASVL-
GGCGAGQISQTATQEPAVNGNRVTLNNLALRDIRIQAAQTGDFLQSGRTVDL (SEQ ID
NO:334)
MLVAINNSPYVTDRLVSITSDIGTVALNGYTQLPTNGMLFIGTSEGQRIKPPPLQSNNIAKAIVTLAKPITN-
GLTYNFTF NFEKAGQANVAVPVSAGLAPRQT >BL;1pqT, ML.tab 322399:323055
reverse MW:23455 MQAIRLGLHTAAAVVTLSISAVSCGTKT-
PDYQLILSKSSTTTTTTPDKPIPLPQYLESIGVTGQQVAPSSLPGLTVSIPT (SEQ ID
NO:335)
PPGWSPYSNPNITPETLIIAKSGKYPTARLVAFKLRGDFDPTQVIKHGNDDAQLFENFRQLDVSTANYNGFP-
SAMIQGSY DLEGRRLHAWNRIVIPTGPPPSKQQYLVQLTITSLANEAVAQSNDIEAIIRGFVVAPK
>BL;1prG, ML.tab 674679:675395 reverse MW:24874
MQAPKHHRRLFAVLATLNTATAVIAGCSSGSNLSSGPLPDATTWVKQATDITKNVTSAHLVLSVNGKITGLPV-
KTLTGDL (SEQ ID NO:336)
TTHPNTVASGNATITLDGADLNANFVVVDGELYATLTPSKWSD-
FGKASDIYDVASILNPDAGLANVLANFTGAKTEGRDS
INGQSAVRISGNVSADAVNKIAPPFNATQ-
PMPATVWIQETGDHQLAQIRIDNKSSGNSVQMTLSNWDEPVQVTKPQVS >BL;1sr2,
ML.tab 305368:305706 forward MW:12164
MAKKVTVTLVDDFDGAGAADETVEFGLDGVTYEIDLTNKNAAKLRGDLRQWVSAGRRVGGRRRGRSNSGRGRG-
AIDREQS (SEQ ID NO:337) AAIREWARRNGHNVSTRGRIPADVIDAFHAAT
>BL;mihF, ML.tab 656895:657212 forward MW:11474
VALPQLTDEQRAAALEKAAAARRARAELKDRLKRGGTNLTQVLKDAESDEVLGKMKVSALLEALPKVGKVKAQ-
EIMTELD (SEQ ID NO:338) IAPTRRLRGLGERQRKALLEKFGSA >BL;mmpS3,
ML.tab 1041502:1042383 forward MW:30891
MSGPNPPGRENEESDSGNELSGELDPHNGVESVDELVPVPDSDLVTASDHTSETEVYSQAYSAPEAEHFTAVP-
YVPADLR (SEQ ID NO:339)
LYDYDESSVYDEPGAAPRWPWVVGVAAILAAISLVVSVSLLFT-
RTDTSKLSTPTTGRSTPPVQDEITTVKPPPPSTETST
ATETQTVTVTPLPPPSATSTAVPPSSVVP-
PPPTTPTTTVTTLTGPRQVTYSVTGTKAPGDIISVTYVDASCRRRTQHMVY
IPWSMTVTPISQSDVGSVEAFSLFRVSKLNCLITTSDGTVLSSNSNDAPQTSC >BL;mtb12,
ML.tab 753532:754035 forward MW:17130
MTMKSIATYAALAIIGAAVDGLTSMAIPTGPAASHIQPVAFGVPLPQDPAPAADVPTAAELTSLLNKIVDPDV-
SFMHKSQ (SEQ ID NO:340)
LVEGGIGSAEAHIGDRELKNAAQKGELPLLFSVTNIRPGTSGS-
ATADVSVSGPKLNPPVTQNITFINKGSWVLSRHSAME LLQAAGR >BL;whiB1, ML.tab
953665:953919 reverse MW:9318
MDWRHKAVCRDEDPELFFPVGNSGPAIAQIADAKLVCNRCPVTTECLAWALNTGQDSGVWGGMSEDERRALKR-
RNTRTKA (SEQ ID NO:341) RSGV >BL;whiB2, ML.tab 903227:903496
forward MW:10119 VVPKALVAFEVESEPESSDQWQDRALCAQTDPEAF-
FPEKGGSTREAKKICLGCEVRHECLEYALAHDERFGIWGGLSERE (SEQ ID NO:342)
RRRLKRGVI >BL;whiB3, ML.tab 475771:476079 reverse MW:11576
MPQPKQLPGPNATIWNWQLQGLCRGVDSSMFFHPDGERGRARMQREQRAKEMCRRCP-
VIEECRAHALDVGEPYGVWGGLS (SEQ ID NO:343) ESERDLLLKGDLARSRSIPRSA H.
tuberculosis proteins that are potential targets for the diagnosis,
prophylaxis or treatment of mycobacterioses. >BL;RV0007,
H37RV2.tab 9914:10825 forward MW:31041
VTAPNEPGALSKGDGPNADGLVDRGGAHRAATGPGRIPDAGDPPPWQRAATRQSQAGHRQPPPVSHPEGRPTN-
PPAAADA (SEQ ID NO:344)
RLNRFISGASAPVTGPAAAVRTPQPDPDASLGCGDGSPAEAYA-
SELPDLSGPTPRAPQRNPAPARPAEGGAGSRGDSAAG
SSGGRSITAESRDARVQLSARRSRGPVRA-
SMQIRRIDPWSTLKVSLLLSVALFFVWMITVAPLYLVLGGMGVWAKLNSNV
GDLLNNASGSSAELVSSGTIFGGAFLIGLVNIVLMTALATIGAFVYNLITDLIGGIEVTLADRD
>BL;Rv0010C, H37RV2.tab 13136:13558 reverse MW:15166
MQQTAWAPRTSGIAGCGAGGVVMAIASVTLVTDTPGRVLTGVAALGLILFASATWRARPRLAITPDGLAIRGW-
FRTQLLR (SEQ ID NO:345)
NSNIKIIRIDEFRRYGRLVRLLEIETVSGGLLILSRWDLGTDP- VEVLDALTAAGYAGRGQR
>BL;Rv0011c, H37RV2.tab 13717:13995
reverse MW:10429 MPKSKVRKKNDFTVSAVSRTPMKVKVGPSSVWFVSLFIGLMLIGLIWLM-
VFQLAAIGSQAPTALNWMAQLGPWNYAIAFA (SEQ ID NO:346) FMITGLLLTMRWH
>BL;RV0020C, H37RV2.tab 23864:25444 reverse MW:56880
MGSQKRLVQRVERKLEQTVGDAFARIFGGSIVPQEVEALLRREAADGIQSLQGNRLLAPNEYIITLGVHDFEK-
LGADPEL (SEQ ID NO:347)
KSTGFARDLADYIQEQGWQTYGDVVVRFEQSSNLHTGQFRARG-
TVNPDVETHPPVIDCARPQSNHAFGAEPGVAPMSDNS
SYRGGQGQGRPDEYYDDRYARPQEDPRGG-
PDPQGGSDPRGGYPPETGGYPPQPGYPRPRHPDQGDYPEQIGYPDQGGYPE
QRGYPEQRGYPDQRGYQDQGRGYPDQGQGGYPPPYEQRPPVSPGPAAGYGAPGYDQGYRQSGGYGPSPGGGQP-
GYGGYGE
YGRGPARHEEGSYVPSGPPGPPEQRPAYPDQGGYDQGYQQGATTYGRQDYGGGADYTRY-
TESPRVPGYAPQGGGYAEPAG
RDYDYGQSGAPDYGQPAPGGYSGYGQGGYGSAGTSVTLQLDDGSG-
RTYQLREGSNIIGRGQDAQFRLPDTGVSRRHLEIR
WDGQVALLADLNSTNGTTVNNAPVQEWQLAD- GDVIRLGHSEIIVRMH >BL;Rv0039C,
H37RV2.tab 42007:42351 reverse MW:11292
MFLAGVLCMCAAAASALFGSWSLCHTPTADPTALALRAMAPTQLAAAVM-
LAAGGVVAVAAPGHTALMVVIVCIAGAVGTL (SEQ ID NO:348)
AAGSWQSAQYALRRETASPTANCVGSCAVCTQACH >BL;Rv0040c, H37RV2.tab
42434:43365 reverse MW:31923 MIQIARTWRVFAGGMATGFIGVVLVT-
AGKASADPLLPPPPIPAPVSAPATVPPVQNLTALPGGSSNRFSPAPAPAPIASP (SEQ ID
NO:349)
IPVGAPGSTAVPPLPPPVTPAISGTLRDHLREKGVKLEAQRPHGFKALDITLPMPPRWTQVPDPNVPDAF-
VVIADRLGNS
VYTSNAQLVVYRLIGDFDPAEAITHGYIDSQKLLAWQTTNASMANFDGFPSSIIEG-
TYRENDMTLNTSRRHVIATSGADK
YLVSLSVTTALSQAVTDGPATDAIVNGFQVVAHAAPAQAPAP-
APGSAPVGLPGQAPGYPPAGTLTPVPPR >BL;Rv0049, H37RV2.tab 52831:53241
forward MW:15000 VDYTLRRRSLLAEVYSGRTGVSEVCDANPYLLRAAKF-
HGKPSRVICPICRKEQLTLVSWVFGEHLGAVSGSARTAEELIL (SEQ ID NO:350)
LATRFSEFAVHVVEVCRTCSWNHLVKSYVLGAARPARPPRGSGGTRTARNGARTASE
>BL;Rv0051, H37RV2.tab 55696:57375 forward MW:61209
VTGALSQSSNISPLPLAADLRSADNRDCPSRTDVLGAALANVVGGPVGRHALIGRTRLMTPLRVMFAIALVFL-
ALGWSTK (SEQ ID NO:351)
AACLQSTGTGPGDQRVANWDNQRAYYQLCYSDTVPLYGAELLS-
QGKFPYKSSWIETDSNGTPQLRYDGQIAVRYMEYPVL
TGIYQYLSMAIAKTYTALSKVAPLPVVAE-
VVMFFNVAAFGLALAWLTTVWATSGLAGRRIWDAALVAASPLVIFQIFTNF
DALATGLATSGLLAWARRRPVLAGVLIGLGSAAKLYPLLFLYPLLLLGIRAGRLNALARTMAAAAATWLLVNL-
PVMLLFP
RGWSEFFRLNTRRGDDMDSLYNVVKSFTGWRGFDPTLGFWEPPLVLNTVVTLLFVLCCA-
AIAYIALTAPHRPRVAQLTFL
TVASFLLVNKVWSPQFSLWLVPLAVLALPHRRILLAWMTIDALVW-
VPRMYYLYGNPSRSLPEQWFTTTVLLRDIAVMVLC
GLVWQIYRPGRDLVRTGGPGALPACGGVDDP-
VGGVFANAAPPGRLPSWLRPRLGDEHRERTPDAGRDRTFSGQHRA >BL;RV0093C,
H37RV2.tab 102818:103663 reverse MW:29599
VLAQATTAGSFNHHASTVLQGCRGVPAAMWSEPAGAIRRHCATIDGMDCEVAREALSARLDGERAPVPSARVD-
EHLGECS (SEQ ID NO:352)
ACRAWFTQVASQAGDLRRLAESRPVVPPVGRLGIRRAPRRQHS-
PMTWRRWALLCVGIAQIALGTVQGFGLDVGLTHQHPT
GAGTHLLNESTSWSIALGVIMVGAALWPS-
AAAGLAGVLTAFVAILTGYVIVDALSGAVSTTRILTHLPVVIGAVLAIMVW
RSASGPRPRPDAVAAEPDIVLPDNASRGRRRGHLWPTDGSAA >BL;RV0098,
H37RV2.tab 107600:108148 forward MW:20528 MSHTDLTPCTRVLASSGTVPIAEE-
LLARVLEPYSCKGCRYLIDAQYSATEDSVLAYGNFTIGESAYIRSTGHFNAVELIL (SEQ ID
NO:353)
CFNQLAYSAFAPAVLNEEIRVLRGWSIDDYCQHQLSSMLIRKASSRFRKPLNPQKFSARLLCRDLQVI-
ERTWRYLKVPCV IEFWDENGGAASGEIELAALNIP >BL;Rv0100, H37RV2.tab
109783:110016 forward MW:8660 VRDRILAAVCDVLYIDEADLIDGDE-
TDLRDLGLDSVRFVLLMKQLGVNRQSELPSRLAANPSIAGWLRELEAVCTEFG (SEQ ID
NO:354) >BL;Rv0116c, H37Rv2.tab 140270:141022 reverse MW:26915
MRRVVRYLSVVVAITLMLTAESVSIATAAVPPLQPIPGVASVSPANGAVVGVAHPVVVTFTTPVTDRRAVE-
RSIRISTPH (SEQ ID NO:355)
NTTGHFEWVASNVVRWVPHRYWPPHTRVSVGVQELTEGFET-
GDALIGVASISAHTFTVSRNGEVLRTMPASLGKPSRPTP
IGSFHAMSKERTVVMDSRTIGIPLNSS-
DGYLLTAHYAVRVTWSGVYVHSAPWSVNSQGYANVSHGCINLSPDNAAWYFDA VTVGDPIEVVG
>BL;Rv0146, H37Rv2.tab 172211:173140 forward MW:34016
MRTHDDTWDIKTSVGATAVMVAAARAVETDRPDPLIRDPYARLLVTNAGAGAIWEAMLDPTLVAKAAAI-
DAETAAIVAYL (SEQ ID NO:356)
RSYQAVRTNFFDTYFASAVAAGIRQVVILASGLDSRAYR-
LDWPAGTIVYEIDQPKVLSYKSTTLAENGVTPSAGRREVPA
DLRQDWPAALRDAGFDPTARTAWLA-
EGLLMYLPAEAQDRLFTQVGAVSVAGSRIAAETAPVHGEERRAEMRARFKKVADV
LGIEQTIDVQELVYHDQDRASVADWLTDHGWRARSQRAPDEMRRvGRWVEGVPMADDPTAFAEFVTAERL
>BL;Rv0164, H37Rv2.tab 193626:194180 forward MW:20165
MTAISCSPRPRYASRMPVLSKTVEVTADAASIMAIVADIERYPEWNEGVKGAWVLARYDDGRPSQVRLDT-
AVQGIEGTYI (SEQ ID NO:357)
HAVYYPGENQIQTVMQQGELFAKQEQLFSVVATGAASLLT-
VDMDVQVTMPVPEPMVKNLLNNVLEHLAENLKQPAEQLAA S#GMCGLSRRLRSQPGPPSACVPHR
>BL;Rv0175, H37Rv2.tab 206814:207452 forward MW:22324
VKAADSAESDAGADQTGPQVKAADSAESDAGELGEDACPEQALVERRPSRLRRGWLVGIAATLLAL-
AGGLGAAGYFALRS (SEQ ID NO:358)
HQESQSIAREDLAAIEAAKDCVAATQAPDAGANSAS-
MQKIIECGTGDFGAQASLYTSMLVEAYQAASVHVQVTDMRAAVE
RNNNDGSVDVLVALRVKVSNTDSDAHEVGYRLRVRMALDEGRYKIAKLDQVTK
>BL;Rv0176, H37Rv2.tab 207452:208417 forward MW:35405
VTVVVEKTPTTLPQATPNGAAPWHVRAGAFAIDVLPGLAVAATMALTALTVPPGSAWRWLCACLLGLTILLLA-
VNRLLLP (SEQ ID NO:359)
TITGWSLGRALTGIRVVRRDGSAIGPWRLLVRDLAHLVDTLSL-
FVGWLWPLWDSRRRTFADLLLRTEVRRVEPVQRPAVI
RRLTAAVALAAAGACASATAVGAAVVYVN-
EWQTDHTRAQLATRGPKLVVDVLSYDPETVQRDFERARSLATDRYRPQLSI
QQDSVRESGPVRNQYWVTDSAVLSATPAQATMLLFMQGERGTPPNQRYIQSTVRAIFQKSRGQWRLDDLAVVM-
KPRQPTG EK >BL;Rv0177, H37Rv2.tab 208417:208968 forward MW:20164
MSPRRKFEPGEGALLAPQSIEPSRRWGLPLALTASAVVMAAAISACALM-
RISHESHQRAAHKDIVMLSDVRSFMTMFTSP (SEQ ID NO:360)
DPFHANEYAERVLSHATGDFAKQYHERANDILIRISGVEPTTGTVLDAGVQRWNEDGSANVLVVTQITSKSAD-
GKRVVSN ANRWLVTAKQEGNEWKISSLLPVI >BL;Rv0178, H37Rv2.tab
208938:209669 forward MW:25879 VEDQQSASGDLTQKSVANGESTDT-
ASAATEGHRGEIDAAGEPDERGAAVADSQADEDDSAATAARGGKTRARRSRGRRLA (SEQ ID
NO:361)
ITVGVAAALFVGSAAFAGATVEPYLSERAVVATKLMVARTAANAITTLWTYTPENMDTLADRAANYLS-
GDFAAQYRRFVD
QIAAANKQAXITNDTEVTGAAVESLSGRDAVAIVYTNTTTTSPVTKNIPALKYL-
SYRLFMKRYDARWLVTRMTTITSLDL TPQV >BL;Rv0184, H37Rv2.tab
214969:215715 forward MW:26826 MTNDKMLARIAALLRQAEGTDNPH-
EADAFMSTAQRLATAASIDLAVARSHAGNRSPAQAPTQRTITIGAAGTRGLRTYVQ (SEQ ID
NO:362)
LFVLIAAANDVRCDVASNSTFVYAYGFAEDIDTSHALYASLVVQMVRASDAYLASGAHRPTPTITARL-
NFQLAFGARVGQ
RLADAREQTRQEATKDRDRPPGTAIALRDKDIELHEYYRRSSKARGAWRASRAT-
AGYSSAARRAGDRAGRQARLGNNPEL PGARAALGR >BL;Rv0185, H37Rv2.tab
215715:216221 forward MW:18365 VIGADVPRDSQRARVYAAEAFVRT-
LFDRvTAHGSPTVEFFGTQLTLPPEGRFGSVASVQRYVDDVLALPAVGQNWPTVSP (SEQ ID
NO:363)
VRVRARRAATAAHYENHGGTGTIAVPDRHTAGWANRELVVLHEVAHHLCQVPPPHGPEFVATVCTLTE-
LVMGPEVGHVFR VVYAQEGVR >BL;Rv0199, H37Rv2.tab 236550:237206
forward MW:23520 MPDGEQSQPPAQEDAEDDSRPDAAEAAAAEPKSSA-
GPMFSTYGIASTLLGVLSVAAVVLGAMIWSAHRDDSGERTYLTRV (SEQ ID NO:364)
MLTAAEWTAVLINMNADNIDASLQRLHDGTVGQLNTDFDAVVQPYRQVVEKLRTHSSGRIEAVAIDTVHRELD-
TQSGAAR PVVTTKLPPFATRTDSVLLVATSVSENAGAKPQTVHWNLRLDVSDVDGKLMISRLESIR
>BL;Rv0200, H37Rv2.tab 237206:237892 forward MW:24029
MRNAWRLVVFDVLAPLATIAALAAIGVLLGWPLWWVSTCSVLVLLVVEGVAINFWLLRRDSVTVGT-
DDDAPGLRLAVVFL (SEQ ID NO:365)
CAAAISAAVVTGYLRWTTPDRDFNRDSREVVHLATG-
MAETVASFSPSAPAAAVDRAAANMVPEHAGGFKEQYAKSSADLA
RRGVTAQAATLAAGVEAIGPSAASVAVILRVSQSIPGQPTSQAARALRVTLTKRGSGWLVLDVTPINAR
>BL;Rv0201c, H37Rv2.tab 237895:238395 reverse MW:18484
VTLAAEPHPAPPQQPTVAWSEPDVDRRVEFWPTVAIRSALESGDIATWQRIAAALKRDPYGRTARQVEEV-
LEGIPATGIA (SEQ ID NO:366)
NAFWEVLDRARTHLDANERAEVARQVGLLLDRSGLQRQEF-
ASRIGVTAQDLTAYLDGIVSPSASLMIRMRRLSDRFVRAK SVRAADS >BL;Rv0207c,
H37Rv2.tab 247387:248112 reverse MW:26175
MSLTEDVTSQTSESLARHSVLAEDLSQDGLTSLGAPGARvLLVWDAPNLDMGLGSILGRRPTALERPRFDALG-
RWLLART (SEQ ID NO:367)
AEIVAGRPGISTEPEATVFTNIAPGSAEVVRPWVDALRNVGFA-
VFAKPKVDEDSDVDRDMLAHIDERYREGLAALVVASA
DGQAFRQPLEAVARSGTPVQVLGFREHAS-
WALASDTLEFVDLEDIAGVFREPLPRIGLDSLPEQGAWLQPFRPLSSLLTS RV
>BL;Rv0216, H37Rv2.tab 258913:259923 forward MW:35756
VASGYGGIRvGGPYFDDLSKGQVFDWAPGVTLSLGLAAAHQSIVGNRLRLALDSDLCAAVTGMPGPLAHPGLV-
CDVAIGQ (SEQ ID NO:368)
STLATQRVKANLFYRGLRFHRFPAVGDTLYTRTEVVGLRANSP-
KPGRAPTGLAGLRMTTIDRTDRLVLDFYRCANLPASP
DWKPGAVPGDDLSRIGADAPAPAADPTAH-
WDGAVFRKRVPGPHFDAGIAGAVLHSTADLVSGAPELARLTLNIAATHHDW
RVSGRRLVYGGHTIGLALAQATRLLPNLATVLDWESCDHTAPVHEGDTLYSELHIESAQAHADGGVLGLRSLV-
YAVSDSA SEPDRQVLDWRFSALQF >BL;Rv0226c, H37Rv2.tab 269837:271564
reverse MW:59107 VRWFRPGYALVLVLLLAAPLLRPG-
YLLLRDAVSTPRSYVSANALGLTSAPRATPQDFAVALASHLVDGGVVVKALLLLGL (SEQ ID
NO:369)
WLAGWGAARLVATALPAAGAAGQFVAITLAIWNPYVAERLLQGHWSLLVGYGCLPWVATAMLTMRTTV-
GAGWFGLFGLAF
WVALAGLTPSGLLLAATVAVVCVANPGAGRPRWQCGVAALGSALVGALPWLTAS-
ALGSSLTSHTAANQLGVTAFAPRAEP
GLGTLGSLASLGGIWNGEAVPSSRTTLFAVASAVVLLANV-
AIGLPTVARRPVAVPLLTLAAVSVMVPAVLATGPGLHALR
VVVDAAPGLGVLRDGQKWVALAVPGY-
TLSGAGTVLTLRRWLRPATAAVVCCLALVLTLPDLAWGVWGKVAPVHYPSGWAA
VAAAINADPRTVAVLPAGTMRRFSWSGSAPVLDPLPRWVRADVLTTGDLVISGVTVPGEDAHARAVQELLLTG-
PHPSTLA
AAGVGWLVVESDSAGDMGAAARTLGRLAAAHRDDELALYRVGGQTSGASSARLKATMLA-
HWAWLSMLLVGGAGAAGYWVR RHLHHCEDTPASRAQD >BL;Rv0227c, H37Rv2.tab
271577:272839 reverse MW:45528
MLRFAACGAIGLGAALLIAALLLSTYTTSRIAEIPLDIDATLISDGTGTALDSASLATEHIVVNQDVPLVSQQ-
QVTVESP (SEQ ID NO:370)
ANADVVTLQVGSSLRRTDKQKDSGLLLAIVDTVTLNRKTAMAV-
SDDTHTGGAVQKPRGLNDENPPTAIPLRHDGLSYRFP
FHTEKKTYPYFDPIAQKAFDANYEGEEDV-
NGLTTYRFTQNVGYTPEGKLVAPLKYPSLYAGDEDGKVTTSAANWGLPGDP
NEQITMTRYYAAQRTFWVDPVSGTIVKETERANNYFARDPLKPEVTFADYOVTSTEETVESQVNAARDERDRL-
ALWSRVL
PITFTAAGLVALVGGGLFASFSLRTEGALMAASGDRDDHDYRRGGFEEPVPGAEAETEK-
LPTQRPDFPREPSGSDPPRLG SAQPPPPPDAGHPDPGPPERR >BL;Rv0236A,
H37Rv2.tab 286898:287071 reverse MW:5833
MNRIVAPAAASVVVGLLLGAAAIFGVTLMVQQDKKPPLPGGDPSSSVLNRvEYGNRS (SEQ ID
NO:372) >BL;Rv0236c, H37Rv2.tab 282652:286851 reverse MW:146250
VAPLSRKWLPVVGAVALALTFAQSPGQVSPDTKLDLTANPLRFLARATNLWNSDLP-
FGQAQNQAYGYLFPHGTFFVIGHL (SEQ ID NO:373)
LGVPGWVTQRLWWAVLLTVGFWGLLR-
vAEALGVGGPSSRvVGAVAFALSPRVLTTLGSISSETLPMMLAPWVLLPTILAL
RGTSGRSVRALAAQAGLAVALMGAVNAIATLAGCLPAVIWWACHRPNRLWWRYTAWWLLAMALATLWWVMALT-
QLNGVSP
PFLDPIESSGVTTQWSSLVEVLRGTDSWTPFVAPNATAGAPLVTGSAAILGTCLVAAAG-
LAGLTSPAMPARGRLVTMLLV
GVVLLAVGHRGGLASPVAHPVQAFLDAAGTPLRNVHKVGPVIRLP-
LVLGLAQLLSRVPLPGSAPRPAWLRAFAHPERDKR
VAVAVVALTALMVSTSLAWTGRVAPPGTFGA-
LPQYWQEAADWLRTHHAATPTPGRVLVVPGAPFATQVWGTSHDEPLQVL
GDGPWGVRDSIPLTPPQTIRALDSVQRLFAAGRPSAGLADTLARQGISYVLVRNDLDPETSRSARPILLHRSI-
AGSPGLA
KLAEFGAPVGPDPLAGFVNDSGLRPRYPAIEIYRVSAPANPGAPYFAATDQLARvDGGP-
EVLLRLDERRRLQGQPPLGPV
LMTADARAAGLPVPQVAVTDTPVARETDYGRVDHHSSAIRAPGDA-
RHTYNRVPDYPVPGAEPVVGGWTGGRITVSSSSAD
ATAMPDVAPASAPAAAVDGDPATAWVSNALQ-
AAVGQWLQVDFDRPVTNAVVTLTPSATAVGAQVRRILIETVNGSTTLRF
DEAGKPLTAALPYGETPWVRFTAAATDDGSAGVQFGITDLAITQYDASGFAHPVQLRHTVLVPGPPPGSAIAG-
WDLGSEL
LGRPGCAPGPDGVRCAASMALAPEEPANLSRTLTVPRPVSVTPMVWVRPRQGPKLADLI-
AAPSTTRASGDSDLVDILGSA
YAAADGDPATAWTAPQRvVQHKTPPTLTLTLPRPTVVTGLRLAAS-
RSMLPAHPTVVAINLGDGPQVRQLQVGELTTLWLH
PRVTDTVSVSLLDWDDVIDRNALGFDQLKPP-
GLAEVVVLSAGGAPIAPADAARNRARALTVDCDHGPVVAVAGRFVHTSI
RTTVGALLDGEPVAALPCEREPIALPAGQQELLISPGAAFVVDGAQLSTPGAGLSSATVTSAETGAWGPTHRE-
VRVPESA
TSRVLVVPESINSGWVARTSTGARLTPIAVNGWQQAWVVPAGNPGTITLTFAPNSLYRA-
SLAIGLALLPLLALLAFWRTG
RRQLADRPTPPWRPGAWAAAGVLAAGAVIASIAGVMVMGTALGVR-
YALRRRERLRDRvTVGLAAGGLILAGAALSRHPWR
SVDGYAGNWASVQLLALISVSVVAASVVATS- ESRGQDRMQ >BL;2V0241c,
H37Rv2.tab 289815:290654 reverse MW:30163
VTQPSGLKNLLRAAAGALPVVPRTDQLPNRTVTVEELPIDPANVAAYAAVTGLRYGN-
QVPLTYPFALTFPSVMSLVTGFD (SEQ ID NO:374)
FPFAAMGAIHTENHITQYRPIAVTDAV-
GVRVRAENLREHRRGLLVDLVTNVSVGNDVAWHQVTTFLHQQRTSLSGEPKPP
PQKKPKLPPPAAVLRITPAKIRRYAAVCGDHNPIHTNPIAAKLFGFPTVIAHGMFTAAAVLANIEARFPDAVR-
YSVRFAK PVLLPATAGLYVAEGDGGWDLTLRNMAKGYPHLTATVRGL >BL;Rv02500,
H37Rv2.tab 301738:302028 reverse MW:10878
LSTTAELAELHDLVGGLRRCVTALKARFGDNPATRRIVIDADRILTDIELLDTDVSELDLERAAVPQPSEKIA-
IPDTEYD (SEQ ID NO:375) REFWRDVDDEGVGGHRY >BL;Rv0257, H37Rv2.tab
309699:310071 forward MW:13053
MTRVSWLPDRCLPRLPACGRGLRGSLPGDSGGTAPDSHRLPASSSPDGKNIGMQSVDLHVERHLPSRGRSNRT-
VATVTCV (SEQ ID NO:376)
TALGDIRSAQLSATGAWPAVLFPSWSWLCGIGGGVDLQKPSCR- A >BL;Rv0283,
H37Rv2.tab 344022:345635 forward MW:55943
MTNQQHDHDFDHDRRSFASRTPVNNNPDKVVYRRGFVTRHQVTGWRFVMRRIAAGIALHDTRMLVD-
PLRTQSRAVLMGVL (SEQ ID NO:377)
IVITGLIGSFVFSLIRPNGQAGSNAVLADRSTAALY-
VRVGEQLHPVLNLTSARLIVGRPVSPTTVKSTELDQFPRGNLIG
IPGAPERMVQNTSTDANWTVCDGLNAPSRGGADGVGVTVIAGPLEDTGARAAALGPGQAVLVDSGAGTWLLWD-
GKRSPID
LADHAVTSGLGLGADVPAPRIIASGLFNAIPEAPPLTAPIIPDAGNPASFGVPAPIGAV-
VSSYALKDSGKTISDTVQYYA
VLPDGLQQISPVLAAILRNNNSYGLQQPPRLGADEVAKLPVSRVL-
DTRRYPSEPVSLVDVTRDPVTCAYWSKPVGAATSS
LTLLAGSALPVPDAVHTVELVGAGNGGVATR-
VALAAGTGYFTQTVGGGPDAPGAGSLFWVSDTGVRYGIDNEPQGVAGGG
KAVEALGLNPPPVPIPWSVLSLFVPGPTLSRADALLAHDTLVPDSRPARPVSAEGGYR
>BL;Rv0288, H37Rv2.tab 351848:352135 forward MW:10390
MSQIMYNYPMLGHAGDMAGYAGTLQSLGAEIAVEQAALQSAWQGDTGITYQAWQAQWNQANEDLVRAYHAMSS-
THEANT (SEQ ID NO:378) MAMMARDTAEAAKWGG >BL;Rv0289, H37Rv2.tab
352149:353033 forward MW:31559
MDATPNAVELTVDNAWFIAETIGAGTFPWVLAITMPYSDAAQRGAFVDRQRDELTRMGLLSPQGVINPAVADW-
IKVVCFP (SEQ ID NO:379)
DRWLDLRYVGPASADGACELLRGIVALRTGTGKTSNKTGNGVV-
ALRNAQLVTFTAMDIDDPRALVPILGVGLAHRPPARF
DEFSLPTRvGARADERLRSGVPLGEVVDY-
LGIPASARPVVESVFSCPRSYVEIVAGCNRDGRHTTTEVGLSIVDTSAGRV
LVSPSRAFDGEWVSTFSPGTPFAIAVAIQTLTACLPDGQWFPGQRvSRDFSTQSS
>BL;Rv0290, H37Rv2.tab 353083:354498 forward MW:47944
MSGTVMQIVRVAILADSRLTEMALPAELPLREILPAVQRLVVPSAQNGDGGQADSGAAVQLSLAPVGGQPFSL-
DASLDTV (SEQ ID NO:380)
GVVDGDLLVLQPVPAGPAAPGIVEDIADAANIFSTSRLKPWGI-
AHIQRGALAAVIAVALLATGLTVTYRvATGVLAGLLA
VAGIAVASALAGLLITIRSPRSGIALSIA-
ALVPIGAALALAVPGKFGPAQVLLGAAGVAAWSLIALMIPSAERERVVAFF
TAAAVVGASVALAAGAQLLWQLPLLSIGCGLIVAALLVTIQAAQLSALWARFPLPVIPAPGDPTPSAPPLRLL-
EDLPRRV
RVSDAHQSGFIAAAVLLSVLGSVAIAVRPEALSVVGWYLVAATAAAATLRARVWDSAAC-
KAWLLAQPYLVAGVLLVFYTA
TGRYVAAFGAVLVLAVLMLAWVVVALNPGIASPESYSLPLRRLLG-
LVAAGLDVSLIPVMAYLVGLFAWVLNR >BL;Rv0292, H37Rv2.tab 355880:356872
forward MW:35932 MNPIPSWPGRGRVTLVLLAVVPVALAYPWQSTRDY-
VLLGVAAAVVIGLFGFWRGLYFTTIARRGLAILRRRRRIAEPATC (SEQ ID NO:381)
TRTTVLVWVGPPASDTNVLPLTLIARYLDRYGIRADTIRITSRVTASGDCRTWVGLTVVADDNLAALQARSAR-
IPLQETA
QVAARRLADHLREIGWEAGTAAPDEIPALVAADSRETWRGMRHTDSDYVAAYRVSANAE-
LPDTLPAIRSRPAQETWIALE
IAYAAGSSTRYTVAAACALRTDWRPGGTAPVAGLLPQHGNHVPAL-
TALDPRSTRRLDGHTDAPADLLTRLHWPTPTAGAH RAPLTNAVSRT >BL;Rv0309,
H37Rv2.tab 377931:378584 forward MW:22528
MSRLLALLCAAVCTGCVAVVLAPVSLAVVNPWFANSVGNATQVVSVVGTGGSTAKMDVYQRTAAGWQPLKTGI-
TTHIGSA (SEQ ID NO:382)
GMAPEAKSGYPATPMGVYSLDSAFGTAPNPGGGLPYTQVGPNH-
WWSGDDNSPTFNSMQVCQKSQCPFSTADSENLQIPQY
KHSVVMGVNKAKVPGKGSAFFFHTTDGGP- TAGCVAIDDATLVQIIRWLRPGAVIAIAK
>BL;Rv0313, H37Rv2.tab 382490:382873 forward MW:13916
VGDYGPFGFDPDEFDRVIREGSEGLRDAFERIGRF-
LSSSGAGTGWSAIFEDLSRRSRPAPETAGEAGDGVWAIYTVDADG (SEQ ID NO:383)
GARVEQVYATELDALRANKDNTDPKRKVRFLPYGIAVSVLDDPVDEAQ >BL;Rv0356c,
H37Rv2.tab 434833:435474 reverse MW:22879
VTDASVHPDELDPEYHHHGGFPEYGPASPGAGFGQFVATMRRLQDLAVAADPGDAVWDEAAERAAALVELLSP-
FEADEGK (SEQ ID NO:384)
APAGRTPGLPGMGSLLLPPWTVTRYGTDGVEMRGSFSRFHVGG-
NSAVHGGVLPLLFDHMFGMISHAAGRPISRTAFLHVD
YRRITPIDVPLIVRGRVTNTEGRKAFVCA- ELFDSDETLLAEGNGLMVRLLPGQP
>BL;Rv0358, H37Rv2.tab 436860:437504 forward MW:23102
MYTAENAPGVAVLLSGDADVPGPLTGLPTHQDNLD-
TVIGRYSRLIVVGADADLGAVLTRLLRTDRLDVEVGYVPRRRSPA (SEQ ID NO:385)
TRAYRLPAGRRAARRARCGVARRVPLIRDETGSVIVGRAQWLPAEEQALIHGEAVVDDTVLFDGDVAGVCIEP-
TLTLPGL RAAVDGAGKWRRWIGGRAAQLGTTGAAVLRDGVAAPFPVRRSTFYRNVEGWLLVR
>BL;Rv0360c, 1137Rv2.tab 438305:438739 reverse MW:15297
VTKRTITPMTSMGDLLGPEPILLPGDSDAEAELLANESPSIVAAAHPSASVAWAVLAEGALADDKTVTA-
YAYARTGYHRG (SEQ ID NO:386)
LDQLRRHGWKGRGPVPYSHQPNRGFLRCVAALARAAAAI- GETDEYGRCLDLLDDCDPAARPALGL
>BL;Rv0361, H37Rv2.tab 438822:439646 forward MW:29982
MSNAPEPDRSAGESGSEPAGERSADPGEERTESYP-
LVPHDAETETVVITTSDNDAAVTQPEAQRERRFTAPGFDAKETQV (SEQ ID NO:387)
IVTAHEAATEVFQTNQAPTTPPRMPTGMPPKTAVPQSIPPRTEATSVRQRTWGWALAVVVIVLALAAIAILGT-
VLLTRGK
HSKMSQEDQVRQAIQSLDIAIQTGDLTALRSLTCGSTRDGYVDYDERDWAETYRRvSAA-
KQYPVIASIDQVVVNGAHAEA NVTTFMAFDPQVRSTRSLDLQFRDDQWKICQSSSN
>BL;Rv0383c, H37Rv2.tab 458464:459315 reverse MW:31801
MVPLWFTLSALCFVGAVVLLYVDIDRRRGRSRRRKSWARSHGFDYERESTEILKRWTRGVMSTVGDVAAHNVV-
LGQIRGE (SEQ ID NO:388)
AVYIFDLEEVATVIALHRKVGTNVVVDLRLKGLKEPRESDIWL-
LGAIGPRMVYSTNLDAARRACDRRMVTFAHTAPDCAE
IMWNEQNWTLVSMPIASTRAQWDEGLRTV-
RQFNDLLRVLPPLPQEMPQQTGVGPRGAAPGRPVAPGGPAELPPRRAQPDP
ATTVLPDPARRAPEPIRRDEGRSEGVRRPPPAGRNGQQATNYQH >BL;Rv0401,
H37Rv2.tab 479789:480157 forward MW:12641
MRPRRALAGLAADVVAVLVFCAVGRRSHAEGLSVTGLAATAWPFLTGTGIGWVLARGWRRPTALAPTGVIVWL-
CTIWGM (SEQ ID NO:389) VLRKVSSAGVAASFVVVASAVTAVLLLGWRAAVALMAPHRADG
>BL;Rv0416, H137Rv2.tab 502167:502370 forward MW:7367
MIVVVNEQQVEVDEQTTIAALLDSLGFGDRGIAVALNFSVLPRSDWATKICELRKPVRLEVVTAVQ-
GG (SEQ ID NO:390) >BL;Rv0430, H37Rv2.tab 518733:519038 forward
MW:11723 MDSAMARAIRSGDDAEVADGLTRREHDILAFERQWWKFAGVKEEAIKEL-
FSMSATRYYQVLNALVDRPEALAADPMLVKR (SEQ ID NO:391)
LRRLRASRQKARAARRLGFEVT >BL;Rv0431, H37Rv2.tab 519073:519564
forward MW:16905 VLVTVGSMNERVPDSSGLPLRAMVMVLLFLGVVFL-
LLVWQALGSSPNSEDDSSAISTMTTTTAAPTSTSVKPAAPRAEVR (SEQ ID NO:392)
VYNISGTEGAAARTADRLKAAGFTVTDVGNLSLPDVAATTVYYTEVEGERATADAVGRTLGAAVELRLPELSD-
QPPGVIV VVTG >BL;Rv0455c, H37Rv2.tab 545378:545821 reverse
MW:16639 MSRLSSILRAGAAFLVLGIAAATFPQSAAADSTED-
FPIPRRMIATTCDAEQYLAAVRDTSPVYYQRYMIDFNNHANLQQA (SEQ ID NO:393)
TINKANWFFSLSPAERRDYSEHFYNGDPLTFAWVNHMKIFFNNKGVVAKGTEVCNGYPAGDMSVWNWA
>BL;Rv0463, H37Rv2.tab 554016:554306 forward MW:10111
MTRRASTDTPQIIMGAIGGVTGYILWLAAISVGDGLTTVSQWSRVVLLLSVLVAVCGAAGGLRLRSRGKLAW-
SAFAFSL (SEQ ID NO:394) PIPPVVLTVAVLADIYL >BL;Rv0464C,
H37Rv2.tab 554316:554885 reverse MW:21304
MTGQNGQVARISPGKFRQLGPVNWLVAKLAARAVGAPQMHLFTTLGYRQYLFWTFAIYTGRLLHGRLPGVDTE-
LVILRVA (SEQ ID NO:395)
HLRSCEYELQHHRRMARRRGLDANTQATIFAWPDVPDGDGPRK-
VLSARQQALLQATDELIKDRTITAGTWERLATNLDPR
LLIEFCLLATQYDAIAATITALAIPPDNP- Q >BL;Rv0466, H37Rv2.tab
556458:557249 forward MW:30153
VSLDKKLMPVPDGHPDVFDREWPLRVGDIDRAGRLRLDAACRHIQDIGQDQLREMGFEETHPLWIV-
RRTMVDLIRPIEFG (SEQ ID NO:396)
DMLRCRRWCSGTSNRWCEMRVRVDGRKGGLIESEAF-
WIHVNRETEMPARIADDFLAGLHRTTSVDRLRWKGYLKPGSRDD
ASEIHEFPVRVTDIDLFDHMNNAVYWSVIEDYLASHAELLRGPLRVTIEHEAPVALGDKLEIISHVHPAGSTE-
IFGPGLV DRAVTTLTYVVGDEPKAVASLFNL >BL;Rv0476, H37Rv2.tab
566508:566768 forward MW:9166 MLVLLVAVLVTAVYAFVHAALQRPD-
AYTAADKLTKPVWLVILGAAVALASILYPVLGVLGMAMSACASGVYLVDVRPKLL (SEQ ID
NO:397) EIQGKSR >BL;Rv0477, H37Rv2.tab 566776:567219 forward
MW:15658 MKALVAVSAVAVVALLGVSSAQADPEADPGAGEANYGGPPSSPRLVDHT-
EWAQWGSLPSLRVYPSQVGRTASRRLGMAAA (SEQ ID NO:398)
DAAWAEVLALSPEADTAGMRAQFICHWQYAEIRQPGKPSWNLEPWRPVVDDSEMLASGCNPGSPEESF
>BL;Rv0479C, H37Rv2.tab 567924:568967 reverse MW:37016
VTNPQGPPNDPSPWARPGDQGPLARPPASSEASTGRLRPGEPAGHIQEPVSPPTQPEQQPQTEHLAASHAH-
TRRSGRQAA (SEQ ID NO:399)
HQAWDPTGLLAAQEEEPAAVKTKRRARRDPLTVFLVLIIVF-
SLVLAGLIGGELYARHVANSKVAQAVACVVKDQATASFG
VAPLLLWQVATRHFTNISVETAGNQIR-
DAKGMQIKLTIQNVRLKNTPNSRGTIGALDATITWSSEGIKESVQNAIPILGA
FVTSSVVTHPADGTVELKGLLNNITAKPIVAGKGLELQIINFNTLGFSLPKETVQSTLNEFTSSLTKNYPLGI-
HADSVQV TSTGVVSRFSTRDAAIPTGIQNPCFSHI >BL;Rv0483, H372V2.tab
571710:573062 forward MW:47858 VVIRVLFRPVSLIPVNNSSTPQSQ-
GPISRRLALTALGFGVLAPNVLVACAGKVTKLAEKRPPPAPRLTFRPADSAADVVP (SEQ ID
NO:400)
IAPISVEVGDGWFQRVALTNSAGKVVAGAYSRDRTIYTITEPLGYDTTYTWSGSAVGHDGKAVPVAGK-
FTTVAPVKTINA
GFQLADGQTVGIAAPVIIQFDSPISDKAAVERALTVTTDPPVEGGWAWLPDEAQ-
GARVHWRPREYYPAGTTVDVDAKLYG
LPFGDGAYGAQDMSLHFQIGRRQVVKAEVSSHRIQVVTDA-
GVIMDFPCSYGEADLARNVTRNGIHVVTEKYSDFYMSNPA
AGYSHIHERWAVRISNNGEFIHANPM-
SAGAQGNSNVTNGCINLSTENAEQYYRSAVYGDPVEVTGSSIQLSYADGDIWDW
AVDWDTWVSMSALPPPAAKPAATQIPVTAPVTPSDAPTPSGTPTTTNGPGG >BL;2V0487,
H37Rv2.tab 576787:577335 forward MW:20684
VTSSLPTVQRVIQNALEVSQLKYSQHPRPGGAPPALIVELPGERKLKINTILSVGEHSVRVEAFVCRKPDENR-
EDVYRFL (SEQ ID NO:401)
LRRNRRLYGVAYTLDNVGDIYLVGQMALSAVDADEVDRVLGQV-
LEVVDSDFNALLELGFRSSIQREWQWRLSRGESLQNL QAFAHLRPTTMQSAQRDEKELGG
>BL;Rv0495c, H37Rv2.tab 585427:586314 reverse MW:32960
VWRPAQGARWHVPAVLGYGGIPRRASWSNVESVANSRRRPVHPGQEVELDFAREWVEFYDPDNPEHLIAADL-
TWLLSRWA (SEQ ID NO:402)
CVFGTPACQGTVAGRPNDGCCSHGAFLSDDDDRTRLADAVHK-
LTDDDWQFRAKGLRRKGYLELDEHDGQPQHRTRKHKGA
CIFLNRPGFAGGAGCALHSKALKLGVPP-
LTMKPDVCWQLPIRRSQEWVTRPDGTEILKTTLTEYDRRGWGSGGADLHWYC
TGDPAAHVGTKQVWQSLADELTELLGEKAYGELAANCKRRSQLGLIAVHPATRAAQ
>BL;Rv0497, H37Rv2.tab 587377:588306 forward MW:33092
MTGPHPETESSGNRQISVAELLARQGVTGAPARRRRRRRGDSDAITVAELTGEIPIIRDDHHHAGPDAHASQS-
PAANGRV (SEQ ID NO:403)
QVGEAAPQSPAEPVAEQVAEEPTRTVYWSQPEPRWPKSPPQDR-
RESGPELSEYPRPLRHTHSDRAPAGPPSGAEHMSPDP
VEHYPDLWVDVLDTEVGEAEAETEVREAQ-
PGRGERHAAAAAAGTDVEGDGAAEARVARRALDVVPTLWRGALVVLQSILA
VAFGAGLFIAFDQLWRWNSIVALVLSVMVILGLVVSVRAVRKTEDIASTLIAVAVGALITLGPLALLQSG
>BL;Rv0498, H37Rv2.tab 588325:589164 forward MW:30433
VRPAIKVGLSTASVYPLRAEAAFEYADRLGYDGVELMVWGESVSQDIDAVRKLSRRYRVPVLSVHAPCLL-
ISQRVWGANP (SEQ ID NO:404)
ILKLDRSVRAAEQLGAQTVVVHPPFRWQRRYAEGFSDQVA-
ALEAASTVMVAVENMFPFRADRFFGAGQSRERMRKRGGGP
GPAISAFAPSYDPLDGNHAHYTLDLS-
HTATAGTDSLDMARRMGPGLVHLHLCDGSGLPADEHLVPGRGTQPTAEVCQMLA
GSGFVGHVVLEVSTSSARSANERESMLAESLQFARTHLLR >BL;RvO500B, H37Rv2.tab
591475:591573 forward MW:4145 MGSVIKKRRKRMSKKKHRKLLRRTR- VQRRKLGK
(SEQ ID NO:405) >BL;Rv0504c, H37Rv2.tab 594805:595302 reverse
MW:18360 MTVPEEAQTLIGKHYRAPDHFLVGREKIREFAVAV-
KDDHPTHYSEPDAAAAGYPALVAPLTFLAIAGRRVQLEIFTKFNI (SEQ ID NO:406)
PINIARVFHRDQKFRFHRPILANDKLYFDTYLDSVIESHGTVLAEIRSEVTDAEGKPVVTSVVTMLGEAAHHE-
ADADATV AAIASI >BL;Rv0528, H37Rv2.tab 618305:619891 forward
MW:57131 MWRSLTSMGTALVLLFLLALAAIPGALLPQRGLNA-
AKVDDYLAAHPLIGPWLDELQAFDVFSSFWFTAIYVLLFVSLVGC (SEQ ID NO:407)
LAPRTIEHARSLRATPVAAPRNLARLPKHANARLAGEPAALAATITGRLRGWRSITRQQGDSVEVSAEKGYLR-
EFGNLVF
HFALLGLLVAVAVGKLFGYEGNVIVIADGGPGFCSASPAAFDSFRAGNTVDGTSLHPIC-
VRVNNFQAHYLPSGQATSFAA
DIDYQADPATADLIANSWRPYRLQVNHPLRVGGDRVYLQGHGYAP-
TFTVTFPDGQTRTSTVQWRPDNPQTLLSAGVVRID
PPAGSYPNPDERRKHQIAIQCLLAPTEQLDG-
TLLSSRFPALNAPAVAIDIYRGDTGLDSGRPQSLFTLDHRLIEQGRLVK
EKRVNLRAGQQVRIDQGPAAGTVVRFDGAVPFVNLQVSHDPGQSWVLVFAITMMAGLLVSLLVRRRRVWARIT-
PTTAGTV NVELGGLTRTDNSGWGAEFERLTGRLLAGFEARSPDMAEAAAGTGRDVD
>BL;Rv0531, H37Rv2.tab 622329:622643 forward MW:11436
VSEAPNDKTTRGVVDILVYATARLLLVVAVSAAIFGVARLIGLTEFPVVVATLFGLIIAMPLGIWVFSPLRRR-
ATAALAV (SEQ ID NO:408) AGERRRAERERLRARLRGESLPEEQ >BL;Rv0543c,
H37Rv2,tab 635576:635875 reverse MW:11279
VNRFLTSIVAWLRAGYPEGIPPTDSFAVLALLCRRLSHDEVKAVANELMRLGDFDQIDIGVVITHFTDELPSP-
EDVERVR (SEQ ID NO:409) ARLAAQGWPLDDVRDREEHA >BL;2V0544c,
H37Rv2.tab 635938:636213 reverse MW:9747
VSAWFNYTATLKILIFSLLAGALLPGLFAVGVRLQAAGDGADATARRRPLLVAVSWAIFALVLAVVIIGVLYI-
ARDFIAH (SEQ ID NO:410) HTGWAFLGATPK >BL;RvOS46c, H37Rv2.tab
637586:637969 reverse MW:14346
MEILASRMLLRPADYQRSLSFYRDQIGLAIAREYGAGTVFFAGQSLLELAGYGEPDNSRGPFPGALWLQVRDL-
EATQTEL (SEQ ID NO:411)
VSRGVSIAREPRREPWGLHEMHVTDPDGITLIFVEVPEGHPLR- TDTRA >BL;Rv0556,
H37Rv2.tab 647959:648471 forward MW:18725
VISPKPLLHILIHGLSDELPDTRGRIVLRWLRIAVLIVTGLVTLQSVLLVAGAWRNDIAIQRNNGVAQAEV-
LSAGPRRST (SEQ ID NO:412)
IEFVTPDRITYRPQLGVLYPSELSTGMRIYVEYNKRDPNLV-
RvQHRNAGLAIIPAGSIAVVAWLIAAAALVVLAVLDKRL ERRENSASATG >BL;Rv0559c,
H37Rv2.tab 650410:650745 reverse MW:12116
MKGTKLAVVVGMTVAAVSLAAPAQADDYDAPFNNTIHRFGIYGPQDYNAWLAKISCERLSRGVDGDAYKSATF-
LQRNLPR (SEQ ID NO:413) GTTQGQAFQFLGAAIDHYCPEHVGVLQRAGTR
>BL;Rv0634A, H37Rv2.tab 731113:731364 forward MW:9408
LGSDCGCGGYLWSMLKRVEIEVDDDLIQKVIRRYRVKGAREAVNLALRTLLGEADTAEHGHDDEYDEFSDPNA-
WVPRRSR (SEQ ID NO:414) DTG >BL;Rv0635, H37Rv2.tab 731930:732403
forward MW:17448 VALSADIVGMHYRYPDHYEVEREK-
IREYAVAVQNDDAWYFEEDGAAELGYKGLLAPLTFICVFGYKAQAAFFKNANIATA (SEQ ID
NO:415)
EAQIVQVDQVLKFEKPIVAGDKLYCDVYVDSVREAHGTQIIVTKNIVTNEEGDLVQETYTTLAGRAGE-
DGEGFSDGAA >BL;Rv0636, H37Rv2.tab 732393:732818 forward MW:14934
MALREFSSVKVGDQLPEKTYPLTRQDLVNYAGVSGDLNPIHWDDEIAKVVGLDTAIA-
HGMLTMGIGGGYVTSWVGDPGAV (SEQ ID NO:416)
TEYNVRFTAVVPVPNDGKGAELVFNGR- VKSVDPESKSVTIALTATTGGKKIFGRAIASAKLA
>BL;Rv0637, H37Rv2.tab 732825:733322 forward MW:18929
MALKTDIRGMIWRYPDYFIVGREQ-
CREFARAVKCDHPAFFSEEAAADLGYDALVAPLTFVTILAKYVQLDFFRHVDVGME (SEQ ID
NO:417)
TMQIVQVDQRFVFHKPVLAGDKLWARMDIHSVDERFGADIVVTRNLCTNDDGELVMEAYTTLMGQQGD-
GSARLKWDKESG QVIRTA >BL;Rv0779c, H37Rv2.tab 872675:873292
reverse MW:21572 MRSRFLPYATTPGRLLAQLISDITVAVWTTLWMLV-
GLAVHDAISIIGEAGRQIEIGSHGIAGNLAAAGQDAQRIPVVGDA (SEQ ID NO:418)
LSNPITAASQAALDIAGAGHNLDTTAGWLAVVLALAVAATPILAVAMPWLFLRLRFCRRKWTVTTLAATPAGR-
QLLALRA LANRPPGKLAAVSTDPVGAWRREDPATMRALAALELRAAGIPLRGD
>BL;Rv0807, H37Rv2.tab 901635:902021 forward MW:13480
MSARDRVDPAKTRQVVLALADWLRDETLPAPDTDVLAAAVRLTARTLAALAPGASVEVRIPPFAAVQCISGPR-
HTRGTPP (SEQ ID NO:419)
NVVQTDPRTWLLVATGLSGVAQARGSGALQLSGSRAGEIEAWL- PLVDLG >BL;Rv0810c,
H37Rv2.tab 904908:905087 reverse MW:6900
MGRGRAKAKQTKVARELKYSSPQTDFQRLQRELSGTGTDRLDGDGPSDDDSWNDEDDW- RR (SEQ
ID NO:420) >BL;Rv0813C, H37Rv2.tab 907341:908018 reverse
MW:23868 VSSGAGSDATGAGGVHAAGSGDRAVAAAVERAKAT-
AARNIPAFDDLPVPADTANLREGADLNNALLALLPLVGVWRGEGE (SEQ ID NO:421)
GRGPDGDYRFGQQIVVSHDGGDYLNWESRSWRLTATGDYQEPGLREAGFWRFVADPYDPSESQAIELLLAHSA-
GYVELFY
GRPRTQSSWELVTDALARSRSGVLVGGAKRLYGIVEGGDLAYVEERvDADGGLVPHLSA-
RLSRFVG >BL;Rv0817C, H37Rv2.tab 910033:910842 reverse MW:28567
MPMRKVLVGVTGAAIVVAVLIVGAVGADFGASIYAEYRLSTTVRKAANLRSDPFVAI-
LRFPFIPQAMREHYAELEIKAFA (SEQ ID NO:422)
VEHAGSGTATLEATMHSIDLSYASWLI-
RPDAKLPVGELESRIIIDSMHLGRYLGISDLMVAAPRQESNDATGGTTESGIS
GSRGLVFSGTPISANFAHRVSVLVDLSVASDDRATLVITPTAVVTGPDTADQPVPDDKRDAVLHAFASKLPNQ-
KLPFGVV PNTVGARGSDVIIEGITRGVTISLDEFKQS >BL;Rv0819, H37Rv2.tab
911736:912680 forward MW:33567
VTALDWRSALTADEQRSVRALVTATTAVDGVAPVGEQVLRELGQQRTEHLLVAGSRPGGPIIGYLNLSPPRGA-
GGANAEL (SEQ ID NO:423)
VVHPQSRRRGIGTAMARAALAKTAGRNQFWAHGTLDPARATAS-
ALGLVGVRELIQMRRPLRDIPEPTIPDGVVIRTYAGT
SDDAELLRVNNAAFAGHPEQGGWTAVQLA-
ERRGEAWFDPDGLILAFGDSPRERPGRLLGFHWTKVHPDHPGLGEVYVLGV
DPAAQRRGLGQMLTSIGIVSLARRLGGRKTLDPAVEPAVLLYVESDNVAAVRTYQSLGFTTYSVDTAYALAGT-
DN >BL;Rv0862C, H37Rv2.tab 960345:962612 reverse MW:79667
MTEHTPDIPLGSWLAALPDERLTQLLELRPDLAQPPPGSIAALAARAQARQSVKAAT-
DELDFLRLAVFDALLVLQADTAP (SEQ ID NO:424)
VPIVRLLAVIGDRAAQADVLGALADLK-
QRALAWGETAVRVATDAGTALPWHPGQVTLEGSSRSGDQLADLIAGLDPAQRD
VLDKLLQGSPVGRTRDAAPGAPSDRPVPRLLAMGLLRRIDAETVILPRHVGQVLRGEQPGPMELTAPDPVVST-
TTPDDAD
AAAAGAVIDLLREVDVLLENLGATPVAELRSGGLGVREFKRLAKATGIDEPRLGLILEI-
AAAAGLIASGMPDPEPPHSDG
PFWAPTVAADRFATMSPAERWHLLASAWLDLPGRPALIGTRGPDA-
KPYGALSDSLFSTAAPLDRRLLLGMLAELPAGAGV
DASRASATLIWRRPRWARRLQPAPIADLLTE-
GHALGLVGRGAISTPARALLDEALEPATAPAAAVGVMARALPKPIDHFL
VQADLTVVVPGPLQRELADDLTTVATVESAGTAMVYRVSEQSIRHALDVGKSRDWLQEFFANRSKTPVPQGLT-
YLIDDVA
RRHGQLRIGMAASFVRCEDPTLLAQVVAAPEADGLALRALAPTVAVSPAPISEVLVTLR-
CAGFAPAAEDSTGAVVDVRTR
GARVPTPQRRRPYRPPPRPNSEALKAVVAVLREVTAAPFANVRVD-
PAVTMSLLQRAAKDQATLVISYLDAAGVATQRVVA
PITLRGGQLVAFDSSSGRLRDFAIHRITLVV- SAHDR >BL;Rv0863, H37Rv2.tab
962599:962877 forward MW:10079
VCSVIADQRRPDQPCGVGGCKTCQNGFVADIAEGKARKTRYVDHGWPTTDPDDHAVS-
ELVTDRTGALSPFGELTFPVPSD (SEQ ID NO:425) DLPYIHPVTVINR
>BL;Rv0875c, H37Rv2.tab 973809:974294 reverse MW:17800
VKRGVATLPVILVILLSVAAGAGAWLLVRGHGPQQPEISAYSHGHLTRVGPYLYCNVVDLDDCQTPQAQGELP-
VSERYPV (SEQ ID NO:426)
QLSVPEVISRAPWRLLQVYQDPANTTSTLFRPDTRLAVTIPTV-
DPQRGRLTGIVVQLLTLVVDHSGELRDVPHAEWSVRL IF >BL;Rv0876C, H37Rv2.tab
974294:975937 reverse MW:57938
MAPTPGRRTRNGSVNGHPGMANYPPDDANYRRSRRPPPMPSANRYLPPLGEQPEPERSRVPPRTTRAGERITV-
TRAAAMR (SEQ ID NO:427)
SREMGSRMYLLVHRAATADGADKSGLTALTWPVMANFAVDSAM-
AVALANTLFFAAASGESKSRVALYLLITIAPFAVIAP
LIGPALDRLQHGRRvALALSFGLRTALAV-
VLIMNYDGATGSFPSWVLYPCALANMVFSKSFSVLRSAVTPRVMPPTIDLV
RVNSRLTVFGLLGGTIAGGAIAAGVEFVCTHLFQLPGALFVVVAITIAGASLSMRIPRWVEVTSGEVPATLSY-
HRDRGRL
RRRWPEEVKNLGGTLRQPLGRNIITSLWGNCTIKVMVGFLFLYPAFVAKAHEANGWVQL-
GMLGLIGAAAAVGNFAGNFTS
ARLQLGRPAVLVVRCTVLVTVLAIAAAVAGSLAATAIATLITAGS-
SAIAKASLDASLQHDLPEESRASGFGRSESTLQLA
WVLGGAVGVLVYTELWVGFTAVSALLILGLA-
QTIVSFRGDSLIPGLGGNRPVMAEQETTRRGAAVAPQ >BL;Rv0877, H37Rv2.tab
976075:976860 forward MW:27437) VTGPTEESAVATVADWPEGLAAV-
LRGAADQARAAVVEFSGPEAVGDYLGVSYEDGNAATHRFIAHLPGYQGWQWAVVVAS (SEQ ID
NO:428)
YSGADHATISEVVLVPGPTALLAPDWVPWEQRvRPGDLSPGDLLAPAKDDPRLVPGYTASGDAQVD-
ETAAEIGLGRRWVM
SAWGRAQSAQRWHDGDYGPGSAMARSTKRVCRDCGFFLPLAGSLGAMFGVCG-
NELSADGHVVDRQYGCGAHSDTTAPAGG STPIYEPYDDGVLDIIEKPAES >BL;Rv0879c,
H37Rv2.tab 978484:978756 reverse MW:9512
MSVENSQIREPPPLPPVLLEVWPVIAVGALAWLVAAVAAFVVPGLASWRPVTVAOLATGLLGTTIFVWQLAAA-
RRGARGA (SEQ ID NO:429) QAGLETYLDPK >BL;Rv0883C, H37Rv2.tab
980509:981267 reverse MW:27373 MRELKVVGLDADGKNIICQGAIPS-
EQFKLPVDDRLRAALRDDSVQPEQAQLDIEVTNVLSPKEIQARIRAGASVEQVAAA (SEQ ID
NO:430)
SGSDIARIRRFAHPVLLERSRAAELATAAHPVLADGPAVLTMQETVAAALVARGLNPDSLTWDAWRNE-
DSRWTVQLAWKA
GRSDNLAHFRFTPGAHGGTATAIDDTAHELINPTFNRPLRPLAPVAHLDFDEPE-
PAQPTLTVPSAQPVSNRRGKPAIPAW EDVLLGVRSGGRR >BL;Rv0885, H37Rv2.tab
982762:983781 forward MW:39798
MDRTRIVRRWRRNMDVADDAEYVEMLATLSEGSVRRNFNPYTDIDWESPEFAVTDNDPRWILPATDPLGRHPW-
YQAQSRE (SEQ ID NO:431)
RQIEIGMWRQANVAKVGLHFESILIRGLMNYTFWMPNGSPEYR-
YCLHESVEECNHTMMFQEMVNRvGADVPGLPRRLRWV
SPLVPLVAGPLPVAFFIGVLAGEEPIDHT-
QKNVLREGKSLHPIMERvMSIHVAEEARHISFAHEYLRKRLPRLTRMQRFW
ISLYFPLTMRSLCNAIVVPPKAFWEEFDIPREVKKELFFGSPESRKWLCDMFADARMLAHDTGLMNPIARLVW-
RLCKIDG KPSRYRSEPQRQHLAAAPAA >BL;Rv0909, H37Rv2.tab
1014681:1014857 forward MW:6403 MGILDKVKNLLSQNADKVETVIN-
KAGEFVDEQTQGNYSDAIHKLHDAASNVVGMSDQQS (SEQ ID NO:432) >BL;Rv0910,
H37Rv2.tab 1014866:1015297 forward MW:15754
MAKLSGSIDVPLPPEEAWMhASDLTRYREWLTIHKVWRSKLPEVLEKGTVVESYVEVKGMPNRIKWTIVRYKP-
PEGMTLN (SEQ ID NO:433)
GDGVGGVKVKLIAKVAPKEHGSVVSFDVHLGGPALLGPIGMIV- AAALRADIRESLQNFVTVFAG
>BL;Rv0912, H37Rv2.tab 1016236:1016682 forward MW:15438
MTRRLRPGWLVALSAAVIAASTWMPWLTTTVGG-
GGWVNAIGGTHGSLELPHGFGPGQLIVLLSSTLLVVGAMAGRGLSVK (SEQ ID NO:434)
LSSIAALVVSLLIVALTVWYYKLNVNPPVSAEYGLYFGAAGGVCAVGCSLWAAVSAASPGRRRHREVVR
>BL;Rv0948C, H37Rv2.tab 1057649:1057963 reverse MW:11770
MRPEPPHHENAELAAMNLEMLESQPVPEIDTLREEIDRLDAEILALVKRRAEVSKAIGKARMASGGTR-
LVHSREMKVIER (SEQ ID NO:435) YSELGPDGKDLAILLLRLGRGRLGH
>BL;Rv0954, H37Rv2.tab 1065127:1066035 forward MW:30203
MTYSPGNPGYPQAQPAGSYGGVTPSFAMADEGASKLPMYLNIAVAVLGLAAYFASFGPMFTLSTELGGGDGAV-
SGDTGLP (SEQ ID NO:436)
VGVALLAALLAGVALVPKAKSHVTVVAVLGVLGVFLMVSATFN-
KPSAYSTGWALWVVLAFIVFQAVAAVLALLVETGAIT
APAPRPKFDPYGQYGRYGQYGQYGVQPGG-
YYGQQGAQQAAGLQSPGPQQSPQPPGYGSQYGGYSSSPSQSGSGYTAQPPA
QPPAQSGSQQSHQGPSTPPTGFPSFSPPPPVSAGTGSQAGSAPVNYSNPSGGEQSSSPGGAPV
>BL;Rv0955, H37Rv2.tab 1066078:1067442 forward MW:46056
VNRVSASADDRAAGARPARDLVRvAFGPGVVALGIIAAVTLLQLLIANSDMTGAWGAIASMWLGVHLVPISIG-
GRALGVM (SEQ ID NO:437)
PLLPVLLMVWATARSTARATSPQSSGLVVRWVVASALGGPLLM-
AAIALAVIHDASSVVTELQTPSALRAFTSVLVVHSVG
AATGVWSRvGRRALAATALPDWLHDSMRA-
AAAGVLALLGLSGVVTAGSLVVHWATMQELYGITDSIFGQFSLTVLSVLYA
PNVIVGTSAIAVGSSAHIGFATFSSFAVLGGDIPALPILAAAPTPPLGPAWVALLIVGASSGVAVGQQCARRA-
LPFVAAN
AKLLVAAVAGALVMAVLGYGGGGRLGNFGDVGVDEGALVLGVLFWFTFVGWVTVVIAGG-
ISRRPKRLRPAPPVELDADES
SPPVDMFDGAASEQPPASVAEDVPPSHDDIANGLKAPTADDEALP- LSDEPPPRAD
>BL;Rv0966c, H37Rv2.tab 1077236:1077835 reverse MW:22210
MSNSAQRDARNSRDESARASDTDRIQIAQLLAYAAEQGRLQLTDYEDRL-
ARAYAATTYQELDRLRADLPGAAIGPRRGGE (SEQ ID NO:438)
CNPAPSTLLLALLGGFERRGRWNVPKKLTTFTLWGSGVLDLRYADFTSTEVDIRAYSIMGAQTILLPPEVNVE-
IHGHRVM GGFDRKVVGEGTRGVPTVRIRGFSLWGDVGIKRKPRKPRK >BL;Rv0970,
H37Rv2.tab 1081052:1081681 forward MW:22887
MIHDLMLRWVVTGLFVLTAAECGLAIIAKRRPWTLIVNNGLHFANAVAMAVMAWPWGARVPTTGPAVFFLLAA-
VWFGATA (SEQ ID NO:439)
VVAVRGTATRGLYGYHGLMMLATAWMYAAMNPRLLPVRSCTEY-
ATEPDGSMPANDMTAMNMPPNSGSPIWFSAVNWIGTV
GFAVAAVFWACRFVMERRQEATQSRLPGS- IGQANMAAGMAMLFFAMLFPV >BL;Rv0996,
H37Rv2.tab 1112384:1113457 forward MW:39519
MPSIPQSLLWISLVVLWLFVLVPMLISKRDAVR-
RTSDVALATRvLNGGAGARLLKRGGPAAGHRWGYLPPEGQGDDPDWK (SEQ ID NO:440)
PEEDWRDDPVEDGFADVEHDIDEDQEADDARRRGAVVMKVAAPQTAGADEPDYLDVDVVEEDSEALPVGAGAA-
VGESADE
ADAEAADGVAGHADPEADPVEYEYEYEYVEDTCGLELEEDDQEAPPTVASGTSRRRRFD-
TKTAAAVSARKYTFRKRALIV
MAVILVGSAAAAFELTPVAWWICGSATGVTVLYLAYLRRQTRIEE-
KVRRRRMQRIARARLGVENTRDREYDVVPSRLRRP
GAVVLEIDDEDPIFTHLESAAPIRNYGWPRD- LPRAVGQ >BL;Rv0998, H37Rv2.tab
1114748:1115746 forward MW:35608
LDGIAELTGARVEDLAGMDVFQGCPAEGLVSLAASVQPLRAAAGQVLLRQGEPAVSF-
LLISSGSAEVSHVGDDGVAIIAR (SEQ ID NO:441)
ALPGMIVGEIALLRDSPRSATVTTIEP-
LTGWTGGRGAFATMVHIPGVGERLLRTARQRLAAFVSPIPVRLADGTQLMLRP
VLPGDRERTVHGHIQFSGETLYRRFMSARVPSPALMHYLSEVDYVDHFVWVVTDGSDPVADARFVRDETDPTV-
AEIAFTV
ADAYQGRGIGSFLIGALSVAARVDGVERFAARMLSDNVPMRTIMDRYGAVWQREDVGVI-
TTMIDVPGPGELSLGREMVDQ INRVARQVIEAVG >BL;Rv1000, H37Rv2.tab
1116531:1117148 reverse MW:22648
MCDKLGGVAIAVQGALFEHNERRQLGDGAFIDIRSGWLTGGEELLDALLSTVPWRAERRQMYDRVVDVPRLVS-
FHDLTIE (SEQ ID NO:442)
DPPHPQLARMRRRLNDIYGGELGEPFTTAGLCYYRDGSDSVAW-
HGDTIGRGSTEDTMVAIVSLGATRVFALRPRGRGPSL
RLPLAHGDLLVMGGSCQRTFEHAVPKTSA- PTGPRVSIQFRPRDVR >BL;Rv1024,
H37Rv2.tab 1145858:1146541 forward MW:24570
MPEAKRPESKRRSPASRPGKAGDSVRGGRATKPSAKPSTPAPHASRKTT-
RTPHEHIVEPIKRAITESVEKRSEQRLGFTA (SEQ ID NO:443)
RRAAILAAVVCVLTLTIARPVRTYFAQRAEMEQLAATEANLRRQIADLEEQQVKLADPAYIAAQARERLGFVM-
PGDIPFQ
VQLPSTPLAPPQPGSDAATATNNEPWYTALWHTIADDPHLPPAAPPAPEPGRPGPLPPA-
SPNPEQPGG >BL;Rv1025, H37Rv2.tab 1146561:1147025 forward
MW:16593 VVTRQLGRAPRGVLAIAYRCPNGEPGVVKTAPRLPDGTPFPTLYYLTHP-
VLTAAASRLETTGLMREMNRRLGQDAELAAA (SEQ ID NO:444)
YRRAHESYLSERDALEPLGTTVSAGGMPDRvKCLHVLIAHSLAKGPGLNPFGDEALALLAAEPRTAATLVAGQ-
WR >BL;Rv1081c, H37Rv2.tab 1205987:1206418 reverse MW:15384
MTHTPIPRPDARYGRPRLSRRARRRvAIALGVLVAAAGIVIAVIGYQRISTSAVTGS-
LVGYRLVDDETASVTISVTRSDP (SEQ ID NO:445)
SRPVACIVRVRATNGSETGRRELLVPP- SEATTVQVTTTVKSSQPPVMADVYGCGTEVPSYLRLP
>BL;Rv1083, H37Rv2.tab 1207383:1207646 forward MW:9263
VNQILLSVIAEGGPGNTGPDFGK-
ASPVGLLVIVLLVIATLFLVRSMNQQLKKVPKSFDRDHPELDQAADEGTDRDGPARP (SEQ ID
NO:446) PGPPHESG >BL;Rv1100, H37Rv2.tab 1228683:1229381
forward MW:24562 MVGDCPRSRTVRWSWDTGHVTAEPQPTPRPAKPRLLQDGRDMFWSLAPL-
VVGCILLAGLVGMCSFQLGGTKRGPIPSYDA (SEQ ID NO:447)
AQALRADAKTLGFPIRLPQLPGGWTPNSGGRGGIENGRADPATGQRRNAATSIVGFISPTGRYLSLTQSNADE-
DKLVGSI
HPSMYPTGTVDVGGTRWVVYEGSDENGAVEPVWTTRLTGPGGATQLAITGAGSIDQFRT-
LASATQSQPPLPAR >BL;Rv1109c, H37Rv2.tab 1235460:1236095 reverse
MW:22957 MATAPYGVRLLVGAATVAVEETMKLPRTILMYPMTLASQAAHVVMRFQQ-
GLAELVIKGDNTLETLFPPKDEKPEWATFDE (SEQ ID NO:448)
DLPDALEGTSIPLLGLSDASEAKNDDRRSDGRFALYSVSDTPETTTASRSADRSTNPKTAKHPKSAAKPTVPT-
PAVAAEL DYPALTLAQLRARLHTLDVPELEALLAYEQATKARAPFQTLLANRITRATAK
>BL;Rv1111c, H37Rv2.tab 1237212:1238192 reverse MW:36985
VSAQRARSAVQASHRSIHPHIPGVPWWAAILIAVTATAIGYAIDAGSGHKALTLVFTGCYIAGCVGAVLAV-
RQSDLFTAL (SEQ ID NO:449)
VQPPLILFCAVPGAYWLFHGGTIGKFKDLLINCGYSLIERF-
PLMLGTAAGVLLIGLVRWYLGTALFDSIARKLSSLMTGD
SDDDGGRRSAQRPARTRSRHARPPSED-
NREPIAERRSRRRPRPQNDPHPRRNAMERPAPRSSRFDSYRSYQPSEPSGPAE
PVNRYERRGARYQPYARYEPTYEPQRRRARPSEPTNPTHHPISQVRYRCSATRDARRDNYREEQRFDRRDRSR-
APRRPPA ESWEYDV >BL;Rv1155, H37Rv2.tab 1281429:1281869 forward
MW:16300 MARQVFDDKLLAVISGNSIGVLATIKHDGRPQL-
SNVQYHFDPRKLLIQVSIAEPRAKTRNLRRDPRASILVDADDGWSYA (SEQ ID NO:450)
VAEGTAQLTPPAAAPDDDTVEALIALYRNIAGEHSDWDDYRQAMVTDRRvLLTLPISHVYGLPPGMR
>BL;Rv1157c, H37Rv2.tab 1283059:1284171 reverse MW:36448
VRRLTNTEHRENTTVASTWSVCKGLAAVVITSAAAFALCPNAAADPATPQPNPTQQLPGLPALAQLSPII-
QQAAMNPAQA (SEQ ID NO:451)
TQLLMAAASAFAGNPAVPTESKNVASSVNQFVAEPTNPDS-
AALGVPAPHGVALPEAIPVPHVPPLGAEPGVQAHLPTGID
PSHAAGPAPAVAPTVTPPVAAPPASA-
PAPAPDAAQPVAVPGPPPAPPAPRAAAPAPASAAPAPAAAPAPASGFGADAPPT
QDFMYPSIGPNCVADGSNSIATALSVAGPAKIPLPGPGPGQTAYVFTAVGTPGPADVQRLPLNVTWVNLTTGK-
SGSATLR PRSDINPDGPTTLTVIADTGSGSIMSTIFGQVTTKDRQCQFMPTIGSTVVP
>SL;Rv1158c, H37Rv2.tab 1284182:1284862 reverse MW:21401
MPTIWTFVRAAAVLVGSSAALLTGGIAHADPAPAPAPAPNIPQQLISSAANAPQILQNLATALGATPPLSAP-
KVAEPAPA (SEQ ID NO:452)
APGITATFPGLTPAAPAAAAAPALTPSIPGVNAPIPGITPAA-
PALPVTAPAAAPTIPGVNAPIPGITAPAPAAAAVPASV
PGVPSAKVDLPQLPYLPLQVPQQLSLPA-
DLPALASGVIPAAPIAPTPPAPGAPALPPGPPSLLAALP >BL;Rv1159A, H37Rv2.tab
1286284:1286568 reverse MW:10379
MAVLTDEQVDAALHDLNGWQRAGGVLRRSIKFPTFMAGIDAVRRVAERAEEVNHHPDIDIRWRTVTFALVTHA-
VGGITEN (SEQ ID NO:453) DIAMAHDIDANFGA >BL;Rv1171, H37Rv2.tab
1301307:1301744 forward MW:15185
VGHRVDTLSDRQRANLTTGATDRAIRLVVLALLTVDGVVSALAGALLMPWYIGSAPFPISALISGLVNAALVW-
AAARWTT (SEQ ID NO:454)
SSRVAALPLWAWLLTVAAMSFGGPGDDVILGGQGLLVYGALVF-
VVAGAVPPAWVLWRRRVQADGSG >BL;Rv1184c, H37Rv2.tab 1324535:1325611
reverse MW:37818 MKRvIAGAFAVWLVGWAGGFGTAIAASEPAYPW-
APGPPPSPSPVGDASTAKVVYALGGARMPGIPWYEYTNQAGSQYFPN (SEQ ID NO:455)
AKHDLIDYPAGAAFSWWPTMLLPPGSHQDNMTVGVAVKDGTNSLDNAIHHGTDPAAAVGLSQGSLVLDQEQAR-
LANDPTA
PAPDKLQFTTFGDPTGRHAFGASFLARIFPPGSHIPIPFIEYTMPQQVDSQYDTNEVVT-
AYDGFSDFPDRPDNLLAVANA
AIGAAIAHTPIGFTGPGDVPPQNIRTTVNSRGATTTTYLVPVNHL-
PLTLPLRYLGMSDAEVDQIDSVLQPQIDAAYARND
NWFTRPVSVDPVRGLDPLTAPGSIVEGARGL- LGSPAFCG >BL;Rv1209, H37Rv2.tab
1353157:1353522 forward MW:13089
VALVLVYLVVLVLVAIVLFAAASLLFGRGEQLPPLPRATTATTLPAFGVTRADVDAV-
KFTQVLRGYKTSEVDWVLERLGR (SEQ ID NO:456)
ELEALRSQLGAIHASSEDAEAESDASN- PSRGETVVHYRSDPA >BL;Rv1211,
H37Rv2.tab 1354243:1354467 forward MW:7810
MLGADQARAGGPARIWREHSMAAMKPRTGDGPLEATKEGRGIVMRVPLEG-
GGRLVVELTPDEAAALGDELKGVTS (SEQ ID NO:457) >BL;Rv1222, H37Rv2.tab
1365344:1365805 forward MW:16250
MADPGSVGHVFRRAFSWLPAQFASQSDAPVGAPRQFRSTEHLSIEAIAAFVDGELRMNAHLRAAHHLSLCAQC-
AAEVDDQ (SEQ ID NO:458)
SRAPAALRDSHPIRIPSTLLGLLSEIPRCPPEGPSKGSSGGSS-
QGPPDGAAAGFGDRFADGDGGNRGRQSRVRR >BL;Rv1249c, H37Rv2.tab
1393197:1393982 reverse MW:27571 MSARRIRSWKRFDNRSANAAEPDPQLAGTGGRP-
KVSTRALAQVIERSSRIQGPAAQAYVARLRRAHPGASPAKIVAKLEK (SEQ ID NO:459)
RFLSVVTASGAAVGAAATLPGIGTLAAWFAAAGEVVVFLEATALFVLALASVHAIPLDHRERRRALVLAVLVG-
DNTTAVA
DLLGPGRTSGGWVSETMASLPLPAISSLNSRMLKYVVKRFALKRGALMFGKLVPMGIGA-
IIGAIGNRLVGKKLVRNARSA FGTPPARWPVTLNVLPTVRDAS >BL;Rv1251c,
H37Rv2.tab 1395824:1399240 reverse MW:123118
VFVTGDSIVYSASDLAAAARCQYALLREFDAKLGRGPAVAVDDELMARAAVLGSAHEGRRLDQLRHEFGDAVA-
IIGRPAY (SEQ ID NO:460)
TPAGLAAAADATRRAIANHAPVVYQAAMFDGRFVGFADFLIRD-
GHRYRVADTKLARSPTVTALLQLAAYADALVHSGVPV
AADAELELGDGTIVRYRVGELIPVYRSQP-
ALLQRLLDGHYTAGTAVRWDDERVQACFRCPQCTERLRASDDLLLVGGMRV
RQRDKLLEAGITTIAELADHTAPVPGLTTNALGKLTAQAKLQIRQRDTGAPQFEIVDPRPLTLLPEPNPGDLF-
FDFEGDP
LWTADGKQWGLEYLFGVLEAGRAGVFRPLWAHDRTAERQALTDFLAIVARRRRRHPNMN-
IYHYAPYEKTALLRLVGRYGI
GEDDVDDLLRNGVLVDLYPLVRKSIRVGTDSFSLKALEPLYLGTQ-
PRSGDVTTAADSINSYARYCELRAAGRIDEAATVL
KEIEGYNHYDCRSTRALRDWLLMRAWEAGVT-
PIGAQPVPDADPIDDGDSLASVLSKFTGDAAAGERTPEQTAVALLAAAR
GYHRREDKPFWWAHFDRLNYPVDEWSDSTDVFLASEASVTVDWHMPPRARKPQRRVRLTGELARGDLNGNVFA-
LYEPPAP
PGMTDNPDRRAAGPAAVVETDDPTVPTEVVIVERTGSDGNTFQQLPFALAPGPPVPTTA-
LRESIESTAAAVASGSPQLPS
TALMDVLLRRPPRTRSGAALPRSSDPVTDIAAAALDLDSSYLAVH-
GPPGTGKTYTAARVIAELVTEHAWRIGVVAQSHAT
VENLLEGVISAGLDPGQVAKKPHDHTAGRWQ-
SIDGSQYTEFIRDTAGCVIGGTAWDFANGNRvPKASLDLLVIDEAGQFC
LANTIAVAPAATNLLLLGDPQQLPQVSQGTHPEPVDTSALSWLVDGQHTLPDERGYFLDRSYRMHPAVCAAVS-
ALSYEGR
LCSHTERTAVRRLDGYPPGVHTRGVHHKGNSIESPEEAEAILAELRQLLGSPWTDEHGT-
RPLAASDVLVLAPYNAQVALV
RRRLASAGLGGADGVRVGTVDKFQGGQAPVVFISMTASSADDVPR-
GISFLLNRNRLNVAVSRAQYAAVIVRSELLTQYLP ATPDGLVDLGAFLGLTSTS
>BL;Rv1259, H37Rv2.tab 1407339:1408235 forward MW:31783
MNIAAESSAKPVWGPPNFCAAAARMQDVRVLMHPKTGRAFRSPVEPGSGWPGDPATPQTPVAADAAQVSALAG-
GAGSICE (SEQ ID NO:461)
LNALISVCRACPRLVSWREEVAVVKRRAFADQPYWGRPVPGWG-
SKRPRLLILGLAPAAHGANRTGRMFTGDRSGDQLYAA
LHRAGLVNSPVSVDAADGLRANRIRITAP-
VRCAPPGNSPTPAERLTCSPWLNAEWRLVSDHIRAIVALGGFAWQVALRLA
GASGTPKPRFGHGVVTELGAGVRLLGCYHPSQQNMFTGRLTPTMLDDIFREAKKLAGIE
>BL;Rv1276c, H37Rv2.tab 1425441:1425914 reverse MW:16461
MRHAKSAYPDGIADHDRPLAPRGIREAGLAGGWLRANLPAVDAVLCSTATRARQTLAHTGIDAPARYAERLYG-
AAPGTVI (SEQ ID NO:462)
EEINRVGDNVTTLLVVGHEPTTSALAIVLASISGTDAAVAERI-
SEKFPTSGIAVLRvAGHWADVEPGCAALVGFHVFR >BL;Rv1277, H37Rv2.tab
1426164:1427414 forward MW:44758
VSPRPGPAGRGPAPCRCADLHSLCVDSHALRRDGMRFLHTADWQLGMTRHFLAGDAQPRYSAARRDAVAGLKA-
LAADVGA (SEQ ID NO:463)
EFVVVAGDVFEHNQLAPQIVGQSLEANRVIGLPVYLLPGNHDP-
LDASSVYTSTLFRAERPDNVVVLDRAGVHEVRPGVQI
VAAPWRSKAPTTDPVAEVLAGLPTDAAIR-
LLVAHGGVDALDPDHDKPSLIRLAALDDALTRQAIHYVALGDKHSLTQVGS
SGRVWYSGAPEVTNFDDVEPDPGHVLVVDIDESDPRHPVTVDARRIGRWRFVTLHHQVDTSRDIADLDLNLDL-
MTDKDRT
VVRLALTGSLTVTDRAALDTCLDKYARLFAWLGLWERNTDLAVIPVDAEFTDLGIGGFA-
AAAVDELVATARGGODESAVD AQAALALLLRLADRGAA >BL;Rv1278, H37Rv2.tab
1427414:1430038 forward MW:93319
VKLHRLALTNYRGIAHRDVEFPDHGVVVVCGANEIGKSSMVEALDLLLEYKDRSTKKEVKQVKPTNADVGSEV-
IAEISSG (SEQ ID NO:464)
PYRFVYRKRFHKRCETELTVLAPRREQLTGDEAHERVRTMLAE-
TVDTELWHAQRVLQAASTAAVDLSGCDALSRALDLAA
GDDAALSGTESLLIERIEAEYARYFTPTG-
RPTGEWSAAVSRLAAAEAAVADCAAAVAEVDDGVRRHTELTEQVAELSQQL
LAHQLRLEAARVAAEKIAAITDDAREAKLIATAAAATSGASTAAHAGRLGLLTEIDTRTAAVVAAEAKARQAA-
DEQATAR
AEAEACDAALTEATQVLTAVRLRAESARRTLDQLADCEEADRLAARLARIDDIEGDRDR-
VCAELSAVTLTEELLSRIERA
AAAVDRGGAQLASISAAVEFTAAVDIELGVGDQRVSLSAGQSWSV-
TATGPTEVKVPGVLTARIVPGATALDFQAKYAAAQ
QELADALAAGEVADLAAARSADLCRRELLSR-
RDQLTATLAGLCGDEQVDQLRSRLEQLCAGQPAELDLVSTDTATARAEL
DAVEAARIAAEKDCETRRQIAAGAARRLAETSTRATVLQNAAAAESAELGAAMTRLACERASVGDDELAAKAE-
ADLRVLQ
TAEQRVIDLADELAATAPDAVAAELAEAADAVELLRERHDEAIRALHEVGVELSVFGTQ-
GRKGKLDAAETEREHAASHHA
RVGRRARAARLLRSVMARHRDTTRLRYVEPYRAELHRLGRPVFGP-
SFEVEVDTDLRIRSRTLDDRTVPYECLSGGAKEQL
GILARLAGAALVAKEDAVPVLIDDALGFTDP-
ERLAKMGEVFDTIGADGQVIVLTCSPTRYGGVKGAHRIDLDAIQ >BL;Rv1303,
H37Rv2.tab 1459766:1460248 forward MW:16863
VTTPAQDAPLVFPSVAFRPVRLFFINVGLAAVAMLVAGVFGHLTVGMFLGLGLLLGLLNALLVRRSAESITAK-
EHPLKRS (SEQ ID NO:465)
NALNSASRLAIITILGLIIAYIFRPAGLGVVFGLAFFQVLLVA-
TTALPVLKKLRTATEEPVATYSSNGQTGGSEGRSASD D >BL;Rv1312, H37Rv2.tab
1467688:1468128 forward MW:16594
MSAPMIGMVVLVVVLGLAVLALSYRLWKLRQGGTAGIMRDIPAVGGHGWRHGVIRYRGGEAAFYRLSSLRLWP-
DRRLSRR (SEQ ID NO:466)
GVEIISRRAPRGDEFDIMTDEIVVVELCDSTQDRRVGYEIALD-
RGALTAFLSWLESRPSPRARRRSM >BL;Rv1324, H37Rv2.tab 1487161:1488072
forward MW:32138 VTRPRPPLGPAMAGAVDLSGIKQRAQQNAAAST-
DADRALSTPSGVTEITEANFEDEVIVRSDEVPVVVLLWSPRSEVCVD (SEQ ID NO:467)
LLDTLSGLAAAAKGKWSLASVNVDVAPRVAQIFGVQAVPTVVALAAGQPISSFQGLQPADQLSRWVDSLLSAT-
AGKLKGA
ASSEESTEVDPAVAQARQQLEDGDFVAARKSYQAILDANPGSVEAKAAIRQIEFLIRAT-
AQRPDAVSVADSLSDDIDAAF
AAADVQVLNQDVSAAFERLIALVRRTSGEERTRvRTRLIELFELF- DPADPEVVAGRRNLANALY
>BL;Rv1332, H37Rv2.tab 1500926:1501579 forward MW:24054
MPPVCGRRCSRTGEIRGYSGSIVRRWKRVETRD-
GPRFRSSLAPHEAALLKNLAGAMIGLLDDRDSSSPSDELEEITGIKT (SEQ ID NO:468)
GHAQRPGDPTLRRLLPDFYRPDDLDDDDPTAVDGSESFNAALRSLHEPEIIDAKRVAAQQLLDTVPDNGGRLE-
LTESDAN AWIAAVNDLRLALGVMLEIGPRGPERLPGNHPLAAHFNVYQWLTVLQEYLVLVLMGSR
>BL;Rv1342c, H37Rv2.tab 1508187:1508546 reverse MW:13382
MTAPETPAAQHAEPAIAVERIRTALLGYRIMAWTTGLWLIALCYEIVVRYVVKVDNP-
PTWIGVVNGWVYFTYLLLTLNLA (SEQ ID NO:469)
VKVRWPLGKTAGVLLAGTIPLLGIVVE- HFQTKEIKARFGL >BL;Rv1343c,
H37Rv2.tab 1508546:1508923 reverse MW:14241
VSTTRRRRPALIALVIIATCGCLALGWWQWTRFQSTSGTFQNLGYALQW-
PLFAWFCVYAYRNFVRYEETPPQPPTGGAAA (SEQ ID NO:470)
EIPAGLLPERPKPAQQPPDDPVLREYNAYLAELAKDDARKQNRTTA >BL;Rv1390,
H37Rv2.tab 1565093:1565422 forward MW:11810
VSISQSDASLAAVPAVDQFDPSSGASGGYDTPLGITNPPIDELLDRvSSKYALVIYAAKRARQINDYYNQLGE-
GILEYVG (SEQ ID NO:471) PLVEPGLQEKPLSIALREIHADLLEHTEGE
>BL;Rv1417, H37Rv2.tab 1592150:1592611 forward MW:16351
VTAAPNDWDVVLRPHWTPLFAYAAAFLIAVAHVAGGLLLKVGSSGVVFQTADQVAMGALGLVLAGAVLLFARP-
RLRVGSA (SEQ ID NO:472)
GLSVRNLLGDRIVGWSEVIGVSFPGGSRWARIDLADDEYIPVM-
AIQAVDKDRAVAANDTVRSLLARYRPDLCAR >DL;Rv1476, H37Rv2.tab
1666204:1666761 forward MW:19601 MTGPYFPQTIPFLPSYIPQDVDMTAVKAEVAAL-
GVSAPPAATPGLLEVVQHARDEGIDLKIVLLDHNPPNDTPLRDIATV (SEQ ID NO:473)
VGADYSDATVLVLSPNYVGSYSTQYPRvTLEAGEDHSKTGNPVQSAQNFVHELSTPEFPWSALTIVLLIGVLA-
AAVGARL MQLRGRRSATSTDAAPGAGDDLNQGV >BL;Rv1590, H37Rv2.tab
1791334:1791570 forward MW:8602 MVEIVAGKQRAPVAAGVYNVYTG-
ELADTATPTAARMGLEPPRFCAQCGRRMVVQVRPDGWWARCSRHGQVDSADLATQR (SEQ ID
NO:474) >BL;Rv1591, H37Rv2.tab 1791570:1792232 forward MW:23151
VTEPPGFGGPSEPSGAPRTSRTRAVLFVMLGLSATGVLVGGLWAWIAPPIHAVVAITRAGERVHEY-
LGSESQNFPIAPFM (SEQ ID NO:475)
LLGLLSVLAVVASALMWQWREHRGPQMVAGLSIOLT-
TAAAIAAGVGALVVRLRYGALDFDTVPLSRGDHALTYVTQAPPV
FFARRPLQIALTLMWPAGIASLVYALLAAGTARDDLGGYPAVDPSSNARTEALETPQAPVS
>BL;Rv1610, H37Rv2.tab 1809443:1810147 forward MW:24586
VAANAGSVRPNRRARPMIGIAQLLLVVAAGALWMAARLPWVVIGSFDELGPPKEVTLTGASWSTALLPLALLM-
LAAAVAA (SEQ ID NO:476)
LAVRGWPLRALAVLLAAASFAVGYLGISLWVVPDVAARGADLA-
HVPVVTLVGSARHYWGAVAAVLAAVCALLAAVFLMSS
AAIRGSAGEDMARYAAPRARRSIARRQHS-
NAAGRAAPQDDGPDMGPRMSERMIWEALDEGRDPTDREQESDTEGR >BL;Rv1635c,
H37Rv2.tab 1840575:1842242 reverse MW:60035
MNASRPGAPPHAGLPSRRTAGDQDHRADPKVTRIMSASTLEQPAAAHVDELVARMRGRLLDPLAIAVLAAVIS-
GAWASRP (SEQ ID NO:477)
SLWFDEGATISASASRTLPELWSLLGHIDAVHGLYYLLMHGWF-
AIFPPTELWSRLPSCLAIGAAAAGVVVFAKQFSGRTT
AVCAGAVFAILPRVTWAGIEARSSALSVA-
AAVWLTVLLVAAVRCNTQRRWLLYALVLMLSILVSINLALLVPAYATMVPL
LASGKSRKSPVIWWTVVTAAALGAMTPFILFAMGQVWQVGWIAGLNRNIILDVIHRQYFDHSVPFAILAGLIV-
AAGIAAH
LAGARGPGGDTHRLVLVSAAWIVVPTAVVLIYSATVEPIYYPRYLILTAPAAAVILAVC-
VVTIARKPWLIAGVVFLLAAA
AFPNYFFTQRGPYAKEGWDYSQVADVISAHAKPGDCLLVDNTAGW-
RPGPIRALLATRPAAFRSLIDVERGTYGPKVGTLW
DGHVAVWLTTAKIDKCPTLWTIANRDKSLPD-
HQVGEMLSPGTGFGRTPVYRFPSYLGFRIVERWQFHYSQVVKSTR >BL;Rv1647,
H37Rv2.tab 1856774:1857721 forward MW:33939
LAGSARTTYPCHVEVGPQDSESGAPDETATAMASPVPRQRSALRWLRTVNRSPGLVSFIHRARRLLPGDPEFG-
DPLSTAG (SEQ ID NO:478)
EGGPRAAARAADRLLRDRDAASREVGLSVLQVWQALTEAVSRR-
PANPEVTLVFTDLVGFSTWSLHAGDDATLTLLRQVAR
AVESPLLDAGGHIVKRLGDGIMAVFRNPT-
VALRAVLVAQDAVKSLEVQGYTPRMRIGIHTGRPQRLAADWLGVDVNIAAR
VMERATKGGIMISQPTLDLIPQSELDALGVVARRVRKPVFASKPTGIPPDLAIYRIKTVSESTAADNFDEMSP-
DAQ >BL;Rv1693, H37Rv2.tab 1917756:1917929 forward MW:6094
MTIDPDQIRAEIDALLASLPDPADAENGPSLAELEGIARRLSEAHEVLLAALESAEKG (SEQ ID
NO:479) >BL;Rv1697, H37Rv2.tab 1921542:1922720 forward MW:42423
MRMSALLSRNTSRPGLIGIARVDRNIDRLLRRvCPGDIVVLDVLDLDRI-
TADALVEAEIAAVVNASSSVSGRYPNLGPEV (SEQ ID NO:480)
LVTNGVTLIDETGPEIFKKVKDGAKVRLYEGGVYAGDRRLIRGTERTDHDIADLMREAKSGLVAHLEAFAGNT-
IEFIRSE
SPLLIDGIGIPDVDVDLRRRHVVIVADEPSGPDDLKSLKPFIKEYQPVLVGVGTGADVL-
RKAGYRPQLIVGDPDQISTEV
LKCGAQVVLPADADGHAPGLERIQDLGVGANTFPAAGSATDLALL-
LADHHGAALLVTAGHAANIETFFDRTRvQSNPSTF
LTRLRVGEKLVDAKAVATLYRNHISGGAIAL-
LALTMLIAIIVALWVSRTDGVVLHWIIDYWNRFSLWVQHLVS >BL;Rv1698,
H37Rv2.tab 1922745:1923686 forward MW:32391
MISLRQHAVSLAAVFLALAMGVVLGSGFFSDTLLSSLRSEKRDLYTQIDRLTDQRDALREKLSAADNFDIQVG-
SRIVHDA (SEQ ID NO:481)
LVGKSVVIFRTPDAHDDDIAAVSKIVGQAGGAVTATVSLTQEF-
VEANSAEKLRSVVNSSILPAGSQLSTKLVDQGSQAGD
LLGIALLSNADPAAPTVEQAQRDTVLAAL-
RETGFITYQPRDRIGTANATVVVTGGALSTDAGNQGVSVARFAAALAPRGS
GTLLAGRDGSANRPAAVAVTRADADMAAEISTVDDIDAEPGRITVILALHDLINGGNVGHYGTGHGAMSVTVS-
Q >BL;Rv1754c, H37Rv2.tab 1984982:1986670 reverse MW:60608
MYRYQVRVQQRRSEMNRWVATRSRRHTYQWITDHKSPRDHYRHISELRTSIATSSPG-
RCDMSPIPRIVSVSLAWAAAIGL (SEQ ID NO:482)
MVPIGLAPPAMAAPCSGDAANAPPPPS-
AIVTDPGATALGPVRPGHGPIPTGRKPRGANDRAPLPKLGPLISALLNPGARN
AAPLQQQALVPRANPGPNPAPNPPATGPQPPNATQLTPNFAPAPDPAPAAAPDPGATLAGATTSLAEWVTGPD-
SPNKTLE
RFGISGTDLGIPWDNGDPANRQVLMIFGDTFGYCAVDGHQWRYNTLFRSQDRDLGNGVH-
VTSGDASNRYSGSPVRQPGFS
KQLINSIKWARDETGIIPTAGIAVGKTQYVNFMSIRNWGRDGEWT-
TNYSGIAVSKDNGQTWGVFPGTIRASGPDSGGKAR
FVPGNENFQMGAYLKSNDGYLYSFGTPPGRG-
GSAYLARvPQRFVPDLTKYQYWNGDSNSWVPNKPDAATPVIPGPVGEMS
VQYNTYLKQYLALYTNGMNDVVARTAPAPQGPWSAEQMLVSSWQMPGGIYAPMMHPWSTGKDVYFNLSLWSAY-
NVNLMHT VLP >BL;Rv1782, H37Rv2.tab 2017740:2019257 forward
MW:53689 VAEESRGQRGSGYGLGLSTRTQVTGYQFLARRT-
ANALTRWRVRMEIEPGRRQTLAVVASVSAALVICLGALLWSFISPSG (SEQ ID NO:483)
QLNESPIIADRDSGALYVRVGDRLYPALNLASARLITGRPDNPHLVRSSQIATMPRGPLVGIPGAPSSFSPKS-
PPASSWL
VCDTVATSSSIGSLQGVTVTVIDGTPDLTGHRQILSGSDAVVLRYGGDAWVIREGRRSR-
IEPTNRAVLLPLGLTPEQVSQ
ARPMSRALFDALPVGPELLVPEVPNAGGPATFPGAPGPIGTVIVT-
PQISGPQQYSLVLGDGVQTLPPLVAQILQNAGSAG
NTKPLTVEPSTLAKMPVVNRLDLSAYPDNPL-
EVVDIREHPSTCWWWERTAGENRARVRVVSGPTIPVAATEMNKVVSLVK
ADTSGRQADQVYFGPDHANFVAVTGNNPGAQTSESLWWVTDAGARFGVEDSKEARDALGLTLTPSLAPWVALR-
LLPQGPT LSRADALVEHDTLPMDMTPAELVVPK >BL;Rv1794, H37Rv2.tab
2031066:2031965 forward MW:32399
MDQQSTRTDITVNVDGFWMLQALLDIRHVAPELRCRPYVSTDSNDWLNEHPGMAVMREQGIVVNDAVNEQVAA-
RMKVLAA (SEQ ID NO:484)
PDLEVVALLSRGKLLYGVIDDENQPPGSRDIPDNEFRVVLARR-
GQHWVSAVRvGNDITVDDVTVSDSASIAALVMDGLES
IHHADPAAINAVNVPMEEMLEATKSWQES-
GFNVPSGGDLRRMGISAATVAALGQALSDPAAEVAVYARQYRDDAKGPSAS
VLSLKDGSGGRIALYQQARTAGSGEAWLAICPATPQLVQVGVKTVLDTLPYGEWKTHSRV
>DL;Rv1797, H37Rv2.tab 2035483:2036700 forward MW:44178
MKAQRSFGLALSWPRVTAVFLVDVLILAVASHCPDSWQADHHVAWWVGVGVAAVVTLLSVVSYHGITVISGLA-
TWVRDWS (SEQ ID NO:485)
ADPGTTLGAGCTPAIDHQRRFGRDTVGVREYNGRLVSVIEVTC-
GESGPSGRHWHRKSPVPMLPVVAVADGLRQFDIHLDG
IDIVSVLVRGGVDAAKASASLQEWEPQGW-
KSEERAGDRTVADRRRTWLVLRMNPQRNVAAVACRDSLASTLVAATERLVQ
DLDGQSCAARPVTADELTEVDSAVLADLEPTWSRPGWRHLKHFNGYATSFWVTPSDITSETLDELCLPDSPEV-
GTTVVTV
RLTTRVGSPALSAWVRYHSDTRLPKEVAAGLNRLTGRQLAAVRASLPAPTHRPLLVIPS-
RNLRDHDELVLPVGQELEHAT SSFVGQ >BL;Rv1828, H37Rv2.tab
2073081:2073821 forward MW:26411
VSAPDSPALAGMSIGAVLDLLRPDFPDVTISKIRFLEAEGLVTPRRASSGYRRFTAYDCARLRFILTAQRDHY-
LPLKVIR (SEQ ID NO:486)
AQLDAQPDGELPPFGSPYVLPRLVPVAGDSAGGVGSDTASVSL-
TGIRLSREDLLERSEVADELLTALLKAGVITTGPGGF
FDEHAVVILQCARALAEYGVEPRHLRAFR-
SAADRQSDLIAQIAGPLVKAGKAGARDRADDLAREVAALAITLHTSLIKSA VRDVLHR
>BL;Rv1830, H37Rv2.tab 2074841:2075515 forward MW:23988
VTQLVTRARSARGSTLGEQPRQDQLDFADHTGTAGDGNDGAAAASGPVQPGLFPDDSVPDELVGYRGPSACQI-
AGITYRQ (SEQ ID NO:487)
LDYWARTSLVVPSIRSAAGSGSQRLYSFKDILVLKIVKRLLDT-
GISLHNIRVAVDHLRQRGVQDLANITLFSDGTTVYEC
TSAEEVVDLLQGGQGVFGIAVSGAMRELT- GVIADFNGERADGGESIAAPEDELASRRKHRDRKIG
>BL;Rv1836c, H37Rv2.tab 2082606:2084636 reverse MW:69677
MGRHSKPDPEDSVDDLSDGNAAEQQNWEDISGSYDYPGVDQPDDGPLSSEGHYSAVGGYSASGSEDYPDIPPR-
PDWEPTG (SEQ ID NO:488)
AEPIAAAPPPLFRFGHRGPGDWQAGHRSADGRRGVSIGVIVAL-
VAVVVMVAGVILWRFFGDALSNRSHTAAARCVGGKDT
VAVIADPSIADQVKESADSYNASAGPVGD-
RCVAVAVTSAGSDAVINGFIGKWPTELGGQPGLWIPSSSISAARLTGAAGS
QAISDSRSLVISPVLLAVRPELQQALANQNWAALPGLQTNPNSLSGLDLPAWGSLRLANPSSGNGDAAYLAGE-
AVAAASA
PAGAPATAGIGAVRTLMGARPKLADDSLTAANDTLLKPGDVATAPVHAVVTTEQQLFQR-
GQSLSDAENTLGSWLPPGPAA
VADYPTVLLSGAWLSQEQTSAASAFARYLHKPEQLAKLARAGFRv-
SDVKPPSSPVTSFPALPSTLSVGDDSMRATLADTM
VTASAGVAATIMLDQSMPNDEGGNSRLSNVV-
AALENRIKAMPPSSVVGLWTFDGREGRTEVPAGPLADPVNGQPRPAALT
AALGKQYSSGGGAVSFTTLRLIYQEMLANYRVGQANSVLVITAGPHTDQTLDGPGLQDFIRKSADPAKPIAVN-
IIDFGAD PDRATWEAVAQLSGGSYQNLETSASPDLATAVNIFLS >BL;Rv1845c,
H37Rv2.tab 2095221:2096168 reverse MW:32704
VSALAFTILAVLLAGPTPALLARATWPLRAPRAANVLWQAIALAAVLSSFSAGIAIASRLLMPGPDGRPTTSF-
VGAAGRL (SEQ ID NO:489)
GWPLWAAYITVFALTVLVGARLAVAVVRVATATRRRRAHHRMV-
VDLVGVGHNGALAQPCARARDLRVLDVAQPLAYCLPG
VRSRVVVSEGTLTALADAEVAAILTHERA-
HLRARHDLVLEAFTAVHAAFPRLVRSANALGAVQLLVELLADDAAVRAAGR
TPLARALVACASGRAPSGALAVGGPSTVLRVRRLSGRGNSAVLSAAAYLAAAAVLVVPTVALAVPWLTQLQRL-
FIA >BL;Rv1846c, H37Rv2.tab 2096186:2096599 reverse MW:15211
MAKLTRLGDLERAVMDHLWSRTEPQTVRQVHEALSARRDLAYTTVMTVLQRLAKKNL-
VLQIRDDRAHRYAPVHGRDELVA (SEQ ID NO:490)
GLMVDALAQAEDSGSRQAALVHFVERV- GADEADALRRALAELEAGHGNRPPAGAATET
>BL;Rv1861, H37Rv2.tab 2109165:2109467 forward MW:10330
MDITATTEFSAMNLDGKTGIGWLGYIVIGGIAG-
WLASKIVKGGGSGILMNVVIGVVGAFGAGLVLNALGVDVNHGGYWFT (SEQ ID NO:491)
FFVALGGAVVLLWIVCMVRKT >BL;Rv1871c, H37Rv2.tab 2121498:2121884
reverse MW:14663 LNAAMNLKREFVNRVQRFVVNPIGRQLPMTMLE-
TIGRKTGQPRRTAVGGRVVDNQFWMVSEHGEHSDYVYNIKANPAVRV (SEQ ID NO:492)
RIGGRWRSGTAYLLPDDDPRQRLRGLPRLNSAGVRANGTDLLTIRVDLD >BL;Rv1883c,
H37Rv2.tab 2133234:2133692 reverse MW:17280
MCLDQVMEGSATVHMAAPPDKIWTLIADVRNTGRFSPETFEAEWLDGATGPALGARFRGHVRRNGIGPVYWTV-
CEPGREF (SEQ ID NO:493)
GFAVLLGDRPVNNWHYRLTPTADGTEVTESFRLPPSVLTTVYY-
RVFGGWLRQRRNIRDMTKTLQRIKDLVEAG >BL;Rv1891, H37Rv2.tab
2139741:2140145 forward MW:14108 MIRELVTTAAITGAAIGGAPVAGADPQRYDGDV-
PGMNYDASLGAPCSSWERFIFGRGPSGQAEACHFPPPNQFPPAETGY (SEQ ID NO:494)
WVISYPLYGVQQVGAPCPKPQAAAQSPDGLPMLCLGARGWQPGWFTGAGFFPPEP
>BL;Rv1893, H37Rv2.tab 2140486:2140701 forward MW:7467
MSFNPKDAVDAVRDIAANAVEKASDIVENAGHIIRGDIAGGASGIVKDSIDIATHAVDRTKEVFTGKTDDEG
(SEQ ID NO:495) >BL;Rv1906c, H37Rv2.tab 2152428:2152895 reverse
MW:15536 MRLKPAPSPAAAFAVAGLILAGWAGSVGLAGAD-
PEPAPTPKTAIDSDGTYAVGIDIAPGTYSSAGPVGDGTCYWKRMGNP (SEQ ID NO:496)
DGALIDNALSKKPQVVTIEPTDKAFKTHGCQPWQNTGSEGAAPAGVPGPEAGAQLQNQLGILNGLLGPTGGRV-
PQP >BL;Rv1919c, H37Rv2.tab 2171064:2171525 reverse MW:16803
MSGRKFSFEVTKTSSAPAATLFRLVTDGGNWATWAKPIVAQSSWARRGDPAPGGIGA-
IRKLGMWPVFVQEETVEYEQDRR (SEQ ID NO:497)
HVYKLVGARTPVQDYFGEVVLTPNASG-
GTDLRWSGSFTEKVRGTGPVMRAALGGAVRFFAGQLVKAAEREAVRR >BL;Rv1976c,
H37Rv2.tab 2218847:2219251 reverse MW:14997
VRWIVDGMNVIGSRPDGWWRDRHRANVMLVERLEGWAITKARGDDVTVVFERPPSTAIPSSVVEVAHAPKAAA-
NSADDEI (SEQ ID NO:498)
VRLVRSGAQPQEIRVVTSDKALTDRVRDLGAAVYPAERFRDLI- DPRGSNAARRTQ
>BL;Rv2050, H37Rv2.tab 2307821:2308153 forward MW:12971
MADRvLRGSRLGAVSYETDRNHDLAPRQIARYRTDNGEEFEVPFADDAE-
IPGTWLCRNGMEGTLIEGDLPEPKKVKPPRT (SEQ ID NO:499)
HWDMLLERRSIEELEELLKERLELIRSRRRG >BL;Rv2054, H37Rv2.tab
2313125:2313835 forward MW:25183 MTTIEIDAPAGPIDALLGLPPGQGPWPGVVVVH-
DAVGYVPDNKLISERIARAGYVVLTPNMYARGGRARCITRVFRELLT (SEQ ID NO:500)
KRGRALDDILAARDHLLANPECSGRVGIVGFCMGGQFALVLSPRGFGATAPFYGTPLPRHLSETLNGACPIVA-
SFGTRDP
LGIGAANRLRKVTAAKNIPADIKSYPGAGHSFANKLPGQPLVRIAGFGYNEAATEDAWR-
RVFEFFGQHLRAGSPGEP >BL;Rv2061c, H37Rv2.tab 2316684:2317085
reverse MW:14782 VTPTFSDLAEAQYLLLTTFTKDGRPKPVPIWAA-
LDTDRGDRLLVITEKKSWKVKRIRNTPRVTLATCTLRGRPTSEAVEA (SEQ ID NO:501)
TAAILDESQTGAVYDAIVKRYGIQGKLFTFVSKLRGGMRNNIGLELKVAESETG
>BL;Rv2091c, H37Rv2.tab 2348561:2349292 reverse MW:26019
MSGPQGSDPRQPWQPPGQGADHSSDPTVAAGYPWQQQPTQEATWQAPAYTPQYQQPADPAYPQQYPQPTPGYA-
QPEQFGA (SEQ ID NO:502)
QPTQLGVPGQYGQYQQPGQYGQPGQYGQPGQYAPPGQYPGQYG-
PYGQSGQGSKRSVAVIGGVIAVMAVLFIGAVLILGFW
APGFFVTTKLDVIKAQAGVQQVLTDETTG-
YGAKNVKDVKCNNGSDPTVKKGATFECTVSIDGTSKRVTVTFQDNKGTYEV GRPQ
>BL;Rv2111c, H37Rv2.tab 2370601:2370792 reverse MW:6944
MAQEQTKRGGGGGDDDDIAGSTAAGQERREKLTEETDDLLDEIDDVLEENAEDFVRAYVQKGGQ
(SEQ ID NO:503) >BL;Rv2125, H372V2.tab 2386293:2387168 forward
MW:31808 VTPSEGNAPLPELHNTVVVAAFEGWNDAGDAAGDAVAHLAASWQALPIVEIDDEAYY-
DYQVNRPVIRQVDGVTRELQWPA (SEQ ID NO:504)
MRISHCRPPGSDRDVVLMCGVEPNMRW-
RTFCDELLAVIDKLNVDTVVILGALLADTPHTRPVPVSGAAYSAASARQFGLQ
ETRYEGPTGIAGVFQSACVGAGIPAVTFWAAVPHYVSHPPNPKATIALLRRvEDVLDVEVPLADLPAQAEAWE-
REITETI AEDHELAEYVQTLEQHGDAAVDMNEALGNIDGDALAAEFERYLRRRRPGFGR
>BL;Rv2133c, H37Rv2.tab 2393854:2394639 reverse MW:28284
VLADGELTVLGRIRSASNATFLCESTLGLRSLHCVYKPVSGERPLWDFPDGTLAGRELSAYLVSTQLGWNL-
VPHTIIRDG (SEQ ID NO:505)
PAGIGMLQLWVQQPGDAVDSDPLPGPDLVDLFPAHRPRPGY-
LPVLRAYDYAGDEVVLMHADDIRLRRMAVFDVLINNADR
KGGHILCGIDGQVYGVDHGLCLHVENK-
LRTVLWGWAGKPIDDQILQAVAGLADALGGPLAEALAGRIAAAEIGALRRRAQ
SLLDQPVMPGPNGHRPIPWPAF >BL;Rv2134c, H37Rv2.tab 2394653:2395237
reverse MW:21217 MARAIHVFRTPDRFVAGTVGQPGNRTFYLQAVH-
DSRvVSVVLEKQQVAVLAERIGALLFEVNRRFGTPVPPEPTEIDDLS (SEQ ID NO:506)
PLIMPVDAEFRVGTMGLGWDSEAQSVVVELLAVTDAEFDASVVLDDTEEGPDAVRVFLTPESARQFATRSYRV-
ISAGRPP CPLCDEPLDPEGHICARTNGYRRDVLLGSGDDPAG >BL;Rv2137c,
H37Rv2.tab 2396905:2397315 reverse MW:14965
MRNMKSTSHESESGKLLSISSCRPREMVLQRYSLGMTVTADRHLADKREEFAVEDISTGIFASGYGQVGDGRS-
FSFHIEH (SEQ ID NO:507)
RSLVVEIYRPRvAGPVPQAEDVVAMAVRGLVDIDLTDERSLAA- AVRDSVASAAPVSR
>BL;Rv2144c, H37Rv2.tab 2404168:2404521 reverse MW:12027
MLIIALVLALIGLLALVFAVVTSNQLVAWVCIGASVLGVALLIVDALRE-
RQQGGADEADGAGETGVAEEADVDYPEEAPE (SEQ ID NO:508)
ESQAVDAGVIGSEEPSEEASEATEESAVSADRSDDSAK >BL;Rv2146c, H37Rv2.tab
2405669:2405956 reverse MW:10804
LVVFFQILGFALFIFWLLLIARvVVEFIRSFSRDWRPTGVTVVILEIIMSITDPPVKVLRRLIPQLTIGAVRF-
DLSIMVL (SEQ ID NO:509) LLVAFIGMQLAFGAAA >BL;Rv2147c, H37Rv2.tab
2406121:2406843 reverse MW:27629
VNSHCSHTFITDNRSPRARRGHAMSTLHKVKAYFGMAPMEDYDDEYYDDRAPSRGYARPRFDDDYGRYDGRDY-
DDARSDS (SEQ ID NO:510)
RGDLRGEPADYPPPGYRGGYADEPRFRPREFDRAEMTRPRFGS-
WLRNSTRGALANDPRRMAMMFEDGHPLSKITTLRPKD
YSEARTIGERFRDGSPVIMDLVSMDNADA-
KRLVDFAAGLAFALRGSFDKVATKVFLLSPADVDVSPEERRRIAETGFYAY Q
>BL;Rv2164c, H37Rv2.tab 2427087:2428238 reverse MW:39647
MRAKREAPKSRSSDRRRRADSPAAATRRTTTNSAPSRRIRSRAGKTSAPGRQARVSRPGPQTSPMLSPFDRPA-
PAKNTSQ (SEQ ID NO:511)
AKARAKARKAKAPKLVRPTPMERLAARLTSIDLRPRTLANKVP-
FVVLVIGSLGVGLGLTLWLSTDAAERSYQLSNARERT
RMLQQHKEALERDVREAASAPALAEAARR-
QGMIPTRDTAHLVQDPDGNWVVVGTPKPADGVPPPPLNTKLPEDPPPPPKP
AAVPLEVPVRVTPGPDDPAPPARSGPEVLVRTPDGTATLGGATHLPTQAGPQLPGPVPIPGAPGPMPAPPLGA-
VPSPAPA
ENPVPLQVGAAPPACLPGPAPVAATPGLSGGSQPMVAPPAPVPANGEQFGPVTAPVPTA- PGAPR
>BL;Rv2169c, H37Rv2.tab 2431568:2431969 reverse MW:14571
MPLSDHEQRMLDQIESALYAEDPKFASSVRGGGFRAPTARRRLQGAALFIIGLGMLV-
SGVAFKETMIGSFPILSVFGFVV (SEQ ID NO:512)
MFGGVVYAITGPRLSGRMDRGGSAAGA- SRQRRTKGAGGSFTSRMEDRFRRRFDE
>BL;Rv2170, H37Rv2.tab 2432235:2432852 forward MW:22926
LAIFLIDLPPSDMERRLGDALTVYVDAMRYPRG-
TETLRAPMWLEHIRRRGWQAVAAVEVTAAEQAEAADTTALPSAAELS (SEQ ID NO:513)
NAPMLGVAYGYPGAPGQWWQQQVVLGLQRSGFPRLAIARLMTSYFELTELHILPRAQGRGLGEALARRLLAGR-
DEDNVLL STPETNGEDNRAWRLYRRLGFTDIIRGYHFAGDPRAFAILGRTLPL
>BL;Rv2172c, H37Rv2.tab 2433634:2434536 reverse MW:33007
VTLNTIALELVPPNLEGGKERAIEDARKVVQYSAASGLDGRIRHVMMPGMIAEDDDRPIPMQPKLDVLDFWSI-
IKPELAG (SEQ ID NO:514)
VHGLCTQVTAFMDEPSLHRRLVDLSDAGMEGIVFVGVPRTMQD-
GEGSGVAPTDALSLYRQLVANRGVIVIPTRDGEQGRL
NFKCSRGATYGMTQLLYSDAIVGFLREFA-
RTTEHRPEILLSFGFVPKVETRIGLINWLIQDPGNAAVADEQAFVQKLAGS
EPARRRRLMVDLYKRVLDGVADLGFPLSIHLEATYGVSAAAFETFAEMLAYWSPAEPGKPD
>BL;Rv2175c, H37Rv2.tab 2437449:2437886 reverse MW:15743
MPGRAPGSTLARVGSIPAGDDVLDPDEPTYDLPRvAELLGVPVSKVAQQLREGHLVAVRRAGGVVIPQVFFTN-
SGQVVKS (SEQ ID NO:515)
LPGLLTILHDGGYRDTEIMRWLFTPDPSLTITRDGSRDAVSNA-
RPVDALHAHQAREVVRRAQAMAY >BL;Rv2179c, H37Rv2.tab 2441814:2442317
reverse MW:19488 VRYFYDTEFIEDGNTIELISIGVVAEDGREYYA-
VSTEFDPERAGSWVRTHVLPKLPPPASQLWRSRQQIRLDLEEFLRID (SEQ ID NO:516)
GTDSIELWAWVGAYDHVALCQLWGPMTALPPTVPRFTRELRQLWEDRGCPRMPPRPRDVHDALVDARDQLRRF-
RLITSTD DAGRGAAR >BL;Rv2183c, H37Rv2.tab 2445418:2445810 reverse
MW:13322 VSGAHTDVRPELRKLAQAILDGIDPAVRvAAAM-
ASGGGPGTGKCQQVWCPLCALAALVTGEQHPLLTVIADHSLALLEVI (SEQ ID NO:517)
RAIVDDIDRSAKPPPEGPPGGGQTGASGGENTNGEGSMKSHYQAIPVTIEE >BL;Rv2185c,
H37Rv2.tab 2447069:2447500 reverse MW:16292
VADKTTQTIYIDADPGEVMKAIADIEAYPQWISEYKEVEILEADDEGYPKRARMLMDAAIFKDTLIMSYEWPE-
DRQSLSW (SEQ ID NO:518)
TLESSSLLKSLEGTYRLAPKGSGTEVTYELAVDLAVPMIGMLK- RKAERRLIDGALKDLKKRVEG
>BL;Rv2186c, H37Rv2.tab 2447608:2447994 reverse MW:14573
MNSIQIADETYVAADAARvSAAVADRCSWRRWW-
PDLRLQVTEDRADKGIRWTVTGALTGTMEIWLEPSMDGVLLHYFLHA (SEQ ID NO:519)
EPTGVAAWQLARMNLARMTHHRRVAGKKMAFEVKTVLERSRPIGVSPVT >BL;Rv2197C,
N37Rv2.tab 2461507:2462148 reverse MW:22480
MVSRYSAYRRGPDVISPDVIDRILVGACAAVWLVFTGVSVAAAVALMDLGRGFHEMAGNPHTTWVLYAVIVVS-
ALVIVGA (SEQ ID NO:520)
IPVLLRARRMAEAEPATRPTGASVRGGRSIGSGHPAKRAVAES-
APVQHADAFEVAAEWSSEAVDRIWLRGTVVLTSAIGI
ALIAVAAATYLMAVGHDGPSWISYGLAGV- VTAGMPVIEWLYARQLRRVVAPQSS
>BL;Rv2199c, H37Rv2.tab 2463236:2463652 reverse MW:14865
MHIEARLFEFVAAFFVVTAVLYGVLTSMFATGG-
VEWAGTTALALTGGMALIVATFFRFVARRLDSRPEDYEGAEISDGAG (SEQ ID NO:521)
ELGFFSPHSWWPIMVALSGSVAAVGIALWLPWLIAAGVAFILASAAGLVFEYYVGPEKH
>BL;Rv2203, H37Rv2.tab 2468231:2468920 forward MW:24371
MPGPHSPNPGVGTNGPAPYPEPSSHEPQALDYPHDLGAAEPAFAPGPADDAALPPAAYPGVPPQVSYPKRRHK-
RLLIGIV (SEQ ID NO:522)
VALALVSAMTAAIIYGVRTNGANTAGTFSEGPAKTAIQGYLNA-
LENRDVDTIVRNALCGIHDGVRDKRSDQALAKLSSDA
FRKQFSQVEVTSIDKIVYWSQYQAQVLFT-
MQVTPAAGGPPRGQVQGIAQLLFQRGQVLVCSYVLRTAGSY >BL;Rv2206, H37Rv2.tab
2470958:2471329 forward MW:13988
MMAGEEAYLLPRDRGPVRRYVRDVVDSRRNLLGLFMPSALTLLFVMFAVPQVQFYLSPAMLILLALMTIDAII-
LGRKVGR (SEQ ID NO:523)
LVDTKFPSNTESRWRLGLYAAGRASQIRRLRAPRPQVERGGDV- G >BL;Rv2219,
H37Rv2.tab 2486235:2486984 forward MW:26864
MAKPRNAAESKAAKAQANAARKAAARQRRAQLWQAFTLQRKEDKRLLPYMIGAFLLI-
VGASVGVGVWAGGFTMFTMIPLG (SEQ ID NO:524)
VLLGALVAFVIFGRRAQRTVYRKAEGQ-
TGAAAWALDNLRGKWRVTPGVAATGNLDAVHRVIGRPGVIFVGEGSAARVKPL
LAQEKKRTARLVGDVPIYDIIVGNGDGEVPLAKLERHLTRLPANITVKQMDTVESRLAALGSRAGAGVMPKGP-
LPTTAKM RSVQRTVRRK >BL;Rv2229c, H37Rv2.tab 2502738:2503472
reverse MW:26851 MKAGVAQQRSLLELAKLDAELTRIAHRATHLPQ-
RAAYQQVQAEHNAANDRMAALRIAAEDLDGQVSRFESEIDAVRKRGD (SEQ ID NO:525)
RDRSLLTSGATDAKQLADLQHELDSLQRRQASLEDALLEVLERREELQAQQTAESRALQALRADLAAAQQALD-
EALAEID
QARHQHSSQRDMLTATLDPELAGLYERQRAGGGPGAGRLQGHRCGACRIEIGRGELAQI-
SAAAEDEVVRCPECGAILLRL EGFEE >BL;Rv2235, H37Rv2.tab
2507637:2508449 forward MW:29762
MPRLAFLLRPGWLALALVVVAFTYLCFTVLAPWQLGKNAKTSRENQQIRYSLDTPPVPLKTLLPQQDSSAPDA-
QWRRVTA (SEQ ID NO:526)
TGQYLPDVQVLARLRvVEGDQAFEVLAPFVVDGGPTVLVDRGY-
VRPQVGSHVPPIPRLPVQTVTITARLRDSEPSVAGKD
PFVRDGFQQVYSINTGQVAALTGVQLAGS-
YLQLIEDQPGGLGVLGVPHLDPGPFLSYGIQWISFGILAPIGLGYFAYAEI
RARRREKAGSPPPDKPMTVEQKLADRYGRRR >BL;Rv2239c, H37Rv2.tab
2511179:2511652 reverse MW:16962
MPIATVCTWPAETEGGSTVVAADHASNYARKLGIQRDQLIQEWGWDEDTDDDIRAAIEEACGGELLDEDTDEV-
IDVVLLW (SEQ ID NO:527)
WRDGDGDLVDTLMDAIGPLAEDGVIWVVTPKTGQPGHVLPAEI-
AEAAPTAGLMPTSSVNLGNWSASRLVQPKSRAGKR >BL;Rv2242, H37Rv2.tab
2515304:2516545 forward MW:44606
VNDNQLAPVARPRSPLELLDTVPDSLLRRLKQYSGRLATEAVSAMQERLPFFADLEASQRASVALVVQTAVVN-
FVEWMHD (SEQ ID NO:528)
PHSDVGYTAQAFELVPQDLTRRIALRQTVDMVRVTMEFFEEVV-
PLLARSEEQLTALTVGILKYSRDLAFTAATAYADAAE
ARGTWDSRMEASVVDAVVRGDTGPELLSR-
AAALNWDTTAPATVLVGTPAPGPNGSNSDGDSERASQDVRDTAARHGRAAL
TDVHGTWLVAIVSGQLSPTEKFLKDLLAAFADAPVVIGPTAPMLTAAHRSASEAISGMNAVAGWRGAPRPVLA-
RELLPER
ALMGDASAIVALHTDVMRPLADAGPTLIETLDAYLDCGGAIEACARKLFVHPNTVRYRL-
KRITDFTGRDPTQPRDAYVLR VAATVGQLNYPTPH >BL;Rv2256c, H37Rv2.tab
2529344:2529874 reverse MW:18896
VEPKEQQMRASNQFADVTSGVVYIHASPAAVCPHVEWALSSTLQAKANLVWTPQPALPPQLRAVTNWVGPVGT-
GARLANA (SEQ ID NO:529)
LRSWSVLRFEVTEDPSPGVDGQRFSHTPQLGLWSGAMSANGDI-
MVGEMRLRAMMAQGADTLAAELDSVLGTAWDQALEVY RDGGDAGEVTWLSRGVG
>BL;Rv2257c, H37Rv2.tab 2530007:2530822 reverse MW:28385
MTALEVLGGWPVPAAAAAVIGPAGVLATHGDTARvFALASVTKPLVARAAQVAVEEGVVNLDTPAGPPGSTVR-
HLLAHTS (SEQ ID NO:530)
GLAMHSDQALARPGTRRMYSNYGFTVLAESVQRESGIEFGRYL-
TEAVCEPLGMVTTRLDGGPAAAGFGATSTVADLAVFA
GDLLRPSTVSAQMHADATTVQFPGLDGVL-
PGYGVQRPNDWGLGFEIENSKSPHWTGECNSTRTFGHFGQSGGFIWVDPKA
DLALVVLTARDFGDWALDLWPAISDAVLAEYT >BL;Rv2342, H37Rv2.tab
2620272:2620526 forward MW:9187 LIGYVAVLGLGYVLGAKAGRRRY-
EQIASTYRALTGSPVARSMIEGGRRKIANRISPDAGFVTLAEIDNQTAVVQRGVERQ (SEQ ID
NO:531) PKTAR >BL;Rv2347c, H37Rv2.tab 2626226:2626519 reverse
MW:10977 MATRFMTDPHAMRDMAGRFEVHAQTVEDEARRMWASAQNISGAGWSGMA-
EATSLDTMAQMNQAFRNIVNMLHGVRDGLVR (SEQ ID NO:532) DANNYEQQEQASQQILSS
>BL;Rv2365c, H37Rv2.tab 2646750:2647088 reverse MW:11130
MMRRPITLAEQLDAEDAKLVVLARAAMARAEAGAGAAVRDVDGRTYAAAPVALSALE-
LTGLQAAVAAAVSSGATGLQAAV (SEQ ID NO:533)
LVAGSVDDPGIAAVRELAPTAAIIVTD- RAGNPL >BL;Rv2376c, H37Rv2.tab
2655612:2656115 reverse MW:16635
MKMVKSIAAGLTAAAAIGAAAAGVTSIMAGGPVVYQMQPVVFGAPLPLDPASAPDVP-
TAAQLTSLLNSLADPNVSFANKG (SEQ ID NO:534)
SLVEGGIGGTEARIADHKLKKAAEHGD-
LPLSFSVTNIQPAAAGSATADVSVSGPKLSSPVTQNVTFVNQGGWMLSRASAN ELLQAAGN
>BL;Rv2413c, H37Rv2.tab 2710354:2711301 reverse MW:33113
LHLVLGDEELLVERAVADVLRSARQRAGTADVPVSRMRAGDVGAYELAELLSPSLFAEERIVVLGAAAE-
AGKDAAAVIES (SEQ ID NO:535)
AAADLPAGTVLVVVHSGGGRAKSLANQLRSMGAQVHPCA-
RITKVSERADFIRSEFASLRvKVDDETVTALLDAVGSDVRE
LASACSQLVADTGGAVDAAAVRRYH-
SGKAEVRGFDIADKAVAGDVAGAAEALRWAMMRGEPLVVLADALAEAVHTIGRVG
PQSGDPYRLAAQLGMPPWRvQKAQKQARRWSRDTVATAMRLVAELNANVKGAVADADYALESAVRQVAELVAD-
RGR >BL;Rv2446c, H37Rv2.tab 2745770:2746138 reverse MW:13311
MTDRSREPADPWKGFSAVMAATLILEAIVVLLAIPVVDAVGGGLRPASLGYLVGLAV-
LLILLTGLQRRPWAIWVNLGAQP (SEQ ID NO:536)
VLVAGFAVYPGVCFIGVLFAALWVLIA- YLRAEVRRRRDYRVSQ >BL;Rv2466c,
H37Rv2.tab 2768264:2768884 reverse MW:23035
MLEKAPQKSVADFWFDPLCPWCWITSRWILEVA-
KVRDIEVNFHVMSLAILNENRDDLPEQYREGMARAWGPVRVAIAAEQ (SEQ ID NO:537)
AHGAKVLDPLYTAMGNRIHNQGNHELDEVITQSLADAGLPAELAKAATSDAYDNALRKSHHAGMDAVGEDVGT-
PTIHVNG VAFFGPVLSKIPRGEEAGKLWDASVTFASYPHFFELKRTRTEPPQFD
>BL;Rv2468c, H37Rv2.tab 2771647:2772147 reverse MW:17288
MTHRSSRLEVGPVARGDVATIEHAELPPGWVLTTSGRISGVTEPGELSVHYPFPIADLVALDDALTYSSRACQ-
VRFAIYL (SEQ ID NO:538)
GDLGRDTAARAREILGKVPTPDNAVLLAVSPNQCAIEVVYGSQ-
VRGRGAESAAPLGVAAASSAFEQGELVDGLISAIRVL SAGIAPG >BL;Rv2476c,
H37Rv2.tab 2777391:2782262 reverse MW:176902
MTIDPGAKQDVEAWTTFTASADIPDWISKAYIDSYRGPRDDSSEATKAAEASWLPASLLTPAMLGAHYRLGRH-
RAAGESC (SEQ ID NO:539)
VAVYRADDPAGFGPALQVVAEHGGMLMDSVTVLLHRLGIAYAA-
ILTPVFDVHRSPTGELLRIEPKAEGTSPHLGEAWMHV
ALSPAVDHKGLAEVERLLPKVLADVQRVA-
TDATALIATLSELAGEVESNAGGRFSAPDRQDVGELLRWLGDGNFLLLGYQ
RCRVADGMVYGEGSSGMGVLRGRTGSRPRLTDDDKLLVLAQARVGSYLRYGAYPYAIAVREYVDGSVVEHRFV-
GLFSVAA
MNADVLEIPTISRRVREALAMAESDPSHPGQLLLDVIQTVPRPELFTLSAQRLLTMARA-
VVDLGSQRQALLFLRADRLQY
FVSCLVYMPRDRYTTAVRMQFEDILVREFGGTRLEFTARVSESPW-
ALMHFMVRLPEVGVAGEGAAAPPVDVSEANRIRIQ
GLLTEAARTWADRLIGAAAAAGSVGQADAMH-
YAAAFSEAYKQAVTPADAIGDIAVITELTDDSVKLVFSERDEQGVAQLT
WFLGGRTASLSQLLPMLQSMGVVVLEERPFSVTRPDGLPVWIYQFKISPHPTIPLAPTVAERAATANRFAEAV-
TAIWHGR
VEIDRFNELVMRAGLTWQQVVLLRAYAKYLRQAGFPYSQSYIESVLNEHPATVRSLVDL-
FEALFVPVPSGSASNRDAQAA
AAAVAADIDALVSLDTDRILRAFASLVQATLRTNYFVTRQGSARC-
RDVLALKLNAQLIDELPLPRPRYEIFVYSPRVEGV
HLRFGPVARGGLRWSDRRDDFRTEILGLVKA-
QAVKNAVIVPVGAKGGFVVKRPPLPTGDPAADRDATRAEGVACYQLFIS
GLLDVTDNVDHATASVNPPPEVVRRDGDDAYLVVAADKGTATFSDIANDVAKSYGFWLGDAFASGGSVGYDHK-
AMGITAR
GAWEAVKRHFREIGIDTQTQDFTVVGIGDMSGDVFGNGMLLSKHIRLIAAFDHRHIFLD-
PNPDAAVSWAERRRMFELPRS
SWSDYDRSLISEGGGVYSREQKAIPLSAQVRAVLGIDGSVDGGAA-
EMAPPNLIRAILRAPVDLLFNGGIGTYIKAESESD
ADVGDRANDPVRVNANQVRAKVIGEGGNLGV-
TALGRVEFDLSGGRINTDALDNSAGVDCSDHEVNIKILIDSLVSAGTVK
ADERTQLLESMTDEVAQLVLADNEDQNDLMGTSRANAASLLPVHAMQIKYLVAERGVNRELEALPSEKEIARR-
SEAGIGL
TSPELATLMAHVKLGLKEEVLATELPDQDVFASRLPRYFPTALRERFTPEIRSHQLRRE-
IVTTMLINDLVDTAGITYAFR
IAEDVGVTPIDAVRTYVATDAIFGVGHIWRRIRAANLPIALSDRL-
TLDTRRLIDRAGRWLLNYRPQPLAVGAEINRFAAM
VKALTPRMSEWLRGDDKAIVEKTAAEFASQG-
VPEDLAYRVSTGLYRYSLLDIIDIADIADIDAAEVADTYFALMDRLGTD
GLLTAVSQLPRHDRWHSLARLAIRDDIYGALRSLCFDVLAVGEPGESSEQKIAEWEHLSASRVARARRTLDDI-
RASGQKD LATLSVAARQIRRMTRTSGRGISG >BL;Rv2484c, H37Rv2.tab
2791022:2792494 reverse MW:52309
MAESGESPRLSDELGPVDYLMHRGEANPRTRSGIMALELLDGTPDWDRFRTRFENASRRVLRLRQKVVVPTLP-
TAAPRWV (SEQ ID NO:540)
VDPDFNLDFHVRRVRVSGPATLREVLDLAEVILQSPLDISRPL-
WTATLVEGMADGRAAMLLHVSHAVTDGVGGVEMFAQI
YDLERDPPPRSTPPQPIPEDLSPNDLMRR-
GINHLPIAVVGGVLDALSGAVSMAGRAVLEPVSTVSGILGYARSGIRVLNR
AAEPSPLLRRRSLTTRTEAIDIRLADLHKAAKAGGGSINDAYLAGLCGALRRYHEALGVPISTLPMAVPVNLR-
AEGDAAG
GNQFTGVNLAAPVGTIDPVARMKKIRAQMTQRRDEPAMNIIGSIAPVLSVLPTAVLEGI-
TGSVIGSDVQASNVPVYPGDT
YLAGAKILRQYGIGPLPGVAMMVVLISRGGWCTVTVRYDRASVRN-
DELFAQCLQAGFDEILALAGGPAPRVLPASFDTQG AGSVPRSVSGS >BL;Rv2507,
H37Rv2.tab 2822438:2823256 forward MW:28520
MNDPRRPQRFGPPLSGYGPTGPQVPPNPPTADPAYADQSPYASTYGGYVSPPWSPGGPPPRPPQWPPGPHEAS-
PTQQLPQ (SEQ ID NO:541)
YWQYDQPPPGGFPPDGLTPPPPQGPRTPRWLWFAAGSAVLLVV-
ALVIALVIANGSVKKQTAIEPLPPMPGPSPTRPTTTT
PTPPSPSAAPAPTTTTGTPSETVAGAMQT-
VVYDVTGEGRAISITYMDSGNVIQTEFNVALPWRKEVSLSKSSLHPASVTI
VNIGHNVTCSVTVAGVQVRQRTGAGLTICDAPS >BL;Rv2520c, H37Rv2.tab
2837391:2837615 reverse MW:8341 VVDRDPNTIKQEIDQTRDQLAAT-
IDSLAERANPRRLADDAKTRVIAFLRKPIVTVSLVGIGSVVVVVVIHKIRNR (SEQ ID
NO:542) >BL;Rv2525c, H37Rv2.tab 2849855:2850574 reverse
MW:25369
MSVSRRDVLKFAAATPGVLGLGVVASSLRAAPASAGSLGTLLDYAAGVIPASQIRAAGAVGAIRYVSDRR-
PGGAWMLGKP (SEQ ID NO:543)
IQLSEARDLSGNGLKIVSCYQYGKGSTADWLGGASAGVQH-
ARRGSELHAAAGGPTSAPIYASIDDNPSYEQYKNQIVPYL
RSWESVIGHQRTGVYANSKTIDWAVN-
DGLGSYFWQHNWGSPKGYTHPAAHLHQVEIDKRKVGGVGVDVNQILKPQFGQWA
>BL;Rv2536, H37Rv2.tab 2860452:2861141 forward MW:24626
MTNWMLRGLAFAAAMVVLRLFQGALINAWQMLSGLISLVLLLLFAIGGVVWGVMDGRADAKASPDPDRRQDLA-
MTWLLAG (SEQ ID NO:544)
LVAGALSGAVAWLISLFYKAIYTGGPINELTTFAAFTALIVFL-
VGIVGVAVGRWLVDRQLAKAPVRHHGLAAEHERAADT
DVFSAVRADDSPTGEMQVAQPEAQTAAVA-
TVEREAPTEVIRTTESDTPTEVIRTDTEADQTKPGDEPKKD >BL;Rv2588C,
H37Rv2.tab 2915849:2916193 reverse MW:12966
MESFVLFLPFLLIMGGFMYFASRRQRRAMQATIDLHDSLQPGERVHTTSGLEATIVAIADDTIDLEIAPGVVT-
TWMKLAI (SEQ ID NO:545) RDRILPDDDIDEELNEDLDKDVDDVAGERRVTNDS
>BL;Rv2609c, H37Rv2.tab 2936813:2937865 reverse MW:38096
MTWLVLAGAVLLVVLVAFGAWGYQTANRLNRLNVRYDLSWQSLDSALARRAVVARAVAIDAYGGAPQGSRLA-
ALADAAEG (SEQ ID NO:546)
APRHARENAENELSAALANVNPASLPAALIAELADAEARVLL-
ARRFHNDAVRDTLALGERRLVRLLRLGGTAVLPTYFEI
VERPHALVNGDQGASGRRTSARVVLLDD-
SGAVLLLCGSDPANPAFRDGAAPKWWFTVGGQVRPGERLAQAAARELAEETG
LRVAPADMIGPIWRRDEVFEFNGSLIDSEEFYLVHRTRRFEPAVQGRTELERRYIRDARWCDANDIAQLVAAG-
ERWPLQ LGELLPAANRLVDVALDNGAARDAGVPQPIR >BL;Rv2673, H37Rv2.tab
2989290:2990588 forward MW:48883
VYGALVTAADSIRTGLGASLLAGFRPRTGAPSTATILRSALWPAAVLSVLHRSIVLTTNGNITDDFKPVYRAV-
LNFRRGW (SEQ ID NO:547)
DIYNEHFDYVDPHYLYPPGGTLLMAPFGYLPFAPSRYLFISIN-
TAAILVAAYLLLRMFNFTLTSVAAPALILAMFATETV
TNTLVFTNINGCILLLEVLFLRWLLDGRA-
SRQWCGGLAIGLTLVLKPLLGPLLLLPLLNRQWRALVAAVVVPVVVNVAAL
PLVSDPMSFFTRTLPYILGTRDYFNSSILGNGVYFGLPTWLILFLRILFTAITFGALWLLYRYYRTGDPLFWF-
TTSSGVL
LLWSWLVMSLAQGYYSMMLFPFLMTVVLPNSVIRNWPAWLGVYGFMTLDRWLLFNWMRW-
GRALEYLKITYGWSLLLIVTF TVLYFRYLDAKADNRLDGGIDPAWLTPEREGQR
>BL;Rv2680, H37Rv2.tab 2996104:2996733 forward MW:22555
LTSAGDDAERSDEEERRLTSAEPALFREAVAANNAVTVRPEIELGPIRPPQRLAPYSYALGAEIKHPELDVIP-
ERSEGDA (SEQ ID NO:548)
FGRLIMLYDPDGSDAWDGTIRLVAYVQADLDSSEAVDPLLPEV-
AWSWLVDALTARTDQVRALGGTVTATTSVRYGDISGP
PRAHQLELRASWTATTPDLGAHVQAFCDV- LEHAAGLPPAGVTDLGSRSRA >BL;Rv2683,
H37Rv2.tab 3000111:3000605 forward MW:17729
MKVNIDPTAPTFATYRRDMRAEQMAEDYPVVSI-
DSDALDAARMLAEHRLPGLLVTAGAGKQYAVLPASQVVRFIVPRYVQ (SEQ ID NO:549)
DDPLLAGVLNESTADRCAERLSGKKVRDVLPDHLVEVPPANADDTIIEVAAVMARLRSPLLAVVKDGSLLGVV-
TASRLLA AALKT >BL;Rv2695, H37Rv2.tab 3011915:3012619 forward
MW:24154 MAVDLDGVTTVLLPGTGSDNDYVRRAFSAPLRR-
AGAVLVTPVPHPGRLIDGYRAALDDAARDGPVVVGGVSLGAAVAAAW (SEQ ID NO:550)
ALEHPDRAVAVLAALPAWTGEPELAPAAQAARYTAARLRCDGLAATTTRMRASSPVWLAEELTRSWRVQWPEL-
PDAMEEA
AAYVAPSRAELARLVAPLAVAAAVDDPIHPLQVAADWVSVAPHAALRTVTLDEIGADAA-
ALGSACLAALAEVSGA >BL;Rv2696c, H37Rv2.tab 3012831:3013607 reverse
MW:27216 MAFGRRTGKDGGKRKAGHAPVQPADEHVRPEDT-
VVASAAAASGVEDQEELQGPFDIDDFDDPSVAVLARLDLGSVLIPMP (SEQ ID NO:551)
AAGQVQVELTESGVPSAVWVITPNGRYSIAAYAAPKTGGLWREVAGELADSLRKDSAKVSIKDGPWGREVIGI-
AAGVVRF
IGVDGYRWNIRCVVNGPQETVDALTEEAREALADTVVRRGDTPLPVRTPLPVHLPEPMA-
AQLREAAAAQADTQRQAAAGV ARRGAQGSAMQQLRSTTGG >BL;Rv2698, H37Rv2.tab
3014172:3014654 forward MW:17530
VSGTRLAPHSVRYRERLWVPWWWWPLAFALAALIAFEVNLGVAALPDWVPFATLFTVAAGTLLWLGRVEIRVT-
AGSADGA (SEQ ID NO:552)
GVKLWAGPAHLPVAVIARSAEIPATAKSAALGRQLDPAAYVLH-
RAWVGPMVLVVLDDPNDPTPYWLVSCRHPERVLSALR S >BL;Rv2699c, H37Rv2.tab
3014665:3014964 reverse MW:10915
MPTDYDAPRRTETDDVSEDSLEELKARRNEAASAVVDVDESESAESFELPGADLSGEELSVRVVPKQADEFTC-
SSCFLVQ (SEQ ID NO:553) HRSRLASEKNGVMICTDCAA >BL;Rv2700,
H37Rv2.tab 3015202:3015849 forward MW:22627
VVAQITEGTAFDKHGRPFRRRNPRPAIVVVAFLVVVTCVMWTLALTRPPDVREAAVCNPPPQPAGSAPTNLGE-
QVSRTDM (SEQ ID NO:554)
TDVAPAKLSDTKVHVLNASGRGGQAADIAGALQDLGFAQPTAA-
NDPIYAGTRLDCQGQIRFGTAGQATAAALWLVAPCTE
LYHDSRADDSVDLALGTDFTTLAHNDDID- AVLANLRPGATEPSDPALLAKIHANSC
>BL;Rv2708c, H37Rv2.tab 3021550:3021795 reverse MW:8962
MSGMQTQTIERTDADERVDDGTGSDTPKYFHYVK-
KDKIAESAVMGSHVVALCGEVFPVTRAPKPGSPVCPDCKRIYDTLK (SEQ ID NO:555) KG
>BL;Rv2709, H37Rv2.tab 3021838:3022281 forward MW:16810
MWDSRVNKHGLRLGFNGQFDDFDDFDDKGRPVLITAAAPSYEVEHRTRVRKYLTLMAFRVPALILA-
AIAYGAWHNGLISL (SEQ ID NO:556)
LIVAASVPLPWMAVLIANDRPPRRADEPRRFDVARR-
RIPLFPTAERPALEPRRQPAERSAPRGFADHG >BL;Rv2714, H37Rv2.tab
3027064:3028035 forward MW:35520 MARDQGADEAREYEPGQPGMYELEFPAPQLSSS-
DGRGPVLVHALEGFSDAGHAIRLAAAHLKAALDTELVASFAIDELLD (SEQ ID NO:557)
YRSRRPLMTFKTDHFTHSDDPELSLYALRDSIGTPFLLLAGLEPDLKWERFITAVRLLAERLGVRQTIGLGTV-
PMAVPHT
RPITMTAHSNNRELISDFQPSISEIQVPGSASNLLEYRMAQHGHEVVGFTVHVPHYLTQ-
TDYPAAAQALLEQVAKTGSLQ
LPLAVLAEAAAEVQAKIDEQVQASAEVAQVVAALERQYDAFIDAQ-
ENRSLLTRDEDLPSGDELGAEFERFLAQQAEKKSD DDPT >BL;Rv2715c, H37Rv2.tab
3031042:3031536 reverse MW:17324
MTPVRPPHTPDPLNLRGPLDGPRWRRAEPAQSRRPGRSRPGGAPLRYHRTGVGMSRTGHGSRPVPPATTVGLA-
LLAAAIT (SEQ ID NO:558)
LWLGLVAQFGQMITGGSADGSADSTGRvPDRLAVVRvETGESL-
YDVAVRvAPNAPTRQVADRIRELNGLQTPALAVGQTL IAPVG >BL;Rv2722,
H37Rv2.tab 3034634:3034879 forward MW:9077
MPCLARQPVDLPPWAGPRCGPYCPRARITLLQRTTIAKSNRKYYENGYPADVKLMPGHAAVVSNRAAARAGFA-
LPCRKRQ (SEQ ID NO:559) PD >BL;Rv2728c, H37Rv2.tab
3040768:3041460 reverse MW:23455
VLSAIGIVPSAPVLVPELAGAAAAELADLGAAVIAAASLLPKSWIAVGTGRADDVVRPTDVGTFAGFGADVRV-
GLAPQDG (SEQ ID NO:560)
DGVAVPVELPLCALLTAWVRGQARPEARAQVHVYASDHGSDAA-
VARGRQLRADIDREPDPIGVLVVADGLNTLTPRAPGG
YDPDGAGMQRALDDALASGDLAVLTRLPA-
QVLGRvAFQVLAGLAEPGPRSAKEFYRGAPHGVGYFAGVWQP >BL;Rv2732c,
H37Rv2.tab 3044377:3044988 reverse MW:21989
MMSHEHDAGDLDALRAEIEAAERRvAREIEPGARALVVAILVFVLLGSFILFHTGSVRGWDVLFSSHGAGRAA-
VALPSRV (SEQ ID NO:561)
FAWLALVFGVGFSMLALLTRRWALAWVALAGSANASGTGLLAV-
WSRQTVAAGHPGPGIGLIVAWITAIVLTFXWAQVVWS
RTIVQLAAEERRRRVVAQQQCKTLLDHVQ- TDSEAGTTPDRGTDR >BL;Rv2738c,
H37Rv2.tab 3051808:3052011 reverse MW:7551
MLAGVRLTEFHERVALHFGAAYGSSVLLDHVLTGFDGRSAAQAIEDGVEP-
RDVWRALCADFDVPHDRW (SEQ ID NO:562) >BL;Rv2740, H37Rv2.tab
3053232:3053678 forward MW:16593
MAELTETSPETPETTEAIRAVEAFLNALQNEDFDTVDAALGDDLVYENVGFSRIRGGRRTATLLRRMQGRVGF-
EVKIHRI (SEQ ID NO:563)
GADGAAVLTERTDALIIGPLRvQFWVCGVFEVDDGRITLWRDY-
FDVYDMFKGLLRGLVALVVPSLKATL >BL;Rv2771c, H37Rv2.tab
3080583:3081032 reverse MW:16000 VRRLLIVHHTPSPHMQEMFEAVVSGATDPEIEG-
VEVVRRPALTVSPIEMLEADGYLLGTPANLGYISGALKHAFDVCYYL (SEQ ID NO:564)
CLDTTRGRSFGAYIHGNEGTEGAERAVDAITTGLGWVQAAETVVVMGKPSKADIEACWNLGATVAAQLMG
>BL;Rv2772c, H37Rv2.tab 3081121:3081591 reverse MW:17326
MTRRTLYVQLIIAFMCVAMVAYLVMLGRvAVAMIGSGRAAAAGLGLALLILPVIGLWANIATLRAG-
FAYQRLARLIAEDG (SEQ ID NO:565)
LDIDASALPRRASGRIQRDAADALFAAVRTELEDDA-
DDWRRWYRLARAYDYAGDRRRAREANKTALQLEGRARPGAR >BL;Rv2795c,
H37Rv2.tab 3103939:3104910 reverse MW:37568
VTWKGSGQETVGAEPTLWAISDLHTGHLGNKPVAESLYPSSPDDWLIVAGDVAERTDEIRWSLDLLRRRFAKV-
IWVPGNH (SEQ ID NO:566)
ELWTTNRDPMQIFGRARYDYLVNMCDEMGVVTPEHPFPVWTER-
GGPATIVPMFLLYDYSFLPEGANSKAEGVAIAKERNV
VATDEFLLSPEPYPTRDAWCHERVAATRA-
RLEQLDWMQPTVLVNHFPLLRQPCDALFYPEFSLWCGTTKTADWHTRYNAV
CSVYGHLHIPRTTWYDGVRFEEVSVGYPREWRRRKPYSWLRQVLPDPQYAPGYLNDFGGHFVITPEMRTQAAQ-
FRERLRQ RQSR >BL;Rv2840c, H37Rv2.tab 3147961:3148257 reverse
MW:10601 VRTCVGCRKRGLAVELLRvVAVSTGNGNYAVIV-
DTATSLPGRGAWLHPLRQCAQQAIRRAFARALRIAGSPDTSAVVEY (SEQ ID NO:567)
LESLGELEPPGNRTGSNRT >BL;2V2843, H37Rv2.tab 3150170:3150712
forward MW:17735 VLPAAPVINRLTNRPISRRGVLAGGAALAALGV-
VSACGESAPKAPAVEELRSPLDQARHDGALAAAAATAIGIPPQVAAA (SEQ ID NO:568)
LTVVATQRTSHARALATEIARAAGKLVSATSETSSSSPSPTDPAAPPPAVSDVIDSLRTSAGEASRLVATTSG-
YRAGLLA SIAASCTASYTVALVPSGPSI >BL;Rv2844, H37Rv2.tab
3150712:3151197 forward MW:16940
MTSSEPAHGATPKRSPSEGSADNAALCDALAVEHATIYGYGIVSALSPPGVNFLVADALKQHRHRRDDVIVML-
SARGVTA (SEQ ID NO:569)
PIAAAGYQLPMQVSSAANAARLAVRMENDGATAWRAVVEHAET-
ADDRvFASTALTESAVMATRWNRvLGAWPITAAFPGG DE >BL;Rv2876, H37Rv2.tab
3187662:3187973 forward MW:11805
MFGQWEFDVSPTGGIAVASTEVEHFAGSQHEVDTAEVPSAAWGWSRIDHRTWHIVGLCIFGFLLAMLRGNHVG-
HVEDWFL (SEQ ID NO:570) ITFAAVVLFVLARDLWGRRRGWIR >BL;Rv2898c,
H37Rv2.tab 3207944:3208327 reverse MW:14223
MTTLKTMTRVQLGAMGEALAVDYLTSMGLRILNRNWRCRYGELDVIACDAATRTVVFVEVKTRTGDGYGGLAH-
AVTERKV (SEQ ID NO:571)
RRLRRLAGLWLADQEERWAAVRIDVIGVRVGPKNSGRTPELTH- LQGIG >BL;Rv2901C,
H37Rv2.tab 3211805:3212107 reverse MW:12225
MSAEDLEKYETEMELSLYREYKDIVGQFSYVVETERRFYLANSVEMVPRNTDGEVYF-
ELRLADAWVWDMYRPARFVKQVR (SEQ ID NO:572) VVTFKDVNIEEVEKPELRLPE
>BL;Rv2917, H37Rv2.tab 3226362:3228239 forward MW:68334
VRVTRLVDAESTRCDVGPAPKSVANLHFTAATSRFRLGRERANSVRSDGGWGVLQPVSATFNPPLRGWQRR-
ALVQYLGTQ (SEQ ID NO:573)
PRDFLAVATPGSGKTSFALRIAAELLRYHTVEQVTVVVPTE-
HLKVQWAHAAAAHGLSLDPKFANSNPQTSPEYHGVMVTY
AQVASHPTLHRVRTEARKTLVVFDEIH-
NGGDAKTWGDAIREAFGDATRRLALTGTPFRSDDSPIPFVSYQPDADGVLRSQ
ADHTYGYAEALADGVVRPVVFLAYSGQARWRDSAGEEYEARLGEPLSAEQTARAWRTALDPEGEWMPAVITAA-
DRRLRQL
RAHVPDAGGMIIASDRTTARAYARLLTTMTAEEPTVVLSDDPGSSARITEFAQGTSRWL-
VAVRMVSEGVDVPRLSVGVYA
TNASTPLFFAQAIGRFVRSRRPGETASIFVPSVPNLLQLASALEV-
QRNHVLGRPHRESAHDPLDGDPATRTQTERGGAER
GFTALGADAELDQVIFDGSSFGTATPTGSDE-
EADYLGIPGLLDAEQMRALLHRRQDEQLRKRAQLQKGATQPATSGASAS
VHGQLRDLRRELHTLVSIAHHRTGKPHGWIHDERRRRCGGPPIAAATRAQIKARIDALRQLNSERS
>BL;Rv2926c, H37Rv2.tab 3240550:3241170 reverse MW:22378
VDLGGVRRRISLMARQHGPTAQRHVASPMTVDIARLGRRPGAMFELHDTVHSPARIGLELIAIDQGALLDL-
DLRVESVSE (SEQ ID NO:574)
GVLVTGTVAAPTVGECARCLSPVRGRVQVALTELFAYPDSA-
TDETTEEDEVGRVVDETIDLEQPIIDAVGLELPFSPVCR
PDCPGLCPQCGVPLASEPGHRHEQIDP- RWAKLVEMLGPESDTLRGER >BL;Rv2949c,
H37Rv2.tab 3299973:3300569 reverse MW:22587
MTECFLSDQEIRKLNRDLRILIAANGTLTRVLN-
IVADDEVIVQIVKQRIHOVSPKLSEFEQLGQVGVGRVLQRYIILKGR (SEQ ID NO:575)
NSEHLFVAAESLIAIDRLPAAIITRLTQTNDPLGEVMAASHIETFKEEAKVWVGDLPGWLALHGYQNSRKRAV-
ARRYRVI SGGQPIMVVTEHFLRSVFRDAPHEEPDRWQFSNAITLAR >BL;Rv2968c,
H37Rv2.tab 3323073:3323702 reverse MW:23100
VVAARPAERSGDPAAVRVPVPSAWWVLIGGVIGLFASMTLTVEKVRILLDPIYVPSCNVNPIVSCGSVMTTPQ-
ASLLGFP (SEQ ID NO:576)
NPLLGIAGFTVVVVTGVLAVAKVPLPRWYWIGLAVGILVGVAF-
VHWLIPQSLYRIGALCPYCMVVWAVIATLLVVVASIV
FGPMRENRGSQERvGARLLYQWRWSLATL- WFTTVFLLIMVRFWDYWSTLI >BL;Rv2980,
H37Rv2.tab 3335959:3336501 forward MW:18752
VTGESDGPPRAVLIAAAALAAAVIGVILVVAAN-
RQPPERPVVIPAVPAPQATGPGCKALLAALPQRLGEYPRAPVAEPTT (SEQ ID NO:577)
AGATAWRTGPNSTPVILRCGLDRPAEFVVGSAIQVVDRVQWFQVAAQNPDEPGRSTWYTVDRPVYVALTLPSG-
SGPTAIQ ELSDVIDHTIPAVPIOPAPAR >BL;Rv3005c, H37Rv2.tab
3363695:3364531 reverse MW:28827
VTSSNDSHWQRPDDSPGPMPGRPVSASLVDPEDDLTPARYAGDFGSGTTTVIPPYDAASSGVGNSGYSLIEAA-
EPLPYVQ (SEQ ID NO:578)
PQPGRQVPAGSAGIDMDDDERVRAAGRRGTQNLGLLILRVGLG-
AVLIAHGLQKLFGWWDGQGLAGFQNSLSDIGYQHAEI
LAYVSAGGEIVAGVLLVLGLFTPLAAAGA-
LAFLINGLLAGISAQHSRPVAYFLQDGHEYQITLVVMAVAVILSGPGRYGL
DAARGWAHRPFIGSFVALLGGIAAGIAVWVLLNGANPLA >BL;Rv3013, H37Rv2.tab
3371814:3372467 forward MW:22967
VRSYLLRIELADRPGSLGSLAVALGSVGADILSLDVVERGNGYAIDDLVVELPPGAMPDTLITAAEALNGVRV-
DSVRPHT (SEQ ID NO:579)
GLLEAHRELELLDHVAAAEGATARLQVLVNEAPRVLRVSWCTV-
LRSSGGELHRLAGSPGAPETRANSAPWLPIERAAALD
GGADWVPQAWRDMDTTMVAAPLGDTHTAV- VLGRPGPEFRPSEVARLGYLAGIVATMLR
>BL;Rv3015c, H37Rv2.tab 3374653:3375663 reverse MW:34212
VSVFATATGIGSWPGTAAREAAQVVVGELAGAL-
AYLTELPARGVGADMLGRAGGLLVDVAIDTVPRGYRIAARPGAVTRR (SEQ ID NO:580)
AASLLDEDMDALEEAWETAGLRGCGRAVKVQAPGPVTLVAGLELANGHRAITDPGAVRDLAASLAEGVAAHRA-
ALARRLD
TPVVVQFDEPSLPAALGGRLTGVTALSPVAPLDETVAEALLDTCIAAVDADVALHSCSP-
DLPWDLLQRSRISAVSVDAST
LQAADLDAVAAFVESGRTVVLGLVPVTAPERAPSMEEVAAAAVAV-
TDRLGVPRSALRDRLGVSPACGLANATGQWARTAV GLARDVAEAFARDPEAI
>BL;Rv3035, H37Rv2.tab 3395378:3396457 forward MW:37305
LAAGPALSARGYLALNGQTPAGCSLMEWQNDNNGRQRWCVRLVQGGGFAGPLFDGFDNLYVGQPGAIISFPPT-
QWTRWRQ (SEQ ID NO:581)
PVIGMPSTPRFLGHGRLLVSTHLGQLLVFDTRRGMVVGSPVDL-
VDGIDPTDATRGLADCAPARPGCPVAAAPAFSSVNGT
VVVSVWQPGEPAAKLVGLKYHAEQLVREW-
TSDAVSAGVLASPVLSADGSTVYVNGRDHRLWALNAADGKAKWSAPLGFLA
QTPPALTPHGLIVSGGGPDTALAAFRDAGDHAEGAWRRDDVTALSTASLAGTGVGYTVISGPNHDGTPGLSLL-
VFDPANG HTVNSYPLPGATGYPVGVSVGNDRRvVTATSDGQVYSFAP >BL;Rv3038c,
H37Rv2.tab 3398427:3399407 reverse MW:36049
MTRSSNIPADATPNPHATAEQVAAARHDSKLAQVLYHDWEAENYDEKWSISYDQRCVDYARGRFDAIVPDEVI-
AQLPYDR (SEQ ID NO:582)
ALELGCGTGFFLLNLIQAGVARRGSVTDLSPGMVKVATRNGQA-
LGLDIDGRvADAEGIPYDDDAFDLVVGHAVLHHIPDV
ELSLREVVRVLKPGGRFVFAGEPTTVGDG-
YARTLSTLTWRVVTNATKLPGLRGWRRPQGELDESSRAAALEALVDLHTFT
PQDLQRIAHNAGAVEVQTATEEFTAANLGWPLRTFECTVPPGRLGWGWARFAFTSWKTLGWVDANVWRHVVPK-
GWFYNVM ITGVKPS >BL;Rv3195, H37Rv2.tab 3564363:3565778 forward
MW:49325 VSTGEVMGDLPFGFSSGDDPPEDPSGRDKRGKD-
GADSGSGANPLGAFGIGGEFNMADLGQIFTRLGEMFGGVGTAMAAGK (SEQ ID NO:583)
TSGPVNYDLARQVASSSIGFIAPIPAATNSAIADAVHLADTWLDGATSLPAGATKAVGWSPTDWVDNTLATWK-
RLCDPMA
QQISTVWASSLPEEAKSMAGPLLSIMSQMGGIAFGSQLGQALGRLSREVLTSTDIGLPL-
GPKGVAAILPGAVESFAAGLE
QPRSEILTFLATREAAHHRLFSHVPWLASQLLGAVEAYAMGMKID-
MTGIEELARDINPTSLADPAAMEQLLSQGVFEPKA
TPAQTQALERLETLLALIEGWVQTVVTAALG-
ERIPGEAALSETLRRRRASGGPAEQTFATLVGLELRPRKLREAGALWER
LTRAVGMDARDAVWQHPDLLPATDDLDDPAAFIDRvIGGDTSGIDEAIAELERDQQARGADDSGHDGGPVDN
>BL;Rv3205c, H37Rv2.tab 3581629:3582504 reverse MW:31352
MGSTRLTGVNVEPPPEHVLVAFGLAGAQPILLGAGWEGGWRCGEVVLSMVADNARAA-
WSARvRETLFVDGVRLARPVRST (SEQ ID NO:584)
DGRYVVSGWRADTFVAGAPEPRHDEVV-
SAAVRLHEATGKLERPRFLTQGPAAPWAEIDVFVAADRAGWEERPLQSVPPGV
PTAPPAADPQRSIDLINQLAGLRKPTKSPNQLVUGDLYGTVLFAGTAPPGITDITPYWRPASWAAGVAVVDAL-
SWGAADD GLIERWNALPEWPQMLLRALMFRLAVYALHPRSTAEAFPGLAHTAALVRLVL
>BL;2V3207c, H37Rv2.tab 3583803:3584657 reverse MW:31034
VSTYGWRAYALPVLMVLTTVVVYQTVTGTSTPRPAAAQTVRDSPAIGVVGTAILDAPPRGLAVFDANLPAG-
TLPDGGPFT (SEQ ID NO:585)
EAGDKTWRvVPGTTPQVGQGTVKVFRYTVEIENGLDPTMYG-
GDNAFAQMVDQTLTNPKGWTHNPQFAFVRIDSGKPDFRI
SLVSPTTVRGGCGYEFRLETSCYNPSF-
GGMDRQSRVFINEARWVRGAVPFEGDVGSYRQYVINHEVGHAIGYLRHEPCDQ
QGGLAPVMMQQTFSTSNDDAAKFDPDFVKADGKTCRFNPWPYPIP >BL;Rv3208c,
H37Rv2.tab 3585679:3585948 reverse MW:9400
VEVKIGITDSPRELVFSSAQTPSEVEELVSNALRDDSGLLTLTDERGRRFLIHTARIAYVEIGVADARRvGFG-
VGVDAAA (SEQ ID NO:586) GSAGKVATSG >BL;Rv3209, H37Rv2.tab
3586273:3586830 forward MW:19118
VALGAVATAVIINSGDSTSTKAIVGAPAPRTVISTSPRPTAPTSTSPHPSPSTLRPQLPPETVTTVAPPGTGP-
TTVPTRT (SEQ ID NO:587)
PTAAPPQTAVPPPAPLNPRTVVYRvTGTKQLFDLVNVVYTDAR-
GFPVTDFNVSLPWTKMVVLNPGVQTESVVATSLYSRL NCSIVNTGAQTVVASTNNAIIATCTR
>BL;Rv3212, H37Rv2.tab 3589393:3590613 forward MW:42506
MVKPERRTKTDIAAAATIAVVVAVAASLIWWTSDARATISRPAAVAVPTPAPAREVPTSLKQLWTAAS-
PATRVPVVVGGT (SEQ ID NO:588)
VATGDGRQVDGRDPATGESLWSYARDTDLCGVTWVYHY-
AVAVYRYDRGCGQVSTIDGSTGRRGAARSGYADPRVRLPSDG
TTVLSAGDTRLELWRSDMVRMLAY-
GEIDARVKPSNRGLQSGCTLESAAASSAAVSVLEACTNQADLRLVLLRPGKEDDEP
IQRIVPEPGVRPGSGARVLVVSQNNTAVYLPARSGAQPRVDVIDETGATVSSTLLAKPPSTSAVASRTGNLVT-
WWTGDAL
LVFDAGNLTQRYTIAAGETTAPVGPGVMMAGQLLVPVTGGIGVYDPVSGANNRYIPVTR-
PPSTSAVIPAVSGSRVIEQRG DTLVALG >BL;Rv3217C, H37Rv2.tab
3593806:3594234 reverse MW:14260
VPVRAPAAVRGAGLIVAVQGGAALVVAAALLVRGLAGADQHIVNGLGTAGWFVLVGGAVLAAGCRLAVGKLWG-
RGLAVFA (SEQ ID NO:589)
QLLLLPVAWYLIVGSHQPAIGIPVGIIALGVLVLLFSPPSIRW- AAGRDQRGAASAANRGPDSR
>BL;Rv3242C, H37Rv2.tab 3621572:3622210 reverse MW:22481
VLDLVLPLECGGCGAPATRWCAACAAELSVAAG-
EPHVVSPRVDPQVPVFALGRYAGVRRQAILAMKEHGRRDLVAPLACA (SEQ ID NO:590)
LIVGVDHLLSWGMLENPLTMVPAPTRRWAARRRGGDPVSRMARIAGATLGRHHDVTVVPALRMRALARDSVGL-
GASARER NITGRVLLRGQRPRNEVVLVDDIITTGATARESVRVLQAAGVRVGAVLAVAAA
>BL;Rv3256c, H37Rv2.tab 3636277:3637314 reverse MW:35277
VNVARAIDLEDTEGLIAADRGALLRAASMAGAQVRAIAAAADEGELDLLRGSDRPRSVIWVTGRGTAETA-
GTILASTLGA (SEQ ID NO:591)
GAAEPIVLASAAPPWVGPLDVLIVAGDDPGDPALVGAAAI-
GVRRGARVVVVAPYEGPLRDSTAGRVAVLEPRLRVPDEFG
LSRYLAAGLAALQTVDPKLRIDLASL-
ADELDAEALRNSAGREVFTNPAKALAARVSGCQLALAGDNAATLALARHGSSVM
LRIANQVVAATRLSDAVVALRAGTPPDALFHDEEIDGPAPQRLRvLALALAGERTVVAARVAGLDDAYLVAAE-
DVPELLD APVGSGGAVLAVRLEMAAVYLRLVRG >BL;2V3258c, H37Rv2.tab
3638813:3639301 reverse MW:16810
MRVSGASAALVHDSLSVVNVPRRCCRPGCPHYAVATLTFVYSDSTAVIGPLATAREPHSWDLCVGHAGRITAP-
RGWELVR (SEQ ID NO:592)
HAGPLPSHPDEDDLVALADAVREGGPSAGRRHHPGGNGAPLHG-
FDDFPAAATGAPTGGGVLAPPEPGAGRRRGHLRVLPD PAD >BL;Rv3259, H37Rv2.tab
3639424:3639840 forward MW:15649
MRGPLLPPTVPGWRSRAERFDMAVLEAYEPIERRWQERVSQLDIAVDEIPRIAAKDPESVQWPPEVIADGPIA-
LARLIPA (SEQ ID NO:593)
GVDVRGNATRARIVLFRKPIERRAKDTEELGELLHEILVAQVA- IYLDVDPSVIDPTIDD
>BL;Rv3269, H37Rv2.tab 3650233:3650511 forward MW:9750
MAIQVFLAKATTTVITGLAGVTAYEILKKAAAKAPLRQTAVSAAALGLRG-
TRKAEEAAESARLKVADVMAEARERIGEES (SEQ ID NO:594) PTPAISDLHDHDH
>BL;Rv3277, H37Rv2.tab 3659877:3660692 forward MW:30079
MNEVTAGVRELATAIMVSRHLTGVLAGHGSQTVTYHFASILCSSVHSLVVSFADATIARLPGVVQPYAQRHH-
ELIKFAIV (SEQ ID NO:595)
GGTTFIIDTAIFYTLKLTVLEPKPVTAKVIAGIVAVIASYVL-
NREWSFRDRGGRERHHEALLFFAFSGVGVLLSMAPLWF
SSYILQLRVPTVSLTMENIADFISAYII-
GNLLQMAFRFWAFRRWVFPDEFARNPDKALESALTAGGIAEVFEDVLEGGFE
DGNVTLLRAWRNRANRFAQLGDSSEPRvSKTS >BL;Rv3278C, H37Rv2.tab
3660653:3661168 reverse MW:19820
MSYPENVLAAGEQVVLHRHPHWNRLIWPVVVLVLLTGLAAFGSGFVNSTPWQQIAKNVIHAVIWGIWLVIVGW-
LTLWPFL (SEQ ID NO:596)
SWLTTHPVVTNRRvMFRHGVLTRSGIDIPLARINSVEFRDRIF-
ERIFRTGTLIIESASQDPLEFYNIPRLREVHALLYHE VFDTLGSDESPS >BL;Rv3281,
H37Rv2.tab 3663688:3664218 forward MW:19013
MGTCPCESSERNEPVSRVSGTNEVSDGNETNNPAEVSDGNETNNPAEVSDGNETNNPAPVSRVSGTNEVSDGN-
ETNNPAP (SEQ ID NO:597)
VSRvSGTNEVSDGNETNNPAPVTEKPLHPHEPHIEILRGQPTD-
QELAALIAVLGSISGSTPPAQPEPTRWGLPVDQLRYP VFSWQRITLQEMTHMRR
>BL;Rv3311, H37Rv2tab 3698120:3699379 forward MW:45732
MVADLVPIRLSLSAGDRYTLWAPRWRDAGDEWEAFLGKDDDLYGFESVSDLVAFVRTDTENDLVDHPAWQDLT-
GAHAHNL (SEQ ID NO:598)
NPAEDNQFDLVVVEELLAEKPTAESVAALAASLAIVSAIGSVC-
ELAAVSKFFNGNPILGTVSGGLEHFTGKAGNKRWNSI
AEVIGRSWDDVLAAIDEIISTPEVDAELS-
EKVAEELAEEPEGAEEVAAEVEATQDTQEAAESDDEEADAPGDSVVLGGDR
DFWLQVGIDPIQIMTGTATFYTLRCYLDDRPIFLGRNGRISVFGSERALARYLADEHDHDLSDLSTYDDIRTA-
ATDGSLA
VAVTDDNVYVLSGLVDDFADGPDAVDREQLDLAVELLRDIGDYSEDSAVDKALETTRPL-
GQLVAYVLDPHSVGKPTAPYA AAVREWEKLERFVESRLRRE >BL;Rv33S4,
H37Rv2.tab 3769110:3769496 forward MW:12987
MNLRRHQTLTLRLLAASAGILSAAAFAAPAQANPVDDAFIAALNNAGVNYGDPVDAKALGQSVCPILAEPGGS-
FNTAVAS (SEQ ID NO:599)
VVARAQGMSQDMAQTFTSIAISMYCPSVMADVASGNLPALPDM- PGLPGS >BL;Rv3368c,
H37Rv2.tab 3780337:3780978 reverse MW:23733
MTLNLSVDEVLTTTRSVRKRLDFDKPVPRDVLMECLELALQAPTGSNSQGWQWVFVE-
DAAKKKAIADVYLANARGYLSGP (SEQ ID NO:600)
APEYPDGDTRGERMGRVRDSATYLAEH-
MHRAPVLLIPCLKGREDESAVGGVSFWASLFPAVWSFCLALRSRGLGSCWTTL
IILLDNGEHKVADVLGIPYDEYSQGGLLPIAYTQGIDFRPAKRLPAESVTHWNGW
>BL;Rv3412, H37Rv2.tab 3831725:3832132 forward MW:15269
VRDHLPPGLPPDPFADDPCDPSAALEAVEPGQPLDQQERMAVEADLADLAVYEALLAHKGIRGLVVCCDECQQ-
DHYHDWD (SEQ ID NO:601)
MLRSNLLQLLIDGTVRPHEPAYDPEPDSYVTWDYCRGYADASL- NEAAPDADRFRRR
>BL;Rv3415c, H37Rv2.tab 3833696:3834520 reverse MW:28627
VNETPHAPVVEQVLVAAAFGNQPGSWPLPTAITPHHLWLRAVAAGGQGR-
YAHAYGDLSVLRRLVPAGPLASLAHSTQGSL (SEQ ID NO:602)
LRQLGWHTLARGWDGRALALAGADREAGADALIGLAADALGVGRFAAAGALLDRADPLVVSPLVADRLAVRRR-
WVAAELA
MATGDGATAVRHAEEAVELTQAMAVASARHRvKSDVVLAAALCSAGAVARARAVGEEAL-
DATARFGLLPLRWALACLLID IGTVTFSAQQLRELTKIRNICAGQVRRAGGCWRTA
>BL;Rv3438, H37Rv2.tab 3857396:3858235 forward MW:29209
VPRIRKLVAALHRRGPHRVLRGDLAFAGLPGVVYTPEAGLHLPGVAFGHDWLTGTSRYSGLLEHLASWGIVAA-
APDSERG (SEQ ID NO:603)
LAPSVLNLAFDLGVALDIVAGVRLGPGKISVHPAKLGLVGHGF-
GGSAAVFAAAGLTGTHVKSVAAIFPTVTNPAAEQPAA
TLDVPGLILTAPGDPKTLTSNALGLSRAW-
DKATLRIVSKARAGGLVEGRRLTKVLGLPGPHRRTQRSVRALLTGYLLYTL
GGDKTYRRFADPDLQLPKTDPIDPEAPPITPGEKIVTLLK >BL;Rv3587c, H37Rv2.tab
4028971:4029762 reverse MW:27067
VLDLEPRGPLPTEIYWRRRGLALGIAVVVVGIAVAIVIAFVDSSAGAKPVSADKPASAQSHPGSPAPQAPQPA-
GQTEGNA (SEQ ID NO:604)
AAAPPQGQNPETPTPTAAVQPPPVLKEGDDCPDSTLAVKGLTN-
APQYYVGDQPKFTMVVTNIGLVSCKRDVGAAVLAAYV
YSLDNKRLWSNLDCAPSNETLVKTFSPGE-
QVTTAVTWTGMGSAPRCPLPRPAIGPGTYNLVVQLGNLRSLPVPFILNQPP
PPPGPVPAPGPAQAPPPESPAQGG >BL;Rv3603c, H37Rv2.tab 4045210:4046118
reverse MW:31104 MERFDGLRPARLKVGIISAGRVGTALGVALQRA-
DHVVVACSAISHASRRRAQRRLPDTPVLPPLDVAASAELLLLAVTDS (SEQ ID NO:605)
ELAGLVSGLAATSAVRPQTIVAHTSGANGIGILAPLAQQGCIPLAIHPAMTFTGSDEDISRLPDTCFGITAAD-
DVGYAIG
QSLVLEMGGEPFCVREDARILYHAALAHASNHIVTVLADALEALRAALSGGELLGQQTV-
DDQPGGIVERIVGPLARAALE
NTLQRGQAALTGPVARGDAAAVADHLAALADVDAALAQAYRINAL- RTAQRAHAPADVVEVLTA
>BL;Rv3604c, H37Rv2.tab 4046306:4047691 reverse MW:49862
VPRAASAMAEPAMGVGRRRCWPGGRPGMRGCLR-
GEFGRTAYPAKPCGNRRTGATRGLTSPGYSQAMTVLSRGARVRRGGR (SEQ ID NO:606)
RPGWVLLTALLVLAIGASSALVFTDRVELLKLAVLLALWAAVAGAFVSVLYRRQSDVDQARVRDLKLVYDLQL-
DREISAR
REYELTLESQLRRELASELRAPAADEVAALRAELAALRTSLEILFDADLEHRPALGTVE-
KEARAARALDGESPPADWVSS
DRVMAVRGGDGASRTDEASIIDVPEVGVPPVSGGPRHYEAPPPPQ-
PEPLFEPRHRPPPLPPQQERPVWQPVTSHGQWLPA
ETPGSQWASVEPETTPAAPPPGRRRRARHAS-
PADQAYNPPAYVELAAQYGESGRRSRHSAEHRDHDIGGSGAGTGERPPS
PPMAPPPPAEPTRRHRTADTPPDDSGGLHARDPLTGGQSVADLMARLQVESTGGGRRRRRGE
>BL;Rv3605c, H37Rv2.tab 4047708:4048181 reverse MW:16789
MGPTRKRDLTAAVVGAAAVGYLLVAVLYRWFPPITVWTGLSLLAVAVAEALWARYVRVKISDGEIGDGPGWLH-
PLVVARS (SEQ ID NO:607)
LMVAKASAWVGALVTGWWIGVLAYFLPRRSWLRAAAEDTTGTV-
VAAGSALALVVAALWLQHCCKSPQDPTEHADGAES >BL;Rv3614c, H37Rv2.tab
4054145:4054696 reverse MW:19802
VDLPGNDFDSNDFDAVDLWGADGAEGWTADPIIGVGSAATPDTGPDLDNAHGQAETDTEQEIALFTVTNPPRT-
VSVSTLM (SEQ ID NO:608)
DGRIDHVELSARvAWMSESQLASEILVIADLARQKAQSAQYAF-
ILDRMSQQVDADEHRvALLRKTVGETWGLPSPEEAAA AEAEVFATRYSDDCPAPDDESDPW
>BL;Rv3615c, H37Rv2.tab 4054815:4055123 reverse MW:10795
MTENLTVQPERLGVLASHHDNAAVDASSGVEAAAGLGESVAITHGPYCSQFNDTLNVYLTAHNALGSSL-
HTAGVDLAKSL (SEQ ID NO:609) RIAAKIYSEADEAWRKAIDGLFT >BL;Rv3616c,
H372V2.tab 4055200:4056375 reverse MW:39888
MSRAFIIDPTISAIDGLYDLLGIGIPNQGGILYSSLEYFEKALEELAAAFPGDGWLGSAADKYAGKNRNHVNF-
FQELADL (SEQ ID NO:610)
DRQLISLIHDQANAVQTTRDILEGAKKGLEFVRPVAVDLTYIP-
VVGHALSAAFQAPFCAGAMAVVGGALAYLVVKTLINA
TQLLKLLAKLAELVAAAIADIISDVADII-
KGTLGEVWEFITNALNGLKELWDKLTGWVTGLFSRGWSNLESFFAGVPGLT
GATSGLSQVTGLFGAAGLSASSGLAHADSLASSASLPALAGIGGGSGFGGLPSLAWVHAASTRQALRPRADGP-
VGAAAEQ
VGGQSQLVSAQGSQGMGGPVGMGGMHPSSGASKGTTTKKYSEGAAAGTEDAERAPVEAD-
AGGGQKVLVRNVV >BL;Rv3619c, H37Rv2.tab 4059987:4060268 reverse
MW:9832 MTINYQFGDVDAHGAMIRAQAGSLEAEHQAIISDVLTASDFWGGAGSAAC-
QGFITQLGRNFQVIYEQANAHGQKVQAAGN (SEQ ID NO:611) NMAQTDSAVGSSWA
>BL;Rv3632, H37Rv2.tab 4071236:4071577 forward MW:13068
MNWIQVLLIASIIGLLFYLLRSRRSARSRAWVKVGYVLFVLAGIYAVLRPDDTTVVANWFGVRRGTDLMLY-
ALVIVIAFSFT (SEQ ID NO:612) TLSTYMRFKDLELRYARIARALALEGAQAPEQCR
>BL;Rv3647c, H37Rv2.tab 4087613:4088188 reverse MW:20314
VSQLSFFAAESVPPAVADLSGVLAGPGQIVLVGCGARLSVVVAESWRASALAEMIQEAGLVPEVARTDE-
NTPLVRTAVDP (SEQ ID NO:613)
LLCGIAAEWTRGAVKTVPPRWLPGPRELRAWTLAAGSPE- ADRYLLGLDPHAPDTHS
PLASALMRvGIAPTLIGTRGTRPA LRISGRRRLSRLVENVGEPPDGAE- AWVQWPRT
>BL;Rv3662c, H37Rv2.tab 4101268:4102035 reverse MW:26338
VTVDPLAPLMELPGVAAASDRVRDALSRvHRHRANLRGWPVAAAEASLR-
AARASSVLDGGPARLHDAGAPTSGKPALSDP (SEQ ID NO:614)
VFAGALRVGQALEGGAGPVVGVWRRAPLQALARLHMLAAADQVDDDRLGRPRSDADVGPRLELLALVVTHPTL-
ASAPVVA
AVAHGELLTLRPFGCADGVVARAVSRLVTIATGLDPHGLGVPEVIWMRQPAEYHDAARR-
FAGGTPDGVAGWLLLCCGAML DGAREALSIAESLSPG >BL;Rv3668c, H37Rv2.tab
4109786:4110481 reverse MW:23102
LQTAHRRFAAAFAAVLLAVVCLPANTAAADDKLPLGGGAGIVVNGDTMCTLTTIGHDKNGDLIGFTSAHCGGP-
GAQIAAE (SEQ ID NO:615)
GAENAGPVGIMVAGNDGLDYAVIKFDPAKVTPVAVFNGFAING-
IGPDPSFGQIACKQGRTTGNSCGVTWGPGESPGTLVM
QVCGGPGDSGAPVTVDNLLVGMIHGAFSD-
NLPSCITKYIPLHTPAVVMSINADLADINAKNRPGAGFVPVPA >BL;Rv3669,
H37Rv2.tab 4110827:4111342 forward MW:18887
VSKIDRKNGVPSTLTTIPLADPHAGPAEPSIGDLIKDATTQMSTLVRAEVELARAEITRDVKKGLTGSVFFIS-
SLVVGFY (SEQ ID NO:616)
STFFFFFFVAELLDTWIWRWVAFLLVFAIMVVVTAVLALLGFL-
KVRRIRGPRQTIASVKETRTALTPGHDKTPVTPKPVT SDRATPVDPSGW >BL;Rv3680,
H37Rv2.tab 4119795:4120952 forward MW:41405
MSVTPKTLDMGAILADTSNRVVVCCGAGGVGKTTTAAALALRAAEYGRTVVVLTIDPAKRLAQALGINDLGNT-
PQRVPLA (SEQ ID NO:617)
PEVPGELHAMMLDMRRTFDEMVMQYSGPERAQSILDNQFYQTV-
ATSLAGTQEYMAMEKLGQLLSQDRWDLIVVDTPPSRN
ALDFLDAPKRLGSFMDSRLWRLLLAPGRG-
IGRLITGVMGLAMKALSTVLGSQMLADAAAFVQSLDATFGGFREKADRTYA
LLKRRGTQFVVVSAAEPDALREASFFVDRLSQESMPLAGLVFNRTHPMLCALPIERAIDAAETLDAETTDSDA-
TSLAAAV
LRIHAERGQTAKREIRLLSRFTGANPTVPVVGVPSLPFDVSDLEALRALADQLTTVGND-
AGRAAGR >BL;Rv37OSc, H37Rv2.tab 4148321:4148962 reverse MW:22359
MRIAAAVVSIGLAVIAGFAVPVADAHPSEPGVVSYAVLGKGSVGNIVGAPMGWEAVF-
TRPFQAFWVELPACNNWVDIGLP (SEQ ID NO:618)
EVYDDPDLASFNGATTQTSATDQTHLV-
KQAVGVFASNDAADRAFHRvVDRTVGCSGQTTAIHLDDGTTQVWSFAGGPSTG
TDEAWTKQEAGTDRRCFVQTRLRENVLLQAKVCQSGNAGPAVNVLAGAMQNTLG
>BL;Rv3716c, H37Rv2.tab 4160515:4160913 reverse MW:13357
MQPGGDMSALLAQAQQMQQKLLEAQQQLANSEVHGQAGGGLVKVVVKGSGEVIGVTIDPKVVDPDDIETLQDL-
IVGAMRD (SEQ ID NO:619)
ASQQVTKMAQERLGALAGANRPPAPPAAPPGAPGMPGMPGMPG- APGAPPVPGI
>BL;Rv3718c, H37Rv2.tab 4161818:4162258 reverse MW:15661
MGQVSAASTILINAEPTATLDALADYETVRPKILSPHYSEYQVLEGGKG-
RGTVAKWRLQATQSRVRDVQVNVDVAGHTVI (SEQ ID NO:620)
EKDMNSSMVTNWTVAPAGPGSSVTVKTTWTGAGGVKGFFEKTFAPLGLKKIQAEVLSNLKTELEGDA
>BL;Rv3723, H37Rv2.tab 4168536:4169297 forward MW:27367
MGRKVAVLWHASFSIGAGVLYFYFVLPRWPELMGDTGHSLGTGLRIATGALVGLAALPVVFTLLRTRKPEL-
GTPQLALSM (SEQ ID NO:621)
RIWSIMAHVLAGALIVGTAISEVWLSLDAAGQWLFGIYGAA-
AAIAVLGFFGFYLSFVAELPPPPPKPLKPKKPKQRRLRR
KKTAKGDEAEPEAAEEAENTELAAQED-
EEAVEAPPESIESPGGEPESATREAPAAETATAEEPRGGLRNRRPTGKTSHRR
RRTRSGVQVAKVDE >BL;Rv3753c, H37Rv2.tab 4199724:4200221 reverse
MW:17917 MQRPAADTPDGFGVAVVREEGRWRCSPMGPKALTSLRAAETELRELRSA-
GAVFGLLDVDDEFFVIVRPAPSGTRLLLSDA (SEQ ID NO:622)
TAALDYDIAAEVLDNLDAEIDPEDLEDADPFEEGDLGLLSDIGLPEAVLGVILDETDLYADEQLGRIAREMGF-
ADQLSAV IDRLGR >BL;Rv3760, H37Rv2.tab 4205538:4205837 forward
MW:10533 VPGSVPGKAPEEPPVKFTRAAAVWSALIVGFLI-
LILLLIFIAQNTASAQFAFFGWRWSLPLGVAILLAAVGGGLITVFAG (SEQ ID NO:623)
TARILQLRRAAKKTHAAALR >BL;Rv3779, H37Rv2.tab 4224985:4226982
forward MW:71763 VGLWFGTLIALILLIAPGAMVARIAQLRWPVAI-
AVGPALTYGVVALAIIPYGALGIPWNGWTALAALAVTCAVATGLQLL (SEQ ID NO:624)
LARFRDLDAEALAVSRWPAVTVAAGVLLGALLIGWAAYRGIPHWQSIPSTWDAVWHANTVRFILDTGQASSTH-
MGELRNV
ETHAPLYYPSVFHGLVAVFCQLTGAAPTTGYTLSSLAASVWLFPVSAAVLTWRAVRSHP-
GALWSASCASAEWRAAGAAGT
AAALSASFTAVPYVEFDTAAMPNLAAYGIAVPTMVLITSTLRHRD-
RIPVAVLALVGVFSLHITGGIVVALLVSAWWLFEA
LRHPVRSRLADLLTLAGVAAMAGLVMLPQFL-
SVRQQEDIIAGHAFPTYLSKKRGLFDAVFQHSRHLNDFPVQYALIVLAA
IGGLILLVKKIWWPLAVWLLLIVMNVDAGTPLGGPIGGVAGALGEFFYHDPRRIAAATTLLLMLMAGVALFAT-
VNLLVAA
AKRLTDRFRPQPVSVWASATATLLIGATLVSAWHYFPRHRFLFGDKYDSVMIDQKDLDA-
NAYLASLPGARDTLIGNANTD
GTAWMYAVAGLHPLWTHYDYPLQQGPGYHRFIFWAYGRNGESDPR-
vLEAIQVLRIRYILTSTPTVRGFAVPDGLVSLETS RSWAKIYDNGEARIYEWRGTAAATHS
>BL;Rv3780, H37Rv2.tab 4226989:4227522 forward MW:19484
VRKRMVIGLSTGSDDDDVEVIGGVDPRLIAVQENDSDESSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPL-
DEASRNRLRD (SEQ ID NO:625)
IHATSIRELEDGLAPELREELDRLTLPFNEDAVPSDAELR-
IAQAQLVGWLEGLFHGIQTALFAQQMAARAQLQQMRQGAL PPGVGKSGQHGHGTGQYL
>BL;Rv3792, H37Rv2.tab 4237932:4239860 forward MW:69516
MPSRRKSPQFGHEMGAFTSARAREVLVALGQLAAAVVVAVGVAVVSLLAIARVEWPAFPSSNQLHALTTVGQV-
GCLAGLV (SEQ ID NO:626)
GIGWLWRHGRFRRLARLGGLVLVSAFTVVTLGMPLGATKLYLF-
GISVDQQFRTEYLTRLTDTAALRDMTYIGLPPFYPPG
WFWIGGRAAALTGTPAWEMFKPWAITSMA-
IAVAVALVLWWRMIRFEYALLVTVATAAVMLAYSSPEPYAAMITVLLPPML
VLTWSGLGARDRQGWAAVVGAGVFLGFAATWYTLLVAYGAFTVVLMALLLAGSRLQSGIKAAVDPLCRLAVVG-
AIAAAIG
STTWLPYLLRAARDPVSDTGSAQHYLPADGAALTFPMLQFSLLGAICLLGTLWLVMRAR-
SSAPAGALAIGVLAVYLWSLL
SMLATLARTTLLSFRLQPTLSVLLVAAGAFGFVEAVQALGKRGRG-
VIPMAAAIGLAGAIAFSQDIPDVLRPDLTIAYTDT
DGYGQRGDRRPPGSEKYYPAIDAAIRRVTGK-
RRDRTVVLTADYSFLSYYPYWGFQGLTPHYANPLAQFDKRATQIDSWSG
LSTADEFIAALDKLPWQPPTVFLMRHGAHNSYTLRLAQDVYPNQPNVRRYTVDLRTALFADPRFVVEDIGPFV-
LAIRKPQ ESA >BL;Rv38O2c, H37Rv2.tab 4263358:4264365 reverse
MW:35448 MAKNSRRKRHRILAWIAAGAMASVVALVIVAVV-
IMLRGAESPPSAVPPGVLPPGPTPAHPHKPRPAFQDASCPDVQMISV (SEQ ID NO:627)
PGTWESSPQQNPLNPVQFPKALLLKVTGPIAQQFAPARVQTYTVAYTAQFHNPLTTDNQMSYNDSRAEGTRAM-
VAAMTDM
NNRCPLTSYVLIGFSQGAVIAGDVASDIGNGRGPVDEDLVLGVTLIADGRRQQGVGNQV-
PPSPRGEGAEITLHEVPVLSG
LGLTMTGPRPGGFGALDGRTNEICAQGDLICAAPAQAFSPANLPT-
TLNTLAGGAGQPVHAMYATPEFWNSDGEPATEWTL NWAHQLIENAPHPKHR
>BL;Rv3805c, H37Rv2.tab 4266956:4268836 reverse MW:68710
MVRVSLWLSVTAVAVLFGWGSWQRRWIADDGLIVLRTVRNLLAGNGPVFNQGERVEANTSTAWTYLLYVGGWV-
GGPMRLE (SEQ ID NO:628)
YVALALAMVLSLLGMVLLMLGTGRLYAPSLRGRRAIMLPAGAL-
VYIAVPPARDFATSGLESGLVLAYLGLLWWMMVCWSQ
PLRARPDSQMFLGALAFVAGCSVLVRPEF-
ALIGGLALIMMLIAARTWRRRVLIVLAGGFLPVAYQIFRMGYYGLLVPSTA
LAKDAAGDKWSQGMIYVSNFNRPYALWVPLVLSVPLGLLLMTARRRPSFLRPVLAPDYGRvARAVQSPPAVVA-
FIVGSGV
LQALYWIRQGGDFMHGRvLLAPLFCLLAPVGVIPILLPDGKDFSRETGRWLVGALSGLW-
LGIAGWSLWAANSPGMGDDAT
RVTYSGIVDERRFYAQATGHAHPLTAADYLDYPRMAAVLTALNNT-
PEGALLLPSGNYNQWDLVPMIRPSSGTAPGGKPAP
KPQHAVFFTNMGMLGMNVGLDVRVIDQIGLV-
NPLAAHTERLKHARIGHDKNLFPDWVIADGPWVKWYPGIPGYIDQQWVT
QAEAALQCPATRAVLNSVRAPITLHRFLSNVLHSYEFTRYRIDRVPRYELVRCGLDVPDGPGPPPRE
>BL;Rv3807c, H37Rv2.tab 4269843:4270337 reverse MW:17218
MVAVQSALVDRPGMLATARGLSHFGEHCIGWLILALLGAIALPRRRREWLVAGAGAFVAHAIAVLIKRLV-
RRQRPDHPAI (SEQ ID NO:629)
AVNVDTPSQLSFPSAHATSTTAAALLMGRATGLPLPVVLV-
PPMALSRILLGVHYPSDVAVGVALGATVGAIVDSVGGGRQ RARKR >BL;Rv3808c,
H37Rv2.tab 4270369:4272279 reverse MW:71507
MSELAASLLSRVILPRPGEPLDVRKLYLEESTTNARRAHAPTRTSLQIGAESEVSFATYFNAFPASYWRRWTT-
CKSVVLR (SEQ ID NO:630)
VQVTGAGRVDVYRTKATGARIFVEGHDFTGTEDQPAAVETEVV-
LQPFEDGGWVWFDITTDTAVTLHSGGWYATSPAPGTA
NIAVOIPTFNRPADCVNALRELTADPLVD-
QVIGAVIVPDQGERKVRDHPDFPAAAARLGSRLSIHDQPNLGGSGGYSRVM
YEALKNTDCQQILFMDDDIRLEPDSILRvLAMHRFAKAPMLVGGQMLNLQEPSHLNIMGEVVDRSIFMWTAAP-
HAEYDHD
FAEYPLNDNNSRSKLLHRRIDVDYNGWWTCMIPRQVAEELGQPLPLFIKWDDADYGLRA-
AEHGYPTVTLPGAAIWHMAWS
DKDDAIDWQAYFHLRNRLVVAAMHWDGPKAQVIGLVRSHLKATLK-
HLACLEYSTVAIQNKAIDDFLAGPEHIFSILESAL
PQVHRIRKSYPDAVVLPAASELPPPLHKNKA-
MKPPVNPLVIGYRLARGIMHNLTAANPQHHRRPEFNVPTQDARWFLLCT
VDGATVTTADGCGVVYRQRDRAKNFALLWQSLRRQRQLLKRFEEMRRIYRDALPTLSSKQKWETALLPAANQE-
PEHG >BL;Rv3821, H37Rv2.tab 4285973:4286683 forward MW:24627
MWSTVLVLALSVICEPVRIGLVVLMLNRRRPLLHLLTFLCGGYTMAGGVANVTLVVL-
GATPLAGHFSVAEVQIGTGLIAL (SEQ ID NO:631)
LIAFALTTNVIGKHVRRATHARVGDDG-
GRVLRESVPPSGAHKLAVRARCFLQGDSLYVAGVSGLGAALPSANYMGAMAAI
LASGATPATQALAVVTFNVVAFTVAEVPLVSYLAAPRKTRAFMAALQSWLRSRSRRDAALLVAAGGCLMLTLG-
LSNL >BL;Rv3835, H37Rv2.tab 4309047:4310393 forward MW:47043
MLDAPEQDPVDPGDPASPPHGEAEQPLPGPRWPRALRASATRRALLLTALGGLLIAG-
LVTAIPAVGRAPERLAGYIASNP (SEQ ID NO:632)
VPSTGAKINASFNRVASGDCLMWPDGT-
PESAAIVSCADEHRFEVAESIDMRTFPGMEYGQNAAPPSPARIQQISEEQCEA
AVRRYLGTKFDPNSKFTISMLWPGDRAWRQAGERRMLCGLQSPGPNNQQLAFKGKVADIDQSKVWPAGTCLGI-
DATTNQP
IDVPVDCAAPHAMEVSGTVNLAERFPDALPSEPEQDGFIKDACTRMTDAYLAPLKLRTT-
TLTLIYPTLTLPSWSAGSRVV
ACSIGATLGNGGWATLVNSAKGALLINGQPPVPPPDIPEERLNLP-
PIPLQLPTPRPAPPAQQLPSTPPGTQHLPAQQPVV
TPTRPPESHAPASAAPAETQPPPPDAGAPPA- TQSPEATPPGPAEPAPAG >BL;Rv3843c,
H37Rv2.tab 4315571:4316596 reverse MW:37353
VIQVCSQCGTGWNVRERQRvWCPRCRGMLLAPL-
ADMPAEARWRTPARPQVPTASDTRRTPPRLPPGFRWIAVRPGAAPPP (SEQ ID NO:633)
RHGPRLRGPTPRYAGIPRWGLTDHVDQAPVPASAKAGPSPAAVRTTLLVSLLVFSIAVVVFVVRYVLLVINRN-
TLLNSVV
ASASVWLGVLVSLAAIAAAGTTIVLLVRWLVARRAAAFMHQGLPERRSARELWAGCLLP-
MVNLLWAPLYVIELALVEDRY
TRLRRPIVVWWIVWIVSNAISMFAFATSWVTDAQGIANNTTMMVL-
AYLCAAAAVAAAARVFEGFEQKPVERPAHRWVVVN TDGRSAPASSVAVELDGQEPAA
>BL;Rv3847, H37Rv2.tab 4321538:4322068 forward MW:18278
MGTGSGGPIGVSPFHSRGALKGFVISGRWPDSTKEWAQLLMVAVRvASLPGLLSTTTVFGAREELPDEPEPGT-
VGLVLAE (SEQ ID NO:634)
GTVFGESAIQPGYFADHQPPALLMLHPPSETTPSLPECTGAAS-
GCVLLPGLPYLGLEHRAAWVEAEADGTITSMVSRVGV DPISHPDTAILAMLLAA
>BL;2V3849, H37Rv2.tab 4323499:4323894 forward MW:14708
MSTTFAARLNRLFDTVYPPGRGPHTSAEVIAALKAEGITMSAPYLSQLRSGNRTNPSGATMAALANFERIKAA-
YFTDDEY (SEQ ID NO:635)
YEKLDKELQWLCTMRDDGVRRIAQRAHGLPSAAQQKVLDRIDE- LRRAEGIDA
>BL;Rv3850, M37Rv2.tab 4324015:4324668 forward MW:23811
MGLFGKRKSRATRRAEARAIKARAKLEAKLSAKNEARRIKAAQRAESKA-
LKAQLKARRDSDRAALKVAEAELKVAREGKL (SEQ ID NO:636)
LSPTRIRRLLTVSRLLAPILTPVIYRAANAARGLIDQRRADQLGVPLAQIGRFSGHGARLSARvGGAERSLRM-
VQEKKPK DVETKQFVSAVTNRLTDLSAIWAAAEHMPAKRRRTAHSAISSQLDGIEADLMARLGLT
>BL;Rv3867, H37Rv2.tab 4342770:4343318 forward MW:19945
MVDPPGNDDDHGDLDALDFSAAHTNEASPLDALDDYAPVQTDDAEGDLDALHALTER-
DEEPELELFTVTNPQGSVSVSTL (SEQ ID NO:637)
MDGRIQHVELTDKATSMSEAQLADEIF-
VIADLARQKARASQYTFMVENIGELTDEDAEGSALLREFVGMTLNLPTPEEAA
AAEAEVFATRYDVDYTSRYKADD >BL;Rv3869, H37Rv2.tab 4345039:4346478
forward MW:51093 MGLRLTTKVQVSGWRFLLRRLEHAIVRRDTRMF-
DDPLQFYsRSIALGIVVAVLILAGAALLAYFKPQGKLGGTSLFTDRA (SEQ ID NO:638)
TNQLYVLLSGQLHPVYNLTSARLVLGNPANPATVKSSELSKLPMGQTVGIPGAPYATPVSAGSTSIWTLCDTV-
ARADSTS
PVVQTAVIANPLEIDASIDPLQSHEAVLVSYQGETWIVTTKGRHAIDLTDRALTSSMGI-
PVTARPTPISEGMFNALPDMG
PWQLPPIPAAGAPNSLGLPDDLVIGSVFQIHTDKGPQYYVVLPDG-
IAQVNATTAAALRATQAHGLVAPPAMVPSLVVRIA
ERvYPSPLPDEPLKIVSRPQDPALCWSWQRS-
AGDQSPQSTVLSGRHLPISPSAMNMGIKQIHGTATVYLDGGKFVALQSP
DPRYTESMYYIDPQGVRYGVPNAETAKSLGLSSPQNAPWEIVRLLVDGPVLSKDAALLEHDTLPADPSPRKVP-
AGASGAP >DL;Rv3874, H37Rv2.tab 4352274:4352573 forward MW:10793
MAEMKTDAATLAQEAGNFERISGDLKTQIDQVESTAGSLQGQWRGAAGTAAQAAVVR-
FQEAANKQKQELDEISTNIRQAG (SEQ ID NO:639) VQYSRADEEQQQALSSQMGF
>BL;Rv3876, H37Rv2.tab 4353010:4355007 forward MW:70645
MAADYDKLFRPHEGMEAPDDMAAQPFFDPSASFPPAPASANLPKPNGQTPPPTSDDLSERFVSAPPPPPPPP-
PPPPPTPM (SEQ ID NO:640)
PIAAGEPPSPEPAASKPPTPPMPIAGPEPAPPKPPTPPMPIA-
GPEPAPPKPPTPPMPIAGPAPTPTESQLAPPRPPTPQT
PTGAPQQPESPAPHVPSHGPHQPRRTAP-
APPWAKMPIGEPPPAPSRPSASPAEPPTRPAPQHSRRARRGHRYRTDTERNV
GKVATGPSIQARLRAEEASGAQLAPGTEPSPAPLGQPRSYLAPPTRPAPTEPPPSPSPQRNSGRRAERRVHPD-
LAAQNAA
AQPDSITAATTGGRRRKRAAPDLDATQKSLRPAAKGPKVKKVKPQKPKATKPPKVVSQR-
GWRHWVHALTRINLGLSPDEK
YELDLHARVRRNPRGSYQIAVVGLKGGAGKTTLTAALGSTLAQVR-
ADRILALDADPGAGNLADRVGRQSGATIADVLAEK
ELSHYNDIRAHTSVNAVNLEVLPAPEYSSAQ-
RALSDADWHFIADPASRFYNLVLADCGAGFFDPLTRGVLSTVSGVVVVA
SVSIDGAQQASVALDWLRNNGYQDLASRACVVINHIMPGEPNVAVKDLVRHFEQQVQPGRVVVMPWDRHIAAG-
TEISLDL LDPIYKRKVLELAAALSDDFERAGRR >BL;Rv3877, H37Rv2.tab
4355007:4356539 forward MW:53981
LSAPAVAAGPTAAGATAARPATTRVTILTGRRMTDLVLPAAVPMETYIDDTVAVLSEVLEDTPADVLGGFDFT-
AQGVWAF (SEQ ID NO:641)
ARPGSPPLKLDQSLDDAGVVDGSLLTLVSVSRTERYRPLVEDV-
IDAIAVLDESPEFDRTALNRFVGAAIPLLTAPVIGMA
MRAWWETGRSLWWPLAIGILGIAVLVGSF-
VANRFYQSGHLAECLLVTTYLLIATAAALAVPLPRGVNSLGAPQVAGAATA
VLFLTLMTRGGPRKRHELASFAVITAIAVIAAAAAFGYGYQDWVPAGGIAFGLFIVTNAAKLTVAVARIALPP-
IPVPGET
VDNEELLDPVATPEATSEETPTWQAIIASVPASAVRLTERSKLAKQLLIGYVTSGTLIL-
AAGAIAVVVRGHFFVHSLVVA
GLITTVCGFRSRLYAERWCAWALLAATVAIPTGLTAKLIIWYPHY-
AWLLLSVYLTVALVALVVVGSMAHVRRvSPVVKRT LELIDGAMIAAIIPMLLWITGVYDTVRNIRF
>BL;Rv3880c, H37Rv2.tab 4360202:4360546 reverse MW:12167
VSMDELDPHVARALTLAARFQSALDGTLNQMNNGSFRATDEAETVEVTINGHQWLTG-
LRIEDGLLKKLGAEAVAQRVNEA (SEQ ID NO:642)
LHNAQAASAYNDAAGEQLTAALSAMSR- ANNEGMA >BL;Rv3882c, H37Rv2.tab
4362035:4363420 reverse MW:50397
MRNPLGLRFSTGHALLASALAPPCIIAFLETRYWWAGIALASLGVIVATVTFYGRRI-
TGWVAAVYAWLRRRRRPPDSSSE (SEQ ID NO:643)
PVVGATVKPGDHVAVRWQGEFLVAVIE-
LIPRPFTPTVIVDGQAHTDDMLDTGLVEELLSVHCPDLEADIVSAGYRVGNTA
APDVVSLYQQVIGTDPAPANRRTWIVLRADPERTRKSAQRRDEGVAGLARYLVASATRIADRLASHGVDAVCG-
RSFDDYD
HATDIGFVREKWSMIKGRDAYTAAYAAPGGPDVWWSARADHTITRVRVAPGMAPQSTVL-
LTTADKPKTPRGFARLFGGQR
PALQGQHLVANRHCQLPIGSAGVLVGETVNRCPVYMPFDDVDIAL-
NLGDAQTFTQFVVRAAAAGAMVTVGPQFEEFARLI
GAHIGQEVKVAWPNATTYLGPHPGIDRVILR- HNVIGTPRHRQLPIRRvSPPEESRYQMALPK
>BL;Rv3909, H37Rv2.tab 4394192:4396597 forward MW:83878
VTALQLGWAALARVTSAIGVVAGLGMALTVPSA-
APHHALAGEPSPTPFVQVRIDQVTPDVVTTSSEPHVTVSGTVTNTGDR (SEQ ID NO:644)
PVRDVMVRLEHAAAVTSSTALRTSLDGGTDQYQPAADFLTVAPELDRGQEAGFTLSAPLRSLTRPSLAVNQPG-
IYPVLLVN
VNGTPDYGAPARLDNARFLLPVVGVPPDQATDFGSAVAPETTAPVWITMLWPLADRPR-
LAPGAPGGTVPVRLVDDDLANS
VNGTPDYGAPARLDNARFLLPVVGVPPDQATDFGSAVAPETTAP-
VWITMLWPLADRPRLAPGAPGGTVPVRLVDDDLANS
LANGGRLDILLSAAEFATNREVDPDGAVGR-
ALCLAIDPDLLITVNAMTGGYVVSDSPDGAAQLPGTPTHPGTGQAAASSW
LDRLRTLVHRTCVTPLPFAQADLDALQRVNDPRLSAIATISPADIVDRILDVSSTRGATVLPDGPLTGRAINL-
LSTHGNT
VAVAAADFSPEEQQGSSQIGSALLPATAPRRLSPRVVAAPFDPAVGAALAAAGTNPTVP-
TYLDPSLFVRIAHESITARRQ
DALGAMLWRSLEPNAAPRTQILVPPASWSLASDDAQVILTALATA-
IRSGLAVPRPLPAVIADAAARTEPPEPPGAYSAAR
GRFNDDITTQIGGQVARLWKLTSALTIDDRT-
GLTGVQYTAPLREDMLRALSQSLPPDTRNGLAQQRLAVVGKTIDDLFGA
VTIVNPGGSYTLATEHSPLPLALHNGLAVPIRVRLQVDAPPGMTVADVGQIELPPGYLPLRVPIEVNFTQRVA-
VDVSLRT
PDGVALGEPVRLSVHSNAYGKVLFAITLSAAAVLVTLAGRRLWHRFRGQPDRADLDRPD-
LPTGKHAPQRRAVASRDDEKH RV
[0236]
* * * * *
References