U.S. patent application number 11/607413 was filed with the patent office on 2007-05-03 for nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeutics.
Invention is credited to David Bush, Lynn Doucette-Stamm, Chad Eric Houseweart, Timothy Opperman, Qiandong Zeng.
Application Number | 20070099861 11/607413 |
Document ID | / |
Family ID | 31721337 |
Filed Date | 2007-05-03 |
United States Patent
Application |
20070099861 |
Kind Code |
A1 |
Doucette-Stamm; Lynn ; et
al. |
May 3, 2007 |
Nucleic acid and amino acid sequences relating to Streptococcus
pneumoniae for diagnostics and therapeutics
Abstract
The invention provides isolated polypeptide and nucleic acid
sequences derived from Streptococcus pneumoniae that are useful in
diagnosis and therapy of pathological conditions; antibodies
against the polypeptides; and methods for the production of the
polypeptides. The invention also provides methods for the
detection, prevention and treatment of pathological conditions
resulting from bacterial infection.
Inventors: |
Doucette-Stamm; Lynn;
(Framingham, MA) ; Bush; David; (Somerville,
MA) ; Zeng; Qiandong; (Waltham, MA) ;
Opperman; Timothy; (Somerville, MA) ; Houseweart;
Chad Eric; (Waltham, MA) |
Correspondence
Address: |
HAMILTON, BROOK, SMITH & REYNOLDS, P.C.
530 VIRGINIA ROAD
P.O. BOX 9133
CONCORD
MA
01742-9133
US
|
Family ID: |
31721337 |
Appl. No.: |
11/607413 |
Filed: |
December 1, 2006 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11028458 |
Dec 30, 2004 |
|
|
|
11607413 |
Dec 1, 2006 |
|
|
|
10640833 |
Aug 14, 2003 |
|
|
|
11028458 |
Dec 30, 2004 |
|
|
|
09583110 |
May 26, 2000 |
6699703 |
|
|
10640833 |
Aug 14, 2003 |
|
|
|
09107433 |
Jun 30, 1998 |
6800744 |
|
|
09583110 |
May 26, 2000 |
|
|
|
60085131 |
May 12, 1998 |
|
|
|
60051553 |
Jul 2, 1997 |
|
|
|
Current U.S.
Class: |
514/44A ;
536/23.7 |
Current CPC
Class: |
A61P 31/04 20180101;
A61K 2039/53 20130101; A61K 39/00 20130101; C07K 14/3156 20130101;
C07K 14/315 20130101; C07H 21/04 20130101; A61K 31/7052 20130101;
C07H 21/02 20130101; A61K 39/092 20130101 |
Class at
Publication: |
514/044 ;
536/023.7 |
International
Class: |
A61K 48/00 20060101
A61K048/00; C07H 21/04 20060101 C07H021/04 |
Claims
1. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid encoding a S. pneumoniae polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO: 5225,
wherein the polypeptide elicits an immune response in the
subject.
2. The method of claim 1, wherein the isolated nucleic acid
administered to the subject is a component of a vector.
3. The method of claim 2, wherein the vector is a component of a
cell.
4. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid as set forth in SEQ ID NO: 2564, wherein the
isolated nucleic acid encodes a polypeptide that elicits an immune
response in the subject.
5. The method of claim 4, wherein the isolated nucleic acid
administered to the subject is a component of a vector.
6. The method of claim 5, wherein the vector is a component of a
cell.
7. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid selected from the group consisting of: a) SEQ
ID NO: 2564; b) a nucleic acid fully complementary to SEQ ID NO:
2564; c) a nucleic acid having at least 99% identity to SEQ ID NO:
2564; d) a nucleic acid fully complementary to a nucleic acid
having at least 99% identity to SEQ ID NO: 2564; and e) an RNA of
a), b), c) or d) wherein U is substituted for T, wherein the
isolated nucleic acid encodes a polypeptide that elicits an immune
response in the subject.
8. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid encoding a S. pneumoniae polypeptide selected
from the group consisting of: (a) SEQ ID NO: 5225; and (b) a
polypeptide having at least 99% identity to SEQ ID NO: 5225,
wherein administration of the polypeptide elicits an immune
response in the subject.
9. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid encoding a S. pneumoniae polypeptide having
at least 99% identity to SEQ ID NO: 5225, wherein the polypeptide
elicits an immune response in the subject.
10. A method of treating a S. pneumoniae infection in a subject,
comprising the step of administering to a subject having an S.
pneumoniae infection a composition that includes an isolated
nucleic acid encoding a S. pneumoniae polypeptide comprising the
amino acid sequence as set forth in SEQ ID NO: 5225.
11. The method of claim 10, wherein the isolated nucleic acid
administered to the subject is a component of a vector.
12. The method of claim 11, wherein the vector is a component of a
cell.
13. A method of treating a S. pneumoniae infection in a subject,
comprising the step of administering to a subject having an S.
pneumoniae infection a composition that includes an isolated
nucleic acid as set forth in SEQ ID NO: 2564.
14. The method of claim 13, wherein the isolated nucleic acid
administered to the subject is a component of a vector.
15. The method of claim 14, wherein the vector is a component of a
cell.
16. A method of treating a S. pneumoniae infection in a subject,
comprising the step of administering to a subject having an S.
pneumoniae infection a composition that includes an isolated
nucleic acid selected from the group consisting of: a) SEQ ID NO:
2564; b) a nucleic acid fully complementary to SEQ ID NO: 2564; c)
a nucleic acid having at least 99% identity to SEQ ID NO: 2564; d)
a nucleic acid fully complementary to a nucleic acid having at
least 99% identity to SEQ ID NO: 2564; and e) an RNA of a), b), c)
or d) wherein U is substituted for T.
17. A method of treating a S. pneumoniae infection in a subject,
comprising the step of administering to a subject having an S.
pneumoniae infection a composition that includes an isolated
nucleic acid encoding a S. pneumoniae polypeptide selected from the
group consisting of: (a) SEQ ID NO: 5225 and (b) a polypeptide
having at least 99% identity to SEQ ID NO: 5225.
18. A method of treating a S. pneumoniae infection in a subject,
comprising the step of administering to a subject having an S.
pneumoniae infection a composition that includes an isolated
nucleic acid encoding a S. pneumoniae polypeptide having at least
99% identity to SEQ ID NO: 5225.
19. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid encoding a S. pneumoniae polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO: 5225,
wherein the polypeptide provides protective immunity against an
infection by S. pneumoniae.
20. The method of claim 19, wherein the isolated nucleic acid
administered to the subject is a component of a vector.
21. The method of claim 20, wherein the vector is a component of a
cell.
22. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid as set forth in SEQ ID NO: 2564, wherein the
isolated nucleic acid encodes a polypeptide that provides
protective immunity against an infection by S. pneumoniae.
23. The method of claim 22, wherein the isolated nucleic acid
administered to the subject is a component of a vector.
24. The method of claim 23, wherein the vector is a component of a
cell.
25. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid selected from the group consisting of: a) SEQ
ID NO: 2564; b) a nucleic acid fully complementary to SEQ ID NO:
2564; c) a nucleic acid having at least 99% identity to SEQ ID NO:
2564; d) a nucleic acid fully complementary to a nucleic acid
having at least 99% identity to SEQ ID NO: 2564; and e) an RNA of
a), b), c) or d) wherein U is substituted for T, wherein the
isolated nucleic acid encodes a polypeptide that provides
protective immunity against an infection by S. pneumoniae.
26. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid encoding a S. pneumoniae polypeptide selected
from the group consisting of: (a) SEQ ID NO: 5225; and (b) a
polypeptide having at least 99% identity to SEQ ID NO: 5225,
wherein administration of the polypeptide provides protective
immunity against an infection by S. pneumoniae.
27. A method of treating a subject, comprising the step of
administering to the subject a composition that includes an
isolated nucleic acid encoding a S. pneumoniae polypeptide having
at least 99% identity to SEQ ID NO: 5225, wherein the polypeptide
provides protective immunity against an infection by S. pneumoniae.
Description
RELATED APPLICATIONS
[0001] This application is a divisional of U.S. application Ser.
No. 11/028,458, filed Dec. 30, 2004, which is a continuation of
U.S. application Ser. No. 10/640,833, filed Aug. 14, 2003, which is
a continuation of U.S. application Ser. No. 09/583,110 (now U.S.
Pat. No. 6,699,703) filed May 26, 2000, which is a
continuation-in-part of U.S. application Ser. No. 09/107,433 (now
U.S. Pat. No. 6,800,744), filed Jun. 30, 1998, which claims the
benefit of U.S. Application No. 60/085,131, filed May 12, 1998 and
of U.S. Application No. 60/051,553, filed Jul. 2, 1997. The entire
teachings of the above applications are incorporated herein by
reference.
INCORPORATION BY REFERENCE OF MATERIAL ON COMPACT DISK
[0002] This application incorporates by reference the Sequence
Listing contained on the two compact disks (Copy 1 and Copy 2),
filed concurrently herewith, containing the following file:
[0003] File name: 3687.1000-048SequenceList.txt; created Nov. 30,
2006, 8,135 KB in size.
[0004] This application also incorporates by reference Table 2
contained on the two compact disks (Copy 1 and Copy 2), filed
concurrently herewith, containing the following file:
[0005] File name: Table2.sub.--2.txt; created Nov. 6, 2006, 351 KB
in size.
FIELD OF THE INVENTION
[0006] The invention relates to isolated nucleic acids and
polypeptides derived from Streptococcus pneumoniae that are useful
as molecular targets for diagnostics, prophylaxis and treatment of
pathological conditions, as well as materials and methods for the
diagnosis, prevention, and amelioration of pathological conditions
resulting from bacterial infection.
BACKGROUND OF THE INVENTION
[0007] Streptococcus pneumoniae (S. pneumoniae) is a common,
spherical, gram-positive bacterium. Worldwide it is a leading cause
of illness among children, the elderly, and individuals with
debilitating medical conditions (Breiman, R. F. et al., 1994, JAMA
271: 1831). S. pneumoniae is estimated to be the causal agent in
3,000 cases of meningitis, 50,000 cases of bacteremia, 500,000
cases of pneumonia, and 7,000,000 cases of otitis media annually in
the United States alone (Reichler, M. R. et al., 1992, J. Infect.
Dis. 166: 1346; Stool, S. E. and Field, M. J., 1989 Pediatr.
Infect. Dis J. 8: S11). In the United States alone, 40,000 deaths
result annually from S. pneumoniae infections (Williams, W. W. et
al., 1988 Ann. Intern. Med. 108: 616) with a death rate approaching
30% from bacteremia (Butler, J. C. et al., 1993, JAMA 270: 1826).
Pneumococcal pneumonia is a serious problem among the elderly of
industrialized nations (Kayhty, H. and Eskola, J., 1996 Emerg.
Infect. Dis. 2: 289) and is a leading cause of death among children
in developing nations (Kayhty, H. and Eskola, J., 1996 Emerg.
Infect. Dis. 2: 289; Stansfield, S. K., 1987 Pediatr. Infect. Dis.
6: 622).
[0008] Vaccines against S. pneumoniae have been available for a
number of years. There are a large number of serotypes based on the
polysaccharide capsule (van Dam, J. E., Fleer, A., and Snippe, H.,
1990 Antonie van Leeuwenhoek 58: 1) although only a fraction of the
serotypes seem to be associated with infections (Martin, D. R. and
Brett, M. S., 1996 N. Z. Med. J. 109: 288). A multivalent vaccine
against capsular polysaccharides of 23 serotypes (Smart, L. E.,
Dougall, A. J. and Gridwood, R. W., 1987 J. Infect. 14: 209) has
provided protection for some groups but not for several groups at
risk for pneumococcal infections, such as infants and the elderly
(Makel, P. H. et al., 1980 Lancet 2: 547; Sankilampi, U., 1996 J.
Infect. Dis. 173: 387). Conjugated pneumococcal capsular
polysaccharide vaccines have somewhat improved efficacy, but are
costly and, therefore, are not likely to be in widespread use
(Kayhty, H. and Eskola, J., 1996 Emerg. Infect. Dis. 2: 289).
[0009] At one time, S. pneumoniae strains were uniformly
susceptible to penicillin. The report of a penicillin-resistant
strain of (Hansman, D. and Bullen, M. M., 1967 Lancet 1: 264) was
followed rapidly by many reports indicating the worldwide emergence
of penicillin-resistant and penicillin non-susceptible strains
(Klugman, K. P., 1990 Clin. Microbiol. Rev. 3: 171). S. pneumoniae
strains which are resistant to multiple antibiotics (including
penicillin) have also been observed recently within the United
States (Welby, P. L., 1994 Pediatr. Infect. Dis. J. 13: 281; Ducin,
J. S. et al., 1995 Pediatr. Infect. Dis. J. 14: 745; Butler, J. C.,
1996 J. Infect. Dis. 174: 986) as well as internationally (Boswell,
T. C. et al., 1996; J. Infect. 33: 17; Catchpole, C., Fraise, A.,
and Wise, R., 1996 Microb. Drug Resist. 2: 431; Tarasi, A. et al.,
1997 Microb. Drug Resist. 3: 105).
[0010] A high incidence of morbidity is associated with invasive S.
pneumoniae infections (Williams, W. W. et al., 1988 Ann. Intern.
Med. 108: 616). Because of the incomplete effectiveness of
currently available vaccines and antibiotics, the identification of
new targets for antimicrobial therapies, including, but not limited
to, the design of vaccines and antibiotics, which may help prevent
infection or that may be useful in fighting existing infections, is
highly desirable.
SUMMARY OF THE INVENTION
[0011] The present invention fulfills the need for diagnostic tools
and therapeutics by providing bacterial-specific compositions and
methods for detecting, treating, and preventing bacterial
infection, in particular S. pneumoniae infection.
[0012] The present invention encompasses isolated polypeptides and
nucleic acids derived from S. pneumoniae that are useful as
reagents for diagnosis of bacterial infection, components of
effective antibacterial vaccines, and/or as targets for
antibacterial drugs, including anti-S. pneumoniae drugs. The
nucleic acids and peptides of the present invention also have
utility for diagnostics and therapeutics for S. pneumoniae and
other Streptococcus species. They can also be used to detect the
presence of S. pneumoniae and other Streptococcus species in a
sample; and in screening compounds for the ability to interfere
with the S. pneumoniae life cycle or to inhibit S. pneumoniae
infection. More specifically, this invention features compositions
of nucleic acids corresponding to entire coding sequences of S.
pneumoniae proteins, including surface or secreted proteins or
parts thereof, nucleic acids capable of binding mRNA from S.
pneumoniae proteins to block protein translation, and methods for
producing S. pneumoniae proteins or parts thereof using peptide
synthesis and recombinant DNA techniques. This invention also
features antibodies and nucleic acids useful as probes to detect S.
pneumoniae infection. In addition, vaccine compositions and methods
for the protection or treatment of infection by S. pneumoniae are
within the scope of this invention.
[0013] The nucleotide sequences provided in SEQ ID NO: 1- SEQ ID
NO: 2661, a fragment thereof, or a nucleotide sequence at least
99.5% identical to a sequence contained within SEQ ID NO: 1- SEQ ID
NO: 2661 may be "provided" in a variety of medias to facilitate use
thereof. As used herein, "provided" refers to a manufacture, other
than an isolated nucleic acid molecule, which contains a nucleotide
sequence of the present invention, i.e., the nucleotide sequence
provided in SEQ ID NO: 1-SEQ ID NO: 2661, a fragment thereof, or a
nucleotide sequence at least 99.5% identical to a sequence
contained within SEQ ID NO: 1-SEQ ID NO: 2661. Uses for and methods
for providing nucleotide sequences in a variety of media is well
known in the art (see e.g., EPO Publication No. EP 0 756 006)
[0014] In one application of this embodiment, a nucleotide sequence
of the present invention can be recorded on computer readable
media. As used herein, "computer readable media" refers to any
media which can be read and accessed directly by a computer. Such
media include, but are not limited to: magnetic storage media, such
as floppy discs, hard disc storage media, and magnetic tape;
optical storage media such as CD-ROM; electrical storage media such
as RAM and ROM; and hybrids of these categories such as
magnetic/optical storage media. A person skilled in the art can
readily appreciate how any of the presently known computer readable
media can be used to create a manufacture comprising computer
readable media having recorded thereon a nucleotide sequence of the
present invention.
[0015] As used herein, "recorded" refers to a process for storing
information on computer readable media. A person skilled in the art
can readily adopt any of the presently known methods for recording
information on computer readable media to generate manufactures
comprising the nucleotide sequence information of the present
invention.
[0016] A variety of data storage structures are available to a
person skilled in the art for creating a computer readable media
having recorded thereon a nucleotide sequence of the present
invention. The choice of the data storage structure will generally
be based on the means chosen to access the stored information. In
addition, a variety of data processor programs and formats can be
used to store the nucleotide sequence information of the present
invention on computer readable media. The sequence information can
be represented in a word processing text file, formatted in
commercially-available software such as WordPerfect and Microsoft
Word, or represented in the form of an ASCII file, stored in a
database application, such as DB2, Sybase, Oracle, or the like. A
person skilled in the art can readily adapt any number of data
processor structuring formats (e.g. text file or database) in order
to obtain computer readable media having recorded thereon the
nucleotide sequence information of the present invention.
[0017] By providing the nucleotide sequence of SEQ ID NO: 1-SEQ ID
NO: 2661, a fragment thereof, or a nucleotide sequence at least
99.5% identical to a sequence contained within SEQ ID NO: 1-SEQ ID
NO: 2661 in computer readable form, a person skilled in the art can
routinely access the sequence information for a variety of
purposes. Computer software is publicly available which allows a
person skilled in the art to access sequence information provided
in a computer readable media. Examples of such computer software
include programs of the "Staden Package", "DNA Star", "MacVector",
GCG "Wisconsin Package" (Genetics Computer Group, Madison, Wis.)and
"NCBI toolbox" (National Center for Biotechnology Information).
[0018] Computer algorithms enable the identification of S.
pneumoniae open reading frames (ORFs) within SEQ ID NO: 1-SEQ ID
NO: 2661 which contain homology to ORFs or proteins from other
organisms. Examples of such similarity-search algorithms include
the BLAST [Altschul et al., J. Mol. Biol. 215:403-410 (1990)] and
Smith-Waterman [Smith and Waterman (1981) Advances in Applied
Mathematics, 2:482-489] search algorithms. These algorithms are
utilized on computer systems as exemplified below. The ORFs so
identified represent protein encoding fragments within the S.
pneumoniae genome and are useful in producing commercially
important proteins such as enzymes used in fermentation reactions
and in the production of commercially useful metabolites.
[0019] The present invention further provides systems, particularly
computer-based systems, which contain the sequence information
described herein. Such systems are designed to identify
commercially important fragments of the S. pneumoniae genome. As
used herein, "a computer-based system" refers to the hardware
means, software means, and data storage means used to analyze the
nucleotide sequence information of the present invention. The
minimum hardware means of the computer-based systems of the present
invention comprises a central processing unit (CPU), input means,
output means, and data storage means. A person skilled in the art
can readily appreciate that any one of the currently available
computer-based systems is suitable for use in the present
invention. The computer-based systems of the present invention
comprise a data storage means having stored therein a nucleotide
sequence of the present invention and the necessary hardware means
and software means for supporting and implementing a search means.
As used herein, "data storage means" refers to memory which can
store nucleotide sequence information of the present invention, or
a memory access means which can access manufactures having recorded
thereon the nucleotide sequence information of the present
invention.
[0020] As used herein, "search means" refers to one or more
programs which are implemented on the computer-based system to
compare a target sequence or target structural motif with the
sequence information stored within the data storage means. Search
means are used to identify fragments or regions of the S.
pneumoniae genome which are similar to, or "match", a particular
target sequence or target motif. A variety of known algorithms are
known in the art and have been disclosed publicly, and a variety of
commercially available software for conducting homology-based
similarity searches are available and can be used in the
computer-based systems of the present invention. Examples of such
software include, but is not limited to, FASTA (GCG Wisconsin
Package), Bic_SW (Compugen Bioccelerator, BLASTN2, BLASTP2 and
BLASTX2 (NCBI) and Motifs (GCG). BLASTN2, A person skilled in the
art can readily recognize that any one of the available algorithms
or implementing software packages for conducting homology searches
can be adapted for use in the present computer-based systems.
[0021] As used herein, a "target sequence" can be any DNA or amino
acid sequence of six or more nucleotides or two or more amino
acids. A person skilled in the art can readily recognize that the
longer a target sequence is, the less likely a target sequence will
be present as a random occurrence in the database. The most
preferred sequence length of a target sequence is from about 10 to
100 amino acids or from about 30 to 300 nucleotide residues.
However, it is well recognized that many genes are longer than 500
amino acids, or 1.5 kb in length, and that commercially important
fragments of the S. pneumoniae genome, such as sequence fragments
involved in gene expression and protein processing, will often be
shorter than 30 nucleotides.
[0022] As used herein, "a target structural motif," or "target
motif," refers to any rationally selected sequence or combination
of sequences in which the sequence(s) are chosen based on a
specific functional domain or three-dimensional configuration which
is formed upon the folding of the target polypeptide. There are a
variety of target motifs known in the art. Protein target motifs
include, but are not limited to, enzymatic active sites, membrane
spanning regions, and signal sequences. Nucleic acid target motifs
include, but are not limited to, promoter sequences, hairpin
structures and inducible expression elements (protein binding
sequences).
[0023] A variety of structural formats for the input and output
means can be used to input and output the information in the
computer-based systems of the present invention. A preferred format
for an output means ranks fragments of the S. pneumoniae genome
possessing varying degrees of homology to the target sequence or
target motif. Such presentation provides a person skilled in the
art with a ranking of sequences which contain various amounts of
the target sequence or target motif and identifies the degree of
homology contained in the identified fragment.
[0024] A variety of comparing means can be used to compare a target
sequence or target motif with the data storage means to identify
sequence fragments of the S. pneumoniae genome. In the present
examples, implementing software which implement the BLASTP2 and
bic_SW algorithms (Altschul et al., J Mol. Biol. 215:403-410
(1990); Compugen Biocellerator) was used to identify open reading
frames within the S. pneumoniae genome. A person skilled in the art
can readily recognize that any one of the publicly available
homology search programs can be used as the search means for the
computer-based systems of the present invention.
[0025] The invention features S. pneumoniae polypeptides,
preferably a substantially pure preparation of an S. pneumoniae
polypeptide, or a recombinant S. pneumoniae polypeptide. In
preferred embodiments: the polypeptide has biological activity; the
polypeptide has an amino acid sequence at least 60%, 70%, 80%, 90%,
95%, 98%, or 99% identical to an amino acid sequence of the
invention contained in the Sequence Listing, preferably it has
about 65% sequence identity with an amino acid sequence of the
invention contained in the Sequence Listing, and most preferably it
has about 92% to about 99% sequence identity with an amino acid
sequence of the invention contained in the Sequence Listing; the
polypeptide has an amino acid sequence essentially the same as an
amino acid sequence of the invention contained in the Sequence
Listing; the polypeptide is at least 5, 10, 20, 50, 100, or 150
amino acid residues in length; the polypeptide includes at least 5,
preferably at least 10, more preferably at least 20, more
preferably at least 50, 100, or 150 contiguous amino acid residues
of the invention contained in the Sequence Listing. In yet another
preferred embodiment, the amino acid sequence which differs in
sequence identity by about 7% to about 8% from the S. pneumoniae
amino acid sequences of the invention contained in the Sequence
Listing is also encompassed by the invention.
[0026] In preferred embodiments: the S. pneumoniae polypeptide is
encoded by a nucleic acid of the invention contained in the
Sequence Listing, or by a nucleic acid having at least 60%, 70%,
80%, 90%, 95%, 98%, or 99% homology with a nucleic acid of the
invention contained in the Sequence Listing.
[0027] In a preferred embodiment, the subject S. pneumoniae
polypeptide differs in amino acid sequence at 1, 2, 3, 5, 10 or
more residues from a sequence of the invention contained in the
Sequence Listing. The differences, however, are such that the S.
pneumoniae polypeptide exhibits an S. pneumoniae biological
activity, e.g., the S. pneumoniae polypeptide retains a biological
activity of a naturally occurring S. pneumoniae enzyme.
[0028] In preferred embodiments, the polypeptide includes all or a
fragment of an amino acid sequence of the invention contained in
the Sequence Listing; fused, in reading frame, to additional amino
acid residues, preferably to residues encoded by genomic DNA 5' or
3' to the genomic DNA which encodes a sequence of the invention
contained in the Sequence Listing.
[0029] In yet other preferred embodiments, the S. pneumoniae
polypeptide is a recombinant fusion protein having a first S.
pneumoniae polypeptide portion and a second polypeptide portion,
e.g., a second polypeptide portion having an amino acid sequence
unrelated to S. pneumoniae. The second polypeptide portion can be,
e.g., any of glutathione-S-transferase, a DNA binding domain, or a
polymerase activating domain. In preferred embodiment the fusion
protein can be used in a two-hybrid assay.
[0030] Polypeptides of the invention include those which arise as a
result of alternative transcription events, alternative RNA
splicing events, and alternative translational and postranslational
events.
[0031] In a preferred embodiment, the encoded S. pneumoniae
polypeptide differs (e.g., by amino acid substitution, addition or
deletion of at least one amino acid residue) in amino acid sequence
at 1, 2, 3, 5, 10 or more residues, from a sequence of the
invention contained in the Sequence Listing. The differences,
however, are such that: the S. pneumoniae encoded polypeptide
exhibits a S. pneumoniae biological activity, e.g., the encoded S.
pneumoniae enzyme retains a biological activity of a naturally
occurring S. pneumoniae.
[0032] In preferred embodiments, the encoded polypeptide includes
all or a fragment of an amino acid sequence of the invention
contained in the Sequence Listing; fused, in reading frame, to
additional amino acid residues, preferably to residues encoded by
genomic DNA 5' or 3' to the genomic DNA which encodes a sequence of
the invention contained in the Sequence Listing.
[0033] The S. pneumoniae strain, 14453, from which genomic
sequences have been sequenced, has been deposited on Jun. 26, 1997
in the American Type Culture Collection, 10801 University Blvd.,
Manassas, Virginia 20110-2209, and assigned the ATCC designation #
55987.
[0034] Included in the invention are: allelic variations; natural
mutants; induced mutants; proteins encoded by DNA that hybridize
under high or low stringency conditions to a nucleic acid which
encodes a polypeptide of the invention contained in the Sequence
Listing (for definitions of high and low stringency see Current
Protocols in Molecular Biology, John Wiley & Sons, New York,
1989, 6.3.1-6.3.6, hereby incorporated by reference); and,
polypeptides specifically bound by antisera to S. pneumoniae
polypeptides, especially by antisera to an active site or binding
domain of S. pneumoniae polypeptide. The invention also includes
fragments, preferably biologically active fragments. These and
other polypeptides are also referred to herein as S. pneumoniae
polypeptide analogs or variants.
[0035] The invention further provides nucleic acids, e.g., RNA or
DNA, encoding a polypeptide of the invention. This includes double
stranded nucleic acids as well as coding and antisense single
strands.
[0036] In preferred embodiments, the subject S. pneumoniae nucleic
acid will include a transcriptional regulatory sequence, e.g. at
least one of a transcriptional promoter or transcriptional enhancer
sequence, operably linked to the S. pneumoniae gene sequence, e.g.,
to render the S. pneumoniae gene sequence suitable for expression
in a recombinant host cell.
[0037] In yet a further preferred embodiment, the nucleic acid
which encodes an S. pneumoniae polypeptide of the invention,
hybridizes under stringent conditions to a nucleic acid probe
corresponding to at least 8 consecutive nucleotides of the
invention contained in the Sequence Listing; more preferably to at
least 12 consecutive nucleotides of the invention contained in the
Sequence Listing; more preferably to at least 20 consecutive
nucleotides of the invention contained in the Sequence Listing;
more preferably to at least 40 consecutive nucleotides of the
invention contained in the Sequence Listing.
[0038] In another aspect, the invention provides a substantially
pure nucleic acid having a nucleotide sequence which encodes an S.
pneumoniae polypeptide. In preferred embodiments: the encoded
polypeptide has biological activity; the encoded polypeptide has an
amino acid sequence at least 60%, 70%, 80%, 90%, 95%, 98%, or 99%
homologous to an amino acid sequence of the invention contained in
the Sequence Listing; the encoded polypeptide has an amino acid
sequence essentially the same as an amino acid sequence of the
invention contained in the Sequence Listing; the encoded
polypeptide is at least 5, 10, 20, 50, 100, or 150 amino acids in
length; the encoded polypeptide comprises at least 5, preferably at
least 10, more preferably at least 20, more preferably at least 50,
100, or 150 contiguous amino acids of the invention contained in
the Sequence Listing.
[0039] In another aspect, the invention encompasses: a vector
including a nucleic acid which encodes an S. pneumoniae polypeptide
or an S. pneumoniae polypeptide variant as described herein; a host
cell transfected with the vector; and a method of producing a
recombinant S. pneumoniae polypeptide or S. pneumoniae polypeptide
variant; including culturing the cell, e.g., in a cell culture
medium, and isolating an S. pneumoniae polypeptide or an S.
pneumoniae polypeptide variant, e.g., from the cell or from the
cell culture medium.
[0040] In another series of embodiments, the invention provides
isolated nucleic acids comprising sequences at least about 8
nucleotides in length, more preferably at least about 12
nucleotides in length, and most preferably at least about 15-20
nucleotides in length, that correspond to a subsequence of any one
of SEQ ID NO: 1-SEQ ID NO: 2661 or complements thereof.
Alternatively, the nucleic acids comprise sequences contained
within any ORF (open reading frame), including a complete
protein-coding sequence, of which any of SEQ ID NO: 1-SEQ ID NO:
2661 forms a part. The invention encompasses sequence-conservative
variants and function-conservative variants of these sequences. The
nucleic acids may be DNA, RNA, DNA/RNA duplexes, protein-nucleic
acid (PNA), or derivatives thereof.
[0041] In another aspect, the invention features, a purified
recombinant nucleic acid having at least 50%, 60%, 70%, 80%, 90%,
95%, 98%, or 99% homology with a sequence of the invention
contained in the Sequence Listing.
[0042] In another aspect, the invention features nucleic acids
capable of binding mRNA of S. pneumoniae. Such nucleic acid is
capable of acting as antisense nucleic acid to control the
translation of mRNA of S. pneumoniae. A further aspect features a
nucleic acid which is capable of binding specifically to an S.
pneumoniae nucleic acid. These nucleic acids are also referred to
herein as complements and have utility as probes and as capture
reagents.
[0043] In another aspect, the invention features an expression
system comprising an open reading frame corresponding to S.
pneumoniae nucleic acid. The nucleic acid further comprises a
control sequence compatible with an intended host. The expression
system is useful for making polypeptides corresponding to S.
pneumoniae nucleic acid.
[0044] In another aspect, the invention features a cell transformed
with the expression system to produce S. pneumoniae
polypeptides.
[0045] In yet another embodiment, the invention encompasses
reagents for detecting bacterial infection, including S. pneumoniae
infection, which comprise at least one S. pneumoniae-derived
nucleic acid defined by any one of SEQ ID NO: 1-SEQ ID NO: 2661, or
sequence-conservative or function-conservative variants thereof.
Alternatively, the diagnostic reagents comprise polypeptide
sequences that are contained within any open reading frames (ORFs),
including complete protein-coding sequences, contained within any
of SEQ ID NO: 1-SEQ ID NO: 2661, or polypeptide sequences contained
within any of SEQ ID NO: 2662-SEQ ID NO: 5322, or polypeptides of
which any of the above sequences forms a part, or antibodies
directed against any of the above peptide sequences or
function-conservative variants and/or fragments thereof.
[0046] The invention further provides antibodies, preferably
monoclonal antibodies, which specifically bind to the polypeptides
of the invention. Methods are also provided for producing
antibodies in a host animal. The methods of the invention comprise
immunizing an animal with at least one S. pneumoniae-derived
immunogenic component, wherein the immunogenic component comprises
one or more of the polypeptides encoded by any one of SEQ ID NO:
1-SEQ ID NO: 2661 or sequence-conservative or function-conservative
variants thereof; or polypeptides that are contained within any
ORFs, including complete protein-coding sequences, of which any of
SEQ ID NO: 1-SEQ ID NO: 2661 forms a part; or polypeptide sequences
contained within any of SEQ ID NO: 2662-SEQ ID NO: 5322 ; or
polypeptides of which any of SEQ ID NO: 2662-SEQ ID NO: 5322 forms
a part. Host animals include any warm blooded animal, including
without limitation mammals and birds. Such antibodies have utility
as reagents for immunoassays to evaluate the abundance and
distribution of S. pneumoniae-specific antigens.
[0047] In yet another aspect, the invention provides a method for
detecting bacterial antigenic components in a sample, which
comprises the steps of: (i) contacting a sample suspected to
contain a bacterial antigenic component with a bacterial-specific
antibody, under conditions in which a stable antigen-antibody
complex can form between the antibody and bacterial antigenic
components in the sample; and (ii) detecting any antigen-antibody
complex formed in step (i), wherein detection of an
antigen-antibody complex indicates the presence of at least one
bacterial antigenic component in the sample. In different
embodiments of this method, the antibodies used are directed
against a sequence encoded by any of SEQ ID NO: 1-SEQ ID NO: 2661
or sequence-conservative or function-conservative variants thereof,
or against a polypeptide sequence contained in any of SEQ ID NO:
2662-SEQ ID NO: 5322 or function-conservative variants thereof.
[0048] In yet another aspect, the invention provides a method for
detecting antibacterial-specific antibodies in a sample, which
comprises: (i) contacting a sample suspected to contain
antibacterial-specific antibodies with a S. pneumoniae antigenic
component, under conditions in which a stable antigen-antibody
complex can form between the S. pneumoniae antigenic component and
antibacterial antibodies in the sample; and (ii) detecting any
antigen-antibody complex formed in step (i), wherein detection of
an antigen-antibody complex indicates the presence of antibacterial
antibodies in the sample. In different embodiments of this method,
the antigenic component is encoded by a sequence contained in any
of SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative and
function-conservative variants thereof, or is a polypeptide
sequence contained in any of SEQ ID NO: 2662-SEQ ID NO: 5322 or
function-conservative variants thereof.
[0049] In another aspect, the invention features a method of
generating vaccines for immunizing an individual against S.
pneumoniae. The method includes: immunizing a subject with an S.
pneumoniae polypeptide, e.g., a surface or secreted polypeptide, or
active portion thereof, and a pharmaceutically acceptable carrier.
Such vaccines have therapeutic and prophylactic utilities.
[0050] In another aspect, the invention features a method of
evaluating a compound, e.g. a polypeptide, e.g., a fragment of a
host cell polypeptide, for the ability to bind an S. pneumoniae
polypeptide. The method includes: contacting the candidate compound
with an S. pneumoniae polypeptide and determining if the compound
binds or otherwise interacts with an S. pneumoniae polypeptide.
Compounds which bind S. pneumoniae are candidates as activators or
inhibitors of the bacterial life cycle. These assays can be
performed in vitro or in vivo.
[0051] In another aspect, the invention features a method of
evaluating a compound, e.g. a polypeptide, e.g., a fragment of a
host cell polypeptide, for the ability to bind an S. pneumoniae
nucleic acid, e.g., DNA or RNA. The method includes: contacting the
candidate compound with an S. pneumoniae nucleic acid and
determining if the compound binds or otherwise interacts with an S.
pneumoniae polypeptide. Compounds which bind S. pneumoniae are
candidates as activators or inhibitors of the bacterial life cycle.
These assays can be performed in vitro or in vivo.
DETAILED DESCRIPTION OF THE INVENTION
[0052] The sequences of the present invention include the specific
nucleic acid and amino acid sequences set forth in the Sequence
Listing that forms a part of the present specification, and which
are designated SEQ ID NO: 1-SEQ ID NO: 5322. Use of the terms "SEQ
ID NO: 1-SEQ ID NO: 2661", "SEQ ID NO: 2662-SEQ ID NO: 5322", "the
sequences depicted in Table 2", etc., is intended, for convenience,
to refer to each individual SEQ ID NO individually, and is not
intended to refer to the genus of these sequences. In other words,
it is a shorthand for listing all of these sequences individually.
The invention encompasses each sequence individually, as well as
any combination thereof.
DEFINITIONS
[0053] "Nucleic acid" or "polynucleotide" as used herein refers to
purine- and pyrimidine-containing polymers of any length, either
polyribonucleotides or polydeoxyribonucleotides or mixed
polyribo-polydeoxyribo nucleotides. This includes single- and
double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA
hybrids, as well as "protein nucleic acids" (PNA) formed by
conjugating bases to an amino acid backbone. This also includes
nucleic acids containing modified bases.
[0054] A nucleic acid or polypeptide sequence that is "derived
from" a designated sequence refers to a sequence that corresponds
to a region of the designated sequence. For nucleic acid sequences,
this encompasses sequences that are homologous or complementary to
the sequence, as well as "sequence-conservative variants" and
"function-conservative variants." For polypeptide sequences, this
encompasses "function-conservative variants." Sequence-conservative
variants are those in which a change of one or more nucleotides in
a given codon position results in no alteration in the amino acid
encoded at that position. Function-conservative variants are those
in which a given amino acid residue in a polypeptide has been
changed without altering the overall conformation and function of
the native polypeptide, including, but not limited to, replacement
of an amino acid with one having similar physico-chemical
properties (such as, for example, acidic, basic, hydrophobic, and
the like). "Function-conservative" variants also include any
polypeptides that have the ability to elicit antibodies specific to
a designated polypeptide.
[0055] An "S. pneumoniae-derived" nucleic acid or polypeptide
sequence may or may not be present in other bacterial species, and
may or may not be present in all S. pneumoniae strains. This term
is intended to refer to the source from which the sequence was
originally isolated. Thus, a S. pneumoniae-derived polypeptide, as
used herein, may be used, e.g., as a target to screen for a broad
spectrum antibacterial agent, to search for homologous proteins in
other species of bacteria or in eukaryotic organisms such as fungi
and humans, etc.
[0056] A purified or isolated polypeptide or a substantially pure
preparation of a polypeptide are used interchangeably herein and,
as used herein, mean a polypeptide that has been separated from
other proteins, lipids, and nucleic acids with which it naturally
occurs. Preferably, the polypeptide is also separated from
substances, e.g., antibodies or gel matrix, e.g., polyacrylamide,
which are used to purify it. Preferably, the polypeptide
constitutes at least 10, 20, 50 70, 80 or 95% dry weight of the
purified preparation. Preferably, the preparation contains:
sufficient polypeptide to allow protein sequencing; at least 1, 10,
or 100 mg of the polypeptide.
[0057] A purified preparation of cells refers to, in the case of
plant or animal cells, an in vitro preparation of cells and not an
entire intact plant or animal. In the case of cultured cells or
microbial cells, it consists of a preparation of at least 10% and
more preferably 50% of the subject cells.
[0058] A purified or isolated or a substantially pure nucleic acid,
e.g., a substantially pure DNA, (are terms used interchangeably
herein) is a nucleic acid which is one or both of the following:
not immediately contiguous with both of the coding sequences with
which it is immediately contiguous (i.e., one at the 5' end and one
at the 3' end) in the naturally-occurring genome of the organism
from which the nucleic acid is derived; or which is substantially
free of a nucleic acid with which it occurs in the organism from
which the nucleic acid is derived. The term includes, for example,
a recombinant DNA which is incorporated into a vector, e.g., into
an autonomously replicating plasmid or virus, or into the genomic
DNA of a prokaryote or eukaryote, or which exists as a separate
molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or
restriction endonuclease treatment) independent of other DNA
sequences. Substantially pure DNA also includes a recombinant DNA
which is part of a hybrid gene encoding additional S. pneumoniae
DNA sequence.
[0059] A "contig" as used herein is a nucleic acid representing a
continuous stretch of genomic sequence of an organism.
[0060] An "open reading frame", also referred to herein as ORF, is
a region of nucleic acid which encodes a polypeptide. This region
usually represents the total coding region for the polypeptide and
can be determined from a stop to stop codon or from a start to stop
codon.
[0061] As used herein, a "coding sequence" is a nucleic acid which
is transcribed into messenger RNA and/or translated into a
polypeptide when placed under the control of appropriate regulatory
sequences. The boundaries of the coding sequence are determined by
a translation start codon at the five prime terminus and a
translation stop codon at the three prime terminus. A coding
sequence can include but is not limited to messenger RNA, synthetic
DNA, and recombinant nucleic acid sequences.
[0062] A "complement" of a nucleic acid as used herein refers to an
anti-parallel or antisense sequence that participates in
Watson-Crick base-pairing with the original sequence.
[0063] A "gene product" is a protein or structural RNA which is
specifically encoded by a gene.
[0064] As used herein, the term "probe" refers to a nucleic acid,
peptide or other chemical entity which specifically binds to a
molecule of interest. Probes are often associated with or capable
of associating with a label. A label is a chemical moiety capable
of detection. Typical labels comprise dyes, radioisotopes,
luminescent and chemiluminescent moieties, fluorophores, enzymes,
precipitating agents, amplification sequences, and the like.
Similarly, a nucleic acid, peptide or other chemical entity which
specifically binds to a molecule of interest and immobilizes such
molecule is referred herein as a "capture ligand". Capture ligands
are typically associated with or capable of associating with a
support such as nitro-cellulose, glass, nylon membranes, beads,
particles and the like. The specificity of hybridization is
dependent on conditions such as the base pair composition of the
nucleotides, and the temperature and salt concentration of the
reaction. These conditions are readily discernable to one of
ordinary skill in the art using routine experimentation.
[0065] "Homologous" refers to the sequence similarity or sequence
identity between two polypeptides or between two nucleic acid
molecules. When a position in both of the two compared sequences is
occupied by the same base or amino acid monomer subunit, e.g., if a
position in each of two DNA molecules is occupied by adenine, then
the molecules are homologous at that position. The percent of
homology between two sequences is a function of the number of
matching or homologous positions shared by the two sequences
divided by the number of positions compared.times.100. For example,
if 6 of 10 of the positions in two sequences are matched or
homologous then the two sequences are 60% homologous. By way of
example, the DNA sequences ATTGCC and TATGGC share 50% homology.
Generally, a comparison is made when two sequences are aligned to
give maximum homology.
[0066] Nucleic acids are hybridizable to each other when at least
one strand of a nucleic acid can anneal to the other nucleic acid
under defined stringency conditions. Stringency of hybridization is
determined by: (a) the temperature at which hybridization and/or
washing is performed; and (b) the ionic strength and polarity of
the hybridization and washing solutions. Hybridization requires
that the two nucleic acids contain complementary sequences;
depending on the stringency of hybridization, however, mismatches
may be tolerated. Typically, hybridization of two sequences at high
stingency (such as, for example, in a solution of 0.5.times.SSC, at
65.degree. C.) requires that the sequences be essentially
completely homologous. Conditions of intermediate stringency (such
as, for example, 2.times.SSC at 65.degree. C.) and low stringency
(such as, for example 2.times.SSC at 55.degree. C.), require
correspondingly less overall complementarity between the
hybridizing sequences. (1.times.SSC is 0.15 M NaCl, 0.015 M Na
citrate).
[0067] The terms peptides, proteins, and polypeptides are used
interchangeably herein.
[0068] As used herein, the term "surface protein" refers to all
surface accessible proteins, e.g. inner and outer membrane
proteins, proteins adhering to the cell wall, and secreted
proteins.
[0069] A polypeptide has S. pneumoniae biological activity if it
has one, two and preferably more of the following properties: (1)
if when expressed in the course of an S. pneumoniae infection, it
can promote, or mediate the attachment of S. pneumoniae to a cell;
(2) it has an enzymatic activity, structural or regulatory function
characteristic of an S. pneumoniae protein; (3) or the gene which
encodes it can rescue a lethal mutation in an S. pneumoniae gene. A
polypeptide has biological activity if it is an antagonist,
agonist, or super-agonist of a polypeptide having one of the
above-listed properties.
[0070] A biologically active fragment or analog is one having an in
vivo or in vitro activity which is characteristic of the S.
pneumoniae polypeptides of the invention contained in the Sequence
Listing, or of other naturally occurring S. pneumoniae
polypeptides, e.g., one or more of the biological activities
described herein. Especially preferred are fragments which exist in
vivo, e.g., fragments which arise from post transcriptional
processing or which arise from translation of alternatively spliced
RNA's. Fragments include those expressed in native or endogenous
cells as well as those made in expression systems, e.g., in CHO
cells. Because peptides such as S. pneumoniae polypeptides often
exhibit a range of physiological properties and because such
properties may be attributable to different portions of the
molecule, a useful S. pneumoniae fragment or S. pneumoniae analog
is one which exhibits a biological activity in any biological assay
for S. pneumoniae activity. Most preferably the fragment or analog
possesses 10%, preferably 40%, more preferably 60%, 70%, 80% or 90%
or greater of the activity of S. pneumoniae, in any in vivo or in
vitro assay.
[0071] Analogs can differ from naturally occurring S. pneumoniae
polypeptides in amino acid sequence or in ways that do not involve
sequence, or both. Non-sequence modifications include changes in
acetylation, methylation, phosphorylation, carboxylation, or
glycosylation. Preferred analogs include S. pneumoniae polypeptides
(or biologically active fragments thereof) whose sequences differ
from the wild-type sequence by one or more conservative amino acid
substitutions or by one or more non-conservative amino acid
substitutions, deletions, or insertions which do not substantially
diminish the biological activity of the S. pneumoniae polypeptide.
Conservative substitutions typically include the substitution of
one amino acid for another with similar characteristics, e.g.,
substitutions within the following groups: valine, glycine;
glycine, alanine; valine, isoleucine, leucine; aspartic acid,
glutamic acid; asparagine, glutamine; serine, threonine; lysine,
arginine; and phenylalanine, tyrosine. Other conservative
substitutions can be made in view of the table below.
TABLE-US-00001 TABLE 1 CONSERVATIVE AMINO ACID REPLACEMENTS For
Amino Acid Code Replace with any of Alanine A D-Ala, Gly, beta-Ala,
L-Cys, D-Cys Arginine R D-Arg, Lys, D-Lys, homo-Arg, D-homo-Arg,
Met, Ile, D-Met, D-Ile, Orn, D-Orn Asparagine N D-Asn, Asp, D-Asp,
Glu, D-Glu, Gln, D-Gln Aspartic Acid D D-Asp, D-Asn, Asn, Glu,
D-Glu, Gln, D-Gln Cysteine C D-Cys, S-Me-Cys, Met, D-Met, Thr,
D-Thr Glutamine Q D-Gln, Asn, D-Asn, Glu, D-Glu, Asp, D-Asp
Glutamic Acid E D-Glu, D-Asp, Asp, Asn, D-Asn, Gln, D-Gln Glycine G
Ala, D-Ala, Pro, D-Pro, .beta.-Ala, Acp Isoleucine I D-Ile, Val,
D-Val, Leu, D-Leu, Met, D-Met Leucine L D-Leu, Val, D-Val, Leu,
D-Leu, Met, D-Met Lysine K D-Lys, Arg, D-Arg, homo-Arg, D-homo-Arg,
Met, D-Met, Ile, D-Ile, Orn, D-Orn Methionine M D-Met, S-Me-Cys,
Ile, D-Ile, Leu, D-Leu, Val, D-Val Phenylalanine F D-Phe, Tyr,
D-Thr, L-Dopa, His, D-His, Trp, D-Trp, Trans-3,4, or
5-phenylproline, cis-3,4, or 5-phenylproline Proline P D-Pro,
L-I-thioazolidine-4-carboxylic acid, D-or
L-1-oxazolidine-4-carboxylic acid Serine S D-Ser, Thr, D-Thr,
allo-Thr, Met, D-Met, Met(O), D-Met(O), L-Cys, D-Cys Threonine T
D-Thr, Ser, D-Ser, allo-Thr, Met, D-Met, Met(O), D-Met(O), Val,
D-Val Tyrosine Y D-Tyr, Phe, D-Phe, L-Dopa, His, D-His Valine V
D-Val, Leu, D-Leu, Ile, D-Ile, Met, D-Met
[0072] Other analogs within the invention are those with
modifications which increase peptide stability; such analogs may
contain, for example, one or more non-peptide bonds (which replace
the peptide bonds) in the peptide sequence. Also included are:
analogs that include residues other than naturally occurring
L-amino acids, e.g., D-amino acids or non-naturally occurring or
synthetic amino acids, e.g., .beta. or .gamma. amino acids; and
cyclic analogs.
[0073] As used herein, the term "fragment", as applied to an S.
pneumoniae analog, will ordinarily be at least about 20 residues,
more typically at least about 40 residues, preferably at least
about 60 residues in length. Fragments of S. pneumoniae
polypeptides can be generated by methods known to those skilled in
the art. The ability of a candidate fragment to exhibit a
biological activity of S. pneumoniae polypeptide can be assessed by
methods known to those skilled in the art as described herein. Also
included are S. pneumoniae polypeptides containing residues that
are not required for biological activity of the peptide or that
result from alternative mRNA splicing or alternative protein
processing events.
[0074] An "immunogenic component" as used herein is a moiety, such
as an S. pneumoniae polypeptide, analog or fragment thereof, that
is capable of eliciting a humoral and/or cellular immune response
in a host animal.
[0075] An "antigenic component" as used herein is a moiety, such as
an S. pneumoniae polypeptide, analog or fragment thereof, that is
capable of binding to a specific antibody with sufficiently high
affinity to form a detectable antigen-antibody complex.
[0076] The term "antibody" as used herein is intended to include
fragments thereof which are specifically reactive with S.
pneumoniae polypeptides.
[0077] As used herein, the term "cell-specific promoter" means a
DNA sequence that serves as a promoter, i.e., regulates expression
of a selected DNA sequence operably linked to the promoter, and
which effects expression of the selected DNA sequence in specific
cells of a tissue. The term also covers so-called "leaky"
promoters, which regulate expression of a selected DNA primarily in
one tissue, but cause expression in other tissues as well.
[0078] Misexpression, as used herein, refers to a non-wild type
pattern of gene expression. It includes: expression at non-wild
type levels, i.e., over or under expression; a pattern of
expression that differs from wild type in terms of the time or
stage at which the gene is expressed, e.g., increased or decreased
expression (as compared with wild type) at a predetermined
developmental period or stage; a pattern of expression that differs
from wild type in terms of decreased expression (as compared with
wild type) in a predetermined cell type or tissue type; a pattern
of expression that differs from wild type in terms of the splicing
size, amino acid sequence, post-translational modification, or
biological activity of the expressed polypeptide; a pattern of
expression that differs from wild type in terms of the effect of an
environmental stimulus or extracellular stimulus on expression of
the gene, e.g., a pattern of increased or decreased expression (as
compared with wild type) in the presence of an increase or decrease
in the strength of the stimulus.
[0079] As used herein, "host cells" and other such terms denoting
microorganisms or higher eukaryotic cell lines cultured as
unicellular entities refers to cells which can become or have been
used as recipients for a recombinant vector or other transfer DNA,
and include the progeny of the original cell which has been
transfected. It is understood by individuals skilled in the art
that the progeny of a single parental cell may not necessarily be
completely identical in genomic or total DNA compliment to the
original parent, due to accident or deliberate mutation.
[0080] As used herein, the term "control sequence" refers to a
nucleic acid having a base sequence which is recognized by the host
organism to effect the expression of encoded sequences to which
they are ligated. The nature of such control sequences differs
depending upon the host organism; in prokaryotes, such control
sequences generally include a promoter, ribosomal binding site,
terminators, and in some cases operators; in eukaryotes, generally
such control sequences include promoters, terminators and in some
instances, enhancers. The term control sequence is intended to
include at a minimum, all components whose presence is necessary
for expression, and may also include additional components whose
presence is advantageous, for example, leader sequences.
[0081] As used herein, the term "operably linked" refers to
sequences joined or ligated to function in their intended manner.
For example, a control sequence is operably linked to coding
sequence by ligation in such a way that expression of the coding
sequence is achieved under conditions compatible with the control
sequence and host cell.
[0082] The "metabolism" of a substance, as used herein, means any
aspect of the expression, function, action, or regulation of the
substance. The metabolism of a substance includes modifications,
e.g., covalent or non-covalent modifications of the substance. The
metabolism of a substance includes modifications, e.g., covalent or
non-covalent modification, the substance induces in other
substances. The metabolism of a substance also includes changes in
the distribution of the substance. The metabolism of a substance
includes changes the substance induces in the distribution of other
substances.
[0083] A "sample" as used herein refers to a biological sample,
such as, for example, tissue or fluid isolated from an individual
(including without limitation plasma, serum, cerebrospinal fluid,
lymph, tears, saliva and tissue sections) or from in vitro cell
culture constituents, as well as samples from the environment.
[0084] Technical and scientific terms used herein have the meanings
commonly understood by one of ordinary skill in the art to which
the present invention pertains, unless otherwise defined. Reference
is made herein to various methodologies known to those of skill in
the art. Publications and other materials setting forth such known
methodologies to which reference is made are incorporated herein by
reference in their entireties as though set forth in full. The
practice of the invention will employ, unless otherwise indicated,
conventional techniques of chemistry, molecular biology,
microbiology, recombinant DNA, and immunology, which are within the
skill of the art. Such techniques are explained fully in the
literature. See e.g., Sambrook, Fritsch, and Maniatis, Molecular
Cloning, Laboratory Manual 2nd ed. (1989); DNA Cloning, Volumes I
and II (D. N Glover ed. 1985); Oligonucleotide Synthesis (M. J.
Gait ed, 1984); Nucleic Acid Hybridization (B. D. Hames & S. J.
Higgins eds. 1984); the series, Methods in Enzymology (Academic
Press, Inc.), particularly Vol. 154 and Vol. 155 (Wu and Grossman,
eds.); PCR-A Practical Approach (McPherson, Quirke, and Taylor,
eds., 1991); Immunology, 2d Edition, 1989, Roitt et al., C. V.
Mosby Company, and New York; Advanced Immunology, 2d Edition, 1991,
Male et al., Grower Medical Publishing, New York.; DNA Cloning: A
Practical Approach, Volumes I and II, 1985 (D. N. Glover ed.);
Oligonucleotide Synthesis, 1984, (M. L. Gait ed); Transcription and
Translation, 1984 (Hames and Higgins eds.); Animal Cell Culture,
1986 (R. I. Freshney ed.); Immobilized Cells and Enzymes, 1986 (IRL
Press); Perbal, 1984, A Practical Guide to Molecular Cloning; and
Gene Transfer Vectors for Mammalian Cells, 1987 (J. H. Miller and
M. P. Calos eds., Cold Spring Harbor Laboratory).
[0085] Any suitable materials and/or methods known to those of
skill can be utilized in carrying out the present invention:
however preferred materials and/or methods are described.
Materials, reagents and the like to which reference is made in the
following description and examples are obtainable from commercial
sources, unless otherwise noted.
S. pneumoniae Genomic Sequence
[0086] This invention provides nucleotide sequences of the genome
of S. pneumoniae which thus comprises a DNA sequence library of S.
pneumoniae genomic DNA. The detailed description that follows
provides nucleotide sequences of S. pneumoniae, and also describes
how the sequences were obtained and how ORFs and protein-coding
sequences were identified. Also described are methods of using the
disclosed S. pneumoniae sequences in methods including diagnostic
and therapeutic applications. Furthermore, the library can be used
as a database for identification and comparison of medically
important sequences in this and other strains of S. pneumoniae.
[0087] To determine the genomic sequence of S. pneumoniae, DNA was
isolated from strain 14453 of S. pneumoniae and mechanically
sheared by nebulization to a median size of 2 kb. Following size
fractionation by gel electrophoresis, the fragments were
blunt-ended, ligated to adapter oligonucleotides, and cloned into
each of 20 different pMPX vectors (Rice et al., abstracts of
Meeting of Genome Mapping and Sequencing, Cold Spring Harbor, N.Y.,
May 11-15, 1994, p. 225) and the PUC19 vector to construct a series
of "shotgun" subclone libraries.
[0088] DNA sequencing was achieved using two sequencing methods.
The first method used multiplex sequencing procedures essentially
as disclosed in Church et al., 1988, Science 240:185; U.S. Pat.
Nos. 4,942,124 and 5,149,625). DNA was extracted from pooled
cultures and subjected to chemical or enzymatic sequencing.
Sequencing reactions were resolved by electrophoresis, and the
products were transferred and covalently bound to nylon membranes.
Finally, the membranes were sequentially hybridized with a series
of labelled oligonucleotides complimentary to "tag" sequences
present in the different shotgun cloning vectors. In this manner, a
large number of sequences could be obtained from a single set of
sequencing reactions. The remainder of the sequencing was performed
on ABI377 automated DNA sequencers. The cloning and sequencing
procedures are described in more detail in the Exemplification.
[0089] Individual sequence reads were assembled using PHRAP (P.
Green, Abstracts of DOE Human Genome Program Contractor-Grantee
Workshop V, Jan. 1996, p. 157). The average contig length was about
3-4 kb.
[0090] A variety of approaches are used to order the contigs so as
to obtain a continuous sequence representing the entire S.
pneumoniae genome. Synthetic oligonucleotides are designed that are
complementary to sequences at the end of each contig. These
oligonucleotides may be hybridized to libaries of S. pneumoniae
genomic DNA in, for example, lambda phage vectors or plasmid
vectors to identify clones that contain sequences corresponding to
the junctional regions between individual contigs. Such clones are
then used to isolate template DNA and the same oligonucleotides are
used as primers in polymerase chain reaction (PCR) to amplify
junctional fragments, the nucleotide sequence of which is then
determined.
[0091] The S. pneumoniae sequences were analyzed for the presence
of open reading frames (ORFs) comprising at least 180 nucleotides.
As a result of the initial analysis of ORFs based on stop-to-stop
codon reads, it should be understood that these ORFs may not
correspond to the ORF of a naturally-occurring S. pneumoniae
polypeptide. These ORFs may contain start codons which indicate the
initiation of protein synthesis of a naturally-occurring S.
pneumoniaepolypeptide. Such start codons within the ORFs provided
herein can be identified by those of ordinary skill in the relevant
art, and the resulting ORF and the encoded S. pneumoniae
polypeptide is within the scope of this invention. For example,
within the ORFs a codon such as AUG or GUG (encoding methionine or
valine) which is part of the initiation signal for protein
synthesis can be identified and the portion of an ORF to
corresponding to a naturally-occurring S. pneumoniae polypeptide
can be recognized.
[0092] The second analysis of the ORFs included identifying the
start codons and the predicted coding regions. These ORFs provided
in this invention were defined by one or more of the following
methods: evaluating the coding potential of such sequences with the
program GENEMARK.TM. (Borodovsky and McIninch, 1993, Comp. 17:123),
distinguishing the coding from noncoding regions using the program
Glimmer (Fraser et al, Nature, 1997), determining codon usage
(Staden et al., Nucleic Acid Research 10: 141), and each predicted
ORF amino acid sequence was compared with all protein sequences
found in current GENBANK, SWISS-PROT, and PIR databases using the
BLAST algorithm. BLAST identifies local alignments occurring by
chance between the ORF sequence and the sequence in the databank
(Altschal et al., 1990, L Mol. Biol. 215:403-410). Homologous ORFs
(probabilities less than 10.sup.-5 by chance) and ORF's that are
probably non-homologous (probabilities greater than 10.sup.-5 by
chance) but have good codon usage were identified. Both homologous,
sequences and non-homologous sequences with good codon usage are
likely to encode proteins and are encompassed by the invention.
S. pneumoniae Nucleic Acids
[0093] The nucleic acids of this invention may be obtained directly
from the DNA of the above referenced S. pneumoniae strain by using
the polymerase chain reaction (PCR). See "PCR, A Practical
Approach" (McPherson, Quirke, and Taylor, eds., IRL Press, Oxford,
UK, 1991) for details about the PCR. High fidelity PCR can be used
to ensure a faithful DNA copy prior to expression. In addition, the
authenticity of amplified products can be verified by conventional
sequencing methods. Clones carrying the desired sequences described
in this invention may also be obtained by screening the libraries
by means of the PCR or by hybridization of synthetic
oligonucleotide probes to filter lifts of the library colonies or
plaques as known in the art (see, e.g., Sambrook et al., Molecular
Cloning, A Laboratory Manual 2nd edition, 1989, Cold Spring Harbor
Press, NY).
[0094] It is also possible to obtain nucleic acids encoding S.
pneumoniae polypeptides from a cDNA library in accordance with
protocols herein described. A cDNA encoding an S. pneumoniae
polypeptide can be obtained by isolating total mRNA from an
appropriate strain. Double stranded cDNAs can then be prepared from
the total mRNA. Subsequently, the cDNAs can be inserted into a
suitable plasmid or viral (e.g., bacteriophage) vector using any
one of a number of known techniques. Genes encoding S. pneumoniae
polypeptides can also be cloned using established polymerase chain
reaction techniques in accordance with the nucleotide sequence
information provided by the invention. The nucleic acids of the
invention can be DNA or RNA. Preferred nucleic acids of the
invention are contained in the Sequence Listing.
[0095] The nucleic acids of the invention can also be chemically
synthesized using standard techniques. Various methods of
chemically synthesizing polydeoxynucleotides are known, including
solid-phase synthesis which, like peptide synthesis, has been fully
automated in commercially available DNA synthesizers (See e.g.,
Itakura et al. U.S. Pat. No. 4,598,049; Caruthers et al. U.S. Pat.
No. 4,458,066; and Itakura U.S. Pat. Nos. 4,401,796 and 4,373,071,
incorporated by reference herein).
[0096] Nucleic acids isolated or synthesized in accordance with
features of the present invention are useful, by way of example,
without limitation, as probes, primers, capture ligands, antisense
genes and for developing expression systems for the synthesis of
proteins and peptides corresponding to such sequences. As probes,
primers, capture ligands and antisense agents, the nucleic acid
normally consists of all or part (approximately twenty or more
nucleotides for specificity as well as the ability to form stable
hybridization products) of the nucleic acids of the invention
contained in the Sequence Listing. These uses are described in
further detail below.
[0097] Probes
[0098] A nucleic acid isolated or synthesized in accordance with
the sequence of the invention contained in the Sequence Listing can
be used as a probe to specifically detect S. pneumoniae. With the
sequence information set forth in the present application,
sequences of twenty or more nucleotides are identified which
provide the desired inclusivity and exclusivity with respect to S.
pneumoniae, and extraneous nucleic acids likely to be encountered
during hybridization conditions. More preferably, the sequence will
comprise at least twenty to thirty nucleotides to convey stability
to the hybridization product formed between the probe and the
intended target molecules.
[0099] Sequences larger than 1000 nucleotides in length are
difficult to synthesize but can be generated by recombinant DNA
techniques. Individuals skilled in the art will readily recognize
that the nucleic acids, for use as probes, can be provided with a
label to facilitate detection of a hybridization product.
[0100] Nucleic acid isolated and synthesized in accordance with the
sequence of the invention contained in the Sequence Listing can
also be useful as probes to detect homologous regions (especially
homologous genes) of other Streptococcus species using appropriate
stringency hybridization conditions as described herein.
[0101] Capture Ligand
[0102] For use as a capture ligand, the nucleic acid selected in
the manner described above with respect to probes, can be readily
associated with a support. The manner in which nucleic acid is
associated with supports is well known. Nucleic acid having twenty
or more nucleotides in a sequence of the invention contained in the
Sequence Listing have utility to separate S. pneumoniae nucleic
acid from the nucleic acid of each other and other organisms.
Nucleic acid having twenty or more nucleotides in a sequence of the
invention contained in the Sequence Listing can also have utility
to separate other Streptococcus species from each other and from
other organisms. Preferably, the sequence will comprise at least
twenty nucleotides to convey stability to the hybridization product
formed between the probe and the intended target molecules.
Sequences larger than 1000 nucleotides in length are difficult to
synthesize but can be generated by recombinant DNA techniques.
[0103] Primers
[0104] Nucleic acid isolated or synthesized in accordance with the
sequences described herein have utility as primers for the
amplification of S. pneumoniae nucleic acid. These nucleic acids
may also have utility as primers for the amplification of nucleic
acids in other Streptococcus species. With respect to polymerase
chain reaction (PCR) techniques, nucleic acid sequences of
.gtoreq.10-15 nucleotides of the invention contained in the
Sequence Listing have utility in conjunction with suitable enzymes
and reagents to create copies of S. pneumoniae nucleic acid. More
preferably, the sequence will comprise twenty or more nucleotides
to convey stability to the hybridization product formed between the
primer and the intended target molecules. Binding conditions of
primers greater than 100 nucleotides are more difficult to control
to obtain specificity. High fidelity PCR can be used to ensure a
faithful DNA copy prior to expression. In addition, amplified
products can be checked by conventional sequencing methods.
[0105] The copies can be used in diagnostic assays to detect
specific sequences, including genes from S. pneumoniae and/or other
Streptococcus species. The copies can also be incorporated into
cloning and expression vectors to generate polypeptides
corresponding to the nucleic acid synthesized by PCR, as is
described in greater detail herein.
[0106] Antisense
[0107] Nucleic acid or nucleic acid-hybridizing derivatives
isolated or synthesized in accordance with the sequences described
herein have utility as antisense agents to prevent the expression
of S. pneumoniae genes. These sequences also have utility as
antisense agents to prevent expression of genes of other
Streptococcus species.
[0108] In one embodiment, nucleic acid or derivatives corresponding
to S. pneumoniae nucleic acids is loaded into a suitable carrier
such as a liposome or bacteriophage for introduction into bacterial
cells. For example, a nucleic acid having twenty or more
nucleotides is capable of binding to bacteria nucleic acid or
bacteria messenger RNA. Preferably, the antisense nucleic acid is
comprised of 20 or more nucleotides to provide necessary stability
of a hybridization product of non-naturally occurring nucleic acid
and bacterial nucleic acid and/or bacterial messenger RNA. Nucleic
acid having a sequence greater than 1000 nucleotides in length is
difficult to synthesize but can be generated by recombinant DNA
techniques. Methods for loading antisense nucleic acid in liposomes
is known in the art as exemplified by U.S. Pat. No. 4,241,046
issued Dec. 23, 1980 to Papahadjopoulos et al.
[0109] The present invention encompasses isolated polypeptides and
nucleic acids derived from S. pneumoniae that are useful as
reagents for diagnosis of bacterial infection, components of
effective antibacterial vaccines, and/or as targets for
antibacterial drugs, including anti-S. pneumoniae drugs.
Expression of S. pneumoniae Nucleic Acids
[0110] Table 2 provides a list of open reading frames (ORFs) in
both strands. An ORF is a region of nucleic acid which encodes a
polypeptide. This region normally represents a complete coding
sequence or a total sequence and was determined from an initial
analysis of stop to stop codons followed by the prediction of start
codons. The first column lists the ORF designation. The second and
third columns list the SEQ ID numbers for the nucleic acid and
amino acid sequences corresponding to each ORF, respectively. The
fourth and fifth columns list the length of the nucleic acid ORF
and the length of the amino acid ORF, respectively. Most of the
nucleotide sequences corresponding to each ORF begin at the first
nucleotide of the start codon and end at the nucleotide immediately
preceding the next downstream stop codon in the same reading frame.
It will be recognized by one skilled in the art that the natural
translation initiation sites will correspond to ATG, GTG, or TTG
codons located within the ORFs. The natural initiation sites depend
not only on the sequence of a start codon but also on the context
of the DNA sequence adjacent to the start codon. Usually, a
recognizable ribosome binding site is found within 20 nucleotides
upstream from the initiation codon. In some cases where genes are
translationally coupled and coordinately expressed together in
"operons", ribosome binding sites are not present, but the
initiation codon of a downstream gene may occur very close to, or
overlap, the stop codon of the an upstream gene in the same operon.
The correct start codons can be generally identified rapidly and
efficiently because only a few codons need be tested. It is
recognized that the translational machinery in bacteria initiates
most polypeptide chains with the amino acid methionine. In some
cases, polypeptides are post-translationally modified, resulting in
an N-terminal amino acid other than methionine in vivo. The sixth
and seventh columns provide metrics for assessing the likelihood of
the homology match (determined by the BLASTP2 algorithm), as is
known in the art, to the genes indicated in the description field.
Specifically, the sixth column represents the "Score" for the match
(a higher score is a better match), and the seventh column
represents the "P-value" for the match (the probability that such a
match could have occurred by chance; the lower the value, the more
likely the match is valid). If a BLASTP2 score of less than 46 was
obtained, no value is reported in the table the "P-value". The
description field provides, where available, the accession number
(AC) or the Swissprot accession number (SP), the locus name (LN),
Superfamily Classification (CL), the Organism (OR), Source of
variant (SR), E.C. number (EC), the gene name (GN), the product
name (PN), the Function Description (FN), the Map Position (MP),
Left End (LE), Right End (RE), Coding Direction (DI), the Database
from which the sequence originates (DB), and the description (DE)
or notes (NT) for each ORF. This information allows one of ordinary
skill in the art to determine a potential use and function for each
identified coding sequence and, as a result, allows the use of the
polypeptides of the present invention for commercial and industrial
purposes.
[0111] Using the information provided in SEQ ID NO: 1-SEQ ID NO:
2661 and in Table 2 together with routine cloning and sequencing
methods, one of ordinary skill in the art will be able to clone and
sequence all the nucleic acid fragments of interest including open
reading frames (ORFs) encoding a large variety proteins of S.
pneumoniae.
[0112] Nucleic acid isolated or synthesized in accordance with the
sequences described herein have utility to generate polypeptides.
The nucleic acid of the invention exemplified in SEQ ID NO: 1-SEQ
ID NO: 2661 and in Table 2 or fragments of said nucleic acid
encoding active portions of S. pneumoniae polypeptides can be
cloned into suitable vectors or used to isolate nucleic acid. The
isolated nucleic acid is combined with suitable DNA linkers and
cloned into a suitable vector.
[0113] The function of a specific gene or operon can be ascertained
by expression in a bacterial strain under conditions where the
activity of the gene product(s) specified by the gene or operon in
question can be specifically measured. Alternatively, a gene
product may be produced in large quantities in an expressing strain
for use as an antigen, an industrial reagent, for structural
studies, etc. This expression can be accomplished in a mutant
strain which lacks the activity of the gene to be tested, or in a
strain that does not produce the same gene product(s). This
includes, but is not limited to, Eucaryotic species such as the
yeast Saccharomyces cerevisiae, Methanobacterium strains or other
Archaea, and Eubacteria such as E. coli, B. subtilis, S. aureus, S.
pneumonia or Pseudomonas putida. In some cases the expression host
will utilize the natural S. pneumoniae promoter whereas in others,
it will be necessary to drive the gene with a promoter sequence
derived from the expressing organism (e.g., an E. coli
beta-galactosidase promoter for expression in E. coli).
[0114] To express a gene product using the natural S. pneumoniae
promoter, a procedure such as the following can be used. A
restriction fragment containing the gene of interest, together with
its associated natural promoter element and regulatory sequences
(identified using the DNA sequence data) is cloned into an
appropriate recombinant plasmid containing an origin of replication
that functions in the host organism and an appropriate selectable
marker. This can be accomplished by a number of procedures known to
those skilled in the art. It is most preferably done by cutting the
plasmid and the fragment to be cloned with the same restriction
enzyme to produce compatible ends that can be ligated to join the
two pieces together. The recombinant plasmid is introduced into the
host organism by, for example, electroporation and cells containing
the recombinant plasmid are identified by selection for the marker
on the plasmid. Expression of the desired gene product is detected
using an assay specific for that gene product.
[0115] In the case of a gene that requires a different promoter,
the body of the gene (coding sequence) is specifically excised and
cloned into an appropriate expression plasmid. This subcloning can
be done by several methods, but is most easily accomplished by PCR
amplification of a specific fragment and ligation into an
expression plasmid after treating the PCR product with a
restriction enzyme or exonuclease to create suitable ends for
cloning.
[0116] A suitable host cell for expression of a gene can be any
procaryotic or eucaryotic cell. For example, an S. pneumoniae
polypeptide can be expressed in bacterial cells such as E. coli or
B. subtilis, insect cells (baculovirus), yeast, or mammalian cells
such as Chinese hamster ovary cell (CHO). Other suitable host cells
are known to those skilled in the art.
[0117] Expression in eucaryotic cells such as mammalian, yeast, or
insect cells can lead to partial or complete glycosylation and/or
formation of relevant inter- or intra-chain disulfide bonds of a
recombinant peptide product. Examples of vectors for expression in
yeast S. cerivisae include pYepSec1 (Baldari. et al., (1987) Embo
J. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell
30:933-943), pJRY88 (Schultz et al., (1987) Gene 54:113-123), and
pYES2 (Invitrogen Corporation, San Diego, Calif.). Baculovirus
vectors available for expression of proteins in cultured insect
cells (SF 9 cells) include the pAc series (Smith et al., (1983)
Mol. Cell Biol. 3:2156-2165) and the pVL series (Lucklow, V. A.,
and Summers, M. D., (1989) Virology 170:31-39). Generally, COS
cells (Gluzman, Y., (1981) Cell 23:175-182) are used in conjunction
with such vectors as pCDM 8 (Aruffo, A. and Seed, B., (1987) Proc.
Natl. Acad. Sci. USA 84:8573-8577) for transient
amplification/expression in mammalian cells, while CHO (dhfr.sup.-
Chinese Hamster Ovary) cells are used with vectors such as pMT2PC
(Kaufman et al. (1987), EMBO J 6:187-195) for stable
amplification/expression in mammalian cells. Vector DNA can be
introduced into mammalian cells via conventional techniques such as
calcium phosphate or calcium chloride co-precipitation,
DEAE-dextran-mediated transfection, or electroporation. Suitable
methods for transforming host cells can be found in Sambrook et al.
(Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring
Harbor Laboratory press (1989)), and other laboratory
textbooks.
[0118] Expression in procaryotes is most often carried out in E.
coli with either fusion or non-fusion inducible expression vectors.
Fusion vectors usually add a number of NH.sub.2 terminal amino
acids to the expressed target gene. These NH.sub.2 terminal amino
acids often are referred to as a reporter group or an affinity
purification group. Such reporter groups usually serve two
purposes: 1) to increase the solubility of the target recombinant
protein; and 2) to aid in the purification of the target
recombinant protein by acting as a ligand in affinity purification.
Often, in fusion expression vectors, a proteolytic cleavage site is
introduced at the junction of the reporter group and the target
recombinant protein to enable separation of the target recombinant
protein from the reporter group subsequent to purification of the
fusion protein. Such enzymes, and their cognate recognition
sequences, include Factor Xa, thrombin and enterokinase. Typical
fusion expression vectors include pGEX (Amrad Corp., Melbourne,
Australia), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5
(Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase,
maltose E binding protein, or protein A, respectively, to the
target recombinant protein. A preferred reporter group is
poly(His), which may be fused to the amino or carboxy terminus of
the protein and which renders the recombinant fusion protein easily
purifiable by metal chelate chromatography.
[0119] Inducible non-fusion expression vectors include pTrc (Amann
et al., (1988) Gene 69:301-315) and pET11d (Studier et al., Gene
Expression Technology: Methods in Enzymology 185, Academic Press,
San Diego, Calif. (1990) 60-89). While target gene expression
relies on host RNA polymerase transcription from the hybrid trp-lac
fusion promoter in pTrc, expression of target genes inserted into
pET11d relies on transcription from the T7 gn10-lac 0 fusion
promoter mediated by coexpressed viral RNA polymerase (T7 gn1).
This viral polymerase is supplied by host strains BL21(DE3) or
HMS174(DE3) from a resident .lamda. prophage harboring a T7 gn1
under the transcriptional control of the lacUV 5 promoter.
[0120] For example, a host cell transfected with a nucleic acid
vector directing expression of a nucleotide sequence encoding an S.
pneumoniae polypeptide can be cultured under appropriate conditions
to allow expression of the polypeptide to occur. The polypeptide
may be secreted and isolated from a mixture of cells and medium
containing the peptide. Alternatively, the polypeptide may be
retained cytoplasmically and the cells harvested, lysed and the
protein isolated. A cell culture includes host cells, media and
other byproducts. Suitable media for cell culture are well known in
the art. Polypeptides of the invention can be isolated from cell
culture medium, host cells, or both using techniques known in the
art for purifying proteins including ion-exchange chromatography,
gel filtration chromatography, ultrafiltration, electrophoresis,
and immunoaffinity purification with antibodies specific for such
polypeptides. Additionally, in many situations, polypeptides can be
produced by chemical cleavage of a native protein (e.g., tryptic
digestion) and the cleavage products can then be purified by
standard techniques.
[0121] In the case of membrane bound proteins, these can be
isolated from a host cell by contacting a membrane-associated
protein fraction with a detergent forming a solubilized complex,
where the membrane-associated protein is no longer entirely
embedded in the membrane fraction and is solubilized at least to an
extent which allows it to be chromatographically isolated from the
membrane fraction. Several different criteria are used for choosing
a detergent suitable for solubilizing these complexes. For example,
one property considered is the ability of the detergent to
solubilize the S. pneumoniae protein within the membrane fraction
at minimal denaturation of the membrane-associated protein allowing
for the activity or functionality of the membrane-associated
protein to return upon reconstitution of the protein. Another
property considered when selecting the detergent is the critical
micelle concentration (CMC) of the detergent in that the detergent
of choice preferably has a high CMC value allowing for ease of
removal after reconstitution. A third property considered when
selecting a detergent is the hydrophobicity of the detergent.
Typically, membrane-associated proteins are very hydrophobic and
therefore detergents which are also hydrophobic, e.g., the triton
series, would be useful for solubilizing the hydrophobic proteins.
Another property important to a detergent can be the capability of
the detergent to remove the S. pneumoniae protein with minimal
protein-protein interaction facilitating further purification. A
fifth property of the detergent which should be considered is the
charge of the detergent. For example, if it is desired to use ion
exchange resins in the purification process then preferably
detergent should be an uncharged detergent. Chromatographic
techniques which can be used in the final purification step are
known in the art and include hydrophobic interaction, lectin
affinity, ion exchange, dye affinity and immunoaffinity.
[0122] One strategy to maximize recombinant S. pneumoniae peptide
expression in E. coli is to express the protein in a host bacteria
with an impaired capacity to proteolytically cleave the recombinant
protein (Gottesman, S., Gene Expression Technology: Methods in
Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128).
Another strategy would be to alter the nucleic acid encoding an S.
pneumoniae peptide to be inserted into an expression vector so that
the individual codons for each amino acid would be those
preferentially utilized in highly expressed E. coli proteins (Wada
et al., (1992) Nuc. Acids Res. 20:2111-2118). Such alteration of
nucleic acids of the invention can be carried out by standard DNA
synthesis techniques.
[0123] The nucleic acids of the invention can also be chemically
synthesized using standard techniques. Various methods of
chemically synthesizing polydeoxynucleotides are known, including
solid-phase synthesis which, like peptide synthesis, has been fully
automated in commercially available DNA synthesizers (See, e.g.,
Itakura et al. U.S. Pat. No. 4,598,049; Caruthers et al. U.S. Pat.
No. 4,458,066; and Itakura U.S. Pat. Nos. 4,401,796 and 4,373,071,
incorporated by reference herein).
[0124] The present invention provides a library of S.
pneumoniae-derived nucleic acid sequences. The libraries provide
probes, primers, and markers which can be used as markers in
epidemiological studies. The present invention also provides a
library of S. pneumoniae-derived nucleic acid sequences which
comprise or encode targets for therapeutic drugs.
[0125] Nucleic acids comprising any of the sequences disclosed
herein or sub-sequences thereof can be prepared by standard methods
using the nucleic acid sequence information provided in SEQ ID NO:
1-SEQ ID NO: 2661. For example, DNA can be chemically synthesized
using, e.g., the phosphoramidite solid support method of Matteucci
et al., 1981, J. Am. Chem. Soc. 103:3185, the method of Yoo et al.,
1989, J. Biol. Chem. 764:17078, or other well known methods. This
can be done by sequentially linking a series of oligonucleotide
cassettes comprising pairs of synthetic oligonucleotides, as
described below.
[0126] Of course, due to the degeneracy of the genetic code, many
different nucleotide sequences can encode polypeptides having the
amino acid sequences defined by SEQ ID NO: 2662-SEQ ID NO: 5322 or
sub-sequences thereof. The codons can be selected for optimal
expression in prokaryotic or eukaryotic systems. Such degenerate
variants are also encompassed by this invention.
[0127] Insertion of nucleic acids (typically DNAs) encoding the
polypeptides of the invention into a vector is easily accomplished
when the termini of both the DNAs and the vector comprise
compatible restriction sites. If this cannot be done, it may be
necessary to modify the termini of the DNAs and/or vector by
digesting back single-stranded DNA overhangs generated by
restriction endonuclease cleavage to produce blunt ends, or to
achieve the same result by filling in the single-stranded termini
with an appropriate DNA polymerase.
[0128] Alternatively, any site desired may be produced, e.g., by
ligating nucleotide sequences (linkers) onto the termini. Such
linkers may comprise specific oligonucleotide sequences that define
desired restriction sites. Restriction sites can also be generated
by the use of the polymerase chain reaction (PCR). See, e.g., Saiki
et al., 1988, Science 239:48. The cleaved vector and the DNA
fragments may also be modified if required by homopolymeric
tailing.
[0129] In certain embodiments, the invention encompasses isolated
nucleic acid fragments comprising all or part of the individual
nucleic acid sequences disclosed herein. The fragments are at least
about 8 nucleotides in length, preferably at least about 12
nucleotides in length, and most preferably at least about 15-20
nucleotides in length.
[0130] The nucleic acids may be isolated directly from cells.
Alternatively, the polymerase chain reaction (PCR) method can be
used to produce the nucleic acids of the invention, using either
chemically synthesized strands or genomic material as templates.
Primers used for PCR can be synthesized using the sequence
information provided herein and can further be designed to
introduce appropriate new restriction sites, if desirable, to
facilitate incorporation into a given vector for recombinant
expression.
[0131] The nucleic acids of the present invention may be flanked by
natural S. pneumoniae regulatory sequences, or may be associated
with heterologous sequences, including promoters, enhancers,
response elements, signal sequences, polyadenylation sequences,
introns, 5'- and 3'-noncoding regions, and the like. The nucleic
acids may also be modified by many means known in the art.
Non-limiting examples of such modifications include methylation,
"caps", substitution of one or more of the naturally occurring
nucleotides with an analog, internucleotide modifications such as,
for example, those with uncharged linkages (e.g., methyl
phosphonates, phosphotriesters, phosphoroamidates, carbamates,
etc.) and with charged linkages (e.g., phosphorothioates,
phosphorodithioates, etc.). Nucleic acids may contain one or more
additional covalently linked moieties, such as, for example,
proteins (e.g., nucleases, toxins, antibodies, signal peptides,
poly-L-lysine, etc.), intercalators (e.g., acridine, psoralen,
etc.), chelators (e.g., metals, radioactive metals, iron, oxidative
metals, etc.), and alkylators. PNAs are also included. The nucleic
acid may be derivatized by formation of a methyl or ethyl
phosphotriester or an alkyl phosphoramidate linkage. Furthermore,
the nucleic acid sequences of the present invention may also be
modified with a label capable of providing a detectable signal,
either directly or indirectly. Exemplary labels include
radioisotopes, fluorescent molecules, biotin, and the like.
[0132] The invention also provides nucleic acid vectors comprising
the disclosed S. pneumoniae-derived sequences or derivatives or
fragments thereof. A large number of vectors, including plasmid and
fungal vectors, have been described for replication and/or
expression in a variety of eukaryotic and prokaryotic hosts, and
may be used for gene therapy as well as for simple cloning or
protein expression.
[0133] The encoded S. pneumoniae polypeptides may be expressed by
using many known vectors, such as pUC plasmids, pET plasmids
(Novagen, Inc., Madison, Wis.), or pRSET or pREP (Invitrogen, San
Diego, Calif.), and many appropriate host cells, using methods
disclosed or cited herein or otherwise known to those skilled in
the relevant art. The particular choice of vector/host is not
critical to the practice of the invention.
[0134] Recombinant cloning vectors will often include one or more
replication systems for cloning or expression, one or more markers
for selection in the host, e.g. antibiotic resistance, and one or
more expression cassettes. The inserted S. pneumoniae coding
sequences may be synthesized by standard methods, isolated from
natural sources, or prepared as hybrids, etc. Ligation of the S.
pneumoniae coding sequences to transcriptional regulatory elements
and/or to other amino acid coding sequences may be achieved by
known methods. Suitable host cells may be
transformed/transfected/infected as appropriate by any suitable
method including electroporation, CaCl.sub.2 mediated DNA uptake,
fungal infection, microinjection, microprojectile, or other
established methods.
[0135] Appropriate host cells include bacteria, archebacteria,
fungi, especially yeast, and plant and animal cells, especially
mammalian cells. Of particular interest are S. pneumoniae, E. coli,
B. Subtilis, Saccharomyces cerevisiae, Saccharomyces
carlsbergensis, Schizosaccharomyces pombi, SF9 cells, C129 cells,
293 cells, Neurospora, and CHO cells, COS cells, HeLa cells, and
immortalized mammalian myeloid and lymphoid cell lines. Preferred
replication systems include M13, ColE1, SV40, baculovirus, lambda,
adenovirus, and the like. A large number of transcription
initiation and termination regulatory regions have been isolated
and shown to be effective in the transcription and translation of
heterologous proteins in the various hosts. Examples of these
regions, methods of isolation, manner of manipulation, etc. are
known in the art. Under appropriate expression conditions, host
cells can be used as a source of recombinantly produced S.
pneumoniae-derived peptides and polypeptides.
[0136] Advantageously, vectors may also include a transcription
regulatory element (i.e., a promoter) operably linked to the S.
pneumoniae portion. The promoter may optionally contain operator
portions and/or ribosome binding sites. Non-limiting examples of
bacterial promoters compatible with E. coli include: b-lactamase
(penicillinase) promoter; lactose promoter; tryptophan (trp)
promoter; araBAD (arabinose) operon promoter; lambda-derived
P.sub.1 promoter and N gene ribosome binding site; and the hybrid
tac promoter derived from sequences of the trp and lac UV5
promoters. Non-limiting examples of yeast promoters include
3-phosphoglycerate kinase promoter, glyceraldehyde-3-phosphate
dehydrogenase (GAPDH) promoter, galactokinase (GAL1) promoter,
galactoepimerase promoter, and alcohol dehydrogenase (ADH)
promoter. Suitable promoters for mammalian cells include without
limitation viral promoters such as that from Simian Virus 40
(SV40), Rous sarcoma virus (RSV), adenovirus (ADV), and bovine
papilloma virus (BPV). Mammalian cells may also require terminator
sequences, polyA addition sequences and enhancer sequences to
increase expression. Sequences which cause amplification of the
gene may also be desirable. Furthermore, sequences that facilitate
secretion of the recombinant product from cells, including, but not
limited to, bacteria, yeast, and animal cells, such as secretory
signal sequences and/or prohormone pro region sequences, may also
be included. These sequences are well described in the art.
[0137] Nucleic acids encoding wild-type or variant S.
pneumoniae-derived polypeptides may also be introduced into cells
by recombination events. For example, such a sequence can be
introduced into a cell, and thereby effect homologous recombination
at the site of an endogenous gene or a sequence with substantial
identity to the gene. Other recombination-based methods such as
nonhomologous recombinations or deletion of endogenous genes by
homologous recombination may also be used.
[0138] The nucleic acids of the present invention find use as
templates for the recombinant production of S. pneumoniae-derived
peptides or polypeptides.
Identification and Use of S. pneumoniae Nucleic Acid Sequences
[0139] The disclosed S. pneumoniae polypeptide and nucleic acid
sequences, or other sequences that are contained within ORFs,
including complete protein-coding sequences, of which any of the
disclosed S. pneumoniae-specific sequences forms a part, are useful
as target components for diagnosis and/or treatment of S.
pneumoniae-caused infection
[0140] It will be understood that the sequence of an entire
protein-coding sequence of which each disclosed nucleic acid
sequence forms a part can be isolated and identified based on each
disclosed sequence. This can be achieved, for example, by using an
isolated nucleic acid encoding the disclosed sequence, or fragments
thereof, to prime a sequencing reaction with genomic S. pneumoniae
DNA as template; this is followed by sequencing the amplified
product. The isolated nucleic acid encoding the disclosed sequence,
or fragments thereof, can also be hybridized to S. pneumoniae
genomic libraries to identify clones containing additional complete
segments of the protein-coding sequence of which the shorter
sequence forms a part. Then, the entire protein-coding sequence, or
fragments thereof, or nucleic acids encoding all or part of the
sequence, or sequence-conservative or function-conservative
variants thereof, may be employed in practicing the present
invention.
[0141] Preferred sequences are those that are useful in diagnostic
and/or therapeutic applications. Diagnostic applications include
without limitation nucleic-acid-based and antibody-based methods
for detecting bacterial infection. Therapeutic applications include
without limitation vaccines, passive immunotherapy, and drug
treatments directed against gene products that are both unique to
bacteria and essential for growth and/or replication of
bacteria.
Identification of Nucleic Acids Encoding Vaccine Components and
Targets for Agents Effective Against S. pneumoniae
[0142] The disclosed S. pneumoniae genome sequence includes
segments that direct the synthesis of ribonucleic acids and
polypeptides, as well as origins of replication, promoters, other
types of regulatory sequences, and intergenic nucleic acids. The
invention encompasses nucleic acids encoding immunogenic components
of vaccines and targets for agents effective against S. pneumoniae.
Identification of said immunogenic components involved in the
determination of the function of the disclosed sequences, which can
be achieved using a variety of approaches. Non-limiting examples of
these approaches are described briefly below.
[0143] Homology to Known Sequences:
[0144] Computer-assisted comparison of the disclosed S. pneumoniae
sequences with previously reported sequences present in publicly
available databases is useful for identifying functional S.
pneumoniae nucleic acid and polypeptide sequences. It will be
understood that protein-coding sequences, for example, may be
compared as a whole, and that a high degree of sequence homology
between two proteins (such as, for example, >80-90%) at the
amino acid level indicates that the two proteins also possess some
degree of functional homology, such as, for example, among enzymes
involved in metabolism, DNA synthesis, or cell wall synthesis, and
proteins involved in transport, cell division, etc. In addition,
many structural features of particular protein classes have been
identified and correlate with specific consensus sequences, such
as, for example, binding domains for nucleotides, DNA, metal ions,
and other small molecules; sites for covalent modifications such as
phosphorylation, acylation, and the like; sites of protein:protein
interactions, etc. These consensus sequences may be quite short and
thus may represent only a fraction of the entire protein-coding
sequence. Identification of such a feature in an S. pneumoniae
sequence is therefore useful in determining the function of the
encoded protein and identifying useful targets of antibacterial
drugs.
[0145] Of particular relevance to the present invention are
structural features that are common to secretory, transmembrane,
and surface proteins, including secretion signal peptides and
hydrophobic transmembrane domains. S. pneumoniae proteins
identified as containing putative signal sequences and/or
transmembrane domains are useful as immunogenic components of
vaccines.
[0146] Targets for therapeutic drugs according to the invention
include, but are not limited to, polypeptides of the invention,
whether unique to S. pneumoniae or not, that are essential for
growth and/or viability of S. pneumoniae under at least one growth
condition. Polypeptides essential for growth and/or viability can
be determined by examining the effect of deleting and/or disrupting
the genes, i.e., by so-called gene "knockout". Alternatively,
genetic footprinting can be used (Smith et al., 1995, Proc. Natl.
Acad. Sci. USA 92:5479-6433; Published International Application WO
94/26933; U.S. Pat. No. 5,612,180). Still other methods for
assessing essentiality includes the ability to isolate conditional
lethal mutations in the specific gene (e.g., temperature sensitive
mutations). Other useful targets for therapeutic drugs, which
include polypeptides that are not essential for growth or viability
per se but lead to loss of viability of the cell, can be used to
target therapeutic agents to cells.
[0147] Strain-Specific Sequences:
[0148] Because of the evolutionary relationship between different
S. pneumoniae strains, it is believed that the presently disclosed
S. pneumoniae sequences are useful for identifying, and/or
discriminating between, previously known and new S. pneumoniae
strains. It is believed that other S. pneumoniae strains will
exhibit at least 70% sequence homology with the presently disclosed
sequence. Systematic and routine analyses of DNA sequences derived
from samples containing S. pneumoniae strains, and comparison with
the present sequence allows for the identification of sequences
that can be used to discriminate between strains, as well as those
that are common to all S. pneumoniae strains. In one embodiment,
the invention provides nucleic acids, including probes, and peptide
and polypeptide sequences that discriminate between different
strains of S. pneumoniae. Strain-specific components can also be
identified functionally by their ability to elicit or react with
antibodies that selectively recognize one or more S. pneumoniae
strains.
[0149] In another embodiment, the invention provides nucleic acids,
including probes, and peptide and polypeptide sequences that are
common to all S. pneumoniae strains but are not found in other
bacterial species.
S. pneumoniae Polypeptides
[0150] This invention encompasses isolated S. pneumoniae
polypeptides encoded by the disclosed S. pneumoniae genomic
sequences, including the polypeptides of the invention contained in
the Sequence Listing. Polypeptides of the invention are preferably
at least 5 amino acid residues in length. Using the DNA sequence
information provided herein, the amino acid sequences of the
polypeptides encompassed by the invention can be deduced using
methods well-known in the art. It will be understood that the
sequence of an entire nucleic acid encoding an S. pneumoniae
polypeptide can be isolated and identified based on an ORF that
encodes only a fragment of the cognate protein-coding region. This
can be achieved, for example, by using the isolated nucleic acid
encoding the ORF, or fragments thereof, to prime a polymerase chain
reaction with genomic S. pneumoniae DNA as template; this is
followed by sequencing the amplified product.
[0151] The polypeptides of the present invention, including
function-conservative variants of the disclosed ORFs, may be
isolated from wild-type or mutant S. pneumoniae cells, or from
heterologous organisms or cells (including, but not limited to,
bacteria, fungi, insect, plant, and mammalian cells) including S.
pneumoniae into which a S. pneumoniae-derived protein-coding
sequence has been introduced and expressed. Furthermore, the
polypeptides may be part of recombinant fusion proteins.
[0152] S. pneumoniae polypeptides of the invention can be
chemically synthesized using commercially automated procedures such
as those referenced herein, including, without limitation,
exclusive solid phase synthesis, partial solid phase methods,
fragment condensation or classical solution synthesis. The
polypeptides are preferably prepared by solid phase peptide
synthesis as described by Merrifield, 1963, J. Am. Chem. Soc.
85:2149. The synthesis is carried out with amino acids that are
protected at the alpha-amino terminus. Trifunctional amino acids
with labile side-chains are also protected with suitable groups to
prevent undesired chemical reactions from occurring during the
assembly of the polypeptides. The alpha-amino protecting group is
selectively removed to allow subsequent reaction to take place at
the amino-terminus. The conditions for the removal of the
alpha-amino protecting group do not remove the side-chain
protecting groups.
[0153] The alpha-amino protecting groups are those known to be
useful in the art of stepwise polypeptide synthesis. Included are
acyl type protecting groups, e.g., formyl, trifluoroacetyl, acetyl,
aromatic urethane type protecting groups, e.g., benzyloxycarbonyl
(Cbz), substituted benzyloxycarbonyl and
9-fluorenylmethyloxycarbonyl (Fmoc), aliphatic urethane protecting
groups, e.g., t-butyloxycarbonyl (Boc), isopropyloxycarbonyl,
cyclohexyloxycarbonyl, and alkyl type protecting groups, e.g.,
benzyl, triphenylmethyl. The preferred protecting group is Boc. The
side-chain protecting groups for Tyr include tetrahydropyranyl,
tert-butyl, trityl, benzyl, Cbz, 4-Br-Cbz and 2,6-dichlorobenzyl.
The preferred side-chain protecting group for Tyr is
2,6-dichlorobenzyl. The side-chain protecting groups for Asp
include benzyl, 2,6-dichlorobenzyl, methyl, ethyl and cyclohexyl.
The preferred side-chain protecting group for Asp is cyclohexyl.
The side-chain protecting groups for Thr and Ser include acetyl,
benzoyl, trityl, tetrahydropyranyl, benzyl, 2,6-dichlorobenzyl and
Cbz. The preferred protecting group for Thr and Ser is benzyl. The
side-chain protecting groups for Arg include nitro, Tos, Cbz,
adamantyloxycarbonyl and Boc. The preferred protecting group for
Arg is Tos. The side-chain amino group of Lys may be protected with
Cbz, 2-Cl-Cbz, Tos or Boc. The 2-Cl-Cbz group is the preferred
protecting group for Lys.
[0154] The side-chain protecting groups selected must remain intact
during coupling and not be removed during the deprotection of the
amino-terminus protecting group or during coupling conditions. The
side-chain protecting groups must also be removable upon the
completion of synthesis, using reaction conditions that will not
alter the finished polypeptide.
[0155] Solid phase synthesis is usually carried out from the
carboxy-terminus by coupling the alpha-amino protected (side-chain
protected) amino acid to a suitable solid support. An ester linkage
is formed when the attachment is made to a chloromethyl or
hydroxymethyl resin, and the resulting polypeptide will have a free
carboxyl group at the C-terminus. Alternatively, when a
benzhydrylamine or p-methylbenzhydrylamine resin is used, an amide
bond is formed and the resulting polypeptide will have a
carboxamide group at the C-terminus. These resins are commercially
available, and their preparation was described by Stewart et al.,
1984, Solid Phase Peptide Synthesis (2nd Edition), Pierce Chemical
Co., Rockford, Ill.
[0156] The C-terminal amino acid, protected at the side chain if
necessary and at the alpha-amino group, is coupled to the
benzhydrylamine resin using various activating agents including
dicyclohexylcarbodiimide (DCC), N,N'-diisopropyl-carbodiimide and
carbonyldiimidazole. Following the attachment to the resin support,
the alpha-amino protecting group is removed using trifluoroacetic
acid (TFA) or HCl in dioxane at a temperature between 0 and
25.degree. C. Dimethylsulfide is added to the TFA after the
introduction of methionine (Met) to suppress possible S-alkylation.
After removal of the alpha-amino protecting group, the remaining
protected amino acids are coupled stepwise in the required order to
obtain the desired sequence.
[0157] Various activating agents can be used for the coupling
reactions including DCC, N,N'-diisopropyl-carbodiimide,
benzotriazol-1-yl-oxy-tris-(dimethylamino)-phosphonium
hexa-fluorophosphate (BOP) and DCC-hydroxybenzotriazole (HOBt).
Each protected amino acid is used in excess (>2.0 equivalents),
and the couplings are usually carried out in N-methylpyrrolidone
(NMP) or in DMF, CH.sub.2Cl.sub.2 or mixtures thereof. The extent
of completion of the coupling reaction is monitored at each stage,
e.g., by the ninhydrin reaction as described by Kaiser et al.,
1970, Anal. Biochem. 34:595. In cases where incomplete coupling is
found, the coupling reaction is repeated. The coupling reactions
can be performed automatically with commercially available
instruments.
[0158] After the entire assembly of the desired polypeptide, the
polypeptide-resin is cleaved with a reagent such as liquid HF for
1-2 hours at 0.degree. C., which cleaves the polypeptide from the
resin and removes all side-chain protecting groups. A scavenger
such as anisole is usually used with the liquid HF to prevent
cations formed during the cleavage from alkylating the amino acid
residues present in the polypeptide. The polypeptide-resin may be
deprotected with TFA/dithioethane prior to cleavage if desired.
[0159] Side-chain to side-chain cyclization on the solid support
requires the use of an orthogonal protection scheme which enables
selective cleavage of the side-chain functions of acidic amino
acids (e.g., Asp) and the basic amino acids (e.g., Lys). The
9-fluorenylmethyl (Fm) protecting group for the side-chain of Asp
and the 9-fluorenylmethyloxycarbonyl (Fmoc) protecting group for
the side-chain of Lys can be used for this purpose. In these cases,
the side-chain protecting groups of the Boc-protected
polypeptide-resin are selectively removed with piperidine in DMF.
Cyclization is achieved on the solid support using various
activating agents including DCC, DCC/HOBt or BOP. The HF reaction
is carried out on the cyclized polypeptide-resin as described
above.
[0160] Methods for polypeptide purification are well-known in the
art, including, without limitation, preparative disc-gel
electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC,
gel filtration, ion exchange and partition chromatography, and
countercurrent distribution. For some purposes, it is preferable to
produce the polypeptide in a recombinant system in which the S.
pneumoniae protein contains an additional sequence tag that
facilitates purification, such as, but not limited to, a
polyhistidine sequence. The polypeptide can then be purified from a
crude lysate of the host cell by chromatography on an appropriate
solid-phase matrix. Alternatively, antibodies produced against a S.
pneumoniae protein or against peptides derived therefrom can be
used as purification reagents. Other purification methods are
possible.
[0161] The present invention also encompasses derivatives and
homologues of S. pneumoniae-encoded polypeptides. For some
purposes, nucleic acid sequences encoding the peptides may be
altered by substitutions, additions, or deletions that provide for
functionally equivalent molecules, i.e., function-conservative
variants. For example, one or more amino acid residues within the
sequence can be substituted by another amino acid of similar
properties, such as, for example, positively charged amino acids
(arginine, lysine, and histidine); negatively charged amino acids
(aspartate and glutamate); polar neutral amino acids; and non-polar
amino acids. The isolated polypeptides may be modified by, for
example, phosphorylation, sulfation, acylation, or other protein
modifications. They may also be modified with a label capable of
providing a detectable signal, either directly or indirectly,
including, but not limited to, radioisotopes and fluorescent
compounds.
[0162] To identify S. pneumoniae-derived polypeptides for use in
the present invention, essentially the complete genomic sequence of
a virulent, methicillin-resistant isolate of Streptococcus
pneumoniae isolate was analyzed. While, in very rare instances, a
nucleic acid sequencing error may be revealed, resolving a rare
sequencing error is well within the art, and such an occurrence
will not prevent one skilled in the art from practicing the
invention.
[0163] Also encompassed are any S. pneumoniae polypeptide sequences
that are contained within the open reading frames (ORFs), including
complete protein-coding sequences, of which any of SEQ ID NO:
2662-SEQ ID NO: 5322 forms a part. Table 2, which is appended
herewith and which forms part of the present specification,
provides a putative identification of the particular function of a
polypeptide which is encoded by each ORF. As a result, one skilled
in the art can use the polypeptides of the present invention for
commercial and industrial purposes consistent with the type of
putative identification of the polypeptide.
[0164] The present invention provides a library of S.
Pneumoniae-derived polypeptide sequences, and a corresponding
library of nucleic acid sequences encoding the polypeptides,
wherein the polypeptides themselves, or polypeptides contained
within ORFs of which they form a part, comprise sequences that are
contemplated for use as components of vaccines. Non-limiting
examples of such sequences are listed by SEQ ID NO in Table 2,
which is appended herewith and which forms part of the present
specification.
[0165] The present invention also provides a library of S.
pneumoniae-derived polypeptide sequences, and a corresponding
library of nucleic acid sequences encoding the polypeptides,
wherein the polypeptides themselves, or polypeptides contained
within ORFs of which they form a part, comprise sequences lacking
homology to any known prokaryotic or eukaryotic sequences. Such
libraries provide probes, primers, and markers which can be used to
diagnose S. pneumoniae infection, including use as markers in
epidemiological studies. Non-limiting examples of such sequences
are listed by SEQ ID NO in Table 2, which is appended
[0166] The present invention also provides a library of S.
pneumoniae-derived polypeptide sequences, and a corresponding
library of nucleic acid sequences encoding the polypeptides,
wherein the polypeptides themselves, or polypeptides contained
within ORFs of which they form a part, comprise targets for
therapeutic drugs.
Specific Example: Determination of Candidate Protein Antigens for
Antibody and Vaccine Development
[0167] The selection of candidate protein antigens for vaccine
development can be derived from the nucleic acids encoding S.
pneumoniae polypeptides. First, the ORF's can be analyzed for
homology to other known exported or membrane proteins and analyzed
using the discriminant analysis described by Klein, et al. (Klein,
P., Kanehsia, M., and DeLisi, C. (1985) Biochimica et Biophysica
Acta 815,468-476) for predicting exported and membrane
proteins.
[0168] Homology searches can be performed using the BLAST algorithm
contained in the Wisconsin Sequence Analysis Package (Genetics
Computer Group, University Research Park, 575 Science Drive,
Madison, Wis. 53711) to compare each predicted ORF amino acid
sequence with all sequences found in the current GenBank,
SWISS-PROT and PIR databases. BLAST searches for local alignments
between the ORF and the databank sequences and reports a
probability score which indicates the probability of finding this
sequence by chance in the database. ORF's with significant homology
(e.g. probabilities lower than 1.times.10.sup.-6 that the homology
is only due to random chance) to membrane or exported proteins
represent protein antigens for vaccine development. Possible
functions can be provided to S. pneumoniae genes based on sequence
homology to genes cloned in other organisms.
[0169] Discriminant analysis (Klein, et al. supra) can be used to
examine the ORF amino acid sequences. This algorithm uses the
intrinsic information contained in the ORF amino acid sequence and
compares it to information derived from the properties of known
membrane and exported proteins. This comparison predicts which
proteins will be exported, membrane associated or cytoplasmic. ORF
amino acid sequences identified as exported or membrane associated
by this algorithm are likely protein antigens for vaccine
development.
Production of Fragments and Analogs of S. pneumoniae Nucleic Acids
and Polypeptides
[0170] Based on the discovery of the S. pneumoniae gene products of
the invention provided in the Sequence Listing, one skilled in the
art can alter the disclosed structure (of S. pneumoniae genes),
e.g., by producing fragments or analogs, and test the newly
produced structures for activity. Examples of techniques known to
those skilled in the relevant art which allow the production and
testing of fragments and analogs are discussed below. These, or
analogous methods can be used to make and screen libraries of
polypeptides, e.g., libraries of random peptides or libraries of
fragments or analogs of cellular proteins for the ability to bind
S. pneumoniae polypeptides. Such screens are useful for the
identification of inhibitors of S. pneumoniae.
[0171] Generation of Fragments
[0172] Fragments of a protein can be produced in several ways,
e.g., recombinantly, by proteolytic digestion, or by chemical
synthesis. Internal or terminal fragments of a polypeptide can be
generated by removing one or more nucleotides from one end (for a
terminal fragment) or both ends (for an internal fragment) of a
nucleic acid which encodes the polypeptide. Expression of the
mutagenized DNA produces polypeptide fragments. Digestion with
"end-nibbling" endonucleases can thus generate DNA's which encode
an array of fragments. DNA's which encode fragments of a protein
can also be generated by random shearing, restriction digestion or
a combination of the above-discussed methods.
[0173] Fragments can also be chemically synthesized using
techniques known in the art such as conventional Merrifield solid
phase f-Moc or t-Boc chemistry. For example, peptides of the
present invention may be arbitrarily divided into fragments of
desired length with no overlap of the fragments, or divided into
overlapping fragments of a desired length.
Alteration of Nucleic Acids and Polypeptides: Random Methods
[0174] Amino acid sequence variants of a protein can be prepared by
random mutagenesis of DNA which encodes a protein or a particular
domain or region of a protein. Useful methods include PCR
mutagenesis and saturation mutagenesis. A library of random amino
acid sequence variants can also be generated by the synthesis of a
set of degenerate oligonucleotide sequences. (Methods for screening
proteins in a library of variants are elsewhere herein).
[0175] PCR Mutagenesis
[0176] In PCR mutagenesis, reduced Taq polymerase fidelity is used
to introduce random mutations into a cloned fragment of DNA (Leung
et al., 1989, Technique 1:11-15). The DNA region to be mutagenized
is amplified using the polymerase chain reaction (PCR) under
conditions that reduce the fidelity of DNA synthesis by Taq DNA
polymerase, e.g., by using a dGTP/dATP ratio of five and adding
Mn.sup.2+ to the PCR reaction. The pool of amplified DNA fragments
are inserted into appropriate cloning vectors to provide random
mutant libraries.
[0177] Saturation Mutagenesis
[0178] Saturation mutagenesis allows for the rapid introduction of
a large number of single base substitutions into cloned DNA
fragments (Mayers et al., 1985, Science 229:242). This technique
includes generation of mutations, e.g., by chemical treatment or
irradiation of single-stranded DNA in vitro, and synthesis of a
complimentary DNA strand. The mutation frequency can be modulated
by modulating the severity of the treatment, and essentially all
possible base substitutions can be obtained. Because this procedure
does not involve a genetic selection for mutant fragments both
neutral substitutions, as well as those that alter function, are
obtained. The distribution of point mutations is not biased toward
conserved sequence elements.
[0179] Degenerate Oligonucleotides
[0180] A library of homologs can also be generated from a set of
degenerate oligonucleotide sequences. Chemical synthesis of a
degenerate sequences can be carried out in an automatic DNA
synthesizer, and the synthetic genes then ligated into an
appropriate expression vector. The synthesis of degenerate
oligonucleotides is known in the art (see for example, Narang, S A
(1983) Tetrahedron 39:3; Itakura et al. (1981) Recombinant DNA,
Proc 3rd Cleveland Sympos. Macromolecules, ed. A G Walton,
Amsterdam: Elsevier pp 273-289; Itakura et al. (1984) Annu. Rev.
Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al.
(1983) Nucleic Acid Res. 11:477. Such techniques have been employed
in the directed evolution of other proteins (see, for example,
Scott et al. (1990) Science 249:386-390; Roberts et al. (1992) PNAS
89:2429-2433; Devlin et al. (1990) Science 249: 404-406; Cwirla et
al. (1990) PNAS 87: 6378-6382; as well as U.S. Pat. Nos. 5,223,409,
5,198,346, and 5,096,815).
Alteration of Nucleic Acids and Polypeptides: Methods for Directed
Mutagenesis
[0181] Non-random or directed, mutagenesis techniques can be used
to provide specific sequences or mutations in specific regions.
These techniques can be used to create variants which include,
e.g., deletions, insertions, or substitutions, of residues of the
known amino acid sequence of a protein. The sites for mutation can
be modified individually or in series, e.g., by (1) substituting
first with conserved amino acids and then with more radical choices
depending upon results achieved, (2) deleting the target residue,
or (3) inserting residues of the same or a different class adjacent
to the located site, or combinations of options 1-3.
[0182] Alanine Scanning Mutagenesis
[0183] Alanine scanning mutagenesis is a useful method for
identification of certain residues or regions of the desired
protein that are preferred locations or domains for mutagenesis,
Cunningham and Wells (Science 244:1081-1085, 1989). In alanine
scanning, a residue or group of target residues are identified
(e.g., charged residues such as Arg, Asp, His, Lys, and Glu) and
replaced by a neutral or negatively charged amino acid (most
preferably alanine or polyalanine). Replacement of an amino acid
can affect the interaction of the amino acids with the surrounding
aqueous environment in or outside the cell. Those domains
demonstrating functional sensitivity to the substitutions are then
refined by introducing further or other variants at or for the
sites of substitution. Thus, while the site for introducing an
amino acid sequence variation is predetermined, the nature of the
mutation per se need not be predetermined. For example, to optimize
the performance of a mutation at a given site, alanine scanning or
random mutagenesis may be conducted at the target codon or region
and the expressed desired protein subunit variants are screened for
the optimal combination of desired activity.
[0184] Oligonucleotide-Mediated Mutagenesis
[0185] Oligonucleotide-mediated mutagenesis is a useful method for
preparing substitution, deletion, and insertion variants of DNA,
see, e.g., Adelman et al., (DNA 2:183, 1983). Briefly, the desired
DNA is altered by hybridizing an oligonucleotide encoding a
mutation to a DNA template, where the template is the
single-stranded form of a plasmid or bacteriophage containing the
unaltered or native DNA sequence of the desired protein. After
hybridization, a DNA polymerase is used to synthesize an entire
second complementary strand of the template that will thus
incorporate the oligonucleotide primer, and will code for the
selected alteration in the desired protein DNA. Generally,
oligonucleotides of at least 25 nucleotides in length are used. An
optimal oligonucleotide will have 12 to 15 nucleotides that are
completely complementary to the template on either side of the
nucleotide(s) coding for the mutation. This ensures that the
oligonucleotide will hybridize properly to the single-stranded DNA
template molecule. The oligonucleotides are readily synthesized
using techniques known in the art such as that described by Crea et
al. (Proc. Natl. Acad. Sci. USA, 75: 5765[1978]).
[0186] Cassette Mutagenesis
[0187] Another method for preparing variants, cassette mutagenesis,
is based on the technique described by Wells et al. (Gene,
34:315[1985]). The starting material is a plasmid (or other vector)
which includes the protein subunit DNA to be mutated. The codon(s)
in the protein subunit DNA to be mutated are identified. There must
be a unique restriction endonuclease site on each side of the
identified mutation site(s). If no such restriction sites exist,
they may be generated using the above-described
oligonucleotide-mediated mutagenesis method to introduce them at
appropriate locations in the desired protein subunit DNA. After the
restriction sites have been introduced into the plasmid, the
plasmid is cut at these sites to linearize it. A double-stranded
oligonucleotide encoding the sequence of the DNA between the
restriction sites but containing the desired mutation(s) is
synthesized using standard procedures. The two strands are
synthesized separately and then hybridized together using standard
techniques. This double-stranded oligonucleotide is referred to as
the cassette. This cassette is designed to have 3' and 5' ends that
are comparable with the ends of the linearized plasmid, such that
it can be directly ligated to the plasmid. This plasmid now
contains the mutated desired protein subunit DNA sequence.
[0188] Combinatorial Mutagenesis
[0189] Combinatorial mutagenesis can also be used to generate
mutants (Ladner et al., WO 88/06630). In this method, the amino
acid sequences for a group of homologs or other related proteins
are aligned, preferably to promote the highest homology possible.
All of the amino acids which appear at a given position of the
aligned sequences can be selected to create a degenerate set of
combinatorial sequences. The variegated library of variants is
generated by combinatorial mutagenesis at the nucleic acid level,
and is encoded by a variegated gene library. For example, a mixture
of synthetic oligonucleotides can be enzymatically ligated into
gene sequences such that the degenerate set of potential sequences
are expressible as individual peptides, or alternatively, as a set
of larger fusion proteins containing the set of degenerate
sequences.
Other Modifications of S. pneumoniae Nucleic Acids and
Polypeptides
[0190] It is possible to modify the structure of an S. pneumoniae
polypeptide for such purposes as increasing solubility, enhancing
stability (e.g., shelf life ex vivo and resistance to proteolytic
degradation in vivo). A modified S. pneumoniae protein or peptide
can be produced in which the amino acid sequence has been altered,
such as by amino acid substitution, deletion, or addition as
described herein.
[0191] An S. pneumoniae peptide can also be modified by
substitution of cysteine residues preferably with alanine, serine,
threonine, leucine or glutamic acid residues to minimize
dimerization via disulfide linkages. In addition, amino acid side
chains of fragments of the protein of the invention can be
chemically modified. Another modification is cyclization of the
peptide.
[0192] In order to enhance stability and/or reactivity, an S.
pneumoniae polypeptide can be modified to incorporate one or more
polymorphisms in the amino acid sequence of the protein resulting
from any natural allelic variation. Additionally, D-amino acids,
non-natural amino acids, or non-amino acid analogs can be
substituted or added to produce a modified protein within the scope
of this invention. Furthermore, an S. pneumoniae polypeptide can be
modified using polyethylene glycol (PEG) according to the method of
A. Sehon and co-workers (Wie et al., supra) to produce a protein
conjugated with PEG. In addition, PEG can be added during chemical
synthesis of the protein. Other modifications of S. pneumoniae
proteins include reduction/alkylation (Tarr, Methods of Protein
Microcharacterization, J. E. Silver ed., Humana Press, Clifton N J
155-194 (1986)); acylation (Tarr, supra); chemical coupling to an
appropriate carrier (Mishell and Shiigi, eds, Selected Methods in
Cellular Immunology, W H Freeman, San Francisco, Calif. (1980),
U.S. Pat. No. 4,939,239; or mild formalin treatment (Marsh, (1971)
Int. Arch. of Allergy and Appl. Immunol., 41: 199-215).
[0193] To facilitate purification and potentially increase
solubility of an S. pneumoniae protein or peptide, it is possible
to add an amino acid fusion moiety to the peptide backbone. For
example, hexa-histidine can be added to the protein for
purification by immobilized metal ion affinity chromatography
(Hochuli, E. et al., (1988) Bio/Technology, 6: 1321-1325). In
addition, to facilitate isolation of peptides free of irrelevant
sequences, specific endoprotease cleavage sites can be introduced
between the sequences of the fusion moiety and the peptide.
[0194] To potentially aid proper antigen processing of epitopes
within an S. pneumoniae polypeptide, canonical protease sensitive
sites can be engineered between regions, each comprising at least
one epitope via recombinant or synthetic methods. For example,
charged amino acid pairs, such as KK or RR, can be introduced
between regions within a protein or fragment during recombinant
construction thereof. The resulting peptide can be rendered
sensitive to cleavage by cathepsin and/or other trypsin-like
enzymes which would generate portions of the protein containing one
or more epitopes. In addition, such charged amino acid residues can
result in an increase in the solubility of the peptide.
Primary Methods for Screening Polypeptides and Analogs
[0195] Various techniques are known in the art for screening
generated mutant gene products. Techniques for screening large gene
libraries often include cloning the gene library into replicable
expression vectors, transforming appropriate cells with the
resulting library of vectors, and expressing the genes under
conditions in which detection of a desired activity, e.g., in this
case, binding to S. pneumoniae polypeptide or an interacting
protein, facilitates relatively easy isolation of the vector
encoding the gene whose product was detected. Each of the
techniques described below is amenable to high through-put analysis
for screening large numbers of sequences created, e.g., by random
mutagenesis techniques.
[0196] Two Hybrid Systems
[0197] Two hybrid assays such as the system described above (as
with the other screening methods described herein), can be used to
identify polypeptides, e.g., fragments or analogs of a
naturally-occurring S. pneumoniae polypeptide, e.g., of cellular
proteins, or of randomly generated polypeptides which bind to an S.
pneumoniae protein. (The S. pneumoniae domain is used as the bait
protein and the library of variants are expressed as prey fusion
proteins.) In an analogous fashion, a two hybrid assay (as with the
other screening methods described herein), can be used to find
polypeptides which bind a S. pneumoniae polypeptide.
[0198] Display Libraries
[0199] In one approach to screening assays, the candidate peptides
are displayed on the surface of a cell or viral particle, and the
ability of particular cells or viral particles to bind an
appropriate receptor protein via the displayed product is detected
in a "panning assay". For example, the gene library can be cloned
into the gene for a surface membrane protein of a bacterial cell,
and the resulting fusion protein detected by panning (Ladner et
al., WO 88/06630; Fuchs et al. (1991) Bio/Technology 9:1370-1371;
and Goward et al. (1992) TIBS 18:136-140). In a similar fashion, a
detectably labeled ligand can be used to score for potentially
functional peptide homologs. Fluorescently labeled ligands, e.g.,
receptors, can be used to detect homologs which retain
ligand-binding activity. The use of fluorescently labeled ligands,
allows cells to be visually inspected and separated under a
fluorescence microscope, or, where the morphology of the cell
permits, to be separated by a fluorescence-activated cell
sorter.
[0200] A gene library can be expressed as a fusion protein on the
surface of a viral particle. For instance, in the filamentous phage
system, foreign peptide sequences can be expressed on the surface
of infectious phage, thereby conferring two significant benefits.
First, since these phage can be applied to affinity matrices at
concentrations well over 10.sup.13 phage per milliliter, a large
number of phage can be screened at one time. Second, since each
infectious phage displays a gene product on its surface, if a
particular phage is recovered from an affinity matrix in low yield,
the phage can be amplified by another round of infection. The group
of almost identical E. coli filamentous phages M13, fd., and f1 are
most often used in phage display libraries. Either of the phage
gIII or gVIII coat proteins can be used to generate fusion proteins
without disrupting the ultimate packaging of the viral particle.
Foreign epitopes can be expressed at the NH.sub.2-terminal end of
pIII and phage bearing such epitopes recovered from a large excess
of phage lacking this epitope (Ladner et al. PCT publication WO
90/02909; Garrard et al., PCT publication WO 92/09690; Marks et al.
(1992) J. Biol. Chem. 267:16007-16010; Griffiths et al. (1993) EMBO
J 12:725-734; Clackson et al. (1991) Nature 352:624-628; and Barbas
et al. (1992) PNAS 89:4457-4461).
[0201] A common approach uses the maltose receptor of E. coli (the
outer membrane protein, LamB) as a peptide fusion partner (Charbit
et al. (1986) EMBO 5, 3029-3037). Oligonucleotides have been
inserted into plasmids encoding the LamB gene to produce peptides
fused into one of the extracellular loops of the protein. These
peptides are available for binding to ligands, e.g., to antibodies,
and can elicit an immune response when the cells are administered
to animals. Other cell surface proteins, e.g., OmpA (Schorr et al.
(1991) Vaccines 91, pp. 387-392), PhoE (Agterberg, et al. (1990)
Gene 88, 37-45), and PAL (Fuchs et al. (1991) Bio/Tech 9,
1369-1372), as well as large bacterial surface structures have
served as vehicles for peptide display. Peptides can be fused to
pilin, a protein which polymerizes to form the pilus-a conduit for
interbacterial exchange of genetic information (Thiry et al. (1989)
Appl. Environ. Microbiol. 55, 984-993). Because of its role in
interacting with other cells, the pilus provides a useful support
for the presentation of peptides to the extracellular environment.
Another large surface structure used for peptide display is the
bacterial motive organ, the flagellum. Fusion of peptides to the
subunit protein flagellin offers a dense array of many peptide
copies on the host cells (Kuwajima et al. (1988) Bio/Tech. 6,
1080-1083). Surface proteins of other bacterial species have also
served as peptide fusion partners. Examples include the
Staphylococcus protein A and the outer membrane IgA protease of
Neisseria (Hansson et al. (1992) J. Bacteriol. 174, 4239-4245 and
Klauser et al. (1990) EMBO J. 9, 1991-1999).
[0202] In the filamentous phage systems and the LamB system
described above, the physical link between the peptide and its
encoding DNA occurs by the containment of the DNA within a particle
(cell or phage) that carries the peptide on its surface. Capturing
the peptide captures the particle and the DNA within. An
alternative scheme uses the DNA-binding protein LacI to form a link
between peptide and DNA (Cull et al. (1992) PNAS USA 89:1865-1869).
This system uses a plasmid containing the LacI gene with an
oligonucleotide cloning site at its 3'-end. Under the controlled
induction by arabinose, a LacI-peptide fusion protein is produced.
This fusion retains the natural ability of LacI to bind to a short
DNA sequence known as LacO operator (LacO). By installing two
copies of LacO on the expression plasmid, the LacI-peptide fusion
binds tightly to the plasmid that encoded it. Because the plasmids
in each cell contain only a single oligonucleotide sequence and
each cell expresses only a single peptide sequence, the peptides
become specifically and stablely associated with the DNA sequence
that directed its synthesis. The cells of the library are gently
lysed and the peptide-DNA complexes are exposed to a matrix of
immobilized receptor to recover the complexes containing active
peptides. The associated plasmid DNA is then reintroduced into
cells for amplification and DNA sequencing to determine the
identity of the peptide ligands. As a demonstration of the
practical utility of the method, a large random library of
dodecapeptides was made and selected on a monoclonal antibody
raised against the opioid peptide dynorphin B. A cohort of peptides
was recovered, all related by a consensus sequence corresponding to
a six-residue portion of dynorphin B. (Cull et al. (1992) Proc.
Natl. Acad. Sci. U.S.A. 89-1869)
[0203] This scheme, sometimes referred to as peptides-on-plasmids,
differs in two important ways from the phage display methods.
First, the peptides are attached to the C-terminus of the fusion
protein, resulting in the display of the library members as
peptides having free carboxy termini. Both of the filamentous phage
coat proteins, pIII and pVIII, are anchored to the phage through
their C-termini, and the guest peptides are placed into the
outward-extending N-terminal domains. In some designs, the
phage-displayed peptides are presented right at the amino terminus
of the fusion protein. (Cwirla, et al. (1990) Proc. Natl. Acad.
Sci. U.S.A. 87, 6378-6382) A second difference is the set of
biological biases affecting the population of peptides actually
present in the libraries. The LacI fusion molecules are confined to
the cytoplasm of the host cells. The phage coat fusions are exposed
briefly to the cytoplasm during translation but are rapidly
secreted through the inner membrane into the periplasmic
compartment, remaining anchored in the membrane by their C-terminal
hydrophobic domains, with the N-termini, containing the peptides,
protruding into the periplasm while awaiting assembly into phage
particles. The peptides in the LacI and phage libraries may differ
significantly as a result of their exposure to different
proteolytic activities. The phage coat proteins require transport
across the inner membrane and signal peptidase processing as a
prelude to incorporation into phage. Certain peptides exert a
deleterious effect on these processes and are underrepresented in
the libraries (Gallop et al. (1994) J. Med. Chem. 37(9):1233-1251).
These particular biases are not a factor in the LacI display
system.
[0204] The number of small peptides available in recombinant random
libraries is enormous. Libraries of 10.sup.7-10.sup.9 independent
clones are routinely prepared. Libraries as large as 10.sup.11
recombinants have been created, but this size approaches the
practical limit for clone libraries. This limitation in library
size occurs at the step of transforming the DNA containing
randomized segments into the host bacterial cells. To circumvent
this limitation, an in vitro system based on the display of nascent
peptides in polysome complexes has recently been developed. This
display library method has the potential of producing libraries 3-6
orders of magnitude larger than the currently available
phage/phagemid or plasmid libraries. Furthermore, the construction
of the libraries, expression of the peptides, and screening, is
done in an entirely cell-free format.
[0205] In one application of this method (Gallop et al. (1994) J.
Med. Chem. 37(9):1233-1251), a molecular DNA library encoding
10.sup.12 decapeptides was constructed and the library expressed in
an E. coli S30 in vitro coupled transcription/translation system.
Conditions were chosen to stall the ribosomes on the mRNA, causing
the accumulation of a substantial proportion of the RNA in
polysomes and yielding complexes containing nascent peptides still
linked to their encoding RNA. The polysomes are sufficiently robust
to be affinity purified on immobilized receptors in much the same
way as the more conventional recombinant peptide display libraries
are screened. RNA from the bound complexes is recovered, converted
to cDNA, and amplified by PCR to produce a template for the next
round of synthesis and screening. The polysome display method can
be coupled to the phage display system. Following several rounds of
screening, cDNA from the enriched pool of polysomes was cloned into
a phagemid vector. This vector serves as both a peptide expression
vector, displaying peptides fused to the coat proteins, and as a
DNA sequencing vector for peptide identification. By expressing the
polysome-derived peptides on phage, one can either continue the
affinity selection procedure in this format or assay the peptides
on individual clones for binding activity in a phage ELISA, or for
binding specificity in a completion phage ELISA (Barret, et al.
(1992) Anal. Biochem 204,357-364). To identify the sequences of the
active peptides one sequences the DNA produced by the phagemid
host.
Secondary Screening of Polypeptides and Analogs
[0206] The high through-put assays described above can be followed
by secondary screens in order to identify further biological
activities which will, e.g., allow one skilled in the art to
differentiate agonists from antagonists. The type of a secondary
screen used will depend on the desired activity that needs to be
tested. For example, an assay can be developed in which the ability
to inhibit an interaction between a protein of interest and its
respective ligand can be used to identify antagonists from a group
of peptide fragments isolated though one of the primary screens
described above.
[0207] Therefore, methods for generating fragments and analogs and
testing them for activity are known in the art. Once the core
sequence of interest is identified, it is routine for one skilled
in the art to obtain analogs and fragments.
Peptide Mimetics of S. pneumoniae Polypeptides
[0208] The invention also provides for reduction of the protein
binding domains of the subject S. pneumoniae polypeptides to
generate mimetics, e.g. peptide or non-peptide agents. The peptide
mimetics are able to disrupt binding of a polypeptide to its
counter ligand, e.g., in the case of an S. pneumoniae polypeptide
binding to a naturally occurring ligand. The critical residues of a
subject S. pneumoniae polypeptide which are involved in molecular
recognition of a polypeptide can be determined and used to generate
S. pneumoniae-derived peptidomimetics which competitively or
noncompetitively inhibit binding of the S. pneumoniae polypeptide
with an interacting polypeptide (see, for example, European patent
applications EP-412,762A and EP-B31,080A).
[0209] For example, scanning mutagenesis can be used to map the
amino acid residues of a particular S. pneumoniae polypeptide
involved in binding an interacting polypeptide, peptidomimetic
compounds (e.g. diazepine or isoquinoline derivatives) can be
generated which mimic those residues in binding to an interacting
polypeptide, and which therefore can inhibit binding of an S.
pneumoniae polypeptide to an interacting polypeptide and thereby
interfere with the function of S. pneumoniae polypeptide. For
instance, non-hydrolyzable peptide analogs of such residues can be
generated using benzodiazepine (e.g., see Freidinger et al. in
Peptides: Chemistry and Biology, G. R. Marshall ed., ESCOM
Publisher: Leiden, Netherlands, 1988), azepine (e.g., see Huffman
et al. in Peptides: Chemistry and Biology, G. R. Marshall ed.,
ESCOM Publisher: Leiden, Netherlands, 1988), substituted gama
lactam rings (Garvey et al. in Peptides: Chemistry and Biology, G.
R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988),
keto-methylene pseudopeptides (Ewenson et al. (1986) J Med Chem
29:295; and Ewenson et al. in Peptides: Structure and Function
(Proceedings of the 9th American Peptide Symposium) Pierce Chemical
Co. Rockland, Ill., 1985), b-turn dipeptide cores (Nagai et al.
(1985) Tetrahedron Lett 26:647; and Sato et al. (1986) J Chem Soc
Perkin Trans 1:1231), and b-aminoalcohols (Gordon et al. (1985)
Biochem Biophys Res Commun 126:419; and et al. (1986) Biochem
Biophys Res Commun 134:71).
Vaccine Formulations for S. pneumoniae Nucleic Acids and
Polypeptides
[0210] This invention also features vaccine compositions for
protection against infection by S. pneumoniae or for treatment of
S. pneumoniae infection, a gram-negative spiral microaerophilic
bacterium. In one embodiment, the vaccine compositions contain one
or more immunogenic components such as a surface protein from S.
pneumoniae, or portion thereof, and a pharmaceutically acceptable
carrier. Nucleic acids within the scope of the invention are
exemplified by the nucleic acids of the invention contained in the
Sequence Listing which encode S. pneumoniae surface proteins. Any
nucleic acid encoding an immunogenic S. pneumoniae protein, or
portion thereof, which is capable of expression in a cell, can be
used in the present invention. These vaccines have therapeutic and
prophylactic utilities.
[0211] One aspect of the invention provides a vaccine composition
for protection against infection by S. pneumoniae which contains at
least one immunogenic fragment of an S. pneumoniae protein and a
pharmaceutically acceptable carrier. Preferred fragments include
peptides of at least about 10 amino acid residues in length,
preferably about 10-20 amino acid residues in length, and more
preferably about 12-16 amino acid residues in length.
[0212] Immunogenic components of the invention can be obtained, for
example, by screening polypeptides recombinantly produced from the
corresponding fragment of the nucleic acid encoding the full-length
S. pneumoniae protein. In addition, fragments can be chemically
synthesized using techniques known in the art such as conventional
Merrifield solid phase f-Moc or t-Boc chemistry.
[0213] In one embodiment, immunogenic components are identified by
the ability of the peptide to stimulate T cells. Peptides which
stimulate T cells, as determined by, for example, T cell
proliferation or cytokine secretion are defined herein as
comprising at least one T cell epitope. T cell epitopes are
believed to be involved in initiation and perpetuation of the
immune response to the protein allergen which is responsible for
the clinical symptoms of allergy. These T cell epitopes are thought
to trigger early events at the level of the T helper cell by
binding to an appropriate HLA molecule on the surface of an antigen
presenting cell, thereby stimulating the T cell subpopulation with
the relevant T cell receptor for the epitope. These events lead to
T cell proliferation, lymphokine secretion, local inflammatory
reactions, recruitment of additional immune cells to the site of
antigen/T cell interaction, and activation of the B cell cascade,
leading to the production of antibodies. A T cell epitope is the
basic element, or smallest unit of recognition by a T cell
receptor, where the epitope comprises amino acids essential to
receptor recognition (e.g., approximately 6 or 7 amino acid
residues). Amino acid sequences which mimic those of the T cell
epitopes are within the scope of this invention.
[0214] Screening immunogenic components can be accomplished using
one or more of several different assays. For example, in vitro,
peptide T cell stimulatory activity is assayed by contacting a
peptide known or suspected of being immunogenic with an antigen
presenting cell which presents appropriate MHC molecules in a T
cell culture. Presentation of an immunogenic S. pneumoniae peptide
in association with appropriate MHC molecules to T cells in
conjunction with the necessary co-stimulation has the effect of
transmitting a signal to the T cell that induces the production of
increased levels of cytokines, particularly of interleukin-2 and
interleukin-4. The culture supernatant can be obtained and assayed
for interleukin-2 or other known cytokines. For example, any one of
several conventional assays for interleukin-2 can be employed, such
as the assay described in Proc. Natl. Acad. Sci USA, 86: 1333
(1989) the pertinent portions of which are incorporated herein by
reference. A kit for an assay for the production of interferon is
also available from Genzyme Corporation (Cambridge, Mass.).
[0215] Alternatively, a common assay for T cell proliferation
entails measuring tritiated thymidine incorporation. The
proliferation of T cells can be measured in vitro by determining
the amount of .sup.3H-labeled thymidine incorporated into the
replicating DNA of cultured cells. Therefore, the rate of DNA
synthesis and, in turn, the rate of cell division can be
quantified.
[0216] Vaccine compositions of the invention containing immunogenic
components (e.g., S. pneumoniae polypeptide or fragment thereof or
nucleic acid encoding an S. pneumoniae polypeptide or fragment
thereof) preferably include a pharmaceutically acceptable carrier.
The term "pharmaceutically acceptable carrier" refers to a carrier
that does not cause an allergic reaction or other untoward effect
in patients to whom it is administered. Suitable pharmaceutically
acceptable carriers include, for example, one or more of water,
saline, phosphate buffered saline, dextrose, glycerol, ethanol and
the like, as well as combinations thereof. Pharmaceutically
acceptable carriers may further comprise minor amounts of auxiliary
substances such as wetting or emulsifying agents, preservatives or
buffers, which enhance the shelf life or effectiveness of the
antibody. For vaccines of the invention containing S. pneumoniae
polypeptides, the polypeptide is co-administered with a suitable
adjuvant.
[0217] It will be apparent to those of skill in the art that the
therapeutically effective amount of DNA or protein of this
invention will depend, inter alia, upon the administration
schedule, the unit dose of antibody administered, whether the
protein or DNA is administered in combination with other
therapeutic agents, the immune status and health of the patient,
and the therapeutic activity of the particular protein or DNA.
[0218] Vaccine compositions are conventionally administered
parenterally, e.g., by injection, either subcutaneously or
intramuscularly. Methods for intramuscular immunization are
described by Wolff et al. (1990) Science 247: 1465-1468 and by
Sedegah et al. (1994) Immunology 91: 9866-9870. Other modes of
administration include oral and pulmonary formulations,
suppositories, and transdermal applications. Oral immunization is
preferred over parenteral methods for inducing protection against
infection by S. pneumoniae. Cain et. al. (1993) Vaccine 11:
637-642. Oral formulations include such normally employed
excipients as, for example, pharmaceutical grades of mannitol,
lactose, starch, magnesium stearate, sodium saccharine, cellulose,
magnesium carbonate, and the like.
[0219] The vaccine compositions of the invention can include an
adjuvant, including, but not limited to aluminum hydroxide;
N-acetyl-muramyl--L-threonyl-D-isoglutamine (thr-MDP);
N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred
to as nor-MDP);
N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1'-2'-dip-
almitoyl-sn-glycero-3-hydroxyphos-phoryloxy)-ethylamine (CGP
19835A, referred to a MTP-PE); RIBI, which contains three
components from bacteria; monophosphoryl lipid A; trehalose
dimycoloate; cell wall skeleton (MPL+TDM+CWS) in a 2%
squalene/Tween 80 emulsion; and cholera toxin. Others which may be
used are non-toxic derivatives of cholera toxin, including its B
subunit, and/or conjugates or genetically engineered fusions of the
S. pneumoniae polypeptide with cholera toxin or its B subunit,
procholeragenoid, fungal polysaccharides, including schizophyllan,
muramyl dipeptide, muramyl dipeptide derivatives, phorbol esters,
labile toxin of E. coli, non-S. pneumoniae bacterial lysates, block
polymers or saponins.
[0220] Other suitable delivery methods include biodegradable
microcapsules or immuno-stimulating complexes (ISCOMs), cochleates,
or liposomes, genetically engineered attenuated live vectors such
as viruses or bacteria, and recombinant (chimeric) virus-like
particles, e.g., bluetongue. The amount of adjuvant employed will
depend on the type of adjuvant used. For example, when the mucosal
adjuvant is cholera toxin, it is suitably used in an amount of 5 mg
to 50 mg, for example 10 mg to 35 mg. When used in the form of
microcapsules, the amount used will depend on the amount employed
in the matrix of the microcapsule to achieve the desired dosage.
The determination of this amount is within the skill of a person of
ordinary skill in the art.
[0221] Carrier systems in humans may include enteric release
capsules protecting the antigen from the acidic environment of the
stomach, and including S. pneumoniae polypeptide in an insoluble
form as fusion proteins. Suitable carriers for the vaccines of the
invention are enteric coated capsules and polylactide-glycolide
microspheres. Suitable diluents are 0.2 N NaHCO3 and/or saline.
[0222] Vaccines of the invention can be administered as a primary
prophylactic agent in adults or in children, as a secondary
prevention, after successful eradication of S. pneumoniae in an
infected host, or as a therapeutic agent in the aim to induce an
immune response in a susceptible host to prevent infection by S.
pneumoniae. The vaccines of the invention are administered in
amounts readily determined by persons of ordinary skill in the art.
Thus, for adults a suitable dosage will be in the range of 10 mg to
10 g, preferably 10 mg to 100 mg. A suitable dosage for adults will
also be in the range of 5 mg to 500 mg. Similar dosage ranges will
be applicable for children. Those skilled in the art will recognize
that the optimal dose may be more or less depending upon the
patient's body weight, disease, the route of administration, and
other factors. Those skilled in the art will also recognize that
appropriate dosage levels can be obtained based on results with
known oral vaccines such as, for example, a vaccine based on an E.
coli lysate (6 mg dose daily up to total of 540 mg) and with an
enterotoxigenic E. coli purified antigen (4 doses of 1 mg)
(Schulman et al., J. Urol. 150:917-921 (1993); Boedecker et al.,
American Gastroenterological Assoc. 999:A-222 (1993)). The number
of doses will depend upon the disease, the formulation, and
efficacy data from clinical trials. Without intending any
limitation as to the course of treatment, the treatment can be
administered over 3 to 8 doses for a primary immunization schedule
over 1 month (Boedeker, American Gastroenterological Assoc.
888:A-222 (1993)).
[0223] In a preferred embodiment, a vaccine composition of the
invention can be based on a killed whole E. coli preparation with
an immunogenic fragment of an S. pneumoniae protein of the
invention expressed on its surface or it can be based on an E. coli
lysate, wherein the killed E. coli acts as a carrier or an
adjuvant.
[0224] It will be apparent to those skilled in the art that some of
the vaccine compositions of the invention are useful only for
preventing S. pneumoniae infection, some are useful only for
treating S. pneumoniae infection, and some are useful for both
preventing and treating S. pneumoniae infection. In a preferred
embodiment, the vaccine composition of the invention provides
protection against S. pneumoniae infection by stimulating humoral
and/or cell-mediated immunity against S. pneumoniae. It should be
understood that amelioration of any of the symptoms of S.
pneumoniae infection is a desirable clinical goal, including a
lessening of the dosage of medication used to treat S.
pneumoniae-caused disease, or an increase in the production of
antibodies in the serum or mucous of patients.
Antibodies Reactive with S. pneumoniae Polypeptides
[0225] The invention also includes antibodies specifically reactive
with the subject S. pneumoniae polypeptide.
Anti-protein/anti-peptide antisera or monoclonal antibodies can be
made by standard protocols (See, for example, Antibodies: A
Laboratory Manual ed. by Harlow and Lane (Cold Spring Harbor Press:
1988)). A mammal such as a mouse, a hamster or rabbit can be
immunized with an immunogenic form of the peptide. Techniques for
conferring immunogenicity on a protein or peptide include
conjugation to carriers or other techniques well known in the art.
An immunogenic portion of the subject S. pneumoniae polypeptide can
be administered in the presence of adjuvant. The progress of
immunization can be monitored by detection of antibody titers in
plasma or serum. Standard ELISA or other immunoassays can be used
with the immunogen as antigen to assess the levels of
antibodies.
[0226] In a preferred embodiment, the subject antibodies are
immunospecific for antigenic determinants of the S. pneumoniae
polypeptides of the invention, e.g. antigenic determinants of a
polypeptide of the invention contained in the Sequence Listing, or
a closely related human or non-human mammalian homolog (e.g., 90%
homologous, more preferably at least 95% homologous). In yet a
further preferred embodiment of the invention, the anti-S.
pneumoniae antibodies do not substantially cross react (i.e., react
specifically) with a protein which is for example, less than 80%
percent homologous to a sequence of the invention contained in the
Sequence Listing. By "not substantially cross react", it is meant
that the antibody has a binding affinity for a non-homologous
protein which is less than 10 percent, more preferably less than 5
percent, and even more preferably less than 1 percent, of the
binding affinity for a protein of the invention contained in the
Sequence Listing. In a most preferred embodiment, there is no
cross-reactivity between bacterial and mammalian antigens.
[0227] The term antibody as used herein is intended to include
fragments thereof which are also specifically reactive with S.
pneumoniae polypeptides. Antibodies can be fragmented using
conventional techniques and the fragments screened for utility in
the same manner as described above for whole antibodies. For
example, F(ab').sub.2 fragments can be generated by treating
antibody with pepsin. The resulting F(ab').sub.2 fragment can be
treated to reduce disulfide bridges to produce Fab' fragments. The
antibody of the invention is further intended to include bispecific
and chimeric molecules having an anti-S. pneumoniae portion.
[0228] Both monoclonal and polyclonal antibodies (Ab) directed
against S. pneumoniae polypeptides or S. pneumoniae polypeptide
variants, and antibody fragments such as Fab' and F(ab').sub.2, can
be used to block the action of S. pneumoniae polypeptide and allow
the study of the role of a particular S. pneumoniae polypeptide of
the invention in aberrant or unwanted intracellular signaling, as
well as the normal cellular function of the S. pneumoniae and by
microinjection of anti-S. pneumoniae polypeptide antibodies of the
present invention.
[0229] Antibodies which specifically bind S. pneumoniae epitopes
can also be used in immunohistochemical staining of tissue samples
in order to evaluate the abundance and pattern of expression of S.
pneumoniae antigens. Anti S. pneumoniae polypeptide antibodies can
be used diagnostically in immuno-precipitation and immuno-blotting
to detect and evaluate S. pneumoniae levels in tissue or bodily
fluid as part of a clinical testing procedure. Likewise, the
ability to monitor S. pneumoniae polypeptide levels in an
individual can allow determination of the efficacy of a given
treatment regimen for an individual afflicted with such a disorder.
The level of an S. pneumoniae polypeptide can be measured in cells
found in bodily fluid, such as in urine samples or can be measured
in tissue, such as produced by gastric biopsy. Diagnostic assays
using anti-S. pneumoniae antibodies can include, for example,
immunoassays designed to aid in early diagnosis of S. pneumoniae
infections. The present invention can also be used as a method of
detecting antibodies contained in samples from individuals infected
by this bacterium using specific S. pneumoniae antigens.
[0230] Another application of anti-S. pneumoniae polypeptide
antibodies of the invention is in the immunological screening of
cDNA libraries constructed in expression vectors such as 1gt11,
1gt18-23, 1ZAP, and IORF8. Messenger libraries of this type, having
coding sequences inserted in the correct reading frame and
orientation, can produce fusion proteins. For instance, 1gt11 will
produce fusion proteins whose amino termini consist of
.beta.-galactosidase amino acid sequences and whose carboxy termini
consist of a foreign polypeptide. Antigenic epitopes of a subject
S. pneumoniae polypeptide can then be detected with antibodies, as,
for example, reacting nitrocellulose filters lifted from infected
plates with anti-S. pneumoniae polypeptide antibodies. Phage,
scored by this assay, can then be isolated from the infected plate.
Thus, the presence of S. pneumoniae gene homologs can be detected
and cloned from other species, and alternate isoforms (including
splicing variants) can be detected and cloned.
Kits Containing Nucleic Acids, Polypeptides or Antibodies of the
Invention
[0231] The nucleic acid, polypeptides and antibodies of the
invention can be combined with other reagents and articles to form
kits. Kits for diagnostic purposes typically comprise the nucleic
acid, polypeptides or antibodies in vials or other suitable
vessels. Kits typically comprise other reagents for performing
hybridization reactions, polymerase chain reactions (PCR), or for
reconstitution of lyophilized components, such as aqueous media,
salts, buffers, and the like. Kits may also comprise reagents for
sample processing such as detergents, chaotropic salts and the
like. Kits may also comprise immobilization means such as
particles, supports, wells, dipsticks and the like. Kits may also
comprise labeling means such as dyes, developing reagents,
radioisotopes, fluorescent agents, luminescent or chemiluminescent
agents, enzymes, intercalating agents and the like. With the
nucleic acid and amino acid sequence information provided herein,
individuals skilled in art can readily assemble kits to serve their
particular purpose. Kits further can include instructions for
use.
Bio Chips and Microarrays
[0232] The nucleic acid sequence of the present invention may be
used to detect S. pneumoniae or other species of Streptococcus acid
sequence using bio chip technology. Bio chips containing arrays of
nucleic acid sequence can also be used to measure expression of
genes of S. pneumoniae or other species of Streptococcus. For
example, to diagnose a patient with a S. pneumoniae or other
Streptococcus infection, a sample from a human or animal can be
used as a probe on a bio chip containing an array of nucleic acid
sequence from the present invention. In addition, a sample from a
disease state can be compared to a sample from a non-disease state
which would help identify a gene that is up-regulated or expressed
in the disease state. This would provide valuable insight as to the
mechanism by which the disease manifests. Changes in gene
expression can also be used to identify critical pathways involved
in drug transport or metabolism, and may enable the identification
of novel targets involved in virulence or host cell interactions
involved in maintenance of an infection. Procedures using such
techniques have been described by Brown et al., 1995, Science 270:
467-470.
[0233] Bio chips can also be used to monitor the genetic changes of
potential therapeutic compounds including, deletions, insertions or
mismatches. Once the therapeutic is added to the patient, changes
to the genetic sequence can be evaluated for its efficacy. In
addition, the nucleic acid sequence of the present invention can be
used to determine essential genes in cell cycling. As described in
Iyer et al., 1999 (Science, 283:83-87) genes essential in the cell
cycle can be identified using bio chips. Furthermore, the present
invention provides nucleic acid sequence which can be used with bio
chip technology to understand regulatory networks in bacteria,
measure the response to environmental signals or drugs as in drug
screening, and study virulence induction. (Mons et al., 1998,
Nature Biotechnology, 16: 45-48. Patents teaching this technology
include U.S. Pat. Nos. 5,445,934, 5,744,305, and 5,800,992.
Drug Screening Assays Using S. pneumoniae Polypeptides
[0234] By making available purified and recombinant S. pneumoniae
polypeptides, the present invention provides assays which can be
used to screen for drugs which are either agonists or antagonists
of the normal cellular function, in this case, of the subject S.
pneumoniae polypeptides, or of their role in intracellular
signaling. Such inhibitors or potentiators may be useful as new
therapeutic agents to combat S. pneumoniae infections in humans. A
variety of assay formats will suffice and, in light of the present
inventions, will be comprehended by the skilled artisan.
[0235] In many drug screening programs which test libraries of
compounds and natural extracts, high throughput assays are
desirable in order to maximize the number of compounds surveyed in
a given period of time. Assays which are performed in cell-free
systems, such as may be derived with purified or semi-purified
proteins, are often preferred as "primary" screens in that they can
be generated to permit rapid development and relatively easy
detection of an alteration in a molecular target which is mediated
by a test compound. Moreover, the effects of cellular toxicity
and/or bioavailability of the test compound can be generally
ignored in the in vitro system, the assay instead being focused
primarily on the effect of the drug on the molecular target as may
be manifest in an alteration of binding affinity with other
proteins or change in enzymatic properties of the molecular target.
Accordingly, in an exemplary screening assay of the present
invention, the compound of interest is contacted with an isolated
and purified S. pneumoniae polypeptide.
[0236] Screening assays can be constructed in vitro with a purified
S. pneumoniae polypeptide or fragment thereof, such as an S.
pneumoniae polypeptide having enzymatic activity, such that the
activity of the polypeptide produces a detectable reaction product.
The efficacy of the compound can be assessed by generating dose
response curves from data obtained using various concentrations of
the test compound. Moreover, a control assay can also be performed
to provide a baseline for comparison. Suitable products include
those with distinctive absorption, fluorescence, or
chemi-luninescence properties, for example, because detection may
be easily automated. A variety of synthetic or naturally occurring
compounds can be tested in the assay to identify those which
inhibit or potentiate the activity of the S. pneumoniae
polypeptide. Some of these active compounds may directly, or with
chemical alterations to promote membrane permeability or
solubility, also inhibit or potentiate the same activity (e.g.,
enzymatic activity) in whole, live S. pneumoniae cells.
[0237] Overexpression Assays
[0238] Overexpression assays are based on the premise that
overproduction of a protein would lead to a higher level of
resistance to compounds that selectively interfere with the
function of that protein. Overexpression assays may be used to
identify compounds that interfere with the function of virtually
any type of protein, including without limitation enzymes,
receptors, DNA- or RNA-binding proteins, or any proteins that are
directly or indirectly involved in regulating cell growth.
[0239] Typically, two bacterial strains are constructed. One
contains a single copy of the gene of interest, and a second
contains several copies of the same gene. Identification of useful
inhibitory compounds of this type of assay is based on a comparison
of the activity of a test compound in inhibiting growth and/or
viability of the two strains. The method involves constructing a
nucleic acid vector that directs high level expression of a
particular target nucleic acid. The vectors are then transformed
into host cells in single or multiple copies to produce strains
that express low to moderate and high levels of protein encoding by
the target sequence (strain A and B, respectively). Nucleic acid
comprising sequences encoding the target gene can, of course, be
directly integrated into the host cell.
[0240] Large numbers of compounds (or crude substances which may
contain active compounds) are screened for their effect on the
growth of the two strains. Agents which interfere with an unrelated
target equally inhibit the growth of both strains. Agents which
interfere with the function of the target at high concentration
should inhibit the growth of both strains. It should be possible,
however, to titrate out the inhibitory effect of the compound in
the overexpressing strain. That is, if the compound is affecting
the particular target that is being tested, it should be possible
to inhibit the growth of strain A at a concentration of the
compound that allows strain B to grow.
[0241] Alternatively, a bacterial strain is constructed that
contains the gene of interest under the control of an inducible
promoter. Identification of useful inhibitory agents using this
type of assay is based on a comparison of the activity of a test
compound in inhibiting growth and/or viability of this strain under
both inducing and non-inducing conditions. The method involves
constructing a nucleic acid vector that directs high-level
expression of a particular target nucleic acid. The vector is then
transformed into host cells that are grown under both non-inducing
and inducing conditions (conditions A and B, respectively).
[0242] Large numbers of compounds (or crude substances which may
contain active compounds) are screened for their effect on growth
under these two conditions. Agents that interfere with the function
of the target should inhibit growth under both conditions. It
should be possible, however, to titrate out the inhibitory effect
of the compound in the overexpressing strain. That is, if the
compound is affecting the particular target that is being tested,
it should be possible to inhibit growth under condition A at a
concentration that allows the strain to grow under condition B.
[0243] Ligand-Binding Assays
[0244] Many of the targets according to the invention have
functions that have not yet been identified. Ligand-binding assays
are useful to identify inhibitor compounds that interfere with the
function of a particular target, even when that function is
unknown. These assays are designed to detect binding of test
compounds to particular targets. The detection may involve direct
measurement of binding. Alternatively, indirect indications of
binding may involve stabilization of protein structure or
disruption of a biological function. Non-limiting examples of
useful ligand-binding assays are detailed below.
[0245] A useful method for the detection and isolation of binding
proteins is the Biomolecular Interaction Assay (BIAcore) system
developed by Pharmacia Biosensor and described in the
manufacturer's protocol (LKB Pharmacia, Sweden). The BIAcore system
uses an affinity purified anti-GST antibody to immobilize
GST-fusion proteins onto a sensor chip. The sensor utilizes surface
plasmon resonance which is an optical phenomenon that detects
changes in refractive indices. In accordance with the practice of
the invention, a protein of interest is coated onto a chip and test
compounds are passed over the chip. Binding is detected by a change
in the refractive index (surface plasmon resonance).
[0246] A different type of ligand-binding assay involves
scintillation proximity assays (SPA, described in U.S. Pat. No.
4,568,649).
[0247] Another type of ligand binding assay, also undergoing
development, is based on the fact that proteins containing
mitochondrial targeting signals are imported into isolated
mitochondria in vitro (Hurt et al., 1985, Embo J. 4:2061-2068;
Eilers and Schatz, Nature, 1986, 322:228-231). In a mitochondrial
import assay, expression vectors are constructed in which nucleic
acids encoding particular target proteins are inserted downstream
of sequences encoding mitochondrial import signals. The chimeric
proteins are synthesized and tested for their ability to be
imported into isolated mitochondria in the absence and presence of
test compounds. A test compound that binds to the target protein
should inhibit its uptake into isolated mitochondria in vitro.
[0248] Another ligand-binding assay is the yeast two-hybrid system
(Fields and Song, 1989, Nature 340:245-246). The yeast two-hybrid
system takes advantage of the properties of the GAL4 protein of the
yeast Saccharomyces cerevisiae. The GAL4 protein is a
transcriptional activator required for the expression of genes
encoding enzymes of galactose utilization. This protein consists of
two separable and functionally essential domains: an N-terminal
domain which binds to specific DNA sequences (UAS.sub.G); and a
C-terminal domain containing acidic regions, which is necessary to
activate transcription. The native GAL4 protein, containing both
domains, is a potent activator of transcription when yeast are
grown on galactose media. The N-terminal domain binds to DNA in a
sequence-specific manner but is unable to activate transcription.
The C-terminal domain contains the activating regions but cannot
activate transcription because it fails to be localized to
UAS.sub.G. In the two-hybrid system, a system of two hybrid
proteins containing parts of GAL4: (1) a GAL4 DNA-binding domain
fused to a protein `X` and (2) a GAL4 activation region fused to a
protein `Y`. If X and Y can form a protein-protein complex and
reconstitute proximity of the GAL4 domains, transcription of a gene
regulated by UAS.sub.G occurs. Creation of two hybrid proteins,
each containing one of the interacting proteins X and Y, allows the
activation region of UAS.sub.G to be brought to its normal site of
action.
[0249] The binding assay described in Fodor et al., 1991, Science
251:767-773, which involves testing the binding affinity of test
compounds for a plurality of defined polymers synthesized on a
solid substrate, may also be useful.
[0250] Compounds which bind to the polypeptides of the invention
are potentially useful as antibacterial agents for use in
therapeutic compositions.
[0251] Pharmaceutical formulations suitable for antibacterial
therapy comprise the antibacterial agent in conjunction with one or
more biologically acceptable carriers. Suitable biologically
acceptable carriers include, but are not limited to,
phosphate-buffered saline, saline, deionized water, or the like.
Preferred biologically acceptable carriers are physiologically or
pharmaceutically acceptable carriers.
[0252] The antibacterial compositions include an antibacterial
effective amount of active agent. Antibacterial effective amounts
are those quantities of the antibacterial agents of the present
invention that afford prophylactic protection against bacterial
infections or which result in amelioration or cure of an existing
bacterial infection. This antibacterial effective amount will
depend upon the agent, the location and nature of the infection,
and the particular host. The amount can be determined by
experimentation known in the art, such as by establishing a matrix
of dosages and frequencies and comparing a group of experimental
units or subjects to each point in the matrix.
[0253] The antibacterial active agents or compositions can be
formed into dosage unit forms, such as for example, creams,
ointments, lotions, powders, liquids, tablets, capsules,
suppositories, sprays, aerosols or the like. If the antibacterial
composition is formulated into a dosage unit form, the dosage unit
form may contain an antibacterial effective amount of active agent.
Alternatively, the dosage unit form may include less than such an
amount if multiple dosage unit forms or multiple dosages are to be
used to administer a total dosage of the active agent. Dosage unit
forms can include, in addition, one or more excipient(s),
diluent(s), disintegrant(s), lubricant(s), plasticizer(s),
colorant(s), dosage vehicle(s), absorption enhancer(s),
stabilizer(s), bactericide(s), or the like.
[0254] For general information concerning formulations, see, e.g.,
Gilman et al. (eds.), 1990, Goodman and Gilman's: The
Pharmacological Basis of Therapeutics, 8th ed., Pergamon Press; and
Remington's Pharmaceutical Sciences, 17th ed., 1990, Mack
Publishing Co., Easton, Pa.; Avis et al. (eds.), 1993,
Pharmaceutical Dosage Forms: Parenteral Medications, Dekker, New
York; Lieberman et al (eds.), 1990, Pharmaceutical Dosage Forms:
Disperse Systems, Dekker, New York.
[0255] The antibacterial agents and compositions of the present
invention are useful for preventing or treating S. pneumoniae
infections. Infection prevention methods incorporate a
prophylactically effective amount of an antibacterial agent or
composition. A prophylactically effective amount is an amount
effective to prevent S. pneumoniae infection and will depend upon
the specific bacterial strain, the agent, and the host. These
amounts can be determined experimentally by methods known in the
art and as described above.
[0256] S. pneumoniae infection treatment methods incorporate a
therapeutically effective amount of an antibacterial agent or
composition. A therapeutically effective amount is an amount
sufficient to ameliorate or eliminate the infection. The
prophylactically and/or therapeutically effective amounts can be
administered in one administration or over repeated
administrations. Therapeutic administration can be followed by
prophylactic administration, once the initial bacterial infection
has been resolved.
[0257] The antibacterial agents and compositions can be
administered topically or systemically. Topical application is
typically achieved by administration of creams, ointments, lotions,
or sprays as described above. Systemic administration includes both
oral and parental routes. Parental routes include, without
limitation, subcutaneous, intramuscular, intraperitoneal,
intravenous, transdermal, inhalation and intranasal
administration.
EXEMPLIFICATION
I. Cloning and Sequencing of S. pneumoniae DNA
[0258] S. pneumoniae chromosomal DNA was isolated according to a
basic DNA protocol outlined in Schleif R. F. and Wensink P. C.,
Practical Methods in Molecular Biology, p. 98, Springer-Verlag,
NY., 1981, with minor modifications. Briefly, cells were pelleted,
resuspended in TE (10 mM Tris, 1 mM EDTA, pH 7.6) and GES lysis
buffer (5.1 M guanidium thiocyanate, 0.1 M EDTA, pH 8.0, 0.5%
N-laurylsarcosine) was added. Suspension was chilled and ammonium
acetate (NH4Ac) was added to final concentration of 2.0 M. DNA was
extracted, first with chloroform, then with phenol-chloroform, and
reextracted with chloroform. DNA was precipitated with isopropanol,
washed twice with 70% EtOH, dried and resuspended in TE.
[0259] Following isolation whole genomic S. pneumoniae DNA was
nebulized (Bodenteich et al., Automated DNA Sequencing and Analysis
(J. C. Venter, ed.), Academic Press, 1994) to a median size of 2000
bp. After nebulization, the DNA was concentrated and separated on a
standard 1% agarose gel. Several fractions, corresponding to
approximate sizes 1000-1500 bp, 1500-2000 bp, 2000-2500 bp,
2500-3000 bp, were excised from the gel and purified by the
GeneClean procedure (Bio101, Inc.).
[0260] The purified DNA fragments were then blunt-ended using T4
DNA polymerase. The healed DNA was then ligated to unique
BstXI-linker adapters (5' GTCTTCACCACGGGG (SEQ ID NO: 5323) and 5'
GTGGTGAAGAC (SEQ ID NO: 5324) in 100-1000 fold molar excess). These
linkers are complimentary to the BstXI-cut pMPX vectors, while the
overhang is not self-complimentary. Therefore, the linkers will not
concatemerize nor will the cut-vector religate itself easily. The
linker-adopted inserts were separated from the unincorporated
linkers on a 1% agarose gel and purified using GeneClean. The
linker-adopted inserts were then ligated to each of 20 pMPX vectors
to construct a series of "shotgun" subclone libraries. Blunt ended
vector was used for cloning into the PUC19 vector. The vectors
contain an out-of-frame lacZ gene at the cloning site which becomes
in-frame in the event that an adapter-dimer is cloned, allowing
these to be avoided by their blue-color.
[0261] All subsequent steps were based either on the multiplex DNA
sequencing protocols outlined in Church G. M. and Kieffer-Higgins
S., Science 240:185-188, 1988 or by ABI377 automated DNA sequencing
methods. Only major modifications to the protocols are highlighted.
Briefly, each of the 20 vectors was then transformed into DH5a
competent cells (Gibco/BRL, DH5a transformation protocol). The
libraries were assessed by plating onto antibiotic plates
containing ampicillin, methicillin and IPTG/Xgal. The plates were
incubated overnight at 37.degree. C. Successful transformants were
then used for plating of clones and pooling into the multiplex
pools. The clones were picked and pooled into 40 ml growth medium
cultures. The cultures were grown overnight at 37.degree. C. DNA
was purified using the Qiagen Midi-prep kits and Tip-100 columns
(Qiagen, Inc.). In this manner, 100 mg of DNA was obtained per
pool.
[0262] These purified DNA samples were then sequenced either using
the multiplex DNA sequencing based on chemical degradation methods
(Church G. M. and Kieffer-Higgins S., Science 240:185-188, 1988) or
by Sequithrem (Epicenter Technologies) dideoxy sequencing protocols
or by ABI dye-terminator chemistry. For the multiplex portion the
sequencing reactions were electrophoresed and transferred onto
nylon membranes by direct transfer electrophoresis from 40 cm gels
(Richterich P. and Church G. M., Methods in Enzymology 218:187-222,
1993). The DNA was covalently bound to the membranes by exposure to
ultraviolet light, and hybridized with labeled oligonucleotides
complimentary to tag sequences on the vectors (Church, supra). The
membranes were washed to rinse off non-specifically bound probe,
and exposed to X-ray film to visualize individual sequence ladders.
After autoradiography, the hybridized probe was removed by
incubation at 65.degree. C., and the hybridization cycle repeated
with another tag sequence until the membrane had been probed 41
times. Thus, each gel produced a large number of films, each
containing new sequencing information. Whenever a new blot was
processed, it was initially probed for an internal standard
sequence added to each of the pools. Digital images of the films
were generated using a laser-scanning densitometer (Molecular
Dynamics, Sunnyvale, Calif.). The digitized images were processed
on computer workstations (VaxStation 4000's) using the program
REPLICA.TM. (Church et al., Automated DNA Sequencing and Analysis
(J. C. Venter, ed.), Academic Press, 1994). Image processing
included lane straightening, contrast adjustment to smooth out
intensity differences, and resolution enhancement by iterative
gaussian deconvolution. The sequences were then converted to an SCF
format so that processing and assembly could proceed on UNIX
machines. The ABI dye terminator sequence reads were run on ABI377
machines and the data was directly transferred to UNIX machines
following lane tracking of the gels. All multiplex and ABI reads
were assembled using PHRAP (P. Green, Abstracts of DOE Human Genome
Program Contractor-Grantee Workshop V, January 1996, p. 157) with
default parameters and not using quality scores. The initial
assembly was done at 7fold coverage and yielded 511 contigs. Short
read length fragments of 200 bp or less found on the ends of
contigs facing in the appropriate direction were used to extend off
the end of the contigs. These reads were then resequenced with
primers using ABI technology to give sequences with a read length
of 500 or more bases. This allowed end extensions to be performed
without ordering new primers. In addition, missing mates (sequences
from clones that only gave one strand reads) were identified and
sequenced with ABI technology to allow the identification of
additional overlapping contigs.
[0263] End-sequencing of randomly picked genomic lambda was also
performed. Sequencing on a both sides was done for all lambda
sequences. The lambda library backbone helped to verify the
integrity of the assembly and allowed closure of some of the
physical gaps.
[0264] To identify S. pneumoniae polypeptides the complete genomic
sequence of S. pneumoniae were analyzed essentially as follows:
First, all possible stop-to-stop open reading frames (ORFs) greater
than 180 nucleotides in all six reading frames were translated into
amino acid sequences. Second, the identified ORFs were analyzed for
homology to known (archeabacter, prokaryotic and eukaryotic)
protein sequences. Third, the predicted coding regions of the
sequences and start codons were evaluated with the programs
GENEMARK.TM. (Borodovsky and McIninch, 1993, Comp. Chem. 17:123)
and Glimmer (Fraser et al, Nature, 1997).
Identification, Cloning and Expression of S. pneumoniae Nucleic
Acids
[0265] Expression and purification of the S. pneumoniae
polypeptides of the invention can be performed essentially as
outlined below.
[0266] To facilitate the cloning, expression and purification of
membrane and secreted proteins from S. pneumoniae, a gene
expression system, such as the pET System (Novagen), for cloning
and expression of recombinant proteins in E. coli, is selected.
Also, a DNA sequence encoding a peptide tag, the His-Tag, is fused
to the 3' end of DNA sequences of interest in order to facilitate
purification of the recombinant protein products. The 3' end is
selected for fusion in order to avoid alteration of any 5' terminal
signal sequence.
[0267] PCR Amplification and Cloning of Nucleic Acids Containing
ORF's Encoding Enzymes
[0268] Nucleic acids chosen (for example, from the nucleic acids
set forth in SEQ ID NO: 1-SEQ ID NO: 2661 ) for cloning from the
14453 strain of S. pneumoniae are prepared for amplification
cloning by polymerase chain reaction (PCR). Synthetic
oligonucleotide primers specific for the 5.sup./ and 3.sup./ ends
of open reading frames (ORFs) are designed and purchased from
GibcoBRL Life Technologies (Gaithersburg, Md., USA). All forward
primers (specific for the 5.sup./ end of the sequence) are designed
to include an NcoI cloning site at the extreme 5.sup./ terminus.
These primers are designed to permit initiation of protein
translation at a methionine residue followed by a valine residue
and the coding sequence for the remainder of the native S.
pneumoniae DNA sequence. All reverse primers (specific for the
3.sup./ end of any S. pneumoniae ORF) include a EcoRI site at the
extreme 5.sup./ terminus to permit cloning of each S. pneumoniae
sequence into the reading frame of the pET-28b. The pET-28b vector
provides sequence encoding an additional 20 carboxy-terminal amino
acids including six histidine residues (at the extreme C-terminus),
which comprise the His-Tag.
[0269] Genomic DNA prepared from strain 14453 of S. pneumoniae is
used as the source of template DNA for PCR amplification reactions
(Current Protocols in Molecular Biology, John Wiley and Sons, Inc.,
F. Ausubel et al., eds., 1994). To amplify a DNA sequence
containing an S. pneumoniae ORF, genomic DNA (50 nanograms) is
introduced into a reaction vial containing 2 mM MgCl.sub.2, 1
micromolar synthetic oligonucleotide primers (forward and reverse
primers) complementary to and flanking a defined S. pneumoniae ORF,
0.2 mM of each deoxynucleotide triphosphate; dATP, dGTP, dCTP, dTTP
and 2.5 units of heat stable DNA polymerase (Amplitaq, Roche
Molecular Systems, Inc., Branchburg, N.J., USA) in a final volume
of 100 microliters.
[0270] Upon completion of thermal cycling reactions, each sample of
amplified DNA is washed and purified using the Qiaquick Spin PCR
purification kit (Qiagen, Gaithersburg, Md., USA). All amplified
DNA samples are subjected to digestion with the restriction
endonucleases, e.g., NcoI and EcoRI (New England BioLabs, Beverly,
Mass., USA)(Current Protocols in Molecular Biology, John Wiley and
Sons, Inc., F. Ausubel et al., eds., 1994). DNA samples are then
subjected to electrophoresis on 1.0% NuSeive (FMC BioProducts,
Rockland, Me. USA) agarose gels. DNA is visualized by exposure to
ethidium bromide and long wave uv irradiation. DNA contained in
slices isolated from the agarose gel is purified using the Bio 101
GeneClean Kit protocol (Bio 101 Vista, Calif., USA).
Cloning of S. pneumoniae Nucleic Acids Into an Expression
Vector
[0271] The pET-28b vector is prepared for cloning by digestion with
endonucleases, e.g., NcoI and EcoRI (Current Protocols in Molecular
Biology, John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994).
The pET-28a vector, which encodes a His-Tag that can be fused to
the 5.sup./ end of an inserted gene, is prepared by digestion with
appropriate restriction endonucleases.
[0272] Following digestion, DNA inserts are cloned (Current
Protocols in Molecular Biology, John Wiley and Sons, Inc., F.
Ausubel et al., eds., 1994) into the previously digested pET-28b
expression vector. Products of the ligation reaction are then used
to transform the BL21 strain of E. coli (Current Protocols in
Molecular Biology, John Wiley and Sons, Inc., F. Ausubel et al.,
eds., 1994) as described below.
[0273] Transformation of Competent Bacteria with Recombinant
Plasmids
[0274] Competent bacteria, E coli strain BL21 or E. coli strain
BL21 (DE3), are transformed with recombinant pET expression
plasmids carrying the cloned S. pneumoniae sequences according to
standard methods (Current Protocols in Molecular, John Wiley and
Sons, Inc., F. Ausubel et al., eds., 1994). Briefly, 1 microliter
of ligation reaction is mixed with 50 microliters of
electrocompetent cells and subjected to a high voltage pulse, after
which, samples are incubated in 0.45 milliliters SOC medium (0.5%
yeast extract, 2.0 % tryptone, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2,
10 mM MgSO4 and 20, mM glucose) at 37.degree. C. with shaking for 1
hour. Samples are then spread on LB agar plates containing 25
microgram/ml kanamycin sulfate for growth overnight. Transformed
colonies of BL21 are then picked and analyzed to evaluate cloned
inserts as described below.
[0275] Identification of Recombinant Expression Vectors with S.
pneumoniae Nucleic Acids
[0276] Individual BL21 clones transformed with recombinant pET-28b
S. pneumoniae ORFs are analyzed by PCR amplification of the cloned
inserts using the same forward and reverse primers, specific for
each S. pneumoniae sequence, that were used in the original PCR
amplification cloning reactions. Successful amplification verifies
the integration of the S. pneumoniae sequences in the expression
vector (Current Protocols in Molecular Biology, John Wiley and
Sons, Inc., F. Ausubel et al., eds., 1994).
[0277] Isolation and Preparation of Nucleic Acids from
Transformants
[0278] Individual clones of recombinant pET-28b vectors carrying
properly cloned S. pneumoniae ORFs are picked and incubated in 5
mls of LB broth plus 25 microgram/ml kanamycin sulfate overnight.
The following day plasmid DNA is isolated and purified using the
Qiagen plasmid purification protocol (Qiagen Inc., Chatsworth,
Calif., USA).
[0279] Expression of Recombinant S. pneumoniae Sequences in E.
coli
[0280] The pET vector can be propagated in any E. coli K-12 strain
e.g. HMS174, HB101, JM109, DH5, etc. for the purpose of cloning or
plasmid preparation. Hosts for expression include E. coli strains
containing a chromosomal copy of the gene for T7 RNA polymerase.
These hosts are lysogens of bacteriophage DE3, a lambda derivative
that carries the lacI gene, the lacUV5 promoter and the gene for T7
RNA polymerase. T7 RNA polymerase is induced by addition of
isopropyl-B-D-thiogalactoside (IPTG), and the T7 RNA polymerase
transcribes any target plasmid, such as pET-28b, carrying its gene
of interest. Strains used include: BL21(DE3) (Studier, F. W.,
Rosenberg, A. H., Dunn, J. J., and Dubendorff, J. W. (1990) Meth.
Enzymol. 185, 60-89).
[0281] To express recombinant S. pneumoniae sequences, 50 nanograms
of plasmid DNA isolated as described above is used to transform
competent BL21(DE3) bacteria as described above (provided by
Novagen as part of the pET expression system kit). The lacZ gene
(beta-galactosidase) is expressed in the pET-System as described
for the S. pneumoniae recombinant constructions. Transformed cells
are cultured in SOC medium for 1 hour, and the culture is then
plated on LB plates containing 25 micrograms/ml kanamycin sulfate.
The following day, bacterial colonies are pooled and grown in LB
medium containing kanamycin sulfate (25 micrograms/ml) to an
optical density at 600 nM of 0.5 to 1.0 O.D. units, at which point,
1 millimolar IPTG was added to the culture for 3 hours to induce
gene expression of the S. pneumoniae recombinant DNA
constructions.
[0282] After induction of gene expression with IPTG, bacteria are
pelleted by centrifugation in a Sorvall RC-3B centrifuge at
3500.times.g for 15 minutes at 4.degree. C. Pellets are resuspended
in 50 milliliters of cold 10 mM Tris-HCl, pH 8.0, 0.1 M NaCl and
0.1 mM EDTA (STE buffer). Cells are then centrifuged at
2000.times.g for 20 min at 4.degree. C. Wet pellets are weighed and
frozen at -80.degree. C. until ready for protein purification.
[0283] A variety of methodologies known in the art can be utilized
to purify the isolated proteins. (Current Protocols in Protein
Science, John Wiley and Sons, Inc., J. E. Coligan et al., eds.,
1995). For example, the frozen cells may be thawed, resupended in
buffer and ruptured by several passages through a small volume
microfluidizer (Model M-110S, Microfluidics International
Corporation, Newton, Mass.). The resultant homogenate may be
centrifuged to yield a clear supernatant (crude extract) and
following filtration the crude extract may be fractionated over
columns. Fractions may be monitored by absorbance at OD.sub.280 nm.
and peak fractions may analyzed by SDS-PAGE
[0284] The concentrations of purified protein preparations may be
quantified spectrophotometrically using absorbance coefficients
calculated from amino acid content (Perkins, S. J. 1986 Eur. J.
Biochem. 157, 169-180). Protein concentrations are also measured by
the method of Bradford, M. M. (1976) Anal. Biochem. 72, 248-254,
and Lowry, O. H., Rosebrough, N., Farr, A. L. & Randall, R. J.
(1951) J. Biol. Chem. 193, pages 265-275, using bovine serum
albumin as a standard.
[0285] SDS-polyacrylamide gels of various concentrations may be
purchased from BioRad (Hercules, Calif., USA), and stained with
Coomassie blue. Molecular weight markers may include rabbit
skeletal muscle myosin (200 kDa), E. coli (-galactosidase (116
kDa), rabbit muscle phosphorylase B (97.4 kDa), bovine serum
albumin (66.2 kDa), ovalbumin (45 kDa), bovine carbonic anhydrase
(31 kDa), soybean trypsin inhibitor (21.5 kDa), egg white lysozyme
(14.4 kDa) and bovine aprotinin (6.5 kDa).
EQUIVALENTS
[0286] While this invention has been particularly shown and
described with references to preferred embodiments thereof, it will
be understood by those skilled in the art that various changes in
form and details may be made therein without departing from the
scope of the invention encompassed by the appended claims.
Sequence CWU 0 SQTB SEQUENCE LISTING The patent application
contains a lengthy "Sequence Listing" section. A copy of the
"Sequence Listing" is available in electronic form from the USPTO
web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20070099861A1).
An electronic copy of the "Sequence Listing" will also be available
from the USPTO upon request and payment of the fee set forth in 37
CFR 1.19(b)(3).
0 SQTB SEQUENCE LISTING The patent application contains a lengthy
"Sequence Listing" section. A copy of the "Sequence Listing" is
available in electronic form from the USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20070099861A1).
An electronic copy of the "Sequence Listing" will also be available
from the USPTO upon request and payment of the fee set forth in 37
CFR 1.19(b)(3).
* * * * *
References