U.S. patent application number 13/758595 was filed with the patent office on 2013-12-19 for methods and composition for the identification of antibiotics that are not susceptible to antibiotic resistance.
This patent application is currently assigned to Wayne State University. The applicant listed for this patent is WAYNE STATE UNIVERSITY. Invention is credited to Philip R. Cunningham.
Application Number | 20130337544 13/758595 |
Document ID | / |
Family ID | 30003301 |
Filed Date | 2013-12-19 |
United States Patent
Application |
20130337544 |
Kind Code |
A1 |
Cunningham; Philip R. |
December 19, 2013 |
METHODS AND COMPOSITION FOR THE IDENTIFICATION OF ANTIBIOTICS THAT
ARE NOT SUSCEPTIBLE TO ANTIBIOTIC RESISTANCE
Abstract
Compositions are provided to identify functional mutant
ribosomes that may be used as drug targets. The compositions allow
isolation and analysis of mutations that would normally be lethal
and allow direct selection of rRNA mutants with predetermined
levels of ribosome function. The compositions of the present
invention may be used to identify antibiotics to treat a large
number of human pathogens through the use of genetically engineered
rRNA genes from a variety of species. The invention further
provides novel plasmid constructs to be used in the methods of the
invention.
Inventors: |
Cunningham; Philip R.;
(Troy, MI) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
WAYNE STATE UNIVERSITY |
Detroit |
MI |
US |
|
|
Assignee: |
Wayne State University
Detroit
MI
|
Family ID: |
30003301 |
Appl. No.: |
13/758595 |
Filed: |
February 4, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12771340 |
Apr 30, 2010 |
8367319 |
|
|
13758595 |
|
|
|
|
11436349 |
May 18, 2006 |
7709196 |
|
|
12771340 |
|
|
|
|
10612224 |
Jul 1, 2003 |
7081341 |
|
|
11436349 |
|
|
|
|
60452012 |
Mar 5, 2003 |
|
|
|
60393237 |
Jul 1, 2002 |
|
|
|
Current U.S.
Class: |
435/252.3 ;
435/320.1 |
Current CPC
Class: |
C12N 15/70 20130101;
C12N 15/1058 20130101; C12N 15/1086 20130101 |
Class at
Publication: |
435/252.3 ;
435/320.1 |
International
Class: |
C12N 15/70 20060101
C12N015/70 |
Claims
1. A plasmid comprising an rRNA gene having a mutant
Anti-Shine-Dalgarno sequence, at least one mutation in said rRNA
gene, and a genetically engineered gene which encodes a selectable
marker having a mutant Shine-Dalgarno sequence, wherein the mutant
Anti-Shine-Dalgarno and the mutant Shine-Dalgarno sequence are a
mutually compatible pair.
2. The plasmid of claim 1, wherein the rRNA gene is from a species
selected from the group consisting of Mycobacterium tuberculosis,
Pseudomonas aeruginosa, Salmonella typhi, Yersenia pestis,
Staphylococcus aureus, Streptococcus pyogenes, Enterococcus
faecalis, Chlamydia trachomatis, Saccharomyces cerevesiae, Candida
alhicans, and trypanosome.
3. The plasmid of claim 1, wherein the selectable marker is
chloramphenicol acetyltransferase (CAT), green fluorescent protein
(GFP), or both CAT and GFP.
4. The plasmid of claim 1, wherein the mutant Anti-Shine-Dalgarno
sequence is selected from the group consisting of SEQ ID NOs: 25,
27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59,
61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93,
95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121,
123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147,
149, 151, 153, 155, 157, and 159.
5. The plasmid of claim 1, wherein the mutant Shine-Dalgarno
sequence is selected from the group consisting of SEQ ID NOs: 24,
26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58,
60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92,
94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120,
122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146,
148, 150, 152, 154, 156, and 158.
6. The plasmid of claim 4, wherein the mutant Anti-Shine-Dalgarno
sequence and the mutant SD sequence are a mutually compatible
pair.
7. The plasmid of claim 6, wherein the mutually compatible mutant
Shine-Dalgarno and mutant Anti-Shine-Dalgarno pair permits
translation by the rRNA of the selectable marker.
8. The plasmid of claim 3, wherein the selectable marker is
CAT.
9. The plasmid of claim 3, wherein the selectable marker is
GFP.
10. A cell comprising the plasmid of claim 1.
11. The cell of claim 10, wherein the mutations in the rRNA gene
affect the quantity of selectable marker produced.
12. The cell of claim 10, wherein the cell is a bacterial cell.
13. The plasmid of claim 1, wherein the DNA sequence encoding the
rRNA gene is under the control of an inducible promoter.
14. A plasmid comprising an E. coli 16S rRNA gene having a mutant
Anti-Shine-Dalgarno sequence, at least one mutation in said 16S
rRNA gene, and a genetically engineered gene which encodes GFP
having a mutant Shine-Dalgarno sequence, wherein the mutant
Anti-Shine-Dalgarno and the mutant Shine-Dalgarno sequence are a
mutually compatible pair.
15. The plasmid of claim 14, wherein the mutant Anti-Shine-Dalgarno
sequence is selected from the group consisting of SEQ ID NOs: 25,
27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59,
61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93,
95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121,
123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147,
149, 151, 153, 155, 157, and 159.
16. The plasmid of claim 14, wherein the mutant Shine-Dalgarno
sequence is selected from the group consisting of SEQ ID NOs: 24,
26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58,
60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92,
94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120,
122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146,
148, 150, 152, 154, 156, and 158.
17. The plasmid of claim 15, wherein the mutant Anti-Shine-Dalgarno
sequence and the mutant Shine-Dalgarno sequence are a mutually
compatible pair.
18. The plasmid of claim 17, wherein the mutually compatible mutant
Shine-Dalgarno and mutant Anti-Shine-Dalgarno pair permits
translation by the mutant 16S rRNA of the selectable marker
GFP.
19. A cell comprising the plasmid of claim 14.
20. The cell of claim 19, wherein the mutation in the 16S rRNA gene
affects the quantity of selectable marker produced.
21. The cell of claim 19, wherein the cell is a bacterial cell.
22. The plasmid of claim 14, wherein the DNA sequence encoding the
16S rRNA gene is under the control of an inducible promoter.
Description
RELATED APPLICATIONS
[0001] The present application is a divisional of U.S. patent
application Ser. No. 12/771,340, filed on Apr. 30, 2010, which is a
continuation of U.S. patent application Ser. No. 11/436,349, filed
on May 18, 2006, now U.S. Pat. No. 7,709,196; which is a divisional
of U.S. patent application Ser. No. 10/612,224, filed on Jul. 1,
2003, now U.S. Pat. No. 7,081,341; which claims priority from U.S.
provisional patent application Ser. No. 60/393,237, filed on Jul.
1, 2002, and U.S. provisional patent application Ser. No.
60/452,012, filed on Mar. 5, 2003, all of which are expressly
incorporated by reference.
SEQUENCE LISTING
[0002] The instant application contains a Sequence Listing which is
identical to the Sequence Listing submitted in the parent
application Ser. No. 12/771,340, filed on Apr. 30, 2010 via EFS-Web
in ASCII format on Aug. 10, 2010 and is hereby incorporated by
reference in its entirety. The ASCII copy is entitled
"WSS59703_SEQLIST.txt" and is 111,235 bytes in size.
BACKGROUND OF THE INVENTION
[0003] Ribosomes are composed of one large and one small subunit
containing three or four RNA molecules and over fifty proteins. The
part of the ribosome that is directly involved in protein synthesis
is the ribosomal RNA (rRNA). The ribosomal proteins are responsible
for folding the rRNAs into their correct three-dimensional
structures. Ribosomes and the protein synthesis process are very
similar in all organisms. One difference between bacteria and other
organisms, however, is the way that ribosomes recognize mRNA
molecules that are ready to be translated. In bacteria, this
process involves a base-pairing interaction between several
nucleotides near the beginning of the mRNA and an equal number of
nucleotides at the end of the ribosomal RNA molecule in the small
subunit. The mRNA sequence is known as the Shine-Dalgarno (SD)
sequence and its counterpart on the rRNA is called the
Anti-Shine-Dalgarno (ASD) sequence.
[0004] There is now extensive biochemical, genetic and phylogenetic
evidence indicating that rRNA is directly involved in virtually
every aspect of ribosome function (Garrett, R. A., et al. (2000)
The Ribosome: Structure, Function, Antibiotics, and Cellular
Interactions. ASM Press, Washington, D.C.). Genetic and functional
analyses of rRNA mutations in E. coli and most other organisms have
been complicated by the presence of multiple rRNA genes and by the
occurrence of dominant lethal rRNA mutations. Because there are
seven rRNA operons in E. coli, the phenotypic expression of rRNA
mutations may be affected by the relative amounts of mutant and
wild-type ribosomes in the cell. Thus, detection of mutant
phenotypes can be hindered by the presence of wild-type ribosomes.
A variety of approaches have been designed to circumvent these
problems.
[0005] One common approach uses cloned copies of a wild-type rRNA
operon (Brosius, J., et al. (1981) Plasmid 6: 112-118; Sigmund, C.
D. et al. (1982) Proc. Natl. Acad. Sci. U.S.A. 79: 5602-5606).
Several groups have used this system to detect phenotypic
differences caused by a high level of expression of mutant
ribosomes. Recently, a strain of E. coli was constructed in which
the only supply of ribosomal RNA was plasmid encoded (Asai, T.,
(1999) J. Bacteriol. 181: 3803-3809). This system has been used to
study transcriptional regulation of rRNA synthesis, as well as
ribosomal RNA function (Voulgaris, J., et al. (1999) J. Bacteriol.
181: 4170-4175; Koosha, H., et al. (2000) RNA. 6: 1166-1173;
Sergiev, P. V., et al. (2000) J. Mol. Biol. 299: 379-389; O'Connor,
M. et al. (2001) Nucl. Acids Res. 29: 1420-1425; O'Connor, M., et
al. (2001) Nucl. Acids Res. 29: 710-715; Vila-Sanjurjo, A. et al.
(2001) J. Mol. Biol. 308: 457-463); Morosyuk S. V., et al. (2000)
J. Mol. Biol. 300 (1):113-126; Morosyuk S. V., et al. (2001) J.
Mol. Biol. 307 (1):197-210; and Morosyuk S. V., et al. (2001) J.
Mol. Biol. 307 (1):211-228. Hui et al. showed that mRNA could be
directed to a specific subset of plasmid-encoded ribosomes by
altering the message binding site (MBS) of the ribosome while at
the same time altering the ribosome binding site (RBS) of an mRNA
(Hui, A., et al. (1987) Methods Enzymol. 153: 432-452).
[0006] Although each of the above methods has contributed
significantly to the understanding of rRNA function, progress in
this field has been hampered both by the complexity of translation
and by difficulty in applying standard genetic selection techniques
to these systems.
[0007] Resistance to antibiotics, a matter of growing concern, is
caused partly by antibiotic overuse. According to a study published
by the Journal of the American Medical Association in 2001, between
1989 to 1999 American adults made some 6.7 million visits a year to
the doctor for sore throat. In 73% of those visits, the study
found, the patient was treated with antibiotics, though only 5%-17%
of sore throats are caused by bacterial infections, the only kind
that respond to antibiotics. Macrolide antibiotics in particular
are becoming extremely popular for treatment of upper respiratory
infections, in part because of their typically short, convenient
course of treatment. Research has linked such vast use to a rise in
resistant bacteria and the recent development of multiple drug
resistance has underscored the need for antibiotics which are
highly specific and refractory to the development of drug
resistance.
[0008] Microorganisms can be resistant to antibiotics by four
mechanisms. First, resistance can occur by reducing the amount of
antibiotic that accumulates in the cell. Cells can accomplish this
by either reducing the uptake of the antibiotic into the cell or by
pumping the antibiotic out of the cell. Uptake mediated resistance
often occurs, because a particular organism does not have the
antibiotic transport protein on the cell surface or occasionally
when the constituents of the membrane are mutated in a way that
interferes with transport of the antibiotic into a cell. Uptake
mediated resistance is only possible in instances where the drug
gains entry through a nonessential transport molecule. Efflux
mechanisms of antibiotic resistance occur via transporter proteins.
These can be highly specific transporters that transport a
particular antibiotic, such as tetracycline, out of the cell or
they can be more general transporters that transport groups of
molecules with similar characteristics out of the cell. The most
notorious example of a nonspecific transporter is the multidrug
resistance transporter (MDR).
[0009] Inactivating the antibiotic is another mechanism by which
microorganisms can become resistant to antibiotics. Antibiotic
inactivation is accomplished when an enzyme in the cell chemically
alters the antibiotic so that it no longer binds to its intended
target. These enzymes are usually very specific and have evolved
over millions of years, along with the antibiotics that they
inactivate. Examples of antibiotics that are enzymatically
inactivated are penicillin, chloramphenicol, and kanamycin.
[0010] Resistance can also occur by modifying or overproducing the
target site. The target molecule of the antibiotic is either
mutated or chemically modified so that it no long binds the
antibiotic. This is possible only if modification of the target
does not interfere with normal cellular functions. Target site
overproduction is less common but can also produce cells that are
resistant to antibiotics.
[0011] Lastly, target bypass is a mechanism by which microorganisms
can become resistant to antibiotics. In bypass mechanisms, two
metabolic pathways or targets exist in the cell and one is not
sensitive to the antibiotic. Treatment with the antibiotic selects
cells with more reliance on the second, antibiotic-resistant
pathway.
[0012] Among these mechanisms, the greatest concern for new
antibiotic development is target site modification. Enzymatic
inactivation and specific transport mechanisms require the
existence of a substrate specific enzyme to inactivate or transport
the antibiotic out of the cell. Enzymes have evolved over millions
of years in response to naturally occurring antibiotics. Since
microorganisms cannot spontaneously generate new enzymes, these
mechanisms are unlikely to pose a significant threat to the
development of new synthetic antibiotics. Target bypass only occurs
in cells where redundant metabolic pathways exist. As understanding
of the MDR transporters increases, it is increasingly possible to
develop drugs that are not transported out of the cell by them.
Thus, target site modification poses the greatest risk for the
development of antibiotic resistance for new classes of antibiotic
and this is particularly true for those antibiotics that target
ribosomes. The only new class of antibiotics in thirty-five years,
the oxazolidinones, is a recent example of an antibiotic that has
been compromised because of target site modification. Resistant
strains containing a single mutation in rRNA developed within seven
months of its use in the clinical settings.
SUMMARY OF THE INVENTION
[0013] The present invention provides compositions and methods
which may be used to identify antibiotics that are not susceptible
to the development of antibiotic resistance. In particular, rRNA
genes from E. coli and other disease causing organisms are
genetically engineered to allow identification of functional mutant
ribosomes that may be used as drug targets, e.g., to screen
chemical and peptide libraries to identify compounds that bind to
all functional mutant ribosomes but do not bind to human ribosomes.
Antibiotics that recognize all biologically active forms of the
target molecule and are therefore not susceptible to the
development of drug resistance by target site modification are thus
identified.
[0014] The invention provides plasmid constructs comprising an rRNA
gene having a mutant ASD sequence set forth in FIGS. 12 (SEQ ID
NOS:24-47), 13 (SEQ ID NOS:48-61), 15 (SEQ ID NOS:62-111), and 16
(SEQ ID NOS:112-159), at least one mutation in the rRNA gene, and a
genetically engineered gene which encodes a selectable marker
having a mutant SD sequence set forth in FIGS. 12, 13, 15, and 16.
The mutant SD-ASD sequences are mutually compatible pairs and
therefore permit translation of only the mRNA containing the
compatible mutant SD sequence, i.e., translation of the selectable
marker. In one embodiment, the selectable marker is chosen from the
group consisting of chloramphenicol acetyltransferase (CAT), green
fluorescent protein (GFP), or both CAT and GFP. In another
embodiment, the DNA sequence encoding the rRNA gene is under the
control of an inducible promoter.
[0015] The rRNA gene may be selected from a variety of species,
thereby providing for the identification of functional mutant
ribosomes that may be used as drug targets to identify drug
candidates that are effective against the selected species.
Examples of species include, without limitation, Mycobacterium
tuberculosis (tuberculosis), Pseudomonas aeruginosa (multidrug
resistant nosocomial infections), Salmonella typhi (typhoid fever),
Yersenia pestis (plague), Staphylococcus aureus (multidrug
resistant infections causing impetigo, folliculitis, abcesses,
boils, infected lacerations, endocarditis, meningitis, septic
arthritis, pneumonia, osteomyelitis, and toxic shock),
Streptococcus pyogenes (streptococcal sore throat, scarlet fever,
impetigo, erysipelas, puerperal fever, and necrotizing fascitis),
Enterococcus faecalis (vancomycin resistant nosocomial infections,
endocarditis, and bacteremia), Chlamydia trachomatis
(lymphogranuloma venereum, trachoma and inclusion conjunctivitis,
nongonococcal urethritis, epididymitis, cervicitis, urethritis,
infant pneumonia, pelvic inflammatory diseases, Reiter's syndrome
(oligoarthritis) and neonatal conjunctivitis), Saccharomyces
cerevesiae, Candida albicans, and trypanosomes. In one embodiment,
the rRNA gene is from Mycobacterium tuberculosis (see, e.g.,
Example 6 and FIG. 17).
[0016] In still other embodiments of the invention, the rRNA genes
are mitochondrial rRNA genes, i.e., eukaryotic rRNA genes (e.g.,
human mitochondrial rRNA genes).
[0017] The plasmid constructs of the invention, such as the plasmid
constructs set forth in FIGS. 22-26, may include novel mutant ASD
and SD sequences set forth herein. In particular, the present
invention provides novel mutant ASD sequences and novel mutant SD
sequences, set forth in FIGS. 12, 13, 15, and 16, which may be used
in the plasmid constructs and methods of the invention. The mutant
ASD and mutant SD sequences may be used as mutually compatible
pairs (see FIGS. 12, 13, 15, and 16). It will be appreciated that
the mutually compatible pairs of mutant ASD and SD sequences
interact as pairs in the form of RNA and permit translation of only
the mRNAs containing the compatible mutant SD sequence.
[0018] In another aspect, the present invention provides a plasmid
comprising an E. coli 16S rRNA gene having a mutant ASD sequence,
at least one mutation in said 16S rRNA gene, and a genetically
engineered gene which encodes a selectable marker, e.g., GFP,
having a mutant SD sequence. In another embodiment, the 16S rRNA
gene is from a species other than E. coli. In one embodiment, the
mutant ASD sequence is selected from the sequences set forth in
FIGS. 12, 13, 15, and 16. In another embodiment, the mutant SD
sequence is selected from the sequences set forth in FIGS. 12, 13,
15, and 16. In yet another embodiment, the mutant ASD sequence and
the mutant SD sequence are in mutually compatible pairs (see FIGS.
12, 13, 15, and 16). Each mutually compatible mutant SD and mutant
ASD pair permits translation by the selectable marker.
[0019] In one embodiment, the invention features a cell comprising
a plasmid of the invention. In another embodiment, the cell is a
bacterial cell.
[0020] It will be appreciated that the rRNA gene used in the
methods of the present invention may be from the 16S rRNA, 23S
rRNA, and 55S rRNA gene.
[0021] Other features and advantages of the invention will be
apparent from the following detailed description and claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] FIG. 1 depicts the plasmid construct pRNA123. The locations
of specific sites in pRNA123 are as follows: the 16S rRNA E. coli
rrnB operon corresponds to nucleic acids 1-1542; the 16S MBS
(message binding sequence) GGGAU corresponds to nucleic acids
1536-1540; the 16S-23S spacer region corresponds to nucleic acids
1543-1982; the 23S rRNA of E. coli rrnB operon corresponds to
nucleic acids 1983-4886; the 23S-5S spacer region corresponds to
nucleic acids 4887-4982; the 5S rRNA of E. coli rrnB operon
corresponds to nucleic acids 4983-5098; the terminator T1 of E.
coli rrnB operon corresponds to nucleic acids 5102-5145; the
terminator T2 of E. coli rrnB operon corresponds to nucleic acids
5276-5305; the bla (.beta.-lactamase; ampicillin resistance)
corresponds to nucleic acids 6575-7432; the replication origin
corresponds to nucleic acids 7575-8209; the rop (Rop protein)
corresponds to nucleic acids 8813-8622; the GFP corresponds to
nucleic acids 10201-9467; the GFP RBS (ribosome binding sequence)
AUCCC corresponds to nucleic acids 10213-10209; the trp.sup.c
promoter corresponds to nucleic acids 10270-10230; the trp.sup.c
promoter corresponds to nucleic acids 10745-10785; the CAT RBS
AUCCC corresponds to nucleic acids 10802-10806; the cam
(chloramphenicol acetyltransferase: CAT) corresponds to nucleic
acids 10814-11473; the lacI.sup.q promoter corresponds to nucleic
acids 11782-11859; the lacI.sup.q (lac repressor) corresponds to
nucleic acids 11860-12942; and the lacUV5 promoter corresponds to
nucleic acids 12985-13026.
[0023] FIG. 2 depicts a scheme for construction of pRNA9. The
abbreviations in FIG. 2 are defined as follows: Ap.sup.r,
ampicillin resistance; cam, CAT gene; lacI.sup.q, lactose
repressor; PlacUV5, lacUV5 promoter; Ptrp.sup.c, constitutive trp
promoter. The restriction sites used are also indicated.
[0024] FIG. 3 depicts an autoradiogram of sequencing gels with
pRNA8-rMBS-rRBS. The mutagenic MBS and RBS are shown: B 5 C, G, T;
D 5 A, G, T; H 5 A, C, T; V 5 A, C, G. The start codon of cam and
the 39 end of 16S rRNA are indicated. Panel A depicts the RBS of
the CAT gene. Panel B depicts the MBS of the 16S rRNA gene.
[0025] FIG. 4 depicts a graph of the effect of MBSs on growth. The
abbreviations in FIG. 4 are defined as follows: pBR322; vector:
pRNA6; RBS 5 GUGUG, MBS 5 CACAC: pRNA9; RBS 5 GGAGG (wt), MBS 5
CCUCC (wt): and Clone 1.times.24; RBS 5 AUCCC, MBS 5 GGGAU.
[0026] FIG. 5 depicts a scheme for construction of pRNA122. The
abbreviations in FIG. 5 are defined as follows: Ap.sup.r,
ampicillin resistance; cam, CAT gene; lacI.sup.q, lactose
repressor; PlacUV5, lacUV5 promoter; Ptrp.sup.c, constitutive trp
promoter; N 5 A, C, G, and T. The four nucleotides mutated are
underlined and the restriction sites used are indicated.
[0027] FIG. 6 depicts a plasmid-derived ribosome distribution and
CAT activity. Cultures were induced (or not) in early log phase (as
shown in FIG. 4) and samples were withdrawn for CAT assay and total
RNA preparation at the points indicated. Open squares represent the
percent plasmid-derived rRNA in uninduced cells. Closed squares
represent the percent plasmid-derived rRNA in induced cells. Open
circles represent CAT activity in uninduced cells. Closed circles
represent CAT activity in induced cells.
[0028] FIG. 7 depicts a scheme for construction of single mutations
at positions 516 or 535. The abbreviations in FIG. 7 are defined as
follows: Ap.sup.r, ampicillin resistance; cam, CAT gene;
lacI.sup.q, lactose repressor; PlacUV5, lacUV5 promoter;
Ptrp.sup.c, constitutive trp promoter. C516 was substituted to V
(A, C, or G) and A535 was substituted to B (C, G, or T,) in pRNA122
and the restriction sites that were used are also indicated.
[0029] FIG. 8 depicts the functional analysis of mutations
constructed at positions 516 and 535 of 16S rRNA in pRNA122.
Nucleotide identities are indicated in the order of 516:535 and
mutations are underlined. pRNA122 containing the wild-type MBS (wt.
MBS) was used as a negative control to assess the degree of MIC and
the level of CAT activity due to CAT mRNA translation by wild-type
ribosomes. Standard error of the mean is used to indicate the range
of the assay results.
[0030] FIG. 9 depicts a description and use of
oligodeoxynucleotides (SEQ ID NOS:6-23). Primer binding sites are
indicated by the number of nucleotides from the 5' nucleotide of
the coding region. Negative numbers indicate binding sites 5' to
the coding region.
[0031] FIG. 10 describes several plasmids used in Example 4.
[0032] FIG. 11 depicts the specificity of the selected
recombinants. The concentrations of chloramphenicol used are
indicated and the unit of MIC is micrograms of
chloramphenicol/mL.
[0033] FIG. 12 depicts novel mutant ASD sequences and novel mutant
SD sequences of the present invention (SEQ ID NOS:24-47). FIG. 12
also shows a sequence analysis of chloramphenicol resistant
isolates. The mutated nucleotides are underlined and potential
duplex formations are boxed. CAT activity was measured twice for
each culture and the unit is CPM/0.1 .mu.L, of culture/OD600.
Induction was measured by dividing CAT activity in induced cells
with CAT activity in uninduced cells. A -1 indicates no induction,
while a +1 indicates induction with 1 mM IPTG.
[0034] FIG. 13 depicts novel mutant ASD sequences and novel mutant
SD sequences of the present invention (SEQ ID NOS:48-61). FIG. 13
also shows a sequence analysis of CAT mRNA mutants. Potential
duplex formations are boxed and the mutated nucleotides are
underlined. The start codon (AUG) is in bold. A -1 indicates no
induction, while a +1 indicates induction with 1 mM IPTG.
[0035] FIG. 14 depicts the effect of Pseudouridine-516
Substitutions on subunit assembly. The percent plasmid-derived 30S
data are presented as the percentage of the total 30S in each peak
and in crude ribosomes.
[0036] FIG. 15 depicts novel mutant ASD sequences and novel mutant
SD sequences of the present invention (SEQ ID NOS:62-111).
[0037] FIG. 16 depicts novel mutant ASD sequences and novel mutant
SD sequences of the present invention (SEQ ID NOS:112-159).
[0038] FIG. 17 depicts a hybrid construct. This hybrid construct
contains a 16S rRNA from Mycobacterium tuberculosis. The specific
sites on the hybrid construct are as follows: the part of rRNA from
E. coli rrnB operon corresponds to nucleic acids 1-931; the part of
16S rRNA from Mycobacterium tuberculosis rrnB operon corresponds to
nucleic acids 932-1542; the 16S MBS (message binding sequence)
GGGAU corresponds to nucleic acids 1536-1540; the terminator T1 of
E. coli rrnB operon corresponds to nucleic acids 1791-1834; the
terminator T2 of E. coli rrnB operon corresponds to nucleic acids
1965-1994; the replication origin corresponds to nucleic acids
3054-2438; the bla (.beta.-lactamase; ampicillin resistance)
corresponds to nucleic acids 3214-4074; the GFP corresponds to
nucleic acids 5726-4992; the GFP RBS (ribosome binding sequence)
AUCCC corresponds to nucleic acids 5738-5734; the trp.sup.c
promoter corresponds to nucleic acids 5795-5755; the trp.sup.c
promoter corresponds to nucleic acids 6270-6310; the CAT RBS
(ribosome binding sequence) AUCCC corresponds to nucleic acids
6327-6331; the cam (chloramphenicol acetyltransferase; CAT)
corresponds to nucleic acids 6339-6998; the lacI.sup.q promoter
corresponds to nucleic acids 7307-7384; the lacI.sup.q (lac
repressor) corresponds to nucleic acids 7385-8467; and the lacUV5
promoter corresponds to nucleic acids 8510-8551.
[0039] FIG. 18 depicts a plasmid map of pRNA122.
[0040] FIG. 19 depicts a table of sequences and MICs of functional
mutants (SEQ ID NOS:160-238). Sequences are ranked by the minimum
inhibitory concentration ("MIC") of chloramphenicol required to
fully inhibit growth of cells expressing the mutant ribosomes. The
nucleotide sequences ("Nucleotide sequence") are the 790 loop
sequences selected from the pool of functional, randomized mutants.
Mutations are underlined. The number of mutations ("Number of
mutations") in each mutant sequence are indicated, as well as the
number of occurrences ("Number of occurrences") which represents
the number of clones with the indicated sequence. The sequence and
activity of the unmutated control, pRNA122 (WT, wild-type) is
depicted in the first row of FIG. 19, in which the MIC is 600
.mu.g/ml.
[0041] FIG. 20 depicts the 790-loop sequence variation. In the
consensus sequence R=A or G; N=A, C, G or U; M=A or C; H=A, C or U;
W=A or U; Y=C or U; .DELTA.=deletion; and underlined numbers
indicate the wild-type E. coli sequence.
[0042] FIG. 21 depicts functional and thermodynamic analysis of
positions 787 and 795. Mutations have been underlined and "n.d."
represents not determined. FIG. 21 shows site-directed mutations
("Nucleotide") that were constructed using PCR, as described for
the random mutants, except that the mutagenic primers contained
substitutions corresponding only to positions 787 and 795. In order
to determine ribosome function ("Mean CAT activity"), each strain
was grown and assayed for CAT activity at least twice, the data
were averaged, and presented as percentages of the unmutated
control, pRNA122+the standard error of the mean. The ratio of
plasmid to chromosome-derived rRNA in 30S and 70 S ribosomes ("%
Mutant 30S in 30S peak/70S peak") was determined by primer
extension. Cultures were grown and assayed at least twice and the
mean values are presented as a percentage of the total 30S in each
peak .+-.the standard error of the mean. Thermodynamic parameters
("Thermodynamics") are for the higher-temperature transition of
model oligonucleotides and are the average of results for four or
five different oligomer concentrations. Standard errors for the
.DELTA.G.degree. 37 are .+-.5% (1 kcal=4184 J). Errors in T.sub.m
are estimated as .+-.1.degree. C. All solutions were at pH 7.
[0043] FIG. 22 depicts the DNA sequence of pRNA8 (SEQ ID NO:1).
[0044] FIG. 23 depicts the DNA sequence of pRNA122 (SEQ ID
NO:2).
[0045] FIG. 24 depicts the DNA sequence of pRNA123 (SEQ ID
NO:3).
[0046] FIG. 25 depicts the DNA sequence of pRNA123 Mycobacterium
tuberculosis-2 (pRNA123 containing a hybrid of E. coli and
Mycobacterium tuberculosis 16S rRNA genes) (SEQ ID NO:4).
[0047] FIG. 26 depicts the DNA sequence of pRep-Mycobacterium
tuberculosis-2 (containing a puc19 derivative containing the rRNA
operon from pRNA122; however, the 23S and 5S rRNA genes are
deleted) (SEQ ID NO:5).
[0048] FIGS. 2-14 may be found in Lee, K., et al. Genetic
Approaches to Studying Protein Synthesis Effects of Mutations at
Pseudouridine 516 and A535 in Escherichia coli 16S rRNA. Symposium:
Translational Control: A Mechanistic Perspective at the
Experimental Biology 2001 Meeting (2001); and FIGS. 18-21 may be
found in Lee, K. et al., J. Mol. Biol. 269: 732-743 (1997), all of
which are expressly incorporated by reference herein.
DETAILED DESCRIPTION OF THE INVENTION
[0049] Compositions and methods are provided to identify functional
mutant ribosomes suitable as drug targets. The compositions and
methods allow isolation and analysis of mutations that would
normally be lethal and allow direct selection of rRNA mutants with
predetermined levels of ribosome function. The compositions and
methods of the present invention may be used to identify
antibiotics to treat generally and/or selectively human
pathogens.
[0050] According to one embodiment of the invention, a functional
genomics database for rRNA genes of a variety of species may be
generated. In particular, the rRNA gene is randomly mutated using a
generalized mutational strategy. A host cell is then transformed
with a mutagenized plasmid of the invention comprising: an rRNA
gene having a mutant ASD sequence, the mutated rRNA gene, and a
genetically engineered gene which encodes a selectable marker
having a mutant SD sequence. The selectable marker gene, such as
CAT, may be used to select mutants that are functional, e.g., by
plating the transformed cells onto growth medium containing
chloramphenicol. The mutant rRNA genes contained in each plasmid
DNA of the individual clones from each colony are selected and
characterized. The function of each of the mutant rRNA genes is
assessed by measuring the amount of an additional selectable marker
gene, such as GFP, produced by each clone upon induction of the
rRNA operon. A functional genomics database may thus be assembled,
which contains the sequence and functional data of the functional
mutant rRNA genes. In particular, functionally important regions of
the rRNA gene that will serve as drug targets are identified by
comparing the sequences of the functional genomics database and
correlating the sequence with the amount of GFP protein
produced.
[0051] In another embodiment, the nucleotides in the functionally
important target regions identified in the above methods may be
simultaneously randomly mutated, e.g., by using standard methods of
molecular mutagenesis, and cloned into a plasmid of the invention
to form a plasmid pool containing random mutations at each of the
nucleotide positions in the target region. The resulting pool of
plasmids containing random mutations is then used to transform
cells, e.g., E. coli cells, and form a library of clones, each of
which contains a unique combination of mutations in the target
region. The library of mutant clones are grown in the presence of
IPTG to induce production of the mutant rRNA genes and a selectable
marker is used, such as CAT, to select clones of rRNA mutants
containing nucleotide combinations of the target region that
produce functional ribosomes. The rRNA genes producing functional
ribosomes are sequenced and may be incorporated into a
database.
[0052] In yet another embodiment, a series of oligonucleotides may
be synthesized that contain the functionally-important nucleotides
and nucleotide motifs within the target region and may be used to
sequentially screen compounds and compound libraries to identify
compounds that recognize (bind to) the functionally important
sequences and motifs. The compounds that bind to all of the
oligonucleotides are then counterscreened against oligonucleotides
and/or other RNA containing molecules to identify drug candidates.
Drug candidates selected by the methods of the present invention
are thus capable of recognizing all of the functional variants of
the target sequence, i.e., the target cannot be mutated in a way
that the drug cannot bind, without causing loss of function to the
ribosome.
[0053] In still another embodiment, after the first stage
mutagenesis of the entire rRNA is performed using techniques known
in the art, e.g., error-prone PCR mutagenesis, the mutants are
analyzed to identify regions within the rRNA that are important for
function. These regions are then sorted based on their phylogenetic
conservation, as described herein, and are then used for further
mutagenesis.
[0054] Ribosomal RNA sequences from each species are different and
the more closely related two species are, the more their rRNAs are
alike. For instance, humans and monkeys have very similar rRNA
sequences, but humans and bacteria have very different rRNA
sequences. These differences may be utilized for the development of
very specific drugs with a narrow spectrum of action and also for
the development of broad-spectrum drugs that inhibit large groups
of organisms that are only distantly related, such as all
bacteria.
[0055] In another embodiment, the functionally important regions
identified above are divided into groups based upon whether or not
they occur in closely related groups of organisms. For instance,
some regions of rRNA are found in all bacteria but not in other
organisms. Other areas of rRNA are found only in closely related
groups of bacteria, such as all of the members of a particular
species, e.g., members of the genus Mycobacterium or
Streptococcus.
[0056] In a further embodiment, the regions found in very large
groups of organisms, e.g., all bacteria or all fungi, are used to
develop broad-spectrum antibiotics that may be used to treat
infections from a large number of organisms within that group. The
methods of the present invention may be performed on these regions
and functional mutant ribosomes identified. These functional mutant
ribosomes may be screened, for example, with compound
libraries.
[0057] In yet another embodiment, regions that are located only in
relatively small groups of organisms, such as all members of the
genus Streptococcus or all members of the genus Mycobacterium, may
be used to design narrow spectrum antibiotics that will only
inhibit the growth of organisms that fall within these smaller
groups. The methods of the present invention may be performed on
these regions and functional mutant ribosomes identified. These
functional mutant ribosomes will be screened, e.g., compound
libraries.
[0058] The invention provides novel plasmid constructs, e.g.
pRNA123 (FIGS. 1 and 24). The novel plasmid constructs of the
present invention employ novel mutant ASD and mutant SD sequences
set forth in FIGS. 12, 13, 15 and 16. The mutant ASD and mutant SD
sequences may be used as mutually compatible pairs (see FIGS. 12,
13, 15 and 16). It will be appreciated that the mutually compatible
pairs of mutant ASD and SD sequences interact as pairs in the form
of RNA, to permit translation of only the mRNAs containing the
altered SD sequence.
DEFINITIONS
[0059] As used herein, each of the following terms has the meaning
associated with it in this section.
[0060] The articles "a" and "an" are used herein to refer to one or
to more than one (i.e. to at least one) of the grammatical object
of the article. By way of example, "an element" means one element
or more than one element.
[0061] An "inducible" promoter is a nucleotide sequence which, when
operably linked with a polynucleotide which encodes or specifies a
gene product, causes the gene product to be produced in a living
cell substantially only when an inducer which corresponds to the
promoter is present in the cell.
[0062] As used herein, the term "mutation" includes an alteration
in the nucleotide sequence of a given gene or regulatory sequence
from the naturally occurring or normal nucleotide sequence. A
mutation may be a single nucleotide alteration (e.g., deletion,
insertion, substitution, including a point mutation), or a
deletion, insertion, or substitution of a number of
nucleotides.
[0063] By the term "selectable marker" is meant a gene whose
expression allows one to identify functional mutant ribosomes.
[0064] Various aspects of the invention are described in further
detail in the following subsections:
I. Isolated Nucleic Acid Molecules
[0065] As used herein, the term "nucleic acid molecule" is intended
to include DNA molecules (e.g., cDNA or genomic DNA) and RNA
molecules (e.g., mRNA) and analogs of the DNA or RNA generated
using nucleotide analogs. The nucleic acid molecule can be
single-stranded or double-stranded, but preferably is
double-stranded DNA.
[0066] The term "isolated nucleic acid molecule" includes nucleic
acid molecules which are separated from other nucleic acid
molecules which are present in the natural source of the nucleic
acid. For example, with regards to genomic DNA, the term "isolated"
includes nucleic acid molecules which are separated from the
chromosome with which the genomic DNA is naturally associated.
Preferably, an "isolated" nucleic acid is free of sequences which
naturally flank the nucleic acid (i.e., sequences located at the 5'
and 3' ends of the nucleic acid) in the genomic DNA of the organism
from which the nucleic acid is derived. Moreover, an "isolated"
nucleic acid molecule, such as a cDNA molecule, can be
substantially free of other cellular material, or culture medium,
when produced by recombinant techniques, or substantially free of
chemical precursors or other chemicals when chemically
synthesized.
[0067] A nucleic acid molecule of the present invention, e.g., a
nucleic acid molecule having the nucleotide sequence set forth in
FIGS. 12, 13, 15, and 16, or a portion thereof, can be isolated
using standard molecular biology techniques and the sequence
information provided herein. Using all or portion of the nucleic
acid sequence set forth in FIGS. 12, 13, 15, and 16 as a
hybridization probe, the nucleic acid molecules of the present
invention can be isolated using standard hybridization and cloning
techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and
Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold
Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold
Spring Harbor, N.Y., 1989).
[0068] Moreover, a nucleic acid molecule encompassing all or a
portion of the sequence set forth in FIGS. 12, 13, 15, and 16 can
be isolated by the polymerase chain reaction (PCR) using synthetic
oligonucleotide primers designed based upon the sequence set forth
in FIGS. 12, 13, 15, and 16.
[0069] A nucleic acid of the invention can be amplified using cDNA,
mRNA or, alternatively, genomic DNA as a template and appropriate
oligonucleotide primers according to standard PCR amplification
techniques. The nucleic acid so amplified can be cloned into an
appropriate vector and characterized by DNA sequence analysis.
Furthermore, oligonucleotides corresponding to the nucleotide
sequences of the present invention can be prepared by standard
synthetic techniques, e.g., using an automated DNA synthesizer.
[0070] In another preferred embodiment, an isolated nucleic acid
molecule of the invention comprises a nucleic acid molecule which
is a complement of the nucleotide sequence set forth in FIGS. 12,
13, 15, and 16, or a portion of any of these nucleotide sequences.
A nucleic acid molecule which is complementary to the nucleotide
sequence shown in FIGS. 12, 13, 15, and 16, is one which is
sufficiently complementary to the nucleotide sequence shown in
FIGS. 12, 13, 15, and 16, such that it can hybridize to the
nucleotide sequence shown in FIGS. 12, 13, 15, and 16,
respectively, thereby forming a stable duplex.
II. Recombinant Expression Vectors and Host Cells
[0071] Another aspect of the invention pertains to vectors,
preferably expression vectors, containing a nucleic acid molecule
of the present invention (or a portion thereof). As used herein,
the term "vector" refers to a nucleic acid molecule capable of
transporting another nucleic acid to which it has been linked. One
type of vector is a "plasmid", which refers to a circular double
stranded DNA loop into which additional DNA segments can be
ligated. Another type of vector is a viral vector, wherein
additional DNA segments can be ligated into the viral genome.
Certain vectors are capable of autonomous replication in a host
cell into which they are introduced (e.g., bacterial vectors having
a bacterial origin of replication and episomal mammalian vectors).
Other vectors (e.g., non-episomal mammalian vectors) are integrated
into the genome of a host cell upon introduction into the host
cell, and thereby are replicated along with the host genome.
Moreover, certain vectors are capable of directing the expression
of genes to which they are operatively linked. Such vectors are
referred to herein as "expression vectors". In general, expression
vectors of utility in recombinant DNA techniques are often in the
form of plasmids. In the present specification, "plasmid" and
"vector" can be used interchangeably as the plasmid is the most
commonly used form of vector. However, the invention is intended to
include such other forms of expression vectors, such as viral
vectors (e.g., replication defective retroviruses, adenoviruses and
adeno-associated viruses), which serve equivalent functions.
[0072] The recombinant expression vectors of the invention comprise
a nucleic acid of the invention in a form suitable for expression
of the nucleic acid in a host cell, which means that the
recombinant expression vectors include one or more regulatory
sequences, selected on the basis of the host cells to be used for
expression, which is operatively linked to the nucleic acid
sequence to be expressed. Within a recombinant expression vector,
"operably linked" is intended to mean that the nucleotide sequence
of interest is linked to the regulatory sequence(s) in a manner
which allows for expression of the nucleotide sequence (e.g., in an
in vitro transcription/translation system or in a host cell when
the vector is introduced into the host cell). The term "regulatory
sequence" is intended to include promoters, enhancers and other
expression control elements (e.g., polyadenylation signals). Such
regulatory sequences are described, for example, in Goeddel (1990)
Methods Enzymol. 185:3-7. Regulatory sequences include those which
direct constitutive expression of a nucleotide sequence in many
types of host cells and those which direct expression of the
nucleotide sequence only in certain host cells (e.g.,
tissue-specific regulatory sequences). It will be appreciated by
those skilled in the art that the design of the expression vector
can depend on such factors as the choice of the host cell to be
transformed, the level of expression of protein desired, and the
like. The expression vectors of the invention can be introduced
into host cells to thereby produce proteins or peptides, including
fusion proteins or peptides, encoded by nucleic acids as described
herein.
[0073] Expression of proteins in prokaryotes is most often carried
out in E. coli with vectors containing constitutive or inducible
promoters directing the expression of either fusion or non-fusion
proteins. Fusion vectors add a number of amino acids to a protein
encoded therein, usually to the amino terminus of the recombinant
protein. Such fusion vectors typically serve three purposes: 1) to
increase expression of recombinant protein; 2) to increase the
solubility of the recombinant protein; and 3) to aid in the
purification of the recombinant protein by acting as a ligand in
affinity purification. Often, in fusion expression vectors, a
proteolytic cleavage site is introduced at the junction of the
fusion moiety and the recombinant protein to enable separation of
the recombinant protein from the fusion moiety subsequent to
purification of the fusion protein. Such enzymes, and their cognate
recognition sequences, include Factor Xa, thrombin and
enterokinase. Typical fusion expression vectors include pGEX
(Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene
67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5
(Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase
(GST), maltose E binding protein, or protein A, respectively, to
the target recombinant protein.
[0074] Examples of suitable inducible non-fusion E. coli expression
vectors include pTrc (Amann et al. (1988) Gene 69:301-315) and pET
11d (Studier et al. (1990) Methods Enzymol. 185:60-89). Target gene
expression from the pTrc vector relies on host RNA polymerase
transcription from a hybrid trp-lac fusion promoter. Target gene
expression from the pET 11d vector relies on transcription from a
T7 gn10-lac fusion promoter mediated by a coexpressed viral RNA
polymerase (T7 gn1). This viral polymerase is supplied by host
strains BL21(DE3) or HMS174(DE3) from a resident prophage harboring
a T7 gn1 gene under the transcriptional control of the lacUV 5
promoter.
[0075] One strategy to maximize recombinant protein expression in
E. coli is to express the protein in a host bacteria with an
impaired capacity to proteolytically cleave the recombinant protein
(Gottesman, S. (1990) Methods Enzymol. 185:119-128). Another
strategy is to alter the nucleic acid sequence of the nucleic acid
to be inserted into an expression vector so that the individual
codons for each amino acid are those preferentially utilized in E.
coli (Wada et al. (1992) Nucleic Acids Res. 20:2111-2118). Such
alteration of nucleic acid sequences of the invention can be
carried out by standard DNA synthesis techniques.
[0076] In another embodiment, the expression vector may be a yeast
expression vector. Examples of vectors for expression in yeast S.
cerevisiae include pYepSec1 (Baldari et al. (1987) Embo J.
6:229-234), pMFa (Kurjan and Herskowitz (1982) Cell 30:933-943),
pJRY88 (Schultz et al. (1987) Gene 54:113-123), pYES2 (Invitrogen
Corporation, San Diego, Calif.), and picZ (Invitrogen Corp, San
Diego, Calif.).
[0077] In yet another embodiment, a nucleic acid of the invention
is expressed in mammalian cells using a mammalian expression
vector. Examples of mammalian expression vectors include pCDM8
(Seed, B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987)
EMBO J. 6:187-195). When used in mammalian cells, the expression
vector's control functions are often provided by viral regulatory
elements. For example, commonly used promoters are derived from
polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40. For
other suitable expression systems for both prokaryotic and
eukaryotic cells see chapters 16 and 17 of Sambrook, J. et al.,
Molecular Cloning: A Laboratory Manual. 2nd ed., Cold Spring Harbor
Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring
Harbor, N.Y., 1989.
[0078] In another embodiment, the recombinant mammalian expression
vector is capable of directing expression of the nucleic acid
preferentially in a particular cell type (e.g., tissue-specific
regulatory elements are used to express the nucleic acid).
Tissue-specific regulatory elements are known in the art.
Non-limiting examples of suitable tissue-specific promoters include
the albumin promoter (liver-specific; Pinkert et al. (1987) Genes
Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton
(1988) Adv. Immunol. 43:235-275), particular promoters of T cell
receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and
immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and
Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g.,
the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl.
Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund
et al. (1985) Science 230:912-916), and mammary gland-specific
promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and
European Application Publication No. 264,166).
Developmentally-regulated promoters are also encompassed, for
example by the murine hox promoters (Kessel and Gruss (1990)
Science 249:374-379).
[0079] Another aspect of the invention pertains to host cells into
which a the nucleic acid molecule of the invention is introduced.
The terms "host cell" and "recombinant host cell" are used
interchangeably herein. It is understood that such terms refer not
only to the particular subject cell but to the progeny or potential
progeny of such a cell. Because certain modifications may occur in
succeeding generations due to either mutation or environmental
influences, such progeny may not, in fact, be identical to the
parent cell, but are still included within the scope of the term as
used herein.
[0080] A host cell can be any prokaryotic or eukaryotic cell. Other
suitable host cells are known to those skilled in the art.
[0081] Vector DNA can be introduced into prokaryotic or eukaryotic
cells via conventional transformation or transfection techniques.
As used herein, the terms "transformation" and "transfection" are
intended to refer to a variety of art-recognized techniques for
introducing foreign nucleic acid (e.g., DNA) into a host cell,
including calcium phosphate or calcium chloride co-precipitation,
DEAE-dextran-mediated transfection, lipofection, or
electroporation. Suitable methods for transforming or transfecting
host cells can be found in Sambrook et al. (Molecular Cloning: A
Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989),
and other laboratory manuals.
[0082] For stable transfection of mammalian cells, it is known
that, depending upon the expression vector and transfection
technique used, only a small fraction of cells may integrate the
foreign DNA into their genome. In order to identify and select
these integrants, a gene that encodes a selectable marker (e.g.,
resistance to antibiotics) is generally introduced into the host
cells along with the gene of interest. Nucleic acid encoding a
selectable marker can be introduced into a host cell on the same
vector as that encoding a protein or can be introduced on a
separate vector. Cells stably transfected with the introduced
nucleic acid can be identified by drug selection (e.g., cells that
have incorporated the selectable marker gene will survive, while
the other cells die).
III. Uses and Methods of the Invention
[0083] The nucleic acid molecules described herein may be used in a
plasmid construct, e.g. pRNA123, to carry out one or more of the
following methods: (1) creation of a functional genomics database
of the rRNA genes generated by the methods of the present
invention; (2) mining of the database to identify functionally
important regions of the rRNA; (3) identification of functionally
important sequences and structural motifs within each target
region; (4) screening compounds and compound libraries against a
series of functional variants of the target sequence to identify
compounds that bind to all functional variants of the target
sequence; and (5) counterscreening the compounds against nontarget
RNAs, such as human ribosomes or ribosomal RNA sequences.
[0084] This invention is further illustrated by the following
examples, which should not be construed as limiting. The contents
of all references, patents and published patent applications cited
throughout this application, as well as the Figures and Appendices,
are incorporated herein by reference.
SPECIFIC EXAMPLES
Example 1
Identification of Mutant SD and Mutant ASD Combinations
[0085] It has been shown that by coordinately changing the SD and
ASD, a particular mRNA containing an altered SD could be targeted
to ribosomes containing the altered ASD. This and all other efforts
to modify the ASD, however, have proved lethal, as cells containing
these mutations died within two hours after the genes containing
them were activated.
[0086] Using random mutagenesis and genetic selection, mutant
SD-ASD combinations were screened in order to identify nonlethal
SD-ASD combinations. The mutant SD-ASD mutually compatible pairs
are set forth in FIGS. 12, 13 15 and 16. The mutually compatible
pairs of mutant sequences interact as pairs in the form of RNA. The
novel mutant SD-ASD sequence combinations of the present invention
permit translation of only the mRNAs containing the altered SD
sequence.
Example 2
Construction of the pRNA123 Plasmid
[0087] A plasmid construct of the present invention identified as
the pRNA123 plasmid, is set forth in FIGS. 1 and 24. E. coli cells
contain a single chromosome with seven copies of the rRNA genes and
all of the genes for the ribosomal proteins. The plasmid, pRNA123,
in the cell contains a genetically engineered copy of one of the
rRNA genes from E. coli and two genetically engineered genes that
are not normally found in E. coli, referred to herein as a
"selectable markers." One gene encodes the protein chloramphenicol
acetyltransferase (CAT). This protein renders cells resistant to
chloramphenicol by chemically modifying the antibiotic. Another
gene, the Green Fluorescent Protein (GFP), is also included in the
system. GFP facilitates high-throughput functional analysis. The
amount of green light produced upon irradiation with ultraviolet
light is proportional to the amount of GFP present in the cell.
[0088] Ribosomes from pRNA123 have an altered ASD sequence.
Therefore, the ribosomes can only translate mRNAs that have an
altered SD sequence. Only two genes in the cell produce mRNAs with
altered SD sequences that may be translated by the plasmid-encoded
ribosomes: the CAT and GFP gene. Mutations in rRNA affect the
ability of the resulting mutant ribosome to make protein. The
present invention thus provides a system whereby the mutations in
the plasmid-encoded rRNA gene only affect the amount of GFP and CAT
produced. A decrease in plasmid ribosome function makes the cell
more sensitive to chloramphenicol and reduces the amount of green
fluorescence of the cells. Translation of the other mRNAs in the
cell is unaffected since these mRNAs are translated only by
ribosomes that come from the chromosome. Hence, cells containing
functional mutants may be identified and isolated via the
selectable marker.
Example 3
Genetic System for Functional Analysis of Ribosomal RNA
[0089] Identification of Functionally Important Regions of
rRNA.
[0090] Functionally important regions of rRNA molecules that may be
used as drug targets using a functional genomics approach may be
identified through a series of steps. Namely, in step I.a., the
entire rRNA gene is randomly mutated using error-prone PCR or
another generalized mutational strategy. In step I.b., a host cell
is then transformed with a mutagenized plasmid comprising: an rRNA
gene having a mutant ASD sequence, at least one mutation in said
rRNA gene, and a genetically engineered gene which encodes a
selectable marker having a mutant SD sequence, and production of
the rRNA genes from the plasmid are induced by growing the cells in
the presence of IPTG. In step I.c., the CAT gene is used to select
mutants that are functional by plating the transformed cells onto
growth medium containing chloramphenicol. In step I.d., individual
clones from each of the colonies obtained in step I.c. are
isolated. In step I.e., the plasmid DNA from each of the individual
clones from step I.d. is isolated. In step I.f., the rRNA genes
contained in each of the plasmids that had been isolated in step
I.e. are sequenced. In step I.g., the function of each of the
mutants from step I.f. is assessed by measuring the amount of GFP
produced by each clone from step I.e. upon induction of the rRNA
operon. In step I.h., a functional genomics database is assembled
containing the sequence and functional data from steps I.f. and
I.g. In step I.i., functionally important regions of the rRNA gene
that will serve as drug targets are identified. Functionally
important regions may be identified by comparing the sequences of
all of the functional genomics database constructed in step I.g.
and correlating the sequence with the amount of GFP protein
produced. Contiguous sequences of three or more rRNA nucleotides,
in which substitution of the nucleotides in the region produces
significant loss of function, will constitute a functionally
important region and therefore a potential drug target.
[0091] Isolation of Functional Variants of the Target Regions.
[0092] A second aspect of the invention features identification of
mutations of the target site that might lead to antibiotic
resistance using a process termed, "instant evolution", as
described below. In step II.a., for a given target region
identified in step I.i., each of the nucleotides in the target
region is simultaneously randomly mutated using standard methods of
molecular mutagenesis, such as cassette mutagenesis or PCR
mutagenesis, and cloned into the plasmid of step I.b. to form a
plasmid pool containing random mutations at each of the nucleotide
positions in the target region. In step II.b., the resulting pool
of plasmids containing random mutations from step II.a. is used to
transform E. coli cells and form a library of clones, each of which
contains a unique combination of mutations in the target region. In
step II.c., the library of mutant clones from step II.b. is grown
in the presence of IPTG to induce production of the mutant rRNA
genes. In step II.d., the induced mutants are plated on medium
containing chloramphenicol, and CAT is used to select clones of
rRNA mutants containing nucleotide combinations of the target
region that produce functional ribosomes. In step II.e., the
functional clones isolated in step II.d. are sequenced and GFP is
used to measure ribosome function in each one. In step II.f., the
data from step II.e. are incorporated into a mutational
database.
[0093] Isolation of Drug Leads.
[0094] In step III.a., the database in step II.f. is analyzed to
identify functionally-important nucleotides and nucleotide motifs
within the target region. In step III.b., the information from step
III.a. is used to synthesize a series of oligonucleotides that
contain the functionally important nucleotides and nucleotide
motifs identified in step III.a. In step III.c., the
oligonucleotides from step III.b. are used to sequentially screen
compounds and compound libraries to identify compounds that
recognize (bind to) the functionally important sequences and
motifs. In step III.d., compounds that bind to all of the
oligonucleotides are counterscreened against oligonucleotides
and/or other RNA containing molecules to identify drug candidates.
"Drug candidates" are compounds that 1) bind to all of the
oligonucleotides containing the functionally important nucleotides
and nucleotide motifs, but do not bind to molecules that do not
contain the functionally important nucleotides and nucleotide
motifs and 2) do not recognize human ribosomes. Drug candidates
selected by the methods of the present invention therefore
recognize all of the functional variants of the target sequence,
i.e., the target cannot be mutated in a way that the drug cannot
bind, without causing loss of function to the ribosome.
Example 4
Genetic System for Studying Protein Synthesis
[0095] Materials and Methods
[0096] Reagents.
[0097] All reagents and chemicals were as in Lee, K., et al. (1996)
RNA 2: 1270-1285. PCR-directed mutagenesis was performed
essentially by the method of Higuchi, R. (1989) PCR Technology
(Erlich, H. A., ed.), pp. 61-70. Stockton Press, New York, N.Y. The
primers used in the present invention are listed in FIG. 9. The
plasmids used in the present invention are listed in FIG. 10.
[0098] Bacterial Strains and Media.
[0099] All plasmids were maintained and expressed in E. coli DH5
(supE44, hsdR17, recA1, endA1, gyrA96, thi-1 and relA1) (36). To
induce synthesis of plasmid-derived rRNA from the lacUV5 promoter,
IPTG was added to a final concentration of 1 mM. Chloramphenicol
acetyltransferase activity was determined essentially as described
by Nielsen et al. (1989) Anal. Biochem. 179: 19-23. Cultures for
CAT assays were grown in LB-Ap100. MIC were determined by standard
methods in microtiter plates as described in Lee, K., et al. (1997)
J. Mol. Biol. 269: 732-743.
[0100] Primer Extension.
[0101] To determine the ratio of plasmid to chromosome-derived
rRNA, pRNA104 containing cells growing in LB-Ap100 were harvested
at the time intervals indicated and total RNA was extracted using
the Qiagen RNeasy kit (Chatsworth, Calif.). The 30S, 70S, and crude
ribosomes were isolated from 200 mL of induced, plasmid containing
cells by the method of Powers and Noller (Powers, T. et al. (1991)
EMBO J. 10: 2203-2214). The purified RNA was analyzed by primer
extension according to Sigmund, C. D., et al. (1988) Methods
Enzymol. 164: 673-690.
[0102] Experimental Procedures
[0103] Generation of pRNA9 Construct.
[0104] The initial construct, pRNA9, was generated using the
following methods. Plasmid pRNA9 contains a copy of the rrnB operon
from pKK3535 under transcriptional regulation of the lacUV5
promoter; this well-characterized promoter is not subject to
catabolic repression and is easily and reproducibly inducible with
isopropyl-.beta.-D-thiogalactoside (IPTG). To minimize
transcription in the absence of inducer, PCR was used to amplify
and subclone the lac repressor variant, lacI.sup.q (Calos, M. P.
(1978) Nature 274: 762-765) from pSPORT1 (Life Technologies,
Rockville, Md.). The chloramphenicol acetyltransferase gene (cam)
is present and transcribed constitutively from a mutant tryptophan
promoter, trp.sup.c (De Boer, H. A., et al. (1983) Proc. Natl.
Acad. Sci. U.S.A. 80: 21-25; Hui, A., et al. (1987) Proc. Natl.
Acad. Sci. U.S.A. 84: 4762-4766). The .beta.-lactamase gene is also
present to allow maintenance of plasmids in the host strain. To
allow genetic selection, the CAT structural gene from pJLS1021
(Schottel, J. L., et al. (1984) Gene 28: 177-193) was amplified and
placed downstream of a constitutive trp.sup.c promoter using PCR.
Expression of the CAT gene in E. coli renders the cell resistant to
chloramphenicol and the minimal inhibitory concentration,
hereinafter referred to as MIC, of chloramphenicol increases
proportionally with the amount of CAT protein produced (Lee, K., et
al. (1996) RNA 2: 1270-1285; Lee, K., et al. (1997) J. Mol. Biol.
269: 732-743) An overview of the steps used to construct the system
is shown in FIG. 2.
[0105] Selection of a New MBS-RBS Pair.
[0106] To isolate message binding site-ribosome binding site,
hereinafter referred to as MBS-RBS, combinations that are nonlethal
and efficiently translated only by plasmid-derived ribosomes, a
random mutagenesis and selection scheme were used. In particular,
the plasmid-encoded 16S MBS and CAT RBS were randomly mutated using
PCR so that the wild-type nucleotide at each position was excluded.
An autoradiogram of sequencing gels with pRNA8-rMBS-rRBS is
provided in FIG. 3. The resulting 2.5.times.10.sup.6 doubly mutated
transformants were induced for 3.5 hours in SOC medium containing 1
mM IPTG and plated on Luria broth medium containing 100 .mu.g/mL
ampicillin, 350 .mu.g/mL chloramphenicol and 1 mM IPTG. To confirm
the presence of all three alternative nucleotides at each mutated
position, plasmid DNA from approximately 2.0.times.10.sup.5
transformants was sequenced (FIG. 3).
Results
[0107] The data show that all of the nonexcluded nucleotides were
equally represented in the random pool. Of the 2.5.times.10.sup.6
transformants plated, 536 survived the chloramphenicol selection.
The efficiency of the selected MBS-RBS combinations was determined
by measuring the minimal inhibitory concentration, hereinafter
referred to as MIC, of chloramphenicol for each survivor in the
presence and absence of inducer (FIG. 11) (Lee, K., et al. (1996)
RNA 2: 1270-1285; Lee, K., et al. (1997) J. Mol. Biol. 269:
732-743). Nine of the isolates (1.7%) showed MIC in the presence of
inducer, which were lower than the 350 .mu.g/mL concentration at
which they were selected. These were slow growing mutants that
appeared after 48 hours during the initial isolation. The MIC,
however, were scored after only 24 hours. The MIC for 451 of the
isolates (84.1%) were between 400 and 600 .mu.g/mL, and the
remaining 76 clones (14.2%) were 600 .mu.g/mL. The difference in
chloramphenicol resistance between induced and uninduced cells
(.DELTA.MIC) is the amount of CAT translation by plasmid-derived
ribosomes only. A specific interaction between plasmid-derived
ribosomes and CAT mRNA was indicated in 79 (14.7%) of the clones,
which showed four-to eightfold increases in CAT resistance upon
addition of IPTG (FIG. 11).
[0108] Based on these analyses, 11 clones were retained for
additional study. The MBS and RBS in plasmids from these clones
were sequenced and CAT assays and growth curves were performed
(FIGS. 4 and 12). Although a wide range of inducibility was
observed, there was no correlation between specificity and
predicted free energy (.DELTA.G .degree..sup.37). Purines were
preferred in all of the MBS positions, but the RBS did not show
this sort of selectivity. This can be explained partially by the
observation that the selected RBS can base pair with sequences
adjacent to the mutated region of 16S rRNA (Lee, K., et al. (1996)
RNA 2: 1270-1285).
[0109] Growth curves were performed for all of the selected mutants
and compared with strains containing control constructs (FIG. 4).
Only one mutant (IX24) is shown in FIG. 4, but all strains
containing the selected MBS/RBS sequences showed the same pattern
of growth as this mutant. Because of its induction profile, strain
IX24 (containing plasmid pRNA100) was chosen for additional
experimentation. To eliminate the possibility that mutations
outside the MBS and RBS had been inadvertently selected, the DraIII
and XbaI fragment containing the MBS and the KpnI and XhoI fragment
containing the RBS sequence from pRNA100 (FIG. 5) were transferred
to pRNA9.
[0110] Specificity of the System.
[0111] The rate of ribosome induction and the ratio of plasmid to
chromosome-derived rRNA at each stage of growth were determined.
For this, a pRNA100 derivative, pRNA104, which contains a C1192U
mutation in 16S rRNA was constructed (Sigmund, C. D., et al. (1984)
Nucleic Acids Res. 12: 4653-4663; Triman, K., et al. (1989) J. Mol.
Biol. 209: 645-653) so that plasmid-derived rRNA could be
differentiated from wild-type rRNA by primer extension. The C1192U
mutation does not affect ribosome function in other expression
systems (Sigmund, C. D., et al. (1984) Nucleic Acids Res. 12:
4653-4663; Makosky, P. C. et al. (1987) Biochimie 69: 885-889). To
show that the same is true in the present system, CAT activity was
measured after 3 hours induction with 1 mM IPTG in DH5 cells
expressing pRNA100 or pRNA104 and the two were compared. In these
experiments, no significant difference between cells expressing
pRNA104 (99.2.+-.2.8%) or pRNA100 (100%) was observed.
[0112] To determine the percentage of plasmid-derived ribosomes in
cells containing the plasmid, total RNA was isolated from DH5 cells
carrying pRNA104 before and after induction with IPTG and subjected
to primer extension analysis (Lee, K., et al. (1997) J. Mol. Biol.
269: 732-743; Sigmund, C. D., et al. (1984) Nucleic Acids Res. 12:
4653-4663; Makosky, P. C. et al. (1987) Biochimie 69: 885-889).
Maximum induction of plasmid-derived ribosomes occurred 3 hours
after induction at which point they constituted approximately 40%
of the total ribosome pool (FIG. 6). CAT activities in these cells
paralleled induction of plasmid-derived ribosomes and began to
decrease 4 hours after induction, presumably due to protein
degradation during stationary phase. In uninduced cells,
approximately 3% of the total ribosome pool contains
plasmid-derived ribosomes because of basal level transcription from
the lacUV5 promoter.
[0113] Optimization of the System.
[0114] Chloramphenicol resistance in uninduced cells containing
pRNA100 is 75 .mu.g/mL (FIG. 13, MIC=100 .mu.g/mL). By measuring
CAT resistance in a derivative of pRNA100 containing a wild-type
16S rRNA gene, it was determined that approximately one-half of
this background activity was due to CAT translation by wild-type
ribosomes (FIG. 13, pRNA100 1 wt MBS). The remaining activity in
uninduced cells is presumably due to leakiness of the lacUV5
promoter (FIG. 6). The nucleotide sequence located between the RBS
and the start codon in mRNA affects translational efficiency
(Calos, M. P. (1978) Nature 274: 762-765; Stormo, G. D., et al.
(1982) Nucleic Acids Res. 10: 2971-2996; Chen, H., et al. (1994)
Nucl. Acids Res. 22: 4953-4957). In pRNA100, three of the
nucleotides found in this region of the CAT mRNA are complementary
with the 3' terminus of wild-type E. coli 16S RNA (FIG. 11, pRNA100
1 wt MBS). To eliminate the possibility that this was contributing
to CAT translation in the absence of plasmid-encoded ribosomes,
four nucleotides in the CAT gene (underlined in FIG. 11) were
randomly mutagenized and screened to identify mutants with reduced
translation by host ribosomes. A total of 2000 clones were screened
in the absence of plasmid-encoded ribosomes using pCAM9 and six
poorly translated CAT sequences were isolated (FIG. 13). Next, the
BamHI fragment of pRNA100 containing lacI.sup.q and the rrnB operon
was added, and MIC, CAT assays and growth curves were performed on
cells expressing these constructs (data not shown).
[0115] Based on these data, pRNA122 was chosen because it produced
a slightly better induction profile than the others (FIGS. 11 and
23). Translation of the pRNA122 CAT message by wild-type ribosomes
(FIG. 11, pRNA122 1 wt MBS) produces cells that are sensitive to
chloramphenicol concentrations <10 .mu.g/mL. In the presence of
specialized ribosomes (FIG. 13, pRNA122), the background
chloramphenicol MIC is between 40 and 50 .mu.g/mL and the MIC for
induced cells is between 550 and 600 ng/mL, producing an
approximately 13-fold increase in CAT expression upon induction in
pRNA122. Induction of the rrnB operon in pRNA100 produces only an
eightfold increase.
[0116] Use of the System.
[0117] To test the system, the effects of nucleotide substitutions
at the sole pseudouridine in E. coli 16S rRNA, located at position
516 were examined. Because pseudouridine and U form equally stable
base pairs with adenosine (Maden, B. E. (1990) Prog. Nucleic Acid
Res. Mol. Biol. 39: 241-303), mutations at A535 were also
constructed to determine whether the potential for base pair
formation between these two loci affected ribosome function. The
mutations were constructed initially in a pUC19 (Yanisch-Perron,
C., et al. (1985) Gene 33: 103-119) derivative containing the 16S
RNA gene, p16ST, as shown in FIG. 7 and then transferred to pRNA122
for analysis. This two-step process was used, because the SacII
restriction site located between the two mutated positions is
unique in pRNA16ST and is not unique in pRNA122. The effect of the
mutations in pRNA122 on protein synthesis in vivo was determined by
measuring the MIC and CAT activity of the mutant cells (FIG. 8). At
position 516, ribosomes containing the single transition mutation,
pseudouridine-516C, produced approximately 60% of the amount of
functional CAT protein produced by wild-type ribosomes. The
transversion mutations, pseudouridine-516A or pseudouridine-516G,
however, reduced ribosome function by >90%. All of the single
mutations at position 535 retained >50% of the function of
wild-type ribosomes. To examine the possibility that the potential
for base pairing between positions 516 and 535 is necessary for
ribosome function, all possible mutations between these loci were
also constructed and analyzed (FIG. 8). These data show that all of
the double mutants were inactive (10% or less of the wild-type)
regardless of the potential to base pair. To examine the reasons
for loss of function in the 516 mutants, ribosomes from cells
expressing single mutations at position 516 were fractionated by
sucrose density gradient centrifugation and the 30S and 70S peaks
were analyzed by primer extension to determine the percentage of
plasmid-derived 30S subunits present. The data in FIG. 14 show a
strong correlation between ribosome function and the presence of
plasmid-derived ribosomes in the 70S ribosomal fraction, indicating
that mutations at positions 516 affect the ability of the mutant
30S subunits to form 70S ribosomes.
[0118] The references cited in Example 4 may be found in Lee, K.,
et al. Genetic Approaches to Studying Protein Synthesis: Effects of
Mutations at Pseudouridine-516 and A535 in Escherichia coli 16S
rRNA. Symposium: Translational Control: A Mechanistic Perspective
at the Experimental Biology 2001 Meeting (2001) and at Lee, K. et
al. (2001) Genetic Approaches to Studying Protein Synthesis:
Effects of Mutations at Pseudouridine-516 and A535 in Escherichia
coli 16S rRNA. J. Nutrition 131 (11):2994-3004.
Example 5
In Vivo Determination of RNA Structure-Function Relationships
[0119] Materials and Methods
[0120] Reagents.
[0121] Restriction enzymes, ligase, AMV reverse transcriptase and
calf intestine alkaline phosphatase were from New England Biolabs
and from Gibco-BRL. Sequenase modified DNA polymerase, nucleotides
and sequencing buffers were from USB/Amersham. Oligonucleotides
were synthesized on-site using a Beckman Oligo 1000 DNA
synthesizer. Amplitaq DNA polymerase and PCR reagents were from
Perkin-Elmer-Cetus. [.sup.3H]Chloramphenicol (30.1 Ci/mmol) was
from Amersham and [.alpha.-.sup.35 S]dATP (1000 Ci/mmol) was from
New England Nuclear. Other chemicals were from Sigma.
[0122] pRNA122.
[0123] The key features of this construct are: (1) it contains a
copy of the rrnB operon from pKK3535 (Brosius. J., et al. (1981)
Plasmid 6:112-118.) under transcriptional regulation of the lacUV5
promoter; (2) it contains a copy of the lactose repressor allele
lacI.sup.q (Calos, M. P. (1978) Nature 274:762-769; (3) the
chloramphenicol acetyltransferase gene (cam) is present and
transcribed constitutively from a mutant tryptophan promoter,
trp.sup.c (de Boer, H. A., et al. (1983) Proc. Natl. Acad. Sci. USA
80:21-25); (4) the RBS of the CAT message has been changed from the
wild-type, 5'-GGAGG to 5'-AUCCC, and the MBS of the 16S rRNA gene
has been changed to 5'-GGGAU; and (5) the .beta.-lactamase gene is
present to allow maintenance of plasmids in the host strain.
[0124] Bacterial Strains and Media.
[0125] Plasmids were maintained and expressed in E. coli DH5
(supE44, hsdR17, recA1, endA1, gyrA96, thi-1; Hanahan, D. (1983) J.
Mol. Biol. 166:557-580). Cultures were grown in LB medium (Luria,
S. E. & Burrous, J. W. (1957) J. Bacteriol. 74:461-476) or LB
medium containing 100 .mu.g/ml ampicillin (LB-Ap100). To induce
synthesis of plasmid-derived rRNA from the lacUV5 promoter, IPTG
was added to a final concentration of 1 mM at the times indicated
in each experiment. Strains were transformed by electroporation
(Dower, W. J., et al. (1988) Nucl. Acids Res. 16: 6127) using a
Gibco-BRL Cell Porator. Unless otherwise indicated, transformants
were grown in SOC medium (Hanahan, 1983, supra) for one hour prior
to plating on selective medium to allow expression of
plasmid-derived genes.
[0126] Chloramphenicol Acetyltransferase Assays.
[0127] CAT activity was determined essentially as described
(Nielsen, D. A. et al. (1989) Anal. Biochem. 60:191-227). Cultures
for CAT assays were grown in LB-Ap100. Briefly, 0.5 ml aliquots of
mid-log cultures (unless otherwise indicated) were added to an
equal volume of 500 mM Tris-HCl (pH8) and lysed using 0.01% (w/v)
SDS and chloroform (Miller, J. H. (1992) A Short Course in
Bacterial Genetics, (Miller, J. H., ed.), pp. 71-80, Cold Spring
Harbor Laboratory Press, Cold Spring Harbor, N.Y.). The resulting
lysate was either used directly or diluted in assay buffer prior to
use. Assay mixtures contained cell extract (5 .mu.l or 10 .mu.L1),
250 mM Tris (pH 8), 214 .mu.M butyryl-coenzyme A (Bu-CoA), and 40
.mu.M [.sup.3H]chloramphenicol in a 125 .mu.l volume. Two
concentrations of lysate were assayed for one hour at 37.degree. C.
to ensure that the signal was proportional to protein
concentrations. The product, butyryl-[.sup.3H]chloramphenicol was
extracted into 2,6,10,14-tetramethylpentadecane:xylenes (2:1) and
measured directly in a Beckman LS-3801 liquid scintillation
counter. Blanks were prepared exactly as described above, except
that uninoculated LB medium was used instead of culture.
[0128] Minimum Inhibitory Concentration Determination.
[0129] MICs were determined by standard methods in microtiter
plates or on solid medium. Overnight cultures grown in LB-Ap100
were diluted and induced in the same medium containing 1 mM IPTG
for three hours. Approximately 10.sup.4 induced cells were then
added to wells (or spotted onto solid medium) containing
LB-Ap100+IPTG (1 mM) and chloramphenicol at increasing
concentrations. Cultures were grown for 24 hours and the lowest
concentration of chloramphenicol that completely inhibited growth
was designated as the MIC.
[0130] Random Mutagenesis and Selection.
[0131] Random mutagenesis of the 790 loop was performed essentially
by the method of Higuchi (1989) using PCR and cloned in pRNA122
using the unique BglII and DraIII restriction sites (Higuchi, R.
(1989) PCR Technology (Erlich, H. A., ed.), pp. 61-70, Stockton
Press, New York) (FIG. 18). For each set of mutations, four primers
were used: two "outside" primers and two "inside" primers. The two
outside primers were designed to anneal to either side of the BglII
and DraIII restriction sites in pRNA122 (FIG. 2). These primers
were 16S-DraIII, 5'-GACAATCTGTGTGAGCACTA-3' (SEQ ID NO:239) and
16S-535, 5'-TGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGT-3' (SEQ ID
NO:240). The inside primers were 165-786R,
5'-CCTGTTTGCTCCCCACGCTTTCGCACCTGAGCG-3' (SEQ ID NO:241) and
16S-ASS-3,5'-CTCAGGTGCGAAAGCGTGGGGAGCAAACAGGNNNNNNNNNCCTGGTAGTCC
ACGCC GTAA-3' (SEQ ID NO:242) (N=A, T, C and G). Thus,
4.sup.9=262,144 possible combinations were created, with the
exception of 320 sequences that were eliminated because they formed
either BglII or DraIII recognition sites (256 BglII sites and 64
DraIII sites).
[0132] Transformants were incubated in SOC medium containing 1 mM
IPTG for four hours to induce rRNA synthesis and then plated on LB
agar containing 100 .mu.g/ml chloramphenicol. A total of
2.times.10.sup.6 transformants were plated yielding approximately
2000 chloramphenicol-resistant survivors. Next, 736 of these
survivors were randomly chosen and assayed to determine the MIC of
chloramphenicol necessary to completely inhibit growth in cells
expressing mutant ribosomes. From this pool, 182 transformants with
MICs greater than 100 .mu.g/ml were randomly selected and
sequenced.
[0133] Site-Directed Mutation of Positions 787 and 795.
[0134] Mutations at positions 787 and 795 were constructed as
described above for the random mutants, except that the inside
primers were 16S-786R (see above) and
16S-ASS-4,5'-CTCAGGTGCGAAAGCGTGGGGAGCAAACAGGNTTAGATANCCTGGTAGTCC
ACGCCGTAA-3' (SEQ ID NO:243) (N=A, T, C and G). Transformants were
selected on LB-Ap100 agar plates and grouped according to their
MICs for chloramphenicol. Representatives from each group were then
sequenced to identify the mutations.
[0135] Primer Extension.
[0136] To determine the ratio of plasmid to chromosome-derived
rRNA, 30S and 70 S ribosomes were isolated from 200 ml of induced,
plasmid containing cells by the method of Powers & Noller
(1991). The purified RNA was then used in primer extension
experiments (Triman, K., et al. (1989) J. Mol. Biol. 209:643-653).
End-labeled primers complementary to sequences 3' to the 788 and
795 mutation sites were annealed to rRNA from induced cells and
extended through the mutation site using AMV reverse transcriptase.
The primers used were: 165-806R, 5'-GGACTACCAGGGTATCT-3' (SEQ ID
NO:244); 165-814R, 5'-TACGGCGTGGACTACCA-3' (SEQ ID NO:245). For
wild-type pRNA122 ribosomes, position 1192 in the 16S RNA gene was
changed from C to U and primers were constructed as described above
(Triman et al., 1989, supra). This mutation has previously been
shown not to affect subunit association (Sigmund, C. D., et al.
(1988) Methods Enzymol. 164:673-689). The extension mixture
contained a mixture of three deoxyribonucleotides and one
dideoxyribonucleotide. The cDNAs were resolved by PAGE and the
ratios of mutant to non-mutant ribosomes were determined by
comparing the amount of radioactivity in each of the two bands.
[0137] Oligoribonucleotide Synthesis.
[0138] Oligoribonucleotides were synthesized on solid support with
the phosphoramidite method (Capaldi, D. & Reese, C. (1994)
Nucl. Acids Res. 22:2209-2216) on a Cruachem PS 250 DNA/RNA
synthesizer. Oligomers were removed from solid support and
deprotected by treatment with ammonia and acid following the
manufacturer's recommendations. The RNA was purified on a silica
gel Si500F TLC plate (Baker) eluted for five hours with
n-propanol/ammonia/water (55:35:10, by vol.). Bands were visualized
with an ultraviolet lamp and the least mobile band was cut out and
eluted three times with 1 ml of purified water. Oligomers were
further purified with a Sep-pak C-18 cartridge (Waters) and
desalted by continuous-flow dialysis (BRL). Purities were checked
by analytical C-8 HPLC (Perceptive Biosystems) and were greater
than 95%.
[0139] Experimental Procedures
[0140] Sequence Analysis of Functional Mutants.
[0141] Random mutations were introduced simultaneously at all nine
positions (787 to 795) in the 790 loop. Functional
(chloramphenicol-resistant) mutants were then selected in E. coli
DH5 cells (Hanahan, 1983, supra) and the effects of these mutations
on ribosome function were determined. A total of 182 mutants that
retained chloramphenicol resistance were randomly selected and
sequenced. Wild-type 790-loop sequences were obtained from 81 of
the sequenced transformants, while the remaining 101 contained
mutant sequences. One of the transformants was
chloramphenicol-resistant in the absence of inducer, presumably due
to a spontaneous mutation in the CAT gene, and was excluded from
further analysis. Of 100 sequenced functional mutants, 14 were
duplicates and four sequences occurred three times. Thus, 78
different, functional, 790-loop mutants were analyzed (FIG. 19).
According to resampling theory, this distribution indicates that of
the 4.sup.9=262,144 possible sequences, only 190 (standard
deviation 30) unique sequences exist in the pool of selected
functional mutants. Of the 78 mutants, 44 contained four to six
substitutions out of the nine bases mutated and 21 of these
retained greater than 50% of the wild-type activity. The minimal
inhibitory concentration (MIC) of chloramphenicol for cells
expressing wild-type rRNA from pRNA122 is 600 .mu.g/ml. MICs of the
mutants ranged from 150 to 550 .mu.g/ml with a mean of 320 .mu.g/ml
(standard deviation 89). The median and mode were both 350
.mu.g/ml.
[0142] Functional 790-loop mutants showed strong nucleotide
preferences at all mutated positions, except positions 788 and 792,
which showed a random distribution (FIG. 20) but significant
covariation. No mutations were observed at U789 or G791. Mutations
at these positions, however, were present in mutants that were
selected for loss of function (not shown). Thus, these nucleotides
appear to be directly involved in ribosome function. U789 is
strictly conserved among bacteria but is frequently C789 among
other organisms (FIG. 20). Chemical protection studies have shown
that G791 is specifically protected from kethoxal modification in
70 S ribosomes and polysomes (Brow, D. A. & Noller, H. F.
(1983) J. Mol. Biol. 163: 112-118; Moazed, D. & Noller, H. F.
(1986) J. Mol. Biol. 191: 483-493); and by poly(U) (Moazed &
Noller, 1986, supra) and that G791 becomes more accessible to
kethoxal modification when 30S subunits are converted from the
"inactive" to "active" conformation (Moazed et al., 1986,
supra).
[0143] Purines were strongly selected at position 787 (97.4%) while
A and, to a lesser extent, C were preferred at position 790 (98.7%)
and U was completely excluded at both positions. At both position
793 and 795, A, C and U were equally distributed but G was selected
against. Adenine and uracil were preferred at position 794
(81.8%).
[0144] Non-random distribution of nucleotides among the selected
functional clones indicates that nucleotide identity affects the
level of ribosome function. To examine this, the mean activities
(MICs) of ribosomes containing all mutations at a given position
were compared by single-factor analysis of variance between
ribosome function (MIC) and nucleotide identity at each mutated
position. Positions that showed a significant effect of nucleotide
identity upon the level of ribosome function were 787 (P<0.001),
788 (P<0.05) and 795 (P<0.001). The absence of mutations at
positions U789 and G791 in the functional clones prevents
statistical analysis of these positions but mutations at these
positions presumably strongly affect ribosome function as well.
[0145] FIG. 20 shows a comparison of the selected functional
mutants with current phylogenetic data (R. Gutell, unpublished
results; Gutell, R. R. (1994) Nucl. Acids Res. 22(17): 3502-3507;
Maidak, B. L. et al. (1996) Nucl. Acids Res. 24: 82-85). While
nucleotide preferences in the selected mutants are similar to those
observed in the phylogenetic data, the mutant sequences selected in
this study show much more variability than those found in nature.
This may be because all of the positions in the loop were mutated
simultaneously, allowing normally deleterious mutations in one
position to be compensated for by mutations at other positions, a
process that is unlikely to occur in nature. In addition, none of
the mutants was as functional as the wild-type, suggesting that
wild-type 790-loop sequences have been selected for optimal
activity or that other portions of the translational machinery have
been optimized to function with the wild-type sequence.
[0146] To identify potential nucleotide covariation within the
loop, the paired distribution of selected nucleotides was examined
for goodness of fit. The most significant covariations were
observed between positions 787 and 795 (P<0.001) and between
positions 790 and 793 (P<0.001). For positions 790 and 793, only
eight double mutants were available for analysis; therefore, the
covariation observed between these positions should be regarded
with caution. Position 788, which showed no nucleotide specificity,
did show significant covariation with positions 787 (P<0.01),
794 (P<0.01) and 795 (P<0.01).
[0147] Analysis of Site-Directed Mutations Constructed at the Base
of the Loop: Functional Analysis of Mutations at Positions 787 and
795. The observed covariations among positions 787, 788 and 795 are
particularly interesting, since nucleotide identity at these
positions correlated with the level of ribosome function. Further
analysis of nucleotides at positions 787 and 795 revealed that 72
of the 78 functional mutants have the potential to form mismatched
base-pairs (A.cndot.C, G.cndot.U, A.cndot.A and G.cndot.G). Other
mismatches, such as G.cndot.A and U.cndot.G, however, were not
found. In addition, only four sequences with an A.cndot.U
Watson-Crick pair and no sequences with a U.cndot.A, G.cndot.C or
C.cndot.G pair were present, suggesting that strong base-pairs
between these positions inhibit ribosome function. Therefore all
possible nucleotide combinations at positions 787 and 795 were
constructed and analyzed without changing other nucleotides in the
790 loop. Ribosome function of the mutants (FIG. 21) varied from
84% (A.cndot.A) to 1% (C.cndot.G) of the wild-type. As predicted by
analysis of the pool of functional random mutants, site-directed
mutants with G.cndot.C, C.cndot.G and U.cndot.A Watson-Crick pairs
between positions 787 and 795 were strongly inhibitory.
[0148] Results
[0149] These data suggest that strong pairing between nucleotides
at positions 787 and 795 inhibits ribosome function. In addition,
some of the site-directed substitutions at positions 787 and 795
that produced functional ribosomes were largely excluded from the
pool of mutants in which all of the loop positions were mutated
simultaneously (e.g. CC, CU, UU and UC). The observed nucleotide
preferences at positions 787 and 795 in the selected random pool
presumably reflect interaction of nucleotides at these positions
with other nucleotides in the loop. This is consistent with our
findings of extensive covariations among these sites.
[0150] Perturbations of the 790 loop have been shown to affect
ribosomal subunit association (Herr, W., et al. (1979) J. Mol.
Biol. 130: 433-449; Tapprich, W. & Hill, W., (1986) Proc. Natl.
Acad. Sci. USA 83: 556-560; Tapprich, W., et al. (1989) Proc. Natl.
Acad. Sci. USA 86: 4927-4931). Therefore several of the 787 to 795
mutants were tested for their ability to form 70 S ribosomes.
Ribosomes were isolated from selected mutants and the distribution
of mutant ribosomes in both the 70 S and 30S peaks was determined
by primer extension (FIG. 21). These data show that CAT activity
correlates with the presence of mutant 30S subunits in the 70 S
ribosome pool. Thus, loss of function may be due to the inability
of mutant 30S and 50 S subunits to associate. Another explanation
for this observation is that the mutations may directly affect a
stage of the protein synthesis process prior to subunit
association, such as initiation, which prevents subsequent steps
from occurring. Other mutations in the 16S rRNA have been
identified for which this appears to be the case (Cunningham, P.,
et al. (1993) Biochemistry 32: 7172-7180).
[0151] The references cited in Example 5 may be found in Lee, K. et
al., J. Mol. Biol. 269: 732-743 (1997), expressly incorporated by
reference herein.
Example 6
Construction of a Hybrid Construct
[0152] A plasmid construct of the present invention identified as
the hybrid construct, is set forth in FIGS. 17 and 25. This hybrid
construct contains a 16S rRNA from Mycobacterium tuberculosis. The
specific sites on the hybrid construct are as follows: the part of
rRNA from E. coli rrnB operon corresponds to nucleic acids 1-931;
the part of 16S rRNA from Mycobacterium tuberculosis rrn operon
corresponds to nucleic acids 932-1542; the 16S MBS GGGAU
corresponds to nucleic acids 1536-1540; the terminator T1 of E.
coli rrnB operon corresponds to nucleic acids 1791-1834; the
terminator T2 of E. coli rrnB operon corresponds to nucleic acids
1965-1994; the replication origin corresponds to nucleic acids
3054-2438; the bla (.beta.-lactamase; ampicillin resistance)
corresponds to nucleic acids 3214-4074; the GFP corresponds to
nucleic acids 5726-4992; the GFP RBS (ribosome binding sequence)
AUCCC corresponds to nucleic acids 5738-5734; the trp.sup.c
promoter corresponds to nucleic acids 5795-5755; the trp.sup.c
promoter corresponds to nucleic acids 6270-6310; the CAT RBS
(ribosome binding sequence) AUCCC corresponds to nucleic acids
6327-6331; the cam (chloramphenicol acetyltransferase; CAT)
corresponds to nucleic acids 6339-6998; the lacI.sup.q promoter
corresponds to nucleic acids 7307-7384; the lacI.sup.q (lac
repressor) corresponds to nucleic acids 7385-8467; and the lac UV5
promoter corresponds to nucleic acids 8510-8551.
[0153] All references cited herein are expressly incorporated by
reference.
EQUIVALENTS
[0154] Those skilled in the art will recognize, or be able to
ascertain using no more than routine experimentation, many
equivalents to the specific embodiments of the invention described
herein. Such equivalents are intended to be encompassed by the
following claims.
Sequence CWU 1
1
245110903DNAArtificial Sequenceprimer 1gacgccgggc aagagcaact
cggtcgccgc atacactatt ctcagaatga cttggttgag 60tactcaccag tcacagaaaa
gcatcttacg gatggcatga cagtaagaga attatgcagt 120gctgccataa
ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga
180ccgaaggagc taaccgcttt tttgcacaac atgggggatc atgtaactcg
ccttgatcgt 240tgggaaccgg agctgaatga agccatacca aacgacgagc
gtgacaccac gatgcctgca 300gcaatggcaa caacgttgcg caaactatta
actggcgaac tacttactct agcttcccgg 360caacaattaa tagactggat
ggaggcggat aaagttgcag gaccacttct gcgctcggcc 420cttccggctg
gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt
480atcattgcag cactggggcc agatggtaag ccctcccgta tcgtagttat
ctacacgacg 540gggagtcagg caactatgga tgaacgaaat agacagatcg
ctgagatagg tgcctcactg 600attaagcatt ggtaactgtc agaccaagtt
tactcatata tactttagat tgatttaaaa 660cttcattttt aatttaaaag
gatctaggtg aagatccttt ttgataatct catgaccaaa 720atcccttaac
gtgagttttc gttccactga gcgtcagacc ccttaataag atgatcttct
780tgagatcgtt ttggtctgcg cgtaatctct tgctctgaaa acgaaaaaac
cgccttgcag 840ggcggttttt cgaaggttct ctgagctacc aactctttga
accgaggtaa ctggcttgga 900ggagcgcagt caccaaaact tgtcctttca
gtttagcctt aaccggcgca tgacttcaag 960actaactcct ctaaatcaat
taccagtggc tgctgccagt ggtgcttttg catgtctttc 1020cgggttggac
tcaagacgat agttaccgga taaggcgcag cggtcggact gaacgggggg
1080ttcgtgcata cagtccagct tggagcgaac tgcctacccg gaactgagtg
tcaggcgtgg 1140aatgagacaa acgcggccat aacagcggaa tgacaccggt
aaaccgaaag gcaggaacag 1200gagagcgcac gagggagccg ccagggggaa
acgcctggta tctttatagt cctgtcgggt 1260ttcgccacca ctgatttgag
cgtcagattt cgtgatgctt gtcagggggg cggagcctat 1320ggaaaaacgg
ctttgccgcg gccctctcac ttccctgtta agtatcttcc tggcatcttc
1380caggaaatct ccgccccgtt cgtaagccat ttccgctcgc cgcagtcgaa
cgaccgagcg 1440tagcgagtca gtgagcgagg aagcggaata tatcctgtat
cacatattct gctgacgcac 1500cggtgcagcc ttttttctcc tgccacatga
agcacttcac tgacaccctc atcagtgcca 1560acatagtaag ccagtataca
ctccgctagc atcgtccatt ccgacagcat cgccagtcac 1620tatggcgtgc
tgctagcgct atatgcgttg atgcaatttc tatgcgcacc cgttctcgga
1680gcactgtccg accgctttgg ccgccgccca gtcctgctcg cttcgctact
tggagccact 1740atcgactacg cgatcatggc gaccacaccc gtcctgtgga
tcctctacgc cggacgcatc 1800gtggccggcc acgatgcgtc cggcgtagag
gatctattta acgaccctgc cctgaaccga 1860cgaccgggtc gaatttgctt
tcgaatttct gccattcatc cgcttattat cacttattca 1920ggcgtagcac
caggcgttta agggcaccaa taactgcctt aaaaaaatta cgccccgccc
1980tgccactcat cgcagtactg ttgtaattca ttaagcattc tgccgacatg
gaagccatca 2040cagacggcat gatgaacctg aatcgccagc ggcatcagca
ccttgtcgcc ttgcgtataa 2100tatttgccca tggtgaaaac gggggcgaag
aagttgtcca tattggccac gtttaaatca 2160aaactggtga aactcaccca
gggattggct gagacgaaaa acatattctc aataaaccct 2220ttagggaaat
aggccaggtt ttcaccgtaa cacgccacat cttgcgaata tatgtgtaga
2280aactgccgga aatcgtcgtg gtattcactc cagagcgatg aaaacgtttc
agtttgctca 2340tggaaaacgg tgtaacaagg gtgaacacta tcccatatca
ccagctcacc gtctttcatt 2400gccatacgga attccggatg agcattcatc
aggcgggcaa gaatgtgaat aaaggccgga 2460taaaacttgt gcttattttt
ctttacggtc tttaaaaagg ccgtaatatc cagctgaacg 2520gtctggttat
aggtacattg agcaactgac tgaaatgcct caaaatgttc tttacgatgc
2580cattgggata tatcaacggt ggtatatcca gtgatttttt tctccatttc
tcgagcacac 2640tgaaagcggc cgcttccaca cattaaacta gttcgatgat
taattgtcaa cagctcgccg 2700ctatatgcgt tgatgcaatt tctatgcgca
cccgttctcg gagcactgtc cgaccgcttt 2760ggccgccgcc cagtcctgct
cgcttcgcta cttggagcca ctatcgacta cgcgatcatg 2820gcgaccacac
ccgtcctgtg gatcccagac gagttaagtc accatacgtt agtacaggtt
2880gccactcttt tggcagacgc agacctacgg ctacaatagc gaagcggtcc
tggtattcat 2940gtttaaaaat actgtcgcga tagccaaaac ggcactcttt
ggcagttaag cgcacttgct 3000tgcctgtcgc cagttcaaca gaatcaacat
aagcgcaaac tcgctgtaat tctacgccat 3060aagcaccaat attctggata
ggtgatgagc cgacacaacc aggaattaat gccagatttt 3120ccagaccagg
cataccttcc tgcaaagtgt attttaccag acgatgccag ttttctccgg
3180ctcctacatg taaataccac gcatcaggtt catcatgaat ttcgatacct
ttgatccggt 3240tgatgatcac cgtgccgcga tagtcctcca gaaaaagtac
attacttcct tcacccagaa 3300taagaacggg ttgtccttct gcggttgcat
actgccaggc attgagtaat tgttgttcgt 3360cttcggcaca tacaatgtgc
tgagcattat gatcaatgcc aaatgtgttc cagggtttta 3420aggagtggtt
catagctgct ttcctgatgc aaaaacgagg ctagtttacc gtatctgtgg
3480ggggatggct tgtagatatg acgacaggaa gagtttgtag aaacgcaaaa
aggccatccg 3540tcaggatggc cttctgctta atttgatgcc tggcagttta
tggcgggcgt cctgcccgcc 3600accctccggg ccgttgcttc gcaacgttca
aatccgctcc cggcggattt gtcctactca 3660ggagagcgtt caccgacaaa
caacagataa aacgaaaggc ccagtctttc gactgagcct 3720ttcgttttat
ttgatgcctg gcagttccct actctcgcat ggggagaccc cacactacca
3780tcggcgctac ggcgtttcac ttctgagttc ggcatggggt caggtgggac
caccgcgcta 3840ctgccgccag gcaaattctg ttttatcaga ccgcttctgc
gttctgattt aatctgtatc 3900aggctgaaaa tcttctctca tccgccaaaa
cagcttcggc gttgtaaggt taagcctcac 3960ggttcattag taccggttag
ctcaacgcat cgctgcgctt acacacccgg cctatcaacg 4020tcgtcgtctt
caacgttcct tcaggaccct taaagggtca gggagaactc atctcggggc
4080aagtttcgtg cttagatgct ttcagcactt atctcttccg catttagcta
ccgggcagtg 4140ccattggcat gacaacccga acaccagtga tgcgtccact
ccggtcctct cgtactagga 4200gcagcccccc tcagttctcc agcgcccacg
gcagataggg accgaactgt ctcacgacgt 4260tctaaaccca gctcgcgtac
cactttaaat ggcgaacagc catacccttg ggacctactt 4320cagccccagg
atgtgatgag ccgacatcga ggtgccaaac accgccgtcg atatgaactc
4380ttgggcggta tcagcctgtt atccccggag taccttttat ccgttgagcg
atggcccttc 4440cattcagaac caccggatca ctatgacctg ctttcgcacc
tgctcgcgcc gtcacgctcg 4500cagtcaagct ggcttatgcc attgcactaa
cctcctgatg tccgaccagg attagccaac 4560cttcgtgctc ctccgttact
ctttaggagg agaccgcccc agtcaaacta cccaccagac 4620actgtccgca
acccggatta cgggtcaacg ttagaacatc aaacattaaa gggtggtatt
4680tcaaggtcgg ctccatgcag actggcgtcc acacttcaaa gcctcccacc
tatcctacac 4740atcaaggctc aatgttcagt gtcaagctat agtaaaggtt
cacggggtct ttccgtcttg 4800ccgcgggtac actgcatctt cacagcgagt
tcaatttcac tgagtctcgg gtggagacag 4860cctggccatc attacgccat
tcgtgcaggt cggaacttac ccgacaagga atttcgctac 4920cttaggaccg
ttatagttac ggccgccgtt taccggggct tcgatcaaga gcttcgcttg
4980cgctaacccc atcaattaac cttccggcac cgggcaggcg tcacaccgta
tacgtccact 5040ttcgtgtttg cacagtgctg tgtttttaat aaacagttgc
agccagctgg tatcttcgac 5100tgatttcagc tccatccgcg agggacctca
cctacatatc agcgtgcctt ctcccgaagt 5160tacggcacca ttttgcctag
ttccttcacc cgagttctct caagcgcctt ggtattctct 5220acctgaccac
ctgtgtcggt ttggggtacg atttgatgtt acctgatgct tagaggcttt
5280tcctggaagc agggcatttg ttgcttcagc accgtagtgc ctcgtcatca
cgcctcagcc 5340ttgattttcc ggatttgcct ggaaaaccag cctacacgct
taaaccggga caaccgtcgc 5400ccggccaaca tagccttctc cgtcccccct
tcgcagtaac accaagtaca ggaatattaa 5460cctgtttccc atcgactacg
cctttcggcc tcgccttagg ggtcgactca ccctgccccg 5520attaacgttg
gacaggaacc cttggtcttc cggcgagcgg gcttttcacc cgctttatcg
5580ttacttatgt cagcattcgc acttctgata cctccagcat gcctcacagc
acaccttcgc 5640aggcttacag aacgctcccc tacccaacaa cgcataagcg
tcgctgccgc agcttcggtg 5700catggtttag ccccgttaca tcttccgcgc
aggccgactc gaccagtgag ctattacgct 5760ttctttaaat gatggctgct
tctaagccaa catcctggct gtctgggcct tcccacatcg 5820tttcccactt
aaccatgact ttgggacctt agctggcggt ctgggttgtt tccctcttca
5880cgacggacgt tagcacccgc cgtgtgtctc ccgtgataac attctccggt
attcgcagtt 5940tgcatcgggt tggtaagtcg ggatgacccc cttgccgaaa
cagtgctcta cccccggaga 6000tgaattcacg aggcgctacc taaatagctt
tcggggagaa ccagctatct cccggtttga 6060ttggcctttc acccccagcc
acaagtcatc cgctaatttt tcaacattag tcggttcggt 6120cctccagtta
gtgttaccca accttcaacc tgcccatggc tagatcaccg ggtttcgggt
6180ctataccctg caacttaacg cccagttaag actcggtttc ccttcggctc
ccctattcgg 6240ttaaccttgc tacagaatat aagtcgctga cccattatac
aaaaggtacg cagtcacacg 6300cctaagcgtg ctcccactgc ttgtacgtac
acggtttcag gttctttttc actcccctcg 6360ccggggttct tttcgccttt
ccctcacggt actggttcac tatcggtcag tcaggagtat 6420ttagccttgg
aggatggtcc ccccatattc agacaggata ccacgtgtcc cgccctactc
6480atcgagctca cagcatgtgc atttttgtgt acggggctgt caccctgtat
cgcgcgcctt 6540tccagacgct tccactaaca cacacactga ttcaggctct
gggctgctcc ccgttcgctc 6600gccgctactg ggggaatctc ggttgatttc
ttttcctcgg ggtacttaga tgtttcagtt 6660cccccggttc gcctcattaa
cctatggatt cagttaatga tagtgtgtcg aaacacactg 6720ggtttcccca
ttcggaaatc gccggttata acggttcata tcaccttacc gacgcttatc
6780gcagattagc acgtccttca tcgcctctga ctgccagggc atccaccgtg
tacgcttagt 6840cgcttaacct cacaacccga agatgtttct ttcgattcat
catcgtgttg cgaaaatttg 6900agagactcac gaacaactct cgttgttcag
tgtttcaatt ttcagcttga tccagatttt 6960taaagagcaa aaatctcaaa
catcacccga agatgagttt tgagatatta aggtcggcga 7020ctttcactca
caaaccagca agtggcgtcc cctaggggat tcgaacccct gttaccgccg
7080tgaaagggcg gtgtcctggg cctctagacg aaggggacac gaaaattgct
tatcacgcgt 7140tgcgtgatat tttcgtgtag ggtgagcttt cattaataga
aagcgaacgg ccttattctc 7200ttcagcctca ctcccaacgc gtaaacgcct
tgcttttcac tttctatcag acaatctgtg 7260tgagcactac aaagtacgct
tctttaaggt aagtgtgtga tccaaccgca ggttccccta 7320cggttacctt
gttacgactt caccccagtc atgaatcaca aagtggtaag cgccctcccg
7380aaggttaagc tacctacttc ttttgcaacc cactcccatg gtgtgacggg
cggtgtgtac 7440aaggcccggg aacgtattca ccgtggcatt ctgatccacg
attactagcg attccgactt 7500catggagtcg agttgcagac tccaatccgg
actacgacgc actttatgag gtccgcttgc 7560tctcgcgagg tcgcttctct
ttgtatgcgc cattgtagca cgtgtgtagc cctggtcgta 7620agggccatga
tgacttgacg tcatccccac cttcctccag tttatcactg gcagtctcct
7680ttgagttccc ggccggaccg ctggcaacaa aggataaggg ttgcgctcgt
tgcgggactt 7740aacccaacat ttcacaacac gagctgacga cagccatgca
gcacctgtct cacggttccc 7800gaaggcacat tctcatctct gaaaacttcc
gtggatgtca agaccaggta aggttcttcg 7860cgttgcatcg aattaaacca
catgctccac cgcttgtgcg ggcccccgtc aattcatttg 7920agttttaacc
ttgcggccgt actccccagg cggtcgactt aacgcgttag ctccggaagc
7980cacgcctcaa gggcacaacc tccaagtcga catcgtttac ggcgtggact
accagggtat 8040ctaatcctgt ttgctcccca cgctttcgca cctgagcgtc
agtcttcgtc cagggggccg 8100ccttcgccac cggtattcct ccagatctct
acgcatttca ccgctacacc tggaattcta 8160cccccctcta cgagactcaa
gcttgccagt atcagatgca gttcccaggt tgagcccggg 8220gatttcacat
ctgacttaac aaaccgcctg cgtgcgcttt acgcccagta attccgatta
8280acgcttgcac cctccgtatt accgcggctg ctggcacgga gttagccggt
gcttcttctg 8340cgggtaacgt caatgagcaa aggtattaac tttactccct
tcctccccgc tgaaagtact 8400ttacaacccg aaggccttct tcatacacgc
ggcatggctg catcaggctt gcgcccattg 8460tgcaatattc cccactgctg
cctcccgtag gagtctggac cgtgtctcag ttccagtgtg 8520gctggtcatc
ctctcagacc agctagggat cgtcgcctag gtgagccgtt accccaccta
8580ctagctaatc ccatctgggc acatccgatg gcaagaggcc cgaaggtccc
cctctttggt 8640cttgcgacgt tatgcggtat tagctaccgt ttccagtagt
tatccccctc catcaggcag 8700tttcccagac attactcacc cgtccgccac
tcgtcagcaa agaagcaagc ttcttcctgt 8760taccgttcga cttgcatgtg
ttaggcctgc cgccagcgtt caatctgagc catgatcaaa 8820ctcttcaatt
taaaagtttg acgctcaaag aattaaactt cgtaatgaat tacgtgttca
8880ctcttgagac ttggtattca tttttcgtct tgcgacgtta agaatccgta
tcttcgagtg 8940cccacacaga ttgtctgata aattgttaaa gagcagtgcc
gcttcgcttt ttctcagcgg 9000ccgctgtgtg aaattgttat ccgctcacaa
ttccacacat tatacgagcc ggaagcataa 9060agtgtaaagc ctggggtgcc
taatgagtga gctaactcac attaattgcg ttgcgctcac 9120tgcccgcttt
ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg
9180cggggagagg cggtttgcgt attgggcgcc agggtggttt ttcttttcac
cagtgagacg 9240ggcaacagct gattgccctt caccgcctgg ccctgagaga
gttgcagcaa gcggtccacg 9300ctggtttgcc ccagcaggcg aaaatcctgt
ttgatggtgg ttgacggcgg gatataacat 9360gagctgtctt cggtatcgtc
gtatcccact accgagatat ccgcaccaac gcgcagcccg 9420gactcggtaa
tggcgcgcat tgcgcccagc gccatctgat cgttggcaac cagcatcgca
9480gtgggaacga tgccctcatt cagcatttgc atggtttgtt gaaaaccgga
catggcactc 9540cagtcgcctt cccgttccgc tatcggctga atttgattgc
gagtgagata tttatgccag 9600ccagccagac gcagacgcgc cgagacagaa
cttaatgggc ccgctaacag cgcgatttgc 9660tggtgaccca atgcgaccag
atgctccacg cccagtcgcg taccgtcttc atgggagaaa 9720ataatactgt
tgatgggtgt ctggtcagag acatcaagaa ataacgccgg aacattagtg
9780caggcagctt ccacagcaat ggcatcctgg tcatccagcg gatagttaat
gatcagccca 9840ctgacccgtt gcgcgagaag attgtgcacc gccgctttac
aggcttcgac gccgcttcgt 9900tctaccatcg acaccaccac gctggcaccc
agttgatcgg cgcgagattt aatcgccgcg 9960acaatttgcg acggcgcgtg
cagggccaga ctggaggtgg caacgccaat cagcaacgac 10020tgtttgcccg
ccagttgttg tgccacgcgg ttgggaatgt aattcagctc cgccatcgcc
10080gcttccactt tttcccgcgt tttcgcagaa acgtggctgg cctggttcac
cacgcgggaa 10140acggtctgat aagagacacc ggcatactct gcgacatcgt
ataacgttac tggtttcaca 10200ttcaccaccc tgaattgact ctcttccggg
cgctatcatg ccataccgcg aaaggttttg 10260caccattcga tggtgtcgga
tcctagagcg cacgaatgag ggccgacagg aagcaaagct 10320gaaaggaatc
aaatttggcc gcaggcgtac cgtggacagg aacgtcgtgc tgacgcttca
10380tcagaagggc actggtgcaa cggaaattgc tcatcagctc agtattgccc
gctccacggt 10440ttataaaatt cttgaagacg aaagggcctc gtgcatacgc
ctatttttat aggttaatgt 10500catgataata atggtttctt agacgtcagg
tggcactttt cggggaaatg tgcgcggaac 10560ccctatttgt ttatttttct
aaatacattc aaatatgtat ccgctcatga gacaataacc 10620ctgataaatg
cttcaataat attgaaaaag gaagagtatg agtattcaac atttccgtgt
10680cgcccttatt cccttttttg cggcattttg ccttcctgtt tttgctcacc
cagaaacgct 10740ggtgaaagta aaagatgctg aagatcagtt gggtgcacga
gtgggttaca tcgaactgga 10800tctcaacagc ggtaagatcc ttgagagttt
tcgccccgaa gaacgttttc caatgatgag 10860cacttttaaa gttctgctat
gtggcgcggt attatcccgt gtt 10903211918DNAArtificial Sequenceprimer
2gatcctctac gccggacgca tcgtggccgg ccacgatgcg tccggcgtag aggatctatt
60taacgaccct gccctgaacc gacgaccggg tcgaatttgc tttcgaattt ctgccattca
120tccgcttatt atcacttatt caggcgtagc accaggcgtt taagggcacc
aataactgcc 180ttaaaaaaat tacgccccgc cctgccactc atcgcagtac
tgttgtaatt cattaagcat 240tctgccgaca tggaagccat cacagacggc
atgatgaacc tgaatcgcca gcggcatcag 300caccttgtcg ccttgcgtat
aatatttgcc catggtgaaa acgggggcga agaagttgtc 360catattggcc
acgtttaaat caaaactggt gaaactcacc cagggattgg ctgagacgaa
420aaacatattc tcaataaacc ctttagggaa ataggccagg ttttcaccgt
aacacgccac 480atcttgcgaa tatatgtgta gaaactgccg gaaatcgtcg
tggtattcac tccagagcga 540tgaaaacgtt tcagtttgct catggaaaac
ggtgtaacaa gggtgaacac tatcccatat 600caccagctca ccgtctttca
ttgccatacg gaattccgga tgagcattca tcaggcgggc 660aagaatgtga
ataaaggccg gataaaactt gtgcttattt ttctttacgg tctttaaaaa
720ggccgtaata tccagctgaa cggtctggtt ataggtacat tgagcaactg
actgaaatgc 780ctcaaaatgt tctttacgat gccattggga tatatcaacg
gtggtatatc cagtgatttt 840tttctccatt tgcggaggga tatgaaagcg
gccgcttcca cacattaaac tagttcgatg 900attaattgtc aacagctcgc
cggcggcacc tcgctaacgg attcaccact ccaagaattg 960gagccaatcg
attcttgcgg agaactgtga atgcgcaaac caacccttgg cagaacatat
1020ccatcgcgtc cgccatctcc agcagccgca cgcggcgcat ctcgggcagc
gttgggtcct 1080ggccacgggt gcgcatgatc gtgctcctgt cgttgaggac
ccggctaggc tggcggggtt 1140gccttactgg ttagcagaat gaatcaccga
tacgcgagcg aacgtgaagc gactgctgct 1200gcaaaacgtc tgcgacctga
gcaacaacat gaatggtctt cggtttccgt gtttcgtaaa 1260gtctggaaac
gcggaagtca gcgccctgca ccattatgtt ccggatctgg gtaccgagct
1320cgaattcact ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc
gttacccaac 1380ttaatcgcct tgcagcacat ccccctttcg ccaggcatcg
caggatgctg ctggctaccc 1440tgtggaacac ctacatctgt attaacgaag
cgctggcatt gaccctgagt gatttttctc 1500tggtcccgcc gcatccatac
cgccagttgt ttaccctcac aacgttccag taaccgggca 1560tgttcatcat
cagtaacccg tatcgtgagc atcctctctc gtttcatcgg tatcattacc
1620cccatgaaca gaaattcccc cttacacgga ggcatcaagt gaccaaacag
gaaaaaaccg 1680cccttaacat ggcccgcttt atcagaagcc agacattaac
gcttctggag aaactcaacg 1740agctggacgc ggatgaacag gcagacatct
gtgaatcgct tcacgaccac gctgatgagc 1800tttaccgcag ctgcctcgcg
cgtttcggtg atgacggtga aaacctctga cacatgcagc 1860tcccggagac
ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg
1920gcgcgtcagc gggtgttggc gggtgtcggg gcgcagccat gacccagtca
cgtagcgata 1980gcggagtgta tactggctta actatgcggc atcagagcag
attgtactga gagtgcacca 2040tatgcggtgt gaaataccgc acagatgcgt
aaggagaaaa taccgcatca ggcgctcttc 2100cgcttcctcg ctcactgact
cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc 2160tcactcaaag
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat
2220gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc
tggcgttttt 2280ccataggctc cgcccccctg acgagcatca caaaaatcga
cgctcaagtc agaggtggcg 2340aaacccgaca ggactataaa gataccaggc
gtttccccct ggaagctccc tcgtgcgctc 2400tcctgttccg accctgccgc
ttaccggata cctgtccgcc tttctccctt cgggaagcgt 2460ggcgctttct
catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa
2520gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat
ccggtaacta 2580tcgtcttgag tccaacccgg taagacacga cttatcgcca
ctggcagcag ccactggtaa 2640caggattagc agagcgaggt atgtaggcgg
tgctacagag ttcttgaagt ggtggcctaa 2700ctacggctac actagaagga
cagtatttgg tatctgcgct ctgctgaagc cagttacctt 2760cggaaaaaga
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt
2820ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag
atcctttgat 2880cttttctacg gggtctgacg ctcagtggaa cgaaaactca
cgttaaggga ttttggtcat 2940gagattatca aaaaggatct tcacctagat
ccttttaaat taaaaatgaa gttttaaatc 3000aatctaaagt atatatgagt
aaacttggtc tgacagttac caatgcttaa tcagtgaggc 3060acctatctca
gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta
3120gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga
taccgcgaga 3180cccacgctca ccggctccag atttatcagc aataaaccag
ccagccggaa gggccgagcg 3240cagaagtggt cctgcaactt tatccgcctc
catccagtct attaattgtt gccgggaagc 3300tagagtaagt agttcgccag
ttaatagttt gcgcaacgtt gttgccattg ctgcaggcat 3360cgtggtgtca
cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag
3420gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg
gtcctccgat 3480cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg
gttatggcag cactgcataa 3540ttctcttact gtcatgccat ccgtaagatg
cttttctgtg actggtgagt actcaaccaa 3600gtcattctga gaatagtgta
tgcggcgacc gagttgctct tgcccggcgt caacacggga 3660taataccgcg
ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg
3720gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac
ccactcgtgc 3780acccaactga tcttcagcat cttttacttt caccagcgtt
tctgggtgag caaaaacagg 3840aaggcaaaat gccgcaaaaa agggaataag
ggcgacacgg aaatgttgaa tactcatact 3900cttccttttt caatattatt
gaagcattta tcagggttat tgtctcatga gcggatacat 3960atttgaatgt
atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt
4020gccacctgac gtctaagaaa ccattattat catgacatta acctataaaa
ataggcgtat
4080cacgaggccc tttcgtcttc aagaattctc atgtttgaca gcttatcatc
gataagcttt 4140aatgcggtag tttatcacag ttaaattgct aacgcagtca
ggcaccgtgt atgaaatcta 4200acaatgcgct catcgtcatc ctcggcaccg
tcaccctgga tgctgtaggc ataggcttgg 4260ttatgccggt actgccgggc
ctcttgcggg atatcgtcca ttccgacagc atcgccagtc 4320actatggcgt
gctgctagcg ctatatgcgt tgatgcaatt tctatgcgca cccgttctcg
4380gagcactgtc cgaccgcttt ggccgccgcc cagtcctgct cgcttcgcta
cttggagcca 4440ctatcgacta cgcgatcatg gcgaccacac ccgtcctgtg
gatcccagac gagttaagtc 4500accatacgtt agtacaggtt gccactcttt
tggcagacgc agacctacgg ctacaatagc 4560gaagcggtcc tggtattcat
gtttaaaaat actgtcgcga tagccaaaac ggcactcttt 4620ggcagttaag
cgcacttgct tgcctgtcgc cagttcaaca gaatcaacat aagcgcaaac
4680tcgctgtaat tctacgccat aagcaccaat attctggata ggtgatgagc
cgacacaacc 4740aggaattaat gccagatttt ccagaccagg cataccttcc
tgcaaagtgt attttaccag 4800acgatgccag ttttctccgg ctcctacatg
taaataccac gcatcaggtt catcatgaat 4860ttcgatacct ttgatccggt
tgatgatcac cgtgccgcga tagtcctcca gaaaaagtac 4920attacttcct
tcacccagaa taagaacggg ttgtccttct gcggttgcat actgccaggc
4980attgagtaat tgttgttcgt cttcggcaca tacaatgtgc tgagcattat
gatcaatgcc 5040aaatgtgttc cagggtttta aggagtggtt catagctgct
ttcctgatgc aaaaacgagg 5100ctagtttacc gtatctgtgg ggggatggct
tgtagatatg acgacaggaa gagtttgtag 5160aaacgcaaaa aggccatccg
tcaggatggc cttctgctta atttgatgcc tggcagttta 5220tggcgggcgt
cctgcccgcc accctccggg ccgttgcttc gcaacgttca aatccgctcc
5280cggcggattt gtcctactca ggagagcgtt caccgacaaa caacagataa
aacgaaaggc 5340ccagtctttc gactgagcct ttcgttttat ttgatgcctg
gcagttccct actctcgcat 5400ggggagaccc cacactacca tcggcgctac
ggcgtttcac ttctgagttc ggcatggggt 5460caggtgggac caccgcgcta
ctgccgccag gcaaattctg ttttatcaga ccgcttctgc 5520gttctgattt
aatctgtatc aggctgaaaa tcttctctca tccgccaaaa cagcttcggc
5580gttgtaaggt taagcctcac ggttcattag taccggttag ctcaacgcat
cgctgcgctt 5640acacacccgg cctatcaacg tcgtcgtctt caacgttcct
tcaggaccct taaagggtca 5700gggagaactc atctcggggc aagtttcgtg
cttagatgct ttcagcactt atctcttccg 5760catttagcta ccgggcagtg
ccattggcat gacaacccga acaccagtga tgcgtccact 5820ccggtcctct
cgtactagga gcagcccccc tcagttctcc agcgcccacg gcagataggg
5880accgaactgt ctcacgacgt tctaaaccca gctcgcgtac cactttaaat
ggcgaacagc 5940catacccttg ggacctactt cagccccagg atgtgatgag
ccgacatcga ggtgccaaac 6000accgccgtcg atatgaactc ttgggcggta
tcagcctgtt atccccggag taccttttat 6060ccgttgagcg atggcccttc
cattcagaac caccggatca ctatgacctg ctttcgcacc 6120tgctcgcgcc
gtcacgctcg cagtcaagct ggcttatgcc attgcactaa cctcctgatg
6180tccgaccagg attagccaac cttcgtgctc ctccgttact ctttaggagg
agaccgcccc 6240agtcaaacta cccaccagac actgtccgca acccggatta
cgggtcaacg ttagaacatc 6300aaacattaaa gggtggtatt tcaaggtcgg
ctccatgcag actggcgtcc acacttcaaa 6360gcctcccacc tatcctacac
atcaaggctc aatgttcagt gtcaagctat agtaaaggtt 6420cacggggtct
ttccgtcttg ccgcgggtac actgcatctt cacagcgagt tcaatttcac
6480tgagtctcgg gtggagacag cctggccatc attacgccat tcgtgcaggt
cggaacttac 6540ccgacaagga atttcgctac cttaggaccg ttatagttac
ggccgccgtt taccggggct 6600tcgatcaaga gcttcgcttg cgctaacccc
atcaattaac cttccggcac cgggcaggcg 6660tcacaccgta tacgtccact
ttcgtgtttg cacagtgctg tgtttttaat aaacagttgc 6720agccagctgg
tatcttcgac tgatttcagc tccatccgcg agggacctca cctacatatc
6780agcgtgcctt ctcccgaagt tacggcacca ttttgcctag ttccttcacc
cgagttctct 6840caagcgcctt ggtattctct acctgaccac ctgtgtcggt
ttggggtacg atttgatgtt 6900acctgatgct tagaggcttt tcctggaagc
agggcatttg ttgcttcagc accgtagtgc 6960ctcgtcatca cgcctcagcc
ttgattttcc ggatttgcct ggaaaaccag cctacacgct 7020taaaccggga
caaccgtcgc ccggccaaca tagccttctc cgtcccccct tcgcagtaac
7080accaagtaca ggaatattaa cctgtttccc atcgactacg cctttcggcc
tcgccttagg 7140ggtcgactca ccctgccccg attaacgttg gacaggaacc
cttggtcttc cggcgagcgg 7200gcttttcacc cgctttatcg ttacttatgt
cagcattcgc acttctgata cctccagcat 7260gcctcacagc acaccttcgc
aggcttacag aacgctcccc tacccaacaa cgcataagcg 7320tcgctgccgc
agcttcggtg catggtttag ccccgttaca tcttccgcgc aggccgactc
7380gaccagtgag ctattacgct ttctttaaat gatggctgct tctaagccaa
catcctggct 7440gtctgggcct tcccacatcg tttcccactt aaccatgact
ttgggacctt agctggcggt 7500ctgggttgtt tccctcttca cgacggacgt
tagcacccgc cgtgtgtctc ccgtgataac 7560attctccggt attcgcagtt
tgcatcgggt tggtaagtcg ggatgacccc cttgccgaaa 7620cagtgctcta
cccccggaga tgaattcacg aggcgctacc taaatagctt tcggggagaa
7680ccagctatct cccggtttga ttggcctttc acccccagcc acaagtcatc
cgctaatttt 7740tcaacattag tcggttcggt cctccagtta gtgttaccca
accttcaacc tgcccatggc 7800tagatcaccg ggtttcgggt ctataccctg
caacttaacg cccagttaag actcggtttc 7860ccttcggctc ccctattcgg
ttaaccttgc tacagaatat aagtcgctga cccattatac 7920aaaaggtacg
cagtcacacg cctaagcgtg ctcccactgc ttgtacgtac acggtttcag
7980gttctttttc actcccctcg ccggggttct tttcgccttt ccctcacggt
actggttcac 8040tatcggtcag tcaggagtat ttagccttgg aggatggtcc
ccccatattc agacaggata 8100ccacgtgtcc cgccctactc atcgagctca
cagcatgtgc atttttgtgt acggggctgt 8160caccctgtat cgcgcgcctt
tccagacgct tccactaaca cacacactga ttcaggctct 8220gggctgctcc
ccgttcgctc gccgctactg ggggaatctc ggttgatttc ttttcctcgg
8280ggtacttaga tgtttcagtt cccccggttc gcctcattaa cctatggatt
cagttaatga 8340tagtgtgtcg aaacacactg ggtttcccca ttcggaaatc
gccggttata acggttcata 8400tcaccttacc gacgcttatc gcagattagc
acgtccttca tcgcctctga ctgccagggc 8460atccaccgtg tacgcttagt
cgcttaacct cacaacccga agatgtttct ttcgattcat 8520catcgtgttg
cgaaaatttg agagactcac gaacaactct cgttgttcag tgtttcaatt
8580ttcagcttga tccagatttt taaagagcaa aaatctcaaa catcacccga
agatgagttt 8640tgagatatta aggtcggcga ctttcactca caaaccagca
agtggcgtcc cctaggggat 8700tcgaacccct gttaccgccg tgaaagggcg
gtgtcctggg cctctagacg aaggggacac 8760gaaaattgct tatcacgcgt
tgcgtgatat tttcgtgtag ggtgagcttt cattaataga 8820aagcgaacgg
ccttattctc ttcagcctca ctcccaacgc gtaaacgcct tgcttttcac
8880tttctatcag acaatctgtg tgagcactac aaagtacgct tctttaaggt
aatcccatga 8940tccaaccgca ggttccccta cggttacctt gttacgactt
caccccagtc atgaatcaca 9000aagtggtaag cgccctcccg aaggttaagc
tacctacttc ttttgcaacc cactcccatg 9060gtgtgacggg cggtgtgtac
aaggcccggg aacgtattca ccgtggcatt ctgatccacg 9120attactagcg
attccgactt catggagtcg agttgcagac tccaatccgg actacgacgc
9180actttatgag gtccgcttgc tctcgcgagg tcgcttctct ttgtatgcgc
cattgtagca 9240cgtgtgtagc cctggtcgta agggccatga tgacttgacg
tcatccccac cttcctccag 9300tttatcactg gcagtctcct ttgagttccc
ggccggaccg ctggcaacaa aggataaggg 9360ttgcgctcgt tgcgggactt
aacccaacat ttcacaacac gagctgacga cagccatgca 9420gcacctgtct
cacggttccc gaaggcacat tctcatctct gaaaacttcc gtggatgtca
9480agaccaggta aggttcttcg cgttgcatcg aattaaacca catgctccac
cgcttgtgcg 9540ggcccccgtc aattcatttg agttttaacc ttgcggccgt
actccccagg cggtcgactt 9600aacgcgttag ctccggaagc cacgcctcaa
gggcacaacc tccaagtcga catcgtttac 9660ggcgtggact accagggtat
ctaatcctgt ttgctcccca cgctttcgca cctgagcgtc 9720agtcttcgtc
cagggggccg ccttcgccac cggtattcct ccagatctct acgcatttca
9780ccgctacacc tggaattcta cccccctcta cgagactcaa gcttgccagt
atcagatgca 9840gttcccaggt tgagcccggg gatttcacat ctgacttaac
aaaccgcctg cgtgcgcttt 9900acgcccagta attccgatta acgcttgcac
cctccgtatt accgcggctg ctggcacgga 9960gttagccggt gcttcttctg
cgggtaacgt caatgagcaa aggtattaac tttactccct 10020tcctccccgc
tgaaagtact ttacaacccg aaggccttct tcatacacgc ggcatggctg
10080catcaggctt gcgcccattg tgcaatattc cccactgctg cctcccgtag
gagtctggac 10140cgtgtctcag ttccagtgtg gctggtcatc ctctcagacc
agctagggat cgtcgcctag 10200gtgagccgtt accccaccta ctagctaatc
ccatctgggc acatccgatg gcaagaggcc 10260cgaaggtccc cctctttggt
cttgcgacgt tatgcggtat tagctaccgt ttccagtagt 10320tatccccctc
catcaggcag tttcccagac attactcacc cgtccgccac tcgtcagcaa
10380agaagcaagc ttcttcctgt taccgttcga cttgcatgtg ttaggcctgc
cgccagcgtt 10440caatctgagc catgatcaaa ctcttcaatt taaaagtttg
acgctcaaag aattaaactt 10500cgtaatgaat tacgtgttca ctcttgagac
ttggtattca tttttcgtct tgcgacgtta 10560agaatccgta tcttcgagtg
cccacacaga ttgtctgata aattgttaaa gagcagtgcc 10620gcttcgcttt
ttctcagcgg ccgctgtgtg aaattgttat ccgctcacaa ttccacacat
10680tatacgagcc ggaagcataa agtgtaaagc ctggggtgcc taatgagtga
gctaactcac 10740attaattgcg ttgcgctcac tgcccgcttt ccagtcggga
aacctgtcgt gccagctgca 10800ttaatgaatc ggccaacgcg cggggagagg
cggtttgcgt attgggcgcc agggtggttt 10860ttcttttcac cagtgagacg
ggcaacagct gattgccctt caccgcctgg ccctgagaga 10920gttgcagcaa
gcggtccacg ctggtttgcc ccagcaggcg aaaatcctgt ttgatggtgg
10980ttgacggcgg gatataacat gagctgtctt cggtatcgtc gtatcccact
accgagatat 11040ccgcaccaac gcgcagcccg gactcggtaa tggcgcgcat
tgcgcccagc gccatctgat 11100cgttggcaac cagcatcgca gtgggaacga
tgccctcatt cagcatttgc atggtttgtt 11160gaaaaccgga catggcactc
cagtcgcctt cccgttccgc tatcggctga atttgattgc 11220gagtgagata
tttatgccag ccagccagac gcagacgcgc cgagacagaa cttaatgggc
11280ccgctaacag cgcgatttgc tggtgaccca atgcgaccag atgctccacg
cccagtcgcg 11340taccgtcttc atgggagaaa ataatactgt tgatgggtgt
ctggtcagag acatcaagaa 11400ataacgccgg aacattagtg caggcagctt
ccacagcaat ggcatcctgg tcatccagcg 11460gatagttaat gatcagccca
ctgacccgtt gcgcgagaag attgtgcacc gccgctttac 11520aggcttcgac
gccgcttcgt tctaccatcg acaccaccac gctggcaccc agttgatcgg
11580cgcgagattt aatcgccgcg acaatttgcg acggcgcgtg cagggccaga
ctggaggtgg 11640caacgccaat cagcaacgac tgtttgcccg ccagttgttg
tgccacgcgg ttgggaatgt 11700aattcagctc cgccatcgcc gcttccactt
tttcccgcgt tttcgcagaa acgtggctgg 11760cctggttcac cacgcgggaa
acggtctgat aagagacacc ggcatactct gcgacatcgt 11820ataacgttac
tggtttcaca ttcaccaccc tgaattgact ctcttccggg cgctatcatg
11880ccataccgcg aaaggttttg caccattcga tggtgtcg
11918313278DNAArtificial Sequenceprimer 3aaattgaaga gtttgatcat
ggctcagatt gaacgctggc ggcaggccta acacatgcaa 60gtcgaacggt aacaggaaga
agcttgcttc tttgctgacg agtggcggac gggtgagtaa 120tgtctgggaa
actgcctgat ggagggggat aactactgga aacggtagct aataccgcat
180aacgtcgcaa gaccaaagag ggggaccttc gggcctcttg ccatcggatg
tgcccagatg 240ggattagcta gtaggtgggg taacggctca cctaggcgac
gatccctagc tggtctgaga 300ggatgaccag ccacactgga actgagacac
ggtccagact cctacgggag gcagcagtgg 360ggaatattgc acaatgggcg
caagcctgat gcagccatgc cgcgtgtatg aagaaggcct 420tcgggttgta
aagtactttc agcggggagg aagggagtaa agttaatacc tttgctcatt
480gacgttaccc gcagaagaag caccggctaa ctccgtgcca gcagccgcgg
taatacggag 540ggtgcaagcg ttaatcggaa ttactgggcg taaagcgcac
gcaggcggtt tgttaagtca 600gatgtgaaat ccccgggctc aacctgggaa
ctgcatctga tactggcaag cttgagtctc 660gtagaggggg gtagaattcc
aggtgtagcg gtgaaatgcg tagagatctg gaggaatacc 720ggtggcgaag
gcggccccct ggacgaagac tgacgctcag gtgcgaaagc gtggggagca
780aacaggatta gataccctgg tagtccacgc cgtaaacgat gtcgacttgg
aggttgtgcc 840cttgaggcgt ggcttccgga gctaacgcgt taagtcgacc
gcctggggag tacggccgca 900aggttaaaac tcaaatgaat tgacgggggc
ccgcacaagc ggtggagcat gtggtttaat 960tcgatgcaac gcgaagaacc
ttacctggtc ttgacatcca cggaagtttt cagagatgag 1020aatgtgcctt
cgggaaccgt gagacaggtg ctgcatggct gtcgtcagct cgtgttgtga
1080aatgttgggt taagtcccgc aacgagcgca acccttatcc tttgttgcca
gcggtccggc 1140cgggaactca aaggagactg ccagtgataa actggaggaa
ggtggggatg acgtcaagtc 1200atcatggccc ttacgaccag ggctacacac
gtgctacaat ggcgcataca aagagaagcg 1260acctcgcgag agcaagcgga
cctcataaag tgcgtcgtag tccggattgg agtctgcaac 1320tcgactccat
gaagtcggaa tcgctagtaa tcgtggatca gaatgccacg gtgaatacgt
1380tcccgggcct tgtacacacc gcccgtcaca ccatgggagt gggttgcaaa
agaagtaggt 1440agcttaacct tcgggagggc gcttaccact ttgtgattca
tgactggggt gaagtcgtaa 1500caaggtaacc gtaggggaac ctgcggttgg
atcatgggat taccttaaag aagcgtactt 1560tgtagtgctc acacagattg
tctgatagaa agtgaaaagc aaggcgttta cgcgttggga 1620gtgaggctga
agagaataag gccgttcgct ttctattaat gaaagctcac cctacacgaa
1680aatatcacgc aacgcgtgat aagcaatttt cgtgtcccct tcgtctagag
gcccaggaca 1740ccgccctttc acggcggtaa caggggttcg aatcccctag
gggacgccac ttgctggttt 1800gtgagtgaaa gtcgccgacc ttaatatctc
aaaactcatc ttcgggtgat gtttgagatt 1860tttgctcttt aaaaatctgg
atcaagctga aaattgaaac actgaacaac gagagttgtt 1920cgtgagtctc
tcaaattttc gcaacacgat gatgaatcga aagaaacatc ttcgggttgt
1980gaggttaagc gactaagcgt acacggtgga tgccctggca gtcagaggcg
atgaaggacg 2040tgctaatctg cgataagcgt cggtaaggtg atatgaaccg
ttataaccgg cgatttccga 2100atggggaaac ccagtgtgtt tcgacacact
atcattaact gaatccatag gttaatgagg 2160cgaaccgggg gaactgaaac
atctaagtac cccgaggaaa agaaatcaac cgagattccc 2220ccagtagcgg
cgagcgaacg gggagcagcc cagagcctga atcagtgtgt gtgttagtgg
2280aagcgtctgg aaaggcgcgc gatacagggt gacagccccg tacacaaaaa
tgcacatgct 2340gtgagctcga tgagtagggc gggacacgtg gtatcctgtc
tgaatatggg gggaccatcc 2400tccaaggcta aatactcctg actgaccgat
agtgaaccag taccgtgagg gaaaggcgaa 2460aagaaccccg gcgaggggag
tgaaaaagaa cctgaaaccg tgtacgtaca agcagtggga 2520gcacgcttag
gcgtgtgact gcgtaccttt tgtataatgg gtcagcgact tatattctgt
2580agcaaggtta accgaatagg ggagccgaag ggaaaccgag tcttaactgg
gcgttaagtt 2640gcagggtata gacccgaaac ccggtgatct agccatgggc
aggttgaagg ttgggtaaca 2700ctaactggag gaccgaaccg actaatgttg
aaaaattagc ggatgacttg tggctggggg 2760tgaaaggcca atcaaaccgg
gagatagctg gttctccccg aaagctattt aggtagcgcc 2820tcgtgaattc
atctccgggg gtagagcact gtttcggcaa gggggtcatc ccgacttacc
2880aacccgatgc aaactgcgaa taccggagaa tgttatcacg ggagacacac
ggcgggtgct 2940aacgtccgtc gtgaagaggg aaacaaccca gaccgccagc
taaggtccca aagtcatggt 3000taagtgggaa acgatgtggg aaggcccaga
cagccaggat gttggcttag aagcagccat 3060catttaaaga aagcgtaata
gctcactggt cgagtcggcc tgcgcggaag atgtaacggg 3120gctaaaccat
gcaccgaagc tgcggcagcg acgcttatgc gttgttgggt aggggagcgt
3180tctgtaagcc tgcgaaggtg tgctgtgagg catgctggag gtatcagaag
tgcgaatgct 3240gacataagta acgataaagc gggtgaaaag cccgctcgcc
ggaagaccaa gggttcctgt 3300ccaacgttaa tcggggcagg gtgagtcgac
ccctaaggcg aggccgaaag gcgtagtcga 3360tgggaaacag gttaatattc
ctgtacttgg tgttactgcg aaggggggac ggagaaggct 3420atgttggccg
ggcgacggtt gtcccggttt aagcgtgtag gctggttttc caggcaaatc
3480cggaaaatca aggctgaggc gtgatgacga ggcactacgg tgctgaagca
acaaatgccc 3540tgcttccagg aaaagcctct aagcatcagg taacatcaaa
tcgtacccca aaccgacaca 3600ggtggtcagg tagagaatac caaggcgctt
gagagaactc gggtgaagga actaggcaaa 3660atggtgccgt aacttcggga
gaaggcacgc tgatatgtag gtgaggtccc tcgcggatgg 3720agctgaaatc
agtcgaagat accagctggc tgcaactgtt tattaaaaac acagcactgt
3780gcaaacacga aagtggacgt atacggtgtg acgcctgccc ggtgccggaa
ggttaattga 3840tggggttagc gcaagcgaag ctcttgatcg aagccccggt
aaacggcggc cgtaactata 3900acggtcctaa ggtagcgaaa ttccttgtcg
ggtaagttcc gacctgcacg aatggcgtaa 3960tgatggccag gctgtctcca
cccgagactc agtgaaattg aactcgctgt gaagatgcag 4020tgtacccgcg
gcaagacgga aagaccccgt gaacctttac tatagcttga cactgaacat
4080tgagccttga tgtgtaggat aggtgggagg ctttgaagtg tggacgccag
tctgcatgga 4140gccgaccttg aaataccacc ctttaatgtt tgatgttcta
acgttgaccc gtaatccggg 4200ttgcggacag tgtctggtgg gtagtttgac
tggggcggtc tcctcctaaa gagtaacgga 4260ggagcacgaa ggttggctaa
tcctggtcgg acatcaggag gttagtgcaa tggcataagc 4320cagcttgact
gcgagcgtga cggcgcgagc aggtgcgaaa gcaggtcata gtgatccggt
4380ggttctgaat ggaagggcca tcgctcaacg gataaaaggt actccgggga
taacaggctg 4440ataccgccca agagttcata tcgacggcgg tgtttggcac
ctcgatgtcg gctcatcaca 4500tcctggggct gaagtaggtc ccaagggtat
ggctgttcgc catttaaagt ggtacgcgag 4560ctgggtttag aacgtcgtga
gacagttcgg tccctatctg ccgtgggcgc tggagaactg 4620aggggggctg
ctcctagtac gagaggaccg gagtggacgc atcactggtg ttcgggttgt
4680catgccaatg gcactgcccg gtagctaaat gcggaagaga taagtgctga
aagcatctaa 4740gcacgaaact tgccccgaga tgagttctcc ctgacccttt
aagggtcctg aaggaacgtt 4800gaagacgacg acgttgatag gccgggtgtg
taagcgcagc gatgcgttga gctaaccggt 4860actaatgaac cgtgaggctt
aaccttacaa cgccgaagct gttttggcgg atgagagaag 4920attttcagcc
tgatacagat taaatcagaa cgcagaagcg gtctgataaa acagaatttg
4980cctggcggca gtagcgcggt ggtcccacct gaccccatgc cgaactcaga
agtgaaacgc 5040cgtagcgccg atggtagtgt ggggtctccc catgcgagag
tagggaactg ccaggcatca 5100aataaaacga aaggctcagt cgaaagactg
ggcctttcgt tttatctgtt gtttgtcggt 5160gaacgctctc ctgagtagga
caaatccgcc gggagcggat ttgaacgttg cgaagcaacg 5220gcccggaggg
tggcgggcag gacgcccgcc ataaactgcc aggcatcaaa ttaagcagaa
5280ggccatcctg acggatggcc tttttgcgtt tctacaaact cttcctgtcg
tcatatctac 5340aagccatccc cccacagata cggtaaacta gcctcgtttt
tgcatcagga aagcagctat 5400gaaccactcc ttaaaaccct ggaacacatt
tggcattgat cataatgctc agcacattgt 5460atgggcctta agggcccaac
aattactcaa tgcctggcag tatgcaaccg cagaaggaca 5520acccgttctt
attctgggtg aaggaagtaa tgtacttttt ctggaggact atcgcggcac
5580ggtgatcatc aaccggatca aaggtatcga aattcatgat gaacctgatg
cgtggtattt 5640acatgtagga gccggagaaa actggcatcg tctggtaaaa
tacactttgc aggaaggtat 5700gcctggtctg gaaaatctgg cattaattcc
tggttgtgtc ggctcatcac ctatccagaa 5760tattggtgct tatggcgtag
aattacagcg agtttgcgct tatgttgatt ctgttgaact 5820ggcgacaggc
aagcaagtgc gcttaactgc caaagagtgc cgttttggct atcgcgacag
5880tatttttaaa catgaatacc aggaccgctt cgctattgta gccgtaggtc
tgcgtctgcc 5940aaaagagtgg caacctgtac taacgtatgg tgacttaact
cgtctgggat ccacaggacg 6000ggtgtggtcg ccatgatcgc gtagtcgata
gtggctccaa gtagcgaagc gagcaggact 6060gggcggcggc caaagcggtc
ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc 6120aacgcatata
gcgctagcag cacgccatag tgactggcga tgctgtcgga atggacgata
6180tcccgcaaga ggcccggcag taccggcata accaagccta tgcctacagc
atccagggtg 6240acggtgccga ggatgacgat gagcgcattg ttagatttca
tacacggtgc ctgactgcgt 6300tagcaattta actgtgataa actaccgcat
taaagcttat cgatgataag ctgtcaaaca 6360tgagaattct tgaagacgaa
agggcctcgt gatacgccta tttttatagg ttaatgtcat 6420gataataatg
gtttcttaga cgtcaggtgg cacttttcgg ggaaatgtgc gcggaacccc
6480tatttgttta tttttctaaa tacattcaaa tatgtatccg ctcatgagac
aataaccctg 6540ataaatgctt caataatatt gaaaaaggaa gagtatgagt
attcaacatt tccgtgtcgc 6600ccttattccc ttttttgcgg cattttgcct
tcctgttttt gctcacccag aaacgctggt 6660gaaagtaaaa gatgctgaag
atcagttggg tgcacgagtg ggttacatcg aactggatct 6720caacagcggt
aagatccttg agagttttcg ccccgaagaa cgttttccaa tgatgagcac
6780ttttaaagtt ctgctatgtg gcgcggtatt atcccgtgtt gacgccgggc
aagagcaact 6840cggtcgccgc atacactatt ctcagaatga cttggttgag
tactcaccag tcacagaaaa 6900gcatcttacg gatggcatga cagtaagaga
attatgcagt gctgccataa ccatgagtga 6960taacactgcg gccaacttac
ttctgacaac gatcggagga ccgaaggagc taaccgcttt 7020tttgcacaac
atgggggatc atgtaactcg ccttgatcgt tgggaaccgg agctgaatga
7080agccatacca aacgacgagc gtgacaccac gatgcctgca gcaatggcaa
caacgttgcg 7140caaactatta actggcgaac tacttactct
agcttcccgg caacaattaa tagactggat 7200ggaggcggat aaagttgcag
gaccacttct gcgctcggcc cttccggctg gctggtttat 7260tgctgataaa
tctggagccg gtgagcgtgg gtctcgcggt atcattgcag cactggggcc
7320agatggtaag ccctcccgta tcgtagttat ctacacgacg gggagtcagg
caactatgga 7380tgaacgaaat agacagatcg ctgagatagg tgcctcactg
attaagcatt ggtaactgtc 7440agaccaagtt tactcatata tactttagat
tgatttaaaa cttcattttt aatttaaaag 7500gatctaggtg aagatccttt
ttgataatct catgaccaaa atcccttaac gtgagttttc 7560gttccactga
gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag atcctttttt
7620tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg
tggtttgttt 7680gccggatcaa gagctaccaa ctctttttcc gaaggtaact
ggcttcagca gagcgcagat 7740accaaatact gtccttctag tgtagccgta
gttaggccac cacttcaaga actctgtagc 7800accgcctaca tacctcgctc
tgctaatcct gttaccagtg gctgctgcca gtggcgataa 7860gtcgtgtctt
accgggttgg actcaagacg atagttaccg gataaggcgc agcggtcggg
7920ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca
ccgaactgag 7980atacctacag cgtgagctat gagaaagcgc cacgcttccc
gaagggagaa aggcggacag 8040gtatccggta agcggcaggg tcggaacagg
agagcgcacg agggagcttc cagggggaaa 8100cgcctggtat ctttatagtc
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt 8160gtgatgctcg
tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg
8220gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat
cccctgattc 8280tgtggataac cgtattaccg cctttgagtg agctgatacc
gctcgccgca gccgaacgac 8340cgagcgcagc gagtcagtga gcgaggaagc
ggaagagcgc ctgatgcggt attttctcct 8400tacgcatctg tgcggtattt
cacaccgcat atggtgcact ctcagtacaa tctgctctga 8460tgccgcatag
ttaagccagt atacactccg ctatcgctac gtgactgggt catggctgcg
8520ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct
cccggcatcc 8580gcttacagac aagctgtgac cgtctccggg agctgcatgt
gtcagaggtt ttcaccgtca 8640tcaccgaaac gcgcgaggca gctgcggtaa
agctcatcag cgtggtcgtg aagcgattca 8700cagatgtctg cctgttcatc
cgcgtccagc tcgttgagtt tctccagaag cgttaatgtc 8760tggcttctga
taaagcgggc catgttaagg gcggtttttt cctgtttggt cacttgatgc
8820ctccgtgtaa gggggaattt ctgttcatgg gggtaatgat accgatgaaa
cgagagagga 8880tgctcacgat acgggttact gatgatgaac atgcccggtt
actggaacgt tgtgagggta 8940aacaactggc ggtatggatg cggcgggacc
agagaaaaat cactcagggt caatgccagc 9000gcttcgttaa tacagatgta
ggtgttccac agggtagcca gcagcatcct gcgatgcctg 9060gcgaaagggg
gatgtgctgc aaggcgatta agttgggtaa cgccagggtt ttcccagtca
9120cgacgttgta aaacgacggc cagtgaattc gagctcggta cctgcactga
cgacaggaag 9180agtttgtaga aacgcaaaaa ggccatccgt caggatggcc
ttctgcttaa tttgatgcct 9240ggcagtttat ggcgggcgtc ctgcccgcca
ccctccgggc cgttgcttcg caacgttcaa 9300atccgctccc ggcggatttg
tcctactcag gagagcgttc accgacaaac aacagataaa 9360acgaaaggcc
cagtctttcg actgagcctt tcgttttatt tgatgcctgg cagttcccta
9420ctctcgcatg gggagacccc acactaccat cggcgctacg actagattat
ttgtagagct 9480catccatgcc atgtgtaatc ccagcagcag ttacaaactc
aagaaggacc atgtggtcac 9540gcttttcgtt gggatctttc gaaagggcag
attgtgtcga caggtaatgg ttgtctggta 9600aaaggacagg gccatcgcca
attggagtat tttgttgata atggtctgct agttgaacgg 9660atccatcttc
aatgttgtgg cgaattttga agttagcttt gattccattc ttttgtttgt
9720ctgccgtgat gtatacattg tgtgagttat agttgtactc gagtttgtgt
ccgagaatgt 9780ttccatcttc tttaaaatca atacctttta actcgatacg
attaacaagg gtatcacctt 9840caaacttgac ttcagcacgc gtcttgtagt
tcccgtcatc tttgaaagat atagtgcgtt 9900cctgtacata accttcgggc
atggcactct tgaaaaagtc atgccgtttc atatgatccg 9960gataacggga
aaagcattga acaccataag agaaagtagt gacaagtgtt ggccatggaa
10020caggtagttt tccagtagtg caaataaatt taagggtaag ctttccgtat
gtagcatcac 10080cttcaccctc tccactgaca gaaaatttgt gcccattaac
atcaccatct aattcaacaa 10140gaattgggac aactccagtg aaaagttctt
ctcctttgct cgcagtgatt tttttctcca 10200tttgcggagg gatatgaaag
cggccgcttc cacacattaa actagttcga tgattaattg 10260tcaacagctc
gccggcggca cctcgctaac ggattcacca ctccaagaat tggagccaat
10320cgattcttgc ggagaactgt gaatgcgggt acccagatcc ggaacataat
ggtgcagggc 10380gctgacttcc gcgtttccag actttacgaa acacggaaac
cgaagaccat tcatgttgtt 10440gctcaggtcg cagacgtttt gcagcagcag
tcgcttcacg ttcgctcgcg tatcggtgat 10500tcattctgct aaccagtaag
gcaaccccgc cagcctagcc gggtcctcaa cgacaggagc 10560acgatcatgc
gcacccgtgg ccaggaccca acgctgcccg agatgcgccg cgtgcggctg
10620ctggagatgg cggacgcgat ggatatgttc tgccaagggt tggtttgcgc
attcacagtt 10680ctccgcaaga atcgattggc tccaattctt ggagtggtga
atccgttagc gaggtgccgc 10740cggcgagctg ttgacaatta atcatcgaac
tagtttaatg tgtggaagcg gccgctttca 10800tatccctccg caaatggaga
aaaaaatcac tggatatacc accgttgata tatcccaatg 10860gcatcgtaaa
gaacattttg aggcatttca gtcagttgct caatgtacct ataaccagac
10920cgttcagctg gatattacgg cctttttaaa gaccgtaaag aaaaataagc
acaagtttta 10980tccggccttt attcacattc ttgcccgcct gatgaatgct
catccggaat tccgtatggc 11040aatgaaagac ggtgagctgg tgatatggga
tagtgttcac ccttgttaca ccgttttcca 11100tgagcaaact gaaacgtttt
catcgctctg gagtgaatac cacgacgatt tccggcagtt 11160tctacacata
tattcgcaag atgtggcgtg ttacggtgaa aacctggcct atttccctaa
11220agggtttatt gagaatatgt ttttcgtctc agccaatccc tgggtgagtt
tcaccagttt 11280tgatttaaac gtggccaata tggacaactt cttcgccccc
gttttcacca tgggcaaata 11340ttatacgcaa ggcgacaagg tgctgatgcc
gctggcgatt caggttcatc atgccgtctg 11400tgatggcttc catgtcggca
gaatgcttaa tgaattacaa cagtactgcg atgagtggca 11460gggcggggcg
taattttttt aaggcagtta ttggtgccct taaacgcctg gtgctacgcc
11520tgaataagtg ataataagcg gatgaatggc agaaattcga aagcaaattc
gacccggtcg 11580tcggttcagg gcagggtcgt taaatagccg cttatgtcta
ttgctggttt acggtttatt 11640gactacccga agcagtgtga ccctgtgctt
ctcaaatgcc tgagggcagt ttgctcaggt 11700ctcccgtggg ggggaataat
taacggtatg agccttacgg cggacggatc gtggccgcaa 11760gtgggtccgg
ctagaggatc cgacaccatc gaatggtgca aaacctttcg cggtatggca
11820tgatagcgcc cggaagagag tcaattcagg gtggtgaatg tgaaaccagt
aacgttatac 11880gatgtcgcag agtatgccgg tgtctcttat cagaccgttt
cccgcgtggt gaaccaggcc 11940agccacgttt ctgcgaaaac gcgggaaaaa
gtggaagcgg cgatggcgga gctgaattac 12000attcccaacc gcgtggcaca
acaactggcg ggcaaacagt cgttgctgat tggcgttgcc 12060acctccagtc
tggccctgca cgcgccgtcg caaattgtcg cggcgattaa atctcgcgcc
12120gatcaactgg gtgccagcgt ggtggtgtcg atggtagaac gaagcggcgt
cgaagcctgt 12180aaagcggcgg tgcacaatct tctcgcgcaa cgggtcagtg
ggctgatcat taactatccg 12240ctggatgacc aggatgccat tgctgtggaa
gctgcctgca ctaatgttcc ggcgttattt 12300cttgatgtct ctgaccagac
acccatcaac agtattattt tctcccatga agacggtacg 12360cgactgggcg
tggagcatct ggtcgcattg ggtcaccagc aaatcgcgct gttagcgggc
12420ccattaagtt ctgtctcggc gcgtctgcgt ctggctggct ggcataaata
tctcactcgc 12480aatcaaattc agccgatagc ggaacgggaa ggcgactgga
gtgccatgtc cggttttcaa 12540caaaccatgc aaatgctgaa tgagggcatc
gttcccactg cgatgctggt tgccaacgat 12600cagatggcgc tgggcgcaat
gcgcgccatt accgagtccg ggctgcgcgt tggtgcggat 12660atctcggtag
tgggatacga cgataccgaa gacagctcat gttatatccc gccgtcaacc
12720accatcaaac aggattttcg cctgctgggg caaaccagcg cggaccgctt
gctgcaactc 12780tctcagggcc aggcggtgaa gggcaatcag ctgttgcccg
tctcactggt gaaaagaaaa 12840accaccctgg cgcccaatac gcaaaccgcc
tctccccgcg cgttggccga ttcattaatg 12900cagctggcac gacaggtttc
ccgactggaa agcgggcagt gagcgcaacg caattaatgt 12960gagttagctc
actcattagg caccccaggc tttacacttt atgcttccgg ctcgtataat
13020gtgtggaatt gtgagcggat aacaatttca cacagcggcc gctgagaaaa
agcgaagcgg 13080cactgctctt taacaattta tcagacaatc tgtgtgggca
ctcgaagata cggattctta 13140acgtcgcaag acgaaaaatg aataccaagt
ctcaagagtg aacacgtaat tcattacgaa 13200gtttaattct ttgagcgtca
aacttttaac gacggccagt gaattcgagc tcggtacctg 13260cactgacgac
aggaagag 13278413227DNAArtificial Sequenceprimer 4aaattgaaga
gtttgatcat ggctcagatt gaacgctggc ggcaggccta acacatgcaa 60gtcgaacggt
aacaggaaga agcttgcttc tttgctgacg agtggcggac gggtgagtaa
120tgtctgggaa actgcctgat ggagggggat aactactgga aacggtagct
aataccgcat 180aacgtcgcaa gaccaaagag ggggaccttc gggcctcttg
ccatcggatg tgcccagatg 240ggattagcta gtaggtgggg taacggctca
cctaggcgac gatccctagc tggtctgaga 300ggatgaccag ccacactgga
actgagacac ggtccagact cctacgggag gcagcagtgg 360ggaatattgc
acaatgggcg caagcctgat gcagccatgc cgcgtgtatg aagaaggcct
420tcgggttgta aagtactttc agcggggagg aagggagtaa agttaatacc
tttgctcatt 480gacgttaccc gcagaagaag caccggctaa ctccgtgcca
gcagccgcgg taatacggag 540ggtgcaagcg ttaatcggaa ttactgggcg
taaagcgcac gcaggcggtt tgttaagtca 600gatgtgaaat ccccgggctc
aacctgggaa ctgcatctga tactggcaag cttgagtctc 660gtagaggggg
gtagaattcc aggtgtagcg gtgaaatgcg tagagatctg gaggaatacc
720ggtggcgaag gcggccccct ggacgaagac tgacgctcag gtgcgaaagc
gtggggagca 780aacaggatta gataccctgg tagtccacgc cgtaaacgat
gtcgacttgg aggttgtgcc 840cttgaggcgt ggcttccgga gctaacgcgt
taagtcgacc gcctggggag tacggccgca 900aggttaaaac tcaaatgaat
tgacgggggc ccgcacaagc ggcggagcat gtggattaat 960tcgatgcaac
gcgaagaacc ttacctgggt ttgacatgca caggacgcgt ctagagatag
1020gcgttccctt gtggcctgtg tgcaggtggt gcatggctgt cgtcagctcg
tgtcgtgaga 1080tgttgggtta agtcccgcaa cgagcgcaac ccttgtctca
tgttgccagc acgtaatggt 1140ggggactcgt gagagactgc cggggtcaac
tcggaggaag gtggggatga cgtcaagtca 1200tcatgcccct tatgtccagg
gcttcacaca tgctacaatg gccggtacaa agggctgcga 1260tgccgcgagg
ttaagcgaat ccttaaaagc cggtctcagt tcggatcggg gtctgcaact
1320cgaccccgtg aagtcggagt cgctagtaat cgcagatcag caacgctgcg
gtgaatacgt 1380tcccgggcct tgtacacacc gcccgtcacg tcatgaaagt
cggtaacacc cgaagccagt 1440ggcctaaccc tcgggaggga gctgtcgaag
gtgggatcgg cgattgggac gaagtcgtaa 1500caaggtaacc gtaggggaac
ctgcggttgg atcatgggat taccttaaag aagcgtactt 1560tgtagtgctc
acacagattg tctgatagaa agtgaaaagc aaggcgttta cgcgttggga
1620gtgaggctga agagaataag gccgttcgct ttctattaat gaaagctcac
cctacacgaa 1680aatatcacgc aacgcgtgat aagcaatttt cgtgtcccct
tcgtctagag gcccaggaca 1740ccgccctttc acggcggtaa caggggttcg
aatcccctag gggacgccac ttgctggttt 1800gtgagtgaaa gtcgccgacc
ttaatatctc aaaactcatc ttcgggtgat gtttgagatt 1860tttgctcttt
aaaaatctgg atcaagctga aaattgaaac actgaacaac gagagttgtt
1920cgtgagtctc tcaaattttc gcaacacgat gatgaatcga aagaaacatc
ttcgggttgt 1980gaggttaagc gactaagcgt acacggtgga tgccctggca
gtcagaggcg atgaaggacg 2040tgctaatctg cgataagcgt cggtaaggtg
atatgaaccg ttataaccgg cgatttccga 2100atggggaaac ccagtgtgtt
tcgacacact atcattaact gaatccatag gttaatgagg 2160cgaaccgggg
gaactgaaac atctaagtac cccgaggaaa agaaatcaac cgagattccc
2220ccagtagcgg cgagcgaacg gggagcagcc cagagcctga atcagtgtgt
gtgttagtgg 2280aagcgtctgg aaaggcgcgc gatacagggt gacagccccg
tacacaaaaa tgcacatgct 2340gtgagctcga tgagtagggc gggacacgtg
gtatcctgtc tgaatatggg gggaccatcc 2400tccaaggcta aatactcctg
actgaccgat agtgaaccag taccgtgagg gaaaggcgaa 2460aagaaccccg
gcgaggggag tgaaaaagaa cctgaaaccg tgtacgtaca agcagtggga
2520gcacgcttag gcgtgtgact gcgtaccttt tgtataatgg gtcagcgact
tatattctgt 2580agcaaggtta accgaatagg ggagccgaag ggaaaccgag
tcttaactgg gcgttaagtt 2640gcagggtata gacccgaaac ccggtgatct
agccatgggc aggttgaagg ttgggtaaca 2700ctaactggag gaccgaaccg
actaatgttg aaaaattagc ggatgacttg tggctggggg 2760tgaaaggcca
atcaaaccgg gagatagctg gttctccccg aaagctattt aggtagcgcc
2820tcgtgaattc atctccgggg gtagagcact gtttcggcaa gggggtcatc
ccgacttacc 2880aacccgatgc aaactgcgaa taccggagaa tgttatcacg
ggagacacac ggcgggtgct 2940aacgtccgtc gtgaagaggg aaacaaccca
gaccgccagc taaggtccca aagtcatggt 3000taagtgggaa acgatgtggg
aaggcccaga cagccaggat gttggcttag aagcagccat 3060catttaaaga
aagcgtaata gctcactggt cgagtcggcc tgcgcggaag atgtaacggg
3120gctaaaccat gcaccgaagc tgcggcagcg acgcttatgc gttgttgggt
aggggagcgt 3180tctgtaagcc tgcgaaggtg tgctgtgagg catgctggag
gtatcagaag tgcgaatgct 3240gacataagta acgataaagc gggtgaaaag
cccgctcgcc ggaagaccaa gggttcctgt 3300ccaacgttaa tcggggcagg
gtgagtcgac ccctaaggcg aggccgaaag gcgtagtcga 3360tgggaaacag
gttaatattc ctgtacttgg tgttactgcg aaggggggac ggagaaggct
3420atgttggccg ggcgacggtt gtcccggttt aagcgtgtag gctggttttc
caggcaaatc 3480cggaaaatca aggctgaggc gtgatgacga ggcactacgg
tgctgaagca acaaatgccc 3540tgcttccagg aaaagcctct aagcatcagg
taacatcaaa tcgtacccca aaccgacaca 3600ggtggtcagg tagagaatac
caaggcgctt gagagaactc gggtgaagga actaggcaaa 3660atggtgccgt
aacttcggga gaaggcacgc tgatatgtag gtgaggtccc tcgcggatgg
3720agctgaaatc agtcgaagat accagctggc tgcaactgtt tattaaaaac
acagcactgt 3780gcaaacacga aagtggacgt atacggtgtg acgcctgccc
ggtgccggaa ggttaattga 3840tggggttagc gcaagcgaag ctcttgatcg
aagccccggt aaacggcggc cgtaactata 3900acggtcctaa ggtagcgaaa
ttccttgtcg ggtaagttcc gacctgcacg aatggcgtaa 3960tgatggccag
gctgtctcca cccgagactc agtgaaattg aactcgctgt gaagatgcag
4020tgtacccgcg gcaagacgga aagaccccgt gaacctttac tatagcttga
cactgaacat 4080tgagccttga tgtgtaggat aggtgggagg ctttgaagtg
tggacgccag tctgcatgga 4140gccgaccttg aaataccacc ctttaatgtt
tgatgttcta acgttgaccc gtaatccggg 4200ttgcggacag tgtctggtgg
gtagtttgac tggggcggtc tcctcctaaa gagtaacgga 4260ggagcacgaa
ggttggctaa tcctggtcgg acatcaggag gttagtgcaa tggcataagc
4320cagcttgact gcgagcgtga cggcgcgagc aggtgcgaaa gcaggtcata
gtgatccggt 4380ggttctgaat ggaagggcca tcgctcaacg gataaaaggt
actccgggga taacaggctg 4440ataccgccca agagttcata tcgacggcgg
tgtttggcac ctcgatgtcg gctcatcaca 4500tcctggggct gaagtaggtc
ccaagggtat ggctgttcgc catttaaagt ggtacgcgag 4560ctgggtttag
aacgtcgtga gacagttcgg tccctatctg ccgtgggcgc tggagaactg
4620aggggggctg ctcctagtac gagaggaccg gagtggacgc atcactggtg
ttcgggttgt 4680catgccaatg gcactgcccg gtagctaaat gcggaagaga
taagtgctga aagcatctaa 4740gcacgaaact tgccccgaga tgagttctcc
ctgacccttt aagggtcctg aaggaacgtt 4800gaagacgacg acgttgatag
gccgggtgtg taagcgcagc gatgcgttga gctaaccggt 4860actaatgaac
cgtgaggctt aaccttacaa cgccgaagct gttttggcgg atgagagaag
4920attttcagcc tgatacagat taaatcagaa cgcagaagcg gtctgataaa
acagaatttg 4980cctggcggca gtagcgcggt ggtcccacct gaccccatgc
cgaactcaga agtgaaacgc 5040cgtagcgccg atggtagtgt ggggtctccc
catgcgagag tagggaactg ccaggcatca 5100aataaaacga aaggctcagt
cgaaagactg ggcctttcgt tttatctgtt gtttgtcggt 5160gaacgctctc
ctgagtagga caaatccgcc gggagcggat ttgaacgttg cgaagcaacg
5220gcccggaggg tggcgggcag gacgcccgcc ataaactgcc aggcatcaaa
ttaagcagaa 5280ggccatcctg acggatggcc tttttgcgtt tctacaaact
cttcctgtcg tcatatctac 5340aagccatccc cccacagata cggtaaacta
gcctcgtttt tgcatcagga aagcagctat 5400gaaccactcc ttaaaaccct
ggaacacatt tggcattgat cataatgctc agcacattgt 5460atgggcctta
agggcccaac aattactcaa tgcctggcag tatgcaaccg cagaaggaca
5520acccgttctt attctgggtg aaggaagtaa tgtacttttt ctggaggact
atcgcggcac 5580ggtgatcatc aaccggatca aaggtatcga aattcatgat
gaacctgatg cgtggtattt 5640acatgtagga gccggagaaa actggcatcg
tctggtaaaa tacactttgc aggaaggtat 5700gcctggtctg gaaaatctgg
cattaattcc tggttgtgtc ggctcatcac ctatccagaa 5760tattggtgct
tatggcgtag aattacagcg agtttgcgct tatgttgatt ctgttgaact
5820ggcgacaggc aagcaagtgc gcttaactgc caaagagtgc cgttttggct
atcgcgacag 5880tatttttaaa catgaatacc aggaccgctt cgctattgta
gccgtaggtc tgcgtctgcc 5940aaaagagtgg caacctgtac taacgtatgg
tgacttaact cgtctgggat ccacaggacg 6000ggtgtggtcg ccatgatcgc
gtagtcgata gtggctccaa gtagcgaagc gagcaggact 6060gggcggcggc
caaagcggtc ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc
6120aacgcatata gcgctagcag cacgccatag tgactggcga tgctgtcgga
atggacgata 6180tcccgcaaga ggcccggcag taccggcata accaagccta
tgcctacagc atccagggtg 6240acggtgccga ggatgacgat gagcgcattg
ttagatttca tacacggtgc ctgactgcgt 6300tagcaattta actgtgataa
actaccgcat taaagcttat cgatgataag ctgtcaaaca 6360tgagaattct
tgaagacgaa agggcctcgt gatacgccta tttttatagg ttaatgtcat
6420gataataatg gtttcttaga cgtcaggtgg cacttttcgg ggaaatgtgc
gcggaacccc 6480tatttgttta tttttctaaa tacattcaaa tatgtatccg
ctcatgagac aataaccctg 6540ataaatgctt caataatatt gaaaaaggaa
gagtatgagt attcaacatt tccgtgtcgc 6600ccttattccc ttttttgcgg
cattttgcct tcctgttttt gctcacccag aaacgctggt 6660gaaagtaaaa
gatgctgaag atcagttggg tgcacgagtg ggttacatcg aactggatct
6720caacagcggt aagatccttg agagttttcg ccccgaagaa cgttttccaa
tgatgagcac 6780ttttaaagtt ctgctatgtg gcgcggtatt atcccgtgtt
gacgccgggc aagagcaact 6840cggtcgccgc atacactatt ctcagaatga
cttggttgag tactcaccag tcacagaaaa 6900gcatcttacg gatggcatga
cagtaagaga attatgcagt gctgccataa ccatgagtga 6960taacactgcg
gccaacttac ttctgacaac gatcggagga ccgaaggagc taaccgcttt
7020tttgcacaac atgggggatc atgtaactcg ccttgatcgt tgggaaccgg
agctgaatga 7080agccatacca aacgacgagc gtgacaccac gatgcctgca
gcaatggcaa caacgttgcg 7140caaactatta actggcgaac tacttactct
agcttcccgg caacaattaa tagactggat 7200ggaggcggat aaagttgcag
gaccacttct gcgctcggcc cttccggctg gctggtttat 7260tgctgataaa
tctggagccg gtgagcgtgg gtctcgcggt atcattgcag cactggggcc
7320agatggtaag ccctcccgta tcgtagttat ctacacgacg gggagtcagg
caactatgga 7380tgaacgaaat agacagatcg ctgagatagg tgcctcactg
attaagcatt ggtaactgtc 7440agaccaagtt tactcatata tactttagat
tgatttaaaa cttcattttt aatttaaaag 7500gatctaggtg aagatccttt
ttgataatct catgaccaaa atcccttaac gtgagttttc 7560gttccactga
gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag atcctttttt
7620tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg
tggtttgttt 7680gccggatcaa gagctaccaa ctctttttcc gaaggtaact
ggcttcagca gagcgcagat 7740accaaatact gtccttctag tgtagccgta
gttaggccac cacttcaaga actctgtagc 7800accgcctaca tacctcgctc
tgctaatcct gttaccagtg gctgctgcca gtggcgataa 7860gtcgtgtctt
accgggttgg actcaagacg atagttaccg gataaggcgc agcggtcggg
7920ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca
ccgaactgag 7980atacctacag cgtgagctat gagaaagcgc cacgcttccc
gaagggagaa aggcggacag 8040gtatccggta agcggcaggg tcggaacagg
agagcgcacg agggagcttc cagggggaaa 8100cgcctggtat ctttatagtc
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt 8160gtgatgctcg
tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg
8220gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat
cccctgattc 8280tgtggataac cgtattaccg cctttgagtg agctgatacc
gctcgccgca gccgaacgac 8340cgagcgcagc gagtcagtga gcgaggaagc
ggaagagcgc ctgatgcggt attttctcct 8400tacgcatctg tgcggtattt
cacaccgcat atggtgcact ctcagtacaa tctgctctga 8460tgccgcatag
ttaagccagt atacactccg ctatcgctac gtgactgggt catggctgcg
8520ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct
cccggcatcc 8580gcttacagac aagctgtgac cgtctccggg agctgcatgt
gtcagaggtt ttcaccgtca 8640tcaccgaaac gcgcgaggca gctgcggtaa
agctcatcag cgtggtcgtg aagcgattca 8700cagatgtctg cctgttcatc
cgcgtccagc tcgttgagtt tctccagaag cgttaatgtc 8760tggcttctga
taaagcgggc catgttaagg gcggtttttt cctgtttggt cacttgatgc
8820ctccgtgtaa gggggaattt ctgttcatgg gggtaatgat accgatgaaa
cgagagagga
8880tgctcacgat acgggttact gatgatgaac atgcccggtt actggaacgt
tgtgagggta 8940aacaactggc ggtatggatg cggcgggacc agagaaaaat
cactcagggt caatgccagc 9000gcttcgttaa tacagatgta ggtgttccac
agggtagcca gcagcatcct gcgatgcctg 9060gcgaaagggg gatgtgctgc
aaggcgatta agttgggtaa cgccagggtt ttcccagtca 9120cgacgttgta
aaacgacggc cagtgaattc gagctcggta cctgcactga cgacaggaag
9180agtttgtaga aacgcaaaaa ggccatccgt caggatggcc ttctgcttaa
tttgatgcct 9240ggcagtttat ggcgggcgtc ctgcccgcca ccctccgggc
cgttgcttcg caacgttcaa 9300atccgctccc ggcggatttg tcctactcag
gagagcgttc accgacaaac aacagataaa 9360acgaaaggcc cagtctttcg
actgagcctt tcgttttatt tgatgcctgg cagttcccta 9420ctctcgcatg
gggagacccc acactaccat cggcgctacg actagattat ttgtagagct
9480catccatgcc atgtgtaatc ccagcagcag ttacaaactc aagaaggacc
atgtggtcac 9540gcttttcgtt gggatctttc gaaagggcag attgtgtcga
caggtaatgg ttgtctggta 9600aaaggacagg gccatcgcca attggagtat
tttgttgata atggtctgct agttgaacgg 9660atccatcttc aatgttgtgg
cgaattttga agttagcttt gattccattc ttttgtttgt 9720ctgccgtgat
gtatacattg tgtgagttat agttgtactc gagtttgtgt ccgagaatgt
9780ttccatcttc tttaaaatca atacctttta actcgatacg attaacaagg
gtatcacctt 9840caaacttgac ttcagcacgc gtcttgtagt tcccgtcatc
tttgaaagat atagtgcgtt 9900cctgtacata accttcgggc atggcactct
tgaaaaagtc atgccgtttc atatgatccg 9960gataacggga aaagcattga
acaccataag agaaagtagt gacaagtgtt ggccatggaa 10020caggtagttt
tccagtagtg caaataaatt taagggtaag ctttccgtat gtagcatcac
10080cttcaccctc tccactgaca gaaaatttgt gcccattaac atcaccatct
aattcaacaa 10140gaattgggac aactccagtg aaaagttctt ctcctttgct
cgcagtgatt tttttctcca 10200tttgcggagg gatatgaaag cggccgcttc
cacacattaa actagttcga tgattaattg 10260tcaacagctc gccggcggca
cctcgctaac ggattcacca ctccaagaat tggagccaat 10320cgattcttgc
ggagaactgt gaatgcgggt acccagatcc ggaacataat ggtgcagggc
10380gctgacttcc gcgtttccag actttacgaa acacggaaac cgaagaccat
tcatgttgtt 10440gctcaggtcg cagacgtttt gcagcagcag tcgcttcacg
ttcgctcgcg tatcggtgat 10500tcattctgct aaccagtaag gcaaccccgc
cagcctagcc gggtcctcaa cgacaggagc 10560acgatcatgc gcacccgtgg
ccaggaccca acgctgcccg agatgcgccg cgtgcggctg 10620ctggagatgg
cggacgcgat ggatatgttc tgccaagggt tggtttgcgc attcacagtt
10680ctccgcaaga atcgattggc tccaattctt ggagtggtga atccgttagc
gaggtgccgc 10740cggcgagctg ttgacaatta atcatcgaac tagtttaatg
tgtggaagcg gccgctttca 10800tatccctccg caaatggaga aaaaaatcac
tggatatacc accgttgata tatcccaatg 10860gcatcgtaaa gaacattttg
aggcatttca gtcagttgct caatgtacct ataaccagac 10920cgttcagctg
gatattacgg cctttttaaa gaccgtaaag aaaaataagc acaagtttta
10980tccggccttt attcacattc ttgcccgcct gatgaatgct catccggaat
tccgtatggc 11040aatgaaagac ggtgagctgg tgatatggga tagtgttcac
ccttgttaca ccgttttcca 11100tgagcaaact gaaacgtttt catcgctctg
gagtgaatac cacgacgatt tccggcagtt 11160tctacacata tattcgcaag
atgtggcgtg ttacggtgaa aacctggcct atttccctaa 11220agggtttatt
gagaatatgt ttttcgtctc agccaatccc tgggtgagtt tcaccagttt
11280tgatttaaac gtggccaata tggacaactt cttcgccccc gttttcacca
tgggcaaata 11340ttatacgcaa ggcgacaagg tgctgatgcc gctggcgatt
caggttcatc atgccgtctg 11400tgatggcttc catgtcggca gaatgcttaa
tgaattacaa cagtactgcg atgagtggca 11460gggcggggcg taattttttt
aaggcagtta ttggtgccct taaacgcctg gtgctacgcc 11520tgaataagtg
ataataagcg gatgaatggc agaaattcga aagcaaattc gacccggtcg
11580tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt
acggtttatt 11640gactacccga agcagtgtga ccctgtgctt ctcaaatgcc
tgagggcagt ttgctcaggt 11700ctcccgtggg ggggaataat taacggtatg
agccttacgg cggacggatc gtggccgcaa 11760gtgggtccgg ctagaggatc
cgacaccatc gaatggtgca aaacctttcg cggtatggca 11820tgatagcgcc
cggaagagag tcaattcagg gtggtgaatg tgaaaccagt aacgttatac
11880gatgtcgcag agtatgccgg tgtctcttat cagaccgttt cccgcgtggt
gaaccaggcc 11940agccacgttt ctgcgaaaac gcgggaaaaa gtggaagcgg
cgatggcgga gctgaattac 12000attcccaacc gcgtggcaca acaactggcg
ggcaaacagt cgttgctgat tggcgttgcc 12060acctccagtc tggccctgca
cgcgccgtcg caaattgtcg cggcgattaa atctcgcgcc 12120gatcaactgg
gtgccagcgt ggtggtgtcg atggtagaac gaagcggcgt cgaagcctgt
12180aaagcggcgg tgcacaatct tctcgcgcaa cgggtcagtg ggctgatcat
taactatccg 12240ctggatgacc aggatgccat tgctgtggaa gctgcctgca
ctaatgttcc ggcgttattt 12300cttgatgtct ctgaccagac acccatcaac
agtattattt tctcccatga agacggtacg 12360cgactgggcg tggagcatct
ggtcgcattg ggtcaccagc aaatcgcgct gttagcgggc 12420ccattaagtt
ctgtctcggc gcgtctgcgt ctggctggct ggcataaata tctcactcgc
12480aatcaaattc agccgatagc ggaacgggaa ggcgactgga gtgccatgtc
cggttttcaa 12540caaaccatgc aaatgctgaa tgagggcatc gttcccactg
cgatgctggt tgccaacgat 12600cagatggcgc tgggcgcaat gcgcgccatt
accgagtccg ggctgcgcgt tggtgcggat 12660atctcggtag tgggatacga
cgataccgaa gacagctcat gttatatccc gccgtcaacc 12720accatcaaac
aggattttcg cctgctgggg caaaccagcg cggaccgctt gctgcaactc
12780tctcagggcc aggcggtgaa gggcaatcag ctgttgcccg tctcactggt
gaaaagaaaa 12840accaccctgg cgcccaatac gcaaaccgcc tctccccgcg
cgttggccga ttcattaatg 12900cagctggcac gacaggtttc ccgactggaa
agcgggcagt gagcgcaacg caattaatgt 12960gagttagctc actcattagg
caccccaggc tttacacttt atgcttccgg ctcgtataat 13020gtgtggaatt
gtgagcggat aacaatttca cacagcggcc gctgagaaaa agcgaagcgg
13080cactgctctt taacaattta tcagacaatc tgtgtgggca ctcgaagata
cggattctta 13140acgtcgcaag acgaaaaatg aataccaagt ctcaagagtg
aacacgtaat tcattacgaa 13200gtttaattct ttgagcgtca aactttt
1322758752DNAArtificial Sequenceprimer 5aaattgaaga gtttgatcat
ggctcagatt gaacgctggc ggcaggccta acacatgcaa 60gtcgaacggt aacaggaaga
agcttgcttc tttgctgacg agtggcggac gggtgagtaa 120tgtctgggaa
actgcctgat ggagggggat aactactgga aacggtagct aataccgcat
180aacgtcgcaa gaccaaagag ggggaccttc gggcctcttg ccatcggatg
tgcccagatg 240ggattagcta gtaggtgggg taacggctca cctaggcgac
gatccctagc tggtctgaga 300ggatgaccag ccacactgga actgagacac
ggtccagact cctacgggag gcagcagtgg 360ggaatattgc acaatgggcg
caagcctgat gcagccatgc cgcgtgtatg aagaaggcct 420tcgggttgta
aagtactttc agcggggagg aagggagtaa agttaatacc tttgctcatt
480gacgttaccc gcagaagaag caccggctaa ctccgtgcca gcagccgcgg
taatacggag 540ggtgcaagcg ttaatcggaa ttactgggcg taaagcgcac
gcaggcggtt tgttaagtca 600gatgtgaaat ccccgggctc aacctgggaa
ctgcatctga tactggcaag cttgagtctc 660gtagaggggg gtagaattcc
aggtgtagcg gtgaaatgcg tagagatctg gaggaatacc 720ggtggcgaag
gcggccccct ggacgaagac tgacgctcag gtgcgaaagc gtggggagca
780aacaggatta gataccctgg tagtccacgc cgtaaacgat gtcgacttgg
aggttgtgcc 840cttgaggcgt ggcttccgga gctaacgcgt taagtcgacc
gcctggggag tacggccgca 900aggttaaaac tcaaatgaat tgacgggggc
ccgcacaagc ggcggagcat gtggattaat 960tcgatgcaac gcgaagaacc
ttacctgggt ttgacatgca caggacgcgt ctagagatag 1020gcgttccctt
gtggcctgtg tgcaggtggt gcatggctgt cgtcagctcg tgtcgtgaga
1080tgttgggtta agtcccgcaa cgagcgcaac ccttgtctca tgttgccagc
acgtaatggt 1140ggggactcgt gagagactgc cggggtcaac tcggaggaag
gtggggatga cgtcaagtca 1200tcatgcccct tatgtccagg gcttcacaca
tgctacaatg gccggtacaa agggctgcga 1260tgccgcgagg ttaagcgaat
ccttaaaagc cggtctcagt tcggatcggg gtctgcaact 1320cgaccccgtg
aagtcggagt cgctagtaat cgcagatcag caacgctgcg gtgaatacgt
1380tcccgggcct tgtacacacc gcccgtcacg tcatgaaagt cggtaacacc
cgaagccagt 1440ggcctaaccc tcgggaggga gctgtcgaag gtgggatcgg
cgattgggac gaagtcgtaa 1500caaggtaacc gtaggggaac ctgcggttgg
atcatgggat taccttaaag aagcgtactt 1560tgtagtgctc acacagattg
tctgatagaa agtgaaaagc aaggcgttta cgcgttggga 1620gtgaggctga
agagaataag gccgttcgct ttctattaat gaaagctcac cctacacgaa
1680aatatcacgc aacgcgtgat aagcaatttt cgtgtcccct tcgtctagac
gtagcgccga 1740tggtagtgtg gggtctcccc atgcgagagt agggaactgc
caggcatcaa ataaaacgaa 1800aggctcagtc gaaagactgg gcctttcgtt
ttatctgttg tttgtcggtg aacgctctcc 1860tgagtaggac aaatccgccg
ggagcggatt tgaacgttgc gaagcaacgg cccggagggt 1920ggcgggcagg
acgcccgcca taaactgcca ggcatcaaat taagcagaag gccatcctga
1980cggatggcct ttttgcgttt ctacaaactc ttcctgtcgt cactgcaggc
atgcaagctt 2040ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt
tatccgctca caattccaca 2100caacatacga gccggaagca taaagtgtaa
agcctggggt gcctaatgag tgagctaact 2160cacattaatt gcgttgcgct
cactgcccgc tttccagtcg ggaaacctgt cgtgccagct 2220gcattaatga
atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc
2280ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg
tatcagctca 2340ctcaaaggcg gtaatacggt tatccacaga atcaggggat
aacgcaggaa agaacatgtg 2400agcaaaaggc cagcaaaagg ccaggaaccg
taaaaaggcc gcgttgctgg cgtttttcca 2460taggctccgc ccccctgacg
agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 2520cccgacagga
ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc
2580tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg
gaagcgtggc 2640gctttctcat agctcacgct gtaggtatct cagttcggtg
taggtcgttc gctccaagct 2700gggctgtgtg cacgaacccc ccgttcagcc
cgaccgctgc gccttatccg gtaactatcg 2760tcttgagtcc aacccggtaa
gacacgactt atcgccactg gcagcagcca ctggtaacag 2820gattagcaga
gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta
2880cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag
ttaccttcgg 2940aaaaagagtt ggtagctctt gatccggcaa acaaaccacc
gctggtagcg gtggtttttt 3000tgtttgcaag cagcagatta cgcgcagaaa
aaaaggatct caagaagatc ctttgatctt 3060ttctacgggg tctgacgctc
agtggaacga aaactcacgt taagggattt tggtcatgag 3120attatcaaaa
aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat
3180ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca
gtgaggcacc 3240tatctcagcg atctgtctat ttcgttcatc catagttgcc
tgactccccg tcgtgtagat 3300aactacgata cgggagggct taccatctgg
ccccagtgct gcaatgatac cgcgagaccc 3360acgctcaccg gctccagatt
tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 3420aagtggtcct
gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag
3480agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta
caggcatcgt 3540ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc
ggttcccaac gatcaaggcg 3600agttacatga tcccccatgt tgtgcaaaaa
agcggttagc tccttcggtc ctccgatcgt 3660tgtcagaagt aagttggccg
cagtgttatc actcatggtt atggcagcac tgcataattc 3720tcttactgtc
atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc
3780attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa
tacgggataa 3840taccgcgcca catagcagaa ctttaaaagt gctcatcatt
ggaaaacgtt cttcggggcg 3900aaaactctca aggatcttac cgctgttgag
atccagttcg atgtaaccca ctcgtgcacc 3960caactgatct tcagcatctt
ttactttcac cagcgtttct gggtgagcaa aaacaggaag 4020gcaaaatgcc
gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt
4080cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg
gatacatatt 4140tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc
acatttcccc gaaaagtgcc 4200acctgacgtc taagaaacca ttattatcat
gacattaacc tataaaaata ggcgtatcac 4260gaggcccttt cgtctcgcgc
gtttcggtga tgacggtgaa aacctctgac acatgcagct 4320cccggagacg
gtcacagctt gtctgtaagc ggatgccggg agcagacaag cccgtcaggg
4380cgcgtcagcg ggtgttggcg ggtgtcgggg ctggcttaac tatgcggcat
cagagcagat 4440tgtactgaga gtgcaccata tgcggtgtga aataccgcac
agatgcgtaa ggagaaaata 4500ccgcatcagg cgccattcgc cattcaggct
gcgcaactgt tgggaagggc gatcggtgcg 4560ggcctcttcg ctattacgcc
agctggcgaa agggggatgt gctgcaaggc gattaagttg 4620ggtaacgcca
gggttttccc agtcacgacg ttgtaaaacg acggccagtg aattcgagct
4680cggtacctgc agtgacgaca ggaagagttt gtagaaacgc aaaaaggcca
tccgtcagga 4740tggccttctg cttaatttga tgcctggcag tttatggcgg
gcgtcctgcc cgccaccctc 4800cgggccgttg cttcgcaacg ttcaaatccg
ctcccggcgg atttgtccta ctcaggagag 4860cgttcaccga caaacaacag
ataaaacgaa aggcccagtc tttcgactga gcctttcgtt 4920ttatttgatg
cctggcagtt ccctactctc gcatggggag accccacact accatcggcg
4980ctacgtctag attatttgta gagctcatcc atgccatgtg taatcccagc
agcagttaca 5040aactcaagaa ggaccatgtg gtcacgcttt tcgttgggat
ctttcgaaag ggcagattgt 5100gtcgacaggt aatggttgtc tggtaaaagg
acagggccat cgccaattgg agtattttgt 5160tgataatggt ctgctagttg
aacggatcca tcttcaatgt tgtggcgaat tttgaagtta 5220gctttgattc
cattcttttg tttgtctgcc gtgatgtata cattgtgtga gttatagttg
5280tactcgagtt tgtgtccgag aatgtttcca tcttctttaa aatcaatacc
ttttaactcg 5340atacgattaa caagggtatc accttcaaac ttgacttcag
cacgcgtctt gtagttcccg 5400tcatctttga aagatatagt gcgttcctgt
acataacctt cgggcatggc actcttgaaa 5460aagtcatgcc gtttcatatg
atccggataa cgggaaaagc attgaacacc ataagagaaa 5520gtagtgacaa
gtgttggcca tggaacaggt agttttccag tagtgcaaat aaatttaagg
5580gtaagctttc cgtatgtagc atcaccttca ccctctccac tgacagaaaa
tttgtgccca 5640ttaacatcac catctaattc aacaagaatt gggacaactc
cagtgaaaag ttcttctcct 5700ttgctagcag tgattttttt ctccatttgc
ggagggatat gaaagcggcc gcttccacac 5760attaaactag ttcgatgatt
aattgtcaac agctcgccgg cggcacctcg ctaacggatt 5820caccactcca
agaattggag ccaatcgatt cttgcggaga actgtgaatg cgggtaccca
5880gatccggaac ataatggtgc agggcgctga cttccgcgtt tccagacttt
acgaaacacg 5940gaaaccgaag accattcatg ttgttgctca ggtcgcagac
gttttgcagc agcagtcgct 6000tcacgttcgc tcgcgtatcg gtgattcatt
ctgctaacca gtaaggcaac cccgccagcc 6060tagccgggtc ctcaacgaca
ggagcacgat catgcgcacc cgtggccagg acccaacgct 6120gcccgagatg
cgccgcgtgc ggctgctgga gatggcggac gcgatggata tgttctgcca
6180agggttggtt tgcgcattca cagttctccg caagaatcga ttggctccaa
ttcttggagt 6240ggtgaatccg ttagcgaggt gccgccggcg agctgttgac
aattaatcat cgaactagtt 6300taatgtgtgg aagcggccgc tttcatatcc
ctccgcaaat ggagaaaaaa atcactggat 6360ataccaccgt tgatatatcc
caatggcatc gtaaagaaca ttttgaggca tttcagtcag 6420ttgctcaatg
tacctataac cagaccgttc agctggatat tacggccttt ttaaagaccg
6480taaagaaaaa taagcacaag ttttatccgg cctttattca cattcttgcc
cgcctgatga 6540atgctcatcc ggaattccgt atggcaatga aagacggtga
gctggtgata tgggatagtg 6600ttcacccttg ttacaccgtt ttccatgagc
aaactgaaac gttttcatcg ctctggagtg 6660aataccacga cgatttccgg
cagtttctac acatatattc gcaagatgtg gcgtgttacg 6720gtgaaaacct
ggcctatttc cctaaagggt ttattgagaa tatgtttttc gtctcagcca
6780atccctgggt gagtttcacc agttttgatt taaacgtggc caatatggac
aacttcttcg 6840cccccgtttt caccatgggc aaatattata cgcaaggcga
caaggtgctg atgccgctgg 6900cgattcaggt tcatcatgcc gtctgtgatg
gcttccatgt cggcagaatg cttaatgaat 6960tacaacagta ctgcgatgag
tggcagggcg gggcgtaatt tttttaaggc agttattggt 7020gcccttaaac
gcctggtgct acgcctgaat aagtgataat aagcggatga atggcagaaa
7080ttcgaaagca aattcgaccc ggtcgtcggt tcagggcagg gtcgttaaat
agccgcttat 7140gtctattgct ggtttacggt ttattgacta cccgaagcag
tgtgaccctg tgcttctcaa 7200atgcctgagg gcagtttgct caggtctccc
gtggggggga ataattaacg gtatgagcct 7260tacggcggac ggatcgtggc
cgcaagtggg tccggctaga ggatccgaca ccatcgaatg 7320gtgcaaaacc
tttcgcggta tggcatgata gcgcccggaa gagagtcaat tcagggtggt
7380gaatgtgaaa ccagtaacgt tatacgatgt cgcagagtat gccggtgtct
cttatcagac 7440cgtttcccgc gtggtgaacc aggccagcca cgtttctgcg
aaaacgcggg aaaaagtgga 7500agcggcgatg gcggagctga attacattcc
caaccgcgtg gcacaacaac tggcgggcaa 7560acagtcgttg ctgattggcg
ttgccacctc cagtctggcc ctgcacgcgc cgtcgcaaat 7620tgtcgcggcg
attaaatctc gcgccgatca actgggtgcc agcgtggtgg tgtcgatggt
7680agaacgaagc ggcgtcgaag cctgtaaagc ggcggtgcac aatcttctcg
cgcaacgggt 7740cagtgggctg attattaact atccgctgga tgaccaggat
gccattgctg tggaagctgc 7800ctgcactaat gttccggcgt tatttcttga
tgtctctgac cagacaccca tcaacagtat 7860tattttctcc catgaagacg
gtacgcgact gggcgtggag catctggtcg cattgggcca 7920ccagcaaatc
gcgctgttag cgggcccatt aagttctgtc tcggcgcgtc tgcgtctggc
7980tggctggcat aaatatctca ctcgcaatca aattcagccg atagcggaac
gggaaggcga 8040ctggagtgcc atgtccggtt ttcaacaaac catgcaaatg
ctgaatgagg gcatcgttcc 8100cactgcgatg ctggttgcca acgatcagat
ggcgctgggc gcaatgcgcg ccattaccga 8160gtccgggctg cgcgttggtg
cggatatctc ggtagtggga tacgacgata ccgaagacag 8220ctcatgttat
atcccgccgt caaccaccat caaacaggat tttcgcctgc tggggcaaac
8280cagcgtggac cgcttgctgc aactctctca gggccaggcg gtgaagggca
atcagctgtt 8340gcccgtctca ctggtgaaaa gaaaaaccac cctggcgccc
aatacgcaaa ccgcctctcc 8400ccgcgcgttg gccgattcat taatgcagct
ggcacgacag gtttcccgac tggaaagcgg 8460gcagtgagcg caacgcaatt
aatgtgagtt agctcactca ttaggcaccc caggctttac 8520actttatgct
tccggctcgt ataatgtgtg gaattgtgag cggataacaa tttcacacag
8580cggccgctga gaaaaagcga agcggcactg ctctttaaca atttatcaga
caatctgtgt 8640gggcactcga agatacggat tcttaacgtc gcaagacgaa
aaatgaatac caagtctcaa 8700gagtgaacac gtaattcatt acgaagttta
attctttgag cgtcaaactt tt 8752620DNAArtificial Sequenceprimer
6ataggggttc cgcgcacatt 20748DNAArtificial Sequenceprimer
7ctcgagcctc ctgaaagcgg ccgcaactca aaaaatacgc ccggtagt
48820DNAArtificial Sequenceprimer 8aaatcgtcgt ggtattcact
20944DNAArtificial Sequenceprimer 9gcggccgctt tcaggaggct cgagaaatgg
agaaaaaaat cact 441059DNAArtificial Sequenceprimer 10ggccgctagc
cggcgagctg ttgacaatta atcatcgaac tagtttaatg tgtggaagc
591159DNAArtificial Sequenceprimer 11ggccgcttcc acacattaaa
ctagttcgat gattaattgt caacagctcg ccggctagc 591217DNAArtificial
Sequenceprimer 12tcgagcacac tgaaagc 171317DNAArtificial
Sequenceprimer 13ggccgctttc agtgtgc 171468DNAArtificial
Sequenceprimer 14ggtcataggc ggccgctgtg tgaaattgtt atccgctcac
aattccacac attatacgag 60ccggaagc 681534DNAArtificial Sequenceprimer
15ttggatccga caccatcgaa tggtgcaaaa cctt 341629DNAArtificial
Sequenceprimer 16gaagggatcc ggcgaagatg tttctctgg
291727DNAArtificial Sequenceprimer 17gcggccgctt aaaataattt tctgacc
271831DNAArtificial Sequenceprimer 18ccacaagctt cgcacctgag
cgtcagtctt c 311937DNAArtificial Sequenceprimer 19aaaattattt
taagcggccg ctgagaaaaa gcgaagc 372020DNAArtificial Sequenceprimer
20ggcgactttc actcacaaac 202165DNAArtificial Sequenceprimer
21gtcgaagctt ggtaaccgta ggggaacctg cggttggatc acacacttac cttaaagaag
60cgtac 652254DNAArtificial Sequenceprimer 22ttaatgtgtg gaagcggccg
ctttcatatc cctnnnnaaa tggagaaaaa aatc 542319DNAArtificial
Sequenceprimer 23cagcaccttg tcgccttgc 192411DNAArtificial
Sequenceprimer 24caggaggcuc g 112511RNAArtificial Sequenceprimer
25ucaccuccuu a
112611RNAArtificial Sequenceprimer 26cagugugcuc g
112711RNAArtificial Sequenceprimer 27ucacacacuu a
112811RNAArtificial Sequenceprimer 28cauaucccuc g
112911RNAArtificial Sequenceprimer 29ucagggauuu a
113011RNAArtificial Sequenceprimer 30caaacaccuc g
113111RNAArtificial Sequenceprimer 31ucaagagguu a
113211RNAArtificial Sequenceprimer 32cauaccucuc g
113311RNAArtificial Sequenceprimer 33ucaugagguu a
113411RNAArtificial Sequenceprimer 34cauaauccuc g
113511RNAArtificial Sequenceprimer 35ucagaggauu a
113611RNAArtificial Sequenceprimer 36caaauaccuc g
113711RNAArtificial Sequenceprimer 37ucaugagguu a
113811RNAArtificial Sequenceprimer 38cacauaccuc g
113911RNAArtificial Sequenceprimer 39ucaugagguu a
114011RNAArtificial Sequenceprimer 40caccgaccuc g
114111RNAArtificial Sequenceprimer 41ucaagagguu a
114211RNAArtificial Sequenceprimer 42cauaucccuc g
114311RNAArtificial Sequenceprimer 43ucaugggauu a
114411RNAArtificial Sequenceprimer 44caacuaccuc g
114511RNAArtificial Sequenceprimer 45ucaugagguu a
114611RNAArtificial Sequenceprimer 46cauauaccuc g
114711RNAArtificial Sequenceprimer 47ucaagagguu a
114818RNAArtificial Sequenceprimer 48cauaucccuc gagaaaug
184914RNAArtificial Sequenceprimer 49ggaucauggg auua
145018RNAArtificial Sequenceprimer 50cauaucccuc gagaaaug
185114RNAArtificial Sequenceprimer 51ggaucaccuc cuua
145218RNAArtificial Sequenceprimer 52cauaucccuc cgcaaaug
185314RNAArtificial Sequenceprimer 53ggaucauggg auua
145418RNAArtificial Sequenceprimer 54cauaucccuc cgcaaaug
185514RNAArtificial Sequenceprimer 55ggaucaccuc cuua
145618RNAArtificial Sequenceprimer 56cauaucccuc cugaaaug
185714RNAArtificial Sequenceprimer 57ggaucauggg auua
145817RNAArtificial Sequenceprimer 58cauaucccuc ccaaaug
175914RNAArtificial Sequenceprimer 59ggaucauggg auua
146018RNAArtificial Sequenceprimer 60cauaucccuc cacaaaug
186114RNAArtificial Sequenceprimer 61ggaucauggg auua
146211RNAArtificial Sequenceprimer 62caggaggcuc g
116311RNAArtificial Sequenceprimer 63ucaccuccuu a
116411RNAArtificial Sequenceprimer 64caauccccuc g
116511RNAArtificial Sequenceprimer 65ucaagggauu a
116611RNAArtificial Sequenceprimer 66cauaccucuc g
116711RNAArtificial Sequenceprimer 67ucaauggguu a
116811RNAArtificial Sequenceprimer 68cacaguccuc g
116911RNAArtificial Sequenceprimer 69ucagacgauu a
117011RNAArtificial Sequenceprimer 70caaaccacuc g
117111RNAArtificial Sequenceprimer 71ucagugauuu a
117211RNAArtificial Sequenceprimer 72cauagcccuc g
117311RNAArtificial Sequenceprimer 73ucauuggguu a
117411RNAArtificial Sequenceprimer 74caucuuccuc g
117511RNAArtificial Sequenceprimer 75ucaggagguu a
117611RNAArtificial Sequenceprimer 76caauuaucuc g
117711RNAArtificial Sequenceprimer 77ucagaauuuu a
117811RNAArtificial Sequenceprimer 78cacagaacuc g
117911RNAArtificial Sequenceprimer 79ucaaucaguu a
118011RNAArtificial Sequenceprimer 80caaaguucuc g
118111RNAArtificial Sequenceprimer 81ucaaugaguu a
118211RNAArtificial Sequenceprimer 82caauucacuc g
118311RNAArtificial Sequenceprimer 83ucagugaauu a
118411RNAArtificial Sequenceprimer 84caacucacuc g
118511RNAArtificial Sequenceprimer 85ucagaguguu a
118611RNAArtificial Sequenceprimer 86caacccacuc g
118711RNAArtificial Sequenceprimer 87ucaugggauu a
118811RNAArtificial Sequenceprimer 88caucguucuc g
118911RNAArtificial Sequenceprimer 89ucaaagaguu a
119011RNAArtificial Sequenceprimer 90cacaccacuc g
119111RNAArtificial Sequenceprimer 91ucaugguuuu a
119211RNAArtificial Sequenceprimer 92cacccaccuc g
119311RNAArtificial Sequenceprimer 93ucaaaggguu a
119411RNAArtificial Sequenceprimer 94caucccacuc g
119511RNAArtificial Sequenceprimer 95ucaagggguu a
119611RNAArtificial Sequenceprimer 96caaacuccuc g
119711RNAArtificial Sequenceprimer 97ucauacuauu a
119811RNAArtificial Sequenceprimer 98cauacaucuc g
119911RNAArtificial Sequenceprimer 99ucaagaguuu a
1110011RNAArtificial Sequenceprimer 100caacucucuc g
1110111RNAArtificial Sequenceprimer 101ucaggagauu a
1110211RNAArtificial Sequenceprimer 102caaauaucuc g
1110311RNAArtificial Sequenceprimer 103ucagagauuu a
1110411RNAArtificial Sequenceprimer 104cauaccucuc g
1110511RNAArtificial Sequenceprimer 105ucaugagguu a
1110611RNAArtificial Sequenceprimer 106cauaguacuc g
1110711RNAArtificial Sequenceprimer 107ucauggauuu a
1110811RNAArtificial Sequenceprimer 108caauccacuc g
1110911RNAArtificial Sequenceprimer 109ucaguggauu a
1111011RNAArtificial Sequenceprimer 110cacagaucuc g
1111111RNAArtificial Sequenceprimer 111ucaggcuuuu a
1111211RNAArtificial Sequenceprimer 112cauagcacuc g
1111311RNAArtificial Sequenceprimer 113ucaugcuauu a
1111411RNAArtificial Sequenceprimer 114caacuaacuc g
1111511RNAArtificial Sequenceprimer 115ucauaguguu a
1111611RNAArtificial Sequenceprimer 116caaauaucuc g
1111711RNAArtificial Sequenceprimer 117ucaagguauu a
1111811RNAArtificial Sequenceprimer 118caaauaucuc g
1111911RNAArtificial Sequenceprimer 119ucaggagauu a
1112011RNAArtificial Sequenceprimer 120cacuccucuc g
1112111RNAArtificial Sequenceprimer 121ucagaggauu a
1112211RNAArtificial Sequenceprimer 122cauauuccuc g
1112311RNAArtificial Sequenceprimer 123ucauggaauu a
1112411RNAArtificial Sequenceprimer 124caaccuacuc g
1112511RNAArtificial Sequenceprimer 125ucaggagauu a
1112611RNAArtificial Sequenceprimer 126caauccacuc g
1112711RNAArtificial Sequenceprimer 127ucaggagauu a
1112811RNAArtificial Sequenceprimer 128caacccccuc g
1112911RNAArtificial Sequenceprimer 129ucagaggguu a
1113011RNAArtificial Sequenceprimer 130caaacaucuc g
1113111RNAArtificial Sequenceprimer 131ucaagauguu a
1113211RNAArtificial Sequenceprimer 132caucccacuc g
1113311RNAArtificial Sequenceprimer 133ucaggguauu a
1113411RNAArtificial Sequenceprimer 134cacugaucuc g
1113511RNAArtificial Sequenceprimer 135ucagaggauu a
1113611RNAArtificial Sequenceprimer 136cauaucccuc g
1113711RNAArtificial Sequenceprimer 137ucagggauuu a
1113811RNAArtificial Sequenceprimer 138caaacaccuc g
1113911RNAArtificial Sequenceprimer 139ucaagagguu a
1114011RNAArtificial Sequenceprimer 140caacgaacuc g
1114111RNAArtificial Sequenceprimer 141ucagaguguu a
1114211RNAArtificial Sequenceprimer 142caucuaucuc g
1114311RNAArtificial Sequenceprimer 143ucaggagauu a
1114411RNAArtificial Sequenceprimer 144cauaccucuc g
1114511RNAArtificial Sequenceprimer 145ucaugagguu a
1114611RNAArtificial Sequenceprimer 146cauauaacuc g
1114711RNAArtificial Sequenceprimer 147ucaagagauu a
1114811RNAArtificial Sequenceprimer 148caaauaccuc g
1114911RNAArtificial Sequenceprimer 149ucaugagguu a
1115011RNAArtificial Sequenceprimer 150cacauaccuc g
1115111RNAArtificial Sequenceprimer 151ucaugagguu a
1115211RNAArtificial Sequenceprimer 152caccgaccuc g
1115311RNAArtificial Sequenceprimer 153ucaagagguu a
1115411RNAArtificial Sequenceprimer 154cauaucccuc g
1115511RNAArtificial Sequenceprimer 155ucaugggguu a
1115611RNAArtificial Sequenceprimer 156caacuaccuc g
1115711RNAArtificial Sequenceprimer 157ucaugagguu a
1115811RNAArtificial Sequenceprimer 158cauauaccuc g
1115911RNAArtificial Sequenceprimer 159ucaagagguu a
111609RNAArtificial Sequenceprimer 160auuagauac 91619RNAArtificial
Sequenceprimer 161auuagguaa 91629RNAArtificial Sequenceprimer
162auucgacau 91639RNAArtificial Sequenceprimer 163aauagguac
91649RNAArtificial Sequenceprimer 164aauagucuc 91659RNAArtificial
Sequenceprimer 165auuagcuac 91669RNAArtificial Sequenceprimer
166auucgacac 91679RNAArtificial Sequenceprimer 167acuagcaca
91689RNAArtificial Sequenceprimer 168acuagcuuc 91699RNAArtificial
Sequenceprimer 169aauagauac 91709RNAArtificial Sequenceprimer
170aauaguauc 91719RNAArtificial Sequenceprimer 171aaucgccuc
91729RNAArtificial Sequenceprimer 172gauagguau 91739RNAArtificial
Sequenceprimer 173auuaggcac 91749RNAArtificial Sequenceprimer
174aauagguuc 91759RNAArtificial Sequenceprimer 175aauagucaa
91769RNAArtificial Sequenceprimer 176aaucgucuc 91779RNAArtificial
Sequenceprimer 177auuagaaaa 91789RNAArtificial Sequenceprimer
178auuagcgac 91799RNAArtificial Sequenceprimer 179auuaggagc
91809RNAArtificial Sequenceprimer 180auuaggcaa 91819RNAArtificial
Sequenceprimer 181aguagccuc 91829RNAArtificial Sequenceprimer
182aguagcuuc 91839RNAArtificial Sequenceprimer 183aguaggauc
91849RNAArtificial Sequenceprimer 184aguagguuc 91859RNAArtificial
Sequenceprimer 185aguagucuc 91869RNAArtificial Sequenceprimer
186acuagauau 91879RNAArtificial Sequenceprimer 187acuagaucc
91889RNAArtificial Sequenceprimer 188acuagcaac 91899RNAArtificial
Sequenceprimer 189acuagcauc 91909RNAArtificial Sequenceprimer
190acuagcuaa 91919RNAArtificial Sequenceprimer 191acuaggcuc
91929RNAArtificial Sequenceprimer 192acuaguaac
91939RNAArtificial Sequenceprimer 193acuaguauc 91949RNAArtificial
Sequenceprimer 194acuaguuuc 91959RNAArtificial Sequenceprimer
195aauagauuc 91969RNAArtificial Sequenceprimer 196aauagcagc
91979RNAArtificial Sequenceprimer 197aauagccaa 91989RNAArtificial
Sequenceprimer 198aauagccac 91999RNAArtificial Sequenceprimer
199aauagccua 92009RNAArtificial Sequenceprimer 200aauagcuaa
92019RNAArtificial Sequenceprimer 201guuaguuau 92029RNAArtificial
Sequenceprimer 202gguaguagu 92039RNAArtificial Sequenceprimer
203gguagucag 92049RNAArtificial Sequenceprimer 204gauaguagu
92059RNAArtificial Sequenceprimer 205aauagaaac 92069RNAArtificial
Sequenceprimer 206guuagauag 92079RNAArtificial Sequenceprimer
207gguagcuuu 92089RNAArtificial Sequenceprimer 208gguaguuug
92099RNAArtificial Sequenceprimer 209auucggaaa 92109RNAArtificial
Sequenceprimer 210auuggagac 92119RNAArtificial Sequenceprimer
211acuagacgc 92129RNAArtificial Sequenceprimer 212acuagccaa
92139RNAArtificial Sequenceprimer 213acuaggcua 92149RNAArtificial
Sequenceprimer 214aauagcaca 92159RNAArtificial Sequenceprimer
215aauagucau 92169RNAArtificial Sequenceprimer 216aauagucca
92179RNAArtificial Sequenceprimer 217cuuaguuaa 92189RNAArtificial
Sequenceprimer 218guuagagau 92199RNAArtificial Sequenceprimer
219guuagucau 92209RNAArtificial Sequenceprimer 220gguagccuu
92219RNAArtificial Sequenceprimer 221gguaggaau 92229RNAArtificial
Sequenceprimer 222gguagguag 92239RNAArtificial Sequenceprimer
223gguagguuu 92249RNAArtificial Sequenceprimer 224gguaguuuu
92259RNAArtificial Sequenceprimer 225gauagccuu 92269RNAArtificial
Sequenceprimer 226gauaguccu 92279RNAArtificial Sequenceprimer
227auuagauga 92289RNAArtificial Sequenceprimer 228aguagcuuu
92299RNAArtificial Sequenceprimer 229aguaguuag 92309RNAArtificial
Sequenceprimer 230agucgccuc 92319RNAArtificial Sequenceprimer
231acuagaguc 92329RNAArtificial Sequenceprimer 232aaucgcagc
92339RNAArtificial Sequenceprimer 233cauaguuuu 92349RNAArtificial
Sequenceprimer 234gguagaugu 92359RNAArtificial Sequenceprimer
235gguagucgu 92369RNAArtificial Sequenceprimer 236ggucgcuau
92379RNAArtificial Sequenceprimer 237gcuaguaag 92389RNAArtificial
Sequenceprimer 238gguagguug 923920DNAArtificial Sequenceprimer
239gacaatctgt gtgagcacta 2024036DNAArtificial Sequenceprimer
240tgccagcagc cgcggtaata cggagggtgc aagcgt 3624133DNAArtificial
Sequenceprimer 241cctgtttgct ccccacgctt tcgcacctga gcg
3324260DNAArtificial Sequenceprimer 242ctcaggtgcg aaagcgtggg
gagcaaacag gnnnnnnnnn cctggtagtc cacgccgtaa 6024360DNAArtificial
Sequenceprimer 243ctcaggtgcg aaagcgtggg gagcaaacag gnttagatan
cctggtagtc cacgccgtaa 6024417DNAArtificial Sequenceprimer
244ggactaccag ggtatct 1724517DNAArtificial Sequenceprimer
245tacggcgtgg actacca 17
* * * * *