U.S. patent application number 13/642107 was filed with the patent office on 2013-02-14 for process for the production of cells which are capable of converting arabinose.
This patent application is currently assigned to DSM IP ASSETS. The applicant listed for this patent is Bianca Elisabeth Maria Gielesen, Wilbert Herman Marie Heijne, Paul Klaassen, Gijsberdina Pieternella Van Suylekom. Invention is credited to Bianca Elisabeth Maria Gielesen, Wilbert Herman Marie Heijne, Paul Klaassen, Gijsberdina Pieternella Van Suylekom.
Application Number | 20130040297 13/642107 |
Document ID | / |
Family ID | 44833745 |
Filed Date | 2013-02-14 |
United States Patent
Application |
20130040297 |
Kind Code |
A1 |
Klaassen; Paul ; et
al. |
February 14, 2013 |
PROCESS FOR THE PRODUCTION OF CELLS WHICH ARE CAPABLE OF CONVERTING
ARABINOSE
Abstract
The invention relates to a process for the production of cells
which are capable of converting arabinose, comprising the following
steps: a) Introducing into a host strain that cannot convert
arabinose, the genes AraA, araB and araD, this cell is designated
as constructed cell; b) Subjecting the constructed cell to adaptive
evolution until a cell that converts arabinose is obtained, c)
Optionally, subjecting the first arabinose converting cell to
adaptive evolution to improve the arabinose conversion; the cell
produced in step b) or c) is designated as first arabinose
converting cell; d) Analysing the full genome or part of the genome
of the first arabinose converting cell and that of the constructed
cell; e) Identifying single nucleotide polymorphisms (SNP's) in the
first arabinose converting cell; and f) Using the information of
the SNP's in rational design of a cell capable of converting
arabinose; g) Construction of the cell capable of converting
arabinose designed in step f).
Inventors: |
Klaassen; Paul; (Dordrecht,
NL) ; Gielesen; Bianca Elisabeth Maria; (Maassluis,
NL) ; Heijne; Wilbert Herman Marie; (Dordrecht,
NL) ; Van Suylekom; Gijsberdina Pieternella;
(Gravenmoer, NL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Klaassen; Paul
Gielesen; Bianca Elisabeth Maria
Heijne; Wilbert Herman Marie
Van Suylekom; Gijsberdina Pieternella |
Dordrecht
Maassluis
Dordrecht
Gravenmoer |
|
NL
NL
NL
NL |
|
|
Assignee: |
DSM IP ASSETS
Heerlen
NL
|
Family ID: |
44833745 |
Appl. No.: |
13/642107 |
Filed: |
April 19, 2011 |
PCT Filed: |
April 19, 2011 |
PCT NO: |
PCT/EP2011/056242 |
371 Date: |
October 18, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61326351 |
Apr 21, 2010 |
|
|
|
61326358 |
Apr 21, 2010 |
|
|
|
Current U.S.
Class: |
435/6.11 ;
435/108; 435/109; 435/113; 435/115; 435/136; 435/139; 435/140;
435/142; 435/144; 435/145; 435/146; 435/157; 435/158; 435/159;
435/160; 435/162; 435/167; 435/168; 435/189; 435/193; 435/198;
435/200; 435/201; 435/209; 435/219; 435/232; 435/254.2; 435/43;
435/47; 530/350 |
Current CPC
Class: |
C12N 1/36 20130101; C12N
9/92 20130101; C12Y 503/01004 20130101; C07K 14/395 20130101; C12N
9/0006 20130101; C12N 9/1205 20130101; C12N 9/16 20130101; C12N
9/2437 20130101; C12P 7/48 20130101; C12P 7/16 20130101; C12P 7/10
20130101; C12P 7/56 20130101; C12N 9/2411 20130101; C12P 7/20
20130101; C12P 13/20 20130101; C12P 7/42 20130101; C12N 9/248
20130101; C12P 13/12 20130101; C12P 13/227 20130101; C12N 9/2468
20130101; C12N 9/2405 20130101; C12N 9/60 20130101; C12P 7/44
20130101; C12Y 207/01016 20130101; C12P 7/54 20130101; C12P 35/00
20130101; Y02E 50/10 20130101; C12P 5/026 20130101; C12Y 501/03004
20130101; Y02E 50/30 20130101; C12P 7/46 20130101; C12P 7/40
20130101; C12N 9/10 20130101; C12N 9/0004 20130101; C12Y 101/01
20130101; C12N 9/88 20130101; C12N 9/90 20130101; C12P 7/18
20130101; C12P 13/08 20130101; C12P 17/10 20130101; C12P 13/005
20130101 |
Class at
Publication: |
435/6.11 ;
435/254.2; 435/162; 435/160; 435/157; 435/139; 435/146; 435/136;
435/140; 435/145; 435/142; 435/144; 435/113; 435/108; 435/115;
435/109; 435/158; 435/167; 435/159; 435/47; 435/168; 435/219;
435/209; 435/201; 435/200; 435/198; 435/232; 435/189; 435/193;
435/43; 530/350 |
International
Class: |
C12N 1/19 20060101
C12N001/19; C12P 7/14 20060101 C12P007/14; C12P 7/16 20060101
C12P007/16; C12P 7/04 20060101 C12P007/04; C12P 7/56 20060101
C12P007/56; C12P 7/42 20060101 C12P007/42; C12P 7/40 20060101
C12P007/40; C12P 7/54 20060101 C12P007/54; C12P 7/46 20060101
C12P007/46; C12P 7/44 20060101 C12P007/44; C12P 7/48 20060101
C12P007/48; C12P 13/12 20060101 C12P013/12; C12P 13/22 20060101
C12P013/22; C12P 13/08 20060101 C12P013/08; C12P 13/20 20060101
C12P013/20; C12P 7/18 20060101 C12P007/18; C12P 5/02 20060101
C12P005/02; C12P 7/20 20060101 C12P007/20; C12P 35/00 20060101
C12P035/00; C12P 3/00 20060101 C12P003/00; C12N 9/50 20060101
C12N009/50; C12N 9/42 20060101 C12N009/42; C12N 9/26 20060101
C12N009/26; C12N 9/24 20060101 C12N009/24; C12N 9/20 20060101
C12N009/20; C12N 9/88 20060101 C12N009/88; C12N 9/02 20060101
C12N009/02; C12N 9/10 20060101 C12N009/10; C12P 37/00 20060101
C12P037/00; C07K 14/00 20060101 C07K014/00; C12Q 1/68 20060101
C12Q001/68 |
Foreign Application Data
Date |
Code |
Application Number |
Apr 21, 2010 |
EP |
10160622.6 |
Apr 21, 2010 |
EP |
10160647.3 |
Claims
1. A process for producing cells which are capable of converting
arabinose, comprising: a) Introducing into a host strain that
cannot convert arabinose, genes araA, araB and araD, to form a
constructed cell; b) Subjecting the constructed cell to adaptive
evolution until a first arabinose converting cell that converts
arabinose is obtained, c) Optionally, subjecting the first
arabinose converting cell to adaptive evolution to improve the
arabinose conversion; said cell produced in step b) or c) is
designated as first arabinose converting cell; d) Analysing a full
genome or part of a genome of said first arabinose converting cell
and that of said constructed cell; e) Identifying single nucleotide
polymorphisms (SNP's) in said first arabinose converting cell; and
f) Using information of said SNP's in rational design of a cell
capable of converting arabinose; g) Constructing said cell capable
of converting arabinose designed in f).
2. The process according to claim 1, wherein in e), f) and/or g) at
least one technique of phenotyping is used in combination with at
least one technique of genotyping.
3. The process according to claim 1, wherein, in said process, a
yeast cell capable of converting arabinose has a chromosome that is
amplified compared to the host strain, wherein the amplified
chromosome has the same number as the chromosome in which the araA,
araB and araD genes were introduced in the host strain.
4. The process according to claim 3, wherein said amplified
chromosome is chromosome VII.
5. A yeast cell having araA, araB and araD genes wherein chromosome
VII has a size of from 1300 to 1600 Kb as determined by
electrophoresis, with the exclusion of a yeast cell BIE201.
6. The yeast cell according to claim 5, wherein a copy number of
the araA, araB and araD genes is from three to five each.
7. The yeast cell according to claim 6, comprising at least one
single nucleotide polymorphism selected from the group consisting
of mutations G1363T in the SSY1 gene, A512T in YJR154w gene, A1186G
in CEP3 gene, and A436C in GAL80 gene.
8. The yeast cell according to claim 7, comprising a single
polymorphism A436C in GAL80 gene.
9. The yeast cell according to claim 8, comprising a single
nucleotide polymorphism A1186G in CEP3 gene.
10. A polypeptide belonging to the group consisting of the
polypeptides: a. A polypeptide comprising the sequence encoded by
polynucleotide SEQ ID NO: 14 having a substitution E455stop in SSY1
and variant polypeptides thereof wherein at least one of other
positions have mutation of an aminoacid with another aminoacid that
is an existing aminoacid in AA trans superfamily; b. A polypeptide
comprising the sequence encoded by the polynucleotide SEQ ID NO: 16
having a substitution D171G in YJR154w and variant polypeptides
thereof wherein at least one of other positions have mutation of an
aminoacid with another aminoacid that is an existing conserved
aminoacid in PhyH superfamily; c. A polypeptide comprising the
sequence encoded by the polynucleotide SEQ ID NO: 18 comprising a
substitution S396G in CEP3; d. A polypeptide comprising the
sequence encoded by SEQ ID NO: 20 comprising a substitution T146P
in GAL80; and variant polypeptides thereof, wherein at least one of
other positions may have mutation of an aminoacid with an aminoacid
that is an existing conserved aminoacid in NADB Rossmann
superfamily.
11. A process for producing at least one fermentation product from
a sugar composition comprising glucose, galactose, arabinose and
xylose, said process comprising fermenting said sugar composition
with a yeast cell according to claim 5.
12. The process according to claim 11, wherein said sugar
composition is produced from lignocellulosic material by: a)
pretreatment of at least one lignocellulosic material to produce
pretreated lignocellulosic material; b) enzymatic treatment of said
pretreated lignocellulosic material to produce said sugar
composition.
13. The process according to claim 11, wherein said fermentation is
anaerobic.
14. The process according to claim 11, wherein said fermentation
product is selected from the group consisting of ethanol,
n-butanol, isobutanol, lactic acid, 3-hydroxy-propionic acid,
acrylic acid, acetic acid, succinic acid, fumaric acid, malic acid,
itaconic acid, maleic acid, citric acid, adipic acid, an amino
acid, such as lysine, methionine, tryptophan, threonine, and
aspartic acid, 1,3-propane-diol, ethylene, glycerol, a
.beta.-lactam antibiotic and a cephalosporin, vitamins,
pharmaceuticals, animal feed supplements, specialty chemicals,
chemical feedstocks, plastics, solvents, fuels, biofuels and biogas
or organic polymers, and an industrial enzyme, a protease, a
cellulase, an amylase, a glucanase, a lactase, a lipase, a lyase,
an oxidoreductase, a transferase or a xylanase.
Description
FIELD OF THE INVENTION
[0001] The invention relates to a process for the production of
cells which are capable of converting arabinose. The invention also
relates to cells that may be produced by the process. The invention
further relates to a process in which such cells are used for the
production of a fermentation product, such as ethanol.
BACKGROUND OF THE INVENTION
[0002] Large-scale consumption of traditional, fossil fuels
(petroleum-based fuels) in recent decades has contributed to high
levels of pollution. This, along with the realisation that the
world stock of fossil fuels is not limited and a growing
environmental awareness, has stimulated new initiatives to
investigate the feasibility of alternative fuels such as ethanol,
which is a particulate-free burning fuel source that releases less
CO2 than unleaded gasoline on a per litre basis. Although
biomass-derived ethanol may be produced by the fermentation of
hexose sugars obtained from many different sources, the substrates
typically used for commercial scale production of fuel alcohol,
such as cane sugar and corn starch, are expensive. Increases in the
production of fuel ethanol will therefore require the use of
lower-cost feedstocks. Currently, only lignocellulosic feedstock
derived from plant biomass is available in sufficient quantities to
substitute the crops currently used for ethanol production. In most
lignocellulosic material, the second-most-common sugar, next to C6
sugar also contain considerable amounts of C5 sugars, including
arabinose. Thus, for an economically feasible fuel production
process, both hexose and pentose sugars must be fermented to form
ethanol. The yeast Saccharomyces cerevisiae is robust and well
adapted for ethanol production, but it is unable toconvert
arabinose. Also, no naturally-occurring organisms are known which
can ferment xylose to ethanol with both a high ethanol yield and a
high ethanol productivity. There is therefore a need for an
organism possessing these properties so as to enable the
commercially-viable production of ethanol from lignocellulosic
feedstocks.
SUMMARY OF THE INVENTION
[0003] An object of the invention is to provide a cell, in
particular a yeast cell that is capable of converting
arabinose.
[0004] This object is attained according to the invention that
provides a process for the production of cells which are capable of
converting arabinose, comprising the following steps: [0005] a)
Introducing into a host strain that cannot convert arabinose, the
genes araA, araB and araD, this cell is designated as constructed
cell; [0006] b) Subjecting the constructed cell to adaptive
evolution until a cell that converts arabinose is obtained, [0007]
c) Optionally, subjecting the first arabinose converting cell to
adaptive evolution to improve the arabinose conversion; the cell
produced in step b) or c) is designated as first arabinose
converting cell; [0008] d) Analysing the full genome or part of the
genome of the first arabinose converting cell and that of the
constructed cell; [0009] e) Identifying single nucleotide
polymorphisms (SNP's) in the first arabinose converting cell; and
[0010] f) Using the information of the SNP's in rational design of
a cell capable of converting arabinose; [0011] g) Construction of
the cell capable of converting arabinose designed in step f).
[0012] The invention further provides a yeast cell having araA,
araB and araD genes wherein chromosome VII has a size of from 1300
to 1600 Kb as determined by electrophoresis, with the exclusion of
yeast cell BIE201.
[0013] The invention further relates to a polypeptide belonging to
the group consisting of the polypeptides: [0014] a. A polypeptide
having a sequence encoded by polynucleotide SEQ ID NO: 14 having a
substitution E455stop in SSY1 and variant polypeptides thereof
wherein one or more of the other positions have mutation of an
aminoacid with another aminoacid that is an existing aminoacid in
the AA trans superfamily; [0015] b. A polypeptide having having the
sequence encoded by the polynucleotide SEQ ID NO: 16 having a
substitution D171G in YJR154w and variant polypeptides thereof
wherein one or more of the other positions have mutation of the
aminoacid with another aminoacid that is an existing conserved
aminoacid in the PhyH superfamily; [0016] c. A polypeptide having
the sequence encoded by the polynucleotide SEQ ID NO: 18 having a
substitution S396G in CEP3; [0017] d. A polypeptide having the
sequence encoded by SEQ ID NO: 20 having a substitution T146P in
GAL80 and variant polypeptides thereof wherein one or more of the
other positions may have mutation of the aminoacid with an
aminoacid that is an existing conserved aminoacid in the NADB
Rossmann superfamily.
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] FIG. 1 sets out a physical map of vector pPWT006.
[0019] FIG. 2 sets out a physical map of plasmid pPWT018, the
sequence of which is given in SEQ ID NO: 1.
[0020] FIG. 3 sets out an Autoradiogram showing the results of a
hybridization experiment showing the correct integration of one
copy of the plasmid pPWT080 in CEN.PK113-7D;
[0021] FIG. 4 sets out a physical map of plasmid pPWT080, the
sequence of which is given in SEQ ID NO: 8.
[0022] FIG. 5 sets out an aerobic growth curve of reference strain
BIE104A2P1 on 2% arabinose as sole carbon source,
[0023] FIG. 6 sets out an anaerobic growth curve of BIE104A2P1c on
2% arabinose as sole carbon source,
[0024] FIG. 7 sets out growth curve (sugar-, ethanol- and glycerol
concentrations OD600 and CO2 produced (ml/hr, second axis) for
BIE104 precultured on 2% glucose, and grown on Verduyn medium with
5% glucose, 5% xylose, 3.5% arabinose and 1% galactose, All % in
w/w.
[0025] FIG. 8 sets out growth curve (sugar-, ethanol- and glycerol
concentrations, OD600 and CO2 produced (ml/hr, second axis) for
BIE104A2P1c precultured on 2% glucose, and grown on Verduyn medium
with 5% glucose, 5% xylose, 3.5% arabinose and 1% galactose.
[0026] FIG. 9 sets out growth curve (sugar-, ethanol- and glycerol
concentrations OD600 and CO2 produced (ml/hr, second axis) for
BIE201 precultured on 2% glucose, and grown on Verduyn medium with
5% glucose, 5% xylose, 3.5% arabinose and 1% galactose, All % in
w/w.
[0027] FIG. 10 sets out a schematic overview of crossing
[0028] FIG. 11 sets out an example of "Normalized Melting Curves"
(melting curves; top panel) and a "Normalized melting Peaks" curve
(lower panel). The latter is derived from the first graph and is
showing the change in fluorescence signal as a function of the
temperature. Strains BIE104A2P1 and BIE201 are displayed. The gene
tested in this figure is YJR154w. The difference in melting
temperature of the probe is clear between the two strains tested,
BIE201 and BIE104A2P1.
[0029] FIG. 12 sets out a schematic representation (coverage plot)
of chromosome VII in strain BIE201. The read depth is set out as a
function of the position along the chromosome. Some parts of
chromosome VII are present in multiple copies, i.e. two or three
times overrepresented.
[0030] FIG. 13 sets out a CHEF gel, stained with ethidium bromide.
Chromosomes were separated on their size using the CHEF technique.
Strains analyzed are BIE104 (untransformed yeast cell), BIE104A2P1a
(primary transformant unable to consume arabinose, synonym of
BIE104A2P1), BIE104A2P1c, a strain derived from BIE104A2P1 by
adaptive evolution, which is able to grow on arabinose, and strain
BIE201, derived from BIE104A2P1c by adaptive evolution, which can
grow on arabinose under anaerobic conditions. Shifts in chromosomes
are observed (see text). Strain YNN295 is a marker strain
(Bio-Rad).
[0031] FIG. 14 sets out a CHEF gel, blotted and hybridized with the
araA probe.
[0032] Chromosomes were separated on their size using the CHEF
technique. Strains analyzed are BIE104 (untransformed yeast cell),
BIE104A2P1a (primary transformant unable to consume arabinose,
synonym of BIE104A2P1), BIE104A2P1c, a strain derived from
BIE104A2P1 by adaptive evolution, which is able to grow on
arabinose, and strain BIE201, derived from BIE104A2P1c by adaptive
evolution, which can grow on arabinose under anaerobic conditions.
Shifts in chromosomes are observed (see text). Strain YNN295 is a
marker strain (Bio-Rad), used as a reference for the size of the
chromosomes.
[0033] FIG. 15 sets out a CHEF gel, blotted and hybridized with the
ACT1 probe. Chromosomes were separated on their size using the CHEF
technique. Strains analyzed are BIE104 (untransformed yeast cell),
BIE104A2P1a (primary transformant unable to consume arabinose,
synonym of BIE104A2P1), BIE104A2P1c, a strain derived from
BIE104A2P1 by adaptive evolution, which is able to grow on
arabinose, and strain BIE201, derived from BIE104A2P1c by adaptive
evolution, which can grow on arabinose under anaerobic conditions.
Shifts in chromosomes are observed (see text). Strain YNN295 is a
marker strain (Bio-Rad), used as a reference for the size of the
chromosomes.
[0034] FIG. 16 sets out a CHEF gel, blotted and hybridized with the
PNC1 probe. Chromosomes were separated on their size using the CHEF
technique. Strains analyzed are BIE104 (untransformed yeast cell),
BIE104A2P1a (primary transformant unable to consume arabinose,
synonym of BIE104A2P1), BIE104A2P1c, a strain derived from
BIE104A2P1 by adaptive evolution, which is able to grow on
arabinose, and strain BIE201, derived from BIE104A2P1c by adaptive
evolution, which can grow on arabinose under anaerobic conditions.
Shifts in chromosomes are observed (see text). Strain YNN295 is a
marker strain (Bio-Rad), used as a reference for the size of the
chromosomes.
[0035] FIG. 17 sets out a CHEF gel, blotted and hybridized with the
HSF1 probe. Chromosomes were separated on their size using the CHEF
technique. Strains analyzed are BIE104 (untransformed yeast cell),
BIE104A2P1a (primary transformant unable to consume arabinose,
synonym of BIE104A2P1), BIE104A2P1c, a strain derived from
BIE104A2P1 by adaptive evolution, which is able to grow on
arabinose, and strain BIE201, derived from BIE104A2P1c by adaptive
evolution, which can grow on arabinose under anaerobic conditions.
Shifts in chromosomes are observed (see text). Strain YNN295 is a
marker strain (Bio-Rad), used as a reference for the size of the
chromosomes.
[0036] FIG. 18 sets out a CHEF gel, blotted and hybridized with the
YGRO31w probe. Chromosomes were separated on their size using the
CHEF technique. Strains analyzed are BIE104 (untransformed yeast
cell), BIE104A2P1a (primary transformant unable to consume
arabinose, synonym of BIE104A2P1), BIE104A2P1c, a strain derived
from BIE104A2P1 by adaptive evolution, which is able to grow on
arabinose, and strain BIE201, derived from BIE104A2P1c by adaptive
evolution, which can grow on arabinose under anaerobic conditions.
Shifts in chromosomes are observed (see text). Strain YNN295 is a
marker strain (Bio-Rad), used as a reference for the size of the
chromosomes.
[0037] FIG. 19 sets out an example of ten dissected asci from the
cross BIE104A2P1.times.BIE201. The asci were dissected with a
Singer Micromanipulator. Each ascus consists of four ascospores.
These ascospores are separated from each other and are put on the
agar plate at distinctive distances. In theory, four haploid spore
isolates can give rise to four individual colonies. The four
colonies in a "column" originate from one ascus.
[0038] FIG. 20 illustrates the performance of strain BIE252 in the
BAM (Biological Activity Monitor, Halotec, The Netherlands). The
strain was precultured in Verduyn medium 2% glucose. Application in
the BAM was done on Verduyn medium supplemented with 5% glucose, 5%
xylose, 3.5% arabinose, 1% galactose and 0.5% mannose, pH4.2, under
anaerobic conditions.
[0039] FIG. 21 illustrates the performance of strain
BIE252.DELTA.GAL80 in the BAM. The strain was precultured in
Verduyn medium 2% glucose. Application in the BAM was done on
Verduyn medium supplemented with 5% glucose, 5% xylose, 3.5%
arabinose, 1% galactose and 0.5% mannose, pH4.2, under anaerobic
conditions.
[0040] FIG. 22 sets out a schematic view of the double crossover
integration of the complete adipic acid pathway into the
genome.
[0041] FIG. 23 sets out a resulting chromatogram of an adipic acid
standard and a sample measured with the analysis method.
[0042] FIG. 24 sets out a physical map of plasmid pGBS416ARABD
BRIEF DESCRIPTION OF THE SEQUENCE LISTING
[0043] SEQ ID NO: 1 sets out the sequence of pPWT018;
[0044] SEQ ID NO: 2 sets out the sequence of a primer for checking
integration of pPWT018;
[0045] SEQ ID NO: 3 sets out a primer for checking integration of
pPWT018 (with SEQ ID NO: 2) and for checking copy number pPWT018
(with SEQ ID NO: 4);
[0046] SEQ ID NO: 4 sets out the sequence for a primer for checking
copy number pPWT018;
[0047] SEQ ID NO: 4 sets out the sequence for a primer for checking
presence of pPWT018 in genome in combination with SEQ ID NO: 4;
[0048] SEQ ID NO: 6 sets out the sequence for a forward primer for
generating the SIT2 probe;
[0049] SEQ ID NO: 7 sets out the sequence for a reverse primer for
generating the SIT2 probe;
[0050] SEQ ID NO: 8 sets out the sequence for plasmid pPWT080;
[0051] SEQ ID NO: 9 sets out the sequence for a forward primer for
checking correct integration of pPWT080 at the 3'-end of the
GRE3-locus (with SEQ ID NO: 10) and for checking the copy number of
plasmid pPWT080 (with SEQ ID NO: 11);
[0052] SEQ ID NO: 10 sets out the sequence for a reverse primer for
checking correct integration of pPWT080 at the 3'-end of the
GRE3-locus;
[0053] SEQ ID NO: 11 sets out the sequence for a reverse primer for
checking the copy number of plasmid pPWT080 (with SEQ ID NO:
10);
[0054] SEQ ID NO: 12 sets out the sequence for a forward primer for
generating an RKI1-probe;
[0055] SEQ ID NO: 13 sets out the sequence for a reverse primer for
generating an RKI1-probe;
[0056] SEQ ID NO: 14 sets out the sequence for the sequence of the
SSY1-gene in wild type strain BIE104;
[0057] SEQ ID NO: 15 sets out the sequence for the SSY1-gene in
strains BIE104A2P1c and BIE201;
[0058] SEQ ID NO: 16 sets out the sequence for the YJR154w-gene in
wild type strain BIE104;
[0059] SEQ ID NO: 17 sets out the sequence the YJR154w-gene in
strains BIE104A2P1c and BIE201;
[0060] SEQ ID NO: 18 sets out the sequence the CEP3-gene in wild
type strain BIE104;
[0061] SEQ ID NO: 19 sets out the sequence the CEP3-gene in strains
BIE104A2P1c and BIE201;
[0062] SEQ ID NO: 20 sets out the sequence the YPL277c-gene in wild
type strain BIE104;
[0063] SEQ ID NO: 21 sets out the sequence the YPL277c-gene in
strains BIE104A2P1c and BIE201;
[0064] SEQ ID NO: 22 sets out the sequence for the GAL80-gene in
wild type strain BIE104;
[0065] SEQ ID NO: 23 sets out the sequence the GAL80-gene in strain
BIE201;
[0066] SEQ ID NO 24 sets out the sequence of forward primer
SSY1;
[0067] SEQ ID NO 25 sets out the sequence of reverse primer
SSY1;
[0068] SEQ ID NO 26 sets out the sequence of forward primer
YJR154w;
[0069] SEQ ID NO 27 sets out the sequence of reverse primer
YJR154w;
[0070] SEQ ID NO 28 sets out the sequence of forward primer
CEP3;
[0071] SEQ ID NO 29 sets out the sequence of reverse primer
CEP3;
[0072] SEQ ID NO 30 sets out the sequence of forward primer
YPL277c;
[0073] SEQ ID NO 31 sets out the sequence of reverse primer
YPL277c;
[0074] SEQ ID NO 32 sets out the sequence of forward primer
GAL80;
[0075] SEQ ID NO 33 sets out the sequence of reverse primer
GAL80;
[0076] SEQ ID NO 34 sets out the sequence of Hi-Res probe SSY1;
[0077] SEQ ID NO 35 sets out the sequence of Hi-Res probe
YJR154w;
[0078] SEQ ID NO 36 sets out the sequence of Hi-Res probe CEP3;
[0079] SEQ ID NO 37 sets out the sequence of Hi-Res probe
YPL277c;
[0080] SEQ ID NO 38 sets out the sequence of Hi-Res probe
GAL80;
[0081] SEQ ID NO 39 sets out the sequence of forward primer
YGL057c;
[0082] SEQ ID NO 40 sets out the sequence of reverse primer
YGL057c;
[0083] SEQ ID NO 41 sets out the sequence of forward primer
SDS23;
[0084] SEQ ID NO 42 sets out the sequence of reverse primer
SDS23;
[0085] SEQ ID NO 43 sets out the sequence of forward primer
ACT1;
[0086] SEQ ID NO 44 sets out the sequence of reverse primer
ACT1;
[0087] SEQ ID NO 45 sets out the sequence of forward primer
araA;
[0088] SEQ ID NO 46 sets out the sequence of reverse primer
araA;
[0089] SEQ ID NO 47 sets out the sequence of forward primer
ACT1;
[0090] SEQ ID NO 48 sets out the sequence of reverse primer
ACT1;
[0091] SEQ ID NO 49 sets out the sequence of forward primer
PNC1;
[0092] SEQ ID NO 50 sets out the sequence of reverse primer
PNC1;
[0093] SEQ ID NO 51 sets out the sequence of forward primer
HSF1;
[0094] SEQ ID NO 52 sets out the sequence of reverse primer
HSF1;
[0095] SEQ ID NO 53 sets out the sequence of forward primer
YGRO31w;
[0096] SEQ ID NO 54 sets out the sequence of reverse primer
YGRO31w;
[0097] SEQ ID NO 55 sets out the sequence of forward primer (matA,
mat.alpha.);
[0098] SEQ ID NO 56 sets out the sequence of reverse primer
matA;
[0099] SEQ ID NO 57 sets out the sequence of reverse primer
mat.alpha. (alpha);
[0100] SEQ ID NO 58 sets out the sequence of forward primer
GAL80::kanMX;
[0101] SEQ ID NO 59 sets out the sequence of reverse primer
GAL80::kanMX;
[0102] SEQ ID NO 60 sets out the sequence of Forward primer for
amplification of the INT1LF;
[0103] SEQ ID NO 61 sets out the sequence of Reverse primer for the
amplification of INT1LF with a 50 by flank overlapping Adi21
expression cassette;
[0104] SEQ ID NO 62 sets out the sequence of Forward primer for
amplification of the Adi21 expression cassette with 50 by flank
INT1LF;
[0105] SEQ ID NO 63 sets out the sequence of Reverse primer for the
amplification of the Adi21 expression cassette
[0106] SEQ ID NO 64 sets out the sequence of Forward primer for the
amplification of the Adi22 expression cassette;
[0107] SEQ ID NO 65 sets out the sequence of Reverse primer for the
amplification of the Adi22 expression cassette;
[0108] SEQ ID NO 66 sets out the sequence of Forward primer for the
amplification of the Adi23 expression cassette;
[0109] SEQ ID NO 67 sets out the sequence of Reverse primer for the
amplification of the Adi23 expression cassette;
[0110] SEQ ID NO 68 sets out the sequence of Forward primer for the
amplification of the kanMX marker from pUG7 with 50 by flank
overlapping with Adi23;
[0111] SEQ ID NO 69 sets out the sequence of Reverse primer for the
amplification of the kanMX marker from pUG7 with 50 by flank
overlapping with Adi8;
[0112] SEQ ID NO 70 sets out the sequence of Forward primer for the
amplification of the Adi8 expression cassette with 25 by flank
overlap with kanMX of pUG7;
[0113] SEQ ID NO 71 sets out the sequence of Reverse primer Adi8
expression cassette;
[0114] SEQ ID NO 72 sets out the sequence of Forward primer for the
amplification of the Adi24 expression cassette;
[0115] SEQ ID NO 73 sets out the sequence of Reverse primer for the
amplification of the Adi24 expression cassette;
[0116] SEQ ID NO 74 sets out the sequence of Forward primer for the
amplification of the Adi25 expression cassette;
[0117] SEQ ID NO 75 sets out the sequence of Reverse primer for the
amplification of the Adi25 expression cassette with 50 by overlap
with SucC;
[0118] SEQ ID NO 76 sets out the sequence of Forward primer for the
amplification of the SucC with 50 by overlap with Adi25;
[0119] SEQ ID NO 77 sets out the sequence of Reverse primer for the
amplification of the SucC expression cassette;
[0120] SEQ ID NO 78 sets out the sequence of Forward primer for the
amplification of the SucD expression cassette;
[0121] SEQ ID NO 79 sets out the sequence of Reverse primer for the
amplification of the SucD expression cassette;
[0122] SEQ ID NO 80 sets out the sequence of Forward primer for the
amplification of the acdh67 expression cassette;
[0123] SEQ ID NO 81 sets out the sequence of Reverse primer for the
amplification of the acdh67 construct with 50 by flank overlapping
with INTRF;
[0124] SEQ ID NO 82 sets out the sequence of Forward primer for the
amplification of the INT1LF site on yeast genome;
[0125] SEQ ID NO 83 sets out the sequence of Reverse primer for the
amplification of the INT1LF site on yeast genome;
[0126] SEQ ID NO 84 sets out the sequence of ADI21 PCR
fragment;
[0127] SEQ ID NO 85 sets out the sequence of ADI22 PCR
fragment;
[0128] SEQ ID NO 86 sets out the sequence of ADI23 PCR
fragment;
[0129] SEQ ID NO 87 sets out the sequence of ADI8 PCR fragment;
[0130] SEQ ID NO 88 sets out the sequence of ADI24 PCR
fragment;
[0131] SEQ ID NO 89 sets out the sequence of ADI25 PCR
fragment;
[0132] SEQ ID NO 90 sets out the sequence of SUCC PCR fragment;
[0133] SEQ ID NO 91 sets out the sequence of SUCD PCR fragment;
[0134] SEQ ID NO 92 sets out the sequence of ACDH67 PCR
fragment;
[0135] SEQ ID NO 93 sets out the sequence of KANMX marker
fragment;
[0136] SEQ ID NO 94 sets out the sequence of INT1LF PCR
fragment;
[0137] SEQ ID NO 95 sets out the sequence of INT1RF PCR
fragment;
[0138] SEQ ID NO 96 sets out the sequence of forward primer araABD
cassette;
[0139] SEQ ID NO 97 sets out the sequence of reverse primer araABD
cassette
[0140] SEQ ID NO 98 sets out the sequence of forward primer
Ty1::araABD;
[0141] SEQ ID NO 99 sets out the sequence of reverse primer
TY1::araABD;
[0142] SEQ ID NO 100 sets out the sequence of forward primer
Ty1::kanMX;
[0143] SEQ ID NO 101 sets out the sequence of reverse primer
Ty1::kanMX.
DETAILED DESCRIPTION OF THE INVENTION
[0144] Throughout the present specification and the accompanying
claims, the words "comprise" and "include" and variations such as
"comprises", "comprising", "includes" and "including" are to be
interpreted inclusively. That is, these words are intended to
convey the possible inclusion of other elements or integers not
specifically recited, where the context allows.
[0145] The articles "a" and "an" are used herein to refer to one or
to more than one (i.e. to one or at least one) of the grammatical
object of the article. By way of example, "an element" may mean one
element or more than one element.
[0146] The various embodiments of the invention described herein
may be cross-combined. The invention provides a process for the
production of cells which are capable of converting arabinose,
comprising the steps a) to g) these will be described here in more
detail:
[0147] Step a) Introducing into a host strain that cannot convert
arabinose, the genes araA, araB and araD, this cell is designated
as constructed cell [0148] Step a) will be described below in
detail in the description as well as being illustrated by the
examples.
[0149] Steps b) and c) Subjecting the constructed cell to adaptive
evolution until a cell that converts arabinose is obtained,
Optionally, subjecting the first arabinose converting cell to
adaptive evolution to improve the arabinose conversion; the cell
produced in step b) or c) is designated as first arabinose
converting cell; [0150] Steps b) and c) will be described below in
detail in the description under adaptive evolution as well as being
illustrated by the examples.
[0151] Step d) Analysing the full genome or part of the genome of
the first arabinose converting cell and that of the constructed
cell;
[0152] This step d) may be executed using common techniques of
genome resequencing
[0153] Step e) Identifying single nucleotide polymorphisms (SNP's)
in the first arabinose converting cell;
[0154] By looking at the differences between the first arabinose
converting cell and that of the constructed cell
[0155] Step f) Using the information of the SNP's in rational
design of a cell capable of converting arabinose;
[0156] In step f) the skilled person will know to which SNP's
arabinose conversion is attitubed, and with common skill be able to
design an improved strain based on that information.
[0157] In steps e), f) and/or g) the skilled person preferably uses
techniques of phenotyping, i.e. the identification of cells with
desired traits and in combination with techniques of genotyping,
i.e. the identification of candidate genes associated with the
chosen traits.
[0158] Examples of techniques for phenotyping are growth
experiments, in shake flasks or fementors, in the presence of
single sugars or sugar mixtures. Also growth assays on solid agar
media can be applied. However, other suitable known methods may be
used.
[0159] Examples of techniques for genotyping are re-sequencing
techniques, such as Solexa and the like, quatitative PCR (Q-PCR),
Southern blotting. However other suitable known methods may be
used.
[0160] Step g) Construction of the cell capable of converting
arabinose designed in step f). In step g) all common techniques of
construction of new strains may be used. In one embodiment,
different strains (parents) are combined in order to combine
advantageous properties of the parents. For example a crossing
technique may be used involving the strain of step b) or c) which
is crossed with a strain that does not have all SNP's present in
the strains of step b) or c).
[0161] For example, a haploid yeast strain, transformed with genes
necessary for or enhancing the ability to ferment arabinose
(designated all together as ARA) was enhanced by a process called
adaptive evolution. During the adaptive evolution process, three
mutations have been introduced into the genome, designated mut1,
mut2 and mut3. The genotype of such a yeast strain could be written
as mut1 mut2 mut3 ARA.
[0162] Such a yeast strain may be crossed with another haploid
yeast strain, also consisting of the genes needed for arabinose
transformation, but yet unable to do so, because it lacks extra
mutations to do so. However, this strain may have another
beneficial property, such as tolerance to inhibitors. This property
is designated as ABC. Such a process is illustrated in FIG. 10.
[0163] In an embodiment, in the above process, the yeast cell
capable of converting arabinose has a chromosome that is amplified
compared to the host strain, wherein the amplified chromosome has
the same number as the chromosome in which the araA, araB and araD
genes were introduced in the host strain. In an embodiment the
amplified chromosome is chromosome VII. In an embodiment, in the
yeast cell parts of chromosome VII, surrounding the centromere, are
amplified (as compared to the host strain). In an embodiment, a
region on the left arm of chromosome VII was amplified three times.
In an embodiment, part of the right arm of chromosome VII was
amplified twice, and an adjacent part was amplified three times
(see FIG. 12).
[0164] The part on the right arm of chromosome VII that was
amplified three times contains the arabinose expression cassette,
i.e. the genes araA, araB and araD under control of strong
constitutive promoters.
[0165] The invention further relates to a yeast cell having araA,
araB and araD genes wherein chromosome VII has a size of from 1300
to 1600 Kb as determined by electrophoresis, with the exclusion of
a yeast cell BIE201. Strain BIE201 has been disclosed in
WO2011003893.
[0166] BIE201 has all the single nucleotide polymorphisms G1363T in
the SSY1 gene, A512T in YJR154w gene, A1186G in CEP3 gene, and
A436C in GAL80 gene.
[0167] In an embodiment, in the yeast cell, the copy number of the
araA, araB and araD genes is two to ten, in an embodiment two to
eight or three to five each. The copy number of the araA, araB and
araD genes may be 2, 3, 4, 5, 6, 7, 8, 9, or 10. The copy number
may be determined with methods known to the skilled person,
Suitable methods are illustrated in the examples, and results are
e.g. shown in FIG. 12
[0168] In an embodiment, the yeast cell one or more, but not all,
of the single nucleotide polymorphism chosen from the group
consisting of mutations G1363T in the SSY1 gene, A512T in YJR154w
gene, A1186G in CEP3 gene, and A436C in GAL80 gene. In an
embodiment, the yeast cell has a single polymorphism A436C in GAL80
gene. In an embodiment, the yeast cell has a single polymorphism
A1186G in CEP3 gene.
[0169] Sexual Conjugation
[0170] Mating in yeast which is mediated by diffusible molecules,
pheromones, can be readily demonstrated (Manney, Duntze & Betz
1981). When cells of opposite mating type are mixed on the surface
of agar growth medium in a petri dish, changes become apparent
within two to three hours. As each type of cell secretes its
pheromone into the medium, it responds to the one produced by the
opposite type (MacKay & Manney 1974). They each respond by
differentiating into a specialized functional form, a gamete. The
cells stop dividing and change their shape. They elongate and
become pear-shaped. These distinctive cells have been termed
"shmoos". Cells of opposite mating types that are in contact or
close proximity join at the surface and fuse together forming a
characteristic "peanut" shape with a central constriction, i.e. two
shmoos fused at their small ends. The two haploid nuclei within
each joined pair fuse into a diploid nucleus, forming a true
zygote. The diploid promptly buds at the constriction, forming a
characteristic "clover leaf" figure. One can easily observe all of
these stages under the microscope.
[0171] The mating pheromones that are secreted by haploid cells are
small peptide molecules that diffuse through agar (Betz, Manney
& Duntze 1981). Consequently, their existence and their effects
on cells of the opposite mating types are easy to demonstrate. If
cells of the mating type a (alpha) are grown overnight on agar
medium, a high concentration of the pheromone accumulates in the
agar surrounding the growth. If cells of the mating type a (matA or
mat.alpha.) are placed on this agar, they begin to undergo the
"shmoo" transformation within a couple of hours. The same effect
can be demonstrated in a liquid medium in which mating type a
(alpha) cells have been grown.
[0172] Meiosis
[0173] Shmoos are the gametes in yeast. They differentiate from
normal vegetative haploid cells only when a cell of the opposite
mating type is present. In a like manner, any diploid cell can go
through meiosis forming haploids which have the potential to become
gametes (Esposito & Klapholz 1981; Fowell 1969). Meiosis is
part of the process of sporulation which is initiated when diploid
cells are transferred to a nutritionally unbalanced medium, but the
changes become apparent under the microscope only after three to
five days when the asci become quite distinctive. Theoretically,
all asci should contain four spores but in practice, some contain
only two or three. The ascus has a characteristic shape. Treating
the sporulation mixture with a readily available crude preparation
of digestive enzymes (e.g. Zymolyase, Glusulase) will remove the
wall of the ascus, liberating the spores. When the spores, either
within the ascus or after being liberated, are returned to a
nutritionally adequate environment, they germinate and undergo
vegetative growth in a stable haploid phase. Haploid strains occur
in two mating types, called a and .alpha. (alpha). Within each
ascus, two spores are normally mating-type a (matA) and the other
two are a (mat.alpha. (alpha)). When a cell of one mating type
encounters one of the other mating type, they initiate a series of
events that leads to conjugation (See Sexual Conjugation). The
result is a diploid cell, which grows by mitotic cell division in a
stable diploid phase. If one merely transfers a sporulated cell
culture to growth medium the result is a mixed population of
haploid strains and new diploid strains which are analogous to the
progeny from a cross between diploid higher organisms.
[0174] Normally, yeast geneticists isolate the spores, either
randomly or by micromanipulation, to prevent the haploid strains
from mating and forming the next generation of diploid strains.
This degree of control and the ability to observe the genetic
traits in the haploid phase makes genetic analysis in yeast
powerful and efficient.
[0175] Adaptation
[0176] Adaptation is the evolutionary process whereby a population
becomes better suited (adapted) to its habitat or habitats. This
process takes place over several to many generations, and is one of
the basic phenomena of biology.
[0177] The term adaptation may also refer to a feature which is
especially important for an organism's survival. Such adaptations
are produced in a variable population by the better suited forms
reproducing more successfully, by natural selection.
[0178] Changes in environmental conditions alter the outcome of
natural selection, affecting the selective benefits of subsequent
adaptations that improve an organism's fitness under the new
conditions. In the case of an extreme environmental change, the
appearance and fixation of beneficial adaptations can be essential
for survival. A large number of different factors, such as e.g.
nutrient availability, temperature, the availability of oxygen,
etcetera, can drive adaptive evolution.
[0179] Fitness
[0180] There is a clear relationship between adaptedness (the
degree to which an organism is able to live and reproduce in a
given set of habitats) and fitness. Fitness is an estimate and a
predictor of the rate of natural selection. By the application of
natural selection, the relative frequencies of alternative
phenotypes will vary in time, if they are heritable.
[0181] Genetic Changes
[0182] When natural selection acts on the genetic variability of
the population, genetic changes are the underlying mechanism. By
this means, the population adapts genetically to its circumstances.
Genetic changes may result in visible structures, or may adjust the
physiological activity of the organism in a way that suits the
changed habitat.
[0183] It may occur that habitats frequently change. Therefore, it
follows that the process of adaptation is never finally complete.
In time, it may happen that the environment changes gradually, and
the species comes to fit its surroundings better and better. On the
other hand, it may happen that changes in the environment occur
relatively rapidly, and then the species becomes less and less well
adapted. Adaptation is a genetic process, which goes on all the
time to some extent, also when the population does not change the
habitat or environment.
[0184] Single nucleotides in a DNA sequence may be changed
(substitution), removed (deletions) or added (insertion). Insertion
or deletion SNPs (InDels) may shift the translational frame.
[0185] Single nucleotide polymorphisms may fall within coding
sequences of genes (Open Reading Frames or ORFS), non-coding
regions of genes (like promoter sequences, terminator sequences and
the like), or in the intergenic regions between genes. SNPs within
a coding sequence will not necessarily change the amino acid
sequence of the corresponding protein that is produced after
transcription and translation, due to degeneracy of the genetic
code. A SNP in which both forms lead to the same polypeptide
sequence is termed synonymous (a silent mutation). If a different
polypeptide sequence is produced they are nonsynonymous. A
nonsynonymous change may either be missense or nonsense. A missense
change results in a different amino acid in the corresponding
polypeptide, while a nonsense change results in a premature stop
codon, sometimes leading to the formation of a truncated
protein.
[0186] SNPs that are not in protein-coding regions may still have
consequences for gene expression, for instance by a changed
transcription factor binding or stability of the corresponding
mRNA.
[0187] The changes that may occur in the DNA are not necessarily
limited to the change (substitution, deletion or insertion) of a
single nucleotide, but may also comprise a change of two or more
nucleotides (Small Nuclear Variations).
[0188] In addition, chromosomal translocations may occur. A
chromosome translocation is a chromosome abnormality caused by
rearrangement of parts between nonhomologous chromosomes.
[0189] In particular, according to the invention SNP are created in
the following reading frames: SSY1, CEP3 and GAL80.
[0190] SSY1 is herein a component of the SPS plasma membrane amino
acid sensor system (Ssy1p-Ptr3p-Ssy5p), which senses external amino
acid concentration and transmits intracellular signals that result
in regulation of expression of amino acid permease genes.
[0191] CEP3 is herein an essential kinetochore protein, component
of the CBF3 complex that binds the CDEIII region of the centromere;
contains an N-terminal Zn2Cys6 type zinc finger domain, a
C-terminal acidic domain, and a putative coiled coil dimerization
domain. GAL80 is herein a transcriptional regulator involved in the
repression of GAL genes in the absence of galactose. Typically it
inhibits transcriptional activation by Gal4p and inhibition is
relieved by Gal3p or Gal1p binding.
[0192] According to the invention, SNP's in the genes SSY1, CEP3
and GAL80 have been shown to be important for the cell to be able
to ferment a mixed sugar composition. BLAST searches were conducted
for the SNP's found in these genes.
[0193] An overview of the SNP that were identified is given in
table 1:
TABLE-US-00001 TABLE 1 Overview of SNP's of the invention
Nucleotide mutation Amino acid mutation Gene position in ORF*
position in protein SSY1 G1363T E455stop YJR154w A512G D171G CEP3
A1186G S396G GAL80 A436C T146P *the A of the start codon ATG is the
first nucleotide position
[0194] A blast of the genes containing the SNP resulted in the
following data:
[0195] Ssy1p (Member of the AA Trans Superfamily)
[0196] Component of the SPS plasma membrane amino acid sensor
system (Ssy1p-Ptr3p-Ssy5p), which senses external amino acid
concentration and transmits intracellular signals that result in
regulation of expression of amino acid permease genes
[Saccharomyces cerevisiae]
TABLE-US-00002 Ssy1p S. cerevisiae JAY291 852 aa 99% identity Ssy1p
S. cerevisiae YJM789 852 aa 99% identity YDR160w-like protein S.
cerevisiae AWRI1631 791 aa 99% identity ZYRO0F13838p Z. rouxii CBS
732 836 aa 56% identity hypothetical protein C. glabrata CBS 138
853 aa 53% identity KLTH0G11726p Lachancea 824 aa 46% identity
thermotolerans
[0197] Shorter protein found in S. cerevisiae BIE201 is a unique
feature.
[0198] YJR154w (Member of the PhyH Superfamily)
[0199] Putative protein of unknown function; green fluorescent
protein (GFP)-fusion protein localizes to the cytoplasm
[Saccharomyces cerevisiae]
TABLE-US-00003 YJR154w S. cerevisiae JAY291 346 aa 100% identity
conserved protein S. cerevisiae YJM789 346 aa 99% identity putative
pimeloyl- S. cerevisiae 346 aa 71% identity CoA synth.
YJR154Wp-like S. cerevisiae AWRI1631 227 aa 99% identity protein
KLTH0E09900p Lachancea thermotolerans 340 aa 48% identity
[0200] In all these proteins, the D-residue at position 171 (or
equivalent position based on the BLAST results) is conserved.
[0201] CEP3 (GAL4-Like Zn2Cys6 Binuclear Cluster DNA-Binding
Domain; Found in Transcription Regulators like GAL4)
[0202] Centromere DNA-binding protein complex CBF3 subunit B
TABLE-US-00004 CEP3 S. cerevisiae JAY291 608 aa 100% identity
ZYRO0A07260p Z. rouxii CBS 732 596 aa 46% identity unnamed protein
Candida glabrata CBS138 611 aa 44% identity product AFL200Wp A.
gossypii ATCC 10895 596 aa 41% identity
[0203] In all these proteins, the S-residue at position 396 (or
equivalent position based on the BLAST results) is conserved.
[0204] GAL80 (Member of the NADB Rossmann Superfamily)
[0205] Galactose/lactose metabolism regulatory protein GAL80
TABLE-US-00005 transcriptional regulator S. cerevisiae 435 aa 100%
identity YJM789 GAL80p S. kudriavzevii 435 aa 89% identity protein
Kpol_1059p5 V. polyspora 429 aa 73% identity DSM 70294 ZYRO0G04664p
Z. rouxii CBS 732 437 aa 67% identity KLTH0C02838p L.
thermotolerans 424 aa 64% identity KIGAL80 protein Kluyveromyces
457 aa 58% identity lactis NECHADRAFT_86878 N. haematococca 367 aa
30% identity mpVI 77-13-4
[0206] In all these proteins, the T-residue at position 146 (or
equivalent position based on the BLAST results) is conserved.
[0207] The Sugar Composition
[0208] The sugar composition according to the invention comprises
glucose, arabinose and xylose. Any sugar composition may be used in
the invention that suffices those criteria. Optional sugars in the
sugar composition are galactose and rhamnose. In a preferred
embodiment, the sugar composition is a hydrolysate of one or more
lignocellulosic material. Lignocelllulose herein includes
hemicellulose and hemicellulose parts of biomass. Also
lignocellulose includes lignocellulosic fractions of biomass.
Suitable lignocellulosic materials may be found in the following
list: orchard primings, chaparral, mill waste, urban wood waste,
municipal waste, logging waste, forest thinnings, short-rotation
woody crops, industrial waste, wheat straw, oat straw, rice straw,
barley straw, rye straw, flax straw, soy hulls, rice hulls, rice
straw, corn gluten feed, oat hulls, sugar cane, corn stover, corn
stalks, corn cobs, corn husks, switch grass, miscanthus, sweet
sorghum, canola stems, soybean stems, prairie grass, gamagrass,
foxtail; sugar beet pulp, citrus fruit pulp, seed hulls, cellulosic
animal wastes, lawn clippings, cotton, seaweed, trees, softwood,
hardwood, poplar, pine, shrubs, grasses, wheat, wheat straw, sugar
cane bagasse, corn, corn husks, corn hobs, corn kernel, fiber from
kernels, products and by-products from wet or dry milling of
grains, municipal solid waste, waste paper, yard waste, herbaceous
material, agricultural residues, forestry residues, municipal solid
waste, waste paper, pulp, paper mill residues, branches, bushes,
canes, corn, corn husks, an energy crop, forest, a fruit, a flower,
a grain, a grass, a herbaceous crop, a leaf, bark, a needle, a log,
a root, a sapling, a shrub, switch grass, a tree, a vegetable,
fruit peel, a vine, sugar beet pulp, wheat midlings, oat hulls,
hard or soft wood, organic waste material generated from an
agricultural process, forestry wood waste, or a combination of any
two or more thereof.
[0209] An overview of some suitable sugar compositions derived from
lignocellulose and the sugar composition of their hydrolysates is
given in table 1. The listed lignocelluloses include: corn cobs,
corn fiber, rice hulls, melon shells, sugar beet pulp, wheat straw,
sugar cane bagasse, wood, grass and olive pressings.
TABLE-US-00006 TABLE 1 Overview of sugar compositions from
lignocellulosic materials. Lignocellulosic %. material Gal Xyl Ara
Man Glu Rham Sum Gal. Lit. Corn cob a 10 286 36 227 11 570 1.7 (1)
Corn cob b 131 228 160 144 663 19.8 (1) Rice hulls a 9 122 24 18
234 10 417 2.2 (1) Rice hulls b 8 120 28 209 12 378 2.2 (1) Melon
Shells 6 120 11 208 16 361 1.7 (1) Sugar beet pulp 51 17 209 11 211
24 523 9.8 (2) Whea straw Idaho 15 249 36 396 696 2.2 (3) Corn
fiber 36 176 113 372 697 5.2 (4) Cane Bagasse 14 180 24 5 391 614
2.3 (5) Corn stover 19 209 29 370 626 (6) Athel (wood) 5 118 7 3
493 625 0.7 (7) Eucalyptus (wood) 22 105 8 3 445 583 3.8 (7) CWR
(grass) 8 165 33 340 546 1.4 (7) JTW (grass) 7 169 28 311 515 1.3
(7) MSW 4 24 5 20 440 493 0.9 (7) Reed Canary Grass 16 117 30 6 209
1 379 4.2 (8) Veg Reed Canary Grass 13 163 28 6 265 1 476 2.7 (9)
Seed Olive pressing residu 15 111 24 8 329 487 3.1 (9) Gal =
galactose, Xyl = xylose, Ara = arabinose, Man = mannose, Glu =
glutamate, Rham = rhamnose. The percentage galactose (% Gal) and
literature source is given.
[0210] It is clear from table 1 that in these lignocelluloses a
high amount of sugar is presence in de form of glucose, xylose,
arabinose and galactose. The conversion of glucose, xylose,
arabinose and galactose to fermentation product is thus of great
economic importance. Also rhamnose is present in some
lignocellulose materials be it in lower amounts than the previously
mentioned sugars. Advantageously therefore also rhamnose is
converted by the mixed sugar cell.
[0211] Pretreatment and Enzymatic Hydrolysis
[0212] Pretreatment and enzymatic hydrolysis may be needed to
release sugars that may be fermented according to the invention
from the lignocellulosic (including hemicellulosic) material. These
steps may be executed with conventional methods.
[0213] The Mixed Sugar Cell
[0214] The mixed sugar cell comprising the genes araA, araB and
araD integrated into the mixed suger cell genome as defined
hereafter. It is able to ferment glucose, arabinose, xylose,
galactose and mannose. In one embodiment of the invention the mixed
sugar cell is able to ferment one or more additional sugar,
preferably C5 and/or C6 sugar. In an embodiment of the invention
the mixed sugar cell comprises one or more of: a xylA-gene and/or
XKS1-gene, to allow the mixed sugar cell to ferment xylose;
deletion of the aldose reductase (GRE3) gene; overexpression of
PPP-genes TAL1, TKL1, RPE1 and RKI1 to allow the increase of the
flux through the pentose phosphate pass-way in the cell.
[0215] Construction of the Mixed Sugar Strain
[0216] The genes may be introduced in the mixed sugar cell by
introduction into a host cell: [0217] a) a cluster consisting of
PPP-genes TAL1, TKL1, RPE1 and RKI1, under control of strong
promoters; [0218] b) a cluster consisting of a xylA-gene and a
XKS1-gene both under control of constitutive promoters, [0219] c) a
cluster consisting of the genes araA, araB and araD and/or a
cluster of xylA-gene and/or the XKS1-gene; and [0220] d) deletion
of an aldose reductase gene and adaptive evolution to produce the
mixed sugar cell. The above cell may be constructed using
recombinant expression techniques.
[0221] Recombinant Expression
[0222] The cell of the invention is a recombinant cell. That is to
say, a cell of the invention comprises, or is transformed with or
is genetically modified with a nucleotide sequence that does not
naturally occur in the cell in question.
[0223] Techniques for the recombinant expression of enzymes in a
cell, as well as for the additional genetic modifications of a cell
of the invention are well known to those skilled in the art.
Typically such techniques involve transformation of a cell with
nucleic acid construct comprising the relevant sequence. Such
methods are, for example, known from standard handbooks, such as
Sambrook and Russel (2001) "Molecular Cloning: A Laboratory Manual
(3rd edition), Cold Spring Harbor Laboratory, Cold Spring Harbor
Laboratory Press, or F. Ausubel et al., eds., "Current protocols in
molecular biology", Green Publishing and Wiley Interscience, New
York (1987). Methods for transformation and genetic modification of
fungal host cells are known from e.g. EP-A- 0635 574, WO 98/46772,
WO 99/60102, WO 00/37671, W090/14423, EP-A-0481008, EP-A-0635574
and U.S. Pat. No. 6,265,186.
[0224] Typically, the nucleic acid construct may be a plasmid, for
instance a low copy plasmid or a high copy plasmid. The cell
according to the present invention may comprise a single or
multiple copies of the nucleotide sequence encoding a enzyme, for
instance by multiple copies of a nucleotide construct or by use of
construct which has multiple copies of the enzyme sequence.
[0225] The nucleic acid construct may be maintained episomally and
thus comprise a sequence for autonomous replication, such as an
autosomal replication sequence sequence. A suitable episomal
nucleic acid construct may e.g. be based on the yeast 2.mu. or pKD1
plasmids (Gleer et al., 1991, Biotechnology 9: 968-975), or the AMA
plasmids (Fierro et al., 1995, Curr Genet. 29:482-489).
Alternatively, each nucleic acid construct may be integrated in one
or more copies into the genome of the cell. Integration into the
cell's genome may occur at random by non-homologous recombination
but preferably, the nucleic acid construct may be integrated into
the cell's genome by homologous recombination as is well known in
the art (see e.g. WO90/14423, EP-A-0481008, EP-A-0635 574 and U.S.
Pat. No. 6,265,186).
[0226] Most episomal or 2.mu. plasmids are relatively unstable,
being lost in approximately 10.sup.-2 or more cells after each
generation. Even under conditions of selective growth, only 60% to
95% of the cells retain the episomal plasmid. The copy number of
most episomal plasmids ranges from 10-40 per cell of cir.sup.+
hosts. However, the plasmids are not equally distributed among the
cells, and there is a high variance in the copy number per cell in
populations. Strains transformed with integrative plasmids are
extremely stable, even in the absence of selective pressure.
However, plasmid loss can occur at approximately 10.sup.-3 to
10.sup.-4 frequencies by homologous recombination between tandemly
repeated DNA, leading to looping out of the vector sequence.
Preferably, the vector design in the case of stable integration is
thus, that upon loss of the selection marker genes (which also
occurs by intramolecular, homologous recombination) that looping
out of the integrated construct is no longer possible. Preferably
the genes are thus stably integrated. Stable integration is herein
defined as integration into the genome, wherein looping out of the
integrated construct is no longer possible. Preferably selection
markers are absent. Typically, the enzyme encoding sequence will be
operably linked to one or more nucleic acid sequences, capable of
providing for or aiding the transcription and/or translation of the
enzyme sequence.
[0227] The term "operably linked" refers to a juxtaposition wherein
the components described are in a relationship permitting them to
function in their intended manner. For instance, a promoter or
enhancer is operably linked to a coding sequence the said promoter
or enhancer affects the transcription of the coding sequence.
[0228] As used herein, the term "promoter" refers to a nucleic acid
fragment that functions to control the transcription of one or more
genes, located upstream with respect to the direction of
transcription of the transcription initiation site of the gene, and
is structurally identified by the presence of a binding site for
DNA-dependent RNA polymerase, transcription initiation sites and
any other DNA sequences known to one of skilled in the art. A
"constitutive" promoter is a promoter that is active under most
environmental and developmental conditions. An "inducible" promoter
is a promoter that is active under environmental or developmental
regulation.
[0229] The promoter that could be used to achieve the expression of
a nucleotide sequence coding for an enzyme according to the present
invention, may be not native to the nucleotide sequence coding for
the enzyme to be expressed, i.e. a promoter that is heterologous to
the nucleotide sequence (coding sequence) to which it is operably
linked. The promoter may, however, be homologous, i.e. endogenous,
to the host cell.
[0230] Promotors are widely available and known to the skilled
person. Suitable examples of such promoters include e.g. promoters
from glycolytic genes, such as the phosphofructokinase (PFK),
triose phosphate isomerase (TPI), glyceraldehyde-3-phosphate
dehydrogenase (GPD, TDH3 or GAPDH), pyruvate kinase (PYK),
phosphoglycerate kinase (PGK) promoters from yeasts or filamentous
fungi; more details about such promoters from yeast may be found in
(WO 93/03159). Other useful promoters are ribosomal protein
encoding gene promoters, the lactase gene promoter (LAC4), alcohol
dehydrogenase promoters (ADHI, ADH4, and the like), and the enolase
promoter (ENO). Other promoters, both constitutive and inducible,
and enhancers or upstream activating sequences will be known to
those of skill in the art. The promoters used in the host cells of
the invention may be modified, if desired, to affect their control
characteristics. Suitable promoters in this context include both
constitutive and inducible natural promoters as well as engineered
promoters, which are well known to the person skilled in the art.
Suitable promoters in eukaryotic host cells may be GAL7, GAL10, or
GAL1, CYC1, HIS3, ADH1, PGL, PH05, GAPDH, ADC1, TRP1, URA3, LEU2,
ENO1, TPI1, and AOX1. Other suitable promoters include PDC1, GPD1,
PGK1, TEF1, and TDH3.
[0231] In a cell of the invention, the 3'-end of the nucleotide
acid sequence encoding enzyme preferably is operably linked to a
transcription terminator sequence. Preferably the terminator
sequence is operable in a host cell of choice, such as e.g. the
yeast species of choice. In any case the choice of the terminator
is not critical; it may e.g. be from any yeast gene, although
terminators may sometimes work if from a non-yeast, eukaryotic,
gene. Usually a nucleotide sequence encoding the enzyme comprises a
terminator. Preferably, such terminators are combined with
mutations that prevent nonsense mediated mRNA decay in the host
cell of the invention (see for example: Shirley et al., 2002,
Genetics 161:1465-1482).
[0232] The transcription termination sequence further preferably
comprises a polyadenylation signal.
[0233] Optionally, a selectable marker may be present in a nucleic
acid construct suitable for use in the invention. As used herein,
the term "marker" refers to a gene encoding a trait or a phenotype
which permits the selection of, or the screening for, a host cell
containing the marker. The marker gene may be an antibiotic
resistance gene whereby the appropriate antibiotic can be used to
select for transformed cells from among cells that are not
transformed. Examples of suitable antibiotic resistance markers
include e.g. dihydrofolate reductase,
hygromycin-B-phosphotransferase, 3'-O-phosphotransferase II
(kanamycin, neomycin and G418 resistance). Antibiotic resistance
markers may be most convenient for the transformation of polyploid
host cells, Also non-antibiotic resistance markers may be used,
such as auxotrophic markers (URA3, TRPI, LEU2) or the S. pombe TPI
gene (described by Russell P R, 1985, Gene 40: 125-130). In a
preferred embodiment the host cells transformed with the nucleic
acid constructs are marker gene free. Methods for constructing
recombinant marker gene free microbial host cells are disclosed in
EP-A-O 635 574 and are based on the use of bidirectional markers
such as the A. nidulans amdS (acetamidase) gene or the yeast URA3
and LYS2 genes. Alternatively, a screenable marker such as Green
Fluorescent Protein, lacL, luciferase, chloramphenicol
acetyltransferase, beta-glucuronidase may be incorporated into the
nucleic acid constructs of the invention allowing to screen for
transformed cells.
[0234] Optional further elements that may be present in the nucleic
acid constructs suitable for use in the invention include, but are
not limited to, one or more leader sequences, enhancers,
integration factors, and/or reporter genes, intron sequences,
centromers, telomers and/or matrix attachment (MAR) sequences. The
nucleic acid constructs of the invention may further comprise a
sequence for autonomous replication, such as an ARS sequence.
[0235] The recombination process may thus be executed with known
recombination techniques. Various means are known to those skilled
in the art for expression and overexpression of enzymes in a cell
of the invention. In particular, an enzyme may be overexpressed by
increasing the copy number of the gene coding for the enzyme in the
host cell, e.g. by integrating additional copies of the gene in the
host cell's genome, by expressing the gene from an episomal
multicopy expression vector or by introducing a episomal expression
vector that comprises multiple copies of the gene.
[0236] Alternatively, overexpression of enzymes in the host cells
of the invention may be achieved by using a promoter that is not
native to the sequence coding for the enzyme to be overexpressed,
i.e. a promoter that is heterologous to the coding sequence to
which it is operably linked. Although the promoter preferably is
heterologous to the coding sequence to which it is operably linked,
it is also preferred that the promoter is homologous, i.e.
endogenous to the host cell. Preferably the heterologous promoter
is capable of producing a higher steady state level of the
transcript comprising the coding sequence (or is capable of
producing more transcript molecules, i.e. mRNA molecules, per unit
of time) than is the promoter that is native to the coding
sequence. Suitable promoters in this context include both
constitutive and inducible natural promoters as well as engineered
promoters.
[0237] The coding sequence used for overexpression of the enzymes
mentioned above may preferably be homologous to the host cell of
the invention. However, coding sequences that are heterologous to
the host cell of the invention may be used.
[0238] Overexpression of an enzyme, when referring to the
production of the enzyme in a genetically modified cell, means that
the enzyme is produced at a higher level of specific enzymatic
activity as compared to the unmodified host cell under identical
conditions. Usually this means that the enzymatically active
protein (or proteins in case of multi-subunit enzymes) is produced
in greater amounts, or rather at a higher steady state level as
compared to the unmodified host cell under identical conditions.
Similarly this usually means that the mRNA coding for the
enzymatically active protein is produced in greater amounts, or
again rather at a higher steady state level as compared to the
unmodified host cell under identical conditions. Preferably in a
host cell of the invention, an enzyme to be overexpressed is
overexpressed by at least a factor of about 1.1, about 1.2, about
1.5, about 2, about 5, about 10 or about 20 as compared to a strain
which is genetically identical except for the genetic modification
causing the overexpression. It is to be understood that these
levels of overexpression may apply to the steady state level of the
enzyme's activity, the steady state level of the enzyme's protein
as well as to the steady state level of the transcript coding for
the enzyme.
The Adaptive Evolution
[0239] The mixed sugar cells are in their preparation subjected to
adaptive evolution. A cell of the invention may be adapted to sugar
utilisation by selection of mutants, either spontaneous or induced
(e.g. by radiation or chemicals), for growth on the desired sugar,
preferably as sole carbon source, and more preferably under
anaerobic conditions. Selection of mutants may be performed by
techniques including serial transfer of cultures as e.g. described
by Kuyper et al. (2004, FEMS Yeast Res. 4: 655-664) or by
cultivation under selective pressure in a chemostat culture. E.g.
in a preferred host cell of the invention at least one of the
genetic modifications described above, including modifications
obtained by selection of mutants, confer to the host cell the
ability to grow on the xylose as carbon source, preferably as sole
carbon source, and preferably under anaerobic conditions.
Preferably the cell produce essentially no xylitol, e.g. the
xylitol produced is below the detection limit or e.g. less than
about 5, about 2, about 1, about 0.5, or about 0.3% of the carbon
consumed on a molar basis.
[0240] Adaptive evolution is also described e.g. in Wisselink H. W.
et al, Applied and Environmental Microbiology August 2007, p.
4881-4891
[0241] In one embodiment of adaptive evolution a regimen consisting
of repeated batch cultivation with repeated cycles of consecutive
growth in different media is applied, e.g. three media with
different compositions (glucose, xylose, and arabinose; xylose and
arabinose. See Wisselink et al. (2009) Applied and Environmental
Microbiology, February 2009, p. 907-914.
[0242] The Host Cell
[0243] The host cell may be any host cell suitable for production
of a useful product. A cell of the invention may be any suitable
cell, such as a prokaryotic cell, such as a bacterium, or a
eukaryotic cell. Typically, the cell will be a eukaryotic cell, for
example a yeast or a filamentous fungus.
[0244] Yeasts are herein defined as eukaryotic microorganisms and
include all species of the subdivision Eumycotina (Alexopoulos, C.
J.,1962, In : Introductory Mycology, John Wiley & Sons, Inc. ,
New York) that predominantly grow in unicellular form.
[0245] Yeasts may either grow by budding of a unicellular thallus
or may grow by fission of the organism. A preferred yeast as a cell
of the invention may belong to the genera Saccharomyces,
Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula,
Kloeckera, Schwanniomyces or Yarrowia. Preferably the yeast is one
capable of anaerobic fermentation, more preferably one capable of
anaerobic alcoholic fermentation.
[0246] Filamentous fungi are herein defined as eukaryotic
microorganisms that include all filamentous forms of the
subdivision Eumycotina. These fungi are characterized by a
vegetative mycelium composed of chitin, cellulose, and other
complex polysaccharides. The filamentous fungi of the suitable for
use as a cell of the present invention are morphologically,
physiologically, and genetically distinct from yeasts. Filamentous
fungal cells may be advantageously used since most fungi do not
require sterile conditions for propagation and are insensitive to
bacteriophage infections. Vegetative growth by filamentous fungi is
by hyphal elongation and carbon catabolism of most filamentous
fungi is obligately aerobic. Preferred filamentous fungi as a host
cell of the invention may belong to the genus Aspergillus,
Trichoderma, Humicola, Acremoniurra, Fusarium or Penicillium. More
preferably, the filamentous fungal cell may be a Aspergillus niger,
Aspergillus oryzae, a Penicillium chrysogenum, or Rhizopus oryzae
cell.
[0247] In one embodiment the host cell may be yeast.
[0248] Preferably the host is an industrial host, more preferably
an industrial yeast. An industrial host and industrial yeast cell
may be defined as follows. The living environments of yeast cells
in industrial processes are significantly different from that in
the laboratory. Industrial yeast cells must be able to perform well
under multiple environmental conditions which may vary during the
process. Such variations include change in nutrient sources, pH,
ethanol concentration, temperature, oxygen concentration, etc.,
which together have potential impact on the cellular growth and
ethanol production of Saccharomyces cerevisiae. Under adverse
industrial conditions, the environmental tolerant strains should
allow robust growth and production. Industrial yeast strains are
generally more robust towards these changes in environmental
conditions which may occur in the applications they are used, such
as in the baking industry, brewing industry, wine making and the
ethanol industry. Examples of industrial yeast (S. cerevisiae) are
Ethanol Red.RTM. (Fermentis) Fermiol.RTM. (DSM) and Thermosacc.RTM.
(Lallemand).
[0249] In an embodiment the host is inhibitor tolerant. Inhibitor
tolerant host cells may be selected by screening strains for growth
on inhibitors containing materials, such as illustrated in Kadar et
al, Appl. Biochem. Biotechnol. (2007), Vol. 136-140, 847-858,
wherein an inhibitor tolerant S. cerevisiae strain ATCC 26602 was
selected.
[0250] Preferably the host cell is industrial and inhibitor
tolerant.
[0251] araA, araB and araD Genes
[0252] A cell of the invention is capable of using arabinose. A
cell of the invention is therefore, be capable of converting
L-arabinose into L-ribulose and/or xylulose 5-phosphate and/or into
a desired fermentation product, for example one of those mentioned
herein.
[0253] Organisms, for example S. cerevisiae strains, able to
produce ethanol from L-arabinose may be produced by modifying a
cell introducing the araA (L-arabinose isomerase), araB
(L-ribulokinase) and araD (L-ribulose-5-P4-epimerase) genes from a
suitable source. Such genes may be introduced into a cell of the
invention is order that it is capable of using arabinose. Such an
approach is given is described in WO2003/095627. araA, araB and
araD genes from Lactobacillus plantanum may be used and are
disclosed in WO2008/041840. The araA gene from Bacillus subtilis
and the araB and araD genes from Escherichia coli may be used and
are disclosed in EP1499708.
[0254] PPP-Genes
[0255] A cell of the invention may comprise one ore more genetic
modifications that increases the flux of the pentose phosphate
pathway. In particular, the genetic modification(s) may lead to an
increased flux through the non-oxidative part pentose phosphate
pathway. A genetic modification that causes an increased flux of
the non-oxidative part of the pentose phosphate pathway is herein
understood to mean a modification that increases the flux by at
least a factor of about 1.1, about 1.2, about 1.5, about 2, about
5, about 10 or about 20 as compared to the flux in a strain which
is genetically identical except for the genetic modification
causing the increased flux. The flux of the non-oxidative part of
the pentose phosphate pathway may be measured by growing the
modified host on xylose as sole carbon source, determining the
specific xylose consumption rate and subtracting the specific
xylitol production rate from the specific xylose consumption rate,
if any xylitol is produced. However, the flux of the non-oxidative
part of the pentose phosphate pathway is proportional with the
growth rate on xylose as sole carbon source, preferably with the
anaerobic growth rate on xylose as sole carbon source. There is a
linear relation between the growth rate on xylose as sole carbon
source (.mu..sub.max) and the flux of the non-oxidative part of the
pentose phosphate pathway. The specific xylose consumption rate
(Q.sub.s) is equal to the growth rate (.mu.) divided by the yield
of biomass on sugar (Y.sub.xs) because the yield of biomass on
sugar is constant (under a given set of conditions: anaerobic,
growth medium, pH, genetic background of the strain, etc.; i.e.
Q.sub.s=.mu./Y.sub.xs). Therefore the increased flux of the
non-oxidative part of the pentose phosphate pathway may be deduced
from the increase in maximum growth rate under these conditions
unless transport (uptake is limiting).
[0256] One or more genetic modifications that increase the flux of
the pentose phosphate pathway may be introduced in the host cell in
various ways. These including e.g. achieving higher steady state
activity levels of xylulose kinase and/or one or more of the
enzymes of the non-oxidative part pentose phosphate pathway and/or
a reduced steady state level of unspecific aldose reductase
activity. These changes in steady state activity levels may be
effected by selection of mutants (spontaneous or induced by
chemicals or radiation) and/or by recombinant DNA technology e.g.
by overexpression or inactivation, respectively, of genes encoding
the enzymes or factors regulating these genes.
[0257] In a preferred host cell, the genetic modification comprises
overexpression of at least one enzyme of the (non-oxidative part)
pentose phosphate pathway. Preferably the enzyme is selected from
the group consisting of the enzymes encoding for ribulose-5-
phosphate isomerase, ribulose-5-phosphate epimerase, transketolase
and transaldolase. Various combinations of enzymes of the
(non-oxidative part) pentose phosphate pathway may be
overexpressed. E.g. the enzymes that are overexpressed may be at
least the enzymes ribulose-5-phosphate isomerase and
ribulose-5-phosphate epimerase; or at least the enzymes
ribulose-5-phosphate isomerase and transketolase; or at least the
enzymes ribulose-5-phosphate isomerase and transaldolase; or at
least the enzymes ribulose-5-phosphate epimerase and transketolase;
or at least the enzymes ribulose-5-phosphate epimerase and
transaldolase; or at least the enzymes transketolase and
transaldolase; or at least the enzymes ribulose-5-phosphate
epimerase, transketolase and transaldolase; or at least the enzymes
ribulose-5-phosphate isomerase, transketolase and transaldolase; or
at least the enzymes ribulose-5-phosphate isomerase,
ribulose-5-phosphate epimerase, and transaldolase; or at least the
enzymes ribulose-5-phosphate isomerase, ribulose-5-phosphate
epimerase, and transketolase. In one embodiment of the invention
each of the enzymes ribulose-5-phosphate isomerase,
ribulose-5-phosphate epimerase, transketolase and transaldolase are
overexpressed in the host cell. More preferred is a host cell in
which the genetic modification comprises at least overexpression of
both the enzymes transketolase and transaldolase as such a host
cell is already capable of anaerobic growth on xylose. In fact,
under some conditions host cells overexpressing only the
transketolase and the transaldolase already have the same anaerobic
growth rate on xylose as do host cells that overexpress all four of
the enzymes, i.e. the ribulose-5-phosphate isomerase,
ribulose-5-phosphate epimerase, transketolase and transaldolase.
Moreover, host cells overexpressing both of the enzymes
ribulose-5-phosphate isomerase and ribulose-5-phosphate epimerase
are preferred over host cells overexpressing only the isomerase or
only the epimerase as overexpression of only one of these enzymes
may produce metabolic imbalances.
[0258] The enzyme "ribulose 5-phosphate epimerase" (EC 5.1.3.1) is
herein defined as an enzyme that catalyses the epimerisation of
D-xylulose 5-phosphate into D-ribulose 5-phosphate and vice versa.
The enzyme is also known as phosphoribulose epimerase;
erythrose-4-phosphate isomerase; phosphoketopentose 3-epimerase;
xylulose phosphate 3-epimerase; phosphoketopentose epimerase;
ribulose 5-phosphate 3- epimerase; D-ribulose
phosphate-3-epimerase; D-ribulose 5-phosphate epimerase; D-
ribulose-5-P 3-epimerase; D-xylulose-5-phosphate 3-epimerase;
pentose-5-phosphate 3-epimerase; or D-ribulose-5-phosphate
3-epimerase. A ribulose 5-phosphate epimerase may be further
defined by its amino acid sequence. Likewise a ribulose 5-phosphate
epimerase may be defined by a nucleotide sequence encoding the
enzyme as well as by a nucleotide sequence hybridising to a
reference nucleotide sequence encoding a ribulose 5-phosphate
epimerase. The nucleotide sequence encoding for ribulose
5-phosphate epimerase is herein designated RPE1.
[0259] The enzyme "ribulose 5-phosphate isomerase" (EC 5.3.1.6) is
herein defined as an enzyme that catalyses direct isomerisation of
D-ribose 5-phosphate into D-ribulose 5-phosphate and vice versa.
The enzyme is also known as phosphopentosisomerase;
phosphoriboisomerase; ribose phosphate isomerase; 5-phosphoribose
isomerase; D-ribose 5-phosphate isomerase; D-ribose-5-phosphate
ketol-isomerase; or D-ribose-5-phosphate aldose-ketose-isomerase. A
ribulose 5-phosphate isomerase may be further defined by its amino
acid sequence. Likewise a ribulose 5-phosphate isomerase may be
defined by a nucleotide sequence encoding the enzyme as well as by
a nucleotide sequence hybridising to a reference nucleotide
sequence encoding a ribulose 5-phosphate isomerase. The nucleotide
sequence encoding for ribulose 5-phosphate isomerase is herein
designated RPI1.
[0260] The enzyme "transketolase" (EC 2.2.1.1) is herein defined as
an enzyme that catalyses the reaction: D-ribose
5-phosphate+D-xylulose 5-phosphate<->sedoheptulose
7-phosphate +D-glyceraldehyde 3-phosphate and vice versa. The
enzyme is also known as glycolaldehydetransferase or
sedoheptulose-7-phosphate:D-glyceraldehyde-3-phosphate
glycolaldehydetransferase. A transketolase may be further defined
by its amino acid. Likewise a transketolase may be defined by a
nucleotide sequence encoding the enzyme as well as by a nucleotide
sequence hybridising to a reference nucleotide sequence encoding a
transketolase. The nucleotide sequence encoding for transketolase
is herein designated TKL1.
[0261] The enzyme "transaldolase" (EC 2.2.1.2) is herein defined as
an enzyme that catalyses the reaction: sedoheptulose
7-phosphate+D-glyceraldehyde 3-phosphate<->D-erythrose
4-phosphate+D-fructose 6-phosphate and vice versa. The enzyme is
also known as dihydroxyacetonetransferase; dihydroxyacetone
synthase; formaldehyde transketolase; or sedoheptulose-7-phosphate
:D-glyceraldehyde-3-phosphate glyceronetransferase. A transaldolase
may be further defined by its amino acid sequence. Likewise a
transaldolase may be defined by a nucleotide sequence encoding the
enzyme as well as by a nucleotide sequence hybridising to a
reference nucleotide sequence encoding a transaldolase. The
nucleotide sequence encoding for transketolase from is herein
designated TAL1.
Xylose Isomerase Gene
[0262] The presence of the nucleotide sequence encoding a xylose
isomerase confers on the cell the ability to isomerise xylose to
xylulose. According to the invention, two to fifteen copies of one
or more xylose isomerase gene are introduced into the host
cell.
[0263] In one embodiment, the two to fifteen copies of one or more
xylose isomerase gene are introduced into the host cell.
[0264] A "xylose isomerase" (EC 5.3.1.5) is herein defined as an
enzyme that catalyses the direct isomerisation of D-xylose into
D-xylulose and/or vice versa. The enzyme is also known as a
D-xylose ketoisomerase. A xylose isomerase herein may also be
capable of catalysing the conversion between D-glucose and
D-fructose (and accordingly may therefore be referred to as a
glucose isomerase). A xylose isomerase herein may require a
bivalent cation, such as magnesium, manganese or cobalt as a
cofactor.
[0265] Accordingly, a cell of the invention is capable of
isomerising xylose to xylulose. The ability of isomerising xylose
to xylulose is conferred on the host cell by transformation of the
host cell with a nucleic acid construct comprising a nucleotide
sequence encoding a defined xylose isomerase. A cell of the
invention isomerises xylose into xylulose by the direct
isomerisation of xylose to xylulose. This is understood to mean
that xylose is isomerised into xylulose in a single reaction
catalysed by a xylose isomerase, as opposed to two step conversion
of xylose into xylulose via a xylitol intermediate as catalysed by
xylose reductase and xylitol dehydrogenase, respectively.
[0266] A unit (U) of xylose isomerase activity may herein be
defined as the amount of enzyme producing 1 nmol of xylulose per
minute, under conditions as described by Kuyper et al. (2003, FEMS
Yeast Res. 4: 69-78). The Xylose isomerise gene may have various
origin, such as for example Pyromyces sp. as disclosed in
WO2006/009434. Other suitable origins are Bacteroides, in
particular Bacteroides unifomis as described in PCT/EP2009/52623,
Bacillus, in particular Bacillus stearothermophilus as described in
PCT/EP2009/052625, Thermotoga, in particular Thermotoga maritima,
as described in PCT/EP2009/052621 and Clostridium, in particular
Clostridium cellulolyticum as described in PCT/EP2009/052620.
[0267] XKS1 Gene
[0268] A cell of the invention may comprise one or more genetic
modifications that increase the specific xylulose kinase activity.
Preferably the genetic modification or modifications causes
overexpression of a xylulose kinase, e.g. by overexpression of a
nucleotide sequence encoding a xylulose kinase. The gene encoding
the xylulose kinase may be endogenous to the host cell or may be a
xylulose kinase that is heterologous to the host cell. A nucleotide
sequence used for overexpression of xylulose kinase in the host
cell of the invention is a nucleotide sequence encoding a
polypeptide with xylulose kinase activity.
[0269] The enzyme "xylulose kinase" (EC 2.7.1.17) is herein defined
as an enzyme that catalyses the reaction
ATP+D-xylulose=ADP+D-xylulose 5-phosphate. The enzyme is also known
as a phosphorylating xylulokinase, D-xylulokinase or ATP
:D-xylulose 5-phosphotransferase. A xylulose kinase of the
invention may be further defined by its amino acid sequence.
Likewise a xylulose kinase may be defined by a nucleotide sequence
encoding the enzyme as well as by a nucleotide sequence hybridising
to a reference nucleotide sequence encoding a xylulose kinase.
[0270] In a cell of the invention, a genetic modification or
modifications that increase(s) the specific xylulose kinase
activity may be combined with any of the modifications increasing
the flux of the pentose phosphate pathway as described above. This
is not, however, essential.
[0271] Thus, a host cell of the invention may comprise only a
genetic modification or modifications that increase the specific
xylulose kinase activity. The various means available in the art
for achieving and analysing overexpression of a xylulose kinase in
the host cells of the invention are the same as described above for
enzymes of the pentose phosphate pathway. Preferably in the host
cells of the invention, a xylulose kinase to be overexpressed is
overexpressed by at least a factor of about 1.1, about 1.2, about
1.5, about 2, about 5, about 10 or about 20 as compared to a strain
which is genetically identical except for the genetic
modification(s) causing the overexpression. It is to be understood
that these levels of overexpression may apply to the steady state
level of the enzyme's activity, the steady state level of the
enzyme's protein as well as to the steady state level of the
transcript coding for the enzyme.
[0272] Aldose Reductase (GRE3) Gene Deletion
[0273] A cell of the invention may comprise one or more genetic
modifications that reduce unspecific aldose reductase activity in
the host cell. Preferably, unspecific aldose reductase activity is
reduced in the host cell by one or more genetic modifications that
reduce the expression of or inactivates a gene encoding an
unspecific aldose reductase. Preferably, the genetic
modification(s) reduce or inactivate the expression of each
endogenous copy of a gene encoding an unspecific aldose reductase
in the host cell (herein called GRE3 deletion). Host cells may
comprise multiple copies of genes encoding unspecific aldose
reductases as a result of di-, poly- or aneu-ploidy, and/or the
host cell may contain several different (iso)enzymes with aldose
reductase activity that differ in amino acid sequence and that are
each encoded by a different gene. Also in such instances preferably
the expression of each gene that encodes an unspecific aldose
reductase is reduced or inactivated. Preferably, the gene is
inactivated by deletion of at least part of the gene or by
disruption of the gene, whereby in this context the term gene also
includes any non-coding sequence up- or down-stream of the coding
sequence, the (partial) deletion or inactivation of which results
in a reduction of expression of unspecific aldose reductase
activity in the host cell.
[0274] A nucleotide sequence encoding an aldose reductase whose
activity is to be reduced in the host cell of the invention is a
nucleotide sequence encoding a polypeptide with aldose reductase
activity.
[0275] Thus, a host cell of the invention comprising only a genetic
modification or modifications that reduce(s) unspecific aldose
reductase activity in the host cell is specifically included in the
invention.
[0276] The enzyme "aldose reductase" (EC 1.1.1.21) is herein
defined as any enzyme that is capable of reducing xylose or
xylulose to xylitol. In the context of the present invention an
aldose reductase may be any unspecific aldose reductase that is
native (endogenous) to a host cell of the invention and that is
capable of reducing xylose or xylulose to xylitol. Unspecific
aldose reductases catalyse the reaction:
aldose+NAD(P)H+H.sup.+alditol+NAD(P).sup.+
[0277] The enzyme has a wide specificity and is also known as
aldose reductase; polyol dehydrogenase (NADP.sup.+); alditol:NADP
oxidoreductase; alditol:NADP.sup.+1-oxidoreductase;
NADPH-aldopentose reductase; or NADPH-aldose reductase.
[0278] A particular example of such an unspecific aldose reductase
that is endogenous to S. cerevisiae and that is encoded by the GRE3
gene (Traff et al., 2001, Appl. Environ. Microbiol. 67: 5668-74).
Thus, an aldose reductase of the invention may be further defined
by its amino acid sequence. Likewise an aldose reductase may be
defined by the nucleotide sequences encoding the enzyme as well as
by a nucleotide sequence hybridising to a reference nucleotide
sequence encoding an aldose reductase.
[0279] Bioproducts Production
[0280] Over the years suggestions have been made for the
introduction of various organisms for the production of bio-ethanol
from crop sugars. In practice, however, all major bio-ethanol
production processes have continued to use the yeasts of the genus
Saccharomyces as ethanol producer. This is due to the many
attractive features of Saccharomyces species for industrial
processes, i.e., a high acid-, ethanol-and osmo-tolerance,
capability of anaerobic growth, and of course its high alcoholic
fermentative capacity. Preferred yeast species as host cells
include S. cerevisiae, S. bulderi, S. barnetti, S. exiguus, S.
uvarum, S. diastaticus, K. lactis, K. marxianus or K fragilis.
[0281] A cell of the invention may be able to convert plant
biomass, celluloses, hemicelluloses, pectins, rhamnose, galactose,
frucose, maltose, maltodextrines, ribose, ribulose, or starch,
starch derivatives, sucrose, lactose and glycerol, for example into
fermentable sugars. Accordingly, a cell of the invention may
express one or more enzymes such as a cellulase (an endocellulase
or an exocellulase), a hemicellulase (an endo- or exo-xylanase or
arabinase) necessary for the conversion of cellulose into glucose
monomers and hemicellulose into xylose and arabinose monomers, a
pectinase able to convert pectins into glucuronic acid and
galacturonic acid or an amylase to convert starch into glucose
monomers.
[0282] The cell further preferably comprises those enzymatic
activities required for conversion of pyruvate to a desired
fermentation product, such as ethanol, butanol, lactic acid, 3
-hydroxy- propionic acid, acrylic acid, acetic acid, succinic acid,
citric acid, fumaric acid, malic acid, itaconic acid, an amino
acid, 1,3- propane-diol, ethylene, glycerol, a .beta.-lactam
antibiotic or a cephalosporin.
[0283] A preferred cell of the invention is a cell that is
naturally capable of alcoholic fermentation, preferably, anaerobic
alcoholic fermentation. A cell of the invention preferably has a
high tolerance to ethanol, a high tolerance to low pH (i.e. capable
of growth at a pH lower than about 5, about 4, about 3, or about
2.5) and towards organic acids like lactic acid, acetic acid or
formic acid and/or sugar degradation products such as furfural and
hydroxy- methylfurfural and/or a high tolerance to elevated
temperatures.
[0284] Any of the above characteristics or activities of a cell of
the invention may be naturally present in the cell or may be
introduced or modified by genetic modification.
[0285] A cell of the invention may be a cell suitable for the
production of ethanol. A cell of the invention may, however, be
suitable for the production of fermentation products other than
ethanol. Such non-ethanolic fermentation products include in
principle any bulk or fine chemical that is producible by a
eukaryotic microorganism such as a yeast or a filamentous
fungus.
[0286] Such fermentation products may be, for example, butanol,
lactic acid, 3 -hydroxy-propionic acid, acrylic acid, acetic acid,
succinic acid, citric acid, malic acid, fumaric acid, itaconic
acid, an amino acid, 1,3-propane-diol, ethylene, glycerol, a
.beta.-lactam antibiotic or a cephalosporin. A preferred cell of
the invention for production of non-ethanolic fermentation products
is a host cell that contains a genetic modification that results in
decreased alcohol dehydrogenase activity.
[0287] In a further aspect the invention relates to fermentation
processes in which the cells of the invention are used for the
fermentation of a carbon source comprising a source of xylose, such
as xylose. In addition to a source of xylose the carbon source in
the fermentation medium may also comprise a source of glucose. The
source of xylose or glucose may be xylose or glucose as such or may
be any carbohydrate oligo- or polymer comprising xylose or glucose
units, such as e.g. lignocellulose, xylans, cellulose, starch and
the like. For release of xylose or glucose units from such
carbohydrates, appropriate carbohydrases (such as xylanases,
glucanases, amylases and the like) may be added to the fermentation
medium or may be produced by the cell. In the latter case the cell
may be genetically engineered to produce and excrete such
carbohydrases. An additional advantage of using oligo- or polymeric
sources of glucose is that it enables to maintain a low(er)
concentration of free glucose during the fermentation, e.g. by
using rate-limiting amounts of the carbohydrases. This, in turn,
will prevent repression of systems required for metabolism and
transport of non-glucose sugars such as xylose.
[0288] In a preferred process the cell ferments both the xylose and
glucose, preferably simultaneously in which case preferably a cell
is used which is insensitive to glucose repression to prevent
diauxic growth. In addition to a source of xylose (and glucose) as
carbon source, the fermentation medium will further comprise the
appropriate ingredient required for growth of the cell.
Compositions of fermentation media for growth of microorganisms
such as yeasts are well known in the art. The fermentation process
is a process for the production of a fermentation product such as
e.g. ethanol, butanol, lactic acid, 3 -hydroxy-propionic acid,
acrylic acid, acetic acid, succinic acid, citric acid, malic acid,
fumaric acid, itaconic acid, an amino acid, 1,3-propane-diol,
ethylene, glycerol, a .beta.-lactam antibiotic, such as Penicillin
G or Penicillin V and fermentative derivatives thereof, and a
cephalosporin.
[0289] Bioproducts Production
[0290] Over the years suggestions have been made for the
introduction of various organisms for the production of bio-ethanol
from crop sugars. In practice, however, all major bio-ethanol
production processes have continued to use the yeasts of the genus
Saccharomyces as ethanol producer. This is due to the many
attractive features of Saccharomyces species for industrial
processes, i.e., a high acid-, ethanol-and osmo-tolerance,
capability of anaerobic growth, and of course its high alcoholic
fermentative capacity. Preferred yeast species as host cells
include S. cerevisiae, S. bulderi, S. barnetti, S. exiguus, S.
uvarum, S. diastaticus, K. lactis, K. marxianus or K fragilis.
[0291] A mixed sugar cell may be a cell suitable for the production
of ethanol. A mixed sugar cell may, however, be suitable for the
production of fermentation products other than ethanol. Such
non-ethanolic fermentation products include in principle any bulk
or fine chemical that is producible by a eukaryotic microorganism
such as a yeast or a filamentous fungus.
[0292] A mixed sugar cell may be used for production of
non-ethanolic fermentation products is a host cell that contains a
genetic modification that results in decreased alcohol
dehydrogenase activity.
[0293] In an embodiment the mixed sugar cell may be used in a
process wherein sugars originating from lignocellulose are
converted into ethanol.
[0294] Liqnocellulose
[0295] Lignocellulose, which may be considered as a potential
renewable feedstock, generally comprises the polysaccharides
cellulose (glucans) and hemicelluloses (xylans, heteroxylans and
xyloglucans). In addition, some hemicellulose may be present as
glucomannans, for example in wood-derived feedstocks. The enzymatic
hydrolysis of these polysaccharides to soluble sugars, including
both monomers and multimers, for example glucose, cellobiose,
xylose, arabinose, galactose, fructose, mannose, rhamnose, ribose,
galacturonic acid, glucoronic acid and other hexoses and pentoses
occurs under the action of different enzymes acting in concert.
[0296] In addition, pectins and other pectic substances such as
arabinans may make up considerably proportion of the dry mass of
typically cell walls from non-woody plant tissues (about a quarter
to half of dry mass may be pectins).
[0297] Pretreatment
[0298] Before enzymatic treatment, the lignocellulosic material may
be pretreated. The pretreatment may comprise exposing the
lignocellulosic material to an acid, a base, a solvent, heat, a
peroxide, ozone, mechanical shredding, grinding, milling or rapid
depressurization, or a combination of any two or more thereof. This
chemical pretreatment is often combined with heat-pretreatment,
e.g. between 150-220.degree. C. for 1 to 30 minutes.
[0299] Enzymatic Hydrolysis
[0300] The pretreated material is commonly subjected to enzymatic
hydrolysis to release sugars that may be fermented according to the
invention. This may be executed with conventional methods, e.g.
contacting with cellulases, for instance cellobiohydrolase(s),
endoglucanase(s), beta-glucosidase(s) and optionally other enzymes.
The conversion with the cellulases may be executed at ambient
temperatures or at higher tempatures, at a reaction time to release
sufficient amounts of sugar(s). The result of the enzymatic
hydrolysis is hydrolysis product comprising C5/C6 sugars, herein
designated as the sugar composition.
[0301] Fermentation
[0302] The fermentation process may be an aerobic or an anaerobic
fermentation process. An anaerobic fermentation process is herein
defined as a fermentation process run in the absence of oxygen or
in which substantially no oxygen is consumed, preferably less than
about 5, about 2.5 or about 1 mmol/L/h, more preferably 0 mmol/L/h
is consumed (i.e. oxygen consumption is not detectable), and
wherein organic molecules serve as both electron donor and electron
acceptors. In the absence of oxygen, NADH produced in glycolysis
and biomass formation, cannot be oxidised by oxidative
phosphorylation. To solve this problem many microorganisms use
pyruvate or one of its derivatives as an electron and hydrogen
acceptor thereby regenerating NAD.sup.+.
[0303] Thus, in a preferred anaerobic fermentation process pyruvate
is used as an electron (and hydrogen acceptor) and is reduced to
fermentation products such as ethanol, butanol, lactic acid, 3
-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid,
citric acid, malic acid, fumaric acid, an amino acid,
1,3-propane-diol, ethylene, glycerol, a .beta.-lactam antibiotic
and a cephalosporin.
[0304] The fermentation process is preferably run at a temperature
that is optimal for the cell. Thus, for most yeasts or fungal host
cells, the fermentation process is performed at a temperature which
is less than about 42.degree. C., preferably less than about
38.degree. C. For yeast or filamentous fungal host cells, the
fermentation process is preferably performed at a temperature which
is lower than about 35, about 33, about 30 or about 28.degree. C.
and at a temperature which is higher than about 20, about 22, or
about 25.degree. C.
[0305] The ethanol yield on xylose and/or glucose in the process
preferably is at least about 50, about 60, about 70, about 80,
about 90, about 95 or about 98%. The ethanol yield is herein
defined as a percentage of the theoretical maximum yield.
[0306] The invention also relates to a process for producing a
fermentation product.,
[0307] The fermentation processes may be carried out in batch,
fed-batch or continuous mode. A separate hydrolysis and
fermentation (SHF) process or a simultaneous saccharification and
fermentation (SSF) process may also be applied. A combination of
these fermentation process modes may also be possible for optimal
productivity.
[0308] The fermentation process according to the present invention
may be run under aerobic and anaerobic conditions. Preferably, the
process is carried out under micro-aerophilic or oxygen limited
conditions.
[0309] An anaerobic fermentation process is herein defined as a
fermentation process run in the absence of oxygen or in which
substantially no oxygen is consumed, preferably less than about 5,
about 2.5 or about 1 mmol/L/h, and wherein organic molecules serve
as both electron donor and electron acceptors.
[0310] An oxygen-limited fermentation process is a process in which
the oxygen consumption is limited by the oxygen transfer from the
gas to the liquid. The degree of oxygen limitation is determined by
the amount and composition of the ingoing gasflow as well as the
actual mixing/mass transfer properties of the fermentation
equipment used. Preferably, in a process under oxygen-limited
conditions, the rate of oxygen consumption is at least about 5.5,
more preferably at least about 6, such as at least 7 mmol/L/h. A
process of the invention comprises recovery of the fermentation
product.
[0311] Fermentation Product
[0312] The fermentation product of the invention may be any useful
product. In one embodiment, it is a product selected from the group
consisting of ethanol, n-butanol, isobutanol, lactic acid,
3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid,
fumaric acid, malic acid, itaconic acid, maleic acid, citric acid,
adipic acid, an amino acid, such as lysine, methionine, tryptophan,
threonine, and aspartic acid, 1,3-propane-diol, ethylene, glycerol,
a .beta.-lactam antibiotic and a cephalosporin, vitamins,
pharmaceuticals, animal feed supplements, specialty chemicals,
chemical feedstocks, plastics, solvents, fuels, including biofuels
and biogas or organic polymers, and an industrial enzyme, such as a
protease, a cellulase, an amylase, a glucanase, a lactase, a
lipase, a lyase, an oxidoreductases, a transferase or a xylanase.
For example the fermentation products may be produced by cells
according to the invention, following additionally prior art cell
preparation methods and fermentation processes, which examples
however should herein not be construed as limiting. For example.
n-butanol may be produced by cells as described in WO2008121701 or
WO2008086124; lactic acid as described in US2011053231 or
US2010137551; 3-hydroxy-propionic acid as described in
WO2010010291; acrylic acid as described in WO2009153047. An
overview of all kind of fermentation products is and how they can
be prepared in yeast is given in Romanos, Mass., et al, "Foreign
Gene Expression in Yeast:: a Review", yeast vol. 8: 423-488 (1992),
see e.g. table 7. The production of glycerol, 1,3 propane diol,
organic acids, and vitamin C (table 2) is described in Negvoigt,
E., Microbiol. Mol. Biol. Rev. 72(3) 379-412 (2008). Giddijala, L.,
et al, BMC Biotechnology 8(29) (2008) describes production of
beta-lactams in yeast.
[0313] Recovery of the Fermentation Product
[0314] For the recovery of the fermenation product existing
technologies are used. For different fermentation products
different recovery processes are appropriate. Existing methods of
recovering ethanol from aqueous mixtures commonly use fractionation
and adsorption techniques. For example, a beer still can be used to
process a fermented product, which contains ethanol in an aqueous
mixture, to produce an enriched ethanol-containing mixture that is
then subjected to fractionation (e.g., fractional distillation or
other like techniques). Next, the fractions containing the highest
concentrations of ethanol can be passed through an adsorber to
remove most, if not all, of the remaining water from the
ethanol.
[0315] The following examples illustrate the invention:
EXAMPLES
[0316] Unless indicated otherwise, the methods described in here
are standard biochemical techniques. Examples of suitable general
methodology textbooks include Sambrook et al., Molecular Cloning, a
Laboratory Manual (1989) and Ausubel et al., Current Protocols in
Molecular Biology (1995), John Wiley & Sons, Inc.
[0317] Medium Composition
[0318] Growth experiments: Saccharomyces cerevisiae strains are
grown on medium having the following composition: 0.67% (w/v) yeast
nitrogen base or synthetic medium (Verduyn et al., Yeast 8:501-517,
1992) and glucose, arabinose, galactose or xylose, or a combination
of these substrates, at varying concentrations (see examples for
specific details; concentrations in % weight over volume (w/v)).
For agar plates the medium is supplemented with 2% (w/v)
bacteriological agar.
[0319] Ethanol Production
[0320] Pre-cultures were prepared by inoculating 25 ml
Verduyn-medium (Verduyn et al.,
[0321] Yeast 8:501-517, 1992) supplemented with 2% glucose in a 100
ml shake flask with a frozen stock culture or a single colony from
agar plate. After incubation at 30.degree. C. in an orbital shaker
(280 rpm) for approximately 24 hours, this culture was harvested
and used for determination of CO.sub.2 evolution and ethanol
production experiments.
[0322] Cultivations for ethanol production were performed at
30.degree. C. in 100 ml synthetic model medium (Verduyn-medium
(Verduyn et al., Yeast 8:501-517, 1992) with 5% glucose, 5% xylose,
3.5% arabinose and 1% galactose) in the BAM (Biological Activity
Monitor, Halotec, The Netherlands). The pH of the medium was
adjusted to 4.2 with 2 M NaOH/H.sub.2SO4 prior to sterilisation.
The synthetic medium for anaerobic cultivation was supplemented
with 0.01 g I.sup.-1 ergosterol and 0.42 g I.sup.-1 Tween 80
dissolved in ethanol (Andreasen and Stier. J. Cell Physiol.
41:23-36, 1953; and Andreasen and Stier. J. Cell Physiol.
43:271-281, 1954). The medium was inoculated at an initial OD600 of
approximately 2. Cultures were stirred by a magnetic stirrer.
Anaerobic conditions developed rapidly during fermentation as the
culture was not aerated. CO.sub.2 production was monitored
constantly. Sugar conversion and product formation (ethanol,
glycerol) was analyzed by NMR. Growth was monitored by following
optical density of the culture at 600 nm on a LKB Ultrospec K
spectrophotometer.
[0323] Transformation of S. Cerevisiae
[0324] Transformation of S. cerevisiae was done as described by
Gietz and Woods (2002; Transformation of the yeast by the LiAc/SS
carrier DNA/PEG method. Methods in Enzymology 350: 87-96).
[0325] Colony PCR
[0326] A single colony isolate was picked with a plastic toothpick
and resuspended in 50 .mu.l milliQ water. The sample was incubated
for 10 minutes at 99.degree. C. 5 .mu.l of the incubated sample was
used as a template for the PCR reaction, using Phusion.RTM. DNA
polymerase (Finnzymes) according to the instructions provided by
the supplier.
[0327] PCR Reaction Conditions:
TABLE-US-00007 step 1 3' 98.degree. C. step 2 10'' 98.degree. C.
step 3 15'' 58.degree. C. repeat step 2 to 4 for 30 cycles step 4
30'' 72.degree. C. step 5 4' 72.degree. C. step 6 30'' 20.degree.
C.
[0328] Chromosomal DNA Isolation
[0329] Yeast cells were grown in YEP-medium containing 2% glucose,
in a rotary shaker (overnight, at 30.degree. C. and 280 rpm). 1.5
ml of these cultures were transferred to an Eppendorf tube and
centrifuged for 1 minute at maximum speed. The supernatant was
decanted and the pellet was resuspended in 200 .mu.l of YCPS (0.1%
SB3-14 (Sigma Aldrich, the Netherlands) in 10 mM Tris.HCl pH 7.5; 1
mM EDTA) and 1 .mu.l RNase (20 mg/ml RNase A from bovine pancreas,
Sigma, the Netherlands). The cell suspension was incubated for 10
minutes at 65.degree. C. The suspension was centrifuged in an
Eppendorf centrifuge for 1 minute at 7000 rpm. The supernatant was
discarded. The pellet was carefully dissolved in 200 .mu.l CLS (25
mM EDTA, 2% SDS) and 1 .mu.l RNase A. After incubation at
65.degree. C. for 10 minutes, the suspension was cooled on ice.
After addition of 70 .mu.l PPS (10M ammonium acetate) the solutions
were thoroughly mixed on a Vortex mixer. After centrifugation (5
minutes in Eppendorf centrifuge at maximum speed), the supernatant
was mixed with 200 .mu.l ice-cold isopropanol. The DNA readily
precipitated and was pelleted by centrifugation (5 minutes, maximum
speed). The pellet was washed with 400 .mu.l ice-cold 70% ethanol.
The pellet was dried at room temperature and dissolved in 50 .mu.l
TE (10 mM Tris.HCl pH7.5, 1 mM EDTA).
Example 1
Construction of Strain BIE104A2P1
[0330] 1.1 Construction of an Expression Vector Containing the
Genes for Arabinose Pathway
[0331] Plasmid pPWT018, as set out in FIG. 2, was constructed as
follows: vector pPWT006 (FIG. 1, consisting of a SIT2-locus
(Gottlin-Ninfa and Kaback (1986) Molecular and Cell Biology vol. 6,
no. 6, 2185-2197) and the markers allowing for selection of
transformants on the antibiotic G418 and the ability to grow on
acetamide was digested with the restriction enzymes BsiWI and MluI.
The kanMX-marker, conferring resistance to G418, was isolated from
p427TEF (Dualsystems Biotech) and a fragment containing the
amdS-marker has been described in the literature (Swinkels, B. W.,
Noordermeer, A. C. M. and Renniers, A. C. H. M (1995) The use of
the amdS cDNA of Aspergillus nidulans as a dominant, bidirectional
selectable marker for yeast transformation. Yeast Volume 11, Issue
1995A, page S579; and US 6051431). The genes encoding arabinose
isomerase (araA), L-ribulokinase (araB) and
L-ribulose-5-phosphate-4-epimerase (araD) from Lactobacillus
plantarum, as disclosed in patent application WO2008/041840, were
synthesized by BaseClear (Leiden, the Netherlands). One large
fragment was synthesized, harbouring the three arabinose-genes
mentioned above, under control of (or operable linked to) strong
promoters from S. cerevisiae, i.e. the TDH3-promoter controlling
the expression of the araA-gene, the ENO1-promoter controlling the
araB-gene and the PGI1-promoter controlling the araD-gene. This
fragment was surrounded by the unique restriction enzymes Acc65I
and MluI. Cloning of this fragment into pPWT006 digested with MluI
and BsiWI, resulted in plasmid pPWT018 (FIG. 2). The sequence of
plasmid pPWT018 is set out in SEQ ID 1.
[0332] 1.2 Yeast Transformation
[0333] CEN.PK113-7D (MATa URA3 HIS3 LEU2 TRP1 MAL2-8 SUC2) was
transformed with plasmid pPWT018, which was previously linearized
with SfiI (New England Biolabs), according to the instructions of
the supplier. A synthetic SfiI-site was designed in the 5'-flank of
the SIT2-gene (see FIG. 2). Transformation mixtures were plated on
YPD-agar (per liter: 10 grams of yeast extract, 20 grams per liter
peptone, 20 grams per liter dextrose, 20 grams of agar) containing
100 .mu.g G418 (Sigma Aldrich) per ml. After two to four days,
colonies appeared on the plates, whereas the negative control (i.e.
no addition of DNA in the transformation experiment) resulted in
blank YPD/G418-plates. The integration of plasmid pPWT018 is
directed to the SIT2-locus. Transformants were characterized using
PCR and Southern blotting techniques.
[0334] PCR reactions, which are indicative for the correct
integration of one copy of plasmid pPWT018, were performed with the
primers indicated by SEQ ID 2 and 3, and 3 and 4. With the primer
pairs of SEQ ID 2 and 3, the correct integration at the SIT2-locus
was checked. If plasmid pPWT018 was integrated in multiple copies
(head-to-tail integration), the primer pair of SEQ ID 3 and 4 will
give a PCR-product. If the latter PCR product is absent, this is
indicative for one copy integration of pPWT018. A strain in which
one copy of plasmid pPWT018 was integrated in the SIT2-locus was
designated BIE104R2.
[0335] 1.3 Marker Rescue
[0336] In order to be able to transform the yeast strain with other
constructs using the same selection markers, it is necessary to
remove the selectable markers. The design of plasmid pPWT018 was
such, that upon integration of pPWT018 in the chromosome,
homologous sequences are in close proximity of each other. This
design allows the selectable markers to be lost by spontaneous
intramolecular recombination of these homologous regions.
[0337] Upon vegetative growth, intramolecular recombination will
take place, although at low frequency. The frequency of this
recombination depends on the length of the homology and the locus
in the genome (unpublished results). Upon sequential transfer of a
subfraction of the culture to fresh medium, intramolecular
recombinants will accumulate in time.
[0338] To this end, strain BIE104R2 was cultured in YPD-medium (per
liter: 10 grams of yeast extract, 20 grams per liter peptone, 20
grams per liter dextrose), starting from a single colony isolate.
25 .mu.l of an overnight culture was used to inoculate fresh YPD
medium. After at least five of such serial transfers, the optical
density of the culture was determined and cells were diluted to a
concentration of approximately 5000 per ml. 100 .mu.l of the cell
suspension was plated on Yeast Carbon Base medium (Difco)
containing 30 mM KPi (pH 6.8), 0.1% (NH4)2SO4, 40 mM
fluoro-acetamide (Amersham) and 1.8% agar (Difco). Cells identical
to cells of strain BIE104R2, i.e. without intracellular
recombination, still contain the amdS-gene. To those cells,
fluoro-acetamide is toxic. These cells will not be able to grow and
will not form colonies on a medium containing fluoro-acetamide.
However, if intramolecular recombination has occurred,
BIE104R2-variants that have lost the selectable markers will be
able to grow on the fluoro-acetamide medium, since they are unable
to convert fluoro-acetamide into growth inhibiting compounds. Those
cells will form colonies on this agar medium.
[0339] The thus obtained fluoro-acetamide resistant colonies were
subjected to PCR analysis using primers of SEQ ID 2 and 3, and 4
and 5. Primers of SEQ ID 2 and 3 will give a band if recombination
of the selectable markers has taken place as intended. As a result,
the cassette with the genes araA, araB and araD under control of
the strong yeast promoters have been integrated in the SIT2-locus
of the genome of the host strain. In that case, a PCR reaction
using primers of SEQ ID 4 and 5 should not result in a PCR product,
since primer 4 primes in a region that should be lost due to the
recombination. If a band is obtained with the latter primers, this
is indicative for the presence of the complete plasmid pPWT018 in
the genome, so no recombination has taken place.
[0340] If primers of SEQ ID 2 and 3 do not result in a PCR product,
recombination has taken place, but in such a way that the complete
plasmid pPWT018 has recombined out of the genome. Not only were the
selectable markers lost, but also the arabinose-genes. In fact,
wild-type yeast has been retrieved.
[0341] Isolates that showed PCR results in accordance with one copy
integration of pPWT018 were subjected to Southern blot analysis.
The chromosomal DNA of strains CEN.PK113-7D and the correct
recombinants were digested with EcoRI and HindIII (double
digestion). A SIT2-probe was prepared with primers of SEQ ID 6 and
7, using pPW018 as a template. The result of the hybridisation
experiment is shown in FIG. 3.
[0342] In the wild-type strain, a band of 2.35 kb is observed,
which is in accordance with the expected size of the wild-type
gene. Upon integration and partial loss by recombination of the
plasmid pPWT018, a band of 1.06 kb was expected. Indeed, this band
is observed, as shown in FIG. 3 (lane 2).
[0343] One of the strains that showed the correct pattern of bands
on the Southern blot (as can be deduced from FIG. 3) is the strain
designated as BIE104A2.
[0344] 1.4 Introduction of Four Constitutively Expressed Genes of
the Non-Oxidative Pentose Phosphate Pathway
[0345] Saccharomyces cerevisiae BIE104A2, expressing the genes
araA, araB and araD constitutively, was transformed with plasmid
pPWT080 (FIG. 4). The sequence of plasmid pPWT080 is set out in SEQ
ID 8. The procedure for transformation and selection, after
selecting a one copy integration transformant, are the same as
described above in sections 1.1, 1.2 and 1.3. In short, BIE104A2
was transformed with Sfil-digested pPWT080. Transformation mixtures
were plated on YPD-agar (per liter: 10 grams of yeast extract, 20
grams per liter peptone, 20 grams per liter dextrose, 20 grams of
agar) containing 100 .mu.g G418 (Sigma Aldrich) per ml.
[0346] After two to four days, colonies appeared on the plates,
whereas the negative control (i.e. no addition of DNA in the
transformation experiment) resulted in blank YPD/G418-plates.
[0347] The integration of plasmid pPWT080 is directed to the
GRE3-locus. Transformants were characterized using PCR and Southern
blotting techniques. The correct integration of the plasmid pPWT080
at the GRE3-locus was checked by PCR using primer pairs SEQ ID 9
and SEQ ID10, and the primer pair SEQ ID 9 and SEQ ID 11 was used
to detect single or multicopy integration of the plasmid pPWT080.
For Southern analysis, a probe was prepared by PCR using SEQ ID 12
and SEQ ID 13, amplifying a part of the RKI1-gene of S. cerevisiae.
Next to the native RKI1-gene, an extra signal was obtained
resulting from the integration of the plasmid pPWT080 (data not
shown)
[0348] A transformant showing correct integration of one copy of
plasmid pPWT080, in accordance with the expected hybridisation
pattern, was designated BIE104A2F1.
[0349] In order to remove the selection markers introduced by the
integration of plasmid pPWT080, strain BIE104A2F1 was cultured in
YPD-medium, starting from a colony isolate. 25 .mu.l of an
overnight culture was used to inoculate fresh YPD-medium. After
five serial transfers, the optical density of the culture was
determined and cells were diluted to a concentration of
approximately 5000 per ml. 100 .mu.l of the cell suspension was
plated on Yeast Carbon Base medium (Difco) containing 30 mM KPi (pH
6.8), 0.1% (NH4)2SO4, 40 mM fluoro-acetamide (Amersham) and 1.8%
agar (Difco). Fluoro-acetamide resistant colonies were subjected to
PCR analysis using the primers of SEQ ID 9 and SEQ ID 10. In case
of correct PCR-profiles, Southern blot analysis was performed in
order to verify the correct integration, again using the probe of
the RKI1-gene. One of the strains that showed the correct pattern
of bands on the Southern blot is the strain designated as
BIE104A2P1.
Example 2
Adaptive Evolution in Shake Flask Leading to BIE104A2P1c and
BIE201
[0350] 2.1 Adaptive Evolution (Aerobically)
[0351] A single colony isolate of strain BIE104A2P1 was used to
inoculate YNB-medium (Difco) supplemented with 2% galactose. The
preculture was incubated for approximately 24 hours at 30.degree.
C. and 280 rpm. Cells were harvested and inoculated in YNB medium
containing 1% galactose and 1% arabinose at a starting OD.sup.600
of 0.2 (FIG. 5). Cells were grown at 30.degree. C. and 280 rpm. The
optical density at 600 nm was monitored regularly.
[0352] When the optical density reached a value of 5, an aliquot of
the culture was transferred to fresh YNB medium containing the same
medium. The amount of cells added was such that the starting
OD.sup.600 of the culture was 0.2. After reaching an OD.sup.600 of
5 again, an aliquot of the culture was transferred to YNB medium
containing 2% arabinose as sole carbon source (event indicated by
(1) in FIG. 5).
[0353] Upon transfer to YNB with 2% arabinose as sole carbon source
growth could be observed after approximately two weeks. When the
optical density at 600 nm reached a value at least of 1, cells were
transferred to a shake flask with fresh YNB-medium supplemented
with 2% arabinose at a starting OD.sup.600 of 0.2 (FIG. 5, day 28).
Sequential transfer was repeated three times, as is set out it in
FIG. 5. The resulting strain which was able to grow fast on
arabinose was designated BIE104A2P1c.
[0354] 2.2 Adaptive Evolution (Anaerobically)
[0355] After adaptation on growth on arabinose under aerobic
conditions, a single colony from strain BIE104A2P1c was inoculated
in YNB medium supplemented with 2% glucose. The preculture was
incubated for approximately 24 hours at 30.degree. C. and 280 rpm.
Cells were harvested and inoculated in YNB medium containing 2%
arabinose, with a initial optical density OD.sup.600 of 0.2. The
flasks were closed with waterlocks, ensuring anaerobic growth
conditions after the oxygen was exhausted from the medium and head
space. After reaching an OD.sup.600 minimum of 3, an aliquot of the
culture was transferred to fresh YNB medium containing 2% arabinose
(FIG. 6), each time at an initial OD.sup.600 value of 0.2. After
several transfers the resulting strain was designated BIE104A2P1d
(=BIE201).
Example 3
Performance Test of Strains in the BAM Showing that Adaptive
Evolution has Led to (Improved) Arabinose Conversion.
Co-Fermentation with Galactose
[0356] Single colony isolates of strain BIE104, BIE104A2P1c and
BIE201 were used to inoculate YNB-medium (Difco) supplemented with
2% glucose. The precultures were incubated for approximately 24
hours at 30.degree. C. and 280 rpm. Cells were harvested and
inoculated in a synthetic model medium (Verduyn et al., Yeast
8:501-517, 1992; 5% glucose, 5% xylose, 3.5% arabinose, 1%
galactose) at an initial OD.sup.600 of approximately 2 in the BAM.
CO.sub.2 production was monitored constantly. Sugar conversion and
product formation was analyzed by NMR. The data represent the
residual amount of sugars at the indicated (glucose, arabinose,
galactose and xylose in grams per litre) and the formation of
(by-)products (ethanol, glycerol). Growth was monitored by
following optical density of the culture at 600nm (FIGS. 7, 8 and
9). The experiment was running for approximately 140 hours.
[0357] The experiments clearly show that reference strain BIE104
converted glucose rapidly, but was not able to convert arabinose,
xylose and/or galactose within 140 hours (FIG. 7). However, strain
BIE104A2P1c and BIE201 were capable to convert arabinose and
galactose (FIGS. 8 and 9, respectively). Galactose and arabinose
utilization started immediately after glucose depletion after less
than 20 hours. Both sugars were converted simultaneously. However,
strain BIE201 which was improved for arabinose growth under
anaerobic conditions, consumed both sugars more rapidly (FIG. 9).
In all fermentations only glycerol was generated as by-product.
Example 4
Resequencing of the Strains and Identification of SNPs Involved in
Arabinose Fermentation
[0358] As can be concluded from examples 1, 2 and 3, mere
introduction of the genes encoding enzymes needed for or enhancing
the utilization of arabinose is not sufficient to allow growth on
arabinose as sole carbon source. As shown in example 2, a process
called adaptive evolution is required to select cells that utilize
arabinose as sole C-source.
[0359] Presumably, spontaneous mutations (SNPs, for Single
Nucleotide Polymorphisms) in the genome are responsible for this
phenotypic change. Alternatively, larger variations in the genome
(not limited to the substitution, insertion or deletion of a single
nucleotide) may have taken place.
[0360] In order to learn which mutations or SNPs are responsible
for this phenotypic change, we resequenced the genomic DNA of the
transformants, using the art known as Solexa.RTM. technology, using
the Illumina.RTM. Genome Analyzer.
[0361] To this end, chromosomal DNA was isolated from the strains
BIE104, primary transformant BIE104A2P1, evolved transformant
BIE104A2P1c and further evolved transformant BIE201 from YEP 2%
glucose overnight cultures. The DNA was sent to ServiceXS (Leiden,
the Netherlands) for resequencing using the Illumina.RTM. Genome
Analyzer (50 by reads, pair end sequencing).
[0362] Per strain, about 1800 Mb of sequences were obtained, which
corresponds to an average genome coverage of 140, which means that
on average, every base has been read 140 times.
[0363] Using NextGene software (SoftGenetics LLC, State College,
Pa. 16803, USA), the sequencing reads were aligned using the S288c
as a template. Mutations (single nucleotide polymorphisms and
insertion/deletions up to 30 bp) were detected using NextGene
software and summarised in a mutation report. The alignments of the
different strains were compared to each other to identify the
unique variations between the strains. Every entry of the mutation
report was checked manually, in order to rule out the possibility
of misalignment of the reads, sequencing errors or mutation calls
in areas where the sequencing coverage was too low to support this.
False positive mutations were removed from the mutation report.
[0364] The sequence of the primary transformant (BIE104A2P1) was
identical to the sequence of wild-type strain BIE104, with the
exception of the sequences that were introduced and the sequences
that were deleted by the integration of the plasmids and the
subsequent removal of the markers by recombination.
[0365] In the evolved transformant, strain BIE104A2P1c, a limited
number of SNPs was introduced:
TABLE-US-00008 SSY1 YDR160w G .fwdarw. T introduction stop-codon
YJR154w A .fwdarw. G D .fwdarw. G CEP3 YMR168c A .fwdarw. G S
.fwdarw. G YPL277c C .fwdarw. T silent
[0366] In the further evolved transformant, strain BIE201, one
additional SNP was observed, next to the 4 SNPs mentioned
above:
TABLE-US-00009 GAL80 YML051w A .fwdarw. C T .fwdarw. P
[0367] The sequences of the five open reading frames of the genes
containing the SNPs, both in the wild type strain BIE104 and in the
evolved strains BIE104A2P1c and BIE201, are given in SEQ ID 14, SEQ
ID 15 (SSY1), SEQ ID 16, SEQ ID 17 (YJR154w), SEQ ID 18, SEQ ID19
(CEP3), SEQ ID 20, SEQ ID 21 (YPL277c), SEQ ID 22 and SEQ ID 23
(GAL80).
Example 5
Confirmation of the SNPs
[0368] In order to (re)confirm the SNPs that were detected in the
example described above, two methods were employed. The first
method comprised amplification of the regions containing the SNPs
followed by Single read (Sanger) sequencing on a AB13730XL
sequencer (outsourced to Baseclear B V, Leiden, the Netherlands).
The second method consisted of High Resolution Melting Analysis
(Hi-Res).
[0369] 5.1 Single Read Sanger Sequencing
[0370] Genomic DNA isolated from cultures of strains BIE104A2P1 and
BIE201 was used as a template for PCR reactions using Phusion.RTM.
High-Fidelity DNA Polymerase (Finnzymes, Vantaa, Finland). The PCR
reactions were performed according to the suggestions made by the
supplier. The following primers were used to amplify the following
genes, expected to have a SNP.
TABLE-US-00010 TABLE 2 Primers used for amplification of PCR
products Gene of interest Forward primer Reverse primer SSY1
(YDR160w) SEQ ID NO 24 SEQ ID NO 25 YJR154w SEQ ID NO 26 SEQ ID NO
27 CEP3 (YMR168c) SEQ ID NO 28 SEQ ID NO 29 YPL277c SEQ ID NO 30
SEQ ID NO 31 GAL80 (YML051w) SEQ ID NO 32 SEQ ID NO 33
[0371] The PCR products were cloned into the pTOPO Blunt II vector
(Invitrogen, Carlsbad, USA). The correct clones were selected on
basis of restriction enzyme analysis. Correct clones were sent to
BaseClear BV (Leiden, the Netherlands) for single stranded Sanger
sequencing.
[0372] The TOPO cloning of the CEP3 fragment was not successful. No
Sanger sequencing data was obtained for this gene.
[0373] The sequence of YPL277c appeared to be identical to the
sequence of the wild-type strain BIE104.
[0374] The Sanger sequencing results confirmed the SNPs in the
genes SSY1, YJR154w and GAL80, i.e. the SNPs were the same as
described in Example 4.
[0375] 5.2 Hi-Res Analysis
[0376] The Hi-Res technology is commercialized by Idaho
Technologies (Salt Lake City, Utah 84108, USA). In short, mutations
in PCR products are detected by the presence of heteroduplexes
optimally detected by LCGreen.RTM. dye. Variations are identified
by changes in the shape of the melting profile compared to a
reference sample. Hi-Res Melting.RTM. (HRM) on the
LightScanner.RTM. is being used for mutation discovery in numerous
research and clinical applications.
[0377] For each SNP, two primers were designed in order to amplify
a region of around 100 to 200 by containing the SNP or the
wild-type sequence. In addition, a third primer was designed to
function as a probe in the experiments which detects the melting
profile. The latter primer was designed such that it covers the SNP
region and is exactly complimentary to the wild-type sequence. The
matching to the SNP sequence is imperfect, i.e. all but one of the
nucleotides of the probe are complementary to the region of
interest. Mismatched DNA strands will melt earlier than matched DNA
strands, which results in different melting curves of wild type and
SNP amplicons, which are detected using the LightScanner.RTM.
(Idaho Technologies, Salt Lake City, Utah, USA).
[0378] The table below summarizes the primer sequences that were
used to amplify the gene or ORF of interest, of which the SNP
should be verified in strain BIE201.
TABLE-US-00011 TABLE 3 Primers for amplification of PCR products
Gene of interest Forward primer Reverse primer SSY1 (YDR160w) SEQ
ID NO 24 SEQ ID NO 25 YJR154w SEQ ID NO 26 SEQ ID NO 27 CEP3
(YMR168c) SEQ ID NO 28 SEQ ID NO 29 YPL277c SEQ ID NO 30 SEQ ID NO
31 GAL80 (YML051w) SEQ ID NO 32 SEQ ID NO 33
[0379] The table below summarizes the SEQ ID NOs that have been
used to verify the SNPs in strain BIE201 (the probes).
TABLE-US-00012 TABLE 4 Primers used as probes in Hi-Res analysis
Gene of interest Probe wild-type sequence SSY1 (YDR160w) SEQ ID NO
34 YJR154w SEQ ID NO 35 CEP3 (YMR168c) SEQ ID NO 36 YPL277c SEQ ID
NO 37 GAL80 (YML051w) SEQ ID NO 38
[0380] PCR reactions were carried out using chromosomal DNA of the
strains BIE104 (wild type yeast strain) and strain BIE201 (the
yeast strain capable of growing anaerobically on arabinose), using
primer pairs of SEQ ID NO 24 and 25 (SSY1), 26 and 27 (YJR154w), 28
and 29 (CEP3), 30 and 31 (YPL277c) and 32 and 33 (GAL80), according
to the instructions as provided by Idaho Technologies but in the
absence of probe. The amplified fragments were checked on a 2%
agarose gel for yield and integrity.
[0381] The HiRes analysis was performed as follows, analogous to
the protocol provided by Idaho Technologies: 2 .mu.l of probe (5
.mu.M) was added to 10 .mu.l PCR product in a PCR microplate
(4titude Framestart 96, black frame, white wells (BiokeO, Leiden,
the Netherlands)). After mixing the microplate was spun down. The
plate was incubated for 30 seconds at 99.degree. C. and cooled to
room temperature (.about.20.degree. C.). Subsequently, the melting
protocol on the Lightscanner was followed with start temperature of
55.degree. C., end temperature of 94.degree. C. and exposure
settings on "auto". After the measurements were complete, data
analysis was performed. The temperature boundaries between which
the change in fluorescence was analysed were manually set at the
temperature interval where the probe was expected to melt from the
PCR products.
[0382] An example of a melting curve is shown in FIG. 11. FIG. 11
displays an example of both "Normalized Melting Curves" (melting
curves; top panel) and a "Normalized melting Peaks" curve (lower
panel). The latter is derived from the first graph and is showing
the change in fluorescence signal as a function of the temperature.
Strains BIE104A2P1 and BIE201 are displayed. The gene tested in
this figure is YJR154w. The difference in melting temperature of
the probe is clear between the two strains tested, BIE201 and
BIE104A2P1.
[0383] All expected SNPs, except the one in YPL277c, were
confirmed. The sequence of this ORF (YPL277c) in BIE201 appeared to
be identical to the sequence of the wild-type strain BIE104.
[0384] In summary, in Example 5 the SNPs in the ORFs SSY1
(YDR160w), YJR154w, CEP3 (YMR168c) and GAL80 (YML051w) were
confirmed. The SNP that was previously identified (Example 4) in
the ORF of YPL277c was falsified using two independent methods.
Example 6
Amplification of Parts of Chromosome VII
[0385] 6.1 Amplification of a Part of Chromosome VII
[0386] As was described in Example 4, resequencing of the wild-type
strain BIE104, primary transformant BIE104A2P1, evolved
transformant BIE104A2P1c and further evolved strain BIE201 yielded
several interesting SNPs.
[0387] Using the coverage plots, which indicate the read depth of
every single nucleotide of the genome, we have searched for areas
in the genome that were over- or underrepresented. Indeed, we have
identified a region on chromosome VII that was overrepresented (see
FIG. 12).
[0388] From the read depth, it was concluded that parts of
chromosome VII, surrounding the centromere, were amplified. A
region on the left arm of chromosome VII was amplified three times.
A part of the right arm of chromosome VII was amplified twice, and
an adjacent part was amplified three times (see FIG. 12).
[0389] The part on the right arm of chromosome VII that was
amplified three times contains the arabinose expression cassette,
i.e. the genes araA, araB and araD under control of strong
constitutive promoters.
[0390] Firstly, the copy number of several genes was confirmed by
Q-PCR. Secondly, it was investigated whether the amplification took
place on the same chromosome (duplication cq. triplication) or
whether the amplified region was integrated into another chromosome
(translocation).
[0391] 6.2 Copy Number Determination by Q-PCR
[0392] In order to verify the amplification of parts of chromosome
VII, as indicated by the coverage plot of FIG. 12, Q-PCR
experiments were performed. Specifically, this method measures the
relative copy number of a gene of interest by comparing it with
another gene, with a known copy number.
[0393] To this end, the Bio-Rad iCycler iQ system from Bio-Rad
(Bio-Rad Laboratories, Hercules, Calif., USA) was used. The iQ SYBR
Green Supermix (Bio-Rad) was used. Experiments were set up as
suggested in the manual of the provider.
[0394] From the coverage plot (read depth) it was deduced that
genes SDS23 and YGL057c were expected to be part of the amplified
region on the left arm of chromosome VII. As a reference single
copy gene, the ACT1 gene was chosen.
[0395] The primers for the detection of the genes YGL057c, SDS23
and ACT1 are summarized in the table below.
TABLE-US-00013 TABLE 5 Primers used for amplification in the Q-PCR
experiment Gene of interest Forward primer Reverse primer YGL057c
SEQ ID NO 39 SEQ ID NO 40 SDS23 SEQ ID NO 41 SEQ ID NO 42 ACT1 SEQ
ID NO 43 SEQ ID NO 44
[0396] The Q-PCR conditions were as follows:
[0397] 1) 95.degree. C. for 3 min
[0398] for 40 cycli, steps 2-4
[0399] 2) 95.degree. C. for 10 sec
[0400] 3) 58.degree. C. for 45 sec
[0401] 4) 72.degree. C. for 45 sec
[0402] 5) 65.degree. C. for 10 sec
[0403] 6) Increase of temperature with 0.5.degree. C. per 10 sec to
95.degree. C.
[0404] The melting curve is being determined by starting to measure
fluorescence at 65.degree. C. for 10 seconds. The temperature is
increased every 10 seconds with 0.5.degree. C., until a temperature
of 95.degree. C. is reached. From the reads, the copy number of the
gene of interest were calculated and/or estimated. The results are
presented in the table below.
TABLE-US-00014 TABLE 6 Relative copy number of selected genes in
strains BIE104A2P1 and BIE201 Copy number in Copy number in Gene of
interest BIE104A2P1 BIE201 YGL057c 1.2 5.1 SDS23 1.2 4.4 ACT1 1.0
(reference) 1.0 (reference)
[0405] The results corroborate the amplification as was apparent
from the read depth analysis in Example 6 (section 6.1). The
observed values are higher than the expected copy number of 3.0.
The difference may be caused by a number of factors, as previously
disclosed by Klein (Klein, D. (2002) TRENDS in Molecular Medicine
Vol. 8 No. 6, 257-260).
[0406] 6.3 Analysis of the Nature of the Duplication
[0407] In order to determine whether the amplified regions are
located on the same chromosome as the genes are originally located,
i.e. chromosome VII, or have been translocated to another
chromosome, CHEF electrophoresis (Clamped Homogeneous Electric
Fields electrophoresis; CHEF-DR.RTM. III Variable Angle System;
Bio-Rad, Hercules, Calif. 94547, USA) was applied. Agarose plugs of
yeast strains (see below) were prepared using the CHEF Yeast
Genomic DNA Plug Kit (BioRad) according to the instructions of the
supplier. 1% Agarose gels (Pulse Field Agarose, Bio-Rad) were
prepared in 0.5.times. TBE (Tris-Borate-EDTA) according to the
suppliers instructions. Gels were run according to the following
settings:
[0408] Block 1 initial time 60 sec [0409] final time 80 sec [0410]
ratio 1 [0411] run time 15 hours
[0412] Block 2 initial time 90 sec [0413] final time 120 sec [0414]
ratio 1 [0415] run time 9 hours
[0416] As a marker for size determination of the chromosomes,
agarose plugs of strain YNN295 (Bio-Rad) were included in the
experiment.
[0417] After electrophoresis, gels were stained using
ethidiumbromide at a final concentration of 70 pg per litre, for 30
minutes. In FIG. 13, an example of a stained gel is shown.
[0418] After staining, gels were blotted onto Amersham Hybond N+
membranes (GE Healthcare Life Sciences, Diegem, Belgium).
[0419] In order to be able to establish if the amplified genes are
located on one chromosome or translocated to other chromosomes,
probes were made for hybridization with the blotted membranes.
Probes (see table below) were prepared using the PCR DIG Probe
Synthesis Kit (Roche, Almere, the Netherlands) according to the
instructions of the supplier.
[0420] The following probes were prepared.
TABLE-US-00015 TABLE 7 Primers for amplification of the indicated
probes Size Systematic PCR Chro- name Forward Reverse product mo-
Probe gene primer primer (bp) some araA SEQ ID NO 45 SEQ ID NO 46
641 VII ACT1 YFL039c SEQ ID NO 47 SEQ ID NO 48 392 VI PNC1 YGL037c
SEQ ID NO 49 SEQ ID NO 50 384 VII HSF1 YGL073w SEQ ID NO 51 SEQ ID
NO 52 381 VII YGR031w YGR031w SEQ ID NO 53 SEQ ID NO 54 392 VII
[0421] The araA-gene is expected to be amplified three times in
BIE104A2P1c and BIE201.
[0422] The ACT1-gene is located on chromosome VI and not expected
to be amplified. Hence, this probe serves as a control.
[0423] PNC1 is located on the left arm of chromosome VII and is
expected to be amplified three times in BIE104A2P1c and BIE201.
[0424] HSF1 is located on the left arm of chromosome VII and is
located upstream of the amplified region. Hence, this gene is
expected to be present in the genome as a single gene in the
strains tested.
[0425] YGR031w is located on the right arm of chromosome VII. This
gene is expected to be present in two copies in the genome of
strains BIE104A2P1c and BIE201.
[0426] Membranes were prehybridized in DIG Easy Hyb Buffer (Roche)
according to the instructions of the supplier. The probes were
denatured at 99.degree. C. for 5 minutes, chilled on ice for 5
minutes, and added to the prehybridized membranes. Hybridization
was done overnight at 42.degree. C.
[0427] Washing of the membranes and blocking of the membranes prior
to detection of the hybridized probes were done using the DIG Wash
and Block Buffer Set (Roche) according to the instructions of the
supplier. The detection was done by incubation with
anti-dioxygenin-AP Fab fragments (Roche) followed by the addition
of detection reagents using the CDP-Star ready-to-use kit (Roche).
Detection of the chemiluminiscent signals were performed using the
Bio-Rad Chemidoc XRS+ System, using the appropriate settings
provided by the Chemidoc apparatus.
[0428] The results are shown in FIGS. 13, 14, 15, 16, 17 and
18.
[0429] From FIG. 13 it can already be inferred that there are
differences in the size of the chromosomes in the strain lineage
from BIE104 to BIE201. In strain BIE104A2P1(a), the primary
transformant, no large differences are observed with respect to the
size of the chromosomes when compared to BIE104. In strains
BIE104A2P1c and BIE201 however, the size of chromosome VII has
increased. In strain BIE104, chromosome VII is close to chromosome
XV; in BIE104A2P1c and BIE201 however, the chromosome has increased
in size and is almost as large as chromosome IV.
[0430] Hybridization with probes of the genes araA (FIG. 14), PNC1
(FIG. 16) and HSF1 (FIG. 17) projects the same image. This suggests
that the amplification has taken place within the same chromosome,
i.e. that all amplified regions are still on chromosome VII. If a
translocation had occurred, multiple signals were expected, which
is not the case. In strain BIE104A2P1(a), a smaller band is
observed under the band of chromosome VII, with all three probes.
This suggests that a second, smaller version chromosome VII is
present. Since the intensity is lower than the larger band, it may
be present in only a fraction of the cells. It may also be
explained by assuming an electrophoresis artefact.
[0431] The hybridisation with the ACT1 probe (FIG. 15) results in a
single band in all strains, as expected, is representing chromosome
VI.
[0432] The hybridisation with the YGRO31w (FIG. 18) probe finally,
resulted in many bands. Apparently, cross-hybridization occurred,
resulting in multiple signals in each strain. Therefore, this
result can not be used for the purpose of this experiment.
[0433] Though some differences in intensity are observed between
the strains, it is difficult to conclude from these data whether
amplification can be shown. Although an increase in the signal
intensity may suggest an increase of the copy number of a certain
gene, other factors may also influence the signal strength, like
the amount of DNA applied on the gel, blotting efficiency,
detection saturation, and the like.
[0434] Taken together, the results of Example 6 clearly indicate
that the amplification has taken place within chromosome VII. There
is no evidence for a translocation of the genetic context of the
genes araA, araB and araD (including surrounding sequences) to
another chromosome.
Example 7
Phenotypic Validation of the SNPs and Amplification
[0435] In order to validate whether the discovered SNPs and
amplification, and if yes to which extent, contribute to the
ability to convert arabinose into ethanol by yeast cells (apart
from the introduced homologous and heterologous pathways),
cross-breeding experiments were performed. To this end, the
following experiments were performed: mating type switch of strain
BIE201, cross-breeding of the mating type switched BIE201 with the
non-evolved parent strain BIE104A2P1, sporulation of the diploid
strain followed by dissection of the four ascospores, determination
of the ability to utilize arabinose as sole carbon and energy
source in the haploid offspring, SNP detection in the haploid
offspring using Hi-Res, and analysis of these datasets.
[0436] By crossing the evolved, mating type switched BIE201 with
the non-evolved primary transformant BIE104A2P1, a diploid cell is
being constructed which is completely homozygous, except for the
identified genomic changes (SNPs and amplification). By
subsequently sporulating this diploid cell followed by dissection
of the ascospores, haploid cells will be obtained which may have
none, some or all genomic changes that were introduced during
adapted evolution. The distribution of the genomic changes over the
four haploid derivatives of one diploid cell is random, although
per SNP, DIP or amplification, a 2:2 segregation is expected over
the four haploid derivatives. For more theoretical background, see
e.g. Mortimer R. K. and Hawthorne D. C. (1975) Genetic Mapping in
Yeast. Methods Cell Biol. 11:221-33.
[0437] 7.1 Mating Type Switch of Strain BIE201
[0438] Plasmid pGal-HO (KAN) is a derivative of the plasmid pGAL-HO
(Herskowitz, I. and Jensen, R. E. (1991) Methods in Enzymology,
194:132-146). The URA3-marker in pGAL-HO has been replaced by the
kanMX marker, by cutting pGAL-HO with EcoRV followed by the
ligation of the kanMX fragment from pUG6 (Guldener, U. et al (1996)
Nucleic Acids Research 24: 2519-2524). The kanMX marker, allowing
for G418 selection in S. cerevisiae, was cut from pUG6 with the
restriction enzymes XbaI and XhoI, followed by filling in the
overhanging ends with Klenow polymerase. The resulting plasmid is
pGal-HO (KAN).
[0439] Strain BIE201 (relevant genotype in relation to this
experiment: matA) was transformed according to the method of Gietz
and Woods (2002) with the plasmid pGal-HO (KAN). Transformants were
selected on YEP/agar-plates containing glucose (2%) and G418 (100
.mu.g/ml). Colonies appeared after two days of incubation at
30.degree. C. Eight colonies were restreaked on fresh
YEP/agar-plates with glucose and G418. Two colonies of each
transformation were used to inoculate 20 ml YEP-medium containing
1% galactose and 0.1% glucose. After 2 days of incubation at
30.degree. C. and 280 rpm, cells were restreaked on YEPD-plates.
Plates were incubated during 2 days at 30.degree. C., and colonies
were visible. PCR reactions were performed for the determination of
the mating-type using the primers of SEQ ID NO 55 and 56 (for
identification of matA cells), and primers of SEQ ID NO 55 and 57
(for identification of mat.alpha. (alpha) cells).
[0440] Several mat.alpha. (alpha) variants of BIE201 were obtained.
In order to test whether these derivatives have indeed switched
their mating type, they were restreaked on fresh YEPD-plates. Also,
strain BIE104A2P1 (the primary transformant, relevant genotype in
this experiment: matA) was restreaked on a separate fresh
YEPD-plate.
[0441] Subsequently, both strains were allowed to mate by mixing a
loopful of each strain on a fresh YEPD-agar plate. After 6 hours of
incubation at 30.degree. C., mating was scored under the
microscope. Some isolates indeed appeared to form zygotes, i.e.
structures in which two cells of opposite mating type have fused to
form a diploid strain. These BIE201 derivatives indeed changed the
mating type to mat.alpha. (alpha).
[0442] 7.2 Cross-Breeding of the Mating Type Switched BIE201 with
the Non-Evolved Parent Strain BIE104A2P1
[0443] The preparations in which the formation of hybrids (zygotes)
were observed by microscopy (section 7.1), were plated on YEPD-agar
plates. Plates were incubated at 30.degree. C. for two days. The
larger colonies were picked and restreaked on fresh YEPD-plates.
Subsequently, colony PCR was performed using the primers of SEQ ID
NO 55 and 56 and SEQ ID NO 55 and 57. Diploids will form a PCR
product with both primer pairs. Several of these colonies were
obtained and used to inoculate YEP-medium with 2% glucose
(30.degree. C., 280 rpm).
[0444] 7.3 Sporulation of the Diploid Strain and Dissection of the
Ascospores
[0445] After overnight growth at 30.degree. C. and 280 rpm, 2.5 ml
was transferred to 25 ml 1.5% KAc in tap water (sterilized).
Incubation was continued at 30.degree. C. and 280 rpm. Each day,
the degree of sporulation was checked microscopically. When the
ratio of asci versus vegetative cells was larger than 2, 60 asci
were dissected using the Singer MSM System.COPYRGT. series 300
(Somerset, UK) apparatus, using the instructions and protocols of
the supplier. Dissection was done on YEPD-plates. Plates were
incubated for 2 days at 30.degree. C. An example of the result is
set out in FIG. 19.
[0446] FIG. 19 shows 10 asci that were dissected. The ascospores
from the ascus were separated from each other and put on the agar
plate at distinctive distances. Colonies in a "column" (10 columns
are shown) originate from one ascus.
[0447] As is apparent from FIG. 19, not all four spores were viable
in all cases. In a minority of the cases, only three and sometimes
even only two ascospores grew into viable colonies.
[0448] Also, some differences in the colony size were observed
between the colonies from one ascospore.
[0449] 7.4 Determination of the Ability to Utilize Arabinose as
Sole Carbon and Energy Source in the Haploid Offspring
[0450] All complete sets of haploid derivatives, it is in those
cases where four viable spores were obtained from an ascus, were
inoculated in YEPD-agar in 96-wells microplates. Controls
BIE104A2P1 and BIE201 were included as controls on each microplate
in at least twofold. The plates were incubated for 2 days at
30.degree. C. These plates are called the "masterplates".
[0451] 96-Well microplates containing 200 .mu.l Verduyn-medium and
2% glucose were inoculated with colony material from the
masterplates, with the aid of a disposable pin tool, which allows
the transfer of cell material of all 96 strains in a microplate in
one movement.
[0452] The microplate containing the liquid Verduyn medium with 2%
glucose was grown for two days at 30.degree. C. and 550 rpm, in an
Infors microplate shaker, at 80% humidity.
[0453] Subsequently, 10 .mu.l of the glucose grown microplate
cultures were transferred to microplates containing 200 .mu.l
Verduyn medium containing 2% arabinose as a carbon source. The
incubation in an Infors shaker at 30.degree. C., 550 rpm and 80%
humidity lasted for four days. Each day, the growth was monitored
by measuring the optical density at 620 nm using a BMG FLUOstar
microplate reader (BMG, Offenburg, Germany). The ability to utilize
arabinose was expressed by dividing the final optical density after
4 days of incubation on arabinose as sole carbon source by the
initial optical density of the same microplate. An example of the
results is summarized in table 8.
TABLE-US-00016 TABLE 8 Of each haploid derivative from the
dissected asci and the controls BIE104A2P1 and BIE201, the growth
(defined as the final optical density at 620 nm divided by the
initial optical density at 620 nm) was determined. Haploid strain
Growth A1 27 A2 7 A3 5 A4 26 B1 6 B2 29 B3 9 B4 5 BIE201 25
BIE104A2P1a 5 C1 9 C2 11 C3 25 C4 12 D1 17 D2 8 D3 11 D4 15 E1 18
E2 6 E3 9 E4 10 F1 9 F2 8 F3 10 F4 7 G1 9 G2 9 G3 17 G4 32
[0454] From table 8 it is clear that there is, as can be expected,
a large difference between the two control strains, BIE104A2P1 and
BIE201. BIE104A2P1 reaches a level of 5, which in practice means
that no growth was obtained. Though a factor 5 suggests that some
growth has occurred, this will most likely be caused by carry over
of nutrients (residual glucose, ethanol) from the preculture.
Strain BIE201 reaches a growth ratio of 25, which is significantly
higher than the strain BIE104A2P1.
[0455] The haploid derivatives display a wide range of growth
phenotypes, ranging from low growth (similar to BIE104A2P1) to high
levels of growth (similar to and exceeding the level of BIE201).
Also, strains with intermediate growth levels were obtained. For
instance, in the first ascus, ascus A, resulting in four haploid
strains A1, A2, A3 and A4, a 2:2 segregation of the arabinose
growth phenotype is obtained. In some other asci, the segregation
between low and high growth levels obtained does not follow a 2:2
pattern. For instance, in ascus B, one high level growth phenotype
strain is obtained, one with an intermediate level (value of 9),
and two haploids that have a low growth phenotype. Similar
observations can be done from the haploid strains derived from the
other asci.
[0456] 7.5 SNP Detection in the Haploid Offspring using Hi-Res
[0457] 96-Well microplates containing YEP-medium supplemented with
2% glucose were inoculated with colony material from the
masterplates (section 7.4). Cells were allowed to grow in an Infors
shaker at 30.degree. C., 550 rpm and 80% humidity for 2 days. As
controls, strain BIE104A2P1 and BIE201 were included.
[0458] Chromosomal DNA was isolated using the above protocol in a
downscaled fashion. The chromosomal DNA served as a template for
Hi-Res analysis as described in section 5.2. The Hi-Res analysis
allowed the identification of the SNPs in each haploid segregant
from the cross BIE201 (mat.alpha.) X BIE104A2P1 (matA). Likewise,
the presence of the amplified regions on chromosome VII were
determined according to the methods described in section 6.2. Of
each haploid segregant, the genotype with respect to the SNPs and
amplification were determined. The results are presented in table
9.
TABLE-US-00017 TABLE 9 Overview of the presence of the SNPs and the
amplification in the haploid derivatives of the cross BIE104A2P1
.times. BIE201. As controls, BIE104A2P1 and BIE201 were included.
Haploid strain YJR154w SSY1 CEP3 GAL80 Amplification A1 WT WT WT
SNP + A2 SNP SNP SNP WT - A3 WT WT WT WT - A4 WT SNP WT SNP + B1
SNP WT SNP SNP - B2 WT WT WT SNP + B3 WT SNP SNP WT + B4 SNP SNP WT
WT - BIE201 SNP SNP SNP SNP + BIE104A2P1a WT WT WT WT - C1 SNP SNP
SNP WT - C2 WT WT SNP SNP - C3 WT WT WT SNP + C4 SNP SNP WT WT + D1
WT SNP SNP WT + D2 SNP SNP WT SNP - D3 SNP WT SNP SNP - D4 WT WT WT
WT - E1 WT SNP WT WT + E2 WT WT SNP SNP - E3 SNP SNP WT SNP - E4
SNP WT SNP WT + F1 SNP WT WT WT - F2 WT WT SNP SNP - F3 WT SNP SNP
WT - F4 SNP SNP WT SNP - G1 SNP SNP WT SNP - G2 WT WT WT WT - G3 WT
SNP SNP WT + G4 SNP WT SNP SNP +
[0459] In most asci, a 2:2 segregation of the SNPs and
amplification are observed. There are some exceptions to this,
which may be caused by e.g. meiotic gene conversion.
[0460] 7.6 Analysis of these Datasets
[0461] Combining the datasets of section 7.4 and 7.5 (tables 8 and
9 respectively), yields the following table, table 10. In table Z
however, the results have been sorted from high growth to low
growth on arabinose.
TABLE-US-00018 TABLE 10 Overview of the SNPs, the amplification and
the growth phenotype of haploid derivatives of the cross BIE104A2P1
.times. BIE201, and the respective parent strains. Am- Strain
YJR154w SSY1 CEP3 GAL80 plification Growth G4 SNP WT SNP SNP + 32
B2 WT WT WT SNP + 29 A1 WT WT WT SNP + 27 A4 WT SNP WT SNP + 26
BIE201 SNP SNP SNP SNP + 25 C3 WT WT WT SNP + 25 E1 WT SNP WT WT +
18 G3 WT SNP SNP WT + 17 D1 WT SNP SNP WT + 17 D4 WT WT WT WT - 15
C4 SNP SNP WT WT + 12 D3 SNP WT SNP SNP - 11 C2 WT WT SNP SNP - 11
E4 SNP WT SNP WT + 10 F3 WT SNP SNP WT - 10 E3 SNP SNP WT SNP - 9
G2 WT WT WT WT - 9 B3 WT SNP SNP WT + 9 G1 SNP SNP WT SNP - 9 C1
SNP SNP SNP WT - 9 F1 SNP WT WT WT - 9 F2 WT WT SNP SNP - 8 D2 SNP
SNP WT SNP - 8 F4 SNP SNP WT SNP - 7 A2 SNP SNP SNP WT - 7 B1 SNP
WT SNP SNP - 6 E2 WT WT SNP SNP - 6 BIE104A2P1a WT WT WT WT - 5 A3
WT WT WT WT - 5 B4 SNP SNP WT WT - 5
[0462] The results of table 10 strongly suggest that the
amplification is the key event determining the ability to grow on
arabinose at a relatively high growth rate. Most of the strains
having the amplification are located in the top 9 of table 10.
Two-third of these strains also have a SNP in the GAL80 gene,
suggesting an interaction between the presence of the SNP in the
GAL80 gene and the presence of the amplification.
[0463] In order to to determine, statistically, which of the
factors are relevant for high growth and whether there are
synergistic effects, ANOVA analysis was applied. Though the design
is not balanced, based on the statistical testing of the data, it
is clear that the presence of the amplification (p<<0.01) has
a positive effect on the growth. The results also reveal that a
strong interaction between GAL80 SNP and the presence of the
amplification (p<<0.01) exists while the other SNPs have no
significant effect (p>0.01).
[0464] A median growth of 8.4 is estimated in case of absence of
the amplification, while in the presence of the amplification, the
median growth is 17.6. A median growth of 8.7 is estimated in case
of absence of both the GAL80 SNP and the amplification, while in
case both are present, the median growth is 26.8.
[0465] Also, the interaction of the presence of the CEP3 SNP and
the presence of the amplification appears to have a synergistic
effect, although in a lesser extent than the interaction between
the presence of the GAL80 SNP and the amplification.
[0466] In conclusion, the effects and the significance of effects
on growth due to the presence of SNPs and/or the amplification
could be determined. The amplification has a significant effect on
the growth. This effect is increased through combination of the
amplification and the GAL80 SNP. A minor interaction effect was
detected for the combination of amplification and the CEP3 SNP and
the combination of amplification, the GAL80 SNP and the CEP3
SNP.
Example 8
Deletion of GAL80 Leads to an Even Better Arabinose Conversion
[0467] In Example 7 it was shown that the identified SNP in the
GAL80 gene has a positive additive effect on the growth on
arabinose, if the amplification of a part of chromosome VII is also
present.
[0468] GAL80 encodes a transcriptional repressor involved in
transcriptional regulation in response to galactose (Timson D J, et
al. (2002) Biochem J 363(Pt 3):515-20). In conjunction with Gal4p
and Gal3p, Gal80p coordinately regulates the expression of genes
containing a GAL upstream activation site in their promoter
(UAS-GAL), which includes the GAL metabolic genes GAL1, GAL10,
GAL2, and GAL7 (reviewed in Lohr D, et al. (1995) FASEB J
9(9):777-87). Cells null for gal80 constitutively express GAL
genes, even in non-inducing media (Torchia T E, et al. (1984) Mol
Cell Biol 4(8):1521-7).
[0469] The hypothesis is that the SNP that was identified in the
GAL80 gene influences the interaction between Gal80p, Gal3p and
Gal4p. Hence, the expression of the galactose metabolic genes,
including GAL2 encoding galactose permease, will be changed as well
as compared to a yeast cell with a wild type GAL80 allele. Gal2p
(galactose permease) is the main sugar transporter for arabinose
(Kou et al (1970) J Bacteriol. 103(3):671-678; Becker and Boles
(2003) Appl Environ Microbiol. 69(7): 4144-4150).
[0470] Apparently, the SNP in the GAL80 gene has a positive effect
on the ability to convert L-arabinose. In order to investigate
whether the arabinose growth phenotype could further be improved,
the coding sequence of the GAL80 gene was deleted in its entirety,
using a PCR-mediated gene replacement strategy.
[0471] 8.1 Disruption of the GAL80 Gene
[0472] Primers of SEQ ID NO 58 and 59 (the forward and reverse
primers respectively) were used for amplification of the
kanMX-marker from plasmid p427-TEF (Dualsystems Biotech, Schlieren,
Switzerland). The flanks of the primers are homologous to the
5'-region and 3'-region of the GAL80 gene. Upon homologous
recombination, the ORF of the GAL80 gene will be replaced by the
kanMX marker, similar as described by Wach (Wach et al (1994) Yeast
10, 1793-1808). The obtained fragment is designated as the
GAL80::kanMX fragment.
[0473] A yeast transformation of strain BIE252 was done with the
purified GAL80::kanMX fragment according to the protocol described
by Gietz and Woods (2002), Methods in Enzymology 350: 87-96). The
construction of strain BIE252 has been described in EP10160622.6.
Strain BIE252 is a xylose and arabinose fermenting strain of S.
cerevisiae, which is a derivative of BIE201. Strain BIE252 also
contains the GAL80 SNP.
[0474] The transformed cells were plated on YEPD-agar containing
100 .mu.g/ml G418 for selection. The plates were incubated at
30.degree. C. until colonies were visible. Plasmid p427-TEF was
included as a positive control and yielded many colonies. MilliQ
(i.e. no DNA) was included as a negative control and yielded no
colonies. The GAL80::kanMX fragment yielded many colonies. Two
independent colonies were tested by Southern blotting in order to
verify the correct integration (data not shown). A colony with the
correct deletion of the GAL80 gene was designated
BIE252.DELTA.GAL80.
[0475] 8.2 Effect of GAL80 Gene Replacement on the Performance in
the BAM
[0476] A BAM (Biological Activity Monitor; Halotec B V, Veenendaal,
the Netherlands) experiment was performed. Single colony isolates
of strain BIE252 and strain BIE252.DELTA.GAL80 (a transformant in
which the ORF of the GAL80 gene was correctly replaced by the kanMX
marker) were used to inoculate Verduyn medium (Verduyn et al.,
Yeast 8:501-517, 1992) supplemented with 2% glucose. The
precultures were incubated for approximately 24 hours at 30.degree.
C. and 280 rpm. Cells were harvested and inoculated in a synthetic
model medium (Verduyn medium supplemented with 5% glucose, 5%
xylose, 3.5% arabinose, 1% galactose and 0.5% mannose, pH 4.2) at a
cell density of about 1 gram dry weight per kg of medium. CO.sub.2
production was monitored constantly. Sugar conversion and product
formation was analyzed by NMR. The data represent the residual
amount of sugars at the indicated time points (glucose, arabinose,
galactose, mannose and xylose in grams per litre) and the formation
of (by-)products (ethanol, glycerol, and the like). Growth was
monitored by following optical density of the culture at 600nm. The
experiment was running for approximately 72 hours.
[0477] The graphs are displayed in FIG. 20 (BIE252) and 21
(BIE252.DELTA.GAL80).
[0478] The experiments clearly show that reference strain BIE252
converted glucose and mannose rapidly. After glucose depletion
(around 10 hours), the conversion of xylose and arabinose
commenced. Some galactose was already being fermented around the 10
hours time point, which might be due to the GAL80 SNP in this
strain, which would allow (partial) simultaneous utilisation of
glucose and galactose. At the end of the experiment, around 72
hours, almost all sugars were converted. An ethanol yield of 0.37
grams of ethanol per gram sugar was obtained.
[0479] Strain BIE252.DELTA.GAL80 exhibits faster sugar conversion
ability than strain BIE252. Also in case of this strain, mannose
and glucose are converted in the first hours of fermentation.
However, as opposed to strain BIE252, in this transformant there is
some co-consumption of glucose, galactose and mannose with
arabinose and especially xylose. In general, sugar consumption is
faster, leading to a more complete use of all available sugars.
This is also apparent from the CO.sub.2 evolution in time. In case
of BIE252, a first peak is observed, which is basically the CO2
formed from glucose and mannose. After reaching a minimum of just
above 10 ml/hr (FIG. 20) a second, more flat peak is observed. In
case of BIE252.DELTA.GAL80 however (FIG. 21), the second peak
appears as a tail of the first peak, due to an intensified co-use
of glucose, xylose, arabinose, mannose and galactose, as is
apparent from the sugar analysis by NMR. In the parent strain
BIE252, the use of the different sugars is more sequential. Hence,
the yield of strain BIE252.DELTA.GAL80 is higher at the end of the
experiment (72 h): 0.40 grams of ethanol per gram sugar.
[0480] In conclusion, the deletion of the ORF of the GAL80 gene
resulted in a further improved performance, as was tested in strain
BIE252.
Example 9
Adipic Acid Production in Strain BIE201
[0481] 9.1 Synthetic DNA Fragments Ordered at DNA2.0
[0482] Nine DNA fragments containing the nine open reading frames
involved in the adipic acid pathway (see European Patent
Application EP11160000.3 filed 28 Mar. 2011) and a S. cerevisiae
promoter and terminator for efficient expression were ordered
synthetically at DNA2.0 (Menlo Park, Calif. 94025, USA). In some
cases homology to an adjacent part of the adipic acid pathway was
added to the synthetic fragment for in vivo recombination of the
pathway after transformation to BIE201. DNA2.0 delivered the
synthetic fragments as cloned inserts in a standard cloning vector.
This resulted in the following plasmids (between brackets the
abbreviation), pADI141 (Adi21), pADI142 (Adi22), pADI143 (Adi23),
pADI199 (Adi8), pADI145 (Adi24), pADI146 (Adi25), pADI149 (SucC),
pADI150 (SucD) and pADI200 (Acdh67). Table 11 shows the genes
involved in the pathway, the used abbreviations, source, Uniprot
code and involvement in the pathway.
TABLE-US-00019 TABLE 11 Overview of the genes in the adipic acid
pathway transformed to the BIE201 strain Uniprot Step in
Abbreviation Name Source code pathway Adi21 beta-ketodipyl CoA
Acinetobacter sp. Q6FBN0 1 thiolase (DcaF) Adi22
beta-hydroxy-adipoyl Acinetobacter sp. Q937T5 2 dehydrogenase(DcaH)
Adi23 enoyl-CoA hydratase Acinetobacter sp. Q937T3 3 (DcaE) Adi8
trans-2-enoyl-CoA- Candida Q8WZM3 4 reductase tropicalus Adi24
acyl-CoA transferase Acinetobacter Sp. Q937T0 5 (Dcal) (subunit A)
Adi25 acyl-CoA transferase Acinetobacter Sp. Q937S9 5 (Dcal)
(subunit B) Acdh67 Acetylating Listeria innocua Q92CP2 Acetyl-CoA
Acetaldehyde supply dehydrogenase SucC Succinyl-CoA E. coli P0A836
Succinyl- synthetase subunit A CoA supply SucD Succinyl-CoA E. coli
P0AGE9 Succinyl- synthetase subunit B CoA supply
[0483] 9.2 Preparation of PCR Fragments for Transformation to
BIE201
[0484] In vivo homologous recombination was used to assemble and
integrate the complete adipic acid pathway into BIE201. The
necessary homology for recombination of the complete pathway
(50-250 bp) was added during synthesis of the synthetic fragment or
by adding the sequence to the primers used for amplification of the
fragment. Primer sequences are listed in table 12.
TABLE-US-00020 TABLE 12 A list of all primer sequences used in the
PCR-reactions to create the fragments for transformation to the
BIE201 strain. Primer Short description SEQ ID NO 60 Forward primer
for amplification of the INT1LF SEQ ID NO 61 Reverse primer for the
amplification of INT1LF with a 50 bp flank overlapping Adi21
expression cassette SEQ ID NO 62 Forward primer for amplification
of the Adi21 expression cassette with 50 bp flank INT1LF SEQ ID NO
63 Reverse primer for the amplification of the Adi21 expression
cassette SEQ ID NO 64 Forward primer for the amplification of the
Adi22 expression cassette SEQ ID NO 65 Reverse primer for the
amplification of the Adi22 expression cassette SEQ ID NO 66 Forward
primer for the amplification of the Adi23 expression cassette SEQ
ID NO 67 Reverse primer for the amplification of the Adi23
expression cassette SEQ ID NO 68 Forward primer for the
amplification of the kanMX marker from pUG7 with 50 bp flank
overlapping with Adi23 SEQ ID NO 69 Reverse primer for the
amplification of the kanMX marker from pUG7 with 50 bp flank
overlapping with Adi8 SEQ ID NO 70 Forward primer for the
amplification of the Adi8 expression cassette with 25 bp flank
overlap with kanMX of pUG7 SEQ ID NO 71 Reverse primer Adi8
expression cassette SEQ ID NO 72 Forward primer for the
amplification of the Adi24 expression cassette SEQ ID NO 73 Reverse
primer for the amplification of the Adi24 expression cassette SEQ
ID NO 74 Forward primer for the amplification of the Adi25
expression cassette SEQ ID NO 75 Reverse primer for the
amplification of the Adi25 expression cassette with 50 bp overlap
with SucC SEQ ID NO 76 Forward primer for the amplification of the
SucC with 50 bp overlap with Adi25 SEQ ID NO 77 Reverse primer for
the amplification of the SucC expression cassette SEQ ID NO 78
Forward primer for the amplification of the SucD expression
cassette SEQ ID NO 79 Reverse primer for the amplification of the
SucD expression cassette SEQ ID NO 80 Forward primer for the
amplification of the acdh67 expression cassette SEQ ID NO 81
Reverse primer for the amplification of the acdh67 construct with
50 bp flank overlapping with INTRF SEQ ID NO 82 Forward primer for
the amplification of the INT1LF site on yeast genome SEQ ID NO 83
Reverse primer for the amplification of the INT1LF site on yeast
genome
[0485] In total 12 fragments (see FIG. 22) were needed to integrate
the complete adipic acid pathway into the genome of BIE201, 9 PCR
fragments containing the gene expression cassettes belonging to the
adipic acid pathway (SEQ ID NO 84-92), one PCR fragment containing
the kanMX-marker conferring resistance to G418 (SEQ ID 93) and
finally the INT1LF (INTegration Left Flank) and INT1RF (INTegration
Right Flank) integration flanks (SED ID NO 94 and SEQ ID NO 95
respectively). All fragments were created with overlapping homology
to each neighboring fragment in the pathway and on the outside of
the pathway to the INT1LF and INT1RF for integration of the pathway
via a double crossover into the genome. The homologous
recombination event, complete assembly and integration of the
pathway, is shown in a drawing in FIG. 22. The created PCR
fragments used in the transformation are listed in table 13. The
sequences are included herein as SEQ ID NO 84 until and including
SEQ ID NO 95. Table 13 shows information on the used promoters and
terminators for the genes and the primers used in the PCR
amplification reactions to create the fragments for
transformation.
TABLE-US-00021 TABLE 13 Overview of DNA elements used for in vivo
recombination/integration of the adipic acid pathway. The
promoter-ORF-terminator fragments are referred to as the name of
the ORF. The columns 5' and 3' homology indicate with which other
fragment(s) homology is shared (see FIG. 22). The `plasmid name`
column shows the name of the DNA2.0 plasmid containing the
synthetic fragment. 5' homology 3'homology ID# ORF/ Forward Reverse
with with plasmid element Promoter element terminator primer primer
element element name ADI21 pTPI1 ADI21 tGND2 SEQ ID SEQ ID INT1LF
ADI22 pADI141 SEQ ID NO 62 NO 63 NO 84 ADI22 pFBA1 ADI22 tPMA1 SEQ
ID SEQ ID ADI21 ADI23 pADI142 SEQ ID NO 64 NO 65 NO 85 ADI23 pADH1
ADI23 tTDH1 SEQ ID SEQ ID ADI22 KANMX pADI143 SEQ ID NO 66 NO 67 NO
86 ADI8 pENO1 ADI8 tPDC1 SEQ ID SEQ ID KANMX ADI24 pADI199 SEQ ID
NO 70 NO 71 NO 87 ADI24 pTDH1 ADI24 tADH2 SEQ ID SEQ ID ADI8 ADI25
pADI145 SEQ ID NO 72 NO 73 NO 88 ADI25 pENO2 ADI25 tGPM1 SEQ ID SEQ
ID ADI24 SUCC pADI146 SEQ ID NO 74 NO 75 NO 89 SUCC pPDC1 SUCC
tGND2 SEQ ID SEQ ID ADI25 SUCD pADI149 SEQ ID NO 76 NO 77 NO 90
SUCD pGPM1 SUCD tADH1 SEQ ID SEQ ID SUCC ACDH67 pADI150 SEQ ID NO
78 NO 79 NO 91 A67 pOYE2 ACDH67 tTPI1 SEQ ID SEQ ID SUCD INT1RF
pADI200 SEQ ID NO 80 NO 81 NO 92 INT1LF -- INT1LF -- SEQ ID SEQ ID
-- ADI21 -- SEQ ID NO 60 NO 61 NO 94 INT1RF -- INT1RF -- SEQ ID SEQ
ID ACDH67 -- -- SEQ ID NO 82 NO 83 NO 95 KANMX -- KANMX -- SEQ ID
SEQ ID ADI23 ADI8 pUG7 SEQ ID NO 68 NO 69 NO 93
[0486] All PCR reactions were performed with Phusion.RTM.
polymerase (Finnzymes) according to the manual. The plasmids
ordered at DNA2.0 were used as template for amplifying the 9 adipic
acid pathway genes. The kanMX-marker was amplified from a plasmid
pUG7 carrying the marker sequence. pUG7 was constructed as follows:
the loxP-sites of plasmid pUG6 (Guldener, U. et al (1996) Nucleic
Acids Research 24: 2519-2524) were replaced in two steps by cloning
linkers containing the modified loxP-sites lox 66 and lox71 (Araki
et al (1997) Nucleic Acids Research, 1997, Vol. 25, No. 4, pp
868-872). Restriction analysis and sequencing was done to confirm
correct replacement.
[0487] The INT1LF and INT1RF (the left and right flanks,
respectively) for integration at the "INT1 locus" were amplified
using chromosomal DNA isolated from BIE104 as a template.
[0488] Size of the PCR fragments was checked with standard agarose
electrophoresis techniques. PCR amplified DNA fragments were
purified and concentrated with the PCR purification kit from
Qiagen, according to the manual. DNA concentration was measured
using the Nanodrop from Thermo scientific (A260/A280
absorbance).
[0489] 9.3. Yeast Transformation
[0490] Transformation of S. cerevisiae was done as described by
Gietz and Woods (2002, Methods in Enzymology 350: 87-96). BIE201
was transformed with 1 .mu.g of each of the 12 amplified and
purified PCR fragments. Transformation mixtures were plated on
YPD-agar (per liter: 10 grams of yeast extract, 20 grams per liter
peptone, 20 grams per liter dextrose, 20 grams of agar) containing
100 .mu.g G418 (Sigma Aldrich) per ml. After two to four days,
colonies appeared on the plates, whereas the negative control (i.e.
no addition of DNA in the transformation experiment) resulted in
blank YPD/G418-plates. From the transformation plate single
colonies were transferred to new YPD-agar plates containing 100
.mu.g G418 per ml. The plates were incubated 2 days at 30.degree.
C.
[0491] 9.4 Adipic acid Production on Arabinose
[0492] Single colonies of 4 transformants (strains 1, 2 3 and 4)
and BIE201 as a control strain were inoculated in duplo in a half
deepwell MTP (microplate) containing 200 .mu.l Verduyn medium with
2% arabinose and 0.05% glucose per well. The MTP was incubated 48
hours at 30.degree. C., 550 rpm and 80% humidity in an Infors
shaker for microplates. After 48 hours incubation 40 pl of each
culture was transferred to two 24-well plates containing 2.5 ml
Verduyn medium with 2% arabinose per well. The 24 well plates were
covered with a standard MTP lid and incubated for 24 hours at
30.degree. C., 550 rpm and 80% humidity. After the 24 hours
incubation the 24 well plates were centrifuged for 10 minutes in
Heraeus centrifuge at 2750 g. The supernatant was removed and to
each well containing cell pellet, 4.5 ml fresh Verduyn media with
2% arabinose was added. The cell pellet was re-suspended with a
pipette. For one plate the standard MTP lid was replaced by an
airpore sheet (Qiagen) to improve aeration. For the second 24-well
plate it was replaced by a BugStopper.TM. Capmat (Whatman) which
creates a micro-aerobic environment. The 24-well plates were
incubated in the Infors Microtron incubator for 72 hours at
30.degree. C., 350 rpm and 80% humidity. After incubation the
plates were centrifuged for 10 minutes at 2750 g in a Heraeus
Centrifuge. Adipic acid concentrations were measured in the
supernatant with LC-MS. Results are shown in table 14.
TABLE-US-00022 TABLE 14 Resulting adipic acid concentrations in
supernatant produced by the BIE201 transformants after growth on
arabinose. Adipic acid concentration Strain Used lid (mg/l) BIE201
Airpore sheets <0.2 BIE201 Airpore sheets <0.2 Strain 2
Airpore sheets 1.4 Strain 2 Airpore sheets 1.4 Strain 3 Airpore
sheets 1.2 Strain 3 Airpore sheets 1.3 Strain 4 Airpore sheets 1.6
Strain 4 Airpore sheets 2.0 BIE201 Bugstopper <0.2 BIE201
Bugstopper <0.2 Strain 2 Bugstopper 3.0 Strain 2 Bugstopper 2.4
Strain 3 Bugstopper 1.8 Strain 3 Bugstopper 2.2 Strain 4 Bugstopper
2.5 Strain 4 Bugstopper 2.8
[0493] Strains 2, 3 and 4 produce adipic acid on Verduyn media with
arabinose as sole C-source. Under oxygen limited conditions, i.e.
with the bugstopper lids, a higher level is obtained as compared to
the plates with airpore sheets.
[0494] Reference strain BIE201 grows on arabinose but does not
produce adipic acid.
[0495] 9.5 UPLC-MS/MS Analysis (ESI Negative Mode)
[0496] The samples were analysed with a column having the following
specifications "Waters Acquity UPLC HSS T3, 1.8 .mu.m, 100 mm*2.1
mm I.D.". Injection volume was 5 .mu.l using a full loop, the flow
through the column was 0.250 ml/min and the column temperature was
40.degree. C. Table 15 shows the gradient used for mobile phase A
and B. Mobile phase A contains 0.1% formic acid in water and Mobile
phase B contains 0.1% formic acid in acetonitril.
TABLE-US-00023 TABLE 15 The gradient used during UPLC-MS/MS
analysis of adipic acid concentrations in the supernatant. Time
(min.) 0.0 5.0 6.5 7.0 10.0 10.5 15.0 % A 100.0 85.0 85.0 20.0 20.0
100.0 100.0 % B 0.0 15.0 15.0 80.0 80.0 0.0 0.0
[0497] FIG. 23 depicts a MRM chromatogram of a standard containing
10, 5 mg/L adipic acid and a sample produced by strain 3 containing
3 mg/I adipic acid strain 3 production on arabinose with a
Bugstopper.
Example 10
Succinic Acid Production
[0498] 10.1 Expression Constructs
[0499] Expression construct pGBS414PPK-3 comprising a phosphoenol
pyruvate carboxykinase PCKa (E.C. 4.1.1.49) from Actinobacillus
succinogenes, and glycosomal fumarate reductase FRDg (E.C. 1.3.1.6)
from Trypanosoma brucei, and an expression construct pGBS415FUM3
comprising a fumarase (E.C. 4.2.1.2.) from Rhizopus oryzae, and a
peroxisomal malate dehydrogenase MDH3 (E.C. 1.1.1.37) were made as
described previously in WO2009/065778 on p. 19-20, and 22-30 which
herein enclosed by reference including the figures and sequence
listing.
[0500] Expression construct pGBS416ARAABD comprising the genes
araA, araB and araD, derived from Lactobacillus plantarum, were
constructed by cloning a PCR product, comprising the araABD
expression cassette from plasmid pPWT018, into plasmid pRS416. The
PCR fragment was generated using Phusion.RTM. DNA polymerase
(Finnzymes) and PCR primers defined in here as SEQ ID 96 and SEQ ID
97. The PCR product was cut with the restriction enzymes SalI and
NotI, as was plasmid pRS416. After ligation and transformation of
E. coli TOP10, the correct recombinants were selected on basis of
restriction enzyme analysis. The physical map of plasmid
pGBS416ARAABD is set out in FIG. 24.
[0501] 10.2 S. Cerevisiae Strains
[0502] The plasmids pGBS414PPK-3, pGBS415-FUM-3 were transformed
into S. cerevisiae strain CEN.PK113-6B (MATA ura3-52 /eu2-112
trp1-289). In addition plasmid pGBS416ARAABD is transformed into
this yeast to create prototrophic yeast strains. The expression
vectors were transformed into yeast by electroporation. The
transformation mixtures were plated on Yeast Nitrogen Base (YNB)
w/o AA (Difco)+2% glucose. One such transformant was called
SUC595.
[0503] As a control, strain CEN.PK113-6B was transformed with
plasmid pGBS416ARAABD only. One such transformant was called
SUC600.
[0504] Strains were subjected to adaptive evolution (see Example 2,
section 2.1) for growth on arabinose as sole carbon source. In
Example 2, YNB-medium containing arabinose was used, while in the
Example, Verduyn medium with 2% arabinose was used.
[0505] Isolated single colony isolates from the adaptive evolution
shake flasks were characterized for their ability to grow on
arabinose as sole carbon source. SUC689, a derivative of SUC595
through adaptive evolution, has a growth rate of 0.1 h.sup.-1 on
arabinose as sole carbon source. SUC694, a derivative of SUC600
through adaptive evolution, has a growth rate of 0.09 h.sup.-1 on
arabinose as sole carbon source.
[0506] 10.3 Growth Experiments and Succinic Acid Production
[0507] Single colony isolates of transformants SUC689 and SUC694
were inoculated in 96 wells microplates containing YNB (Difco), 4%
galactose and 2% agar. Four independent colonies were inoculated
per strain. After growth for 2 days at 30.degree. C., with the aid
of a pin tool, colony material was transferred to a 96 wells
microplate containing 200 .mu.l pre-culture medium consisting of
Verduyn medium (Verduyn et al., 1992, Yeast. July; 8(7):501-17)
comprising 4% galactose (w/v) and grown under aerobic conditions in
an Infors shaking incubator at 30.degree. C., 550 rpm and 80%
humidity. After approximately 48 hours, cells were transferred in
duplicate to 24 wells microplates, containing 2.5 ml fresh Verduyn
medium supplemented with 4% galactose. After 72 hours of incubation
at 30.degree. C., the plates were spun down in a microplate
centrifuge, in order to separate the cells from the medium. The
supernatant was discarded. The cells were resuspended in 4 ml
Verduyn medium comprising 8% arabinose. At two time intervals, 48
hours (microplate 1) and 72 hours (microplate 2), the incubation
was stopped by spinning down the cells. The supernatant was used to
measure succinic acid levels by NMR as described in section
10.4.
[0508] 10.4 NMR Analysis
[0509] NMR was performed for the determination of organic acids and
sugars in broth samples.
[0510] The results are presented in tables 16 and 17.
TABLE-US-00024 TABLE 16 Results of the NMR analysis at time point
48 hours. Strain Arabinose Malic acid Glycerol Succinic acid
Ethanol SUC689 18.5 0.4 3.3 0.7 8.4 SUC689 14.5 0.4 4.3 0.8 10.0
SUC689 16.6 0.4 4.3 0.8 9.7 SUC689 14.9 0.4 4.1 0.7 9.1 SUC694 0.7
N.D. N.D. 0.2 18.8 SUC694 0.4 N.D. 0.0 0.2 18.5 SUC694 1.1 N.D.
N.D. 0.3 18.4 SUC694 0.7 N.D. N.D. 0.2 17.8 All values are in grams
per litre. N.D. means not detected.
TABLE-US-00025 TABLE 17 Results of the NMR analysis at time point
72 hours. Strain Arabinose Malic acid Glycerol Succinic acid
Ethanol SUC689 14.0 0.5 3.5 0.7 6.7 SUC689 11.2 0.5 4.3 0.8 6.8
SUC689 13.7 0.5 3.9 0.8 6.0 SUC689 10.3 0.5 3.9 0.7 7.5 SUC694 0.1
N.D. N.D. 0.2 15.6 SUC694 0.1 N.D. N.D. 0.2 15.2 SUC694 0.2 N.D.
N.D. 0.2 15.6 SUC694 0.3 N.D. N.D. 0.3 13.6 All values are in grams
per litre. N.D. means not detected.
[0511] It is clear from tables 16 and 17 that the amount of
succinic acid is higher in case of strain SUC689, as compared to
strain SUC694. The latter converts almost all arabinose, and as
products mainly biomass and ethanol were formed. In case of strain
SUC689, less ethanol is formed, but a significantly higher amount
of succinic acid, 3 to 4 times higher as compared to SUC694.
Succinic acid yields were calculated and shown in the table
below.
TABLE-US-00026 TABLE 18 Succinic acid yields on arabinose as a
carbon source. Average succinic acid Average succinic acid yield
(gram succinic acid yield (gram succinic acid per gram arabinose)
at 48 per gram arabinose) at 72 Strain hours hours SUC689 0.012
0.011 SUC694 0.003 0.003
[0512] In conclusion, succinic acid was produced from arabinose in
strain SUC689, which was significantly lower in strain SUC694, the
strain not expressing the succinic acid pathway.
Example 11
Introduction of Extra Copies of the araA, araB and araD-Genes
[0513] 11.1 Amplification of the araABD-Cassette
[0514] In order to introduce extra copies of the araA, araB and
araD genes into the genome, a PCR reaction is performed using
Phusion.RTM. DNA polymerase (Finnzymes) with plasmid pPWT018 as a
template and the oligonucleotides with SEQ ID 98 and SEQ ID 99 as
primers. With these primers, the araABD-cassette is being
amplified. The primer design is such that the flanks of the PCR
fragment are homologous to the consensus sequence of the
delta-sequences of the yeast transposon Ty-1. These sequences can
be obtained from NCBI (http://www.ncbi.nlm.nih.gov/) and aligned
using a software package allowing to do so, like e.g. Clone Manager
9 Professional Edition (Scientific & Educational Software,
Cary, USA).
[0515] The araABD-cassette does not contain a selectable marker
with which the integration into the genome can be selected for. In
order to estimate transformation frequency, a second control
transformation was done with the kanMX-marker. To this end, the
kanMX-cassette from plasmid p427TEF (Dualsystems Biotech) was
amplified in a PCR reaction using the primers corresponding to SEQ
ID NO 100 and SEQ ID NO 101.
[0516] 11.2 Transformation of BIE104A2P1
[0517] BIE104A2P1 is transformed according to the electroporation
protocol (as described above) with the fragments comprising either
30 .mu.g of the araABD-cassette (designated Ty1::araABD) or 10
.mu.g of the kanMX-cassette. The kanMX-transformation mixture is
plated on YPD-agar (per liter: 10 grams of yeast extract, 20 grams
per liter peptone, 20 grams per liter dextrose, 20 grams of agar)
containing 100 .mu.g G418 (Sigma Aldrich) per ml. After two to four
days, colonies are appearing on the plates, whereas the negative
control (i.e. no addition of DNA in the transformation experiment)
is resulting in blank YPD/G418-plates. The transformation frequency
is higher than 600 colonies per pg of kanMX-cassette.
[0518] The Ty1::araABD transformation mixture is used to inoculate
a shake flask containing 100 ml of Verduyn medium, supplemented
with 2% arabinose. As a control, the negative control of the
transformation (i.e. no addition of DNA in the transformation
experiment) is used. The shake flasks were incubated at 30.degree.
C. and 280 rpm in an orbital shaker. Growth is followed by
measuring the optical density at 600 nm on a regular basis.
[0519] After approximately 25 days, the optical density of the
Ty1::araABD shake flask increases, while the growth in the negative
control is still absent. At day 25, a flask containing fresh
Verduyn medium supplemented with 2% arabinose is inoculated from
the Ty1::araABD culture to a start optical density at 600 nm of
0.15. The culture starts to grow on arabinose immediately and
rapidly. Since it is likely that the culture consists of a mixture
of subcultures, thus consisting of cells with differences in copy
number of the Ty1::araABD cassette and in growth rate on arabinose,
cells are diluted in milliQ water and are plated on YPD-agar plates
in order to get single colony isolates. The single colony isolates
are tested for their ability to utilize different carbon
sources.
[0520] 11.3 Selection of Better Arabinose Converting Strains
[0521] In order to select a strain which has gained improved growth
on arabinose as a sole carbon source without losing its ability to
utilize the other important sugars (glucose, and galactose), ten
single colony isolates of the adaptive evolution culture are
restreaked on YPD-agar. Subsequently, a preculture is done on
YPD-medium supplemented with 2% glucose. The ten cultures are
incubated overnight at 30.degree. C. and 280.degree. C. Aliquots of
each culture are used to inoculate fresh Verduyn medium
supplemented with either 2% glucose, or 2% arabinose or 2%
galactose, at an initial optical density of 0.15. As controls,
strains BIE201, BIE104A2P1 and the mixed population (from which the
ten single colony isolates are retrieved) are included in the
experiment. Cells are grown at 30.degree. C. and 280 rpm in an
orbital shaker. Growth is assessed on basis of optical density
measurements at 600 nm.
[0522] The results are showing that both the mixed culture and the
ten single colony isolates exhibit a higher final optical density
at 600 nm.
[0523] One colony (colony T) is selected on basis of its growth on
arabinose as sole carbon source. This colony, if inoculated in
Verduyn medium supplemented with 2% arabinose, is showing a higher
growth rate than parent strain BIE104A2P1. Its growth rate is
comparable to the growth rate of strain BIE201.
[0524] Q-PCR is done on the chromosomal DNA of strains BIE201,
BIE104A2P1 and colony T. The copy number of the araABD cassette is
determined to be 1 in case of BIE104A2P1, and larger than 2 in case
of both colony T and BIE201.
Sequence CWU 1
1
101118215DNAArtificial Sequencesynthetic plasmid 1ggccaagatg
gccgatctgc atttttcata ataatcctcg gtactttcta caagatcaat 60taaattccaa
tcaaaaatcg tcttttgcaa gattttgaag tcacagtact tttcattttc
120aatgtcaaca gcgccccatt tgtattgtct tcctttaact ttttcgccct
tttcattaaa 180aatgtactca ttagatgcaa ttatactgaa tggatatttt
tgaaaaatat cttgtgttgc 240attcaaaact tcatcgccga aaaagaaaca
tacagggata tcttgtactc ttattatttc 300tctaacttgt gttttgaagt
ttttcaattc ctctttcgtt agcaaatctg atttagcaat 360aaccgggatt
aaattcactc tcttcgctaa ttttttcatt gttacgacgt ctaaagtatc
420aattccctta tttgaaggtc tcagaaagta caaacaacaa tggactctat
tatcaaccat 480ttttgtccta tcaggttgtt cttcttggaa aatgtacgat
cttatttctt catcaatata 540gtttctagac tgcagcccgg gatccgtcga
caagcttgtg gagaggtgac ttcatgaacc 600aagtgtctgt cgatatacaa
caaaaaggaa ccattttcat cttgatggac aacatgtgca 660tcaaaaacct
tatcgtaaag agttcttgga cccttggatg gagtgtaaac catgatttaa
720aacagcaaat aataaaaatc gatagcgaca aaaactgtca atttcaatat
tctttatatt 780tgttgactgc ttagatattt tgagaaaatt cagcggaaac
agcgtgatga gtgagttaag 840ttctgctgtt taaataagta ttcaactact
attgaagccg actcatgaag ccggttacgg 900acaaaaccgg gcaaatttcg
ccggtcccgg aattttcgtt tccgcaataa aagaaccgct 960catcatcata
gcgccagggt agtatactat agaaggtcag actaaactga gtcatctaga
1020gtaatgacgc cttagtagct tttacatctt cataagaaaa ggaaacttgt
agaatggcct 1080ggcgatttgt ttgctttctt gtgatgaaga aatttcgatg
cgattaaccg gcaaaatcag 1140taaaggtatt tcgcggaggc ggccttcaat
catcgaatac tacgtcttaa tatgatgtac 1200tgtggttcat attttcaagt
agtgttagta aatttgtata cgttcatgta agtgtgtatc 1260ttgagtgtct
gtatgggcgc ataaacgtaa gcgagacttc caaatggagc aaacgagaag
1320agatctttaa agtattatag aagagctggg caggaactat tatgacgtaa
agccttgacc 1380ataataaaga cgattctttg tccctctata caaacatctt
gcaaagatac caaatatttt 1440caaatcctac tcaataaaaa attaatgaat
aaattagtgt gtgtgcatta tatatattaa 1500aaattaagaa ttagactaaa
taaagtgttt ctaaaaaaat attaaagttg aaatgtgcgt 1560gttgtgaatt
gtgctctatt agaataatta tgacttgtgt gcgtttcata ttttaaaata
1620ggaaataacc aagaaagaaa aagtaccatc cagagaaacc aattatatca
aatcaaataa 1680aacaaccagc ttcggtgtgt gtgtgtgtgt gaagctaaga
gttgatgcca tttaatctaa 1740aaattttaag gtgtgtgtgt ggataaaata
ttagaatgac aattcgaatt gcgtacctta 1800gtcaaaaaat tagcctttta
attctgctgt aacccgtaca tgcccaaaat agggggcggg 1860ttacacagaa
tatataacat cgtaggtgtc tgggtgaaca gtttattcct ggcatccact
1920aaatataatg gagcccgctt tttaagctgg catccagaaa aaaaaagaat
cccagcacca 1980aaatattgtt ttcttcacca accatcagtt cataggtcca
ttctcttagc gcaactacag 2040agaacagggg cacaaacagg caaaaaacgg
gcacaacctc aatggagtga tgcaacctgc 2100ctggagtaaa tgatgacaca
aggcaattga cccacgcatg tatctatctc attttcttac 2160accttctatt
accttctgct ctctctgatt tggaaaaagc tgaaaaaaaa ggttgaaacc
2220agttccctga aattattccc ctacttgact aataagtata taaagacggt
aggtattgat 2280tgtaattctg taaatctatt tcttaaactt cttaaattct
acttttatag ttagtctttt 2340ttttagtttt aaaacaccaa gaacttagtt
tcgaataaac acacataaac aaacaaaatg 2400ttatcagtac ctgattatga
gttttggttt gttaccggtt cacaacacct ttatggtgaa 2460gaacaattga
agtctgttgc taaggatgcg caagatattg cggataaatt gaatgcaagc
2520ggcaagttac cttataaagt agtctttaag gatgttatga cgacggctga
aagtatcacc 2580aactttatga aagaagttaa ttacaatgat aaggtagccg
gtgttattac ttggatgcac 2640acattctcac cagctaagaa ctggattcgt
ggaactgaac tgttacaaaa accattatta 2700cacttagcaa cgcaatattt
gaataatatt ccatatgcag acattgactt tgattacatg 2760aaccttaacc
aaagtgccca tggcgaccgc gagtatgcct acattaacgc ccggttgcag
2820aaacataata agattgttta cggctattgg ggcgatgaag atgtgcaaga
gcagattgca 2880cgttgggaag acgtcgccgt agcgtacaat gagagcttta
aagttaaggt tgctcgcttt 2940ggcgacacaa tgcgtaatgt ggccgttact
gaaggtgaca aggttgaggc tcaaattaag 3000atgggctgga cagttgacta
ttatggtatc ggtgacttag ttgaagagat caataaggtt 3060tcggatgctg
atgttgataa ggaatacgct gacttggagt ctcggtatga aatggtccaa
3120ggtgataacg atgcggacac gtataaacat tcagttcggg ttcaattggc
acaatatctg 3180ggtattaagc ggttcttaga aagaggcggt tacacagcct
ttaccacgaa ctttgaagat 3240ctttggggga tggagcaatt acctggtcta
gcttcacaat tattaattcg tgatgggtat 3300ggttttggtg ctgaaggtga
ctggaagacg gctgctttag gacgggttat gaagattatg 3360tctcacaaca
agcaaaccgc ctttatggaa gactacacgt tagacttgcg tcatggtcat
3420gaagcgatct taggttcaca catgttggaa gttgatccgt ctatcgcaag
tgataaacca 3480cgggtcgaag ttcatccatt ggatattggg ggtaaagatg
atcctgctcg cctagtattt 3540actggttcag aaggtgaagc aattgatgtc
accgttgccg atttccgtga tgggttcaag 3600atgattagct acgcggtaga
tgcgaataag ccagaagccg aaacacctaa tttaccagtt 3660gctaagcaat
tatggacccc aaagatgggc ttaaagaaag gtgcactaga atggatgcaa
3720gctggtggtg gtcaccacac gatgctgtcc ttctcgttaa ctgaagaaca
aatggaagac 3780tatgcaacca tggttggcat gactaaggca ttcttaaagt
aagtgaattt actttaaatc 3840ttgcatttaa ataaattttc tttttatagc
tttatgactt agtttcaatt tatatactat 3900tttaatgaca ttttcgattc
attgattgaa agctttgtgt tttttcttga tgcgctattg 3960cattgttctt
gtctttttcg ccacatgtaa tatctgtagt agatacctga tacattgtgg
4020atgctgagtg aaattttagt taataatgga ggcgctctta ataattttgg
ggatattggc 4080tttttttttt aaagtttaca aatgaatttt ttccgccagg
atcgtacgcc gcggaaccgc 4140cagatattca ttacttgacg caaaagcgtt
tgaaataatg acgaaaaaga aggaagaaaa 4200aaaaagaaaa ataccgcttc
taggcgggtt atctactgat ccgagcttcc actaggatag 4260cacccaaaca
cctgcatatt tggacgacct ttacttacac caccaaaaac cactttcgcc
4320tctcccgccc ctgataacgt ccactaattg agcgattacc tgagcggtcc
tcttttgttt 4380gcagcatgag acttgcatac tgcaaatcgt aagtagcaac
gtctcaaggt caaaactgta 4440tggaaacctt gtcacctcac ttaattctag
ctagcctacc ctgcaagtca agaggtctcc 4500gtgattccta gccacctcaa
ggtatgcctc tccccggaaa ctgtggcctt ttctggcaca 4560catgatctcc
acgatttcaa catataaata gcttttgata atggcaatat taatcaaatt
4620tattttactt ctttcttgta acatctctct tgtaatccct tattccttct
agctattttt 4680cataaaaaac caagcaactg cttatcaaca cacaaacact
aaatcaaaat gaatttagtt 4740gaaacagccc aagcgattaa aactggcaaa
gtttctttag gaattgagct tggctcaact 4800cgaattaaag ccgttttgat
cacggacgat tttaatacga ttgcttcggg aagttacgtt 4860tgggaaaacc
aatttgttga tggtacttgg acttacgcac ttgaagatgt ctggaccgga
4920attcaacaaa gttatacgca attagcagca gatgtccgca gtaaatatca
catgagtttg 4980aagcatatca atgctattgg cattagtgcc atgatgcacg
gatacctagc atttgatcaa 5040caagcgaaat tattagttcc gtttcggact
tggcgtaata acattacggg gcaagcagca 5100gatgaattga ccgaattatt
tgatttcaac attccacaac ggtggagtat cgcacactta 5160taccaggcaa
tcttaaataa tgaagcgcac gttaaacagg tggacttcat aacaacgctg
5220gctggctatg taacctggaa attgtcgggt gagaaagttc taggaatcgg
tgatgcgtct 5280ggcgttttcc caattgatga aacgactgac acatacaatc
agacgatgtt aaccaagttt 5340agccaacttg acaaagttaa accgtattca
tgggatatcc ggcatatttt accgcgggtt 5400ttaccagcgg gagccattgc
tggaaagtta acggctgccg gggcgagctt acttgatcag 5460agcggcacgc
tcgacgctgg cagtgttatt gcaccgccag aaggggatgc tggaacagga
5520atggtcggta cgaacagcgt ccgtaaacgc acgggtaaca tctcggtggg
aacctcagca 5580ttttcgatga acgttctaga taaaccattg tctaaagtct
atcgcgatat tgatattgtt 5640atgacgccag atgggtcacc agttgcaatg
gtgcatgtta ataattgttc atcagatatt 5700aatgcgtggg caacgatttt
tcatgagttt gcagcccggt tgggaatgga attgaaaccg 5760gatcgattat
atgaaacgtt attcttggaa tcaactcgcg ctgatgcgga tgctggaggg
5820ttggctaatt atagttatca atccggtgag aatattacta agattcaagc
tggtcggccg 5880ctatttgtac ggacaccaaa cagtaaattt agtttaccga
actttatgtt gactcaatta 5940tatgcggcgt tcgcacccct ccaacttggt
atggatattc ttgttaacga agaacatgtt 6000caaacggacg ttatgattgc
acagggtgga ttgttccgaa cgccggtaat tggccaacaa 6060gtattggcca
acgcactgaa cattccgatt actgtaatga gtactgctgg tgaaggcggc
6120ccatggggga tggcagtgtt agccaacttt gcttgtcggc aaactgcaat
gaacctagaa 6180gatttcttag atcaagaagt ctttaaagag ccagaaagta
tgacgttgag tccagaaccg 6240gaacgggtgg ccggatatcg tgaatttatt
caacgttatc aagctggctt accagttgaa 6300gcagcggctg ggcaagcaat
caaatattag agcttttgat taagccttct agtccaaaaa 6360acacgttttt
ttgtcattta tttcattttc ttagaatagt ttagtttatt cattttatag
6420tcacgaatgt tttatgattc tatatagggt tgcaaacaag catttttcat
tttatgttaa 6480aacaatttca ggtttacctt ttattctgct tgtggtgacg
cgggtatccg cccgctcttt 6540tggtcaccca tgtatttaat tgcataaata
attcttaaaa gtggagctag tctatttcta 6600tttacatacc tctcatttct
catttcctcc actagtagag aattttgcca tcggacatgc 6660taccttacgc
ttatatctct cattggaata tcgttttctg attaaaacac ggaagtaaga
6720acttaattcg tttttcgttg aactatgttg tgccagcgta acattaaaaa
agagtgtaca 6780aggccacgtt ctgtcaccgt cagaaaaata tgtcaatgag
gcaagaaccg ggatggtaac 6840aaaaatcacg atctgggtgg gtgtgggtgt
attggattat aggaagccac gcgctcaacc 6900tggaattaca ggaagctggt
aattttttgg gtttgcaatc atcaccatct gcacgttgtt 6960ataatgtccc
gtgtctatat atatccattg acggtattct atttttttgc tattgaaatg
7020agcgtttttt gttactacaa ttggttttac agacggaatt ttccctattt
gtttcgtccc 7080atttttcctt ttctcattgt tctcatatct taaaaaggtc
ctttcttcat aatcaatgct 7140ttcttttact taatatttta cttgcattca
gtgaatttta atacatattc ctctagtctt 7200gcaaaatcga tttagaatca
agataccagc ctaaaaatgc tagaagcatt aaaacaagaa 7260gtttatgagg
ctaacatgca gcttccaaag ctgggcctgg ttacttttac ctggggcaat
7320gtctcgggca ttgaccggga aaaaggccta ttcgtgatca agccatctgg
tgttgattat 7380ggtgaattaa aaccaagcga tttagtcgtt gttaacttac
agggtgaagt ggttgaaggt 7440aaactaaatc cgtctagtga tacgccgact
catacggtgt tatataacgc ttttcctaat 7500attggcggaa ttgtccatac
tcattcgcca tgggcagttg cctatgcagc tgctcaaatg 7560gatgtgccag
ctatgaacac gacccatgct gatacgttct atggtgacgt gccggccgcg
7620gatgcgctga ctaaggaaga aattgaagca gattatgaag gcaacacggg
taaaaccatt 7680gtgaagacgt tccaagaacg gggcctcgat tatgaagctg
taccagcctc attagtcagc 7740cagcacggcc catttgcttg gggaccaacg
ccagctaaag ccgtttacaa tgctaaagtg 7800ttggaagtgg ttgccgaaga
agattatcat actgcgcaat tgacccgtgc aagtagcgaa 7860ttaccacaat
atttattaga taagcattat ttacgtaagc atggtgcaag tgcctattat
7920ggtcaaaata atgcgcattc taaggatcat gcagttcgca agtaaacaaa
tcgctcttaa 7980atatatacct aaagaacatt aaagctatat tataagcaaa
gatacgtaaa ttttgcttat 8040attattatac acatatcata tttctatatt
tttaagattt ggttatataa tgtacgtaat 8100gcaaaggaaa taaattttat
acattattga acagcgtcca agtaactaca ttatgtgcac 8160taatagttta
gcgtcgtgaa gactttattg tgtcgcgaaa agtaaaaatt ttaaaaatta
8220gagcaccttg aacttgcgaa aaaggttctc atcaactgtt taaaaacgcg
tgtcttctgt 8280gtttcagttc agggcttttc ggaggatgtg aatcgacggc
gtactgtcct tgggaacttt 8340gtctacgtat tttcacttcc tcagcgaatc
cagagactat cttgggaaat tcgacaggac 8400agtctgttga caaccgactc
ccttttgact tcataataaa aattcaatga cgcaaaagga 8460attttaggtt
tttattattt atttatttat ttctgttaat tgatcctttt ctttccacta
8520ccaacaacaa aaaagggggg aaaaagatgt ataatctaaa agacactaat
ctgctcttga 8580tatccttatt atgtaatgga ataactcata taaatgtaaa
atagaacttc aaattaatat 8640tataatgata gtcgaggtca gacacactta
taatacatta agtaaagaaa aaaaaatgtc 8700tgtcatcgag gtctcttttg
tgtcgctaac aaaacatcac taaatacgaa gacactttgc 8760atgggaagga
tgcagcaaat ggcaaactaa cgggccattg attggtttac ctcttctatt
8820tgtattacga ccagaaagaa cgaatggttt tcatcaatga ggtaggaaac
gacctaaata 8880taatgtagca tagataaaat ctttgtactg tatggttgca
atgccttctt gattagtatc 8940gaatttcctg aataattttg ttaatctcat
tagccaaact aacgcctcaa cgaatttatc 9000aaactttagt tcttttcctg
ttccatttct gtttataaac tcagcatatt ggtcaaatgt 9060tttctcgcta
acttcaaaag gtattagata tcctagttct tgaagtgagt tatgaaattc
9120gcttacagaa atggtgagcg atccgttgat atcattgtcc acataaactt
ttctccaact 9180tttcactctt ttgtataggg cgatgaattc tgcctggttg
acagtgccaa acctggaagc 9240accaaataaa tttatcagcg catctactga
tgatatacaa aaatgggagt tgtcgtcgtt 9300ttgtagtaag ttctgtagtt
cctcagctgt cagtcggttt ttgcccttta catcatggtt 9360atgaaatagc
tgtgtggcca cttgcatgtc tcgtacatct tctctgctat cgaacgaagc
9420aggtgcaact ttcttcaaga gttgtgcagg cactgcttga ttgtgaatta
ggggaggagg 9480agaggaagct atccgttgag cggaagtgtt caagttgtta
taatgggttg gcgctggagg 9540tataggcctg cctgctggtt tctgtgcgat
aacattatat ctaggatcca caggtgtttt 9600cgtatgtctt ggagaataac
tttggggaga accataggag tggtgaccgt tttctgctct 9660gtttttgtta
tattgagttt gtaagggaat tggagctgag tggactctag tgttgggagt
9720ttgtgcttga gtaaccggta ccacggctcc tcgctgcaga cctgcgagca
gggaaacgct 9780cccctcacag tcgcgttgaa ttgtccccac gccgcgcccc
tgtagagaaa tataaaaggt 9840taggatttgc cactgaggtt cttctttcat
atacttcctt ttaaaatctt gctaggatac 9900agttctcaca tcacatccga
acataaacaa ccatgggtaa ggaaaagact cacgtttcga 9960ggccgcgatt
aaattccaac atggatgctg atttatatgg gtataaatgg gctcgcgata
10020atgtcgggca atcaggtgcg acaatctatc gattgtatgg gaagcccgat
gcgccagagt 10080tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt
tacagatgag atggtcagac 10140taaactggct gacggaattt atgcctcttc
cgaccatcaa gcattttatc cgtactcctg 10200atgatgcatg gttactcacc
actgcgatcc ccggcaaaac agcattccag gtattagaag 10260aatatcctga
ttcaggtgaa aatattgttg atgcgctggc agtgttcctg cgccggttgc
10320attcgattcc tgtttgtaat tgtcctttta acagcgatcg cgtatttcgt
ctcgctcagg 10380cgcaatcacg aatgaataac ggtttggttg atgcgagtga
ttttgatgac gagcgtaatg 10440gctggcctgt tgaacaagtc tggaaagaaa
tgcataagct tttgccattc tcaccggatt 10500cagtcgtcac tcatggtgat
ttctcacttg ataaccttat ttttgacgag gggaaattaa 10560taggttgtat
tgatgttgga cgagtcggaa tcgcagaccg ataccaggat cttgccatcc
10620tatggaactg cctcggtgag ttttctcctt cattacagaa acggcttttt
caaaaatatg 10680gtattgataa tcctgatatg aataaattgc agtttcattt
gatgctcgat gagtttttct 10740aatcagtact gacaataaaa agattcttgt
tttcaagaac ttgtcatttg tatagttttt 10800ttatattgta gttgttctat
tttaatcaaa tgttagcgtg atttatattt tttttcgcct 10860cgacatcatc
tgcccagatg cgaagttaag tgcgcagaaa gtaatatcat gcgtcaatcg
10920tatgtgaatg ctggtcgcta tactgctgtc gattcgatac taacgccgcc
atccagggta 10980ccatcctttt gttgtttccg ggtgtacaat atggacttcc
tcttttctgg caaccaaacc 11040catacatcgg gattcctata ataccttcgt
tggtctccct aacatgtagg tggcggaggg 11100gagatataca atagaacaga
taccagacaa gacataatgg gctaaacaag actacaccaa 11160ttacactgcc
tcattgatgg tggtacataa cgaactaata ctgtagccct agacttgata
11220gccatcatca tatcgaagtt tcactaccct ttttccattt gccatctatt
gaagtaataa 11280taggcgcatg caacttcttt tctttttttt tcttttctct
ctcccccgtt gttgtctcac 11340catatccgca atgacaaaaa aaatgatgga
agacactaaa ggaaaaaatt aacgacaaag 11400acagcaccaa cagatgtcgt
tgttccagag ctgatgaggg gtatcttcga acacacgaaa 11460ctttttcctt
ccttcattca cgcacactac tctctaatga gcaacggtat acggccttcc
11520ttccagttac ttgaatttga aataaaaaaa gtttgccgct ttgctatcaa
gtataaatag 11580acctgcaatt attaatcttt tgtttcctcg tcattgttct
cgttcccttt cttccttgtt 11640tctttttctg cacaatattt caagctatac
caagcataca atcaactatc tcatatacaa 11700tgcctcaatc ctgggaagaa
ctggccgctg ataagcgcgc ccgcctcgca aaaaccatcc 11760ctgatgaatg
gaaagtccag acgctgcctg cggaagacag cgttattgat ttcccaaaga
11820aatcggggat cctttcagag gccgaactga agatcacaga ggcctccgct
gcagatcttg 11880tgtccaagct ggcggccgga gagttgacct cggtggaagt
tacgctagca ttctgtaaac 11940gggcagcaat cgcccagcag ttaacaaact
gcgcccacga gttcttccct gacgccgctc 12000tcgcgcaggc aagggaactc
gatgaatact acgcaaagca caagagaccc gttggtccac 12060tccatggcct
ccccatctct ctcaaagacc agcttcgagt caagggctac gaaacatcaa
12120tgggctacat ctcatggcta aacaagtacg acgaagggga ctcggttctg
acaaccatgc 12180tccgcaaagc cggtgccgtc ttctacgtca agacctctgt
cccgcagacc ctgatggtct 12240gcgagacagt caacaacatc atcgggcgca
ccgtcaaccc acgcaacaag aactggtcgt 12300gcggcggcag ttctggtggt
gagggtgcga tcgttgggat tcgtggtggc gtcatcggtg 12360taggaacgga
tatcggtggc tcgattcgag tgccggccgc gttcaacttc ctgtacggtc
12420taaggccgag tcatgggcgg ctgccgtatg caaagatggc gaacagcatg
gagggtcagg 12480agacggtgca cagcgttgtc gggccgatta cgcactctgt
tgaggacctc cgcctcttca 12540ccaaatccgt cctcggtcag gagccatgga
aatacgactc caaggtcatc cccatgccct 12600ggcgccagtc cgagtcggac
attattgcct ccaagatcaa gaacggcggg ctcaatatcg 12660gctactacaa
cttcgacggc aatgtccttc cacaccctcc tatcctgcgc ggcgtggaaa
12720ccaccgtcgc cgcactcgcc aaagccggtc acaccgtgac cccgtggacg
ccatacaagc 12780acgatttcgg ccacgatctc atctcccata tctacgcggc
tgacggcagc gccgacgtaa 12840tgcgcgatat cagtgcatcc ggcgagccgg
cgattccaaa tatcaaagac ctactgaacc 12900cgaacatcaa agctgttaac
atgaacgagc tctgggacac gcatctccag aagtggaatt 12960accagatgga
gtaccttgag aaatggcggg aggctgaaga aaaggccggg aaggaactgg
13020acgccatcat cgcgccgatt acgcctaccg ctgcggtacg gcatgaccag
ttccggtact 13080atgggtatgc ctctgtgatc aacctgctgg atttcacgag
cgtggttgtt ccggttacct 13140ttgcggataa gaacatcgat aagaagaatg
agagtttcaa ggcggttagt gagcttgatg 13200ccctcgtgca ggaagagtat
gatccggagg cgtaccatgg ggcaccggtt gcagtgcagg 13260ttatcggacg
gagactcagt gaagagagga cgttggcgat tgcagaggaa gtggggaagt
13320tgctgggaaa tgtggtgact ccataggtcg agaatttata cttagataag
tatgtactta 13380caggtatatt tctatgagat actgatgtat acatgcatga
taatatttaa acggttatta 13440gtgccgattg tcttgtgcga taatgacgtt
cctatcaaag caatacactt accacctatt 13500acatgggcca agaaaatatt
ttcgaacttg tttagaatat tagcacagag tatatgatga 13560tatccgttag
attatgcatg attcattcct acaacttttt cgtagcataa ggattaatta
13620cttggatgcc aataaaaaaa aaaaacatcg agaaaatttc agcatgctca
gaaacaattg 13680cagtgtatca aagtaaaaaa aagattttcg ctacatgttc
cttttgaaga aagaaaatca 13740tggaacatta gatttacaaa aatttaacca
ccgctgatta acgattagac cgttaagcgc 13800acaacaggtt attagtacag
agaaagcatt ctgtggtgtt gccccggact ttcttttgcg 13860acataggtaa
atcgaatacc atcatactat cttttccaat gactccctaa agaaagactc
13920ttcttcgatg ttgtatacgt tggagcatag ggcaagaatt gtggcttgag
atctagatta 13980cgtggaagaa aggtagtaaa agtagtagta taagtagtaa
aaagaggtaa aaagagaaaa 14040ccggctacat actagagaag cacgtacaca
aaaactcata ggcacttcat catacgacag 14100tttcttgatg cattataata
gtgtattaga tattttcaga aatatgcata gaacctcttc 14160ttgcctttac
tttttataca tagaacattg gcagatttac ttacactact ttgtttctac
14220gccatttctt ttgttttcaa cacttagaca agttgttgag aaccggacta
ctaaaaagca 14280atgttcccac tgaaaatcat gtacctgcag gataataacc
ccctaattct gcatcgatcc 14340agtatgtttt tttttctcta ctcattttta
cctgaagata gagcttctaa aacaaaaaaa 14400atcagcgatt acatgcatat
tgtgtgttct agaattgcgg atcaccagat cgccattaca 14460atgtatgcag
gcaaatattt ctcagaatga aaaatagaga aaaggaaacg aaaattctgt
14520aagatgcctt cgaagagatt tctcgatatg caaggcgtgc atcagggtga
tccaaaggaa 14580ctcgagagag agggcgaaag gcaatttaat gcattgcttc
tccattgact tctagttgag 14640cggataagtt cggaaatgta agtcacagct
aatgacaaat ccactttagg tttcgaggca 14700ctatttaggc aaaaagacga
gtggggaaat aacaaacgct caaacatatt agcatatacc 14760ttcaaaaaat
gggaatagta tataaccttc cggttcgtta ataaatcaaa tctttcatct
14820agttctctta agatttcaat attttgcttt cttgaagaaa gaatctactc
tcctccccca 14880ttcgcactgc aaagctagct tggcactggc cgtcgtttta
caacgtcgtg actgggaaaa 14940ccctggcctt acccaactta atcgccttgc
agcacatccc cctttcgcca gctggcgtaa 15000tagcgaagag gcccgcaccg
atcgcccttc ccaacagttg cgcagcctga atggcgaatg 15060ggaaattgta
aacgttaata ttttgttaaa attcgcgtta aatttttgtt aaatcagctc
15120attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag
aatagaccga 15180gatagggttg agtgttgttc cagtttggaa caagagtcca
ctattaaaga acgtggactc 15240caacgtcaaa gggcgaaaaa ccgtctatca
gggcgatggc ccactacgtg aaccatcacc 15300ctaatcaagt tttttggggt
cgaggtgccg taaagcacta aatcggaacc ctaaagggag 15360cccccgattt
agagcttgac ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa
15420agcgaaagga gcgggcgcta gggcgctggc aagtgtagcg gtcacgctgc
gcgtaaccac 15480cacacccgcc gcgcttaatg cgccgctaca gggcgcgtca
ggtggcactt ttcggggaaa 15540tgtgcgcgga acccctattt gtttattttt
ctaaatacat tcaaatatgt atccgctcat 15600gagacaataa ccctgataaa
tgcttcaata atattgaaaa aggaagagta tgagtattca 15660acatttccgt
gtcgccctta ttcccttttt tgcggcattt tgccttcctg tttttgctca
15720cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac
gagtgggtta 15780catcgaactg gatctcaaca gcggtaagat ccttgagagt
tttcgccccg aagaacgttt 15840tccaatgatg agcactttta aagttctgct
atgtggcgcg gtattatccc gtattgacgc 15900cgggcaagac caactcggtc
gccgcataca ctattctcag aatgacttgg ttgagtactc 15960accagtcaca
gaaaagcatc ttacggatgg catgacagta agagaattat gcagtgctgc
16020cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg
gaggaccgaa 16080ggagctaacc gcttttttgc acaacatggg ggatcatgta
actcgccttg atcgttggga 16140accggagctg aatgaagcca taccaaacga
cgagcgtgac accacgatgc ctgtagcaat 16200ggcaacaacg ttgcgcaaac
tattaactgg cgaactactt agtctagctt cccggcaaca 16260attaatagac
tggatggagg cggataaagt tgcaggacca cttctgcgct cggcccttcc
16320ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc
gcggtatcat 16380tgcagcactg gggccagatg gtaagccctc ccgtatcgta
gttatctaca cgacggggag 16440tcaggcaact atggatgaac gaaatagaca
gatcgctgag ataggtgcct cactgattaa 16500gcattggtaa ctgtcagacc
aagtttactc atatatactt tagattgatt taaaacttca 16560tttttaattt
aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc
16620ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca
aaggatcttc 16680ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa
acaaaaaaac caccgctacc 16740agcggtggtt tgtttgccgg atcaagagct
accacctctt tttccgaagg taactggctt 16800cagcagagcg cagataccaa
atactgtcct tctagtgtag ccgtagttag gccaccactt 16860caagaactct
gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc
16920tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt
taccggataa 16980ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag
cccagcttgg agcgaacgac 17040ctacaccgaa ctgagatacc tacagcgtga
gcattgagaa agcgccacgc ttcccgaagg 17100gagaaaggcg gacaggtatc
cggtaagcgg cagggtcgga acaggagagc gcacgaggga 17160gcttccaggg
ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact
17220tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa
acgccagcaa 17280cgcggccttt ttacggttcc tggccttttg ctggcctttt
gctcacatgt tctttcctgc 17340gttatcccct gattctgtgg ataaccgtat
taccgccttt gagtgagctg ataccgctcg 17400ccgcagccga acgaccgagc
gcagcgagtc agtgagcgag gaagcggaag agcgcccaat 17460acgcaaaccg
cctctccccg cgcgttggcc gattcattaa tgcagctggc acgacaggtt
17520tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc
tcactcatta 17580ggcaccccag gctttacact ttatgcttcc ggctcgtatg
ttgtgtggaa ttgtgagcgg 17640ataacaattt cacacaggaa acagctatga
catgattacg aatttaatac gactcacaat 17700agggaattag cttgcgcgaa
attattggct tttttttttt tttaattaat actacctttt 17760gatgtgaacg
tttactaaag tagcactatc tgtggaatgg ctgttggaac tttttccgat
17820taacagcttg tattccaagt cctgacattc cagttgtaag ttttccaact
tgtgattcaa 17880ttgttcaatc tcttggttaa aattctcttg ttccatgaat
aggctctttt tccagtctcg 17940aaattttgaa atttctctgt tggacagctc
gttgaatttt ttcttagctt ctaattgtct 18000agttataaat tcaggatccc
attctgtagc caccttatcc atgaccgttt tattaattat 18060ttcatagcac
ttgtaatttt tgagtttgtt ttcctcgatt tcatcgaagt tcatttcttc
18120ctccaaaaat ttcctttgtt cttccgttat gtcaacactt ttcgttgtta
agcaatctct 18180ggcctttaat agcctagttc ttagcatttc agatc
18215223DNAArtificial Sequencesynthetic primer 2tgatcttgta
gaaagtaccg agg 23324DNAArtificial Sequencesynthetic primer
3ggaaacagct atgacatgat tacg 24423DNAArtificial Sequencesynthetic
primer 4tgcacatgtt gtccatcaag atg 23525DNAArtificial
Sequencesynthetic primer 5ctttgttctt ccgttatgtc aacac
25623DNAArtificial Sequencesynthetic primer 6ttccaagaag aacaacctga
tag 23721DNAArtificial Sequencesynthetic primer 7tgatgtgaac
gtttactaaa g 21816176DNAArtificial Sequencesynthetic plasmid
8tcgcgcgttt cggtgatgac ggtgaaaacc tcttgacaca tgcagctccc ggagacggtc
60acagcttgtc tgtaagcgga tgccgggagc agacaagccc gtcagggcgc gtcagcgggt
120gttggcgggt gtcggggctg gcttaactat gcggcatcag agcagattgt
actgagagtg 180caccatatgc ggtgtgaaat accgcacaga tgcgtaagga
gaaaataccg catcaggcgc 240cattcgccat tcaggctgcg caactgttgg
gaagggcgat cggtgcgggc ctcttcgcta 300ttacgccagc tggcgaaagg
gggatgtgct gcaaggcgat taagttgggt aacgccaggg 360ttttcccagt
cacgacgttg taaaacgacg gccagtaagc ttgcatgcct gcaggtcgac
420gcggccgcat attttttgta actgtaattt cactcatgca caagaaaaaa
aaaactggat 480taaaagggag cccaaggaaa actcctcagc atatatttag
aagtctcctc agcatatagt 540tgtttgtttt ctttacacat tcactgttta
ataaaacttt tataatattt cattatcgga 600actctagatt ctatacttgt
ttcccaattg ggccgatcgg gccttgctgg tagtaaacgt 660atacgtcata
aaagggaaaa gccacatgcg gaagaatttt atggaaaaaa aaaaaacctc
720gaagttacta cttctagggg gcctatcaag taaattactc ctggtacact
gaagtatata 780agggatatag aagcaaatag ttgtcagtgc aatccttcaa
gacgattggg aaaatactgt 840aggtaccgga gacctaacta catagtgttt
aaagattacg gatatttaac ttacttagaa 900taatgccatt tttttgagtt
ataataatcc tacgttagtg tgagcgggat ttaaactgtg 960aggaccttaa
tacattcaga cacttctgcg gtatcaccct acttattccc ttcgagatta
1020tatctaggaa cccatcaggt tggtggaaga ttacccgttc taagactttt
cagcttcctc 1080tattgatgtt acacctggac accccttttc tggcatccag
tttttaatct tcagtggcat 1140gtgagattct ccgaaattaa ttaaagcaat
cacacaattc tctcggatac cacctcggtt 1200gaaactgaca ggtggtttgt
tacgcatgct aatgcaaagg agcctatata cctttggctc 1260ggctgctgta
acagggaata taaagggcag cataatttag gagtttagtg aacttgcaac
1320atttactatt ttcccttctt acgtaaatat ttttcttttt aattctaaat
caatcttttt 1380caattttttg tttgtattct tttcttgctt aaatctataa
ctacaaaaaa cacatacata 1440aactaaaaat gtctgaacca gctcaaaaga
aacaaaaggt tgctaacaac tctctagaac 1500aattgaaagc ctccggcact
gtcgttgttg ccgacactgg tgatttcggc tctattgcca 1560agtttcaacc
tcaagactcc acaactaacc catcattgat cttggctgct gccaagcaac
1620caacttacgc caagttgatc gatgttgccg tggaatacgg taagaagcat
ggtaagacca 1680ccgaagaaca agtcgaaaat gctgtggaca gattgttagt
cgaattcggt aaggagatct 1740taaagattgt tccaggcaga gtctccaccg
aagttgatgc tagattgtct tttgacactc 1800aagctaccat tgaaaaggct
agacatatca ttaaattgtt tgaacaagaa ggtgtctcca 1860aggaaagagt
ccttattaaa attgcttcca cttgggaagg tattcaagct gccaaagaat
1920tggaagaaaa ggacggtatc cactgtaatt tgactctatt attctccttc
gttcaagcag 1980ttgcctgtgc cgaggcccaa gttactttga tttccccatt
tgttggtaga attctagact 2040ggtacaaatc cagcactggt aaagattaca
agggtgaagc cgacccaggt gttatttccg 2100tcaagaaaat ctacaactac
tacaagaagt acggttacaa gactattgtt atgggtgctt 2160ctttcagaag
cactgacgaa atcaaaaact tggctggtgt tgactatcta acaatttctc
2220cagctttatt ggacaagttg atgaacagta ctgaaccttt cccaagagtt
ttggaccctg 2280tctccgctaa gaaggaagcc ggcgacaaga tttcttacat
cagcgacgaa tctaaattca 2340gattcgactt gaatgaagac gctatggcca
ctgaaaaatt gtccgaaggt atcagaaaat 2400tctctgccga tattgttact
ctattcgact tgattgaaaa gaaagttacc gcttaaggaa 2460gtatctcgga
aatattaatt taggccatgt ccttatgcac gtttcttttg atacttacgg
2520gtacatgtac acaagtatat ctatatatat aaattaatga aaatccccta
tttatatata 2580tgactttaac gagacagaac agttttttat tttttatcct
atttgatgaa tgatacagtt 2640tcttattcac gtgttatacc cacaccaaat
ccaatagcaa taccggccat cacaatcact 2700gtttcggcag cccctaagat
cagacaaaac atccggaacc accttaaatc aacgtcccat 2760atgaatcctt
gcagcaaagc cgctcgtacc ggagatatac aatagaacag ataccagaca
2820agacataatg ggctaaacaa gactacacca attacactgc ctcattgatg
gtggtacata 2880acgaactaat actgtagccc tagacttgat agccatcatc
atatcgaagt ttcactaccc 2940tttttccatt tgccatctat tgaagtaata
ataggcgcat gcaacttctt ttcttttttt 3000ttcttttctc tctcccccgt
tgttgtctca ccatatccgc aatgacaaaa aaatgatgga 3060agacactaaa
ggaaaaaatt aacgacaaag acagcaccaa cagatgtcgt tgttccagag
3120ctgatgaggg gtatctcgaa gcacacgaaa ctttttcctt ccttcattca
cgcacactac 3180tctctaatga gcaacggtat acggccttcc ttccagttac
ttgaatttga aataaaaaaa 3240agtttgctgt cttgctatca agtataaata
gacctgcaat tattaatctt ttgtttcctc 3300gtcattgttc tcgttccctt
tcttccttgt ttctttttct gcacaatatt tcaagctata 3360ccaagcatac
aatcaactat ctcatataca atgactcaat tcactgacat tgataagcta
3420gccgtctcca ccataagaat tttggctgtg gacaccgtat ccaaggccaa
ctcaggtcac 3480ccaggtgctc cattgggtat ggcaccagct gcacacgttc
tatggagtca aatgcgcatg 3540aacccaacca acccagactg gatcaacaga
gatagatttg tcttgtctaa cggtcacgcg 3600gtcgctttgt tgtattctat
gctacatttg actggttacg atctgtctat tgaagacttg 3660aaacagttca
gacagttggg ttccagaaca ccaggtcatc ctgaatttga gttgccaggt
3720gttgaagtta ctaccggtcc attaggtcaa ggtatctcca acgctgttgg
tatggccatg 3780gctcaagcta acctggctgc cacttacaac aagccgggct
ttaccttgtc tgacaactac 3840acctatgttt tcttgggtga cggttgtttg
caagaaggta tttcttcaga agcttcctcc 3900ttggctggtc atttgaaatt
gggtaacttg attgccatct acgatgacaa caagatcact 3960atcgatggtg
ctaccagtat ctcattcgat gaagatgttg ctaagagata cgaagcctac
4020ggttgggaag ttttgtacgt agaaaatggt aacgaagatc tagccggtat
tgccaaggct 4080attgctcaag ctaagttatc caaggacaaa ccaactttga
tcaaaatgac cacaaccatt 4140ggttacggtt ccttgcatgc cggctctcac
tctgtgcacg gtgccccatt gaaagcagat 4200gatgttaaac aactaaagag
caaattcggt ttcaacccag acaagtcctt tgttgttcca 4260caagaagttt
acgaccacta ccaaaagaca attttaaagc caggtgtcga agccaacaac
4320aagtggaaca agttgttcag cgaataccaa aagaaattcc cagaattagg
tgctgaattg 4380gctagaagat tgagcggcca actacccgca aattgggaat
ctaagttgcc aacttacacc 4440gccaaggact ctgccgtggc cactagaaaa
ttatcagaaa ctgttcttga ggatgtttac 4500aatcaattgc cagagttgat
tggtggttct gccgatttaa caccttctaa cttgaccaga 4560tggaaggaag
cccttgactt ccaacctcct tcttccggtt caggtaacta ctctggtaga
4620tacattaggt acggtattag agaacacgct atgggtgcca taatgaacgg
tatttcagct 4680ttcggtgcca actacaaacc atacggtggt actttcttga
acttcgtttc ttatgctgct 4740ggtgccgtta gattgtccgc tttgtctggc
cacccagtta tttgggttgc tacacatgac 4800tctatcggtg tcggtgaaga
tggtccaaca catcaaccta ttgaaacttt agcacacttc 4860agatccctac
caaacattca agtttggaga ccagctgatg gtaacgaagt ttctgccgcc
4920tacaagaact ctttagaatc caagcatact ccaagtatca ttgctttgtc
cagacaaaac 4980ttgccacaat tggaaggtag ctctattgaa agcgcttcta
agggtggtta cgtactacaa 5040gatgttgcta acccagatat tattttagtg
gctactggtt ccgaagtgtc tttgagtgtt 5100gaagctgcta agactttggc
cgcaaagaac atcaaggctc gtgttgtttc tctaccagat 5160ttcttcactt
ttgacaaaca acccctagaa tacagactat cagtcttacc agacaacgtt
5220ccaatcatgt ctgttgaagt tttggctacc acatgttggg gcaaatacgc
tcatcaatcc 5280ttcggtattg acagatttgg tgcctccggt aaggcaccag
aagtcttcaa gttcttcggt 5340ttcaccccag aaggtgttgc tgaaagagct
caaaagacca ttgcattcta taagggtgac 5400aagctaattt ctcctttgaa
aaaagctttc taaattctga tcgtagatca tcagatttga 5460tatgatatta
tttgtgaaaa aatgaaataa aactttatac aacttaaata caactttttt
5520tataaacgat taagcaaaaa aatagtttca aacttttaac aatattccaa
acactcagtc 5580cttttccttc ttatattata ggtgtacgta ttatagaaaa
atttcaatga ttactttttc 5640tttctttttc cttgtaccag cacatggccg
agcttgaatg ttaaaccctt cgagagaatc 5700acaccattca agtataaagc
caataaagaa tatcgtacca gagaattttg ccatcggaca 5760tgctacctta
cgcttatatc tctcattgga atatcgtttt ctgattaaaa cacggaagta
5820agaacttaat tcgtttttcg ttgaactatg ttgtgccagc gtaacattaa
aaaagagtgt 5880acaaggccac gttctgtcac cgtcagaaaa atatgtcaat
gaggcaagaa ccgggatggt 5940aacaaaaatc acgatctggg tgggtgtggg
tgtattggat tataggaagc cacgcgctca 6000acctggaatt acaggaagct
ggtaattttt tgggtttgca atcatcacca tctgcacgtt 6060gttataatgt
cccgtgtcta tatatatcca ttgacggtat tctatttttt tgctattgaa
6120atgagcgttt tttgttacta caattggttt tacagacgga attttcccta
tttgtttcgt 6180cccatttttc cttttctcat tgttctcata tcttaaaaag
gtcctttctt cataatcaat 6240gctttctttt acttaatatt ttacttgcat
tcagtgaatt ttaatacata ttcctctagt 6300cttgcaaaat cgatttagaa
tcaagatacc agcctaaaaa tggtcaaacc aattatagct 6360cccagtatcc
ttgcttctga cttcgccaac ttgggttgcg aatgtcataa ggtcatcaac
6420gccggcgcag attggttaca tatcgatgtc atggacggcc attttgttcc
aaacattact 6480ctgggccaac caattgttac ctccctacgt cgttctgtgc
cacgccctgg cgatgctagc 6540aacacagaaa agaagcccac tgcgttcttc
gattgtcaca tgatggttga aaatcctgaa 6600aaatgggtcg acgattttgc
taaatgtggt gctgaccaat ttacgttcca ctacgaggcc 6660acacaagacc
ctttgcattt agttaagttg attaagtcta agggcatcaa agctgcatgc
6720gccatcaaac ctggtacttc tgttgacgtt ttatttgaac tagctcctca
tttggatatg 6780gctcttgtta tgactgtgga acctgggttt ggaggccaaa
aattcatgga agacatgatg 6840ccaaaagtgg aaactttgag agccaagttc
ccccatttga atatccaagt cgatggtggt 6900ttgggcaagg agaccatccc
gaaagccgcc aaagccggtg ccaacgttat tgtcgctgga 6960accagtgttt
tcactgcagc tgacccgcac gatgttatct ccttcatgaa agaagaagtc
7020tcgaaggaat tgcgttctag agatttgcta gattagttgt acatatgcgg
catttcttat 7080atttatactc tctatactat acgatatggt atttttttct
cgttttgatc tcctaatata 7140cataaaccga gccattccta ctatacaaga
tacgtaagtg cctaactcat gggaaaaatg 7200ggccgcccag ggtggtgcct
tgtccgtttt cgatgatcaa tccctgggat gcagtatcgt 7260caatgacact
ccataaggct tccttaacca aagtcaaaga actcttcttt tcattctctt
7320tcactttctt accgccatct agatcaatat ccatttcgta ccccgcggaa
ccgccagata 7380ttcattactt gacgcaaaag cgtttgaaat aatgacgaaa
aagaaggaag aaaaaaaaag 7440aaaaataccg cttctaggcg ggttatctac
tgatccgagc ttccactagg atagcaccca 7500aacacctgca tatttggacg
acctttactt acaccaccaa aaaccacttt cgcctctccc 7560gcccctgata
acgtccacta attgagcgat tacctgagcg gtcctctttt gtttgcagca
7620tgagacttgc atactgcaaa tcgtaagtag caacgtctca aggtcaaaac
tgtatggaaa 7680ccttgtcacc tcacttaatt ctagctagcc taccctgcaa
gtcaagaggt ctccgtgatt 7740cctagccacc tcaaggtatg cctctccccg
gaaactgtgg ccttttctgg cacacatgat 7800ctccacgatt tcaacatata
aatagctttt gataatggca atattaatca aatttatttt 7860acttctttct
tgtaacatct ctcttgtaat cccttattcc ttctagctat ttttcataaa
7920aaaccaagca actgcttatc aacacacaaa cactaaatca aaatggctgc
cggtgtccca 7980aaaattgatg cgttagaatc tttgggcaat cctttggagg
atgccaagag agctgcagca 8040tacagagcag ttgatgaaaa tttaaaattt
gatgatcaca aaattattgg aattggtagt 8100ggtagcacag tggtttatgt
tgccgaaaga attggacaat atttgcatga ccctaaattt 8160tatgaagtag
cgtctaaatt catttgcatt ccaacaggat tccaatcaag aaacttgatt
8220ttggataaca agttgcaatt aggctccatt gaacagtatc ctcgcattga
tatagcgttt 8280gacggtgctg atgaagtgga tgagaattta caattaatta
aaggtggtgg tgcttgtcta 8340tttcaagaaa aattggttag tactagtgct
aaaaccttca ttgtcgttgc tgattcaaga 8400aaaaagtcac caaaacattt
aggtaagaac tggaggcaag gtgttcccat tgaaattgta 8460ccttcctcat
acgtgagggt caagaatgat ctattagaac aattgcatgc tgaaaaagtt
8520gacatcagac aaggaggttc tgctaaagca ggtcctgttg taactgacaa
taataacttc 8580attatcgatg cggatttcgg tgaaatttcc gatccaagaa
aattgcatag agaaatcaaa 8640ctgttagtgg gcgtggtgga aacaggttta
ttcatcgaca acgcttcaaa agcctacttc 8700ggtaattctg acggtagtgt
tgaagttacc gaaaagtgag cagatcaaag gcaaagacag 8760aaaccgtagt
aaaggttgac ttttcacaac agtgtctcca ttttttatat tgtattatta
8820aagctattta gttatttgga tactgttttt tttccagaag ttttcttttt
agtaaagtac 8880aatccagtaa aaatgaagga tgaacaatcg gtgtatgcag
attcaacacc aataaatgca 8940atgtttattt ctttggaacg tttgtgttgt
tcgaaatcca ggataatcct tcaacaagac 9000cctgtccgga taaggcgtta
ctaccgatga cacaccaagc tcgagtaacg gagcaagaat 9060tgaaggatat
ttctgcacta aatgccaaca tcagatttaa tgatccatgg acctggttgg
9120atggtaaatt ccccactttt gcctgatcca gccagtaaaa tccatactca
acgacgatat 9180gaacaaattt ccctcattcc gatgctgtat atgtgtataa
atttttacat gctcttctgt 9240ttagacacag aacagcttta aataaaatgt
tggatatact ttttctgcct gtggtgtcat 9300ccacgctttt aattcatctc
ttgtatggtt gacaatttgg ctatttttta acagaaccca 9360acggtaattg
aaattaaaag ggaaacgagt gggggcgatg agtgagtgat actaaaatag
9420acaccaagag agcaaagcgg tcccagcggc cgcgaattcg gcgtaatcat
ggtcatagct 9480gtttcctgtg tgaaattgtt atccgctcac aattccacac
aacatacgag ccggaagcat 9540aaagtgtaaa gcctggggtg cctaatgagt
gagctaactc acattaattg cgttgcgctc 9600actgcccgct ttccagtcgg
gaaacctgtc gtgccagctg cattaatgaa tcggccaacg 9660cgcggggaga
ggcggtttgc gtattgggcg ctcttccgct tcctcgctca ctgactcgct
9720gcgctcggtc gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg
taatacggtt 9780atccacagaa tcaggggata acgcaggaaa gaacatgtga
gcaaaaggcc agcaaaaggc 9840caggaaccgt aaaaaggccg cgttgctggc
gtttttccat aggctccgcc cccctgacga 9900gcatcacaaa aatcgacgct
caagtcagag gtggcgaaac ccgacaggac tataaagata 9960ccaggcgttt
ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac
10020cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcaat
gctcacgctg 10080taggtatctc agttcggtgt aggtcgttcg ctccaagctg
ggctgtgtgc acgaaccccc 10140cgttcagccc gaccgctgcg ccttatccgg
taactatcgt cttgagtcca acccggtaag 10200acacgactta tcgccactgg
cagcagccac tggtaacagg attagcagag cgaggtatgt 10260aggcggtgct
acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt
10320atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg
gtagctcttg 10380atccggcaaa caaaccaccg ctggtagcgg tggttttttt
gtttgcaagc agcagattac 10440gcgcagaaaa aaaggatctc aagaagatcc
tttgatcttt tctacggggt ctgacgctca 10500gtggaacgaa aactcacgtt
aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac 10560ctagatcctt
ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac
10620ttggtctgac agttaccaat gcttaatcag tgaggcacct atctcagcga
tctgtctatt 10680tcgttcatcc atagttgcct gactccccgt cgtgtagata
actacgatac gggagggctt 10740accatctggc cccagtgctg caatgatacc
gcgagaccca cgctcaccgg ctccagattt 10800atcagcaata aaccagccag
ccggaagggc cgagcgcaga agtggtcctg caactttatc 10860cgcctccatc
cagtctatta attgttgccg ggaagctaga gtaagtagtt cgccagttaa
10920tagtttgcgc aacgttgttg ccattgctac aggcatcgtg gtgtcacgct
cgtcgtttgg 10980tatggcttca ttcagctccg gttcccaacg atcaaggcga
gttacatgat cccccatgtt 11040gtgcaaaaaa gcggttagct ccttcggtcc
tccgatcgtt gtcagaagta agttggccgc 11100agtgttatca ctcatggtta
tggcagcact gcataattct cttactgtca tgccatccgt 11160aagatgcttt
tctgtgactg gtgagtactc aaccaagtca
ttctgagaat agtgtatgcg 11220gcgaccgagt tgctcttgcc cggcgtcaat
acgggataat accgcgccac atagcagaac 11280tttaaaagtg ctcatcattg
gaaaacgttc ttcggggcga aaactctcaa ggatcttacc 11340gctgttgaga
tccagttcga tgtaacccac tcgtgcaccc aactgatctt cagcatcttt
11400tactttcacc agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg
caaaaaaggg 11460aataagggcg acacggaaat gttgaatact catactcttc
ctttttcaat attattgaag 11520catttatcag ggttattgtc tcatgagcgg
atacatattt gaatgtattt agaaaaataa 11580acaaataggg gttccgcgca
catttccccg aaaagtgcca cctgacgtca actatacaaa 11640tgacaagttc
ttgaaaacaa gaatcttttt attgtcagta ctgattagaa aaactcatcg
11700agcatcaaat gaaactgcaa tttattcata tcaggattat caataccata
tttttgaaaa 11760agccgtttct gtaatgaagg agaaaactca ccgaggcagt
tccataggat ggcaagatcc 11820tggtatcggt ctgcgattcc gactcgtcca
acatcaatac aacctattaa tttcccctcg 11880tcaaaaataa ggttatcaag
tgagaaatca ccatgagtga cgactgaatc cggtgagaat 11940ggcaaaagct
tatgcatttc tttccagact tgttcaacag gccagccatt acgctcgtca
12000tcaaaatcac tcgcatcaac caaaccgtta ttcattcgtg attgcgcctg
agcgagacga 12060aatacgcgat cgctgttaaa aggacaatta caaacaggaa
tcgaatgcaa ccggcgcagg 12120aacactgcca gcgcatcaac aatattttca
cctgaatcag gatattcttc taatacctgg 12180aatgctgttt tgccggggat
cgcagtggtg agtaaccatg catcatcagg agtacggata 12240aaatgcttga
tggtcggaag aggcataaat tccgtcagcc agtttagtct gaccatctca
12300tctgtaacat cattggcaac gctacctttg ccatgtttca gaaacaactc
tggcgcatcg 12360ggcttcccat acaatcgata gattgtcgca cctgattgcc
cgacattatc gcgagcccat 12420ttatacccat ataaatcagc atccatgttg
gaatttaatc gcggcctcga aacgtgagtc 12480ttttccttac ccatggttgt
ttatgttcgg atgtgatgtg agaactgtat cctagcaaga 12540ttttaaaagg
aagtatatga aagaagaacc tcagtggcaa atcctaacct tttatatttc
12600tctacagggg cgcggcgtgg ggacaattca acgcgactgt gacgcgttct
agaacacaca 12660atatgcatgt aatcgctgat tttttttgtt ttagaagctc
tatcttcagg taaaaatgag 12720tagagaaaaa aaaacatact ggatcgatgc
agaattaggg ggttattatc ctgcaggtac 12780atgattttca gtgggaacat
tgctttttag tagtccggtt ctcaacaact tgtctaagtg 12840ttgaaaacaa
aagaaatggc gtagaaacaa agtagtgtaa gtaaatctgc caatgttcta
12900tgtataaaaa gtaaaggcaa gaagaggttc tatgcatatt tctgaaaata
tctaatacac 12960tattataatg catcaagaaa ctgtcgtatg atgaagtgcc
tatgagtttt tgtgtacgtg 13020cttctctagt atgtagccgg ttttctcttt
ttacctcttt ttactactta tactactact 13080tttactacct ttcttccacg
taatctagat ctcaagccac aattcttgcc ctatgctcca 13140acgtatacaa
catcgaagaa gagtctttct ttagggagtc attggaaaag atagtatgat
13200ggtattcgat ttacctatgt cgcaaaagaa agtccggggc aacaccacag
aatgctttct 13260ctgtactaat aacctgttgt gcgcttaacg gtctaatcgt
taatcagcgg tggttaaatt 13320tttgtaaatc taatgttcca tgattttctt
tcttcaaaag gaacatgtag cgaaaatctt 13380ttttttactt tgatacactg
caattgtttc tgagcatgct gaaattttct cgatgttttt 13440tttttttatt
ggcatccaag taattaatcc ttatgctacg aaaaagttgt aggaatgaat
13500catgcataat ctaacggata tcatcatata ctctgtgcta atattctaaa
caagttcgaa 13560aatattttct tggcccatgt aataggtggt aagtgtattg
ctttgatagg aacgtcatta 13620tcgcacaaga caatcggcac taataaccgt
ttaaatatta tcatgcatgt atacatcagt 13680atctcataga aatatacctg
taagtacata cttatctaag tataaattct cgacctatgg 13740agtcaccaca
tttcccagca acttccccac ttcctctgca atcgccaacg tcctctcttc
13800actgagtctc cgtccgataa cctgcactgc aaccggtgcc ccatggtacg
cctccggatc 13860atactcttcc tgcacgaggg catcaagctc actaaccgcc
ttgaaactct cattcttctt 13920atcgatgttc ttatccgcaa aggtaaccgg
aacaaccacg ctcgtgaaat ccagcaggtt 13980gatcacagag gcatacccat
agtaccggaa ctggtcatgc cgtaccgcag cggtaggcgt 14040aatcggcgcg
atgatggcgt ccagttcctt cccggccttt tcttcagcct cccgccattt
14100ctcaaggtac tccatctggt aattccactt ctggagatgc gtgtcccaga
gctcgttcat 14160gttaacagct ttgatgttcg ggttcagtag gtctttgata
tttggaatcg ccggctcgcc 14220ggatgcactg atatcgcgca ttacgtcggc
gctgccgtca gccgcgtaga tatgggagat 14280gagatcgtgg ccgaaatcgt
gcttgtatgg cgtccacggg gtcacggtgt gaccggcttt 14340ggcgagtgcg
gcgacggtgg tttccacgcc gcgcaggata ggagggtgtg gaaggacatt
14400gccgtcgaag ttgtagtagc cgatattgag cccgccgttc ttgatcttgg
aggcaataat 14460gtccgactcg gactggcgcc agggcatggg gatgaccttg
gagtcgtatt tccatggctc 14520ctgaccgagg acggatttgg tgaagaggcg
gaggtcctca acagagtgcg taatcggccc 14580gacaacgctg tgcaccgtct
cctgaccctc catgctgttc gccatctttg catacggcag 14640ccgcccatga
ctcggcctta gaccgtacag gaagttgaac gcggccggca ctcgaatcga
14700gccaccgata tccgttccta caccgatgac gccaccacga atcccaacga
tcgcaccctc 14760accaccagaa ctgccgccgc acgaccagtt cttgttgcgt
gggttgacgg tgcgcccgat 14820gatgttgttg actgtctcgc agaccatcag
ggtctgcggg acagaggtct tgacgtagaa 14880gacggcaccg gctttgcgga
gcatggttgt cagaaccgag tccccttcgt cgtacttgtt 14940tagccatgag
atgtagccca ttgatgtttc gtagcccttg actcgaagct ggtctttgag
15000agagatgggg aggccatgga gtggaccaac gggtctcttg tgctttgcgt
agtattcatc 15060gagttccctt gcctgcgcga gagcggcgtc agggaagaac
tcgtgggcgc agtttgttaa 15120ctgctgggcg attgctgccc gtttacagaa
tgctagcgta acttccaccg aggtcaactc 15180tccggccgcc agcttggaca
caagatctgc agcggaggcc tctgtgatct tcagttcggc 15240ctctgaaagg
atccccgatt tctttgggaa atcaataacg ctgtcttccg caggcagcgt
15300ctggactttc cattcatcag ggatggtttt tgcgaggcgg gcgcgcttat
cagcggccag 15360ttcttcccag gattgaggca ttgtatatga gatagttgat
tgtatgcttg gtatagcttg 15420aaatattgtg cagaaaaaga aacaaggaag
aaagggaacg agaacaatga cgaggaaaca 15480aaagattaat aattgcaggt
ctatttatac ttgatagcaa agcggcaaac tttttttatt 15540tcaaattcaa
gtaactggaa ggaaggccgt ataccgttgc tcattagaga gtagtgtgcg
15600tgaatgaagg aaggaaaaag tttcgtgtgt tcgaagatac ccctcatcag
ctctggaaca 15660acgacatctg ttggtgctgt ctttgtcgtt aattttttcc
tttagtgtct tccatcattt 15720tttttgtcat tgcggatatg gtgagacaac
aacgggggag agagaaaaga aaaaaaaaga 15780aaagaagttg catgcgccta
ttattacttc aatagatggc aaatggaaaa agggtagtga 15840aacttcgata
tgatgatggc tatcaagtct agggctacag tattagttcg ttatgtacca
15900ccatcaatga ggcagtgtaa tttgtgtagt cttgtttagc ccattatgtc
ttgtctggta 15960tctgttctat tgtatatctc ccctccgcca cctacatgtt
agggagacca acgaaggtat 16020tataggaatc ccgatgtatg ggtttggttg
ccagaaaaga ggaagtccat attgtacacc 16080cggaaacaac aaaaggatgg
gcccatgacg tctaagaaac cattattatc atgacattaa 16140cctataaaaa
taggcgtatc acgaggccct ttcgtc 16176922DNAArtificial
Sequencesynthetic primer 9acgccagggt tttcccagtc ac
221022DNAArtificial Sequencesynthetic primer 10caccaacctg
atgggttcct ag 221122DNAArtificial Sequencesynthetic primer
11caccaacctg atgggttcct ag 221221DNAArtificial Sequencesynthetic
primer 12acggtgctga tgaagtggat g 211321DNAArtificial
Sequencesynthetic primer 13accacgccca ctaacagttt g
21142559DNASaccharomyces cerevisiae 14atgagttctg tcaaccaaat
atatgaccta tttcccaata agcataatat ccaatttaca 60gattctcatt cacaggagca
tgatacttcg tccagccttg ctaagaatga tacagacgga 120actataagta
taccaggtag tatagacact ggcattttaa agagcattat tgaggagcaa
180ggttggaatg acgctgagtt atatagaagt tcaatacaaa atcaaagatt
ttttttaacg 240gataaataca ctaaaaagaa gcatttgact atggaggaca
tgcttagccc agaagaagaa 300caaatatatc aggaacctat tcaagatttc
caaacatata acaaacgtgt tcaaagggaa 360tatgagctca gggaaaggat
ggaagaattc ttccgtcaaa acaccaaaaa tgatttacat 420attttaaacg
aggattcatt aaatcagcaa tattccccgt taggacctgc agattatgtt
480ctgcccctcg atagatactc cagaatgaaa cacattgcct caaacttttt
cagaaaaaaa 540cttggtattc ctagaaaact gaaaagaaga agccattata
atcccaacgc agagggccac 600accaaaggga attcttctat attgagttcc
actactgatg taattgataa cgccagctac 660aggaatattg caatagatga
aaatgttgac ataacacata aagaacacgc cattgacgaa 720ataaacgagc
agggtgcatc aggtagtgaa tctgttgtgg aaggtggatc gttattgcat
780gacattgaaa aggttttcaa taggtccagg gcaactagga aataccatat
ccaacggaaa 840ttaaaagtgc gccatattca aatgctttct atcggggctt
gctttagtgt cggattattt 900ttaacctcag ggaaagcctt ttctattgcc
gggccatttg gtacactact tgggtttgga 960ctcacaggta gcatcatttt
agccacaatg ctgtcattta cagagttatc cacccttatt 1020cctgtgtctt
ctgggttctc aggactggct tctagatttg tagaggatgc tttcggattt
1080gcattgggct ggacgtattg gatttcctgt atgcttgctc ttcctgccca
agtttcctca 1140agtacattct atctcagcta ttataataat gtcaatatat
caaagggagt aacagcaggg 1200tttatcacgc tgttttctgc atttagcatt
gtagtaaatt tactggatgt cagcataatg 1260ggtgaaattg tatatgttgc
tggaataagc aaagtgataa ttgcaatttt gatggttttc 1320acgatgatca
tcctaaatgc cggacatgga aatgacattc acgaaggagt cggttttaga
1380tattgggata gctctaaatc tgtccgaaat ttgacctacg ggctatatcg
tccaacattt 1440gacctggctg atgctggcga aggaagcaaa aaaggaattt
caggcccaaa aggccgattt 1500ttagctacgg catcagtaat gctaatttca
acatttgcgt ttagcggtgt tgagatgact 1560tttttagcta gtggggaagc
tataaatcca aggaaaacaa ttccttctgc tacaaaaagg 1620acattttcca
ttgtactgat atcttacgtt tttttgattt tttcggtagg catcaacata
1680tacagtggcg atccaagact actatcatat tttcctggta tttccgaaaa
gaggtatgaa 1740gccattataa aaggcacagg aatggactgg agacttagga
ctaattgtcg cggcggtatt 1800gattataggc agatttcagt aggaacaggt
tattctagtc cttgggttgt tgcattgcag 1860aactttgggc tatgtacttt
cgcatctgct tttaacgcaa tactgatatt tttcactgct 1920acagcaggga
tatcctcgtt atttagttgt tcaagaacac tatacgccat gtctgtacaa
1980cggaaggcac cgccagtttt cgaaatttgc agcaagagag gtgttcctta
tgtttcagtg 2040atattctcct ctttattttc agtcattgct tatattgcag
ttgaccaaac cgcgattgaa 2100aacttcgacg tcttggccaa tgtttctagt
gctagtacgt ctattatatg gatgggattg 2160aatctttcct ttttgcgatt
ctattacgcc ctaaaacaaa ggaaggatat tatatcaaga 2220aatgattcat
catacccata taaatcgcca ttccaaccat atctagcgat ttatggtcta
2280gttggatgtt cattatttgt tatatttatg ggatatccta actttataca
tcatttctgg 2340agtactaaag cttttttttc agcatatggt ggcctgatgt
ttttctttat cagttacaca 2400gcttataagg ttctcggaac gtcaaagatt
caaagactag atcagttaga tatggacagt 2460gggaggaggg aaatggacag
aactgactgg accgaacata gccaatattt gggaacatat 2520agggaaagag
cgaagaagtt ggttacctgg ctgatttag 2559152559DNASaccharomyces
cerevisiae 15atgagttctg tcaaccaaat atatgaccta tttcccaata agcataatat
ccaatttaca 60gattctcatt cacaggagca tgatacttcg tccagccttg ctaagaatga
tacagacgga 120actataagta taccaggtag tatagacact ggcattttaa
agagcattat tgaggagcaa 180ggttggaatg acgctgagtt atatagaagt
tcaatacaaa atcaaagatt ttttttaacg 240gataaataca ctaaaaagaa
gcatttgact atggaggaca tgcttagccc agaagaagaa 300caaatatatc
aggaacctat tcaagatttc caaacatata acaaacgtgt tcaaagggaa
360tatgagctca gggaaaggat ggaagaattc ttccgtcaaa acaccaaaaa
tgatttacat 420attttaaacg aggattcatt aaatcagcaa tattccccgt
taggacctgc agattatgtt 480ctgcccctcg atagatactc cagaatgaaa
cacattgcct caaacttttt cagaaaaaaa 540cttggtattc ctagaaaact
gaaaagaaga agccattata atcccaacgc agagggccac 600accaaaggga
attcttctat attgagttcc actactgatg taattgataa cgccagctac
660aggaatattg caatagatga aaatgttgac ataacacata aagaacacgc
cattgacgaa 720ataaacgagc agggtgcatc aggtagtgaa tctgttgtgg
aaggtggatc gttattgcat 780gacattgaaa aggttttcaa taggtccagg
gcaactagga aataccatat ccaacggaaa 840ttaaaagtgc gccatattca
aatgctttct atcggggctt gctttagtgt cggattattt 900ttaacctcag
ggaaagcctt ttctattgcc gggccatttg gtacactact tgggtttgga
960ctcacaggta gcatcatttt agccacaatg ctgtcattta cagagttatc
cacccttatt 1020cctgtgtctt ctgggttctc aggactggct tctagatttg
tagaggatgc tttcggattt 1080gcattgggct ggacgtattg gatttcctgt
atgcttgctc ttcctgccca agtttcctca 1140agtacattct atctcagcta
ttataataat gtcaatatat caaagggagt aacagcaggg 1200tttatcacgc
tgttttctgc atttagcatt gtagtaaatt tactggatgt cagcataatg
1260ggtgaaattg tatatgttgc tggaataagc aaagtgataa ttgcaatttt
gatggttttc 1320acgatgatca tcctaaatgc cggacatgga aatgacattc
actaaggagt cggttttaga 1380tattgggata gctctaaatc tgtccgaaat
ttgacctacg ggctatatcg tccaacattt 1440gacctggctg atgctggcga
aggaagcaaa aaaggaattt caggcccaaa aggccgattt 1500ttagctacgg
catcagtaat gctaatttca acatttgcgt ttagcggtgt tgagatgact
1560tttttagcta gtggggaagc tataaatcca aggaaaacaa ttccttctgc
tacaaaaagg 1620acattttcca ttgtactgat atcttacgtt tttttgattt
tttcggtagg catcaacata 1680tacagtggcg atccaagact actatcatat
tttcctggta tttccgaaaa gaggtatgaa 1740gccattataa aaggcacagg
aatggactgg agacttagga ctaattgtcg cggcggtatt 1800gattataggc
agatttcagt aggaacaggt tattctagtc cttgggttgt tgcattgcag
1860aactttgggc tatgtacttt cgcatctgct tttaacgcaa tactgatatt
tttcactgct 1920acagcaggga tatcctcgtt atttagttgt tcaagaacac
tatacgccat gtctgtacaa 1980cggaaggcac cgccagtttt cgaaatttgc
agcaagagag gtgttcctta tgtttcagtg 2040atattctcct ctttattttc
agtcattgct tatattgcag ttgaccaaac cgcgattgaa 2100aacttcgacg
tcttggccaa tgtttctagt gctagtacgt ctattatatg gatgggattg
2160aatctttcct ttttgcgatt ctattacgcc ctaaaacaaa ggaaggatat
tatatcaaga 2220aatgattcat catacccata taaatcgcca ttccaaccat
atctagcgat ttatggtcta 2280gttggatgtt cattatttgt tatatttatg
ggatatccta actttataca tcatttctgg 2340agtactaaag cttttttttc
agcatatggt ggcctgatgt ttttctttat cagttacaca 2400gcttataagg
ttctcggaac gtcaaagatt caaagactag atcagttaga tatggacagt
2460gggaggaggg aaatggacag aactgactgg accgaacata gccaatattt
gggaacatat 2520agggaaagag cgaagaagtt ggttacctgg ctgatttag
2559161041DNASaccharomyces cerevisiae 16atgaacacag attcacacaa
ccttagtgag ccatacaata taggtggcca aaaatacatt 60aatatgaaaa aaaaggaaga
tcttggcgta tgccagcctg gcttaacgca aaaggcattc 120acagtcgaag
acaagttcga ttacaaagca attattgaaa aaatggaagt atacggactt
180tgcgtggtca agaattttat agagacctcc agatgtgatg aaatattgaa
agaaatcgaa 240ccgcattttt atagatacga atcatggcaa ggctcaccgt
ttcctaagga aactactgtg 300gcaacgagat cggttttaca ctcatctaca
gtcttaaagg atgtggtatg cgaccgtatg 360ttttgtgata tctcaaaaca
ttttttgaat gaagaaaact actttgcggc gggaaaggtg 420attaataaat
gcactagtga tattcaactg aactccggta tagtctacaa ggttggcgct
480ggtgcaagtg accagggcta ccaccgagaa gatattgttc atcatacgac
ccatcaagca 540tgtgaacgtt tccagtatgg aaccgaaacc atggtagggt
taggtgtagc ttttacagat 600atgaataaag aaaatggctc tacgcgaatg
atagtcggtt cacatttgtg gggtccgcac 660gattcctgtg ggaactttga
caagaggatg gaatttcacg ttaatgttgc aaagggagac 720gcagttctat
tcttagggag cctctaccat gcagccagtg caaatcgtac gtcacaagac
780agagttgctg gatatttttt tatgacaaag agctacttga aaccagagga
aaatcttcac 840ttagggactg atttgcgagt gtttaagggt ttaccattgg
aagccttgca actgttgggg 900ctcggaatta gtgagccatt ttgtggtcac
atagattata agagtccagg acatcttatc 960agttctagtt tgtttgaaaa
tgatatcgaa aaggggtact atggagagac aataagggtg 1020aattatgggt
ccacgcaata a 1041171041DNASaccharomyces cerevisiae 17atgaacacag
attcacacaa ccttagtgag ccatacaata taggtggcca aaaatacatt 60aatatgaaaa
aaaaggaaga tcttggcgta tgccagcctg gcttaacgca aaaggcattc
120acagtcgaag acaagttcga ttacaaagca attattgaaa aaatggaagt
atacggactt 180tgcgtggtca agaattttat agagacctcc agatgtgatg
aaatattgaa agaaatcgaa 240ccgcattttt atagatacga atcatggcaa
ggctcaccgt ttcctaagga aactactgtg 300gcaacgagat cggttttaca
ctcatctaca gtcttaaagg atgtggtatg cgaccgtatg 360ttttgtgata
tctcaaaaca ttttttgaat gaagaaaact actttgcggc gggaaaggtg
420attaataaat gcactagtga tattcaactg aactccggta tagtctacaa
ggttggcgct 480ggtgcaagtg accagggcta ccaccgagaa ggtattgttc
atcatacgac ccatcaagca 540tgtgaacgtt tccagtatgg aaccgaaacc
atggtagggt taggtgtagc ttttacagat 600atgaataaag aaaatggctc
tacgcgaatg atagtcggtt cacatttgtg gggtccgcac 660gattcctgtg
ggaactttga caagaggatg gaatttcacg ttaatgttgc aaagggagac
720gcagttctat tcttagggag cctctaccat gcagccagtg caaatcgtac
gtcacaagac 780agagttgctg gatatttttt tatgacaaag agctacttga
aaccagagga aaatcttcac 840ttagggactg atttgcgagt gtttaagggt
ttaccattgg aagccttgca actgttgggg 900ctcggaatta gtgagccatt
ttgtggtcac atagattata agagtccagg acatcttatc 960agttctagtt
tgtttgaaaa tgatatcgaa aaggggtact atggagagac aataagggtg
1020aattatgggt ccacgcaata a 1041181827DNASaccharomyces cerevisiae
18atgtttaacc gtaccactca actgaaatcc aagcatccct gttccgtgtg tacgaggcga
60aaagtcaaat gtgatcgtat gataccgtgt ggaaactgca ggaagagagg acaggactcc
120gaatgtatga aatcaacaaa actaataacg gcttcatctt ccaaggaata
tctccctgac 180ctattgttat tctggcaaaa ttatgaatat tggataacga
atattgggct gtacaaaaca 240aaacaaagag atcttactag aacaccagct
aatttggata ctgatactga agaatgtatg 300ttttggatga attatcttca
aaaagaccaa tcattccaat tgatgaactt tgctatggaa 360aacttaggtg
ctttgtattt tggttccatt ggagatatca gtgaattata tttgagggtg
420gaacagtact gggatagaag ggcagacaag aatcacagtg tagacggcaa
atactgggac 480gcactaatat ggtctgtctt taccatgtgc atttattata
tgccagtcga gaagttagca 540gaaatatttt cagtatatcc tctccatgaa
tatttgggta gcaacaaaag gctcaattgg 600gaagatggta tgcaattagt
catgtgccaa aattttgcac gctgctcatt attccaattg 660aaacaatgtg
atttcatggc gcatcccgat ataaggctcg ttcaagcata tctgatttta
720gccactacaa ctttccccta cgatgaaccg ttgttggcaa attcgctcct
aacacagtgc 780atccatacct ttaaaaattt tcatgtggat gactttagac
ctttacttaa tgatgacccc 840gttgaaagca tcgctaaagt aaccttggga
agaatattct atcgcctgtg tggatgcgat 900tatcttcaat cggggccacg
caaaccaatt gcacttcata cagaagtatc ctccctatta 960caacatgcag
catatttgca ggatttgcct aacgttgatg tttacaggga agaaaacagc
1020acagaggtct tgtattggaa aatcatctca ttagacagag atttagatca
atacttgaac 1080aagagttcta aacctccctt aaaaacattg gatgctataa
ggagggagct cgatattttt 1140caatacaaag tagattcgtt ggaagaagat
tttagatcaa ataacagcag atttcaaaaa 1200tttattgcac tttttcaaat
atctactgtt tcctggaaat tgtttaagat gtatctcatt 1260tattatgata
ccgcagattc actactaaag gttatacatt attctaaggt aatcattagt
1320cttattgtta ataacttcca tgcaaaaagt gagtttttca acagacatcc
gatggtgatg 1380caaaccatta cgcgcgtggt ctctttcatc tccttttacc
aaatttttgt ggaatcggct 1440gctgtcaaac aacttttagt agatctaact
gaacttactg caaatctgcc cacaatattc 1500ggttcaaaac tagataaact
agtttacttg accgaaaggc tcagtaaatt aaaactttta 1560tgggacaagg
tacagcttct agattcaggt gattcgtttt accatcctgt tttcaaaata
1620ctacaaaatg atattaagat aattgagttg aaaaatgatg aaatgttttc
tctcataaaa 1680ggactcgggt ctttggtacc gttgaataag cttagacaag
aatcgttgct tgaggaagag 1740gacgaaaaca atacggaacc aagtgacttc
agaactattg tagaagagtt tcaatccgaa 1800tataacattt ctgacatact ttcctaa
1827191827DNASaccharomyces cerevisiae 19atgtttaacc gtaccactca
actgaaatcc aagcatccct gttccgtgtg tacgaggcga 60aaagtcaaat gtgatcgtat
gataccgtgt ggaaactgca ggaagagagg acaggactcc 120gaatgtatga
aatcaacaaa actaataacg gcttcatctt ccaaggaata tctccctgac
180ctattgttat tctggcaaaa
ttatgaatat tggataacga atattgggct gtacaaaaca 240aaacaaagag
atcttactag aacaccagct aatttggata ctgatactga agaatgtatg
300ttttggatga attatcttca aaaagaccaa tcattccaat tgatgaactt
tgctatggaa 360aacttaggtg ctttgtattt tggttccatt ggagatatca
gtgaattata tttgagggtg 420gaacagtact gggatagaag ggcagacaag
aatcacagtg tagacggcaa atactgggac 480gcactaatat ggtctgtctt
taccatgtgc atttattata tgccagtcga gaagttagca 540gaaatatttt
cagtatatcc tctccatgaa tatttgggta gcaacaaaag gctcaattgg
600gaagatggta tgcaattagt catgtgccaa aattttgcac gctgctcatt
attccaattg 660aaacaatgtg atttcatggc gcatcccgat ataaggctcg
ttcaagcata tctgatttta 720gccactacaa ctttccccta cgatgaaccg
ttgttggcaa attcgctcct aacacagtgc 780atccatacct ttaaaaattt
tcatgtggat gactttagac ctttacttaa tgatgacccc 840gttgaaagca
tcgctaaagt aaccttggga agaatattct atcgcctgtg tggatgcgat
900tatcttcaat cggggccacg caaaccaatt gcacttcata cagaagtatc
ctccctatta 960caacatgcag catatttgca ggatttgcct aacgttgatg
tttacaggga agaaaacagc 1020acagaggtct tgtattggaa aatcatctca
ttagacagag atttagatca atacttgaac 1080aagagttcta aacctccctt
aaaaacattg gatgctataa ggagggagct cgatattttt 1140caatacaaag
tagattcgtt ggaagaagat tttagatcaa ataacggcag atttcaaaaa
1200tttattgcac tttttcaaat atctactgtt tcctggaaat tgtttaagat
gtatctcatt 1260tattatgata ccgcagattc actactaaag gttatacatt
attctaaggt aatcattagt 1320cttattgtta ataacttcca tgcaaaaagt
gagtttttca acagacatcc gatggtgatg 1380caaaccatta cgcgcgtggt
ctctttcatc tccttttacc aaatttttgt ggaatcggct 1440gctgtcaaac
aacttttagt agatctaact gaacttactg caaatctgcc cacaatattc
1500ggttcaaaac tagataaact agtttacttg accgaaaggc tcagtaaatt
aaaactttta 1560tgggacaagg tacagcttct agattcaggt gattcgtttt
accatcctgt tttcaaaata 1620ctacaaaatg atattaagat aattgagttg
aaaaatgatg aaatgttttc tctcataaaa 1680ggactcgggt ctttggtacc
gttgaataag cttagacaag aatcgttgct tgaggaagag 1740gacgaaaaca
atacggaacc aagtgacttc agaactattg tagaagagtt tcaatccgaa
1800tataacattt ctgacatact ttcctaa 1827201464DNASaccharomyces
cerevisiae 20atgcgattcc atcgtcaagg tatctcagcc atcataggcg tactactcat
tgtactgctt 60ggtttctgtt ggaagttatc tggatcttac ggcatagtat caactgccct
accacacaat 120caatctgcaa ttaaaagcac agacttacct tctatacgat
gggataatta ccatgagttc 180gtcagagaca ttgattttga taacagtacg
gctatcttta attccattcg ggctgcttta 240agacagtctc catcggatat
acatcctgtc ggagtatctt attttcccgc tgtaattccc 300aaaggaactt
taatgtacca tgccggatca aaagtgccaa ctaccttcga atggctagct
360atggaccatg aattcagcta ctctttcggc ttgaggtcac catcctatgg
gagaaaatct 420ttggaaagaa ggcatgggag gttcggcaat ggcaccaacg
gtgatcatcc aaaagggcca 480ccaccaccac caccaccacc agacgaaaaa
ggtcggggtt cacaaaaaat gcttacttat 540agagcagcac gggacctcaa
caaatttctc tatcttgatg gggcttctgc tgcgaaaact 600gactcaggag
agatggacac gcagctaatg ttgtcaaatg ttattaaaga gaaattgaac
660cttacagatg atggtgaaaa cgaacgaatg gccgaacgac tctacgctgc
tagaatatgc 720aaatggggga agccattcgg gcttgacgga attatcaggg
tagaggttgg ctttgaggtc 780gttttgtgtg atttttcggc tgataacgtc
gaacttgttt caatgttaga aatggtccag 840cctaaccagt acctaggctt
accagcacct accgtaatat cgaaggaaga aggttggcct 900ctggatgaaa
atggaagcct agttgaagat cagctaacag atgaccaaaa ggcgattctg
960gaaagagaag atggttggga gaaggctttt tctaatttca acgcagttaa
aagcttcaat 1020cagttgagag cgggtgcagc gcatgacaac ggggagcatc
gaatccatat cgactatagg 1080tacctagtga gcgggataaa caggacgtac
attgctcctg atcctaacaa cagaagatta 1140ctcgatgaag gaatgacatg
ggaaaagcaa ttggacatgg tagatgactt agaaaaggcg 1200ctggaagtcg
gatttgatgc cacgcaaagt atggattggc agttagcatt tgatgagctt
1260gtccttaaat ttgctccatt actaaaatct gttagtaaca tactgaacag
cgatggtgat 1320attaatgagt caattgccat caatgcaaca gcactcacat
tgaacttttg tctaccaata 1380tgtgagccca taccaggcct taaaaacgga
tgcagacttt ttgatttggt catctgctgt 1440cagcgttgtc ggagaaattg ttga
1464211464DNASaccharomyces cerevisiae 21atgcgattcc atcgtcaagg
tatctcagcc atcataggcg tactactcat tgtactgctt 60ggtttctgtt ggaagttatc
tggatcttac ggcatagtat caactgccct accacacaat 120caatctgcaa
ttaaaagcac agacttacct tctatacgat gggataatta ccatgagttt
180gtcagagaca ttgattttga taacagtacg gctatcttta attccattcg
ggctgcttta 240agacagtctc catcggatat acatcctgtc ggagtatctt
attttcccgc tgtaattccc 300aaaggaactt taatgtacca tgccggatca
aaagtgccaa ctaccttcga atggctagct 360atggaccatg aattcagcta
ctctttcggc ttgaggtcac catcctatgg gagaaaatct 420ttggaaagaa
ggcatgggag gttcggcaat ggcaccaacg gtgatcatcc aaaagggcca
480ccaccaccac caccaccacc agacgaaaaa ggtcggggtt cacaaaaaat
gcttacttat 540agagcagcac gggacctcaa caaatttctc tatcttgatg
gggcttctgc tgcgaaaact 600gactcaggag agatggacac gcagctaatg
ttgtcaaatg ttattaaaga gaaattgaac 660cttacagatg atggtgaaaa
cgaacgaatg gccgaacgac tctacgctgc tagaatatgc 720aaatggggga
agccattcgg gcttgacgga attatcaggg tagaggttgg ctttgaggtc
780gttttgtgtg atttttcggc tgataacgtc gaacttgttt caatgttaga
aatggtccag 840cctaaccagt acctaggctt accagcacct accgtaatat
cgaaggaaga aggttggcct 900ctggatgaaa atggaagcct agttgaagat
cagctaacag atgaccaaaa ggcgattctg 960gaaagagaag atggttggga
gaaggctttt tctaatttca acgcagttaa aagcttcaat 1020cagttgagag
cgggtgcagc gcatgacaac ggggagcatc gaatccatat cgactatagg
1080tacctagtga gcgggataaa caggacgtac attgctcctg atcctaacaa
cagaagatta 1140ctcgatgaag gaatgacatg ggaaaagcaa ttggacatgg
tagatgactt agaaaaggcg 1200ctggaagtcg gatttgatgc cacgcaaagt
atggattggc agttagcatt tgatgagctt 1260gtccttaaat ttgctccatt
actaaaatct gttagtaaca tactgaacag cgatggtgat 1320attaatgagt
caattgccat caatgcaaca gcactcacat tgaacttttg tctaccaata
1380tgtgagccca taccaggcct taaaaacgga tgcagacttt ttgatttggt
catctgctgt 1440cagcgttgtc ggagaaattg ttga
1464221308DNASaccharomyces cerevisiae 22atggactaca acaagagatc
ttcggtctca accgtgccta atgcagctcc cataagagtc 60ggattcgtcg gtctcaacgc
agccaaagga tgggcaatca agacacatta ccccgccata 120ctgcaactat
cgtcacaatt tcaaatcact gccttataca gtccaaaaat tgagacttct
180attgccacca ttcagcgtct aaaattgagt aatgccactg cttttcccac
tttagagtca 240tttgcatcat cttccactat agatatgata gtgatagcta
tccaagtggc cagccattat 300gaagttgtta tgcctctctt ggaattctcc
aaaaataatc cgaacctcaa gtatcttttc 360gtagaatggg cccttgcatg
ttcactagat caagccgaat ccatttataa ggctgctgct 420gaacgtgggg
ttcaaaccat catctcttta caaggtcgta aatcaccata tattttgaga
480gcaaaagaat taatatctca aggctatatc ggcgacatta attcgatcga
gattgctgga 540aatggcggtt ggtacggcta cgaaaggcct gttaaatcac
caaaatacat ctatgaaatc 600gggaacggtg tagatctggt aaccacaaca
tttggtcaca caatcgatat tttacaatac 660atgacaagtt cgtacttttc
caggataaat gcaatggttt tcaataatat tccagagcaa 720gagctgatag
atgagcgtgg taaccgattg ggccagcgag tcccaaagac agtaccggat
780catcttttat tccaaggcac attgttaaat ggcaatgttc cagtgtcatg
cagtttcaaa 840ggtggcaaac ctaccaaaaa atttaccaaa aatttggtca
ttgacattca cggtaccaag 900ggagatttga aacttgaagg cgatgccggc
ttcgcagaaa tttcaaatct ggtcctttac 960tacagtggaa ctagagcaaa
cgacttcccg ctagccaatg gacaacaagc tcctttagac 1020ccggggtatg
atgcaggtaa agaaatcatg gaagtatatc atttacgaaa ttataatgcc
1080attgtgggta atattcatcg actgtatcaa tctatctctg acttccactt
caatacaaag 1140aaaattcctg aattaccctc acaatttgta atgcaaggtt
tcgatttcga aggctttccc 1200accttgatgg atgctctgat attacacagg
ttaatcgaga gcgtttataa aagtaacatg 1260atgggctcca cattaaacgt
tagcaatatc tcgcattata gtttataa 1308231308DNASaccharomyces
cerevisiae 23atggactaca acaagagatc ttcggtctca accgtgccta atgcagctcc
cataagagtc 60ggattcgtcg gtctcaacgc agccaaagga tgggcaatca agacacatta
ccccgccata 120ctgcaactat cgtcacaatt tcaaatcact gccttataca
gtccaaaaat tgagacttct 180attgccacca ttcagcgtct aaaattgagt
aatgccactg cttttcccac tttagagtca 240tttgcatcat cttccactat
agatatgata gtgatagcta tccaagtggc cagccattat 300gaagttgtta
tgcctctctt ggaattctcc aaaaataatc cgaacctcaa gtatcttttc
360gtagaatggg cccttgcatg ttcactagat caagccgaat ccatttataa
ggctgctgct 420gaacgtgggg ttcaacccat catctcttta caaggtcgta
aatcaccata tattttgaga 480gcaaaagaat taatatctca aggctatatc
ggcgacatta attcgatcga gattgctgga 540aatggcggtt ggtacggcta
cgaaaggcct gttaaatcac caaaatacat ctatgaaatc 600gggaacggtg
tagatctggt aaccacaaca tttggtcaca caatcgatat tttacaatac
660atgacaagtt cgtacttttc caggataaat gcaatggttt tcaataatat
tccagagcaa 720gagctgatag atgagcgtgg taaccgattg ggccagcgag
tcccaaagac agtaccggat 780catcttttat tccaaggcac attgttaaat
ggcaatgttc cagtgtcatg cagtttcaaa 840ggtggcaaac ctaccaaaaa
atttaccaaa aatttggtca ttgacattca cggtaccaag 900ggagatttga
aacttgaagg cgatgccggc ttcgcagaaa tttcaaatct ggtcctttac
960tacagtggaa ctagagcaaa cgacttcccg ctagccaatg gacaacaagc
tcctttagac 1020ccggggtatg atgcaggtaa agaaatcatg gaagtatatc
atttacgaaa ttataatgcc 1080attgtgggta atattcatcg actgtatcaa
tctatctctg acttccactt caatacaaag 1140aaaattcctg aattaccctc
acaatttgta atgcaaggtt tcgatttcga aggctttccc 1200accttgatgg
atgctctgat attacacagg ttaatcgaga gcgtttataa aagtaacatg
1260atgggctcca cattaaacgt tagcaatatc tcgcattata gtttataa
13082427DNAArtificial SequenceSynthetic DNA 24tggatgtcag cataatgggt
gaaattg 272519DNAArtificial SequenceSynthetic DNA 25cgccagcatc
agccaggtc 192627DNAArtificial SequenceSynthetic DNA 26tctacagtct
taaaggatgt ggtatgc 272725DNAArtificial SequenceSynthetic DNA
27atgcttgatg ggtcgtatga tgaac 252823DNAArtificial SequenceSynthetic
DNA 28cattggatgc tataaggagg gag 232924DNAArtificial
SequenceSynthetic DNA 29ctttagtagt gaatctgcgg tatc
243027DNAArtificial SequenceSynthetic DNA 30agcacagact taccttctat
acgatgg 273126DNAArtificial SequenceSynthetic DNA 31cagcccgaat
ggaattaaag atagcc 263225DNAArtificial SequenceSynthetic DNA
32atgttcacta gatcaagccg aatcc 253322DNAArtificial SequenceSynthetic
DNA 33ccaaccgcca tttccagcaa tc 223431DNAArtificial
SequenceSynthetic DNA 34ccggacatgg aaatgacatt cacgaaggag t
313523DNAArtificial SequenceSynthetic DNA 35ccaccgagaa gatattgttc
atc 233621DNAArtificial SequenceSynthetic DNA 36gatcaaataa
cagcagattt c 213735DNAArtificial SequenceSynthetic DNA 37gataattacc
atgagttcgt cagagacatt gattt 353821DNAArtificial SequenceSynthetic
DNA 38cgtggggttc aaaccatcat c 213927DNAArtificial SequenceSynthetic
DNA 39ccccaccttg accacgtttt gcacatc 274028DNAArtificial
SequenceSynthetic DNA 40ccccttcaat ccgtcttcga cacataac
284125DNAArtificial SequenceSynthetic DNA 41ggggatggag actgtggttg
agaag 254224DNAArtificial SequenceSynthetic DNA 42gggcatgtcg
tcattttgtc tcgg 244321DNAArtificial SequenceSynthetic DNA
43gttacgtcgc cttggacttc g 214421DNAArtificial SequenceSynthetic DNA
44cggcaatacc tgggaacatg g 214523DNAArtificial SequenceSynthetic DNA
45gaagtctgtt gctaaggatg cgc 234623DNAArtificial SequenceSynthetic
DNA 46tcataccgag actccaagtc agc 234721DNAArtificial
SequenceSynthetic DNA 47gtaagaagaa ttgcacggtc c 214821DNAArtificial
SequenceSynthetic DNA 48taccttggtg tcttggtcta c 214922DNAArtificial
SequenceSynthetic DNA 49gatagagact ggcacaggat tg
225021DNAArtificial SequenceSynthetic DNA 50acaatactcc aaagctacac c
215121DNAArtificial SequenceSynthetic DNA 51ggttaaatcg cgacaacaca g
215221DNAArtificial SequenceSynthetic DNA 52cgatatcaaa gggcgttagg c
215321DNAArtificial SequenceSynthetic DNA 53cgtgtatctg ctggacctaa g
215421DNAArtificial SequenceSynthetic DNA 54tcagcgccgt taggagaaac c
215523DNAArtificial SequenceSynthetic DNA 55agtcacatca agatcgttta
tgg 235623DNAArtificial SequenceSynthetic DNA 56actccacttc
aagtaagagt ttg 235723DNAArtificial SequenceSynthetic DNA
57gcacggaata tgggactact tcg 235882DNAArtificial SequenceSynthetic
DNA 58atggactaca acaagagatc ttcggtctca accgtgccta atgcagctcc
cataagagtc 60agacgcgttg aattgtcccc ac 825986DNAArtificial
SequenceSynthetic DNA 59catgttactt ttataaacgc tctcgattaa cctgtgtaat
atcagagcat ccatcaaggt 60acaaatgaca agttcttgaa aacaag
866019DNAArtificial SequenceSynthetic DNA 60cggcattatt gtgtatggc
196175DNAArtificial SequenceSynthetic DNA 61attttttgga aattaccaaa
atcttgttcc cttattcttg gctcatcctt agggtttcaa 60agatccatac ttctc
756271DNAArtificial SequenceSynthetic DNA 62cagttttaaa aagtcagaga
atgtagagaa gtatggatct ttgaaaccct aaggatgagc 60caagaataag g
716326DNAArtificial SequenceSynthetic DNA 63tggttgccat ctttagagct
tccgtg 266419DNAArtificial SequenceSynthetic DNA 64ggatccactg
gtagagagc 196520DNAArtificial SequenceSynthetic DNA 65actagtaaac
gtgtgtgtgc 206620DNAArtificial SequenceSynthetic DNA 66atatgaaacg
cacacaagtc 206754DNAArtificial SequenceSynthetic DNA 67gaattcgtcg
acctgcagcg tacgattctt agtatatata tactgctcaa gggc
546875DNAArtificial SequenceSynthetic DNA 68atttccaaag taattgcatt
tgcccttgag cagtatatat atactaagaa tcgtacgctg 60caggtcgacg aattc
756975DNAArtificial SequenceSynthetic DNA 69gttaattcca ggattgaaag
gaagtgtcga atagtatagt atgctttcta taggccacta 60gtggatctga tatcg
757056DNAArtificial SequenceSynthetic DNA 70cgatatcaga tccactagtg
gcctatagaa agcatactat actattcgac acttcc 567124DNAArtificial
SequenceSynthetic DNA 71caagctgctt ttacttagct aaac
247220DNAArtificial SequenceSynthetic DNA 72ttccctttta cagtgcttcg
207320DNAArtificial SequenceSynthetic DNA 73tgagggtgtg tacattgcag
207423DNAArtificial SequenceSynthetic DNA 74tttactcatc tcatctcatc
aag 237570DNAArtificial SequenceSynthetic DNA 75accctttacg
tcctggttgt cccttcccgc cttgatttgg ccttcatttt tctcaaaatt 60caccaacctc
707671DNAArtificial SequenceSynthetic DNA 76agttacatgc atgatgaata
tgcgccatga gaggttggtg aattttgaga aaaatgaagg 60ccaaatcaag g
717724DNAArtificial SequenceSynthetic DNA 77ttttcactat cgggtgagaa
tatc 247823DNAArtificial SequenceSynthetic DNA 78gactatgtga
tgccataggc aag 237923DNAArtificial SequenceSynthetic DNA
79gtaaaaaaag catgcacgta tac 238023DNAArtificial SequenceSynthetic
DNA 80tctatcttca tcgtcattca ttg 238175DNAArtificial
SequenceSynthetic DNA 81atcttacata gtgtcgggaa caggtcattc taaaaaaagt
aaaataaaat tccaccgcgg 60tggcggccgc tctag 758265DNAArtificial
SequenceSynthetic DNA 82ctagagcggc cgccaccgcg gtggaatttt attttacttt
ttttagaatg acctgttccc 60gacac 658324DNAArtificial SequenceSynthetic
DNA 83cacaagctta ttcttccaaa aatc 24842916DNAArtificial
SequenceSynthetic DNA 84cagttttaaa aagtcagaga atgtagagaa gtatggatct
ttgaaaccct aaggatgagc 60caagaataag ggaacaagat tttggtaatt tccaaaaaat
caatagcatg caggacgtta 120tgaagaagag atctacgtat ggtcatttct
tcttcagatt ccctcatgga gaaagtgcgg 180cagatgtata tgacagagtc
gccagtttcc aagagacttt attcaggcac ttccatgata 240ggcaagagag
aagacccaga gatgttgttg tcctagttac acatggtatt tattccagag
300tattcctgat gaaatggttt agatggacat acgaagagtt tgaatcgttt
accaatgttc 360ctaacgggag cgtaatggtg atggaactgg acgaatccat
caatagatac gtcctgagga 420ccgtgctacc caaatggact gattgtgagg
gagacctaac tacatagtgt ttaaagatta 480cggatattta acttacttag
aataatgcca tttttttgag ttataataat cctacgttag 540tgtgagcggg
atttaaactg tgaggacctt aatacattca gacacttctg cggtatcacc
600ctacttattc ccttcgagat tatatctagg aacccatcag gttggtggaa
gattacccgt 660tctaagactt ttcagcttcc tctattgatg ttacacctgg
acaccccttt tctggcatcc 720agtttttaat cttcagtggc atgtgagatt
ctccgaaatt aattaaagca atcacacaat 780tctctcggat accacctcgg
ttgaaactga caggtggttt gttacgcatg ctaatgcaaa 840ggagcctata
tacctttggc tcggctgctg taacagggaa tataaagggc agcataattt
900aggagtttag tgaacttgca acatttacta ttttcccttc ttacgtaaat
atttttcttt 960ttaattctaa atcaatcttt ttcaattttt tgtttgtatt
cttttcttgc ttaaatctat 1020aactacaaaa aacacataca taaactaaaa
atgttgaacg cttacatcta cgatggtttg 1080agaactccat tcggtagaca
tgccggtgaa ttggcttcca tcagaccaga tgacttggct 1140ggtttagtca
tccaaagatt gattgaaaag accggtgttg ctggtgctga cattgaagat
1200gtcatcttcg gtgacaccaa ccaagctggt gaagattcca gaaacattgc
ccgtcacgct 1260gctttgttgg ctggtttgcc agttaccgtt ccaggtcaaa
ccgtcaacag attatgtgct 1320tctggtttag ctgccatcat tgactctgcc
agagccatca cctgtggtga aggtgactta 1380tacattgctg gtggtgttga
atccatgtcc agagctccat tcgtcatggg taaggctgaa 1440tctgcttact
ccagagatgc caagatctac gacaccacca ttggtaccag attcccaaac
1500aagaagattg ttgctcaata cggtggtcac tccatgccag aaaccggtga
caacgttgct 1560gtcgaatacg gtatctccag agaacaagct gacttattcg
ctgctcaatc tcaagccaag 1620taccaaaagg ctttggaaga aggtttcttt
gctggtgaaa tcactgctgt cgaagtttct 1680caaggtaaga aattgcctcc
aaagcaagtc actgaagatg aacacccaag accatcttcc 1740actttggaag
ctctatccaa gttgaagcca ttgttcgaag gtggtgttgt cactgctggt
1800aacgcttctg gtatcaacga tggtgctgct gctttgttga ttggttctga
agttgccggt 1860caaaagtacg gtttgactcc aatggccaag atcttgtctg
ctgctgctgc tggtgttgaa 1920ccaagaatca tgggtgctgg tccaattgaa
gccatcaaga aggctgttgc cagagctggt 1980ttgactttgg atgacttgga
catcattgaa atcaacgaag cctttgcttc tcaagtcttg 2040tcttgtttga
aaggtttggg tattgacttc aacgacccaa gagtcaaccc aaacggtggt
2100gccattgctg tcggtcaccc attgggtgct tctggtgctc gtttggcttt
gactgttgcc 2160cgtgaattgc aaagaagaaa caagaaatac gctgttgttt
ctctatgtat cggtgtcggt 2220caaggtttgg ctatggttat cgaaaatgta
tcataagtaa ggagttaaag gcaaagtttt 2280ctttactaga gccgttccca
caaataatta tacgtatatg cttcttttcg tttactatat 2340atctatattt
acaagccttt attcactgat gcaatttgtt tccaaatact tttttggaga
2400tctcataact agatatgatg atggcgcaac ttgggcgtat cttaattact
ctggctgcca 2460ggcccgtgta gagggccgca agaccttctg tacgccatat
agtctctaag aacttgaaca 2520tgttactaga cctattgccg cctttcggat
cgctattgtt catcatggat atttgccatc 2580tcgtcttacc gacatcaaaa
gggtgtgtgc atatagcagc tatcatccca cttatgcaac 2640cactggcaaa
actgtttata aaatggaccc agtttgcgtc cttagatgca aatcgagtag
2700aatctagcca tagtctttcc ttgcaaagtt cataggaact ccaatatatt
gcactaaacg 2760ggatccactg gtagagagcg actttgtatg ccccaattgc
gaaacccgcg ttatccttct 2820cgattcttta gtacccgacc aggacaagga
aaaggaggtc gaaacgtttt tgaagaaaca 2880agaggaacta cacggaagct
ctaaagatgg caacca 2916853034DNAArtificial SequenceSynthetic DNA
85ggatccactg gtagagagcg actttgtatg ccccaattgc gaaacccgcg ttatccttct
60cgattcttta gtacccgacc aggacaagga aaaggaggtc gaaacgtttt tgaagaaaca
120agaggaacta cacggaagct ctaaagatgg caaccagcca gaaactaaga
aaatgaagtt 180gatggttcca actggcaccg ctggcttgaa caacaatacc
agccttccaa cttctgtaaa 240taacggcggt acgccagtgc caccagtacc
gttacctttc ggtatacctc ctttccccat 300gtttccaatg cccttcatgc
ctccaacggc tactatcaca aatcctcatc aagctgacgc 360aagccctaag
aaatgaataa caatactgac agtactaaat aattgcctac ttggcttcac
420atacgttgca tacgtcgata tagataataa tgataatgac agcaggatta
tcgtaatacg 480taatagttga aaatctcaaa aatgtgtggg tcattacgta
aataatgata ggaatgggat 540tcttctattt ttcctttttc cattctagca
gccgtcggga aaacgtggca tcctctcttt 600cgggctcaat tggagtcacg
ctgccgtgag catcctctct ttccatatct aacaactgag 660cacgtaacca
atggaaaagc atgagcttag cgttgctcca aaaaagtatt ggatggttaa
720taccatttgt ctgttctctt ctgactttga ctcctcaaaa aaaaaaaatc
tacaatcaac 780agatcgcttc aattacgccc tcacaaaaac ttttttcctt
cttcttcgcc cacgttaaat 840tttatccctc atgttgtcta acggatttct
gcacttgatt tattataaaa agacaaagac 900ataatacttc tctatcaatt
tcagttattg ttcttccttg cgttattctt ctgttcttct 960ttttcttttg
tcatatataa ccataaccaa gtaatacata ttcaaaatga cccacccaat
1020caagaagatt gccatcatcg gtgtcggtgt catgggttcc ggtattgctc
aaattgctgc 1080tcaatctggt cacatcactt acttatacga tgctaaggct
ggtgctgctc aacaagctaa 1140gcaacaattg gccatcactt tccaaaaatt
gttggacaag aacaagatca ccactgaata 1200cgctgatgct gctaacgcta
acttgttgat tgctaacgaa ttgcacgatt tgaaggactg 1260tgacttgatt
gtcgaagcca ttgttgaaag attagatatt aaacaatctt tgatgtccca
1320attggaagcc atcgttccag aaaccaccat cttggcttct aacacctctt
ctttgtccat 1380cactgccatt gcttccaact gtaagcatcc agaaagagtt
gctggttacc atttcttcaa 1440cccagttcca ttgatgaagg ttgttgaagt
catccaaggt ttgaaaactg acccaaagca 1500cattgaaact ttgaaccaat
tgtccagagt cttaggtcac agacctgttg ttgccaagga 1560caccccaggt
ttcatcatca accacgctgg tagagcttac ggtactgaag ccttgaaaat
1620cttgaatgaa aacgttaccg acatctctga aatcgacaga atcttgcgtg
acggtgttgg 1680tttcagaatg ggtccatttg aattgatgga cttgactggt
ttagatgtct cccacccagt 1740catggaatcc atttaccatc aatactacga
agaagctcgt tacagaccaa actctttgac 1800caagcaaatg ttggaagcta
agcaattagg tagaaaggtc ggtcaaggtt tctacgacta 1860cagaaccggt
tccaagactg gtgaaacttc tgccaaggtt gctgaaagat tgactttgta
1920cccaaaggtc tggattgctg ctgacttcga agatgacaaa caattgttga
tcaactattt 1980gaccacccac aacattcaat tggatgtcgg tgccaagcct
caagctgact ctttgtgtct 2040attagcttgt tacggtgaag ataccactca
cgctgctttg agattaaacg tcaacccagc 2100tcactctgtt gccattgaca
tgttgtacgg tatcgaaaag cacagaactt tgatgccatc 2160tttgatcact
gaagtcacct actctcacgc tgctcactcc atcttcaact tggatggtgc
2220catggtttcc actatcggtg aatctattgg tttcgttgct caaagaatct
tagctatggt 2280tatcaacttg ggttgtgaca ttgctcaaca agccattgct
tctgtcgatg acattaatgc 2340tgctgtccgt ttgggtctag gttacccatt
cggtccaatc gaatggggtg atgaaattgg 2400ttccaacaag atcttgttga
tcttgaacag aatcactgct ttgacctctg acccaagata 2460cagaccatct
ccatggttac aaagaagagt tgctttgaac ttgccattga cctttacgac
2520ctaagtaagc tcctgttgaa gtagcattta atcataattt ttgtcacatt
ttaatcaact 2580tgatttttct ggtttaattt ttctaatttt aattttaatt
tttttatcaa tgggaactga 2640tacactaaaa agaattagga gccaacaaga
ataagccgct tatttcctac tagagtttgc 2700ttaaaatttc atctcgaatt
gtcattctaa tattttatcc acacacacac cttaaaattt 2760ttagattaaa
tggcatcaac tcttagcttc acacacacac acacaccgaa gctggttgtt
2820ttatttgatt tgatataatt ggtttctctg gatggtactt tttctttctt
ggttatttcc 2880tattttaaaa tatgaaacgc acacaagtca taattattct
aatagagcac aattcacaac 2940acgcacattt caactttaat atttttttag
aaacacttta tttagtctaa ttcttaattt 3000ttaatatata taatgcacac
acacgtttac tagt 3034862460DNAArtificial SequenceSynthetic DNA
86atatgaaacg cacacaagtc ataattattc taatagagca caattcacaa cacgcacatt
60tcaactttaa tattttttta gaaacacttt atttagtcta attcttaatt tttaatatat
120ataatgcaca cacacgttta ctagtaaggt gagacgcgca taaccgctag
agtactttga 180agaggaaaca gcaatagggt tgctaccagt ataaatagac
aggtacatac aacactggaa 240atggttgtct gtttgagtac gctttcaatt
catttgggtg tgcactttat tatgttacaa 300tatggaaggg aactttacac
ttctcctatg cacatatatt aattaaagtc caatgctagt 360agagaagggg
ggtaacaccc ctccgcgctc ttttccgatt tttttctaaa ccgtggaata
420tttcggttat ccttttgttg tttccgggtg tacaatatgg acttcctctt
ttctggcaac 480caaacccata catcgggatt cctataatac cttcgttggt
ctccctaaca tgtaggtggc 540ggaggggaga tatacaatag aacagatacc
agacaagaca taatgggcta aacaagacta 600caccaattac actgcctcat
tgatggtggt acataacgaa ctaatactgt agccctagac 660ttgatagcca
tcatcatatc gaagtttcac tacccttttt ccatttgcca tctattgaag
720taataatagg cgcatgcaac ttcttttctt tttttttctt ttctctctcc
cccgttgttg 780tctcaccata tccgcaatga caaaaaaatg atggaagaca
ctaaaggaaa aaattaacga 840caaagacagc accaacagat gtcgttgttc
cagagctgat gaggggtatc tcgaagcaca 900cgaaactttt tccttccttc
attcacgcac actactctct aatgagcaac ggtatacggc 960cttccttcca
gttacttgaa tttgaaataa aaaaaagttt gctgtcttgc tatcaagtat
1020aaatagacct gcaattatta atcttttgtt tcctcgtcat tgttctcgtt
ccctttcttc 1080cttgtttctt tttctgcaca atatttcaag ctataccaag
catacaatca actatctcat 1140atacaatgat tccagaccaa gacaactttg
ttgaaatcga cttctccatt gaacaaatcg 1200ctattgtcaa gatcaacaga
ccagcttcca agaacgcttt gaacactgaa gtcagaaagc 1260aattggctca
agccttcacc gaattgtctt tcaacgacca aatcaacgcc attgttttga
1320ctggtggtga agatgttttc gctgctggtg ctgacttgaa ggaaatggct
accgcttctt 1380ccactgacat gttgttgaga cacactgaac gttactggaa
cgccattgct caatgtccaa 1440agccagttat cgctgctgtc aacggttacg
ctttaggtgg tggttgtgaa ttggccatgc 1500acactgacat catcattgct
ggtaaatctg ccacctttgg tcaaccagaa atcaaggtcg 1560gtttgatgcc
aggtgctggt ggtacccaaa gattattcag agctgttggt aaattccacg
1620ctatgagaat gatcatgacc ggtgtcatgg ttcctgctga agaagcctac
ttgattggtt 1680tggtttctca agtcactgaa gattctcaaa ccattccaac
tgccatcaag atggctcaat 1740ctttggccaa gatgccacca attgctttgc
aacaaatcaa ggaagttgct ttgatgtccg 1800aagatgtccc attgaacgct
ggtttgactt tggaaagaaa gtctttccaa ttattattct 1860ccactgaaga
taagaacgaa ggtatcaatg ctttcatcga aaagagaaag ccatcttacc
1920atggaaaata agtaaataaa gcaatcttga tgaggataat gatttttttt
tgaatataca 1980taaatactac cgtttttctg ctagattttg tgaagacgta
aataagtaca tattactttt 2040taagccaaga caagattaag cattaacttt
acccttttct cttctaagtt tcaattctag 2100ttatcactgt ttaaaagtta
tggcgagaac gtcggcggtt aaaatatatt accctgaacg 2160tggtgaattg
aagttctagg atggtttaaa gatttttcct ttttgggaaa taagtaaaca
2220atatattgct gcctttgcaa aacgcacata cccacaatat gtgactattg
gcaaagaacg 2280cattatcctt tgaagaggtg gatactgata ctaagagagt
ctctattccg gctccacttt 2340tagtccagag attacttgtc ttcttacgta
tcagaacaag aaagcatttc caaagtaatt 2400gcatttgccc ttgagcagta
tatatatact aagaatcgta cgctgcaggt cgacgaattc 2460872460DNAArtificial
SequenceSynthetic DNA 87atatgaaacg cacacaagtc ataattattc taatagagca
caattcacaa cacgcacatt 60tcaactttaa tattttttta gaaacacttt atttagtcta
attcttaatt tttaatatat 120ataatgcaca cacacgttta ctagtaaggt
gagacgcgca taaccgctag agtactttga 180agaggaaaca gcaatagggt
tgctaccagt ataaatagac aggtacatac aacactggaa 240atggttgtct
gtttgagtac gctttcaatt catttgggtg tgcactttat tatgttacaa
300tatggaaggg aactttacac ttctcctatg cacatatatt aattaaagtc
caatgctagt 360agagaagggg ggtaacaccc ctccgcgctc ttttccgatt
tttttctaaa ccgtggaata 420tttcggttat ccttttgttg tttccgggtg
tacaatatgg acttcctctt ttctggcaac 480caaacccata catcgggatt
cctataatac cttcgttggt ctccctaaca tgtaggtggc 540ggaggggaga
tatacaatag aacagatacc agacaagaca taatgggcta aacaagacta
600caccaattac actgcctcat tgatggtggt acataacgaa ctaatactgt
agccctagac 660ttgatagcca tcatcatatc gaagtttcac tacccttttt
ccatttgcca tctattgaag 720taataatagg cgcatgcaac ttcttttctt
tttttttctt ttctctctcc cccgttgttg 780tctcaccata tccgcaatga
caaaaaaatg atggaagaca ctaaaggaaa aaattaacga 840caaagacagc
accaacagat gtcgttgttc cagagctgat gaggggtatc tcgaagcaca
900cgaaactttt tccttccttc attcacgcac actactctct aatgagcaac
ggtatacggc 960cttccttcca gttacttgaa tttgaaataa aaaaaagttt
gctgtcttgc tatcaagtat 1020aaatagacct gcaattatta atcttttgtt
tcctcgtcat tgttctcgtt ccctttcttc 1080cttgtttctt tttctgcaca
atatttcaag ctataccaag catacaatca actatctcat 1140atacaatgat
tccagaccaa gacaactttg ttgaaatcga cttctccatt gaacaaatcg
1200ctattgtcaa gatcaacaga ccagcttcca agaacgcttt gaacactgaa
gtcagaaagc 1260aattggctca agccttcacc gaattgtctt tcaacgacca
aatcaacgcc attgttttga 1320ctggtggtga agatgttttc gctgctggtg
ctgacttgaa ggaaatggct accgcttctt 1380ccactgacat gttgttgaga
cacactgaac gttactggaa cgccattgct caatgtccaa 1440agccagttat
cgctgctgtc aacggttacg ctttaggtgg tggttgtgaa ttggccatgc
1500acactgacat catcattgct ggtaaatctg ccacctttgg tcaaccagaa
atcaaggtcg 1560gtttgatgcc aggtgctggt ggtacccaaa gattattcag
agctgttggt aaattccacg 1620ctatgagaat gatcatgacc ggtgtcatgg
ttcctgctga agaagcctac ttgattggtt 1680tggtttctca agtcactgaa
gattctcaaa ccattccaac tgccatcaag atggctcaat 1740ctttggccaa
gatgccacca attgctttgc aacaaatcaa ggaagttgct ttgatgtccg
1800aagatgtccc attgaacgct ggtttgactt tggaaagaaa gtctttccaa
ttattattct 1860ccactgaaga taagaacgaa ggtatcaatg ctttcatcga
aaagagaaag ccatcttacc 1920atggaaaata agtaaataaa gcaatcttga
tgaggataat gatttttttt tgaatataca 1980taaatactac cgtttttctg
ctagattttg tgaagacgta aataagtaca tattactttt 2040taagccaaga
caagattaag cattaacttt acccttttct cttctaagtt tcaattctag
2100ttatcactgt ttaaaagtta tggcgagaac gtcggcggtt aaaatatatt
accctgaacg 2160tggtgaattg aagttctagg atggtttaaa gatttttcct
ttttgggaaa taagtaaaca 2220atatattgct gcctttgcaa aacgcacata
cccacaatat gtgactattg gcaaagaacg 2280cattatcctt tgaagaggtg
gatactgata ctaagagagt ctctattccg gctccacttt 2340tagtccagag
attacttgtc ttcttacgta tcagaacaag aaagcatttc caaagtaatt
2400gcatttgccc ttgagcagta tatatatact aagaatcgta cgctgcaggt
cgacgaattc 2460882168DNAArtificial SequenceSynthetic DNA
88ttccctttta cagtgcttcg gaaaagcaca gcgttgtcca agggaacaat ttttcttcaa
60gttaatgcat aagaaatatc tttttttatg tttagctaag taaaagcagc ttggagtaaa
120aaaaaaaatg agtaaatttc tcgatggatt agtttctcac aggtaacata
acaaaaacca 180agaaaagccc gcttctgaaa actacagttg acttgtatgc
taaagggcca gactaatggg 240aggagaaaaa gaaacgaatg tatatgctca
tttacactct atatcaccat atggaggata 300agttgggctg agcttctgat
ccaatttatt ctatccatta gttgctgata tgtcccacca 360gccaacactt
gatagtatct actcgccatt cacttccagc agcgccagta gggttgttga
420gcttagtaaa aatgtgcgca ccacaagcct acatgactcc acgtcacatg
aaaccacacc 480gtggggcctt gttgcgctag gaataggata tgcgacgaag
acgcttctgc ttagtaacca 540caccacattt tcagggggtc gatctgcttg
cttcctttac tgtcacgagc ggcccataat 600cgcgcttttt ttttaaaagg
cgcgagacag caaacaggaa gctcgggttt caaccttcgg 660agtggtcgca
gatctggaga ctggatcttt acaatacagt aaggcaagcc accatctgct
720tcttaggtgc atgcgacggt atccacgtgc agaacaacat agtctgaaga
agggggggag 780gagcatgttc attctctgta gcagtaagag cttggtgata
atgaccaaaa ctggagtctc 840gaaatcatat aaatagacaa tatattttca
cacaatgaga tttgtagtac agttctattc 900tctctcttgc ataaataaga
aattcatcaa gaacttggtt tgatatttca ccaacacaca 960caaaaaacag
tacttcacta aatttacaca caatgatcaa caaaatcatc aacgacattg
1020aaccaatctt gaaatccatt ccagatggtt ccaccatcat gacttctggt
ttcggtacca 1080ctggtcaacc agaagctcta ttagaagcct tgattgactt
tgctccaaag gaattgacca 1140tcatcaacaa caatgcttct tctggtccaa
acggtttgac tcaattattc actgctggtt 1200tggtcaagaa attgatctgt
tcttacccaa agtccatttc ttccactgtt ttcccagatt 1260tgtacagagc
tggtaagatt gaattggaat tggttcctca aggtaactta gcttgtcgta
1320tccaagctgc tggtgctggt ttgggtgccg ttttcactcc aactggttac
ggtaccaaga 1380ttgctgaagg taaggaaacc agaatcatca acggtaagaa
ctacgttttg gaatacccat 1440tggaagctga ttacgctttc atctacgctg
acaaggctga cagatggggt aacttgacct 1500acagaaaggc tgccagaaac
ttcggtccaa tcatggccaa ggctgccaag accaccattg 1560ctcaagtcaa
ccaaaccgtc gaattgggtg atttggaccc agaatgtatc atcactccag
1620gtattttcgt ccaacacgtt gtcagattgg gtgacattaa gtaagtaagg
gcgcggatct 1680cttatgtctt tacgatttat agttttcatt atcaagtatg
cctatattag tatatagcat 1740ctttagatga cagtgttcga agtttcacga
ataaaagata atattctact ttttgctccc 1800accgcgtttg ctagcacgag
tgaacaccat ccctcgcctg tgagttgtac ccattcctct 1860aaactgtaga
catggtagct tcagcagtgt tcgttatgta cggcatcctc caacaaacag
1920tcggttatag tttgtcctgc tcctctgaat cgtctccctc gatatttctc
attttccttc 1980gcatgccagc attgaaatga tcgaagttca atgatgaaac
ggtaattctt ctgtcattta 2040ctcatctcat ctcatcaagt tatataattc
tatacggatg taatttttca cttttcgtct 2100tgacgtccac cctataattt
caattattga accctcacaa atgatgcact gcaatgtaca 2160caccctca
2168892362DNAArtificial SequenceSynthetic DNA 89tttactcatc
tcatctcatc aagttatata attctatacg gatgtaattt ttcacttttc 60gtcttgacgt
ccaccctata atttcaatta ttgaaccctc acaaatgatg cactgcaatg
120tacacaccct caactagtaa tcctactctt gccgttgcca tccaaaatga
gctagaaggt 180ggattaacaa atataatgac aaatcgttgc ttgtctgact
tgattccact acagttacaa 240atatttgaca ttgtatataa gttttgcaag
ttcatcaaat ctatgagagc aaaattatgt 300caactggacc ccgtactata
tgagaaacac aaaagcggga tgatgaaaac actaaacgaa 360ggctatcgta
caaacaatgg cggtcaggaa gatgttggtt accaagaaga tgccgccctg
420gaattaattc agaagctgat tgaatacatt agcaacgcgt ccagcatttt
tcggaagtgt 480ctcataaact ttactcaaga gttaagtact gaaaaattcg
acttttatga tagttcaagt 540gtcgacgctg cgggtataga aagggttctt
tactctatag tacctcctcg ctcagcatct 600gcttcttccc aaagatgaac
gcggcgttat gtcactaacg acgtgcacca acttgcggaa 660agtggaatcc
cgttccaaaa ctggcatcca ctaattgata catctacaca ccgcacgcct
720tttttctgaa gcccactttc gtggactttg ccatatgcaa aattcatgaa
gtgtgatacc 780aagtcagcat acacctcact agggtagttt ctttggttgt
attgatcatt tggttcatcg 840tggttcatta attttttttc tccattgctt
tctggctttg atcttactat catttggatt 900tttgtcgaag gttgtagaat
tgtatgtgac aagtggcacc aagcatatat aaaaaaaaaa 960agcattatct
tcctaccaga gttgattgtt aaaaacgtat ttatagcaaa cgcaattgta
1020attaattctt attttgtatc ttttcttccc ttgtctcaat cttttatttt
tattttattt 1080ttcttttctt agtttctttc ataacaccaa gcaactaata
ctataacata caataataat 1140gaccatccaa aagagatcca gagaagatat
tgccatcatg attgctaagg acattccaga 1200tggttcttac gtcaacttgg
gtattggttt accaactcac gttgctaaat acttgccaaa 1260ggacaaggaa
atctttttgc actctgaaaa cggtgttttg gctttcggtc caccacctgc
1320tgaaggtgaa gaagatcaag atttggttaa cgctggtaag gaattagtca
ctttgttgtc 1380cggtggttgt ttcatgcacc acggtgactc tttcgacatc
atgagaggtg gtcatttgga 1440catctgtgtt atcggtgctt tccaagttgc
tttgaacggt gacttggcta actggcacac 1500tggtaaggat gacgatgttc
cagccgtcgg tggtgctatg gacttggctg tcggtgccaa 1560gagaattttc
gtctacatgg aacacaccac caagaagggt gaaccaaaga tcgtcaagca
1620tttgacctac ccaatcactg gtgaacaatg tgttgacaga atctacaccg
atttgtgtac 1680cattgaattg aaagatggtc aagcctacgt catcgaaatg
gttgacggtt tggacttcga 1740cactttacaa gctctaactg aatgtccatt
gattgaccac tgtacctact cctctttgat 1800ccaattgcga taagtaagtc
tgaagaatga atgatttgat gatttctttt tccctccatt 1860tttcttactg
aatatatcaa tgatatagac ttgtatagtt tattatttca aattaagtag
1920ctatatatag tcaagataac gtttgtttga cacgattaca ttattcgtcg
acatcttttt 1980tcagcctgtc gtggtagcaa tttgaggagt attattaatt
gaataggttc attttgcgct 2040cgcataaaca gttttcgtca gggacagtat
gttggaatga gtggtaatta atggtgacat 2100gacatgttat agcaataacc
ttgatgttta catcgtagtt taatgtacac cccgcgaatt 2160cgttcaagta
ggagtgcacc aattgcaaag ggaaaagctg aatgggcagt tcgaatagta
2220cttaagatta gttaaaagtc catgattgaa cattgatgtg gtagttacat
gcatgatgaa 2280tatgcgccat gagaggttgg tgaattttga gaaaaatgaa
ggccaaatca aggcgggaag 2340ggacaaccag gacgtaaagg gt
2362902874DNAArtificial SequenceSynthetic DNA 90agttacatgc
atgatgaata tgcgccatga gaggttggtg aattttgaga aaaatgaagg 60ccaaatcaag
gcgggaaggg acaaccagga cgtaaagggt agcctcccca taacataaac
120tcaataaaat atatagtctt caacttgaaa aaggaacaag ctcatgcaaa
gaggtggtac 180ccgcacgccg aaatgcatgc aagtaaccta ttcaaagtaa
tatctcatac atgtttcatg 240agggtaacaa catgcgactg ggtgagcata
tgttccgctg
atgtgatgtg caagataaac 300aagcaagaca gaaactaact tcttcttcat
gtaataaaca caccccgcgt ttatttacct 360atctttaaac ttcaacacct
tatatcataa ctaatatttc ttgagataag cacactgcac 420ccataccttc
cttaaaaacg tagcttccag tttttggtgg ttctggcttc cttcccgatt
480ccgcccgcta aacgcataat tttgttgcct ggtggcattt gcaaaatgca
taacctatgc 540atttaaaaga ttatgtatgg tcttctgact tttcgtgtga
tgaggctcgt ggaaaaaatg 600aataatttat gaatttgaga acaattttgt
gttgttacgg tattttacta tggaataatc 660aatcaattga ggattttatg
caaatatcgt ttgaatattt ttccgaccct ttgagtactt 720ttcttcataa
ttgcataata ttgtccgctg cccgtttttc tgttagacgg tgtcttgatc
780tacttgctat cgttcaacac caccttattt tctaactatt ttttttttag
ctcatttgaa 840tcagcttatg gtgatggcac atttttgcat aaacctagct
gtcctcgttg aacataggaa 900aaaaaaatat ataaacaagg ctctttcact
ctccttggaa tcagatttgg gtttgttccc 960tttattttca tatttcttgt
catattcttt tctcaattat tatcttctac tcataacctc 1020acgcaaaata
acacagtcaa atcaatcaaa atgaacttgc acgaatacca agccaagcaa
1080ttgtttgctc gttacggtct accagctcca gttggttacg cttgtaccac
tccaagagaa 1140gctgaagaag ctgcctccaa gattggtgct ggtccatggg
ttgtcaagtg tcaagtccac 1200gctggtggtc gtggtaaggc tggtggtgtc
aaggttgtca actccaagga agatattaga 1260gctttcgctg aaaactggtt
aggtaagaga ttagtcacct accaaactga cgctaacggt 1320caacctgtta
accaaatctt agtcgaagct gccactgaca ttgccaagga attatacttg
1380ggtgccgttg ttgaccgttc ttccagaaga gttgttttca tggcttctac
tgaaggtggt 1440gttgaaatcg aaaaggttgc tgaagaaact ccacatttga
ttcacaaggt tgctttggac 1500ccattgactg gtccaatgcc ataccaaggt
agagaattgg ccttcaaatt gggtttggaa 1560ggtaagttgg tccaacaatt
caccaagatc ttcatgggtt tggctaccat cttcttggaa 1620agagacttgg
ctttgattga aatcaaccca ttagtcatca ccaagcaagg tgacttgatc
1680tgtttggatg gtaagttggg tgctgacggt aacgctttat tcagacaacc
agatttgaga 1740gaaatgagag atcaatctca agaagatcca agagaagctc
aagctgctca atgggaattg 1800aactacgttg ctttggacgg taacatcggt
tgtatggtta acggtgccgg tttggccatg 1860ggtaccatgg acattgtcaa
attgcacggt ggtgaaccag ctaacttctt ggatgtcggt 1920ggtggtgcca
ccaaggaaag agttactgaa gccttcaaga tcatcttatc tgacgacaag
1980gtcaaggctg tcttggttaa catcttcggt ggtattgtca gatgtgactt
gattgctgat 2040ggtatcatcg gtgctgttgc tgaagttggt gtcaatgtcc
cagttgttgt cagattggaa 2100ggtaacaacg ctgaattggg tgccaagaaa
ttggctgact ctggtttgaa catcattgct 2160gccaagggtt tgaccgatgc
tgctcaacaa gttgttgctg ctgtcgaagg gaaataagta 2220aggagttaaa
ggcaaagttt tctttactag agccgttccc acaaataatt atacgtatat
2280gcttcttttc gtttactata tatctatatt tacaagcctt tattcactga
tgcaatttgt 2340ttccaaatac ttttttggag atctcataac tagatatgat
gatggcgcaa cttgggcgta 2400tcttaattac tctggctgcc aggcccgtgt
agagggccgc aagaccttct gtacgccata 2460tagtctctaa gaacttgaac
atgttactag acctattgcc gcctttcgga tcgctattgt 2520tcatcatgga
tatttgccat ctcgtcttac cgacatcaaa agggtgtgtg catatagcag
2580ctatcatccc acttatgcaa ccactggcaa aactgtttat aaaatggacc
cagtttgcgt 2640ccttagatgc aaatcgagta gaatctagcc atagtctttc
cttgcaaagt tcataggaac 2700tccaatatat tgcactaaac gggatcctgt
ggtagaatac aaaagactat gtgatgccat 2760aggcaagaag ggagactctc
actccgagat gggcagcttg atcgcccagg aattgaattg 2820tattgtggtg
gagaaaggtc agtcagataa gatattctca cccgatagtg aaaa
2874912356DNAArtificial SequenceSynthetic DNA 91gactatgtga
tgccataggc aagaagggag actctcactc cgagatgggc agcttgatcg 60cccaggaatt
gaattgtatt gtggtggaga aaggtcagtc agataagata ttctcacccg
120atagtgaaaa agacatgttg acgaacagcg aagagggcag caacaagagg
gtaggaggcc 180aaggtgatac tttgacagga gctatatcat gcatgcttgc
atttagtcgt gcaatgtatg 240actttaagat ttgtgagcag gaagaaaagg
gagaatcttc taacgataaa cccttgaaaa 300actgggtaga ctacgctatg
ttgagttgct acgcaggctg cacaattaca cgagaatgct 360cccgcgtagg
atttaaggct aagggacgtg caatgcagac gacagatcta aatgaccgtg
420tcggtgaagt gttcgccaaa cttttcggtt aacacatgca gtgatgcacg
cgcgatggtg 480ctaagttaca tatatatata tatatatata tatatatata
tatagccata gtgatgtcta 540agtaaccttt atggtatatt tcttaatgtg
gaaagatact agcgcgcgca cccacacact 600agcttcgtct tttcttgaag
aaaagaggaa gctcgctaaa tgggattcca ctttccgttc 660cctgccagct
catggaaaaa ggttagtgga acgatgaaga ataaaaagag agatccactg
720aggtgaaatt tgagctgaca gcgagtttca tgatcgtgat gaacaatggt
aacgagttgt 780ggctgttgcc agggagggtg gttctcaact tttaatgtat
ggccaaatcg ctacttgggt 840ttgttatata acaaagaaga aataatgaac
tgattctctt cctccttctt gtcctttctt 900aattctgttg taattacctt
cctttgtaat tttttttgta attattcttc ttaataatcc 960aaacaaacac
acatattaca ataatgtcca tcttgattga caagaacacc aaggtcatct
1020gtcaaggttt caccggttct caaggtactt tccactctga acaagccatt
gcttacggta 1080ccaagatggt tggcggtgtc accccaggta agggtggtac
cactcacttg ggtttaccag 1140ttttcaacac cgtcagagaa gctgttgctg
ccactggtgc taccgcttct gtcatctacg 1200ttccagctcc attctgtaag
gattccatct tggaagccat tgatgctggt atcaaattga 1260tcattaccat
tactgaaggt atcccaactt tggacatgtt gactgtcaag gtcaaattgg
1320atgaagctgg tgttagaatg attggtccaa actgtccagg tgtcatcact
ccaggtgaat 1380gtaagatcgg tattcaacca ggtcacattc acaagccagg
taaggttggt atcgtttccc 1440gttctggtac tttgacctac gaagctgtca
agcaaaccac tgactacggt ttcggtcaat 1500ctacctgtgt tggtatcggt
ggtgacccaa ttccaggttc caacttcatc gacatcttgg 1560aaatgtttga
aaaggaccct caaactgaag ccattgtcat gatcggtgaa atcggtggtt
1620ctgctgaaga agaagctgct gcttacatca aggaacacgt taccaagcca
gttgttggtt 1680acattgctgg tgttactgct ccaaagggta agagaatggg
tcatgccggt gccatcattg 1740ctggtggtaa gggtactgct gatgaaaaat
tcgctgcttt ggaagctgct ggtgtcaaga 1800ccgtcagatc tttggctgac
atcggtgaag ccttaaagac tgttttgaaa taagtaagcg 1860aatttcttat
gatttatgat ttttattatt aaataagtta taaaaaaaat aagtgtatac
1920aaattttaaa gtgactctta ggttttaaaa cgaaaattct tattcttgag
taactctttc 1980ctgtaggtca ggttgctttc tcaggtatag catgaggtcg
ctcttattga ccacacctct 2040accggcatgc cgagcaaatg cctgcaaatc
gctccccatt tcacccaatt gtagatatgc 2100taactccagc aatgagttga
tgaatctcgg tgtgtatttt atgtcctcag aggacaacac 2160ctgttgtaat
cgttcttcca caccgatcca cagcctagcc ttcagttggg ctctatcttc
2220atcgtcattc attgcatcta ctagcccctt acctgagctt caagacgtta
tatcgctttt 2280atgtatcatg atcttatctt gagatatgaa tacataaata
tatttactca agtgtatacg 2340tgcatgcttt ttttac 2356923138DNAArtificial
SequenceSynthetic DNA 92tctatcttca tcgtcattca ttgcatctac tagcccctta
cctgagcttc aagacgttat 60atcgctttta tgtatcatga tcttatcttg agatatgaat
acataaatat atttactcaa 120gtgtatacgt gcatgctttt tttacgacta
gtacgtctct tgggttgata aacttgtatg 180acatatgttc accgagtttt
gtcatgtcgt catactatac ggcagcggct tgttgctgcc 240gtttaatgaa
acagtttttt tcacgacaag attcttctat tgattattca catatgtatt
300ttaatgaaaa atgagtactt tataacacaa ccctaatgac aaatgaaaaa
gttgattgcc 360atgaactctt aaagcgattt atgagaacaa ttaattgatt
atatatatat atctttgcaa 420ttatgtcgtt tgttgcaaga tgcttctgaa
agtaagtaac tctataagat agataatgct 480acaagacgcc aaacgcaagt
gagtaagaaa taagagctgg caggtcttcg ccggaacact 540atcatcaaaa
tcactacaat ttagcggctt agcacaatac gcgttttcaa cttcctacgc
600tagcgatgac aaaatgtctc caagaggcgg aacttgcgac ggatgcatgg
aaatatctta 660cgtaatgaac ttccgtaatg aacttccgta attcaagatc
tcttagcatc tcttgttcaa 720tcttcagact ctactaagtg ttcttaccaa
ccattggatg ctcattacaa atgaatgaat 780atattgcacg gaacggaagc
ggcatgcttt ttccgtctcg tgtgcttagt aaagcaaaac 840ggagtagaat
cggtaagaac ttcctttttg ggttggaaaa tcattgccat tgtttggaca
900cctttctttt tccgtattgt tcgagcaccg cgtttctttt tgggtacttg
atgaggtagc 960agattcctgg aacgtgcttt ctctcgaggt aacctgcctt
gttcctcctg gtgactttct 1020aaaatataaa aggaaaagca tatctctagt
ttcgagtttt ttcttcatac tttatttcct 1080tatgttaaac ggtccagata
tagaataaat catcatatta agctaaatat agacgataat 1140atagtatcga
taatggaatc tttggaattg gaacaattag tcaagaaggt tttgttggaa
1200aaattggctg aacaaaagga agttccaacc aagaccacca cccaaggtgc
caagtccggt 1260gttttcgata ccgtcgatga agctgtccaa gctgccgtca
ttgctcaaaa ctgttacaag 1320gaaaaatctt tggaagaaag aagaaacgtt
gtcaaggcca tcagagaagc tttataccca 1380gaaatcgaaa ccattgctac
cagagctgtt gctgaaaccg gtatgggtaa tgtcaccgat 1440aaaatcttga
agaacacttt agctatcgaa aagactccag gtgttgaaga cttgtacact
1500gaagttgcta ccggtgacaa cggtatgact ttatacgaat tatctccata
cggtgtcatc 1560ggtgctgttg ctccatctac caacccaact gaaactttga
tctgtaactc catcggtatg 1620ttggctgctg gtaacgccgt tttctactct
cctcacccag gtgccaagaa catctcttta 1680tggttgattg aaaagttgaa
cactatcgtc agagattctt gtggtattga caacttgatt 1740gtcaccgttg
ccaagccatc tatccaagct gctcaagaaa tgatgaacca cccaaaggtt
1800ccattgttgg tcatcactgg tggtccaggt gttgtcttgc aagctatgca
atctggtaag 1860aaggttatcg gtgctggtgc tggtaaccct ccatccatcg
ttgacgaaac cgctaacatt 1920gaaaaggctg ctgctgacat tgtcgacggt
gcttcctttg accataatat cttgtgtatc 1980gctgaaaagt ctgttgttgc
cgttgactcc attgctgact tcttgttgtt ccaaatggaa 2040aagaacggtg
ctttgcacgt cactaaccca tctgacatcc aaaaattgga aaaggttgcc
2100gtcactgaca agggtgtcac caacaagaaa ttggttggta agtctgccac
tgaaatcttg 2160aaagaagctg gtattgcttg tgatttcacc ccaagattga
tcattgtcga aactgaaaag 2220tcccacccat tcgctactgt tgaattgttg
atgccaattg ttccagttgt cagagttcca 2280gacttcgatg aagctttgga
agttgccatt gaattggaac aaggtctaca tcacactgct 2340accatgcact
ctcaaaacat ctccagattg aacaaggctg cccgtgacat gcaaacctcc
2400atctttgtca agaacggtcc atctttcgct ggtttaggtt tcagaggtga
aggttccacc 2460actttcacca ttgctactcc aactggtgaa ggtactacca
ctgcccgtca cttcgctaga 2520agaagaagat gtgtcttgac tgatggtttc
tccattagat aagattaata taattatata 2580aaaatattat cttcttttct
ttatatctag tgttatgtaa aataaattga tgactacgga 2640aagctttttt
atattgtttc tttttcattc tgagccactt aaatttcgtg aatgttcttg
2700taagggacgg tagatttaca agtgatacaa caaaaagcaa ggcgcttttt
ctaataaaaa 2760gaagaaaagc atttaacaat tgaacacctc tatatcaacg
aagaatatta ctttgtctct 2820aaatccttgt aaaatgtgta cgatctctat
atgggttact cataagtgta ccgaagactg 2880cattgaaagt ttatgttttt
tcactggagg cgtcattttc gcgttgagaa gatgttctta 2940tccaaatttc
aactgttata tacaagagca aaaaattgcc aaaaaaaaca acatttattc
3000atttgaaata taaaatttgg gcttctatat tttaatattg cttttcaatt
actgttatta 3060aatctagagc ggccgccacc gcggtggaat tttattttac
tttttttaga atgacctgtt 3120cccgacacta tgtaagat
3138931672DNAArtificial SequenceSynthetic DNA 93atttccaaag
taattgcatt tgcccttgag cagtatatat atactaagaa tcgtacgctg 60caggtcgacg
aattctaccg ttcgtataat gtatgctata cgaagttata gatctgttta
120gcttgcctcg tccccgccgg gtcacccggc cagcgacatg gaggcccaga
ataccctcct 180tgacagtctt gacgtgcgca gctcaggggc atgatgtgac
tgtcgcccgt acatttagcc 240catacatccc catgtataat catttgcatc
catacatttt gatggccgca cggcgcgaag 300caaaaattac ggctcctcgc
tgcagacctg cgagcaggga aacgctcccc tcacagacgc 360gttgaattgt
ccccacgccg cgcccctgta gagaaatata aaaggttagg atttgccact
420gaggttcttc tttcatatac ttccttttaa aatcttgcta ggatacagtt
ctcacatcac 480atccgaacat aaacaaccat gggtaaggaa aagactcacg
tttcgaggcc gcgattaaat 540tccaacatgg atgctgattt atatgggtat
aaatgggctc gcgataatgt cgggcaatca 600ggtgcgacaa tctatcgatt
gtatgggaag cccgatgcgc cagagttgtt tctgaaacat 660ggcaaaggta
gcgttgccaa tgatgttaca gatgagatgg tcagactaaa ctggctgacg
720gaatttatgc ctcttccgac catcaagcat tttatccgta ctcctgatga
tgcatggtta 780ctcaccactg cgatccccgg caaaacagca ttccaggtat
tagaagaata tcctgattca 840ggtgaaaata ttgttgatgc gctggcagtg
ttcctgcgcc ggttgcattc gattcctgtt 900tgtaattgtc cttttaacag
cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg 960aataacggtt
tggttgatgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa
1020caagtctgga aagaaatgca taagcttttg ccattctcac cggattcagt
cgtcactcat 1080ggtgatttct cacttgataa ccttattttt gacgagggga
aattaatagg ttgtattgat 1140gttggacgag tcggaatcgc agaccgatac
caggatcttg ccatcctatg gaactgcctc 1200ggtgagtttt ctccttcatt
acagaaacgg ctttttcaaa aatatggtat tgataatcct 1260gatatgaata
aattgcagtt tcatttgatg ctcgatgagt ttttctaatc agtactgaca
1320ataaaaagat tcttgttttc aagaacttgt catttgtata gtttttttat
attgtagttg 1380ttctatttta atcaaatgtt agcgtgattt atattttttt
tcgcctcgac atcatctgcc 1440cagatgcgaa gttaagtgcg cagaaagtaa
tatcatgcgt caatcgtatg tgaatgctgg 1500tcgctatact gctgtcgatt
cgatactaac gccgccatcc agtgtcgaaa acgagctcat 1560aacttcgtat
aatgtatgct atacgaacgg tagaattcga tatcagatcc actagtggcc
1620tatagaaagc atactatact attcgacact tcctttcaat cctggaatta ac
167294550DNAArtificial SequenceSynthetic DNA 94cggcattatt
gtgtatggct caataatttt ataaaaaaag gaactattgg ttcttagtat 60tttcttgcta
gaagacatat tcttaccaat cctttcataa gctaattatg ccatccatat
120agcaagagaa tccggtgggg gcgccatgcc tatccggcgg caacattatt
actctggtat 180acgggcgtaa ctccataata tgccaccact tacctttaac
atgttcatgg taggtacccc 240acccagccat aaggaaattt tcaaaggcgt
tggatcaaaa aataggcctt tatttcatcg 300cgtgattgag gagcataaca
tgtttagtga aggtttcttt tggaaaactt cagtcgctca 360ttattagaac
cagggaggtc caggctttgc tggtgggaga gaaagcttat gaagctgggg
420ttgcagattt gtcgattggt cgccagtaca cagttttaaa aagtcagaga
atgtagagaa 480gtatggatct ttgaaaccct aaggatgagc caagaataag
ggaacaagat tttggtaatt 540tccaaaaaat 55095523DNAArtificial
SequenceSynthetic DNA 95ctagagcggc cgccaccgcg gtggaatttt attttacttt
ttttagaatg acctgttccc 60gacactatgt aagatctagc ttttaacata ttatggaaac
ctgaaatgta aaatctgaat 120ttttgtatat gtgtttatat ttgggtagtt
cttttgagga aagcatgcat agacttgctg 180tacgaacttt atgtgacttg
tagtgacgct gtttcatgag actttagccc tttgaacata 240ttatcatatc
tcagcttgaa atactataga tttacttttg cagccatttc ttggtgctcc
300aaggttgtgc gtatctatta cttaatttct gtccttgcca agttttgcag
cagggcggtc 360acaagactcc tctgccgtca ttccttagtc cttcgggaac
acacttattt atgtatttgt 420attctacaat tctacggtgc acaagggttg
ggcactgttg agctcagcac gcaactattg 480ctggcatgaa gataagattg
atttttggaa gaataagctt gtg 5239622DNAArtificial SequenceSynthetic
DNA 96tgttcttctt ggaaaatgta cg 229734DNAArtificial
SequenceSynthetic DNA 97gattcgcggc cgcctgaact gaaacacaga agac
3498121DNAArtificial SequenceSynthetic DNA 98tttctcatgg tagcgcctgt
gcttcggtta cttctaagga agtccacaca aatcaagatc 60cgttagacgt ttcagcttcc
aaaacagaag aatgtgagat gttcttcttg gaaaatgtac 120g
12199120DNAArtificial SequenceSynthetic DNA 99gaggtggtac tgaagcaggt
tgaggagagg catgatgggg gttctctgga acagctgatg 60aagcaggtgt tgttgtctgt
tgagagttag ccttagtgcc tgaactgaaa cacagaagac 120100120DNAArtificial
SequenceSynthetic DNA 100tttctcatgg tagcgcctgt gcttcggtta
cttctaagga agtccacaca aatcaagatc 60cgttagacgt ttcagcttcc aaaacagaag
aatgtgagag ctcccctcac agacgcgttg 120101120DNAArtificial
SequenceSynthetic DNA 101gaggtggtac tgaagcaggt tgaggagagg
catgatgggg gttctctgga acagctgatg 60aagcaggtgt tgttgtctgt tgagagttag
ccttagtgca aatgacaagt tcttgaaaac 120
* * * * *
References