U.S. patent application number 16/864098 was filed with the patent office on 2020-10-15 for enhancing microbial metabolism of c5 organic carbon.
The applicant listed for this patent is MARA Renewables Corporation. Invention is credited to Roberto E. Armenta, Jeremy Benjamin, Alexandra Merkx-Jacques, Denise Muise, Holly Rasmussen, Mark Scaife, David Woodhall.
Application Number | 20200325465 16/864098 |
Document ID | / |
Family ID | 1000004925969 |
Filed Date | 2020-10-15 |
View All Diagrams
United States Patent
Application |
20200325465 |
Kind Code |
A1 |
Merkx-Jacques; Alexandra ;
et al. |
October 15, 2020 |
ENHANCING MICROBIAL METABOLISM OF C5 ORGANIC CARBON
Abstract
Provided herein are recombinant microorganisms having two or
more copies of a nucleic acid sequence encoding xylose isomerase,
wherein the nucleic acid encoding the xylose isomerase is an
exogenous nucleic acid. Optionally, the recombinant microorganisms
include at least one nucleic acid sequence encoding a xylulose
kinase and/or at least one nucleic acid sequence encoding a xylose
transporter. The provided recombinant microorganisms are capable of
growing on xylose as a carbon source.
Inventors: |
Merkx-Jacques; Alexandra;
(Dartmouth, CA) ; Woodhall; David; (Dartmouth,
CA) ; Scaife; Mark; (Dartmouth, CA) ; Armenta;
Roberto E.; (Dartmouth, CA) ; Muise; Denise;
(Dartmouth, CA) ; Rasmussen; Holly; (Dartmouth,
CA) ; Benjamin; Jeremy; (Dartmouth, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
MARA Renewables Corporation |
Dartmouth |
|
CA |
|
|
Family ID: |
1000004925969 |
Appl. No.: |
16/864098 |
Filed: |
April 30, 2020 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15918786 |
Mar 12, 2018 |
10662418 |
|
|
16864098 |
|
|
|
|
15208849 |
Jul 13, 2016 |
9951326 |
|
|
15918786 |
|
|
|
|
62191983 |
Jul 13, 2015 |
|
|
|
62354444 |
Jun 24, 2016 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12P 7/6427 20130101;
C07K 14/40 20130101; C12N 1/12 20130101; C07K 14/38 20130101; C12Y
503/01005 20130101; C12N 1/10 20130101; C12Y 207/01017 20130101;
C07K 2319/21 20130101; C12N 9/92 20130101; C12P 7/64 20130101; C12N
9/1205 20130101 |
International
Class: |
C12N 9/92 20060101
C12N009/92; C12N 9/12 20060101 C12N009/12; C07K 14/40 20060101
C07K014/40; C12P 7/64 20060101 C12P007/64; C07K 14/38 20060101
C07K014/38; C12N 1/12 20060101 C12N001/12; C12N 1/10 20060101
C12N001/10 |
Claims
1. A recombinant microorganism comprising two or more copies of a
nucleic acid sequence encoding xylose isomerase, wherein the
nucleic acid encoding xylose isomerase is an exogenous nucleic
acid.
2. The recombinant microorganism of claim 1, further comprising at
least one nucleic acid sequence encoding a xylulose kinase.
3. The recombinant microorganism of claim 2, further comprising at
least one nucleic acid sequence encoding a xylose transporter.
4. The recombinant microorganism of claim 1, wherein the
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
exogenous nucleic acid sequence encoding xylose isomerase.
5. The recombinant microorganism of claim 1, wherein the 6 nucleic
acid sequence encoding the xylose isomerase is at least 90%
identical to SEQ ID NO:2.
6. The recombinant microorganism of claim 2, wherein the
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylulose kinase.
7. The recombinant microorganism of claim 6, wherein the nucleic
acid sequence encoding the xylulose kinase is at least 90%
identical to SEQ ID NO:5.
8. The recombinant microorganism of claim 3, wherein the
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylose transporter.
9. The recombinant microorganism of claim 8, wherein the xylose
transporter is GXS1 from Candida intermedia.
10. The recombinant microorganism of claim 8, wherein the nucleic
acid sequence encoding the xylose transporter is at least 90%
identical to SEQ ID NO:23.
11. The recombinant microorganism of claim 3, wherein the
recombinant microorganism has increased xylose transport activity
as compared to a non-recombinant control microorganism, increased
xylose isomerase activity as compared to a non-recombinant control
microorganism, increased xylulose kinase activity as compared to a
non-recombinant control microorganism, or any combination
thereof.
12. The recombinant microorganism of claim 3, wherein the
recombinant microorganism grows with xylose as the sole carbon
source.
13. The recombinant microorganism of claim 3, wherein the nucleic
acid sequence encoding the xylose isomerase, the xylulose kinase
and/or the xylose transporter is operably linked to a promoter.
14. The recombinant microorganism of claim 13, wherein the promoter
is a tubulin promoter that is at least 80% identical to SEQ ID
NO:25 or SEQ ID NO:26.
15. The recombinant microorganism of claim 3, wherein the nucleic
acid sequence encoding the xylose isomerase, the xylulose kinase
and/or the xylose transporter comprises a terminator.
16. The recombinant microorganism of claim 15, wherein the
terminator is a tubulin terminator that is at least 80% identical
to SEQ ID NO:27, SEQ ID NO:28, or SEQ ID NO:30.
17. The recombinant microorganism of claim 1, wherein the
microorganism is either a Thraustochytrium or a Schizochytrium
microorganism.
18. The recombinant microorganism of claim 17, wherein the
microorganism is ONC-T18.
19. A method of making a recombinant xylose-metabolizing
microorganism comprising: providing one or more nucleic acid
constructs comprising a nucleic acid sequence encoding a xylose
isomerase, a nucleic acid sequence encoding a xylulose kinase and a
nucleic acid sequence encoding a xylose transporter; transforming
the microorganism with the one or more nucleic acid constructs; and
isolating microorganisms comprising at least two or more copies of
the nucleic acid sequences encoding the xylose isomerase.
20. The method of claim 19, further comprising isolating
microorganisms comprising at least one copy of the nucleic acid
sequence encoding the xylulose kinase.
21. The method of claim 20, isolating microorganisms comprising at
least one copy of the xylose transporter.
22. The method of claim 19, wherein the providing comprises
providing a first nucleic acid construct comprising a nucleic acid
sequence encoding a xylose isomerase, a second nucleic acid
construct comprising a nucleic acid sequence encoding a xylulose
kinase and a third nucleic acid construct comprising a nucleic acid
sequence encoding a xylose transporter;
23. The method of claim 22, wherein the first, second, and/or third
nucleic acid construct comprises a promoter, a selectable marker, a
nucleic acid sequence encoding a 2A peptide, the nucleic acid
sequence encoding the xylose isomerase, and a terminator.
24. The method of claim 23, wherein the promoter is a tubulin
promoter that is at least 80% identical to SEQ ID NO:25 or SEQ ID
NO:26.
25. The method of claim 23, wherein the terminator is a tubulin
terminator that is at least 80% identical to SEQ ID NO:27, SEQ ID
NO:28, or SEQ ID NO:30.
26. The method of claim 19, wherein the isolated recombinant
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
exogenous nucleic acid sequence encoding xylose isomerase.
27. The method of claim 19, wherein the nucleic acid sequence
encoding the xylose isomerase is at least 90% identical to SEQ ID
NO:2.
28. The method of claim 19, wherein the isolated recombinant
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylulose kinase.
29. The method of claim 19, wherein the nucleic acid sequence
encoding the xylulose kinase is at least 90% identical to SEQ ID
NO:5.
30. The method of claim 19, wherein the isolated recombinant
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylose transporter.
31. The method of claim 30, wherein the xylose transporter is GXS1
from Candida intermedia.
32. The method of claim 30, wherein the nucleic acid sequence
encoding the xylose transporter is at least 90% identical to SEQ ID
NO:23.
33. The method of claim 19, wherein the isolated recombinant
microorganism has increased xylose transport activity as compared
to a control non-recombinant microorganism, increased xylose
isomerase activity as compared to a control non-recombinant
microorganism, increased xylulose kinase activity as compared to a
control non-recombinant microorganism, or a combination
thereof.
34. The method of claim 19, wherein the isolated recombinant
microorganism grows with xylose as the sole carbon source.
35. The method of claim 19, wherein the microorganism is either a
Thraustochytrium or a Schizochytrium microorganism.
36. The method of claim 19, wherein the microorganism is ONC-T18.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of currently pending U.S.
application Ser. No. 15/918,786, filed Mar. 12, 2018, which is a
continuation of U.S. application Ser. No. 15/208,849, filed Jul.
13, 2016 (now U.S. Pat. No. 9,951,326), which claims the benefit of
priority to U.S. Provisional Application No. 62/191,983, filed Jul.
13, 2015, and U.S. Provisional Application No. 62/354,444, filed
Jun. 24, 2016, which are incorporated by reference herein in their
entireties.
BACKGROUND OF THE INVENTION
[0002] Heterotrophic fermentation of microorganisms is an efficient
way of generating high value oil and biomass products. Under
certain cultivation conditions, microorganisms synthesize
intracellular oil, which can be extracted and used to produce fuel
(e.g., biodiesel, bio-jetfuel, and the like) and nutritional lipids
(e.g., polyunsaturated fatty acids such as DHA, EPA, and DPA). The
biomass of some microorganisms is of great nutritional value due to
high polyunsaturated fatty acid (PUFA) and protein content, and can
be used as a nutritional supplement for animal feed.
Thraustochytrids are eukaryotic, single-cell, microorganisms which
can be used in such fermentation processes to produce lipids.
Heterotrophic fermentations with Thraustochytrids convert organic
carbon provided in the growth medium to lipids, which are harvested
from the biomass at the end of the fermentation process. However,
existing microorganism fermentations use mainly expensive
carbohydrates, such as glucose, as the carbon source.
BRIEF SUMMARY OF THE INVENTION
[0003] Provided herein are recombinant microorganisms having two or
more copies of a nucleic acid sequence encoding xylose isomerase,
wherein the nucleic acid encoding the xylose isomerase is an
exogenous nucleic acid. Optionally, the recombinant microorganisms
include at least one nucleic acid sequence encoding a xylulose
kinase and/or at least one nucleic acid sequence encoding a xylose
transporter. The provided recombinant microorganisms are capable of
growing on xylose as a carbon source.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] FIG. 1 is a schematic of the xylose metabolism pathway.
[0005] FIG. 2 is a graph showing expression of xylose isomerase in
WT ONC-T18 during cycles of glucose starvation.
[0006] FIG. 3 is a graph showing expression of the putative
xylulose kinase in WT ONC-T18 during cycles of glucose
starvation.
[0007] FIG. 4 is a schematic showing an alpha-tubulin ble-isomerase
plasmid construct.
[0008] FIG. 5 is a schematic showing an alpha-tubulin hygro-xylB
plasmid construct.
[0009] FIG. 6 is a schematic showing a nucleic acid construct
having an alpha-tubulin promoter a ble sequence a 2A sequence an
xylose isomerase sequence and an alpha-tubulin terminator.
[0010] FIG. 7 is an image of a Southern blot to probe the xylose
isomerase His-tagged gene within recombinant ONC-T18 strains "6"
and "16".
[0011] FIG. 8 is a graph showing the qPCR determination of the
number of xylose isomerase His-tagged gene insertions in
recombinant ONC-T18 strains.
[0012] FIG. 9 is an image of a Southern blot to probe the xylB gene
within recombinant ONC-T18 strains containing both xylose isomerase
and xylulose kinase referred to in the graph as "7-3" and
"7-7".
[0013] FIG. 10 is a graph of qPCR determination of the number of
xylB gene insertions in recombinant 7-3 and 7-7 ONC-T18
strains.
[0014] FIG. 11 is a graph showing the expression of the xylose
isomerase gene transcript in recombinant ONC-T18 strains "6" and
"16."
[0015] FIG. 12 is a graph showing the in vitro xylose isomerase
activity in Wt ONC-T18 and recombinant ONC-T18 strains "6" and
"16."
[0016] FIG. 13 is a graph showing the combined xylose isomerase and
xylulose kinase activity in vitro of recombinant ONC-T18 strain
"16" encoding only xylose isomerase and recombinant ONC-T18 strains
"7-3" and "7-7" encoding xylose isomerase and xylulose kinase.
[0017] FIGS. 14A and 14B are graphs showing xylose uptake
improvement and decreased xylitol production in recombinant ONC-T18
strain "16" (squares). The Wild Type (WT) strain is represented by
diamonds.
[0018] FIGS. 15A and 15B are graphs showing xylose usage
improvement and decreased xylitol production in recombinant ONC-T18
strain "16" (squares) and recombinant ONC-T18 strains "7-3"
(triangles) and "7-7" (asterisks). The Wild Type (WT) strain is
represented by diamonds.
[0019] FIG. 16 is a graph showing accumulation of xylitol during a
glucose:xylose fermentation with recombinant ONC-T18 strain "16"
and recombinant ONC-T18 strain "7-7."
[0020] FIG. 17 is a schematic of different versions of the
constructs used for transformation of ONC-T18.
[0021] FIG. 18 is a graph showing the alignment of the xylB
sequence from E. coli (SEQ ID NO:20) with the codon optimized
version of E. coli xylB (SEQ ID NO:5).
[0022] FIGS. 19A, 19B, and 19C are graphs showing xylose usage
(FIG. 19A), glucose usage (FIG. 19B) and percent xylitol made (FIG.
19C) in strains comprising xylose isomerase, xylulose kinase and
the sugar transporter Gxs1. WT is wild-type; IsoHis XylB "7-7"
contains the xylose isomerase and xylB sequences, 36-2, 36-9 and
36-16 are transformants containing Gxs1, xylose isomerase and the
xylB sequences (xylulose kinase).
[0023] FIGS. 20A and 20B are graphs showing the impact of
temperature incubation on the activity of isomerase from T18 (FIG.
20A) and E. coli (FIG. 20B) with xylose (diamond) and xylulose
(square).
[0024] FIGS. 21A and 21B are graphs showing dose dependency of
isomerase from T18 (FIG. 21A) and E. coli (FIG. 21B) with xylose
(diamond) and xylulose (square).
[0025] FIGS. 22A and 22B are graphs showing xylose use (FIG. 22A)
and decreased xylitol production (FIG. 22B) in a T18B strain
engineered with xylose isomerases ("16" (squares), "B" (x), and "6"
(crosses)). FIGS. 22C (xylose) and 22D (xylitol production) show
the same data expressed relative to wild type (diamonds) at 4
(gray) and 7 (black) days.
[0026] FIGS. 23A and 23B are graphs showing xylose use and
decreased xylitol production in a T18B strain engineered with a
xylose isomerase "16" (squares) and strains engineered to express a
xylose isomerase and xylulose kinase "7-7" (x) and "7-3"
(triangles). FIGS. 23C (xylose) and 23D (xylitol production) show
the same data relative to wild type (diamonds) at 9 (gray) and 11
(black) days.
[0027] FIG. 24 is a graph showing improved xylose usage and
decreased xylitol production in a T18B strain engineered to express
a xylose isomerase and xylulose kinase "7-7" in fermentation. The
wild type strain is represented by diamonds and the dotted line and
the strain "7-7" is represented by circles.
[0028] FIG. 25 is a schematic showing .alpha.-tubulin aspTx-neo and
.alpha.-tubulin gxs1-neo constructs.
[0029] FIG. 26A is an image of a Southern blot to probe the Gxs1
gene within "7-7" T18B strains engineered with the xylose
transporter Gxs1. FIG. 26B is an image of a Southern blot to probe
the AspTx gene within "7-7" T18B strains engineered with the xylose
transporter AspTx.
[0030] FIG. 27A is a graph showing the use of xylose in T18 strains
engineered with a xylose isomerase, a xylulose kinase and either
the Gxs1 transporter (triangles) or AspTx transporter (circles).
Strain "7-7" is represented by diamonds. FIG. 27B is a bar graph of
the ratio of xylitol production versus xylose use for each of the 3
modified strains. FIG. 27C is a bar graph showing xylose use
relative to strain "7-7." FIG. 27D is a bar graph showing xylitol
production made relative to strain "7-7."
[0031] FIG. 28 is a graph showing growth of wild type (WT)
(diamonds), isohis strain "16" (squares), strain "7-7" (x), and
transporter strains Gxs1 (asterisks) and AspTx (triangles) in media
containing xylose as sole carbon source.
[0032] FIG. 29A is a graph showing remaining glucose in alternative
feedstock containing glucose and xylose during growth of WT
(squares), strain "7-7" (triangles), and transporter strains Gxs1
(asterisks) and AspTx (crosses). FIG. 29B is a graph showing xylose
remaining and xylitol produced over time when WT (squares) strain
"7-7" (triangles) and transporter strains Gxs1 (asterisks) and
AspTx (crosses) are grown on alternative feedstock containing
glucose and xylose.
DETAILED DESCRIPTION OF THE INVENTION
[0033] Microorganisms such as Thraustochytrids encode genes
required for the metabolism of xylose. However, the microorganism's
innate metabolic pathways produce a large amount of the sugar
alcohol, xylitol, which is secreted and potentially hinders growth
of the microorganisms (see FIG. 14, WT). Furthermore, carbon atoms
sequestered into xylitol are atoms that are diverted away from the
target product in this process, namely, lipid production. In
nature, two xylose metabolism pathways exist, the xylose
reductase/xylitol dehydrogenase pathway and the xylose
isomerase/xylulose kinase pathway (FIG. 1). Thraustochytrids have
genes that encode proteins active in both pathways; however, the
former pathway appears to be dominant as evidenced by a build-up of
xylitol when grown in a xylose medium. In other organisms, the
build-up of xylitol has been shown to be due to a redox co-factor
imbalance required for xylose reductase/xylitol dehydrogenase
pathway. Since the isomerase/kinase pathway does not depend on
redox co-factors, over-expression of the isomerase gene removes
co-factor dependence in the conversion of xylose to xylulose. As
shown herein, transcriptomic studies with ONC-T18 showed that its
xylose isomerase and putative xylulose kinase genes are mostly
expressed during glucose starvation (FIG. 2 and FIG. 3); whereas,
the putatively identified genes encoding for the xylose reductase
and xylitol dehydrogenase are constitutively expressed. To increase
the expression of the isomerase and kinase throughout all growth
stages, microorganisms were engineered to include ONC-T18 isomerase
gene and an E. coli xylulose kinase gene (xylB) such that they are
under the control of the constitutively expressed promoter and
terminator, e.g., an .alpha.-tubulin promoter and terminator.
Optionally, the genes can be under the control of a inducible
promoter and/or terminator.
[0034] The provided recombinant microorganisms demonstrate a level
of control of the amount of expression of a gene of interest via
the number of integrated transgene copies. As shown in the examples
below, a recombinant ONC-T18 strain (Iso-His #16) harbouring eight
(8) transgene copies demonstrates higher levels of xylose isomerase
transcript expression, enzyme activity and xylose metabolism than a
strain harbouring a single copy of the transgene (Iso-His #6). When
Iso-His #16 was further modified to incorporate the xylB gene, a
similar phenomenon is observed. Multiple copies of the xylB gene
conferred greater enzyme activity and xylose metabolism
productivity compared to single insertions. Thus, unexpectedly, it
was not only necessary to recreate a xylose metabolism pathway, but
to do so with multiple copies of the necessary transgenes. It was
not anticipated that the Thraustrochytrid genome could accommodate
multiple transgene copies and remain viable; therefore, it was not
expected to observe such variability in expression levels amongst
transformant strains. However, as provided herein, recombinant
microorganisms can be produced that allow for controlled expression
levels of transgenes indirectly by selecting among transformant
strains that possess a transgene copy number "tailored" to a
particular expression level optimized for the metabolic engineering
of a particular pathway, e.g., the xylose pathway.
[0035] Provided herein are nucleic acids encoding one or more genes
involved in xylose metabolism. The present application provides
recombinant microorganisms, methods for making the microorganisms,
and methods for producing oil using the microorganisms that are
capable of metabolizing xylose. Specifically, provided herein are
nucleic acids and polypeptides encoding xylose isomerase, xylulose
kinase and xylose transporters for modifying microorganisms to be
capable of metabolizing xylose and/or growing on xylose as the sole
carbon source. Thus, provided are nucleic acids encoding a xylose
isomerase. The nucleic acid sequences can be endogenous or
heterologous to the microorganism. Exemplary nucleic acids
sequences of xylose isomerases include, but are not limited to,
those from Piromyces sp., Streptococcus sp., and Thraustochytrids.
For example, exemplary nucleic acid sequences encoding xylose
isomerases include, but are not limited to, SEQ ID NO:2 and SEQ ID
NO:15; and exemplary polypeptide sequences of xylose isomerase
include, but are not limited to, SEQ ID NO:16. Exemplary nucleic
acids sequences of xylulose kinases include, but are not limited
to, those from E. coli, Piromyces sp., Saccharomyces sp., and
Pichia sp. For example, exemplary nucleic acid sequences encoding
xylulose kinases include, but are not limited to, SEQ ID NO:5, SEQ
ID NO:17, SEQ ID NO:18, SEQ ID NO:19 and SEQ ID NO:20. Exemplary
nucleic acid sequences encoding sugar transporters, e.g., xylose
transporters, include, but are not limited to, those from
Aspergillus sp., Gfx1, Gxs1 and Sut1. For example, exemplary
nucleic acid sequences encoding xylose transporters include, but
are not limited to, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, and
SEQ ID NO:24.
[0036] Nucleic acid, as used herein, refers to deoxyribonucleotides
or ribonucleotides and polymers and complements thereof. The term
includes deoxyribonucleotides or ribonucleotides in either single-
or double-stranded form. The term encompasses nucleic acids
containing known nucleotide analogs or modified backbone residues
or linkages, which are synthetic, naturally occurring, and
non-naturally occurring, which have similar binding properties as
the reference nucleic acid, and which are metabolized in a manner
similar to the reference nucleotides. Examples of such analogs
include, without limitation, phosphorothioates, phosphoramidates,
methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl
ribonucleotides, peptide-nucleic acids (PNAs). Unless otherwise
indicated, conservatively modified variants of nucleic acid
sequences (e.g., degenerate codon substitutions) and complementary
sequences can be used in place of a particular nucleic acid
sequence recited herein. Specifically, degenerate codon
substitutions may be achieved by generating sequences in which the
third position of one or more selected (or all) codons is
substituted with mixed-base and/or deoxyinosine residues (Batzer et
al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol.
Chem. 260:2605-2608 (1985); Rossolini et al., Mol. Cell. Probes
8:91-98 (1994)). The term nucleic acid is used interchangeably with
gene, cDNA, mRNA, oligonucleotide, and polynucleotide.
[0037] A nucleic acid is operably linked when it is placed into a
functional relationship with another nucleic acid sequence. For
example, DNA that encodes a presequence or secretory leader is
operably linked to DNA that encodes a polypeptide if it is
expressed as a preprotein that participates in the secretion of the
polypeptide; a promoter or enhancer is operably linked to a coding
sequence if it affects the transcription of the sequence; or a
ribosome binding site is operably linked to a coding sequence if it
is positioned so as to facilitate translation. Generally, operably
linked means that the DNA sequences being linked are near each
other, and, in the case of a secretory leader, contiguous and in
reading phase. However, enhancers do not have to be contiguous. For
example, a nucleic acid sequence that is operably linked to a
second nucleic acid sequence is covalently linked, either directly
or indirectly, to such second sequence, although any effective
three-dimensional association is acceptable. A single nucleic acid
sequence can be operably linked to multiple other sequences. For
example, a single promoter can direct transcription of multiple RNA
species. Linking can be accomplished by ligation at convenient
restriction sites. If such sites do not exist, the synthetic
oligonucleotide adaptors or linkers are used in accordance with
conventional practice.
[0038] The terms identical or percent identity, in the context of
two or more nucleic acids or polypeptide sequences, refer to two or
more sequences or subsequences that are the same or have a
specified percentage of amino acid residues or nucleotides that are
the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%,
85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher
identity over a specified region, when compared and aligned for
maximum correspondence over a comparison window or designated
region) as measured using a BLAST or BLAST 2.0 sequence comparison
algorithms with default parameters described below, or by manual
alignment and visual inspection (see, e.g., NCBI web site or the
like). Such sequences are then said to be substantially identical.
This definition also refers to, or may be applied to, the
compliment of a test sequence. The definition also includes
sequences that have deletions and/or additions, as well as those
that have substitutions. As described below, the preferred
algorithms can account for gaps and the like. Preferably, identity
exists over a region that is at least about 25 amino acids or
nucleotides in length, or more preferably over a region that is
50-100 amino acids or nucleotides in length.
[0039] For sequence comparison, typically one sequence acts as a
reference sequence, to which test sequences are compared. When
using a sequence comparison algorithm, test and reference sequences
are entered into a computer, subsequence coordinates are
designated, if necessary, and sequence algorithm program parameters
are designated. Preferably, default program parameters can be used,
or alternative parameters can be designated. The sequence
comparison algorithm then calculates the percent sequence
identities for the test sequences relative to the reference
sequence, based on the program parameters.
[0040] A comparison window, as used herein, includes reference to a
segment of any one of the number of contiguous positions selected
from the group consisting of from 20 to 600, usually about 50 to
about 200, more usually about 100 to about 150 in which a sequence
may be compared to a reference sequence of the same number of
contiguous positions after the two sequences are optimally aligned.
Methods of alignment of sequences for comparison are well-known in
the art. Optimal alignment of sequences for comparison can be
conducted, e.g., by the local homology algorithm of Smith &
Waterman, Adv. Appl. Math. 2:482 (1981); by the homology alignment
algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970);
by the search for similarity method of Pearson & Lipman, Proc.
Nat'l. Acad. Sci. USA 85:2444 (1988); by computerized
implementations of these algorithms (GAP, BESTFIT, FASTA, and
TFASTA in the Wisconsin Genetics Software Package, Genetics
Computer Group, 575 Science Dr., Madison, Wis.); or by manual
alignment and visual inspection (see, e.g., Current Protocols in
Molecular Biology (Ausubel et al., eds. 1995 supplement)).
[0041] A preferred example of an algorithm that is suitable for
determining percent sequence identity and sequence similarity are
the BLAST and BLAST 2.0 algorithms, which are described in Altschul
et al., Nuc. Acids Res. 25:3389-3402 (1977), and Altschul et al.,
J. Mol. Biol. 215:403-410 (1990), respectively. BLAST and BLAST 2.0
are used, with the parameters described herein, to determine
percent sequence identity for nucleic acids or proteins. Software
for performing BLAST analyses is publicly available through the
National Center for Biotechnology Information, as known in the art.
This algorithm involves first identifying high scoring sequence
pairs (HSPs) by identifying short words of a selected length (W) in
the query sequence, which either match or satisfy some
positive-valued threshold score T when aligned with a word of the
same length in a database sequence. T is referred to as the
neighborhood word score threshold (Altschul et al., supra). These
initial neighborhood word hits act as seeds for initiating searches
to find longer HSPs containing them. The word hits are extended in
both directions along each sequence for as far as the cumulative
alignment score can be increased. Cumulative scores are calculated
using, for nucleotide sequences, the parameters M (reward score for
a pair of matching residues; always >0) and N (penalty score for
mismatching residues; always <0). For amino acid sequences, a
scoring matrix is used to calculate the cumulative score. Extension
of the word hits in each direction are halted when: the cumulative
alignment score falls off by the quantity X from its maximum
achieved value; the cumulative score goes to zero or below, due to
the accumulation of one or more negative-scoring residue
alignments; or the end of either sequence is reached. The BLAST
algorithm parameters W, T, and X determine the sensitivity and
speed of the alignment. The Expectation value (E) represents the
number of different alignments with scores equivalent to or better
than what is expected to occur in a database search by chance. The
BLASTN program (for nucleotide sequences) uses as defaults a
wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a
comparison of both strands. For amino acid sequences, the BLASTP
program uses as defaults a wordlength of 3, expectation (E) of 10,
and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc.
Natl. Acad. Sci. USA 89:10915 (1989)), alignments (B) of 50,
expectation (E) of 10, M=5, N=-4, and a comparison of both
strands.
[0042] The term polypeptide, as used herein, generally has its
art-recognized meaning of a polymer of at least three amino acids
and is intended to include peptides and proteins. However, the term
is also used to refer to specific functional classes of
polypeptides, such as, for example, desaturases, elongases, etc.
For each such class, the present disclosure provides several
examples of known sequences of such polypeptides. Those of ordinary
skill in the art will appreciate, however, that the term
polypeptide is intended to be sufficiently general as to encompass
not only polypeptides having the complete sequence recited herein
(or in a reference or database specifically mentioned herein), but
also to encompass polypeptides that represent functional fragments
(i.e., fragments retaining at least one activity) of such complete
polypeptides. Moreover, those in the art understand that protein
sequences generally tolerate some substitution without destroying
activity. Thus, any polypeptide that retains activity and shares at
least about 30-40% overall sequence identity, often greater than
about 50%, 60%, 70%, or 80%, and further usually including at least
one region of much higher identity, often greater than 90% or even
95%, 96%, 97%, 98%, or 99% in one or more highly conserved regions,
usually encompassing at least 3-4 and often up to 20 or more amino
acids, with another polypeptide of the same class, is encompassed
within the relevant term polypeptide as used herein. Those in the
art can determine other regions of similarity and/or identity by
analysis of the sequences of various polypeptides described herein.
As is known by those in the art, a variety of strategies are known,
and tools are available, for performing comparisons of amino acid
or nucleotide sequences in order to assess degrees of identity
and/or similarity. These strategies include, for example, manual
alignment, computer assisted sequence alignment and combinations
thereof. A number of algorithms (which are generally computer
implemented) for performing sequence alignment are widely
available, or can be produced by one of skill in the art.
Representative algorithms include, e.g., the local homology
algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2: 482);
the homology alignment algorithm of Needleman and Wunsch (J. Mol.
Biol., 1970, 48: 443); the search for similarity method of Pearson
and Lipman (Proc. Natl. Acad. Sci. (USA), 1988, 85: 2444); and/or
by computerized implementations of these algorithms (e.g., GAP,
BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software
Package Release 7.0, Genetics Computer Group, 575 Science Dr.,
Madison, Wis.). Readily available computer programs incorporating
such algorithms include, for example, BLASTN, BLASTP, Gapped BLAST,
PILEUP, CLUSTALW, etc. When utilizing BLAST and Gapped BLAST
programs, default parameters of the respective programs may be
used. Alternatively, the practitioner may use non-default
parameters depending on his or her experimental and/or other
requirements (see for example, the Web site having URL
www.ncbi.nlm.nih.gov).
[0043] As discussed above, the nucleic acids encoding the xylose
transporter, xylulose kinase and xylose isomerase, can be linked to
a promoter and/or terminator. Examples of promoters and terminators
include, but are not limited to, tubulin promoters and terminators.
By way of example, the promoter is a tubulin promoter, e.g., an
alpha-tubulin promoter. Optionally, the promoter is at least 80%
identical to SEQ ID NO:25 or SEQ ID NO:26. Optionally, the
terminator is a tubulin terminator. Optionally, the terminator is
at least 80% identical to SEQ ID NO:27, SEQ ID NO:28, or SEQ ID
NO:30.
[0044] As used herein, the terms promoter, promoter element, and
regulatory sequence refer to a polynucleotide that regulates
expression of a selected polynucleotide sequence operably linked to
the promoter, and that effects expression of the selected
polynucleotide sequence in cells. The term Thraustochytrium
promoter, as used herein, refers to a promoter that functions in a
Thraustochytrium cell. In some embodiments, a promoter element is
or comprises untranslated regions (UTR) in a position 5' of coding
sequences. 5' UTRs form part of the mRNA transcript and so are an
integral part of protein expression in eukaryotic organisms.
Following transcription 5'UTRs can regulate protein expression at
both the transcription and translation levels.
[0045] As used herein, the term terminator refers to a
polynucleotide that abrogates expression of, targets for maturation
(e.g., adding a polyA tail), or imparts mRNA stability to a
selected polynucleotide sequence operably linked to the terminator
in cells. A terminator sequence may be downstream of a stop codon
in a gene. The term Thraustochytrium terminator, as used herein,
refers to a terminator that functions in a Thraustochytrium cell.
Provided herein are also nucleic acid constructs that include
nucleic acid sequences encoding xylose isomerase, xylulose kinase
and xylose transporter as well as promoters, terminators,
selectable markers, 2A peptides or any combination thereof. By way
of example, provided is a first nucleic acid construct including a
promoter, a selectable marker, a nucleic acid sequence encoding a
2A peptide, a nucleic acid sequence encoding a xylose isomerase,
and a terminator. Also provided is a second nucleic acid construct
including a promoter, selectable marker, a nucleic acid sequence
encoding a 2A peptide, a nucleic acid sequence encoding a xylulose
kinase, and a terminator. Further provided is a third nucleic acid
construct including a promoter, a nucleic acid sequence encoding a
xylose transporter, a nucleic acid sequence encoding a 2A peptide,
a selectable marker, and a terminator. These constructs are
exemplary and the nucleic acid sequences encoding the xylose
isomerase, xylulose kinase and xylose transporter can be included
on the same construct under control of the same or different
promoters. Optionally, each of the nucleic acid sequences encoding
the xylose isomerase, xylulose kinase and xylose transporter are on
the same construct and are separated by 2A polypeptide sequences,
e.g., as shown in SEQ ID NO:6. Thus, by way of example, a nucleic
acid construct can include a tubulin promoter, a nucleic acid
sequences encoding a xylose isomerase, xylulose kinase, and xylose
transporter separated by a nucleic acid sequence encoding SEQ ID
NO:6, a tubulin terminator and a selectable marker. Optionally, the
selectable marker is the ble gene. Optionally, the selectable
marker comprises SEQ ID NO:29.
[0046] The phrase selectable marker, as used herein, refers either
to a nucleotide sequence, e.g., a gene, that encodes a product
(polypeptide) that allows for selection, or to the gene product
(e.g., polypeptide) itself. The term selectable marker is used
herein as it is generally understood in the art and refers to a
marker whose presence within a cell or organism confers a
significant growth or survival advantage or disadvantage on the
cell or organism under certain defined culture conditions
(selective conditions). For example, the conditions may be the
presence or absence of a particular compound or a particular
environmental condition such as increased temperature, increased
radiation, presence of a compound that is toxic in the absence of
the marker, etc. The presence or absence of such compound(s) or
environmental condition(s) is referred to as a selective condition
or selective conditions. By growth advantage is meant either
enhanced viability (e.g., cells or organisms with the growth
advantage have an increased life span, on average, relative to
otherwise identical cells), increased rate of proliferation (also
referred to herein as growth rate) relative to otherwise identical
cells or organisms, or both. In general, a population of cells
having a growth advantage will exhibit fewer dead or nonviable
cells and/or a greater rate of cell proliferation than a population
of otherwise identical cells lacking the growth advantage. Although
typically a selectable marker will confer a growth advantage on a
cell, certain selectable markers confer a growth disadvantage on a
cell, e.g., they make the cell more susceptible to the deleterious
effects of certain compounds or environmental conditions than
otherwise identical cells not expressing the marker. Antibiotic
resistance markers are a non-limiting example of a class of
selectable marker that can be used to select cells that express the
marker. In the presence of an appropriate concentration of
antibiotic (selective conditions), such a marker confers a growth
advantage on a cell that expresses the marker. Thus, cells that
express the antibiotic resistance marker are able to survive and/or
proliferate in the presence of the antibiotic while cells that do
not express the antibiotic resistance marker are not able to
survive and/or are unable to proliferate in the presence of the
antibiotic.
[0047] Examples of selectable markers include common bacterial
antibiotics, such as but not limited to ampicillin, kanamycin and
chloramphenicol, as well as selective compounds known to function
in microalgae; examples include rrnS and AadA (Aminoglycoside
3'-adenylytranferase), which may be isolated from E. coli plasmid
R538-1, conferring resistance to spectinomycin and streptomycin,
respectively in E. coli and some microalgae (Hollingshead and
Vapnek, Plasmid 13:17-30, 1985; Meslet-Cladiere and Vallon,
Eukaryot Cell. 10(12):1670-8 2011). Another example is the 23S RNA
protein, rrnL, which confers resistance to erythromycin (Newman,
Boynton et al., Genetics, 126:875-888 1990; Roffey, Golbeck et al.,
Proc. Natl Acad. Sci. USA, 88:9122-9126 1991). Another example is
Ble, a GC rich gene isolated from Streptoalloteichus hindustanus
that confers resistance to zeocin (Stevens, Purton et al., Mol.
Gen. Genet., 251:23-30 1996). Aph7 is yet another example, which is
a Streptomyces hygroscopicus-derived aminoglycoside
phosphotransferase gene that confers resistance to hygromycin B
(Berthold, Schmitt et al., Protist 153(4):401-412 2002). Additional
examples include: AphVIII, a Streptomyces rimosus derived
aminoglycoside 3'-phosphotransferase type VIII that confers
resistance to Paromycin in E. coli and some microalgae (Sizova,
Lapina et al., Gene 181(1-2):13-18 1996; Sizova, Fuhrmann et al.,
Gene 277(1-2):221-229 2001); Nat & Sat-1, which encode
nourseothricin acetyl transferase from Streptomyces noursei and
streptothricin acetyl transferase from E. coli, which confer
resistance to nourseothricin (Zaslayskaia, Lippmeier et al.,
Journal of Phycology 36(2):379-386, 2000); Neo, an aminoglycoside
3'-phosphotransferase, conferring resistance to the
aminoglycosides; kanamycin, neomycin, and the analog G418 (Hasnain,
Manavathu et al., Molecular and Cellular Biology 5(12):3647-3650,
1985); and Cry1, a ribosomal protein S14 that confers resistance to
emetine (Nelson, Savereide et al., Molecular and Cellular Biology
14(6):4011-4019, 1994).
[0048] Other selectable markers include nutritional markers, also
referred to as auto- or auxo-trophic markers. These include
photoautotrophy markers that impose selection based on the
restoration of photosynthetic activity within a photosynthetic
organism. Photoautotrophic markers include, but are not limited to,
AtpB, TscA, PetB, NifH, psaA and psaB (Boynton, Gillham et al.,
Science 240(4858):1534-1538 1988; Goldschmidt-Clermont, Nucleic
Acids Research 19(15):4083-4089, 1991; Kindle, Richards et al.,
PNAS, 88(5):1721-1725, 1991; Redding, MacMillan et al., EMBO J
17(1):50-60, 1998; Cheng, Day et al., Biochemical and Biophysical
Research Communications 329(3):966-975, 2005). Alternative or
additional nutritional markers include ARG7, which encodes
argininosuccinate lyase, a critical step in arginine biosynthesis
(Debuchy, Purton et al., EMBO J 8(10):2803-2809, 1989); NIT1, which
encodes a nitrate reductase essential to nitrogen metabolism
(Fernandez, Schnell et al., PNAS, 86(17):6449-6453, 1989); THI10,
which is essential to thiamine biosynthesis (Ferris, Genetics
141(2):543-549, 1995); and NIC1, which catalyzes an essential step
in nicotinamide biosynthesis (Ferris, Genetics 141(2):543-549,
1995). Such markers are generally enzymes that function in a
biosynthetic pathway to produce a compound that is needed for cell
growth or survival. In general, under nonselective conditions, the
required compound is present in the environment or is produced by
an alternative pathway in the cell. Under selective conditions,
functioning of the biosynthetic pathway, in which the marker is
involved, is needed to produce the compound.
[0049] The phrase selection agent, as used herein refers to an
agent that introduces a selective pressure on a cell or populations
of cells either in favor of or against the cell or population of
cells that bear a selectable marker. For example, the selection
agent is an antibiotic and the selectable marker is an antibiotic
resistance gene. Optionally, zeocin is used as the selection
agent.
[0050] Suitable microorganisms that can be transformed with the
provided nucleic acids encoding the genes involved in xylose
metabolism and nucleic acid constructs containing the same include,
but are not limited to, algae (e.g., microalgae), fungi (including
yeast), bacteria, or protists. Optionally, the microorganism
includes Thraustochytrids of the order Thraustochytriales, more
specifically Thraustochytriales of the genus Thraustochytrium.
Optionally, the population of microorganisms includes
Thraustochytriales as described in U.S. Pat. Nos. 5,340,594 and
5,340,742, which are incorporated herein by reference in their
entireties. The microorganism can be a Thraustochytrium species,
such as the Thraustochytrium species deposited as ATCC Accession
No. PTA-6245 (i.e., ONC-T18) as described in U.S. Pat. No.
8,163,515, which is incorporated by reference herein in its
entirety. Thus, the microorganism can have an 18s rRNA sequence
that is at least 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%,
99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% or more (e.g., including
100%) identical to SEQ ID NO:1.
[0051] Microalgae are acknowledged in the field to represent a
diverse group of organisms. For the purpose of this document, the
term microalgae will be used to describe unicellular microorganisms
derived from aquatic and/or terrestrial environments (some
cyanobacteria are terrestrial/soil dwelling). Aquatic environments
extend from oceanic environments to freshwater lakes and rivers,
and also include brackish environments such as estuaries and river
mouths. Microalgae can be photosynthetic; optionally, microalgae
are heterotrophic. Microalgae can be of eukaryotic nature or of
prokaryotic nature. Microalgae can be non-motile or motile.
[0052] The term thraustochytrid, as used herein, refers to any
member of the order Thraustochytriales, which includes the family
Thraustochytriaceae. Strains described as thraustochytrids include
the following organisms: Order: Thraustochytriales; Family:
Thraustochytriaceae; Genera: Thraustochytrium (Species: sp.,
arudimentale, aureum, benthicola, globosum, kinnei, motivum,
multirudimentale, pachydermum, proliferum, roseum, striatum),
Ulkenia (Species: sp., amoeboidea, kerguelensis, minuta, profunda,
radiata, sailens, sarkariana, schizochytrops, visurgensis,
yorkensis), Schizochytrium (Species: sp., aggregatum, limnaceum,
mangrovei, minutum, octosporuni), Japonochytrium (Species: sp.,
marinum), Aplanochytrium (Species: sp., haliotidis, kerguelensis,
profunda, stocchinoi), Althornia (Species: sp., crouchii), or Elina
(Species: sp., marisalba, sinorifica). Species described within
Ulkenia will be considered to be members of the genus
Thraustochytrium. Strains described as being within the genus
Thrautochytrium may share traits in common with and also be
described as falling within the genus Schizochytrium. For example,
in some taxonomic classifications ONC-T18 may be considered within
the genus Thrautochytrium, while in other classifications it may be
described as within the genus Schizochytrium because it comprises
traits indicative of both genera.
[0053] The term transformation, as used herein refers to a process
by which an exogenous or heterologous nucleic acid molecule (e.g.,
a vector or recombinant nucleic acid molecule) is introduced into a
recipient cell or microorganism. The exogenous or heterologous
nucleic acid molecule may or may not be integrated into (i.e.,
covalently linked to) chromosomal DNA making up the genome of the
host cell or microorganism. For example, the exogenous or
heterologous polynucleotide may be maintained on an episomal
element, such as a plasmid. Alternatively or additionally, the
exogenous or heterologous polynucleotide may become integrated into
a chromosome so that it is inherited by daughter cells through
chromosomal replication. Methods for transformation include, but
are not limited to, calcium phosphate precipitation; Ca.sup.2+
treatment; fusion of recipient cells with bacterial protoplasts
containing the recombinant nucleic acid; treatment of the recipient
cells with liposomes containing the recombinant nucleic acid; DEAE
dextran; fusion using polyethylene glycol (PEG); electroporation;
magnetoporation; biolistic delivery; retroviral infection;
lipofection; and micro-injection of DNA directly into cells.
[0054] The term transformed, as used in reference to cells, refers
to cells that have undergone transformation as described herein
such that the cells carry exogenous or heterologous genetic
material (e.g., a recombinant nucleic acid). The term transformed
can also or alternatively be used to refer to microorganisms,
strains of microorganisms, tissues, organisms, etc. that contain
exogenous or heterologous genetic material.
[0055] The term introduce, as used herein with reference to
introduction of a nucleic acid into a cell or organism, is intended
to have its broadest meaning and to encompass introduction, for
example by transformation methods (e.g., calcium-chloride-mediated
transformation, electroporation, particle bombardment), and also
introduction by other methods including transduction, conjugation,
and mating. Optionally, a construct is utilized to introduce a
nucleic acid into a cell or organism.
[0056] The microorganisms for use in the methods described herein
can produce a variety of lipid compounds. As used herein, the term
lipid includes phospholipids, free fatty acids, esters of fatty
acids, triacylglycerols, sterols and sterol esters, carotenoids,
xanthophyls (e.g., oxycarotenoids), hydrocarbons, and other lipids
known to one of ordinary skill in the art. Optionally, the lipid
compounds include unsaturated lipids. The unsaturated lipids can
include polyunsaturated lipids (i.e., lipids containing at least 2
unsaturated carbon-carbon bonds, e.g., double bonds) or highly
unsaturated lipids (i.e., lipids containing 4 or more unsaturated
carbon-carbon bonds). Examples of unsaturated lipids include
omega-3 and/or omega-6 polyunsaturated fatty acids, such as
docosahexaenoic acid (i.e., DHA), eicosapentaenoic acid (i.e.,
EPA), and other naturally occurring unsaturated, polyunsaturated
and highly unsaturated compounds.
[0057] Provided herein are recombinant microorganisms engineered to
express polypeptides for metabolizing C5 carbon sugars such as
xylose. Specifically, provided is a recombinant microorganism
having one or more copies of a nucleic acid sequence encoding
xylose isomerase, wherein the nucleic acid encoding xylose
isomerase is a exogenous nucleic acid. Optionally, the recombinant
microorganism comprises two or more copies of the nucleic acid
sequence encoding xylose isomerase. Optionally, the recombinant
microorganisms also contains one or two copies of an endogenous
nucleic acid sequence encoding xylose isomerase. By way of example,
the recombinant microorganisms can contain one or two copies of an
endogenous nucleic acid sequence encoding xylose isomerase and one
copy of an exogenous nucleic acid sequence encoding xylose
isomerase. Optionally, the recombinant microorganism includes three
copies of a nucleic acid sequence encoding xylose isomerase, one
being exogenously introduced and the other two being endogenous.
The term recombinant when used with reference to a cell, nucleic
acid, polypeptide, vector, or the like indicates that the cell,
nucleic acid, polypeptide, vector or the like has been modified by
or is the result of laboratory methods and is non-naturally
occurring. Thus, for example, recombinant microorganisms include
microorganisms produced by or modified by laboratory methods, e.g.,
transformation methods for introducing nucleic acids into the
microroganism. Recombinant microorganisms can include nucleic acid
sequences not found within the native (non-recombinant) form of the
microroganisms or can include nucleic acid sequences that have been
modified, e.g., linked to a non-native promoter.
[0058] As used herein, the term exogenous refers to a substance,
such as a nucleic acid (e.g., nucleic acids including regulatory
sequences and/or genes) or polypeptide, that is artificially
introduced into a cell or organism and/or does not naturally occur
in the cell in which it is present. In other words, the substance,
such as nucleic acid or polypeptide, originates from outside a cell
or organism into which it is introduced. An exogenous nucleic acid
can have a nucleotide sequence that is identical to that of a
nucleic acid naturally present in the cell. For example, a
Thraustochytrid cell can be engineered to include a nucleic acid
having a Thraustochytrid or Thraustochytrium regulatory sequence.
In a particular example, an endogenous Thraustochytrid or
Thraustochytrium regulatory sequence is operably linked to a gene
with which the regulatory sequence is not involved under natural
conditions. Although the Thraustochytrid or Thraustochytrium
regulatory sequence may naturally occur in the host cell, the
introduced nucleic acid is exogenous according to the present
disclosure. An exogenous nucleic acid can have a nucleotide
sequence that is different from that of any nucleic acid that is
naturally present in the cell. For example, the exogenous nucleic
acid can be a heterologous nucleic acid, i.e., a nucleic acid from
a different species or organism. Thus, an exogenous nucleic acid
can have a nucleic acid sequence that is identical to that of a
nucleic acid that is naturally found in a source organism but that
is different from the cell into which the exogenous nucleic acid is
introduced. As used herein, the term endogenous, refers to a
nucleic acid sequence that is native to a cell. As used herein, the
term heterologous refers to a nucleic acid sequence that is not
native to a cell, i.e., is from a different organism than the cell.
The terms exogenous and endogenous or heterologous are not mutually
exclusive. Thus, a nucleic acid sequence can be exogenous and
endogenous, meaning the nucleic acid sequence can be introduced
into a cell but have a sequence that is the same as or similar to
the sequence of a nucleic acid naturally present in the cell.
Similarly, a nucleic acid sequence can be exogenous and
heterologous meaning the nucleic acid sequence can be introduced
into a cell but have a sequence that is not native to the cell,
e.g., a sequence from a different organism.
[0059] As discussed above, the provided recombinant microorganisms
contain at least two copies of a nucleic acid sequence encoding a
xylose isomerase. The provided microorganisms optionally also
contain at least one nucleic acid sequence encoding a xylulose
kinase. Optionally, the recombinant microorganisms comprise at
least one nucleic acid sequence encoding a xylose transporter. The
nucleic acid sequences encoding the xylose isomerase, xylulose
kinase, and/or xylose transporter are, optionally, exogenous
nucleic acid sequences. Optionally, the nucleic acid sequence
encoding the xylose isomerase is an endogenous nucleic acid
sequence. Optionally, the nucleic acid sequence encoding the
xylulose kinase and/or xylose transporter is a heterologous nucleic
acid. Optionally, the microorganism contains at least two copies of
a nucleic acid sequence encoding a xylose isomerase, at least two
copies of a nucleic acid sequence encoding a xylulose kinase, and
at least one nucleic acid sequence encoding a xylose transporter.
Optionally, the heterologous nucleic acid sequence encoding the
xylose isomerase is at least 90% identical to SEQ ID NO:2.
Optionally, the heterologous nucleic acid sequence encoding the
xylulose kinase is at least 90% identical to SEQ ID NO:5. As noted
above, optionally, the nucleic acid encoding the xylose transporter
is a heterologous nucleic acid. Optionally, the xylose transporter
encoded by the heterologous nucleic acid is GXS1 from Candida
intermedia. Optionally, the heterologous nucleic acid sequence
encoding the xylose transporter is at least 90% identical to SEQ ID
NO:23.
[0060] The provided recombinant microorganisms not only contain
nucleic acid sequences encoding genes involved in xylose
metabolism, they can include multiple copies of such sequences.
Thus, the microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of
the nucleic acid sequence encoding xylose isomerase. Optionally,
the microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylulose kinase. Optionally, the
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylose transporter.
[0061] In the provided microorganisms, the nucleic acids, e.g.,
xylose isomerase, xylulose kinase or xylose transporter can be
operably linked to a promoter and/or terminator. Optionally, the
exogenous nucleic acid sequence encoding the xylose isomerase is
operably linked to a promoter. Optionally, the nucleic acid
sequence encoding the xylulose kinase and/or the nucleic acid
sequence encoding the xylose transporter are also operably linked
to a promoter. Optionally, the promoter is a tubulin promoter.
Optionally, the promoter is at least 80% identical to SEQ ID NO:25
or SEQ ID NO:26. Optionally, the exogenous nucleic acid sequence
encoding the xylose isomerase comprises a terminator. Optionally,
the nucleic acid sequence encoding the xylulose kinase comprises a
terminator. Optionally, the nucleic acid sequence encoding the
xylose transporter comprises a terminator. Optionally, the
terminator is a tubulin terminator. Optionally, the terminator is
at least 80% identical to SEQ ID NO:27, SEQ ID NO:28, or SEQ ID
NO:30.
[0062] The provided microorganisms can include a selectable marker
to confirm transformation of genes of interest. Thus, the
microorganism can further include a selectable marker. Optionally,
the selectable marker is an antibiotic resistance gene. Optionally,
the antibiotic is zeocin, hygromycin B, kanamycin or neomycin.
Optionally, the microorganism is either a Thraustochytrium or a
Schizochytrium microorganism. Optionally, the microorganism is
ONC-T18.
[0063] The provided microorganisms have distinguishing features
over wild type microorganisms. For example, the recombinant
microorganisms can have increased xylose transport activity as
compared to a non-recombinant control (or wild type) microorganism,
increased xylose isomerase activity as compared to a
non-recombinant control (or wild type) microorganism, increased
xylulose kinase activity as compared to a non-recombinant control
(or wild type) microorganism, or any combination of these
activities. Optionally, the recombinant microorganism grows with
xylose as the sole carbon source.
[0064] Also provided are methods of making the recombinant
microorganisms. Thus, provided is a method of making a recombinant
xylose-metabolizing microorganism including providing one or more
nucleic acid constructs comprising a nucleic acid sequence encoding
a xylose isomerase, a nucleic acid sequence encoding a xylulose
kinase and a nucleic acid sequence encoding a xylose transporter;
transforming the microorganism with the one or more nucleic acid
constructs; and isolating microorganisms comprising at least two
copies of the nucleic acid sequences encoding the xylose isomerase.
Optionally, the methods further include isolating microorganisms
comprising at least two copies of the nucleic acid sequence
encoding the xylulose kinase. Optionally, the method includes
isolating microorganisms comprising at least one copy of the xylose
transporter. Optionally, the one or more nucleic acid constructs
further comprise a selectable marker.
[0065] In the provided methods, the nucleic acid sequences encoding
the xylose isomerase, xylulose kinase and xylose transporter can be
located on the same or different constructs. Optionally, the method
includes providing a first nucleic acid construct comprising a
nucleic acid sequence encoding a xylose isomerase, a second nucleic
acid construct comprising a nucleic acid sequence encoding a
xylulose kinase and a third nucleic acid construct comprising a
nucleic acid sequence encoding a xylose transporter. Optionally,
the first, second and third nucleic acid constructs comprise the
same selectable marker. Optionally, the first nucleic acid
construct comprises a promoter, a selectable marker, a nucleic acid
sequence encoding a 2A peptide, the nucleic acid sequence encoding
the xylose isomerase, and a terminator. Optionally, the second
nucleic acid construct comprises a promoter, selectable marker, a
nucleic acid sequence encoding a 2A peptide, the nucleic acid
sequence encoding the xylulose kinase, and a terminator.
Optionally, the third nucleic acid construct comprises a promoter,
the nucleic acid sequence encoding the xylose transporter, a
nucleic acid sequence encoding a 2A peptide, a selectable marker,
and a terminator. As noted above, selectable markers include, but
are not limited to, antibiotic resistance genes. Optionally, the
antibiotic is zeocin, hygromycin B, kanamycin or neomycin.
Promoters used for the constructs include, but are not limited to,
a tubulin promoter. Optionally, the promoter is at least 80%
identical to SEQ ID NO:25 or SEQ ID NO:26. Terminators used for the
constructs include, but are not limited to, a tubulin terminator.
Optionally, the terminator is at least 80% identical to SEQ ID
NO:27, SEQ ID NO:28, or SEQ ID NO:30.
[0066] In the provided methods, the isolated recombinant
microorganisms can include one or more copies of the xylose
isomerase, xylulose kinase and xylose transporter. Optionally, the
isolated recombinant microorganism comprise at least 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or
40 copies of the nucleic acid sequence encoding xylose isomerase.
Optionally, the xylose isomerase is an endogenous xylose isomerase
or a heterologous xylose isomerase. Optionally, the nucleic acid
sequence encoding the xylose isomerase is at least 90% identical to
SEQ ID NO:2. Optionally, the isolated recombinant microorganism
comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, or 40 copies of the nucleic acid
sequence encoding the xylulose kinase. Optionally, the xylulose
kinase is a heterologous xylulose kinase. Optionally, the nucleic
acid sequence encoding the xylulose kinase is at least 90%
identical to SEQ ID NO:5. Optionally, the isolated recombinant
microorganism comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 copies of the
nucleic acid sequence encoding the xylose transporter. Optionally,
the xylose transporter is a heterologous xylose transporter.
Optionally, the xylose transporter is GXS1 from Candida intermedia.
Optionally, the nucleic acid sequence encoding the xylose
transporter is at least 90% identical to SEQ ID NO:23. Optionally,
the microorganism is either a Thraustochytrium or a Schizochytrium
microorganism. Optionally, the microorganism is ONC-T18.
[0067] As noted above, the isolated recombinant microorgansims can
have increased xylose transport activity as compared to a control
non-recombinant microorganism, increased xylose isomerase activity
as compared to a control non-recombinant microorganism, increased
xylulose kinase activity as compared to a control non-recombinant
microorganism, or a combination thereof. Optionally, the isolated
recombinant microorganism grows with xylose as the sole carbon
source.
[0068] As described herein, a control or standard control refers to
a sample, measurement, or value that serves as a reference, usually
a known reference, for comparison to a test sample, measurement, or
value. For example, a test microorganism, e.g., a microorganism
transformed with nucleic acid sequences encoding genes for
metabolizing xylose can be compared to a known normal (wild-type)
microorganism (e.g., a standard control microorganism). A standard
control can also represent an average measurement or value gathered
from a population of microorganisms (e.g., standard control
microorganisms) that do not grow or grow poorly on xylose as the
sole carbon source or that do not have or have minimal levels of
xylose isomerase activity, xylulose kinase activity and/or xylose
transport activity. One of skill will recognize that standard
controls can be designed for assessment of any number of parameters
(e.g., RNA levels, polypeptide levels, specific cell types, and the
like).
[0069] Provided herein are also methods of producing oil using the
recombinant microorganisms. The method includes providing the
recombinant microorganism, wherein the microorganism grows on
xylose as the sole carbon source, and culturing the microorganism
in a culture medium under suitable conditions to produce the oil.
Optionally, the oil comprises triglycerides. Optionally, the oil
comprises alpha linolenic acid, arachidonic acid, docosahexanenoic
acid, docosapentaenoic acid, eicosapentaenoic acid, gamma-linolenic
acid, linoleic acid, linolenic acid, or a combination thereof.
Optionally, the method further includes isolating the oil.
[0070] The provided methods include or can be used in conjunction
with additional steps for culturing microorganisms according to
methods known in the art. For example, a Thraustochytrid, e.g., a
Thraustochytrium sp., can be cultivated according to methods
described in U.S. Patent Publications 2009/0117194 or 2012/0244584,
which are herein incorporated by reference in their entireties for
each step of the methods or composition used therein.
[0071] Microorganisms are grown in a growth medium (also known as
culture medium). Any of a variety of medium can be suitable for use
in culturing the microorganisms described herein. Optionally, the
medium supplies various nutritional components, including a carbon
source and a nitrogen source, for the microorganism. Medium for
Thraustochytrid culture can include any of a variety of carbon
sources. Examples of carbon sources include fatty acids, lipids,
glycerols, triglycerols, carbohydrates, polyols, amino sugars, and
any kind of biomass or waste stream. Fatty acids include, for
example, oleic acid. Carbohydrates include, but are not limited to,
glucose, cellulose, hemicellulose, fructose, dextrose, xylose,
lactulose, galactose, maltotriose, maltose, lactose, glycogen,
gelatin, starch (corn or wheat), acetate, m-inositol (e.g., derived
from corn steep liquor), galacturonic acid (e.g., derived from
pectin), L-fucose (e.g., derived from galactose), gentiobiose,
glucosamine, alpha-D-glucose-1-phosphate (e.g., derived from
glucose), cellobiose, dextrin, alpha-cyclodextrin (e.g., derived
from starch), and sucrose (e.g., from molasses). Polyols include,
but are not limited to, maltitol, erythritol, and adonitol. Amino
sugars include, but are not limited to, N-acetyl-D-galactosamine,
N-acetyl-D-glucosamine, and N-acetyl-beta-D-mannosamine.
[0072] Optionally, the microorganisms provided herein are
cultivated under conditions that increase biomass and/or production
of a compound of interest (e.g., oil or total fatty acid (TFA)
content). Thraustochytrids, for example, are typically cultured in
saline medium. Optionally, Thraustochytrids can be cultured in
medium having a salt concentration from about 0.5 g/L to about 50.0
g/L. Optionally, Thraustochytrids are cultured in medium having a
salt concentration from about 0.5 g/L to about 35 g/L (e.g., from
about 18 g/L to about 35 g/L). Optionally, the Thraustochytrids
described herein can be grown in low salt conditions. For example,
the Thraustochytrids can be cultured in a medium having a salt
concentration from about 0.5 g/L to about 20 g/L (e.g., from about
0.5 g/L to about 15 g/L). The culture medium optionally includes
NaCl. Optionally, the medium includes natural or artificial sea
salt and/or artificial seawater.
[0073] The culture medium can include non-chloride-containing
sodium salts as a source of sodium. Examples of non-chloride sodium
salts suitable for use in accordance with the present methods
include, but are not limited to, soda ash (a mixture of sodium
carbonate and sodium oxide), sodium carbonate, sodium bicarbonate,
sodium sulfate, and mixtures thereof. See, e.g., U.S. Pat. Nos.
5,340,742 and 6,607,900, the entire contents of each of which are
incorporated by reference herein. A significant portion of the
total sodium, for example, can be supplied by non-chloride salts
such that less than about 100%, 75%, 50%, or 25% of the total
sodium in culture medium is supplied by sodium chloride.
[0074] Medium for Thraustochytrids culture can include any of a
variety of nitrogen sources. Exemplary nitrogen sources include
ammonium solutions (e.g., NH.sub.4 in H.sub.2O), ammonium or amine
salts (e.g., (NH.sub.4).sub.2SO.sub.4, (NH.sub.4).sub.3PO.sub.4,
NH.sub.4NO.sub.3, NH.sub.4OOCH.sub.2CH.sub.3 (NH.sub.4Ac)),
peptone, tryptone, yeast extract, malt extract, fish meal, sodium
glutamate, soy extract, casamino acids and distiller grains.
Concentrations of nitrogen sources in suitable medium typically
range between and including about 1 g/L and about 25 g/L.
[0075] The medium optionally includes a phosphate, such as
potassium phosphate or sodium-phosphate. Inorganic salts and trace
nutrients in medium can include ammonium sulfate, sodium
bicarbonate, sodium orthovanadate, potassium chromate, sodium
molybdate, selenous acid, nickel sulfate, copper sulfate, zinc
sulfate, cobalt chloride, iron chloride, manganese chloride calcium
chloride, and EDTA. Vitamins such as pyridoxine hydrochloride,
thiamine hydrochloride, calcium pantothenate, p-aminobenzoic acid,
riboflavin, nicotinic acid, biotin, folic acid and vitamin B12 can
be included.
[0076] The pH of the medium can be adjusted to between and
including 3.0 and 10.0 using acid or base, where appropriate,
and/or using the nitrogen source. Optionally, the medium can be
sterilized.
[0077] Generally a medium used for culture of a microorganism is a
liquid medium. However, the medium used for culture of a
microorganism can be a solid medium. In addition to carbon and
nitrogen sources as discussed herein, a solid medium can contain
one or more components (e.g., agar or agarose) that provide
structural support and/or allow the medium to be in solid form.
[0078] Optionally, the resulting biomass is pasteurized to
inactivate undesirable substances present in the biomass. For
example, the biomass can be pasteurized to inactivate compound
degrading substances. The biomass can be present in the
fermentation medium or isolated from the fermentation medium for
the pasteurization step. The pasteurization step can be performed
by heating the biomass and/or fermentation medium to an elevated
temperature. For example, the biomass and/or fermentation medium
can be heated to a temperature from about 50.degree. C. to about
95.degree. C. (e.g., from about 55.degree. C. to about 90.degree.
C. or from about 65.degree. C. to about 80.degree. C.). Optionally,
the biomass and/or fermentation medium can be heated from about 30
minutes to about 120 minutes (e.g., from about 45 minutes to about
90 minutes, or from about 55 minutes to about 75 minutes). The
pasteurization can be performed using a suitable heating means,
such as, for example, by direct steam injection.
[0079] Optionally, no pasteurization step is performed. Stated
differently, the method taught herein optionally lacks a
pasteurization step.
[0080] Optionally, the biomass can be harvested according to a
variety of methods, including those currently known to one skilled
in the art. For example, the biomass can be collected from the
fermentation medium using, for example, centrifugation (e.g., with
a solid-ejecting centrifuge) or filtration (e.g., cross-flow
filtration). Optionally, the harvesting step includes use of a
precipitation agent for the accelerated collection of cellular
biomass (e.g., sodium phosphate or calcium chloride).
[0081] Optionally, the biomass is washed with water. Optionally,
the biomass can be concentrated up to about 20% solids. For
example, the biomass can be concentrated to about 5% to about 20%
solids, from about 7.5% to about 15% solids, or from about solids
to about 20% solids, or any percentage within the recited ranges.
Optionally, the biomass can be concentrated to about 20% solids or
less, about 19% solids or less, about 18% solids or less, about 17%
solids or less, about 16% solids or less, about 15% solids or less,
about 14% solids or less, about 13% solids or less, about 12%
solids or less, about 11% solids or less, about 10% solids or less,
about 9% solids or less, about 8% solids or less, about 7% solids
or less, about 6% solids or less, about 5% solids or less, about 4%
solids or less, about 3% solids or less, about 2% solids or less,
or about 1% solids or less.
[0082] The provided methods, optionally, include isolating the
polyunsaturated fatty acids from the biomass or microorganisms.
Isolation of the polyunsaturated fatty acids can be performed using
one or more of a variety of methods, including those currently
known to one of skill in the art. For example, methods of isolating
polyunsaturated fatty acids are described in U.S. Pat. No.
8,163,515, which is incorporated by reference herein in its
entirety. Optionally, the medium is not sterilized prior to
isolation of the polyunsaturated fatty acids. Optionally,
sterilization comprises an increase in temperature. Optionally, the
polyunsaturated fatty acids produced by the microorganisms and
isolated from the provided methods are medium chain fatty acids.
Optionally, the one or more polyunsaturated fatty acids are
selected from the group consisting of alpha linolenic acid,
arachidonic acid, docosahexanenoic acid, docosapentaenoic acid,
eicosapentaenoic acid, gamma-linolenic acid, linoleic acid,
linolenic acid, and combinations thereof.
[0083] Oil including polyunsaturated fatty acids (PUFAs) and other
lipids produced according to the method described herein can be
utilized in any of a variety of applications exploiting their
biological, nutritional, or chemical properties. Thus, the provided
methods optionally include isolating oil from the harvested portion
of the threshold volume. Optionally, the oil is used to produce
fuel, e.g., biofuel. Optionally, the oil can be used in
pharmaceuticals, food supplements, animal feed additives,
cosmetics, and the like. Lipids produced according to the methods
described herein can also be used as intermediates in the
production of other compounds.
[0084] By way of example, the oil produced by the microorganisms
cultured using the provided methods can comprise fatty acids.
Optionally, the fatty acids are selected from the group consisting
of alpha linolenic acid, arachidonic acid, docosahexaenoic acid,
docosapentaenoic acid, eicosapentaenoic acid, gamma-linolenic acid,
linoleic acid, linolenic acid, and combinations thereof.
Optionally, the oil comprises triglycerides. Optionally, the oil
comprises fatty acids selected from the group consisting of
palmitic acid (C16:0), myristic acid (C14:0), palmitoleic acid
(C16:1(n-7)), cis-vaccenic acid (C18:1(n-7)), docosapentaenoic acid
(C22:5(n-6)), docosahexaenoic acid (C22:6(n-3)), and combinations
thereof.
[0085] Optionally, the lipids produced according to the methods
described herein can be incorporated into a final product (e.g., a
food or feed supplement, an infant formula, a pharmaceutical, a
fuel, etc.). Suitable food or feed supplements into which the
lipids can be incorporated include beverages such as milk, water,
sports drinks, energy drinks, teas, and juices; confections such as
candies, jellies, and biscuits; fat-containing foods and beverages
such as dairy products; processed food products such as soft rice
(or porridge); infant formulae; breakfast cereals; or the like.
Optionally, one or more produced lipids can be incorporated into a
dietary supplement, such as, for example, a vitamin or
multivitamin. Optionally, a lipid produced according to the method
described herein can be included in a dietary supplement and
optionally can be directly incorporated into a component of food or
feed (e.g., a food supplement).
[0086] Examples of feedstuffs into which lipids produced by the
methods described herein can be incorporated include pet foods such
as cat foods; dog foods; feeds for aquarium fish, cultured fish or
crustaceans, etc.; feed for farm-raised animals (including
livestock and fish or crustaceans raised in aquaculture). Food or
feed material into which the lipids produced according to the
methods described herein can be incorporated is preferably
palatable to the organism which is the intended recipient. This
food or feed material can have any physical properties currently
known for a food material (e.g., solid, liquid, soft).
[0087] Optionally, one or more of the produced compounds (e.g.,
PUFAs) can be incorporated into a nutraceutical or pharmaceutical
product. Examples of such a nutraceuticals or pharmaceuticals
include various types of tablets, capsules, drinkable agents, etc.
Optionally, the nutraceutical or pharmaceutical is suitable for
topical application. Dosage forms can include, for example,
capsules, oils, granula, granula subtilae, pulveres, tabellae,
pilulae, trochisci, or the like.
[0088] The oil or lipids produced according to the methods
described herein can be incorporated into products as described
herein in combination with any of a variety of other agents. For
instance, such compounds can be combined with one or more binders
or fillers, chelating agents, pigments, salts, surfactants,
moisturizers, viscosity modifiers, thickeners, emollients,
fragrances, preservatives, etc., or any combination thereof.
[0089] Disclosed are materials, compositions, and components that
can be used for, can be used in conjunction with, can be used in
preparation for, or are products of the disclosed methods and
compositions. These and other materials are disclosed herein, and
it is understood that when combinations, subsets, interactions,
groups, etc. of these materials are disclosed that while specific
reference of each various individual and collective combinations
and permutations of these compounds may not be explicitly
disclosed, each is specifically contemplated and described herein.
For example, if a method is disclosed and discussed and a number of
modifications that can be made to a number of molecules including
the method are discussed, each and every combination and
permutation of the method, and the modifications that are possible
are specifically contemplated unless specifically indicated to the
contrary. Likewise, any subset or combination of these is also
specifically contemplated and disclosed. This concept applies to
all aspects of this disclosure including, but not limited to, steps
in methods using the disclosed compositions. Thus, if there are a
variety of additional steps that can be performed, it is understood
that each of these additional steps can be performed with any
specific method steps or combination of method steps of the
disclosed methods, and that each such combination or subset of
combinations is specifically contemplated and should be considered
disclosed.
[0090] Publications cited herein and the material for which they
are cited are hereby specifically incorporated by reference in
their entireties.
[0091] The examples below are intended to further illustrate
certain aspects of the methods and compositions described herein,
and are not intended to limit the scope of the claims.
EXAMPLES
Example 1. C5 Carbon Metabolism by Recombinant Thraustochytrids
[0092] In nature, two xylose metabolism pathways exist, the xylose
reductase/xylitol dehydrogenase pathway and the xylose
isomerase/xylulose kinase pathway (FIG. 1). ONC-T18 encodes genes
from both pathways, and, as described above, the xylose
reductase/xylitol dehydrogenase pathway is dominant, as evidenced
by a build-up of xylitol when grown in a xylose medium. Since the
isomerase/kinase pathway does not depend on redox co-factors,
over-expression of ONC-T18's isomerase gene removes co-factor
dependence in the conversion of xylose to xylulose. As shown herein
in FIGS. 2 and 3, transcriptomic studies with ONC-T18 showed that
its xylose isomerase and putative xylulose kinase genes were mostly
expressed during glucose starvation; whereas, the putatively
identified genes encoding for the xylose reductase and xylitol
dehydrogenase were constitutively expressed.
[0093] T18 isomerase was purified by metal-affinity chromatography
following his-tagging and over-expression in yeast INVSc1. As a
positive control, his-tagged XylA from E. coli strain W3110 was
over-expressed and purified from E. coli strain BL21(DE3)plysS. The
protein concentration of purified proteins was determined by a
standard Bradford assay. The impact of temperature on the activity
of T18 isomerase and E. coli isomerase was determined using 5 .mu.g
of protein and 0.75 g/L of either xylose or xylulose in 5 mM MgATP,
50 mM Hepes (pH 7.4), 10 mM MgCl.sub.2. Reactions were incubated
overnight at 10.degree. C., 25.degree. C., 30.degree. C.,
37.degree. C., 50.degree. C., 60.degree. C., and 80.degree. C.
Reactions were stopped by heat inactivation at 95.degree. C. for 5
mins. Reactions were analyzed by HPLC and the concentration of the
sugars present was determined from the area under the peak relative
to a standard curve. T18 isomerase had higher activity on both
xylose and xylulose at temperatures at and above 37.degree. C.
(FIG. 20A). This is in contrast to E. coli isomerase, which had
higher activity at temperatures between 25.degree. C. and
30.degree. C. (FIG. 20B).
[0094] Dose-dependency was determined by incubating increasing
protein concentrations of the isomerase with 0.75 g/L xylose or
xylulose in 5 mM MgATP, 50 mM Hepes (pH 7.4), 10 mM MgCl.sub.2.
Reactions were incubated overnight at 30.degree. C. (E. coli) or
50.degree. C. (T18) then stopped by heat inactivation at 95.degree.
C. for 5 mins. Reactions were analyzed by HPLC and the
concentration of the sugars present was determined from the area
under the peak relative to a standard curve. Observed was a dose
dependency of T18 isomerase on both xylose and xylulose (FIGS. 21A
and 21B).
[0095] This example describes the use of a Thraustochytrium
ONC-T18-derived (ONC-T18) alpha-tubulin promoter to express
endogenous and/or heterologous xylose metabolism transgenes in
Thraustochytrid species, including ONC-T18. However, as discussed
throughout, other regulatory elements can be used. FIGS. 4 and 5
show constructs of the plasmids containing the xylose isomerase and
xylulose kinase genes, respectively. As described herein, the
xylose metabolism transgenes were present in multiple (.gtoreq.8)
copies within the genome of the host. In the case of ONC-T18, the
modified organisms demonstrated an increased metabolism of xylose
compared to wild-type (WT) cells. For example, a strain modified to
express an endogenous xylose isomerase gene (SEQ ID NO:2) (strain
Iso-His #16) and a strain modified to express an endogenous xylose
isomerase gene (SEQ ID NO:2) and a xylulose kinase gene (SEQ ID
NO:5) (Iso-His+xylB, strain 7-7) both used 40% more xylose than the
WT strain. Both Iso-His #16 and 7-7 converted less xylose to
xylitol than the WT strain, 40% less and 420% less, respectively.
The constructs used for transformation of ONC-T18 are shown in
FIGS. 4 and 5. ONC-T18 tranformants were created using standard
biolistics protocols as described by BioRad's Biolistic PDS-1000/He
Particle Delivery System (Hercules, Calif.). Briefly, 0.6 .mu.m
gold particles were coated with 2.5 .mu.g of linerized plasmid DNA
(EcoRI, 37.degree. C., overnight). The coated gold particles were
used to bombard plates previously spread with 1 ml of ONC-T18 cells
at an OD600 of 1.0. The bombardment parameters included using a
helium pressure of 1350 or 1100 psi with a target distance of 3 or
6 cm. After an overnight recovery, the cells were washed off the
plate and plated on media containing selection antibiotics (Zeo 250
.mu.g/mL and hygro 400 .mu.g/mL). Plates were incubated for 1 week
at 25.degree. C. to identify resistant colonies. The resulting
transformants were screened by PCR and Southern blot.
[0096] Southern blots were performed using standard protocols.
Briefly, approximately 20 .mu.g of genomic DNA were digested with
40 units of BamHI restriction enzyme in a total volume of 50 .mu.L
overnight at 37.degree. C. 7.2 .mu.g of each digested sample was
run on a 1.0% agarose gel at 50V for approximately 1.5. hours, with
a digoxigenin (DIG) DNA molecular-weight marker II (Roche, Basel,
Switzerland). DNA was depurinated in the gel by submerging the gel
in 250 mM HCl for 15 minutes. The gel was further denatured by
incubation in a solution containing 0.5 M NaOH and 1.5 M NaCL (pH
7.5) for two 15 minute washes. The reaction was then neutralized by
incubation in 0.5 M Tris-HCl (pH 7.5) for two 15 minute washes.
Finally, the gel was equilibrated in 20.times. saline-sodium
citrate (SSC) buffer for 15 minutes. DNA was transferred to a
positively charged nylon membrane using a standard transfer
apparatus. DNA was fixed to the membrane using a UV Stratalinker at
an exposure of 120,000 .mu.J. Southern blot probe was generated
using a PCR DIG Probe Synthesis Kit (Roche, Basel, Switzerland) to
generate a DIG-labelled probe according to the manufacturer's
instructions. The DNA affixed to the nylon membrane was
prehybridized with 20 mL of DIG EasyHyb solution (DIG EasyHyb
Granules, Roche, Basel, Switzerland). The DIG-labelled probe was
denatured by adding 40 .mu.L of the ble-probe reaction mixture to
300 .mu.L of ddH.sub.2O and incubated at 99.degree. C. for 5
minutes. This solution was then added to 20 mL of DIG hybridization
solution to create the probe solution. The probe solution was then
added to the DNA-affixed nylon membrane and incubated at 53.degree.
C. overnight. The following day, the membrane was washed twice in
2.times.SSC, 0.1% SDS at room temperature. The membrane was further
washed twice in 0.1.times.SSC, 0.1% SDS at 68.degree. C. for 15
minutes. For detection, the membrane was washed and blocked using
DIG Wash and Block Buffer set (Roche, Basel, Switzerland) according
to the manufacturer's instructions. An anti-DIG-AP conjugated
antibody from a DIG Nucleic Acid Detection Kit (Roche, Basel,
Switzerland) was used for detection. 2 .mu.L of the antibody
solution was added to 20 mL detection solution and incubated with
the membrane at room temperature for 30 minutes. The blot was then
immersed in a washing buffer provided with the kit. CDP-Star
(Roche, Basel, Switzerland) was used for visualization. 10 .mu.L of
the CDP-star solution was incubated on the membrane in 1 mL of
detection solution, which was covered in a layer of
`sheet-protector` plastic to hold the solution to the membrane.
Signal was immediately detected using a ChemiDoc imaging system
(BioRad Laboratories, Hercules, Calif.).
[0097] The codon optimized ble gene was cloned under the control of
T18B .alpha.-tubulin promoter and terminator elements (FIG. 6). The
isomerase gene was cloned from T18B in such a way as to add a
six-histidine tag on the N-terminus of the expressed protein
(Iso-His). Xylose isomerase enzymatic activity was confirmed by
over-expression and purification of the histidine-tagged protein in
yeast. The isomerase gene (along with the introduced six-histidine
tag) was cloned under the control of the .alpha.-tubulin promoter
and terminators by cloning the gene downstream of the ble gene and
a 2A sequence (FIG. 4 and FIG. 6). Biolistic transformation of T18B
with this plasmid (pALPHTB-B2G-hisIso) resulted in Zeocin (zeo)
resistant transformants. Many transformant strains were obtained
from this procedure. Two of these strains are shown as example #6
containing one copy of the transgene and example #16 containing
eight copies of the transgene (FIG. 7).
[0098] The insertion of the Iso-His transgene within the T18B
genome was confirmed by PCR and Southern blot analysis (FIG. 7).
Qualitatively, these data showed the presence of a single copy of
the transgene in strain #6 and multiple, concatameric, transgene
copies, at a single site, in strain #16. The precise number of
Iso-His transgene insertions was determined by qPCR on genomic DNA
(FIG. 8). These data showed the presence of one copy of the
transgene in strain #6 and eight copies of the transgene in strain
#16 (FIG. 8). To test whether an increase in copy number correlated
with an increase in expression level, mRNA was isolated from WT,
Iso-His #6 and Iso-His #16 T18B cells and qRT-PCR was performed.
FIG. 11 shows significantly increased expression of the Iso-His
transcript in strain #16 cells, containing eight copies of the
transgene, compared to strain #6, containing a single copy of the
transgene. No Iso-His transcript is detectable in WT cells (FIG.
11). To assess whether increased mRNA expression correlated with
increased isomerase enzymatic activity, cell extracts were
harvested from WT, Iso-His #6 and Iso-His #16 cells. Enhanced
isomerase enzyme activity is observed in strain #16 cells compared
with strain #6 and WT cells (FIG. 12). Finally, the ability of
strain #16 to metabolize xylose was examined in xylose depletion
assays (FIG. 14) and compared with WT cells. These flask
fermentations demonstrated the ability to metabolise xylose and
quantify the amount of xylose converted to xylitol. Thus, FIG. 14
shows an increase in xylose metabolism in Iso-His strain #16
compared with WT cells and significantly less production of
xylitol.
[0099] For flasks assays, cells were grown in media for 2 to 3
days. Pellets were washed twice in Media 2 (9 g/L NaCl, 4 g/L
MgSO.sub.4, 100 mg/L CaCl.sub.2, 5 mg/L FeCl.sub.3, 20 g/L
(NH.sub.4).sub.2SO.sub.4, 0.86 g/L KH.sub.2PO.sub.4, 150 .mu.g/L
vitamin B12, 30 .mu.g/L biotin, 6 mg/L thiamine hydrochloride, 1.5
mg/L cobalt (II) chloride, 3 mg/L manganese chloride) containing no
sugar. Then, minimal media containing 20 g/L glucose & 50 g/L
xylose was inoculated to an OD600 of 0.05 with the washed cells.
Samples were taken at various time points and the amount of sugar
remaining in the supernatant was analyzed by HPLC. As shown in
FIGS. 22A, 22B, 22C and 22D, with increased xylose isomerase gene
copy number, up to 40% more xylose usage and 20% decrease in
xylitol production when compared to WT.
[0100] Iso-His strain #16 was then used as the parent strain for a
second round of transformation to introduce the E. coli xylB gene.
This gene was introduced under hygromycin (hygro) selection. The
hygro gene from pChlamy_3, the 2A sequence, and the T18B codon
optimized W3110 E. coli xylB gene were cloned under the control of
the T18B .alpha.-tubulin promoter and terminator elements for
expression in T18B iso-his #16 (FIG. 5). The in vitro ability of
the E. coli xylulose kinase to work in concert with the T18B
isomerase was confirmed by over-expression and purification of the
histidine-tagged proteins in yeast followed by enzymatic reactions
with xylose and xylulose. Biolistic transformation of T18B iso-his
strain #16 with the xylB plasmid (pJB47) resulted in hygro and zeo
resistant transformants. The insertion of the hygro-2A-xylB genes
within the T18B genome was confirmed by PCR and Southern blot
analysis (FIG. 9). Qualitatively, these data show the presence of a
single copy of the transgene in strain #7-3 and multiple,
concatameric, transgene copies, at a single site, in strain #7-7.
The number of xylB gene insertions was determined by qPCR on
genomic DNA isolations (FIG. 10). FIG. 10 shows sixteen insertions
of the transgene in strain 7-7 and one copy in strain 7-3. To
determine whether multiple copies of the transgene confer enhanced
xylose metabolism in vitro, cell extract assays were performed and
the ability of the cells extracts to metabolise xylose was analysed
(FIG. 13). The ability of the transformant cells to metabolize
xylose was examined through flask-based xylose depletion assays
(FIG. 15). In this experiment, WT cells consumed the least amount
of xylose and made the most xylitol. Strain Iso-His #16, 7-3 and
7-7 all consumed similar amounts of xylose; however, only 7-7,
containing multiple copies of the xylB transgene, did not make
significant amounts of xylitol. Finally, strains Iso-His #16 and
7-7 were tested at in 5 L fermentation vessels in media containing
glucose and xylose. During a seventy-seven (77) hour fermentation,
strain Iso-His #16 converted approximately 8% of xylose to xylitol,
whereas strain 7-7 converted approximately 2% of xylose to xylitol.
Xylitol accumulation in this fermentation is shown in FIG. 16.
[0101] For flasks assays, cells were grown in media for 2 to 3
days. Pellets were washed twice in media containing no sugar. Media
containing 20 g/L: 50 g/L glucose: xylose was inoculated to an
OD600 of 0.05 with the washed cells. Samples were taken at various
time points and the amount of sugar remaining in the supernatant
was analyzed by HPLC. As shown in FIGS. 23A, 23B, 23C and 23D, up
to 50% more xylose was used and an 80% reduction in xylitol was
observed in strains over-expressing both a xylose isomerase and a
xylulose kinase when compared to WT.
[0102] To further analyze these strains, the strains were grown in
parallel 5 L Sartorius fermenters. Initial media contained 20 g/L
Glucose and 50 g/L xylose along with other basal media components.
Both cultures were maintained at 28.degree. C. and 5.5 pH, with
constant mixing at 720 RPM and constant aeration at 1 Lpm of
environmental air. The cultures were fed glucose for 16 hrs
followed by 8 hr starvation period. This cycle was completed 3
times. During starvation periods, 10 mL samples were taken every
0.5 hr. Glucose, xylose and xylitol concentrations were quantified
in these samples by HPLC. Larger 50 mL samples were taken
periodically for further biomass and oil content quantification.
Glucose feed rates matched glucose consumption rates, which was
quantified by CO.sub.2 detected in the culture exhaust gas. As
shown in FIG. 24, the 7-7 strain used up to 52% more xylose than WT
under these conditions.
[0103] By Southern blot analysis, it was observed that strain
Iso-His #16 contains eight (8) insertions of the isomerase
transgene (FIG. 8). This unexpected multiple insertion resulted in
an increase in isomerase gene expression relative to strains
harbouring a single copy (FIG. 11) as well as increased isomerase
in vitro activity (FIG. 12). Strain Iso-His #16 demonstrated
increased xylose productivity than strains harbouring a single copy
of the isomerase transgene (FIG. 14).
[0104] Similarly, within the Iso-His+xylB transformants, one of the
clones (Iso-His+xylB 7-7) also had multiple insertions of the xylB
gene (FIG. 10), which resulted in increased in vivo activity of
both the xylose isomerase and xylulose kinase within the cell (FIG.
13). This clone was capable of using either as much or more xylose
than the parental strain, Iso-His #16, while producing
significantly less xylitol (FIG. 15). Furthermore the Iso-His+xylB
7-7 produced more biomass than WT in the presence of xylose. These
two strains showed that, not only is the presence of both the
isomerase and the kinase genes important, but the number of
insertions is as well.
[0105] To further optimize the iso-his & xylB containing "7-7"
strain, this strain was transformed with a xylose transporter. FIG.
17 shows exemplary constructs for transformation. Examples of
xylose transporters to be used include, but are not limited to,
At5g17010 and At5g59250 (Arabidopsis thaliana), Gfx1 and GXS1
(Candida), AspTx (Aspergillus), and Sut1 (Pichia). Gxs1 (SEQ ID
NO:23) was selected for transformation. The results are shown in
FIGS. 19A, 19B, and 19C. The transformants 36-2, 36-9, and 36-16,
containing GXS1 use more xylose than 7-7 and WT strains. They also
use glucose slower than WT and 7-7 strains. The data demonstrate
both xylose and glucose being used in the earlier stages by the
GXS1 containing strains. Further, the percent of xylitol made by
the GXS1 containing strains is lower than both WT and 7-7
strains.
[0106] To further analyze the effect of sugar transporters on the
metabolism of xylose, codon optimized xylose transporters AspTX
from Aspergillus (An11g01100) and Gxs1 from Candida were introduced
in the 7-7 strain (isohis+xylB). FIG. 25 shows the alpha-tubulin
aspTx-neo and alpha-tubulin gxs1-neo constructs. T18 transformants
were created using standard biolistics protocols as described by
BioRad's Biolistic PDS-1000/He Particle Delivery System. Briefly,
0.6 .mu.m gold particles were coated with 2.5 .mu.g of linearized
plasmid DNA (EcoRI, 37.degree. C., o/n), The coated gold particles
were used to bombard WD plates previously spread with 1 ml of T18
cells at an OD600 of 1.0. The bombardment parameters included using
a Helium pressure of 1350 or 1100 psi with a target distance of 3
or 6 cm. After an overnight recovery, the cells were washed off the
plate and plated on media containing selection antibiotics (G418 at
2 mg/mL). Plates were incubated for 1 week at 25.degree. C. to
identify resistant colonies. The resulting transformants were
screened by PCR and Southern blot (FIG. 26).
[0107] Southern blots were performed using standard protocols.
Briefly, approximately 20 .mu.g of genomic DNA were digested with
40 units of BamHI restriction enzyme in a total volume of 50 .mu.L
o/n/ at 37.degree. C. 7.2 .mu.g of each digested sample was run on
a 1.0% agarose gel at 50V for approximately 1.5. hours, with a
digoxigenin (DIG) DNA molecular-weight marker II (Roche). DNA was
depurinated in the gel by submerging the gel in 250 mM HCl for 15
minutes. The gel was further denatured by incubation in a solution
containing 0.5 M NaOH and 1.5 M NaCL (pH 7.5) for two 15 minute
washes. The reaction was then neutralized by incubation in 0.5 M
Tris-HCl (pH 7.5) for two 15 minute washes. Finally, the gel was
equilibrated in 20.times. saline-sodium citrate (SSC) buffer for 15
minutes. DNA was transferred to a positively charged nylon membrane
(Roche) using a standard transfer apparatus. DNA was fixed to the
membrane using a UV Stratalinker at an exposure of 120,000 .mu.J.
Southern blot probe was generated using a PCR DIG Probe Synthesis
Kit (Roche) to generate a DIG-labelled probe according to the
manufacturer's instructions. The DNA affixed to the nylon membrane
was prehybridised with 20 mL of DIG EasyHyb solution (DIG EasyHyb
Granules, Roche). The DIG-labelled probe was denatured by adding 40
.mu.L of the ble-probe reaction mixture to 300 .mu.L of ddH.sub.2O
and incubated at 99.degree. C. for 5 minutes. This solution was
then added to 20 mL of DIG hybridization solution to create the
probe solution. The probe solution was then added to the
DNA-affixed nylon membrane and incubated at 53.degree. C.
overnight. The following day, the membrane was washed, twice, in
2.times.SSC, 0.1% SDS at RT. The membrane was further washed,
twice, in 0.1.times.SSC, 0.1% SDS at 68.degree. C. for 15 minutes.
For detection, the membrane was washed and blocked using DIG Wash
and Block Buffer set (Roche) according to the manufacturer's
instructions. An anti-DIG-AP conjugated antibody from a DIG Nucleic
Acid Detection Kit (Roche) was used for detection. 2 .mu.L of the
antibody solution was added to 20 mL detection solution and
incubated with the membrane at RT for 30 minutes. The blot was then
immersed in a washing buffer provided with the kit. CDP-Star
(Roche) was used for visualization. 10 .mu.L of the CDP-star
solution was incubated on the membrane in 1 mL of detection
solution, which was covered in a layer of `sheet-protector` plastic
to hold the solution to the membrane. Signal was immediately
detected using a ChemiDoc imaging system.
[0108] For flasks assays, cells were grown in media for 2 to 3
days. Pellets were washed twice in media 2 (9 g/L NaCl, 4 g/L
MgSO4, 100 mg/L CaCl2, 5 mg/L FeCl3, 20 g/L (NH4)2SO4, 0.86 g/L
KH2PO4, 150 .mu.g/L vitamin B12, 30 .mu.g/L biotin, 6 mg/L thiamine
hydrochloride, 1.5 mg/L cobalt (II) chloride, 3 mg/L manganese
chloride) containing no sugar. Then, Media 2 containing 20 g/L
Glucose and 20 g/L Xylose was inoculated to an OD600 of 0.05 with
the washed cells. As shown in FIGS. 27A, 27B, 27C and 27D, the
expression of the xylose isomerase, xylulose kinase, and either
xylose transporters resulted in up to 71% more xylose used and 40%
less xylitol produced than the parental strain 7-7.
[0109] For flasks assays, cells were grown in media for 2 to 3
days. Pellets were washed twice in saline. Then, media containing
60 g/L xylose instead of glucose was inoculated to an OD600 of 0.05
with the washed cells. Samples were taken at various time points
and the amount of sugar remaining in the supernatant was analyzed
by HPLC. FIG. 28 shows T18 growth in media containing xylose as the
main carbon source requires over-expression of both an isomerase
and a kinase. The expression of the transporters in this background
did not significantly increase xylose usage in this media.
[0110] Enhanced xylose usage by T18 7-7 and transporter strains was
observed in media containing carbon from alternative feed stocks.
For flasks assays, cells were grown in media for 2 to 3 days.
Pellets were washed twice in 0.9% saline solution. Media 2
containing 20 g/L glucose:50 g/L xylose as a combination of lab
grade glucose and glucose and xylose from an alternative feedstock
from forestry, was inoculated to an OD600nm of 0.05 with the washed
cells. Samples were taken at various time points and the amount of
sugar remaining in the supernatant was analyzed by HPLC. As shown
in FIGS. 29A and 29B, in media containing sugars from an
alternative feedstock, the T18 7-7 strains encoding for
transporters used more xylose than wild-type, or T18 7-7.
Sequence CWU 1
1
3011723DNAThraustochytrium sp. 1gtagtcatac gctcgtctca aagattaagc
catgcatgtg taagtataag cgattatact 60gtgagactgc gaacggctca ttatatcagt
tatgatttct tcggtatttt ctttatatgg 120atacctgcag taattctgga
attaatacat gctgagaggg cccgactgtt cgggagggcc 180gcacttatta
gagttgaagc caagtaagat ggtgagtcat gataattgag cagatcgctt
240gtttggagcg atgaatcgtt tgagtttctg ccccatcagt tgtcgacggt
agtgtattgg 300actacggtga ctataacggg tgacggggag ttagggctcg
actccggaga gggagcctga 360gagacggcta ccacatccaa ggaaggcagc
aggcgcgtaa attacccaat gtggactcca 420cgaggtagtg acgagaaata
tcaatgcggg gcgcttcgcg tcttgctatt ggaatgagag 480caatgtaaaa
ccctcatcga ggatcaactg gagggcaagt ctggtgccag cagccgcggt
540aattccagct ccagaagcgt atgctaaagt tgttgcagtt aaaaagctcg
tagttgaatt 600tctggggcgg gagccccggt ctttgcgcga ctgcgctctg
tttgccgagc ggctcctctg 660ccatcctcgc ctcttttttt agtggcgtcg
ttcactgtaa ttaaagcaga gtgttccaag 720caggtcgtat gacctggatg
tttattatgg gatgatcaga tagggctcgg gtgctatttt 780gttggtttgc
acatctgagt aatgatgaat aggaacagtt gggggtattc gtatttagga
840gctagaggtg aaattcttgg atttccgaaa gacgaactac agcgaaggca
tttaccaagc 900atgttttcat taatcaagaa cgaaagtctg gggatcgaag
atgattagat accatcgtag 960tctagaccgt aaacgatgcc gacttgcgat
tgcggggtgt ttgtattgga ccctcgcagc 1020agcacatgag aaatcaaagt
ctttgggttc cggggggagt atggtcgcaa ggctgaaact 1080taaaggaatt
gacggaaggg caccaccagg agtggagcct gcggcttaat ttgactcaac
1140acgggaaaac ttaccaggtc cagacatagg taggattgac agattgagag
ctctttcttg 1200attctatggg tggtggtgca tggccgttct tagttggtgg
agtgatttgt ctggttaatt 1260ccgttaacga acgagacctc ggcctactaa
atagcggtgg gtatggcgac atacttgcgt 1320acgcttctta gagggacatg
ttcggtatac gagcaggaag ttcgaggcaa taacaggtct 1380gtgatgccct
tagatgttct gggccgcacg cgcgctacac tgatgggttc aacgggtggt
1440catcgttgtt cgcagcgagg tgctttgccg gaaggcatgg caaatccttt
caacgcccat 1500cgtgctgggg ctagattttt gcaattatta atctccaacg
aggaattcct agtaaacgca 1560agtcatcagc ttgcattgaa tacgtccctg
ccctttgtac acaccgcccg tcgcacctac 1620cgattgaacg gtccgatgaa
accatgggat gaccttttga gcgtttgttc gcgagggggg 1680tcagaactcg
ggtgaatctt attgtttaga ggaaggtgaa gtc 172321320DNAThraustochytrium
sp. 2atggagttct tccccgaggt ggccaaggtg gagtacgccg gccccgagag
ccgcgacgtc 60ctggcgtata gatggtacaa caaggaagag gtagtgatgg ggaagaaaat
gaaggagtgg 120ctgaggttct cggtgtgctt ttggcatacc tttcgcggaa
acgggtcgga cccctttggc 180aagcccacca tcacgcaccg cttcgcaggc
gacgatggtt cggacaccat ggagaacgcc 240ctccggcgcg ttgaggcggc
ctttgagctc tttgtcaagc tcggcgtgga gttctactcc 300tttcacgacg
tcgatgtggc gcctgagggc aagacgctca aggagacaaa cgagaacctg
360gacaagatca cggaccgcat gctcgagctg caacaggaga cgggcgtcaa
gctgctctgg 420ggcactgcca acttgttctc tcatccgcga tacatgaacg
gcgggtcaac aaacccggat 480cccaaggtct ttgtgcgcgc cgccgcgcag
gtgaaaaagg ccatcgacgt gacccacaaa 540ctcggtggcg aaggctttgt
gttctggggc ggtcgggagg gttacatgca cattctcaac 600acggatatgg
tccgtgaaat gaatcattac gcgaaaatgc tcaagatggc catcgcctac
660aagaaaaaga tcggcttcgg cgggcagatc ctggtcgaac ccaagccccg
cgagcccatg 720aagcaccagt atgactacga cgtgcagacc gtcattggct
ttctcagaca gcacggcctg 780gaaaacgagg tcagcctcaa cgtggagccc
aatcacacgc agctcgccgg gcacgagttt 840gagcacgatg tcgtcctcgc
cgcgcagctc ggcatgctcg gcagcgtcga cgccaacacg 900ggctccgaga
gcctcgggtg ggacacggac gagttcatca ccgaccaaac gcgcgccact
960gtgctttgca aggccatcat tgagatgggt ggtttcgttc agggcggtct
caactttgac 1020gccaaggtcc gtcgggagag caccgacccg gaggacctct
ttatcgctca tgtcgcctcg 1080attgacgcgc tcgccaaggg tctgcgcaac
gcttcgcagc tcgtttctga cggccgcatg 1140cgcaaaatgc tccaggaccg
gtacgccggc tgggatgagg gcatcggaca aaagattgag 1200attggggaaa
cctcgcttga ggacctcgag gcccactgcc tgcaggacga cacggaacca
1260gtcaagacgt cggccaagca ggagaaattc cttgccgttc tcaaccacta
catttcctaa 1320311PRTArtificial SequenceSynthetic Construct 3Met
His His His His His His Gly Ser Met Ser1 5 10433DNAArtificial
SequenceSynthetic Construct 4atgcaccacc accaccacca cggttccatg tcg
3351455DNAArtificial SequenceArtificial Construct 5atgtacatcg
gcatcgacct cggcacctcg ggcgtcaagg tcatcctcct caacgagcag 60ggcgaggtcg
tcgccgccca gaccgagaag ctcaccgtct cgcgcccgca cccgctctgg
120tcggagcagg acccggagca gtggtggcag gccaccgacc gcgccatgaa
ggccctcggc 180gaccagcact cgctccagga cgtcaaggcc ctcggcatcg
ccggccagat gcacggcgcc 240accctcctcg acgcccagca gcgcgtcctc
cgcccggcca tcctctggaa cgacggccgc 300tgcgcccagg agtgcaccct
cctcgaggcc cgcgtcccgc agtcgcgcgt catcaccggc 360aacctcatga
tgccgggctt caccgccccg aagctcctct gggtccagcg ccacgagccg
420gagatcttcc gccagatcga caaggtcctc ctcccgaagg actacctccg
cctccgcatg 480accggcgagt tcgcctcgga catgtcggac gccgccggca
ccatgtggct cgacgtcgcc 540aagcgcgact ggtcggacgt catgctccag
gcctgcgacc tctcgcgcga ccagatgccg 600gccctctacg agggctcgga
gatcaccggc gccctcctcc cggaggtcgc caaggcctgg 660ggcatggcca
ccgtcccggt cgtcgccggc ggcggcgaca acgccgccgg cgccgtcggc
720gtcggcatgg tcgacgccaa ccaggccatg ctctcgctcg gcacctcggg
cgtctacttc 780gccgtctcgg agggcttcct ctcgaagccg gagtcggccg
tccactcgtt ctgccacgcc 840ctcccgcagc gctggcacct catgtcggtc
atgctctcgg ccgcctcgtg cctcgactgg 900gccgccaagc tcaccggcct
ctcgaacgtc ccggccctca tcgccgccgc ccagcaggcc 960gacgagtcgg
ccgagccggt ctggttcctc ccgtacctct cgggcgagcg caccccgcac
1020aacaacccgc aggccaaggg cgtcttcttc ggcctcaccc accagcacgg
cccgaacgag 1080ctcgcccgcg ccgtcctcga gggcgtcggc tacgccctcg
ccgacggcat ggacgtcgtc 1140cacgcctgcg gcatcaagcc gcagtcggtc
accctcatcg gcggcggcgc ccgctcggag 1200tactggcgcc agatgctcgc
cgacatctcg ggccagcagc tcgactaccg caccggcggc 1260gacgtcggcc
cggccctcgg cgccgcccgc ctcgcccaga tcgccgccaa cccggagaag
1320tcgctcatcg agctcctccc gcagctcccg ctcgagcagt cgcacctccc
ggacgcccag 1380cgctacgccg cctaccagcc gcgccgcgag accttccgcc
gcctctacca gcagctcctc 1440ccgctcatgg cctaa 1455622PRTArtificial
SequenceSynthetic Construct 6Gly Ser Gly Ala Thr Asn Phe Ser Leu
Leu Lys Gln Ala Gly Asp Val1 5 10 15Glu Glu Asn Pro Gly Pro
20720DNAArtificial SequenceSynthetic Construct 7gacgacgtga
ccctgttcat 20819DNAArtificial SequenceSynthetic Construct
8tcccggaagt tcgtggaca 19920DNAArtificial SequenceSynthetic
Construct 9tgagattggg gaaacctcgc 201022DNAArtificial
SequenceSynthetic Construct 10tcttgactgg ttggccgtgt cg
221120DNAArtificial SequenceSynthetic Construct 11cgtcctgcgc
attgatcttg 201220DNAArtificial SequenceSynthetic Construct
12ggcgagcttc tccttgatgt 201320DNAArtificial SequenceSynthetic
Construct 13ggcgtcaacc acaaggagta 201420DNAArtificial
SequenceSynthetic Construct 14tgtcgttgat gaccttggca
20151669DNAPiromyces sp. 15gtaaatggct aaggaatatt tcccacaaat
tcaaaagatt aagttcgaag gtaaggattc 60taagaatcca ttagccttcc actactacga
tgctgaaaag gaagtcatgg gtaagaaaat 120gaaggattgg ttacgtttcg
ccatggcctg gtggcacact ctttgcgccg aaggtgctga 180ccaattcggt
ggaggtacaa agtctttccc atggaacgaa ggtactgatg ctattgaaat
240tgccaagcaa aaggttgatg ctggtttcga aatcatgcaa aagcttggta
ttccatacta 300ctgtttccac gatgttgatc ttgtttccga aggtaactct
attgaagaat acgaatccaa 360ccttaaggct gtcgttgctt acctcaagga
aaagcaaaag gaaaccggta ttaagcttct 420ctggagtact gctaacgtct
tcggtcacaa gcgttacatg aacggtgcct ccactaaccc 480agactttgat
gttgtcgccc gtgctattgt tcaaattaag aacgccatag acgccggtat
540tgaacttggt gctgaaaact acgtcttctg gggtggtcgt gaaggttaca
tgagtctcct 600taacactgac caaaagcgtg aaaaggaaca catggccact
atgcttacca tggctcgtga 660ctacgctcgt tccaagggat tcaagggtac
tttcctcatt gaaccaaagc caatggaacc 720aaccaagcac caatacgatg
ttgacactga aaccgctatt ggtttcctta aggcccacaa 780cttagacaag
gacttcaagg tcaacattga agttaaccac gctactcttg ctggtcacac
840tttcgaacac gaacttgcct gtgctgttga tgctggtatg ctcggttcca
ttgatgctaa 900ccgtggtgac taccaaaacg gttgggatac tgatcaattc
ccaattgatc aatacgaact 960cgtccaagct tggatggaaa tcatccgtgg
tggtggtttc gttactggtg gtaccaactt 1020cgatgccaag actcgtcgta
actctactga cctcgaagac atcatcattg cccacgtttc 1080tggtatggat
gctatggctc gtgctcttga aaacgctgcc aagctcctcc aagaatctcc
1140atacaccaag atgaagaagg aacgttacgc ttccttcgac agtggtattg
gtaaggactt 1200tgaagatggt aagctcaccc tcgaacaagt ttacgaatac
ggtaagaaga acggtgaacc 1260aaagcaaact tctggtaagc aagaactcta
cgaagctatt gttgccatgt accaataagt 1320taatcgtagt taaattggta
aaataattgt aaaatcaata aacttgtcaa tcctccaatc 1380aagtttaaaa
gatcctatct ctgtactaat taaatatagt acaaaaaaaa atgtataaac
1440aaaaaaaagt ctaaaagacg gaagaattta atttagggaa aaaataaaaa
taataataaa 1500caatagataa atcctttata ttaggaaaat gtcccattgt
attattttca tttctactaa 1560aaaagaaagt aaataaaaca caagaggaaa
ttttcccttt tttttttttt tgtaataaat 1620tttatgcaaa tataaatata
aataaaataa taaaaaaaaa aaaaaaaaa 166916395PRTStreptomyces lividans
16Met Asn Tyr Gln Pro Thr Ser Glu Asp Arg Phe Thr Phe Gly Leu Trp1
5 10 15Thr Val Gly Trp Gln Gly Leu Asp Pro Phe Gly Asp Ala Thr Arg
Glu 20 25 30Ala Leu Asp Pro Ala Glu Ser Val Arg Arg Leu Ser Gln Leu
Gly Ala 35 40 45Tyr Gly Val Thr Phe His Asp Asp Glu Leu Ile Pro Phe
Gly Ser Ser 50 55 60Asp Asn Glu Arg Gly Val Ala His Gly Ala Gly Val
Ala His Gln Ala65 70 75 80Val Pro Ala Gly Ala Gly Arg Asp Arg His
Glu Gly Ala Asp Gly Asp 85 90 95Asp Glu Pro Val His Ala Pro Gly Cys
Ser Arg Asp Gly Ala Phe Thr 100 105 110Ala Asn Asp Arg Asp Val Arg
Gly Thr Arg Cys Ala Arg Ala Ile Arg 115 120 125Asn Ile Asp Leu Ala
Val Glu His Val Ala Arg Ala Ser Thr Cys Ala 130 135 140Trp Gly Gly
Arg Glu Gly Ala Glu Ser Gly Ala Ala Lys Asp Val Arg145 150 155
160Asp Ala Leu Asp Arg Met Lys Glu Ala Phe Asp Leu Leu Gly Glu Tyr
165 170 175Val Thr Glu Gln Gly Tyr Asp Leu Lys Phe Ala Ile Glu Pro
Lys Pro 180 185 190Asn Glu Pro Arg Gly Asp Ile Leu Leu Pro Thr Val
Gly His Ala Leu 195 200 205Ala Phe Ile Glu Arg Leu Glu Arg Pro Glu
Leu Tyr Gly Val Asn Pro 210 215 220Glu Val Gly His Glu Gln Met Ala
Gly Leu Asn Phe Pro His Gly Ile225 230 235 240Ala Gln Ala Leu Trp
Ala Gly Lys Leu Phe His Ile Asp Leu Asn Gly 245 250 255Gln Ser Gly
Ile Lys Tyr Asp Gln Asp Leu Arg Phe Gly Ala Gly Asp 260 265 270Leu
Arg Ala Ala Phe Trp Leu Val Asp Leu Leu Glu Arg Ala Gly Tyr 275 280
285Ala Gly Pro Arg His Phe Asp Phe Lys Pro Pro Arg Thr Glu Asn Phe
290 295 300Asp Ala Val Trp Pro Ser Ala Ala Gly Cys Met Arg Asn Tyr
Leu Ile305 310 315 320Leu Lys Asp Arg Ala Ala Ala Phe Arg Ala Asp
Pro Gln Val Gln Glu 325 330 335Ala Leu Ala Ala Ala Arg Leu Asp Glu
Leu Ala Arg Pro Thr Ala Glu 340 345 350Asp Gly Leu Ala Ala Leu Leu
Ala Asp Arg Ser Ala Tyr Asp Thr Phe 355 360 365Asp Val Asp Ala Ala
Ala Ala Arg Gly Met Ala Phe Glu His Leu Asp 370 375 380Gln Leu Ala
Met Asp His Leu Leu Gly Ala Arg385 390 395172040DNAPiromyces sp.
17attatataaa ataactttaa ataaaacaat ttttatttgt ttatttaatt attcaaaaaa
60aattaaagta aaagaaaaat aatacagtag aacaatagta ataatatcaa aatgaagact
120gttgctggta ttgatcttgg aactcaaagt atgaaagtcg ttatttacga
ctatgaaaag 180aaagaaatta ttgaaagtgc tagctgtcca atggaattga
tttccgaaag tgacggtacc 240cgtgaacaaa ccactgaatg gtttgacaag
ggtcttgaag tttgttttgg taagcttagt 300gctgataaca aaaagactat
tgaagctatt ggtatttctg gtcaattaca cggttttgtt 360cctcttgatg
ctaacggtaa ggctttatac aacatcaaac tttggtgtga tactgctacc
420gttgaagaat gtaagattat cactgatgct gccggtggtg acaaggctgt
tattgatgcc 480cttggtaacc ttatgctcac cggtttcacc gctccaaaga
tcctctggct caagcgcaac 540aagccagaag ctttcgctaa cttaaagtac
attatgcttc cacacgatta cttaaactgg 600aagcttactg gtgattacgt
tatggaatac ggtgatgcct ctggtaccgc tctcttcgat 660tctaagaacc
gttgctggtc taagaagatt tgcgatatca ttgacccaaa acttttagat
720ttacttccaa agttaattga accaagcgct ccagctggta aggttaatga
tgaagccgct 780aaggcttacg gtattccagc cggtattcca gtttccgctg
gtggtggtga taacatgatg 840ggtgctgttg gtactggtac tgttgctgat
ggtttcctta ccatgtctat gggtacttct 900ggtactcttt acggttacag
tgacaagcca attagtgacc cagctaatgg tttaagtggt 960ttctgttctt
ctactggtgg atggcttcca ttactttgta ctatgaactg tactgttgcc
1020actgaattcg ttcgtaacct cttccaaatg gatattaagg aacttaatgt
tgaagctgcc 1080aagtctccat gtggtagtga aggtgtttta gttattccat
tcttcaatgg tgaaagaact 1140ccaaacttac caaacggtcg tgctagtatt
actggtctta cttctgctaa caccagccgt 1200gctaacattg ctcgtgctag
tttcgaatcc gccgttttcg ctatgcgtgg tggtttagat 1260gctttccgta
agttaggttt ccaaccaaag gaaattcgtc ttattggtgg tggttctaag
1320ctgatctctg gagacaaatt gccgctgata tcatgaacct tccaatcaga
gttccacttt 1380tagaagaagc tgctgctctt ggtggtgctg ttcaagcttt
atggtgtctt aagaaccaat 1440ctggtaagtg tgatattgtt gaactttgca
aagaacacat taagattgat gaatctaaga 1500atgctaaccc aattgccgaa
aatgttgctg tttacgacaa ggcttacgat gaatactgca 1560aggttgtaaa
tactctttct ccattatatg cttaaattgc caatgtaaaa aaaaatataa
1620tgccatataa ttgccttgtc aatacactgt tcatgttcat ataatcatag
gacattgaat 1680ttacaaggtt tatacaatta atatctatta tcatattatt
atacagcatt tcattttcta 1740agattagacg aaacaattct tggttccttg
caatatacaa aatttacatg aatttttaga 1800atagtctcgt atttatgccc
aataatcagg aaaattacct aatgctggat tcttgttaat 1860aaaaacaaaa
taaataaatt aaataaacaa ataaaaatta taagtaaata taaatatata
1920agtaatataa aaaaaaagta aataaataaa taaataaata aaaatttttt
gcaaatatat 1980aaataaataa ataaaatata aaaataattt agcaaataaa
ttaaaaaaaa aaaaaaaaaa 2040181803DNASaccharomyces sp. 18atgttgtgtt
cagtaattca gagacagaca agagaggttt ccaacacaat gtctttagac 60tcatactatc
ttgggtttga tctttcgacc caacaactga aatgtctcgc cattaaccag
120gacctaaaaa ttgtccattc agaaacagtg gaatttgaaa aggatcttcc
gcattatcac 180acaaagaagg gtgtctatat acacggcgac actatcgaat
gtcccgtagc catgtggtta 240gaggctctag atctggttct ctcgaaatat
cgcgaggcta aatttccatt gaacaaagtt 300atggccgtct cagggtcctg
ccagcagcac gggtctgtct actggtcctc ccaagccgaa 360tctctgttag
agcaattgaa taagaaaccg gaaaaagatt tattgcacta cgtgagctct
420gtagcatttg caaggcaaac cgcccccaat tggcaagacc acagtactgc
aaagcaatgt 480caagagtttg aagagtgcat aggtgggcct gaaaaaatgg
ctcaattaac agggtccaga 540gcccatttta gatttactgg tcctcaaatt
ctgaaaattg cacaattaga accagaagct 600tacgaaaaaa caaagaccat
ttctttagtg tctaattttt tgacttctat cttagtgggc 660catcttgttg
aattagagga ggcagatgcc tgtggtatga acctttatga tatacgtgaa
720agaaaattca gtgatgagct actacatcta attgatagtt cttctaagga
taaaactatc 780agacaaaaat taatgagagc acccatgaaa aatttgatag
cgggtaccat ctgtaaatat 840tttattgaga agtacggttt caatacaaac
tgcaaggtct ctcccatgac tggggataat 900ttagccacta tatgttcttt
acccctgcgg aagaatgacg ttctcgtttc cctaggaaca 960agtactacag
ttcttctggt caccgataag tatcacccct ctccgaacta tcatcttttc
1020attcatccaa ctctgccaaa ccattatatg ggtatgattt gttattgtaa
tggttctttg 1080gcaagggaga ggataagaga cgagttaaac aaagaacggg
aaaataatta tgagaagact 1140aacgattgga ctctttttaa tcaagctgtg
ctagatgact cagaaagtag tgaaaatgaa 1200ttaggtgtat attttcctct
gggggagatc gttcctagcg taaaagccat aaacaaaagg 1260gttatcttca
atccaaaaac gggtatgatt gaaagagagg tggccaagtt caaagacaag
1320aggcacgatg ccaaaaatat tgtagaatca caggctttaa gttgcagggt
aagaatatct 1380cccctgcttt cggattcaaa cgcaagctca caacagagac
tgaacgaaga tacaatcgtg 1440aagtttgatt acgatgaatc tccgctgcgg
gactacctaa ataaaaggcc agaaaggact 1500ttttttgtag gtggggcttc
taaaaacgat gctattgtga agaagtttgc tcaagtcatt 1560ggtgctacaa
agggtaattt taggctagaa acaccaaact catgtgccct tggtggttgt
1620tataaggcca tgtggtcatt gttatatgac tctaataaaa ttgcagttcc
ttttgataaa 1680tttctgaatg acaattttcc atggcatgta atggaaagca
tatccgatgt ggataatgaa 1740aattgggatc gctataattc caagattgtc
cccttaagcg aactggaaaa gactctcatc 1800taa 1803192942DNAPichia
sp.misc_feature(4)..(4)n is a, c, g, or tmisc_feature(38)..(38)n is
a, c, g, or tmisc_feature(85)..(85)n is a, c, g, or
tmisc_feature(88)..(88)n is a, c, g, or tmisc_feature(146)..(146)n
is a, c, g, or tmisc_feature(174)..(174)n is a, c, g, or t
19ttanacagtt ttccagaatc caaattttcc aaccaacnaa aaacggaccc agaaagttac
60agatttttca gagcttcatc ttttntanga tttcacagct tcatcaattt cagaccatag
120ccataatgac ttttgtagag tttccnatca ctattcccaa ccagcagcgt
gtgnaaactg 180ccatcaccta tagtgcctac ttttcggttt tcaccagtgt
ggtttttggc ctagttacaa 240attcgctaga gaatgttgtg tatgcttttg
gagcgcagac tgccatcacg ttagtgttga 300ctgcattcaa ctggccgtgg
ttcacgagtg ctcccggtat cgaatggctc ccggtagaat 360tttaggatcg
tatggtgact tggcgattta actgggtagc acaagggaat tttcaggaaa
420ttttctggtt ggacattttg ggcggctgaa ctttcatggt taaaaggact
aaggccagat 480tctcgggggg agaaaaattt ctgttagttt ggaattttcc
gagccccaca cattgcgatg 540gtagattcgg tacgaaacta tataaacggt
tggattccta gaaagggcca gatcagattg 600tagstagtat atatagcata
tagatccctg gaggataccc acagacatta ctgctactaa 660ttcataccat
acttgacgta tatctgcgca tacatatcta ccccaacttt catataaaat
720tcctagattt attgcatctt ctaatagagt catttttcag atttttcaat
ttccatagaa 780agcatacatt ttcatacagc ttctatttgt taatcgacct
gataatttta ctagccatat 840ttcttttttt gatttttcac
ttaatcgaca tataaatact cacgtagttg acactcacaa 900tgaccactac
cccatttgat gctccagata agctcttcct cgggttcgat ctttcgactc
960agcagttgaa gatcatcgtc accgatgaaa acctcgctgc tctcaaaacc
tacaatgtcg 1020agttcgatag catcaacagc tctgtccaga agggtgtcat
tgctatcaac gacgaaatca 1080gcaagggtgc cattatttcc cccgtttaca
tgtggttgga tgcccttgac catgtttttg 1140aagacatgaa gaaggacgga
ttccccttca acaaggttgt tggtatttcc ggttcttgtc 1200aacagcacgg
ttcggtatac tggtctagaa cggccgagaa ggtcttgtcc gaattggacg
1260ctgaatcttc gttatcgagc cagatgagat ctgctttcac cttcaagcac
gctccaaact 1320ggcaggatca ctctaccggt aaagagcttg aagagttcga
aagagtgatt ggtgctgatg 1380ccttggctga tatctctggt tccagagccc
attacagatt cacagggctc cagattagaa 1440agttgtctac cagattcaag
cccgaaaagt acaacagaac tgctcgtatc tctttagttt 1500cgtcatttgt
tgccagtgtg ttgcttggta gaatcacctc cattgaagaa gccgatgctt
1560gtggaatgaa cttgtacgat atcgaaaagc gcgagttcaa cgaagagctc
ttggccatcg 1620ctgctggtgt ccaccctgag ttggatggtg tagaacaaga
cggtgaaatt tacagagctg 1680gtatcaatga gttgaagaga aagttgggtc
ctgtcaaacc tataacatac gaaagcgaag 1740gtgacattgc ctcttacttt
gtcaccagat acggcttcaa ccccgactgt aaaatctact 1800cgttcaccgg
agacaatttg gccacgatta tctcgttgcc tttggctcca aatgatgctt
1860tgatctcatt gggtacttct actacagttt taattatcac caagaactac
gctccttctt 1920ctcaatacca tttgtttaaa catccaacca tgcctgacca
ctacatgggc atgatctgct 1980actgtaacgg ttccttggcc agagaaaagg
ttagagacga agtcaacgaa aagttcaatg 2040tagaagacaa gaagtcgtgg
gacaagttca atgaaatctt ggacaaatcc acagacttca 2100acaacaagtt
gggtatttac ttcccacttg gcgaaattgt ccctaatgcc gctgctcaga
2160tcaagagatc ggtgttgaac agcaagaacg aaattgtaga cgttgagttg
ggcgacaaga 2220actggcaacc tgaagatgat gtttcttcaa ttgtagaatc
acagactttg tcttgtagat 2280tgagaactgg tccaatgttg agcaagagtg
gagattcttc tgcttccagc tctgcctcac 2340ctcaaccaga aggtgatggt
acagatttgc acaaggtcta ccaagacttg gttaaaaagt 2400ttggtgactt
gttcactgat ggaaagaagc aaacctttga gtctttgacc gccagaccta
2460accgttgtta ctacgtcggt ggtgcttcca acaacggcag cattatccsc
aagatgggtt 2520ccatcttggc tcccgtcaac ggaaactaca aggttgacat
tcctaacgcc tgtgcattgg 2580gtggtgctta caaggccagt tggagttacg
agtgtgaagc caagaaggaa tggatcggat 2640acgatcagta tatcaacaga
ttgtttgaag taagtgacga gatgaatctg ttcgaagtca 2700aggataaatg
gctcgaatat gccaacgggg ttggaatgtt ggccaagatg gaaagtgaat
2760tgaaacacta aaatccataa tagcttgtat agaggtatag aaaaagagaa
cgttatagag 2820taaagacaat gtagcatata tgtgcgaata tcacgataga
cgttatacag aagattactt 2880tcacatcatt ttgaaaatat cttgatatgt
tcatatttca ttcgcctcta gcatttttca 2940ga 2942201455DNAE. coli
20atgtatatcg ggatagatct tggcacctcg ggcgtaaaag ttattttgct caacgagcag
60ggtgaggtgg ttgctgcgca aacggaaaag ctgaccgttt cgcgcccgca tccactctgg
120tcggaacaag acccggaaca gtggtggcag gcaactgatc gcgcaatgaa
agctctgggc 180gatcagcatt ctctgcagga cgttaaagca ttgggtattg
ccggccagat gcacggagca 240accttgctgg atgctcagca acgggtgtta
cgccctgcca ttttgtggaa cgacgggcgc 300tgtgcgcaag agtgcacttt
gctggaagcg cgagttccgc aatcgcgggt gattaccggc 360aacctgatga
tgcccggatt tactgcgcct aaattgctat gggttcagcg gcatgagccg
420gagatattcc gtcaaatcga caaagtatta ttaccgaaag attacttgcg
tctgcgtatg 480acgggggagt ttgccagcga tatgtctgac gcagctggca
ccatgtggct ggatgtcgca 540aagcgtgact ggagtgacgt catgctgcag
gcttgcgact tatctcgtga ccagatgccc 600gcattatacg aaggcagcga
aattactggt gctttgttac ctgaagttgc gaaagcgtgg 660ggtatggcga
cggtgccagt tgtcgcaggc ggtggcgaca atgcagctgg tgcagttggt
720gtgggaatgg ttgatgctaa tcaggcaatg ttatcgctgg ggacgtcggg
ggtctatttt 780gctgtcagcg aagggttctt aagcaagcca gaaagcgccg
tacatagctt ttgccatgcg 840ctaccgcaac gttggcattt aatgtctgtg
atgctgagtg cagcgtcgtg tctggattgg 900gccgcgaaat taaccggcct
gagcaatgtc ccagctttaa tcgctgcagc tcaacaggct 960gatgaaagtg
ccgagccagt ttggtttctg ccttatcttt ccggcgagcg tacgccacac
1020aataatcccc aggcgaaggg ggttttcttt ggtttgactc atcaacatgg
ccccaatgaa 1080ctggcgcgag cagtgctgga aggcgtgggt tatgcgctgg
cagatggcat ggatgtcgtg 1140catgcctgcg gtattaaacc gcaaagtgtt
acgttgattg ggggcggggc gcgtagtgag 1200tactggcgtc agatgctggc
ggatatcagc ggtcagcagc tcgattaccg tacggggggg 1260gatgtggggc
cagcactggg cgcagcaagg ctggcgcaga tcgcggcgaa tccagagaaa
1320tcgctcattg aattgttgcc gcaactaccg ttagaacagt cgcatctacc
agatgcgcag 1380cgttatgccg cttatcagcc acgacgagaa acgttccgtc
gcctctatca gcaacttctg 1440ccattaatgg cgtaa 1455211623DNAAspergillus
sp. 21atggctatcg gcaatcttta cttcattgcg gccatcgccg tcgtcggcgg
tggtctgttc 60ggtttcgata tctcgtcgat gtcggccatc atcgagaccg atgcctatct
ctgttacttc 120aaccaggctc ctgtcactta cgatgatgat ggcaagaggg
tctgtcaggg ccccagcgcg 180agtgtgcagg gtggtatcac cgcctccatg
gctggtggtt cctggttggg ctcgttgatc 240tcgggtttca tctcggacag
gcttggtcgt cgtactgcca ttcagatcgg ttccatcatc 300tggtgcattg
gatccatcat tgtctgtgcc tcccagaaca ttcccatgct gatcgtcggt
360cgtatcatca acggtctgag tgtgggtatc tgctccgctc aggtgccagt
gtatatttcg 420gagattgctc ctccaaccaa gcgtggtcgt gtcgtcggtc
tgcaacaatg ggctattacc 480tggggtatcc tgatcatgtt ctacgtctcc
tatggatgca gcttcatcaa gggtacggcg 540gccttccgga ttccctgggg
tctgcagatg atccctgccg tgctattgtt cctgggtatg 600atgctcctgc
ctgagtcacc ccgctggctg gcacgcaagg accgatggga ggagtgccac
660gctgttttga ccctcgtcca cggtcaggga gacccgagct ctccctttgt
gcagcgtgaa 720tatgaagaga tcaagagcat gtgcgagttt gagcgccaaa
acgcggatgt ctcctacctc 780gagctgttca agcccaacat gcttaaccgt
acccatgtgg gtgttttcgt tcagatctgg 840tctcagttga ctggaatgaa
cgtcatgatg tactacatca cctacgtctt tgccatggcc 900ggcttgaaag
gtaacaacaa cttgatctcc tccagtatcc agtacgtgat caacgtgtgc
960atgactgtgc cggctctggt gtggggtgat cagtggggcc gtcgcccgac
cttcttgatc 1020ggttccctct tcatgatgat ctggatgtac attaatgctg
gtctgatggc cagctacggt 1080catcccgcgc cgcccggcgg tctcaacaac
gtggaagccg agtcctgggt catccacggc 1140gcgcccagca aggctgtcat
tgccagtacc tacctcttcg tagcctcata cgccatctcc 1200ttcggccccg
ccagctgggt gtacccgccg gaactcttcc ctctgcgtgt gcgcggcaag
1260gctaccgccc tctgcacttc agccaactgg gccttcaact tcgccctcag
ctattttgtc 1320cccccggcat ttgtcaacat ccagtggaag gtctacatcc
tcttcggtgt cttctgtact 1380gccatgttct tgcacatttt cttcttcttt
cccgagacca cgggtaagac cctggaagag 1440gtcgaggcca tcttcactga
tcccaatggt attccgtaca tcggtactcc cgcctggaag 1500acaaagaacg
agtactcgcg cggtgcacac attgaggagg ttggctttga agatgagaag
1560aaggttgctg gtgggcagac tatccaccag gaggtcacgg ctactccgga
taagattgct 1620tga 1623221644DNACandida sp. 22atgtcacaag attcgcattc
ttctggtgcc gctacaccag tcaatggttc catccttgaa 60aaggaaaaag aagactctcc
agttcttcaa gttgatgccc cacaaaaggg tttcaaggac 120tacattgtca
tttctatctt ctgttttatg gttgccttcg gtggtttcgt cttcggtttc
180gacactggta ccatttccgg tttcgtgaac atgtctgact ttaaagacag
attcggtcaa 240caccacgctg atggtactcc ttacttgtcc gacgttagag
ttggtttgat gatttctatt 300ttcaacgttg gttgcgctgt cggtggtatt
ttcctctgca aggtcgctga tgtctggggt 360agaagaattg gtcttatgtt
ctccatggct gtctacgttg ttggtattat tattcagatc 420tcttcatcca
ccaagtggta ccagttcttc attggtcgtc ttattgctgg tttggctgtt
480ggtaccgttt ctgtcgtttc cccacttttc atctctgagg tttctccaaa
gcaaattaga 540ggtactttag tgtgctgctt ccagttgtgt atcaccttgg
gtatcttctt gggttactgt 600actacttacg gtactaagac ctacactgac
tctagacagt ggagaattcc tttgggtttg 660tgtttcgctt gggctatctt
gttggttgtc ggtatgttga acatgccaga gtctccaaga 720tacttggttg
agaagcacag aattgatgag gccaagagat ccattgccag atccaacaag
780atccctgagg aggacccatt cgtctacact gaggttcagc ttattcaggc
cggtattgag 840agagaagctt tggctggtca ggcatcttgg aaggagttga
tcactggtaa gccaaagatc 900ttcagaagag ttatcatggg tattatgctt
cagtccttgc aacagttgac cggtgacaac 960tacttcttct actacggtac
taccattttc caggctgtcg gtttgaagga ttctttccag 1020acttctatca
ttttgggtat tgtcaacttt gcttccacct tcgttggtat ctatgtcatt
1080gagagattgg gtagaagatt gtgtcttttg accggttccg ctgctatgtt
catctgtttc 1140atcatctact ctttgattgg tactcagcac ttgtacaagc
aaggttactc caacgagacc 1200tccaacactt acaaggcttc tggtaacgct
atgatcttca tcacttgtct ttacattttc 1260ttctttgctt ctacctgggc
tggtggtgtt tactgtatca tttccgagtc ctacccattg 1320agaattagat
ccaaggccat gtctattgct accgctgcta actggttgtg gggtttcttg
1380atttccttct tcactccatt catcaccagt gccatccact tctactacgg
tttcgttttc 1440actggttgtt tggctttctc tttcttctac gtctacttct
tcgtctacga aaccaagggt 1500ctttctttgg aggaggttga tgagatgtac
gcttccggtg ttcttccact caagtctgcc 1560agctgggttc caccaaatct
tgagcacatg gctcactctg ccggttacgc tggtgctgac 1620aaggccaccg
acgaacaggt ttaa 1644231569DNACandida sp. 23atgggtttgg aggacaatag
aatggttaag cgtttcgtca acgttggcga gaagaaggct 60ggctctactg ccatggccat
catcgtcggt ctttttgctg cttctggtgg tgtccttttc 120ggatacgata
ctggtactat ttctggtgtg atgaccatgg actacgttct tgctcgttac
180ccttccaaca agcactcttt tactgctgat gaatcttctt tgattgtttc
tatcttgtct 240gttggtactt tctttggtgc actttgtgct ccattcctta
acgacaccct cggtagacgt 300tggtgtctta ttctttctgc tcttattgtc
ttcaacattg gtgctatctt gcaggtcatc 360tctactgcca ttccattgct
ttgtgctggt agagttattg caggttttgg tgtcggtttg 420atttctgcta
ctattccatt gtaccaatct gagactgctc caaagtggat cagaggtgcc
480attgtctctt gttaccagtg ggctattacc attggtcttt tcttggcctc
ttgtgtcaac 540aagggtactg agcacatgac taactctgga tcttacagaa
ttccacttgc tattcaatgt 600ctttggggtc ttatcttggg tatcggtatg
atcttcttgc cagagactcc aagattctgg 660atctccaagg gtaaccagga
gaaggctgct gagtctttgg ccagattgag aaagcttcca 720attgaccacc
cagactctct cgaggaatta agagacatca ctgctgctta cgagttcgag
780actgtgtacg gtaagtcctc ttggagccag gtgttctctc acaagaacca
ccagttgaag 840agattgttca ctggtgtggc tatccaggct ttccagcaat
tgaccggtgt taacttcatt 900ttctactacg gtactacctt cttcaagaga
gctggtgtta acggtttcac tatctccttg 960gccactaaca ttgtcaatgt
cggttctact attccaggta ttcttttgat ggaagtcctc 1020ggtagaagaa
acatgttgat gggtggtgct actggtatgt ctctttctca attgatcgtt
1080gccattgttg gtgttgctac ctcggaaaac aacaagtctt cccagtccgt
ccttgttgct 1140ttctcctgta ttttcattgc cttcttcgct gccacctggg
gtccatgtgc ttgggttgtt 1200gttggtgagt tgttcccatt gagaaccaga
gctaagtctg tctccttgtg tactgcttcc 1260aactggttgt ggaactgggg
tattgcttac gctactccat acatggtgga tgaagacaag 1320ggtaacttgg
gttccaatgt gttcttcatc tggggtggtt tcaacttggc ttgtgttttc
1380ttcgcttggt acttcatcta cgagaccaag ggtctttctt tggagcaggt
cgacgagttg 1440tacgagcatg tcagcaaggc ttggaagtct aagggcttcg
ttccatctaa gcactctttc 1500agagagcagg tggaccagca aatggactcc
aaaactgaag ctattatgtc tgaagaagct 1560tctgtttaa 1569241662DNAPichia
sp. 24atgtctgtag atgaaaatca attggagaat ggacaacttc tatcctccga
aaatgaggca 60tcatcacctt ttaaagagtc tatcccttct cgctcttccc tctacttaat
agctcttaca 120gtttcacttt tgggagttca attgacttgg tcggttgaac
ttggttatgg tacaccgtat 180ttattctcac ttggtcttcg taaagaatgg
acttcaatta tatggattgc cggtcctttg 240actggaatat taattcagcc
aattgctggt atattgtccg accgggttaa ttcaagaata 300ggtcggcgga
gaccgttcat gctctgtgct agtttgttag gaacattcag cttattcctt
360atgggctggg cccctgatat ttgcctcttt atatttagca atgaggttct
aatgaaacgt 420gttactatcg ttttggctac gattagcatt tatttgcttg
acgtggccgt caatgtcgta 480atggctagca ctcgatcttt aattgttgat
tcagtccgtt cagatcaaca gcatgaagca 540aattcctggg ctggaagaat
gataggtgta ggcaatgtgc ttgggtactt actaggctat 600ttacctctat
atcgcatctt ctcctttctc aatttcacac agttacaggt gttttgcgta
660cttgcctcca tttccttggt actcacagtt accatcacaa caatatttgt
gagtgaaagg 720agattcccac cagttgaaca cgagaaatcg gttgctggag
aaatctttga attttttaca 780actatgcgac aaagtattac cgcacttcca
tttacattaa aaagaatttg ttttgttcaa 840ttttttgcat actttggatg
gtttccattt ttgttttata ttactaccta tgtgggtatt 900ttatatttac
gccatgctcc taaaggccat gaagaagatt gggacatggc gactcgtcaa
960gggtcgttcg cattactgct ttttgctatc atttctcttg ccgcaaatac
agcacttcca 1020ttgttgctcg aggacacgga ggatgatgag gaggacgaat
cgagtgatgc atctaataat 1080gaatacaaca ttcaagaaag aaacgatctc
ggaaatataa gaactggtac taatacaccc 1140cgtcttggta atttgagcga
aacaacttct ttccgttcgg aaaatgagcc ctcacgacgc 1200aggcttttac
cgtctagtag atcaattatg acaacgatat cctccaaggt acaaatcaaa
1260ggacttactc ttcctattct gtggttgagc tcccatgtcc tttttggtgt
ttgtatgttg 1320agcacgatat tcttgcaaac atcatggcaa gcgcaggcaa
tggtagctat ctgtggactg 1380tcctgggcat gtactctatg gattccatat
tcgctatttt cttcagaaat agggaagctt 1440ggattacgag aaagcagtgg
caaaatgatt ggtgttcaca atgtatttat atctgccccc 1500caagtgttga
gcaccatcat tgccaccatt gtatttattc aatcggaggg cagtcatcga
1560gacatcgccg acaatagtat agcatgggtg ttgagaattg gaggtatatc
tgcatttcta 1620gccgcgtacc aatgccggca tcttttgccc atcaactttt ga
166225490DNAThraustochytrium sp. 25cgcggcttcc cgtctccaag cttcgtctcg
gtagagattc tatcttcgcc cggcagcccg 60ccgccgtccg gcaagtgtag aacggcagaa
agcccacttg cacggaacgc ccgacaagtt 120gacgaaagcg gcccgcaagt
gcggcagccc ggctggtttt tcctcgcggc gaggccaaac 180cgccaacgcc
accaagccag acaccaggta tgtgccgcac gcgccgccgc acgcgagccc
240cgaggatgcc ccgtacgcgc tgacgccttt ctccgccccg cccgcgagaa
gacgcgctcc 300ggcaacggcg ggagccgagc gaacgggcga ggattgatcg
agtagctgca ggttgagaaa 360aaaggaaaac cgccgagatg gacaacggct
ggatggacga gaagacgcac gaggacgcga 420ggactgacga tgatcacgtg
cgcaggaaga cttgaaaaga agcaaggaag gtagaaaaaa 480aagaagaaat
490261004DNAThraustochytrium sp. 26ggcctgtctc ccttggccat ccattgcgct
gcggaagcat tggattgcga actgcgtcgg 60ccagatcgct tggtttccca acatgagacg
cgctctgtcg gcaagaccat ttccgccccc 120ggctttgctc acaaccaact
cgtagtagat tttgtaaaga acactgcacg tctgactgct 180cccagcccgc
acgcattgcg cttggcagcc tcggtcccaa accgtcacgg tcgctgcccg
240gtccacggga aaaaataact tttgtccgcg agcggccgtt caaggcgcag
ccgcgagcgt 300gccaaccgtc cgtcccgcat tcttttccca atgttggatt
cattcattct tgccaggcca 360gatcatctgt gcctccctcg cgtgcccttc
cttagcgtgc gcagatctct tcttcccaga 420gcccgcgcgg cgcttcgtgg
agtcggcgtc catgtcatgc gcgcgcggcg tcttgacccc 480ctcggcccct
ttggttcgcg gctgcgcaac gagccgtttc acgccattgc gaccaaccgc
540gcgctaaaat cggattggcc gttgcacgcc gattttgcag cacctctggg
ctgtgaggga 600cgaccgtcca cttttacccg cacagagtgg actttcaccc
cctcactcca ctgaagccaa 660cttttcgccg tcttcccaac ccaaagttta
tgctagccct catgccgcaa cggacgtcac 720ccccatttcc actggcgacg
tggggacctg ggcgcaataa ggcgcgagaa ggaaattacg 780acggcacact
ggggccagaa gagggcacta ggagcggcaa cccactggcg cggcacagcg
840gtttggcgcg gggatcaaag caaaacccgg ctcatccaga gcaaacccga
atcagccttc 900agacggtcgt gcctaacaac acgccgttct accccgcctt
ccttgcgtcc ctcgcctccc 960ccgagcccaa gtcttccgcc cgctcctaac
gccaaccaag caag 100427590DNAThraustochytrium sp. 27atgcattcgc
atggctccgc accaccacac accaccgccc ctcttctttc cttgctcact 60cgatccatag
ccacttacct gccccttccc tctaccactg ccacgtgcgg cgtatgagcg
120cgcttgcacc cgcaaccttc tctctagttg ttcacaatta cacccgctat
caatactcac 180gcattcatct tccccttttt ttctacttta cgtaccggtg
ctcacttact tacacctgcc 240cgccttgttc attcattctt ctcgatgaca
acggcaggct ctgcttgcgg cgcgcgcacg 300catcccttac tccgccgcgc
accgacaagc ctgcgcaaaa aacaaaaaaa acttatcttc 360gctcgcggct
ccgatgtcgc ggcggcgtac gagaccgcgc cgagttccgc ccgccatgcg
420atcgagagtc tctctcgtag gagcgggacc gcgagcgacc tcggtgcctc
cgatagccag 480ctgggcttct agaccggctg ggggaccgcc cgcggcgtac
ctctgcgctt cggtggccct 540taaaaggctg atcgtggaaa aggtcgctct
ccagtctgcg gtttagcggc 59028640DNAThraustochytrium sp. 28gcgccttcag
gcaggctgat ccctactgtg ggggctctga cggacggccg gtctttgtac 60gtaaacaggc
gcttcttcgc ggcccgccga ggggggcggc aacgagccgg gtggcgtggc
120acggacaagg caagagcctt tccatcccgc ataaagtgat gcaccatttt
gaccttgttg 180atcgtttttg tgtgtttaga gcggccccgt gcgggtaggc
gaagtgcgct tctgagcaag 240gaagagagag gtgcagcttc ttcttgatca
gtgtggtaat cttcaacggc cacgctcgct 300tattcgatac ctgtaaagct
accggtgcac ccgtgcaagt tgggcaccac gtagttgtac 360tggtgaatcc
aaatgttagc cgctagcttg gtgccctttt cgacaggaag ggcttggtga
420aaagccatgc tgtcgatctc ccttgggtcc tcgttcgtga cgctaggcca
gagaatagct 480gtgtgccgcg cagtcgaagc cagcgcgcgc gcgtcggggc
cgagcataga gttagcaatt 540cagttgtttc gggctcttga tgaggccgcc
agagagcgaa gaaggatgaa cttaccagat 600ccgcgctccg gtgtattggt
gatgggcggc ttggtctccg 64029372DNAArtificial SequenceSynthetic
Construct 29atggccaagc tcacctcggc cgtcccggtg ctcaccgcgc gcgacgtcgc
cggcgcggtc 60gagttctgga ccgaccggct cgggttctcc cgggacttcg tggaggacga
cttcgccggc 120gtggtccggg acgacgtgac cctgttcatc agcgcggtcc
aggaccaggt ggtgccggac 180aacaccctgg cctgggtgtg ggtgcgcggc
ctggacgagc tgtacgccga gtggtcggag 240gtcgtgtcca cgaacttccg
ggacgcctcc gggccggcca tgaccgagat cggcgagcag 300ccgtgggggc
gggagttcgc cctgcgcgac ccggccggca actgcgtgca cttcgtggcc
360gaggagcagg ac 372301017DNAThraustochytrium sp. 30gcgttgtttc
ggcacgcgca attgccggga tgggaatgtg cattggtgca cgggattgct 60gcgtcggccg
ggccgtctcc caacatgaga cgcgctcggc aggaccgctt ccggttggca
120taacgtcgtt ttttccctgc tgtcccagct cgcgctttcg agacgaagct
atctgtacta 180ccctctctac tcgcgatcac tcgctcgcaa ggaaaatgga
agttaacgaa aggtcaatca 240ttttttgcgc tctgcattca tttgctctct
tcttgttgtt tgtggaacca aacggtcaga 300cgcgtggatc gcttttgtta
ggcactcggg aacggctgtc cctttaagca ctcaaccgaa 360cagtcgggcc
acttggtctg caacagcgag accaacttgg gtgcatggcg gcggctcatc
420ttccactgcc actccatggg caggtcgtga aaggagagcc acagcgagta
gcccgcctgc 480tggcggctct gtccgcacga ctggcacaca gacgtcccgg
cgtcgttctg gccaaagcac 540atggtctgga agcgggtccg gtaacaagag
gcgcaacgcc aaggctcgct cgaggccggc 600tcgttgtccg cgttgagcgt
caaaatcacg gggggcggca cgcccgcagg gcgctcgggc 660cccgtgatcg
acggatcgtc gatagcgagc acgacctcgt aacgccaggc gcgaaaatcg
720ttgggcttgg cagccttgtc gtcaggatgc gcctgcacaa tgtcctcgac
ctcgccaggc 780actggtttga agtacgcctc cttgtgctcc tccggggccg
cctctgcctc cttgtcgccg 840tggatctcga gttgggccat gcggggaccg
gccggcggca gaccgttaaa atgccagagg 900tgcagcggtg gcttgtacga
gtcgagggcg tggtcgcaga agagtagcgt aaaatggtca 960ccgccgtgca
ggatccatac agggtgcccc ggcgtcttga ggcagtcgtg tacagga 1017
* * * * *
References