U.S. patent application number 16/938605 was filed with the patent office on 2021-01-14 for production of manool.
The applicant listed for this patent is FIRMENICH SA. Invention is credited to Laurent Daviet, Letizia Rocci, Michel Schalk, Daniel Solis Escalante.
Application Number | 20210010035 16/938605 |
Document ID | / |
Family ID | 1000005120548 |
Filed Date | 2021-01-14 |
View All Diagrams
United States Patent
Application |
20210010035 |
Kind Code |
A1 |
Schalk; Michel ; et
al. |
January 14, 2021 |
PRODUCTION OF MANOOL
Abstract
Described herein are methods of producing (+)-manool, the
methods including: contacting geranylgeranyl diphosphate with a
copalyl diphosphate (CPP) synthase to form a (9S, 10S)-copalyl
diphosphate and contacting the CPP with a sclareol synthase enzyme
to form (+)-manool and derivatives thereof. Also described herein
are nucleic acids encoding CPP synthases and sclareol synthases for
use in the methods. Further described herein are expression vectors
and non-human host organisms and cells including nucleic acids
encoding a CPP synthase and a sclareol synthase as described
herein.
Inventors: |
Schalk; Michel; (Satigny,
CH) ; Daviet; Laurent; (Satigny, CH) ; Rocci;
Letizia; (Satigny, CH) ; Solis Escalante; Daniel;
(Satigny, CH) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
FIRMENICH SA |
Satigny |
|
CH |
|
|
Family ID: |
1000005120548 |
Appl. No.: |
16/938605 |
Filed: |
July 24, 2020 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
16472120 |
Jun 20, 2019 |
10752922 |
|
|
PCT/EP2017/083372 |
Dec 18, 2017 |
|
|
|
16938605 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12Y 505/01012 20130101;
C12N 1/16 20130101; C12N 9/88 20130101; C12N 15/70 20130101; C12N
15/81 20130101; C12P 7/02 20130101; C12N 1/20 20130101 |
International
Class: |
C12P 7/02 20060101
C12P007/02; C12N 9/88 20060101 C12N009/88; C12N 15/70 20060101
C12N015/70; C12N 15/81 20060101 C12N015/81; C12N 1/16 20060101
C12N001/16; C12N 1/20 20060101 C12N001/20 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 22, 2016 |
EP |
16206349.9 |
Claims
1. A method of producing (+)-manool, the method comprising: a)
contacting geranylgeranyl diphosphate (GGPP) with a copalyl
diphosphate (CPP) synthase to form a copalyl diphosphate, wherein
the CPP synthase comprises a) an amino acid sequence having at
least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or
100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) an
amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%,
95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID
NO: 18; or c) an amino acid sequence having at least 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 20 or SEQ ID NO: 21; and b) contacting the CPP with a
sclareol synthase to form the (+)-manool; and c) optionally
isolating the (+)-manool.
2. The method of claim 1, wherein the CPP synthase comprises a) a
polypeptide comprising an amino acid sequence having at least 90%,
95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID
NO: 15; or b) a polypeptide comprising an amino acid sequence
having at least 90%, 95%, 98%, 99%, or 100% sequence identity to
SEQ ID NO: 17 or SEQ ID NO: 18; or c) a polypeptide comprising an
amino acid sequence having at least 95%, 96%, 97%, 98%, 99%, or
100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21.
3. The method of claim 1, wherein step a) further comprises
culturing a non-human host organism or cell capable of producing
GGPP and transformed to express at least one polypeptide comprising
a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%,
75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at
least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid
sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,
98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO:
21; and having a CPP synthase activity, under conditions conducive
to a production of CPP.
4. The method of claim 1, wherein the method further comprises,
prior to step a), transforming a non-human host organism or cell
capable of producing GGPP with a) at least one nucleic acid
encoding a polypeptide comprising a) an amino acid sequence having
at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%,
99%, or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15;
or b) an amino acid sequence having at least 71%, 72%, 75%, 80%,
85%, 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 17
and SEQ ID NO: 18; or c) an amino acid sequence having at least
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence
identity to SEQ ID NO: 20 or SEQ ID NO: 21; and having a CPP
synthase activity, so that said organism or cell expresses said
polypeptide having a CPP synthase activity; and b) at least one
nucleic acid encoding a polypeptide having a sclareol synthase
activity, so that said organism or cell expresses said polypeptide
having a sclareol synthase activity.
5. The method as recited in claim 4, wherein the polypeptide having
sclareol synthase activity comprises an amino acid sequence having
at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID
NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
6. The method as recited in claim 1, further comprising processing
the (+)-manool to a (+)-manool derivative using a chemical or
biochemical synthesis or a combination of both.
7. The method as recited in claim 6, wherein the derivative is an
alcohol, acetal, aldehyde, acid, ether, ketone, lactone, acetate or
an ester.
8. The method as recited in claim 6, wherein the derivative is
selected from the group consisting of copalol, copalal, manooloxy,
Z-11, gamma-ambrol and ambrox.
9. A method for transforming a host cell or non-human organism, the
method comprising transforming a host cell or non-human organism
with a nucleic acid encoding a polypeptide having a copalyl
diphosphate synthase activity and a nucleic acid encoding a
polypeptide having a sclareol synthase activity, wherein the
polypeptide having copalyl diphosphate activity comprises a) an
amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID
NO: 14 or SEQ ID NO: 15; or b) an amino acid sequence having at
least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid
sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,
98%, 99% or 100% sequence identity to SEQ ID NO: 20 and SEQ ID NO:
21; and wherein the polypeptide having sclareol synthase activity
comprises an amino acid sequence having at least 90%, 95%, 98%,
99%, or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ
ID NO: 23, or SEQ ID NO: 25.
10. The method as recited in claim 4, wherein the host cell or
non-human organism is a plant, a prokaryote, or a fungus.
11. The method as recited in claim 4, wherein the non-human host
organism or cell is E. coli or Saccharomyces cerevisiae.
12. An expression vector comprising a) a nucleic acid encoding a
polypeptide having a CPP synthase activity comprising a) an amino
acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 14 or SEQ ID NO: 15; or b) an amino acid
sequence having at least 95%, 98%, 99% or 100% sequence identity to
SEQ ID NO: 17 or SEQ ID NO: 18; or c) an amino acid sequence having
at 98%, 99% or 100% sequence identity SEQ ID NO: 20 or SEQ ID NO:
21; or b) a nucleic acid encoding a polypeptide having a CPP
synthase activity comprising a nucleotide sequence having at least
90%, 95%, 98%, 99% or 100% sequence identity to a nucleic acid
sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO:
19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ
ID NO: 31, or SEQ ID NO: 32.
13. The expression vector of claim 12 further comprising a) a
nucleic acid encoding a polypeptide having a sclareol synthase
activity, wherein the polypeptide having sclareol synthase activity
comprises an amino acid sequence having at least 90%, 95%, 98%, 99%
or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO:
23, or SEQ ID NO: 25; or b) a nucleic acid encoding a polypeptide
having a sclareol synthase activity comprising a nucleotide
sequence having at least 90%, 95%, 98%,99% or 100% sequence
identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO:
33, or SEQ ID NO: 34.
14. A non-human host organism or cell comprising a) the expression
vector as recited in claim 12; or b) a nucleic acid encoding a
polypeptide having a copalyl diphosphate synthase activity and a
nucleic acid encoding a polypeptide having a sclareol synthase
activity, wherein the polypeptide having copalyl diphosphate
activity comprises i. an amino acid sequence having at least 50%,
55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100%
sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or ii. an
amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%,
95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID
NO: 18; or iii. an amino acid sequence having at least 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to
SEQ ID NO: 20 or SEQ ID NO: 21; and wherein the polypeptide having
sclareol synthase activity comprises an amino acid sequence having
at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO:
4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25; and wherein at
least one of the nucleic acids is heterologous to the non-human
host organism or cell.
15. The non-human host organism or cell of claim 14, wherein the
non-human host organism or cell is a plant, a prokaryote, a fungus,
Escherichia coli, or Saccharomyces cerevisiae.
16. The method as recited in claim 9, wherein the host cell or
non-human organism is a plant, a prokaryote, or a fungus.
17. The method as recited in claim 9, wherein the non-human host
organism or cell is E. coli or Saccharomyces cerevisiae.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Divisional Application of U.S.
Non-Provisional application Ser. No. 16/472,120 filed Jun. 20,
2019, which claims priority to U.S. National Phase Application of
PCT/EP2017/083372, filed Dec. 18, 2017, which claims the benefit of
priority to European Patent Application No. 16206349.9, filed Dec.
22, 2016, the entire contents of which are hereby incorporated by
reference herein.
TECHNICAL FIELD
[0002] Provided herein are biochemical methods of producing
(+)-manool using a copalyl diphosphate synthase and a sclareol
synthase.
BACKGROUND
[0003] Terpenes are found in most organisms (microorganisms,
animals and plants). These compounds are made up of five carbon
units called isoprene units and are classified by the number of
these units present in their structure. Thus monoterpenes,
sesquiterpenes and diterpenes are terpenes containing 10, 15 and 20
carbon atoms respectively. Sesquiterpenes, for example, are widely
found in the plant kingdom. Many sesquiterpene molecules are known
for their flavor and fragrance properties and their cosmetic,
medicinal and antimicrobial effects. Numerous sesquiterpene
hydrocarbons and sesquiterpenoids have been identified.
[0004] Biosynthetic production of terpenes involves enzymes called
terpene synthases. These enzymes convert an acyclic terpene
precursor in one or more terpene products. In particular, diterpene
synthases produce diterpenes by cyclization of the precursor
geranylgeranyl diphosphate (GGPP). The cyclization of GGPP often
requires two enzyme polypeptides, a type I and a type II diterpene
synthase working in combination in two successive enzymatic
reactions. The type II diterpene synthases catalyze a
cyclization/rearrangement of GGPP initiated by the protonation of
the terminal double bond of GGPP leading to a cyclic diterpene
diphosphate intermediate. This intermediate is then further
converted by a type I diterpene synthase catalyzing an ionization
initiated cyclization.
[0005] Diterpene synthases are present in the plants and other
organisms and use substrates such as GGPP but they have different
product profiles. Genes and cDNAs encoding diterpene synthases have
been cloned and the corresponding recombinant enzymes
characterized.
[0006] Copalyl diphosphate (CPP) synthase enzymes and sclareol
synthase enzymes are enzymes that occur in plants. Hence, it is
desirable to discover and use these enzymes and variants in
biochemical processes to generate (+)-manool.
SUMMARY
[0007] Provided herein is a method of producing (+)-manool
comprising: [0008] a) contacting geranylgeranyl diphosphate (GGPP)
with a copalyl diphosphate (CPP) synthase to form a copalyl
diphosphate, wherein the CPP synthase comprises [0009] i) an amino
acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%,
85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide
selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15;
or [0010] ii) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ
ID NO: 18; or [0011] iii) an amino acid sequence having at least
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence
identity to a polypeptide selected from group consisting of SEQ ID
NO: 20 and SEQ ID NO: 21; and [0012] b) contacting the CPP with a
sclareol synthase to form (+)-manool; and [0013] c) optionally
isolating the (+)-manool.
[0014] Provided herein is the above method further comprising
further processing the (+)-manool to a (+)-manool derivative.
[0015] Also provided herein is a polypeptide having CPP synthase
activity, wherein the polypeptide comprises [0016] a) an amino acid
sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,
90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide
selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15;
or [0017] b) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ
ID NO: 18; or [0018] c) an amino acid sequence having at least 90%,
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence
identity to a polypeptide selected from group consisting of SEQ ID
NO: 20 and SEQ ID NO: 21.
[0019] Further provided is a polypeptide having sclareol synthase
activity, wherein the polypeptide comprises an amino acid sequence
having at least 90%, 95%, 98%, 99% or 100% sequence identity to a
polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID
NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
[0020] Also provided herein is a nucleic acid encoding a
polypeptide described above.
[0021] Also provided herein is a nucleic acid encoding a CPP
synthase wherein the nucleic acid comprises a nucleotide sequence
having at least 90%, 95%, 98%, 99% or 100% sequence identity to the
nucleic acid sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16,
SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID
NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0022] Further provided herein is a nucleic acid encoding a
sclareol synthase wherein the nucleic acid comprises a nucleotide
sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO:
33, or SEQ ID NO: 34.
[0023] Also provided is an expression vector comprising the nucleic
acids described above, a non-human host organism or cell comprising
the nucleic acids described above or comprising the expression
vector, non-human host organisms or cells capable of producing
GGPP, methods of transforming a non-human host organism or cell,
and methods for culturing the non-human host organisms or cells for
producing (+)-manool.
DESCRIPTION OF THE DRAWINGS
[0024] FIG. 1. Enzymatic pathway from geranylgeranyl diphosphate
(GGPP) to (+)-manool.
[0025] FIG. 2. Enzymatic pathways from geranylgeranyl diphosphate
(GGPP) to (+)-manool and sclareol.
[0026] FIG. 3. GCMS analysis of the in vitro enzymatic conversion
of GGPP. A. Using the recombinant SmCPS enzyme. B. Using the
recombinant ScScS enzyme. C. Combining the SmCPS with ScScS enzymes
in a single assay.
[0027] FIG. 4. GCMS analysis of (+)-manool produced using
Escherichia coli cells expressing SmCPS, ScScS and mevalonate
pathway enzymes. A. Total ion chromatogram of an extract of the E.
coli culture medium. B. Total ion chromatogram of a (+)-manool
standard. C. Mass spectrum of the major peak (retention time of
14.55 min) in chromatogram A. D. Mass spectrum of the (+)-manool
authentic standard.
[0028] FIG. 5. GCMS analysis of (+)-manool produced using E. coli
cells expressing, mevalonate pathway enzymes, a GGPP synthase,
ScSCS and five different CPP synthases: SmCPS2 from Salvia
miltiorrhiza, CfCPS1 from Coleus forskohlii, TaTps1 from Triticum
aestivum, MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus
officinalis.
[0029] FIG. 6. GCMS analysis of (+)-manool produced using E. coli
cells expressing, mevalonate pathway enzymes, a GGPP synthase,
SmCPS2 and a class I diterpene synthases: NgSCS-del29 from
Nicotiana glutinosa or SsScS from Salvia sclarea.
[0030] FIG. 7. Saccharomyces cerevisiae expression plasmids were
constructed in vivo by co-transformation of yeast with six DNA
fragments: a) LEU2 yeast marker, b) AmpR E. coli marker, c) Yeast
origin of replication, d) E. coli replication origin, e) a fragment
for co-expression of CrtE and one of the sclareol synthases coding
sequences tested, and f) a fragment for expression of one of the
copalyl diphosphate (CPP) synthases coding sequences tested.
[0031] FIG. 8. GCMS analysis of (+)-manool produced using the
modified S. cerevisiae strain YST045 expressing a GGPP synthase,
ScSCS and five different truncated versions of CPP synthases:
SmCPS2 from Salvia miltiorrhiza, CfCPS1 from Coleus forskohlii,
TaTps1 from Triticum aestivum, MvCps3 from Marrubium vulgare and
RoCPS1 from Rosmarinus officinalis.
DETAILED DESCRIPTION
Definitions
[0032] For the descriptions herein and the appended claims, the use
of "or" means "and/or" unless stated otherwise.
[0033] Similarly, "comprise," "comprises," "comprising," "include,"
"includes," and "including" are interchangeable and not intended to
be limiting.
[0034] It is to be further understood that where descriptions of
various embodiments use the term "comprising," those skilled in the
art would understand that in some specific instances, an embodiment
can be alternatively described using language "consisting
essentially of" or "consisting of."
[0035] The following terms have the meanings ascribed to them
unless specified otherwise.
[0036] The term "polypeptide" means an amino acid sequence of
consecutively polymerized amino acid residues, for instance, at
least 15 residues, at least 30 residues, at least 50 residues. In
some embodiments provided herein, a polypeptide comprises an amino
acid sequence that is an enzyme, or a fragment, or a variant
thereof.
[0037] The term "isolated" polypeptide refers to an amino acid
sequence that is removed from its natural environment by any method
or combination of methods known in the art and includes
recombinant, biochemical and synthetic methods.
[0038] The term "protein" refers to an amino acid sequence of any
length wherein amino acids are linked by covalent peptide bonds,
and includes oligopeptide, peptide, polypeptide and full length
protein whether naturally occurring or synthetic.
[0039] The terms "biological function," "function," "biological
activity" or "activity" refer to the ability of the CPP synthase
and the sclareol synthase activity to catalyze the formation of
(+)-manool.
[0040] The terms "nucleic acid sequence," "nucleic acid," and
"polynucleotide" are used interchangeably meaning a sequence of
nucleotides. A nucleic acid sequence may be a single-stranded or
double-stranded deoxyribonucleotide, or ribonucleotide of any
length, and include coding and non-coding sequences of a gene,
exons, introns, sense and anti-sense complimentary sequences,
genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant
nucleic acid sequences, isolated and purified naturally occurring
DNA and/or RNA sequences, synthetic DNA and RNA sequences,
fragments, primers and nucleic acid probes; and the complement of
such sequences. The skilled artisan is aware that the nucleic acid
sequences of RNA are identical to the DNA sequences with the
difference of thymine (T) being replaced by uracil (U).
[0041] An isolated nucleic acid or isolated nucleic acid sequence
refers to a nucleic acid or nucleic acid sequence that is in an
environment different from that in which the nucleic acid or
nucleic acid sequence naturally occurs. The term
"naturally-occurring" as used herein as applied to a nucleic acid
refers to a nucleic acid that is found in a cell in nature. For
example, a nucleic acid sequence that is present in an organism,
for instance in the cells of an organism, that can be isolated from
a source in nature and which it has not been intentionally modified
by a human in the laboratory is naturally occurring.
[0042] The terms "purified," "substantially purified," and
"isolated" as used herein refer to the state of being free of
other, dissimilar compounds with which the compound of the
invention is normally associated in its natural state, so that the
"purified," "substantially purified," and "isolated" subject
comprises at least 0.5%, 1%, 5%, 10%, or 20%, or at least 50% or
75% of the mass, by weight, of a given sample. In one particular
embodiment, these terms refer to the compound of the invention
comprising at least 95, 96, 97, 98, 99 or 100% of the mass, by
weight, of a given sample. As used herein, the terms "purified,"
"substantially purified," and "isolated," when referring to a
nucleic acid or protein, of nucleic acids or proteins, also refers
to a state of purification or concentration different than that
which occurs naturally in a cell or organism. Any degree of
purification or concentration greater than that which occurs
naturally in a cell or organism, including (1) the purification
from other associated structures or compounds or (2) the
association with structures or compounds to which it is not
normally associated in the cell or organism, are within the meaning
of "isolated." The nucleic acid or protein or classes of nucleic
acids or proteins, described herein, may be isolated, or otherwise
associated with structures or compounds to which they are not
normally associated in nature, according to a variety of methods
and processes known to those of skill in the art.
[0043] As used herein, the terms "amplifying" and "amplification"
refer to the use of any suitable amplification methodology for
generating or detecting recombinant of naturally expressed nucleic
acid, as described in detail, below. For example, the invention
provides methods and reagents (e.g., specific degenerate
oligonucleotide primer pairs, oligo dT primer) for amplifying
(e.g., by polymerase chain reaction, PCR) naturally expressed
(e.g., genomic DNA or mRNA) or recombinant (e.g., cDNA) nucleic
acids of the invention in vivo, ex vivo or in vitro.
[0044] "Recombinant nucleic acid sequence" are nucleic acid
sequences that result from the use of laboratory methods (molecular
cloning) to bring together genetic material from more than on
source, creating a nucleic acid sequence that does not occur
naturally and would not be otherwise found in biological
organisms.
[0045] "Recombinant DNA technology" refers to molecular biology
procedures to prepare a recombinant nucleic acid sequence as
described, for instance, in Laboratory Manuals edited by Weigel and
Glazebrook, 2002 Cold Spring Harbor Lab Press; and Sambrook et al.,
1989 Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory
Press.
[0046] The term "gene" means a DNA sequence comprising a region,
which is transcribed into a RNA molecule, e.g., an mRNA in a cell,
operably linked to suitable regulatory regions, e.g., a promoter. A
gene may thus comprise several operably linked sequences, such as a
promoter, a 5' leader sequence comprising, e.g., sequences involved
in translation initiation, a coding region of cDNA or genomic DNA,
introns, exons, and/or a 3'non-translated sequence comprising,
e.g., transcription termination sites.
[0047] A "chimeric gene" refers to any gene, which is not normally
found in nature in a species, in particular, a gene in which one or
more parts of the nucleic acid sequence are present that are not
associated with each other in nature. For example the promoter is
not associated in nature with part or all of the transcribed region
or with another regulatory region. The term "chimeric gene" is
understood to include expression constructs in which a promoter or
transcription regulatory sequence is operably linked to one or more
coding sequences or to an antisense, i.e., reverse complement of
the sense strand, or inverted repeat sequence (sense and antisense,
whereby the RNA transcript forms double stranded RNA upon
transcription). The term "chimeric gene" also includes genes
obtained through the combination of portions of one or more coding
sequences to produce a new gene.
[0048] A "3' UTR" or "3' non-translated sequence" (also referred to
as "3' untranslated region," or "3'end") refers to the nucleic acid
sequence found downstream of the coding sequence of a gene, which
comprises for example a transcription termination site and (in
most, but not all eukaryotic mRNAs) a polyadenylation signal such
as AAUAAA or variants thereof. After termination of transcription,
the mRNA transcript may be cleaved downstream of the
polyadenylation signal and a poly(A) tail may be added, which is
involved in the transport of the mRNA to the site of translation,
e.g., cytoplasm.
[0049] "Expression of a gene" involves transcription of the gene
and translation of the mRNA into a protein. Overexpression refers
to the production of the gene product as measured by levels of
mRNA, polypeptide and/or enzyme activity in transgenic cells or
organisms that exceeds levels of production in non-transformed
cells or organisms of a similar genetic background.
[0050] "Expression vector" as used herein means a nucleic acid
molecule engineered using molecular biology methods and recombinant
DNA technology for delivery of foreign or exogenous DNA into a host
cell. The expression vector typically includes sequences required
for proper transcription of the nucleotide sequence. The coding
region usually codes for a protein of interest but may also code
for an RNA, e.g., an antisense RNA, siRNA and the like.
[0051] An "expression vector" as used herein includes any linear or
circular recombinant vector including but not limited to viral
vectors, bacteriophages and plasmids. The skilled person is capable
of selecting a suitable vector according to the expression system.
In one embodiment, the expression vector includes the nucleic acid
of an embodiment herein operably linked to at least one regulatory
sequence, which controls transcription, translation, initiation and
termination, such as a transcriptional promoter, operator or
enhancer, or an mRNA ribosomal binding site and, optionally,
including at least one selection marker. Nucleotide sequences are
"operably linked" when the regulatory sequence functionally relates
to the nucleic acid of an embodiment herein. "Regulatory sequence"
refers to a nucleic acid sequence that determines the expression
level of the nucleic acid sequences of an embodiment herein and is
capable of regulating the rate of transcription of the nucleic acid
sequence operably linked to the regulatory sequence. Regulatory
sequences comprise promoters, enhancers, transcription factors,
promoter elements and the like.
[0052] "Promoter" refers to a nucleic acid sequence that controls
the expression of a coding sequence by providing a binding site for
RNA polymerase and other factors required for proper transcription
including without limitation transcription factor binding sites,
repressor and activator protein binding sites. The meaning of the
term promoter also includes the term "promoter regulatory
sequence". Promoter regulatory sequences may include upstream and
downstream elements that may influences transcription, RNA
processing or stability of the associated coding nucleic acid
sequence. Promoters include naturally-derived and synthetic
sequences. The coding nucleic acid sequences is usually located
downstream of the promoter with respect to the direction of the
transcription starting at the transcription initiation site.
[0053] The term "constitutive promoter" refers to an unregulated
promoter that allows for continual transcription of the nucleic
acid sequence it is operably linked to.
[0054] As used herein, the term "operably linked" refers to a
linkage of polynucleotide elements in a functional relationship. A
nucleic acid is "operably linked" when it is placed into a
functional relationship with another nucleic acid sequence. For
instance, a promoter, or rather a transcription regulatory
sequence, is operably linked to a coding sequence if it affects the
transcription of the coding sequence. Operably linked means that
the DNA sequences being linked are typically contiguous. The
nucleotide sequence associated with the promoter sequence may be of
homologous or heterologous origin with respect to the cell or
organism, e.g. host cell, plant cell, plant, or microorganism, to
be transformed. The sequence also may be entirely or partially
synthetic. Regardless of the origin, the nucleic acid sequence
associated with the promoter sequence will be expressed or silenced
in accordance with promoter properties to which it is linked. The
associated nucleic acid may code for a protein that is desired to
be expressed or suppressed throughout the organism at all times or,
alternatively, at a specific time or in specific tissues, cells, or
cell compartment. Such nucleotide sequences particularly encode
proteins conferring desirable phenotypic traits to the host cells
or organism altered or transformed therewith. More particularly,
the associated nucleotide sequence leads to the production of a
(+)-manool synthase in the host cell or organism.
[0055] "Target peptide" refers to an amino acid sequence which
targets a protein, or polypeptide to intracellular organelles,
i.e., mitochondria, or plastids, or to the extracellular space
(secretion signal peptide). A nucleic acid sequence encoding a
target peptide may be fused to the nucleic acid sequence encoding
the amino terminal end, e.g., N-terminal end, of the protein or
polypeptide, or may be used to replace a native targeting
polypeptide.
[0056] The term "primer" refers to a short nucleic acid sequence
that is hybridized to a template nucleic acid sequence and is used
for polymerization of a nucleic acid sequence complementary to the
template.
[0057] As used herein, the term "host cell" or "transformed cell"
refers to a cell (or organism) altered to harbor at least one
nucleic acid molecule, for instance, a recombinant gene encoding a
desired protein or nucleic acid sequence which upon transcription
yields a CPP synthase protein and/or a sclareol synthase protein or
which together produce (+)-manool.
[0058] The host cell is particularly a bacterial cell, a fungal
cell or a plant cell. The host cell may contain a recombinant gene
which has been integrated into the nuclear or organelle genomes of
the host cell. Alternatively, the host may contain the recombinant
gene extra-chromosomally. Homologous sequences include orthologous
or paralogous sequences. Methods of identifying orthologs or
paralogs including phylogenetic methods, sequence similarity and
hybridization methods are known in the art and are described
herein.
[0059] Paralogs result from gene duplication that gives rise to two
or more genes with similar sequences and similar functions.
Paralogs typically cluster together and are formed by duplications
of genes within related plant species. Paralogs are found in groups
of similar genes using pair-wise Blast analysis or during
phylogenetic analysis of gene families using programs such as
CLUSTAL. In paralogs, consensus sequences can be identified
characteristic to sequences within related genes and having similar
functions of the genes.
[0060] Orthologs, or orthologous sequences, are sequences similar
to each other because they are found in species that descended from
a common ancestor. For instance, plant species that have common
ancestors are known to contain many enzymes that have similar
sequences and functions. The skilled artisan can identify
orthologous sequences and predict the functions of the orthologs,
for example, by constructing a polygenic tree for a gene family of
one species using for example CLUSTAL or BLAST programs
[0061] The term "selectable marker" refers to any gene which upon
expression may be used to select a cell or cells that include the
selectable marker. Examples of selectable markers are described
below. The skilled artisan will know that different antibiotic,
fungicide, auxotrophic or herbicide selectable markers are
applicable to different target species.
[0062] The term "organism" refers to any non-human multicellular or
unicellular organisms such as a plant, or a microorganism.
Particularly, a microorganism is a bacterium, a yeast, an algae or
a fungus.
[0063] The term "plant" is used interchangeably to include plant
cells including plant protoplasts, plant tissues, plant cell tissue
cultures giving rise to regenerated plants, or parts of plants, or
plant organs such as roots, stems, leaves, buds, flowers, petioles,
petals, pollen, ovules, embryos, tubers, fruits, seed, progeny
thereof and the like. Any plant can be used to carry out the
methods of an embodiment herein.
Particular Embodiments
[0064] In one embodiment provided herein is a method for
transforming a host cell or non-human organism comprising
transforming a host cell or non-human organism with a nucleic acid
encoding a polypeptide having a copalyl diphosphate synthase
activity and with a nucleic acid encoding a polypeptide having a
sclareol synthase activity, wherein the polypeptide having the
copalyl diphosphate activity comprises
[0065] a) an amino acid sequence having at least 50%, 55%, 60%,
65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence
identity to a polypeptide selected from group consisting of SEQ ID
NO: 14 and SEQ ID NO: 15; or
[0066] b) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ
ID NO: 18; or
[0067] c) an amino acid sequence having at least 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ
ID NO: 21.
[0068] In one embodiment, the polypeptide having the sclareol
synthase activity comprises an amino acid sequence having at least
90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide
selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ
ID NO: 23, and SEQ ID NO: 25.
[0069] In one embodiment provided herein is a method comprising
cultivating a non-human host organism or cell capable of producing
a geranylgeranyl diphosphate (GGPP) and transformed to express a
polypeptide having a copalyl diphosphate synthase activity wherein
the polypeptide having the copalyl diphosphate synthase activity
comprises
[0070] a) an amino acid sequence having at least 50%, 55%, 60%,
65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence
identity to a polypeptide selected from group consisting of SEQ ID
NO: 14 and SEQ ID NO: 15; or
[0071] b) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ
ID NO: 18; or
[0072] c) an amino acid sequence having at least 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ
ID NO: 21; and
further transformed to express a polypeptide having a sclareol
synthase activity.
[0073] Particularly, the polypeptide having the sclareol synthase
activity comprises an amino acid sequence having at least 90%, 95%,
98%, 99% or 100% sequence identity to a polypeptide selected from
the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and
SEQ ID NO: 25.
[0074] Further provided herein is an expression vector comprising a
nucleic acid encoding a CPP synthase wherein the CPP synthase
comprises a polypeptide comprising
[0075] a) an amino acid sequence having at least 50%, 55%, 60%,
65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
[0076] b) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID
NO: 17 or SEQ ID NO: 18; or
[0077] c) an amino acid sequence having at least 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 20 or SEQ ID NO: 21; and further the expression vector
comprises a nucleic acid encoding a sclareol synthase enzyme.
[0078] Particularly, the sclareol synthase comprises an amino acid
sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID
NO: 25. In a particularly embodiment, the two enzymes, i.e. the CPP
synthase and the sclareol synthase, could be on two different
vectors or plasmids transformed in the same cell. In a further
embodiment, these two enzymes could be on two different vectors or
plasmids transformed in two different cells.
[0079] Further provided herein is a non-human host organism or cell
comprising or transformed to harbor at least one nucleic acid
encoding a CPP synthase wherein the CPP synthase comprises
[0080] a) a polypeptide comprising an amino acid sequence having at
least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or
100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
[0081] b) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID
NO: 17 or SEQ ID NO: 18; or
[0082] c) an amino acid sequence having at least 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 20 or SEQ ID NO: 21; and at least one nucleic acid encoding
a sclareol enzyme.
[0083] Particularly, the sclareol synthase comprises an amino acid
sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID
NO: 25.
[0084] Further provided herein is a non-human host organism or cell
comprising or transformed to harbor at least one nucleic acid
encoding a CPP synthase wherein the CPP synthase comprises an amino
acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to a polypeptide selected from the group consisting of SEQ
ID NO: 1 and SEQ ID NO: 2; and
at least one nucleic acid encoding a sclareol enzyme wherein the
sclareol synthase comprises an amino acid sequence having at least
90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide
selected from the group SEQ ID NO: 23 and SEQ ID NO: 25.
[0085] In one embodiment, the nucleic acid that encodes for a CPP
synthase provided herein comprises a nucleotide sequence that has
at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO:
3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ
ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID
NO: 32.
[0086] In one embodiment, the nucleic acid that encodes for a CPP
synthase provided herein comprises a nucleotide sequence having at
least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ
ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO:
26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO:
32.
[0087] In one embodiment, the nucleic acid that encodes for a CPP
synthase provided herein comprises a nucleotide sequence having at
least 98% %, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID
NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26,
SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0088] In one embodiment, the nucleic acid that encodes for a CPP
synthase provided herein comprises a nucleotide sequence having 99%
or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID
NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29,
SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
[0089] In one embodiment, the nucleic acid that encodes for a CPP
synthase provided herein comprises the nucleotide sequence as set
forth in SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19,
SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID
NO: 31, or SEQ ID NO: 32.
[0090] In one embodiment, the CPP synthase comprises a polypeptide
comprising
[0091] a) an amino acid sequence having at least 50%, 55%, 60%,
65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence
identity to a polypeptide selected from the group consisting of SEQ
ID NO: 14 and SEQ ID NO: 15; or
[0092] b) an amino acid sequence having at least 71%, 72%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ
ID NO: 18; or
[0093] c) an amino acid sequence having at least 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a
polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ
ID NO: 21.
[0094] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 55%, 60%, 65%,
70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% sequence identity to SEQ
ID NO: 14.
[0095] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 60%, 65%, 70%,
75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 14.
[0096] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 70%, 75%, 80%,
85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO:
14.
[0097] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 90%, 95%, 98%,
99% or 100% sequence identity to SEQ ID NO: 14.
[0098] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 98%, 99% or 100%
sequence identity to SEQ ID NO: 14.
[0099] In one embodiment, the CPP synthase comprises a polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO:
14.
[0100] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 55%, 60%, 65%,
70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to
SEQ ID NO: 15.
[0101] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 60%, 65%, 70%,
75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ
ID NO: 15.
[0102] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 70%, 75%, 80%,
85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO:
15.
[0103] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 90%, 95%, 98%,
99% or 100% sequence identity to SEQ ID NO: 15.
[0104] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 98%, 99% or 100%
sequence identity to SEQ ID NO: 15.
[0105] In one embodiment, the CPP synthase comprises a polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO:
15.
[0106] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 75%, 80%, 85%,
90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
[0107] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 80%, 85%, 90%,
95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
[0108] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 90%, 95%, 98%,
99% or 100% sequence identity to SEQ ID NO: 17.
[0109] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 98%, 99% or 100%
sequence identity to SEQ ID NO: 17.
[0110] In one embodiment, the CPP synthase comprises a polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO:
17.
[0111] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 75%, 80%, 85%,
90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
[0112] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 80%, 85%, 90%,
95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
[0113] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 90%, 95%, 98%,
99% or 100% sequence identity to SEQ ID NO: 18.
[0114] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 98%, 99% or 100%
sequence identity to SEQ ID NO: 18.
[0115] In one embodiment, the CPP synthase comprises a polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO:
18.
[0116] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 91%, 92%, 93%,
94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID
NO: 20.
[0117] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 95%, 96%, 97%,
98%, 99% or 100% sequence identity to SEQ ID NO: 20.
[0118] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 98%, 99% or 100%
sequence identity to SEQ ID NO: 20.
[0119] In one embodiment, the CPP synthase comprises a polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO:
20.
[0120] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 91%, 92%, 93%,
94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID
NO: 21.
[0121] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 95%, 96%, 97%,
98%, 99% or 100% sequence identity to SEQ ID NO: 21.
[0122] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 98%, 99% or 100%
sequence identity to SEQ ID NO: 21.
[0123] In one embodiment, the CPP synthase comprises a polypeptide
comprising an amino acid sequence having at least 99% sequence
identity to SEQ ID NO: 21.
[0124] In one embodiment, the CPP synthase comprises a polypeptide
comprising the amino acid sequence as set forth in SEQ ID NO:
21.
[0125] In one embodiment, the nucleic acid encoding the sclareol
synthase enzyme comprises a nucleotide sequence having at least
90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ
ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
[0126] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 4.
[0127] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 4.
[0128] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 98%, 99% or 100% sequence identity to
SEQ ID NO: 4.
[0129] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 99% sequence identity to SEQ ID NO:
4.
[0130] In one embodiment, the sclareol synthase comprises the amino
acid sequence as set forth in SEQ ID NO: 4.
[0131] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 5.
[0132] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 5.
[0133] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 98%, 99% or 100% sequence identity to
SEQ ID NO: 5.
[0134] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 99% sequence identity to SEQ ID NO:
5.
[0135] In one embodiment, the sclareol synthase comprises the amino
acid sequence as set forth in SEQ ID NO: 5.
[0136] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 23.
[0137] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 23.
[0138] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 98%, 99% or 100% sequence identity to
SEQ ID NO: 23.
[0139] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 99% sequence identity to SEQ ID NO:
23.
[0140] In one embodiment, the sclareol synthase comprises the amino
acid sequence as set forth in SEQ ID NO: 23.
[0141] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 25.
[0142] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 95%, 98%, 99% or 100% sequence
identity to SEQ ID NO: 25.
[0143] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 98%, 99% or 100% sequence identity to
SEQ ID NO: 25.
[0144] In one embodiment, the sclareol synthase comprises an amino
acid sequence having at least 99% sequence identity to SEQ ID NO:
25.
[0145] In one embodiment, the sclareol synthase comprises the amino
acid sequence as set forth in SEQ ID NO: 25.
[0146] In another embodiment, provided herein is an expression
vector comprising at least one of the nucleic acids described
herein.
[0147] In another embodiment, provided herein is a non-human host
organism or cell that comprises one or more expression vectors
comprising a nucleic acid encoding a CPP synthase as described
herein and a nucleic acid encoding a sclareol synthase as described
herein.
[0148] In another embodiment, provided herein is a non-human host
organism or cell comprising or transformed to harbor at least one
nucleic acid described herein so that it heterologously expresses
or over-expresses at least one polypeptide described herein.
[0149] In an embodiment, the present invention provides a
transformed cell or organism, in which the polypeptides are
expressed in higher quantity than in the same cell or organism not
so transformed.
[0150] There are several methods known in the art for the creation
of transgenic host organisms or cells such as plants, fungi,
prokaryotes, or cultures of higher eukaryotic cells. Appropriate
cloning and expression vectors for use with bacterial, fungal,
yeast, plant and mammalian cellular hosts are described, for
example, in Pouwels et al., Cloning Vectors: A Laboratory Manual,
1985, Elsevier, New York and Sambrook et al., Molecular Cloning: A
Laboratory Manual, 2.sup.nd edition, 1989, Cold Spring Harbor
Laboratory Press. Cloning and expression vectors for higher plants
and/or plant cells in particular are available to the skilled
person. See for example Schardl et al., Gene, 1987, 61:1-11.
[0151] Methods for transforming host organisms or cells to harbor
transgenic nucleic acids are familiar to the skilled person. For
the creation of transgenic plants, for example, current methods
include: electroporation of plant protoplasts, liposome-mediated
transformation, agrobacterium-mediated transformation,
polyethylene-glycol-mediated transformation, particle bombardment,
microinjection of plant cells, and transformation using
viruses.
[0152] In one embodiment, transformed DNA is integrated into a
chromosome of a non-human host organism and/or cell such that a
stable recombinant system results. Any chromosomal integration
method known in the art may be used in the practice of the
invention, including but not limited to recombinase-mediated
cassette exchange (RMCE), viral site-specific chromosomal
insertion, adenovirus and pronuclear injection.
[0153] In one embodiment for carrying out the method for producing
(+)-manool, herein provided is a method of making at least one
polypeptide having a CPP synthase activity and at least one
polypeptide having a sclareol synthase activity as described in any
embodiment of the invention.
[0154] One embodiment provides a method for producing manool
comprising [0155] a) contacting geranylgeranyl diphosphate (GGPP)
with a copalyl diphosphate (CPP) synthase as described herein to
form a copalyl diphosphate; and [0156] b) contacting the CPP with a
sclareol synthase as described herein to form (+)-manool; wherein
step a) comprises culturing a non-human host organism or host cell
capable of producing GGPP and transformed with one or more nucleic
acids as described herein or with one or more expression vectors as
described herein, so that the non-human host organism or host cell
harbors a nucleic acid encoding a polypeptide having CPP synthase
activity as described herein and a nucleic acid encoding a
polypeptide having a sclareol synthase activity as described herein
and expresses or over-expresses the polypeptides.
[0157] One embodiment provides the above method for producing
manool further comprising prior to step a), transforming a
non-human host organism or host cell capable of producing GGPP with
[0158] a) at least one nucleic acid encoding a polypeptide
comprising [0159] i. an amino acid sequence having at least 50%,
55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100%
sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or [0160] ii.
an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%,
90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or
SEQ ID NO: 18; or [0161] iii. an amino acid sequence having at
least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%
sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and [0162]
having a CPP synthase activity, so that said organism or cell
expresses said polypeptide having a CPP synthase activity; and
[0163] b) at least one nucleic acid encoding a polypeptide having a
sclareol synthase activity as described herein, so that said
organism or cell expresses said polypeptide having a sclareol
synthase activity.
[0164] In one embodiment, the non-human host organism or host cell
capable of producing GGPP comprises [0165] a) a nucleic acid
encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid
encoding a sclareol synthase comprising SEQ ID NO: 5; or [0166] b)
a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and
a nucleic acid encoding a sclareol synthase comprising SEQ ID NO:
5; or [0167] c) a nucleic acid encoding a CPP synthase comprising
SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase
comprising SEQ ID NO: 5; or [0168] d) a nucleic acid comprising SEQ
ID NO: 16 which encodes for a CPP synthase and a nucleic acid
comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
[0169] e) a nucleic acid comprising SEQ ID NO: 19 which encodes for
a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which
encodes for a sclareol synthase; or [0170] f) a nucleic acid
comprising SEQ ID NO: 22 which encodes for a CPP synthase and a
nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol
synthase; or [0171] g) a nucleic acid comprising SEQ ID NO: 26
which encodes for a CPP synthase and a nucleic acid comprising SEQ
ID NO: 27 which encodes for a sclareol synthase; or [0172] h) a
nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP
synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes
for a sclareol synthase; or [0173] i) a nucleic acid comprising SEQ
ID NO: 30 which encodes for a CPP synthase and a nucleic acid
comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
[0174] j) a nucleic acid comprising SEQ ID NO: 31 which encodes for
a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which
encodes for a sclareol synthase; or [0175] k) a nucleic acid
comprising SEQ ID NO: 32 which encodes for a CPP synthase and a
nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol
synthase; or [0176] l) a nucleic acid encoding a CPP synthase
comprising SEQ ID NO: 2 and a nucleic acid encoding a sclareol
synthase comprising SEQ ID NO: 23; or [0177] m) a nucleic acid
encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid
encoding a sclareol synthase comprising SEQ ID NO: 23; or [0178] n)
a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and
a nucleic acid encoding a sclareol synthase comprising SEQ ID NO:
23; or [0179] o) a nucleic acid encoding a CPP synthase comprising
SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase
comprising SEQ ID NO: 23; or [0180] p) a nucleic acid encoding a
CPP synthase comprising SEQ ID NO: 2 and a nucleic acid encoding a
sclareol synthase comprising SEQ ID NO: 25; or [0181] q) a nucleic
acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic
acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
[0182] r) a nucleic acid encoding a CPP synthase comprising SEQ ID
NO: 18 and a nucleic acid encoding a sclareol synthase comprising
SEQ ID NO: 25; or [0183] s) a nucleic acid encoding a CPP synthase
comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol
synthase comprising SEQ ID NO: 25; or [0184] t) a nucleic acid
comprising SEQ ID NO: 16 which encodes for a CPP synthase and a
nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol
synthase; or [0185] u) a nucleic acid comprising SEQ ID NO: 19
which encodes for a CPP synthase and a nucleic acid comprising SEQ
ID NO: 24 which encodes for a sclareol synthase; or [0186] v) a
nucleic acid comprising SEQ ID NO: 22 which encodes for a CPP
synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes
for a sclareol synthase; or [0187] w) a nucleic acid comprising SEQ
ID NO: 26 which encodes for a CPP synthase and a nucleic acid
comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0188] x) a nucleic acid comprising SEQ ID NO: 26 which encodes for
a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which
encodes for a sclareol synthase [0189] y) a nucleic acid comprising
SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid
comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
[0190] z) a nucleic acid comprising SEQ ID NO: 29 which encodes for
a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which
encodes for a sclareol synthase [0191] aa) a nucleic acid
comprising SEQ ID NO: 30 which encodes for a CPP synthase and a
nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol
synthase; or [0192] bb) a nucleic acid comprising SEQ ID NO: 30
which encodes for a CPP synthase and a nucleic acid comprising SEQ
ID NO: 34 which encodes for a sclareol synthase [0193] cc) a
nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP
synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes
for a sclareol synthase; or [0194] dd) a nucleic acid comprising
SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid
comprising SEQ ID NO: 34 which encodes for a sclareol synthase; or
[0195] ee) a nucleic acid comprising SEQ ID NO: 32 which encodes
for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33
which encodes for a sclareol synthase; or [0196] ff) a nucleic acid
comprising SEQ ID NO: 32 which encodes for a CPP synthase and a
nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol
synthase; wherein the above combinations of nucleic acid sequences
and/or synthases also comprise the variants and various percent
identities to the SEQ ID NO enumerated as described herein.
[0197] In one embodiment, the non-human host organism provided
herein is a plant, a prokaryote or a fungus.
[0198] In one embodiment, the non-human host provided herein is a
microorganism, particularly bacteria or yeast.
[0199] In one embodiment, the bacterium provided herein is
Escherichia coli and yeast is Saccharomyces cerevisiae.
[0200] In one embodiment, the non-human organism provided herein is
Saccharomyces cerevisiae.
[0201] In one embodiment, the cell is a prokaryotic cell.
[0202] In other embodiment, the cell is a bacterial cell.
[0203] In one embodiment, the cell is a eukaryotic cell.
[0204] In one embodiment, the eukaryotic cell is a yeast cell or a
plant cell.
[0205] In one embodiment, the manool can be produced by culturing
the transformed bacteria or yeast described herein, including
through fermentation, for example as described in Paddon et al.,
Nature, 2013, 496:528-532.
[0206] In one embodiment, the process of producing (+)-manool
produces the (+)-manool at a purity of at least 98.5%.
[0207] In another embodiment, a method provided herein further
comprising processing the (+)-manool to a derivative using a
chemical or biochemical synthesis or a combination of both using
methods commonly known in the art.
[0208] In one embodiment, the (+)-manool derivative is selected
from the group consisting of a hydrocarbon, an alcohol, acetal,
aldehyde, acid, ether, ketone, lactone, acetate and an ester.
[0209] According to any embodiment of the invention, said
(+)-manool derivative is a C.sub.10 to C.sub.25 compound optionally
comprising one, two or three oxygen atoms.
[0210] In a further embodiment, the derivative is selected from the
group consisting of manool acetate
((3R)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na-
phthalenyl]-1-penten-3-yl acetate), copalol
((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na-
phthalenyl]-2-penten-1-ol), copalol acetate
((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na-
phthalenyl]-2-penten-1-yl acetate), copalal
((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-na-
phthalenyl]-2-pentenal), (+)-manooloxy
(4-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2--
butanone), Z-11 ((3
S,5aR,7aS,11aS,11bR)-3,8,8,11a-tetramethyldodecahydro-3,5a-epoxynaphtho[2-
,1-c]oxepin), gamma-ambrol
(2-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]eth-
anol) and Ambrox.RTM.
(3aR,5aS,9aS,9bR)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan).
[0211] In another embodiment, a method provided herein further
comprises contacting the (+)-manool with a suitable reacting system
to convert said (+)-manool in to a suitable (+)-manool derivative.
Said suitable reacting system can be of enzymatic nature (e.g.
requiring one or more enzymes) or of chemical nature (e.g.
requiring one or more synthetic chemicals).
[0212] For example, (+)-manool may be enzymatically converted to
manooloxy or gamma-ambrol using a process described in the
literature, for example as set forth in U.S. Pat. No. 7,294,492,
wherein said patent is hereby incorporated by reference in its
entirety herein.
[0213] In yet another embodiment, the (+)-manool derivative is
copalol and its esters with a C.sub.1-C.sub.5 carboxylic acids.
[0214] In yet another embodiment, the (+)-manool derivative is a
(+)-manool ester with a C.sub.1-C.sub.5 carboxylic acids.
[0215] In one embodiment, the (+)-manool derivative is copalal.
[0216] In one embodiment, the (+)-manool derivative is
manooloxy.
[0217] In yet another embodiment, the (+)-manool derivative is
Z-11.
[0218] In one embodiment, the (+)-manool derivative is an ambrol or
is a mixture thereof and its esters with a C.sub.1-C.sub.5
carboxylic acids, and in particular gamma-ambrol and its
esters.
[0219] In a further embodiment, the (+)-manool derivative is
Ambrox.RTM., sclareolide (also known as
3a,6,6,9a-tetramethyldecahydronaphtho[2,1-b]furan-2(1H)-one and all
its diastereoisomer and stereoisomers),
3,4a,7,7,10a-pentamethyldodecahydro-1H-benzo[f]chromen-3-ol or
3,4a,7,7,10a-pentamethyl-4a,5,6,6a,7,8,9,10,10a,10b-decahydro-1H-benzo[f]-
chromene and all their diastereoisomer and stereoisomers cyclic
ketone and open form,
(1R,2R,4aS,8aS)-1-(2-hydroxyethyl)-2,5,5,8a-tetramethyldecahydronaphthale-
n-2-ol DOL, gamma-ambrol.
[0220] Specific examples of how said derivatives (e.g. a triene
hydrocarbon, an acetate or copalol) can be obtained are detailed in
the Examples.
[0221] For instance, the manool obtained according to the invention
can be processed into Manooloxy (a ketone, as per known methods)
and then into ambrol (an alcohol) and ambrox (an ether), according
to EP 212254.
[0222] The ability of a polypeptide to catalyze the synthesis of a
particular sesquiterpene can be confirmed by performing the enzyme
assay as detailed in the Examples provided herein.
[0223] Polypeptides are also meant to include truncated
polypeptides provided that they keep their (+)-manool synthase
activity and their sclareol synthase activity.
[0224] As intended herein below, a nucleotide sequence obtained by
modifying the sequences described herein may be performed using any
method known in the art, for example by introducing any type of
mutations such as deletion, insertion or substitution mutations.
Examples of such methods are cited in the part of the description
relative to the variant polypeptides and the methods to prepare
them.
[0225] The percentage of identity between two peptide or nucleotide
sequences is a function of the number of amino acids or nucleotide
residues that are identical in the two sequences when an alignment
of these two sequences has been generated. Identical residues are
defined as residues that are the same in the two sequences in a
given position of the alignment. The percentage of sequence
identity, as used herein, is calculated from the optimal alignment
by taking the number of residues identical between two sequences
dividing it by the total number of residues in the shortest
sequence and multiplying by 100. The optimal alignment is the
alignment in which the percentage of identity is the highest
possible. Gaps may be introduced into one or both sequences in one
or more positions of the alignment to obtain the optimal alignment.
These gaps are then taken into account as non-identical residues
for the calculation of the percentage of sequence identity.
Alignment for the purpose of determining the percentage of amino
acid or nucleic acid sequence identity can be achieved in various
ways using computer programs and for instance publicly available
computer programs available on the world wide web. Preferably, the
BLAST program (Tatiana et al., FEMS Microbiol Lett., 1999,
174:247-250) set to the default parameters, available from the
National Center for Biotechnology Information (NCBI) at
http://www.ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi, can be used
to obtain an optimal alignment of protein or nucleic acid sequences
and to calculate the percentage of sequence identity.
[0226] The polypeptide to be contacted with GGPP in vitro can be
obtained by extraction from any organism expressing it, using
standard protein or enzyme extraction technologies. If the host
organism is an unicellular organism or cell releasing the
polypeptide of an embodiment herein into the culture medium, the
polypeptide may simply be collected from the culture medium, for
example by centrifugation, optionally followed by washing steps and
re-suspension in suitable buffer solutions. In another embodiment,
the GGPP may be contacted with the polypeptide in the culture
medium where the polypeptide may be released from the host
organism, unicellular organism or cell. If the organism or cell
accumulates the polypeptide within its cells, the polypeptide may
be obtained by disruption or lysis of the cells. The GGPP may be
contacted with the polypeptide upon further extraction of the
polypeptide from the cell lysate or through contact with the cell
lysate without necessarily conducting such an extraction.
[0227] According to another particularly embodiment, the method of
any of the above-described embodiments is carried out in vivo.
These embodiments provided herein are particularly advantageous
since it is possible to carry out the method in vivo without
previously isolating the polypeptide. The reaction occurs directly
within the organism or cell transformed to express said
polypeptide.
[0228] The organism or cell is meant to "express" a polypeptide,
provided that the organism or cell is transformed to harbor a
nucleic acid encoding said polypeptide, this nucleic acid is
transcribed to mRNA and the polypeptide is found in the host
organism or cell. The term "express" encompasses "heterologously
express" and "over-express", the latter referring to levels of
mRNA, polypeptide and/or enzyme activity over and above what is
measured in a non-transformed organism or cell. A more detailed
description of suitable methods to transform a non-human host
organism or cell will be described later on in the part of the
specification that is dedicated to such transformed non-human host
organisms or cells.
[0229] A particular organism or cell is meant to be "capable of
producing GGPP" when it produces GGPP naturally or when it does not
produce GPPP naturally but is transformed to produce GGPP, either
prior to the transformation with a nucleic acid as described herein
or together with said nucleic acid. Organisms or cells transformed
to produce a higher amount of GGPP than the naturally occurring
organism or cell are also encompassed by the "organisms or cells
capable of producing GGPP". Several methods to transform organisms,
for example microorganisms, so that they produce GGPP are known,
for example in Schalk et al., J. Am. Chem. Soc., 2013,
134:18900-18903.
[0230] Non-human host organisms suitable to carry out the method of
an embodiment herein in vivo may be any non-human multicellular or
unicellular organisms. In a particular embodiment, the non-human
host organism used to carry out an embodiment herein in vivo is a
plant, a prokaryote or a fungus. Any plant, prokaryote or fungus
can be used. Particularly useful plants are those that naturally
produce high amounts of terpenes. In a more particular embodiment
the non-human host organism used to carry out the method of an
embodiment herein in vivo is a microorganism. Any microorganism can
be used but according to an even more particular embodiment said
microorganism is a bacteria or yeast. Most particularly, said
bacterium is E. coli and said yeast is Saccharomyces
cerevisiae.
[0231] Some of these organisms do not produce GGPP naturally or
only in small amounts. To be suitable to carry out the method of an
embodiment herein, these organisms have to be transformed to
produce said precursor or engineered to produce said precursor in
larger amounts. They can be so transformed either before the
modification with the nucleic acid described according to any of
the above embodiments or simultaneously, as explained above.
[0232] In one embodiment, the non-human host organism or cell
capable of producing GGPP is transformed with a nucleic acid
encoding a CPP synthase or variant thereof as described herein and
a nucleic acid encoding a sclareol synthase or variant thereof as
described herein, wherein the non-human host organism or cell
capable of producing GGPP has been engineered to over-express a
GGPP synthase or transformed with a nucleic acid encoding a GGPP
synthase.
[0233] In one embodiment, the non-human host organism or cell
comprises a nucleic acid encoding a GGPP synthase, a nucleic acid
encoding a CPP synthase or variant thereof as described herein, and
a nucleic acid encoding a sclareol synthase or variant thereof as
described herein, wherein at least one of said nucleic acids is
heterologous to the non-human host organism or cell.
[0234] Isolated higher eukaryotic cells can also be used, instead
of complete organisms, as hosts to carry out the method of an
embodiment herein in vivo. Suitable eukaryotic cells may be any
non-human cell, but are particularly plant or fungal cells.
[0235] According to another embodiment, the polypeptides having a
CPP synthase activity used in any of the embodiments described
herein or encoded by the nucleic acids described herein may be
variants obtained by genetic engineering, provided that said
variant keeps its CPP synthase activity.
[0236] According to another embodiment, the polypeptides having a
sclareol synthase activity used in any of the embodiments described
herein or encoded by the nucleic acids described herein may be
variants obtained by genetic engineering, provided that said
variant keeps its sclareol synthase activity or has manool synthase
activity.
[0237] As used herein, the polypeptide is intended as a polypeptide
or peptide fragment that encompasses the amino acid sequences
identified herein, as well as truncated or variant polypeptides,
provided that they keep their CPP synthase activity and their
sclareol synthase activity and/or manool synthase activity.
[0238] Examples of variant polypeptides are naturally occurring
proteins that result from alternate mRNA splicing events or from
proteolytic cleavage of the polypeptides described herein.
Variations attributable to proteolysis include, for example,
differences in the N- or C-termini upon expression in different
types of host cells, due to proteolytic removal of one or more
terminal amino acids from the polypeptides of an embodiment herein.
Polypeptides encoded by a nucleic acid obtained by natural or
artificial mutation of a nucleic acid of an embodiment herein, as
described thereafter, are also encompassed by an embodiment
herein.
[0239] Polypeptide variants resulting from a fusion of additional
peptide sequences at the amino and carboxyl terminal ends can also
be used in the methods of an embodiment herein. In particular such
a fusion can enhance expression of the polypeptides, be useful in
the purification of the protein or improve the enzymatic activity
of the polypeptide in a desired environment or expression system.
Such additional peptide sequences may be signal peptides, for
example. Accordingly, encompassed herein are methods using variant
polypeptides, such as those obtained by fusion with other oligo- or
polypeptides and/or those which are linked to signal peptides.
Polypeptides resulting from a fusion with another functional
protein, such as another protein from the terpene biosynthesis
pathway, can also advantageously be used in the methods of an
embodiment herein.
[0240] A variant may also differ from the polypeptide of an
embodiment herein by attachment of modifying groups which are
covalently or non-covalently linked to the polypeptide
backbone.
[0241] The variant also includes a polypeptide which differs from
the polypeptide described herein by introduced N-linked or O-linked
glycosylation sites, and/or an addition of cysteine residues. The
skilled artisan will recognize how to modify an amino acid sequence
and preserve biological activity.
[0242] Therefore, in an embodiment, the present invention provides
a method for preparing a variant polypeptide having a CPP synthase
activity or a sclareol synthase activity or a manool synthase
activity, as described in any of the above embodiments, and
comprising the steps of: [0243] (a) selecting a nucleic acid
according to any of the embodiments exposed above; [0244] (b)
modifying the selected nucleic acid to obtain at least one mutant
nucleic acid; [0245] (c) transforming host cells or unicellular
organisms with the mutant nucleic acid sequence to express a
polypeptide encoded by the mutant nucleic acid sequence; [0246] (d)
screening the polypeptide for at least one modified property; and,
[0247] (e) optionally, if the polypeptide has no desired variant
CPP synthase activity, sclareol synthase activity, or manool
synthase activity repeating the process steps (a) to (d) until a
polypeptide with a desired variant CPP synthase activity, sclareol
synthase activity, or manool synthase activity is obtained; [0248]
(f) optionally, if a polypeptide having a desired variant CPP
synthase activity or a sclareol synthase activity or manool
synthase activity was identified in step (d), isolating the
corresponding mutant nucleic acid obtained in step (c).
[0249] According to an embodiment, the variant polypeptide prepared
when in combination with either a polypeptide with CPP synthase
activity or a sclareol synthase activity is capable of producing
(+)-manool.
[0250] In step (b), a large number of mutant nucleic acid sequences
may be created, for example by random mutagenesis, site-specific
mutagenesis, or DNA shuffling. The detailed procedures of gene
shuffling are found in Stemmer, DNA shuffling by random
fragmentation and reassembly: in vitro recombination for molecular
evolution (Proc Natl Acad Sci USA., 1994, 91(22): 10747-1075). In
short, DNA shuffling refers to a process of random recombination of
known sequences in vitro, involving at least two nucleic acids
selected for recombination. For example mutations can be introduced
at particular loci by synthesizing oligonucleotides containing a
mutant sequence, flanked by restriction sites enabling ligation to
fragments of the native sequence. Following ligation, the resulting
reconstructed sequence encodes an analog having the desired amino
acid insertion, substitution, or deletion. Alternatively,
oligonucleotide-directed site-specific mutagenesis procedures can
be employed to provide an altered gene wherein predetermined codons
can be altered by substitution, deletion or insertion.
[0251] Mutant nucleic acids may be obtained and separated, which
may be used for transforming a host cell according to standard
procedures, for example such as disclosed in the present
examples.
[0252] In step (d), the polypeptide obtained in step (c) is
screened for at least one modified property, for example a desired
modified enzymatic activity. Examples of desired enzymatic
activities, for which an expressed polypeptide may be screened,
include enhanced or reduced enzymatic activity, as measured by
K.sub.M or V.sub.max value, modified regio-chemistry or
stereochemistry and altered substrate utilization or product
distribution. The screening of enzymatic activity can be performed
according to procedures familiar to the skilled person and those
disclosed in the present examples.
[0253] Step (e) provides for repetition of process steps (a)-(d),
which may preferably be performed in parallel. Accordingly, by
creating a significant number of mutant nucleic acids, many host
cells may be transformed with different mutant nucleic acids at the
same time, allowing for the subsequent screening of an elevated
number of polypeptides. The chances of obtaining a desired variant
polypeptide may thus be increased at the discretion of the skilled
person.
[0254] In addition to the gene sequences shown in the sequences
disclosed herein, it will be apparent for the person skilled in the
art that DNA sequence polymorphisms may exist within a given
population, which may lead to changes in the amino acid sequence of
the polypeptides disclosed herein. Such genetic polymorphisms may
exist in cells from different populations or within a population
due to natural allelic variation. Allelic variants may also include
functional equivalents.
[0255] Further embodiments also relate to the molecules derived by
such sequence polymorphisms from the concretely disclosed nucleic
acids. These natural variations usually bring about a variance of
about 1 to 5% in the nucleotide sequence of a gene or in the amino
acid sequence of the polypeptides disclosed herein. As mentioned
above, the nucleic acid encoding the polypeptide of an embodiment
herein is a useful tool to modify non-human host organisms or cells
intended to be used when the method is carried out in vivo.
[0256] A nucleic acid encoding a polypeptide according to any of
the above-described embodiments is therefore also provided
herein.
[0257] The nucleic acid of an embodiment herein can be defined as
including deoxyribonucleotide or ribonucleotide polymers in either
single- or double-stranded form (DNA and/or RNA). The terms
"nucleotide sequence" should also be understood as comprising a
polynucleotide molecule or an oligonucleotide molecule in the form
of a separate fragment or as a component of a larger nucleic acid.
Nucleic acids of an embodiment herein also encompass certain
isolated nucleotide sequences including those that are
substantially free from contaminating endogenous material. The
nucleic acid of an embodiment herein may be truncated, provided
that it encodes a polypeptide encompassed herein, as described
above.
[0258] In one embodiment, the nucleic acid of an embodiment herein
that encodes for a CPP synthase can be either present naturally in
a plant such as Salvia miltiorrhiza, or other species, such as
Coleus forskohlii, Triticum aestivum, Marrubium vulgare or
Rosmarinus officinalis, or be obtained by modifying SEQ ID NO: 3,
SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID
NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO:
32.
[0259] In a further embodiment, the nucleic acid of an embodiment
herein that encodes for a sclareol synthase can be either present
naturally in a plant such as Salvia sclarea, or other species such
as Nicotiana glutinosa, or can be obtained by modifying SEQ ID NO:
6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO:
34.
[0260] Mutations may be any kind of mutations of these nucleic
acids, such as point mutations, deletion mutations, insertion
mutations and/or frame shift mutations. A variant nucleic acid may
be prepared in order to adapt its nucleotide sequence to a specific
expression system. For example, bacterial expression systems are
known to more efficiently express polypeptides if amino acids are
encoded by particular codons.
[0261] Due to the degeneracy of the genetic code, more than one
codon may encode the same amino acid sequence, multiple nucleic
acid sequences can code for the same protein or polypeptide, all
these DNA sequences being encompassed by an embodiment herein.
Where appropriate, the nucleic acid sequences encoding the CPP
synthase and the scalereol synthase may be optimized for increased
expression in the host cell. For example, nucleotides of an
embodiment herein may be synthesized using codons particular to a
host for improved expression.
[0262] Another important tool for transforming host organisms or
cells suitable to carry out the method of an embodiment herein in
vivo is an expression vector comprising a nucleic acid according to
any embodiment of an embodiment herein. Such a vector is therefore
also provided herein.
[0263] Recombinant non-human host organisms and cells transformed
to harbor at least one nucleic acid of an embodiment herein so that
it heterologously expresses or over-expresses at least one
polypeptide of an embodiment herein are also very useful tools to
carry out the method of an embodiment herein. Such non-human host
organisms and cells are therefore also provided herein.
[0264] A nucleic acid according to any of the above-described
embodiments can be used to transform the non-human host organisms
and cells and the expressed polypeptide can be any of the
above-described polypeptides.
[0265] Non-human host organisms of an embodiment herein may be any
non-human multicellular or unicellular organisms. In a particular
embodiment, the non-human host organism is a plant, a prokaryote or
a fungus. Any plant, prokaryote or fungus is suitable to be
transformed according to the methods provided herein. Particularly
useful plants are those that naturally produce high amounts of
terpenes.
[0266] In a more particular embodiment the non-human host organism
is a microorganism. Any microorganism is suitable to be used
herein, but according to an even more particular embodiment said
microorganism is a bacteria or yeast. Most particularly, said
bacterium is E. coli and said yeast is Saccharomyces
cerevisiae.
[0267] Isolated higher eukaryotic cells can also be transformed,
instead of complete organisms. As higher eukaryotic cells, we mean
here any non-human eukaryotic cell except yeast cells. Particular
higher eukaryotic cells are plant cells or fungal cells.
[0268] Embodiments provided herein include, but are not limited to
cDNA, genomic DNA and RNA sequences.
[0269] Genes, including the polynucleotides of an embodiment
herein, can be cloned on basis of the available nucleotide sequence
information, such as found in the attached sequence listing and by
methods known in the art. These include e.g. the design of DNA
primers representing the flanking sequences of such gene of which
one is generated in sense orientations and which initiates
synthesis of the sense strand and the other is created in reverse
complementary fashion and generates the antisense strand. Thermo
stable DNA polymerases such as those used in polymerase chain
reaction are commonly used to carry out such experiments.
Alternatively, DNA sequences representing genes can be chemically
synthesized and subsequently introduced in DNA vector molecules
that can be multiplied by e.g. compatible bacteria such as e.g. E.
coli.
[0270] Provided herein are nucleic acid sequences obtained by
mutations of SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO:
19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ
ID NO: 31, or SEQ ID NO: 32, and SEQ ID NO: 6, SEQ ID NO: 24, SEQ
ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34; such mutations can be
routinely made. It is clear to the skilled artisan that mutations,
deletions, insertions, and/or substitutions of one or more
nucleotides can be introduced into these DNA sequence
[0271] The nucleic acid sequences of an embodiment herein encoding
CPP synthase and the sclareol synthase proteins can be inserted in
expression vectors and/or be contained in chimeric genes inserted
in expression vectors, to produce CPP synthase and sclareol
synthase in a host cell or host organism. The vectors for inserting
transgenes into the genome of host cells are well known in the art
and include plasmids, viruses, cosmids and artificial chromosomes.
Binary or co-integration vectors into which a chimeric gene is
inserted are also used for transforming host cells.
[0272] An embodiment provided herein provides recombinant
expression vectors comprising a nucleic acid encoding for a CPP
synthase and a sclareol synthase each, separately, are operably
linked to associated nucleic acid sequences such as, for instance,
promoter sequences.
[0273] Alternatively, the promoter sequence may already be present
in a vector so that the nucleic acid sequence which is to be
transcribed is inserted into the vector downstream of the promoter
sequence. Vectors are typically engineered to have an origin of
replication, a multiple cloning site, and a selectable marker.
EXAMPLES
Example 1
[0274] Diterpene Synthase Genes.
[0275] Two diterpene synthase are necessary for the conversion of
geranylgeranyl diphosphate (GGPP) to manool: a type II and a type I
diterpene synthase. In the following examples, several type II and
type I diterpene synthase combinations were selected and evaluated
for the production of manool. For the type II synthases, five
copalyl diphosphate (CPP) synthases were selected: [0276] SmCPS,
NCBI accession No ABV57835.1, from Salvia miltiorrhiza. [0277]
CfCPS1, NCBI accession No AHW04046.1, from Coleus forskohlii.
[0278] TaTps1, NCBI accession No BAH56559.1, from Triticum
aestivum. [0279] MvCps3, NCBI accession No AIE77092.1, from
Marrubium vulgare. [0280] RoCPS1, NCBI accession No AHL67261.1,
from Rosmarinus officinalis.
[0281] The codon usage of the cDNA encoding for the five CPP
synthases were modified for optimal expression in E. coli (DNA 2.0,
Menlo Park, Calif. 94025) and the NdeI and KpnI restriction sites
were added at 5'-end and 3'-end, respectively. In addition, the
cDNA were designed to express the recombinant CPP synthase with
deletion of the predicted peptide signal (58, 63, 59, 63 and 67
amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1,
respectively).
[0282] For the type I diterpene synthase, the sclareol synthase
from Salvia sclarea (SsScS) was used (NCBI accession No AET21246.1,
WO2009095366). The codon usage of the cDNA was optimized for E.
coli expression (DNA 2.0, Menlo Park, Calif. 94025), the 50 first
N-terminal codon were removed and the NdeI and KpnI restriction
sites were added at the 5'-end and 3'-end, respectively. All the
cDNAs were synthesized in vitro and cloned in the pJ208 or pJ401
plasmid (DNA 2.0, Menlo Park, Calif. 94025, USA).
Example 2
[0283] Expression Plasmids.
[0284] The modified SmCPS-encoding cDNA (SmCPS2) and sclareol
synthase (SsScS)-encoding cDNA (1132-2-5_opt) were digested with
NdeI and KpnI and ligated into the pETDuet-1 plasmid providing the
pETDuet-SmCPS2 and pETDuet-1132opt expression plasmids,
respectively.
[0285] Another plasmid was constructed to co-expression the SmCPS2
and SsScS enzymes together with a geranylgeranyl diphosphate (GGPP)
synthase. For the GGPP synthase, the CrtE gene from Pantoea
agglomerans (NCBI accession M38424.1) encoding for a GGPP synthase
(NCBI accession number AAA24819.1) was used. The CrtE gene was
synthesized with codon optimization and addition of the NcoI and
BamHI restriction enzyme recognition sites at the 3' and 5' ends
(DNA 2.0, Menlo Park, Calif. 94025, USA) and ligated between NcoI
and BamHI site of the pETDuet-1 plasmid to obtain the pETDuet-CrtE
plasmid. The SmCPS2 encoding cDNA was digested with NdeI and KpnI
and ligated into the pETDuet-1-CrtE plasmid thus providing the
pETDuet-CrtE-SmCPS2 construct. The optimized cDNA (1132-2-5_opt)
encoding for the truncated SsScS was then introduced in the
pETDuet-CrtE-SmCPS2 plasmid using the In-Fusion.RTM. technique
(Clontech, Takara Bio Europe). For this cloning, the
pETDuet-1132opt was used as template in a PCR amplification using
the forward primer SmCPS2-1132Inf_F1
5'-CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGCGAAAATG
AAGGAGAACTTTAAACG-3' (SEQ ID NO: 9) and the reverse primer
1132-pET_Inf_R1
5'-GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCATGTCCTCT-3' (SEQ
ID NO: 10). The PCR product was ligated in the plasmid
pETDuet-CrtE-SmCPS2 digested with the KpnI and XhoI restriction
enzymes and using the In-Fusion.RTM. Dry-Down PCR Cloning Kit
(Clontech, Takara Bio Europe), providing the new plasmid
pETDuet-CrtE-SmCPS2-SsScS. In this plasmid the CrtE gene is under
the control of the first T7 promoter of the pETDuet plasmid and the
CPP synthase and sclareol synthase encoding cDNAs are organized in
a bi-cistronic construct under the control of the second T7
promoter.
[0286] The pETDuet-CrtE-SmCPS2-SsScS plasmid was used as template
for construction of new expression plasmids carrying the four other
CPP synthases-encoding enzymes. The SmCPS2 cDNA was replaced by one
of the four new CPP synthase encoding cDNA using an NdeI-KpnI
restriction digestion-ligation approach providing the new plasmids
pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS,
pETDuet-CrtE-MvCps3del63-SsScS and
pETDuet-CrtE-RoCPS1del67-SsScS.
Example 3
[0287] Heterologous Expression in E. coli and Enzymatic
Activities.
[0288] The expression plasmids (pETDuet-SmCPS2 or pETDuet-1132opt)
were used to transform Bl21(DE3) E. coli cells (Novagene, Madison,
Wis.). Single colonies of transformed cells were used to inoculate
25 ml LB medium. After 5 to 6 hours incubation at 37.degree. C.,
the cultures were transferred to a 20.degree. C. incubator and left
1 hour for equilibration. Expression of the protein was then
induced by the addition of 0.1 mM IPTG and the culture was
incubated over-night at 20.degree. C. The next day, the cells were
collected by centrifugation, re-suspended in 0.1 volume of 50 mM
MOPSO (3-morpholino-2-hydroxypropanesulfonic acid sodium salt,
3-(N-morpholinyl)-2-hydroxypropanesulfonic acid sodium salt) buffer
at pH 7, 10% glycerol, 1 mM DTT and lysed by sonication. The
extracts were cleared by centrifugation (30 min at 20,000 g) and
the supernatants containing the soluble proteins were used for
further experiments.
Example 4
[0289] In Vitro Diterpene Synthase Activity Assays.
[0290] Enzymatic assays were performed in Teflon sealed glass tubes
using 50 to 100 .mu.l of protein extract in a final volume of 1 mL
of 50 mM MOPSO pH 7, 10% glycerol supplemented with 20 mM
MgCl.sub.2 and 50 to 200 .mu.M purified geranylgeranyl diphosphate
(GGPP) (prepared as described by Keller and Thompson, J.
Chromatogr, 1993, 645(1):161-167). The tubes were incubated 5 to 48
hours at 30.degree. C. and the enzyme products were extracted twice
with one volume of pentane. After concentration under a nitrogen
flux, the extracts were analyzed by GC-MS and compared to extracts
from control proteins (obtained from cells transformed with the
empty plasmid). GC-MS analysis were performed on an Agilent 6890
series GC system equipped with a DB1 column (30 m.times.0.25
mm.times.0.25 mm film thickness; Agilent) and coupled with a 5975
series mass spectrometer. The carrier gas was helium at a constant
flow of 1 ml/min. Injection was in split-less mode with the
injector temperature set at 260.degree. C. and the oven temperature
was programmed from 100.degree. C. to 225.degree. C. at 10.degree.
C./min and to 280.degree. C. at 30.degree. C./min. The identities
of the products were confirmed based on the concordance of the
retention indices and mass spectra of authentic standards.
[0291] In these conditions and with the recombinant protein from E.
coli cells transformed with the plasmids pETDuet-SmCPS2 or
pETDuet-1132opt (heterologously expressing the SmCPS or ScScS
enzymes, respectively) no production of diterpene molecules was
detected in the solvent extracts (the diphosphate-containing
diterpenes are not detected in these conditions). Similar assays
were then performed but combining the 2 protein extracts containing
the recombinant SmCPS and SsScS in a single assay. In these assays,
one major product was formed and was identified as being (+)-manool
by matching of the mass spectrum and retention index with authentic
standards (FIG. 3). This experiment demonstrated that a sclareol
synthase can be used together with a CPP synthase to produce
manool.
Example 5
[0292] In Vivo Manool Production Using E. coli Cells.
[0293] The in vivo production of manool using cultures of whole
cells was evaluated using E. coli cells. The CrtE gene inserted in
the co-expression plasmids described in Example 2 encodes for an
enzyme having GGPP synthase activity that uses farnesyl-diphosphate
(FPP) to produce geranylgeranyl diphosphate (GGPP). To increase the
level of the endogenous GGPP pool and therefore the productivity in
diterpene of the cells, a heterologous complete mevalonate pathway
leading to FPP was co-expressed in the same cells. The enzymes of
this pathway were expressed using a single plasmid containing all
the genes organized in two operons under the control of two
promoters. The construction of this expression plasmid is described
in patent application WO2013064411 or in Schalk et al. (J. Am.
Chem. Soc., 2013, 134:18900-18903). Briefly, a first synthetic
operon consisting of an E. coli acetoacetyl-CoA thiolase (atoB), a
Staphylococcus aureus HMG-CoA synthase (mvaS), a Staphylococcus
aureus HMG-CoA reductase (mvaA) and a Saccharomyces cerevisiae FPP
synthase (ERG20) genes was synthetized in vitro (DNA2.0, Menlo
Park, Calif., USA) and ligated into the NcoI-BamHI digested
pACYCDuet-1 vector (Invitrogen) yielding pACYC-29258. A second
operon containing a mevalonate kinase (MvaK1), a phosphomevalonate
kinase (MvaK2), a mevalonate diphosphate decarboxylase (MvaD), and
an isopentenyl diphosphate isomerase (idi) was amplified from
genomic DNA of Streptococcus pneumoniae (ATCC BAA-334) and ligated
into the second multicloning site of pACYC-29258 providing the
plasmid pACYC-29258-4506. This plasmid thus contains the genes
encoding all enzymes of the biosynthetic pathway leading from
acetyl-coenzyme A to FPP.
[0294] KRX E. coli cells (Promega) were co-transformed with the
plasmid pACYC-29258-4506 and one plasmid selected from
pETDuet-CrtE-SmCPS2-SsSc, pETDuet-CrtE-CfCPS1del63-SsScS,
pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS, or
pETDuet-CrtE-RoCPS1del67-SsScS. Transformed cells were selected on
carbenicillin (50 .mu.g/ml) and chloramphenicol (34 .mu.g/ml)
LB-agarose plates. Single colonies were used to inoculate 5 mL
liquid LB medium supplemented with the same antibiotics. The
cultures were incubated overnight at 37.degree. C. The next day 2
mL of TB medium supplemented with the same antibiotics were
inoculated with 0.2 mL of the overnight culture. After 6 hours
incubation at 37.degree. C., the culture was cooled down to
28.degree. C. and 0.1 mM IPTG, 0.2% rhamnose and 1:10 volume of
decane were added to each tube. The cultures were incubated for 48
hours at 28.degree. C. The cultures were then extracted twice with
2 volumes of MTBE (Methyl tert-butyl ether), the organic phase were
concentrated to 500 .mu.L and analyzed by GC-MS as described above
in Example 4 except for the oven temperature which was 1 min hold
at 100.degree. C., followed by a temperature gradient of 10.degree.
C./min to 220.degree. C. and 20.degree. C./min and to 3000.degree.
C.
[0295] Under these culture conditions, manool was produced with
each combination of type II diterpene synthase and the Salvia
sclarea sclareol synthase (SsScS) (FIGS. 4 and 5). The amounts of
diterpene compounds produced were quantified using an internal
standard (alpha-longipinene). The table below shows the quantities
of manool produced relative to the SmCPS/SsScS combination, when
the ScScS is combined with various type II diterpene synthase
(under these experimental conditions, the concentration of manool
produced by cells expressing the SmCPS and the SsScS was 300 to 500
mg/L (FIG. 4)). Under these conditions, the highest relative
quantity of manool produced was with the TaTps1del59
combination.
TABLE-US-00001 Type II diterpene Type I diterpene Relative quantity
of synthase synthase manool produced SmCPS2 ScScS 100 CfCPS1del63
ScScS 125.3 TaTps1del59 ScScS 139.4 MyCps3del63 ScScS 14.9
RoCPS1del67 ScScS 77.7
Example 6
[0296] Production of (+)-Manool Using Recombinant Cells,
Purification and NMR Analysis.
[0297] One litre of E. coli culture was prepared in the conditions
described in Example 5, using the SmCPS/SsScS enzyme combination,
except that the decane organic phase was replaced by 50 g/L
Amberlite XAD-4 for solid phase extraction. The culture medium was
filtered to recover the resine. The resine was then washed with 3
column volumes of water, and eluted using 3 column volumes of MTBE.
The product was then further purified by flash chromatography on
silica gel using a mobile phase composed of heptane:MTBE 8:2 (v/v).
The structure of manool was confirmed by 1H- and 13C-NMR using a
Bruker Avance 500 MHz spectrometer. The optical rotation was
measured using a Perkin-Elmer 241 polarimeter and the value of
[.alpha.].sup.D.sub.20=+26.9.degree. (0.3%, CHCl.sub.3) confirmed
the production of (+)-manool.
Example 7
[0298] In Vivo Manool Production in E. coli Cells Using a Sclareol
Synthases from Nicotiana glutinosa.
[0299] Sclareol synthases from the plant Nicotiana glutinosa are
described in WO 2014/022434 and are shown to produce sclareol from
labdenediol diphosphate (LPP). Two of the sclareol synthase
described in WO 2014/022434 were evaluated, NgSCS-del29
(corresponding to SEQ ID NO: 78 in WO 2014/0224) and NgSCS-del38
(corresponding to SEQ ID NO: 40 of WO 2014/022434) for the
production of (+)-manool under conditions similar to Example 5.
[0300] A cDNA encoding for NgSCS-del29 was design with a codon
usage optimal for E. coli expression and including the KpnI and
XhoI sites at the 5'-end and 3'-end respectively. This DNA was
synthesized by DNA 2.0 (Newark, CA 94560).
[0301] The pETDuet-CrtE-SmCPS2-SsScS plasmid (Example 2) was used
as template for construction of a new expression plasmid. The
pETDuet-CrtE-SmCPS2-SsScS plasmid was digested with the KpnI and
XhoI restriction sites to replace the SsScS cDNA with the
NgSCS-del29 cDNA, providing the new pETDuet-CrtE-SmCPS2-del29
plasmid.
[0302] KRX E. coli cells (Promega) were co-transformed with the
plasmid pACYC-29258-4506 (Example 5) and the
pETDuet-CrtE-SmCPS2-del29 plasmid. Transformed cells were selected
and cultivated in conditions for production of diterpene as
described in Example 5. The production of diterpenes was evaluated
using GC-MS analysis and the diterpene compounds produced were
quantified using an internal standard (alpha-longipinene). With the
new combination of the diterpene synthases SmCPS2 and NgSCS-del29,
manool was produced by transformed E. coli cells (FIG. 6). The
combination of the diterpene synthases SmCPS2 and NgSCS-del38 did
not produce manool under the experimental conditions used. Thus at
least one of the Nicotiana glutinosa sclareol synthase tested can
also be used to produce manool from CPP. However, the quantities
produced using the Nicotiana glutinosa synthase were much lower
than with the SsSCS synthase (see table below).
TABLE-US-00002 Type II diterpene Type I diterpene Relative quantity
of synthase synthase manool produced. SmCPS2 SsScS 100 SmCPS2
NgSCS-del29 3.1
Example 8
[0303] The manool obtained in the above examples was converted into
its esters according to the following experimental part (herein
below as example into its acetate):
##STR00001##
Following the literature (G. Ohloff, Helv. Chim. Acta 41, 845
(1958)), 32.0 g (0.11 mole) of pure crystalline (+)-Manool were
treated by 20.0 g (0.25 mole) of acetyl chloride in 100 ml of
dimethyl aniline for 5 days at room temperature. The mixture was
additionally heated for 7 hours at 50.degree. to reach 100% of
conversion. After cooling, the reaction mixture was diluted with
ether, washed successively with 10% H.sub.2SO.sub.4, aqueous
NaHCO.sub.3 and water to neutrality. After drying
(Na.sub.2SO.sub.4) and concentration, the product was distilled
(bulb-to-bulb, B.p.=160.degree., 0.1 mbar) to give 20.01 g (79.4%)
of Manool Acetate which was used without further purification.
[0304] MS: M.sup.+ 332 (0); m/e: 272 (27), 257 (83), 137 (62), 95
(90), 81 (100).
[0305] .sup.1H-NMR (CDCl.sub.3): 0.67, 0.80, 0.87, 1.54 and 2.01 (5
s, 3H each), 4.49 (s, 1H), 4.80 (s, 1H), 5.11 (m, 1H), 5.13 (m,
1H), 5.95 (m, 1H).
[0306] .sup.13C-NMR (CDCl.sub.3): 14.5 (q), 17.4 (t), 19.4 (t),
21.7 (q), 22.2 (q), 23.5 (q), 24.2 (t), 33.5 (s), 33.6 (t), 38.3
(t), 39.0 (t), 39.3 (t), 39.8 (s), 42.2 (t), 55.6 (d), 57.2 (t),
83.4 (s), 106.4 (t), 113.0 (t), 142.0 (d), 148.6 (s), 169.9
(s).
Example 9
[0307] The manool acetate obtained in the above examples was
converted into its trienes according to the following experimental
part (herein below as example into its Sclarene and
(Z+E)-Biformene):
##STR00002##
To a solution of 0.4 g of Manool Acetate in 4 ml of cyclohexane at
room temperature was added 0.029 g (0.05 eq.) of BF.sub.3.AcOH
complex. After 15 minutes at room temperature, the reaction was
quenched with aqueous NaHCO.sub.3 and washed with water to
neutrality. GC-MS analysis showed only hydrocarbons which were
identified as Sclarene, (Z) and (E)-biformene. No Copalol Acetate
was detected. Another trial with more catalyst (0.15 eq) gave the
same result.
[0308] Sclarene: MS: M.sup.+ 272 (18); m/e: 257 (100), 149 (15),
105 (15).
[0309] (Z) and (E)-Biformene (identical spectra): MS: M.sup.+ 272
(29); m/e: 257 (100), 187 (27), 161 (33), 105 (37).
Example 10
[0310] The manool obtained in the above examples was converted into
Copalyl esters according to the following experimental part (herein
below as example into the acetate):
##STR00003##
To a solution of 0.474 g (0.826 mmole, 0.27 eq.) of BF.sub.3.AcOH
in 100 ml of cyclohexane at room temperature was added 4.4 g of
acetic anhydride and 12.1 g of acetic acid. At room temperature,
10.0 g (33 mmole) of pure crystalline Manool in 15 ml of
cyclohexane were added (sl. exothermic) and the temperature was
maintained at room temperature using a water bath. After 30 min. of
stirring at room temperature, a GC control showed no starting
material. The reaction mixture was quenched with 300 ml of aq.
saturated NaHCO.sub.3 and treated as usual. The crude mixture (9.9
g) was purified by flash chromatography (SiO2, pentane/ether 95:5)
and bulb-to-bulb distillation (Eb.=130.degree., 0.1 mbar) to give
4.34 g (37.1%) of a 27/73 mixture of (Z) and (E)-Copalyl
Acetate.
[0311] (Z)-Copalyl Acetate:
[0312] MS: M.sup.+ 332 (0); m/e: 317 (2), 272 (35)=, 257 (100), 137
(48),95 (68), 81 (70).
[0313] .sup.1H-NMR (CDCl.sub.3): 0.67, 0.80, 0.87 1.76 and 2.04
(5s, 3H each), 4.86 (s, 1H), 5.35 (t: J=6 Hz, 1H).
[0314] (E)-Copalyl Acetate:
[0315] MS: M.sup.+ 332 (0); m/e: 317 (2), 272 (33)=, 257 (100), 137
(54),95 (67), 81 (74).
[0316] .sup.1H-NMR (CDCl.sub.3): 0.68, 0.80, 0.87 1.70 and 2.06
(5s, 3H each), 4.82 (s, 1H), 5.31 (t: J=6 Hz, 1H).
[0317] .sup.13C-NMR (CDCl.sub.3): (Spectrum recorded on (Z/E)
mixture, only significant signals are given): 61.4 (t), 106.2 (t),
117.9 (d), 143.1 (s), 148.6 (s), 171.1 (s).
Example 11
[0318] The copalyl acetate obtained in the above examples was
converted into Copalol according to the following experimental
part:
##STR00004##
Copalyl Acetate (4.17 g, 12.5 mmole), KOH pellets (3.35 g, 59.7
mmole), water (1.5 g) and EtOH (9.5 ml) were mixed together and
stirred for 3 hours at 50.degree.. After usual workup, 3.7 g of
crude (Z+E)-Copalol were obtained and purified by flash
chromatography (SiO2, pentane/ether 7:2. After evaporation of the
solvent, a bulb-to-bulb distillation (Eb=170.degree., 0.1 mbar)
furnished 3.25 g (92%) of a 27/73 mixture of (Z) and
(E)-Copalol.
[0319] (Z)-Copalol
[0320] MS: M.sup.+ 290 (3); m/e: 275 (18), 272 (27), 257 (82), 137
(71), 95 (93), 81 (100), 69 (70).
[0321] .sup.1H-NMR (CDCl.sub.3): 0.67, 0.80, 0.87 and 1.74 (4s, 3H
each); 4.06 (m, 2H), 4.55 (s, 1H), 4.86 (s, 1H), 5.42 (t: J=6 Hz,
1H).
[0322] (E)-Copalol
[0323] MS: M.sup.+ 290 (3); m/e: 275 (27), 272 (22), 257 (75), 137
(75), 95 (91), 81 (100), 69 (68).
[0324] .sup.1H-NMR (CDCl.sub.3): 0.68, 0.80, 0.87 and 1.67 (4s, 3H
each); 4.15 (m, 2H), 4.51 (s, 1H), 4.83 (s, 1H), 5.39 (t, J=6 Hz,
1H)
[0325] .sup.13C-NMR (CDCl.sub.3): (Spectrum recorded on (Z/E)
mixture, only significant signals are given): 59.4 (t), 106.2 (t),
123.0 (d), 140.6 (s), 148.6 (s).
Example 12
[0326] In Vivo Manool Production in Saccharomyces cerevisiae Cells
Using Different Combinations of CPP Synthases and Sclareol
Synthases.
[0327] Different combinations of class I and class II diterpene
synthases were evaluated for the production of manool in S.
cerevisiae cells.
[0328] For the class II diterpene synthase, five CPP synthases were
selected: [0329] SmCPS, NCBI accession No ABV57835.1, from Salvia
miltiorrhiza. [0330] CfCPS1, NCBI accession No AHW04046.1, from
Coleus forskohlii. [0331] TaTps1, NCBI accession No BAH56559.1,
from Triticum aestivum. [0332] MvCps3, NCBI accession No
AIE77092.1, from Marrubium vulgare. [0333] RoCPS1, NCBI accession
No AHL67261.1, from Rosmarinus officinalis.
[0334] For the class I, two putative sclareol synthases from
Nicotiana glutinosa and one from Salvia sclarea were selected:
[0335] NgSCS-del38 (corresponding to SEQ ID NO: 40 of WO
2014/022434). [0336] NgSCS-del29 (corresponding to SEQ ID NO: 78 of
WO 2014/022434). [0337] SsScS, NCBI accession No AET21246.1, from
Salvia sclarea.
[0338] The codon usage of the DNA encoding for different CPP
synthases was modified for optimal expression in S. cerevisiae. In
addition, the DNA sequences were designed to express the
recombinant CPP synthase with deletion of the predicted peptide
signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1,
TaTps1, MvCps3 and RoCPS1, respectively). The NgSCS-del38,
NgSCS-del29 and SaSCS DNA sequences were also codon optimized for
S. cerevisiae expression.
[0339] For expression of the different genes in S. cerevisiae, a
set of plasmids were constructed in vivo using yeast endogenous
homologous recombination as previously described in Kuijpers et
al., Microb Cell Fact., 2013, 12:47. Each plasmid is composed of
six DNA fragments which were used for S. cerevisiae
co-transformation. The fragments were: [0340] a) LEU2 yeast marker,
constructed by PCR using the primers 5'
AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGTCG
TACCGCGCCATTCGACTACGTCGTAAGGCC-3' (SEQ ID NO: 44) and 5'
TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCGTT
GTTGCTGACCATCGACGGTCGAGGAGAACTT-3' (SEQ ID NO: 45) with the plasmid
pESC-LEU (Agilent Technologies, California, USA) as template;
[0341] b) AmpR E. coli marker, constructed by PCR using the primers
5'-TGGTCAGCAACAACGCCGAAGAATCACTCTCGTGTTGAGAATTGCACGCC
TTGACCACGACACGTTAAGGGATTTTGGTCATGAG-3' (SEQ ID NO: 37) and
5'-AACGCGTACCCTAAGTACGGCACCACAGTGACTATGCAGTCCGCACTTTG
CCAATGCCAAAAATGTGCGCGGAACCCCTA-3' (SEQ ID NO: 38) with the plasmid
pESC-URA as template; [0342] c) Yeast origin of replication,
obtained by PCR using the primers
5'-TTGGCATTGGCAAAGTGCGGACTGCATAGTCACTGTGGTGCCGTACTTAG
GGTACGCGTTCCTGAACGAAGCATCTGTGCTTCA-3' (SEQ ID NO: 39) and
5'-CCGAGATGCCAAAGGATAGGTGCTATGTTGATGACTACGACACAGAACTG
CGGGTGACATAATGATAGCATTGAAGGATGAGACT-3' (SEQ ID NO: 40) with
pESC-URA as template; [0343] d) E. coli replication origin,
obtained by PCR using the primers
5'-ATGTCACCCGCAGTTCTGTGTCGTAGTCATCAACATAGCACCTATCCTTTG
GCATCTCGGTGAGCAAAAGGCCAGCAAAAGG-3' (SEQ ID NO: 41) and
5'-CTCAGATGTACGGTGATCGCCACCATGTGACGGAAGCTATCCTGACAGTG
TAGCAAGTGCTGAGCGTCAGACCCCGTAGAA-3' (SEQ ID NO: 42) with the plasmid
pESC-URA as template; [0344] e) a fragment composed by the last 60
nucleotides of the fragment "d", 200 nucleotides downstream the
stop codon of the yeast gene PGK1, the GGPP synthase coding
sequence CrtE (from Pantoea agglomerans, NCBI accession M38424.1)
codon optimized for its expression in S. cerevisiae, the
bidirectional yeast promoter of GAL10/GAL1, one of the tested
sclareol synthase coding sequences, 200 nucleotides downstream the
stop codon of the yeast gene CYC1 and the sequence
5'-ATTCCTAGTGACGGCCTTGGGAACTCGATACACGATGTTCAGTAGACCGC
TCACACATGG-3'(SEQ ID NO: 43), this fragment was obtained by DNA
synthesis (DNA 2.0, Menlo Park, Calif. 94025) and [0345] f) a
fragment composed by the last 60 nucleotides of fragment "e", 200
nucleotides downstream the stop codon of the yeast gene CYC1, one
of the tested CPP synthase coding sequences, the bidirectional
yeast promoter of GAL10/GAL1 and 60 nucleotides corresponding to
the beginning of the fragment "a", this fragment was obtained by
DNA synthesis (DNA 2.0, Menlo Park, Calif. 94025).
[0346] In total 15 plasmids were constructed which cover all the
possible combinations of class I and class II diterpene synthases
listed above. The table below show all the plasmids.
TABLE-US-00003 Plasmid Class II diterpene Class I diterpene name
synthase synthase Nm SmCPS2 SsScS Cf CfCPS1del63 SsScS Mv
MvCps3del63 SsScS Ro RoCPS1del67 SsScS Ta TaTps1del59 SsScS Nt_Sm
SmCPS2 NgSCS-del38 Nt_Cf CfCPS1del63 NgSCS-del38 Nt_Mv MvCps3del63
NgSCS-del38 Nt_Ro RoCPS1del67 NgSCS-del38 Nt_Ta TaTps1del59
NgSCS-del38 Nt2_Sm SmCPS2 NgSCS-del29 Nt2_Cf CfCPS1del63
NgSCS-del29 Nt2_Mv MvCps3del63 NgSCS-del29 Nt2_Ro RoCPS1del67
NgSCS-del29 Nt2_Ta TaTps1del59 NgSCS-del29
[0347] To increase the level of endogenous farnesyl-diphosphate
(FPP) pool in S. cerevisiae cells, an extra copy of all the yeast
endogenous genes involved in the mevalonate pathway, from ERG10
coding for acetyl-CoA C-acetyltransferase to ERG20 coding for FPP
synthetase, were integrated in the genome of the S. cerevisiae
strain CEN.PK2-1C (Euroscarf, Frankfurt, Germany) under the control
of galactose-inducible promoters, similarly as described in Paddon
et al., Nature, 2013, 496:528-532. Briefly, three cassettes were
integrated in the LEU2, TRP1 and URA3 loci respectively. A first
cassette containing the genes ERG20 and a truncated HMG1 (tHMG1) as
described in Donald et al., Proc Natl Acad Sci USA, 1997,
109:E111-8, under the control of the bidirectional promoter
GAL10/GAL1 and the genes ERG19 and ERG13 also under the control of
GAL10/GAL1 promoter, the cassette was flanked by two 100
nucleotides regions corresponding to the up- and down-stream
sections of LEU2. A second cassette where the genes IDI1 and tHMG1
were under the control of the GAL10/GAL1 promoter and the gene
ERG13 under the control of the promoter region of GAL 7, the
cassette was flanked by two 100 nucleotides regions corresponding
to the up- and down-stream sections of TRP1. A third cassette with
the genes ERG10, ERG12, tHMG1 and ERG8, all under the control of
GAL10/GAL1 promoters, the cassette was flanked by two 100
nucleotides regions corresponding to the up- and down-stream
sections of URA3. All genes in the three cassettes included 200
nucleotides of their own terminator regions. Also, an extra copy of
GAL4 under the control of a mutated version of its own promoter, as
described in Griggs and Johnston, Proc Natl Acad Sci USA, 1991,
88:8597-8601, was integrated upstream the ERG9 promoter region. In
addition, the endogenous promoter of ERG9 was replaced by the yeast
promoter region of CTR3 generating the strain YST035. Finally,
YST035 was mated with the strain CEN.PK2-1D (Euroscarf, Frankfurt,
Germany) obtaining a diploid strain termed YST045.
[0348] YST045 was transformed with the above described fragments
required for in vivo plasmid assembly. Yeast transformations were
performed with the lithium acetate protocol as described in Gietz
and Woods, Methods Enzymol., 2002, 350:87-96. Transformation
mixtures were plated on SmLeu-media containing 6.7 g/L of Yeast
Nitrogen Base without amino acids (BD Difco, New Jersey, USA), 1.6
g/L Dropout supplement without leucine (Sigma Aldrich, Missouri,
USA), 20 g/L glucose and 20 g/L agar. Plates were incubated for 3-4
days at 30.degree. C. Single cells were used to produce manool in
cultures as described in Westfall et al., Proc Natl Acad Sci USA,
2012, 109:E111-118.
[0349] Under these culture conditions, manool was produced with
some combinations of type II and type I diterpene synthases. The
production of manool was evaluated using GC-MS analysis and
quantified using an internal standard. The table below shows the
quantities of manool produced relative to the SmCPS/SsScS
combination (under these experimental conditions, the concentration
of manool produced by cells expressing the SmCPS and the SsScS was
100 to 250 mg/L, the highest quantity of manool produced).
TABLE-US-00004 Class II diterpene Class I diterpene Relative
quantity of synthase synthase manool produced SmCPS2 SsScS 100
CfCPS1del63 SsScS 67 MvCps3del63 SsScS 1 RoCPS1del67 SsScS 29
TaTps1del59 SsScS 16 SmCPS2 NgSCS-del38 0 CfCPS1del63 NgSCS-del38 0
MvCps3del63 NgSCS-del38 0 RoCPS1del67 NgSCS-del38 0 TaTps1del59
NgSCS-del38 0 SmCPS2 NgSCS-del29 0 CfCPS1del63 NgSCS-del29 0
MvCps3del63 NgSCS-del29 0 RoCPS1del67 NgSCS-del29 0 TaTps1del59
NgSCS-del29 0
TABLE-US-00005 Sequence Listing. SEQ ID NO: 1 SmCPS, full-length
copalyl diphosphate synthase from Salvia miltiorrhiza
MASLSSTILSRSPAARRRITPASAKLHRPECFATSAWMGSSSKNLSL
SYQLNHKKISVATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTL
LRTTGDGRISVSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLED
GSWGDQKLFCVYDRLVNTIACVVALRSWNVHAHKVKRGVTYKENVDK
LMEGNEEHMTCGFEWFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQ
KLKRIPLEIMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPS
STAFAFMQTKDEKCYQFIKNTIDTFNGGAPHTYPVDWGRLWAIDRLQ
RLGISRFFEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMR
LMRMHGYDVDPNVLRNFKQKDGKFSCYGGQMIESPSPIYNLYRASQL
RFPGEEILEDAKRFAYDFLKEKLANNQILDKWVISKHLPDEIKLGLE
MPWLATLPRVEAKYYIQYYAGSGDVWIGKTLYRMPEISNDTYHDLAK
TDFKRCQAKHQFEWLYMQEWYESCGIEEFGISRKDLLLSYFLATASI
FELERTNERIAWAKSQIIAKMITSFFNKETTSEEDKRALLNELGNIN
GLNDTNGAGREGGAGSIALATLTQFLEGFDRYTRHQLKNAWSVWLTQ
LQHGEADDAELLTNTLNICAGHIAFREEILAHNEYKALSNLTSKICR
QLSFIQSEKEMGVEGEIAAKSSIKNKELEEDMQMLVKLVLEKYGGID
RNIKKAFLAVAKTYYYRAYHAADTIDTHMFKVLFEPVA SEQ ID NO: 2 SmCPS2,
truncated copalyl diphosphate synthase from S. miltiorrhiza
MATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTLLRTTGDGRIS
VSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLEDGSWGDQKLFC
VYDRLVNTIACVVALRSWNVHAHKVKRGVTYIKENVDKLMEGNEEHM
TCGFEVVFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQKLKRIPLE
IMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPSSTAFAFMQ
TKDEKCYQFIKNTIDTFNGGAPHTYPVDVFGRLWAIDRLQRLGISRF
FEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMRLMRMHGY
DVDPNVLRNFKQKDGKFSCYGGQMIESPSPrYNLYRASQLRFPGEEI
LEDAKRFAYDFLKEKLANNQILDKWVISKHLPDEIKLGLEMPWLATL
PRVEAKYYIQYYAGSGDVWIGKTLYRMPEISNDTYHDLAKTDFKRCQ
AKHQFEWLYMQEWYESCGIEEFGISRKDLLLSYFLATASIFELERTN
ERIAWAKSQIIAKMITSFFNKETTSEEDKRALLNELGNINGLNDTNG
AGREGGAGSIALATLTQFLEGFDRYTRHQLKNAWSVWLTQLQHGEAD
DAELLTNTLNICAGHIAFREEILAHNEYKALSNLTSKICRQLSFIQS
EKEMGVEGEIAAKSSIKNKELEEDMQMLVKLVLEKYGGlDRNIKKAF
LAVAKTYYYRAYHAADTrDTHMFKVLFEPVA SEQ ID NO: 3 SmCPS2opt, optimized
cDNA for E. coli expression encoding for SmCPS2
ATGGCAACTGTTGACGCACCTCAAGTCCATGATCACGATGGCACCAC
CGTTCACCAGGGTCACGACGCGGTGAAGAACATCGAGGACCCGATCG
AATACATTCGTACCCTGCTGCGTACCACTGGTGATGGTCGCATCAGC
GTCAGCCCGTATGACACGGCGTGGGTGGCGATGATTAAAGACGTCGA
GGGTCGCGATGGCCCGCAATTTCCTTCTAGCCTGGAGTGGATTGTCC
AAAATCAGCTGGAAGATGGCTCGTGGGGTGACCAGAAGCTGTTTTGT
GTTTACGATCGCCTGGTTAATACCATCGCATGTGTGGTTGCGCTGC
GTAGCTGGAATGTTCACGCTCATAAAGTCAAACGTGGCGTGACGTAT
ATCAAGGAAAACGTGGATAAGCTGATGGAAGGCAACGAAGAACACAT
GACGTGTGGCTTCGAGGTTGTTTTTCCAGCCTTGCTGCAGAAAGCAA
AGTCCCTGGGTATTGAGGATCTGCCGTACGACTCGCCGGCAGTGCAA
GAAGTCTATCACGTCCGCGAGCAGAAGCTGAAACGCATCCCGCTGGA
GATTATGCATAAGATTCCGACCTCTCTGCTGTTCTCTCTGGAAGGTC
TGGAGAACCTGGATTGGGACAAACTGCTGAAGCTGCAGTCCGCTGAC
GGTAGCTTTCTGACCAGCCCGAGCAGCACGGCCTTTGCGTTTATGCA
GACCAAAGATGAGAAGTGCTATCAATTCATCAAGAATACTATTGATA
CCTTCAACGGTGGCGCACCGCACACGTACCCAGTAGACGTTTTTGGT
CGCCTGTGGGCGATTGACCGTTTGCAGCGTCTGGGTATCAGCCGTTT
CTTCGAGCCGGAGATTGCGGACTGCTTGAGCCATATTCACAAATTCT
GGACGGACAAAGGCGTGTTCAGCGGTCGTGAGAGCGAGTTCTGCGAC
ATCGACGATACGAGCATGGGTATGCGTCTGATGCGTATGCACGGTTA
CGACGTGGACCCGAATGTGTTGCGCAACTTCAAGCAAAAAGATGGCA
AGTTTAGCTGCTACGGTGGCCAAATGATTGAGAGCCCGAGCCCGATC
TATAACTTATATCGTGCGAGCCAACTGCGTTTCCCGGGTGAAGAAAT
TCTGGAAGATGCGAAGCGTTTTGCGTATGACTTCCTGAAGGAAAAGC
TCGCAAACAATCAAATCTTGGATAAATGGGTGATCAGCAAGCACTTG
CCGGATGAGATTAAACTGGGTCTGGAGATGCCGTGGTTGGCCACCCT
GCCGAGAGTTGAGGCGAAATACTATATTCAGTATTACGCGGGTAGCG
GTGATGTTTGGATTGGCAAGACCCTGTACCGCATGCCGGAGATCAGC
AATGATACCTATCATGACCTGGCCAAGACCGACTTCAAACGCTGTCA
AGCGAAACATCAATTTGAATGGTTATACATGCAAGAGTGGTACGAAA
GCTGCGGCATCGAAGAGTTCGGTATCTCCCGTAAAGATCTGCTGCTG
TCTTACTTTCTGGCAACGGCCAGCATTTTCGAGCTGGAGCGTACCAA
TGAGCGTATTGCCTGGGCGAAATCACAAATCATTGCTAAGATGATTA
CGAGCTTTTTCAATAAAGAAACCACGTCCGAGGAAGATAAACGTGCT
CTGCTGAATGAACTGGGCAACATCAACGGTCTGAATGACACCAACGG
TGCCGGTCGTGAGGGTGGCGCAGGCAGCATTGCACTGGCCACGCTGA
CCCAGTTCCTGGAAGGTTTCGACCGCTACACCCGTCACCAGCTGAAG
AACGCGTGGTCCGTCTGGCTGACCCAGCTGCAGCATGGTGAGGCAGA
CGACGCGGAGCTGCTGACCAACACGTTGAATATCTGCGCTGGCCATA
TCGCGTTTCGCGAAGAGATTCTGGCGCACAACGAGTACAAAGCCCTG
AGCAATCTGACCTCTAAAATCTGTCGTCAGCTTAGCTTTATTCAGAG
CGAGAAAGAAATGGGCGTGGAAGGTGAGATCGCGGCAAAATCCAGCA
TCAAGAACAAAGAACTGGAAGAAGATATGCAGATGTTGGTCAAGCTC
GTCCTGGAGAAGTATGGTGGCATCGACCGTAATATCAAGAAAGCGTT
TCTGGCCGTGGCGAAAACGTATTACTACCGCGCGTACCACGCGGCAG
ATACCATTGACACCCACATGTTTAAGGTTTTGTTTGAGCCGGTTGCT TAA SEQ ID NO: 4
Full-length sclareol synthase from Salvia sclarea
MSLAFNVGVTPFSGQRVGSRKEKFPVQGFPVTTPNRSRLIVNCSLTT
IDFMAKMKENFKREDDKFPTTTTLRSEDIPSNLCIIDTLQRLGVDQF
FQYEINTILDNTFRLWQEKHKVIYGNVTTHAMAFRLLRVKGYEVSSE
ELAPYGNQEAVSQQTNDLPMIIELYRAANERIYEEERSLEKILAWTT
IFLNKQVQDNSIPDKKLHKLVEFYLRNYKGITIRLGARRNLELYDMT
YYQALKSTNRFSNLCNEDFLVFAKQDFDIHEAQNQKGLQQLQRWYAD
CRLDTLNFGRDVVIIANYLASLIIGDHAFDYVRLAFAKTSVLVTIMD
DFFDCHGSSQECDKIIELVKEWKENPDAEYGSEELEILFMALYNTVN
ELAERARVEQGRSVKEFLVKLWVEILSAFKIELDTWSNGTQQSFDEY
ISSSWLSNGSRLTGLLTMQFVGVKLSDEMLMSEECTDLARHVCMVGR
LLNDVCSSEREREENIAGKSYSILLATEKDGRKVSEDEAIAEINEMV
EYHWRKVLQIVYKKESILPRRCKDVFLEMAKGTFYAYGINDELTSPQ QSKEDMKSFVF SEQ ID
NO: 5 Truncated sclareol synthase from Salvia sclarea (SsScS)
MAKMKENFKREDDKFPTTTTLRSEDIPSNLCIIDTLQRLGVDQFFQY
EINTILDNTFRLWQEKHKVIYGNVTTHAMAFRLLRVKGYEVSSEELA
PYGNQEAVSQQTNDLPMIIELYRAANERIYEEERSLEKILAWTTIFL
NKQVQDNSIPDKKLHKLVEFYLRNYKGITIRLGARRNLELYDMTYYQ
ALKSTNRFSNLCNEDFLVFAKQDFDIHEAQNQKGLQQLQRWYADCRL
DTLNFGRDVVIIANYLASLIIGDHAFDYVRLAFAKTSVLVTIMDDFF
DCHGSSQECDKIIELVKEWKENPDAEYGSEELEILFMALYNTVNELA
ERARVEQGRSVKEFLVKLWVEILSAFKIELDTWSNGTQQSFDEYISS
SWLSNGSRLTGLLTMQFVGVKLSDEMLMSEECTDLARHVCMVGRLLN
DVCSSEREREENIAGKSYSILLATEKDGRKVSEDEAIAEINEMVEYH
WRKVLQIVYKKESILPRRCKDVFLEMAKGTFYAYGINDELTSPQQSK EDMKSFVF SEQ ID NO:
6 1132-2-5_opt, optimized cDNA for E. coli expression encoding the
truncated sclareol synthase from Salvia sclarea
ATGGCGAAAATGAAGGAGAACTTTAAACGCGAGGACGATAAATTCCC
GACGACCACGACCCTGCGCAGCGAGGATATCCCGAGCAACCTGTGCA
TCATTGATACCCTGCAGCGCCTGGGTGTCGATCAGTTCTTCCAATAC
GAAATCAATACCATTCTGGACAATACTTTTCGTCTGTGGCAAGAGAA
ACACAAAGTGATCTACGGCAACGTTACCACCCACGCGATGGCGTTCC
GTTTGTTGCGTGTCAAGGGCTACGAGGTTTCCAGCGAGGAACTGGCG
CCGTACGGTAATCAGGAAGCAGTTAGCCAACAGACGAATGATCTGCC
TATGATCATTGAGCTGTATCGCGCAGCAAATGAGCGTATCTACGAAG
AGGAACGCAGCCTGGAAAAGATCCTGGCGTGGACCACGATCTTCCTG
AACAAACAAGTTCAAGACAATTCTATTCCTGATAAGAAGCTGCATAA
ACTGGTCGAATTCTATCTGCGTAATTACAAGGGCATCACGATCCGTC
TGGGCGCACGCCGTAACCTGGAGTTGTATGATATGACGTATTACCAG
GCTCTGAAAAGCACCAATCGTTTCTCCAATCTGTGTAATGAGGATTT
TCTGGTGTTCGCCAAGCAGGATTTTGACATCCACGAGGCGCAAAATC
AAAAAGGTCTGCAACAACTGCAACGTTGGTACGCTGACTGTCGCCTG
GACACCCTGAATTTCGGTCGCGACGTTGTCATTATTGCAAACTATCT
GGCCAGCCTGATCATCGGTGATCACGCATTCGACTACGTCCGCCTGG
CCTTCGCTAAGACCAGCGTTCTGGTGACCATTATGGATGATTTCTTC
GATTGCCACGGTTCTAGCCAGGAATGCGACAAAATCATTGAGCTGGT
GAAAGAGTGGAAAGAAAACCCTGATGCGGAATACGGTTCCGAAGAGT
TGGAGATCCTGTTTATGGCCTTGTACAACACCGTGAATGAACTGGCC
GAGCGTGCTCGTGTGGAGCAGGGCCGTTCTGTGAAGGAGTTTTTGGT
CAAGTTGTGGGTGGAAATCCTGTCCGCGTTCAAGATCGAACTGGATA
CGTGGTCGAATGGTACGCAACAGAGCTTCGACGAATACATCAGCAGC
AGCTGGCTGAGCAATGGCAGCCGTCTGACCGGTTTGCTGACCATGCA
ATTTGTGGGTGTTAAACTGTCCGATGAAATGCTGATGAGCGAAGAAT
GCACCGACCTGGCACGCCATGTGTGTATGGTGGGTCGCCTGCTGAAC
GACGTCTGCAGCAGCGAACGTGAGCGCGAGGAAAACATTGCAGGCAA
GAGCTACAGCATCTTGTTGGCCACCGAGAAAGATGGTCGCAAAGTGT
CTGAGGACGAAGCAATTGCAGAGATTAATGAAATGGTCGAGTACCAC
TGGCGTAAGGTTTTGCAGATTGTGTATAAGAAAGAGAGCATCTTGCC
GCGTCGCTGTAAGGATGTTTTCTTGGAGATGGCGAAGGGCACGTTCT
ATGCGTACGGCATTAACGACGAGCTGACGAGCCCGCAACAATCGAAA
GAGGACATGAAGAGCTTCGTGTTCTGAGGTAC SEQ ID NO: 7 GGPP synthase from
Pantoea agglomerans MVSGSKAGVSPHREIEVMRQSIDDHLAGLLPETDSQDIVSLAMREGV
MAPGKRIRPLLMLLAARDLRYQGSMPTLLDLACAVELTHTASLMLDD
MPCMDNAELRRGQPTTHKKFGESVAILASVGLLSKAFGLIAATGDLP
GERRAQAVNELSTAVGVQGLVLGQFRDLNDAALDRTPDAILSTNHLK
TGILFSAMLQIVAIASASSPSTRETLHAFALDFGQAFQLLDDLRDDH
PETGKDRNKDAGKSTLVNRLGADAARQKLREHIDSADKHLTFACPQG
GAIRQFMHLWFGHHLADWSPVMKIA SEQ ID NO: 8 CrtEopt, optimized cDNA
encoding for the GGPP synthase from Pantoea agglomeranes.
ATGGTTTCTGGTTCGAAAGCAGGAGTATCACCTCATAGGGAAATCGA
AGTCATGAGACAGTCCATTGATGACCACTTAGCAGGATTGTTGCCAG
AAACAGATTCCCAGGATATCGTTAGCCTTGCTATGAGAGAAGGTGTT
ATGGCACCTGGTAAACGTATCAGACCTTTGCTGATGTTACTTGCTGC
AAGAGACCTGAGATATCAGGGTTCTATGCCTACACTACTGGATCTAG
CTTGTGCTGTTGAACTGACACATACTGCTTCCTTGATGCTGGATGAC
ATGCCTTGTATGGACAATGCGGAACTTAGAAGAGGTCAACCAACAAC
CCACAAGAAATTCGGAGAATCTGTTGCCATTTTGGCTTCTGTAGGTC
TGTTGTCGAAAGCTTTTGGCTTGATTGCTGCAACTGGTGATCTTCCA
GGTGAAAGGAGAGCACAAGCTGTAAACGAGCTATCTACTGCAGTTGG
TGTTCAAGGTCTAGTCTTAGGACAGTTCAGAGATTTGAATGACGCAG
CTTTGGACAGAACTCCTGATGCTATCCTGTCTACGAACCATCTGAAG
ACTGGCATCTTGTTCTCAGCTATGTTGCAAATCGTAGCCATTGCTTC
TGCTTCTTCACCATCTACTAGGGAAACGTTACACGCATTCGCATTGG
ACTTTGGTCAAGCCTTTCAACTGCTAGACGATTTGAGGGATGATCAT
CCAGAGACAGGTAAAGACCGTAACAAAGACGCTGGTAAAAGCACTCT
AGTCAACAGATTGGGTGCTGATGCAGCTAGACAGAAACTGAGAGAGC
ACATTGACTCTGCTGACAAACACCTGACATTTGCATGTCCACAAGGA
GGTGCTATAAGGCAGTTTATGCACCTATGGTTTGGACACCATCTTGC
TGATTGGTCTCCAGTGATGAAGATCGCCTAA SEQ ID NO: 9 Forward primer
SmCPS2-1132Inf_F1 CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGC
GAAAATGAAGGAGAACTTTAAACG SEQ ID NO: 10 Reverse primer
1132-pET_Inf_R1 GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCA
TGTCCTCT SEQ ID NO: 11 CfCPS1, full-length copalyl diphosphate
synthase from Coleus forskohlii
MGSLSTMNLNHSPMSYSGILPSSSAKAKLLLPGCFSISAWMNNGKNL
NCQLTHKKISKVAE1RVATVNAPPVHDQDDSTENQCHDAVNNIEDPI
EYIRTLLRTTGDGRISVSPYDTAWVALIKDLOGRDAPEFPSSLEWII
QNQLADGSWGDAKFFCVYDRLVNTIACVVALRSWDVHAEKVERGVRY
INENVEKLRDGNEEHMTCGFEVVFPALLQRAKSLGIQDLPYDAPVIQ
EIYHSREQKSKRIPLEMMHKVPTSLLFSLEGLENLEWDKLLKLQSAD
GSFLTSPSSTAFAFMQTRDPKCYQFIKNTIQTFNGGAPHTYPVDVFG
RLWAIDRLQRLGISRFFESEIADCIAHIHRFWTEKGVFSGRESEFCD
IDDTSMGVRLMRMHGYDVDPNVLKNFKKDDKFSCYGGQMIESPSPIY
NLYRASQLRFPGEQILEDANKFAYDFLQEKLAHNQILDKWVISKHLP
DEIKLGLEMPWYATLPRVEARYYIQYYAGSGDVWIGKTLYRMPEISN
DTYFIELAKTDFKRCQAQHQFEWIYMQEWYESCNMEEFGISRKELLV
AYFLATASIFELERANERIAWAKSQIISTIIASFFNNQNTSPEDKLA
FLTDFKNGNSTNMALVTLTQFLEGFDRYTSHQLKNAWSVWLRKLQQG
IEGNGGADAELLVNTLNICAGHIAFREELAHNDYKTLSNLTSKICRQ
LSQIQNEKELETEGQKTSIKNKELEEDMQRLVKLVLEKSRVGINRDM
KKTFLAVVKTYYYKAYHSAQAIDNHMFKVLFEPVA SEQ ID NO: 12 CfCPS1-del63,
truncated copalyl diphosphate synthase from Coleus forskohlii
MVATVNAPPVHDQDDSTENQCHDAVNNIEDPIEYIRTLLRTTGDGRI
SVSPYDTAWVALIKDLQGRDAPEFPSSLEWIIQNQLADGSWGDAKFF
CVYDRLVNTIACVVALRSWDVHAEKVERGVRYINENVEKLRDGNEEH
MTCGFEVVFPALLQRAKSLGIQDLPYDAPVIQEIYHSREQKSKRIPL
EMMHKVPTSLLFSLEGLENLEWDKLLKLQSADGSFLTSPSSTAFAFM
QTRDPKCYQFIKNTIQTFNGGAPHTYPVDVFGRLWAIDRLQRLGISR
FFESEIADCIAHIHRFWTEKGVFSGRESEFCDIDDTSMGVRLMRMHG
YDVDPNVLKNFKKDDKFSCYGGQMIESPSPIYNLYRASQLRFPGEQI
LEDANKFAYDFLQEKLAHNQILDKWVISKHLPDEIKLGLEMPWYATL
PRVEARYYIQYYAGSGDVWIGKTLYRMPEISNDTYHELAKTDFKRCQ
AQHQFEWIYMQEWYESCNMEEFGISRKELLVAYFLATASIFELERAN
ERIAWAKSQIISTIIASFFNNQNTSPEDKLAFLTDFKNGNSTNMALV
TLTQFLEGFDRYTSHQLKNAWSVWLRKLQQGEGNGGADAELLVNTLN
ICAGHIAFREEILAHNDYKTLSNLTSKICRQLSQIQNEKELETEGQK
TSIKNKELEEDMQRLVKLVLEKSRVGINRDMKKTFLAVVKTYYYKAY HSAQAIDNHMFKVLFEPVA
SEQ ID NO: 13 Optimized cDNA for E. coli expression encoding for
CfCPS1-del63 ATGGTCGCTACTGTCAATGCTCCACCGGTCCACGATCAAGACGACAG
CACTGAGAATCAATGTCATGATGCCGTAAACAATATTGAAGATCCAA
TCGAGTATATCCGTACCCTGTTGCGCACGACGGGTGATGGTCGTATC
AGCGTCAGCCCGTACGATACCGCGTGGGTGGCGCTGATCAAAGATCT
GCAGGGCCGTGACGCACCGGAGTTTCCGTCCTCTCTTGAGTGGATCA
TTCAAAACCAGCTGGCCGACGGTTCTTGGGGCGACGCCAAATTTTTC
TGCGTGTATGACCGTCTGGTGAACACCATCGCGTGCGTCGTTGCGCT
GCGTTCCTGGGACGTCCACGCGGAAAAAGTTGAGCGTGGCGTGCGCT
ATATCAACGAAAATGTCGAAAAGCTGCGCGACGGTAATGAAGAACAC
ATGACCTGTGGCTTTGAAGTTGTTTTCCCGGCGCTCCTGCAGCGCGC
GAAGTCTCTGGGTATTCAAGATCTGCCGTACGATGCTCCGGTGATCC
AAGAGATTTATCACTCTCGTGAGCAGAAGTCCAAGCGTATCCCGTTG
GAGATGATGCACAAAGTTCCGACGAGCCTGCTGTTCAGCTTGGAAGG
CCTGGAAAATCTGGAGTGGGACAAACTGCTGAAGCTGCAGAGCGCGG
ACGGTAGCTTCCTGACGAGCCCGAGCAGCACCGCATTTGCATTTATG
CAGACCCGTGACCCGAAGTGTTACCAATTTATTAAGAACACGATTCA
GACGTTTAACGGTGGTGCACCGCATACCTATCCGGTAGACGTCTTTG
GTCGCCTGTGGGCAATTGATCGTCTGCAGCGTTTGGGTATCAGCCGC
TTCTTCGAAAGCGAAATTGCAGATTGTATCGCACACATCCATCGTTT
TTGGACCGAGAAAGGCGTCTTTAGCGGCCGTGAGTCTGAGTTCTGTG
ACATCGATGACACGAGCATGGGTGTCCGTCTGATGCGTATGCATGGC
TATGATGTTGACCCGAACGTGCTGAAGAATTTTAAAAAAGATGACAA
GTTTAGCTGCTACGGCGGTCAGATGATTGAGAGCCCGAGCCCGATTT
ATAATCTGTACCGCGCGAGCCAACTGCGTTTCCCGGGTGAACAGATT
CTGGAAGATGCCAATAAATTCGCGTATGATTTCCTGCAGGAAAAACT
GGCGCACAATCAGATCCTGGATAAATGGGTTATCAGCAAGCATCTGC
CTGACGAAATCAAATTGGGCCTGGAGATGCCGTGGTATGCGACCTTG
CCGCGTGTCGAAGCGCGTTACTACATCCAGTACTATGCGGGTAGCGG
CGATGTCTGGATTGGTAAGACGCTGTACCGTATGCCAGAGATTAGCA
ACGACACCTACCATGAATTGGCAAAGACCGATTTCAAGCGTTGCCAA
GCCCAACACCAGTTCGAGTGGATTTACATGCAAGAGTGGTACGAGTC
GTGCAACATGGAAGAGTTCGGTATTAGCCGCAAAGAACTGCTGGTTG
CATATTTCCTGGCCACGGCGAGCATCTTTGAGCTGGAGCGTGCGAAT
GAACGCATTGCATGGGCAAAAAGCCAAATCATTTCTACCATTATCGC
TTCGTTCTTTAATAACCAAAATACGAGCCCTGAGGATAAACTGGCGT
TTCTGACTGATTTCAAAAATGGCAACAGCACCAACATGGCTCTGGTG
ACCCTGACCCAGTTCCTGGAAGGCTTTGACCGCTACACTTCCCATCA
ACTGAAAAACGCGTGGAGCGTTTGGCTGCGTAAGCTGCAACAGGGTG
AGGGTAATGGCGGTGCCGACGCCGAGTTACTGGTGAATACGCTGAAC
ATTTGCGCGGGTCACATCGCGTTCCGTGAAGAAATTCTGGCACATAA
TGACTATAAAACGTTGTCGAACCTGACCAGCAAGATTTGTCGCCAGC
TGAGCCAGATTCAGAATGAAAAAGAATTGGAAACCGAAGGCCAAAAG
ACTTCCATTAAGAACAAAGAACTGGAAGAAGATATGCAGCGCCTGGT
TAAACTGGTTTTGGAGAAAAGCCGTGTGGGTATCAATCGTGACATGA
AGAAAACGTTCCTGGCTGTGGTGAAAACCTACTATTACAAAGCATAC
CACTCCGCGCAGGCAATCGATAACCACATGTTCAAGGTTCTGTTCGA ACCGGTGGCCTAA SEQ
ID NO: 14 TaTps1, full-length copalyl diphosphate synthase from
Tritictim aestivum. MLTFTAALRHVPVLDQPTSEPWRRLSLHLHSQRRPCGLVLISKSPSY
PEVDVGEWKVDEYRQRTDEPSETRQMIDDIRTALASLGDDETSMSVS
AYDTALVALVKNLDGGDGPQFPSCIDWIVQNQLPDGSWGDPAFFMVQ
DRMISTLACWAVKSWNIDRDNLCDRGVLFIKENMSRLVEEEQDWMPC
GFEINFPALLEKAKDLDLDIPYDHPVLEEIYAKRNLKLLKIPLDVLH
AIPTTLLFSVEGMVDLPLDWEKLLRLRCPDGSFHSSPAATAAALSHT
GDKECHAFLDRLIQKFEGGVPCSHSMDTFEQLWVYDRLMRLGISRHF
TSEIQQCLEFIYRRWTQKGLAHNMHCPIPDIDDTAMGFRLLRQHGYD
VTPSVFKHFEKDGKFVCFPMETNHASVTPMHNTYRASQFMFPGDDDV
LARAGRYCRAFLQERQSSNKLYDKWIITKDLPGEVGYTLNFPWKSSL
PRIETRMYLDQYGGNNDVWIAKVLYRMNLVSNDLYLKMAKADFTEYQ
RLSRIEWNGLRKWYFRNHLQRYGATPKSALKAYFLASANIFEPGRAA
ERLAWARMAVLAEAVTTHFRHIGGPCYSTENLEELIDLVSFDDVSGG
LREAWKQWLMAWTAKESHGSVDGDTALLFVRTIEICSGRIVSSEQKL
NLWDYSQLEQLTSSICHKLATIGLSQNEASMENTEDLHQQVDLEMQE
LSWRVHQGCHGINRETRQTFLNVVKSFYYSAHCSPETVDSHIAKVIF QDVI SEQ ID NO: 15
TaTps1-del59, truncated copalyl diphosphate synthase from Triticum
aestivum. MYRQRTDEPSETRQMIDDIRTALASLGDDETSMSVSAYDTALVALVK
NLDGGDGPQFPSCIDWIVQNQLPDGSWGDPAFFMVQDRMISTLACVV
AVKSWNIDRDNLCDRGVLFIKENMSRLVEEEQDWMPCGFEINFPALL
EKAKDLDLDIPYDHPVLEEIYAKRNLKLLKIPLDVLHAIPTTLLFSV
EGMVDLPLDWEKLLRLRCPDGSFHSSPAATAAALSHTGDKECHAFLD
RLIQKFEGGVPCSHSMDTFEQLWVVDRLMRLGISRHFTSEIQQCLEF
IYRRWTQKGLAHNMHCPIPDIDDTAMGFRLLRQHGYDVTPSVFKHFE
KDGKFVCFPMETNHASVTPMHNTYRASQFMFPGDDDVLARAGRYCRA
FLQERQSSNKLYDKWIITKDLPGEVGYTLNFPWKSSLPRIETRMYLD
QYGGNNDVWIAKVLYRMNLVSNDLYLKMAKADFTEYQRLSRIEWNGL
RKWYFRNHLQRYGATPKSALKAYFLASANIFEPGRAAERLAWARMAV
LAEAVTTHFRHIGGPCYSTENLEELIDLVSFDDVSGGLREAWKQWLM
AWTAKESHGSVDGDTALLFVRTIEICSGRIVSSEQKLNLWDYSQLEQ
LTSSICHKLATIGLSQNEASMENTEDLHQQVDLEMQELSWRVHQGCH
GINRETRQTFLNVVKSFYYSAHCSPETVDSHIAKVIFQDVI SEQ ID NO: 16 Optimized
cDNA for E. coli expression encoding for TaTps1-del59
ATGTATCGCCAAAGAACTGATGAGCCAAGCGAAACCCGCCAGATGAT
CGATGATATTCGCACCGCTTTGGCTAGCCTGGGTGACGATGAAACCA
GCATGAGCGTGAGCGCATACGACACCGCCCTGGTTGCCCTGGTGAAG
AACCTGGACGGTGGCGATGGCCCGCAGTTCCCGAGCTGCATTGACTG
GATTGTTCAGAACCAGCTGCCGGACGGTAGCTGGGGCGACCCGGCTT
TCTTTATGGTTCAGGACCGTATGATCAGCACCCTGGCCTGTGTCGTG
GCCGTGAAATCCTGGAATATCGATCGTGACAACTTGTGCGATCGTGG
TGTCCTGTTTATCAAAGAAAACATGTCGCGTCTGGTTGAAGAAGAAC
AAGATTGGATGCCATGTGGCTTCGAGATTAACTTTCCTGCACTGTTG
GAGAAAGCTAAAGACCTGGACTTGGACATTCCGTACGATCATCCTGT
GCTGGAAGAGATTTACGCGAAGCGTAATCTGAAACTGCTGAAGATTC
CGTTAGATGTCCTCCATGCGATCCCGACGACGCTGTTGTTTTCCGTT
GAGGGTATGGTCGATCTGCCGCTGGATTGGGAGAAACTGCTGCGTCT
GCGTTGCCCGGACGGTTCTTTTCATTCTAGCCCGGCGGCGACGGCAG
CGGCGCTGAGCCACACGGGTGACAAAGAGTGTCACGCCTTCCTGGAC
CGCCTGATTCAAAAGTTCGAGGGTGGCGTCCCGTGCTCCCACAGCAT
GGACACCTTCGAGCAACTGTGGGTTGTTGACCGTTTGATGCGTCTGG
GTATCAGCCGTCATTTTACGAGCGAGATCCAGCAGTGCTTGGAGTTC
ATCTATCGTCGTTGGACCCAGAAAGGTCTGGCGCACAATATGCACTG
CCCGATCCCGGACATTGATGACACTGCGATGGGTTTTCGTCTGTTGA
GACAGCACGGTTACGACGTGACCCCGTCGGTTTTCAAGCATTTCGAG
AAAGACGGCAAGTTCGTATGCTTCCCGATGGAAACCAACCATGCGAG
CGTGACGCCGATGCACAATACCTACCGTGCGAGCCAGTTCATGTTCC
CGGGTGATGACGACGTGCTGGCCCGTGCCGGCCGCTACTGTCGCGCA
TTCTTGCAAGAGCGTCAGAGCTCTAACAAGTTGTACGATAAGTGGAT
TATCACGAAAGATCTGCCGGGTGAGGTTGGCTACACGCTGAACTTTC
CGTGGAAAAGCTCCCTGCCGCGTATTGAAACTCGTATGTATCTGGAT
CAGTACGGTGGCAATAACGATGTCTGGATTGCAAAGGTCCTGTATCG
CATGAACCTGGTTAGCAATGACCTGTACCTGAAAATGGCGAAAGCCG
ACTTTACCGAGTATCAACGTCTGTCTCGCATTGAGTGGAACGGCCTG
CGCAAATGGTATTTTCGCAATCATCTGCAGCGTTACGGTGCGACCCC
GAAGTCCGCGCTGAAAGCGTATTTCCTGGCGTCGGCAAACATCTTTG
AGCCTGGCCGCGCAGCCGAGCGCCTGGCATGGGCACGTATGGCCGTG
CTGGCTGAAGCTGTAACGACTCATTTCCGTCACATTGGCGGCCCGTG
CTACAGCACCGAGAATCTGGAAGAACTGATCGACCTTGTTAGCTTCG
ACGACGTGAGCGGCGGCTTGCGTGAGGCGTGGAAGCAATGGCTGATG
GCGTGGACCGCAAAAGAATCACACGGCAGCGTGGACGGTGACACGGC
ACTGCTGTTTGTCCGCACGATTGAGATTTGCAGCGGCCGCATCGTTT
CCAGCGAGCAGAAACTGAATCTGTGGGATTACAGCCAGTTAGAGCAA
TTGACCAGCAGCATCTGTCATAAACTGGCCACCATCGGTCTGAGCCA
GAACGAAGCTAGCATGGAAAATACCGAAGATCTGCACCAACAAGTCG
ATTTGGAAATGCAAGAACTGTCATGGCGTGTTCACCAGGGTTGTCAC
GGTATTAATCGCGAAACCCGTCAAACCTTCCTGAATGTTGTTAAGTC
TTTTTATTACTCCGCACACTGCAGCCCGGAAACCGTGGACAGCCATA
TTGCAAAAGTGATCTTTCAAGACGTTATCTGA SEQ ID NO: 17 MvCps3, full-length
copalyl diphosphate synthase from Marrubium vulgare.
MGSLSTLNLIKTCVTLASSEKLNQPSQCYTISTCMKSSNNPPFNYYQ
INGRKKMSTAIDSSVNAPPEQKYNSTALEHDTEIIEIEDHIECIRRL
LRTAGDGRISVSPYDTAWIALIKDLDGHDSPQFPSSMEWVADNQLPD
GSWGDEHFVCVYDRLVNTIACVVALRSWNVHAHKCEKGIKYIKENVH
KLEDANEEHMTCGFEVVFPALLQRAQSMGIKGIPYNAPVIEEIYNSR
EKKLKRIPMEVVHKVATSLLFSLEGLENLEWEKLLKLQSPDGSFLTS
PSSTAFAFIHTKDRKCFNFINNIVHTFKGGAPHTYPVDIFGRLWAVD
RLQRLGISRFFESEIAEFLSHVHRFWSDEAGVFSGRESVFCDIDDTS
HMGLRLLRMHGYHVDPNVLKNFKQSDKFSCYGGQMMECSSPIYNLYR
ASQLQFPGEEILEEANKFAYKFLQEKLESNQILDKWLISNLSDEIKV
GLEMPWYATLPRVETSYYIHHYGGGDDVWIGKTLYRMPEISNDTYRE
LARLDFRRCQAQHQLEWIYMQRWYESCRMQEFGISRKEVLRAYFLAS
GTIFEVERAKERVAWARSQIISHMIKSFFNKETTSSDQKQALLTELL
FGNISASETEKRELDGWVATLRQFLEGFDIGTRHQVKAAWDVWLRKV
EQGEAHGGADAELCTTTLNTCANQHLSSHPDYNTLSKLTNKICHKLS
QIQHQKEMKGGIKAKCSINNKEVDIEMQWLVKLVLEKSGLNRKAKQA
FLSIAKTYYYRAYYADQTMDAHEFKVLFEPVV SEQ ID NO: 18 MvCps3-del63,
truncated copalyl diphosphate synthase from Marrubium vulgare
MAPPEQKYNSTALEHDTEIIEIEDHIECIRRLLRTAGDGRISVSPYD
TAWIALIKDLDGHDSPQFPSSMEWVADNQLPDGSWGDEHFVCVYDRL
VNTIACWALRSWNVHAHKCEKGIKYIKENVHKLEDANEEHMTCGFEV
VFPALLQRAQSMGIKGIPYNAPVIEEIYNSREKKLKRIPMEVVHKVA
TSLLFSLEGLENLEWEKLLKLQSPDGSFLTSPSSTAFAFIHTKDRKC
FNFINNIVHTFKGGAPHTYPVDIFGRLWAVDRLQRLGISRFFESEIA
EFLSHVHRFWSDEAGVFSGRESVFCDIDDTSMGLRLLRMHGYHVDPN
VLKNFKQSDKFSCYGGQMMECSSPIYNLYRASQLQFPGEEILEEANK
FAYKFLQEKLESNQILDKWLISNHLSDEIKVGLEMPWYATLPRVETS
YYIHHYGGGDDVWIGKTLYRMPEISNDTYRELARLDFRRCQAQHQLE
WIYMQRWYESCRMQEFGISRKEVLRAYFLASGTIFEVERAKERVAWA
RSQIISHMIKSFFNKETTSSDQKQALLTELLFGNISASETEKRELDG
VVVATLRQFLEGFDIGTRHQVKAAWDVWLRKVEQGEAHGGADAELCT
TTLNTCANQHLSSHPDYNTLSKLTNKICHKLSQIQHQKEMKGGIKAK
CSINNKEVDIEMQWLVKLVLEKSGLNRKAKQAFLSIAKTYYYRAYYA DQTMDAHIFKVLFEPVV
SEQ ID NO: 19 Optimized cDNA for E. coli expression encoding for
MvCps3-del63 ATGGCCCCGCCGGAACAAAAGTACAACAGCACTGCATTAGAACACGA
CACCGAGATTATTGAGATCGAGGACCACATCGAGTGTATCCGCCGTC
TGCTGCGTACCGCGGGTGATGGTCGTATTAGCGTGAGCCCGTATGAT
ACCGCGTGGATTGCACTGATTAAAGATTTGGATGGCCACGACTCCCC
GCAATTCCCGTCGAGCATGGAATGGGTTGCTGATAATCAGCTGCCGG
ACGGTAGCTGGGGTGACGAGCACTTCGTTTGCGTTTACGATCGCCTG
GTTAATACCATCGCATGCGTCGTGGCGCTGCGCAGCTGGAATGTCCA
TGCACATAAGTGCGAGAAAGGTATTAAGTACATTAAAGAAAATGTCC
ACAAACTGGAAGATGCGAACGAAGAACACATGACTTGCGGCTTCGAA
GTCGTTTTTCCGGCCTTGCTGCAGCGTGCACAGAGCATGGGTATTAA
GGGCATCCCGTACAACGCGCCTGTCATTGAAGAAATTTACAATTCCC
GTGAGAAAAAGCTGAAACGTATTCCGATGGAAGTTGTCCACAAAGTC
GCGACCAGCCTGCTGTTCTCCCTGGAAGGTCTGGAGAACCTGGAGTG
GGAGAAATTGCTGAAACTGCAGAGCCCGGACGGTTCGTTTCTGACCA
GCCCGAGCTCTACGGCATTCGCGTTTATCCATACCAAAGACCGTAAA
TGTTTTAACTTTATTAACAATATCGTTCATACCTTTAAGGGTGGTGC
ACCGCACACGTACCCTGTGGACATCTTTGGCCGCCTGTGGGCAGTGG
ATCGCTTGCAGCGTCTGGGTATTAGCCGCTTCTTCGAGAGCGAGATC
GCGGAATTTCTGAGCCACGTGCACCGTTTTTGGAGCGACGAAGCGGG
CGTTTTCAGCGGCCGTGAGAGCGTGTTCTGTGATATTGATGACACCA
GCATGGGTCTGCGCCTGCTTCGTATGCATGGCTACCATGTAGACCCA
AACGTTCTGAAGAACTTCAAGCAATCTGACAAGTTTAGCTGCTACGG
TGGCCAGATGATGGAATGCAGCAGCCCAATTTACAATCTGTACCGTG
CGAGCCAACTGCAATTTCCGGGTGAAGAAATCTTGGAAGAGGCTAAC
AAATTCGCGTATAAGTTTTTGCAAGAGAAACTGGAGTCCAATCAGAT
TCTGGACAAGTGGCTGATCTCCAACCACCTGAGCGACGAAATCAAAG
TTGGCCTGGAAATGCCGTGGTATGCGACCTTGCCGCGCGTTGAGACT
AGCTATTATATTCACCATTACGGCGGTGGCGACGATGTGTGGATTGG
TAAAACGCTGTATCGCATGCCGGAAATTAGCAACGACACCTACCGTG
AGCTGGCACGTCTGGACTTCCGCCGCTGCCAGGCGCAGCACCAGTTG
GAATGGATCTATATGCAACGTTGGTATGAGAGCTGTCGTATGCAAGA
ATTTGGTATTTCCCGCAAAGAAGTCCTGCGTGCCTACTTCCTGGCCT
CTGGCACGATTTTCGAAGTTGAGCGCGCCAAAGAGCGCGTGGCGTGG
GCTCGTAGCCAAATCATTTCCCACATGATCAAGAGCTTCTTCAATAA
AGAAACCACGAGCAGCGATCAGAAACAAGCGCTGCTGACCGAGTTGC
TGTTTGGTAACATCTCTGCAAGCGAGACTGAGAAACGTGAGCTGGAT
GGTGTTGTGGTTGCGACCCTGCGTCAGTTCCTGGAAGGCTTCGATAT
CGGCACCCGTCACCAAGTGAAGGCAGCGTGGGATGTGTGGCTGCGTA
AAGTCGAACAGGGTGAGGCACATGGTGGCGCGGACGCCGAGTTGTGT
ACGACGACGCTGAACACGTGCGCGAATCAGCATCTGTCTAGCCATCC
GGACTACAATACCCTGTCGAAACTCACCAATAAGATTTGTCACAAGC
TGTCCCAAATCCAGCATCAGAAAGAAATGAAGGGCGGTATTAAGGCA
AAGTGCTCTATCAATAACAAAGAAGTGGATATCGAGATGCAATGGCT
GGTCAAACTGGTCCTGGAGAAATCCGGTCTGAACCGCAAGGCTAAAC
AAGCGTTTCTGAGCATTGCCAAAACCTATTATTATCGTGCTTACTAT
GCCGACCAGACGATGGATGCCCACATCTTCAAGGTCCTGTTTGAACC GGTCGTGTAA SEQ ID
NO: 20 RoCPSl, full-length copalyl diphosphate synthase from
Rosmarinus officinalis
MTSMSSLNLSRAPAISRRLQLPAKVQLPEFYAVCSWLNNSSKHTPLS
CHIHRKQLSKVTKCRVASLDASQVSEKGTSSPVQTPEEVNEKIENYI
EYIKNLLTTSGDGRISVSPYDTSIVALIKDLKGRDTPQFPSCLEWIA
QHQMADGSWGDEFFCIYDRILNTLACWALKSWNVHADMIEKGVTYVN
ENVQKLEDGNLEHMTSGFEIVVPALVQRAQDLGIQGLPYDHPLIKEI
ANTKEGRLKKIPKDMIYQKPTTLLFSLEGLGDLEWEKILKLQSGDGS
FLTSPSSTAHVFMKTKDEKCLKFIENAVKNCNGGAPHTYPVDVFARL
WAVDRLQRLGISRFFQQEIKYFLDHINSVWTENGVFSGRDSEFCDID
DTSMGIRLLKMHGYDIDPNALEHFKQQDGKFSCYGGQMIESASPIYN
LYRAAQLRFPGEEILEEATKFAYNFLQEKIANDQFQEKWVISDHLID
EVKLGLKMPWYATLPRVEAAYYLQYYAGCGDVWIGKVFYRJVIPEIS
NDTYKKLAILDFNRCQAQHQFEWIYMQEWYHRSSVSEFGISKKDLLR
AYFLAAATIFEPERTQERLVWAKTQIVSGMITSFVNSGTTLSLHQKT
ALLSQIGHNFDGLDEIISAMKDHGLAATLLTTFQQLLDGFDRYTRHQ
LKNAWSQWFMKLQQGEASGGEDAELLANTLNICAGLIAFNEDVLSHH
EYTTLSTLTNKICKRLTQIQDKKTLEVVDGSIKDKELEKDIQMLVKL
VLEENGGGVDRNIKHTFLSVFKTFYYNAYHDDETTDVHIFKVLFGPV V SEQ ID NO: 21
RoCPSl-del67, truncated copalyl diphosphate synthase from
Rosmarinus officinalis
MASQVSEKGTSSPVQTPEEVNEKIENYIEYIKNLLTTSGDGRISVSP
YDTSIVALIKDLKGRDTPQFPSCLEWIAQHQMADGSWGDEFFCIYDR
ILNTLACVVALKSWNVHADMIEKGVTYVNENVQKLEDGNLEHMTSGF
EIVVPALVQRAQDLGIQGLPYDHPLIKEIANTKEGRLKKIPKDMIYQ
KPTTLLFSLEGLGDLEWEKLLKLQSGDGSFLTSPSSTAHVFMKTKDE
KCLKFIENAVKNCNGGAPHTYPVDVFARLWAVDRLQRLGISRFFQQE
IKYFLDHINSVWTENGVFSGRDSEFCDIDDTSMGIRLLKMHGYDIDP
NALEHFKQQDGKFSCYGGQMIESASPIYNLYRAAQLRFPGEEILEEA
TKFAYNFLQEKIANDQFQEKWVISDHLIDEVKLGLKMPWYATLPRVE
AAYYLQYYAGCGDVWIGKVFYRMPEISNDTYKKLAILDFNRCQAQHQ
FEWIYMQEWYIIRSSVSEFGISKKDLLRAYFLAAATIFEPERTQERL
VWAKTQIVSGMITSFVNSGTTLSLHQKTALLSQIGHNFDGLDEIISA
MKDHGLAATLLTTFQQLLDGFDRYTRHQLKNAWSQWFMKLQQGEASG
GEDAELLANTLNICAGLtAFNEDVLSHHEYTTLSTLTNKICKRLTQI
QDKKTLEWDGSIKDKELEKDIQMLVKLVLEENGGGVDRNIKHTFLSV
FKTFYYNAYHDDETTDVHIFKVLFGPVV SEQ ID NO: 22 Optimized cDNA for E.
coli expression encoding for RoCPS1-del67
ATGGCATCACAAGTTAGCGAGAAAGGCACCAGCTCCCCAGTTCAAAC
GCCAGAGGAAGTGAACGAAAAGATCGAGAATTACATTGAGTATATTA
AAAATCTGCTGACTACTTCGGGCGACGGCCGCATCAGCGTCAGCCCG
TACGACACGAGCATCGTTGCCCTGATTAAAGACCTGAAGGGTCGTGA
CACCCCGCAGTTTCCGTCCTGTCTGGAGTGGATTGCCCAACACCAAA
TGGCCGATGGTTCCTGGGGTGATGAATTTTTCTGCATTTACGACCGC
GATCCTGAATACGCTGGCTTGTGTTGTCGCCCTGAAGTCCTGGAATT
TCATGCAGACATGATCGAAAAGGGTGTCACTTACGTTAACGAAAACG
TGCAGAAACTGGAAGATGGCAATCTGGAGCACATGACGAGCGGTTTC
CGAGATTGTTGTCCCGGCGCTGGTTCAGAGAGCGCAAGACCTGGGCA
TCCAGGGCCTGCCGTATGATATCCGTTGATCAAAGAAATCGCAAACA
CCAAAGAGGGCCGCCTGAAGAAAATTCCTAAAGACATGATTTATCAG
AAACCGACTACGCTGCTGTTCAGCCTGGAAGGCTTGGGCGACCTGGA
GTGGGAAAAGATCCTGAAGTTACAGTCTGGTGATGGTTCTTTCCTGA
CCAGCCCGAGCTCTACGGCCCATGTTTTCATGAAAACCAAAGATGAG
AAGTGTCTGAAGTTTATTGAAAATGCCGTCAAGAATTGCAACGGTGG
CGCGCCTCACACCTACCCGGTGGACGTTTTCGCTCGTCTGTGGGCCG
TCGATCGTCTGCAACGCCTGGGCATCTCGCGTTTCTTCCAGCAAGAG
ATTAAGTACTTCCTGGACCACATTAATAGCGTGTGGACCGAAAACGG
CGTTTTCAGCGGTCGCGACAGCGAGTTTTGTGATATTGATGACACCT
CTATGGGTATCCGTTTGCTGAAGATGCACGGTTACGACATTGACCCG
AATGCCCTGGAGCACTTTAAACAACAGGATGGTAAGTTCTCCTGCTA
CGGTGGTCAGATGATTGAGAGCGCGAGCCCGATCTACAACCTGTACC
GTGCTGCGCAGCTGCGTTTTCCGGGTGAAGAGATTCTGGAAGAGGCC
ACCAAATTTGCGTATAATTTTTTGCAAGAGAAAATTGCAAACGACCA
AATTCCAGGAAAAATGGGTTATTAGCGATCACCTTATCGATGAAGTG
AAAACTGGGTTTGAAGATGCCGTGGTACGCGCGCTGCCACGTGTCGA
GGCAGCGTATTATCTGCAGTATTATGCGGGCTGTGGTGTGTGTGGAT
CGGCAAAGTGTTCTACCGTATGCCGGAAATCAGCAATGACACCTACA
AGAAACTGGCCATCCTGGATTTCAACCGTTGCCAGGCGCAACACCAA
TTCGAGTGGATCTACATGCAAGAGTGGTATCATCGTAGCAGCGTTTC
TGAGTTTGGCATTTCCAAAAAAGACTTGCTGCGCGCGTATTTTCTGG
CGGCAGCGACCATTTTCGAACCGGAGCGCACCCAGGAACGTCTGGTG
TGGGCTAAGACGCAAATCGTCAGCGGTATGATTACGTCCTTTGTTAA
TAGCGGTACGACTCTGAGCCTGCACCAGAAAACGGCACTGTTGAGCC
AAATCGGTCATAACTTTGACGGCCTGGATGAGATTATCAGCGCGATG
AAAGACCACGGCCTGGCAGCGACGCTGTTAACGACCTTTCAACAGCT
GCTGGACGGCTTCGATCGCTACACCCGTCATCAGCTGAAAAACGCGT
GGAGCCAGTGGTTCATGAAGCTGCAACAGGGTGAGGCGTCGGGTGGC
GAAGATGCTGAGCTGCTGGCTAATACCCTGAACATTTGCGCGGGTTT
GATTGCGTTTAATGAAGATGTGTTGAGCCACCATGAGTACACCACCC
TGAGCACCCTGACCAACAAGATCTGTAAGCGCTTGACTCAAATCCAG
GATAAGAAAACGCTGGAAGTCGTGGATGGTAGCATCAAAGATAAAGA
ACTGGAAAAAGACATTCAAATGCTGGTGAAACTGGTCCTTGAAGAGA
ACGGCGGTGGCGTTGACCGTAACATCAAGCACACCTTCCTGAGCGTC
TTTAAAACCTTTTATTATAATGCCTATCATGACGATGAAACGACCGA
CGTGCACATTTTCAAAGTTCTGTTCGGTCCGGTCGTGTAA SEQ ID NO: 23 NgSCS-del29,
truncated putative sclareol synthase from Nicotiana glutinosa
MANFHRPSRVRCSHSTASSLEEAKERIRETFGKNELSPSSYDTAWVA
MVPSRYSMNQPCFPRCLDWILENQREDGSWGLNPSHPLLVKDSLSST
LACLLALRKWRIGDNQVQRGLGFIETHGWAVDNVDQISPLGFDIIFP
SMIKYAEKLNLDLPFDPNLVNMMLRERELTIERALKNEFEGNMANVE
YFAEGLGELCHWKEIMLHQRRNGSLFDSPATTAAALIYHQHDEKCFG
YLSSILKLHENWVPTIYPTKVHSNLFFVDALQNLGVDRYFKTELKSV
LDEIYRLWLEKNEEIFSDIAHCAMAFRLLRMNNYEVSSEELEGFVDQ
EHFFTTSGGKLISHVAILELHRASQVDIQEGKDLILDKISTWTRNFM
EQELLDNQILDRSKKEMEFAMRKFYGTFDRVETRRYIESYKMDSFKI
LKAAYRSSNINNIDLLKFSEFIDFNLCQARHKEELQQIKRWFADCKL
EQVGSSQNYLYTSYFPIAAILFEPEYGDARLAFAKCGIIATTVDDFF
DGFACNEELQNIIELVERWDGYPTVGFRSERVRIFFLALYKMIEEIA
AKAETKQGRCVKDLLINLWIDLLKCMLVELDLWKIKSTTPSIEEYLS
IACVTTGVKCLILISLHLLGPKLSKDVTESSEVSALWNCTAVVARLN
NDIHSYKREQAESSTNMAAILISQSQRTISEEEAIRQIKEMMESKRR
ELLGMVLQNKESQLPQVCKDLFWTTFKAAYSIYTHGDEYRFPQELKN HINDVIYKPLNQYSP SEQ
ID NO: 24 Optimized cDNA for E. coli expression encoding for
NgSCS-del29 ATGGCTAATTTCCATCGCCCATCCCGTGTTCGTTGTTCCCACTCTAC
CGCAAGCTCCCTGGAAGAGGCAAAAGAGCGCATCCGTGAAACCTTCG
GCAAAAATGAACTCTCTCCTTCTAGCTATGATACGGCCTGGGTTGCT
ATGGTCCCGAGCCGCTACAGCATGAACCAGCCGTGCTTTCCGCGCTG
CCTGGACTGGATTCTGGAGAACCAACGTGAGGATGGCAGCTGGGGTC
TGAACCCGAGCCATCCGTTACTGGTGAAAGACAGCTTGAGCAGCACG
CTGGCGTGTTTGCTGGCGCTGCGTAAGTGGCGTATTGGCGACAACCA
AGTCCAGCGTGGCCTGGGTTTTATCGAGACTCATGGTTGGGCAGTGG
ACAACGTAGACCAGATCTCTCCACTGGGTTTTGACATCATTTTCCCG
AGCATGATTAAATATGCGGAAAAGCTGAATCTGGATTTGCCTTTTGA
TCCGAACCTGGTGAACATGATGCTGCGCGAGCGCGAGCTGACGATCG
AGCGTGCGCTGAAAAACGAATTTGAGGGTAATATGGCTAATGTCGAG
TACTTCGCCGAGGGTTTGGGTGAGCTGTGTCACTGGAAAGAAATCAT
GCTGCACCAACGCCGTAACGGTAGCCTGTTCGACTCTCCGGCAACGA
CCGCCGCGGCTCTTATTTATCATCAGCACGATGAGAAGTGCTTCGGC
TATCTGTCTAGCATCCTGAAATTACACGAGAACTGGGTGCCGACCAT
CTATCCGACCAAGGTTCACTCCAATCTGTTTTTCGTCGATGCGCTGC
AGAACCTGGGTGTTGACCGTTACTTCAAAACCGAACTGAAGTCCGTC
CTGGATGAGATCTACCGTTTGTGGCTGGAGAAAAACGAAGAGATCTT
CAGCGATATTGCGCACTGCGCAATGGCGTTTCGCCTGTTGCGCATGA
ATAATTACGAGGTTAGCAGCGAAGAACTGGAAGGCTTCGTGGACCAA
GAACATTTTTTCACCACGTCGGGTGGCAAGCTGATCAGCCACGTTGC
CATCCTGGAACTGCACCGTGCAAGCCAAGTGGACATTCAGGAGGGCA
AAGACCTGATCCTGGACAAAATTAGCACCTGGACTCGCAACTTTATG
GAACAGGAACTGCTGGATAACCAGATCTTGGATCGTAGCAAAAAAGA
AATGGAATTTGCAATGCGTAAGTTTTACGGTACGTTCGATCGCGTGG
AAACCCGTCGTTATATTGAAAGCTACAAAATGGATTCCTTCAAGATC
CTGAAGGCAGCGTACCGTAGCTCCAACATTAACAATATTGACCTGTT
GAAGTTCAGCGAGCACGACTTCAATCTCTGCCAGGCGCGTCACAAGG
AAGAACTGCAGCAAATCAAACGCTGGTTCGCAGATTGCAAACTGGAG
CAAGTCGGTAGCAGCCAGAACTACTTGTACACCTCTTACTTCCCGAT
CGCGGCCATTTTGTTCGAGCCGGAGTATGGCGACGCACGCCTGGCGT
TCGCGAAGTGCGGTATTATCGCGACCACCGTTGACGATTTTTTTGAC
GGTTTTGCATGTAATGAAGAACTGCAAAACATCATCGAACTGGTCGA
GAGATGGGACGGTTATCCGACGGTTGGTTTCCGCTCCGAGCGTGTGC
GCATTTTCTTTCTGGCGCTGTACAAAATGATTGAAGAAATTGCCGCG
AAAGCGGAAACGAAACAGGGCCGTTGCGTGAAAGATCTGTTGATCAA
TCTGTGGATTGATCTGCTGAAATGCATGCTGGTCGAACTGGATCTGT
GGAAAATTAAGAGCACGACCCCGAGCATTGAAGAGTATCTGAGCATT
GCCTGTGTGACGACCGGCGTTAAGTGCTTGATCCTGATTAGCCTGCA
TCTGCTGGGCCCGAAACTGAGCAAAGACGTGACCGAATCCAGCGAAG
TTAGCGCTCTGTGGAACTGTACGGCCGTGGTTGCGCGCCTGAACAAC
GACATTCATAGCTACAAGCGTGAGCAAGCCGAGAGCAGCACTAATAT
GGCCGCAATCCTGATTTCGCAAAGCCAGCGTACCATCTCAGAAGAAG
AAGCTATCCGCCAGATCAAAGAGATGATGGAATCGAAACGCCGTGAG
CTGCTGGGCATGGTGCTGCAGAATAAAGAGAGCCAATTGCCGCAAGT
CTGCAAAGACCTGTTTTGGACCACCTTCAAAGGCGCGTACAGCATTT
ATACCCACGGTGATGAGTACCGTTTTCCACAAGAACTGAAGAACCAT
ATCAACGATGTCATCTATAAGCCGTTAAATCAATACAGCCCTTAA SEQ ID NO: 25
NgSCS-del38, putative sclareol synthase from Nicotiana glutinosa
MSHSTASSLEEAKERIRETFGKNELSSSSYDTAWVAMVPSRYSMNQP
CFPRCLDWILENQREDGSWGLNPSLPLLVKDSLSSTLACLLALRKWR
IGDNQVQRGLGFIETHGWAVDNVDQISPLGFDIIFPSMIKYAEKLNL
DLPFDPNLVNMMLRERELTIERALKNEFEGNMANVEYFAEGLGELCH
WKEIMLHQRRNGSPFDSPATTAAALIYHQHDEKCFGYLSSILKLHEN
WVPTIYPTKVHSNLFFVDALQNLGVDRYFKTELKSVLDEIYRLWLEK
NEEIFSDIAHCAMAFRLLRMNNYEVSSEELEGFVDQEHFFTTSGGKL
ISHVAILELHRASQVDIQEGKDLILDKISTWTRNFMEQELLDNQILD
RSKKEMEFAMRKFYGTFDRVETRRYIESYKMDSFKILKAAYRSSNIN
NIDLLKFSEHDFNLCQARHKEELQQIKRWFADCKLEQVGSSQNYLYT
SYFPIAAILFEPEYGDARLAFAKCGIIATTVDDFFDGFACNEELQNI
IELVERWDGYPTVGFRSERVRIFFLALYKMIEEIAAKAETKQGRCVK
DLLINLWIDLLKCMLVELDLWKIKSTTPSIEEYLSIACVTTGVKCLI
LISLHLLGPKLSKDVTESSEVSALWNCTAVVARLNNDIHSYKREQAE
SSTNMVAILISQSQRTISEEEAIRQIKEMMESKRRELLGMVLQNKES
QLPQVCKDLFWTTFKAAYSIYTHGDEYRFPQELKNHINDVIYKPLNQ YSP SEQ ID NO: 26
Optimized cDNA for Saccharomyces cerevisiae expression encoding for
SmCPS2. ATGGCTACTGTTGACGCTCCACAAGTTCACGACCACGACGGTACTAC
TGTTCACCAAGGTCACGACGCTGTTAAGAACATCGAAGACCCAATCG
AATACATCAGAACTTTGTTGAGAACTACTGGTGACGGTAGAATCTCT
GTTTCTCCATACGACACTGCTTGGGTTGCTATGATCAAGGACGTTGA
AGGTAGAGACGGTCCACAATTCCCATCTTCTTTGGAATGGATCGTTC
AAAACCAATTGGAAGACGGTTCTTGGGGTGACCAAAAGTTGTTCTGT
GTTTACGACAGATTGGTTAACACTATCGCTTGTGTTGTTGCTTTGAG
ATCTTGGAACGTTCACGCTCACAAGGTTAAGAGAGGTGTTACTTACA
TCAAGGAAAACGTTGACAAGTTGATGGAAGGTAACGAAGAACACATG
ACTTGTGGTTTCGAAGTTGTTTTCCCAGCTTTGTTGCAAAAGGCTAA
GTCTTTGGGTATCGAAGACTTGCCATACGACTCTCCAGCTGTTCAAG
AAGTTTACCACGTTAGAGAACAAAAGTTGAAGAGAATCCCATTGGAA
ATCATGCACAAGATCCCAACTTCTTTGTTGTTCTCTTTGGAAGGTTT
GGAAAACTTGGACTGGGACAAGTTGTTGAAGTTGCAATCTGCTGACG
GTTCTTTCTTGACTTCTCCATCTTCTACTGCTTTCGCTTTCATGCAA
ACTAAGGACGAAAAGTGTTACCAATTCATCAAGAACACTATCGACAC
TTTCAACGGTGGTGCTCCACACACTTACCCAGTTGACGTTTTCGGTA
GATTGTGGGCTATCGACAGATTGCAAAGATTGGGTATCTCTAGATTC
TTCGAACCAGAAATCGCTGACTGTTTGTCTCACATCCACAAGTTCTG
GACTGACAAGGGTGTTTTCTCTGGTAGAGAATCTGAATTCTGTGACA
TCGACGACACTTCTATGGGTATGAGATTGATGAGAATGCACGGTTAC
GACGTTGACCCAAACGTTTTGAGAAACTTCAAGCAAAAGGACGGTAA
GTTCTCTTGTTACGGTGGTCAAATGATCGAATCTCCATCTCCAATCT
ACAACTTGTACAGAGCTTCTCAATTGAGATTCCCAGGTGAAGAAATC
TTGGAAGACGCTAAGAGATTCGCTTACGACTTCTTGAAGGAAAAGTT
GGCTAACAACCAAATCTTGGACAAGTGGGTTATCTCTAAGCACTTGC
CAGACGAAATCAAGTTGGGTTTGGAAATGCCATGGTTGGCTACTTTG
CCAAGAGTTGAAGCTAAGTACTACATCCAATACTACGCTGGTTCTGG
TGACGTTTGGATCGGTAAGACTTTGTACAGAATGCCAGAAATCTCTA
ACGACACTTACCACGACTTGGCTAAGACTGACTTCAAGAGATGTCAA
GCTAAGCACCAATTCGAATGGTTGTACATGCAAGAATGGTACGAATC
TTGTGGTATCGAAGAATTCGGTATCTCTAGAAAGGACTTGTTGTTGT
CTTACTTCTTGGCTACTGCTTCTATCTTCGAATTGGAAAGAACTAAC
GAAAGAATCGCTTGGGCTAAGTCTCAAATCATCGCTAAGATGATCAC
TTCTTTCTTCAACAAGGAAACTACTTCTGAAGAAGACAAGAGAGCTT
TGTTGAACGAATTGGGTAACATCAACGGTTTGAACGACACTAACGGT
GCTGGTAGAGAAGGTGGTGCTGGTTCTATCGCTTTGGCTACTTTGAC
TCAATTCTTGGAAGGTTTCGACAGATACACTAGACACCAATTGAAGA
ACGCTTGGTCTGTTTGGTTGACTCAATTGCAACACGGTGAAGCTGAC
GACGCTGAATTGTTGACTAACACTTTGAACATCTGTGCTGGTCACAT
CGCTTTCAGAGAAGAAATCTTGGCTCACAACGAATACAAGGCTTTGT
CTAACTTGACTTCTAAGATCTGTAGACAATTGTCTTTCATCCAATCT
GAAAAGGAAATGGGTGTTGAAGGTGAAATCGCTGCTAAGTCTTCTAT
CAAGAACAAGGAATTGGAAGAAGACATGCAAATGTTGGTTAAGTTGG
TTTTGGAAAAGTACGGTGGTATCGACAGAAACATCAAGAAGGCTTTC
TTGGCTGTTGCTAAGACTTACTACTACAGAGCTTACCACGCTGCTGA
CACTATCGACACTCACATGTTCAAGGTTTTGTTCGAACCAGTTGCTT AA SEQ ID NO: 27
Optimized cDNA for S. cerevisiae expression encoding for truncated
SsScS from Salvia sclarea
ATGGCTAAGATGAAGGAAAACTTCAAGAGAGAAGACGACAAGTTCCC
AACTACTACTACTTTGAGATCTGAAGACATCCCATCTAACTTGTGTA
TCATCGACACTTTGCAAAGATTGGGTGTTGACCAATTCTTCCAATAC
GAAATCAACACTATCTTGGACAACACTTTCAGATTGTGGCAAGAAAA
GCACAAGGTTATCTACGGTAACGTTACTACTCACGCTATGGCTTTCA
GATTGTTGAGAGTTAAGGGTTACGAAGTTTCTTCTGAAGAATTGGCT
CCATACGGTAACCAAGAAGCTGTTTCTCAACAAACTAACGACTTGCC
AATGATCATCGAATTGTACAGAGCTGCTAACGAAAGAATCTACGAAG
AAGAAAGATCTTTGGAAAAGATCTTGGCTTGGACTACTATCTTCTTG
AACAAGCAAGTTCAAGACAACTCTATCCCAGACAAGAAGTTGCACAA
GTTGGTTGAATTCTACTTGAGAAACTACAAGGGTATCACTATCAGAT
TGGGTGCTAGAAGAAACTTGGAATTGTACGACATGACTTACTACCAA
GCTTTGAAGTCTACTAACAGATTCTCTAACTTGTGTAACGAAGACTT
CTTGGTTTTCGCTAAGCAAGACTTCGACATCCACGAAGCTCAAAACC
AAAAGGGTTTGCAACAATTGCAAAGATGGTACGCTGACTGTAGATTG
GACACTTTGAACTTCGGTAGAGACGTTGTTATCATCGCTAACTACTT
GGCTTCTTTGATCATCGGTGACCACGCTTTCGACTACGTTAGATTGG
CTTTCGCTAAGACTTCTGTTTTGGTTACTATCATGGACGACTTCTTC
GACTGTCACGGTTCTTCTCAAGAATGTGACAAGATCATCGAATTGGT
TAAGGAATGGAAGGAAAACCCAGACGCTGAATACGGTTCTGAAGAAT
TGGAAATCTTGTTCATGGCTTTGTACAACACTGTTAACGAATTGGCT
GAAAGAGCTAGAGTTGAACAAGGTAGATCTGTTAAGGAATTCTTGGT
TAAGTTGTGGGTTGAAATCTTGTCTGCTTTCAAGATCGAATTGGACA
CTTGGTCTAACGGTACTCAACAATCTTTCGACGAATACATCTCTTCT
TCTTGGTTGTCTAACGGTTCTAGATTGACTGGTTTGTTGACTATGCA
ATTCGTTGGTGTTAAGTTGTCTGACGAAATGTTGATGTCTGAAGAAT
GTACTGACTTGGCTAGACACGTTTGTATGGTTGGTAGATTGTTGAAC
GACGTTTGTTCTTCTGAAAGAGAAAGAGAAGAAAACATCGCTGGTAA
GTCTTACTCTATCTTGTTGGCTACTGAAAAGGACGGTAGAAAGGTTT
CTGAAGACGAAGCTATCGCTGAAATCAACGAAATGGTTGAATACCAC
TGGAGAAAGGTTTTGCAAATCGTTTACAAGAAGGAATCTATCTTGCC
AAGAAGATGTAAGGACGTTTTCTTGGAAATGGCTAAGGGTACTTTCT
ACGCTTACGGTATCAACGACGAATTGACTTCTCCACAACAATCTAAG
GAAGACATGAAGTCTTTCGTTTTCTAA SEQ ID NO: 28 Optimized cDNA for S.
cerevisiae expression encoding for the GGPP synthase from Pantoea
agglomeranes ATGGTTTCTGGTTCTAAGGCTGGTGTTTCTCCACACAGAGAAATCGA
AGTTATGAGACAATCTATCGACGACCACTTGGCTGGTTTGTTGCCAG
AAACTGACTCTCAAGACATCGTTTCTTTGGCTATGAGAGAAGGTGTT
ATGGCTCCAGGTAAGAGAATCAGACCATTGTTGATGTTGTTGGCTGC
TAGAGACTTGAGATACCAAGGTTCTATGCCAACTTTGTTGGACTTGG
CTTGTGCTGTTGAATTGACTCACACTGCTTCTTTGATGTTGGACGAC
ATGCCATGTATGGACAACGCTGAATTGAGAAGAGGTCAACCAACTAC
TCACAAGAAGTTCGGTGAATCTGTTGCTATCTTGGCTTCTGTTGGTT
TGTTGTCTAAGGCTTTCGGTTTGATCGCTGCTACTGGTGACTTGCCA
GGTGAAAGAAGAGCTCAAGCTGTTAACGAATTGTCTACTGCTGTTGG
TGTTCAAGGTTTGGTTTTGGGTCAATTCAGAGACTTGAACGACGCTG
CTTTGGACAGAACTCCAGACGCTATCTTGTCTACTAACCACTTGAAG
ACTGGTATCTTGTTCTCTGCTATGTTGCAAATCGTTGCTATCGCTTC
TGCTTCTTCTCCATCTACTAGAGAAACTTTGCACGCTTTCGCTTTGG
ACTTCGGTCAAGCTTTCCAATTGTTGGACGACTTGAGAGACGACCAC
CCAGAAACTGGTAAGGACAGAAACAAGGACGCTGGTAAGTCTACTTT
GGTTAACAGATTGGGTGCTGACGCTGCTAGACAAAAGTTGAGAGAAC
ACATCGACTCTGCTGACAAGCACTTGACTTTCGCTTGTCCACAAGGT
GGTGCTATCAGACAATTCATGCACTTGTGGTTCGGTCACCACTTGGC
TGACTGGTCTCCAGTTATGAAGATCGCTTAA SEQ ID NO: 29 Optimized cDNA for S.
cerevisiae expression encoding for CfCPS1-del63
ATGGTTGCTACTGTTAACGCTCCACCAGTTCACGACCAAGACGACTC
TACTGAAAACCAATGTCACGACGCTGTTAACAACATCGAAGACCCAA
TCGAATACATCAGAACTTTGTTGAGAACTACTGGTGACGGTAGAATC
TCTGTTTCTCCATACGACACTGCTTGGGTTGCTTTGATCAAGGACTT
GCAAGGTAGAGACGCTCCAGAATTCCCATCTTCTTTGGAATGGATCA
TCCAAAACCAATTGGCTGACGGTTCTTGGGGTGACGCTAAGTTCTTC
TGTGTTTACGACAGATTGGTTAACACTATCGCTTGTGTTGTTGCTTT
GAGATCTTGGGACGTTCACGCTGAAAAGGTTGAAAGAGGTGTTAGAT
ACATCAACGAAAACGTTGAAAAGTTGAGAGACGGTAACGAAGAACAC
ATGACTTGTGGTTTCGAAGTTGTTTTCCCAGCTTTGTTGCAAAGAGC
TAAGTCTTTGGGTATCCAAGACTTGCCATACGACGCTCCAGTTATCC
AAGAAATCTACCACTCTAGAGAACAAAAGTCTAAGAGAATCCCATTG
GAAATGATGCACAAGGTTCCAACTTCTTTGTTGTTCTCTTTGGAAGG
TTTGGAAAACTTGGAATGGGACAAGTTGTTGAAGTTGCAATCTGCTG
ACGGTTCTTTCTTGACTTCTCCATCTTCTACTGCTTTCGCTTTCATG
CAAACTAGAGACCCAAAGTGTTACCAATTCATCAAGAACACTATCCA
AACTTTCAACGGTGGTGCTCCACACACTTACCCAGTTGACGTTTTCG
GTAGATTGTGGGCTATCGACAGATTGCAAAGATTGGGTATCTCTAGA
TTCTTCGAATCTGAAATCGCTGACTGTATCGCTCACATCCACAGATT
CTGGACTGAAAAGGGTGTTTTCTCTGGTAGAGAATCTGAATTCTGTG
ACATCGACGACACTTCTATGGGTGTTAGATTGATGAGAATGCACGGT
TACGACGTTGACCCAAACGTTTTGAAGAACTTCAAGAAGGACGACAA
GTTCTCTTGTTACGGTGGTCAAATGATCGAATCTCCATCTCCAATCT
ACAACTTGTACAGAGCTTCTCAATTGAGATTCCCAGGTGAACAAATC
TTGGAAGACGCTAACAAGTTCGCTTACGACTTCTTGCAAGAAAAGTT
GGCTCACAACCAAATCTTGGACAAGTGGGTTATCTCTAAGCACTTGC
CAGACGAAATCAAGTTGGGTTTGGAAATGCCATGGTACGCTACTTTG
CCAAGAGTTGAAGCTAGATACTACATCCAATACTACGCTGGTTCTGG
TGACGTTTGGATCGGTAAGACTTTGTACAGAATGCCAGAAATCTCTA
ACGACACTTACCACGAATTGGCTAAGACTGACTTCAAGAGATGTCAA
GCTCAACACCAATTCGAATGGATCTACATGCAAGAATGGTACGAATC
TTGTAACATGGAAGAATTCGGTATCTCTAGAAAGGAATTGTTGGTTG
CTTACTTCTTGGCTACTGCTTCTATCTTCGAATTGGAAAGAGCTAAC
GAAAGAATCGCTTGGGCTAAGTCTCAAATCATCTCTACTATCATCGC
TTCTTTCTTCAACAACCAAAACACTTCTCCAGAAGACAAGTTGGCTT
TCTTGACTGACTTCAAGAACGGTAACTCTACTAACATGGCTTTGGTT
ACTTTGACTCAATTCTTGGAAGGTTTCGACAGATACACTTCTCACCA
ATTGAAGAACGCTTGGTCTGTTTGGTTGAGAAAGTTGCAACAAGGTG
AAGGTAACGGTGGTGCTGACGCTGAATTGTTGGTTAACACTTTGAAC
ATCTGTGCTGGTCACATCGCTTTCAGAGAAGAAATCTTGGCTCACAA
CGACTACAAGACTTTGTCTAACTTGACTTCTAAGATCTGTAGACAAT
TGTCTCAAATCCAAAACGAAAAGGAATTGGAAACTGAAGGTCAAAAG
ACTTCTATCAAGAACAAGGAATTGGAAGAAGACATGCAAAGATTGGT
TAAGTTGGTTTTGGAAAAGTCTAGAGTTGGTATCAACAGAGACATGA
AGAAGACTTTCTTGGCTGTTGTTAAGACTTACTACTACAAGGCTTAC
CACTCTGCTCAAGCTATCGACAACCACATGTTCAAGGTTTTGTTCGA ACCAGTTGCTTAA SEQ
ID NO: 30 Optimized cDNA for S. cerevisiae expression encoding for
TaTps1-del59 ATGTACAGACAAAGAACTGACGAACCATCTGAAACTAGACAAATGAT
CGACGACATCAGAACTGCTTTGGCTTCTTTGGGTGACGACGAAACTT
CTATGTCTGTTTCTGCTTACGACACTGCTTTGGTTGCTTTGGTTAAG
AACTTGGACGGTGGTGACGGTCCACAATTCCCATCTTGTATCGACTG
GATCGTTCAAAACCAATTGCCAGACGGTTCTTGGGGTGACCCAGCTT
TCTTCATGGTTCAAGACAGAATGATCTCTACTTTGGCTTGTGTTGTT
GCTGTTAAGTCTTGGAACATCGACAGAGACAACTTGTGTGACAGAGG
TGTTTTGTTCATCAAGGAAAACATGTCTAGATTGGTTGAAGAAGAAC
AAGACTGGATGCCATGTGGTTTCGAAATCAACTTCCCAGCTTTGTTG
GAAAAGGCTAAGGACTTGGACTTGGACATCCCATACGACCACCCAGT
TTTGGAAGAAATCTACGCTAAGAGAAACTTGAAGTTGTTGAAGATCC
CATTGGACGTTTTGCACGCTATCCCAACTACTTTGTTGTTCTCTGTT
GAAGGTATGGTTGACTTGCCATTGGACTGGGAAAAGTTGTTGAGATT
GAGATGTCCAGACGGTTCTTTCCACTCTTCTCCAGCTGCTACTGCTG
CTGCTTTGTCTCACACTGGTGACAAGGAATGTCACGCTTTCTTGGAC
AGATTGATCCAAAAGTTCGAAGGTGGTGTTCCATGTTCTCACTCTAT
GGACACTTTCGAACAATTGTGGGTTGTTGACAGATTGATGAGATTGG
GTATCTCTAGACACTTCACTTCTGAAATCCAACAATGTTTGGAATTC
ATCTACAGAAGATGGACTCAAAAGGGTTTGGCTCACAACATGCACTG
TCCAATCCCAGACATCGACGACACTGCTATGGGTTTCAGATTGTTGA
GACAACACGGTTACGACGTTACTCCATCTGTTTTCAAGCACTTCGAA
AAGGACGGTAAGTTCGTTTGTTTCCCAATGGAAACTAACCACGCTTC
TGTTACTCCAATGCACAACACTTACAGAGCTTCTCAATTCATGTTCC
CAGGTGACGACGACGTTTTGGCTAGAGCTGGTAGATACTGTAGAGCT
TTCTTGCAAGAAAGACAATCTTCTAACAAGTTGTACGACAAGTGGAT
CATCACTAAGGACTTGCCAGGTGAAGTTGGTTACACTTTGAACTTCC
CATGGAAGTCTTCTTTGCCAAGAATCGAAACTAGAATGTACTTGGAC
CAATACGGTGGTAACAACGACGTTTGGATCGCTAAGGTTTTGTACAG
AATGAACTTGGTTTCTAACGACTTGTACTTGAAGATGGCTAAGGCTG
ACTTCACTGAATACCAAAGATTGTCTAGAATCGAATGGAACGGTTTG
AGAAAGTGGTACTTCAGAAACCACTTGCAAAGATACGGTGCTACTCC
AAAGTCTGCTTTGAAGGCTTACTTCTTGGCTTCTGCTAACATCTTCG
AACCAGGTAGAGCTGCTGAAAGATTGGCTTGGGCTAGAATGGCTGTT
TTGGCTGAAGCTGTTACTACTCACTTCAGACACATCGGTGGTCCATG
TTACTCTACTGAAAACTTGGAAGAATTGATCGACTTGGTTTCTTTCG
ACGACGTTTCTGGTGGTTTGAGAGAAGCTTGGAAGCAATGGTTGATG
GCTTGGACTGCTAAGGAATCTCACGGTTCTGTTGACGGTGACACTGC
TTTGTTGTTCGTTAGAACTATCGAAATCTGTTCTGGTAGAATCGTTT
CTTCTGAACAAAAGTTGAACTTGTGGGACTACTCTCAATTGGAACAA
TTGACTTCTTCTATCTGTCACAAGTTGGCTACTATCGGTTTGTCTCA
AAACGAAGCTTCTATGGAAAACACTGAAGACTTGCACCAACAAGTTG
ACTTGGAAATGCAAGAATTGTCTTGGAGAGTTCACCAAGGTTGTCAC
GGTATCAACAGAGAAACTAGACAAACTTTCTTGAACGTTGTTAAGTC
TTTCTACTACTCTGCTCACTGTTCTCCAGAAACTGTTGACTCTCACA
TCGCTAAGGTTATCTTCCAAGACGTTATCTAA SEQ ID NO: 31 Optimized cDNA for
S. cerevisiae expression encoding for MvCps3-del63
ATGGCTCCACCAGAACAAAAGTACAACTCTACTGCTTTGGAACACGA
CACTGAAATCATCGAAATCGAAGACCACATCGAATGTATCAGAAGAT
TGTTGAGAACTGCTGGTGACGGTAGAATCTCTGTTTCTCCATACGAC
ACTGCTTGGATCGCTTTGATCAAGGACTTGGACGGTCACGACTCTCC
ACAATTCCCATCTTCTATGGAATGGGTTGCTGACAACCAATTGCCAG
ACGGTTCTTGGGGTGACGAACACTTCGTTTGTGTTTACGACAGATTG
GTTAACACTATCGCTTGTGTTGTTGCTTTGAGATCTTGGAACGTTCA
CGCTCACAAGTGTGAAAAGGGTATCAAGTACATCAAGGAAAACGTTC
ACAAGTTGGAAGACGCTAACGAAGAACACATGACTTGTGGTTTCGAA
GTTGTTTTCCCAGCTTTGTTGCAAAGAGCTCAATCTATGGGTATCAA
GGGTATCCCATACAACGCTCCAGTTATCGAAGAAATCTACAACTCTA
GAGAAAAGAAGTTGAAGAGAATCCCAATGGAAGTTGTTCACAAGGTT
GCTACTTCTTTGTTGTTCTCTTTGGAAGGTTTGGAAAACTTGGAATG
GGAAAAGTTGTTGAAGTTGCAATCTCCAGACGGTTCTTTCTTGACTT
CTCCATCTTCTACTGCTTTCGCTTTCATCCACACTAAGGACAGAAAG
TGTTTCAACTTCATCAACAACATCGTTCACACTTTCAAGGGTGGTGC
TCCACACACTTACCCAGTTGACATCTTCGGTAGATTGTGGGCTGTTG
ACAGATTGCAAAGATTGGGTATCTCTAGATTCTTCGAATCTGAAATC
GCTGAATTCTTGTCTCACGTTCACAGATTCTGGTCTGACGAAGCTGG
TGTTTTCTCTGGTAGAGAATCTGTTTTCTGTGACATCGACGACACTT
CTATGGGTTTGAGATTGTTGAGAATGCACGGTTACCACGTTGACCCA
AACGTTTTGAAGAACTTCAAGCAATCTGACAAGTTCTCTTGTTACGG
TGGTCAAATGATGGAATGTTCTTCTCCAATCTACAACTTGTACAGAG
CTTCTCAATTGCAATTCCCAGGTGAAGAAATCTTGGAAGAAGCTAAC
AAGTTCGCTTACAAGTTCTTGCAAGAAAAGTTGGAATCTAACCAAAT
CTTGGACAAGTGGTTGATCTCTAACCACTTGTCTGACGAAATCAAGG
TTGGTTTGGAAATGCCATGGTACGCTACTTTGCCAAGAGTTGAAACT
TCTTACTACATCCACCACTACGGTGGTGGTGACGACGTTTGGATCGG
TAAGACTTTGTACAGAATGCCAGAAATCTCTAACGACACTTACAGAG
AATTGGCTAGATTGGACTTCAGAAGATGTCAAGCTCAACACCAATTG
GAATGGATCTACATGCAAAGATGGTACGAATCTTGTAGAATGCAAGA
ATTCGGTATCTCTAGAAAGGAAGTTTTGAGAGCTTACTTCTTGGCTT
CTGGTACTATCTTCGAAGTTGAAAGAGCTAAGGAAAGAGTTGCTTGG
GCTAGATCTCAAATCATCTCTCACATGATCAAGTCTTTCTTCAACAA
GGAAACTACTTCTTCTGACCAAAAGCAAGCTTTGTTGACTGAATTGT
TGTTCGGTAACATCTCTGCTTCTGAAACTGAAAAGAGAGAATTGGAC
GGTGTTGTTGTTGCTACTTTGAGACAATTCTTGGAAGGTTTCGACAT
CGGTACTAGACACCAAGTTAAGGCTGCTTGGGACGTTTGGTTGAGAA
AGGTTGAACAAGGTGAAGCTCACGGTGGTGCTGACGCTGAATTGTGT
ACTACTACTTTGAACACTTGTGCTAACCAACACTTGTCTTCTCACCC
AGACTACAACACTTTGTCTAAGTTGACTAACAAGATCTGTCACAAGT
TGTCTCAAATCCAACACCAAAAGGAAATGAAGGGTGGTATCAAGGCT
AAGTGTTCTATCAACAACAAGGAAGTTGACATCGAAATGCAATGGTT
GGTTAAGTTGGTTTTGGAAAAGTCTGGTTTGAACAGAAAGGCTAAGC
AAGCTTTCTTGTCTATCGCTAAGACTTACTACTACAGAGCTTACTAC
GCTGACCAAACTATGGACGCTCACATCTTCAAGGTTTTGTTCGAACC AGTTGTTTAA SEQ ID
NO: 32 Optimized cDNA for S. cerevisiae expression encoding for
RoCPS1-del67 ATGGCTTCTCAAGTTTCTGAAAAGGGTACTTCTTCTCCAGTTCAAAC
TCCAGAAGAAGTTAACGAAAAGATCGAAAACTACATCGAATACATCA
AGAACTTGTTGACTACTTCTGGTGACGGTAGAATCTCTGTTTCTCCA
TACGACACTTCTATCGTTGCTTTGATCAAGGACTTGAAGGGTAGAGA
CACTCCACAATTCCCATCTTGTTTGGAATGGATCGCTCAACACCAAA
TGGCTGACGGTTCTTGGGGTGACGAATTCTTCTGTATCTACGACAGA
ATCTTGAACACTTTGGCTTGTGTTGTTGCTTTGAAGTCTTGGAACGT
TCACGCTGACATGATCGAAAAGGGTGTTACTTACGTTAACGAAAACG
TTCAAAAGTTGGAAGACGGTAACTTGGAACACATGACTTCTGGTTTC
GAAATCGTTGTTCCAGCTTTGGTTCAAAGAGCTCAAGACTTGGGTAT
CCAAGGTTTGCCATACGACCACCCATTGATCAAGGAAATCGCTAACA
CTAAGGAAGGTAGATTGAAGAAGATCCCAAAGGACATGATCTACCAA
AAGCCAACTACTTTGTTGTTCTCTTTGGAAGGTTTGGGTGACTTGGA
ATGGGAAAAGATCTTGAAGTTGCAATCTGGTGACGGTTCTTTCTTGA
CTTCTCCATCTTCTACTGCTCACGTTTTCATGAAGACTAAGGACGAA
TAAGTGTTGAAGTTCATCGAAAACGCTGTTAAGAACTGTAACGGTGG
TGCTCCACACACTTACCCAGTTGACGTTTTCGCTAGATTGTGGGCTG
TTGACAGATTGCAAAGATTGGGTATCTCTAGATTCTTCCAACAAGAA
ATCAAGTACTTCTTGGACCACATCAACTCTGTTTGGACTGAAAACGG
TGTTTTCTCTGGTAGAGACTCTGAATTCTGTGACATCGACGACACTT
CTATGGGTATCAGATTGTTGAAGATGCACGGTTACGACATCGACCCA
AACGCTTTGGAACACTTCAAGCAACAAGACGGTAAGTTCTCTTGTTA
CGGTGGTCAAATGATCGAATCTGCTTCTCCAATCTACAACTTGTACA
GAGCTGCTCAATTGAGATTCCCAGGTGAAGAAATCTTGGAAGAAGCT
ACTAAGTTCGCTTACAACTTCTTGCAAGAAAAGATCGCTAACGACCA
ATTCCAAGAAAAGTGGGTTATCTCTGACCACTTGATCGACGAAGTTA
AGTTGGGTTTGAAGATGCCATGGTACGCTACTTTGCCAAGAGTTGAA
TGCTGCTTACTACTTGCAATACTACGCTGGTTGTGGGACGTTTGGAT
CGGTAAGGTTTTCTACAGAATGCCAGAAATCTCTAACGACACTTACA
AGAAGTTGGCTATCTTGGACTTCAACAGATGTCAAGCTCAACACCAA
TTCGAATGGATCTACATGCAAGAATGGTACCACAGATCTTCTGTTTC
TGAATTCGGTATCTCTAAGAAGGACTTGTTGAGAGCTTACTTCTTGG
CTGCTGCTACTATCTTCGAACCAGAAAGAACTCAAGAAAGATTGGTT
TGGGCTAAGACTCAAATCGTTTCTGGTATGATCACTTCTTTCGTTAA
CTCTGGTACTACTTTGTCTTTGCACCAAAAGACTGCTTTGTTGTCTC
AAATCGGTCACAACTTCGACGGTTTGGACGAAATCATCTCTGCTATG
AAGGACCACGGTTTGGCTGCTACTTTGTTGACTACTTTCCAACAATT
GTTGGACGGTTTCGACAGATACACTAGACACCAATTGAAGAACGCTT
GGTCTCAATGGTTCATGAAGTTGCAACAAGGTGAAGCTTCTGGTGGT
GAAGACGCTGAATTGTTGGCTAACACTTTGAACATCTGTGCTGGTTT
GATCGCTTTCAACGAAGACGTTTTGTCTCACCACGAATACACTACTT
TGTCTACTTTGACTAACAAGATCTGTAAGAGATTGACTCAAATCCAA
GACAAGAAGACTTTGGAAGTTGTTGACGGTTCTATCAAGGACAAGGA
ATTGGAAAAGGACATCCAAATGTTGGTTAAGTTGGTTTTGGAAGAAA
ACGGTGGTGGTGTTGACAGAAACATCAAGCACACTTTCTTGTCTGTT
TTCAAGACTTTCTACTACAACGCTTACCACGACGACGAAACTACTGA
CGTTCACATCTTCAAGGTTTTGTTCGGTCCAGTTGTTTAA SEQ ID NO: 33 Optimized
cDNA for S. cerevisiae expression encoding for NgSCS-del29
ATGGCTAACTTCCACAGACCATCTAGAGTTAGATGTTCTCACTCTAC
TGCTTCTTCTTTGGAAGAAGCTAAGGAAAGAATCAGAGAAACTTTCG
GTAAGAACGAATTGTCTCCATCTTCTTACGACACTGCTTGGGTTGCT
ATGGTTCCATCTAGATACTCTATGAACCAACCATGTTTCCCAAGATG
TTTGGACTGGATCTTGGAAAACCAAAGAGAAGACGGTTCTTGGGGTT
TGAACCCATCTCACCCATTGTTGGTTAAGGACTCTTTGTCTTCTACT
TTGGCTTGTTTGTTGGCTTTGAGAAAGTGGAGAATCGGTGACAACCA
AGTTCAAAGAGGTTTGGGTTTCATCGAAACTCACGGTTGGGCTGTTG
ACAACGTTGACCAAATCTCTCCATTGGGTTTCGACATCATCTTCCCA
TCTATGATCAAGTACGCTGAAAAGTTGAACTTGGACTTGCCATTCGA
CCCAAACTTGGTTAACATGATGTTGAGAGAAAGAGAATTGACTATCG
AAAGAGCTTTGAAGAACGAATTCGAAGGTAACATGGCTAACGTTGAA
TACTTCGCTGAAGGTTTGGGTGAATTGTGTCACTGGAAGGAAATCAT
GTTGCACCAAAGAAGAAACGGTTCTTTGTTCGACTCTCCAGCTACTA
CTGCTGCTGCTTTGATCTACCACCAACACGACGAAAAGTGTTTCGGT
TACTTGTCTTCTATCTTGAAGTTGCACGAAAACTGGGTTCCAACTAT
CTACCCAACTAAGGTTCACTCTAACTTGTTCTTCGTTGACGCTTTGC
AAAACTTGGGTGTTGACAGATACTTCAAGACTGAATTGAAGTCTGTT
TTGGACGAAATCTACAGATTGTGGTTGGAAAAGAACGAAGAAATCTT
CTCTGACATCGCTCACTGTGCTATGGCTTTCAGATTGTTGAGAATGA
ACAACTACGAAGTTTCTTCTGAAGAATTGGAAGGTTTCGTTGACCAA
GAACACTTCTTCACTACTTCTGGTGGTAAGTTGATCTCTCACGTTGC
TATCTTGGAATTGCACAGAGCTTCTCAAGTTGACATCCAAGAAGGTA
AGGACTTGATCTTGGACAAGATCTCTACTTGGACTAGAAACTTCATG
GAACAAGAATTGTTGGACAACCAAATCTTGGACAGATCTAAGAAGGA
AATGGAATTCGCTATGAGAAAGTTCTACGGTACTTTCGACAGAGTTG
AAACTAGAAGATACATCGAATCTTACAAGATGGACTCTTTCAAGATC
TTGAAGGCTGCTTACAGATCTTCTAACATCAACAACATCGACTTGTT
GAAGTTCTCTGAACACGACTTCAACTTGTGTCAAGCTAGACACAAGG
AAGAATTGCAACAAATCAAGAGATGGTTCGCTGACTGTAAGTTGGAA
CAAGTTGGTTCTTCTCAAAACTACTTGTACACTTCTTACTTCCCAAT
CGCTGCTATCTTGTTCGAACCAGAATACGGTGACGCTAGATTGGCTT
TCGCTAAGTGTGGTATCATCGCTACTACTGTTGACGACTTCTTCGAC
GGTTTCGCTTGTAACGAAGAATTGCAAAACATCATCGAATTGGTTGA
AAGATGGGACGGTTACCCAACTGTTGGTTTCAGATCTGAAAGAGTTA
GAATCTTCTTCTTGGCTTTGTACAAGATGATCGAAGAAATCGCTGCT
AAGGCTGAAACTAAGCAAGGTAGATGTGTTAAGGACTTGTTGATCAA
CTTGTGGATCGACTTGTTGAAGTGTATGTTGGTTGAATTGGACTTGT
GGAAGATCAAGTCTACTACTCCATCTATCGAAGAATACTTGTCTATC
GCTTGTGTTACTACTGGTGTTAAGTGTTTGATCTTGATCTCTTTGCA
CTTGTTGGGTCCAAAGTTGTCTAAGGACGTTACTGAATCTTCTGAAG
TTTCTGCTTTGTGGAACTGTACTGCTGTTGTTGCTAGATTGAACAAC
GACATCCACTCTTACAAGAGAGAACAAGCTGAATCTTCTACTAACAT
GGCTGCTATCTTGATCTCTCAATCTCAAAGAACTATCTCTGAAGAAG
AAGCTATCAGACAAATCAAGGAAATGATGGAATCTAAGAGAAGAGAA
TTGTTGGGTATGGTTTTGCAAAACAAGGAATCTCAATTGCCACAAGT
TTGTAAGGACTTGTTCTGGACTACTTTCAAGGCTGCTTACTCTATCT
ACACTCACGGTGACGAATACAGATTCCCACAAGAATTGAAGAACCAC
ATCAACGACGTTATCTACAAGCCATTGAACCAATACTCTCCATAA SEQ ID NO: 34
Optimized cDNA for S. cerevisiae expression encoding for
NgSCS-del38 ATGTCTCACTCTACTGCTTCTTCTTTGGAAGAAGCTAAGGAAAGAAT
CAGAGAAACTTTCGGTAAGAACGAATTGTCTTCTTCTTCTTACGACA
CTGCTTGGGTTGCTATGGTTCCATCTAGATACTCTATGAACCAACCA
TGTTTCCCAAGATGTTTGGACTGGATCTTGGAAAACCAAAGAGAAGA
CGGTTCTTGGGGTTTGAACCCATCTTTGCCATTGTTGGTTAAGGACT
CTTTGTCTTCTACTTTGGCTTGTTTGTTGGCTTTGAGAAAGTGGAGA
ATCGGTGACAACCAAGTTCAAAGAGGTTTGGGTTTCATCGAAACTCA
CGGTTGGGCTGTTGACAACGTTGACCAAATCTCTCCATTGGGTTTCG
ACATCATCTTCCCATCTATGATCAAGTACGCTGAAAAGTTGAACTTG
GACTTGCCATTCGACCCAAACTTGGTTAACATGATGTTGAGAGAAAG
AGAATTGACTATCGAAAGAGCTTTGAAGAACGAATTCGAAGGTAACA
TGGCTAACGTTGAATACTTCGCTGAAGGTTTGGGTGAATTGTGTCAC
TGGAAGGAAATCATGTTGCACCAAAGAAGAAACGGTTCTCCATTCGA
CTCTCCAGCTACTACTGCTGCTGCTTTGATCTACCACCAACACGACG
AAAAGTGTTTCGGTTACTTGTCTTCTATCTTGAAGTTGCACGAAAAC
TGGGTTCCAACTATCTACCCAACTAAGGTTCACTCTAACTTGTTCTT
CGTTGACGCTTTGCAAAACTTGGGTGTTGACAGATACTTCAAGACTG
AATTGAAGTCTGTTTTGGACGAAATCTACAGATTGTGGTTGGAAAAG
AACGAAGAAATCTTCTCTGACATCGCTCACTGTGCTATGGCTTTCAG
ATTGTTGAGAATGAACAACTACGAAGTTTCTTCTGAAGAATTGGAAG
GTTTCGTTGACCAAGAACACTTCTTCACTACTTCTGGTGGTAAGTTG
ATCTCTCACGTTGCTATCTTGGAATTGCACAGAGCTTCTCAAGTTGA
CATCCAAGAAGGTAAGGACTTGATCTTGGACAAGATCTCTACTTGGA
CTAGAAACTTCATGGAACAAGAATTGTTGGACAACCAAATCTTGGAC
AGATCTAAGAAGGAAATGGAATTCGCTATGAGAAAGTTCTACGGTAC
TTTCGACAGAGTTGAAACTAGAAGATACATCGAATCTTACAAGATGG
ACTCTTTCAAGATCTTGAAGGCTGCTTACAGATCTTCTAACATCAAC
AACATCGACTTGTTGAAGTTCTCTGAACACGACTTCAACTTGTGTCA
AGCTAGACACAAGGAAGAATTGCAACAAATCAAGAGATGGTTCGCTG
ACTGTAAGTTGGAACAAGTTGGTTCTTCTCAAAACTACTTGTACACT
TCTTACTTCCCAATCGCTGCTATCTTGTTCGAACCAGAATACGGTGA
CGCTAGATTGGCTTTCGCTAAGTGTGGTATCATCGCTACTACTGTTG
ACGACTTCTTCGACGGTTTCGCTTGTAACGAAGAATTGCAAAACATC
ATCGAATTGGTTGAAAGATGGGACGGTTACCCAACTGTTGGTTTCAG
ATCTGAAAGAGTTAGAATCTTCTTCTTGGCTTTGTACAAGATGATCG
AAGAAATCGCTGCTAAGGCTGAAACTAAGCAAGGTAGATGTGTTAAG
GACTTGTTGATCAACTTGTGGATCGACTTGTTGAAGTGTATGTTGGT
TGAATTGGACTTGTGGAAGATCAAGTCTACTACTCCATCTATCGAAG
AATACTTGTCTATCGCTTGTGTTACTACTGGTGTTAAGTGTTTGATC
TTGATCTCTTTGCACTTGTTGGGTCCAAAGTTGTCTAAGGACGTTAC
TGAATCTTCTGAAGTTTCTGCTTTGTGGAACTGTACTGCTGTTGTTG
CTAGATTGAACAACGACATCCACTCTTACAAGAGAGAACAAGCTGAA
TCTTCTACTAACATGGTTGCTATCTTGATCTCTCAATCTCAAAGAAC
TATCTCTGAAGAAGAAGCTATCAGACAAATCAAGGAAATGATGGAAT
CTAAGAGAAGAGAATTGTTGGGTATGGTTTTGCAAAACAAGGAATCT
CAATTGCCACAAGTTTGTAAGGACTTGTTCTGGACTACTTTCAAGGC
TGCTTACTCTATCTACACTCACGGTGACGAATACAGATTCCCACAAG
AATTGAAGAACCACATCAACGACGTTATCTACAAGCCATTGAACCAA TACTCTCCATAA SEQ ID
NO: 35 Primer for construction of fragment "a" (URA3 yeast marker)
AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGT
CGTACCGCGCCATTGAGAGTGCACCATACCACAGCTTT SEQ ID NO: 36 Primer for
construction of fragment "a" (URA3 yeast marker)
TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCG
TTGTTGCTGACCAGCGGTATTTCACACCGCATAGGGTA SEQ ID NO: 37 Primer for
construction of fragment "b" (AmpR E. coli marker)
TGGTCAGCAACAACGCCGAAGAATCACTCTCGTGTTGAGAATTGCAC
GCCTTGACCACGACACGTTAAGGGATTTTGGTCATGAG SEQ ID NO: 38 Primer for
construction of fragment "b" (AmpR E. coli marker)
AACGCGTACCCTAAGTACGGCACCACAGTGACTATGCAGTCCGCACT
TTGCCAATGCCAAAAATGTGCGCGGAACCCCTA SEQ ID NO: 39 Primer for
construction of fragment "c" (Yeast origin of replication)
TTGGCATTGGCAAAGTGCGGACTGCATAGTCACTGTGGTGCCGTACT
TAGGGTACGCGTTCCTGAACGAAGCATCTGTGCTTCA SEQ ID NO: 40 Primer for
construction of fragment "c" (Yeast origin of replication)
CCGAGATGCCAAAGGATAGGTGCTATGTTGATGACTACGACACAGAA
CTGCGGGTGACATAATGATAGCATTGAAGGATGAGACT SEQ ID NO: 41 Primer for
construction of fragment "d" (E. coli origin of replication)
ATGTCACCCGCAGTTCTGTGTCGTAGTCATCAACATAGCACCTATCC
TTTGGCATCTCGGTGAGCAAAAGGCCAGCAAAAGG SEQ ID NO: 42 Primer for
construction of fragment "d" (E. coli origin of replication)
CTCAGATGTACGGTGATCGCCACCATGTGACGGAAGCTATCCTGACA
GTGTAGCAAGTGCTGAGCGTCAGACCCCGTAGAA SEQ ID NO: 43 Part of fragment
"d" obtained by DNA synthesis
ATTCCTAGTGACGGCCTTGGGAACTCGATACACGATGTTCAGTAGAC CGCTCACACATGG SEQ
ID NO: 44 Primer for construction of fragment "a" (LEU2 yeast
marker) AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGT
CGTACCGCGCCATTCGACTACGTCGTAAGGCC SEQ ID NO: 45 Primer for
construction of fragment "a" (LEU2 yeast marker)
TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCG
TTGTTGCTGACCATCGACGGTCGAGGAGAACTT
Sequence CWU 1
1
451793PRTSalvia miltiorrhiza 1Met Ala Ser Leu Ser Ser Thr Ile Leu
Ser Arg Ser Pro Ala Ala Arg1 5 10 15Arg Arg Ile Thr Pro Ala Ser Ala
Lys Leu His Arg Pro Glu Cys Phe 20 25 30Ala Thr Ser Ala Trp Met Gly
Ser Ser Ser Lys Asn Leu Ser Leu Ser 35 40 45Tyr Gln Leu Asn His Lys
Lys Ile Ser Val Ala Thr Val Asp Ala Pro 50 55 60Gln Val His Asp His
Asp Gly Thr Thr Val His Gln Gly His Asp Ala65 70 75 80Val Lys Asn
Ile Glu Asp Pro Ile Glu Tyr Ile Arg Thr Leu Leu Arg 85 90 95Thr Thr
Gly Asp Gly Arg Ile Ser Val Ser Pro Tyr Asp Thr Ala Trp 100 105
110Val Ala Met Ile Lys Asp Val Glu Gly Arg Asp Gly Pro Gln Phe Pro
115 120 125Ser Ser Leu Glu Trp Ile Val Gln Asn Gln Leu Glu Asp Gly
Ser Trp 130 135 140Gly Asp Gln Lys Leu Phe Cys Val Tyr Asp Arg Leu
Val Asn Thr Ile145 150 155 160Ala Cys Val Val Ala Leu Arg Ser Trp
Asn Val His Ala His Lys Val 165 170 175Lys Arg Gly Val Thr Tyr Ile
Lys Glu Asn Val Asp Lys Leu Met Glu 180 185 190Gly Asn Glu Glu His
Met Thr Cys Gly Phe Glu Val Val Phe Pro Ala 195 200 205Leu Leu Gln
Lys Ala Lys Ser Leu Gly Ile Glu Asp Leu Pro Tyr Asp 210 215 220Ser
Pro Ala Val Gln Glu Val Tyr His Val Arg Glu Gln Lys Leu Lys225 230
235 240Arg Ile Pro Leu Glu Ile Met His Lys Ile Pro Thr Ser Leu Leu
Phe 245 250 255Ser Leu Glu Gly Leu Glu Asn Leu Asp Trp Asp Lys Leu
Leu Lys Leu 260 265 270Gln Ser Ala Asp Gly Ser Phe Leu Thr Ser Pro
Ser Ser Thr Ala Phe 275 280 285Ala Phe Met Gln Thr Lys Asp Glu Lys
Cys Tyr Gln Phe Ile Lys Asn 290 295 300Thr Ile Asp Thr Phe Asn Gly
Gly Ala Pro His Thr Tyr Pro Val Asp305 310 315 320Val Phe Gly Arg
Leu Trp Ala Ile Asp Arg Leu Gln Arg Leu Gly Ile 325 330 335Ser Arg
Phe Phe Glu Pro Glu Ile Ala Asp Cys Leu Ser His Ile His 340 345
350Lys Phe Trp Thr Asp Lys Gly Val Phe Ser Gly Arg Glu Ser Glu Phe
355 360 365Cys Asp Ile Asp Asp Thr Ser Met Gly Met Arg Leu Met Arg
Met His 370 375 380Gly Tyr Asp Val Asp Pro Asn Val Leu Arg Asn Phe
Lys Gln Lys Asp385 390 395 400Gly Lys Phe Ser Cys Tyr Gly Gly Gln
Met Ile Glu Ser Pro Ser Pro 405 410 415Ile Tyr Asn Leu Tyr Arg Ala
Ser Gln Leu Arg Phe Pro Gly Glu Glu 420 425 430Ile Leu Glu Asp Ala
Lys Arg Phe Ala Tyr Asp Phe Leu Lys Glu Lys 435 440 445Leu Ala Asn
Asn Gln Ile Leu Asp Lys Trp Val Ile Ser Lys His Leu 450 455 460Pro
Asp Glu Ile Lys Leu Gly Leu Glu Met Pro Trp Leu Ala Thr Leu465 470
475 480Pro Arg Val Glu Ala Lys Tyr Tyr Ile Gln Tyr Tyr Ala Gly Ser
Gly 485 490 495Asp Val Trp Ile Gly Lys Thr Leu Tyr Arg Met Pro Glu
Ile Ser Asn 500 505 510Asp Thr Tyr His Asp Leu Ala Lys Thr Asp Phe
Lys Arg Cys Gln Ala 515 520 525Lys His Gln Phe Glu Trp Leu Tyr Met
Gln Glu Trp Tyr Glu Ser Cys 530 535 540Gly Ile Glu Glu Phe Gly Ile
Ser Arg Lys Asp Leu Leu Leu Ser Tyr545 550 555 560Phe Leu Ala Thr
Ala Ser Ile Phe Glu Leu Glu Arg Thr Asn Glu Arg 565 570 575Ile Ala
Trp Ala Lys Ser Gln Ile Ile Ala Lys Met Ile Thr Ser Phe 580 585
590Phe Asn Lys Glu Thr Thr Ser Glu Glu Asp Lys Arg Ala Leu Leu Asn
595 600 605Glu Leu Gly Asn Ile Asn Gly Leu Asn Asp Thr Asn Gly Ala
Gly Arg 610 615 620Glu Gly Gly Ala Gly Ser Ile Ala Leu Ala Thr Leu
Thr Gln Phe Leu625 630 635 640Glu Gly Phe Asp Arg Tyr Thr Arg His
Gln Leu Lys Asn Ala Trp Ser 645 650 655Val Trp Leu Thr Gln Leu Gln
His Gly Glu Ala Asp Asp Ala Glu Leu 660 665 670Leu Thr Asn Thr Leu
Asn Ile Cys Ala Gly His Ile Ala Phe Arg Glu 675 680 685Glu Ile Leu
Ala His Asn Glu Tyr Lys Ala Leu Ser Asn Leu Thr Ser 690 695 700Lys
Ile Cys Arg Gln Leu Ser Phe Ile Gln Ser Glu Lys Glu Met Gly705 710
715 720Val Glu Gly Glu Ile Ala Ala Lys Ser Ser Ile Lys Asn Lys Glu
Leu 725 730 735Glu Glu Asp Met Gln Met Leu Val Lys Leu Val Leu Glu
Lys Tyr Gly 740 745 750Gly Ile Asp Arg Asn Ile Lys Lys Ala Phe Leu
Ala Val Ala Lys Thr 755 760 765Tyr Tyr Tyr Arg Ala Tyr His Ala Ala
Asp Thr Ile Asp Thr His Met 770 775 780Phe Lys Val Leu Phe Glu Pro
Val Ala785 7902736PRTArtificial SequenceTruncated copalyl
diphosphate synthase 2Met Ala Thr Val Asp Ala Pro Gln Val His Asp
His Asp Gly Thr Thr1 5 10 15Val His Gln Gly His Asp Ala Val Lys Asn
Ile Glu Asp Pro Ile Glu 20 25 30Tyr Ile Arg Thr Leu Leu Arg Thr Thr
Gly Asp Gly Arg Ile Ser Val 35 40 45Ser Pro Tyr Asp Thr Ala Trp Val
Ala Met Ile Lys Asp Val Glu Gly 50 55 60Arg Asp Gly Pro Gln Phe Pro
Ser Ser Leu Glu Trp Ile Val Gln Asn65 70 75 80Gln Leu Glu Asp Gly
Ser Trp Gly Asp Gln Lys Leu Phe Cys Val Tyr 85 90 95Asp Arg Leu Val
Asn Thr Ile Ala Cys Val Val Ala Leu Arg Ser Trp 100 105 110Asn Val
His Ala His Lys Val Lys Arg Gly Val Thr Tyr Ile Lys Glu 115 120
125Asn Val Asp Lys Leu Met Glu Gly Asn Glu Glu His Met Thr Cys Gly
130 135 140Phe Glu Val Val Phe Pro Ala Leu Leu Gln Lys Ala Lys Ser
Leu Gly145 150 155 160Ile Glu Asp Leu Pro Tyr Asp Ser Pro Ala Val
Gln Glu Val Tyr His 165 170 175Val Arg Glu Gln Lys Leu Lys Arg Ile
Pro Leu Glu Ile Met His Lys 180 185 190Ile Pro Thr Ser Leu Leu Phe
Ser Leu Glu Gly Leu Glu Asn Leu Asp 195 200 205Trp Asp Lys Leu Leu
Lys Leu Gln Ser Ala Asp Gly Ser Phe Leu Thr 210 215 220Ser Pro Ser
Ser Thr Ala Phe Ala Phe Met Gln Thr Lys Asp Glu Lys225 230 235
240Cys Tyr Gln Phe Ile Lys Asn Thr Ile Asp Thr Phe Asn Gly Gly Ala
245 250 255Pro His Thr Tyr Pro Val Asp Val Phe Gly Arg Leu Trp Ala
Ile Asp 260 265 270Arg Leu Gln Arg Leu Gly Ile Ser Arg Phe Phe Glu
Pro Glu Ile Ala 275 280 285Asp Cys Leu Ser His Ile His Lys Phe Trp
Thr Asp Lys Gly Val Phe 290 295 300Ser Gly Arg Glu Ser Glu Phe Cys
Asp Ile Asp Asp Thr Ser Met Gly305 310 315 320Met Arg Leu Met Arg
Met His Gly Tyr Asp Val Asp Pro Asn Val Leu 325 330 335Arg Asn Phe
Lys Gln Lys Asp Gly Lys Phe Ser Cys Tyr Gly Gly Gln 340 345 350Met
Ile Glu Ser Pro Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ser Gln 355 360
365Leu Arg Phe Pro Gly Glu Glu Ile Leu Glu Asp Ala Lys Arg Phe Ala
370 375 380Tyr Asp Phe Leu Lys Glu Lys Leu Ala Asn Asn Gln Ile Leu
Asp Lys385 390 395 400Trp Val Ile Ser Lys His Leu Pro Asp Glu Ile
Lys Leu Gly Leu Glu 405 410 415Met Pro Trp Leu Ala Thr Leu Pro Arg
Val Glu Ala Lys Tyr Tyr Ile 420 425 430Gln Tyr Tyr Ala Gly Ser Gly
Asp Val Trp Ile Gly Lys Thr Leu Tyr 435 440 445Arg Met Pro Glu Ile
Ser Asn Asp Thr Tyr His Asp Leu Ala Lys Thr 450 455 460Asp Phe Lys
Arg Cys Gln Ala Lys His Gln Phe Glu Trp Leu Tyr Met465 470 475
480Gln Glu Trp Tyr Glu Ser Cys Gly Ile Glu Glu Phe Gly Ile Ser Arg
485 490 495Lys Asp Leu Leu Leu Ser Tyr Phe Leu Ala Thr Ala Ser Ile
Phe Glu 500 505 510Leu Glu Arg Thr Asn Glu Arg Ile Ala Trp Ala Lys
Ser Gln Ile Ile 515 520 525Ala Lys Met Ile Thr Ser Phe Phe Asn Lys
Glu Thr Thr Ser Glu Glu 530 535 540Asp Lys Arg Ala Leu Leu Asn Glu
Leu Gly Asn Ile Asn Gly Leu Asn545 550 555 560Asp Thr Asn Gly Ala
Gly Arg Glu Gly Gly Ala Gly Ser Ile Ala Leu 565 570 575Ala Thr Leu
Thr Gln Phe Leu Glu Gly Phe Asp Arg Tyr Thr Arg His 580 585 590Gln
Leu Lys Asn Ala Trp Ser Val Trp Leu Thr Gln Leu Gln His Gly 595 600
605Glu Ala Asp Asp Ala Glu Leu Leu Thr Asn Thr Leu Asn Ile Cys Ala
610 615 620Gly His Ile Ala Phe Arg Glu Glu Ile Leu Ala His Asn Glu
Tyr Lys625 630 635 640Ala Leu Ser Asn Leu Thr Ser Lys Ile Cys Arg
Gln Leu Ser Phe Ile 645 650 655Gln Ser Glu Lys Glu Met Gly Val Glu
Gly Glu Ile Ala Ala Lys Ser 660 665 670Ser Ile Lys Asn Lys Glu Leu
Glu Glu Asp Met Gln Met Leu Val Lys 675 680 685Leu Val Leu Glu Lys
Tyr Gly Gly Ile Asp Arg Asn Ile Lys Lys Ala 690 695 700Phe Leu Ala
Val Ala Lys Thr Tyr Tyr Tyr Arg Ala Tyr His Ala Ala705 710 715
720Asp Thr Ile Asp Thr His Met Phe Lys Val Leu Phe Glu Pro Val Ala
725 730 73532211DNAArtificial SequenceOptimized cDNA for E. coli
expression encoding for SmCPS2 3atggcaactg ttgacgcacc tcaagtccat
gatcacgatg gcaccaccgt tcaccagggt 60cacgacgcgg tgaagaacat cgaggacccg
atcgaataca ttcgtaccct gctgcgtacc 120actggtgatg gtcgcatcag
cgtcagcccg tatgacacgg cgtgggtggc gatgattaaa 180gacgtcgagg
gtcgcgatgg cccgcaattt ccttctagcc tggagtggat tgtccaaaat
240cagctggaag atggctcgtg gggtgaccag aagctgtttt gtgtttacga
tcgcctggtt 300aataccatcg catgtgtggt tgcgctgcgt agctggaatg
ttcacgctca taaagtcaaa 360cgtggcgtga cgtatatcaa ggaaaacgtg
gataagctga tggaaggcaa cgaagaacac 420atgacgtgtg gcttcgaggt
tgtttttcca gccttgctgc agaaagcaaa gtccctgggt 480attgaggatc
tgccgtacga ctcgccggca gtgcaagaag tctatcacgt ccgcgagcag
540aagctgaaac gcatcccgct ggagattatg cataagattc cgacctctct
gctgttctct 600ctggaaggtc tggagaacct ggattgggac aaactgctga
agctgcagtc cgctgacggt 660agctttctga ccagcccgag cagcacggcc
tttgcgttta tgcagaccaa agatgagaag 720tgctatcaat tcatcaagaa
tactattgat accttcaacg gtggcgcacc gcacacgtac 780ccagtagacg
tttttggtcg cctgtgggcg attgaccgtt tgcagcgtct gggtatcagc
840cgtttcttcg agccggagat tgcggactgc ttgagccata ttcacaaatt
ctggacggac 900aaaggcgtgt tcagcggtcg tgagagcgag ttctgcgaca
tcgacgatac gagcatgggt 960atgcgtctga tgcgtatgca cggttacgac
gtggacccga atgtgttgcg caacttcaag 1020caaaaagatg gcaagtttag
ctgctacggt ggccaaatga ttgagagccc gagcccgatc 1080tataacttat
atcgtgcgag ccaactgcgt ttcccgggtg aagaaattct ggaagatgcg
1140aagcgttttg cgtatgactt cctgaaggaa aagctcgcaa acaatcaaat
cttggataaa 1200tgggtgatca gcaagcactt gccggatgag attaaactgg
gtctggagat gccgtggttg 1260gccaccctgc cgagagttga ggcgaaatac
tatattcagt attacgcggg tagcggtgat 1320gtttggattg gcaagaccct
gtaccgcatg ccggagatca gcaatgatac ctatcatgac 1380ctggccaaga
ccgacttcaa acgctgtcaa gcgaaacatc aatttgaatg gttatacatg
1440caagagtggt acgaaagctg cggcatcgaa gagttcggta tctcccgtaa
agatctgctg 1500ctgtcttact ttctggcaac ggccagcatt ttcgagctgg
agcgtaccaa tgagcgtatt 1560gcctgggcga aatcacaaat cattgctaag
atgattacga gctttttcaa taaagaaacc 1620acgtccgagg aagataaacg
tgctctgctg aatgaactgg gcaacatcaa cggtctgaat 1680gacaccaacg
gtgccggtcg tgagggtggc gcaggcagca ttgcactggc cacgctgacc
1740cagttcctgg aaggtttcga ccgctacacc cgtcaccagc tgaagaacgc
gtggtccgtc 1800tggctgaccc agctgcagca tggtgaggca gacgacgcgg
agctgctgac caacacgttg 1860aatatctgcg ctggccatat cgcgtttcgc
gaagagattc tggcgcacaa cgagtacaaa 1920gccctgagca atctgacctc
taaaatctgt cgtcagctta gctttattca gagcgagaaa 1980gaaatgggcg
tggaaggtga gatcgcggca aaatccagca tcaagaacaa agaactggaa
2040gaagatatgc agatgttggt caagctcgtc ctggagaagt atggtggcat
cgaccgtaat 2100atcaagaaag cgtttctggc cgtggcgaaa acgtattact
accgcgcgta ccacgcggca 2160gataccattg acacccacat gtttaaggtt
ttgtttgagc cggttgctta a 22114575PRTSalvia sclarea 4Met Ser Leu Ala
Phe Asn Val Gly Val Thr Pro Phe Ser Gly Gln Arg1 5 10 15Val Gly Ser
Arg Lys Glu Lys Phe Pro Val Gln Gly Phe Pro Val Thr 20 25 30Thr Pro
Asn Arg Ser Arg Leu Ile Val Asn Cys Ser Leu Thr Thr Ile 35 40 45Asp
Phe Met Ala Lys Met Lys Glu Asn Phe Lys Arg Glu Asp Asp Lys 50 55
60Phe Pro Thr Thr Thr Thr Leu Arg Ser Glu Asp Ile Pro Ser Asn Leu65
70 75 80Cys Ile Ile Asp Thr Leu Gln Arg Leu Gly Val Asp Gln Phe Phe
Gln 85 90 95Tyr Glu Ile Asn Thr Ile Leu Asp Asn Thr Phe Arg Leu Trp
Gln Glu 100 105 110Lys His Lys Val Ile Tyr Gly Asn Val Thr Thr His
Ala Met Ala Phe 115 120 125Arg Leu Leu Arg Val Lys Gly Tyr Glu Val
Ser Ser Glu Glu Leu Ala 130 135 140Pro Tyr Gly Asn Gln Glu Ala Val
Ser Gln Gln Thr Asn Asp Leu Pro145 150 155 160Met Ile Ile Glu Leu
Tyr Arg Ala Ala Asn Glu Arg Ile Tyr Glu Glu 165 170 175Glu Arg Ser
Leu Glu Lys Ile Leu Ala Trp Thr Thr Ile Phe Leu Asn 180 185 190Lys
Gln Val Gln Asp Asn Ser Ile Pro Asp Lys Lys Leu His Lys Leu 195 200
205Val Glu Phe Tyr Leu Arg Asn Tyr Lys Gly Ile Thr Ile Arg Leu Gly
210 215 220Ala Arg Arg Asn Leu Glu Leu Tyr Asp Met Thr Tyr Tyr Gln
Ala Leu225 230 235 240Lys Ser Thr Asn Arg Phe Ser Asn Leu Cys Asn
Glu Asp Phe Leu Val 245 250 255Phe Ala Lys Gln Asp Phe Asp Ile His
Glu Ala Gln Asn Gln Lys Gly 260 265 270Leu Gln Gln Leu Gln Arg Trp
Tyr Ala Asp Cys Arg Leu Asp Thr Leu 275 280 285Asn Phe Gly Arg Asp
Val Val Ile Ile Ala Asn Tyr Leu Ala Ser Leu 290 295 300Ile Ile Gly
Asp His Ala Phe Asp Tyr Val Arg Leu Ala Phe Ala Lys305 310 315
320Thr Ser Val Leu Val Thr Ile Met Asp Asp Phe Phe Asp Cys His Gly
325 330 335Ser Ser Gln Glu Cys Asp Lys Ile Ile Glu Leu Val Lys Glu
Trp Lys 340 345 350Glu Asn Pro Asp Ala Glu Tyr Gly Ser Glu Glu Leu
Glu Ile Leu Phe 355 360 365Met Ala Leu Tyr Asn Thr Val Asn Glu Leu
Ala Glu Arg Ala Arg Val 370 375 380Glu Gln Gly Arg Ser Val Lys Glu
Phe Leu Val Lys Leu Trp Val Glu385 390 395 400Ile Leu Ser Ala Phe
Lys Ile Glu Leu Asp Thr Trp Ser Asn Gly Thr 405 410 415Gln Gln Ser
Phe Asp Glu Tyr Ile Ser Ser Ser Trp Leu Ser Asn Gly 420 425 430Ser
Arg Leu Thr Gly Leu Leu Thr Met Gln Phe Val Gly Val Lys Leu 435 440
445Ser Asp Glu Met Leu Met Ser Glu Glu Cys Thr Asp Leu Ala Arg His
450 455 460Val Cys Met Val Gly Arg Leu Leu Asn Asp Val Cys Ser Ser
Glu Arg465 470 475 480Glu Arg Glu Glu Asn Ile Ala Gly Lys Ser Tyr
Ser Ile Leu Leu Ala 485 490 495Thr Glu Lys Asp Gly Arg Lys Val Ser
Glu Asp Glu Ala Ile Ala Glu 500 505 510Ile Asn Glu Met Val Glu Tyr
His Trp Arg Lys Val Leu Gln Ile Val 515 520 525Tyr Lys Lys Glu Ser
Ile Leu Pro Arg Arg Cys Lys Asp Val Phe Leu 530 535 540Glu Met Ala
Lys Gly Thr Phe Tyr Ala Tyr Gly Ile Asn Asp Glu Leu545 550
555 560Thr Ser Pro Gln Gln Ser Lys Glu Asp Met Lys Ser Phe Val Phe
565 570 5755525PRTArtificial SequenceTruncated sclareol synthase
from Salvia sclarea (SsScS) 5Met Ala Lys Met Lys Glu Asn Phe Lys
Arg Glu Asp Asp Lys Phe Pro1 5 10 15Thr Thr Thr Thr Leu Arg Ser Glu
Asp Ile Pro Ser Asn Leu Cys Ile 20 25 30Ile Asp Thr Leu Gln Arg Leu
Gly Val Asp Gln Phe Phe Gln Tyr Glu 35 40 45Ile Asn Thr Ile Leu Asp
Asn Thr Phe Arg Leu Trp Gln Glu Lys His 50 55 60Lys Val Ile Tyr Gly
Asn Val Thr Thr His Ala Met Ala Phe Arg Leu65 70 75 80Leu Arg Val
Lys Gly Tyr Glu Val Ser Ser Glu Glu Leu Ala Pro Tyr 85 90 95Gly Asn
Gln Glu Ala Val Ser Gln Gln Thr Asn Asp Leu Pro Met Ile 100 105
110Ile Glu Leu Tyr Arg Ala Ala Asn Glu Arg Ile Tyr Glu Glu Glu Arg
115 120 125Ser Leu Glu Lys Ile Leu Ala Trp Thr Thr Ile Phe Leu Asn
Lys Gln 130 135 140Val Gln Asp Asn Ser Ile Pro Asp Lys Lys Leu His
Lys Leu Val Glu145 150 155 160Phe Tyr Leu Arg Asn Tyr Lys Gly Ile
Thr Ile Arg Leu Gly Ala Arg 165 170 175Arg Asn Leu Glu Leu Tyr Asp
Met Thr Tyr Tyr Gln Ala Leu Lys Ser 180 185 190Thr Asn Arg Phe Ser
Asn Leu Cys Asn Glu Asp Phe Leu Val Phe Ala 195 200 205Lys Gln Asp
Phe Asp Ile His Glu Ala Gln Asn Gln Lys Gly Leu Gln 210 215 220Gln
Leu Gln Arg Trp Tyr Ala Asp Cys Arg Leu Asp Thr Leu Asn Phe225 230
235 240Gly Arg Asp Val Val Ile Ile Ala Asn Tyr Leu Ala Ser Leu Ile
Ile 245 250 255Gly Asp His Ala Phe Asp Tyr Val Arg Leu Ala Phe Ala
Lys Thr Ser 260 265 270Val Leu Val Thr Ile Met Asp Asp Phe Phe Asp
Cys His Gly Ser Ser 275 280 285Gln Glu Cys Asp Lys Ile Ile Glu Leu
Val Lys Glu Trp Lys Glu Asn 290 295 300Pro Asp Ala Glu Tyr Gly Ser
Glu Glu Leu Glu Ile Leu Phe Met Ala305 310 315 320Leu Tyr Asn Thr
Val Asn Glu Leu Ala Glu Arg Ala Arg Val Glu Gln 325 330 335Gly Arg
Ser Val Lys Glu Phe Leu Val Lys Leu Trp Val Glu Ile Leu 340 345
350Ser Ala Phe Lys Ile Glu Leu Asp Thr Trp Ser Asn Gly Thr Gln Gln
355 360 365Ser Phe Asp Glu Tyr Ile Ser Ser Ser Trp Leu Ser Asn Gly
Ser Arg 370 375 380Leu Thr Gly Leu Leu Thr Met Gln Phe Val Gly Val
Lys Leu Ser Asp385 390 395 400Glu Met Leu Met Ser Glu Glu Cys Thr
Asp Leu Ala Arg His Val Cys 405 410 415Met Val Gly Arg Leu Leu Asn
Asp Val Cys Ser Ser Glu Arg Glu Arg 420 425 430Glu Glu Asn Ile Ala
Gly Lys Ser Tyr Ser Ile Leu Leu Ala Thr Glu 435 440 445Lys Asp Gly
Arg Lys Val Ser Glu Asp Glu Ala Ile Ala Glu Ile Asn 450 455 460Glu
Met Val Glu Tyr His Trp Arg Lys Val Leu Gln Ile Val Tyr Lys465 470
475 480Lys Glu Ser Ile Leu Pro Arg Arg Cys Lys Asp Val Phe Leu Glu
Met 485 490 495Ala Lys Gly Thr Phe Tyr Ala Tyr Gly Ile Asn Asp Glu
Leu Thr Ser 500 505 510Pro Gln Gln Ser Lys Glu Asp Met Lys Ser Phe
Val Phe 515 520 52561583DNAArtificial SequenceOptimized cDNA for E.
coli expression encoding the truncated sclareol synthase from
Salvia sclarea 6atggcgaaaa tgaaggagaa ctttaaacgc gaggacgata
aattcccgac gaccacgacc 60ctgcgcagcg aggatatccc gagcaacctg tgcatcattg
ataccctgca gcgcctgggt 120gtcgatcagt tcttccaata cgaaatcaat
accattctgg acaatacttt tcgtctgtgg 180caagagaaac acaaagtgat
ctacggcaac gttaccaccc acgcgatggc gttccgtttg 240ttgcgtgtca
agggctacga ggtttccagc gaggaactgg cgccgtacgg taatcaggaa
300gcagttagcc aacagacgaa tgatctgcct atgatcattg agctgtatcg
cgcagcaaat 360gagcgtatct acgaagagga acgcagcctg gaaaagatcc
tggcgtggac cacgatcttc 420ctgaacaaac aagttcaaga caattctatt
cctgataaga agctgcataa actggtcgaa 480ttctatctgc gtaattacaa
gggcatcacg atccgtctgg gcgcacgccg taacctggag 540ttgtatgata
tgacgtatta ccaggctctg aaaagcacca atcgtttctc caatctgtgt
600aatgaggatt ttctggtgtt cgccaagcag gattttgaca tccacgaggc
gcaaaatcaa 660aaaggtctgc aacaactgca acgttggtac gctgactgtc
gcctggacac cctgaatttc 720ggtcgcgacg ttgtcattat tgcaaactat
ctggccagcc tgatcatcgg tgatcacgca 780ttcgactacg tccgcctggc
cttcgctaag accagcgttc tggtgaccat tatggatgat 840ttcttcgatt
gccacggttc tagccaggaa tgcgacaaaa tcattgagct ggtgaaagag
900tggaaagaaa accctgatgc ggaatacggt tccgaagagt tggagatcct
gtttatggcc 960ttgtacaaca ccgtgaatga actggccgag cgtgctcgtg
tggagcaggg ccgttctgtg 1020aaggagtttt tggtcaagtt gtgggtggaa
atcctgtccg cgttcaagat cgaactggat 1080acgtggtcga atggtacgca
acagagcttc gacgaataca tcagcagcag ctggctgagc 1140aatggcagcc
gtctgaccgg tttgctgacc atgcaatttg tgggtgttaa actgtccgat
1200gaaatgctga tgagcgaaga atgcaccgac ctggcacgcc atgtgtgtat
ggtgggtcgc 1260ctgctgaacg acgtctgcag cagcgaacgt gagcgcgagg
aaaacattgc aggcaagagc 1320tacagcatct tgttggccac cgagaaagat
ggtcgcaaag tgtctgagga cgaagcaatt 1380gcagagatta atgaaatggt
cgagtaccac tggcgtaagg ttttgcagat tgtgtataag 1440aaagagagca
tcttgccgcg tcgctgtaag gatgttttct tggagatggc gaagggcacg
1500ttctatgcgt acggcattaa cgacgagctg acgagcccgc aacaatcgaa
agaggacatg 1560aagagcttcg tgttctgagg tac 15837307PRTPantoea
agglomerans 7Met Val Ser Gly Ser Lys Ala Gly Val Ser Pro His Arg
Glu Ile Glu1 5 10 15Val Met Arg Gln Ser Ile Asp Asp His Leu Ala Gly
Leu Leu Pro Glu 20 25 30Thr Asp Ser Gln Asp Ile Val Ser Leu Ala Met
Arg Glu Gly Val Met 35 40 45Ala Pro Gly Lys Arg Ile Arg Pro Leu Leu
Met Leu Leu Ala Ala Arg 50 55 60Asp Leu Arg Tyr Gln Gly Ser Met Pro
Thr Leu Leu Asp Leu Ala Cys65 70 75 80Ala Val Glu Leu Thr His Thr
Ala Ser Leu Met Leu Asp Asp Met Pro 85 90 95Cys Met Asp Asn Ala Glu
Leu Arg Arg Gly Gln Pro Thr Thr His Lys 100 105 110Lys Phe Gly Glu
Ser Val Ala Ile Leu Ala Ser Val Gly Leu Leu Ser 115 120 125Lys Ala
Phe Gly Leu Ile Ala Ala Thr Gly Asp Leu Pro Gly Glu Arg 130 135
140Arg Ala Gln Ala Val Asn Glu Leu Ser Thr Ala Val Gly Val Gln
Gly145 150 155 160Leu Val Leu Gly Gln Phe Arg Asp Leu Asn Asp Ala
Ala Leu Asp Arg 165 170 175Thr Pro Asp Ala Ile Leu Ser Thr Asn His
Leu Lys Thr Gly Ile Leu 180 185 190Phe Ser Ala Met Leu Gln Ile Val
Ala Ile Ala Ser Ala Ser Ser Pro 195 200 205Ser Thr Arg Glu Thr Leu
His Ala Phe Ala Leu Asp Phe Gly Gln Ala 210 215 220Phe Gln Leu Leu
Asp Asp Leu Arg Asp Asp His Pro Glu Thr Gly Lys225 230 235 240Asp
Arg Asn Lys Asp Ala Gly Lys Ser Thr Leu Val Asn Arg Leu Gly 245 250
255Ala Asp Ala Ala Arg Gln Lys Leu Arg Glu His Ile Asp Ser Ala Asp
260 265 270Lys His Leu Thr Phe Ala Cys Pro Gln Gly Gly Ala Ile Arg
Gln Phe 275 280 285Met His Leu Trp Phe Gly His His Leu Ala Asp Trp
Ser Pro Val Met 290 295 300Lys Ile Ala3058924DNAArtificial
SequenceOptimized cDNA encoding for the GGPP synthase from Pantoea
agglomerans 8atggtttctg gttcgaaagc aggagtatca cctcataggg aaatcgaagt
catgagacag 60tccattgatg accacttagc aggattgttg ccagaaacag attcccagga
tatcgttagc 120cttgctatga gagaaggtgt tatggcacct ggtaaacgta
tcagaccttt gctgatgtta 180cttgctgcaa gagacctgag atatcagggt
tctatgccta cactactgga tctagcttgt 240gctgttgaac tgacacatac
tgcttccttg atgctggatg acatgccttg tatggacaat 300gcggaactta
gaagaggtca accaacaacc cacaagaaat tcggagaatc tgttgccatt
360ttggcttctg taggtctgtt gtcgaaagct tttggcttga ttgctgcaac
tggtgatctt 420ccaggtgaaa ggagagcaca agctgtaaac gagctatcta
ctgcagttgg tgttcaaggt 480ctagtcttag gacagttcag agatttgaat
gacgcagctt tggacagaac tcctgatgct 540atcctgtcta cgaaccatct
gaagactggc atcttgttct cagctatgtt gcaaatcgta 600gccattgctt
ctgcttcttc accatctact agggaaacgt tacacgcatt cgcattggac
660tttggtcaag cctttcaact gctagacgat ttgagggatg atcatccaga
gacaggtaaa 720gaccgtaaca aagacgctgg taaaagcact ctagtcaaca
gattgggtgc tgatgcagct 780agacagaaac tgagagagca cattgactct
gctgacaaac acctgacatt tgcatgtcca 840caaggaggtg ctataaggca
gtttatgcac ctatggtttg gacaccatct tgctgattgg 900tctccagtga
tgaagatcgc ctaa 924971DNAArtificial SequencePrimer Sequence
9ctgtttgagc cggtcgccta aggtaccaga aggagataaa taatggcgaa aatgaaggag
60aactttaaac g 711055DNAArtificial SequencePrimer Sequence
10gcagcggttt ctttaccaga ctcgaggtca gaacacgaag ctcttcatgt cctct
5511786PRTColeus forskohlii 11Met Gly Ser Leu Ser Thr Met Asn Leu
Asn His Ser Pro Met Ser Tyr1 5 10 15Ser Gly Ile Leu Pro Ser Ser Ser
Ala Lys Ala Lys Leu Leu Leu Pro 20 25 30Gly Cys Phe Ser Ile Ser Ala
Trp Met Asn Asn Gly Lys Asn Leu Asn 35 40 45Cys Gln Leu Thr His Lys
Lys Ile Ser Lys Val Ala Glu Ile Arg Val 50 55 60Ala Thr Val Asn Ala
Pro Pro Val His Asp Gln Asp Asp Ser Thr Glu65 70 75 80Asn Gln Cys
His Asp Ala Val Asn Asn Ile Glu Asp Pro Ile Glu Tyr 85 90 95Ile Arg
Thr Leu Leu Arg Thr Thr Gly Asp Gly Arg Ile Ser Val Ser 100 105
110Pro Tyr Asp Thr Ala Trp Val Ala Leu Ile Lys Asp Leu Gln Gly Arg
115 120 125Asp Ala Pro Glu Phe Pro Ser Ser Leu Glu Trp Ile Ile Gln
Asn Gln 130 135 140Leu Ala Asp Gly Ser Trp Gly Asp Ala Lys Phe Phe
Cys Val Tyr Asp145 150 155 160Arg Leu Val Asn Thr Ile Ala Cys Val
Val Ala Leu Arg Ser Trp Asp 165 170 175Val His Ala Glu Lys Val Glu
Arg Gly Val Arg Tyr Ile Asn Glu Asn 180 185 190Val Glu Lys Leu Arg
Asp Gly Asn Glu Glu His Met Thr Cys Gly Phe 195 200 205Glu Val Val
Phe Pro Ala Leu Leu Gln Arg Ala Lys Ser Leu Gly Ile 210 215 220Gln
Asp Leu Pro Tyr Asp Ala Pro Val Ile Gln Glu Ile Tyr His Ser225 230
235 240Arg Glu Gln Lys Ser Lys Arg Ile Pro Leu Glu Met Met His Lys
Val 245 250 255Pro Thr Ser Leu Leu Phe Ser Leu Glu Gly Leu Glu Asn
Leu Glu Trp 260 265 270Asp Lys Leu Leu Lys Leu Gln Ser Ala Asp Gly
Ser Phe Leu Thr Ser 275 280 285Pro Ser Ser Thr Ala Phe Ala Phe Met
Gln Thr Arg Asp Pro Lys Cys 290 295 300Tyr Gln Phe Ile Lys Asn Thr
Ile Gln Thr Phe Asn Gly Gly Ala Pro305 310 315 320His Thr Tyr Pro
Val Asp Val Phe Gly Arg Leu Trp Ala Ile Asp Arg 325 330 335Leu Gln
Arg Leu Gly Ile Ser Arg Phe Phe Glu Ser Glu Ile Ala Asp 340 345
350Cys Ile Ala His Ile His Arg Phe Trp Thr Glu Lys Gly Val Phe Ser
355 360 365Gly Arg Glu Ser Glu Phe Cys Asp Ile Asp Asp Thr Ser Met
Gly Val 370 375 380Arg Leu Met Arg Met His Gly Tyr Asp Val Asp Pro
Asn Val Leu Lys385 390 395 400Asn Phe Lys Lys Asp Asp Lys Phe Ser
Cys Tyr Gly Gly Gln Met Ile 405 410 415Glu Ser Pro Ser Pro Ile Tyr
Asn Leu Tyr Arg Ala Ser Gln Leu Arg 420 425 430Phe Pro Gly Glu Gln
Ile Leu Glu Asp Ala Asn Lys Phe Ala Tyr Asp 435 440 445Phe Leu Gln
Glu Lys Leu Ala His Asn Gln Ile Leu Asp Lys Trp Val 450 455 460Ile
Ser Lys His Leu Pro Asp Glu Ile Lys Leu Gly Leu Glu Met Pro465 470
475 480Trp Tyr Ala Thr Leu Pro Arg Val Glu Ala Arg Tyr Tyr Ile Gln
Tyr 485 490 495Tyr Ala Gly Ser Gly Asp Val Trp Ile Gly Lys Thr Leu
Tyr Arg Met 500 505 510Pro Glu Ile Ser Asn Asp Thr Tyr His Glu Leu
Ala Lys Thr Asp Phe 515 520 525Lys Arg Cys Gln Ala Gln His Gln Phe
Glu Trp Ile Tyr Met Gln Glu 530 535 540Trp Tyr Glu Ser Cys Asn Met
Glu Glu Phe Gly Ile Ser Arg Lys Glu545 550 555 560Leu Leu Val Ala
Tyr Phe Leu Ala Thr Ala Ser Ile Phe Glu Leu Glu 565 570 575Arg Ala
Asn Glu Arg Ile Ala Trp Ala Lys Ser Gln Ile Ile Ser Thr 580 585
590Ile Ile Ala Ser Phe Phe Asn Asn Gln Asn Thr Ser Pro Glu Asp Lys
595 600 605Leu Ala Phe Leu Thr Asp Phe Lys Asn Gly Asn Ser Thr Asn
Met Ala 610 615 620Leu Val Thr Leu Thr Gln Phe Leu Glu Gly Phe Asp
Arg Tyr Thr Ser625 630 635 640His Gln Leu Lys Asn Ala Trp Ser Val
Trp Leu Arg Lys Leu Gln Gln 645 650 655Gly Glu Gly Asn Gly Gly Ala
Asp Ala Glu Leu Leu Val Asn Thr Leu 660 665 670Asn Ile Cys Ala Gly
His Ile Ala Phe Arg Glu Glu Ile Leu Ala His 675 680 685Asn Asp Tyr
Lys Thr Leu Ser Asn Leu Thr Ser Lys Ile Cys Arg Gln 690 695 700Leu
Ser Gln Ile Gln Asn Glu Lys Glu Leu Glu Thr Glu Gly Gln Lys705 710
715 720Thr Ser Ile Lys Asn Lys Glu Leu Glu Glu Asp Met Gln Arg Leu
Val 725 730 735Lys Leu Val Leu Glu Lys Ser Arg Val Gly Ile Asn Arg
Asp Met Lys 740 745 750Lys Thr Phe Leu Ala Val Val Lys Thr Tyr Tyr
Tyr Lys Ala Tyr His 755 760 765Ser Ala Gln Ala Ile Asp Asn His Met
Phe Lys Val Leu Phe Glu Pro 770 775 780Val Ala78512724PRTArtificial
SequenceTruncated copalyl diphosphate synthase from Coleus
forskohlii 12Met Val Ala Thr Val Asn Ala Pro Pro Val His Asp Gln
Asp Asp Ser1 5 10 15Thr Glu Asn Gln Cys His Asp Ala Val Asn Asn Ile
Glu Asp Pro Ile 20 25 30Glu Tyr Ile Arg Thr Leu Leu Arg Thr Thr Gly
Asp Gly Arg Ile Ser 35 40 45Val Ser Pro Tyr Asp Thr Ala Trp Val Ala
Leu Ile Lys Asp Leu Gln 50 55 60Gly Arg Asp Ala Pro Glu Phe Pro Ser
Ser Leu Glu Trp Ile Ile Gln65 70 75 80Asn Gln Leu Ala Asp Gly Ser
Trp Gly Asp Ala Lys Phe Phe Cys Val 85 90 95Tyr Asp Arg Leu Val Asn
Thr Ile Ala Cys Val Val Ala Leu Arg Ser 100 105 110Trp Asp Val His
Ala Glu Lys Val Glu Arg Gly Val Arg Tyr Ile Asn 115 120 125Glu Asn
Val Glu Lys Leu Arg Asp Gly Asn Glu Glu His Met Thr Cys 130 135
140Gly Phe Glu Val Val Phe Pro Ala Leu Leu Gln Arg Ala Lys Ser
Leu145 150 155 160Gly Ile Gln Asp Leu Pro Tyr Asp Ala Pro Val Ile
Gln Glu Ile Tyr 165 170 175His Ser Arg Glu Gln Lys Ser Lys Arg Ile
Pro Leu Glu Met Met His 180 185 190Lys Val Pro Thr Ser Leu Leu Phe
Ser Leu Glu Gly Leu Glu Asn Leu 195 200 205Glu Trp Asp Lys Leu Leu
Lys Leu Gln Ser Ala Asp Gly Ser Phe Leu 210 215 220Thr Ser Pro Ser
Ser Thr Ala Phe Ala Phe Met Gln Thr Arg Asp Pro225 230 235 240Lys
Cys Tyr Gln Phe Ile Lys Asn Thr Ile Gln Thr Phe Asn Gly Gly 245 250
255Ala Pro His Thr Tyr Pro Val Asp Val Phe Gly Arg Leu Trp Ala Ile
260 265 270Asp Arg Leu Gln Arg Leu Gly Ile Ser Arg Phe Phe Glu Ser
Glu Ile 275 280 285Ala Asp Cys Ile Ala His Ile His Arg Phe Trp Thr
Glu Lys Gly Val 290 295 300Phe Ser Gly
Arg Glu Ser Glu Phe Cys Asp Ile Asp Asp Thr Ser Met305 310 315
320Gly Val Arg Leu Met Arg Met His Gly Tyr Asp Val Asp Pro Asn Val
325 330 335Leu Lys Asn Phe Lys Lys Asp Asp Lys Phe Ser Cys Tyr Gly
Gly Gln 340 345 350Met Ile Glu Ser Pro Ser Pro Ile Tyr Asn Leu Tyr
Arg Ala Ser Gln 355 360 365Leu Arg Phe Pro Gly Glu Gln Ile Leu Glu
Asp Ala Asn Lys Phe Ala 370 375 380Tyr Asp Phe Leu Gln Glu Lys Leu
Ala His Asn Gln Ile Leu Asp Lys385 390 395 400Trp Val Ile Ser Lys
His Leu Pro Asp Glu Ile Lys Leu Gly Leu Glu 405 410 415Met Pro Trp
Tyr Ala Thr Leu Pro Arg Val Glu Ala Arg Tyr Tyr Ile 420 425 430Gln
Tyr Tyr Ala Gly Ser Gly Asp Val Trp Ile Gly Lys Thr Leu Tyr 435 440
445Arg Met Pro Glu Ile Ser Asn Asp Thr Tyr His Glu Leu Ala Lys Thr
450 455 460Asp Phe Lys Arg Cys Gln Ala Gln His Gln Phe Glu Trp Ile
Tyr Met465 470 475 480Gln Glu Trp Tyr Glu Ser Cys Asn Met Glu Glu
Phe Gly Ile Ser Arg 485 490 495Lys Glu Leu Leu Val Ala Tyr Phe Leu
Ala Thr Ala Ser Ile Phe Glu 500 505 510Leu Glu Arg Ala Asn Glu Arg
Ile Ala Trp Ala Lys Ser Gln Ile Ile 515 520 525Ser Thr Ile Ile Ala
Ser Phe Phe Asn Asn Gln Asn Thr Ser Pro Glu 530 535 540Asp Lys Leu
Ala Phe Leu Thr Asp Phe Lys Asn Gly Asn Ser Thr Asn545 550 555
560Met Ala Leu Val Thr Leu Thr Gln Phe Leu Glu Gly Phe Asp Arg Tyr
565 570 575Thr Ser His Gln Leu Lys Asn Ala Trp Ser Val Trp Leu Arg
Lys Leu 580 585 590Gln Gln Gly Glu Gly Asn Gly Gly Ala Asp Ala Glu
Leu Leu Val Asn 595 600 605Thr Leu Asn Ile Cys Ala Gly His Ile Ala
Phe Arg Glu Glu Ile Leu 610 615 620Ala His Asn Asp Tyr Lys Thr Leu
Ser Asn Leu Thr Ser Lys Ile Cys625 630 635 640Arg Gln Leu Ser Gln
Ile Gln Asn Glu Lys Glu Leu Glu Thr Glu Gly 645 650 655Gln Lys Thr
Ser Ile Lys Asn Lys Glu Leu Glu Glu Asp Met Gln Arg 660 665 670Leu
Val Lys Leu Val Leu Glu Lys Ser Arg Val Gly Ile Asn Arg Asp 675 680
685Met Lys Lys Thr Phe Leu Ala Val Val Lys Thr Tyr Tyr Tyr Lys Ala
690 695 700Tyr His Ser Ala Gln Ala Ile Asp Asn His Met Phe Lys Val
Leu Phe705 710 715 720Glu Pro Val Ala132175DNAArtificial
SequenceOptimized cDNA for E. coli expression encoding for
CfCPS1-del63 13atggtcgcta ctgtcaatgc tccaccggtc cacgatcaag
acgacagcac tgagaatcaa 60tgtcatgatg ccgtaaacaa tattgaagat ccaatcgagt
atatccgtac cctgttgcgc 120acgacgggtg atggtcgtat cagcgtcagc
ccgtacgata ccgcgtgggt ggcgctgatc 180aaagatctgc agggccgtga
cgcaccggag tttccgtcct ctcttgagtg gatcattcaa 240aaccagctgg
ccgacggttc ttggggcgac gccaaatttt tctgcgtgta tgaccgtctg
300gtgaacacca tcgcgtgcgt cgttgcgctg cgttcctggg acgtccacgc
ggaaaaagtt 360gagcgtggcg tgcgctatat caacgaaaat gtcgaaaagc
tgcgcgacgg taatgaagaa 420cacatgacct gtggctttga agttgttttc
ccggcgctcc tgcagcgcgc gaagtctctg 480ggtattcaag atctgccgta
cgatgctccg gtgatccaag agatttatca ctctcgtgag 540cagaagtcca
agcgtatccc gttggagatg atgcacaaag ttccgacgag cctgctgttc
600agcttggaag gcctggaaaa tctggagtgg gacaaactgc tgaagctgca
gagcgcggac 660ggtagcttcc tgacgagccc gagcagcacc gcatttgcat
ttatgcagac ccgtgacccg 720aagtgttacc aatttattaa gaacacgatt
cagacgttta acggtggtgc accgcatacc 780tatccggtag acgtctttgg
tcgcctgtgg gcaattgatc gtctgcagcg tttgggtatc 840agccgcttct
tcgaaagcga aattgcagat tgtatcgcac acatccatcg tttttggacc
900gagaaaggcg tctttagcgg ccgtgagtct gagttctgtg acatcgatga
cacgagcatg 960ggtgtccgtc tgatgcgtat gcatggctat gatgttgacc
cgaacgtgct gaagaatttt 1020aaaaaagatg acaagtttag ctgctacggc
ggtcagatga ttgagagccc gagcccgatt 1080tataatctgt accgcgcgag
ccaactgcgt ttcccgggtg aacagattct ggaagatgcc 1140aataaattcg
cgtatgattt cctgcaggaa aaactggcgc acaatcagat cctggataaa
1200tgggttatca gcaagcatct gcctgacgaa atcaaattgg gcctggagat
gccgtggtat 1260gcgaccttgc cgcgtgtcga agcgcgttac tacatccagt
actatgcggg tagcggcgat 1320gtctggattg gtaagacgct gtaccgtatg
ccagagatta gcaacgacac ctaccatgaa 1380ttggcaaaga ccgatttcaa
gcgttgccaa gcccaacacc agttcgagtg gatttacatg 1440caagagtggt
acgagtcgtg caacatggaa gagttcggta ttagccgcaa agaactgctg
1500gttgcatatt tcctggccac ggcgagcatc tttgagctgg agcgtgcgaa
tgaacgcatt 1560gcatgggcaa aaagccaaat catttctacc attatcgctt
cgttctttaa taaccaaaat 1620acgagccctg aggataaact ggcgtttctg
actgatttca aaaatggcaa cagcaccaac 1680atggctctgg tgaccctgac
ccagttcctg gaaggctttg accgctacac ttcccatcaa 1740ctgaaaaacg
cgtggagcgt ttggctgcgt aagctgcaac agggtgaggg taatggcggt
1800gccgacgccg agttactggt gaatacgctg aacatttgcg cgggtcacat
cgcgttccgt 1860gaagaaattc tggcacataa tgactataaa acgttgtcga
acctgaccag caagatttgt 1920cgccagctga gccagattca gaatgaaaaa
gaattggaaa ccgaaggcca aaagacttcc 1980attaagaaca aagaactgga
agaagatatg cagcgcctgg ttaaactggt tttggagaaa 2040agccgtgtgg
gtatcaatcg tgacatgaag aaaacgttcc tggctgtggt gaaaacctac
2100tattacaaag cataccactc cgcgcaggca atcgataacc acatgttcaa
ggttctgttc 2160gaaccggtgg cctaa 217514757PRTTriticum aestivum 14Met
Leu Thr Phe Thr Ala Ala Leu Arg His Val Pro Val Leu Asp Gln1 5 10
15Pro Thr Ser Glu Pro Trp Arg Arg Leu Ser Leu His Leu His Ser Gln
20 25 30Arg Arg Pro Cys Gly Leu Val Leu Ile Ser Lys Ser Pro Ser Tyr
Pro 35 40 45Glu Val Asp Val Gly Glu Trp Lys Val Asp Glu Tyr Arg Gln
Arg Thr 50 55 60Asp Glu Pro Ser Glu Thr Arg Gln Met Ile Asp Asp Ile
Arg Thr Ala65 70 75 80Leu Ala Ser Leu Gly Asp Asp Glu Thr Ser Met
Ser Val Ser Ala Tyr 85 90 95Asp Thr Ala Leu Val Ala Leu Val Lys Asn
Leu Asp Gly Gly Asp Gly 100 105 110Pro Gln Phe Pro Ser Cys Ile Asp
Trp Ile Val Gln Asn Gln Leu Pro 115 120 125Asp Gly Ser Trp Gly Asp
Pro Ala Phe Phe Met Val Gln Asp Arg Met 130 135 140Ile Ser Thr Leu
Ala Cys Val Val Ala Val Lys Ser Trp Asn Ile Asp145 150 155 160Arg
Asp Asn Leu Cys Asp Arg Gly Val Leu Phe Ile Lys Glu Asn Met 165 170
175Ser Arg Leu Val Glu Glu Glu Gln Asp Trp Met Pro Cys Gly Phe Glu
180 185 190Ile Asn Phe Pro Ala Leu Leu Glu Lys Ala Lys Asp Leu Asp
Leu Asp 195 200 205Ile Pro Tyr Asp His Pro Val Leu Glu Glu Ile Tyr
Ala Lys Arg Asn 210 215 220Leu Lys Leu Leu Lys Ile Pro Leu Asp Val
Leu His Ala Ile Pro Thr225 230 235 240Thr Leu Leu Phe Ser Val Glu
Gly Met Val Asp Leu Pro Leu Asp Trp 245 250 255Glu Lys Leu Leu Arg
Leu Arg Cys Pro Asp Gly Ser Phe His Ser Ser 260 265 270Pro Ala Ala
Thr Ala Ala Ala Leu Ser His Thr Gly Asp Lys Glu Cys 275 280 285His
Ala Phe Leu Asp Arg Leu Ile Gln Lys Phe Glu Gly Gly Val Pro 290 295
300Cys Ser His Ser Met Asp Thr Phe Glu Gln Leu Trp Val Val Asp
Arg305 310 315 320Leu Met Arg Leu Gly Ile Ser Arg His Phe Thr Ser
Glu Ile Gln Gln 325 330 335Cys Leu Glu Phe Ile Tyr Arg Arg Trp Thr
Gln Lys Gly Leu Ala His 340 345 350Asn Met His Cys Pro Ile Pro Asp
Ile Asp Asp Thr Ala Met Gly Phe 355 360 365Arg Leu Leu Arg Gln His
Gly Tyr Asp Val Thr Pro Ser Val Phe Lys 370 375 380His Phe Glu Lys
Asp Gly Lys Phe Val Cys Phe Pro Met Glu Thr Asn385 390 395 400His
Ala Ser Val Thr Pro Met His Asn Thr Tyr Arg Ala Ser Gln Phe 405 410
415Met Phe Pro Gly Asp Asp Asp Val Leu Ala Arg Ala Gly Arg Tyr Cys
420 425 430Arg Ala Phe Leu Gln Glu Arg Gln Ser Ser Asn Lys Leu Tyr
Asp Lys 435 440 445Trp Ile Ile Thr Lys Asp Leu Pro Gly Glu Val Gly
Tyr Thr Leu Asn 450 455 460Phe Pro Trp Lys Ser Ser Leu Pro Arg Ile
Glu Thr Arg Met Tyr Leu465 470 475 480Asp Gln Tyr Gly Gly Asn Asn
Asp Val Trp Ile Ala Lys Val Leu Tyr 485 490 495Arg Met Asn Leu Val
Ser Asn Asp Leu Tyr Leu Lys Met Ala Lys Ala 500 505 510Asp Phe Thr
Glu Tyr Gln Arg Leu Ser Arg Ile Glu Trp Asn Gly Leu 515 520 525Arg
Lys Trp Tyr Phe Arg Asn His Leu Gln Arg Tyr Gly Ala Thr Pro 530 535
540Lys Ser Ala Leu Lys Ala Tyr Phe Leu Ala Ser Ala Asn Ile Phe
Glu545 550 555 560Pro Gly Arg Ala Ala Glu Arg Leu Ala Trp Ala Arg
Met Ala Val Leu 565 570 575Ala Glu Ala Val Thr Thr His Phe Arg His
Ile Gly Gly Pro Cys Tyr 580 585 590Ser Thr Glu Asn Leu Glu Glu Leu
Ile Asp Leu Val Ser Phe Asp Asp 595 600 605Val Ser Gly Gly Leu Arg
Glu Ala Trp Lys Gln Trp Leu Met Ala Trp 610 615 620Thr Ala Lys Glu
Ser His Gly Ser Val Asp Gly Asp Thr Ala Leu Leu625 630 635 640Phe
Val Arg Thr Ile Glu Ile Cys Ser Gly Arg Ile Val Ser Ser Glu 645 650
655Gln Lys Leu Asn Leu Trp Asp Tyr Ser Gln Leu Glu Gln Leu Thr Ser
660 665 670Ser Ile Cys His Lys Leu Ala Thr Ile Gly Leu Ser Gln Asn
Glu Ala 675 680 685Ser Met Glu Asn Thr Glu Asp Leu His Gln Gln Val
Asp Leu Glu Met 690 695 700Gln Glu Leu Ser Trp Arg Val His Gln Gly
Cys His Gly Ile Asn Arg705 710 715 720Glu Thr Arg Gln Thr Phe Leu
Asn Val Val Lys Ser Phe Tyr Tyr Ser 725 730 735Ala His Cys Ser Pro
Glu Thr Val Asp Ser His Ile Ala Lys Val Ile 740 745 750Phe Gln Asp
Val Ile 75515699PRTArtificial SequenceTruncated copalyl diphosphate
synthase from Triticum aestivum 15Met Tyr Arg Gln Arg Thr Asp Glu
Pro Ser Glu Thr Arg Gln Met Ile1 5 10 15Asp Asp Ile Arg Thr Ala Leu
Ala Ser Leu Gly Asp Asp Glu Thr Ser 20 25 30Met Ser Val Ser Ala Tyr
Asp Thr Ala Leu Val Ala Leu Val Lys Asn 35 40 45Leu Asp Gly Gly Asp
Gly Pro Gln Phe Pro Ser Cys Ile Asp Trp Ile 50 55 60Val Gln Asn Gln
Leu Pro Asp Gly Ser Trp Gly Asp Pro Ala Phe Phe65 70 75 80Met Val
Gln Asp Arg Met Ile Ser Thr Leu Ala Cys Val Val Ala Val 85 90 95Lys
Ser Trp Asn Ile Asp Arg Asp Asn Leu Cys Asp Arg Gly Val Leu 100 105
110Phe Ile Lys Glu Asn Met Ser Arg Leu Val Glu Glu Glu Gln Asp Trp
115 120 125Met Pro Cys Gly Phe Glu Ile Asn Phe Pro Ala Leu Leu Glu
Lys Ala 130 135 140Lys Asp Leu Asp Leu Asp Ile Pro Tyr Asp His Pro
Val Leu Glu Glu145 150 155 160Ile Tyr Ala Lys Arg Asn Leu Lys Leu
Leu Lys Ile Pro Leu Asp Val 165 170 175Leu His Ala Ile Pro Thr Thr
Leu Leu Phe Ser Val Glu Gly Met Val 180 185 190Asp Leu Pro Leu Asp
Trp Glu Lys Leu Leu Arg Leu Arg Cys Pro Asp 195 200 205Gly Ser Phe
His Ser Ser Pro Ala Ala Thr Ala Ala Ala Leu Ser His 210 215 220Thr
Gly Asp Lys Glu Cys His Ala Phe Leu Asp Arg Leu Ile Gln Lys225 230
235 240Phe Glu Gly Gly Val Pro Cys Ser His Ser Met Asp Thr Phe Glu
Gln 245 250 255Leu Trp Val Val Asp Arg Leu Met Arg Leu Gly Ile Ser
Arg His Phe 260 265 270Thr Ser Glu Ile Gln Gln Cys Leu Glu Phe Ile
Tyr Arg Arg Trp Thr 275 280 285Gln Lys Gly Leu Ala His Asn Met His
Cys Pro Ile Pro Asp Ile Asp 290 295 300Asp Thr Ala Met Gly Phe Arg
Leu Leu Arg Gln His Gly Tyr Asp Val305 310 315 320Thr Pro Ser Val
Phe Lys His Phe Glu Lys Asp Gly Lys Phe Val Cys 325 330 335Phe Pro
Met Glu Thr Asn His Ala Ser Val Thr Pro Met His Asn Thr 340 345
350Tyr Arg Ala Ser Gln Phe Met Phe Pro Gly Asp Asp Asp Val Leu Ala
355 360 365Arg Ala Gly Arg Tyr Cys Arg Ala Phe Leu Gln Glu Arg Gln
Ser Ser 370 375 380Asn Lys Leu Tyr Asp Lys Trp Ile Ile Thr Lys Asp
Leu Pro Gly Glu385 390 395 400Val Gly Tyr Thr Leu Asn Phe Pro Trp
Lys Ser Ser Leu Pro Arg Ile 405 410 415Glu Thr Arg Met Tyr Leu Asp
Gln Tyr Gly Gly Asn Asn Asp Val Trp 420 425 430Ile Ala Lys Val Leu
Tyr Arg Met Asn Leu Val Ser Asn Asp Leu Tyr 435 440 445Leu Lys Met
Ala Lys Ala Asp Phe Thr Glu Tyr Gln Arg Leu Ser Arg 450 455 460Ile
Glu Trp Asn Gly Leu Arg Lys Trp Tyr Phe Arg Asn His Leu Gln465 470
475 480Arg Tyr Gly Ala Thr Pro Lys Ser Ala Leu Lys Ala Tyr Phe Leu
Ala 485 490 495Ser Ala Asn Ile Phe Glu Pro Gly Arg Ala Ala Glu Arg
Leu Ala Trp 500 505 510Ala Arg Met Ala Val Leu Ala Glu Ala Val Thr
Thr His Phe Arg His 515 520 525Ile Gly Gly Pro Cys Tyr Ser Thr Glu
Asn Leu Glu Glu Leu Ile Asp 530 535 540Leu Val Ser Phe Asp Asp Val
Ser Gly Gly Leu Arg Glu Ala Trp Lys545 550 555 560Gln Trp Leu Met
Ala Trp Thr Ala Lys Glu Ser His Gly Ser Val Asp 565 570 575Gly Asp
Thr Ala Leu Leu Phe Val Arg Thr Ile Glu Ile Cys Ser Gly 580 585
590Arg Ile Val Ser Ser Glu Gln Lys Leu Asn Leu Trp Asp Tyr Ser Gln
595 600 605Leu Glu Gln Leu Thr Ser Ser Ile Cys His Lys Leu Ala Thr
Ile Gly 610 615 620Leu Ser Gln Asn Glu Ala Ser Met Glu Asn Thr Glu
Asp Leu His Gln625 630 635 640Gln Val Asp Leu Glu Met Gln Glu Leu
Ser Trp Arg Val His Gln Gly 645 650 655Cys His Gly Ile Asn Arg Glu
Thr Arg Gln Thr Phe Leu Asn Val Val 660 665 670Lys Ser Phe Tyr Tyr
Ser Ala His Cys Ser Pro Glu Thr Val Asp Ser 675 680 685His Ile Ala
Lys Val Ile Phe Gln Asp Val Ile 690 695162100DNAArtificial
SequenceOptimized cDNA for E. coli expression encoding for
TaTps1-del59 16atgtatcgcc aaagaactga tgagccaagc gaaacccgcc
agatgatcga tgatattcgc 60accgctttgg ctagcctggg tgacgatgaa accagcatga
gcgtgagcgc atacgacacc 120gccctggttg ccctggtgaa gaacctggac
ggtggcgatg gcccgcagtt cccgagctgc 180attgactgga ttgttcagaa
ccagctgccg gacggtagct ggggcgaccc ggctttcttt 240atggttcagg
accgtatgat cagcaccctg gcctgtgtcg tggccgtgaa atcctggaat
300atcgatcgtg acaacttgtg cgatcgtggt gtcctgttta tcaaagaaaa
catgtcgcgt 360ctggttgaag aagaacaaga ttggatgcca tgtggcttcg
agattaactt tcctgcactg 420ttggagaaag ctaaagacct ggacttggac
attccgtacg atcatcctgt gctggaagag 480atttacgcga agcgtaatct
gaaactgctg aagattccgt tagatgtcct ccatgcgatc 540ccgacgacgc
tgttgttttc cgttgagggt atggtcgatc tgccgctgga ttgggagaaa
600ctgctgcgtc tgcgttgccc ggacggttct tttcattcta gcccggcggc
gacggcagcg 660gcgctgagcc acacgggtga caaagagtgt cacgccttcc
tggaccgcct gattcaaaag 720ttcgagggtg gcgtcccgtg ctcccacagc
atggacacct tcgagcaact gtgggttgtt 780gaccgtttga tgcgtctggg
tatcagccgt cattttacga gcgagatcca gcagtgcttg 840gagttcatct
atcgtcgttg gacccagaaa ggtctggcgc acaatatgca ctgcccgatc
900ccggacattg atgacactgc gatgggtttt cgtctgttga gacagcacgg
ttacgacgtg 960accccgtcgg ttttcaagca tttcgagaaa gacggcaagt
tcgtatgctt cccgatggaa 1020accaaccatg cgagcgtgac gccgatgcac
aatacctacc gtgcgagcca gttcatgttc 1080ccgggtgatg acgacgtgct
ggcccgtgcc ggccgctact gtcgcgcatt cttgcaagag 1140cgtcagagct
ctaacaagtt gtacgataag
tggattatca cgaaagatct gccgggtgag 1200gttggctaca cgctgaactt
tccgtggaaa agctccctgc cgcgtattga aactcgtatg 1260tatctggatc
agtacggtgg caataacgat gtctggattg caaaggtcct gtatcgcatg
1320aacctggtta gcaatgacct gtacctgaaa atggcgaaag ccgactttac
cgagtatcaa 1380cgtctgtctc gcattgagtg gaacggcctg cgcaaatggt
attttcgcaa tcatctgcag 1440cgttacggtg cgaccccgaa gtccgcgctg
aaagcgtatt tcctggcgtc ggcaaacatc 1500tttgagcctg gccgcgcagc
cgagcgcctg gcatgggcac gtatggccgt gctggctgaa 1560gctgtaacga
ctcatttccg tcacattggc ggcccgtgct acagcaccga gaatctggaa
1620gaactgatcg accttgttag cttcgacgac gtgagcggcg gcttgcgtga
ggcgtggaag 1680caatggctga tggcgtggac cgcaaaagaa tcacacggca
gcgtggacgg tgacacggca 1740ctgctgtttg tccgcacgat tgagatttgc
agcggccgca tcgtttccag cgagcagaaa 1800ctgaatctgt gggattacag
ccagttagag caattgacca gcagcatctg tcataaactg 1860gccaccatcg
gtctgagcca gaacgaagct agcatggaaa ataccgaaga tctgcaccaa
1920caagtcgatt tggaaatgca agaactgtca tggcgtgttc accagggttg
tcacggtatt 1980aatcgcgaaa cccgtcaaac cttcctgaat gttgttaagt
ctttttatta ctccgcacac 2040tgcagcccgg aaaccgtgga cagccatatt
gcaaaagtga tctttcaaga cgttatctga 210017785PRTMarrubium vulgare
17Met Gly Ser Leu Ser Thr Leu Asn Leu Ile Lys Thr Cys Val Thr Leu1
5 10 15Ala Ser Ser Glu Lys Leu Asn Gln Pro Ser Gln Cys Tyr Thr Ile
Ser 20 25 30Thr Cys Met Lys Ser Ser Asn Asn Pro Pro Phe Asn Tyr Tyr
Gln Ile 35 40 45Asn Gly Arg Lys Lys Met Ser Thr Ala Ile Asp Ser Ser
Val Asn Ala 50 55 60Pro Pro Glu Gln Lys Tyr Asn Ser Thr Ala Leu Glu
His Asp Thr Glu65 70 75 80Ile Ile Glu Ile Glu Asp His Ile Glu Cys
Ile Arg Arg Leu Leu Arg 85 90 95Thr Ala Gly Asp Gly Arg Ile Ser Val
Ser Pro Tyr Asp Thr Ala Trp 100 105 110Ile Ala Leu Ile Lys Asp Leu
Asp Gly His Asp Ser Pro Gln Phe Pro 115 120 125Ser Ser Met Glu Trp
Val Ala Asp Asn Gln Leu Pro Asp Gly Ser Trp 130 135 140Gly Asp Glu
His Phe Val Cys Val Tyr Asp Arg Leu Val Asn Thr Ile145 150 155
160Ala Cys Val Val Ala Leu Arg Ser Trp Asn Val His Ala His Lys Cys
165 170 175Glu Lys Gly Ile Lys Tyr Ile Lys Glu Asn Val His Lys Leu
Glu Asp 180 185 190Ala Asn Glu Glu His Met Thr Cys Gly Phe Glu Val
Val Phe Pro Ala 195 200 205Leu Leu Gln Arg Ala Gln Ser Met Gly Ile
Lys Gly Ile Pro Tyr Asn 210 215 220Ala Pro Val Ile Glu Glu Ile Tyr
Asn Ser Arg Glu Lys Lys Leu Lys225 230 235 240Arg Ile Pro Met Glu
Val Val His Lys Val Ala Thr Ser Leu Leu Phe 245 250 255Ser Leu Glu
Gly Leu Glu Asn Leu Glu Trp Glu Lys Leu Leu Lys Leu 260 265 270Gln
Ser Pro Asp Gly Ser Phe Leu Thr Ser Pro Ser Ser Thr Ala Phe 275 280
285Ala Phe Ile His Thr Lys Asp Arg Lys Cys Phe Asn Phe Ile Asn Asn
290 295 300Ile Val His Thr Phe Lys Gly Gly Ala Pro His Thr Tyr Pro
Val Asp305 310 315 320Ile Phe Gly Arg Leu Trp Ala Val Asp Arg Leu
Gln Arg Leu Gly Ile 325 330 335Ser Arg Phe Phe Glu Ser Glu Ile Ala
Glu Phe Leu Ser His Val His 340 345 350Arg Phe Trp Ser Asp Glu Ala
Gly Val Phe Ser Gly Arg Glu Ser Val 355 360 365Phe Cys Asp Ile Asp
Asp Thr Ser Met Gly Leu Arg Leu Leu Arg Met 370 375 380His Gly Tyr
His Val Asp Pro Asn Val Leu Lys Asn Phe Lys Gln Ser385 390 395
400Asp Lys Phe Ser Cys Tyr Gly Gly Gln Met Met Glu Cys Ser Ser Pro
405 410 415Ile Tyr Asn Leu Tyr Arg Ala Ser Gln Leu Gln Phe Pro Gly
Glu Glu 420 425 430Ile Leu Glu Glu Ala Asn Lys Phe Ala Tyr Lys Phe
Leu Gln Glu Lys 435 440 445Leu Glu Ser Asn Gln Ile Leu Asp Lys Trp
Leu Ile Ser Asn His Leu 450 455 460Ser Asp Glu Ile Lys Val Gly Leu
Glu Met Pro Trp Tyr Ala Thr Leu465 470 475 480Pro Arg Val Glu Thr
Ser Tyr Tyr Ile His His Tyr Gly Gly Gly Asp 485 490 495Asp Val Trp
Ile Gly Lys Thr Leu Tyr Arg Met Pro Glu Ile Ser Asn 500 505 510Asp
Thr Tyr Arg Glu Leu Ala Arg Leu Asp Phe Arg Arg Cys Gln Ala 515 520
525Gln His Gln Leu Glu Trp Ile Tyr Met Gln Arg Trp Tyr Glu Ser Cys
530 535 540Arg Met Gln Glu Phe Gly Ile Ser Arg Lys Glu Val Leu Arg
Ala Tyr545 550 555 560Phe Leu Ala Ser Gly Thr Ile Phe Glu Val Glu
Arg Ala Lys Glu Arg 565 570 575Val Ala Trp Ala Arg Ser Gln Ile Ile
Ser His Met Ile Lys Ser Phe 580 585 590Phe Asn Lys Glu Thr Thr Ser
Ser Asp Gln Lys Gln Ala Leu Leu Thr 595 600 605Glu Leu Leu Phe Gly
Asn Ile Ser Ala Ser Glu Thr Glu Lys Arg Glu 610 615 620Leu Asp Gly
Val Val Val Ala Thr Leu Arg Gln Phe Leu Glu Gly Phe625 630 635
640Asp Ile Gly Thr Arg His Gln Val Lys Ala Ala Trp Asp Val Trp Leu
645 650 655Arg Lys Val Glu Gln Gly Glu Ala His Gly Gly Ala Asp Ala
Glu Leu 660 665 670Cys Thr Thr Thr Leu Asn Thr Cys Ala Asn Gln His
Leu Ser Ser His 675 680 685Pro Asp Tyr Asn Thr Leu Ser Lys Leu Thr
Asn Lys Ile Cys His Lys 690 695 700Leu Ser Gln Ile Gln His Gln Lys
Glu Met Lys Gly Gly Ile Lys Ala705 710 715 720Lys Cys Ser Ile Asn
Asn Lys Glu Val Asp Ile Glu Met Gln Trp Leu 725 730 735Val Lys Leu
Val Leu Glu Lys Ser Gly Leu Asn Arg Lys Ala Lys Gln 740 745 750Ala
Phe Leu Ser Ile Ala Lys Thr Tyr Tyr Tyr Arg Ala Tyr Tyr Ala 755 760
765Asp Gln Thr Met Asp Ala His Ile Phe Lys Val Leu Phe Glu Pro Val
770 775 780Val78518723PRTArtificial SequenceTruncated copalyl
diphosphate synthase from Marrubium vulgare 18Met Ala Pro Pro Glu
Gln Lys Tyr Asn Ser Thr Ala Leu Glu His Asp1 5 10 15Thr Glu Ile Ile
Glu Ile Glu Asp His Ile Glu Cys Ile Arg Arg Leu 20 25 30Leu Arg Thr
Ala Gly Asp Gly Arg Ile Ser Val Ser Pro Tyr Asp Thr 35 40 45Ala Trp
Ile Ala Leu Ile Lys Asp Leu Asp Gly His Asp Ser Pro Gln 50 55 60Phe
Pro Ser Ser Met Glu Trp Val Ala Asp Asn Gln Leu Pro Asp Gly65 70 75
80Ser Trp Gly Asp Glu His Phe Val Cys Val Tyr Asp Arg Leu Val Asn
85 90 95Thr Ile Ala Cys Val Val Ala Leu Arg Ser Trp Asn Val His Ala
His 100 105 110Lys Cys Glu Lys Gly Ile Lys Tyr Ile Lys Glu Asn Val
His Lys Leu 115 120 125Glu Asp Ala Asn Glu Glu His Met Thr Cys Gly
Phe Glu Val Val Phe 130 135 140Pro Ala Leu Leu Gln Arg Ala Gln Ser
Met Gly Ile Lys Gly Ile Pro145 150 155 160Tyr Asn Ala Pro Val Ile
Glu Glu Ile Tyr Asn Ser Arg Glu Lys Lys 165 170 175Leu Lys Arg Ile
Pro Met Glu Val Val His Lys Val Ala Thr Ser Leu 180 185 190Leu Phe
Ser Leu Glu Gly Leu Glu Asn Leu Glu Trp Glu Lys Leu Leu 195 200
205Lys Leu Gln Ser Pro Asp Gly Ser Phe Leu Thr Ser Pro Ser Ser Thr
210 215 220Ala Phe Ala Phe Ile His Thr Lys Asp Arg Lys Cys Phe Asn
Phe Ile225 230 235 240Asn Asn Ile Val His Thr Phe Lys Gly Gly Ala
Pro His Thr Tyr Pro 245 250 255Val Asp Ile Phe Gly Arg Leu Trp Ala
Val Asp Arg Leu Gln Arg Leu 260 265 270Gly Ile Ser Arg Phe Phe Glu
Ser Glu Ile Ala Glu Phe Leu Ser His 275 280 285Val His Arg Phe Trp
Ser Asp Glu Ala Gly Val Phe Ser Gly Arg Glu 290 295 300Ser Val Phe
Cys Asp Ile Asp Asp Thr Ser Met Gly Leu Arg Leu Leu305 310 315
320Arg Met His Gly Tyr His Val Asp Pro Asn Val Leu Lys Asn Phe Lys
325 330 335Gln Ser Asp Lys Phe Ser Cys Tyr Gly Gly Gln Met Met Glu
Cys Ser 340 345 350Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ser Gln Leu
Gln Phe Pro Gly 355 360 365Glu Glu Ile Leu Glu Glu Ala Asn Lys Phe
Ala Tyr Lys Phe Leu Gln 370 375 380Glu Lys Leu Glu Ser Asn Gln Ile
Leu Asp Lys Trp Leu Ile Ser Asn385 390 395 400His Leu Ser Asp Glu
Ile Lys Val Gly Leu Glu Met Pro Trp Tyr Ala 405 410 415Thr Leu Pro
Arg Val Glu Thr Ser Tyr Tyr Ile His His Tyr Gly Gly 420 425 430Gly
Asp Asp Val Trp Ile Gly Lys Thr Leu Tyr Arg Met Pro Glu Ile 435 440
445Ser Asn Asp Thr Tyr Arg Glu Leu Ala Arg Leu Asp Phe Arg Arg Cys
450 455 460Gln Ala Gln His Gln Leu Glu Trp Ile Tyr Met Gln Arg Trp
Tyr Glu465 470 475 480Ser Cys Arg Met Gln Glu Phe Gly Ile Ser Arg
Lys Glu Val Leu Arg 485 490 495Ala Tyr Phe Leu Ala Ser Gly Thr Ile
Phe Glu Val Glu Arg Ala Lys 500 505 510Glu Arg Val Ala Trp Ala Arg
Ser Gln Ile Ile Ser His Met Ile Lys 515 520 525Ser Phe Phe Asn Lys
Glu Thr Thr Ser Ser Asp Gln Lys Gln Ala Leu 530 535 540Leu Thr Glu
Leu Leu Phe Gly Asn Ile Ser Ala Ser Glu Thr Glu Lys545 550 555
560Arg Glu Leu Asp Gly Val Val Val Ala Thr Leu Arg Gln Phe Leu Glu
565 570 575Gly Phe Asp Ile Gly Thr Arg His Gln Val Lys Ala Ala Trp
Asp Val 580 585 590Trp Leu Arg Lys Val Glu Gln Gly Glu Ala His Gly
Gly Ala Asp Ala 595 600 605Glu Leu Cys Thr Thr Thr Leu Asn Thr Cys
Ala Asn Gln His Leu Ser 610 615 620Ser His Pro Asp Tyr Asn Thr Leu
Ser Lys Leu Thr Asn Lys Ile Cys625 630 635 640His Lys Leu Ser Gln
Ile Gln His Gln Lys Glu Met Lys Gly Gly Ile 645 650 655Lys Ala Lys
Cys Ser Ile Asn Asn Lys Glu Val Asp Ile Glu Met Gln 660 665 670Trp
Leu Val Lys Leu Val Leu Glu Lys Ser Gly Leu Asn Arg Lys Ala 675 680
685Lys Gln Ala Phe Leu Ser Ile Ala Lys Thr Tyr Tyr Tyr Arg Ala Tyr
690 695 700Tyr Ala Asp Gln Thr Met Asp Ala His Ile Phe Lys Val Leu
Phe Glu705 710 715 720Pro Val Val192172DNAArtificial
SequenceOptimized cDNA for E. coli expression encoding for
MvCps3-del63 19atggccccgc cggaacaaaa gtacaacagc actgcattag
aacacgacac cgagattatt 60gagatcgagg accacatcga gtgtatccgc cgtctgctgc
gtaccgcggg tgatggtcgt 120attagcgtga gcccgtatga taccgcgtgg
attgcactga ttaaagattt ggatggccac 180gactccccgc aattcccgtc
gagcatggaa tgggttgctg ataatcagct gccggacggt 240agctggggtg
acgagcactt cgtttgcgtt tacgatcgcc tggttaatac catcgcatgc
300gtcgtggcgc tgcgcagctg gaatgtccat gcacataagt gcgagaaagg
tattaagtac 360attaaagaaa atgtccacaa actggaagat gcgaacgaag
aacacatgac ttgcggcttc 420gaagtcgttt ttccggcctt gctgcagcgt
gcacagagca tgggtattaa gggcatcccg 480tacaacgcgc ctgtcattga
agaaatttac aattcccgtg agaaaaagct gaaacgtatt 540ccgatggaag
ttgtccacaa agtcgcgacc agcctgctgt tctccctgga aggtctggag
600aacctggagt gggagaaatt gctgaaactg cagagcccgg acggttcgtt
tctgaccagc 660ccgagctcta cggcattcgc gtttatccat accaaagacc
gtaaatgttt taactttatt 720aacaatatcg ttcatacctt taagggtggt
gcaccgcaca cgtaccctgt ggacatcttt 780ggccgcctgt gggcagtgga
tcgcttgcag cgtctgggta ttagccgctt cttcgagagc 840gagatcgcgg
aatttctgag ccacgtgcac cgtttttgga gcgacgaagc gggcgttttc
900agcggccgtg agagcgtgtt ctgtgatatt gatgacacca gcatgggtct
gcgcctgctt 960cgtatgcatg gctaccatgt agacccaaac gttctgaaga
acttcaagca atctgacaag 1020tttagctgct acggtggcca gatgatggaa
tgcagcagcc caatttacaa tctgtaccgt 1080gcgagccaac tgcaatttcc
gggtgaagaa atcttggaag aggctaacaa attcgcgtat 1140aagtttttgc
aagagaaact ggagtccaat cagattctgg acaagtggct gatctccaac
1200cacctgagcg acgaaatcaa agttggcctg gaaatgccgt ggtatgcgac
cttgccgcgc 1260gttgagacta gctattatat tcaccattac ggcggtggcg
acgatgtgtg gattggtaaa 1320acgctgtatc gcatgccgga aattagcaac
gacacctacc gtgagctggc acgtctggac 1380ttccgccgct gccaggcgca
gcaccagttg gaatggatct atatgcaacg ttggtatgag 1440agctgtcgta
tgcaagaatt tggtatttcc cgcaaagaag tcctgcgtgc ctacttcctg
1500gcctctggca cgattttcga agttgagcgc gccaaagagc gcgtggcgtg
ggctcgtagc 1560caaatcattt cccacatgat caagagcttc ttcaataaag
aaaccacgag cagcgatcag 1620aaacaagcgc tgctgaccga gttgctgttt
ggtaacatct ctgcaagcga gactgagaaa 1680cgtgagctgg atggtgttgt
ggttgcgacc ctgcgtcagt tcctggaagg cttcgatatc 1740ggcacccgtc
accaagtgaa ggcagcgtgg gatgtgtggc tgcgtaaagt cgaacagggt
1800gaggcacatg gtggcgcgga cgccgagttg tgtacgacga cgctgaacac
gtgcgcgaat 1860cagcatctgt ctagccatcc ggactacaat accctgtcga
aactcaccaa taagatttgt 1920cacaagctgt cccaaatcca gcatcagaaa
gaaatgaagg gcggtattaa ggcaaagtgc 1980tctatcaata acaaagaagt
ggatatcgag atgcaatggc tggtcaaact ggtcctggag 2040aaatccggtc
tgaaccgcaa ggctaaacaa gcgtttctga gcattgccaa aacctattat
2100tatcgtgctt actatgccga ccagacgatg gatgcccaca tcttcaaggt
cctgtttgaa 2160ccggtcgtgt aa 217220799PRTRosmarinus officinalis
20Met Thr Ser Met Ser Ser Leu Asn Leu Ser Arg Ala Pro Ala Ile Ser1
5 10 15Arg Arg Leu Gln Leu Pro Ala Lys Val Gln Leu Pro Glu Phe Tyr
Ala 20 25 30Val Cys Ser Trp Leu Asn Asn Ser Ser Lys His Thr Pro Leu
Ser Cys 35 40 45His Ile His Arg Lys Gln Leu Ser Lys Val Thr Lys Cys
Arg Val Ala 50 55 60Ser Leu Asp Ala Ser Gln Val Ser Glu Lys Gly Thr
Ser Ser Pro Val65 70 75 80Gln Thr Pro Glu Glu Val Asn Glu Lys Ile
Glu Asn Tyr Ile Glu Tyr 85 90 95Ile Lys Asn Leu Leu Thr Thr Ser Gly
Asp Gly Arg Ile Ser Val Ser 100 105 110Pro Tyr Asp Thr Ser Ile Val
Ala Leu Ile Lys Asp Leu Lys Gly Arg 115 120 125Asp Thr Pro Gln Phe
Pro Ser Cys Leu Glu Trp Ile Ala Gln His Gln 130 135 140Met Ala Asp
Gly Ser Trp Gly Asp Glu Phe Phe Cys Ile Tyr Asp Arg145 150 155
160Ile Leu Asn Thr Leu Ala Cys Val Val Ala Leu Lys Ser Trp Asn Val
165 170 175His Ala Asp Met Ile Glu Lys Gly Val Thr Tyr Val Asn Glu
Asn Val 180 185 190Gln Lys Leu Glu Asp Gly Asn Leu Glu His Met Thr
Ser Gly Phe Glu 195 200 205Ile Val Val Pro Ala Leu Val Gln Arg Ala
Gln Asp Leu Gly Ile Gln 210 215 220Gly Leu Pro Tyr Asp His Pro Leu
Ile Lys Glu Ile Ala Asn Thr Lys225 230 235 240Glu Gly Arg Leu Lys
Lys Ile Pro Lys Asp Met Ile Tyr Gln Lys Pro 245 250 255Thr Thr Leu
Leu Phe Ser Leu Glu Gly Leu Gly Asp Leu Glu Trp Glu 260 265 270Lys
Ile Leu Lys Leu Gln Ser Gly Asp Gly Ser Phe Leu Thr Ser Pro 275 280
285Ser Ser Thr Ala His Val Phe Met Lys Thr Lys Asp Glu Lys Cys Leu
290 295 300Lys Phe Ile Glu Asn Ala Val Lys Asn Cys Asn Gly Gly Ala
Pro His305 310 315 320Thr Tyr Pro Val Asp Val Phe Ala Arg Leu Trp
Ala Val Asp Arg Leu 325 330 335Gln Arg Leu Gly Ile Ser Arg Phe Phe
Gln Gln Glu Ile Lys Tyr Phe 340 345 350Leu Asp His Ile Asn Ser Val
Trp Thr Glu Asn Gly Val Phe Ser Gly 355 360 365Arg Asp Ser Glu Phe
Cys Asp Ile Asp Asp Thr Ser Met Gly Ile Arg 370 375 380Leu Leu Lys
Met His Gly Tyr Asp Ile Asp Pro Asn Ala Leu Glu His385 390 395
400Phe Lys Gln Gln Asp Gly Lys Phe Ser Cys Tyr Gly Gly Gln Met Ile
405 410
415Glu Ser Ala Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ala Gln Leu Arg
420 425 430Phe Pro Gly Glu Glu Ile Leu Glu Glu Ala Thr Lys Phe Ala
Tyr Asn 435 440 445Phe Leu Gln Glu Lys Ile Ala Asn Asp Gln Phe Gln
Glu Lys Trp Val 450 455 460Ile Ser Asp His Leu Ile Asp Glu Val Lys
Leu Gly Leu Lys Met Pro465 470 475 480Trp Tyr Ala Thr Leu Pro Arg
Val Glu Ala Ala Tyr Tyr Leu Gln Tyr 485 490 495Tyr Ala Gly Cys Gly
Asp Val Trp Ile Gly Lys Val Phe Tyr Arg Met 500 505 510Pro Glu Ile
Ser Asn Asp Thr Tyr Lys Lys Leu Ala Ile Leu Asp Phe 515 520 525Asn
Arg Cys Gln Ala Gln His Gln Phe Glu Trp Ile Tyr Met Gln Glu 530 535
540Trp Tyr His Arg Ser Ser Val Ser Glu Phe Gly Ile Ser Lys Lys
Asp545 550 555 560Leu Leu Arg Ala Tyr Phe Leu Ala Ala Ala Thr Ile
Phe Glu Pro Glu 565 570 575Arg Thr Gln Glu Arg Leu Val Trp Ala Lys
Thr Gln Ile Val Ser Gly 580 585 590Met Ile Thr Ser Phe Val Asn Ser
Gly Thr Thr Leu Ser Leu His Gln 595 600 605Lys Thr Ala Leu Leu Ser
Gln Ile Gly His Asn Phe Asp Gly Leu Asp 610 615 620Glu Ile Ile Ser
Ala Met Lys Asp His Gly Leu Ala Ala Thr Leu Leu625 630 635 640Thr
Thr Phe Gln Gln Leu Leu Asp Gly Phe Asp Arg Tyr Thr Arg His 645 650
655Gln Leu Lys Asn Ala Trp Ser Gln Trp Phe Met Lys Leu Gln Gln Gly
660 665 670Glu Ala Ser Gly Gly Glu Asp Ala Glu Leu Leu Ala Asn Thr
Leu Asn 675 680 685Ile Cys Ala Gly Leu Ile Ala Phe Asn Glu Asp Val
Leu Ser His His 690 695 700Glu Tyr Thr Thr Leu Ser Thr Leu Thr Asn
Lys Ile Cys Lys Arg Leu705 710 715 720Thr Gln Ile Gln Asp Lys Lys
Thr Leu Glu Val Val Asp Gly Ser Ile 725 730 735Lys Asp Lys Glu Leu
Glu Lys Asp Ile Gln Met Leu Val Lys Leu Val 740 745 750Leu Glu Glu
Asn Gly Gly Gly Val Asp Arg Asn Ile Lys His Thr Phe 755 760 765Leu
Ser Val Phe Lys Thr Phe Tyr Tyr Asn Ala Tyr His Asp Asp Glu 770 775
780Thr Thr Asp Val His Ile Phe Lys Val Leu Phe Gly Pro Val Val785
790 79521733PRTArtificial SequenceTruncated copalyl diphosphate
synthase from Rosmarinus officinalis 21Met Ala Ser Gln Val Ser Glu
Lys Gly Thr Ser Ser Pro Val Gln Thr1 5 10 15Pro Glu Glu Val Asn Glu
Lys Ile Glu Asn Tyr Ile Glu Tyr Ile Lys 20 25 30Asn Leu Leu Thr Thr
Ser Gly Asp Gly Arg Ile Ser Val Ser Pro Tyr 35 40 45Asp Thr Ser Ile
Val Ala Leu Ile Lys Asp Leu Lys Gly Arg Asp Thr 50 55 60Pro Gln Phe
Pro Ser Cys Leu Glu Trp Ile Ala Gln His Gln Met Ala65 70 75 80Asp
Gly Ser Trp Gly Asp Glu Phe Phe Cys Ile Tyr Asp Arg Ile Leu 85 90
95Asn Thr Leu Ala Cys Val Val Ala Leu Lys Ser Trp Asn Val His Ala
100 105 110Asp Met Ile Glu Lys Gly Val Thr Tyr Val Asn Glu Asn Val
Gln Lys 115 120 125Leu Glu Asp Gly Asn Leu Glu His Met Thr Ser Gly
Phe Glu Ile Val 130 135 140Val Pro Ala Leu Val Gln Arg Ala Gln Asp
Leu Gly Ile Gln Gly Leu145 150 155 160Pro Tyr Asp His Pro Leu Ile
Lys Glu Ile Ala Asn Thr Lys Glu Gly 165 170 175Arg Leu Lys Lys Ile
Pro Lys Asp Met Ile Tyr Gln Lys Pro Thr Thr 180 185 190Leu Leu Phe
Ser Leu Glu Gly Leu Gly Asp Leu Glu Trp Glu Lys Ile 195 200 205Leu
Lys Leu Gln Ser Gly Asp Gly Ser Phe Leu Thr Ser Pro Ser Ser 210 215
220Thr Ala His Val Phe Met Lys Thr Lys Asp Glu Lys Cys Leu Lys
Phe225 230 235 240Ile Glu Asn Ala Val Lys Asn Cys Asn Gly Gly Ala
Pro His Thr Tyr 245 250 255Pro Val Asp Val Phe Ala Arg Leu Trp Ala
Val Asp Arg Leu Gln Arg 260 265 270Leu Gly Ile Ser Arg Phe Phe Gln
Gln Glu Ile Lys Tyr Phe Leu Asp 275 280 285His Ile Asn Ser Val Trp
Thr Glu Asn Gly Val Phe Ser Gly Arg Asp 290 295 300Ser Glu Phe Cys
Asp Ile Asp Asp Thr Ser Met Gly Ile Arg Leu Leu305 310 315 320Lys
Met His Gly Tyr Asp Ile Asp Pro Asn Ala Leu Glu His Phe Lys 325 330
335Gln Gln Asp Gly Lys Phe Ser Cys Tyr Gly Gly Gln Met Ile Glu Ser
340 345 350Ala Ser Pro Ile Tyr Asn Leu Tyr Arg Ala Ala Gln Leu Arg
Phe Pro 355 360 365Gly Glu Glu Ile Leu Glu Glu Ala Thr Lys Phe Ala
Tyr Asn Phe Leu 370 375 380Gln Glu Lys Ile Ala Asn Asp Gln Phe Gln
Glu Lys Trp Val Ile Ser385 390 395 400Asp His Leu Ile Asp Glu Val
Lys Leu Gly Leu Lys Met Pro Trp Tyr 405 410 415Ala Thr Leu Pro Arg
Val Glu Ala Ala Tyr Tyr Leu Gln Tyr Tyr Ala 420 425 430Gly Cys Gly
Asp Val Trp Ile Gly Lys Val Phe Tyr Arg Met Pro Glu 435 440 445Ile
Ser Asn Asp Thr Tyr Lys Lys Leu Ala Ile Leu Asp Phe Asn Arg 450 455
460Cys Gln Ala Gln His Gln Phe Glu Trp Ile Tyr Met Gln Glu Trp
Tyr465 470 475 480His Arg Ser Ser Val Ser Glu Phe Gly Ile Ser Lys
Lys Asp Leu Leu 485 490 495Arg Ala Tyr Phe Leu Ala Ala Ala Thr Ile
Phe Glu Pro Glu Arg Thr 500 505 510Gln Glu Arg Leu Val Trp Ala Lys
Thr Gln Ile Val Ser Gly Met Ile 515 520 525Thr Ser Phe Val Asn Ser
Gly Thr Thr Leu Ser Leu His Gln Lys Thr 530 535 540Ala Leu Leu Ser
Gln Ile Gly His Asn Phe Asp Gly Leu Asp Glu Ile545 550 555 560Ile
Ser Ala Met Lys Asp His Gly Leu Ala Ala Thr Leu Leu Thr Thr 565 570
575Phe Gln Gln Leu Leu Asp Gly Phe Asp Arg Tyr Thr Arg His Gln Leu
580 585 590Lys Asn Ala Trp Ser Gln Trp Phe Met Lys Leu Gln Gln Gly
Glu Ala 595 600 605Ser Gly Gly Glu Asp Ala Glu Leu Leu Ala Asn Thr
Leu Asn Ile Cys 610 615 620Ala Gly Leu Ile Ala Phe Asn Glu Asp Val
Leu Ser His His Glu Tyr625 630 635 640Thr Thr Leu Ser Thr Leu Thr
Asn Lys Ile Cys Lys Arg Leu Thr Gln 645 650 655Ile Gln Asp Lys Lys
Thr Leu Glu Val Val Asp Gly Ser Ile Lys Asp 660 665 670Lys Glu Leu
Glu Lys Asp Ile Gln Met Leu Val Lys Leu Val Leu Glu 675 680 685Glu
Asn Gly Gly Gly Val Asp Arg Asn Ile Lys His Thr Phe Leu Ser 690 695
700Val Phe Lys Thr Phe Tyr Tyr Asn Ala Tyr His Asp Asp Glu Thr
Thr705 710 715 720Asp Val His Ile Phe Lys Val Leu Phe Gly Pro Val
Val 725 730222202DNAArtificial SequenceOptimized cDNA for E. coli
expression encoding for RoCPS1-del67 22atggcatcac aagttagcga
gaaaggcacc agctccccag ttcaaacgcc agaggaagtg 60aacgaaaaga tcgagaatta
cattgagtat attaaaaatc tgctgactac ttcgggcgac 120ggccgcatca
gcgtcagccc gtacgacacg agcatcgttg ccctgattaa agacctgaag
180ggtcgtgaca ccccgcagtt tccgtcctgt ctggagtgga ttgcccaaca
ccaaatggcc 240gatggttcct ggggtgatga atttttctgc atttacgacc
gcatcctgaa tacgctggct 300tgtgttgtcg ccctgaagtc ctggaatgtt
catgcagaca tgatcgaaaa gggtgtcact 360tacgttaacg aaaacgtgca
gaaactggaa gatggcaatc tggagcacat gacgagcggt 420ttcgagattg
ttgtcccggc gctggttcag agagcgcaag acctgggcat ccagggcctg
480ccgtatgatc atccgttgat caaagaaatc gcaaacacca aagagggccg
cctgaagaaa 540attcctaaag acatgattta tcagaaaccg actacgctgc
tgttcagcct ggaaggcttg 600ggcgacctgg agtgggaaaa gatcctgaag
ttacagtctg gtgatggttc tttcctgacc 660agcccgagct ctacggccca
tgttttcatg aaaaccaaag atgagaagtg tctgaagttt 720attgaaaatg
ccgtcaagaa ttgcaacggt ggcgcgcctc acacctaccc ggtggacgtt
780ttcgctcgtc tgtgggccgt cgatcgtctg caacgcctgg gcatctcgcg
tttcttccag 840caagagatta agtacttcct ggaccacatt aatagcgtgt
ggaccgaaaa cggcgttttc 900agcggtcgcg acagcgagtt ttgtgatatt
gatgacacct ctatgggtat ccgtttgctg 960aagatgcacg gttacgacat
tgacccgaat gccctggagc actttaaaca acaggatggt 1020aagttctcct
gctacggtgg tcagatgatt gagagcgcga gcccgatcta caacctgtac
1080cgtgctgcgc agctgcgttt tccgggtgaa gagattctgg aagaggccac
caaatttgcg 1140tataattttt tgcaagagaa aattgcaaac gaccaattcc
aggaaaaatg ggttattagc 1200gatcacctta tcgatgaagt gaaactgggt
ttgaagatgc cgtggtacgc gacgctgcca 1260cgtgtcgagg cagcgtatta
tctgcagtat tatgcgggct gtggtgatgt gtggatcggc 1320aaagtgttct
accgtatgcc ggaaatcagc aatgacacct acaagaaact ggccatcctg
1380gatttcaacc gttgccaggc gcaacaccaa ttcgagtgga tctacatgca
agagtggtat 1440catcgtagca gcgtttctga gtttggcatt tccaaaaaag
acttgctgcg cgcgtatttt 1500ctggcggcag cgaccatttt cgaaccggag
cgcacccagg aacgtctggt gtgggctaag 1560acgcaaatcg tcagcggtat
gattacgtcc tttgttaata gcggtacgac tctgagcctg 1620caccagaaaa
cggcactgtt gagccaaatc ggtcataact ttgacggcct ggatgagatt
1680atcagcgcga tgaaagacca cggcctggca gcgacgctgt taacgacctt
tcaacagctg 1740ctggacggct tcgatcgcta cacccgtcat cagctgaaaa
acgcgtggag ccagtggttc 1800atgaagctgc aacagggtga ggcgtcgggt
ggcgaagatg ctgagctgct ggctaatacc 1860ctgaacattt gcgcgggttt
gattgcgttt aatgaagatg tgttgagcca ccatgagtac 1920accaccctga
gcaccctgac caacaagatc tgtaagcgct tgactcaaat ccaggataag
1980aaaacgctgg aagtcgtgga tggtagcatc aaagataaag aactggaaaa
agacattcaa 2040atgctggtga aactggtcct tgaagagaac ggcggtggcg
ttgaccgtaa catcaagcac 2100accttcctga gcgtctttaa aaccttttat
tataatgcct atcatgacga tgaaacgacc 2160gacgtgcaca ttttcaaagt
tctgttcggt ccggtcgtgt aa 220223766PRTArtificial SequenceTruncated
putative sclareol synthase from Nicotiana glutinosa 23Met Ala Asn
Phe His Arg Pro Ser Arg Val Arg Cys Ser His Ser Thr1 5 10 15Ala Ser
Ser Leu Glu Glu Ala Lys Glu Arg Ile Arg Glu Thr Phe Gly 20 25 30Lys
Asn Glu Leu Ser Pro Ser Ser Tyr Asp Thr Ala Trp Val Ala Met 35 40
45Val Pro Ser Arg Tyr Ser Met Asn Gln Pro Cys Phe Pro Arg Cys Leu
50 55 60Asp Trp Ile Leu Glu Asn Gln Arg Glu Asp Gly Ser Trp Gly Leu
Asn65 70 75 80Pro Ser His Pro Leu Leu Val Lys Asp Ser Leu Ser Ser
Thr Leu Ala 85 90 95Cys Leu Leu Ala Leu Arg Lys Trp Arg Ile Gly Asp
Asn Gln Val Gln 100 105 110Arg Gly Leu Gly Phe Ile Glu Thr His Gly
Trp Ala Val Asp Asn Val 115 120 125Asp Gln Ile Ser Pro Leu Gly Phe
Asp Ile Ile Phe Pro Ser Met Ile 130 135 140Lys Tyr Ala Glu Lys Leu
Asn Leu Asp Leu Pro Phe Asp Pro Asn Leu145 150 155 160Val Asn Met
Met Leu Arg Glu Arg Glu Leu Thr Ile Glu Arg Ala Leu 165 170 175Lys
Asn Glu Phe Glu Gly Asn Met Ala Asn Val Glu Tyr Phe Ala Glu 180 185
190Gly Leu Gly Glu Leu Cys His Trp Lys Glu Ile Met Leu His Gln Arg
195 200 205Arg Asn Gly Ser Leu Phe Asp Ser Pro Ala Thr Thr Ala Ala
Ala Leu 210 215 220Ile Tyr His Gln His Asp Glu Lys Cys Phe Gly Tyr
Leu Ser Ser Ile225 230 235 240Leu Lys Leu His Glu Asn Trp Val Pro
Thr Ile Tyr Pro Thr Lys Val 245 250 255His Ser Asn Leu Phe Phe Val
Asp Ala Leu Gln Asn Leu Gly Val Asp 260 265 270Arg Tyr Phe Lys Thr
Glu Leu Lys Ser Val Leu Asp Glu Ile Tyr Arg 275 280 285Leu Trp Leu
Glu Lys Asn Glu Glu Ile Phe Ser Asp Ile Ala His Cys 290 295 300Ala
Met Ala Phe Arg Leu Leu Arg Met Asn Asn Tyr Glu Val Ser Ser305 310
315 320Glu Glu Leu Glu Gly Phe Val Asp Gln Glu His Phe Phe Thr Thr
Ser 325 330 335Gly Gly Lys Leu Ile Ser His Val Ala Ile Leu Glu Leu
His Arg Ala 340 345 350Ser Gln Val Asp Ile Gln Glu Gly Lys Asp Leu
Ile Leu Asp Lys Ile 355 360 365Ser Thr Trp Thr Arg Asn Phe Met Glu
Gln Glu Leu Leu Asp Asn Gln 370 375 380Ile Leu Asp Arg Ser Lys Lys
Glu Met Glu Phe Ala Met Arg Lys Phe385 390 395 400Tyr Gly Thr Phe
Asp Arg Val Glu Thr Arg Arg Tyr Ile Glu Ser Tyr 405 410 415Lys Met
Asp Ser Phe Lys Ile Leu Lys Ala Ala Tyr Arg Ser Ser Asn 420 425
430Ile Asn Asn Ile Asp Leu Leu Lys Phe Ser Glu His Asp Phe Asn Leu
435 440 445Cys Gln Ala Arg His Lys Glu Glu Leu Gln Gln Ile Lys Arg
Trp Phe 450 455 460Ala Asp Cys Lys Leu Glu Gln Val Gly Ser Ser Gln
Asn Tyr Leu Tyr465 470 475 480Thr Ser Tyr Phe Pro Ile Ala Ala Ile
Leu Phe Glu Pro Glu Tyr Gly 485 490 495Asp Ala Arg Leu Ala Phe Ala
Lys Cys Gly Ile Ile Ala Thr Thr Val 500 505 510Asp Asp Phe Phe Asp
Gly Phe Ala Cys Asn Glu Glu Leu Gln Asn Ile 515 520 525Ile Glu Leu
Val Glu Arg Trp Asp Gly Tyr Pro Thr Val Gly Phe Arg 530 535 540Ser
Glu Arg Val Arg Ile Phe Phe Leu Ala Leu Tyr Lys Met Ile Glu545 550
555 560Glu Ile Ala Ala Lys Ala Glu Thr Lys Gln Gly Arg Cys Val Lys
Asp 565 570 575Leu Leu Ile Asn Leu Trp Ile Asp Leu Leu Lys Cys Met
Leu Val Glu 580 585 590Leu Asp Leu Trp Lys Ile Lys Ser Thr Thr Pro
Ser Ile Glu Glu Tyr 595 600 605Leu Ser Ile Ala Cys Val Thr Thr Gly
Val Lys Cys Leu Ile Leu Ile 610 615 620Ser Leu His Leu Leu Gly Pro
Lys Leu Ser Lys Asp Val Thr Glu Ser625 630 635 640Ser Glu Val Ser
Ala Leu Trp Asn Cys Thr Ala Val Val Ala Arg Leu 645 650 655Asn Asn
Asp Ile His Ser Tyr Lys Arg Glu Gln Ala Glu Ser Ser Thr 660 665
670Asn Met Ala Ala Ile Leu Ile Ser Gln Ser Gln Arg Thr Ile Ser Glu
675 680 685Glu Glu Ala Ile Arg Gln Ile Lys Glu Met Met Glu Ser Lys
Arg Arg 690 695 700Glu Leu Leu Gly Met Val Leu Gln Asn Lys Glu Ser
Gln Leu Pro Gln705 710 715 720Val Cys Lys Asp Leu Phe Trp Thr Thr
Phe Lys Ala Ala Tyr Ser Ile 725 730 735Tyr Thr His Gly Asp Glu Tyr
Arg Phe Pro Gln Glu Leu Lys Asn His 740 745 750Ile Asn Asp Val Ile
Tyr Lys Pro Leu Asn Gln Tyr Ser Pro 755 760 765242301DNAArtificial
SequenceOptimized cDNA for E. coli expression encoding for
NgSCS-del29 24atggctaatt tccatcgccc atcccgtgtt cgttgttccc
actctaccgc aagctccctg 60gaagaggcaa aagagcgcat ccgtgaaacc ttcggcaaaa
atgaactctc tccttctagc 120tatgatacgg cctgggttgc tatggtcccg
agccgctaca gcatgaacca gccgtgcttt 180ccgcgctgcc tggactggat
tctggagaac caacgtgagg atggcagctg gggtctgaac 240ccgagccatc
cgttactggt gaaagacagc ttgagcagca cgctggcgtg tttgctggcg
300ctgcgtaagt ggcgtattgg cgacaaccaa gtccagcgtg gcctgggttt
tatcgagact 360catggttggg cagtggacaa cgtagaccag atctctccac
tgggttttga catcattttc 420ccgagcatga ttaaatatgc ggaaaagctg
aatctggatt tgccttttga tccgaacctg 480gtgaacatga tgctgcgcga
gcgcgagctg acgatcgagc gtgcgctgaa aaacgaattt 540gagggtaata
tggctaatgt cgagtacttc gccgagggtt tgggtgagct gtgtcactgg
600aaagaaatca tgctgcacca acgccgtaac ggtagcctgt tcgactctcc
ggcaacgacc 660gccgcggctc ttatttatca tcagcacgat gagaagtgct
tcggctatct gtctagcatc 720ctgaaattac acgagaactg ggtgccgacc
atctatccga ccaaggttca ctccaatctg 780tttttcgtcg atgcgctgca
gaacctgggt gttgaccgtt acttcaaaac cgaactgaag 840tccgtcctgg
atgagatcta ccgtttgtgg ctggagaaaa acgaagagat cttcagcgat
900attgcgcact gcgcaatggc gtttcgcctg ttgcgcatga ataattacga
ggttagcagc 960gaagaactgg aaggcttcgt ggaccaagaa cattttttca
ccacgtcggg tggcaagctg 1020atcagccacg ttgccatcct
ggaactgcac cgtgcaagcc aagtggacat tcaggagggc 1080aaagacctga
tcctggacaa aattagcacc tggactcgca actttatgga acaggaactg
1140ctggataacc agatcttgga tcgtagcaaa aaagaaatgg aatttgcaat
gcgtaagttt 1200tacggtacgt tcgatcgcgt ggaaacccgt cgttatattg
aaagctacaa aatggattcc 1260ttcaagatcc tgaaggcagc gtaccgtagc
tccaacatta acaatattga cctgttgaag 1320ttcagcgagc acgacttcaa
tctctgccag gcgcgtcaca aggaagaact gcagcaaatc 1380aaacgctggt
tcgcagattg caaactggag caagtcggta gcagccagaa ctacttgtac
1440acctcttact tcccgatcgc ggccattttg ttcgagccgg agtatggcga
cgcacgcctg 1500gcgttcgcga agtgcggtat tatcgcgacc accgttgacg
atttttttga cggttttgca 1560tgtaatgaag aactgcaaaa catcatcgaa
ctggtcgaga gatgggacgg ttatccgacg 1620gttggtttcc gctccgagcg
tgtgcgcatt ttctttctgg cgctgtacaa aatgattgaa 1680gaaattgccg
cgaaagcgga aacgaaacag ggccgttgcg tgaaagatct gttgatcaat
1740ctgtggattg atctgctgaa atgcatgctg gtcgaactgg atctgtggaa
aattaagagc 1800acgaccccga gcattgaaga gtatctgagc attgcctgtg
tgacgaccgg cgttaagtgc 1860ttgatcctga ttagcctgca tctgctgggc
ccgaaactga gcaaagacgt gaccgaatcc 1920agcgaagtta gcgctctgtg
gaactgtacg gccgtggttg cgcgcctgaa caacgacatt 1980catagctaca
agcgtgagca agccgagagc agcactaata tggccgcaat cctgatttcg
2040caaagccagc gtaccatctc agaagaagaa gctatccgcc agatcaaaga
gatgatggaa 2100tcgaaacgcc gtgagctgct gggcatggtg ctgcagaata
aagagagcca attgccgcaa 2160gtctgcaaag acctgttttg gaccaccttc
aaagccgcgt acagcattta tacccacggt 2220gatgagtacc gttttccaca
agaactgaag aaccatatca acgatgtcat ctataagccg 2280ttaaatcaat
acagccctta a 230125755PRTNicotiana glutinosa 25Met Ser His Ser Thr
Ala Ser Ser Leu Glu Glu Ala Lys Glu Arg Ile1 5 10 15Arg Glu Thr Phe
Gly Lys Asn Glu Leu Ser Ser Ser Ser Tyr Asp Thr 20 25 30Ala Trp Val
Ala Met Val Pro Ser Arg Tyr Ser Met Asn Gln Pro Cys 35 40 45Phe Pro
Arg Cys Leu Asp Trp Ile Leu Glu Asn Gln Arg Glu Asp Gly 50 55 60Ser
Trp Gly Leu Asn Pro Ser Leu Pro Leu Leu Val Lys Asp Ser Leu65 70 75
80Ser Ser Thr Leu Ala Cys Leu Leu Ala Leu Arg Lys Trp Arg Ile Gly
85 90 95Asp Asn Gln Val Gln Arg Gly Leu Gly Phe Ile Glu Thr His Gly
Trp 100 105 110Ala Val Asp Asn Val Asp Gln Ile Ser Pro Leu Gly Phe
Asp Ile Ile 115 120 125Phe Pro Ser Met Ile Lys Tyr Ala Glu Lys Leu
Asn Leu Asp Leu Pro 130 135 140Phe Asp Pro Asn Leu Val Asn Met Met
Leu Arg Glu Arg Glu Leu Thr145 150 155 160Ile Glu Arg Ala Leu Lys
Asn Glu Phe Glu Gly Asn Met Ala Asn Val 165 170 175Glu Tyr Phe Ala
Glu Gly Leu Gly Glu Leu Cys His Trp Lys Glu Ile 180 185 190Met Leu
His Gln Arg Arg Asn Gly Ser Pro Phe Asp Ser Pro Ala Thr 195 200
205Thr Ala Ala Ala Leu Ile Tyr His Gln His Asp Glu Lys Cys Phe Gly
210 215 220Tyr Leu Ser Ser Ile Leu Lys Leu His Glu Asn Trp Val Pro
Thr Ile225 230 235 240Tyr Pro Thr Lys Val His Ser Asn Leu Phe Phe
Val Asp Ala Leu Gln 245 250 255Asn Leu Gly Val Asp Arg Tyr Phe Lys
Thr Glu Leu Lys Ser Val Leu 260 265 270Asp Glu Ile Tyr Arg Leu Trp
Leu Glu Lys Asn Glu Glu Ile Phe Ser 275 280 285Asp Ile Ala His Cys
Ala Met Ala Phe Arg Leu Leu Arg Met Asn Asn 290 295 300Tyr Glu Val
Ser Ser Glu Glu Leu Glu Gly Phe Val Asp Gln Glu His305 310 315
320Phe Phe Thr Thr Ser Gly Gly Lys Leu Ile Ser His Val Ala Ile Leu
325 330 335Glu Leu His Arg Ala Ser Gln Val Asp Ile Gln Glu Gly Lys
Asp Leu 340 345 350Ile Leu Asp Lys Ile Ser Thr Trp Thr Arg Asn Phe
Met Glu Gln Glu 355 360 365Leu Leu Asp Asn Gln Ile Leu Asp Arg Ser
Lys Lys Glu Met Glu Phe 370 375 380Ala Met Arg Lys Phe Tyr Gly Thr
Phe Asp Arg Val Glu Thr Arg Arg385 390 395 400Tyr Ile Glu Ser Tyr
Lys Met Asp Ser Phe Lys Ile Leu Lys Ala Ala 405 410 415Tyr Arg Ser
Ser Asn Ile Asn Asn Ile Asp Leu Leu Lys Phe Ser Glu 420 425 430His
Asp Phe Asn Leu Cys Gln Ala Arg His Lys Glu Glu Leu Gln Gln 435 440
445Ile Lys Arg Trp Phe Ala Asp Cys Lys Leu Glu Gln Val Gly Ser Ser
450 455 460Gln Asn Tyr Leu Tyr Thr Ser Tyr Phe Pro Ile Ala Ala Ile
Leu Phe465 470 475 480Glu Pro Glu Tyr Gly Asp Ala Arg Leu Ala Phe
Ala Lys Cys Gly Ile 485 490 495Ile Ala Thr Thr Val Asp Asp Phe Phe
Asp Gly Phe Ala Cys Asn Glu 500 505 510Glu Leu Gln Asn Ile Ile Glu
Leu Val Glu Arg Trp Asp Gly Tyr Pro 515 520 525Thr Val Gly Phe Arg
Ser Glu Arg Val Arg Ile Phe Phe Leu Ala Leu 530 535 540Tyr Lys Met
Ile Glu Glu Ile Ala Ala Lys Ala Glu Thr Lys Gln Gly545 550 555
560Arg Cys Val Lys Asp Leu Leu Ile Asn Leu Trp Ile Asp Leu Leu Lys
565 570 575Cys Met Leu Val Glu Leu Asp Leu Trp Lys Ile Lys Ser Thr
Thr Pro 580 585 590Ser Ile Glu Glu Tyr Leu Ser Ile Ala Cys Val Thr
Thr Gly Val Lys 595 600 605Cys Leu Ile Leu Ile Ser Leu His Leu Leu
Gly Pro Lys Leu Ser Lys 610 615 620Asp Val Thr Glu Ser Ser Glu Val
Ser Ala Leu Trp Asn Cys Thr Ala625 630 635 640Val Val Ala Arg Leu
Asn Asn Asp Ile His Ser Tyr Lys Arg Glu Gln 645 650 655Ala Glu Ser
Ser Thr Asn Met Val Ala Ile Leu Ile Ser Gln Ser Gln 660 665 670Arg
Thr Ile Ser Glu Glu Glu Ala Ile Arg Gln Ile Lys Glu Met Met 675 680
685Glu Ser Lys Arg Arg Glu Leu Leu Gly Met Val Leu Gln Asn Lys Glu
690 695 700Ser Gln Leu Pro Gln Val Cys Lys Asp Leu Phe Trp Thr Thr
Phe Lys705 710 715 720Ala Ala Tyr Ser Ile Tyr Thr His Gly Asp Glu
Tyr Arg Phe Pro Gln 725 730 735Glu Leu Lys Asn His Ile Asn Asp Val
Ile Tyr Lys Pro Leu Asn Gln 740 745 750Tyr Ser Pro
755262211DNAArtificial SequenceOptimized cDNA for Saccharomyces
cerevisiae expression encoding for SmCPS2 26atggctactg ttgacgctcc
acaagttcac gaccacgacg gtactactgt tcaccaaggt 60cacgacgctg ttaagaacat
cgaagaccca atcgaataca tcagaacttt gttgagaact 120actggtgacg
gtagaatctc tgtttctcca tacgacactg cttgggttgc tatgatcaag
180gacgttgaag gtagagacgg tccacaattc ccatcttctt tggaatggat
cgttcaaaac 240caattggaag acggttcttg gggtgaccaa aagttgttct
gtgtttacga cagattggtt 300aacactatcg cttgtgttgt tgctttgaga
tcttggaacg ttcacgctca caaggttaag 360agaggtgtta cttacatcaa
ggaaaacgtt gacaagttga tggaaggtaa cgaagaacac 420atgacttgtg
gtttcgaagt tgttttccca gctttgttgc aaaaggctaa gtctttgggt
480atcgaagact tgccatacga ctctccagct gttcaagaag tttaccacgt
tagagaacaa 540aagttgaaga gaatcccatt ggaaatcatg cacaagatcc
caacttcttt gttgttctct 600ttggaaggtt tggaaaactt ggactgggac
aagttgttga agttgcaatc tgctgacggt 660tctttcttga cttctccatc
ttctactgct ttcgctttca tgcaaactaa ggacgaaaag 720tgttaccaat
tcatcaagaa cactatcgac actttcaacg gtggtgctcc acacacttac
780ccagttgacg ttttcggtag attgtgggct atcgacagat tgcaaagatt
gggtatctct 840agattcttcg aaccagaaat cgctgactgt ttgtctcaca
tccacaagtt ctggactgac 900aagggtgttt tctctggtag agaatctgaa
ttctgtgaca tcgacgacac ttctatgggt 960atgagattga tgagaatgca
cggttacgac gttgacccaa acgttttgag aaacttcaag 1020caaaaggacg
gtaagttctc ttgttacggt ggtcaaatga tcgaatctcc atctccaatc
1080tacaacttgt acagagcttc tcaattgaga ttcccaggtg aagaaatctt
ggaagacgct 1140aagagattcg cttacgactt cttgaaggaa aagttggcta
acaaccaaat cttggacaag 1200tgggttatct ctaagcactt gccagacgaa
atcaagttgg gtttggaaat gccatggttg 1260gctactttgc caagagttga
agctaagtac tacatccaat actacgctgg ttctggtgac 1320gtttggatcg
gtaagacttt gtacagaatg ccagaaatct ctaacgacac ttaccacgac
1380ttggctaaga ctgacttcaa gagatgtcaa gctaagcacc aattcgaatg
gttgtacatg 1440caagaatggt acgaatcttg tggtatcgaa gaattcggta
tctctagaaa ggacttgttg 1500ttgtcttact tcttggctac tgcttctatc
ttcgaattgg aaagaactaa cgaaagaatc 1560gcttgggcta agtctcaaat
catcgctaag atgatcactt ctttcttcaa caaggaaact 1620acttctgaag
aagacaagag agctttgttg aacgaattgg gtaacatcaa cggtttgaac
1680gacactaacg gtgctggtag agaaggtggt gctggttcta tcgctttggc
tactttgact 1740caattcttgg aaggtttcga cagatacact agacaccaat
tgaagaacgc ttggtctgtt 1800tggttgactc aattgcaaca cggtgaagct
gacgacgctg aattgttgac taacactttg 1860aacatctgtg ctggtcacat
cgctttcaga gaagaaatct tggctcacaa cgaatacaag 1920gctttgtcta
acttgacttc taagatctgt agacaattgt ctttcatcca atctgaaaag
1980gaaatgggtg ttgaaggtga aatcgctgct aagtcttcta tcaagaacaa
ggaattggaa 2040gaagacatgc aaatgttggt taagttggtt ttggaaaagt
acggtggtat cgacagaaac 2100atcaagaagg ctttcttggc tgttgctaag
acttactact acagagctta ccacgctgct 2160gacactatcg acactcacat
gttcaaggtt ttgttcgaac cagttgctta a 2211271578DNAArtificial
SequenceOptimized cDNA for S. cerevisiae expression encoding for
truncated SsScS from Salvia sclarea 27atggctaaga tgaaggaaaa
cttcaagaga gaagacgaca agttcccaac tactactact 60ttgagatctg aagacatccc
atctaacttg tgtatcatcg acactttgca aagattgggt 120gttgaccaat
tcttccaata cgaaatcaac actatcttgg acaacacttt cagattgtgg
180caagaaaagc acaaggttat ctacggtaac gttactactc acgctatggc
tttcagattg 240ttgagagtta agggttacga agtttcttct gaagaattgg
ctccatacgg taaccaagaa 300gctgtttctc aacaaactaa cgacttgcca
atgatcatcg aattgtacag agctgctaac 360gaaagaatct acgaagaaga
aagatctttg gaaaagatct tggcttggac tactatcttc 420ttgaacaagc
aagttcaaga caactctatc ccagacaaga agttgcacaa gttggttgaa
480ttctacttga gaaactacaa gggtatcact atcagattgg gtgctagaag
aaacttggaa 540ttgtacgaca tgacttacta ccaagctttg aagtctacta
acagattctc taacttgtgt 600aacgaagact tcttggtttt cgctaagcaa
gacttcgaca tccacgaagc tcaaaaccaa 660aagggtttgc aacaattgca
aagatggtac gctgactgta gattggacac tttgaacttc 720ggtagagacg
ttgttatcat cgctaactac ttggcttctt tgatcatcgg tgaccacgct
780ttcgactacg ttagattggc tttcgctaag acttctgttt tggttactat
catggacgac 840ttcttcgact gtcacggttc ttctcaagaa tgtgacaaga
tcatcgaatt ggttaaggaa 900tggaaggaaa acccagacgc tgaatacggt
tctgaagaat tggaaatctt gttcatggct 960ttgtacaaca ctgttaacga
attggctgaa agagctagag ttgaacaagg tagatctgtt 1020aaggaattct
tggttaagtt gtgggttgaa atcttgtctg ctttcaagat cgaattggac
1080acttggtcta acggtactca acaatctttc gacgaataca tctcttcttc
ttggttgtct 1140aacggttcta gattgactgg tttgttgact atgcaattcg
ttggtgttaa gttgtctgac 1200gaaatgttga tgtctgaaga atgtactgac
ttggctagac acgtttgtat ggttggtaga 1260ttgttgaacg acgtttgttc
ttctgaaaga gaaagagaag aaaacatcgc tggtaagtct 1320tactctatct
tgttggctac tgaaaaggac ggtagaaagg tttctgaaga cgaagctatc
1380gctgaaatca acgaaatggt tgaataccac tggagaaagg ttttgcaaat
cgtttacaag 1440aaggaatcta tcttgccaag aagatgtaag gacgttttct
tggaaatggc taagggtact 1500ttctacgctt acggtatcaa cgacgaattg
acttctccac aacaatctaa ggaagacatg 1560aagtctttcg ttttctaa
157828924DNAArtificial SequenceOptimized cDNA for S. cerevisiae
expression encoding for the GGPP synthase from Pantoea agglomerans
28atggtttctg gttctaaggc tggtgtttct ccacacagag aaatcgaagt tatgagacaa
60tctatcgacg accacttggc tggtttgttg ccagaaactg actctcaaga catcgtttct
120ttggctatga gagaaggtgt tatggctcca ggtaagagaa tcagaccatt
gttgatgttg 180ttggctgcta gagacttgag ataccaaggt tctatgccaa
ctttgttgga cttggcttgt 240gctgttgaat tgactcacac tgcttctttg
atgttggacg acatgccatg tatggacaac 300gctgaattga gaagaggtca
accaactact cacaagaagt tcggtgaatc tgttgctatc 360ttggcttctg
ttggtttgtt gtctaaggct ttcggtttga tcgctgctac tggtgacttg
420ccaggtgaaa gaagagctca agctgttaac gaattgtcta ctgctgttgg
tgttcaaggt 480ttggttttgg gtcaattcag agacttgaac gacgctgctt
tggacagaac tccagacgct 540atcttgtcta ctaaccactt gaagactggt
atcttgttct ctgctatgtt gcaaatcgtt 600gctatcgctt ctgcttcttc
tccatctact agagaaactt tgcacgcttt cgctttggac 660ttcggtcaag
ctttccaatt gttggacgac ttgagagacg accacccaga aactggtaag
720gacagaaaca aggacgctgg taagtctact ttggttaaca gattgggtgc
tgacgctgct 780agacaaaagt tgagagaaca catcgactct gctgacaagc
acttgacttt cgcttgtcca 840caaggtggtg ctatcagaca attcatgcac
ttgtggttcg gtcaccactt ggctgactgg 900tctccagtta tgaagatcgc ttaa
924292175DNAArtificial SequenceOptimized cDNA for S. cerevisiae
expression encoding for CfCPS1-del63 29atggttgcta ctgttaacgc
tccaccagtt cacgaccaag acgactctac tgaaaaccaa 60tgtcacgacg ctgttaacaa
catcgaagac ccaatcgaat acatcagaac tttgttgaga 120actactggtg
acggtagaat ctctgtttct ccatacgaca ctgcttgggt tgctttgatc
180aaggacttgc aaggtagaga cgctccagaa ttcccatctt ctttggaatg
gatcatccaa 240aaccaattgg ctgacggttc ttggggtgac gctaagttct
tctgtgttta cgacagattg 300gttaacacta tcgcttgtgt tgttgctttg
agatcttggg acgttcacgc tgaaaaggtt 360gaaagaggtg ttagatacat
caacgaaaac gttgaaaagt tgagagacgg taacgaagaa 420cacatgactt
gtggtttcga agttgttttc ccagctttgt tgcaaagagc taagtctttg
480ggtatccaag acttgccata cgacgctcca gttatccaag aaatctacca
ctctagagaa 540caaaagtcta agagaatccc attggaaatg atgcacaagg
ttccaacttc tttgttgttc 600tctttggaag gtttggaaaa cttggaatgg
gacaagttgt tgaagttgca atctgctgac 660ggttctttct tgacttctcc
atcttctact gctttcgctt tcatgcaaac tagagaccca 720aagtgttacc
aattcatcaa gaacactatc caaactttca acggtggtgc tccacacact
780tacccagttg acgttttcgg tagattgtgg gctatcgaca gattgcaaag
attgggtatc 840tctagattct tcgaatctga aatcgctgac tgtatcgctc
acatccacag attctggact 900gaaaagggtg ttttctctgg tagagaatct
gaattctgtg acatcgacga cacttctatg 960ggtgttagat tgatgagaat
gcacggttac gacgttgacc caaacgtttt gaagaacttc 1020aagaaggacg
acaagttctc ttgttacggt ggtcaaatga tcgaatctcc atctccaatc
1080tacaacttgt acagagcttc tcaattgaga ttcccaggtg aacaaatctt
ggaagacgct 1140aacaagttcg cttacgactt cttgcaagaa aagttggctc
acaaccaaat cttggacaag 1200tgggttatct ctaagcactt gccagacgaa
atcaagttgg gtttggaaat gccatggtac 1260gctactttgc caagagttga
agctagatac tacatccaat actacgctgg ttctggtgac 1320gtttggatcg
gtaagacttt gtacagaatg ccagaaatct ctaacgacac ttaccacgaa
1380ttggctaaga ctgacttcaa gagatgtcaa gctcaacacc aattcgaatg
gatctacatg 1440caagaatggt acgaatcttg taacatggaa gaattcggta
tctctagaaa ggaattgttg 1500gttgcttact tcttggctac tgcttctatc
ttcgaattgg aaagagctaa cgaaagaatc 1560gcttgggcta agtctcaaat
catctctact atcatcgctt ctttcttcaa caaccaaaac 1620acttctccag
aagacaagtt ggctttcttg actgacttca agaacggtaa ctctactaac
1680atggctttgg ttactttgac tcaattcttg gaaggtttcg acagatacac
ttctcaccaa 1740ttgaagaacg cttggtctgt ttggttgaga aagttgcaac
aaggtgaagg taacggtggt 1800gctgacgctg aattgttggt taacactttg
aacatctgtg ctggtcacat cgctttcaga 1860gaagaaatct tggctcacaa
cgactacaag actttgtcta acttgacttc taagatctgt 1920agacaattgt
ctcaaatcca aaacgaaaag gaattggaaa ctgaaggtca aaagacttct
1980atcaagaaca aggaattgga agaagacatg caaagattgg ttaagttggt
tttggaaaag 2040tctagagttg gtatcaacag agacatgaag aagactttct
tggctgttgt taagacttac 2100tactacaagg cttaccactc tgctcaagct
atcgacaacc acatgttcaa ggttttgttc 2160gaaccagttg cttaa
2175302100DNAArtificial SequenceOptimized cDNA for S. cerevisiae
expression encoding for TaTps1-del59 30atgtacagac aaagaactga
cgaaccatct gaaactagac aaatgatcga cgacatcaga 60actgctttgg cttctttggg
tgacgacgaa acttctatgt ctgtttctgc ttacgacact 120gctttggttg
ctttggttaa gaacttggac ggtggtgacg gtccacaatt cccatcttgt
180atcgactgga tcgttcaaaa ccaattgcca gacggttctt ggggtgaccc
agctttcttc 240atggttcaag acagaatgat ctctactttg gcttgtgttg
ttgctgttaa gtcttggaac 300atcgacagag acaacttgtg tgacagaggt
gttttgttca tcaaggaaaa catgtctaga 360ttggttgaag aagaacaaga
ctggatgcca tgtggtttcg aaatcaactt cccagctttg 420ttggaaaagg
ctaaggactt ggacttggac atcccatacg accacccagt tttggaagaa
480atctacgcta agagaaactt gaagttgttg aagatcccat tggacgtttt
gcacgctatc 540ccaactactt tgttgttctc tgttgaaggt atggttgact
tgccattgga ctgggaaaag 600ttgttgagat tgagatgtcc agacggttct
ttccactctt ctccagctgc tactgctgct 660gctttgtctc acactggtga
caaggaatgt cacgctttct tggacagatt gatccaaaag 720ttcgaaggtg
gtgttccatg ttctcactct atggacactt tcgaacaatt gtgggttgtt
780gacagattga tgagattggg tatctctaga cacttcactt ctgaaatcca
acaatgtttg 840gaattcatct acagaagatg gactcaaaag ggtttggctc
acaacatgca ctgtccaatc 900ccagacatcg acgacactgc tatgggtttc
agattgttga gacaacacgg ttacgacgtt 960actccatctg ttttcaagca
cttcgaaaag gacggtaagt tcgtttgttt cccaatggaa 1020actaaccacg
cttctgttac tccaatgcac aacacttaca gagcttctca attcatgttc
1080ccaggtgacg acgacgtttt ggctagagct ggtagatact gtagagcttt
cttgcaagaa 1140agacaatctt ctaacaagtt gtacgacaag tggatcatca
ctaaggactt gccaggtgaa 1200gttggttaca ctttgaactt cccatggaag
tcttctttgc caagaatcga aactagaatg 1260tacttggacc aatacggtgg
taacaacgac gtttggatcg ctaaggtttt gtacagaatg 1320aacttggttt
ctaacgactt gtacttgaag atggctaagg ctgacttcac tgaataccaa
1380agattgtcta gaatcgaatg gaacggtttg agaaagtggt acttcagaaa
ccacttgcaa 1440agatacggtg ctactccaaa gtctgctttg aaggcttact
tcttggcttc tgctaacatc 1500ttcgaaccag gtagagctgc tgaaagattg
gcttgggcta gaatggctgt tttggctgaa 1560gctgttacta ctcacttcag
acacatcggt ggtccatgtt actctactga aaacttggaa 1620gaattgatcg
acttggtttc tttcgacgac gtttctggtg gtttgagaga agcttggaag
1680caatggttga tggcttggac tgctaaggaa tctcacggtt ctgttgacgg
tgacactgct 1740ttgttgttcg ttagaactat cgaaatctgt tctggtagaa
tcgtttcttc tgaacaaaag 1800ttgaacttgt gggactactc tcaattggaa
caattgactt cttctatctg tcacaagttg 1860gctactatcg gtttgtctca
aaacgaagct tctatggaaa acactgaaga cttgcaccaa 1920caagttgact
tggaaatgca agaattgtct tggagagttc accaaggttg tcacggtatc
1980aacagagaaa ctagacaaac tttcttgaac gttgttaagt ctttctacta
ctctgctcac 2040tgttctccag aaactgttga ctctcacatc gctaaggtta
tcttccaaga cgttatctaa 2100312172DNAArtificial SequenceOptimized
cDNA for S. cerevisiae expression encoding for MvCps3-del63
31atggctccac cagaacaaaa gtacaactct actgctttgg aacacgacac tgaaatcatc
60gaaatcgaag accacatcga atgtatcaga agattgttga gaactgctgg tgacggtaga
120atctctgttt ctccatacga cactgcttgg atcgctttga tcaaggactt
ggacggtcac 180gactctccac aattcccatc ttctatggaa tgggttgctg
acaaccaatt gccagacggt 240tcttggggtg acgaacactt cgtttgtgtt
tacgacagat tggttaacac tatcgcttgt 300gttgttgctt tgagatcttg
gaacgttcac gctcacaagt gtgaaaaggg tatcaagtac 360atcaaggaaa
acgttcacaa gttggaagac gctaacgaag aacacatgac ttgtggtttc
420gaagttgttt tcccagcttt gttgcaaaga gctcaatcta tgggtatcaa
gggtatccca 480tacaacgctc cagttatcga agaaatctac aactctagag
aaaagaagtt gaagagaatc 540ccaatggaag ttgttcacaa ggttgctact
tctttgttgt tctctttgga aggtttggaa 600aacttggaat gggaaaagtt
gttgaagttg caatctccag acggttcttt cttgacttct 660ccatcttcta
ctgctttcgc tttcatccac actaaggaca gaaagtgttt caacttcatc
720aacaacatcg ttcacacttt caagggtggt gctccacaca cttacccagt
tgacatcttc 780ggtagattgt gggctgttga cagattgcaa agattgggta
tctctagatt cttcgaatct 840gaaatcgctg aattcttgtc tcacgttcac
agattctggt ctgacgaagc tggtgttttc 900tctggtagag aatctgtttt
ctgtgacatc gacgacactt ctatgggttt gagattgttg 960agaatgcacg
gttaccacgt tgacccaaac gttttgaaga acttcaagca atctgacaag
1020ttctcttgtt acggtggtca aatgatggaa tgttcttctc caatctacaa
cttgtacaga 1080gcttctcaat tgcaattccc aggtgaagaa atcttggaag
aagctaacaa gttcgcttac 1140aagttcttgc aagaaaagtt ggaatctaac
caaatcttgg acaagtggtt gatctctaac 1200cacttgtctg acgaaatcaa
ggttggtttg gaaatgccat ggtacgctac tttgccaaga 1260gttgaaactt
cttactacat ccaccactac ggtggtggtg acgacgtttg gatcggtaag
1320actttgtaca gaatgccaga aatctctaac gacacttaca gagaattggc
tagattggac 1380ttcagaagat gtcaagctca acaccaattg gaatggatct
acatgcaaag atggtacgaa 1440tcttgtagaa tgcaagaatt cggtatctct
agaaaggaag ttttgagagc ttacttcttg 1500gcttctggta ctatcttcga
agttgaaaga gctaaggaaa gagttgcttg ggctagatct 1560caaatcatct
ctcacatgat caagtctttc ttcaacaagg aaactacttc ttctgaccaa
1620aagcaagctt tgttgactga attgttgttc ggtaacatct ctgcttctga
aactgaaaag 1680agagaattgg acggtgttgt tgttgctact ttgagacaat
tcttggaagg tttcgacatc 1740ggtactagac accaagttaa ggctgcttgg
gacgtttggt tgagaaaggt tgaacaaggt 1800gaagctcacg gtggtgctga
cgctgaattg tgtactacta ctttgaacac ttgtgctaac 1860caacacttgt
cttctcaccc agactacaac actttgtcta agttgactaa caagatctgt
1920cacaagttgt ctcaaatcca acaccaaaag gaaatgaagg gtggtatcaa
ggctaagtgt 1980tctatcaaca acaaggaagt tgacatcgaa atgcaatggt
tggttaagtt ggttttggaa 2040aagtctggtt tgaacagaaa ggctaagcaa
gctttcttgt ctatcgctaa gacttactac 2100tacagagctt actacgctga
ccaaactatg gacgctcaca tcttcaaggt tttgttcgaa 2160ccagttgttt aa
2172322202DNAArtificial SequenceOptimized cDNA for S. cerevisiae
expression encoding for RoCPS1-del67 32atggcttctc aagtttctga
aaagggtact tcttctccag ttcaaactcc agaagaagtt 60aacgaaaaga tcgaaaacta
catcgaatac atcaagaact tgttgactac ttctggtgac 120ggtagaatct
ctgtttctcc atacgacact tctatcgttg ctttgatcaa ggacttgaag
180ggtagagaca ctccacaatt cccatcttgt ttggaatgga tcgctcaaca
ccaaatggct 240gacggttctt ggggtgacga attcttctgt atctacgaca
gaatcttgaa cactttggct 300tgtgttgttg ctttgaagtc ttggaacgtt
cacgctgaca tgatcgaaaa gggtgttact 360tacgttaacg aaaacgttca
aaagttggaa gacggtaact tggaacacat gacttctggt 420ttcgaaatcg
ttgttccagc tttggttcaa agagctcaag acttgggtat ccaaggtttg
480ccatacgacc acccattgat caaggaaatc gctaacacta aggaaggtag
attgaagaag 540atcccaaagg acatgatcta ccaaaagcca actactttgt
tgttctcttt ggaaggtttg 600ggtgacttgg aatgggaaaa gatcttgaag
ttgcaatctg gtgacggttc tttcttgact 660tctccatctt ctactgctca
cgttttcatg aagactaagg acgaaaagtg tttgaagttc 720atcgaaaacg
ctgttaagaa ctgtaacggt ggtgctccac acacttaccc agttgacgtt
780ttcgctagat tgtgggctgt tgacagattg caaagattgg gtatctctag
attcttccaa 840caagaaatca agtacttctt ggaccacatc aactctgttt
ggactgaaaa cggtgttttc 900tctggtagag actctgaatt ctgtgacatc
gacgacactt ctatgggtat cagattgttg 960aagatgcacg gttacgacat
cgacccaaac gctttggaac acttcaagca acaagacggt 1020aagttctctt
gttacggtgg tcaaatgatc gaatctgctt ctccaatcta caacttgtac
1080agagctgctc aattgagatt cccaggtgaa gaaatcttgg aagaagctac
taagttcgct 1140tacaacttct tgcaagaaaa gatcgctaac gaccaattcc
aagaaaagtg ggttatctct 1200gaccacttga tcgacgaagt taagttgggt
ttgaagatgc catggtacgc tactttgcca 1260agagttgaag ctgcttacta
cttgcaatac tacgctggtt gtggtgacgt ttggatcggt 1320aaggttttct
acagaatgcc agaaatctct aacgacactt acaagaagtt ggctatcttg
1380gacttcaaca gatgtcaagc tcaacaccaa ttcgaatgga tctacatgca
agaatggtac 1440cacagatctt ctgtttctga attcggtatc tctaagaagg
acttgttgag agcttacttc 1500ttggctgctg ctactatctt cgaaccagaa
agaactcaag aaagattggt ttgggctaag 1560actcaaatcg tttctggtat
gatcacttct ttcgttaact ctggtactac tttgtctttg 1620caccaaaaga
ctgctttgtt gtctcaaatc ggtcacaact tcgacggttt ggacgaaatc
1680atctctgcta tgaaggacca cggtttggct gctactttgt tgactacttt
ccaacaattg 1740ttggacggtt tcgacagata cactagacac caattgaaga
acgcttggtc tcaatggttc 1800atgaagttgc aacaaggtga agcttctggt
ggtgaagacg ctgaattgtt ggctaacact 1860ttgaacatct gtgctggttt
gatcgctttc aacgaagacg ttttgtctca ccacgaatac 1920actactttgt
ctactttgac taacaagatc tgtaagagat tgactcaaat ccaagacaag
1980aagactttgg aagttgttga cggttctatc aaggacaagg aattggaaaa
ggacatccaa 2040atgttggtta agttggtttt ggaagaaaac ggtggtggtg
ttgacagaaa catcaagcac 2100actttcttgt ctgttttcaa gactttctac
tacaacgctt accacgacga cgaaactact 2160gacgttcaca tcttcaaggt
tttgttcggt ccagttgttt aa 2202332301DNAArtificial SequenceOptimized
cDNA for S. cerevisiae expression encoding for NgSCS-del29
33atggctaact tccacagacc atctagagtt agatgttctc actctactgc ttcttctttg
60gaagaagcta aggaaagaat cagagaaact ttcggtaaga acgaattgtc tccatcttct
120tacgacactg cttgggttgc tatggttcca tctagatact ctatgaacca
accatgtttc 180ccaagatgtt tggactggat cttggaaaac caaagagaag
acggttcttg gggtttgaac 240ccatctcacc cattgttggt taaggactct
ttgtcttcta ctttggcttg tttgttggct 300ttgagaaagt ggagaatcgg
tgacaaccaa gttcaaagag gtttgggttt catcgaaact 360cacggttggg
ctgttgacaa cgttgaccaa atctctccat tgggtttcga catcatcttc
420ccatctatga tcaagtacgc tgaaaagttg aacttggact tgccattcga
cccaaacttg 480gttaacatga tgttgagaga aagagaattg actatcgaaa
gagctttgaa gaacgaattc 540gaaggtaaca tggctaacgt tgaatacttc
gctgaaggtt tgggtgaatt gtgtcactgg 600aaggaaatca tgttgcacca
aagaagaaac ggttctttgt tcgactctcc agctactact 660gctgctgctt
tgatctacca ccaacacgac gaaaagtgtt tcggttactt gtcttctatc
720ttgaagttgc acgaaaactg ggttccaact atctacccaa ctaaggttca
ctctaacttg 780ttcttcgttg acgctttgca aaacttgggt gttgacagat
acttcaagac tgaattgaag 840tctgttttgg acgaaatcta cagattgtgg
ttggaaaaga acgaagaaat cttctctgac 900atcgctcact gtgctatggc
tttcagattg ttgagaatga acaactacga agtttcttct 960gaagaattgg
aaggtttcgt tgaccaagaa cacttcttca ctacttctgg tggtaagttg
1020atctctcacg ttgctatctt ggaattgcac agagcttctc aagttgacat
ccaagaaggt 1080aaggacttga tcttggacaa gatctctact tggactagaa
acttcatgga acaagaattg 1140ttggacaacc aaatcttgga cagatctaag
aaggaaatgg aattcgctat gagaaagttc 1200tacggtactt tcgacagagt
tgaaactaga agatacatcg aatcttacaa gatggactct 1260ttcaagatct
tgaaggctgc ttacagatct tctaacatca acaacatcga cttgttgaag
1320ttctctgaac acgacttcaa cttgtgtcaa gctagacaca aggaagaatt
gcaacaaatc 1380aagagatggt tcgctgactg taagttggaa caagttggtt
cttctcaaaa ctacttgtac 1440acttcttact tcccaatcgc tgctatcttg
ttcgaaccag aatacggtga cgctagattg 1500gctttcgcta agtgtggtat
catcgctact actgttgacg acttcttcga cggtttcgct 1560tgtaacgaag
aattgcaaaa catcatcgaa ttggttgaaa gatgggacgg ttacccaact
1620gttggtttca gatctgaaag agttagaatc ttcttcttgg ctttgtacaa
gatgatcgaa 1680gaaatcgctg ctaaggctga aactaagcaa ggtagatgtg
ttaaggactt gttgatcaac 1740ttgtggatcg acttgttgaa gtgtatgttg
gttgaattgg acttgtggaa gatcaagtct 1800actactccat ctatcgaaga
atacttgtct atcgcttgtg ttactactgg tgttaagtgt 1860ttgatcttga
tctctttgca cttgttgggt ccaaagttgt ctaaggacgt tactgaatct
1920tctgaagttt ctgctttgtg gaactgtact gctgttgttg ctagattgaa
caacgacatc 1980cactcttaca agagagaaca agctgaatct tctactaaca
tggctgctat cttgatctct 2040caatctcaaa gaactatctc tgaagaagaa
gctatcagac aaatcaagga aatgatggaa 2100tctaagagaa gagaattgtt
gggtatggtt ttgcaaaaca aggaatctca attgccacaa 2160gtttgtaagg
acttgttctg gactactttc aaggctgctt actctatcta cactcacggt
2220gacgaataca gattcccaca agaattgaag aaccacatca acgacgttat
ctacaagcca 2280ttgaaccaat actctccata a 2301342268DNAArtificial
SequenceOptimized cDNA for S. cerevisiae expression encoding for
NgSCS-del38 34atgtctcact ctactgcttc ttctttggaa gaagctaagg
aaagaatcag agaaactttc 60ggtaagaacg aattgtcttc ttcttcttac gacactgctt
gggttgctat ggttccatct 120agatactcta tgaaccaacc atgtttccca
agatgtttgg actggatctt ggaaaaccaa 180agagaagacg gttcttgggg
tttgaaccca tctttgccat tgttggttaa ggactctttg 240tcttctactt
tggcttgttt gttggctttg agaaagtgga gaatcggtga caaccaagtt
300caaagaggtt tgggtttcat cgaaactcac ggttgggctg ttgacaacgt
tgaccaaatc 360tctccattgg gtttcgacat catcttccca tctatgatca
agtacgctga aaagttgaac 420ttggacttgc cattcgaccc aaacttggtt
aacatgatgt tgagagaaag agaattgact 480atcgaaagag ctttgaagaa
cgaattcgaa ggtaacatgg ctaacgttga atacttcgct 540gaaggtttgg
gtgaattgtg tcactggaag gaaatcatgt tgcaccaaag aagaaacggt
600tctccattcg actctccagc tactactgct gctgctttga tctaccacca
acacgacgaa 660aagtgtttcg gttacttgtc ttctatcttg aagttgcacg
aaaactgggt tccaactatc 720tacccaacta aggttcactc taacttgttc
ttcgttgacg ctttgcaaaa cttgggtgtt 780gacagatact tcaagactga
attgaagtct gttttggacg aaatctacag attgtggttg 840gaaaagaacg
aagaaatctt ctctgacatc gctcactgtg ctatggcttt cagattgttg
900agaatgaaca actacgaagt ttcttctgaa gaattggaag gtttcgttga
ccaagaacac 960ttcttcacta cttctggtgg taagttgatc tctcacgttg
ctatcttgga attgcacaga 1020gcttctcaag ttgacatcca agaaggtaag
gacttgatct tggacaagat ctctacttgg 1080actagaaact tcatggaaca
agaattgttg gacaaccaaa tcttggacag atctaagaag 1140gaaatggaat
tcgctatgag aaagttctac ggtactttcg acagagttga aactagaaga
1200tacatcgaat cttacaagat ggactctttc aagatcttga aggctgctta
cagatcttct 1260aacatcaaca acatcgactt gttgaagttc tctgaacacg
acttcaactt gtgtcaagct 1320agacacaagg aagaattgca acaaatcaag
agatggttcg ctgactgtaa gttggaacaa 1380gttggttctt ctcaaaacta
cttgtacact tcttacttcc caatcgctgc tatcttgttc 1440gaaccagaat
acggtgacgc tagattggct ttcgctaagt gtggtatcat cgctactact
1500gttgacgact tcttcgacgg tttcgcttgt aacgaagaat tgcaaaacat
catcgaattg 1560gttgaaagat gggacggtta cccaactgtt ggtttcagat
ctgaaagagt tagaatcttc 1620ttcttggctt tgtacaagat gatcgaagaa
atcgctgcta aggctgaaac taagcaaggt 1680agatgtgtta aggacttgtt
gatcaacttg tggatcgact tgttgaagtg tatgttggtt 1740gaattggact
tgtggaagat caagtctact actccatcta tcgaagaata cttgtctatc
1800gcttgtgtta ctactggtgt taagtgtttg atcttgatct ctttgcactt
gttgggtcca 1860aagttgtcta aggacgttac tgaatcttct gaagtttctg
ctttgtggaa ctgtactgct 1920gttgttgcta gattgaacaa cgacatccac
tcttacaaga gagaacaagc tgaatcttct 1980actaacatgg ttgctatctt
gatctctcaa tctcaaagaa ctatctctga agaagaagct 2040atcagacaaa
tcaaggaaat gatggaatct aagagaagag aattgttggg tatggttttg
2100caaaacaagg aatctcaatt gccacaagtt tgtaaggact tgttctggac
tactttcaag 2160gctgcttact ctatctacac tcacggtgac gaatacagat
tcccacaaga attgaagaac 2220cacatcaacg acgttatcta caagccattg
aaccaatact ctccataa 22683585DNAArtificial SequencePrimer Sequence
35aggtgcagtt cgcgtgcaat tataacgtcg tggcaactgt tatcagtcgt accgcgccat
60tgagagtgca ccataccaca gcttt 853685DNAArtificial SequencePrimer
Sequence 36tcgtggtcaa ggcgtgcaat tctcaacacg agagtgattc ttcggcgttg
ttgctgacca 60gcggtatttc acaccgcata gggta 853785DNAArtificial
SequencePrimer Sequence 37tggtcagcaa caacgccgaa gaatcactct
cgtgttgaga attgcacgcc ttgaccacga 60cacgttaagg gattttggtc atgag
853880DNAArtificial SequencePrimer Sequence 38aacgcgtacc ctaagtacgg
caccacagtg actatgcagt ccgcactttg ccaatgccaa 60aaatgtgcgc ggaaccccta
803984DNAArtificial SequencePrimer Sequence 39ttggcattgg caaagtgcgg
actgcatagt cactgtggtg ccgtacttag ggtacgcgtt 60cctgaacgaa gcatctgtgc
ttca 844085DNAArtificial SequencePrimer Sequence 40ccgagatgcc
aaaggatagg tgctatgttg atgactacga cacagaactg cgggtgacat 60aatgatagca
ttgaaggatg agact 854182DNAArtificial SequencePrimer Sequence
41atgtcacccg cagttctgtg tcgtagtcat caacatagca cctatccttt ggcatctcgg
60tgagcaaaag gccagcaaaa gg 824281DNAArtificial SequencePrimer
Sequence 42ctcagatgta cggtgatcgc caccatgtga cggaagctat cctgacagtg
tagcaagtgc 60tgagcgtcag accccgtaga a 814360DNAArtificial
SequencePrimer Sequence 43attcctagtg acggccttgg gaactcgata
cacgatgttc agtagaccgc tcacacatgg 604479DNAArtificial SequencePrimer
Sequence 44aggtgcagtt cgcgtgcaat tataacgtcg tggcaactgt tatcagtcgt
accgcgccat 60tcgactacgt cgtaaggcc 794580DNAArtificial
SequencePrimer Sequence 45tcgtggtcaa ggcgtgcaat tctcaacacg
agagtgattc ttcggcgttg ttgctgacca 60tcgacggtcg aggagaactt 80
* * * * *
References