U.S. patent number 10,385,363 [Application Number 15/330,814] was granted by the patent office on 2019-08-20 for drimenol synthases ii.
This patent grant is currently assigned to Firmenich SA. The grantee listed for this patent is Firmenich SA. Invention is credited to Fabienne Deguerry, Olivier Haefliger, Xiu-Feng He, Michel Schalk, Yu-Hua Zhang.
![](/patent/grant/10385363/US10385363-20190820-D00001.png)
![](/patent/grant/10385363/US10385363-20190820-D00002.png)
![](/patent/grant/10385363/US10385363-20190820-D00003.png)
![](/patent/grant/10385363/US10385363-20190820-D00004.png)
![](/patent/grant/10385363/US10385363-20190820-D00005.png)
![](/patent/grant/10385363/US10385363-20190820-D00006.png)
![](/patent/grant/10385363/US10385363-20190820-D00007.png)
United States Patent |
10,385,363 |
Zhang , et al. |
August 20, 2019 |
Drimenol synthases II
Abstract
The present invention relates to a method of producing drimenol
and/or drimenol derivatives by contacting at least one polypeptide
with farnesyl diphosphate. The method may be performed in vitro or
in vivo. The present invention also provides amino acid sequences
of polypeptides useful in the method of the invention and nucleic
acid encoding the polypeptides of the invention. The method further
provides host cells or organisms genetically modified to express
the polypeptides of the invention and useful to produce drimenol
and/or drimenol derivatives.
Inventors: |
Zhang; Yu-Hua (Shanghai,
CN), Schalk; Michel (Geneva, CH),
Haefliger; Olivier (Shanghai, CN), He; Xiu-Feng
(Shanghai, CN), Deguerry; Fabienne (Geneva,
CH) |
Applicant: |
Name |
City |
State |
Country |
Type |
Firmenich SA |
Geneva |
N/A |
CH |
|
|
Assignee: |
Firmenich SA (Geneva,
CH)
|
Family
ID: |
53483771 |
Appl.
No.: |
15/330,814 |
Filed: |
May 6, 2015 |
PCT
Filed: |
May 06, 2015 |
PCT No.: |
PCT/EP2015/059988 |
371(c)(1),(2),(4) Date: |
November 07, 2016 |
PCT
Pub. No.: |
WO2015/169871 |
PCT
Pub. Date: |
November 12, 2015 |
Prior Publication Data
|
|
|
|
Document
Identifier |
Publication Date |
|
US 20180251797 A1 |
Sep 6, 2018 |
|
Foreign Application Priority Data
|
|
|
|
|
May 6, 2014 [WO] |
|
|
PCT/CN2014/076890 |
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12P
7/04 (20130101); C12P 7/02 (20130101); C12N
9/16 (20130101); C12N 9/0004 (20130101); C12Y
301/07007 (20130101) |
Current International
Class: |
C12N
9/02 (20060101); C12P 7/04 (20060101); C12P
7/02 (20060101); C12N 9/16 (20060101) |
Foreign Patent Documents
|
|
|
|
|
|
|
WO2012058636 |
|
May 2012 |
|
WO |
|
WO2013058655 |
|
Apr 2013 |
|
WO |
|
Other References
Whisstock et al. Quaterly Reviews of Biophysics, 2003, "Prediction
of protein function from protein sequence and structure",
36(3):307-340. cited by examiner .
Witkowski et al. Conversion of a beta-ketoacyl synthase to a
malonyl decarboxylase by replacement of the active-site cysteine
with glutamine, Biochemistry. Sep. 7, 1999;38(36):11643-50. cited
by examiner .
International Search Report and Written Opinion, application
PCT/EP2015/059988 dated Nov. 26, 2015. cited by applicant .
Altschul, Stephen F., et al; J. Mol. Biol. (1990) 215, 403-410.
cited by applicant .
Schalk, Michel, et al; Journal of the American Chemical Society
(2012), 134, 18900-18903. cited by applicant .
Tatiana A. Tatusova, et al; FEMS Microbiology Letters 174 (1999)
247-250. cited by applicant .
Munoz-Concha, et al, Biochemical Systematics and Ecology, vol. 35,
No. 7, 2007 p. 434-438. cited by applicant .
XP55211211, Calgary, Alberta; Retrieved from the Internet, URL:
http://theses.ucalgary.ca/bitstream/11
023/129/2/ucalgary_2012_pyle_bryan.pdf. cited by applicant.
|
Primary Examiner: Chowdhury; Iqbal H
Attorney, Agent or Firm: Armstrong Teasdale LLP
Claims
What is claimed is:
1. A method of producing drimenol comprising: i) contacting
farnesyl diphosphate (FPP) with a polypeptide having drimenol
synthase activity and comprising an amino acid sequence having at
least 85% sequence identity to a sequence selected from the group
consisting of SEQ ID NO: 2 and SEQ ID NO: 5 to produce the
drimenol; and ii) optionally isolating the drimenol.
2. The method as recited in claim 1 wherein the drimenol is
isolated.
3. The method as recited in claim 1 wherein the drimenol is
produced with at least 30% selectivity.
4. The method as recited in claim 1 comprising the steps of
transforming a host cell or non-human organism with a nucleic acid
encoding a polypeptide comprising an amino acid sequence having at
least 85% sequence identity of a sequence selected from the group
consisting of SEQ ID NO: 2 and SEQ ID NO: 5 and culturing the host
cell or organism under conditions that allow for the production of
the polypeptide.
5. The method recited in claim 4 wherein the cell is a prokaryotic
cell or a eukaryotic cell.
6. The method as recited in claim 4 wherein the cell is a bacterial
cell.
7. The method as recited in claim 5 wherein the eukaryotic cell is
a yeast cell or a plant cell.
8. The method as recited in claim 2 wherein the drimenol is
produced with at least 30% selectivity.
Description
TECHNICAL FIELD
The field relates to methods of producing Drimenol, said method
comprising contacting at least one polypeptide with farnesyl
pyrophosphate (FPP). In particular, said method may be carried out
in vitro or in vivo to produce Drimenol, a very useful compound in
the fields of perfumery. Also provided herein is an amino acid
sequence of a polypeptide useful in the methods provided herein. A
nucleic acid encoding the polypeptide of an embodiment herein and
an expression vector containing said nucleic acid are also provided
herein. A non-human host organism or a cell transformed to be used
in the method of producing Drimenol is further provided herein.
BACKGROUND
Terpenes are found in most organisms (microorganisms, animals and
plants). These compounds are made up of five carbon units called
isoprene units and are classified by the number of these units
present in their structure. Thus monoterpenes, sesquiterpenes and
diterpenes are terpenes containing 10, 15 and 20 carbon atoms
respectively. Sesquiterpenes, for example, are widely found in the
plant kingdom. Many sesquiterpene molecules are known for their
flavor and fragrance properties and their cosmetic, medicinal and
antimicrobial effects. Numerous sesquiterpene hydrocarbons and
sesquiterpenoids have been identified.
Biosynthetic production of terpenes involves enzymes called terpene
synthases. There is virtually an infinity of sesquiterpene
synthases present in the plant kingdom, all using the same
substrate (farnesyl pyrophosphate, FPP) but having different
product profiles. Genes and cDNAs encoding sesquiterpene synthases
have been cloned and the corresponding recombinant enzymes
characterized.
Currently the main source for Drimenol are plants naturally
containing Drimenol and the contents of Drimenol in these natural
sources are low. Chemical synthesis approaches have been developed
but are still complex and not cost-effective.
SUMMARY
Provided herein is a method of producing Drimenol comprising: i)
contacting a acyclic terpene pyrophosphate with a polypeptide
having Drimenol synthase activity and having at least, or at least
about 70% sequence identify to a sequence selected from the group
consisting of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO:
11 and SEQ ID NO: 14 to produce the Drimenol; and ii) optionally
isolating the Drimenol.
Further provided herein is an isolated polypeptide having Drimenol
activity comprising an amino acid sequence having at least or at
least about 70%, or more identity to amino acid sequence of a
sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID
NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID NO: 14.
Also provided herein is an isolated nucleic acid molecule encoding
a polypeptide having at least, or at least about 70% sequence
identify to a sequence selected from the group consisting of SEQ ID
NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID NO:
14
DESCRIPTION OF THE DRAWINGS
FIG. 1. GCMS analysis of the leaves of Drimys lanceolata and Drimys
winteri.
FIG. 2. GCMS analysis of the sesquiterpenes produced by the
recombinant DlTps589 in in-vitro assays. A. Total ion chromatogram
of the sesquiterpene profile of an incubation of the recombinant
DlTps589 protein with FPP. B. Negative control performed in the
same conditions with E. coli cells transformed with an empty
plasmid. C. Mass spectrum of the peak at 11.76 min. D. Mass
spectrum of an authentic standard of (-)-drimenol.
FIG. 3. GCMS analysis of the sesquiterpenes produced in vivo by the
recombinant DlTps589 in engineered bacteria cells. A. Total ion
chromatogram. B. Mass spectrum of the peak at 11.49 min. C. Mass
spectrum of an authentic standard of (-)-drimenol. The compound
eluting at 10.98 min is farnesol produced by the DlTps589 enzyme or
resulting from the hydrolysis of excess FPP produced by the E. coli
cells.
FIG. 4. Structure of (-)-drimenol produced by the recombinant
DlTps589 synthase.
FIG. 5. Chiral GC\FID chromatograms of (-)-drimenol produced by the
recombinant enzyme (upper), racemic drimenol obtained chemically
(middle) and authentic (-)-drimenol (lower).
FIG. 6. Total ion chromatogram of GCMS analysis of the
sesquiterpenes produced in in-vitro assays by the recombinant
proteins SCH51-3228-9 (A), SCH51-998-28 (B) or SCH52-13163-6
(C).
FIG. 7. Total ion chromatogram of GCMS analysis of the
sesquiterpenes produced in vivo by engineered bacteria cells
expressing the different recombinant proteins SCH51-3228-9 (A),
SCH51-998-28 (B) or SCH52-13163-6 (C). The farnesol detected
results from the hydrolysis of excess FPP produced by the E. coli
cells or could be in part produced by the recombinant proteins.
DETAILED DESCRIPTION
For the descriptions herein and the appended claims, the use of
"or" means "and/or" unless stated otherwise. Similarly, "comprise,"
"comprises," "comprising" "include," "includes," and "including"
are interchangeable and not intended to be limiting.
It is to be further understood that where descriptions of various
embodiments use the term "comprising," those skilled in the art
would understand that in some specific instances, an embodiment can
be alternatively described using language "consisting essentially
of" or "consisting of." In one aspect, provided here is a method of
producing Drimenol comprising:
i) contacting a acyclic terpene pyrophosphate, particularly
farnesyl diphospate (FPP)) with a polypeptide having Drimenol
synthase activity and having at least, or at least about 70%,
particularly 75%, particularly 80%, particularly 85%, particularly
90%, particularly 95%, particularly 96%, particularly 97%,
particularly 98% or particularly 99% or more sequence identify to a
sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID
NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID NO: 14 to produce the
Drimenol; and
ii) optionally isolating the Drimenol.
In one aspect, the Drimenol is isolated.
Further provided here is an isolated polypeptide having Drimenol
activity comprising an amino acid sequence having at least or at
least about 70%, particularly 75%, particularly 80%, particularly
85%, particularly 90%, particularly 95%, particularly 96%,
particularly 97%, particularly 98% or more particularly 99% or more
identity to amino acid sequence of a sequence selected from the
group consisting of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ
ID NO: 11 and SEQ ID NO: 14.
Further provided herein is an isolated nucleic acid molecule
encoding a polypeptide comprising an amino acid sequence having at
least or at least about 70%, particularly 75%, particularly 80%,
particularly 85%, particularly 90%, particularly 95%, particularly
96%, particularly 97%, particularly 98% or more particularly 99% or
more identity to amino acid sequence of a sequence selected from
the group consisting of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8,
SEQ ID NO: 11 and SEQ ID NO: 14.
Further provided herein a nucleic acid molecule comprising the
sequence selected from the group consisting of SEQ ID NO:1, SEQ ID
NO: 3, SEQ ID NO: 4, SEQ ID NO: 6 SEQ ID NO: 7, SEQ ID NO: 9, SEQ
ID NO: 10, SEQ ID NO: 12, SEQ ID No: 13 and SEQ ID NO: 15.
Further provided here is a method as recited in claim 1 comprising
the steps of transforming a host cell or non-human organism with a
nucleic acid encoding a polypeptide having at least, or at least
about, 70%, particularly 75%, particularly 80%, particularly 85%.
particularly 90%, particularly 95%, particularly 96%, particularly
97%, particularly 98% or particularly 99% or more sequence identity
of the sequence of a sequence selected from the group consisting of
SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID
NO: 14 and culturing the host cell or organism under conditions
that allow for the production of the polypeptide.
Further provided is at least one vector comprising the nucleic acid
molecules described.
Further provided herein is a vector selected from the group of a
prokaryotic vector, viral vector and a eukaryotic vector.
Further provided here is a vector that is an expression vector.
As a "Drimenol synthase" or as a "polypeptide having a Drimenol
synthase activity", we mean here a polypeptide capable of
catalyzing the synthesis of Drimenol, in the form of any of its
stereoisomers or a mixture thereof, starting from an acyclic
terpene pyrophosphate, particularly FPP. Drimenol may be the only
product or may be part of a mixture of sesquiterpenes.
The ability of a polypeptide to catalyze the synthesis of a
particular sesquiterpene (for example Drimenol) can be simply
confirmed by performing the enzyme assay as detailed in Example 2
to 5.
Polypeptides are also meant to include truncated polypeptides
provided that they keep their Drimenol synthase activity.
As intended herein below, "a nucleotide sequence obtained by
modifying SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6,
SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO:
13 or SEQ ID NO: 15 or the complement thereof" encompasses any
sequence that has been obtained by changing the sequence of SEQ ID
NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ
ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 12 or of the complement
thereof using any method known in the art, for example by
introducing any type of mutations such as deletion, insertion or
substitution mutations. Examples of such methods are cited in the
part of the description relative to the variant polypeptides and
the methods to prepare them.
The percentage of identity between two peptidic or nucleotidic
sequences is a function of the number of amino acids or nucleotide
residues that are identical in the two sequences when an alignment
of these two sequences has been generated. Identical residues are
defined as residues that are the same in the two sequences in a
given position of the alignment. The percentage of sequence
identity, as used herein, is calculated from the optimal alignment
by taking the number of residues identical between two sequences
dividing it by the total number of residues in the shortest
sequence and multiplying by 100. The optimal alignment is the
alignment in which the percentage of identity is the highest
possible. Gaps may be introduced into one or both sequences in one
or more positions of the alignment to obtain the optimal alignment.
These gaps are then taken into account as non-identical residues
for the calculation of the percentage of sequence identity.
Alignment for the purpose of determining the percentage of amino
acid or nucleic acid sequence identity can be achieved in various
ways using computer programs and for instance publicly available
computer programs available on the world wide web. Preferably, the
BLAST program (Tatiana et al, FEMS Microbiol Lett., 1999,
174:247-250, 1999) set to the default parameters, available from
the National Center for Biotechnology Information (NCBI), can be
used to obtain an optimal alignment of peptidic or nucleotidic
sequences and to calculate the percentage of sequence identity.
ABBREVIATIONS USED
bp base pair
kb kilo base
BSA bovine serum albumin
DNA deoxyribonucleic acid
cDNA complementary DNA
DTT dithiothreitol
FID Flame ionization detector
FPP farnesyl pyrophosphate
GC gaseous chromatograph
IPTG isopropyl-D-thiogalacto-pyranoside
LB lysogeny broth
MS mass spectrometer
MVA mevalonic acid
PCR polymerase chain reaction
RMCE recombinase-mediated cassette exchange
3'-/5'-RACE 3' and 5' rapid amplification of cDNA ends
RNA ribonucleic acid
mRNA messenger ribonucleic acid
miRNA micro RNA
siRNA small interfering RNA
rRNA ribosomal RNA
tRNA transfer RNA
Definitions
The term "polypeptide" means an amino acid sequence of
consecutively polymerized amino acid residues, for instance, at
least 15 residues, at least 30 residues, at least 50 residues. In
some embodiments provided herein, a polypeptide comprises an amino
acid sequence that is an enzyme, or a fragment, or a variant
thereof.
The term "isolated" polypeptide refers to an amino acid sequence
that is removed from its natural environment by any method or
combination of methods known in the art and includes recombinant,
biochemical and synthetic methods.
The term "protein" refers to an amino acid sequence of any length
wherein amino acids are linked by covalent peptide bonds, and
includes oligopeptide, peptide, polypeptide and full length protein
whether naturally occurring or synthetic.
The terms "Drimenol synthase" or "Drimenol synthase protein" refer
to an enzyme that is capable of converting farnesyl diphosphate
(FPP) to Drimenol.
The terms "biological function," "function," "biological activity"
or "activity" refer to the ability of the Drimenol synthase to
catalyze the formation of Drimenol from FPP.
The terms "nucleic acid sequence," "nucleic acid," and
"polynucleotide" are used interchangeably meaning a sequence of
nucleotides. A nucleic acid sequence may be a single-stranded or
double-stranded deoxyribonucleotide, or ribonucleotide of any
length, and include coding and non-coding sequences of a gene,
exons, introns, sense and anti-sense complimentary sequences,
genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant
nucleic acid sequences, isolated and purified naturally occurring
DNA and/or RNA sequences, synthetic DNA and RNA sequences,
fragments, primers and nucleic acid probes. The skilled artisan is
aware that the nucleic acid sequences of RNA are identical to the
DNA sequences with the difference of thymine (T) being replaced by
uracil (U).
An "isolated nucleic acid" or "isolated nucleic acid sequence" is
defined as a nucleic acid or nucleic acid sequence that is in an
environment different from that in which the nucleic acid or
nucleic acid sequence naturally occurs. The term
"naturally-occurring" as used herein as applied to a nucleic acid
refers to a nucleic acid that is found in a cell in nature. For
example, a nucleic acid sequence that is present in an organism,
for instance in the cells of an organism, that can be isolated from
a source in nature and which has not been intentionally modified by
a human in the laboratory is naturally occurring.
"Recombinant nucleic acid sequence" are nucleic acid sequences that
result from the use of laboratory methods (molecular cloning) to
bring together genetic material from more than on source, creating
a nucleic acid sequence that does not occur naturally and would not
be otherwise found in biological organisms.
"Recombinant DNA technology" refers to molecular biology procedures
to prepare a recomninant nucleic acid sequence as described, for
instance, in Laboratory Manuals edited by Weigel and Glazebrook,
2002 Cold Spring Harbor Lab Press; and Sambrook et al., 1989 Cold
Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press.
The term "gene" means a DNA sequence comprising a region, which is
transcribed into a RNA molecule, e.g., an mRNA in a cell, operably
linked to suitable regulatory regions, e.g., a promoter. A gene may
thus comprise several operably linked sequences, such as a
promoter, a 5' leader sequence comprising, e.g., sequences involved
in translation initiation, a coding region of cDNA or genomic DNA,
introns, exons, and/or a 3'non-translated sequence comprising,
e.g., transcription termination sites.
A "chimeric gene" refers to any gene, which is not normally found
in nature in a species, in particular, a gene in which one or more
parts of the nucleic acid sequence are present that are not
associated with each other in nature. For example the promoter is
not associated in nature with part or all of the transcribed region
or with another regulatory region. The term "chimeric gene" is
understood to include expression constructs in which a promoter or
transcription regulatory sequence is operably linked to one or more
coding sequences or to an antisense, i.e., reverse complement of
the sense strand, or inverted repeat sequence (sense and antisense,
whereby the RNA transcript forms double stranded RNA upon
transcription).
A "3' UTR" or "3' non-translated sequence" (also referred to as "3'
untranslated region," or "3'end") refers to the nucleic acid
sequence found downstream of the coding sequence of a gene, which
comprises for example a transcription termination site and (in
most, but not all eukaryotic imRNAs) a polyadenylation signal such
as AAUAAA or variants thereof. After termination of transcription,
the mRNA transcript may be cleaved downstream of the
polyadenylation signal and a poly(A) tail may be added, which is
involved in the transport of the mRNA to the site of translation,
e.g., cytoplasm.
"Expression of a gene" involves transcription of the gene and
translation of the mRNA into a protein. Overexpression refers to
the production of the gene product as measured by levels of mRNA,
polypeptide and/or enzyme activity in transgenic cells or organisms
that exceeds levels of production in non-transformed cells or
organisms of a similar genetic background.
"Expression vector" as used herein means a nucleic acid molecule
engineered using molecular biology methods and recombinant DNA
technology for delivery of foreign or exogenous DNA into a host
cell. The expression vector typically includes sequences required
for proper transcription of the nucleotide sequence. The coding
region usually codes for a protein of interest but may also code
for an RNA, e.g., an antisense RNA, siRNA and the like.
An "expression vector" as used herein includes any linear or
circular recombinant vector including but not limited to viral
vectors, bacteriophages and plasmids. The skilled person is capable
of selecting a suitable vector according to the expression system.
In one embodiment, the expression vector includes the nucleic acid
of an embodiment herein operably linked to at least one regulatory
sequence, which controls transcription, translation, initiation and
termination, such as a transcriptional promoter, operator or
enhancer, or an mRNA ribosomal binding site and, optionally,
including at least one selection marker. Nucleotide sequences are
"operably linked" when the regulatory sequence functionally relates
to the nucleic acid of an embodiment herein. "Regulatory sequence"
refers to a nucleic acid sequence that determines expression level
of the nucleic acid sequences of an embodiment herein and is
capable of regulating the rate of transcription of the nucleic acid
sequence operably linked to the regulatory sequence. Regulatory
sequences comprise promoters, enhancers, transcription factors,
promoter elements and the like.
"Promoter" refers to a nucleic acid sequence that controls the
expression of a coding sequence by providing a binding site for RNA
polymerase and other factors required for proper transcription
including without limitation transcription factor binding sites,
repressor and activator protein binding sites. The meaning of the
term promoter also include the term "promoter regulatory sequence".
Promoter regulatory sequences may include upstream and downstream
elements that may influences transcription, RNA processing or
stability of the associated coding nucleic acid sequence. Promoters
include naturally-derived and synthetic sequences. The coding
nucleic acid sequences is usually located downstream of the
promoter with respect to the direction of the transcription
starting at the transcription initiation site.
The term "constitutive promoter" refers to an unregulated promoter
that allows for continual transcription of the nucleic acid
sequence it is operably linked to.
As used herein, the term "operably linked" refers to a linkage of
polynucleotide elements in a functional relationship. A nucleic
acid is "operably linked" when it is placed into a functional
relationship with another nucleic acid sequence. For instance, a
promoter, or rather a transcription regulatory sequence, is
operably linked to a coding sequence if it affects the
transcription of the coding sequence. Operably linked means that
the DNA sequences being linked are typically contiguous. The
nucleotide sequence associated with the promoter sequence may be of
homologous or heterologous origin with respect to the plant to be
transformed. The sequence also may be entirely or partially
synthetic. Regardless of the origin, the nucleic acid sequence
associated with the promoter sequence will be expressed or silenced
in accordance with promoter properties to which it is linked after
binding to the polypeptide of an embodiment herein. The associated
nucleic acid may code for a protein that is desired to be expressed
or suppressed throughout the organism at all times or,
alternatively, at a specific time or in specific tissues, cells, or
cell compartment. Such nucleotide sequences particularly encode
proteins conferring desirable phenotypic traits to the host cells
or organism altered or transformed therewith. More particularly,
the associated nucleotide sequence leads to the production of
Drimenol in the organism. Particularly, the nucleotide sequence
encodes Drimenol synthase.
"Target peptide" refers to an amino acid sequence which targets a
protein, or polypeptide to intracellular organelles, i.e.,
mitochondria, or plastids, or to the extracellular space (secretion
signal peptide). A nucleic acid sequence encoding a target peptide
may be fused to the nucleic acid sequence encoding the amino
terminal end, e.g., N-terminal end, of the protein or polypeptide,
or may be used to replace a native targeting polypeptide.
The term "primer" refers to a short nucleic acid sequence that is
hybridized to a template nucleic acid sequence and is used for
polymerization of a nucleic acid sequence complementary to the
template.
As used herein, the term "host cell" or "transformed cell" refers
to a cell (or organism) altered to harbor at least one nucleic acid
molecule, for instance, a recombinant gene encoding a desired
protein or nucleic acid sequence which upon transcription yields a
Drimenol synthase protein useful to produce Drimenol. The host cell
is particularly a bacterial cell, a fungal cell or a plant cell.
The host cell may contain a recombinant gene which has been
integrated into the nuclear or organelle genomes of the host cell.
Alternatively, the host may contain the recombinant gene
extra-chromosomally. Homologous sequences include orthologous or
paralogous sequences. Methods of identifying orthologs or paralogs
including phylogenetic methods, sequence similarity and
hybridization methods are known in the art and are described
herein.
Paralogs result from gene duplication that gives rise to two or
more genes with similar sequences and similar functions. Paralogs
typically cluster together and are formed by duplications of genes
within related plant species. Paralogs are found in groups of
similar genes using pair-wise Blast analysis or during phylogenetic
analysis of gene families using programs such as CLUSTAL. In
paralogs, consensus sequences can be identified characteristic to
sequences within related genes and having similar functions of the
genes.
Orthologs, or orthologous sequences, are sequences similar to each
other because they are found in species that descended from a
common ancestor. For instance, plant species that have common
ancestors are known to contain many enzymes that have similar
sequences and functions. The skilled artisan can identify
orthologous sequences and predict the functions of the orthologs,
for example, by constructing a polygenic tree for a gene family of
one species using CLUSTAL or BLAST programs. A method for
identifying or confirming similar functions among homologous
sequences is by comparing of the transcript profiles in plants
overexpressing or lacking (in knockouts/knockdowns) related
polypeptides. The skilled person will understand that genes having
similar transcript profiles, with greater than 50% regulated
transcripts in common, or with greater than 70% regulated
transcripts in common, or greater than 90% regulated transcripts in
common will have similar functions. Homologs, paralogs, orthologs
and any other variants of the sequences herein are expected to
function in a similar manner by making plants producing Drimenol
synthase proteins.
An embodiment of the provided herein provides amino acid sequences
of Drimenol synthase proteins including orthologs and paralogs as
well as methods for identifying and isolating orthologs and
paralogs of the Drimenol synthases in other organisms.
Particularly, so identified orthologs and paralogs of the Drimenol
synthase retain Drimenol synthase activity and are capable of
producing Drimenol starting from FPP precursors.
The term "selectable marker" refers to any gene which upon
expression may be used to select a cell or cells that include the
selectable marker. Examples of selectable markers are described
below. The skilled artisan will know that different antibiotic,
fungicide, auxotrophic or herbicide selectable markers are
applicable to different target species.
"Drimenol" for purposes of this application refers to (-)-drimenol
(CAS: 468-68-8).
The term "organism" refers to any non-human multicellular or
unicellular organisms such as a plant, or a microorganism.
Particularly, a micro-organism is a bacterium, a yeast, an algae or
a fungus. The term "plant" is used interchangeably to include plant
cells including plant protoplasts, plant tissues, plant cell tissue
cultures giving rise to regenerated plants, or parts of plants, or
plant organs such as roots, stems, leaves, flowers, pollen, ovules,
embryos, fruits and the like. Any plant can be used to carry out
the methods of an embodiment herein.
The polypeptide to be contacted with an acyclic pyrophosphate, e.g.
FPP, in vitro can be obtained by extraction from any organism
expressing it, using standard protein or enzyme extraction
technologies. If the host organism is an unicellular organism or
cell releasing the polypeptide of an embodiment herein into the
culture medium, the polypeptide may simply be collected from the
culture medium, for example by centrifugation, optionally followed
by washing steps and re-suspension in suitable buffer solutions. If
the organism or cell accumulates the polypeptide within its cells,
the polypeptide may be obtained by disruption or lysis of the cells
and further extraction of the polypeptide from the cell lysate.
The polypeptide having a Drimenol synthase activity, either in an
isolated form or together with other proteins, for example in a
crude protein extract obtained from cultured cells or
microorganisms, may then be suspended in a buffer solution at
optimal pH. If adequate, salts, DTT, inorganic cations and other
kinds of enzymatic co-factors, may be added in order to optimize
enzyme activity. The precursor FPP is added to the polypeptide
suspension, which is then incubated at optimal temperature, for
example between 15 and 40.degree. C., particularly between 25 and
35.degree. C., more particularly at 30.degree. C. After incubation,
the Drimenol produced may be isolated from the incubated solution
by standard isolation procedures, such as solvent extraction and
distillation, optionally after removal of polypeptides from the
solution.
According to another particularly embodiment, the method of any of
the above-described embodiments is carried out in vivo. In this
case, step a) comprises cultivating a non-human host organism or
cell capable of producing FPP and transformed to express at least
one polypeptide comprising an amino acid sequence at least 70%
identical to a sequence selected from the group consisting of SEQ
ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID NO:
14 and having a Drimenol synthase activity, under conditions
conducive to the production of Drimenol.
According to a more particular embodiment, the method further
comprises, prior to step a), transforming a non human organism or
cell capable of producing FPP with at least one nucleic acid
encoding a polypeptide comprising an amino acid sequence at least
70% identical to a sequence selected from the group consisting of
SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID
NO: 14 and having a Drimenol synthase activity, so that said
organism expresses said polypeptide.
These embodiments provided herein are particularly advantageous
since it is possible to carry out the method in vivo without
previously isolating the polypeptide. The reaction occurs directly
within the organism or cell transformed to express said
polypeptide.
According to a more particular embodiment at least one nucleic acid
used in any of the above embodiments comprises a nucleotide
sequence that has been obtained by modifying SEQ ID NO: 1, SEQ ID
NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ
ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 or the
complement thereof. According to another embodiment, the at least
one nucleic acid is isolated from a plant of the Winteraceae family
or the Canellaceae family, particularly from Drimys Winteri or
Drimys lanceolata.
The organism or cell is meant to "express" a polypeptide, provided
that the organism or cell is transformed to harbor a nucleic acid
encoding said polypeptide, this nucleic acid is transcribed to mRNA
and the polypeptide is found in the host organism or cell. The term
"express" encompasses "heterologously express" and "over-express",
the latter referring to levels of mRNA, polypeptide and/or enzyme
activity over and above what is measured in a non-transformed
organism or cell. A more detailed description of suitable methods
to transform a non-human host organism or cell will be described
later on in the part of the specification that is dedicated to such
transformed non-human host organisms or cells.
A particular organism or cell is meant to be "capable of producing
FPP" when it produces FPP naturally or when it does not produce FPP
naturally but is transformed to produce FPP, either prior to the
transformation with a nucleic acid as described herein or together
with said nucleic acid. Organisms or cells transformed to produce a
higher amount of FPP than the naturally occurring organism or cell
are also encompassed by the "organisms or cells capable of
producing FPP". Methods to transform organisms, for example
microorganisms, so that they produce FPP are already known in the
art.
To carry out an embodiment herein in vivo, the host organism or
cell is cultivated under conditions conducive to the production of
Drimenol. Accordingly, if the host is a transgenic plant, optimal
growth conditions are provided, such as optimal light, water and
nutrient conditions, for example. If the host is a unicellular
organism, conditions conducive to the production of Drimenol may
comprise addition of suitable cofactors to the culture medium of
the host. In addition, a culture medium may be selected, so as to
maximize Drimenol synthesis. Optimal culture conditions are
described in a more detailed manner in the following Examples.
Non-human host organisms suitable to carry out the method of an
embodiment herein in vivo may be any non-human multicellular or
unicellular organisms. In a particular embodiment, the non-human
host organism used to carry out an embodiment herein in vivo is a
plant, a prokaryote or a fungus. Any plant, prokaryote or fungus
can be used. Particularly useful plants are those that naturally
produce high amounts of terpenes. In a more particular embodiment
the non-human host organism used to carry out the method of an
embodiment herein in vivo is a microorganism. Any microorganism can
be used but according to an even more particular embodiment said
microorganism is a bacteria or yeast. Most particularly, said
bacteria is E. coli and said yeast is Saccharomyces cerevisiae.
Some of these organisms do not produce FPP naturally. To be
suitable to carry out the method of an embodiment herein, these
organisms have to be transformed to produce said precursor. They
can be so transformed either before the modification with the
nucleic acid described according to any of the above embodiments or
simultaneously, as explained above.
Isolated higher eukaryotic cells can also be used, instead of
complete organisms, as hosts to carry out the method of an
embodiment herein in vivo. Suitable eukaryotic cells may be any
non-human cell, but are particularly plant or fungal cells.
In another particular embodiment, the polypeptide consists of an
amino acid sequence at least at least 70%, particularly at least
75%, particularly at least 80%, particularly at least 85%,
particularly at least 90%, particularly at least 95%, particularly
at least 96%, particularly at least 97%, particularly at least 98%
and even more particularly at least 99% sequence identity to a
sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID
NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 and SEQ ID NO: 14. In an even
more particular embodiment, said polypeptide consists of SEQ
ID.
According to another particular embodiment, the at least one
polypeptide having a Drimenol synthase activity used in any of the
above-described embodiments or encoded by the nucleic acid used in
any of the above-described embodiments comprises an amino acid
sequence that is a variant of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID
NO: 8, SEQ ID NO: 11 or SEQ ID NO: 14 obtained by genetic
engineering, provided that said variant keeps its Drimenol synthase
activity, as defined above and has the required percentage of
identity to SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11
or SEQ ID NO: 14. In other terms, said polypeptide particularly
comprises an amino acid sequence encoded by a nucleotide sequence
that has been obtained by modifying SEQ ID NO: 1, SEQ ID NO: 3, SEQ
ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10
SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 or the complement
thereof. According to a more particular embodiment, the at least
one polypeptide having a Drimenol synthase activity used in any of
the above-described embodiments or encoded by the nucleic acid used
in any of the above-described embodiments consists of an amino acid
sequence that is a variant of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID
NO: 8, SEQ ID NO: 11 or SEQ ID NO: 14 obtained by genetic
engineering, i.e. an amino acid sequence encoded by a nucleotide
sequence that has been obtained by modifying modifying SEQ ID NO:
1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID
NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15
or the complement thereof.
According to another particular embodiment, the at least one
polypeptide having a Drimenol synthase activity used in any of the
above-described embodiments or encoded by the nucleic acid used in
any of the above-described embodiments is a variant of SEQ ID NO:
2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 or SEQ ID NO: 14 that
can be found naturally in other organisms, such as other plant
species, provided that it keeps its Drimenol synthase activity as
defined above and has the required percentage of identity to of SEQ
ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 or SEQ ID NO:
14.
As used herein, the polypeptide is intended as a polypeptide or
peptide fragment that encompasses the amino acid sequences
identified herein, as well as truncated or variant polypeptides,
provided that they keep their Drimenol synthase activity as defined
above and that they share at least the defined percentage of
identity with the corresponding fragment of SEQ ID NO: 2, SEQ ID
NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 or SEQ ID NO: 14.
Examples of variant polypeptides are naturally occurring proteins
that result from alternate mRNA splicing events or from proteolytic
cleavage of the polypeptides described herein. Variations
attributable to proteolysis include, for example, differences in
the N- or C-termini upon expression in different types of host
cells, due to proteolytic removal of one or more terminal amino
acids from the polypeptides of an embodiment herein. Polypeptides
encoded by a nucleic acid obtained by natural or artificial
mutation of a nucleic acid of an embodiment herein, as described
thereafter, are also encompassed by an embodiment herein.
Polypeptide variants resulting from a fusion of additional peptide
sequences at the amino and carboxyl terminal ends can also be used
in the methods of an embodiment herein. In particular such a fusion
can enhance expression of the polypeptides, be useful in the
purification of the protein or improve the enzymatic activity of
the polypeptide in a desired environment or expression system. Such
additional peptide sequences may be signal peptides, for example.
Accordingly, encompassed herein are methods using variant
polypeptides, such as those obtained by fusion with other oligo- or
polypeptides and/or those which are linked to signal peptides.
Polypeptides resulting from a fusion with another functional
protein, such as another protein from the terpene biosynthesis
pathway, can also be advantageously be used in the methods of an
embodiment herein.
According to another embodiment, the at least one polypeptide
having a Drimenol synthase activity used in any of the
above-described embodiments or encoded by the nucleic acid used in
any of the above-described embodiments is isolated from a plant of
the Winteraceae family or the Canellaceae family, particularly from
Drimys Winteri or Drimys lanceolata.
An important tool to carry out the method of an embodiment herein
is the polypeptide itself. A polypeptide having a Drimenol synthase
activity and comprising an amino acid sequence at least 70%
identical to SEQ ID NO:2 is therefore provided herein.
According to a particular embodiment, the polypeptide is capable of
producing a mixture of sesquiterpenes wherein Drimenol represents
at least 20%, particularly at least 30%, particularly at least 35%,
particularly at least 90%, particularly at least 95%, more
particularly at least 98% of the sesquiterpenes produced. In
another aspect provided here, the Drimenol is produced with greater
than or equal to 95%, more particularly 98% selectivity.
According to a particular embodiment, the polypeptide comprises an
amino acid sequence at least 70%, particularly at least 75%,
particularly at least 80%, particularly at least 85%, particularly
at least 90%, particularly at least 95%, particularly at least 96%,
particularly at least 97%, particularly at least 98% and even more
particularly at least 99% identical to a sequence selected from the
group consisting of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ
ID NO: 11 and SEQ ID NO: 14. According to a more particular
embodiment, the polypeptide comprises amino acid sequence selected
from the group consisting of of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID
NO: 8, SEQ ID NO: 11 and SEQ ID NO: 14
According to another particular embodiment, the polypeptide
consists of an amino acid sequence at least 70%, particularly at
least 75%, particularly at least 80%, particularly at least 85%,
particularly at least 90%, particularly at least 95%, particularly
at least 96%, particularly at least 97%, particularly at least 98%
and even more particularly at least 99% identical to a sequence
selected from the group consisting of of SEQ ID NO: 2, SEQ ID NO:
5, SEQ ID NO: 8, SEQ ID NO: 11 or SEQ ID NO: 14. According to a
more particular embodiment, the polypeptide consists of an amino
acid selected from the group consisting of of SEQ ID NO: 2, SEQ ID
NO: 5, SEQ ID NO: 8, SEQ ID NO: 11 or SEQ ID NO: 14.
The at least one polypeptide comprises an amino acid sequence that
is a variant of SEQ ID NO:2, either obtained by genetic engineering
or found naturally in Drimys plants or in other plant species. In
other terms, when the variant polypeptide is obtained by genetic
engineering, said polypeptide comprises an amino acid sequence
encoded by a nucleotide sequence that has been obtained by
modifying SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6,
SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO:
13 or SEQ ID NO: 15 or the complement thereof. According to a more
particular embodiment, the at least one polypeptide having a
Drimenol synthase activity consists of an amino acid sequence that
is a variant of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID
NO: 11 or SEQ ID NO: 14 obtained by genetic engineering, i.e. an
amino acid sequence encoded by a nucleotide sequence that has been
obtained by modifying SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ
ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12,
SEQ ID NO: 13 or SEQ ID NO: 15 or the complement thereof.
According to another embodiment, the polypeptide is isolated from a
plant of the Winteraceae family or the Canellaceae family,
particularly from Drimys Winteri or Drimys lanceolata. As used
herein, the polypeptide is intended as a polypeptide or peptide
fragment that encompasses the amino acid sequence identified
herein, as well as truncated or variant polypeptides, provided that
they keep their activity as defined above and that they share at
least the defined percentage of identity with the corresponding
fragment of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 8, SEQ ID NO: 11
or SEQ ID NO: 14.
As mentioned above, the nucleic acid encoding the polypeptide of an
embodiment herein is a useful tool to modify non-human host
organisms or cells intended to be used when the method is carried
out in vivo.
A nucleic acid encoding a polypeptide according to any of the
above-described embodiments is therefore also provided herein.
According to a particular embodiment, the nucleic acid comprises a
nucleotide sequence at least 50%, particularly at least 55%,
particularly at least 60%, particularly at least 65%, particularly
at least 70%, particularly at least 75%, particularly at least 80%,
particularly at least 85%, particularly at least 90%, more
particularly at least 95% particularly at least 96%, particularly
at least 97%, particularly at least 98%, and even more particularly
at least 99% identical to a sequence selected from the group
consisting of a sequence selected from the group consisting of NO:
1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID
NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15
or the complement thereof. According to a more particular
embodiment, the nucleic acid comprises the nucleotide sequence
selected from the group consisting of NO: 1, SEQ ID NO: 3, SEQ ID
NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ
ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 thereof.
According to another particular embodiment, the nucleic acid
consists of a nucleotide sequence at least 70%, particularly at
least 75%, particularly at least 80%, particularly at least 85%,
particularly at least 90%, particularly at least 95%, particularly
at least 96%, particularly at least 97%, particularly at least 98%
and even more particularly at least 99% or more identity to a
sequence selected from the group consisting NO: 1, SEQ ID NO: 3,
SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO:
10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 or the complement
thereof. According to an even more particular embodiment, the
nucleic acid consists of a sequence selected from the group
consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO:
6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID
NO: 13 or SEQ ID NO: 15 or the complement thereof.
The nucleic acid of an embodiment herein can be defined as
including deoxyribonucleotide or ribonucleotide polymers in either
single- or double-stranded form (DNA and/or RNA). The terms
"nucleotide sequence" should also be understood as comprising a
polynucleotide molecule or an oligonucleotide molecule in the form
of a separate fragment or as a component of a larger nucleic acid.
Nucleic acids of an embodiment herein also encompass certain
isolated nucleotide sequences including those that are
substantially free from contaminating endogenous material. The
nucleic acid of an embodiment herein may be truncated, provided
that it encodes a polypeptide encompassed herein, as described
above.
In one embodiment, the nucleic acid of an embodiment herein can be
either present naturally in plants of the Drimys species or other
species, or be obtained by modifying SEQ ID NO: 1, SEQ ID NO: 3,
SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO:
10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 or the complement
thereof. Particularly said nucleic acid consists of a nucleotide
sequence that has been obtained by modifying SEQ ID NO: 1, SEQ ID
NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ
ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 or the
complement thereof.
The nucleic acids comprising a sequence obtained by mutation of SEQ
ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7,
SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID
NO: 15 or the complement thereof are encompassed by an embodiment
herein, provided that the sequences they comprise share at least
the defined percentage of identity with the corresponding fragments
of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID
NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or
SEQ ID NO: 15 or the complement thereof and provided that they
encode a polypeptide having a Drimenol synthase activity, as
defined in any of the above embodiments. Mutations may be any kind
of mutations of these nucleic acids, such as point mutations,
deletion mutations, insertion mutations and/or frame shift
mutations. A variant nucleic acid may be prepared in order to adapt
its nucleotide sequence to a specific expression system. For
example, bacterial expression systems are known to more efficiently
express polypeptides if amino acids are encoded by particular
codons.
Due to the degeneracy of the genetic code, more than one codon may
encode the same amino acid sequence, multiple nucleic acid
sequences can code for the same protein or polypeptide, all these
DNA sequences being encompassed by an embodiment herein. Where
appropriate, the nucleic acid sequences encoding the Drimenol
synthase may be optimized for increased expression in the host
cell. For example, nucleotides of an embodiment herein may be
synthesized using codons particular by a host for improved
expression.
Another important tool for transforming host organisms or cells
suitable to carry out the method of an embodiment herein in vivo is
an expression vector comprising a nucleic acid according to any
embodiment of an embodiment herein. Such a vector is therefore also
provided herein.
The expression vectors provided herein may be used in the methods
for preparing a genetically transformed host organism and/or cell,
in host organisms and/or cells harboring the nucleic acids of an
embodiment herein and in the methods for making polypeptides having
a Drimenol synthase activity, as disclosed further below.
Recombinant non-human host organisms and cells transformed to
harbor at least one nucleic acid of an embodiment herein so that it
heterologously expresses or over-expresses at least one polypeptide
of an embodiment herein are also very useful tools to carry out the
method of an embodiment herein. Such non-human host organisms and
cells are therefore also provided herein.
A nucleic acid according to any of the above-described embodiments
can be used to transform the non-human host organisms and cells and
the expressed polypeptide can be any of the above-described
polypeptides.
Non-human host organisms of an embodiment herein may be any
non-human multicellular or unicellular organisms. In a particular
embodiment, the non-human host organism is a plant, a prokaryote or
a fungus. Any plant, prokaryote or fungus is suitable to be
transformed according to the methods provided herein. Particularly
useful plants are those that naturally produce high amounts of
terpenes.
In a more particular embodiment the non-human host organism is a
microorganism. Any microorganism is suitable to be used herein, but
according to an even more particular embodiment said microorganism
is a bacteria or yeast. Most particularly, said bacteria is E. coli
and said yeast is Saccharomyces cerevisiae.
Isolated higher eukaryotic cells can also be transformed, instead
of complete organisms. As higher eukaryotic cells, we mean here any
non-human eukaryotic cell except yeast cells. Particular higher
eukaryotic cells are plant cells or fungal cells.
A variant may also differ from the polypeptide of an embodiment
herein by attachment of modifying groups which are covalently or
non-covalently linked to the polypeptide backbone. The variant also
includes a polypeptide which differs from the polypeptide described
herein by introduced N-linked or O-linked glycosylation sites,
and/or an addition of cysteine residues. The skilled artisan will
recognise how to modify an amino acid sequence and preserve
biological activity.
The functionality or activity of any Drimenol synthase protein,
variant or fragment, may be determined using various methods. For
example, transient or stable overexpression in plant, bacterial or
yeast cells can be used to test whether the protein has activity,
i.e., produces Drimenol from FPP precursors. Drimenol synthase
activity may be assessed in a microbial expression system, such as
the assay described in Example 2 or 3 herein on the production of
Drimenol, indicating functionality. A variant or derivative of a
Drimenol synthase polypeptide of an embodiment herein retains an
ability to produce Drimenol from FPP precursors Amino acid sequence
variants of the Drimenol synthases provided herein may have
additional desirable biological functions including, e.g., altered
substrate utilization, reaction kinetics, product distribution or
other alterations.
An embodiment herein provides polypeptides of an embodiment herein
to be used in a method to produce Drimenol by contacting an FPP
precursor with the polypeptides of an embodiment herein either in
vitro or in vivo.
Provided herein is also an isolated, recombinant or synthetic
polynucleotide encoding a polypeptide or variant polypeptide
provided herein.
An embodiment of an embodiment herein provides an isolated,
recombinant or synthetic nucleic acid sequence of SEQ ID NO: 1, SEQ
ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9,
SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 or a
variant thereof encoding for a Drimenol synthase having the amino
acid sequence which is at least 70%, 75%, 80%, 85%, 90%, 92%, 95%,
96%, 97%, 98% or 99% identical to a amino acid sequence selected
from the group consisting of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO:
8, SEQ ID NO: 11 or SEQ ID NO: 14 or fragments thereof that
catalyze production of Drimenol in a cell from a FPP precursor.
Embodiments provided herein include, but are not limited to cDNA,
genomic DNA and RNA sequences. Any nucleic acid sequence encoding
the Drimenol synthase or variants thereof is referred herein as a
Drimenol synthase encoding sequence.
According to a particular embodiment, the nucleic acid of SEQ ID
NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ
ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO:
15 SEQ is the coding sequence of a Drimenol synthase gene encoding
the Drimenol synthase obtained as described in the Examples.
A fragment of a polynucleotide of SEQ ID NO: 1, SEQ ID NO: 3, SEQ
ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10
SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO: 15 refers to contiguous
nucleotides that is particularly at least 15 bp, at least 30 bp, at
least 40 bp, at least 50 bp and/or at least 60 bp in length of the
polynucleotide of an embodiment herein herein. Particularly the
fragment of a polynucleotide comprises at least 25, more
particularly at least 50, more particularly at least 75, more
particularly at least 100, more particularly at least 150, more
particularly at least 200, more particularly at least 300, more
particularly at least 400, more particularly at least 500, more
particularly at least 600, more particularly at least 700, more
particularly at least 800, more particularly at least 900, more
particularly at least 1000 contiguous nucleotides of the
polynucleotide of the an embodiment herein. Without being limited,
the fragment of the polynucleotides herein may be used as a PCR
primer, and/or as a probe, or for anti-sense gene silencing or
RNAi.
It is clear to the person skilled in the art that genes, including
the polynucleotides of an embodiment herein, can be cloned on basis
of the available nucleotide sequence information, such as found in
the attached sequence listing, by methods known in the art. These
include e.g. the design of DNA primers representing the flanking
sequences of such gene of which one is generated in sense
orientations and which initiates synthesis of the sense strand and
the other is created in reverse complementary fashion and generates
the antisense strand. Thermo stable DNA polymerases such as those
used in polymerase chain reaction are commonly used to carry out
such experiments. Alternatively, DNA sequences representing genes
can be chemically synthesized and subsequently introduced in DNA
vector molecules that can be multiplied by e.g. compatible bacteria
such as e.g. E. coli.
In a related embodiment of an embodiment herein, PCR primers and/or
probes for detecting nucleic acid sequences encoding a Drimenol
synthase are provided. The skilled artisan will be aware of methods
to synthesize degenerate or specific PCR primer pairs to amplify a
nucleic acid sequence encoding the Drimenol synthase or fragments
thereof, based on SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID
NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ
ID NO: 13 or SEQ ID NO: 15. A detection kit for nucleic acid
sequences encoding the Drimenol synthase may include primers and/or
probes specific for nucleic acid sequences encoding the Drimenol
synthase, and an associated protocol to use the primers and/or
probes to detect nucleic acid sequences encoding the Drimenol
synthase in a sample. Such detection kits may be used to determine
whether a plant has been modified, i.e., transformed with a
sequence encoding the Drimenol synthase.
Provided herein are nucleic acid sequences obtained by mutations of
NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ
ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO:
15 such mutations can be routinely made. It is clear to the skilled
artisan that mutations, deletions, insertions, and/or substitutions
of one or more nucleotides can be introduced into the DNA sequence
of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID
NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or
SEQ ID NO: 15. Generally, a mutation is a change in the DNA
sequence of a gene that can alter the amino acid sequence of the
polypeptide produced.
To test a function of variant DNA sequences according to an
embodiment herein, the sequence of interest is operably linked to a
selectable or screenable marker gene and expression of the reporter
gene is tested in transient expression assays with protoplasts or
in stably transformed plants. The skilled artisan will recognize
that DNA sequences capable of driving expression are built as
modules. Accordingly, expression levels from shorter DNA fragments
may be different than the one from the longest fragment and may be
different from each other. Further provided herein are also
functional equivalents of the nucleic acid sequence coding the
Drimenol synthase proteins, i.e., nucleotide sequences that
hybridize under stringent conditions to the nucleic acid sequence
of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID
NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or
SEQ ID NO: 15. The skilled artisan will be aware of methods to
identify homologous sequences in other organisms and methods
(identified in the Definition section herein) to determine the
percentage of sequence identity between homologous sequences. Such
newly identified to DNA molecules then can be sequenced and the
sequence can be compared with the nucleic acid sequence of SEQ ID
NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 7, SEQ
ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO: 13 or SEQ ID NO:
15 and tested for functional equivalence. Provided herein are are
DNA molecules having at least 70% particularly 75%, particularly
80%, particularly 85%, particularly 90%, particularly 95%,
particularly 96% particularly 97% particularly 98%, or more
particularly 99% or more sequence identity to the nucleotide
sequence of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 6,
SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO: 12, SEQ ID NO:
13 or SEQ ID NO: 15
A related embodiment provides a nucleic acid sequence which is
complementary to the nucleic acid sequence according to SEQ ID NO:1
or SEQ ID NO:3, such as inhibitory RNAs, or nucleic acid sequence
which hybridizes under stringent conditions to at least part of the
nucleotide sequence according to NO: 1, SEQ ID NO: 3, SEQ ID NO: 4,
SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10 SEQ ID NO:
12, SEQ ID NO: 13 or SEQ ID NO: 15
An alternative embodiment of provided herein provides a method to
alter gene expression in a host cell. For instance, the
polynucleotide of an embodiment herein may be enhanced or
overexpressed or induced in certain contexts (e.g. following insect
bites or stings or upon exposure to a certain temperature) in a
host cell or host organism.
Alteration of expression of a polynucleotide provided hereinalso
results in "ectopic expression" which is a different expression
pattern in an altered and in a control or wild-type organism.
Alteration of expression occurs from interactions of polypeptide of
an embodiment herein with exogenous or endogenous modulators, or as
a result of chemical modification of the polypeptide. The term also
refers to an altered expression pattern of the polynucleotide of an
embodiment herein which is altered below the detection level or
completely suppressed activity.
In one embodiment, several Drimenol synthase encoding nucleic acid
sequences are co-expressed in a single host, particularly under
control of different promoters. Alternatively, several Drimenol
synthase protein encoding nucleic acid sequences can be present on
a single transformation vector or be co-transformed at the same
time using separate vectors and selecting transformants comprising
both chimeric genes. Similarly, one or more Drimenol synthase
encoding genes may be expressed in a single plant together with
other chimeric genes, for example encoding other proteins which
enhance insect pest resistance, or others.
The nucleic acid sequences of an embodiment herein encoding
Drimenol synthase proteins can be inserted in expression vectors
and/or be contained in chimeric genes inserted in expression
vectors, to produce Drimenol synthase proteins in a host cell or
host organism. The vectors for inserting transgenes into the genome
of host cells are well known in the art and include plasmids,
viruses, cosmids and artificial chromosomes. Binary or
co-integration vectors into which a chimeric gene is inserted are
also used for transforming host cells.
An embodiment of the provided herein provides recombinant
expression vectors comprising a nucleic acid sequence of a Drimenol
synthase gene, or a chimeric gene comprising a nucleic acid
sequence of a Drimenol synthase gene, operably linked to associated
nucleic acid sequences such as, for instance, promoter sequences.
For example, a chimeric gene comprising a nucleic acid sequence of
SEQ ID NO:1 or SEQ ID NO:3 may be operably linked to a promoter
sequence suitable for expression in plant cells, bacterial cells or
fungal cells, optionally linked to a 3' non-translated nucleic acid
sequence.
Alternatively, the promoter sequence may already be present in a
vector so that the nucleic acid sequence which is to be transcribed
is inserted into the vector downstream of the promoter sequence.
Vectors are typically engineered to have an origin of replication,
a multiple cloning site, and a selectable marker.
The following examples are illustrative only and are not intended
to limit the scope of the claims or embodiments provided
herein.
EXAMPLES
Example 1
Drimys lanceolata and Drimys winteri Plant Material and Leaf
Transcriptome Sequencing.
Drimys winteri and Drimys lanceolata plants were obtained from
Bluebell Nursery (Leicestershire, UK). For analysis of the
composition in terpene molecules, the leaves were collected and
solvent extracted using MTBE (methyl tert-butyl ether). The extract
was analyzed by GCMS using an Agilent 6890 Series GC system
connected to an Agilent 5975 mass detector. The GC was equipped
with 0.25 mm inner diameter by 30 m DB-1 ms capillary column
(Agilent). The carrier gas was He at a constant flow of 1 mL/min.
The initial oven temperature was 50.degree. C. (1 min hold)
followed by a gradient of 10.degree. C./min to 300.degree. C. The
injection was made in a split/splitless injector set at 260.degree.
C. and used in splitless mode. The identification of the products
was based on the comparison of the mass spectra and retention
indices with authentic standards and internal mass spectra
databases. The leaves of the two plants contained significant
quantities of drimane sesquiterpene compounds including
(-)-drimenol, polygodial and epipolygodial (FIG. 1).
Small leaves of D. winteri and D. lanceolata were thus taken for
transcriptome analysis. Total RNA was extracted using the
Concert.TM. Plant RNA Reagent (Invitrogen). This total RNA was
processed using the Illumina Total RNA-Seq technique and the
Illumina HiSeq 2000 sequencer. A total of 101 and 105 millions of
paired-reads of 2.times.100 bp were generated for D. winteri and D.
lanceolata, respectively. The reads were assembled using the Velvet
de novo genomic assembler and the Oases software. For D. winteri
40,586 contigs with an average size of 1,080 bp were assembled and
for D. lanceolate 28,255 contigs with an average size of 1,179 bp
were obtained. The contigs were search using the tBlastn algorithm
(Altschul et al, J. Mol. Biol. 215, 403-410, 1990) and using as
query the amino acid sequences of known sesquiterpene synthases.
This approach provided the sequences for 37 new putative
sesquiterpene synthases. The enzymatic activity of these synthases
were evaluated as described in the following examples for the
synthases showing Drimenol synthase activity.
Example 2. Functional Expression and Characterization of DlTps589
from D. Lanceolata
The DNA sequences of one of the selected sesquiterpene synthases
DlTps589 was codon-optimized, synthesized in-vitro and cloned in
the pJ444-SR expression plasmid (DNA2.0, Menlo Park, Calif.,
USA).
Heterologous expression of the DlTps589 synthases was performed in
KRX E. coli cells (Promega). Single colonies of cells transformed
with the pJ444SR-DlTps589 expression plasmid were used to inoculate
5 ml LB medium. After 5 to 6 hours incubation at 37.degree. C., the
cultures were transferred to a 20.degree. C. incubator and left 1
hour for equilibration. Expression of the protein was then induced
by the addition of 1 mM IPTG and 0.2% L-rhamnose and the culture
was incubated over-night at 20.degree. C. The next day, the cells
were collected by centrifugation, resuspended in 0.1 volume of 50
mM MOPSO pH 7, 10% glycerol and lyzed by sonication. The extracts
were cleared by centrifugation (30 min at 20,000 g) and the
supernatants containing the soluble proteins were used for further
experiments.
The crude E coli protein extracts containing the recombinant
protein were used for the characterization of the enzymatic
activities. The assays were performed in 2 mL of 50 mM MOPSO pH 7,
10% glycerol, 1 mM DTT, 15 mM MgCl.sub.2 in the presence of 80
.mu.M of farnesyl-diphosphate (FPP, Sigma) and 0.1 to 0.5 mg of
crude protein. The tubes were incubated 12 to 24 hours at
30.degree. C. and extracted twice with one volume of pentane. After
concentration under a nitrogen flux, the extracts were analysed by
GC and GC-MS and compared to extracts from assays with control
proteins. The analysis of the products formed by the enzymes was
made by GCMS as described in example 1. A negative control was
performed in the same conditions using E. coli cells transformed
with an empty pJ444 plasmid. In these conditions, the DlTps589
recombinant enzyme produced (-)-drimenol as major product with a
selectivity over 98% (FIG. 2). The identity of (-)-drimenol was
confirmed by matching of the mass spectrum and retention time of an
authentic Drimenol standard isolated from Sandalwood Oil West
(Amyris balsamifera).
Example 3. In Vivo Production of (-)-Drimenol in E. coli Cells
Using DlTps589
To evaluate the in-vivo production of (-)-drimenol in heterologous
cells, E. coli cells were transformed with the pJ444SR-DlTps589
expression plasmid and the production of sesquiterpenes from the
endogenous FPP pool was evaluated. To increase the productivity of
the cells, an heterologous FPP synthase and an the enzymes from a
complete heterologous mevalonate (MVA) pathway were also expressed
in the same cells. The construction of the expression plasmid
containing an FPP synthase gene and the gene for a complete MVA
pathway was described in patent WO2013064411 or in Schalk et al
(2013) J. Am. Chem. Soc. 134, 18900-18903. Briefly, an expression
plasmid was prepared containing two operons composed of the genes
encoding the enzymes for a complete mevalonate pathway. A first
synthetic operon consisting of an E. coli acetoacetyl-CoA thiolase
(atoB), a Staphylococcus aureus HMG-CoA synthase (mvaS), a
Staphylococcus aureus HMG-CoA reductase (mvaA) and a Saccharomyces
cerevisiae FPP synthase (ERG20) genes was synthetized in-vitro
(DNA2.0, Menlo Park, Calif., USA) and ligated into the Ncol-BamHI
digested pACYCDuet-1 vector (Invitrogen) yielding pACYC-29258. A
second operon containing a mevalonate kinase (MvaK1), a
phosphomevalonate kinase (MvaK2), a mevalonate diphosphate
decarboxylase (MvaD), and an isopentenyl diphosphate isomerase
(idi) was amplified from genomic DNA of Streptococcus pneumoniae
(ATCC BAA-334) and ligated into the second multicloning site of
pACYC-29258 providing the plasmid pACYC-29258-4506. This plasmid
thus contains the genes encoding all enzymes of the biosynthetic
pathway leading from acetyl-coenzyme A to FPP.
KRX E. coli cells (Promega) were co-transformed with the plasmid
pACYC-29258-4506 and the plasmid pJ444SR-DlTps589. Transformed
cells were selected on carbenicillin (50 .mu.g/ml) and
chloramphenicol (34 .mu.g/ml) LB-agarose plates. Single colonies
were used to inoculate 5 mL liquid LB medium supplemented with the
same antibiotics. The culture was incubated overnight at 37.degree.
C. The next day 2 mL of TB medium supplemented with the same
antibiotics were inoculated with 0.2 mL of the overnight culture.
After 6 hours incubation at 37.degree. C., the culture was cooled
down to 28.degree. C. and 0.1 mM IPTG and 0.2% rhamnose were added
to each tube. The cultures were incubated for 48 hours at
28.degree. C. The cultures were then extracted twice with 2 volumes
of MTBE, the organic phase were concentrated to 500 .mu.L and
analyzed by GC-MS as described above in Example 1.
In this in-vivo conditions the DlTps589 recombinant enzyme produced
(-)-drimenol as major product with the same apparent selectivity as
in the in vitro assay described in example 2 (FIG. 3). Using these
engineered E. Coli cells larger (1 L) culture were used to purified
the sesquiterpene produced by the enzyme in sufficient quantity to
confirm the structure by NMR analysis and specific rotation
measurement as being the structure of (-)-drimenol shown in FIG. 4.
The enantiopurity was confirmed by chiral GC analysis on a Varian
CP-3800 GC system equipped with a ChiraSil column (Agilent) and
using oven gradient temperature of 3.0.degree. C./min from 125 to
180.degree. C. (FIG. 5).
Example 4. Functional Expression and Characterization of
SCH51-3228-9 and SCH51-3228-11 from D. Winteri
SCH51-3228-9 and SCH51-3228-11 are two other DNA sequences
putatively encoding for sesquiterpene synthases and isolated from
the Drymis winteri transcriptome sequences. The deduced amino acid
sequences share 92.6 and 95.1% identity, respectively with the
DlTps589 amino acid sequence. The two sequences were codon
optimized, synthesized in-vitro (Invitrogen) and cloned between the
NdeI and KpnI restriction enzyme recognition sites of the pETDuet-1
(Novagen) expression plasmid (Invitrogen).
Heterologous expression of the SCH51-3228-9 and SCH51-3228-11
synthases was performed in BL21 (DE3) E. coli cells (Invitrogen).
Single colonies of cells transformed with the pETDuet-SCH51-3228-9
or the pEDTDuet-SCH51-3228-11 expression plasmids were used to
produce the recombinant enzymes as described in example 2. The
crude E coli protein extracts containing the recombinant proteins
were used for the characterization of the enzymatic activities as
described in example 2 except for the the GCMS analysis conditions
which were performed as follows. The GCMS analysis was made on an
Agilent 6890 Series GC system connected to an Agilent 5975 mass
detector. The GC was equipped with 0.25 mm inner diameter by 30 m
DB-1 ms capillary column (Agilent). The carrier gas was He at a
constant flow of 1 mL/min. The initial oven temperature was
50.degree. C. (5 min hold) followed by a gradient of 5.degree.
C./min to 300.degree. C. The injection was made in split mode at
250.degree. C. with a split ratio of 5:1.
The The two recombinant enzymes produced (-)-drimenol as major
product with high selectivity. The identity of (-)-drimenol was
confirmed by matching of the mass spectrum and retention time of an
authentic Drimenol standard isolated from Sandalwood Oil West
(Amyris balsamifera) (FIG. 6).
Using the whole E. Coli cell system and method described in example
3 (except for the GCMS analysis conditions which were as described
above) Drimenol could also be produced in vivo in bacteria cultures
using the SCH51-3228-9 and SCH51-3228-11 recombinant proteins (FIG.
7).
Example 5. Functional Expression and Characterization of
SCH51-998-28 from D. Winteri and SCH52-13163-6 from D.
Lanceolata
Similarly to example 4, the two cDNAs SCH51-998-28 and
SCH52-13163-6 were optimized and cloned in the pETDuet expression
plasmid.
The recombinant proteins were produced in in BL21 (DE3) E. coli
cells (Invitrogen) and the in vitro assays using FPP as substrate
were performed as described in example 2 and 4. These assays showed
Drimenol synthase activity for SCH51-998-28 and SCH52-13163-6 (FIG.
6). Using E. Coli cells overproducing FPP from a recombinant
mevalonate pathway (example 2 and 4), Drimenol could also be
produced in vivo using the SCH51-998-28 and SCH52-13163-6 proteins
(FIG. 7).
Example 6. Sequence Comparison of the Drimenol Synthases from
Drimys Species
The amino acid sequences of the Drimenol synthases from Drimys
winteri and Drimys lanceolata were aligned using the ClustalW
Multiple alignment program (Thompson et al, 1994, Nucleic Acid Res.
22(22), 4673-4680 and the sequence identities were calculated based
on this alignment.
Percent identity (%) between the different Drimenol synthases from
Drimys species:
TABLE-US-00001 SCH51- SCH51- SCH51- SCH52- DITps589 3228-11 3228-9
998-28 13163-6 DITps589 ID 95.1 92.6 70.5 88 SCH51_3228_11 95.1 ID
97.1 70.6 87.6 SCH51_3228_9 92.6 97 ID 71 90.1 SCH51_998_28 70.5
70.6 0.71 ID 72.5 SCH52_13163_6 88 87.6 90.1 72.5 ID
SEQUENCE LISTINGS
1
1511680DNADrimys lanceolata 1atggatctta ttaatccctc cccagcggct
tccaccctcc ctctcccagt tgatggagat 60tcagaagttg ttaggcgatc tgccgggttt
catccgacta tctggggcga tcacttcctc 120tcctacaagc ccgatccaaa
gaaaatagat gcatggaata aaagggttga agagctgaag 180gaagaagtga
agaagatatt aagcaatgca aaagggacgg tggaagagct gaatttgatt
240gatgatctcg tacaccttgg gattagttat cattttgaga aggagattga
tgatgctcta 300caacacatct ttgataccca tcttgatgat tttcctaagg
atgatctata tgtcgccgct 360ctccgatttg gcgtcttaag gaaacagggg
caccgtgttt ctccagatgt attcaaaaaa 420ttcaaagatg agcaggggaa
tttcaaggca gagttgagca ccgatgcgaa aggtttgcta 480tgtttaaatg
atgtggctta tctcagcaca agaggggaag atatcttgga tgaagccatt
540cctttcactg aggagcacct taggtcttgt attagccatg tagattctca
tatggcagca 600aaaattgaac attctctcga gcttcccctt catcatcgca
taccaaggct agagaacagg 660cactacatct cagtctatga aggagacaag
gaaaggaacg aagttgtcct tgagcttgcc 720aatttagatt tcaatctgat
tcaaatcttg caccaaagag agctgagaga catcacaatg 780tggtggaagg
agattgacct tgcagcaaag ctgcctttta ttagggatag gttggtggag
840tgctactact ggatcatggg ggtctatttt gaaccaatat actcgagggc
tagggttttt 900tccaccaaaa tgacaatgtt ggtctcagtt gtggacgaca
tatatgatgt gtatgctacc 960gaggatgagc ttcaactatt cactgatgcc
atctataggt gggatgctga tgacattgat 1020cagctgcctc agtacttgaa
agatgctttt atggtactct acaacactgt gaagactcta 1080gaagaagaac
ttgaaccaga aggaaactct tatcgtggat tctatgtaaa agatgcaatg
1140aaggttttgg caagggatta ctttgtggag cacaaatggt ataacagaaa
aattgtgcca 1200tccgtagagg aatacttgaa aatttcttgc atcagtgtgg
ccgttcatat ggctacagtt 1260cactgtattg ctgggatgta tgaaattgca
accaaagagg cattcgaatg gttgatgact 1320gagcccaaac ttgttattga
tgcatctctg attggtcgtc tccttgatga catgcagtcc 1380acctcgtttg
agcaacagag aggccacgtg tcatcagcag tacagtgtta catggctgaa
1440tatggtgtaa cagcggaaga agcatgtgaa aagctccgag atatggctgc
aattgcttgg 1500aaagatgtga acgaggcatg ccttaggccc acggttttcc
ctatgcctat ccttttgcct 1560tctatcaact tggcacgtgt ggcagaagtc
atctacctac gtggagatgg atacacgcac 1620gctgggggtg agaccaagaa
acacatcacg gccatgcttg ttaagccaat tgaagtctga 16802559PRTDrimys
lanceolata 2Met Asp Leu Ile Asn Pro Ser Pro Ala Ala Ser Thr Leu Pro
Leu Pro1 5 10 15Val Asp Gly Asp Ser Glu Val Val Arg Arg Ser Ala Gly
Phe His Pro 20 25 30Thr Ile Trp Gly Asp His Phe Leu Ser Tyr Lys Pro
Asp Pro Lys Lys 35 40 45Ile Asp Ala Trp Asn Lys Arg Val Glu Glu Leu
Lys Glu Glu Val Lys 50 55 60Lys Ile Leu Ser Asn Ala Lys Gly Thr Val
Glu Glu Leu Asn Leu Ile65 70 75 80Asp Asp Leu Val His Leu Gly Ile
Ser Tyr His Phe Glu Lys Glu Ile 85 90 95Asp Asp Ala Leu Gln His Ile
Phe Asp Thr His Leu Asp Asp Phe Pro 100 105 110Lys Asp Asp Leu Tyr
Val Ala Ala Leu Arg Phe Gly Val Leu Arg Lys 115 120 125Gln Gly His
Arg Val Ser Pro Asp Val Phe Lys Lys Phe Lys Asp Glu 130 135 140Gln
Gly Asn Phe Lys Ala Glu Leu Ser Thr Asp Ala Lys Gly Leu Leu145 150
155 160Cys Leu Asn Asp Val Ala Tyr Leu Ser Thr Arg Gly Glu Asp Ile
Leu 165 170 175Asp Glu Ala Ile Pro Phe Thr Glu Glu His Leu Arg Ser
Cys Ile Ser 180 185 190His Val Asp Ser His Met Ala Ala Lys Ile Glu
His Ser Leu Glu Leu 195 200 205Pro Leu His His Arg Ile Pro Arg Leu
Glu Asn Arg His Tyr Ile Ser 210 215 220Val Tyr Glu Gly Asp Lys Glu
Arg Asn Glu Val Val Leu Glu Leu Ala225 230 235 240Asn Leu Asp Phe
Asn Leu Ile Gln Ile Leu His Gln Arg Glu Leu Arg 245 250 255Asp Ile
Thr Met Trp Trp Lys Glu Ile Asp Leu Ala Ala Lys Leu Pro 260 265
270Phe Ile Arg Asp Arg Leu Val Glu Cys Tyr Tyr Trp Ile Met Gly Val
275 280 285Tyr Phe Glu Pro Ile Tyr Ser Arg Ala Arg Val Phe Ser Thr
Lys Met 290 295 300Thr Met Leu Val Ser Val Val Asp Asp Ile Tyr Asp
Val Tyr Ala Thr305 310 315 320Glu Asp Glu Leu Gln Leu Phe Thr Asp
Ala Ile Tyr Arg Trp Asp Ala 325 330 335Asp Asp Ile Asp Gln Leu Pro
Gln Tyr Leu Lys Asp Ala Phe Met Val 340 345 350Leu Tyr Asn Thr Val
Lys Thr Leu Glu Glu Glu Leu Glu Pro Glu Gly 355 360 365Asn Ser Tyr
Arg Gly Phe Tyr Val Lys Asp Ala Met Lys Val Leu Ala 370 375 380Arg
Asp Tyr Phe Val Glu His Lys Trp Tyr Asn Arg Lys Ile Val Pro385 390
395 400Ser Val Glu Glu Tyr Leu Lys Ile Ser Cys Ile Ser Val Ala Val
His 405 410 415Met Ala Thr Val His Cys Ile Ala Gly Met Tyr Glu Ile
Ala Thr Lys 420 425 430Glu Ala Phe Glu Trp Leu Met Thr Glu Pro Lys
Leu Val Ile Asp Ala 435 440 445Ser Leu Ile Gly Arg Leu Leu Asp Asp
Met Gln Ser Thr Ser Phe Glu 450 455 460Gln Gln Arg Gly His Val Ser
Ser Ala Val Gln Cys Tyr Met Ala Glu465 470 475 480Tyr Gly Val Thr
Ala Glu Glu Ala Cys Glu Lys Leu Arg Asp Met Ala 485 490 495Ala Ile
Ala Trp Lys Asp Val Asn Glu Ala Cys Leu Arg Pro Thr Val 500 505
510Phe Pro Met Pro Ile Leu Leu Pro Ser Ile Asn Leu Ala Arg Val Ala
515 520 525Glu Val Ile Tyr Leu Arg Gly Asp Gly Tyr Thr His Ala Gly
Gly Glu 530 535 540Thr Lys Lys His Ile Thr Ala Met Leu Val Lys Pro
Ile Glu Val545 550 55531680DNAArtificial SequenceCodon optimized
DNA sequence of DlTps589 from D. lanceolata 3atggacctga ttaacccgag
ccctgctgca tccaccctgc cactgccagt cgatggtgat 60agcgaagttg tgcgccgtag
cgcgggtttc catccgacca tctggggtga ccactttctg 120tcttataagc
cggacccgaa aaagattgat gcgtggaaca agcgtgttga ggaactgaaa
180gaagaggtca aaaagatttt gagcaatgcg aaaggcacgg ttgaggaact
gaatttgatt 240gacgacctgg tacacctggg tattagctat cactttgaga
aagaaatcga cgacgcgctg 300cagcatatct tcgatacgca cctggatgat
ttcccgaaag atgacctcta cgtggctgcg 360ctgcgttttg gcgtcctgcg
taagcaaggc catcgtgtca gcccggacgt ctttaagaaa 420ttcaaagacg
agcaaggcaa cttcaaagcg gagctgtcaa ccgatgcaaa gggcctgttg
480tgcctgaacg atgtggcgta cctgagcacc cgtggtgagg atatcctgga
cgaagcgatc 540ccgttcacgg aagaacattt gcgctcgtgc attagccacg
ttgatagcca catggcagcg 600aagattgagc actctctgga gctgccgctg
caccatcgca ttccgcgttt agagaatcgc 660cattacatct ccgtgtacga
gggtgacaaa gagcgtaatg aagtcgttct ggagttggct 720aacttggact
ttaatcttat ccagatcctg caccagcgcg agctgcgcga catcacgatg
780tggtggaaag aaattgatct ggccgcaaag ctgccgttta ttcgtgaccg
tctggtggag 840tgttactatt ggattatggg cgtgtacttc gagccgatct
acagccgtgc gcgcgtgttt 900agcaccaaga tgaccatgct ggttagcgtg
gtggatgaca tctatgatgt ctacgctacg 960gaagatgagt tgcagctgtt
taccgacgcc atttacagat gggacgccga tgacattgat 1020caactgccgc
aatatctgaa agacgccttt atggttctgt acaacaccgt caaaaccctg
1080gaagaagaac tggagccgga aggtaactct tatcgtggtt tctacgttaa
agatgcgatg 1140aaagttctgg cgcgtgacta tttcgttgag cataagtggt
acaatcgtaa gatcgtcccg 1200tccgttgaag agtacttgaa gattagctgt
atcagcgtcg cagtccacat ggcgaccgtg 1260cactgtatcg ccggcatgta
tgagatcgcc acgaaagaag cattcgagtg gctgatgacc 1320gagccgaaac
tggtgattga cgcaagcctg attggtcgcc tgctggacga tatgcagagc
1380acgagctttg agcagcagcg cggtcatgtt agctccgcag ttcaatgcta
catggctgag 1440tacggtgtga ctgccgaaga agcatgcgag aagctgcgtg
atatggcggc cattgcgtgg 1500aaagatgtga atgaagcatg cctgcgcccg
accgttttcc cgatgccgat tttactgcct 1560agcatcaacc tggcacgtgt
ggcggaagtt atctatctgc gtggcgacgg ttatacgcac 1620gcgggtggtg
agactaagaa gcacatcacc gcgatgctgg tcaagccgat cgaagtgtaa
168041656DNADrimys winteri 4atggcttcca ccctccctct cccagcttat
ggagattcag aagttgttag gcgatctgcc 60gggtttcatc cgacgatctg gggcgatcac
ttcctctcct acaagcctga tccaacgaaa 120atagatgaat ggaataaaag
ggttgaagag ctgaaggaag aagtgaagaa gatattaagc 180aatgcaaaag
ggacagtgga agagctgaat ttgcttgatg atctcgtaca ccttgggatt
240agttatcatt ttgagaagga gattgatgat gctttacaac aaatctttga
tacccatctt 300gatgtttttc ctaaggatga tctatatgcc accgctctcc
gatttggcgt cttaaggaaa 360caggggcacc gtgtttctcc agatgtattc
aaaaaattca aagatgagca ggggaatttc 420aaggcagagt tgagcaccga
tgcgaagggt ttgctatgtt tatatgatgt ggcttatctc 480agcacaagag
gggaagatat cttggatgaa gccattcctt tcactaagga gcaccttagg
540tcttgtatta gccatgtcga ttctcatatg gcagcaaaaa ttgagcattc
tctagagctt 600ccccttcatc atcgcatacc aaggctagag aacaggcact
acatctcagt ctatgaagga 660gacaaggaaa ggaatgaagt tgtccttgag
cttgccaaat tagatttcaa tctgattcaa 720atcttgcacc aaagagagct
gagggacatc acaacgtggt ggaaggagat tgaccttgca 780gcaaagctac
cttttattag ggataggttg gtggagtgct actattggat catgggagtc
840tattttgaac caatatactc aagggctaga gttttttcga ccaaaatgac
aatcttggtc 900tcagttgtgg acgacatata tgatgtatat gctacagagg
atgagctcca acttttcact 960gatgcaatct ataggtggga tgctgaggac
attgagcagc ttccacagta cttgaaagat 1020gcttttcttg tactctataa
cactgtgaag gacctagaag aggaattgga accagaagga 1080aactcttatc
gtggatacta tgtaaaagat gcgatgaagg ttttggcaag ggattacttt
1140gtggagcaca aatggtataa cagaaaaatt gtgccatcag tagaggacta
cctgcgaatt 1200tcttgcatta gtgttgccgt tcatatggcc acagttcatt
gtattgctgg gatgtatgaa 1260attgcaacca aagaggcatt cgaatggttg
aagacggaac ctaaacttgt tatagatgca 1320tcactgattg ggcgtctcct
cgatgacatg cagtccacct cgtttgagca acagagaggt 1380catgtgtcat
cagcggtaca gtgttacatg atccaatatg gggtatcaca cgaagaagcg
1440tgtgagaagt tgcgagaaat ggctgcaatt gcgtggaaag atgtaaacca
agcatgcctt 1500aggcccactg ttttccctat gcctattctt ctgccctcca
tcaaccttgc acgtgtggca 1560gaagtgattt acctacgcgg agatggatat
acacatgcgg gtggtgagac caaaaaacat 1620atcacggcca tgcttgttga
tccaatcaaa gtctga 16565551PRTDrimys winteri 5Met Ala Ser Thr Leu
Pro Leu Pro Ala Tyr Gly Asp Ser Glu Val Val1 5 10 15Arg Arg Ser Ala
Gly Phe His Pro Thr Ile Trp Gly Asp His Phe Leu 20 25 30Ser Tyr Lys
Pro Asp Pro Thr Lys Ile Asp Glu Trp Asn Lys Arg Val 35 40 45Glu Glu
Leu Lys Glu Glu Val Lys Lys Ile Leu Ser Asn Ala Lys Gly 50 55 60Thr
Val Glu Glu Leu Asn Leu Leu Asp Asp Leu Val His Leu Gly Ile65 70 75
80Ser Tyr His Phe Glu Lys Glu Ile Asp Asp Ala Leu Gln Gln Ile Phe
85 90 95Asp Thr His Leu Asp Val Phe Pro Lys Asp Asp Leu Tyr Ala Thr
Ala 100 105 110Leu Arg Phe Gly Val Leu Arg Lys Gln Gly His Arg Val
Ser Pro Asp 115 120 125Val Phe Lys Lys Phe Lys Asp Glu Gln Gly Asn
Phe Lys Ala Glu Leu 130 135 140Ser Thr Asp Ala Lys Gly Leu Leu Cys
Leu Tyr Asp Val Ala Tyr Leu145 150 155 160Ser Thr Arg Gly Glu Asp
Ile Leu Asp Glu Ala Ile Pro Phe Thr Lys 165 170 175Glu His Leu Arg
Ser Cys Ile Ser His Val Asp Ser His Met Ala Ala 180 185 190Lys Ile
Glu His Ser Leu Glu Leu Pro Leu His His Arg Ile Pro Arg 195 200
205Leu Glu Asn Arg His Tyr Ile Ser Val Tyr Glu Gly Asp Lys Glu Arg
210 215 220Asn Glu Val Val Leu Glu Leu Ala Lys Leu Asp Phe Asn Leu
Ile Gln225 230 235 240Ile Leu His Gln Arg Glu Leu Arg Asp Ile Thr
Thr Trp Trp Lys Glu 245 250 255Ile Asp Leu Ala Ala Lys Leu Pro Phe
Ile Arg Asp Arg Leu Val Glu 260 265 270Cys Tyr Tyr Trp Ile Met Gly
Val Tyr Phe Glu Pro Ile Tyr Ser Arg 275 280 285Ala Arg Val Phe Ser
Thr Lys Met Thr Ile Leu Val Ser Val Val Asp 290 295 300Asp Ile Tyr
Asp Val Tyr Ala Thr Glu Asp Glu Leu Gln Leu Phe Thr305 310 315
320Asp Ala Ile Tyr Arg Trp Asp Ala Glu Asp Ile Glu Gln Leu Pro Gln
325 330 335Tyr Leu Lys Asp Ala Phe Leu Val Leu Tyr Asn Thr Val Lys
Asp Leu 340 345 350Glu Glu Glu Leu Glu Pro Glu Gly Asn Ser Tyr Arg
Gly Tyr Tyr Val 355 360 365Lys Asp Ala Met Lys Val Leu Ala Arg Asp
Tyr Phe Val Glu His Lys 370 375 380Trp Tyr Asn Arg Lys Ile Val Pro
Ser Val Glu Asp Tyr Leu Arg Ile385 390 395 400Ser Cys Ile Ser Val
Ala Val His Met Ala Thr Val His Cys Cys Ala 405 410 415Gly Met Asp
Glu Ile Ala Thr Lys Glu Ala Phe Glu Trp Leu Lys Thr 420 425 430Glu
Pro Lys Leu Val Ile Asp Ala Ser Leu Ile Gly Arg Leu Leu Asp 435 440
445Asp Met Gln Ser Thr Ser Phe Glu Gln Gln Arg Gly His Val Ser Ser
450 455 460Ala Val Gln Cys Tyr Met Ile Gln Tyr Gly Val Ser His Glu
Glu Ala465 470 475 480Cys Glu Lys Leu Arg Glu Met Ala Ala Ile Ala
Trp Lys Asp Val Asn 485 490 495Gln Ala Cys Leu Arg Pro Thr Val Phe
Pro Met Pro Ile Leu Leu Pro 500 505 510Ser Ile Asn Leu Ala Arg Val
Ala Glu Val Ile Tyr Leu Arg Gly Asp 515 520 525Gly Tyr Thr His Ala
Gly Gly Glu Thr Lys Lys His Ile Thr Ala Met 530 535 540Leu Val Asp
Pro Ile Lys Val545 55061656DNAArtificial SequenceCodon optimized
DNA sequence of SCH51-3228-9 6atggcaagca ccctgccgct gcctgcctat
ggtgatagcg aagttgttcg tcgtagcgca 60ggttttcatc cgaccatttg gggtgatcat
tttctgagct ataaaccgga tccgaccaaa 120attgatgaat ggaataaacg
tgtcgaagaa ctgaaagaag aagtgaaaaa aatcctgagc 180aatgccaaag
gcaccgttga ggaactgaat ctgctggatg atctggttca tctgggtatc
240agctatcact ttgagaaaga aatcgatgat gcactgcagc agatttttga
tacccatctg 300gatgttttcc cgaaagatga tctgtatgca accgcactgc
gttttggtgt tctgcgtaaa 360cagggtcatc gtgttagtcc ggatgtgttc
aaaaaattca aagatgaaca gggcaacttc 420aaagcagaac tgagcaccga
tgcaaaaggt ctgctgtgtc tgtatgatgt tgcatatctg 480agcacccgtg
gtgaagatat tctggatgaa gcaattccgt ttaccaaaga acatctgcgt
540agctgtatta gccatgttga tagccacatg gcagcgaaaa ttgaacatag
cctggaactg 600cctctgcatc accgtattcc gcgtctggaa aatcgtcact
atattagcgt ttatgagggc 660gataaagaac gcaatgaagt tgtgctggaa
ctggcaaaac tggattttaa cctgattcag 720attctgcatc agcgtgaact
gcgtgatatt accacctggt ggaaagaaat tgatctggca 780gcaaaactgc
cgtttattcg tgatcgtctg gttgaatgct attattggat tatgggcgtg
840tatttcgaac cgatttatag ccgtgcacgt gtttttagca ccaaaatgac
cattctggtt 900agcgtggtgg atgatatcta tgatgtttat gccaccgaag
atgaactgca gctgtttacc 960gatgccattt atcgttggga tgcagaagat
attgaacagc tgccgcagta tctgaaagat 1020gcatttctgg ttctgtacaa
caccgtgaaa gatctggaag aagaactgga accggaaggt 1080aatagctatc
gtggttatta tgttaaagat gccatgaaag ttctggcacg cgattatttt
1140gttgagcaca aatggtataa ccgcaaaatt gttccgagcg tggaagatta
tctgcgtatt 1200agctgcatta gcgttgcagt tcacatggca accgttcatt
gttgtgcagg tatggatgaa 1260attgcaacca aagaagcatt tgagtggctg
aaaaccgaac cgaaactggt tattgatgca 1320agcctgattg gtcgtctgct
ggacgatatg cagagcacca gctttgaaca gcagcgtggt 1380catgttagca
gcgcagttca gtgttatatg attcagtatg gtgttagcca tgaagaagca
1440tgcgaaaaac tgcgcgaaat ggcagcaatt gcatggaaag atgttaatca
ggcatgtctg 1500cgtccgaccg tttttccgat gccgattctg ctgccgagca
ttaatctggc acgtgttgcc 1560gaagttatct atctgcgtgg tgatggttat
acccatgccg gtggtgaaac caaaaaacat 1620attaccgcaa tgctggtcga
tccgattaaa gtttaa 165671656DNADrimys winteri 7atggcttcca ccctccctct
cccagcttat ggagattcag aagttgttag gcgatctgcc 60gggtttcatc cgacgatctg
gggcgatcac ttcctctcct acaagcctga tccaacgaaa 120atagatgaat
ggaataaaag ggttgaagag ctgaaggaag aagtgaagaa gatattaagc
180aatgcaaaag ggacagtgga agagctgaat ttgcttgatg atctcgtaca
ccttgggatt 240agttatcatt ttgagaagga gattgatgat gctttacaac
aaatctttga tacccatctt 300gatgtttttc ctaaggatga tctatatgcc
accgctctcc gatttggcgt cttaaggaaa 360caggggcacc gtgtttctcc
agatgtattc aaaaaattca aagatgagca ggggaatttc 420aaggcagagt
tgagcaccga tgcgaagggt ttgctatgtt tatatgatgt ggcttatctc
480agcacaagag gggaagatat cttggatgaa gccattcctt tcactaagga
gcaccttagg 540tcttgtatta gccatgtcga ttctcatatg gcagcaaaaa
ttgagcattc tctagagctt 600ccccttcatc atcgcatacc aaggctagag
aacaggcact acatctcagt ctatgaagga 660gacaaggaaa ggaatgaagt
tgtccttgag cttgccaaat tagatttcaa tctgattcaa 720atcttgcacc
aaagagagct gagggacatc acaatgtggt ggaaggagat tgaccttgca
780gcaaagctac cttttattag agataggttg gtggagtgct actactggat
catgggggtc 840tattttgaac caatatactc cagggctagg gttttttcca
ctaaaatgac aatcttggtc 900tcagttgtgg acgacatata tgatgtctat
gctacggagg atgagcttca actattcact 960gatgcaatct ataggtggga
tgctgatgac attgatcagc tgcctcagta cttgaaagat 1020gcttttatgg
tactctataa cactgtgaag actctagaag aagaacttga accagaagga
1080aactcttatc gtggatacta cgtaaaagat gcaatgaagg ttttggcaag
agattacttt 1140gtggaacaca aatggtataa cagacaaatt gtgccatccg
tagaggaata cttgaaaatt 1200tcttgcatta gtgtggctgt tcatatggct
acagttcatt gtattgctgg gatgtatgaa 1260attgctacca aagaggcatt
cgaatggttg aagactgaac ccaaacttgt tatcgatgca 1320tctctgatcg
gtcgtcttct tgatgacatg
cagtctacct cgtttgagca acaaagaggg 1380cacgtgtcat cagcagtaca
gtgttacatg gcccaatatg gagtaacagc agaagaagca 1440tgtgaaaagc
tacgagaaat ggctgcaatt gcttggaaag atgtgaatga agcatgcctt
1500aggcccacgg tattccctat gcctatcctc ttgccttcta tcaacttggc
acgtgtggca 1560gaagtgatct acctacgtgg agatggatac acgcacgctg
ggggtgagac caaaaaacac 1620atcacggcca tgcttgttaa gccaattgaa gtctga
16568551PRTDrimys winteri 8Met Ala Ser Thr Leu Pro Leu Pro Ala Tyr
Gly Asp Ser Glu Val Val1 5 10 15Arg Arg Ser Ala Gly Phe His Pro Thr
Ile Trp Gly Asp His Phe Leu 20 25 30Ser Tyr Lys Pro Asp Pro Thr Lys
Ile Asp Glu Trp Asn Lys Arg Val 35 40 45Glu Glu Leu Lys Glu Glu Val
Lys Lys Ile Leu Ser Asn Ala Lys Gly 50 55 60Thr Val Glu Glu Leu Asn
Leu Leu Asp Asp Leu Val His Leu Gly Ile65 70 75 80Ser Tyr His Phe
Glu Lys Glu Ile Asp Asp Ala Leu Gln Gln Ile Phe 85 90 95Asp Thr His
Leu Asp Val Phe Pro Lys Asp Asp Leu Tyr Ala Thr Ala 100 105 110Leu
Arg Phe Gly Val Leu Arg Lys Gln Gly His Arg Val Ser Pro Asp 115 120
125Val Phe Lys Lys Phe Lys Asp Glu Gln Gly Asn Phe Lys Ala Glu Leu
130 135 140Ser Thr Asp Ala Lys Gly Leu Leu Cys Leu Tyr Asp Val Ala
Tyr Leu145 150 155 160Ser Thr Arg Gly Glu Asp Ile Leu Asp Glu Ala
Ile Pro Phe Thr Lys 165 170 175Glu His Leu Arg Ser Cys Ile Ser His
Val Asp Ser His Met Ala Ala 180 185 190Lys Ile Glu His Ser Leu Glu
Leu Pro Leu His His Arg Ile Pro Arg 195 200 205Leu Glu Asn Arg His
Tyr Ile Ser Val Tyr Glu Gly Asp Lys Glu Arg 210 215 220Asn Glu Val
Val Leu Glu Leu Ala Lys Leu Asp Phe Asn Leu Ile Gln225 230 235
240Ile Leu His Gln Arg Glu Leu Arg Asp Ile Thr Met Trp Trp Lys Glu
245 250 255Ile Asp Leu Ala Ala Lys Leu Pro Phe Ile Arg Asp Arg Leu
Val Glu 260 265 270Cys Tyr Tyr Trp Ile Met Gly Val Tyr Phe Glu Pro
Ile Tyr Ser Arg 275 280 285Ala Arg Val Phe Ser Thr Lys Met Thr Ile
Leu Val Ser Val Val Asp 290 295 300Asp Ile Tyr Asp Val Tyr Ala Thr
Glu Asp Glu Leu Gln Leu Phe Thr305 310 315 320Asp Ala Ile Tyr Arg
Trp Asp Ala Asp Asp Ile Asp Gln Leu Pro Gln 325 330 335Tyr Leu Lys
Asp Ala Phe Met Val Leu Tyr Asn Thr Val Lys Thr Leu 340 345 350Glu
Glu Glu Leu Glu Pro Glu Gly Asn Ser Tyr Arg Gly Tyr Tyr Val 355 360
365Lys Asp Ala Met Lys Val Leu Ala Arg Asp Tyr Phe Val Glu His Lys
370 375 380Trp Tyr Asn Arg Gln Ile Val Pro Ser Val Glu Glu Tyr Leu
Lys Ile385 390 395 400Ser Cys Ile Ser Val Ala Val His Met Ala Thr
Val His Cys Ile Ala 405 410 415Gly Met Tyr Glu Ile Ala Thr Lys Glu
Ala Phe Glu Trp Leu Lys Thr 420 425 430Glu Pro Lys Leu Val Ile Asp
Ala Ser Leu Ile Gly Arg Leu Leu Asp 435 440 445Asp Met Gln Ser Thr
Ser Phe Glu Gln Gln Arg Gly His Val Ser Ser 450 455 460Ala Val Gln
Cys Tyr Met Ala Gln Tyr Gly Val Thr Ala Glu Glu Ala465 470 475
480Cys Glu Lys Leu Arg Glu Met Ala Ala Ile Ala Trp Lys Asp Val Asn
485 490 495Glu Ala Cys Leu Arg Pro Thr Val Phe Pro Met Pro Ile Leu
Leu Pro 500 505 510Ser Ile Asn Leu Ala Arg Val Ala Glu Val Ile Tyr
Leu Arg Gly Asp 515 520 525Gly Tyr Thr His Ala Gly Gly Glu Thr Lys
Lys His Ile Thr Ala Met 530 535 540Leu Val Lys Pro Ile Glu Val545
55091656DNAArtificial SequenceCodon optimized DNA sequence of
SCH51-3228-11 9atggcatcta ctcttccact gccggcttat ggtgattctg
aggttgttcg tcgttccgcg 60ggttttcacc ctaccatctg gggcgatcac tttctgtcct
ataagccaga cccgaccaag 120attgacgagt ggaataagcg tgtcgaggaa
ctgaaagaag aagtgaaaaa gatcctgtcc 180aacgcaaaag gtactgtcga
ggagctgaat ctgctggatg acctggtgca tctgggcatc 240agctatcact
tcgaaaagga aattgacgac gctttgcagc aaatttttga tacgcacctg
300gacgtctttc cgaaagatga cctgtatgcg accgcgctgc gctttggtgt
gctgcgtaaa 360cagggtcatc gcgtgtctcc tgatgtgttc aagaaattta
aagatgaaca gggcaatttc 420aaggccgagt tgagcacgga cgccaaaggt
ttgctctgcc tgtacgacgt tgcatatctg 480agcacccgtg gtgaagatat
cctggacgaa gcgattccgt tcaccaagga acatctgcgc 540tcgtgcattt
cccatgtaga tagccacatg gcggccaaga tcgagcacag cctggagctg
600cctttgcacc atcgtattcc gcgcctggag aatcgccatt acattagcgt
ctatgagggt 660gacaaagagc gcaacgaagt cgtgttagag ctggcgaagc
tggacttcaa cctgattcaa 720attctgcatc aacgcgagct gcgcgacatt
accatgtggt ggaaagagat tgatctggca 780gcgaagctgc cgttcatccg
cgatcgtctg gttgagtgct actactggat catgggcgtc 840tacttcgagc
cgatctacag ccgcgctcgt gtgttttcga cgaagatgac catcctggtt
900agcgttgttg atgacattta tgacgtttac gcgaccgaag atgaactgca
gctgtttacg 960gacgcaatct accgttggga cgcggatgat atcgaccagc
tgccgcaata cttgaaagat 1020gcgttcatgg ttttgtacaa caccgtcaaa
acgctggaag aagaactgga gccggaaggc 1080aacagctacc gtggttacta
tgttaaagat gcgatgaaag ttctggcgcg cgactacttc 1140gtcgagcaca
agtggtataa ccgtcagatt gtgccgagcg tcgaggaata cctgaagatt
1200agctgtatca gcgttgccgt tcacatggca acggtgcact gcatcgccgg
tatgtacgag 1260attgcgacga aagaagcctt cgaatggttg aaaaccgagc
cgaagctggt tatcgacgcc 1320agcctgatcg gtcgtttgct ggacgacatg
caaagcacga gcttcgagca gcagcgcggc 1380catgtgagca gcgctgttca
gtgttatatg gcgcaatatg gcgtgaccgc agaagaagcg 1440tgcgagaagc
tgcgtgagat ggcagcaatt gcgtggaaag atgtgaatga agcctgtctg
1500cgtccgactg tgtttccgat gccgatcctg ctgccgagca ttaacctggc
gcgtgtggca 1560gaggtcatct atctgcgtgg tgacggttac acccacgcgg
gtggcgaaac caagaaacat 1620atcaccgcaa tgctggttaa gccgattgaa gtgtaa
1656101677DNADrimys winteri 10atggatctta gtacttcacc tgttctttct
tcctcccccc ttccggtgga agacggaaaa 60aatccggccg ttcgccgttc agctggattt
caccccagta tttggggtga tcatttcctc 120tcctacactg aagatcacaa
gaagctggat gcatggagcg aaaggactca agtgttgaag 180gaagaggtga
ggagaatttt aatcaatgcc aaggggtcac tagaagagtt ggatttgttg
240gatgcaatcc aacgccttgg ggtgaaatat cactttgaga aagagattga
agaggcatta 300caccatattt atgttgcaga aactcatgtt tctactgatg
acttatattc cgtttctctc 360cggtttcgac ttcttagaca acaagggtac
aatgtatctg ctgatgtatt taaaaagttc 420aaagatgaga ggggcaactt
caaggcaagc ttaagtactg atgccagggg gttgctaagc 480ttgtatgaag
ctgcatttct cagcatacga ggagatgata tcttagatga agccataact
540ttcacaagag agcagcttaa gtcttctatg acccatgttg atgcccctct
tgccaaacaa 600atagcccatg ccttagaggt accagcgcac aagcgcatac
aaagactaga gaacattcgc 660tacctcacaa tctaccaaga agagaaagga
aggaatgatg tgttgcttga gcttgccaag 720ttggatttca atatcttaca
acaattgcat aagaaagaac tgagagacct tacaaagtgg 780tggaaggaca
cagacgttgc aggaaagcta cctttcatca gagataggtt ggtggaatgc
840tattattgga tcttgggtgt gtattatgag ccagaatact ccagagctag
aattttttct 900accaaaatga caatcatggt ctcagttgtt gatgacatat
atgacgtata tgctactgaa 960gatgagctcc aactattcac tgatgcaatc
tataggtggg atctggaggg cctagatcaa 1020ctcccacagt tcttgaaaga
ctgttttctt gtactctatg acaccgtcaa ggaattagaa 1080gacgaactag
aaccggaagg aaaatcctat cgtggatact atgtaaagga tgcgatgaag
1140gttttggcta gagattactt cgttgagcac aaatggtata acagaaacat
agtgccaagt 1200gtagaagaat atctccgtgt ttcttgcatc agtgttgcag
tccatatggc taacgtccat 1260tgctgtgctg ggatgggaga tgtaatgagc
aaagaggcat tcgaatggtt gaagagtgaa 1320ccaaaggttg taatggatgc
atcactaatt ggccgactgc tcgatgacat gcagtccacc 1380gagtttgagc
aaaagagagg ccatgttgca tcggctgtcc aatgttacat gaatgagtat
1440ggagtgactt acaaagaagc gtgtgaaaag ctgcatgaaa tggctgccct
tgcatggaaa 1500gacgtaaacc aggcttgcct taaaccaact gttttccctc
tccctgtatt tatgcctgca 1560atcaaccttg cgcgagtggc tgaagtcatc
taccttcgtg gagatgggta tactcattca 1620ggaggagaga ctaaagaaaa
tatcacgttg atgcttgtca atccaatctc tgtgtga 167711558PRTDrimys winteri
11Met Asp Leu Ser Thr Ser Pro Val Leu Ser Ser Ser Pro Leu Pro Val1
5 10 15Glu Asp Gly Lys Asn Pro Ala Val Arg Arg Ser Ala Gly Phe His
Pro 20 25 30Ser Ile Trp Gly Asp His Phe Leu Ser Tyr Thr Glu Asp His
Lys Lys 35 40 45Leu Asp Ala Trp Ser Glu Arg Thr Gln Val Leu Lys Glu
Glu Val Arg 50 55 60Arg Ile Leu Ile Asn Ala Lys Gly Ser Leu Glu Glu
Leu Asp Leu Leu65 70 75 80Asp Ala Ile Gln Arg Leu Gly Val Lys Tyr
His Phe Glu Lys Glu Ile 85 90 95Glu Glu Ala Leu His His Ile Tyr Val
Ala Glu Thr His Val Ser Thr 100 105 110Asp Asp Leu Tyr Ser Val Ser
Leu Arg Phe Arg Leu Leu Arg Gln Gln 115 120 125Gly Tyr Asn Val Ser
Ala Asp Val Phe Lys Lys Phe Lys Asp Glu Arg 130 135 140Gly Asn Phe
Lys Ala Ser Leu Ser Thr Asp Ala Arg Gly Leu Leu Ser145 150 155
160Leu Tyr Glu Ala Ala Phe Leu Ser Ile Arg Gly Asp Asp Ile Leu Asp
165 170 175Glu Ala Ile Thr Phe Thr Arg Glu Gln Leu Lys Ser Ser Met
Thr His 180 185 190Val Asp Ala Pro Leu Ala Lys Gln Ile Ala His Ala
Leu Glu Val Pro 195 200 205Ala His Lys Arg Ile Gln Arg Leu Glu Asn
Ile Arg Tyr Leu Thr Ile 210 215 220Tyr Gln Glu Glu Lys Gly Arg Asn
Asp Val Leu Leu Glu Leu Ala Lys225 230 235 240Leu Asp Phe Asn Ile
Leu Gln Gln Leu His Lys Lys Glu Leu Arg Asp 245 250 255Leu Thr Lys
Trp Trp Lys Asp Thr Asp Val Ala Gly Lys Leu Pro Phe 260 265 270Ile
Arg Asp Arg Leu Val Glu Cys Tyr Tyr Trp Ile Leu Gly Val Tyr 275 280
285Tyr Glu Pro Glu Tyr Ser Arg Ala Arg Ile Phe Ser Thr Lys Met Thr
290 295 300Ile Met Val Ser Val Val Asp Asp Ile Tyr Asp Val Tyr Ala
Thr Glu305 310 315 320Asp Glu Leu Gln Leu Phe Thr Asp Ala Ile Tyr
Arg Trp Asp Leu Glu 325 330 335Gly Leu Asp Gln Leu Pro Gln Phe Leu
Lys Asp Cys Phe Leu Val Leu 340 345 350Tyr Asp Thr Val Lys Glu Leu
Glu Asp Glu Leu Glu Pro Glu Gly Lys 355 360 365Ser Tyr Arg Gly Tyr
Tyr Val Lys Asp Ala Met Lys Val Leu Ala Arg 370 375 380Asp Tyr Phe
Val Glu His Lys Trp Tyr Asn Arg Asn Ile Val Pro Ser385 390 395
400Val Glu Glu Tyr Leu Arg Val Ser Cys Ile Ser Val Ala Val His Met
405 410 415Ala Asn Val His Cys Cys Ala Gly Met Gly Asp Val Met Ser
Lys Glu 420 425 430Ala Phe Glu Trp Leu Lys Ser Glu Pro Lys Val Val
Met Asp Ala Ser 435 440 445Leu Ile Gly Arg Leu Leu Asp Asp Met Gln
Ser Thr Glu Phe Glu Gln 450 455 460Lys Arg Gly His Val Ala Ser Ala
Val Gln Cys Tyr Met Asn Glu Tyr465 470 475 480Gly Val Thr Tyr Lys
Glu Ala Cys Glu Lys Leu His Glu Met Ala Ala 485 490 495Leu Ala Trp
Lys Asp Val Asn Gln Ala Cys Leu Lys Pro Thr Val Phe 500 505 510Pro
Leu Pro Val Phe Met Pro Ala Ile Asn Leu Ala Arg Val Ala Glu 515 520
525Val Ile Tyr Leu Arg Gly Asp Gly Tyr Thr His Ser Gly Gly Glu Thr
530 535 540Lys Glu Asn Ile Thr Leu Met Leu Val Asn Pro Ile Ser
Val545 550 555121677DNAArtificial SequenceCodon optimized DNA
sequence of SCH51-998-28 12atggatctga gcaccagtcc ggttctgagc
agctcaccgc tgccggttga agatggtaaa 60aatccggcag ttcgtcgtag cgcaggtttt
catccgagca tttggggtga tcattttctg 120agctataccg aggatcacaa
aaaactggat gcatggtcag aacgtaccca ggttctgaaa 180gaagaagtgc
gtcgtattct gattaatgca aaaggtagcc tggaagaact ggatctgctg
240gatgcaattc agcgtctggg tgttaaatat cactttgaga aagaaatcga
agaagccctg 300catcatattt atgttgcaga aacccatgtg tcaaccgatg
atctgtatag cgttagcctg 360cgttttcgtc tgctgcgtca gcagggttat
aatgttagcg cagatgtgtt caaaaaattc 420aaagatgaac gcggtaactt
caaagcaagc ctgagcaccg atgcacgtgg tctgctgagc 480ctgtatgaag
cagcatttct gagcattcgt ggtgatgata ttctggatga agcaattacc
540tttacccgtg aacagctgaa aagcagcatg acccatgttg atgcaccgct
ggcaaaacaa 600attgcacatg cactggaagt tccggcacat aaacgtattc
agcgcctgga aaatattcgc 660tatctgacca tttaccaaga agagaaaggt
cgtaacgatg ttctgctgga actggccaaa 720ctggatttta acattctgca
gcagctgcat aaaaaagaac tgcgtgatct gaccaaatgg 780tggaaagata
ccgatgttgc aggtaaactg ccgtttattc gtgatcgtct ggttgaatgc
840tattattgga ttctgggcgt ttattatgag ccggaatata gccgtgcacg
tatttttagc 900accaaaatga ccattatggt tagcgtggtg gatgacatct
atgatgttta tgcaaccgaa 960gatgaactgc agctgtttac cgatgcaatt
tatcgttggg atctggaagg tctggatcag 1020ctgccgcagt tcctgaaaga
ttgttttctg gttctgtatg ataccgtgaa agaactggaa 1080gatgagctgg
aaccggaagg taaaagctat cgtggttatt atgttaaaga tgccatgaaa
1140gttctggcac gcgattattt tgttgagcac aaatggtata accgcaatat
tgttccgagc 1200gtggaagaat atctgcgtgt tagctgtatt agcgttgcag
ttcacatggc aaatgttcat 1260tgttgtgcag gtatgggtga tgtgatgagc
aaagaagcat ttgaatggct gaaaagtgaa 1320ccgaaagttg ttatggatgc
cagcctgatt ggtcgcctgc tggacgatat gcagagcacc 1380gaatttgaac
agaaacgtgg tcatgttgca agcgcagttc agtgttatat gaatgaatat
1440ggcgtgacct ataaagaggc atgcgaaaaa ctgcatgaaa tggcagcact
ggcatggaaa 1500gatgttaatc aggcatgtct gaaaccgacc gtttttccgc
tgcctgtttt tatgcctgca 1560attaatctgg cacgtgttgc cgaagttatt
tacctgcgtg gggatggtta tacccatagc 1620ggtggtgaaa ccaaagaaaa
cattaccctg atgctggtta atccgattag cgtttaa 1677131680DNADrimys
lanceolata 13atggatgttc taattccctc ccctgtggct tccactctcc ctctgcccga
agatggaaac 60ttggacgtcg ttcgcagatc cgccgggttt catccgacgg tctggggcga
tcacttcctc 120gcttactcgc ccgatccaac caaaatagat gcttggacta
aaagagttga agagctgaag 180caagaagtga agaggattct aagcaatgtg
aaagggtcac tggaagagct gaacttgctt 240gatgctatcc aacaccttgg
gattggttat cattttgaga aagagattga tgatgcttta 300caactaatct
ttgattccca tattgatgct tttcctactg atgatctata tgtggctgcc
360ctccgattta gcctactaag gcgacaaggg cactgtgttt cttcagatgt
attcaaaaaa 420ttcaaagatg agcaggggaa tttcaaggca gagctgagca
ccgatgcgaa aggtttgctg 480agtctctatg acgcggcgta tctcagtgta
agaggggaag atatattgga tgaggccatt 540cctttcacta gggagcacct
taggacttgt attagccatg tagattctca tttggcagca 600aaaattgagc
attctctaga gcttcccctg catcatcgca taccaaggct agagaacagg
660cactacatct cagtgtacga aggagagaag gaaaggaatg aagttgtact
agagcttgcc 720aaattagatt tcaatctgat tcaaatcttg caccaaagag
agctgaggga catcacaacg 780tggtggaatg agattgacct cgcagcaaag
ctaccattta ttagggatag gttggtggag 840tgctactatt ggatcatggg
tgtctatttt gaaccaatat tctcaagggc tagagttttt 900tcgaccaaaa
tgacaatttt ggtctcagtt gtcgacgaca tatatgatgt ctacgctaca
960gaggatgagc tccaactttt cactgacgca atctataggt gggatgccga
ggacattgag 1020cagcttccac agtacttgaa agattctttt cttgtactct
ataacaccgt gaaggactta 1080gaagaggagc tgaaaccaga aggaaactca
tatcgtggag actatgtaaa agatgcgatg 1140aaggttttgg caagagatta
ctttgtggag cacaaatggt ataacagaaa aattgtaccg 1200tcagtagagg
actacctacg aatttcttgc attagtgttg ccgttcatat ggctacagtt
1260cattgttgtg ctgggatgga tgaaattgca accaaagagg cattcgaatg
gttgaagacc 1320gaacctaaac ttgttataga tgcatcactg attgggcgtc
tcctcgatga catgcagtcc 1380acctcgtttg agcaacagag aggtcatgtg
tcatcggcgg tacagtgtta catgatccaa 1440tatggcgtat cacacgaaga
agcgtgtgag aagttgacag aaatggctgc aattgcatgg 1500aaagatgtaa
accaagcatg ccttaggccc actgttttcc caatgcctat tcttctgcct
1560tcaatcaacc ttgcacgtgt ggcagaagtc atctacctgc gcggagatgg
atatacacat 1620gctggtggtg agaccaaaaa acatatcacg gccatgcttg
ttgaaccaat ccaagtctga 168014559PRTDrimys lanceolata 14Met Asp Val
Leu Ile Pro Ser Pro Val Ala Ser Thr Leu Pro Leu Pro1 5 10 15Glu Asp
Gly Asn Leu Asp Val Val Arg Arg Ser Ala Gly Phe His Pro 20 25 30Thr
Val Trp Gly Asp His Phe Leu Ala Tyr Ser Pro Asp Pro Thr Lys 35 40
45Ile Asp Ala Trp Thr Lys Arg Val Glu Glu Leu Lys Gln Glu Val Lys
50 55 60Arg Ile Leu Ser Asn Val Lys Gly Ser Leu Glu Glu Leu Asn Leu
Leu65 70 75 80Asp Ala Ile Gln His Leu Gly Ile Gly Tyr His Phe Glu
Lys Glu Ile 85 90 95Asp Asp Ala Leu Gln Leu Ile Phe Asp Ser His Ile
Asp Ala Phe Pro 100 105 110Thr Asp Asp Leu Tyr Val Ala Ala Leu Arg
Phe Ser Leu Leu Arg Arg 115 120 125Gln Gly His Cys Val Ser Ser Asp
Val Phe Lys Lys Phe Lys Asp Glu 130 135 140Gln Gly Asn Phe Lys Ala
Glu Leu Ser Thr Asp Ala Lys Gly Leu Leu145 150 155 160Ser Leu Tyr
Asp Ala Ala Tyr Leu Ser Val Arg Gly Glu Asp Ile Leu
165 170 175Asp Glu Ala Ile Pro Phe Thr Arg Glu His Leu Arg Thr Cys
Ile Ser 180 185 190His Val Asp Ser His Leu Ala Ala Lys Ile Glu His
Ser Leu Glu Leu 195 200 205Pro Leu His His Arg Ile Pro Arg Leu Glu
Asn Arg His Tyr Ile Ser 210 215 220Val Tyr Glu Gly Glu Lys Glu Arg
Asn Glu Val Val Leu Glu Leu Ala225 230 235 240Lys Leu Asp Phe Asn
Leu Ile Gln Ile Leu His Gln Arg Glu Leu Arg 245 250 255Asp Ile Thr
Thr Trp Trp Asn Glu Ile Asp Leu Ala Ala Lys Leu Pro 260 265 270Phe
Ile Arg Asp Arg Leu Val Glu Cys Tyr Tyr Trp Ile Met Gly Val 275 280
285Tyr Phe Glu Pro Ile Phe Ser Arg Ala Arg Val Phe Ser Thr Lys Met
290 295 300Thr Ile Leu Val Ser Val Val Asp Asp Ile Tyr Asp Val Tyr
Ala Thr305 310 315 320Glu Asp Glu Leu Gln Leu Phe Thr Asp Ala Ile
Tyr Arg Trp Asp Ala 325 330 335Glu Asp Ile Glu Gln Leu Pro Gln Tyr
Leu Lys Asp Ser Phe Leu Val 340 345 350Leu Tyr Asn Thr Val Lys Asp
Leu Glu Glu Glu Leu Lys Pro Glu Gly 355 360 365Asn Ser Tyr Arg Gly
Asp Tyr Val Lys Asp Ala Met Lys Val Leu Ala 370 375 380Arg Asp Tyr
Phe Val Glu His Lys Trp Tyr Asn Arg Lys Ile Val Pro385 390 395
400Ser Val Glu Asp Tyr Leu Arg Ile Ser Cys Ile Ser Val Ala Val His
405 410 415Met Ala Thr Val His Cys Cys Ala Gly Met Asp Glu Ile Ala
Thr Lys 420 425 430Glu Ala Phe Glu Trp Leu Lys Thr Glu Pro Lys Leu
Val Ile Asp Ala 435 440 445Ser Leu Ile Gly Arg Leu Leu Asp Asp Met
Gln Ser Thr Ser Phe Glu 450 455 460Gln Gln Arg Gly His Val Ser Ser
Ala Val Gln Cys Tyr Met Ile Gln465 470 475 480Tyr Gly Val Ser His
Glu Glu Ala Cys Glu Lys Leu Thr Glu Met Ala 485 490 495Ala Ile Ala
Trp Lys Asp Val Asn Gln Ala Cys Leu Arg Pro Thr Val 500 505 510Phe
Pro Met Pro Ile Leu Leu Pro Ser Ile Asn Leu Ala Arg Val Ala 515 520
525Glu Val Ile Tyr Leu Arg Gly Asp Gly Tyr Thr His Ala Gly Gly Glu
530 535 540Thr Lys Lys His Ile Thr Ala Met Leu Val Glu Pro Ile Gln
Val545 550 555151680DNAArtificial SequenceCodon optimized DNA
sequence of SCH51-13163-6 15atggatgttc tgattccgag tccggttgca
agcaccctgc cgctgccgga agatggtaat 60ctggatgttg ttcgtcgtag cgcaggtttt
catccgaccg tttggggtga tcattttctg 120gcatatagtc cggatccgac
caaaattgat gcatggacca aacgtgttga ggaactgaaa 180caagaagtga
aacgtattct gagcaatgtg aaaggtagcc tggaagaact gaatctgctg
240gatgcaattc agcatctggg tattggttat cacttcgaga aagaaattga
tgatgcactg 300cagctgatct ttgatagcca tattgatgcc tttccgaccg
atgatctgta tgttgcagca 360ctgcgtttta gcctgctgcg tcgtcagggt
cattgtgtta gcagtgatgt tttcaaaaaa 420ttcaaagacg agcagggcaa
ctttaaagca gaactgagca ccgatgcaaa aggtctgctg 480agcctgtatg
atgccgcata tctgagcgtt cgtggtgaag atattctgga tgaagcaatt
540ccgtttaccc gtgaacatct gcgtacctgt attagccatg tggatagcca
tctggcagca 600aaaattgaac atagtctgga actgcctctg catcatcgta
ttccgcgtct ggaaaatcgt 660cactatatta gcgtttatga aggcgaaaaa
gaacgcaatg aagttgtgct ggaactggca 720aaactggatt ttaacctgat
tcagattctg catcagcgtg aactgcgtga tattaccacc 780tggtggaatg
aaattgacct ggcagccaaa ctgccgttta ttcgtgatcg tctggttgaa
840tgctattatt ggattatggg cgtgtatttt gaaccgattt ttagccgtgc
acgtgtgttt 900agcaccaaaa tgaccattct ggttagcgtg gtggatgata
tctatgatgt ttatgcaacc 960gaagatgagc tgcaactgtt taccgatgcc
atttatcgtt gggatgcaga agatattgaa 1020cagctgcctc agtatctgaa
agatagcttt ctggttctgt acaacaccgt gaaagatctg 1080gaagaagaac
tgaaaccgga aggtaatagc tatcgtggtg attatgttaa agacgccatg
1140aaagttctgg cacgcgatta ttttgttgag cacaaatggt ataaccgcaa
aattgttccg 1200agcgtggaag attatctgcg tattagctgc attagcgttg
cagttcacat ggcaaccgtt 1260cattgttgtg caggtatgga tgaaattgca
accaaagaag catttgagtg gctgaaaacc 1320gaaccgaaac tggttattga
tgcaagcctg attggtcgtc tgctggacga tatgcagtca 1380accagctttg
aacagcagcg tggtcatgtt agcagcgcag ttcagtgtta tatgattcag
1440tatggtgtta gccatgaaga agcatgcgaa aaactgaccg aaatggcagc
aattgcatgg 1500aaagatgtta atcaggcatg tctgcgtccg accgtgtttc
ctatgccgat tctgctgccg 1560agcattaatc tggcacgtgt tgccgaagtt
atctatctgc gtggtgatgg ttatacccat 1620gccggtggtg aaaccaaaaa
acatattacc gcaatgctgg tagaaccgat tcaggtttaa 1680
* * * * *
References