U.S. patent application number 12/364724 was filed with the patent office on 2009-06-11 for gdc-1 genes conferring herbicide resistance.
This patent application is currently assigned to Athenix Corporation. Invention is credited to Brian Carr, Nicholas B. Duck, Philip E. Hammer, Todd K. Hinson.
Application Number | 20090151018 12/364724 |
Document ID | / |
Family ID | 46123678 |
Filed Date | 2009-06-11 |
United States Patent
Application |
20090151018 |
Kind Code |
A1 |
Hammer; Philip E. ; et
al. |
June 11, 2009 |
GDC-1 GENES CONFERRING HERBICIDE RESISTANCE
Abstract
Compositions and methods for conferring herbicide resistance to
plants, plant cells, tissues and seeds are provided. Compositions
comprising a coding sequence for a polypeptide that confers
resistance or tolerance to glyphosate herbicides are provided. The
coding sequences can be used in DNA constructs or expression
cassettes for transformation and expression in plants. Compositions
also comprise transformed plants, plant cells, tissues, and seeds.
In particular, isolated nucleic acid molecules corresponding to
glyphosate resistant nucleic acid sequences are provided.
Additionally, amino acid sequences corresponding to the
polynucleotides are encompassed. In particular, the present
invention provides for isolated nucleic acid molecules comprising
nucleotide sequences encoding an amino acid sequence shown in SEQ
ID NO:3, 6, 8, 11, 19, or 21, or a nucleotide sequence set forth in
SEQ ID NO:1, 2, 4, 5, 7, 9, 10, 18, or 20, as well as variants and
fragments thereof.
Inventors: |
Hammer; Philip E.; (Cary,
NC) ; Hinson; Todd K.; (Rougemont, NC) ; Carr;
Brian; (Raleigh, NC) ; Duck; Nicholas B.;
(Apex, NC) |
Correspondence
Address: |
ALSTON & BIRD LLP
BANK OF AMERICA PLAZA, 101 SOUTH TRYON STREET, SUITE 4000
CHARLOTTE
NC
28280-4000
US
|
Assignee: |
Athenix Corporation
Research Triangle Park
NC
|
Family ID: |
46123678 |
Appl. No.: |
12/364724 |
Filed: |
February 3, 2009 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11185342 |
Jul 20, 2005 |
7504561 |
|
|
12364724 |
|
|
|
|
10796953 |
Mar 10, 2004 |
|
|
|
11185342 |
|
|
|
|
60453237 |
Mar 10, 2003 |
|
|
|
Current U.S.
Class: |
800/278 ;
435/232; 435/252.3; 435/320.1; 435/419; 435/69.1; 536/23.2;
800/300; 800/300.1 |
Current CPC
Class: |
C12N 9/88 20130101; C12N
15/8275 20130101 |
Class at
Publication: |
800/278 ;
800/300; 800/300.1; 536/23.2; 435/320.1; 435/252.3; 435/419;
435/232; 435/69.1 |
International
Class: |
C12N 15/60 20060101
C12N015/60; A01H 5/00 20060101 A01H005/00; C12N 15/11 20060101
C12N015/11; C12N 15/00 20060101 C12N015/00; C12N 1/21 20060101
C12N001/21; C12N 5/04 20060101 C12N005/04; C12N 9/88 20060101
C12N009/88; C12P 21/02 20060101 C12P021/02 |
Claims
1. An isolated nucleic acid molecule selected from the group
consisting of: a) a nucleic acid molecule comprising the nucleotide
sequence of SEQ ID NO:1, 2, 4, 5, 9, 10, 18, or 20; b) a nucleic
acid molecule which encodes a polypeptide comprising the amino acid
sequence of SEQ ID NO:3, 6, 11, 19, or 21; c) a nucleic acid
molecule comprising a nucleotide sequence encoding a polypeptide
having at least 95% amino acid sequence identity to the amino acid
sequence of SEQ ID NO:3, 6, 11, 19, or 21, wherein said polypeptide
has glyphosate resistance and decarboxylase activity; and d) a
complement of any of a)-c).
2. The isolated nucleic acid molecule of claim 1, wherein said
nucleotide sequence is a synthetic sequence that has been designed
for expression in a plant.
3. The nucleic acid molecule of claim 2, wherein said synthetic
sequence has an increased GC content relative to the GC content of
SEQ ID NO: 1, 2, 4, 5, 9, 10, 18, or 20.
4. A vector comprising the nucleic acid molecule of claim 1.
5. The vector of claim 4, further comprising a nucleic acid
molecule encoding a heterologous polypeptide.
6. A host cell that contains the vector of claim 4.
7. The host cell of claim 6 that is a bacterial host cell.
8. The host cell of claim 6 that is a plant cell.
9. A transgenic plant comprising the host cell of claim 8.
10. The plant of claim 9, wherein said plant is selected from the
group consisting of maize, sorghum, wheat, sunflower, tomato,
crucifers, peppers, potato, cotton, rice, soybean, sugarbeet,
sugarcane, tobacco, barley, and oilseed rape.
11. A transgenic seed comprising the nucleic acid molecule of claim
1.
12. An isolated polypeptide selected from the group consisting of:
a) a polypeptide comprising the amino acid sequence of SEQ NO:3, 6,
11, 19, or 21; b) a polypeptide encoded by the nucleotide sequence
of SEQ ID NO:1, 2, 4, 5, 9, 10, 18, or 20; and c) a polypeptide
comprising an amino acid sequence having at least 95% sequence
identity to the amino acid sequence of SEQ ID NO:3, 6, 11, 19, or
21, wherein said polypeptide has glyphosate resistance and
decarboxylase activity.
13. The polypeptide of claim 12 further comprising a heterologous
amino acid sequence.
14. A method for producing a polypeptide with glyphosate resistance
activity, comprising culturing the host cell of claim 6 under
conditions in which a nucleic acid molecule encoding the
polypeptide is expressed.
15. A method for conferring resistance to glyphosate in a plant,
said method comprising transforming said plant with a DNA
construct, said construct comprising a promoter that drives
expression in a plant cell operably linked with a nucleotide
sequence at least 95% identical to the nucleotide sequence of SEQ
ID NO:3, 6, 11, 19, or 21, and regenerating a transformed plant,
wherein said nucleotide sequence encodes a polypeptide that has
glyphosate resistance and decarboxylase activity.
16. A plant having stably incorporated into its genome a DNA
construct comprising a nucleotide sequence that encodes a protein
having glyphosate resistance activity, wherein said nucleotide
sequence is selected from the group consisting of: a) the
nucleotide sequence of SEQ ID NO:1, 2, 4, 5, 9, 10, 18, or 20; b) a
nucleotide sequence encoding a polypeptide comprising the amino
acid sequence of SEQ ID NO:3, 6, 11, 19, or 21; and c) a nucleotide
sequence encoding a polypeptide having at least 95% amino acid
sequence identity to the amino acid sequence of SEQ ID NO:3, 6, 11,
19, or 21, wherein said polypeptide has glyphosate resistance and
decarboxylase activity; wherein said nucleotide sequence is
operably linked to a promoter that drives expression of a coding
sequence in a plant cell.
17. A plant cell having stably incorporated into its genome a DNA
construct comprising a nucleotide sequence that encodes a protein
having herbicide resistance activity, wherein said nucleotide
sequence is selected from the group consisting of: a) the
nucleotide sequence of SEQ ID NO:1, 2, 4, 5, 9, 10, 18, or 20; b) a
nucleotide sequence encoding a polypeptide comprising the amino
acid sequence of SEQ ID NO:3, 6, 11, 19, or 21; and c) a nucleotide
sequence encoding a polypeptide having at least 95% amino acid
sequence identity to the amino acid sequence of SEQ ID NO:3, 6, 11,
19, or 21, wherein said polypeptide has glyphosate resistance and
decarboxylase activity; wherein said nucleotide sequence is
operably linked to a promoter that drives expression of a coding
sequence in a plant cell.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of copending U.S. patent
application Ser. No. 11/185,342, filed Jul. 20, 2005, which is a
continuation-in-part of U.S. patent application Ser. No.
10/796,953, filed Mar. 10, 2004, now abandoned, which claims the
benefit of U.S. Provisional Application Ser. No. 60/453,237, filed
Mar. 10, 2003, each of which is hereby incorporated in its entirety
by reference herein.
REFERENCE TO SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA
EFS-WEB
[0002] The official copy of the sequence listing is submitted
concurrently with the specification as an ASCII formatted text file
via EFS-Web, with a file name of "367460_SequenceListing.txt", a
creation date of Jan. 31, 2009, and a size of 96 kilobytes. The
sequence listing filed via EFS-Web is part of the specification and
is hereby incorporated in its entirety by reference herein.
FIELD OF THE INVENTION
[0003] This invention provides novel genes encoding herbicide
resistance, which are useful in plant biology, crop breeding, and
plant cell culture.
BACKGROUND OF THE INVENTION
[0004] N-phosphonomethylglycine, commonly referred to as
glyphosate, is an important agronomic chemical. Glyphosate inhibits
the enzyme that converts phosphoenolpyruvic acid (PEP) and
3-phosphoshikimic acid to 5-enolpyruvyl-3-phosphoshikimic acid.
Inhibition of this enzyme (5-enolpyruvylshikimate-3-phosphate
synthase; referred to herein as "EPSP synthase") kills plant cells
by shutting down the shikimate pathway, thereby inhibiting aromatic
acid biosynthesis.
[0005] Since glyphosate-class herbicides inhibit aromatic amino
acid biosynthesis, they not only kill plant cells, but are also
toxic to bacterial cells. Glyphosate inhibits many bacterial EPSP
synthases, and thus is toxic to these bacteria. However, certain
bacterial EPSP synthases may have a high tolerance to
glyphosate.
[0006] Plant cells resistant to glyphosate toxicity can be produced
by transforming plant cells to express glyphosate-resistant EPSP
synthases. A mutated EPSP synthase from Salmonella typhimurium
strain CT7 confers glyphosate resistance in bacterial cells, and
confers glyphosate resistance on plant cells (U.S. Pat. Nos.
4,535,060, 4,769,061, and 5,094,945). Thus, there is a precedent
for use of glyphosate-resistant bacterial EPSP synthases to confer
glyphosate resistance upon plant cells.
[0007] An alternative method to generate target genes resistant to
a toxin (such as an herbicide) is to identify and develop enzymes
that result in detoxification of the toxin to an inactive or less
active form. This can be accomplished by identifying enzymes that
encode resistance to the toxin in a toxin-sensitive test organism,
such as a bacterium.
[0008] Castle et al (WO 02/36782 A2) describe proteins (glyphosate
N-acetyltransferases) that are described as modifying glyphosate by
acetylation of a secondary amine to yield N-acetylglyphosate.
[0009] Barry et al (U.S. Pat. No. 5,463,175) describes genes
encoding an oxidoreductase (GOX), and states that GOX proteins
degrade glyphosate by removing the phosphonate residue to yield
amino methyl phosphonic acid (AMPA). This suggests that glyphosate
resistance can also be conferred, at least partially, by removal of
the phosphonate group from glyphosate. However, the resulting
compound (AMPA) appears to provide reduced but measurable toxicity
upon plant cells. Barry describes the effect of AMPA accumulation
on plant cells as resulting in effects including chlorosis of
leaves, infertility, stunted growth, and death. Barry (U.S. Pat.
No. 6,448,476) describes plant cells expressing an
AMPA-N-acetyltransferase (phnO) to detoxify AMPA.
[0010] Phosphonates, such as glyphosate, can also be degraded by
cleavage of C--P bond by a C--P lyase. Wacket et al. (1987) J.
Bacteriol. 169:710-717 described strains that utilize glyphosate as
a sole phosphate source. Kishore et al. (1987) J. Biol. Chem.
262:12164-12168 and Shinabarger et al. (1986) J. Bacteriol.
168:702-707 describe degradation of glyphosate by C--P Lyase to
yield glycine and inorganic phosphate.
[0011] While several strategies are available for detoxification of
toxins, such as the herbicide glyphosate, as described above, new
activities capable of degrading glyphosate are useful. Novel genes
and genes conferring glyphosate resistance by novel mechanisms of
action would be of additional usefulness. Single genes conferring
glyphosate resistance by formation of non-toxic products would be
especially useful.
[0012] Thus, novel genes encoding resistance to herbicides are
needed.
SUMMARY OF INVENTION
[0013] Compositions and methods for conferring herbicide resistance
to plants, plant cells, tissues and seeds are provided.
Compositions comprising a coding sequence for a polypeptide that
confers resistance or tolerance to glyphosate herbicides are
provided. The coding sequences can be used in DNA constructs or
expression cassettes for transformation and expression in plants.
Compositions also comprise transformed plants, plant cells,
tissues, and seeds.
[0014] In particular, isolated nucleic acid molecules corresponding
to glyphosate resistance-conferring nucleic acid sequences are
provided. Additionally, amino acid sequences corresponding to the
polynucleotides are encompassed. In particular, the present
invention provides for isolated nucleic acid molecules comprising
nucleotide sequences encoding an amino acid sequence shown in SEQ
ID NO:3, 6, 8, 11, 19, or 21, or a nucleotide sequence set forth in
SEQ ID NO:1, 2, 4, 5, 7, 9, 10, 18, or 20, as well as variants and
fragments thereof. Nucleotide sequences that are complementary to a
nucleotide sequence of the invention, or that hybridize to a
sequence of the invention are also encompassed.
DESCRIPTION OF FIGURES
[0015] FIG. 1 is a diagram that shows GDC-1 (full), GDC-1 (23),
GDC-1 (35), GDC-1 (59), and GDC-1 (35H3mut), as well as the
location of the TPP binding domains and the location (X) of a
mutation.
[0016] FIG. 2 shows an alignment of the predicted proteins
resulting from translation of the clones GDC-1 (full) (SEQ ID
NO:19), GDC-1 (23) (SEQ ID NO:6), GDC-1 (35) (SEQ ID NO:8), and
GDC-1 (59) (SEQ ID NO:11).
[0017] FIG. 3 shows an alignment of GDC-1 protein (SEQ ID NO:19) to
pyruvate decarboxylase of Saccharomyces cerevesiae (SEQ ID NO:13),
a putative indole-3-pyruvate decarboxylase from Salmonella
typhimurium (SEQ ID NO:14), pyruvate decarboxylase (EC 4.1.1.1)
from Zymomonas mobilis (SEQ ID NO:15), acetolactate synthase from
Saccharomyces cerevesiae (SEQ ID NO:16), and acetolactate synthase
from Magnaporthe grisea (SEQ ID NO:17). The alignment shows the
most highly conserved amino acid residues highlighted in black, and
highly conserved amino acid residues highlighted in gray.
DETAILED DESCRIPTION
[0018] The present invention is drawn to compositions and methods
for regulating resistance in organisms, particularly in plants or
plant cells. The methods involve transforming organisms with
nucleotide sequences encoding a glyphosate resistance protein of
the invention. In particular, the nucleotide sequences of the
invention are useful for preparing plants that show increased
tolerance to the herbicide glyphosate. Thus, transformed plants,
plant cells, plant tissues and seeds are provided. Compositions
include nucleic acids and proteins relating to glyphosate tolerance
in plants as well as transformed plants, plant tissues and seeds.
More particularly, nucleotide sequences encoding all or part of the
"glyphosate resistance-conferring decarboxylase" gene GDC-1 and the
amino acid sequences of the proteins encoded thereby are disclosed.
The sequences find use in the construction of expression vectors
for subsequent transformation into organisms of interest, as probes
for the isolation of other glyphosate resistance genes, as
selectable markers, and the like.
DEFINITIONS
[0019] "Glyphosate" includes any herbicidal form of
N-phosphonomethylglycine (including any salt thereof) and other
forms that result in the production of the glyphosate anion in
planta.
[0020] "Glyphosate (or herbicide) resistance-conferring
decarboxylase" or "GDC" includes a DNA segment that encodes all or
part of a glyphosate (or herbicide) resistance protein. This
includes DNA segments that are capable of expressing a protein that
confers glyphosate (herbicide) resistance to a cell.
[0021] An "herbicide resistance protein" or a protein resulting
from expression of an "herbicide resistance-encoding nucleic acid
molecule" includes proteins that confer upon a cell the ability to
tolerate a higher concentration of an herbicide than cells that do
not express the protein, or to tolerate a certain concentration of
an herbicide for a longer time than cells that do not express the
protein.
[0022] A "glyphosate resistance protein" includes a protein that
confers upon a cell the ability to tolerate a higher concentration
of glyphosate than cells that do not express the protein, or to
tolerate a certain concentration of glyphosate for a longer time
than cells that do not express the protein. By "tolerate" or
"tolerance" is intended either to survive, or to carry out
essential cellular functions, such as protein synthesis and
respiration, in a manner that is not readily discernable from
untreated cells.
[0023] By "decarboxylase" is intended a protein, or a gene encoding
a protein, whose catalytic mechanism can include cleavage and
release of a carboxylic acid. This includes enzymes that liberate
CO.sub.2, such as pyruvate decarboxlyases, acetolactate synthases,
and orthinine decarboxylases, as well as enzymes that liberate
larger carboxylic acids. "Decarboxylase" includes proteins that
utilize thiamine pyrophoshate as a cofactor in enzymatic catalysis.
Many such decarbolyases also utilize other cofactors, such as
FAD.
[0024] By "TPP-binding domain" is intended a region of conserved
amino acids present in enzymes that are capable of utilizing TPP as
a cofactor.
[0025] "Plant tissue" includes all known forms of plants, including
undifferentiated tissue (e.g. callus), suspension culture cells,
protoplasts, plant cells including leaf cells, root cells and
phloem cells, plant seeds, pollen, propagules, embryos and the
like.
[0026] "Plant expression cassette" includes DNA constructs that are
capable of resulting in the expression of a protein from an open
reading frame in a plant cell. Typically these contain a promoter
and a coding sequence. Often, such constructs will also contain a
3' untranslated region. Such constructs may contain a `signal
sequence` or `leader sequence` to facilitate co-translational or
post-translational transport of the peptide to certain
intracellular structures such as the chloroplast (or other
plastid), endoplasmic reticulum, or Golgi apparatus.
[0027] "Signal sequence" includes sequences that are known or
suspected to result in cotranslational or post-translational
peptide transport across the cell membrane. In eukaryotes, this
typically involves secretion into the Golgi apparatus, with some
resulting glycosylation.
[0028] "Leader sequence" includes any sequence that when
translated, results in an amino acid sequence sufficient to trigger
co-translational transport of the peptide chain to a sub-cellular
organelle. Thus, this includes leader sequences targeting transport
and/or glycosylation by passage into the endoplasmic reticulum,
passage to vacuoles, plastids including chloroplasts, mitochondria,
and the like.
[0029] "Plant transformation vector" includes DNA molecules that
are necessary for efficient transformation of a plant cell. Such a
molecule may consist of one or more plant expression cassettes, and
may be organized into more than one `vector` DNA molecule. For
example, binary vectors are plant transformation vectors that
utilize two non-contiguous DNA vectors to encode all requisite cis-
and trans-acting functions for transformation of plant cells
(Hellens and Mullineaux (2000) Trends in Plant Science
5:446-451).
[0030] "Vector" refers to a nucleic acid construct designed for
transfer between different host cells. "Expression vector" refers
to a vector that has the ability to incorporate, integrate and
express heterologous DNA sequences or fragments in a foreign
cell.
[0031] "Transgenic plants" or "transformed plants" or "stably
transformed plants or cells or tissues" refers to plants that have
incorporated or integrated exogenous or endogenous nucleic acid
sequences or DNA fragments or chimeric nucleic acid sequences or
fragments.
[0032] "Heterologous" generally refers to the nucleic acid
sequences that are not endogenous to the cell or part of the native
genome in which they are present, and have been added to the cell
by infection, transfection, microinjection, electroporation,
microprojection, or the like.
[0033] "Promoter" refers to a nucleic acid sequence that functions
to direct transcription of a downstream coding sequence. The
promoter together with other transcriptional and translational
regulatory nucleic acid sequences (also termed "control sequences")
are necessary for the expression of a DNA sequence of interest.
[0034] Provided here is a novel isolated gene that confers
resistance to glyphosate. Also provided are amino acid sequences of
the GDC-1 protein. The protein resulting from translation of this
gene allows cells to function in the presence of concentrations of
glyphosate that are otherwise toxic to cells, including plant cells
and bacterial cells.
[0035] An "isolated" or "purified" nucleic acid molecule or
protein, or biologically active portion thereof, is substantially
free of other cellular material, or culture medium when produced by
recombinant techniques, or substantially free of chemical
precursors or other chemicals when chemically synthesized.
Preferably, an "isolated" nucleic acid is free of sequences
(preferably protein encoding sequences) that naturally flank the
nucleic acid (i.e., sequences located at the 5' and 3' ends of the
nucleic acid) in the genomic DNA of the organism from which the
nucleic acid is derived. For purposes of the invention, "isolated"
when used to refer to nucleic acid molecules excludes isolated
chromosomes. For example, in various embodiments, the isolated
glyphosate resistance-encoding nucleic acid molecule can contain
less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of
nucleotide sequence that naturally flanks the nucleic acid molecule
in genomic DNA of the cell from which the nucleic acid is derived.
A glyphosate resistance protein that is substantially free of
cellular material includes preparations of protein having less than
about 30%, 20%, 10%, or 5% (by dry weight) of non-glyphosate
resistance protein (also referred to herein as a "contaminating
protein"). Various aspects of the invention are described in
further detail in the following subsections.
Isolated Nucleic Acid Molecules, and Variants and Fragments
Thereof
[0036] One aspect of the invention pertains to isolated nucleic
acid molecules comprising nucleotide sequences encoding glyphosate
resistance proteins and polypeptides or biologically active
portions thereof, as well as nucleic acid molecules sufficient for
use as hybridization probes to identify glyphosate
resistance-encoding nucleic acids. As used herein, the term
"nucleic acid molecule" is intended to include DNA molecules (e.g.,
cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and analogs of
the DNA or RNA generated using nucleotide analogs. The nucleic acid
molecule can be single-stranded or double-stranded, but preferably
is double-stranded DNA.
[0037] Nucleotide sequences encoding the proteins of the present
invention include the sequences set forth in SEQ ID NOS:1, 2, 18,
and 20, and complements thereof. By "complement" is intended a
nucleotide sequence that is sufficiently complementary to a given
nucleotide sequence such that it can hybridize to the given
nucleotide sequence to thereby form a stable duplex. The
corresponding amino acid sequences for the glyphosate resistance
proteins encoded by the nucleotide sequences are set forth in SEQ
ID NOS:3, 19, and 21. The invention also encompasses nucleic acid
molecules comprising nucleotide sequences encoding partial-length
glyphosate resistance proteins, including the sequences set forth
in SEQ ID NOS:4, 5, 7, 9, and 10, and complements thereof. The
corresponding amino acid sequences for the glyphosate resistance
proteins encoded by these partial-length nucleotide sequences are
set forth in SEQ ID NOS:6, 8, and 11.
[0038] Nucleic acid molecules that are fragments of these
glyphosate resistance-encoding nucleotide sequences are also
encompassed by the present invention. By "fragment" is intended a
portion of the nucleotide sequence encoding a glyphosate resistance
protein. A fragment of a nucleotide sequence may encode a
biologically active portion of a glyphosate resistance protein, or
it may be a fragment that can be used as a hybridization probe or
PCR primer using methods disclosed below. Nucleic acid molecules
that are fragments of a glyphosate resistance nucleotide sequence
comprise at least about 15, 20, 50, 75, 100, 200, 300, 350, 400,
450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050,
1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600,
1650, 1700, 1750, 1800, 1850, 1900, 1950, 2000, 2050, 2100, 2150,
2200 nucleotides, or up to the number of nucleotides present in a
full-length glyphosate resistance-encoding nucleotide sequence
disclosed herein (for example, 2210 nucleotides for SEQ ID NO:1)
depending upon the intended use.
[0039] Fragments of the nucleotide sequences of the present
invention generally will encode protein fragments that retain the
biological activity of the full-length glyphosate resistance
protein; i.e., glyphosate resistance activity. By "retains
glyphosate resistance activity" is intended that the fragment will
have at least about 30%, preferably at least about 50%, more
preferably at least about 70%, even more preferably at least about
80% of the glyphosate resistance activity of the full-length
glyphosate resistance protein disclosed herein as SEQ ID NO:19.
Methods for measuring glyphosate resistance activity are well known
in the art. See, for example, U.S. Pat. Nos. 4,535,060, and
5,188,642, each of which are herein incorporated by reference in
their entirety.
[0040] A fragment of a glyphosate resistance-encoding nucleotide
sequence that encodes a biologically active portion of a protein of
the invention will encode at least about 15, 25, 30, 50, 75, 100,
125, 150, 175, 200, 250, 300, 350, 400, 450, 500, or 550 contiguous
amino acids, or up to the total number of amino acids present in a
full-length glyphosate resistance protein of the invention (for
example, 575 amino acids for SEQ ID NO:3).
[0041] Preferred glyphosate resistance proteins of the present
invention are encoded by a nucleotide sequence sufficiently
identical to the nucleotide sequence of SEQ ID NO:1, 2, 4, 5, 7, 9,
10, 18, or 20. The term "sufficiently identical is intended an
amino acid or nucleotide sequence that has at least about 60% or
65% sequence identity, preferably about 70% or 75% sequence
identity, more preferably about 80% or 85% sequence identity, most
preferably about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%
sequence identity compared to a reference sequence using one of the
alignment programs described herein using standard parameters. One
of skill in the art will recognize that these values can be
appropriately adjusted to determine corresponding identity of
proteins encoded by two nucleotide sequences by taking into account
codon degeneracy, amino acid similarity, reading frame positioning,
and the like.
[0042] To determine the percent identity of two amino acid
sequences or of two nucleic acids, the sequences are aligned for
optimal comparison purposes. The percent identity between the two
sequences is a function of the number of identical positions shared
by the sequences (i.e., percent identity=number of identical
positions/total number of positions (e.g., overlapping
positions).times.100). In one embodiment, the two sequences are the
same length. The percent identity between two sequences can be
determined using techniques similar to those described below, with
or without allowing gaps. In calculating percent identity,
typically exact matches are counted.
[0043] The determination of percent identity between two sequences
can be accomplished using a mathematical algorithm. A nonlimiting
example of a mathematical algorithm utilized for the comparison of
two sequences is the algorithm of Karlin and Altschul (1990) Proc.
Natl. Acad. Sci. USA 87:2264, modified as in Karlin and Altschul
(1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. Such an algorithm
is incorporated into the BLASTN and BLASTX programs of Altschul et
al. (1990) J. Mol. Biol. 215:403. BLAST nucleotide searches can be
performed with the BLASTN program, score=100, wordlength=12, to
obtain nucleotide sequences homologous to GDC-like nucleic acid
molecules of the invention. BLAST protein searches can be performed
with the BLASTX program, score=50, wordlength=3, to obtain amino
acid sequences homologous to glyphosate resistance protein
molecules of the invention. To obtain gapped alignments for
comparison purposes, Gapped BLAST can be utilized as described in
Altschul et al. (1997) Nucleic Acids Res. 25:3389. Alternatively,
PSI-Blast can be used to perform an iterated search that detects
distant relationships between molecules. See Altschul et al. (1997)
supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs,
the default parameters of the respective programs (e.g., BLASTX and
BLASTN) can be used. See www.ncbi.nlm.nih.gov. Another non-limiting
example of a mathematical algorithm utilized for the comparison of
sequences is the ClustalW algorithm (Higgins et al. (1994) Nucleic
Acids Res. 22:4673-4680). ClustalW compares sequences and aligns
the entirety of the amino acid or DNA sequence, and thus can
provide data about the sequence conservation of the entire amino
acid sequence. The ClustalW algorithm is used in several
commercially available DNA/amino acid analysis software packages,
such as the ALIGNX module of the vector NTi Program Suite
(Informax, Inc). After alignment of amino acid sequences with
ClustalW, the percent amino acid identity can be assessed. A
non-limiting example of a software program useful for analysis of
ClustalW alignments is GeneDoc.TM.. Genedoc.TM. (Karl Nicholas)
allows assessment of amino acid (or DNA) similarity and identity
between multiple proteins. Another non-limiting example of a
mathematical algorithm utilized for the comparison of sequences is
the algorithm of Myers and Miller (1988) CABIOS 4:11-17. Such an
algorithm is incorporated into the ALIGN program (version 2.0),
which is part of the GCG sequence alignment software package
(available from Accelrys, Inc., 9865 Scranton Rd., San Diego,
Calif., USA). When utilizing the ALIGN program for comparing amino
acid sequences, a PAM 120 weight residue table, a gap length
penalty of 12, and a gap penalty of 4 can be used.
[0044] A preferred program is GAP version 10, which used the
algorithm of Needleman and Wunsch (1970) supra. GAP Version 10 may
be used with the following parameters: % identity and % similarity
for a nucleotide sequence using GAP Weight of 50 and Length Weight
of 3, and the nwsgapdna.cmp scoring matrix; % identity and %
similarity for an amino acid sequence using GAP Weight of 8 and
Length Weight of 2, and the BLOSUM62 Scoring Matrix. Equivalent
programs may also be used. By "equivalent program" is intended any
sequence comparison program that, for any two sequences in
question, generates an alignment having identical nucleotide or
amino acid residue matches and an identical percent sequence
identity when compared to the corresponding alignment generated by
GAP Version 10.
[0045] The invention also encompasses variant nucleic acid
molecules. "Variants" of the glyphosate resistance-encoding
nucleotide sequences include those sequences that encode the
glyphosate resistance proteins disclosed herein but that differ
conservatively because of the degeneracy of the genetic code, as
well as those that are sufficiently identical as discussed above.
Naturally occurring allelic variants can be identified with the use
of well-known molecular biology techniques, such as polymerase
chain reaction (PCR) and hybridization techniques as outlined
below. Variant nucleotide sequences also include synthetically
derived nucleotide sequences that have been generated, for example,
by using site-directed mutagenesis but which still encode the
glyphosate resistance proteins disclosed in the present invention
as discussed below. Variant proteins encompassed by the present
invention are biologically active, that is they retain the desired
biological activity of the native protein, that is, glyphosate
resistance activity. By "retains glyphosate resistance activity" is
intended that the variant will have at least about 30%, preferably
at least about 50%, more preferably at least about 70%, even more
preferably at least about 80% of the glyphosate resistance activity
of the native protein. Methods for measuring glyphosate resistance
activity are well known in the art. See, for example, U.S. Pat.
Nos. 4,535,060, and 5,188,642, each of which are herein
incorporated by reference in their entirety.
[0046] The skilled artisan will further appreciate that changes can
be introduced by mutation into the nucleotide sequences of the
invention thereby leading to changes in the amino acid sequence of
the encoded glyphosate resistance proteins, without altering the
biological activity of the proteins. Thus, variant isolated nucleic
acid molecules can be created by introducing one or more nucleotide
substitutions, additions, or deletions into the corresponding
nucleotide sequence disclosed herein, such that one or more amino
acid substitutions, additions or deletions are introduced into the
encoded protein. Mutations can be introduced by standard
techniques, such as site-directed mutagenesis and PCR-mediated
mutagenesis. Such variant nucleotide sequences are also encompassed
by the present invention.
[0047] For example, conservative amino acid substitutions may be
made at one or more predicted, preferably nonessential amino acid
residues. A "nonessential" amino acid residue is a residue that can
be altered from the wild-type sequence of a glyphosate resistance
protein without altering the biological activity, whereas an
"essential" amino acid residue is required for biological activity.
A "conservative amino acid substitution" is one in which the amino
acid residue is replaced with an amino acid residue having a
similar side chain. Families of amino acid residues having similar
side chains have been defined in the art. These families include
amino acids with basic side chains (e.g., lysine, arginine,
histidine), acidic side chains (e.g., aspartic acid, glutamic
acid), uncharged polar side chains (e.g., glycine, asparagine,
glutamine, serine, threonine, tyrosine, cysteine), nonpolar side
chains (e.g., alanine, valine, leucine, isoleucine, proline,
phenylalanine, methionine, tryptophan), beta-branched side chains
(e.g., threonine, valine, isoleucine) and aromatic side chains
(e.g., tyrosine, phenylalanine, tryptophan, histidine). Amino acid
substitutions may be made in nonconserved regions that retain
function. In general, such substitutions would not be made for
conserved amino acid residues, or for amino acid residues residing
within a conserved motif, where such residues are essential for
protein activity. Examples of residues that are conserved and that
may be essential for protein activity include, for example,
residues that are identical between all proteins contained in the
alignment of FIG. 3. However, one of skill in the art would
understand that functional variants may have minor conserved or
nonconserved alterations in the conserved residues.
[0048] Alternatively, variant nucleotide sequences can be made by
introducing mutations randomly along all or part of the coding
sequence, such as by saturation mutagenesis, and the resultant
mutants can be screened for ability to confer glyphosate resistance
activity to identify mutants that retain activity. Following
mutagenesis, the encoded protein can be expressed recombinantly,
and the activity of the protein can be determined using standard
assay techniques.
[0049] Using methods such as PCR, hybridization, and the like
corresponding glyphosate resistance sequences can be identified,
such sequences having substantial identity to the sequences of the
invention. See, for example, Sambrook J., and Russell, D. W. (2001)
Molecular Cloning: A Laboratory Manual. (Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y.) and Innis, et al.
(1990) PCR Protocols: A Guide to Methods and Applications (Academic
Press, NY).
[0050] In a hybridization method, all or part of the glyphosate
resistance nucleotide sequence can be used to screen cDNA or
genomic libraries. Methods for construction of such cDNA and
genomic libraries are generally known in the art and are disclosed
in Sambrook and Russell, 2001. The so-called hybridization probes
may be genomic DNA fragments, cDNA fragments, RNA fragments, or
other oligonucleotides, and may be labeled with a detectable group
such as .sup.32P, or any other detectable marker, such as other
radioisotopes, a fluorescent compound, an enzyme, or an enzyme
co-factor. Probes for hybridization can be made by labeling
synthetic oligonucleotides based on the known glyphosate
resistance-encoding nucleotide sequence disclosed herein.
Degenerate primers designed on the basis of conserved nucleotides
or amino acid residues in the nucleotide sequence or encoded amino
acid sequence can additionally be used. The probe typically
comprises a region of nucleotide sequence that hybridizes under
stringent conditions to at least about 12, preferably about 25,
more preferably at least about 50, 75, 100, 125, 150, 175, 200,
250, 300, 350, or 400 consecutive nucleotides of glyphosate
resistance-encoding nucleotide sequence of the invention or a
fragment or variant thereof. Preparation of Probes for
Hybridization is Generally Known in the Art and is Disclosed in
Sambrook and Russell, 2001 and Sambrook et al. (1989) Molecular
Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory
Press, Plainview, N.Y.), both of which are herein incorporated by
reference.
[0051] For example, an entire glyphosate resistance sequence
disclosed herein, or one or more portions thereof, may be used as a
probe capable of specifically hybridizing to corresponding
glyphosate resistance sequences and messenger RNAs. To achieve
specific hybridization under a variety of conditions, such probes
include sequences that are unique and are preferably at least about
10 nucleotides in length, and most preferably at least about 20
nucleotides in length. Such probes may be used to amplify
corresponding glyphosate resistance sequences from a chosen
organism by PCR. This technique may be used to isolate additional
coding sequences from a desired organism or as a diagnostic assay
to determine the presence of coding sequences in an organism.
Hybridization techniques include hybridization screening of plated
DNA libraries (either plaques or colonies; see, for example,
Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d
ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
[0052] Hybridization of such sequences may be carried out under
stringent conditions. By "stringent conditions" or "stringent
hybridization conditions" is intended conditions under which a
probe will hybridize to its target sequence to a detectably greater
degree than to other sequences (e.g., at least 2-fold over
background). Stringent conditions are sequence-dependent and will
be different in different circumstances. By controlling the
stringency of the hybridization and/or washing conditions, target
sequences that are 100% complementary to the probe can be
identified (homologous probing). Alternatively, stringency
conditions can be adjusted to allow some mismatching in sequences
so that lower degrees of similarity are detected (heterologous
probing). Generally, a probe is less than about 1000 nucleotides in
length, preferably less than 500 nucleotides in length.
[0053] Typically, stringent conditions will be those in which the
salt concentration is less than about 1.5 M Na ion, typically about
0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to
8.3 and the temperature is at least about 30.degree. C. for short
probes (e.g., 10 to 50 nucleotides) and at least about 60.degree.
C. for long probes (e.g., greater than 50 nucleotides). Stringent
conditions may also be achieved with the addition of destabilizing
agents such as formamide. Exemplary low stringency conditions
include hybridization with a buffer solution of 30 to 35%
formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37.degree.
C., and a wash in 1.times. to 2.times.SSC (20.times.SSC=3.0 M
NaCl/0.3 M trisodium citrate) at 50 to 55.degree. C. Exemplary
moderate stringency conditions include hybridization in 40 to 45%
formamide, 1.0 M NaCl, 1% SDS at 37.degree. C., and a wash in
0.5.times. to 1.times.SSC at 55 to 60.degree. C. Exemplary high
stringency conditions include hybridization in 50% formamide, 1 M
NaCl, 1% SDS at 37.degree. C., and a wash in 0.1.times.SSC at 60 to
65.degree. C. Optionally, wash buffers may comprise about 0.1% to
about 1% SDS. Duration of hybridization is generally less than
about 24 hours, usually about 4 to about 12 hours.
[0054] Specificity is typically the function of post-hybridization
washes, the critical factors being the ionic strength and
temperature of the final wash solution. For DNA-DNA hybrids, the
T.sub.m can be approximated from the equation of Meinkoth and Wahl
(1984) Anal. Biochem. 138:267-284: T.sub.m=81.5.degree. C.+16.6
(log M)+0.41 (% GC)-0.61 (% form)-500/L; where M is the molarity of
monovalent cations, % GC is the percentage of guanosine and
cytosine nucleotides in the DNA, % form is the percentage of
formamide in the hybridization solution, and L is the length of the
hybrid in base pairs. The T.sub.m is the temperature (under defined
ionic strength and pH) at which 50% of a complementary target
sequence hybridizes to a perfectly matched probe. T.sub.m is
reduced by about 1.degree. C. for each 1% of mismatching; thus,
T.sub.m, hybridization, and/or wash conditions can be adjusted to
hybridize to sequences of the desired identity. For example, if
sequences with .gtoreq.90% identity are sought, the T.sub.m can be
decreased 10.degree. C. Generally, stringent conditions are
selected to be about 5.degree. C. lower than the thermal melting
point (T.sub.m) for the specific sequence and its complement at a
defined ionic strength and pH. However, severely stringent
conditions can utilize a hybridization and/or wash at 1, 2, 3, or
4.degree. C. lower than the thermal melting point (T.sub.m);
moderately stringent conditions can utilize a hybridization and/or
wash at 6, 7, 8, 9, or 110.degree. C. lower than the thermal
melting point (T.sub.m); low stringency conditions can utilize a
hybridization and/or wash at 11, 12, 13, 14, 15, or 20.degree. C.
lower than the thermal melting point (T.sub.m). Using the equation,
hybridization and wash compositions, and desired T.sub.m, those of
ordinary skill will understand that variations in the stringency of
hybridization and/or wash solutions are inherently described. If
the desired degree of mismatching results in a T.sub.m of less than
45.degree. C. (aqueous solution) or 32.degree. C. (formamide
solution), it is preferred to increase the SSC concentration so
that a higher temperature can be used. An extensive guide to the
hybridization of nucleic acids is found in Tijssen (1993)
Laboratory Techniques in Biochemistry and Molecular
Biology--Hybridization with Nucleic Acid Probes, Part I, Chapter 2
(Elsevier, New York); and Ausubel et al., eds. (1995) Current
Protocols in Molecular Biology, Chapter 2 (Greene Publishing and
Wiley-Interscience, New York). See Sambrook et al. (1989) Molecular
Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory
Press, Plainview, N.Y.).
Isolated Proteins and Variants and Fragments Thereof
[0055] Glyphosate resistance proteins are also encompassed within
the present invention. By "glyphosate resistance protein" is
intended a protein having the amino acid sequence set forth in SEQ
ID NO:3, 19, or 21. Fragments, biologically active portions, and
variants thereof are also provided, and may be used to practice the
methods of the present invention.
[0056] "Fragments" or "biologically active portions" include
polypeptide fragments comprising a portion of an amino acid
sequence encoding a glyphosate resistance protein as set forth in
SEQ ID NO:3, 19, or 21, and that retains glyphosate resistance
activity. A biologically active portion of a glyphosate resistance
protein can be a polypeptide that is, for example, 10, 25, 50, 100
or more amino acids in length. Such biologically active portions
can be prepared by recombinant techniques and evaluated for
glyphosate resistance activity. Methods for measuring glyphosate
resistance activity are well known in the art. See, for example,
U.S. Pat. Nos. 4,535,060, and 5,188,642, each of which are herein
incorporated by reference in their entirety. As used here, a
fragment comprises at least 8 contiguous amino acids of SEQ ID
NO:3, 19, or 21. The invention encompasses other fragments,
however, such as any fragment in the protein greater than about 10,
20, 30, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, or
600 amino acids.
[0057] By "variants" is intended proteins or polypeptides having an
amino acid sequence that is at least about 60%, 65%, preferably
about 70%, 75%, more preferably, 80%, 85%, most preferably 90%,
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to the
amino acid sequence of SEQ ID NO:3, 6, 8, 11, 19, or 21. Variants
also include polypeptides encoded by a nucleic acid molecule that
hybridizes to the nucleic acid molecule of SEQ ID NO:1, 2, 4, 5, 7,
9, 10, 18, or 21, or a complement thereof, under stringent
conditions. Variants include polypeptides that differ in amino acid
sequence due to mutagenesis. Variant proteins encompassed by the
present invention are biologically active, that is they continue to
possess the desired biological activity of the native protein, that
is, retaining glyphosate resistance activity. Methods for measuring
glyphosate resistance activity are well known in the art. See, for
example, U.S. Pat. Nos. 4,535,060, and 5,188,642, each of which are
herein incorporated by reference in their entirety.
Altered or Improved Variants
[0058] It is recognized that DNA sequences of GDC-1 may be altered
by various methods, and that these alterations may result in DNA
sequences encoding proteins with amino acid sequences different
than that encoded by GDC-1. This protein may be altered in various
ways including amino acid substitutions, deletions, truncations,
and insertions. Methods for such manipulations are generally known
in the art. For example, amino acid sequence variants of the GDC-1
protein can be prepared by mutations in the DNA. This may also be
accomplished by one of several forms of mutagenesis and/or in
directed evolution. In some aspects, the changes encoded in the
amino acid sequence will not substantially affect the function of
the protein. Such variants will possess the desired glyphosate
resistance activity. However, it is understood that the ability of
GDC-1 to confer glyphosate resistance may be improved by the use of
such techniques upon the compositions of this invention. For
example, one may express GDC-1 in host cells that exhibit high
rates of base misincorporation during DNA replication, such as XL-1
Red (Stratagene). After propagation in such strains, one can
isolate the GDC-1 DNA (for example by preparing plasmid DNA, or by
amplifying by PCR and cloning the resulting PCR fragment into a
vector), culture the GDC-1 mutations in a non-mutagenic strain, and
identify mutated GDC-1 genes with improved resistance to
glyphosate, for example by growing cells in increasing
concentrations of glyphosate and testing for clones that confer
ability to tolerate increased concentrations of glyphosate.
[0059] Alternatively, alterations may be made to the protein
sequence of many proteins at the amino or carboxy terminus without
substantially affecting activity. This can include insertions,
deletions, or alterations introduced by modern molecular methods,
such as PCR, including PCR amplifications that alter or extend the
protein coding sequence by virtue of inclusion of amino acid
encoding sequences in the oligonucleotides utilized in the PCR
amplification. Alternatively, the protein sequences added can
include entire protein-coding sequences, such as those used
commonly in the art to generate protein fusions. Such fusion
proteins are often used to (1) increase expression of a protein of
interest (2) introduce a binding domain, enzymatic activity, or
epitope to facilitate either protein purification, protein
detection, or other experimental uses known in the art (3) target
secretion or translation of a protein to a subcellular organelle,
such as the periplasmic space of Gram-negative bacteria, or the
endoplasmic reticulum of eukaryotic cells, the latter of which
often results in glycosylation of the protein.
[0060] Variant nucleotide and amino acid sequences of the present
invention also encompass sequences derived from mutagenic and
recombinogenic procedures such as DNA shuffling. With such a
procedure, one or more different glyphosate resistance protein
coding regions can be used to create a new glyphosate resistance
protein possessing the desired properties. In this manner,
libraries of recombinant polynucleotides are generated from a
population of related sequence polynucleotides comprising sequence
regions that have substantial sequence identity and can be
homologously recombined in vitro or in vivo. For example, using
this approach, sequence motifs encoding a domain of interest may be
shuffled between the glyphosate resistance gene of the invention
and other known glyphosate resistance genes to obtain a new gene
coding for a protein with an improved property of interest, such as
an increased glyphosate resistance activity. Strategies for such
DNA shuffling are known in the art. See, for example, Stemmer
(1994) Proc. Natl. Acad. Sci. USA 91:10747-10751; Stemmer (1994)
Nature 370:389-391; Crameri et al. (1997) Nature Biotech.
15:436-438; Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et
al. (1997) Proc. Natl. Acad. Sci. USA 94:4504-4509; Crameri et al.
(1998) Nature 391:288-291; and U.S. Pat. Nos. 5,605,793 and
5,837,458.
Transformation of Bacterial or Plant Cells
[0061] In one aspect of the invention, the GDC-1 gene is useful as
a marker to assess transformation of bacterial or plant cells.
Transformation of bacterial cells is accomplished by one of several
techniques known in the art, not limited to electroporation, or
chemical transformation (See for example Ausubel (ed.), Current
Protocols in Molecular Biology, John Wiley and Sons, Inc. (1994)).
Markers conferring resistance to toxic substances are useful in
identifying transformed cells (having taken up and expressed the
test DNA) from non-transformed cells (those not containing or not
expressing the test DNA). By engineering GDC-1 to be (1) expressed
from a bacterial promoter known to stimulate transcription in the
organism to be tested, (2) properly translated to generate an
intact GDC-1 peptide, and (3) placing the cells in an otherwise
toxic concentration of glyphosate, one can identify cells that have
been transformed with DNA by virtue of their resistance to
glyphosate.
[0062] Transformation of plant cells can be accomplished in similar
fashion. First, one engineers the GDC-1 gene in a way that allows
its expression in plant cells. The glyphosate resistance sequences
of the invention may be provided in expression cassettes for
expression in the plant of interest. The cassette will include 5'
and 3' regulatory sequences operably linked to a sequence of the
invention. By "operably linked" is intended a functional linkage
between a promoter and a second sequence, wherein the promoter
sequence initiates and mediates transcription of the DNA sequence
corresponding to the second sequence. Generally, operably linked
means that the nucleic acid sequences being linked are contiguous
and, where necessary to join two protein coding regions, contiguous
and in the same reading frame. The cassette may additionally
contain at least one additional gene to be cotransformed into the
organism. Alternatively, the additional gene(s) can be provided on
multiple expression cassettes. The organization of such constructs
is well known in the art.
[0063] Such an expression cassette is provided with a plurality of
restriction sites for insertion of the glyphosate resistance
sequence to be under the transcriptional regulation of the
regulatory regions. The expression cassette will include in the
5'-3' direction of transcription, a transcriptional and
translational initiation region (i.e., a promoter), a DNA sequence
of the invention, and a transcriptional and translational
termination region (i.e., termination region) functional in plants.
The promoter may be native or analogous, or foreign or
heterologous, to the plant host and/or to the DNA sequence of the
invention. Additionally, the promoter may be the natural sequence
or alternatively a synthetic sequence. Where the promoter is
"native" or "homologous" to the plant host, it is intended that the
promoter is found in the native plant into which the promoter is
introduced. Where the promoter is "foreign" or "heterologous" to
the DNA sequence of the invention, it is intended that the promoter
is not the native or naturally occurring promoter for the operably
linked DNA sequence of the invention.
[0064] The termination region may be native with the
transcriptional initiation region, may be native with the
operably-linked DNA sequence of interest, may be native with the
plant host, or may be derived from another source (i.e., foreign or
heterologous to the promoter, the DNA sequence of interest, the
plant host, or any combination thereof). Convenient termination
regions are available from the Ti-plasmid of A. tumefaciens, such
as the octopine synthase and nopaline synthase termination regions.
See also Guerineau et al. (1991) Mol. Gen. Genet. 262:141-144;
Proudfoot (1991) Cell 64:671-674; Sanfacon et al. (1991) Genes Dev.
5:141-149; Mogen et al. (1990) Plant Cell 2:1261-1272; Munroe et
al. (1990) Gene 91:151-158; Ballas et al. (1989) Nucleic Acids Res.
17:7891-7903; and Joshi et al. (1987) Nucleic Acid Res.
15:9627-9639.
[0065] Where appropriate, the gene(s) may be optimized for
increased expression in the transformed host cell. That is, the
genes can be synthesized using host cell-preferred codons for
improved expression, or may be synthesized using codons at a
host-preferred codon usage frequency. Generally, the GC content of
the gene will be increased. See, for example, Campbell and Gowri
(1990) Plant Physiol. 92:1-11 for a discussion of host-preferred
codon usage. Methods are known in the art for synthesizing
host-preferred genes. See, for example, U.S. Pat. Nos. 6,320,100;
6,075,185; 5,380,831; and 5,436,391, U.S. Published Application
Nos. 20040005600 and 20010003849, and Murray et al. (1989) Nucleic
Acids Res. 17:477-498, herein incorporated by reference.
[0066] In some instances, it may be useful to engineer the gene
such that the resulting peptide is secreted, or otherwise targeted
within the plant cell. For example, the gene can be engineered to
contain a signal peptide to facilitate transfer of the peptide to
the endoplasmic reticulum. It may also be preferable to engineer
the plant expression cassette to contain an intron, such that mRNA
processing of the intron is required for expression. In one
embodiment, the nucleic acids of interest are targeted to the
chloroplast for expression. In this manner, where the nucleic acid
of interest is not directly inserted into the chloroplast, the
expression cassette will additionally contain a nucleic acid
encoding a transit peptide to direct the gene product of interest
to the chloroplasts. Such transit peptides are known in the art.
See, for example, Von Heijne et al. (1991) Plant Mol. Biol. Rep.
9:104-126; Clark et al. (1989) J. Biol. Chem. 264:17544-17550;
Della-Cioppa et al. (1987) Plant Physiol. 84:965-968; Romer et al.
(1993) Biochem. Biophys. Res. Commun. 196:1414-1421; and Shah et
al. (1986) Science 233:478-481.
[0067] Methods for transformation of chloroplasts are known in the
art. See, for example, Svab et al. (1990) Proc. Natl. Acad. Sci.
USA 87:8526-8530; Svab and Maliga (1993) Proc. Natl. Acad. Sci. USA
90:913-917; Svab and Maliga (1993) EMBO J. 12:601-606. The method
relies on particle gun delivery of DNA containing a selectable
marker and targeting of the DNA to the plastid genome through
homologous recombination. Additionally, plastid transformation can
be accomplished by transactivation of a silent plastid-borne
transgene by tissue-preferred expression of a nuclear-encoded and
plastid-directed RNA polymerase. Such a system has been reported in
McBride et al. (1994) Proc. Natl. Acad. Sci. USA 91:7301-7305.
[0068] The nucleic acids of interest to be targeted to the
chloroplast may be optimized for expression in the chloroplast to
account for differences in codon usage between the plant nucleus
and this organelle. In this manner, the nucleic acids of interest
may be synthesized using chloroplast-preferred codons. See, for
example, U.S. Pat. No. 5,380,831, herein incorporated by
reference.
[0069] Typically this `plant expression cassette` will be inserted
into a `plant transformation vector`. This plant transformation
vector may be comprised of one or more DNA vectors needed for
achieving plant transformation. For example, it is a common
practice in the art to utilize plant transformation vectors that
are comprised of more than one contiguous DNA segment. These
vectors are often referred to in the art as `binary vectors`.
Binary vectors as well as vectors with helper plasmids are most
often used for Agrobacterium-mediated transformation, where the
size and complexity of DNA segments needed to achieve efficient
transformation is quite large, and it is advantageous to separate
functions onto separate DNA molecules. Binary vectors typically
contain a plasmid vector that contains the cis-acting sequences
required for T-DNA transfer (such as left border and right border),
a selectable marker that is engineered to be capable of expression
in a plant cell, and a `gene of interest` (a gene engineered to be
capable of expression in a plant cell for which generation of
transgenic plants is desired). Also present on this plasmid vector
are sequences required for bacterial replication. The cis-acting
sequences are arranged in a fashion to allow efficient transfer
into plant cells and expression therein. For example, the
selectable marker gene and the gene of interest are located between
the left and right borders. Often a second plasmid vector contains
the trans-acting factors that mediate T-DNA transfer from
Agrobacterium to plant cells. This plasmid often contains the
virulence functions (Vir genes) that allow infection of plant cells
by Agrobacterium, and transfer of DNA by cleavage at border
sequences and vir-mediated DNA transfer, as in understood in the
art (Hellens and Mullineaux (2000) Trends in Plant Science
5:446-451). Several types of Agrobacterium strains (e.g. LBA4404,
GV3101, EHA101, EHA105, etc.) can be used for plant transformation.
The second plasmid vector is not necessary for transforming the
plants by other methods such as microprojection, microinjection,
electroporation, polyethelene glycol, etc. Many types of vectors
can be used to transform plant cells for achieving glyphosate
resistance.
[0070] In general, plant transformation methods involve
transferring heterologous DNA into target plant cells (e.g.
immature or mature embryos, suspension cultures, undifferentiated
callus, protoplasts, etc.), followed by applying a maximum
threshold level of appropropriate selection (depending on the
selectable marker gene and in this case "glyphosate") to recover
the transformed plant cells from a group of untransformed cell
mass. Explants are typically transferred to a fresh supply of the
same medium and cultured routinely. Subsequently, the transformed
cells are differentiated into shoots after placing on regeneration
medium supplemented with a maximum threshold level of selecting
agent (e.g. "glyphosate"). The shoots are then transferred to a
selective rooting medium for recovering rooted shoot or plantlet.
The transgenic plantlet then grow into mature plant and produce
fertile seeds (e.g. Hiei et al. (1994) The Plant Journal 6:271-282;
Ishida et al. (1996) Nature Biotechnology 14:745-750). Explants are
typically transferred to a fresh supply of the same medium and
cultured routinely. A general description of the techniques and
methods for generating transgenic plants are found in Ayres and
Park (1994) Critical Reviews in Plant Science 13:219-239 and
Bommineni and Jauhar (1997) Maydica 42:107-120. Since the
transformed material contains many cells; both transformed and
non-transformed cells are present in any piece of subjected target
callus or tissue or group of cells. The ability to kill
non-transformed cells and allow transformed cells to proliferate
results in transformed plant cultures. Often, the ability to remove
non-transformed cells is a limitation to rapid recovery of
transformed plant cells and successful generation of transgenic
plants.
[0071] Generation of transgenic plants may be performed by one of
several methods, including but not limited to introduction of
heterologous DNA by Agrobacterium into plant cells
(Agrobacterium-mediated transformation). Bombardment of plant cells
with heterologous foreign DNA adhered to particles including
aerosol beam transformation (U.S. Published Application No.
20010026941; U.S. Pat. No. 4,945,050; International Publication No.
WO 91/00915; U.S. Published Application No. 2002015066), and
various other non-particle direct-mediated methods (e.g. Hiei et
al. (1994) The Plant Journal 6:271-282; Ishida et al. (1996) Nature
Biotechnology 14:745-750; Ayres and Park (1994) Critical Reviews in
Plant Science 13:219-239; Bommineni and Jauhar (1997) Maydica
42:107-120) to transfer DNA.
[0072] Following integration of heterologous foreign DNA into plant
cells, one then applies a maximum threshold level of glyphosate in
the medium to kill the untransformed cells and separate and
proliferate the putatively transformed cells that survive from this
selection treatment by transferring regularly to a fresh medium. By
continuous passage and challenge with glyphosate, one identifies
and proliferates the cells that are transformed with the plasmid
vector. Then molecular and biochemical methods will be used for
confirming the presence of the integrated heterologous gene of
interest in the genome of transgenic plant.
[0073] The cells that have been transformed may be grown into
plants in accordance with conventional ways. See, for example,
McCormick et al. (1986) Plant Cell Reports 5:81-84. These plants
may then be grown, and either pollinated with the same transformed
strain or different strains, and the resulting hybrid having
constitutive expression of the desired phenotypic characteristic
identified. Two or more generations may be grown to ensure that
expression of the desired phenotypic characteristic is stably
maintained and inherited and then seeds harvested to ensure
expression of the desired phenotypic characteristic has been
achieved. In this manner, the present invention provides
transformed seed (also referred to as "transgenic seed") having a
nucleotide construct of the invention, for example, an expression
cassette of the invention, stably incorporated into their
genome.
Evaluation of Plant Transformation
[0074] Following introduction of heterologous foreign DNA into
plant cells, the transformation or integration of heterologous gene
in the plant genome is confirmed by various methods such as
analysis of nucleic acids, proteins and metabolites associated with
the integrated gene.
PCR Analysis: PCR analysis is a rapid method to screen transformed
cells, tissue or shoots for the presence of incorporated gene at
the earlier stage before transplanting into the soil (Sambrook and
Russell, 2001). PCR is carried out using oligonucleotide primers
specific to the gene of interest or Agrobacterium vector
background, etc. Southern Analysis Plant transformation is
confirmed by Southern blot analysis of genomic DNA (Sambrook and
Russell, 2001). In general, total DNA is extracted from the
transformant, digested with appropriate restriction enzymes,
fractionated in an agarose gel and transferred to a nitrocellulose
or nylon membrane. The membrane or "blot" then is probed with, for
example, radiolabeled .sup.32P target DNA fragment to confirm the
integration of introduced gene in the plant genome according to
standard techniques (Sambrook and Russell, 2001. Molecular Cloning:
A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold
Spring Harbor, N.Y. Northern Analysis: RNA is isolated from
specific tissues of transformant, fractionated in a formaldehyde
agarose gel, blotted onto a nylon filter according to standard
procedures that are routinely used in the art (Sambrook, J., and
Russell, D. W. 2001. Molecular Cloning: A Laboratory Manual, Cold
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).
Expression of RNA encoded by GDC-1 is then tested by hybridizing
the filter to a radioactive probe derived from a GDC, by methods
known in the art (Sambrook and Russell, 2001) Western blot and
Biochemical assays: Western blot and biochemical assays and the
like may be carried out on the transgenic plants to determine the
presence of protein encoded by the glyphosate resistance gene by
standard procedures (Sambrook, J., and Russell, D. W. 2001.
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y.) using antibodies that
bind to one or more epitopes present on the glyphosate resistance
protein.
Transgenic Plants
[0075] In another aspect of the invention, one may generate
transgenic plants expressing GDC-1 that are more resistant to high
concentrations of glyphosate than non-transformed plants. Methods
described above by way of example may be utilized to generate
transgenic plants, but the manner in which the transgenic plant
cells are generated is not critical to this invention. Methods
known or described in the art such as Agrobacterium-mediated
transformation, biolistic transformation, and non-particle-mediated
methods may be used at the discretion of the experimenter. Plants
expressing GDC-1 may be isolated by common methods described in the
art, for example by transformation of callus, selection of
transformed callus, and regeneration of fertile plants from such
transgenic callus. In such process, GDC-1 may be used as selectable
marker. Alternatively, one may use any gene as a selectable marker
so long as its expression in plant cells confers ability to
identify or select for transformed cells. Genes known to function
effectively as selectable markers in plant transformation are well
known in the art.
[0076] The following examples are offered by way of illustration
and not by way of limitation.
EXPERIMENTAL
Example 1
Isolation of ATX6394
[0077] Glyphosate-resistant fungi were isolated by plating samples
of soil on Enriched Minimal Media (EMM) containing glyphosate as
the sole source of phosphorus. Since EMM contains no aromatic amino
acids, a strain must be resistant to glyphosate in order to grow on
this media.
[0078] Two grams of soil was suspended in approximately 30 ml of
water, and sonicated for 30 seconds in an Aquasonic sonicator water
bath. The sample was vortexed for 5 seconds and permitted to settle
for 60 seconds. This process was repeated 3 times. 100 .mu.l of
this suspension was added to 2 ml of Enriched Minimal Media II (EMM
II) supplemented with 4 mM glyphosate (pH 6.0) EMMII contains
Solution A (In 900 mls: 10 g sucrose (or other carbon source), 2 g
NaNO.sub.3, 1.0 ml 0.8 M MgSO.sub.4, 1.0 ml 0.1 M CaCl.sub.2, 1.0
ml Trace Elements Solution (In 100 ml of 1000.times. solution: 0.1
g FeSO.sub.4.7H.sub.2O, 0.5 mg CuSO.sub.4.5H.sub.2O, 1.0 mg
H.sub.3BO.sub.3, 1.0 mg MnSO.sub.4.5H.sub.2O, 7.0 mg
ZnSO.sub.4.7H.sub.2O, 1.0 mg MoO.sub.3, 4.0 g KCl)) and Solution B
(In 100 mls: 0.21 g Na.sub.2HPO.sub.4, 0.09 g NaH.sub.2PO.sub.4, pH
7.0). The culture was shaken on a tissue culture roller drum for
eight days at 21.degree. C. and then transferred into 2 ml of fresh
EMMII containing 4 mM glyphosate as the only phosphorus source.
After five days, the culture was plated onto solid media by
streaking a 1 .mu.l loop onto the surface of agar plate containing
EMMII agar containing 5 mM glyphosate as the sole phosphorus
source. The plate was sealed with parafilm and incubated until
suitable growth was attained. Fresh plates were inoculated by agar
plugs to isolate the fungus into pure culture.
[0079] One particular strain, designated ATX6394, was selected due
to its ability to grow in the presence of high glyphosate
concentrations.
Example 2
Construction of cDNA Library from Strain ATX6394
[0080] ATX6394 was grown in (liquid media L+phosphorous) containing
5 mM glyphosate, and total RNA was isolated using Trizol reagent
(Invitrogen). poly(A)+ mRNA was isolated from total RNA using
Poly(A) Purist mRNA Purification kit (Ambion). cDNA was synthesized
from polyA+ mRNA using ZAP cDNA Synthesis kit from Stratagene, and
cloned into the lambda Zap II expression vector (Stratagene).
Example 3
In Vivo Excision of cDNA Clones
[0081] The ATX6394 cDNA library was excised in bulk as per
manufacturers protocol (Stratagene), transfected into the SOLR
strain of E. coli (Stratagene), plated directly onto M9 minimal
media plates containing thiamine, proline, ampicillin and 5 mM
glyphosate and incubated at 37.degree. C. (M9 media contains 30 g
Na.sub.2HPO.sub.4, 15 g KH.sub.2PO.sub.4, 5 g NH.sub.4Cl, 2.5 g
NaCl, and 15 mg CaCl.sub.2).
Example 4
Identification of cDNA Clones Conferring Glyphosate Resistance in
E. coli
[0082] Following 2 days growth, 51 colonies had grown in the
presence of 5 mM glyphosate, and these clones were selected for
further study. Plasmid DNA from 48 of the 51 positive clones was
isolated and transformed into the alternate host strain XL-1 Blue
MRF' (Stratagene) and plasmid DNA was prepared for sequencing.
[0083] We determined the DNA sequence of 48 clones conferring
glyphosate resistance (5 mM). Three clones (#23, 35, 59) were found
to represent the same open reading frame. Therefore we designated
this open reading frame GDC-1. The nucleotide sequences of clones
#23, 35, and 59 are provided in SEQ ID NOS:4, 7, and 9
respectively.
Example 5
Isolation of Full-Length GDC-1 Construct (GDC-1 (Full))
[0084] Comparison of GDC-1 (29) GDC-1(35) and GDC-1 (59) suggested
that these clones did not represent the entire cDNA for the GDC-1
mRNA. To generate a full length GDC-1 clone, we performed 5' RACE
using the SMART RACE cDNA Amplification kit (BD Biosciences) to
amplify the 5' end of the GDC-1 from ATX6394 poly(A)+ mRNA. Oligo
[SMARTgrg3.rev 5'TCCCAGATGCCAAAGTTGGCTGTTCCAGTC 3']; SEQ ID NO:12
was derived from the sequence of GDC-1 (#35). We cut the resultant
PCR product with HindIII and ligated this to the existing GDC-1(59)
cDNA in pBluescript to generate the full length cDNA, referred to
herein as GDC-1(full). The DNA sequence of GDC-1 (full) was
determined, and found to contain a complete protein-coding region.
This coding region is referred to herein as GDC-1. Amino acid
sequences resulting from the translation of the GDC-1 gene are
provided in SEQ ID NOS:3, 19, and 21.
[0085] GDC1(59) consists of amino acid residues 118 to 575 of
GDC-1(full) (SEQ ID NO:19). GDC1(35) consists of amino acid
residues 331 to 556 of GDC-1(full) (SEQ ID NO:19). GDC1(23)
consists of amino acid residues 379 to 575 of GDC-1(full) (SEQ ID
NO:19).
Example 6
Disruption of GDC-10RF Eliminates Glyphosate Resistance
[0086] To confirm that GDC-10RF is responsible for conferring
glyphosate resistance, we engineered a mutant of GDC1(35), and
tested its ability to confer glyphosate resistance. The GDC1(35)
construct contains a single recognition site for HindIII
restriction enzyme. GDC1(35) was digested with the restriction
enzyme Hind III, and the resulting recessed 3' ends extended by
incubating with T4 DNA polymerase and dNTPs, as known in the art
(Sambrook). The resulting molecules were then religated using T4
DNA ligase (Maniatis). The religated molecules were identified by
min-prep of transformed clones, and the DNA was sequenced. The
resulting clone, GDC1(35-H3mut), contains a four nucleotide
insertion in the GDC-1 open reading frame. This four nucleotide
insertion leads to the premature termination of translation of the
GDC1(35) protein at a premature stop codon at nucleotides 1451-1453
of GDC-1 full length sequence.
TABLE-US-00001 TABLE 1 Glyphosate resistance of GDC-1(35) and the
mutant GDC-1 (35-H3mut) M9 media + Amp + 10 mM Glyphosate Vector
(pBluescript SK+) - GDC-1(35) +++ GDC-1(35-H3mut) -
Example 7
GDC-1 Does Not Complement an aroA Mutation in E. coli
[0087] The E. coli aroA gene codes for EPSP synthase, the target
enzyme for glyphosate. EPSP synthase catalyzes the sixth step in
the biosynthesis of aromatic amino acids in microbes and plants.
aroA mutants that lack an EPSP synthase do not grow on minimal
media that lacks aromatic amino acids (Pittard and Wallace (1966)
J. Bacteriol. 91:1494-508), but can grow in rich media, such as LB.
However, genes encoding EPSPS activity can restore the ability to
grow on glyphosate upon aroA mutant E. coli strains. Thus, a test
for genetic complementation of an aroA mutant is a highly sensitive
method to test if a gene is capable of functioning as an EPSPS in
E. coli. Such tests for gene function by genetic complementation
are known in the art.
[0088] A deletion of the aroA gene was created in E. coli XL-1 MRF'
(Stratagene) by PCR/recombination methods known in the art and
outlined by Datsenko and Wanner, (2000) Proc. Natl. Acad. Sci. USA
97:6640-6645. This system is based on the Red system that allows
for chromosomal disruptions of targeted sequences. A large portion
(1067 nt of the 1283 nt) of the aroA coding region was disrupted by
the engineered deletion. The presence of the deletion was confirmed
by PCR with several sets of oligonucleotides, and by the appearance
of an aroA phenotype in the strain, referred to herein as
`.DELTA.aroA`. .DELTA.aroA grows on LB media (which contains all
amino acids) and grows on M63 media supplemented with
phenylalanine, tryptophan, and tyrosine, but does not grow on M63
minimal media (which lacks aromatic amino acids). These results
indicate that .DELTA.aroA exhibits an aroA phenotype.
[0089] The ability of an EPSPS to complement the mutant phenotype
of .DELTA.aroA was confirmed. Clone pAX482, an E. coli expression
vector containing the wild-type E. coli aroA gene, was transformed
into .DELTA.aroA, and transformed cells were selected. These cells
(containing a functional aroA gene residing on a plasmid) were then
plated on LB media, M63, and M63 with amino acid supplements. Where
the .DELTA.aroA mutant strain grew only on LB and M63 supplemented
with aromatic amino acids, .DELTA.aroA cells containing the
functional aroA gene on a plasmid grew on all three media
types.
[0090] In order to determine whether or not GDC-1 could confer
complementation, plasmid pAX472, the expression vector containing
GDC-1, was transformed into .DELTA.aroA and plated on the same
three types of media. Cells transformed with pAX472 were able to
grow on M63 media supplemented with phenylalanine, tryptophan, and
tyrosine and LB media but they were not able to grow on M63 alone.
Thus, GDC-1 was not capable of complementing the aroA mutation, and
thus GDC-1 is not EPSP synthase.
Example 8
GDC-1 is a TPP-Binding Decarboxylase
[0091] The predicted amino acid sequence of GDC-1 was compared to
the non-redundant database of sequences maintained by the National
Center for Biotechnology Information (NCBI), using the BLAST2
algorithm (Altschul et al. (1990) J. Mol. Biol. 215:403-410;
Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402; Gish and
States (1993) Nature Genet. 3:266-272). Comparison of GDC-1 with
public DNA and amino acid databases, such as the non-redundant
database of GenBank, the Swissprot database, and the `pat` database
of GenBank show that GDC-1 encodes a novel protein. Results from a
BLAST search of the NCBI nr database are shown in Table 2. The
sequences obtained using the Genbank Accession Nos. provided are
herein incorporated by reference in their entirety. The results of
BLAST searches identified homology between the predicted GDC-1 open
reading frame (SEQ ID NO:3) and several known proteins. The highest
scoring amino acid sequences from this search were aligned with
GDC-1 using ClustalW algorithm (Higgins et al. (1994) Nucleic Acids
Res. 22:4673-4680) [as incorporated into the program ALIGNX module
of the vector NTi Program Suite, Informax, Inc.]. After alignment
with ClustalW, the percent amino acid identity was assessed. The
protein encoded by GDC-1 has homology to several members of the
fungal pyruvate decarboxylase enzyme family. The highest protein
homology identified is the Aspergillus oryzae pyruvate
decarboxylase (pdcA) gene. GDC-1 also shares homology with indole-3
pyruvate decarboxylases, found in bacteria such as Salmonella
typhimurium. A similar search of the patent database at NCBI also
identifies proteins with homology to GDC-1, though proteins
identified in this search are less related to GDC-1. The percent
amino acid identity of GDC-1 with members of these protein classes
is shown in Table 3.
[0092] Further analysis of GDC-1 sequence shows that GDC-1 contains
conserved domains characteristic of proteins that utilize Thiamine
Pyrophosphate (TPP) as a cofactor. These domains are collectively
and singly referred to as a "TPP binding domain". Analysis of GDC-1
sequence shows that amino acids 13-187 of SEQ ID NOS:3, 19, and 21
constitute an N-terminal domain of TPP-binding domain, amino acids
375-547 of SEQ ID NOS:3, 19, and 21 constitute a central domain of
TPP-binding domain, and amino acids 209-348 of SEQ ID NOS:3, 19,
and 21 constitute a C-terminal domain of TPP-binding domain. It is
understood that these amino acid coordinates are only
approximations of the location of such domains as judged by
homology with known TPP binding proteins, and are not limiting to
the invention. An alignment of GDC-1 with other known TPP-binding
proteins is shown in FIG. 3.
TABLE-US-00002 TABLE 2 High scoring open reading frames from BLAST
search of NCBI nr database Genbank Accession No. Organism Gene
Description gi|4323052|gb|AF098293.1|AF098293 Aspergillus oryzae
pyruvate decarboxylase (pdcA) gi|2160687|gb|U73194.1|ENU73194
Emericella nidulans pyruvate decarboxylase (pdcA)
gi|25992751|gb|AF545432.1| Candida glabrata pyruvate decarboxylase
(PDC) gi|4115|emb|X55905.1|SCPDC6 Saccharomyces PDC6 gene for
pyruvate cerivisiae decarboxylase gi|173308|gb|L09727.1|YSKPDC1A
Kluyveromyces pyruvate decarboxylase marxianus (PDC1)
gi|535343|gb|U13635.1|HUU13635 Hanseniaspora pyruvate decarboxylase
uvarum (PDC) gi|4113|emb|X15668.1|SCPDC5 Saccharomyces PDC5 gene
for pyruvate cerivisiae decarboxylase (EC4.1.1.1.)
gi|452688|emb|X77316.1|SCPDC1A Saccharomyces PDC1 cerivisiae
TABLE-US-00003 TABLE 3 Percent identity of GDC-1 to related
proteins from various fungi and bacteria % Amino Acid Organism Gene
Product Identity Aspergillus oryzae Pyruvate decarboxylase 58%
Emericalla nidulans Pyruvate decarboxylase 56% Candida glabrata
Pyruvate decarboxylase 49% Kluyveromyces marxianus Pyruvate
decarboxylase 47% Saccharomyces cerevisiae Pyruvate decarboxylase
PDC1 46% Saccharomyces cerevisiae Pyruvate decarboxylase PDC5 47%
Saccharomyces cerevisiae Pyruvate decarboxylase PDC6 47% Pichia
Stipitis Pyruvate decarboxylase PDC2 45% Salmonella typhimurium
Indole-3 pyruvate decarboxylase 33% Neurospora crassa Pyruvate
decarboxylase 28% Nicotiana tabacum Pyruvate decarboxylase 28%
Zymomonas mobilis Pyruvate decarboxylase 27%
Example 9
Engineering GDC-1 for Expression in E. coli
[0093] An E. coli strain expressing GDC-1 was engineered into a
customized expression vector (pAX481). pAX481 contains the pBR322
origin of replication, a chloramphenicol acetyl transferase gene
(for selection and maintenance of the plasmid), the lacI gene, the
Ptac promoter and the rrnB transcriptional terminator. The GDC-1
open reading frame was amplified by PCR using a high fidelity DNA
polymerase, as known in the art. The oligonucleotides for PCR
amplification of GDC-1 were designed to place the ATG start site of
the gene at the proper distance from the ribosome binding site of
pAX481.
[0094] The GDC-1 PCR product was cloned into the expression vector
pAX481 and transformed into E. coli XL1 Blue MRF' to yield the
plasmid pAX472. GDC-1 positive clones were identified by standard
methods known in the art. The sequence of pAX472 was confirmed by
DNA sequence analysis as known in the art.
Example 10
Purification of GDC-1 Expressed as a 6.times.His-Tagged Protein in
E. coli
[0095] The GDC-1 coding region (1,728 nucleotides) was amplified by
PCR using ProofStart.TM. DNA polymerase. Oligonucleotides used to
prime PCR were designed to introduce restriction enzyme recognition
sites near the 5' and 3' ends of the resulting PCR product. The
resulting PCR product was digested with BamH I and Sal I. BamH I
cleaved the PCR product at the 5' end, and Sal I cleaved the PCR
product at the 3' end. The digested product was cloned into the
6.times.His-tag expression vector pQE-30 (Qiagen), prepared by
digestion with BamH I and Sal I. The resulting clone, pAX623,
contained GDC-1 in the same translational reading frame as, and
immediately C-terminal to, the 6.times.His tag of pQE-30. General
strategies for generating such clones, and for expressing proteins
containing 6.times.His-tag are well known in the art.
[0096] The ability of this clone to confer glyphosate resistance
was confirmed by plating cells of pAX623 onto M63 media containing
5 mM glyphosate. pAX623-containing cells gave rise to colonies,
where cells containing the vector alone gave no colonies.
[0097] GDC-1 protein from pAX623-containing cells was isolated by
expression of GDC-1-6.times.His-tagged protein in E. coli, and the
resulting protein purified using Ni-NTA Superflow Resin (Qiagen) as
per manufacturer's instructions.
Example 11
Assay of GDC-1 Pyruvate Decarboxylase Activity
[0098] 100 ng of GDC-1 protein was tested for activity in a
standard pyruvate decarboxylase assay (Gounaris et al. (1971) J. of
Biol. Chem. 246:1302-1309). This assay is a coupled reaction,
wherein the first step the pyruvate decarboxylase (PDC) converts
pyruvate to acetaldehyde and CO.sub.2. The acetaldehyde produced in
this reaction is a substrate for alcohol dehydrogenase, which
converts acetaldehyde and .beta.-NADH to ethanol and .beta.-NAD.
Thus, PDC activity is detected by virtue of utilization of
.beta.-NADH as decrease in absorbance at 340 nM in a
spectrophotometer. GDC-1 as well as a control enzyme (pyruvate
decarboxylase, Sigma) were tested in this assay. GDC-1 showed
activity as a pyruvate decarboxylase, and the reaction rate
correlated with the concentration of pyruvate in the assay.
Example 12
Engineering GDC-1 for Plant Transformation
[0099] The GDC-1 open reading frame (ORF) was amplified by PCR from
a full-length cDNA template. HindIII restriction sites were added
to each end of the ORF during PCR. Additionally, the nucleotide
sequence ACC was added immediately 5' to the start codon of the
gene to increase translational efficiency (Kozak (1987) Nucleic
Acids Research 15:8125-8148; Joshi (1987) Nucleic Acids Research
15:6643-6653). The PCR product was cloned and sequenced, using
techniques well known in the art, to ensure that no mutations were
introduced during PCR.
[0100] The plasmid containing the GDC-1 PCR product was partially
digested with Hind III and the 1.7 kb Hind III fragment containing
the intact ORF was isolated. (GDC-1 contains an internal Hind III
site in addition to the sites added by PCR.) This fragment was
cloned into the Hind III site of plasmid pAX200, a plant expression
vector containing the rice actin promoter (McElroy et al. (1991)
Mol. Gen. Genet. 231:150-160) and the PinII terminator (An et al.
(1989) The Plant Cell 1:115-122). The promoter--gene--terminator
fragment from this intermediate plasmid was subcloned into Xho I
site of plasmid pSB11 (Japan Tobacco, Inc.) to form the plasmid
pAX810. pAX810 is organized such that the 3.45 kb DNA fragment
containing the promoter--GDC-1--terminator construct may be excised
from pAX810 by double digestion with KpnI and XbaI for
transformation into plants using aerosol beam injection. The
structure of pAX810 was verified by restriction digests and gel
electrophoresis and by sequencing across the various cloning
junctions.
[0101] Plasmid pAX810 was mobilized into Agrobacterium tumifaciens
strain LBA4404 which also harbored the plasmid pSB1 (Japan Tobacco,
Inc.), using triparental mating procedures well known in the art,
and plated on media containing spectinomycin. Plasmid pAX810
carries spectinomycin resistance but is a narrow host range plasmid
and cannot replicate in Agrobacterium. Spectinomycin resistant
colonies arise when pAX810 integrates into the broad host range
plasmid pSB1 through homologous recombination. The cointegrate
product of pSB1 and pAX810 was named pAX204 and was verified by
Southern hybridization (data not shown). The Agrobacterium strain
harboring pAX204 was used to transform maize by the PureIntro
method (Japan Tobacco).
Example 13
Transformation of GDC-1 into Plant Cells
[0102] Maize ears are collected 8-12 days after pollination.
Embryos are isolated from the ears, and those embryos 0.8-1.5 mm in
size are used for transformation. Embryos are plated scutellum
side-up on a suitable incubation media, such as DN62A5S media (3.98
g/L N6 Salts; 1 mL/L (of 1000.times. Stock) N6 Vitamins; 800 mg/L
L-Asparagine; 100 mg/L Myo-inositol; 1.4 g/L L-Proline; 100 mg/L
Casaminoacids; 50 g/L sucrose; 1 mL/L (of 1 mg/mL Stock) 2,4-D).
However, media and salts other than DN62A5S are suitable and are
known in the art. Embryos are incubated overnight at 25.degree. C.
in the dark.
[0103] The resulting explants are transferred to mesh squares
(30-40 per plate), transferred onto osmotic media for 30-45
minutes, and then transferred to a beaming plate (see, for example,
PCT Publication No. WO/0138514 and U.S. Pat. No. 5,240,842).
[0104] DNA constructs designed to express GDC-1 in plant cells are
accelerated into plant tissue using an aerosol beam accelerator,
using conditions essentially as described in PCT Publication No.
WO/0138514. After beaming, embryos are incubated for 30 min on
osmotic media, and placed onto incubation media overnight at
25.degree. C. in the dark. To avoid unduly damaging beamed
explants, they are incubated for at least 24 hours prior to
transfer to recovery media. Embryos are then spread onto recovery
period media, for 5 days, 25.degree. C. in the dark, then
transferred to a selection media. Explants are incubated in
selection media for up to eight weeks, depending on the nature and
characteristics of the particular selection utilized. After the
selection period, the resulting callus is transferred to embryo
maturation media, until the formation of mature somatic embryos is
observed. The resulting mature somatic embryos are then placed
under low light, and the process of regeneration is initiated by
methods known in the art. The resulting shoots are allowed to root
on rooting media, and the resulting plants are transferred to
nursery pots and propagated as transgenic plants.
Materials
TABLE-US-00004 [0105] DN62A5S Media Components per liter Source
Chu'S N6 Basal Salt Mixture 3.98 g/L Phytotechnology Labs (Prod.
No. C 416) Chu's N6 Vitamin Solution 1 mL/L (of Phytotechnology
Labs (Prod. No. C 149) 1000x Stock) L-Asparagine 800 mg/L
Phytotechnology Labs Myo-inositol 100 mg/L Sigma L-Proline 1.4 g/L
Phytotechnology Labs Casaminoacids 100 mg/L Fisher Scientific
Sucrose 50 g/L Phytotechnology Labs 2,4-D (Prod. No. D-7299) 1 mL/L
(of Sigma 1 mg/mL Stock)
[0106] Adjust the pH of the solution to pH to 5.8 with 1N KOH/1N
KCl, add Gelrite (Sigma) to 3 g/L, and autoclave. After cooling to
50.degree. C., add 2 ml/L of a 5 mg/ml stock solution of Silver
Nitrate (Phytotechnology Labs). Recipe yields about 20 plates.
Example 14
Transformation of GDC-1 into Plant Cells by Agrobacterium-Mediated
Transformation
[0107] Ears are collected 8-12 days after pollination. Embryos are
isolated from the ears, and those embryos 0.8-1.5 mm in size are
used for transformation. Embryos are plated scutellum side-up on a
suitable incubation media, and incubated overnight at 25.degree. C.
in the dark. However, it is not necessary per se to incubate the
embryos overnight. Embryos are contacted with an Agrobacterium
strain containing the appropriate vectors for Ti plasmid mediated
transfer for 5-10 min, and then plated onto co-cultivation media
for 3 days (25.degree. C. in the dark). After co-cultivation,
explants are transferred to recovery period media for five days (at
25.degree. C. in the dark). Explants are incubated in selection
media for up to eight weeks, depending on the nature and
characteristics of the particular selection utilized. After the
selection period, the resulting callus is transferred to embryo
maturation media, until the formation of mature somatic embryos is
observed. The resulting mature somatic embryos are then placed
under low light, and the process of regeneration is initiated as
known in the art. The resulting shoots are allowed to root on
rooting media, and the resulting plants are transferred to nursery
pots and propagated as transgenic plants.
[0108] All publications and patent applications mentioned in the
specification are indicative of the level of skill of those skilled
in the art to which this invention pertains. All publications and
patent applications are herein incorporated by reference to the
same extent as if each individual publication or patent application
was specifically and individually indicated to be incorporated by
reference.
[0109] Although the foregoing invention has been described in some
detail by way of illustration and example for purposes of clarity
of understanding, it will be obvious that certain changes and
modifications may be practiced within the scope of the appended
claims.
Sequence CWU 1
1
2112210DNAUnknownCDS(224)...(1951)Fungal isolate from soil sample
1acgcggggtg cccacggaca acaattccct taggattatc tcctgtattg aatacactct
60actttgcaac tttacctatt attcgacttt cttttagagg agcagcattg tcatcattac
120ctgcccctcc atctgatacc taccttacat tgtcgccaac acacctataa
gccataatat 180accgactcaa agcaaaccac gcccattgtt tgattgttta atc atg
gcc agc atc 235Met Ala Ser Ile1aac atc agg gtg cag aat ctc gag caa
ccc atg gac gtt gcc gag tat 283Asn Ile Arg Val Gln Asn Leu Glu Gln
Pro Met Asp Val Ala Glu Tyr5 10 15 20ctt ttt cgg cgt ctc cac gaa
atc ggc att cgc tcc atc cac ggt ctt 331Leu Phe Arg Arg Leu His Glu
Ile Gly Ile Arg Ser Ile His Gly Leu25 30 35cca ggc gat tac aac ctt
ctt gcc ctc gac tat ttg cca tca tgt ggc 379Pro Gly Asp Tyr Asn Leu
Leu Ala Leu Asp Tyr Leu Pro Ser Cys Gly40 45 50ctg aga tgg gtt ggc
agc gtc aac gaa ctc aat gct gct tat gct gct 427Leu Arg Trp Val Gly
Ser Val Asn Glu Leu Asn Ala Ala Tyr Ala Ala55 60 65gat ggc tat gcc
cgc gtc aag cag atg gga gct ctc atc acc act ttt 475Asp Gly Tyr Ala
Arg Val Lys Gln Met Gly Ala Leu Ile Thr Thr Phe70 75 80gga gtg gga
gag ctc tca gcc atc aat ggc gtt gcc ggt gcc ttt tcg 523Gly Val Gly
Glu Leu Ser Ala Ile Asn Gly Val Ala Gly Ala Phe Ser85 90 95 100gaa
cac gtc cca gtc gtt cac att gtt ggc tgc cct tcc act gtc tcg 571Glu
His Val Pro Val Val His Ile Val Gly Cys Pro Ser Thr Val Ser105 110
115cag cga aac ggc atg ctc ctc cac cac acg ctt gga aac ggc gac ttc
619Gln Arg Asn Gly Met Leu Leu His His Thr Leu Gly Asn Gly Asp
Phe120 125 130aac atc ttt gcc aac atg agc gct caa atc tct tgc gaa
gtg gcc aag 667Asn Ile Phe Ala Asn Met Ser Ala Gln Ile Ser Cys Glu
Val Ala Lys135 140 145ctc acc aac cct gcc gaa att gcg acc cag atc
gac cat gcc ctc cgc 715Leu Thr Asn Pro Ala Glu Ile Ala Thr Gln Ile
Asp His Ala Leu Arg150 155 160gtt tgc ttc att cgt tct cgg ccc gtc
tac atc atg ctt ccc acc gat 763Val Cys Phe Ile Arg Ser Arg Pro Val
Tyr Ile Met Leu Pro Thr Asp165 170 175 180atg gtc cag gcc aaa gta
gaa ggt gcc aga ctc aag gaa cca att gac 811Met Val Gln Ala Lys Val
Glu Gly Ala Arg Leu Lys Glu Pro Ile Asp185 190 195ttg tcg gag cct
cca aat gat ccc gag aaa gaa gca tac gtc gtt gac 859Leu Ser Glu Pro
Pro Asn Asp Pro Glu Lys Glu Ala Tyr Val Val Asp200 205 210gtt gtc
ctc aag tay ctc cgt gct gca aag aac ccc gtc atc ctt gtc 907Val Val
Leu Lys Tyr Leu Arg Ala Ala Lys Asn Pro Val Ile Leu Val215 220
225gat gct tgt gct atc cgt cat cgt gtt ctt gat gag gtt cat gat ctc
955Asp Ala Cys Ala Ile Arg His Arg Val Leu Asp Glu Val His Asp
Leu230 235 240atc gaa aag aca aac ctc cct gtc ttt gtc act cct atg
ggc aaa ggt 1003Ile Glu Lys Thr Asn Leu Pro Val Phe Val Thr Pro Met
Gly Lys Gly245 250 255 260gct gtt aac gaa gaa cac ccg aca tat ggt
ggt gtc tat gcc ggt gac 1051Ala Val Asn Glu Glu His Pro Thr Tyr Gly
Gly Val Tyr Ala Gly Asp265 270 275ggc tca cat ccg cct caa gtt aag
gac atg gtt gag tct tct gat ttg 1099Gly Ser His Pro Pro Gln Val Lys
Asp Met Val Glu Ser Ser Asp Leu280 285 290ata ttg aca atc ggt gct
ctc aag agc gac ttc aac act gct ggc ttc 1147Ile Leu Thr Ile Gly Ala
Leu Lys Ser Asp Phe Asn Thr Ala Gly Phe295 300 305tct tac cgt acc
tca cag ctg aac acg att gat cta cac agc gac cac 1195Ser Tyr Arg Thr
Ser Gln Leu Asn Thr Ile Asp Leu His Ser Asp His310 315 320tgc att
gtc aaa tac tcg aca tat cca ggt gtc cag atg agg ggt gtg 1243Cys Ile
Val Lys Tyr Ser Thr Tyr Pro Gly Val Gln Met Arg Gly Val325 330 335
340ctg cga caa gtg att aag cag ctc gat gca tct gag atc aac gct cag
1291Leu Arg Gln Val Ile Lys Gln Leu Asp Ala Ser Glu Ile Asn Ala
Gln345 350 355cca gcg cca gtc gtc gag aat gaa gtt gcc aaa aac cga
gat aac tca 1339Pro Ala Pro Val Val Glu Asn Glu Val Ala Lys Asn Arg
Asp Asn Ser360 365 370ccc gtc att aca caa gct ttc ttc tgg ccg cgc
gtg gga gag ttc ctg 1387Pro Val Ile Thr Gln Ala Phe Phe Trp Pro Arg
Val Gly Glu Phe Leu375 380 385aag aag aac gac atc gtc att acc gag
act gga aca gcc aac ttt ggc 1435Lys Lys Asn Asp Ile Val Ile Thr Glu
Thr Gly Thr Ala Asn Phe Gly390 395 400atc tgg gat act aag ttt ccc
tct ggc gtt act gcg ctt tct cag gtc 1483Ile Trp Asp Thr Lys Phe Pro
Ser Gly Val Thr Ala Leu Ser Gln Val405 410 415 420ctt tgg gga agc
att ggt tgg tcc gtt ggt gcc tgc caa gga gcc gtt 1531Leu Trp Gly Ser
Ile Gly Trp Ser Val Gly Ala Cys Gln Gly Ala Val425 430 435ctt gca
gcc gcc gat gac aac agc gat cgc aga act atc ctc ttt gtt 1579Leu Ala
Ala Ala Asp Asp Asn Ser Asp Arg Arg Thr Ile Leu Phe Val440 445
450ggt gat ggc tca ttc cag ctc act gct caa gaa ttg agc aca atg att
1627Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu Ser Thr Met
Ile455 460 465cgt ctc aag ctg aag ccc atc atc ttt gtc atc tgc aac
gat ggc ttt 1675Arg Leu Lys Leu Lys Pro Ile Ile Phe Val Ile Cys Asn
Asp Gly Phe470 475 480acc att gaa cga ttc att cac ggc atg gaa gcc
gag tac aac gac atc 1723Thr Ile Glu Arg Phe Ile His Gly Met Glu Ala
Glu Tyr Asn Asp Ile485 490 495 500gca aat tgg gac ttc aag gct ctg
gtt gac gtc ttt ggc ggc tct aag 1771Ala Asn Trp Asp Phe Lys Ala Leu
Val Asp Val Phe Gly Gly Ser Lys505 510 515acg gcc aag aag ttc gcc
gtc aag acc aag gac gag ctg gac agc ctt 1819Thr Ala Lys Lys Phe Ala
Val Lys Thr Lys Asp Glu Leu Asp Ser Leu520 525 530ctc aca gac cct
acc ttt aac gcc gca gaa tgc ctc cag ttt gtc gag 1867Leu Thr Asp Pro
Thr Phe Asn Ala Ala Glu Cys Leu Gln Phe Val Glu535 540 545cta tat
atg ccc aaa gaa gat gct cct cga gca ttg atc atg act gca 1915Leu Tyr
Met Pro Lys Glu Asp Ala Pro Arg Ala Leu Ile Met Thr Ala550 555
560gaa gct agc gcg agg aac aat gcc aag aca gag taa agtggactgt
1961Glu Ala Ser Ala Arg Asn Asn Ala Lys Thr Glu *565 570
575catgaaggcc gatttaccac ctcataaatt gtaatagacc tgatacacat
agatcaaggc 2021aggtaccgat cattaatcaa gcaggtttgg atggggaagg
attttgaaaa tgaggaaacg 2081atgggatgat atttggaata actggccatt
attttgagta cttataaaca aatttgaagt 2141tcaatttttt ttcaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2201aaaaaaaaa
221021725DNAUnknownCDS(1)...(1725)Fungal isolate from soil sample
2atg gcc agc atc aac atc agg gtg cag aat ctc gag caa ccc atg gac
48Met Ala Ser Ile Asn Ile Arg Val Gln Asn Leu Glu Gln Pro Met Asp1
5 10 15gtt gcc gag tat ctt ttt cgg cgt ctc cac gaa atc ggc att cgc
tcc 96Val Ala Glu Tyr Leu Phe Arg Arg Leu His Glu Ile Gly Ile Arg
Ser20 25 30atc cac ggt ctt cca ggc gat tac aac ctt ctt gcc ctc gac
tat ttg 144Ile His Gly Leu Pro Gly Asp Tyr Asn Leu Leu Ala Leu Asp
Tyr Leu35 40 45cca tca tgt ggc ctg aga tgg gtt ggc agc gtc aac gaa
ctc aat gct 192Pro Ser Cys Gly Leu Arg Trp Val Gly Ser Val Asn Glu
Leu Asn Ala50 55 60gct tat gct gct gat ggc tat gcc cgc gtc aag cag
atg gga gct ctc 240Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Val Lys Gln
Met Gly Ala Leu65 70 75 80atc acc act ttt gga gtg gga gag ctc tca
gcc atc aat ggc gtt gcc 288Ile Thr Thr Phe Gly Val Gly Glu Leu Ser
Ala Ile Asn Gly Val Ala85 90 95ggt gcc ttt tcg gaa cac gtc cca gtc
gtt cac att gtt ggc tgc cct 336Gly Ala Phe Ser Glu His Val Pro Val
Val His Ile Val Gly Cys Pro100 105 110tcc act gtc tcg cag cga aac
ggc atg ctc ctc cac cac acg ctt gga 384Ser Thr Val Ser Gln Arg Asn
Gly Met Leu Leu His His Thr Leu Gly115 120 125aac ggc gac ttc aac
atc ttt gcc aac atg agc gct caa atc tct tgc 432Asn Gly Asp Phe Asn
Ile Phe Ala Asn Met Ser Ala Gln Ile Ser Cys130 135 140gaa gtg gcc
aag ctc acc aac cct gcc gaa att gcg acc cag atc gac 480Glu Val Ala
Lys Leu Thr Asn Pro Ala Glu Ile Ala Thr Gln Ile Asp145 150 155
160cat gcc ctc cgc gtt tgc ttc att cgt tct cgg ccc gtc tac atc atg
528His Ala Leu Arg Val Cys Phe Ile Arg Ser Arg Pro Val Tyr Ile
Met165 170 175ctt ccc acc gat atg gtc cag gcc aaa gta gaa ggt gcc
aga ctc aag 576Leu Pro Thr Asp Met Val Gln Ala Lys Val Glu Gly Ala
Arg Leu Lys180 185 190gaa cca att gac ttg tcg gag cct cca aat gat
ccc gag aaa gaa gca 624Glu Pro Ile Asp Leu Ser Glu Pro Pro Asn Asp
Pro Glu Lys Glu Ala195 200 205tac gtc gtt gac gtt gtc ctc aag tay
ctc cgt gct gca aag aac ccc 672Tyr Val Val Asp Val Val Leu Lys Tyr
Leu Arg Ala Ala Lys Asn Pro210 215 220gtc atc ctt gtc gat gct tgt
gct atc cgt cat cgt gtt ctt gat gag 720Val Ile Leu Val Asp Ala Cys
Ala Ile Arg His Arg Val Leu Asp Glu225 230 235 240gtt cat gat ctc
atc gaa aag aca aac ctc cct gtc ttt gtc act cct 768Val His Asp Leu
Ile Glu Lys Thr Asn Leu Pro Val Phe Val Thr Pro245 250 255atg ggc
aaa ggt gct gtt aac gaa gaa cac ccg aca tat ggt ggt gtc 816Met Gly
Lys Gly Ala Val Asn Glu Glu His Pro Thr Tyr Gly Gly Val260 265
270tat gcc ggt gac ggc tca cat ccg cct caa gtt aag gac atg gtt gag
864Tyr Ala Gly Asp Gly Ser His Pro Pro Gln Val Lys Asp Met Val
Glu275 280 285tct tct gat ttg ata ttg aca atc ggt gct ctc aag agc
gac ttc aac 912Ser Ser Asp Leu Ile Leu Thr Ile Gly Ala Leu Lys Ser
Asp Phe Asn290 295 300act gct ggc ttc tct tac cgt acc tca cag ctg
aac acg att gat cta 960Thr Ala Gly Phe Ser Tyr Arg Thr Ser Gln Leu
Asn Thr Ile Asp Leu305 310 315 320cac agc gac cac tgc att gtc aaa
tac tcg aca tat cca ggt gtc cag 1008His Ser Asp His Cys Ile Val Lys
Tyr Ser Thr Tyr Pro Gly Val Gln325 330 335atg agg ggt gtg ctg cga
caa gtg att aag cag ctc gat gca tct gag 1056Met Arg Gly Val Leu Arg
Gln Val Ile Lys Gln Leu Asp Ala Ser Glu340 345 350atc aac gct cag
cca gcg cca gtc gtc gag aat gaa gtt gcc aaa aac 1104Ile Asn Ala Gln
Pro Ala Pro Val Val Glu Asn Glu Val Ala Lys Asn355 360 365cga gat
aac tca ccc gtc att aca caa gct ttc ttc tgg ccg cgc gtg 1152Arg Asp
Asn Ser Pro Val Ile Thr Gln Ala Phe Phe Trp Pro Arg Val370 375
380gga gag ttc ctg aag aag aac gac atc gtc att acc gag act gga aca
1200Gly Glu Phe Leu Lys Lys Asn Asp Ile Val Ile Thr Glu Thr Gly
Thr385 390 395 400gcc aac ttt ggc atc tgg gat act aag ttt ccc tct
ggc gtt act gcg 1248Ala Asn Phe Gly Ile Trp Asp Thr Lys Phe Pro Ser
Gly Val Thr Ala405 410 415ctt tct cag gtc ctt tgg gga agc att ggt
tgg tcc gtt ggt gcc tgc 1296Leu Ser Gln Val Leu Trp Gly Ser Ile Gly
Trp Ser Val Gly Ala Cys420 425 430caa gga gcc gtt ctt gca gcc gcc
gat gac aac agc gat cgc aga act 1344Gln Gly Ala Val Leu Ala Ala Ala
Asp Asp Asn Ser Asp Arg Arg Thr435 440 445atc ctc ttt gtt ggt gat
ggc tca ttc cag ctc act gct caa gaa ttg 1392Ile Leu Phe Val Gly Asp
Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu450 455 460agc aca atg att
cgt ctc aag ctg aag ccc atc atc ttt gtc atc tgc 1440Ser Thr Met Ile
Arg Leu Lys Leu Lys Pro Ile Ile Phe Val Ile Cys465 470 475 480aac
gat ggc ttt acc att gaa cga ttc att cac ggc atg gaa gcc gag 1488Asn
Asp Gly Phe Thr Ile Glu Arg Phe Ile His Gly Met Glu Ala Glu485 490
495tac aac gac atc gca aat tgg gac ttc aag gct ctg gtt gac gtc ttt
1536Tyr Asn Asp Ile Ala Asn Trp Asp Phe Lys Ala Leu Val Asp Val
Phe500 505 510ggc ggc tct aag acg gcc aag aag ttc gcc gtc aag acc
aag gac gag 1584Gly Gly Ser Lys Thr Ala Lys Lys Phe Ala Val Lys Thr
Lys Asp Glu515 520 525ctg gac agc ctt ctc aca gac cct acc ttt aac
gcc gca gaa tgc ctc 1632Leu Asp Ser Leu Leu Thr Asp Pro Thr Phe Asn
Ala Ala Glu Cys Leu530 535 540cag ttt gtc gag cta tat atg ccc aaa
gaa gat gct cct cga gca ttg 1680Gln Phe Val Glu Leu Tyr Met Pro Lys
Glu Asp Ala Pro Arg Ala Leu545 550 555 560atc atg act gca gaa gct
agc gcg agg aac aat gcc aag aca gag 1725Ile Met Thr Ala Glu Ala Ser
Ala Arg Asn Asn Ala Lys Thr Glu565 570 5753575PRTUnknownFungal
isolate from soil sample 3Met Ala Ser Ile Asn Ile Arg Val Gln Asn
Leu Glu Gln Pro Met Asp1 5 10 15Val Ala Glu Tyr Leu Phe Arg Arg Leu
His Glu Ile Gly Ile Arg Ser20 25 30Ile His Gly Leu Pro Gly Asp Tyr
Asn Leu Leu Ala Leu Asp Tyr Leu35 40 45Pro Ser Cys Gly Leu Arg Trp
Val Gly Ser Val Asn Glu Leu Asn Ala50 55 60Ala Tyr Ala Ala Asp Gly
Tyr Ala Arg Val Lys Gln Met Gly Ala Leu65 70 75 80Ile Thr Thr Phe
Gly Val Gly Glu Leu Ser Ala Ile Asn Gly Val Ala85 90 95Gly Ala Phe
Ser Glu His Val Pro Val Val His Ile Val Gly Cys Pro100 105 110Ser
Thr Val Ser Gln Arg Asn Gly Met Leu Leu His His Thr Leu Gly115 120
125Asn Gly Asp Phe Asn Ile Phe Ala Asn Met Ser Ala Gln Ile Ser
Cys130 135 140Glu Val Ala Lys Leu Thr Asn Pro Ala Glu Ile Ala Thr
Gln Ile Asp145 150 155 160His Ala Leu Arg Val Cys Phe Ile Arg Ser
Arg Pro Val Tyr Ile Met165 170 175Leu Pro Thr Asp Met Val Gln Ala
Lys Val Glu Gly Ala Arg Leu Lys180 185 190Glu Pro Ile Asp Leu Ser
Glu Pro Pro Asn Asp Pro Glu Lys Glu Ala195 200 205Tyr Val Val Asp
Val Val Leu Lys Tyr Leu Arg Ala Ala Lys Asn Pro210 215 220Val Ile
Leu Val Asp Ala Cys Ala Ile Arg His Arg Val Leu Asp Glu225 230 235
240Val His Asp Leu Ile Glu Lys Thr Asn Leu Pro Val Phe Val Thr
Pro245 250 255Met Gly Lys Gly Ala Val Asn Glu Glu His Pro Thr Tyr
Gly Gly Val260 265 270Tyr Ala Gly Asp Gly Ser His Pro Pro Gln Val
Lys Asp Met Val Glu275 280 285Ser Ser Asp Leu Ile Leu Thr Ile Gly
Ala Leu Lys Ser Asp Phe Asn290 295 300Thr Ala Gly Phe Ser Tyr Arg
Thr Ser Gln Leu Asn Thr Ile Asp Leu305 310 315 320His Ser Asp His
Cys Ile Val Lys Tyr Ser Thr Tyr Pro Gly Val Gln325 330 335Met Arg
Gly Val Leu Arg Gln Val Ile Lys Gln Leu Asp Ala Ser Glu340 345
350Ile Asn Ala Gln Pro Ala Pro Val Val Glu Asn Glu Val Ala Lys
Asn355 360 365Arg Asp Asn Ser Pro Val Ile Thr Gln Ala Phe Phe Trp
Pro Arg Val370 375 380Gly Glu Phe Leu Lys Lys Asn Asp Ile Val Ile
Thr Glu Thr Gly Thr385 390 395 400Ala Asn Phe Gly Ile Trp Asp Thr
Lys Phe Pro Ser Gly Val Thr Ala405 410 415Leu Ser Gln Val Leu Trp
Gly Ser Ile Gly Trp Ser Val Gly Ala Cys420 425 430Gln Gly Ala Val
Leu Ala Ala Ala Asp Asp Asn Ser Asp Arg Arg Thr435 440 445Ile Leu
Phe Val Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu450 455
460Ser Thr Met Ile Arg Leu Lys Leu Lys Pro Ile Ile Phe Val Ile
Cys465 470 475 480Asn Asp Gly Phe Thr Ile Glu Arg Phe Ile His Gly
Met Glu Ala Glu485 490 495Tyr Asn Asp Ile Ala Asn Trp Asp Phe Lys
Ala Leu Val Asp Val Phe500 505 510Gly Gly Ser Lys Thr Ala Lys Lys
Phe Ala Val Lys Thr Lys Asp Glu515 520 525Leu Asp Ser Leu Leu Thr
Asp Pro Thr Phe Asn Ala Ala Glu Cys Leu530 535 540Gln Phe Val Glu
Leu Tyr Met Pro Lys Glu Asp Ala Pro Arg Ala Leu545 550 555 560Ile
Met Thr Ala Glu Ala Ser Ala Arg Asn Asn Ala Lys Thr Glu565 570
5754835DNAUnknownCDS(3)...(596)Fungal isolate from soil sample 4ct
ttc ttc tgg ccg cgc gtg gga gag ttc ctg aag aag aac gac atc 47Phe
Phe Trp Pro Arg Val Gly Glu Phe Leu Lys Lys Asn Asp Ile1 5 10 15gtc
att acc gag act gga aca gcc aac ttt ggc atc tgg gat act aag 95Val
Ile Thr Glu Thr Gly Thr Ala Asn
Phe Gly Ile Trp Asp Thr Lys20 25 30ttt ccc tct ggc gtt act gcg ctt
tct cag gtc ctt tgg gga agc att 143Phe Pro Ser Gly Val Thr Ala Leu
Ser Gln Val Leu Trp Gly Ser Ile35 40 45ggt tgg tcc gtt ggt gcc tgc
caa gga gcc gtt ctt gca gcc gcc gat 191Gly Trp Ser Val Gly Ala Cys
Gln Gly Ala Val Leu Ala Ala Ala Asp50 55 60gac aac agc gat cgc aga
act atc ctc ttt gtt ggt gat ggc tca ttc 239Asp Asn Ser Asp Arg Arg
Thr Ile Leu Phe Val Gly Asp Gly Ser Phe65 70 75cag ctc act gct caa
gaa ttg agc aca atg att cgt ctc aag ctg aag 287Gln Leu Thr Ala Gln
Glu Leu Ser Thr Met Ile Arg Leu Lys Leu Lys80 85 90 95ccc atc atc
ttt gtc atc tgc aac gat ggc ttt acc att gaa cga ttc 335Pro Ile Ile
Phe Val Ile Cys Asn Asp Gly Phe Thr Ile Glu Arg Phe100 105 110att
cac ggc atg gaa gcc gag tac aac gac atc gca aat tgg gac ttc 383Ile
His Gly Met Glu Ala Glu Tyr Asn Asp Ile Ala Asn Trp Asp Phe115 120
125aag gct ctg gtt gac gtc ttt ggc ggc tct aag acg gcc aag aag ttc
431Lys Ala Leu Val Asp Val Phe Gly Gly Ser Lys Thr Ala Lys Lys
Phe130 135 140gcc gtc aag acc aag gac gag ctg gac agc ctt ctc aca
gac cct acc 479Ala Val Lys Thr Lys Asp Glu Leu Asp Ser Leu Leu Thr
Asp Pro Thr145 150 155ttt aac gcc gca gaa tgc ctc cag ttt gtc gag
cta tat atg ccc aaa 527Phe Asn Ala Ala Glu Cys Leu Gln Phe Val Glu
Leu Tyr Met Pro Lys160 165 170 175gaa gat gct cct cga gca ttg atc
atg act gca gaa gct agc gcg agg 575Glu Asp Ala Pro Arg Ala Leu Ile
Met Thr Ala Glu Ala Ser Ala Arg180 185 190aac aat gcc aag aca gag
taa agtggactgt catgaaggcc gatttaccac 626Asn Asn Ala Lys Thr Glu
*195ctcataaatt gtaatagacc tgatacacat agatcaaggc aggtaccgat
cattaatcaa 686gcaggtttgg atggggaagg attttgaaaa tgaggaaacg
atgggatgat atttggaata 746actggccatt attttgagta cttataaaca
aatttgaagt tcaatttttt ttcaaaaaaa 806aaaaaaaaaa aaaaaaaaaa aaaaaaaaa
8355591DNAUnknownCDS(1)...(591)Fungal isolate from soil sample 5ttc
ttc tgg ccg cgc gtg gga gag ttc ctg aag aag aac gac atc gtc 48Phe
Phe Trp Pro Arg Val Gly Glu Phe Leu Lys Lys Asn Asp Ile Val1 5 10
15att acc gag act gga aca gcc aac ttt ggc atc tgg gat act aag ttt
96Ile Thr Glu Thr Gly Thr Ala Asn Phe Gly Ile Trp Asp Thr Lys Phe20
25 30ccc tct ggc gtt act gcg ctt tct cag gtc ctt tgg gga agc att
ggt 144Pro Ser Gly Val Thr Ala Leu Ser Gln Val Leu Trp Gly Ser Ile
Gly35 40 45tgg tcc gtt ggt gcc tgc caa gga gcc gtt ctt gca gcc gcc
gat gac 192Trp Ser Val Gly Ala Cys Gln Gly Ala Val Leu Ala Ala Ala
Asp Asp50 55 60aac agc gat cgc aga act atc ctc ttt gtt ggt gat ggc
tca ttc cag 240Asn Ser Asp Arg Arg Thr Ile Leu Phe Val Gly Asp Gly
Ser Phe Gln65 70 75 80ctc act gct caa gaa ttg agc aca atg att cgt
ctc aag ctg aag ccc 288Leu Thr Ala Gln Glu Leu Ser Thr Met Ile Arg
Leu Lys Leu Lys Pro85 90 95atc atc ttt gtc atc tgc aac gat ggc ttt
acc att gaa cga ttc att 336Ile Ile Phe Val Ile Cys Asn Asp Gly Phe
Thr Ile Glu Arg Phe Ile100 105 110cac ggc atg gaa gcc gag tac aac
gac atc gca aat tgg gac ttc aag 384His Gly Met Glu Ala Glu Tyr Asn
Asp Ile Ala Asn Trp Asp Phe Lys115 120 125gct ctg gtt gac gtc ttt
ggc ggc tct aag acg gcc aag aag ttc gcc 432Ala Leu Val Asp Val Phe
Gly Gly Ser Lys Thr Ala Lys Lys Phe Ala130 135 140gtc aag acc aag
gac gag ctg gac agc ctt ctc aca gac cct acc ttt 480Val Lys Thr Lys
Asp Glu Leu Asp Ser Leu Leu Thr Asp Pro Thr Phe145 150 155 160aac
gcc gca gaa tgc ctc cag ttt gtc gag cta tat atg ccc aaa gaa 528Asn
Ala Ala Glu Cys Leu Gln Phe Val Glu Leu Tyr Met Pro Lys Glu165 170
175gat gct cct cga gca ttg atc atg act gca gaa gct agc gcg agg aac
576Asp Ala Pro Arg Ala Leu Ile Met Thr Ala Glu Ala Ser Ala Arg
Asn180 185 190aat gcc aag aca gag 591Asn Ala Lys Thr
Glu1956197PRTUnknownFungal isolate from soil sample 6Phe Phe Trp
Pro Arg Val Gly Glu Phe Leu Lys Lys Asn Asp Ile Val1 5 10 15Ile Thr
Glu Thr Gly Thr Ala Asn Phe Gly Ile Trp Asp Thr Lys Phe20 25 30Pro
Ser Gly Val Thr Ala Leu Ser Gln Val Leu Trp Gly Ser Ile Gly35 40
45Trp Ser Val Gly Ala Cys Gln Gly Ala Val Leu Ala Ala Ala Asp Asp50
55 60Asn Ser Asp Arg Arg Thr Ile Leu Phe Val Gly Asp Gly Ser Phe
Gln65 70 75 80Leu Thr Ala Gln Glu Leu Ser Thr Met Ile Arg Leu Lys
Leu Lys Pro85 90 95Ile Ile Phe Val Ile Cys Asn Asp Gly Phe Thr Ile
Glu Arg Phe Ile100 105 110His Gly Met Glu Ala Glu Tyr Asn Asp Ile
Ala Asn Trp Asp Phe Lys115 120 125Ala Leu Val Asp Val Phe Gly Gly
Ser Lys Thr Ala Lys Lys Phe Ala130 135 140Val Lys Thr Lys Asp Glu
Leu Asp Ser Leu Leu Thr Asp Pro Thr Phe145 150 155 160Asn Ala Ala
Glu Cys Leu Gln Phe Val Glu Leu Tyr Met Pro Lys Glu165 170 175Asp
Ala Pro Arg Ala Leu Ile Met Thr Ala Glu Ala Ser Ala Arg Asn180 185
190Asn Ala Lys Thr Glu1957678DNAUnknownCDS(1)...(678)Fungal isolate
from soil sample 7aca tat cca ggt gtc cag atg agg ggt gtg ctg cga
caa gtg att aag 48Thr Tyr Pro Gly Val Gln Met Arg Gly Val Leu Arg
Gln Val Ile Lys1 5 10 15cag ctc gat gca tct gag atc aac gct cag cca
gcg cca gtc gtc gag 96Gln Leu Asp Ala Ser Glu Ile Asn Ala Gln Pro
Ala Pro Val Val Glu20 25 30aat gaa gtt gcc aaa aac cga gat aac tca
ccc gtc att aca caa gct 144Asn Glu Val Ala Lys Asn Arg Asp Asn Ser
Pro Val Ile Thr Gln Ala35 40 45ttc ttc tgg ccg cgc gtg gga gag ttc
ctg aag aag aac gac atc gtc 192Phe Phe Trp Pro Arg Val Gly Glu Phe
Leu Lys Lys Asn Asp Ile Val50 55 60att acc gag act gga aca gcc aac
ttt ggc atc tgg gat act aag ttt 240Ile Thr Glu Thr Gly Thr Ala Asn
Phe Gly Ile Trp Asp Thr Lys Phe65 70 75 80ccc tct ggc gtt act gcg
ctt tct cag gtc ctt tgg gga agc att ggt 288Pro Ser Gly Val Thr Ala
Leu Ser Gln Val Leu Trp Gly Ser Ile Gly85 90 95tgg tcc gtt ggt gcc
tgc caa gga gcc gtt ctt gca gcc gcc gat gac 336Trp Ser Val Gly Ala
Cys Gln Gly Ala Val Leu Ala Ala Ala Asp Asp100 105 110aac agc gat
cgc aga act atc ctc ttt gtt ggt gat ggc tca ttc cag 384Asn Ser Asp
Arg Arg Thr Ile Leu Phe Val Gly Asp Gly Ser Phe Gln115 120 125ctc
act gct caa gaa ttg agc aca atg att cgt ctc aag ctg aag ccc 432Leu
Thr Ala Gln Glu Leu Ser Thr Met Ile Arg Leu Lys Leu Lys Pro130 135
140atc atc ttt gtc atc tgc aac gat ggc ttt acc att gaa cga ttc att
480Ile Ile Phe Val Ile Cys Asn Asp Gly Phe Thr Ile Glu Arg Phe
Ile145 150 155 160cac ggc atg gaa gcc gag tac aac gac atc gca aat
tgg gac ttc aag 528His Gly Met Glu Ala Glu Tyr Asn Asp Ile Ala Asn
Trp Asp Phe Lys165 170 175gct ctg gtt gac gtc ttt ggc ggc tct aag
acg gcc aag aag ttc gcc 576Ala Leu Val Asp Val Phe Gly Gly Ser Lys
Thr Ala Lys Lys Phe Ala180 185 190gtc aag acc aag gac gag ctg gac
agc ctt ctc aca gac cct acc ttt 624Val Lys Thr Lys Asp Glu Leu Asp
Ser Leu Leu Thr Asp Pro Thr Phe195 200 205aac gcc gca gaa tgc ctc
cag ttt gtc gag cta tat atg ccc aaa gaa 672Asn Ala Ala Glu Cys Leu
Gln Phe Val Glu Leu Tyr Met Pro Lys Glu210 215 220gat gct 678Asp
Ala2258226PRTUnknownFungal isolate from soil sample 8Thr Tyr Pro
Gly Val Gln Met Arg Gly Val Leu Arg Gln Val Ile Lys1 5 10 15Gln Leu
Asp Ala Ser Glu Ile Asn Ala Gln Pro Ala Pro Val Val Glu20 25 30Asn
Glu Val Ala Lys Asn Arg Asp Asn Ser Pro Val Ile Thr Gln Ala35 40
45Phe Phe Trp Pro Arg Val Gly Glu Phe Leu Lys Lys Asn Asp Ile Val50
55 60Ile Thr Glu Thr Gly Thr Ala Asn Phe Gly Ile Trp Asp Thr Lys
Phe65 70 75 80Pro Ser Gly Val Thr Ala Leu Ser Gln Val Leu Trp Gly
Ser Ile Gly85 90 95Trp Ser Val Gly Ala Cys Gln Gly Ala Val Leu Ala
Ala Ala Asp Asp100 105 110Asn Ser Asp Arg Arg Thr Ile Leu Phe Val
Gly Asp Gly Ser Phe Gln115 120 125Leu Thr Ala Gln Glu Leu Ser Thr
Met Ile Arg Leu Lys Leu Lys Pro130 135 140Ile Ile Phe Val Ile Cys
Asn Asp Gly Phe Thr Ile Glu Arg Phe Ile145 150 155 160His Gly Met
Glu Ala Glu Tyr Asn Asp Ile Ala Asn Trp Asp Phe Lys165 170 175Ala
Leu Val Asp Val Phe Gly Gly Ser Lys Thr Ala Lys Lys Phe Ala180 185
190Val Lys Thr Lys Asp Glu Leu Asp Ser Leu Leu Thr Asp Pro Thr
Phe195 200 205Asn Ala Ala Glu Cys Leu Gln Phe Val Glu Leu Tyr Met
Pro Lys Glu210 215 220Asp
Ala22591636DNAUnknownCDS(1)...(1377)Fungal isolate from soil sample
9cga aac ggc atg ctc ctc cac cac acg ctt gga aac ggc gac ttc aac
48Arg Asn Gly Met Leu Leu His His Thr Leu Gly Asn Gly Asp Phe Asn1
5 10 15atc ttt gcc aac atg agc gct caa atc tct tgc gaa gtg gcc aag
ctc 96Ile Phe Ala Asn Met Ser Ala Gln Ile Ser Cys Glu Val Ala Lys
Leu20 25 30acc aac cct gcc gaa att gcg acc cag atc gac cat gcc ctc
cgc gtt 144Thr Asn Pro Ala Glu Ile Ala Thr Gln Ile Asp His Ala Leu
Arg Val35 40 45tgc ttc att cgt tct cgg ccc gtc tac atc atg ctt ccc
acc gat atg 192Cys Phe Ile Arg Ser Arg Pro Val Tyr Ile Met Leu Pro
Thr Asp Met50 55 60gtc cag gcc aaa gta gaa ggt gcc aga ctc aag gaa
cca att gac ttg 240Val Gln Ala Lys Val Glu Gly Ala Arg Leu Lys Glu
Pro Ile Asp Leu65 70 75 80tcg gag cct cca aat gat ccc gag aaa gaa
gca tac gtc gtt gac gtt 288Ser Glu Pro Pro Asn Asp Pro Glu Lys Glu
Ala Tyr Val Val Asp Val85 90 95gtc ctc aag tac ctc cgt gct gca aag
aac ccc gtc atc ctt gtc gat 336Val Leu Lys Tyr Leu Arg Ala Ala Lys
Asn Pro Val Ile Leu Val Asp100 105 110gct tgt gct atc cgt cat cgt
gtt ctt gat gag gtt cat gat ctc atc 384Ala Cys Ala Ile Arg His Arg
Val Leu Asp Glu Val His Asp Leu Ile115 120 125gaa aag aca aac ctc
cct gtc ttt gtc act cct atg ggc aaa ggt gct 432Glu Lys Thr Asn Leu
Pro Val Phe Val Thr Pro Met Gly Lys Gly Ala130 135 140gtt aac gaa
gaa cac ccg aca tat ggt ggt gtc tat gcc ggt gac ggc 480Val Asn Glu
Glu His Pro Thr Tyr Gly Gly Val Tyr Ala Gly Asp Gly145 150 155
160tca cat ccg cct caa gtt aag gac atg gtt gag tct tct gat ttg ata
528Ser His Pro Pro Gln Val Lys Asp Met Val Glu Ser Ser Asp Leu
Ile165 170 175ttg aca atc ggt gct ctc aag agc gac ttc aac act gct
ggc ttc tct 576Leu Thr Ile Gly Ala Leu Lys Ser Asp Phe Asn Thr Ala
Gly Phe Ser180 185 190tac cgt acc tca cag ctg aac acg att gat cta
cac agc gac cac tgc 624Tyr Arg Thr Ser Gln Leu Asn Thr Ile Asp Leu
His Ser Asp His Cys195 200 205att gtc aaa tac tcg aca tat cca ggt
gtc cag atg agg ggt gtg ctg 672Ile Val Lys Tyr Ser Thr Tyr Pro Gly
Val Gln Met Arg Gly Val Leu210 215 220cga caa gtg att aag cag ctc
gat gca tct gag atc aac gct cag cca 720Arg Gln Val Ile Lys Gln Leu
Asp Ala Ser Glu Ile Asn Ala Gln Pro225 230 235 240gcg cca gtc gtc
gag aat gaa gtt gcc aaa aac cga gat aac tca ccc 768Ala Pro Val Val
Glu Asn Glu Val Ala Lys Asn Arg Asp Asn Ser Pro245 250 255gtc att
aca caa gct ttc ttc tgg ccg cgc gtg gga gag ttc ctg aag 816Val Ile
Thr Gln Ala Phe Phe Trp Pro Arg Val Gly Glu Phe Leu Lys260 265
270aag aac gac atc gtc att acc gag act gga aca gcc aac ttt ggc atc
864Lys Asn Asp Ile Val Ile Thr Glu Thr Gly Thr Ala Asn Phe Gly
Ile275 280 285tgg gat act aag ttt ccc tct ggc gtt act gcg ctt tct
cag gtc ctt 912Trp Asp Thr Lys Phe Pro Ser Gly Val Thr Ala Leu Ser
Gln Val Leu290 295 300tgg gga agc att ggt tgg tcc gtt ggt gcc tgc
caa gga gcc gtt ctt 960Trp Gly Ser Ile Gly Trp Ser Val Gly Ala Cys
Gln Gly Ala Val Leu305 310 315 320gca gcc gcc gat gac aac agc gat
cgc aga act atc ctc ttt gtt ggt 1008Ala Ala Ala Asp Asp Asn Ser Asp
Arg Arg Thr Ile Leu Phe Val Gly325 330 335gat ggc tca ttc cag ctc
act gct caa gaa ttg agc aca atg att cgt 1056Asp Gly Ser Phe Gln Leu
Thr Ala Gln Glu Leu Ser Thr Met Ile Arg340 345 350ctc aag ctg aag
ccc atc atc ttt gtc atc tgc aac gat ggc ttt acc 1104Leu Lys Leu Lys
Pro Ile Ile Phe Val Ile Cys Asn Asp Gly Phe Thr355 360 365att gaa
cga ttc att cac ggc atg gaa gcc gag tac aac gac atc gca 1152Ile Glu
Arg Phe Ile His Gly Met Glu Ala Glu Tyr Asn Asp Ile Ala370 375
380aat tgg gac ttc aag gct ctg gtt gac gtc ttt ggc ggc tct aag acg
1200Asn Trp Asp Phe Lys Ala Leu Val Asp Val Phe Gly Gly Ser Lys
Thr385 390 395 400gcc aag aag ttc gcc gtc aag acc aag gac gag ctg
gac agc ctt ctc 1248Ala Lys Lys Phe Ala Val Lys Thr Lys Asp Glu Leu
Asp Ser Leu Leu405 410 415aca gac cct acc ttt aac gcc gca gaa tgc
ctc cag ttt gtc gag cta 1296Thr Asp Pro Thr Phe Asn Ala Ala Glu Cys
Leu Gln Phe Val Glu Leu420 425 430tat atg ccc aaa gaa gat gct cct
cga gca ttg atc atg act gca gaa 1344Tyr Met Pro Lys Glu Asp Ala Pro
Arg Ala Leu Ile Met Thr Ala Glu435 440 445gct agc gcg agg aac aat
gcc aag aca gag taa agtggactgt catgaaggcc 1397Ala Ser Ala Arg Asn
Asn Ala Lys Thr Glu *450 455gatttaccac ctcataaatt gtaatagacc
tgatacacat agatcaaggc aggtaccgat 1457cattaatcaa gcaggtttgg
atggggaagg attttgaaaa tgaggaaacg atgggatgat 1517atttggaata
actggccatt attttgagta cttataaaca aatttgaagt tcaatttttt
1577ttcaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaa 1636101374DNAUnknownCDS(1)...(1374)Fungal isolate from
soil sample 10cga aac ggc atg ctc ctc cac cac acg ctt gga aac ggc
gac ttc aac 48Arg Asn Gly Met Leu Leu His His Thr Leu Gly Asn Gly
Asp Phe Asn1 5 10 15atc ttt gcc aac atg agc gct caa atc tct tgc gaa
gtg gcc aag ctc 96Ile Phe Ala Asn Met Ser Ala Gln Ile Ser Cys Glu
Val Ala Lys Leu20 25 30acc aac cct gcc gaa att gcg acc cag atc gac
cat gcc ctc cgc gtt 144Thr Asn Pro Ala Glu Ile Ala Thr Gln Ile Asp
His Ala Leu Arg Val35 40 45tgc ttc att cgt tct cgg ccc gtc tac atc
atg ctt ccc acc gat atg 192Cys Phe Ile Arg Ser Arg Pro Val Tyr Ile
Met Leu Pro Thr Asp Met50 55 60gtc cag gcc aaa gta gaa ggt gcc aga
ctc aag gaa cca att gac ttg 240Val Gln Ala Lys Val Glu Gly Ala Arg
Leu Lys Glu Pro Ile Asp Leu65 70 75 80tcg gag cct cca aat gat ccc
gag aaa gaa gca tac gtc gtt gac gtt 288Ser Glu Pro Pro Asn Asp Pro
Glu Lys Glu Ala Tyr Val Val Asp Val85 90 95gtc ctc aag tac ctc cgt
gct gca aag aac ccc gtc atc ctt gtc gat 336Val Leu Lys Tyr Leu Arg
Ala Ala Lys Asn Pro Val Ile Leu Val Asp100 105 110gct tgt gct atc
cgt cat cgt gtt ctt gat gag gtt cat gat ctc atc 384Ala Cys Ala Ile
Arg His Arg Val Leu Asp Glu Val His Asp Leu Ile115 120 125gaa aag
aca aac ctc cct gtc ttt gtc act cct atg ggc aaa ggt gct 432Glu Lys
Thr Asn Leu Pro Val Phe Val Thr Pro Met Gly Lys Gly Ala130 135
140gtt aac gaa gaa cac ccg aca tat ggt ggt gtc tat gcc ggt gac ggc
480Val Asn Glu Glu His Pro Thr Tyr Gly Gly Val Tyr Ala Gly Asp
Gly145 150 155 160tca cat ccg cct caa gtt aag gac atg gtt gag tct
tct gat ttg ata 528Ser His Pro Pro Gln Val Lys Asp Met Val Glu Ser
Ser Asp Leu Ile165 170 175ttg aca atc ggt gct ctc aag agc gac ttc
aac act gct ggc ttc tct 576Leu Thr Ile Gly Ala Leu Lys Ser Asp Phe
Asn Thr Ala Gly Phe Ser180 185 190tac cgt acc tca cag ctg aac acg
att gat cta cac agc gac cac tgc 624Tyr
Arg Thr Ser Gln Leu Asn Thr Ile Asp Leu His Ser Asp His Cys195 200
205att gtc aaa tac tcg aca tat cca ggt gtc cag atg agg ggt gtg ctg
672Ile Val Lys Tyr Ser Thr Tyr Pro Gly Val Gln Met Arg Gly Val
Leu210 215 220cga caa gtg att aag cag ctc gat gca tct gag atc aac
gct cag cca 720Arg Gln Val Ile Lys Gln Leu Asp Ala Ser Glu Ile Asn
Ala Gln Pro225 230 235 240gcg cca gtc gtc gag aat gaa gtt gcc aaa
aac cga gat aac tca ccc 768Ala Pro Val Val Glu Asn Glu Val Ala Lys
Asn Arg Asp Asn Ser Pro245 250 255gtc att aca caa gct ttc ttc tgg
ccg cgc gtg gga gag ttc ctg aag 816Val Ile Thr Gln Ala Phe Phe Trp
Pro Arg Val Gly Glu Phe Leu Lys260 265 270aag aac gac atc gtc att
acc gag act gga aca gcc aac ttt ggc atc 864Lys Asn Asp Ile Val Ile
Thr Glu Thr Gly Thr Ala Asn Phe Gly Ile275 280 285tgg gat act aag
ttt ccc tct ggc gtt act gcg ctt tct cag gtc ctt 912Trp Asp Thr Lys
Phe Pro Ser Gly Val Thr Ala Leu Ser Gln Val Leu290 295 300tgg gga
agc att ggt tgg tcc gtt ggt gcc tgc caa gga gcc gtt ctt 960Trp Gly
Ser Ile Gly Trp Ser Val Gly Ala Cys Gln Gly Ala Val Leu305 310 315
320gca gcc gcc gat gac aac agc gat cgc aga act atc ctc ttt gtt ggt
1008Ala Ala Ala Asp Asp Asn Ser Asp Arg Arg Thr Ile Leu Phe Val
Gly325 330 335gat ggc tca ttc cag ctc act gct caa gaa ttg agc aca
atg att cgt 1056Asp Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu Ser Thr
Met Ile Arg340 345 350ctc aag ctg aag ccc atc atc ttt gtc atc tgc
aac gat ggc ttt acc 1104Leu Lys Leu Lys Pro Ile Ile Phe Val Ile Cys
Asn Asp Gly Phe Thr355 360 365att gaa cga ttc att cac ggc atg gaa
gcc gag tac aac gac atc gca 1152Ile Glu Arg Phe Ile His Gly Met Glu
Ala Glu Tyr Asn Asp Ile Ala370 375 380aat tgg gac ttc aag gct ctg
gtt gac gtc ttt ggc ggc tct aag acg 1200Asn Trp Asp Phe Lys Ala Leu
Val Asp Val Phe Gly Gly Ser Lys Thr385 390 395 400gcc aag aag ttc
gcc gtc aag acc aag gac gag ctg gac agc ctt ctc 1248Ala Lys Lys Phe
Ala Val Lys Thr Lys Asp Glu Leu Asp Ser Leu Leu405 410 415aca gac
cct acc ttt aac gcc gca gaa tgc ctc cag ttt gtc gag cta 1296Thr Asp
Pro Thr Phe Asn Ala Ala Glu Cys Leu Gln Phe Val Glu Leu420 425
430tat atg ccc aaa gaa gat gct cct cga gca ttg atc atg act gca gaa
1344Tyr Met Pro Lys Glu Asp Ala Pro Arg Ala Leu Ile Met Thr Ala
Glu435 440 445gct agc gcg agg aac aat gcc aag aca gag 1374Ala Ser
Ala Arg Asn Asn Ala Lys Thr Glu450 45511458PRTUnknownFungal isolate
from soil sample 11Arg Asn Gly Met Leu Leu His His Thr Leu Gly Asn
Gly Asp Phe Asn1 5 10 15Ile Phe Ala Asn Met Ser Ala Gln Ile Ser Cys
Glu Val Ala Lys Leu20 25 30Thr Asn Pro Ala Glu Ile Ala Thr Gln Ile
Asp His Ala Leu Arg Val35 40 45Cys Phe Ile Arg Ser Arg Pro Val Tyr
Ile Met Leu Pro Thr Asp Met50 55 60Val Gln Ala Lys Val Glu Gly Ala
Arg Leu Lys Glu Pro Ile Asp Leu65 70 75 80Ser Glu Pro Pro Asn Asp
Pro Glu Lys Glu Ala Tyr Val Val Asp Val85 90 95Val Leu Lys Tyr Leu
Arg Ala Ala Lys Asn Pro Val Ile Leu Val Asp100 105 110Ala Cys Ala
Ile Arg His Arg Val Leu Asp Glu Val His Asp Leu Ile115 120 125Glu
Lys Thr Asn Leu Pro Val Phe Val Thr Pro Met Gly Lys Gly Ala130 135
140Val Asn Glu Glu His Pro Thr Tyr Gly Gly Val Tyr Ala Gly Asp
Gly145 150 155 160Ser His Pro Pro Gln Val Lys Asp Met Val Glu Ser
Ser Asp Leu Ile165 170 175Leu Thr Ile Gly Ala Leu Lys Ser Asp Phe
Asn Thr Ala Gly Phe Ser180 185 190Tyr Arg Thr Ser Gln Leu Asn Thr
Ile Asp Leu His Ser Asp His Cys195 200 205Ile Val Lys Tyr Ser Thr
Tyr Pro Gly Val Gln Met Arg Gly Val Leu210 215 220Arg Gln Val Ile
Lys Gln Leu Asp Ala Ser Glu Ile Asn Ala Gln Pro225 230 235 240Ala
Pro Val Val Glu Asn Glu Val Ala Lys Asn Arg Asp Asn Ser Pro245 250
255Val Ile Thr Gln Ala Phe Phe Trp Pro Arg Val Gly Glu Phe Leu
Lys260 265 270Lys Asn Asp Ile Val Ile Thr Glu Thr Gly Thr Ala Asn
Phe Gly Ile275 280 285Trp Asp Thr Lys Phe Pro Ser Gly Val Thr Ala
Leu Ser Gln Val Leu290 295 300Trp Gly Ser Ile Gly Trp Ser Val Gly
Ala Cys Gln Gly Ala Val Leu305 310 315 320Ala Ala Ala Asp Asp Asn
Ser Asp Arg Arg Thr Ile Leu Phe Val Gly325 330 335Asp Gly Ser Phe
Gln Leu Thr Ala Gln Glu Leu Ser Thr Met Ile Arg340 345 350Leu Lys
Leu Lys Pro Ile Ile Phe Val Ile Cys Asn Asp Gly Phe Thr355 360
365Ile Glu Arg Phe Ile His Gly Met Glu Ala Glu Tyr Asn Asp Ile
Ala370 375 380Asn Trp Asp Phe Lys Ala Leu Val Asp Val Phe Gly Gly
Ser Lys Thr385 390 395 400Ala Lys Lys Phe Ala Val Lys Thr Lys Asp
Glu Leu Asp Ser Leu Leu405 410 415Thr Asp Pro Thr Phe Asn Ala Ala
Glu Cys Leu Gln Phe Val Glu Leu420 425 430Tyr Met Pro Lys Glu Asp
Ala Pro Arg Ala Leu Ile Met Thr Ala Glu435 440 445Ala Ser Ala Arg
Asn Asn Ala Lys Thr Glu450
4551230DNAUnknownCDS(1)...(30)Oligonucleotide used for PCR
amplification of GDC-1 12tcc cag atg cca aag ttg gct gtt cca gtc
30Ser Gln Met Pro Lys Leu Ala Val Pro Val1 5
1013563PRTSaccharomyces cerevisiae 13Met Ser Glu Ile Thr Leu Gly
Lys Tyr Leu Phe Glu Arg Leu Lys Gln1 5 10 15Val Asn Val Asn Thr Val
Phe Gly Leu Pro Gly Asp Phe Asn Leu Ser20 25 30Leu Leu Asp Lys Ile
Tyr Glu Val Glu Gly Met Arg Trp Ala Gly Asn35 40 45Ala Asn Glu Leu
Asn Ala Arg Tyr Ala Ala Asp Gly Tyr Ala Arg Ile50 55 60Lys Gly Met
Ser Cys Ile Ile Thr Thr Phe Gly Val Gly Glu Leu Ser65 70 75 80Ala
Leu Asn Gly Ile Ala Gly Ser Tyr Ala Glu His Val Gly Val Leu85 90
95His Val Val Gly Val Pro Ser Ile Ser Ser Gln Ala Lys Gln Leu
Leu100 105 110Leu His His Thr Leu Gly Asn Gly Asp Phe Thr Val Phe
His Arg Met115 120 125Ser Ala Asn Ile Ser Glu Thr Thr Ala Met Ile
Thr Asp Ile Cys Thr130 135 140Ala Pro Ala Glu Ile Asp Arg Cys Ile
Arg Thr Thr Tyr Val Thr Gln145 150 155 160Arg Pro Val Tyr Leu Gly
Leu Pro Ala Asn Leu Val Asp Leu Asn Val165 170 175Pro Ala Lys Leu
Leu Gln Thr Pro Ile Asp Met Ser Leu Lys Pro Asn180 185 190Asp Ala
Glu Ser Glu Lys Glu Val Ile Asp Thr Ile Leu Val Leu Ala195 200
205Lys Asp Ala Lys Asn Pro Val Ile Leu Ala Asp Ala Cys Cys Ser
Arg210 215 220His Asp Val Lys Ala Glu Thr Lys Lys Leu Ile Asp Leu
Thr Gln Phe225 230 235 240Pro Ala Phe Val Thr Pro Met Gly Lys Gly
Ser Ile Ser Glu Gln His245 250 255Pro Arg Tyr Gly Gly Val Tyr Val
Gly Thr Leu Ser Lys Pro Glu Val260 265 270Lys Glu Ala Val Glu Ser
Ala Asp Leu Ile Leu Ser Val Gly Ala Leu275 280 285Leu Ser Asp Phe
Asn Thr Gly Ser Phe Ser Tyr Ser Tyr Lys Thr Lys290 295 300Asn Ile
Val Glu Phe His Ser Asp His Met Lys Ile Arg Asn Ala Thr305 310 315
320Phe Pro Gly Val Gln Met Lys Phe Val Leu Gln Lys Leu Leu Thr
Asn325 330 335Ile Ala Asp Ala Ala Lys Gly Tyr Lys Pro Val Ala Val
Pro Ala Arg340 345 350Thr Pro Ala Asn Ala Ala Val Pro Ala Ser Thr
Pro Leu Lys Gln Glu355 360 365Trp Met Trp Asn Gln Leu Gly Asn Phe
Leu Gln Glu Gly Asp Val Val370 375 380Ile Ala Glu Thr Gly Thr Ser
Ala Phe Gly Ile Asn Gln Thr Thr Phe385 390 395 400Pro Asn Asn Thr
Tyr Gly Ile Ser Gln Val Leu Trp Gly Ser Ile Gly405 410 415Phe Thr
Thr Gly Ala Thr Leu Gly Ala Ala Phe Ala Ala Glu Glu Ile420 425
430Asp Pro Lys Lys Arg Val Ile Leu Phe Ile Gly Asp Gly Ser Leu
Gln435 440 445Leu Thr Val Gln Glu Ile Ser Thr Met Ile Arg Trp Gly
Leu Lys Pro450 455 460Tyr Leu Phe Val Leu Asn Asn Asp Gly Tyr Thr
Ile Glu Lys Leu Ile465 470 475 480His Gly Pro Lys Ala Gln Tyr Asn
Glu Ile Gln Gly Trp Asp His Leu485 490 495Ser Leu Leu Pro Thr Phe
Gly Ala Lys Asp Tyr Glu Thr His Arg Val500 505 510Ala Thr Thr Gly
Glu Trp Asp Lys Leu Thr Gln Asp Lys Ser Phe Asn515 520 525Asp Asn
Ser Lys Ile Arg Met Ile Glu Val Met Leu Pro Val Phe Asp530 535
540Ala Pro Gln Asn Leu Val Glu Gln Ala Lys Leu Thr Ala Ala Thr
Asn545 550 555 560Ala Lys Gln14550PRTSalmonella typhimurium 14Met
Gln Asn Pro Tyr Thr Val Ala Asp Tyr Leu Leu Asp Arg Leu Ala1 5 10
15Gly Cys Gly Ile Gly His Leu Phe Gly Val Pro Gly Asp Tyr Asn Leu20
25 30Gln Phe Leu Asp His Val Ile Asp His Pro Thr Leu Arg Trp Val
Gly35 40 45Cys Ala Asn Glu Leu Asn Ala Ala Tyr Ala Ala Asp Gly Tyr
Ala Arg50 55 60Met Ser Gly Ala Gly Ala Leu Leu Thr Thr Phe Gly Val
Gly Glu Leu65 70 75 80Ser Ala Ile Asn Gly Ile Ala Gly Ser Tyr Ala
Glu Tyr Val Pro Val85 90 95Leu His Ile Val Gly Ala Pro Cys Ser Ala
Ala Gln Gln Arg Gly Glu100 105 110Leu Met His His Thr Leu Gly Asp
Gly Asp Phe Arg His Phe Tyr Arg115 120 125Met Ser Gln Ala Ile Ser
Ala Ala Ser Ala Ile Leu Asp Glu Gln Asn130 135 140Ala Cys Phe Glu
Ile Asp Arg Val Leu Gly Glu Met Leu Ala Ala Arg145 150 155 160Arg
Pro Gly Tyr Ile Met Leu Pro Ala Asp Val Ala Lys Lys Thr Ala165 170
175Ile Pro Pro Thr Gln Ala Leu Ala Leu Pro Val His Glu Ala Gln
Ser180 185 190Gly Val Glu Thr Ala Phe Arg Tyr His Ala Arg Gln Cys
Leu Met Asn195 200 205Ser Arg Arg Ile Ala Leu Leu Ala Asp Phe Leu
Ala Gly Arg Phe Gly210 215 220Leu Arg Pro Leu Leu Gln Arg Trp Met
Ala Glu Thr Pro Ile Ala His225 230 235 240Ala Thr Leu Leu Met Gly
Lys Gly Leu Phe Asp Glu Gln His Pro Asn245 250 255Phe Val Gly Thr
Tyr Ser Ala Gly Ala Ser Ser Lys Glu Val Arg Gln260 265 270Ala Ile
Glu Asp Ala Asp Arg Val Ile Cys Val Gly Thr Arg Phe Val275 280
285Asp Thr Leu Thr Ala Gly Phe Thr Gln Gln Leu Pro Ala Glu Arg
Thr290 295 300Leu Glu Ile Gln Pro Tyr Ala Ser Arg Ile Gly Glu Thr
Trp Phe Asn305 310 315 320Leu Pro Met Ala Gln Ala Val Ser Thr Leu
Arg Glu Leu Cys Leu Glu325 330 335Cys Ala Phe Ala Pro Pro Pro Thr
Arg Ser Ala Gly Gln Pro Val Arg340 345 350Ile Asp Lys Gly Glu Leu
Thr Gln Glu Ser Phe Trp Gln Thr Leu Gln355 360 365Gln Tyr Leu Lys
Pro Gly Asp Ile Ile Leu Val Asp Gln Gly Thr Ala370 375 380Ala Phe
Gly Ala Ala Ala Leu Ser Leu Pro Asp Gly Ala Glu Val Val385 390 395
400Leu Gln Pro Leu Trp Gly Ser Ile Gly Tyr Ser Leu Pro Ala Ala
Phe405 410 415Gly Ala Gln Thr Ala Cys Pro Asp Arg Arg Val Ile Leu
Ile Ile Gly420 425 430Asp Gly Ala Ala Gln Leu Thr Ile Gln Glu Met
Gly Ser Met Leu Arg435 440 445Asp Gly Gln Ala Pro Val Ile Leu Leu
Leu Asn Asn Asp Gly Tyr Thr450 455 460Val Glu Arg Ala Ile His Gly
Ala Ala Gln Arg Tyr Asn Asp Ile Ala465 470 475 480Ser Trp Asn Trp
Thr Gln Ile Pro Pro Ala Leu Asn Ala Ala Gln Gln485 490 495Ala Glu
Cys Trp Arg Val Thr Gln Ala Ile Gln Leu Ala Glu Val Leu500 505
510Glu Arg Leu Ala Arg Pro Gln Arg Leu Ser Phe Ile Glu Val Met
Leu515 520 525Pro Lys Ala Asp Leu Pro Glu Leu Leu Arg Thr Val Thr
Arg Ala Leu530 535 540Glu Ala Arg Asn Gly Gly545
55015568PRTZymomonas mobilis 15Met Ser Tyr Thr Val Gly Thr Tyr Leu
Ala Glu Arg Leu Val Gln Ile1 5 10 15Gly Leu Lys His His Phe Ala Val
Ala Gly Asp Tyr Asn Leu Val Leu20 25 30Leu Asp Asn Leu Leu Leu Asn
Lys Asn Met Glu Gln Val Tyr Cys Cys35 40 45Asn Glu Leu Asn Cys Gly
Phe Ser Ala Glu Gly Tyr Ala Arg Ala Lys50 55 60Gly Ala Ala Ala Ala
Val Val Thr Tyr Ser Val Gly Ala His Ser Ala65 70 75 80Phe Asp Ala
Ile Gly Gly Ala Tyr Ala Glu Asn Leu Pro Val Ile Leu85 90 95Ile Ser
Gly Ala Pro Asn Asn Asn Asp His Ala Ala Gly His Val Leu100 105
110His His Ala Leu Gly Lys Thr Asp Tyr His Tyr Gln Leu Glu Met
Ala115 120 125Lys Asn Ile Thr Ala Ala Ala Glu Ala Ile Tyr Thr Pro
Glu Glu Ala130 135 140Pro Ala Lys Ile Asp His Val Ile Lys Thr Ala
Leu Ala Lys Lys Lys145 150 155 160Pro Val Tyr Leu Glu Ile Ala Cys
Asn Ile Ala Ser Met Pro Cys Ala165 170 175Ala Pro Gly Pro Ala Ser
Ala Leu Phe Asn Asp Glu Ala Ser Asp Glu180 185 190Ala Ser Leu Asn
Ala Ala Val Asp Glu Thr Leu Lys Phe Ile Ala Asn195 200 205Arg Asp
Lys Val Ala Val Leu Val Gly Ser Lys Leu Arg Ala Ala Gly210 215
220Ala Glu Glu Ala Ala Val Lys Phe Thr Asp Ala Leu Gly Gly Ala
Val225 230 235 240Ala Thr Met Ala Ala Ala Lys Ser Phe Phe Pro Glu
Glu Asn Pro His245 250 255Tyr Ile Gly Thr Ser Trp Gly Glu Val Ser
Tyr Pro Gly Val Glu Lys260 265 270Thr Met Lys Glu Ala Asp Ala Val
Ile Ala Leu Ala Pro Val Phe Asn275 280 285Asp Tyr Ser Thr Thr Gly
Trp Thr Asp Ile Pro Asp Pro Lys Lys Leu290 295 300Val Leu Ala Glu
Pro Arg Ser Val Val Val Arg Arg Ile Arg Phe Pro305 310 315 320Ser
Val His Leu Lys Asp Tyr Leu Thr Arg Leu Ala Gln Lys Val Ser325 330
335Lys Lys Thr Gly Ser Leu Asp Phe Phe Lys Ser Leu Asn Ala Gly
Glu340 345 350Leu Lys Lys Ala Ala Pro Ala Asp Pro Ser Ala Pro Leu
Val Asn Ala355 360 365Glu Ile Ala Arg Gln Val Glu Ala Leu Leu Thr
Pro Asn Thr Thr Val370 375 380Ile Ala Glu Thr Gly Asp Ser Trp Phe
Asn Ala Gln Arg Met Lys Leu385 390 395 400Pro Asn Gly Ala Arg Val
Glu Tyr Glu Met Gln Trp Gly His Ile Gly405 410 415Trp Ser Val Pro
Ala Ala Phe Gly Tyr Ala Val Gly Ala Pro Glu Arg420 425 430Arg Asn
Ile Leu Met Val Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln435 440
445Glu Val Ala Gln Met Val Arg Leu Lys Leu Pro Val Ile Ile Phe
Leu450 455 460Ile Asn Asn Tyr Gly Tyr Thr Ile Glu Val Met Ile His
Asp Gly Pro465 470 475 480Tyr Asn Asn Ile Lys Asn Trp Asp Tyr Ala
Gly Leu Met Glu Val Phe485 490 495Asn Gly Asn Gly Gly Tyr Asp Ser
Gly Ala Ala Lys Gly Leu Lys Ala500 505 510Lys Thr Gly Gly Glu Leu
Ala Glu Ala Ile Lys Val Ala Leu Ala Asn515 520 525Thr Asp Gly Pro
Thr Leu Ile Glu Cys Phe Ile Gly Arg Glu Asp Cys530 535 540Thr Glu
Glu Leu Val Lys Trp Gly Lys Arg Val Ala Ala Ala Asn Ser545 550 555
560Arg Lys Pro Val Asn Lys Leu Leu56516687PRTSaccharomyces
cerevisiae 16Met Ile Arg Gln Ser Thr Leu Lys Asn Phe Ala Ile Lys
Arg Cys Phe1 5 10 15Gln His Ile Ala Tyr Arg Asn Thr Pro Ala Met Arg
Ser
Val Ala Leu20 25 30Ala Gln Arg Phe Tyr Ser Ser Ser Ser Arg Tyr Tyr
Ser Ala Ser Pro35 40 45Leu Pro Ala Ser Lys Arg Pro Glu Pro Ala Pro
Ser Phe Asn Val Asp50 55 60Pro Leu Glu Gln Pro Ala Glu Pro Ser Lys
Leu Ala Lys Lys Leu Arg65 70 75 80Ala Glu Pro Asp Met Asp Thr Ser
Phe Val Gly Leu Thr Gly Gly Gln85 90 95Ile Phe Asn Glu Met Met Ser
Arg Gln Asn Val Asp Thr Val Phe Gly100 105 110Tyr Pro Gly Gly Ala
Ile Leu Pro Val Tyr Asp Ala Ile His Asn Ser115 120 125Asp Lys Phe
Asn Phe Val Leu Pro Lys His Glu Gln Gly Ala Gly His130 135 140Met
Ala Glu Gly Tyr Ala Arg Ala Ser Gly Lys Pro Gly Val Val Leu145 150
155 160Val Thr Ser Gly Pro Gly Ala Thr Asn Val Val Thr Pro Met Ala
Asp165 170 175Ala Phe Ala Asp Gly Ile Pro Met Val Val Phe Thr Gly
Gln Val Pro180 185 190Thr Ser Ala Ile Gly Thr Asp Ala Phe Gln Glu
Ala Asp Val Val Gly195 200 205Ile Ser Arg Ser Cys Thr Lys Trp Asn
Val Met Val Lys Ser Val Glu210 215 220Glu Leu Pro Leu Arg Ile Asn
Glu Ala Phe Glu Ile Ala Thr Ser Gly225 230 235 240Arg Pro Gly Pro
Val Leu Val Asp Leu Pro Lys Asp Val Thr Ala Ala245 250 255Ile Leu
Arg Asn Pro Ile Pro Thr Lys Thr Thr Leu Pro Ser Asn Ala260 265
270Leu Asn Gln Leu Thr Ser Arg Ala Gln Asp Glu Phe Val Met Gln
Ser275 280 285Ile Asn Lys Ala Ala Asp Leu Ile Asn Leu Ala Lys Lys
Pro Val Leu290 295 300Tyr Val Gly Ala Gly Ile Leu Asn His Ala Asp
Gly Pro Arg Leu Leu305 310 315 320Lys Glu Leu Ser Asp Arg Ala Gln
Ile Pro Val Thr Thr Thr Leu Gln325 330 335Gly Leu Gly Ser Phe Asp
Gln Glu Asp Pro Lys Ser Leu Asp Met Leu340 345 350Gly Met His Gly
Cys Ala Thr Ala Asn Leu Ala Val Gln Asn Ala Asp355 360 365Leu Ile
Ile Ala Val Gly Ala Arg Phe Asp Asp Arg Val Thr Gly Asn370 375
380Ile Ser Lys Phe Ala Pro Glu Ala Arg Arg Ala Ala Ala Glu Gly
Arg385 390 395 400Gly Gly Ile Ile His Phe Glu Val Ser Pro Lys Asn
Ile Asn Lys Val405 410 415Val Gln Thr Gln Ile Ala Val Glu Gly Asp
Ala Thr Thr Asn Leu Gly420 425 430Lys Met Met Ser Lys Ile Phe Pro
Val Lys Glu Arg Ser Glu Trp Phe435 440 445Ala Gln Ile Asn Lys Trp
Lys Lys Glu Tyr Pro Tyr Ala Tyr Met Glu450 455 460Glu Thr Pro Gly
Ser Lys Ile Lys Pro Gln Thr Val Ile Lys Lys Leu465 470 475 480Ser
Lys Val Ala Asn Asp Thr Gly Arg His Val Ile Val Thr Thr Gly485 490
495Val Gly Gln His Gln Met Trp Ala Ala Gln His Trp Thr Trp Arg
Asn500 505 510Pro His Thr Phe Ile Thr Ser Gly Gly Leu Gly Thr Met
Gly Tyr Gly515 520 525Leu Pro Ala Ala Ile Gly Ala Gln Val Ala Lys
Pro Glu Ser Leu Val530 535 540Ile Asp Ile Asp Gly Asp Ala Ser Phe
Asn Met Thr Leu Thr Glu Leu545 550 555 560Ser Ser Ala Val Gln Ala
Gly Thr Pro Val Lys Ile Leu Ile Leu Asn565 570 575Asn Glu Glu Gln
Gly Met Val Thr Gln Trp Gln Ser Leu Phe Tyr Glu580 585 590His Arg
Tyr Ser His Thr His Gln Leu Asn Pro Asp Phe Ile Lys Leu595 600
605Ala Glu Ala Met Gly Leu Lys Gly Leu Arg Val Lys Lys Gln Glu
Glu610 615 620Leu Asp Ala Lys Leu Lys Glu Phe Val Ser Thr Lys Gly
Pro Val Leu625 630 635 640Leu Glu Val Glu Val Asp Lys Lys Val Pro
Val Leu Pro Met Val Ala645 650 655Gly Gly Ser Gly Leu Asp Glu Phe
Ile Asn Phe Asp Pro Glu Val Glu660 665 670Arg Gln Gln Thr Glu Leu
Arg His Lys Arg Thr Gly Gly Lys His675 680 68517686PRTMagnaporthe
grisea 17Met Leu Arg Thr Val Gly Arg Lys Ala Leu Arg Gly Ser Ser
Lys Gly1 5 10 15Cys Ser Arg Thr Ile Ser Thr Leu Lys Pro Ala Thr Ala
Thr Ile Ala20 25 30Lys Pro Gly Ser Arg Thr Leu Ser Thr Pro Ala Thr
Ala Thr Ala Thr35 40 45Ala Pro Arg Thr Lys Pro Ser Ala Ser Phe Asn
Ala Arg Arg Asp Pro50 55 60Gln Pro Leu Val Asn Pro Arg Ser Gly Glu
Ala Asp Glu Ser Phe Ile65 70 75 80Gly Lys Thr Gly Gly Glu Ile Phe
His Glu Met Met Leu Arg Gln Asn85 90 95Val Lys His Ile Phe Gly Tyr
Pro Gly Gly Ala Ile Leu Pro Val Phe100 105 110Asp Ala Ile Tyr Asn
Ser Lys His Ile Asp Phe Val Leu Pro Lys His115 120 125Glu Gln Gly
Ala Gly His Met Ala Glu Gly Tyr Ala Arg Ala Ser Gly130 135 140Lys
Pro Gly Val Val Leu Val Thr Ser Gly Pro Gly Ala Thr Asn Val145 150
155 160Ile Thr Pro Met Ala Asp Ala Leu Ala Asp Gly Thr Pro Leu Val
Val165 170 175Phe Ser Gly Gln Val Val Thr Ser Asp Ile Gly Ser Asp
Ala Phe Gln180 185 190Glu Ala Asp Val Ile Gly Ile Ser Arg Ser Cys
Thr Lys Trp Asn Val195 200 205Met Val Lys Ser Ala Asp Glu Leu Pro
Arg Arg Ile Asn Glu Ala Phe210 215 220Glu Ile Ala Thr Ser Gly Arg
Pro Gly Pro Val Leu Val Asp Pro Ala225 230 235 240Lys Asp Val Thr
Ala Ser Val Leu Arg Arg Ala Ile Pro Thr Glu Thr245 250 255Ser Ile
Pro Ser Ile Ser Ala Ala Ala Arg Ala Val Gln Glu Ala Gly260 265
270Arg Lys Gln Leu Glu His Ser Ile Lys Arg Val Ala Asp Leu Val
Asn275 280 285Ile Ala Lys Lys Pro Val Ile Tyr Ala Gly Gln Gly Val
Ile Leu Ser290 295 300Glu Gly Gly Val Glu Leu Leu Lys Ala Leu Ala
Asp Lys Ala Ser Ile305 310 315 320Pro Val Thr Thr Thr Leu His Gly
Leu Gly Ala Phe Asp Glu Leu Asp325 330 335Glu Lys Ala Leu His Met
Leu Gly Met His Gly Ser Ala Tyr Ala Asn340 345 350Met Ser Met Gln
Glu Ala Asp Leu Ile Ile Ala Leu Gly Gly Arg Phe355 360 365Asp Asp
Arg Val Thr Gly Ser Ile Pro Lys Phe Ala Pro Ala Ala Lys370 375
380Leu Ala Ala Ala Glu Gly Arg Gly Gly Ile Val His Phe Glu Ile
Met385 390 395 400Pro Lys Asn Ile Asn Lys Val Val Gln Ala Thr Glu
Ala Ile Glu Gly405 410 415Asp Val Ala Ser Asn Leu Lys Leu Leu Leu
Pro Lys Ile Glu Gln Arg420 425 430Ser Met Thr Asp Arg Lys Glu Trp
Phe Asp Gln Ile Lys Glu Trp Lys435 440 445Glu Lys Trp Pro Leu Ser
His Tyr Glu Arg Ala Glu Arg Ser Gly Leu450 455 460Ile Lys Pro Gln
Thr Leu Ile Glu Glu Leu Ser Asn Leu Thr Ala Asp465 470 475 480Arg
Lys Asp Met Thr Tyr Ile Thr Thr Gly Val Gly Gln His Gln Met485 490
495Trp Thr Ala Gln His Phe Arg Trp Arg His Pro Arg Ser Met Ile
Thr500 505 510Ser Gly Gly Leu Gly Thr Met Gly Tyr Gly Leu Pro Ala
Ala Ile Gly515 520 525Ala Lys Val Ala Arg Pro Asp Ala Leu Val Ile
Asp Ile Asp Gly Asp530 535 540Ala Ser Phe Asn Met Thr Leu Thr Glu
Leu Ser Thr Ala Ala Gln Phe545 550 555 560Asn Ile Gly Val Lys Val
Ile Val Leu Asn Asn Glu Glu Gln Gly Met565 570 575Val Thr Gln Trp
Gln Asn Leu Phe Tyr Glu Asp Arg Tyr Ser His Thr580 585 590His Gln
Arg Asn Pro Asp Phe Met Lys Leu Ala Asp Ala Met Asp Val595 600
605Gln His Arg Arg Val Ser Lys Pro Asp Asp Val Gly Asp Ala Leu
Thr610 615 620Trp Leu Ile Asn Thr Asp Gly Pro Ala Leu Leu Glu Val
Met Thr Asp625 630 635 640Lys Lys Val Pro Val Leu Pro Met Val Pro
Gly Gly Asn Gly Leu His645 650 655Glu Phe Ile Thr Phe Asp Ala Ser
Lys Asp Lys Gln Arg Arg Glu Leu660 665 670Met Arg Ala Arg Thr Asn
Gly Leu His Gly Arg Thr Ala Val675 680 685181728DNAUnknownFungal
isolate from soil sample 18atg gcc agc atc aac atc agg gtg cag aat
ctc gag caa ccc atg gac 48Met Ala Ser Ile Asn Ile Arg Val Gln Asn
Leu Glu Gln Pro Met Asp1 5 10 15gtt gcc gag tat ctt ttc cgg cgt ctc
cac gaa atc ggc att cgc tcc 96Val Ala Glu Tyr Leu Phe Arg Arg Leu
His Glu Ile Gly Ile Arg Ser20 25 30atc cac ggt ctt cca ggc gat tac
aac cct ctt gcc ctc gac tat ttg 144Ile His Gly Leu Pro Gly Asp Tyr
Asn Pro Leu Ala Leu Asp Tyr Leu35 40 45cca tca tgt ggc ctg aga tgg
gtt ggc agc gtc aac gaa ctc aat gct 192Pro Ser Cys Gly Leu Arg Trp
Val Gly Ser Val Asn Glu Leu Asn Ala50 55 60gct tat gct gct gat ggc
tat gcc cgc gtc aag cag atg gga gct ctc 240Ala Tyr Ala Ala Asp Gly
Tyr Ala Arg Val Lys Gln Met Gly Ala Leu65 70 75 80atc acc act ttt
gga gtg gga gag ctc tca gcc atc aat ggc gtt gcc 288Ile Thr Thr Phe
Gly Val Gly Glu Leu Ser Ala Ile Asn Gly Val Ala85 90 95ggt gcc ttt
tcg gaa cac gtc cca gtc gtt cac att gtt ggc tgc cct 336Gly Ala Phe
Ser Glu His Val Pro Val Val His Ile Val Gly Cys Pro100 105 110tcc
act gcc tcg cag cga aac ggc atg ctc ctc cac cac acg ctt gga 384Ser
Thr Ala Ser Gln Arg Asn Gly Met Leu Leu His His Thr Leu Gly115 120
125aac ggc gac ttc aac atc ttt gcc aac atg agc gct caa atc tct tgc
432Asn Gly Asp Phe Asn Ile Phe Ala Asn Met Ser Ala Gln Ile Ser
Cys130 135 140gaa gtg gcc aag ctc acc aac cct gcc gaa att gcg acc
cag atc gac 480Glu Val Ala Lys Leu Thr Asn Pro Ala Glu Ile Ala Thr
Gln Ile Asp145 150 155 160cat gcc ctc cgc gtt tgc ttc att cgt tct
cgg ccc gtc tac atc atg 528His Ala Leu Arg Val Cys Phe Ile Arg Ser
Arg Pro Val Tyr Ile Met165 170 175ctt ccc acc gat atg gtc cag gcc
aaa gta gaa ggt gcc aga ctc aag 576Leu Pro Thr Asp Met Val Gln Ala
Lys Val Glu Gly Ala Arg Leu Lys180 185 190gaa cca att gac ttg tcg
gag cct cca aat gat ccc gag aaa gaa gca 624Glu Pro Ile Asp Leu Ser
Glu Pro Pro Asn Asp Pro Glu Lys Glu Ala195 200 205tac gtc gtt gac
gtt gtc ctc aag tac ctc cgt gct gca aag aac ccc 672Tyr Val Val Asp
Val Val Leu Lys Tyr Leu Arg Ala Ala Lys Asn Pro210 215 220gtc atc
ctt gtc gat gct tgt gct atc cgt cat cgt gtt ctt gat gag 720Val Ile
Leu Val Asp Ala Cys Ala Ile Arg His Arg Val Leu Asp Glu225 230 235
240gtt cat gat ctc atc gaa aag aca aac ctc ccc gtc ttt gtc act cct
768Val His Asp Leu Ile Glu Lys Thr Asn Leu Pro Val Phe Val Thr
Pro245 250 255atg ggc aaa ggt gct gtt aac gaa gaa cac ccg aca tat
ggt ggt gtc 816Met Gly Lys Gly Ala Val Asn Glu Glu His Pro Thr Tyr
Gly Gly Val260 265 270tat gcc ggt gac ggc tca cat ccg cct caa gtt
aag gac atg gtt gag 864Tyr Ala Gly Asp Gly Ser His Pro Pro Gln Val
Lys Asp Met Val Glu275 280 285tct tct gat ttg ata ttg aca atc ggt
gct ctc aag agc gac ttc aac 912Ser Ser Asp Leu Ile Leu Thr Ile Gly
Ala Leu Lys Ser Asp Phe Asn290 295 300act gct ggc ttc tct tac cgt
acc tca cag ctg aac acg att gat cta 960Thr Ala Gly Phe Ser Tyr Arg
Thr Ser Gln Leu Asn Thr Ile Asp Leu305 310 315 320cac agc gac cac
tgc att gtc aaa tac tcg aca tat cca ggt gtc cag 1008His Ser Asp His
Cys Ile Val Lys Tyr Ser Thr Tyr Pro Gly Val Gln325 330 335atg agg
ggt gtg ctg cga caa gtg att aag cag ctc gat gca tct gag 1056Met Arg
Gly Val Leu Arg Gln Val Ile Lys Gln Leu Asp Ala Ser Glu340 345
350atc aac gct cag cca gcg cca gtc gtc gag aat gaa gtt gcc aaa aac
1104Ile Asn Ala Gln Pro Ala Pro Val Val Glu Asn Glu Val Ala Lys
Asn355 360 365cga gat aac tca ccc gtc att aca caa gct ttc ttc tgg
ccg cgc gtg 1152Arg Asp Asn Ser Pro Val Ile Thr Gln Ala Phe Phe Trp
Pro Arg Val370 375 380gga gag ttc ctg aag aag aac gac atc gtc att
acc gag act gga aca 1200Gly Glu Phe Leu Lys Lys Asn Asp Ile Val Ile
Thr Glu Thr Gly Thr385 390 395 400gcc aac ttt ggc atc tgg gat act
aag ttt ccc tct ggc gtt act gcg 1248Ala Asn Phe Gly Ile Trp Asp Thr
Lys Phe Pro Ser Gly Val Thr Ala405 410 415ctt tct cag gtc ctt tgg
gga agc att ggt tgg tcc gtt ggt gcc tgc 1296Leu Ser Gln Val Leu Trp
Gly Ser Ile Gly Trp Ser Val Gly Ala Cys420 425 430caa gga gcc gtt
ctt gca gcc gcc gat gac aac agc gat cgc aga act 1344Gln Gly Ala Val
Leu Ala Ala Ala Asp Asp Asn Ser Asp Arg Arg Thr435 440 445atc ctc
ttt gtt ggt gat ggc tca ttc cag ctc act gct caa gaa ttg 1392Ile Leu
Phe Val Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu450 455
460agc aca atg att cgt ctc aag ctg aag ccc atc atc ttt gtc atc tgc
1440Ser Thr Met Ile Arg Leu Lys Leu Lys Pro Ile Ile Phe Val Ile
Cys465 470 475 480aac gat ggc ttt acc att gaa cga ttc att cac ggc
atg gaa gcc gag 1488Asn Asp Gly Phe Thr Ile Glu Arg Phe Ile His Gly
Met Glu Ala Glu485 490 495tac aac gac atc gca aat tgg gac ttc aag
gct ctg gtt gac gtc ttt 1536Tyr Asn Asp Ile Ala Asn Trp Asp Phe Lys
Ala Leu Val Asp Val Phe500 505 510ggc ggc tct aag acg gcc aag aag
ttc gcc gtc aag acc aag gac gag 1584Gly Gly Ser Lys Thr Ala Lys Lys
Phe Ala Val Lys Thr Lys Asp Glu515 520 525ctg gac agc ctt ctc aca
gac cct acc ttt aac gcc gca gaa tgc ctc 1632Leu Asp Ser Leu Leu Thr
Asp Pro Thr Phe Asn Ala Ala Glu Cys Leu530 535 540cag ttt gtc gag
cta tat atg ccc aaa gaa gat gct cct cga gca ttg 1680Gln Phe Val Glu
Leu Tyr Met Pro Lys Glu Asp Ala Pro Arg Ala Leu545 550 555 560atc
atg acg gca gaa gct agc gcg agg aac aat gcc aag aca gag taa 1728Ile
Met Thr Ala Glu Ala Ser Ala Arg Asn Asn Ala Lys Thr Glu *565 570
57519575PRTUnknownFungal isolate from soil sample 19Met Ala Ser Ile
Asn Ile Arg Val Gln Asn Leu Glu Gln Pro Met Asp1 5 10 15Val Ala Glu
Tyr Leu Phe Arg Arg Leu His Glu Ile Gly Ile Arg Ser20 25 30Ile His
Gly Leu Pro Gly Asp Tyr Asn Pro Leu Ala Leu Asp Tyr Leu35 40 45Pro
Ser Cys Gly Leu Arg Trp Val Gly Ser Val Asn Glu Leu Asn Ala50 55
60Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Val Lys Gln Met Gly Ala Leu65
70 75 80Ile Thr Thr Phe Gly Val Gly Glu Leu Ser Ala Ile Asn Gly Val
Ala85 90 95Gly Ala Phe Ser Glu His Val Pro Val Val His Ile Val Gly
Cys Pro100 105 110Ser Thr Ala Ser Gln Arg Asn Gly Met Leu Leu His
His Thr Leu Gly115 120 125Asn Gly Asp Phe Asn Ile Phe Ala Asn Met
Ser Ala Gln Ile Ser Cys130 135 140Glu Val Ala Lys Leu Thr Asn Pro
Ala Glu Ile Ala Thr Gln Ile Asp145 150 155 160His Ala Leu Arg Val
Cys Phe Ile Arg Ser Arg Pro Val Tyr Ile Met165 170 175Leu Pro Thr
Asp Met Val Gln Ala Lys Val Glu Gly Ala Arg Leu Lys180 185 190Glu
Pro Ile Asp Leu Ser Glu Pro Pro Asn Asp Pro Glu Lys Glu Ala195 200
205Tyr Val Val Asp Val Val Leu Lys Tyr Leu Arg Ala Ala Lys Asn
Pro210 215 220Val Ile Leu Val Asp Ala Cys Ala Ile Arg His Arg Val
Leu Asp Glu225 230 235 240Val His Asp Leu Ile Glu Lys Thr Asn Leu
Pro Val Phe Val Thr Pro245 250 255Met Gly Lys Gly Ala Val Asn Glu
Glu His Pro Thr Tyr Gly Gly Val260 265 270Tyr Ala Gly Asp Gly Ser
His Pro Pro Gln Val Lys Asp Met Val Glu275 280 285Ser Ser Asp Leu
Ile Leu Thr Ile Gly Ala Leu Lys Ser Asp Phe Asn290 295 300Thr Ala
Gly Phe Ser Tyr Arg Thr Ser Gln Leu Asn Thr Ile Asp Leu305 310 315
320His Ser Asp His Cys Ile Val Lys Tyr Ser Thr Tyr Pro Gly Val
Gln325 330 335Met
Arg Gly Val Leu Arg Gln Val Ile Lys Gln Leu Asp Ala Ser Glu340 345
350Ile Asn Ala Gln Pro Ala Pro Val Val Glu Asn Glu Val Ala Lys
Asn355 360 365Arg Asp Asn Ser Pro Val Ile Thr Gln Ala Phe Phe Trp
Pro Arg Val370 375 380Gly Glu Phe Leu Lys Lys Asn Asp Ile Val Ile
Thr Glu Thr Gly Thr385 390 395 400Ala Asn Phe Gly Ile Trp Asp Thr
Lys Phe Pro Ser Gly Val Thr Ala405 410 415Leu Ser Gln Val Leu Trp
Gly Ser Ile Gly Trp Ser Val Gly Ala Cys420 425 430Gln Gly Ala Val
Leu Ala Ala Ala Asp Asp Asn Ser Asp Arg Arg Thr435 440 445Ile Leu
Phe Val Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu450 455
460Ser Thr Met Ile Arg Leu Lys Leu Lys Pro Ile Ile Phe Val Ile
Cys465 470 475 480Asn Asp Gly Phe Thr Ile Glu Arg Phe Ile His Gly
Met Glu Ala Glu485 490 495Tyr Asn Asp Ile Ala Asn Trp Asp Phe Lys
Ala Leu Val Asp Val Phe500 505 510Gly Gly Ser Lys Thr Ala Lys Lys
Phe Ala Val Lys Thr Lys Asp Glu515 520 525Leu Asp Ser Leu Leu Thr
Asp Pro Thr Phe Asn Ala Ala Glu Cys Leu530 535 540Gln Phe Val Glu
Leu Tyr Met Pro Lys Glu Asp Ala Pro Arg Ala Leu545 550 555 560Ile
Met Thr Ala Glu Ala Ser Ala Arg Asn Asn Ala Lys Thr Glu565 570
575201728DNAUnknownFungal isolate from soil sample 20atg gcc agc
atc aac atc agg gtg cag aat ctc gag caa ccc atg gac 48Met Ala Ser
Ile Asn Ile Arg Val Gln Asn Leu Glu Gln Pro Met Asp1 5 10 15gtt gcc
gag tat ctt ttc cgg cgt ctc cac gaa atc ggc att cgc tcc 96Val Ala
Glu Tyr Leu Phe Arg Arg Leu His Glu Ile Gly Ile Arg Ser20 25 30atc
cac ggt ctt cca ggc gat tac aac ctt ctt gcc ctc gac tat ttg 144Ile
His Gly Leu Pro Gly Asp Tyr Asn Leu Leu Ala Leu Asp Tyr Leu35 40
45cca tca tgt ggc ctg aga tgg gtt ggc agc gtc aac gaa ctc aat gct
192Pro Ser Cys Gly Leu Arg Trp Val Gly Ser Val Asn Glu Leu Asn
Ala50 55 60gct tat gct gct gat ggc tat gcc cgc gtc aag cag atg gga
gct ctc 240Ala Tyr Ala Ala Asp Gly Tyr Ala Arg Val Lys Gln Met Gly
Ala Leu65 70 75 80atc acc act ttt gga gtg gga gag ctc tca gcc atc
aat ggc gtt gcc 288Ile Thr Thr Phe Gly Val Gly Glu Leu Ser Ala Ile
Asn Gly Val Ala85 90 95ggt gcc ttt tcg gaa cac gtc cca gtc gtt cac
att gtt ggc tgc cct 336Gly Ala Phe Ser Glu His Val Pro Val Val His
Ile Val Gly Cys Pro100 105 110tcc act gcc tcg cag cga aac ggc atg
ctc ctc cac cac acg ctt gga 384Ser Thr Ala Ser Gln Arg Asn Gly Met
Leu Leu His His Thr Leu Gly115 120 125aac ggc gac ttc aac atc ttt
gcc aac atg agc gct caa atc tct tgc 432Asn Gly Asp Phe Asn Ile Phe
Ala Asn Met Ser Ala Gln Ile Ser Cys130 135 140gaa gtg gcc aag ctc
acc aac cct gcc gaa att gcg acc cag atc gac 480Glu Val Ala Lys Leu
Thr Asn Pro Ala Glu Ile Ala Thr Gln Ile Asp145 150 155 160cat gcc
ctc cgc gtt tgc ttc att cgt tct cgg ccc gtc tac atc atg 528His Ala
Leu Arg Val Cys Phe Ile Arg Ser Arg Pro Val Tyr Ile Met165 170
175ctt ccc acc gat atg gtc cag gcc aaa gta gaa ggt gcc aga ctc aag
576Leu Pro Thr Asp Met Val Gln Ala Lys Val Glu Gly Ala Arg Leu
Lys180 185 190gaa cca att gac ttg tcg gag cct cca aat gat ccc gag
aaa gaa gca 624Glu Pro Ile Asp Leu Ser Glu Pro Pro Asn Asp Pro Glu
Lys Glu Ala195 200 205tac gtc gtt gac gtt gtc ctc aag tac ctc cgt
gct gca aag aac ccc 672Tyr Val Val Asp Val Val Leu Lys Tyr Leu Arg
Ala Ala Lys Asn Pro210 215 220gtc atc ctt gtc gat gct tgt gct atc
cgt cat cgt gtt ctt gat gag 720Val Ile Leu Val Asp Ala Cys Ala Ile
Arg His Arg Val Leu Asp Glu225 230 235 240gtt cat gat ctc atc gaa
aag aca aac ctc ccc gtc ttt gtc act cct 768Val His Asp Leu Ile Glu
Lys Thr Asn Leu Pro Val Phe Val Thr Pro245 250 255atg ggc aaa ggt
gct gtt aac gaa gaa cac ccg aca tat ggt ggt gtc 816Met Gly Lys Gly
Ala Val Asn Glu Glu His Pro Thr Tyr Gly Gly Val260 265 270tat gcc
ggt gac ggc tca cat ccg cct caa gtt aag gac atg gtt gag 864Tyr Ala
Gly Asp Gly Ser His Pro Pro Gln Val Lys Asp Met Val Glu275 280
285tct tct gat ttg ata ttg aca atc ggt gct ctc aag agc gac ttc aac
912Ser Ser Asp Leu Ile Leu Thr Ile Gly Ala Leu Lys Ser Asp Phe
Asn290 295 300act gct ggc ttc tct tac cgt acc tca cag ctg aac acg
att gat cta 960Thr Ala Gly Phe Ser Tyr Arg Thr Ser Gln Leu Asn Thr
Ile Asp Leu305 310 315 320cac agc gac cac tgc att gtc aaa tac tcg
aca tat cca ggt gtc cag 1008His Ser Asp His Cys Ile Val Lys Tyr Ser
Thr Tyr Pro Gly Val Gln325 330 335atg agg ggt gtg ctg cga caa gtg
att aag cag ctc gat gca tct gag 1056Met Arg Gly Val Leu Arg Gln Val
Ile Lys Gln Leu Asp Ala Ser Glu340 345 350atc aac gct cag cca gcg
cca gtc gtc gag aat gaa gtt gcc aaa aac 1104Ile Asn Ala Gln Pro Ala
Pro Val Val Glu Asn Glu Val Ala Lys Asn355 360 365cga gat aac tca
ccc gtc att aca caa gct ttc ttc tgg ccg cgc gtg 1152Arg Asp Asn Ser
Pro Val Ile Thr Gln Ala Phe Phe Trp Pro Arg Val370 375 380gga gag
ttc ctg aag aag aac gac atc gtc att acc gag act gga aca 1200Gly Glu
Phe Leu Lys Lys Asn Asp Ile Val Ile Thr Glu Thr Gly Thr385 390 395
400gcc aac ttt ggc atc tgg gat act aag ttt ccc tct ggc gtt act gcg
1248Ala Asn Phe Gly Ile Trp Asp Thr Lys Phe Pro Ser Gly Val Thr
Ala405 410 415ctt tct cag gtc ctt tgg gga agc att ggt tgg tcc gtt
ggt gcc tgc 1296Leu Ser Gln Val Leu Trp Gly Ser Ile Gly Trp Ser Val
Gly Ala Cys420 425 430caa gga gcc gtt ctt gca gcc gcc gat gac aac
agc gat cgc aga act 1344Gln Gly Ala Val Leu Ala Ala Ala Asp Asp Asn
Ser Asp Arg Arg Thr435 440 445atc ctc ttt gtt ggt gat ggc tca ttc
cag ctc act gct caa gaa ttg 1392Ile Leu Phe Val Gly Asp Gly Ser Phe
Gln Leu Thr Ala Gln Glu Leu450 455 460agc aca atg att cgt ctc aag
ctg aag ccc atc atc ttt gtc atc tgc 1440Ser Thr Met Ile Arg Leu Lys
Leu Lys Pro Ile Ile Phe Val Ile Cys465 470 475 480aac gat ggc ttt
acc att gaa cga ttc att cac ggc atg gaa gcc gag 1488Asn Asp Gly Phe
Thr Ile Glu Arg Phe Ile His Gly Met Glu Ala Glu485 490 495tac aac
gac atc gca aat tgg gac ttc aag gct ctg gtt gac gtc ttt 1536Tyr Asn
Asp Ile Ala Asn Trp Asp Phe Lys Ala Leu Val Asp Val Phe500 505
510ggc ggc tct aag acg gcc aag aag ttc gcc gtc aag acc aag gac gag
1584Gly Gly Ser Lys Thr Ala Lys Lys Phe Ala Val Lys Thr Lys Asp
Glu515 520 525ctg gac agc ctt ctc aca gac cct acc ttt aac gcc gca
gaa tgc ctc 1632Leu Asp Ser Leu Leu Thr Asp Pro Thr Phe Asn Ala Ala
Glu Cys Leu530 535 540cag ttt gtc gag cta tat atg ccc aaa gaa gat
gct cct cga gca ttg 1680Gln Phe Val Glu Leu Tyr Met Pro Lys Glu Asp
Ala Pro Arg Ala Leu545 550 555 560atc atg acg gca gaa gct agc gcg
agg aac aat gcc aag aca gag taa 1728Ile Met Thr Ala Glu Ala Ser Ala
Arg Asn Asn Ala Lys Thr Glu *565 570 57521575PRTUnknownFungal
isolate from soil sample 21Met Ala Ser Ile Asn Ile Arg Val Gln Asn
Leu Glu Gln Pro Met Asp1 5 10 15Val Ala Glu Tyr Leu Phe Arg Arg Leu
His Glu Ile Gly Ile Arg Ser20 25 30Ile His Gly Leu Pro Gly Asp Tyr
Asn Leu Leu Ala Leu Asp Tyr Leu35 40 45Pro Ser Cys Gly Leu Arg Trp
Val Gly Ser Val Asn Glu Leu Asn Ala50 55 60Ala Tyr Ala Ala Asp Gly
Tyr Ala Arg Val Lys Gln Met Gly Ala Leu65 70 75 80Ile Thr Thr Phe
Gly Val Gly Glu Leu Ser Ala Ile Asn Gly Val Ala85 90 95Gly Ala Phe
Ser Glu His Val Pro Val Val His Ile Val Gly Cys Pro100 105 110Ser
Thr Ala Ser Gln Arg Asn Gly Met Leu Leu His His Thr Leu Gly115 120
125Asn Gly Asp Phe Asn Ile Phe Ala Asn Met Ser Ala Gln Ile Ser
Cys130 135 140Glu Val Ala Lys Leu Thr Asn Pro Ala Glu Ile Ala Thr
Gln Ile Asp145 150 155 160His Ala Leu Arg Val Cys Phe Ile Arg Ser
Arg Pro Val Tyr Ile Met165 170 175Leu Pro Thr Asp Met Val Gln Ala
Lys Val Glu Gly Ala Arg Leu Lys180 185 190Glu Pro Ile Asp Leu Ser
Glu Pro Pro Asn Asp Pro Glu Lys Glu Ala195 200 205Tyr Val Val Asp
Val Val Leu Lys Tyr Leu Arg Ala Ala Lys Asn Pro210 215 220Val Ile
Leu Val Asp Ala Cys Ala Ile Arg His Arg Val Leu Asp Glu225 230 235
240Val His Asp Leu Ile Glu Lys Thr Asn Leu Pro Val Phe Val Thr
Pro245 250 255Met Gly Lys Gly Ala Val Asn Glu Glu His Pro Thr Tyr
Gly Gly Val260 265 270Tyr Ala Gly Asp Gly Ser His Pro Pro Gln Val
Lys Asp Met Val Glu275 280 285Ser Ser Asp Leu Ile Leu Thr Ile Gly
Ala Leu Lys Ser Asp Phe Asn290 295 300Thr Ala Gly Phe Ser Tyr Arg
Thr Ser Gln Leu Asn Thr Ile Asp Leu305 310 315 320His Ser Asp His
Cys Ile Val Lys Tyr Ser Thr Tyr Pro Gly Val Gln325 330 335Met Arg
Gly Val Leu Arg Gln Val Ile Lys Gln Leu Asp Ala Ser Glu340 345
350Ile Asn Ala Gln Pro Ala Pro Val Val Glu Asn Glu Val Ala Lys
Asn355 360 365Arg Asp Asn Ser Pro Val Ile Thr Gln Ala Phe Phe Trp
Pro Arg Val370 375 380Gly Glu Phe Leu Lys Lys Asn Asp Ile Val Ile
Thr Glu Thr Gly Thr385 390 395 400Ala Asn Phe Gly Ile Trp Asp Thr
Lys Phe Pro Ser Gly Val Thr Ala405 410 415Leu Ser Gln Val Leu Trp
Gly Ser Ile Gly Trp Ser Val Gly Ala Cys420 425 430Gln Gly Ala Val
Leu Ala Ala Ala Asp Asp Asn Ser Asp Arg Arg Thr435 440 445Ile Leu
Phe Val Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln Glu Leu450 455
460Ser Thr Met Ile Arg Leu Lys Leu Lys Pro Ile Ile Phe Val Ile
Cys465 470 475 480Asn Asp Gly Phe Thr Ile Glu Arg Phe Ile His Gly
Met Glu Ala Glu485 490 495Tyr Asn Asp Ile Ala Asn Trp Asp Phe Lys
Ala Leu Val Asp Val Phe500 505 510Gly Gly Ser Lys Thr Ala Lys Lys
Phe Ala Val Lys Thr Lys Asp Glu515 520 525Leu Asp Ser Leu Leu Thr
Asp Pro Thr Phe Asn Ala Ala Glu Cys Leu530 535 540Gln Phe Val Glu
Leu Tyr Met Pro Lys Glu Asp Ala Pro Arg Ala Leu545 550 555 560Ile
Met Thr Ala Glu Ala Ser Ala Arg Asn Asn Ala Lys Thr Glu565 570
575
* * * * *
References