U.S. patent application number 12/568846 was filed with the patent office on 2010-02-04 for sucrose transport proteins.
This patent application is currently assigned to E.I. DU PONT DE NEMOURS AND COMPANY. Invention is credited to Stephen M. Allen, William D. Hitz, J. Antoni Rafalski.
Application Number | 20100029489 12/568846 |
Document ID | / |
Family ID | 22162389 |
Filed Date | 2010-02-04 |
United States Patent
Application |
20100029489 |
Kind Code |
A1 |
Allen; Stephen M. ; et
al. |
February 4, 2010 |
SUCROSE TRANSPORT PROTEINS
Abstract
This invention relates to an isolated nucleic acid fragment
encoding a sucrose transport protein. The invention also relates to
the construction of a chimeric gene encoding all or a portion of
the sucrose transport protein, in sense or antisense orientation,
wherein expression of the chimeric gene results in production of
altered levels of the sucrose transport protein in a transformed
host cell.
Inventors: |
Allen; Stephen M.;
(Wilmington, DE) ; Hitz; William D.; (Wilmington,
DE) ; Rafalski; J. Antoni; (Wilmington, DE) |
Correspondence
Address: |
E I DU PONT DE NEMOURS AND COMPANY;LEGAL PATENT RECORDS CENTER
BARLEY MILL PLAZA 25/1122B, 4417 LANCASTER PIKE
WILMINGTON
DE
19805
US
|
Assignee: |
E.I. DU PONT DE NEMOURS AND
COMPANY
Wilmington
DE
|
Family ID: |
22162389 |
Appl. No.: |
12/568846 |
Filed: |
September 29, 2009 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11901765 |
Sep 18, 2007 |
7605247 |
|
|
12568846 |
|
|
|
|
09679687 |
Oct 5, 2000 |
7288645 |
|
|
11901765 |
|
|
|
|
PCT/US99/07562 |
Apr 7, 1999 |
|
|
|
09679687 |
|
|
|
|
60081148 |
Apr 9, 1998 |
|
|
|
Current U.S.
Class: |
506/2 ; 435/419;
435/468; 530/350; 536/23.6 |
Current CPC
Class: |
C07K 14/415 20130101;
C12N 15/8245 20130101 |
Class at
Publication: |
506/2 ; 536/23.6;
435/419; 530/350; 435/468 |
International
Class: |
C07H 21/04 20060101
C07H021/04; C12N 5/10 20060101 C12N005/10; C07K 14/415 20060101
C07K014/415; C12N 15/82 20060101 C12N015/82; C40B 20/00 20060101
C40B020/00 |
Claims
1. An isolated nucleic acid fragment encoding all or a substantial
portion of a sucrose transport protein comprising a member selected
from the group consisting of: (a) an isolated nucleic acid fragment
encoding all or a substantial portion of the amino acid sequence
set forth in a member selected from the group consisting of SEQ ID
NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24; (b) an isolated
nucleic acid fragment that is substantially similar to an isolated
nucleic acid fragment encoding all or a substantial portion of the
amino acid sequence set forth in a member selected from the group
consisting of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and
24; and (c) an isolated nucleic acid fragment that is complementary
to (a) or (b).
2. The isolated nucleic acid fragment of claim 1 wherein the
nucleotide sequence of the fragment comprises all or a portion of
the sequence set forth in a member selected from the group
consisting of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and
23.
3. A chimeric gene comprising the nucleic acid fragment of claim 1
operably linked to suitable regulatory sequences.
4. A transformed host cell comprising the chimeric gene of claim
3.
5. A sucrose transport protein polypeptide comprising all or a
substantial portion of the amino acid sequence set forth in a
member selected from the group consisting of SEQ ID NO:2, 4, 6, 8,
10, 12, 14, 16, 18, 20, 22 and 24.
6. A method of altering the level of expression of a sucrose
transport protein in a host cell comprising: (a) transforming a
host cell with the chimeric gene of claim 3; and (b) growing the
transformed host cell produced in step (a) under conditions that
are suitable for expression of the chimeric gene wherein expression
of the chimeric gene results in production of altered levels of a
sucrose transport protein in the transformed host cell.
7. A method of obtaining a nucleic acid fragment encoding all or a
substantial portion of the amino acid sequence encoding a sucrose
transport protein comprising: (a) probing a cDNA or genomic library
with the nucleic acid fragment of claim 1; (b) identifying a DNA
clone that hybridizes with the nucleic acid fragment of claim 1;
(c) isolating the DNA clone identified in step (b); and (d)
sequencing the cDNA or genomic fragment that comprises the clone
isolated in step (c) wherein the sequenced nucleic acid fragment
encodes all or a substantial portion of the amino acid sequence
encoding a sucrose transport protein.
8. A method of obtaining a nucleic acid fragment encoding a
substantial portion of an amino acid sequence encoding a sucrose
transport protein comprising: (a) synthesizing an oligonucleotide
primer corresponding to a portion of the sequence set forth in any
of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23; and (b)
amplifying a cDNA insert present in a cloning vector using the
oligonucleotide primer of step (a) and a primer representing
sequences of the cloning vector wherein the amplified nucleic acid
fragment encodes a substantial portion of an amino acid sequence
encoding a sucrose transport protein.
9. The product of the method of claim 7.
10. The product of the method of claim 8.
Description
[0001] This application claims the benefit of U.S. Provisional
Application No. 60/081,148, filed Apr. 9, 1998.
FIELD OF THE INVENTION
[0002] This invention is in the field of plant molecular biology.
More specifically, this invention pertains to nucleic acid
fragments encoding sucrose transport proteins in plants and
seeds.
BACKGROUND OF THE INVENTION
[0003] Sucrose is the first form of carbohydrate to leave
photosynthesizing cells in most higher plants and is the main form
of transported carbon in most annual field crops plants such as
corn, soybeans and wheat. As such its movement and concentration
across various plant membranes is critical to plant growth and
development. In addition sucrose is the main form of carbon that
moves into developing seeds of soybeans, corn and wheat. This
movement and concentration is accomplished by the action of sucrose
carrier proteins that act to move sucrose against a concentration
gradient by coupling sucrose movement to the opposite vectoral
movement of a proton. Specific sucrose carrier sequences from these
crop plants should find use in controlling the timing and extent of
phenomena such as grain fill duration that are important factors in
crop yield and quality. Accordingly, the availability of nucleic
acid sequences encoding all or a portion of these enzymes would
facilitate studies to better understand carbohydrate metabolism and
function in plants, provide genetic tools for the manipulation of
these biosynthetic pathways, and provide a means to control
carbohydrate transport and distribution in plant cells.
SUMMARY OF THE INVENTION
[0004] The instant invention relates to isolated nucleic acid
fragments encoding proteins involved in sucrose transport.
Specifically, this invention concerns an isolated nucleic acid
fragment encoding a sucrose transport protein. In addition, this
invention relates to a nucleic acid fragment that is complementary
to the nucleic acid fragment encoding the sucrose transport
protein. An additional embodiment of the instant invention pertains
to a polypeptide encoding all or a substantial portion of a sucrose
transport protein.
[0005] In another embodiment, the instant invention relates to a
chimeric gene encoding a sucrose transport protein, or to a
chimeric gene that comprises a nucleic acid fragment that is
complementary to a nucleic acid fragment encoding a sucrose
transport protein, operably linked to suitable regulatory
sequences, wherein expression of the chimeric gene results in
production of levels of the encoded protein in a transformed host
cell that is altered (i.e., increased or decreased) from the level
produced in an untransformed host cell.
[0006] In a further embodiment, the instant invention concerns a
transformed host cell comprising in its genome a chimeric gene
encoding a sucrose transport protein, operably linked to suitable
regulatory sequences. Expression of the chimeric gene results in
production of altered levels of the encoded protein in the
transformed host cell. The transformed host cell can be of
eukaryotic or prokaryotic origin, and include cells derived from
higher plants and microorganisms. The invention also includes
transformed plants that arise from transformed host cells of higher
plants, and seeds derived from such transformed plants.
[0007] An additional embodiment of the instant invention concerns a
method of altering the level of expression of a sucrose transport
protein in a transformed host cell comprising: a) transforming a
host cell with a chimeric gene comprising a nucleic acid fragment
encoding a sucrose transport protein; and b) growing the
transformed host cell under conditions that are suitable for
expression of the chimeric gene wherein expression of the chimeric
gene results in production of altered levels of sucrose transport
protein in the transformed host cell.
[0008] An addition embodiment of the instant invention concerns a
method for obtaining a nucleic acid fragment encoding all or a
substantial portion of an amino acid sequence encoding a sucrose
transport protein.
BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE DESCRIPTIONS
[0009] The invention can be more fully understood from the
following detailed description and the accompanying drawings and
Sequence Listing which form a part of this application.
[0010] FIG. 1 shows a comparison of the amino acid sequences set
forth in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24
and the Daucus carota (SEQ ID NO:25), Oryza sativa (SEQ ID NO:26),
Ricinus communis (SEQ ID NO:27) and Vicia faba (SEQ ID NO:28)
sucrose transport protein amino acid sequences.
[0011] The following sequence descriptions and sequence listings
attached hereto comply with the rules governing nucleotide and/or
amino acid sequence disclosures in patent applications as set forth
in 37 C.F.R. .sctn.1.821-1.825.
[0012] SEQ ID NO:1 is the nucleotide sequence comprising the entire
cDNA insert in clone cepe7.pk0015.d10 encoding an entire corn
sucrose transport protein.
[0013] SEQ ID NO:2 is the deduced amino acid sequence of a sucrose
transport protein derived from the nucleotide sequence of SEQ ID
NO:1.
[0014] SEQ ID NO:3 is the nucleotide sequence comprising a portion
of the cDNA insert in clone cr1n.pk0075.f5 encoding a portion of a
corn sucrose transport protein.
[0015] SEQ ID NO:4 is the deduced amino acid sequence of a portion
of a sucrose transport protein derived from the nucleotide sequence
of SEQ ID NO:3.
[0016] SEQ ID NO:5 is the nucleotide sequence comprising a portion
of the cDNA insert in clone cr1n.pk0095.c10 encoding a portion of a
corn sucrose transport protein.
[0017] SEQ ID NO:6 is the deduced amino acid sequence of a portion
of a sucrose transport protein derived from the nucleotide sequence
of SEQ ID NO:5.
[0018] SEQ ID NO:7 is the nucleotide sequence comprising the entire
cDNA insert in clone rlr2.pk0043.b1 encoding a portion of a rice
sucrose transport protein.
[0019] SEQ ID NO:8 is the deduced amino acid sequence of a portion
of a sucrose transport protein derived from the nucleotide sequence
of SEQ ID NO:7.
[0020] SEQ ID NO:9 is the nucleotide sequence comprising the entire
cDNA insert in clone rls6.pk0076.e2 encoding an entire rice sucrose
transport protein.
[0021] SEQ ID NO:10 is the deduced amino acid sequence of a sucrose
transport protein derived from the nucleotide sequence of SEQ ID
NO:9.
[0022] SEQ ID NO:11 is the nucleotide sequence comprising the
entire cDNA insert in clone sfl1.pk0001.g1 encoding an entire
soybean sucrose transport protein.
[0023] SEQ ID NO:12 is the deduced amino acid sequence of a sucrose
transport protein derived from the nucleotide sequence of SEQ ID
NO:11.
[0024] SEQ ID NO:13 is the nucleotide sequence comprising a contig
assembled from the cDNA inserts in clones sfl1.pk0043.c7 and
sdp3c.pk012.c13 encoding a portion of a soybean sucrose transport
protein.
[0025] SEQ ID NO:14 is the deduced amino acid sequence of a portion
of a sucrose transport protein derived from the nucleotide sequence
of SEQ ID NO:13.
[0026] SEQ ID NO:15 is the nucleotide sequence comprising a portion
of the cDNA insert in clone vs1n.pk0002.h3 encoding a portion of a
Vernonia sucrose transport protein.
[0027] SEQ ID NO:16 is the deduced amino acid sequence of a portion
of a sucrose transport protein derived from the nucleotide sequence
of SEQ ID NO:15.
[0028] SEQ ID NO:17 is the nucleotide sequence comprising the
entire cDNA insert in clone wle1n.pk0007.h8 encoding a portion of a
wheat sucrose transport protein.
[0029] SEQ ID NO:18 is the deduced amino acid sequence of a portion
of a sucrose transport protein derived from the nucleotide sequence
of SEQ ID NO:17.
[0030] SEQ ID NO:19 is the nucleotide sequence comprising the
entire cDNA insert in clone wle1n.pk0103.c11 encoding an entire
wheat sucrose transport protein.
[0031] SEQ ID NO:20 is the deduced amino acid sequence of a sucrose
transport protein derived from the nucleotide sequence of SEQ ID
NO:19.
[0032] SEQ ID NO:21 is the nucleotide sequence comprising the
entire cDNA insert in clone wlm24.pk0015.g11 encoding an entire
wheat sucrose transport protein.
[0033] SEQ ID NO:22 is the deduced amino acid sequence of a sucrose
transport protein derived from the nucleotide sequence of SEQ ID
NO:21.
[0034] SEQ ID NO:23 is the nucleotide sequence comprising the
entire cDNA insert in clone wlmk1.pk0002.e11 encoding an entire
wheat sucrose transport protein.
[0035] SEQ ID NO:24 is the deduced amino acid sequence of a sucrose
transport protein derived from the nucleotide sequence of SEQ ID
NO:23.
[0036] SEQ ID NO:25 is the amino acid sequence of a Daucus carota
sucrose transport protein (NCBI Identifier No. gi 2969887).
[0037] SEQ ID NO:26 is the amino acid sequence of a Oryza sativa
sucrose transport protein (NCBI Identifier No. gi 2723471).
[0038] SEQ ID NO:27 is the amino acid sequence of a Ricinus
communis sucrose transport protein (NCBI Identifier No. gi
542020).
[0039] SEQ ID NO:28 is the amino acid sequence of a Vicia faba
sucrose transport protein (NCBI Identifier No. gi 1935019).
[0040] The Sequence Listing contains the one letter code for
nucleotide sequence characters and the three letter codes for amino
acids as defined in conformity with the IUPAC-IUBMB standards
described in Nucleic Acids Research 13:3021-3030 (1985) and in the
Biochemical Journal 219 (No. 2):345-373 (1984) which are herein
incorporated by reference. The symbols and format used for
nucleotide and amino acid sequence data comply with the rules set
forth in 37 C.F.R. .sctn.1.822.
DETAILED DESCRIPTION OF THE INVENTION
[0041] In the context of this disclosure, a number of terms shall
be utilized. As used herein, an "isolated nucleic acid fragment" is
a polymer of RNA or DNA that is single- or double-stranded,
optionally containing synthetic, non-natural or altered nucleotide
bases. An isolated nucleic acid fragment in the form of a polymer
of DNA may be comprised of one or more segments of cDNA, genomic
DNA or synthetic DNA. As used herein, "contig" refers to an
assemblage of overlapping nucleic acid sequences to form one
contiguous nucleotide sequence. For example, several DNA sequences
can be compared and aligned to identify common or overlapping
regions. The individual sequences can then be assembled into a
single contiguous nucleotide sequence. As used herein,
"substantially similar" refers to nucleic acid fragments wherein
changes in one or more nucleotide bases results in substitution of
one or more amino acids, but do not affect the functional
properties of the protein encoded by the DNA sequence.
[0042] "Substantially similar" also refers to nucleic acid
fragments wherein changes in one or more nucleotide bases does not
affect the ability of the nucleic acid fragment to mediate
alteration of gene expression by antisense or co-suppression
technology. "Substantially similar" also refers to modifications of
the nucleic acid fragments of the instant invention such as
deletion or insertion of one or more nucleotides that do not
substantially affect the functional properties of the resulting
transcript vis-a-vis the ability to mediate alteration of gene
expression by antisense or co-suppression technology or alteration
of the functional properties of the resulting protein molecule. It
is therefore understood that the invention encompasses more than
the specific exemplary sequences.
[0043] For example, it is well known in the art that antisense
suppression and co-suppression of gene expression may be
accomplished using nucleic acid fragments representing less than
the entire coding region of a gene, and by nucleic acid fragments
that do not share 100% sequence identity with the gene to be
suppressed. Moreover, alterations in a gene which result in the
production of a chemically equivalent amino acid at a given site,
but do not effect the functional properties of the encoded protein,
are well known in the art. Thus, a codon for the amino acid
alanine, a hydrophobic amino acid, may be substituted by a codon
encoding another less hydrophobic residue, such as glycine, or a
more hydrophobic residue, such as valine, leucine, or isoleucine.
Similarly, changes which result in substitution of one negatively
charged residue for another, such as aspartic acid for glutamic
acid, or one positively charged residue for another, such as lysine
for arginine, can also be expected to produce a functionally
equivalent product. Nucleotide changes which result in alteration
of the N-terminal and C-terminal portions of the protein molecule
would also not be expected to alter the activity of the protein.
Each of the proposed modifications is well within the routine skill
in the art, as is determination of retention of biological activity
of the encoded products.
[0044] Moreover, substantially similar nucleic acid fragments may
also be characterized by their ability to hybridize, under
stringent conditions (0.1.times.SSC, 0.1% SDS, 65.degree. C.), with
the nucleic acid fragments disclosed herein.
[0045] Substantially similar nucleic acid fragments of the instant
invention may also be characterized by the percent similarity of
the amino acid sequences that they encode to the amino acid
sequences disclosed herein, as determined by algorithms commonly
employed by those skilled in this art. Preferred are those nucleic
acid fragments whose nucleotide sequences encode amino acid
sequences that are 90% similar to the amino acid sequences reported
herein. Most preferred are nucleic acid fragments that encode amino
acid sequences that are 95% similar to the amino acid sequences
reported herein. Sequence alignments and percent similarity
calculations were performed using the Megalign program of the
LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison,
Wis.). Multiple alignment of the sequences was performed using the
Clustal method of alignment (Higgins, D. G. and Sharp, P. M. (1989)
CABIOS. 5:151-153) with the default parameters (GAP PENALTY=10, GAP
LENGTH PENALTY=10), (hereafter Clustal algorithm). Default
parameters for pairwise alignments using the Clustal method were
KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5.
[0046] A "substantial portion" of an amino acid or nucleotide
sequence comprises enough of the amino acid sequence of a
polypeptide or the nucleotide sequence of a gene to afford putative
identification of that polypeptide or gene, either by manual
evaluation of the sequence by one skilled in the art, or by
computer-automated sequence comparison and identification using
algorithms such as BLAST (Basic Local Alignment Search Tool;
Altschul, S. F., et al., (1993) J. Mol. Biol. 215:403-410; see also
www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or more
contiguous amino acids or thirty or more nucleotides is necessary
in order to putatively identify a polypeptide or nucleic acid
sequence as homologous to a known protein or gene. Moreover, with
respect to nucleotide sequences, gene specific oligonucleotide
probes comprising 20-30 contiguous nucleotides may be used in
sequence-dependent methods of gene identification (e.g., Southern
hybridization) and isolation (e.g., in situ hybridization of
bacterial colonies or bacteriophage plaques). In addition, short
oligonucleotides of 12-15 bases may be used as amplification
primers in PCR in order to obtain a particular nucleic acid
fragment comprising the primers. Accordingly, a "substantial
portion" of a nucleotide sequence comprises enough of the sequence
to afford specific identification and/or isolation of a nucleic
acid fragment comprising the sequence. The instant specification
teaches partial or complete amino acid and nucleotide sequences
encoding one or more particular plant proteins. The skilled
artisan, having the benefit of the sequences as reported herein,
may now use all or a substantial portion of the disclosed sequences
for purposes known to those skilled in this art. Accordingly, the
instant invention comprises the complete sequences as reported in
the accompanying Sequence Listing, as well as substantial portions
of those sequences as defined above.
[0047] "Codon degeneracy" refers to divergence in the genetic code
permitting variation of the nucleotide sequence without effecting
the amino acid sequence of an encoded polypeptide. Accordingly, the
instant invention relates to any nucleic acid fragment that encodes
all or a substantial portion of the amino acid sequence encoding
the sucrose transport proteins as set forth in SEQ ID NOs:2, 4, 6,
8, 10, 12, 14, 16, 18, 20, 22 and 24. The skilled artisan is well
aware of the "codon-bias" exhibited by a specific host cell in
usage of nucleotide codons to specify a given amino acid.
Therefore, when synthesizing a gene for improved expression in a
host cell, it is desirable to design the gene such that its
frequency of codon usage approaches the frequency of preferred
codon usage of the host cell.
[0048] "Synthetic genes" can be assembled from oligonucleotide
building blocks that are chemically synthesized using procedures
known to those skilled in the art. These building blocks are
ligated and annealed to form gene segments which are then
enzymatically assembled to construct the entire gene. "Chemically
synthesized", as related to a sequence of DNA, means that the
component nucleotides were assembled in vitro. Manual chemical
synthesis of DNA may be accomplished using well established
procedures, or automated chemical synthesis can be performed using
one of a number of commercially available machines. Accordingly,
the genes can be tailored for optimal gene expression based on
optimization of nucleotide sequence to reflect the codon bias of
the host cell. The skilled artisan appreciates the likelihood of
successful gene expression if codon usage is biased towards those
codons favored by the host. Determination of preferred codons can
be based on a survey of genes derived from the host cell where
sequence information is available.
[0049] "Gene" refers to a nucleic acid fragment that expresses a
specific protein, including regulatory sequences preceding (5'
non-coding sequences) and following (3' non-coding sequences) the
coding sequence. "Native gene" refers to a gene as found in nature
with its own regulatory sequences. "Chimeric gene" refers any gene
that is not a native gene, comprising regulatory and coding
sequences that are not found together in nature. Accordingly, a
chimeric gene may comprise regulatory sequences and coding
sequences that are derived from different sources, or regulatory
sequences and coding sequences derived from the same source, but
arranged in a manner different than that found in nature.
"Endogenous gene" refers to a native gene in its natural location
in the genome of an organism. A "foreign" gene refers to a gene not
normally found in the host organism, but that is introduced into
the host organism by gene transfer. Foreign genes can comprise
native genes inserted into a non-native organism, or chimeric
genes. A "transgene" is a gene that has been introduced into the
genome by a transformation procedure.
[0050] "Coding sequence" refers to a DNA sequence that codes for a
specific amino acid sequence. "Regulatory sequences" refer to
nucleotide sequences located upstream (5' non-coding sequences),
within, or downstream (3' non-coding sequences) of a coding
sequence, and which influence the transcription, RNA processing or
stability, or translation of the associated coding sequence.
Regulatory sequences may include promoters, translation leader
sequences, introns, and polyadenylation recognition sequences.
[0051] "Promoter" refers to a DNA sequence capable of controlling
the expression of a coding sequence or functional RNA. In general,
a coding sequence is located 3' to a promoter sequence. The
promoter sequence consists of proximal and more distal upstream
elements, the latter elements often referred to as enhancers.
Accordingly, an "enhancer" is a DNA sequence which can stimulate
promoter activity and may be an innate element of the promoter or a
heterologous element inserted to enhance the level or
tissue-specificity of a promoter. Promoters may be derived in their
entirety from a native gene, or be composed of different elements
derived from different promoters found in nature, or even comprise
synthetic DNA segments. It is understood by those skilled in the
art that different promoters may direct the expression of a gene in
different tissues or cell types, or at different stages of
development, or in response to different environmental conditions.
Promoters which cause a gene to be expressed in most cell types at
most times are commonly referred to as "constitutive promoters".
New promoters of various types useful in plant cells are constantly
being discovered; numerous examples may be found in the compilation
by Okamuro and Goldberg, (1989) Biochemistry of Plants 15: 1-82. It
is further recognized that since in most cases the exact boundaries
of regulatory sequences have not been completely defined, DNA
fragments of different lengths may have identical promoter
activity.
[0052] The "translation leader sequence" refers to a DNA sequence
located between the promoter sequence of a gene and the coding
sequence. The translation leader sequence is present in the fully
processed mRNA upstream of the translation start sequence. The
translation leader sequence may affect processing of the primary
transcript to mRNA, mRNA stability or translation efficiency.
Examples of translation leader sequences have been described
(Turner, R. and Foster, G. D. (1995) Molecular Biotechnology
3:225).
[0053] The "3' non-coding sequences" refer to DNA sequences located
downstream of a coding sequence and include polyadenylation
recognition sequences and other sequences encoding regulatory
signals capable of affecting mRNA processing or gene expression.
The polyadenylation signal is usually characterized by affecting
the addition of polyadenylic acid tracts to the 3' end of the mRNA
precursor. The use of different 3' non-coding sequences is
exemplified by Ingelbrecht et al., (1989) Plant Cell 1:671-680.
[0054] "RNA transcript" refers to the product resulting from RNA
polymerase-catalyzed transcription of a DNA sequence. When the RNA
transcript is a perfect complementary copy of the DNA sequence, it
is referred to as the primary transcript or it may be a RNA
sequence derived from posttranscriptional processing of the primary
transcript and is referred to as the mature RNA. "Messenger RNA
(mRNA)" refers to the RNA that is without introns and that can be
translated into protein by the cell. "cDNA" refers to a
double-stranded DNA that is complementary to and derived from mRNA.
"Sense" RNA refers to RNA transcript that includes the mRNA and so
can be translated into protein by the cell. "Antisense RNA" refers
to a RNA transcript that is complementary to all or part of a
target primary transcript or mRNA and that blocks the expression of
a target gene (U.S. Pat. No. 5,107,065, incorporated herein by
reference). The complementarity of an antisense RNA may be with any
part of the specific gene transcript, i.e., at the 5' non-coding
sequence, 3' non-coding sequence, introns, or the coding sequence.
"Functional RNA" refers to sense RNA, antisense RNA, ribozyme RNA,
or other RNA that may not be translated but yet has an effect on
cellular processes.
[0055] The term "operably linked" refers to the association of
nucleic acid sequences on a single nucleic acid fragment so that
the function of one is affected by the other. For example, a
promoter is operably linked with a coding sequence when it is
capable of affecting the expression of that coding sequence (i.e.,
that the coding sequence is under the transcriptional control of
the promoter). Coding sequences can be operably linked to
regulatory sequences in sense or antisense orientation.
[0056] The term "expression", as used herein, refers to the
transcription and stable accumulation of sense (mRNA) or antisense
RNA derived from the nucleic acid fragment of the invention.
Expression may also refer to translation of mRNA into a
polypeptide. "Antisense inhibition" refers to the production of
antisense RNA transcripts capable of suppressing the expression of
the target protein. "Overexpression" refers to the production of a
gene product in transgenic organisms that exceeds levels of
production in normal or non-transformed organisms. "Co-suppression"
refers to the production of sense RNA transcripts capable of
suppressing the expression of identical or substantially similar
foreign or endogenous genes (U.S. Pat. No. 5,231,020, incorporated
herein by reference).
[0057] "Altered levels" refers to the production of gene product(s)
in transgenic organisms in amounts or proportions that differ from
that of normal or non-transformed organisms.
[0058] "Mature" protein refers to a post-translationally processed
polypeptide; i.e., one from which any pre- or propeptides present
in the primary translation product have been removed. "Precursor"
protein refers to the primary product of translation of mRNA; i.e.,
with pre- and propeptides still present. Pre- and propeptides may
be but are not limited to intracellular localization signals.
[0059] A "chloroplast transit peptide" is an amino acid sequence
which is translated in conjunction with a protein and directs the
protein to the chloroplast or other plastid types present in the
cell in which the protein is made. "Chloroplast transit sequence"
refers to a nucleotide sequence that encodes a chloroplast transit
peptide. A "signal peptide" is an amino acid sequence which is
translated in conjunction with a protein and directs the protein to
the secretory system (Chrispeels, J. J., (1991) Ann. Rev. Plant
Phys. Plant Mol. Biol. 42:21-53). If the protein is to be directed
to a vacuole, a vacuolar targeting signal (supra) can further be
added, or if to the endoplasmic reticulum, an endoplasmic reticulum
retention signal (supra) may be added. If the protein is to be
directed to the nucleus, any signal peptide present should be
removed and instead a nuclear localization signal included (Raikhel
(1992) Plant Phys. 100:1627-1632).
[0060] "Transformation" refers to the transfer of a nucleic acid
fragment into the genome of a host organism, resulting in
genetically stable inheritance. Host organisms containing the
transformed nucleic acid fragments are referred to as "transgenic"
organisms. Examples of methods of plant transformation include
Agrobacterium-mediated transformation (De Blaere et al. (1987)
Meth. Enzymol. 143:277) and particle-accelerated or "gene gun"
transformation technology (Klein et al. (1987) Nature (London)
327:70-73; U.S. Pat. No. 4,945,050, incorporated herein by
reference).
[0061] Standard recombinant DNA and molecular cloning techniques
used herein are well known in the art and are described more fully
in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning:
A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold
Spring Harbor, 1989 (hereinafter "Maniatis").
[0062] Nucleic acid fragments encoding at least a portion of
several sucrose transport proteins have been isolated and
identified by comparison of random plant cDNA sequences to public
databases containing nucleotide and protein sequences using the
BLAST algorithms well known to those skilled in the art. Table 1
lists the proteins that are described herein, and the designation
of the cDNA clones that comprise the nucleic acid fragments
encoding these proteins.
TABLE-US-00001 TABLE 1 Sucrose Transport Proteins Enzyme Clone
Plant Sucrose Transporter cepe7.pk0015.d10 Corn cr1n.pk0095.c10
Corn cr1n.pk0075.f5 Corn rlr2.pk0043.b1 Rice rls6.pk0076.e2 Rice
sfl1.pk0001.g1 Soybean sfl1.pk0043.c7 Soybean sdp3c.pk012.c13
Soybean vs1n.pk0002.h3 Vernonia wle1n.pk0007.h8 Wheat
wle1n.pk0103.c11 Wheat wlm24.pk0015.g11 Wheat wlmk1.pk0002.e11
Wheat
[0063] The nucleic acid fragments of the instant invention may be
used to isolate cDNAs and genes encoding homologous proteins from
the same or other plant species. Isolation of homologous genes
using sequence-dependent protocols is well known in the art.
Examples of sequence-dependent protocols include, but are not
limited to, methods of nucleic acid hybridization, and methods of
DNA and RNA amplification as exemplified by various uses of nucleic
acid amplification technologies (e.g., polymerase chain reaction,
ligase chain reaction).
[0064] For example, genes encoding other sucrose transport
proteins, either as cDNAs or genomic DNAs, could be isolated
directly by using all or a portion of the instant nucleic acid
fragments as DNA hybridization probes to screen libraries from any
desired plant employing methodology well known to those skilled in
the art. Specific oligonucleotide probes based upon the instant
nucleic acid sequences can be designed and synthesized by methods
known in the art (Maniatis). Moreover, the entire sequences can be
used directly to synthesize DNA probes by methods known to the
skilled artisan such as random primer DNA labeling, nick
translation, or end-labeling techniques, or RNA probes using
available in vitro transcription systems. In addition, specific
primers can be designed and used to amplify a part or all of the
instant sequences. The resulting amplification products can be
labeled directly during amplification reactions or labeled after
amplification reactions, and used as probes to isolate full length
cDNA or genomic fragments under conditions of appropriate
stringency.
[0065] In addition, two short segments of the instant nucleic acid
fragments may be used in polymerase chain reaction protocols to
amplify longer nucleic acid fragments encoding homologous genes
from DNA or RNA. The polymerase chain reaction may also be
performed on a library of cloned nucleic acid fragments wherein the
sequence of one primer is derived from the instant nucleic acid
fragments, and the sequence of the other primer takes advantage of
the presence of the polyadenylic acid tracts to the 3' end of the
mRNA precursor encoding plant genes. Alternatively, the second
primer sequence may be based upon sequences derived from the
cloning vector. For example, the skilled artisan can follow the
RACE protocol (Frohman et al., (1988) PNAS USA 85:8998) to generate
cDNAs by using PCR to amplify copies of the region between a single
point in the transcript and the 3' or 5' end. Primers oriented in
the 3' and 5' directions can be designed from the instant
sequences. Using commercially available 3' RACE or 5' RACE systems
(BRL), specific 3' or 5' cDNA fragments can be isolated (Ohara et
al., (1989) PNAS USA 86:5673; Loh et al., (1989) Science 243:217).
Products generated by the 3' and 5' RACE procedures can be combined
to generate full-length cDNAs (Frohman, M. A. and Martin, G. R.,
(1989) Techniques 1: 165).
[0066] Availability of the instant nucleotide and deduced amino
acid sequences facilitates immunological screening of cDNA
expression libraries. Synthetic peptides representing portions of
the instant amino acid sequences may be synthesized. These peptides
can be used to immunize animals to produce polyclonal or monoclonal
antibodies with specificity for peptides or proteins comprising the
amino acid sequences. These antibodies can be then be used to
screen cDNA expression libraries to isolate full-length cDNA clones
of interest (Lerner, R. A. (1984) Adv. Immunol. 36:1;
Maniatis).
[0067] The nucleic acid fragments of the instant invention may be
used to create transgenic plants in which the disclosed sucrose
transport proteins are present at higher or lower levels than
normal or in cell types or developmental stages in which they are
not normally found. This would have the effect of altering the
level of sucrose metabolism in those cells.
[0068] Overexpression of the sucrose transport proteins of the
instant invention may be accomplished by first constructing a
chimeric gene in which the coding region is operably linked to a
promoter capable of directing expression of a gene in the desired
tissues at the desired stage of development. For reasons of
convenience, the chimeric gene may comprise promoter sequences and
translation leader sequences derived from the same genes. 3'
Non-coding sequences encoding transcription termination signals may
also be provided. The instant chimeric gene may also comprise one
or more introns in order to facilitate gene expression.
[0069] Plasmid vectors comprising the instant chimeric gene can
then constructed. The choice of plasmid vector is dependent upon
the method that will be used to transform host plants. The skilled
artisan is well aware of the genetic elements that must be present
on the plasmid vector in order to successfully transform, select
and propagate host cells containing the chimeric gene. The skilled
artisan will also recognize that different independent
transformation events will result in different levels and patterns
of expression (Jones et al., (1985) EMBO J. 4:2411-2418; De Almeida
et al., (1989) Mol. Gen. Genetics 218:78-86), and thus that
multiple events must be screened in order to obtain lines
displaying the desired expression level and pattern. Such screening
may be accomplished by Southern analysis of DNA, Northern analysis
of mRNA expression, Western analysis of protein expression, or
phenotypic analysis.
[0070] For some applications it may be useful to direct the instant
sucrose transport proteins to different cellular compartments, or
to facilitate its secretion from the cell. It is thus envisioned
that the chimeric gene described above may be further supplemented
by altering the coding sequence to encode a sucrose transport
protein with appropriate intracellular targeting sequences such as
transit sequences (Keegstra, K. (1989) Cell 56:247-253), signal
sequences or sequences encoding endoplasmic reticulum localization
(Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant Mol. Biol.
42:21-53), or nuclear localization signals (Raikhel, N. (1992)
Plant Phys. 100:1627-1632) added and/or with targeting sequences
that are already present removed. While the references cited give
examples of each of these, the list is not exhaustive and more
targeting signals of utility may be discovered in the future.
[0071] It may also be desirable to reduce or eliminate expression
of genes encoding sucrose transport proteins in plants for some
applications. In order to accomplish this, a chimeric gene designed
for co-suppression of the instant sucrose transport proteins can be
constructed by linking a gene or gene fragment encoding a sucrose
transport protein to plant promoter sequences. Alternatively, a
chimeric gene designed to express antisense RNA for all or part of
the instant nucleic acid fragment can be constructed by linking the
gene or gene fragment in reverse orientation to plant promoter
sequences. Either the co-suppression or antisense chimeric genes
could be introduced into plants via transformation wherein
expression of the corresponding endogenous genes are reduced or
eliminated.
[0072] The instant sucrose transport proteins (or portions thereof)
may be produced in heterologous host cells, particularly in the
cells of microbial hosts, and can be used to prepare antibodies to
the these proteins by methods well known to those skilled in the
art. The antibodies are useful for detecting sucrose transport
proteins in situ in cells or in vitro in cell extracts. Preferred
heterologous host cells for production of the instant sucrose
transport proteins are microbial hosts. Microbial expression
systems and expression vectors containing regulatory sequences that
direct high level expression of foreign proteins are well known to
those skilled in the art. Any of these could be used to construct a
chimeric gene for production of the instant sucrose transport
proteins. This chimeric gene could then be introduced into
appropriate microorganisms via transformation to provide high level
expression of the encoded sucrose transport protein. An example of
a vector for high level expression of the instant sucrose transport
proteins in a bacterial host is provided (Example 6).
[0073] All or a substantial portion of the nucleic acid fragments
of the instant invention may also be used as probes for genetically
and physically mapping the genes that they are a part of, and as
markers for traits linked to those genes. Such information may be
useful in plant breeding in order to develop lines with desired
phenotypes. For example, the instant nucleic acid fragments may be
used as restriction fragment length polymorphism (RFLP) markers.
Southern blots (Maniatis) of restriction-digested plant genomic DNA
may be probed with the nucleic acid fragments of the instant
invention. The resulting banding patterns may then be subjected to
genetic analyses using computer programs such as MapMaker (Lander
et at., (1987) Genomics 1:174-181) in order to construct a genetic
map. In addition, the nucleic acid fragments of the instant
invention may be used to probe Southern blots containing
restriction endonuclease-treated genomic DNAs of a set of
individuals representing parent and progeny of a defined genetic
cross. Segregation of the DNA polymorphisms is noted and used to
calculate the position of the instant nucleic acid sequence in the
genetic map previously obtained using this population (Botstein, D.
et al., (1980) Am. J. Hum. Genet. 32:314-331).
[0074] The production and use of plant gene-derived probes for use
in genetic mapping is described in R. Bernatzky, R. and Tanksley,
S. D. (1986) Plant Mol. Biol. Reporter 4(1):37-41. Numerous
publications describe genetic mapping of specific cDNA clones using
the methodology outlined above or variations thereof. For example,
F2 intercross populations, backcross populations, randomly mated
populations, near isogenic lines, and other sets of individuals may
be used for mapping. Such methodologies are well known to those
skilled in the art.
[0075] Nucleic acid probes derived from the instant nucleic acid
sequences may also be used for physical mapping (i.e., placement of
sequences on physical maps; see Hoheisel, J. D., et al., In:
Nonmammalian Genomic Analysis: A Practical Guide, Academic press
1996, pp. 319-346, and references cited therein).
[0076] In another embodiment, nucleic acid probes derived from the
instant nucleic acid sequences may be used in direct fluorescence
in situ hybridization (FISH) mapping (Trask, B. J. (1991) Trends
Genet. 7:149-154). Although current methods of FISH mapping favor
use of large clones (several to several hundred KB; see Laan, M. et
al. (1995) Genome Research 5:13-20), improvements in sensitivity
may allow performance of FISH mapping using shorter probes.
[0077] A variety of nucleic acid amplification-based methods of
genetic and physical mapping may be carried out using the instant
nucleic acid sequences. Examples include allele-specific
amplification (Kazazian, H. H. (1989) J. Lab. Clin. Med.
114(2):95-96), polymorphism of PCR-amplified fragments (CAPS;
Sheffield, V. C. et al. (1993) Genomics 16:325-332),
allele-specific ligation (Landegren, U. et al. (1988) Science
241:1077-1080), nucleotide extension reactions (Sokolov, B. P.
(1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping
(Walter, M. A. et al. (1997) Nature Genetics 7:22-28) and Happy
Mapping (Dear, P. H. and Cook, P. R. (1989) Nucleic Acid Res.
17:6795-6807). For these methods, the sequence of a nucleic acid
fragment is used to design and produce primer pairs for use in the
amplification reaction or in primer extension reactions. The design
of such primers is well known to those skilled in the art. In
methods employing PCR-based genetic mapping, it may be necessary to
identify DNA sequence differences between the parents of the
mapping cross in the region corresponding to the instant nucleic
acid sequence. This, however, is generally not necessary for
mapping methods.
[0078] Loss of function mutant phenotypes may be identified for the
instant cDNA clones either by targeted gene disruption protocols or
by identifying specific mutants for these genes contained in a
maize population carrying mutations in all possible genes
(Ballinger and Benzer, (1989) Proc. Natl. Acad. Sci. USA 86:9402;
Koes et al., (1995) Proc. Natl. Acad. Sci. USA 92:8149; Bensen et
al., (1995) Plant Cell 7:75). The latter approach may be
accomplished in two ways. First, short segments of the instant
nucleic acid fragments may be used in polymerase chain reaction
protocols in conjunction with a mutation tag sequence primer on
DNAs prepared from a population of plants in which Mutator
transposons or some other mutation-causing DNA element has been
introduced (see Bensen, supra). The amplification of a specific DNA
fragment with these primers indicates the insertion of the mutation
tag element in or near the plant gene encoding the sucrose
transport protein. Alternatively, the instant nucleic acid fragment
may be used as a hybridization probe against PCR amplification
products generated from the mutation population using the mutation
tag sequence primer in conjunction with an arbitrary genomic site
primer, such as that for a restriction enzyme site-anchored
synthetic adaptor. With either method, a plant containing a
mutation in the endogenous gene encoding a sucrose transport
protein can be identified and obtained. This mutant plant can then
be used to determine or confirm the natural function of the sucrose
transport protein gene product.
EXAMPLES
[0079] The present invention is further defined in the following
Examples, in which all parts and percentages are by weight and
degrees are Celsius, unless otherwise stated. It should be
understood that these Examples, while indicating preferred
embodiments of the invention, are given by way of illustration
only. From the above discussion and these Examples, one skilled in
the art can ascertain the essential characteristics of this
invention, and without departing from the spirit and scope thereof,
can make various changes and modifications of the invention to
adapt it to various usages and conditions.
Example 1
Composition of cDNA Libraries; Isolation and Sequencing of cDNA
Clones
[0080] cDNA libraries representing mRNAs from various corn, rice,
soybean, Vernonia and wheat tissues were prepared. The
characteristics of the libraries are described below.
TABLE-US-00002 TABLE 2 cDNA Libraries from Corn Rice, Soybean
Vernonia and Wheat Library Tissue Clone cepe7 Corn epicotyl from 7
day old etiolated seedling cepe7.pk0015.d10 cr1n Corn root from 7
day seedling grown in light* cr1n.pk0075.f5 cr1n.pk0095.c10 rlr2
Rice leaf 15 days after germination 2 hours after infection of
rlr2.pk0043.b1 strain Magnaporthe grisea 4360-R-62 (AVR2-YAMO) rls6
Rice leaf 15 days after germination 6 hours after infection of
rls6.pk0076.e2 strain Magnaporthe grisea 4360-R-62 (AVR2-YAMO)
sdp3c Soybean developing pods 8-9 mm sdp3c.pk012.c13 sfl1 Soybean
immature flower sfl1.pk0001.g1 sfl1.pk0043.c7 vs1 Vernonia
developing seed vs1n.pk0002.h3 wle1n Wheat leaf 7 day old etiolated
seedling light grown* wle1n.pk0007.h8 wle1n.pk0103.c11 wlm24 Wheat
seedling 24 hours after inoculation with Erysiphe wlm24.pk0015.g11
graminis wlmk1 Wheat seedlings 1 hour after inoculation with
Erysiphe wlmk1.pk0002.e11 graminis and treatment with fungicide**
*These libraries were normalized essentially as described in U.S.
Pat. No. 5,482,845 **Application of
6-iodo-2-propoxy-3-propyl-4(3H)-quinazolinone; synthesis and
methods of using this compound are described in USSN 08/545,827,
incorporated herein by reference.
[0081] cDNA libraries were prepared in Uni-ZAP.TM. XR vectors
according to the manufacturer's protocol (Stratagene Cloning
Systems, La Jolla, Calif.). Conversion of the Uni-ZAP.TM. XR
libraries into plasmid libraries was accomplished according to the
protocol provided by Stratagene. Upon conversion, cDNA inserts were
contained in the plasmid vector pBluescript. cDNA inserts from
randomly picked bacterial colonies containing recombinant
pBluescript plasmids were amplified via polymerase chain reaction
using primers specific for vector sequences flanking the inserted
cDNA sequences or plasmid DNA was prepared from cultured bacterial
cells. Amplified insert DNAs or plasmid DNAs were sequenced in
dye-primer sequencing reactions to generate partial cDNA sequences
(expressed sequence tags or "ESTs"; see Adams, M. D. et al., (1991)
Science 252:1651). The resulting ESTs were analyzed using a Perkin
Elmer Model 377 fluorescent sequencer.
Example 2
Identification of cDNA Clones
[0082] ESTs encoding sucrose transport proteins were identified by
conducting BLAST (Basic Local Alignment Search Tool; Altschul, S.
F., et al., (1993) J. Mol. Biol. 215:403-410; see also
www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequences
contained in the BLAST "nr" database (comprising all non-redundant
GenBank CDS translations, sequences derived from the 3-dimensional
structure Brookhaven Protein Data Bank, the last major release of
the SWISS-PROT protein sequence database, EMBL, and DDBJ
databases). The cDNA sequences obtained in Example 1 were analyzed
for similarity to all publicly available DNA sequences contained in
the "nr" database using the BLASTN algorithm provided by the
National Center for Biotechnology Information (NCBI). The DNA
sequences were translated in all reading frames and compared for
similarity to all publicly available protein sequences contained in
the "nr" database using the BLASTX algorithm (Gish, W. and States,
D. J. (1993) Nature Genetics 3:266-272 and Altschul, Stephen F., et
al. (1997) Nucleic Acids Res. 25:3389-3402) provided by the NCBI.
For convenience, the P-value (probability) of observing a match of
a cDNA sequence to a sequence contained in the searched databases
merely by chance as calculated by BLAST are reported herein as
"pLog" values, which represent the negative of the logarithm of the
reported P-value. Accordingly, the greater the pLog value, the
greater the likelihood that the cDNA sequence and the BLAST "hit"
represent homologous proteins.
Example 3
Characterization of cDNA Clones Encoding Sucrose Transporter
Proteins
[0083] The BLASTX search using the EST sequences from clones
cepe7.pk0015.d10, cr1n.pk0095.c10, cr1n.pk0075.f5, rls6.pk0076.e2,
wle1n.pk0007.h8, wle1n.pk0007.h8, wle1n.pk0103.c11,
wlm24.pk0015.g11 and wlmk1.pk0002.e11 revealed similarity of the
proteins encoded by the cDNAs to a sucrose transporter from Oryza
sativa (NCBI Identifier No. gi 2723471). The BLASTX search using
the EST sequence from clone rlr2.pk0043.b1 revealed similarity of
the protein encoded by the cDNA to a sucrose transporter from
Daucus carota (NCBI Identifier No. gi 2969887). The BLASTX search
using the EST sequence from clone sfl1.pk0001.g1 revealed
similarity of the protein encoded by the cDNA to a sucrose
transporter from Vicia faba (NCBI Identifier No. gi 1935019). The
BLASTX search using the EST sequences from clones sfl1.pk0043.c7,
sdp3c.pk012.c13 and vs1n.pk0002.h3 revealed similarity of the
proteins encoded by the cDNAs to a sucrose transporter from Ricinus
communis (NCBI Identifier No. gi 542020).
[0084] In the process of comparing the ESTs it was found that
soybean clones sfl1.pk0043.c7 and sdp3c.pk012.c13 had overlapping
regions of homology. Using this homology it was possible to align
the ESTs and assemble a contig encoding a unique soybean sucrose
transport protein.
[0085] The BLAST results for each of these ESTs and the soybean
contig are shown in Table 3:
TABLE-US-00003 TABLE 3 BLAST Results for Clones Encoding
Polypeptides Homologous to Daucus carota, Oryza sativa, Ricinus
communis and Vicia faba Sucrose Transport Proteins Clone BLAST pLog
Score cepe7.pk0015.d10 >250.00 cr1n.pk0095.c10 >250.00
cr1n.pk0075.f5 31.10 rlr2.pk0043.b1 148.00 rls6.pk0076.e2
>250.00 sfl1.pk0001.g1 >250.00 Contig composed of: 142.00
sfl1.pk0043.c7 sdp3c.pk012.c13 vs1n.pk0002.h3 59.30 wle1n.pk0007.h8
110.00 wle1n.pk0103.c11 >250.00 wlm24.pk0015.g11 >250.00
wlmk1.pk0002.e11 177.00
[0086] The sequence of a portion of the cDNA insert from clone
cepe7.pk0015.d10 is shown in SEQ ID NO:1; the deduced amino acid
sequence of this cDNA, which represents 100% of the protein, is
shown in SEQ ID NO:2. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:2 and the Oryza
sativa sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:2 is 82% similar to the Oryza sativa
protein.
[0087] The sequence of a portion of the cDNA insert from clone
cr1n.pk0075.f5 is shown in SEQ ID NO:3; the deduced amino acid
sequence of this cDNA, which represents 93% of the protein, is
shown in SEQ ID NO:4. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:4 and the Oryza
sativa sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:4 is 50% similar to the Oryza sativa
protein.
[0088] The sequence of a portion of the cDNA insert from clone
cr1n.pk0095.c10 is shown in SEQ ID NO:5; the deduced amino acid
sequence of this cDNA, which represents 20% of the protein
(C-terminal region), is shown in SEQ ID NO:6. A calculation of the
percent similarity of the amino acid sequence set forth in SEQ ID
NO:6 and the Oryza sativa sequence (using the Clustal algorithm)
revealed that the protein encoded by SEQ ID NO:6 is 86% similar to
the Oryza sativa protein.
[0089] The sequence of a portion of the cDNA insert from clone
rlr2.pk0043.b1 is shown in SEQ ID NO:7; the deduced amino acid
sequence of this cDNA, which represents 79% of the protein
(C-terminal region), is shown in SEQ ID NO:8. A calculation of the
percent similarity of the amino acid sequence set forth in SEQ ID
NO:8 and the Daucus carota sequence (using the Clustal algorithm)
revealed that the protein encoded by SEQ ID NO:8 is 60% similar to
the Daucus carota protein.
[0090] The sequence of a portion of the cDNA insert from clone
rls6.pk0076.e2 is shown in SEQ ID NO:9; the deduced amino acid
sequence of this cDNA, which represents 100% of the protein, is
shown in SEQ ID NO:10. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:10 and the Oryza
sativa sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:10 is 55% similar to the Oryza sativa
protein. Due to a percent similarity of only 55% with a known rice
sucrose transport protein clone rls6.pk0076.e2 appears to represent
a second rice sucrose transport protein.
[0091] The sequence of a portion of the cDNA insert from clone
sfl1.pk0001.g1 is shown in SEQ ID NO:11; the deduced amino acid
sequence of this cDNA, which represents 100% of the protein, is
shown in SEQ ID NO:12. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:12 and the Vicia
faba sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:12 is 67% similar to the Vicia faba
protein.
[0092] The sequence of a portion of the contig composed of clones
sfl1.pk0043.c7 and sdp3c.pk012.c13 is shown in SEQ ID NO:13; the
deduced amino acid sequence of this contig, which represents 62% of
the protein (N-terminal region), is shown in SEQ ID NO:14. A
calculation of the percent similarity of the amino acid sequence
set forth in SEQ ID NO:14 and the Ricinus communis sequence (using
the Clustal algorithm) revealed that the protein encoded by SEQ ID
NO:14 is 66% similar to the Ricinus communis protein.
[0093] The sequence of a portion of the cDNA insert from clone
vs1n.pk0002.h3 is shown in SEQ ID NO:15; the deduced amino acid
sequence of this cDNA, which represents 31% of the protein
(C-terminal region), is shown in SEQ ID NO:16. A calculation of the
percent similarity of the amino acid sequence set forth in SEQ ID
NO:16 and the Ricinus communis sequence (using the Clustal
algorithm) revealed that the protein encoded by SEQ ID NO:16 is 66%
similar to the Ricinus communis protein.
[0094] The sequence of a portion of the cDNA insert from clone
wle1n.pk0007.h8 is shown in SEQ ID NO:17; the deduced amino acid
sequence of this cDNA, which represents 43% of the protein
(C-terminal region), is shown in SEQ ID NO:18. A calculation of the
percent similarity of the amino acid sequence set forth in SEQ ID
NO:18 and the Oryza sativa sequence (using the Clustal algorithm)
revealed that the protein encoded by SEQ ID NO:18 is 80% similar to
the Oryza sativa protein.
[0095] The sequence of a portion of the cDNA insert from clone
wle1n.pk0103.c11 is shown in SEQ ID NO:19; the deduced amino acid
sequence of this cDNA, which represents 100% of the protein, is
shown in SEQ ID NO:20. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:20 and the Oryza
sativa sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:20 is 80% similar to the Oryza sativa
protein.
[0096] The sequence of a portion of the cDNA insert from clone
wlm24.pk0015.g11 is shown in SEQ ID NO:21; the deduced amino acid
sequence of this cDNA, which represents 100% of the protein, is
shown in SEQ ID NO:22. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:22 and the Oryza
sativa sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:22 is 80% similar to the Oryza sativa
protein.
[0097] The sequence of a portion of the cDNA insert from clone
wlmk1.pk0002.e11 is shown in SEQ ID NO:23; the deduced amino acid
sequence of this cDNA, which represents 97% of the protein, is
shown in SEQ ID NO:24. A calculation of the percent similarity of
the amino acid sequence set forth in SEQ ID NO:24 and the Oryza
sativa sequence (using the Clustal algorithm) revealed that the
protein encoded by SEQ ID NO:24 is 54% similar to the Oryza sativa
protein.
[0098] The percent similarity between each of the corn, rice,
soybean, Vernonia and wheat amino acid sequence was calculated to
range from 12 to 98% using the Clustal algorithm. FIG. 1 presents
an alignment of the amino acid sequences set forth in SEQ ID NOs:2,
4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24 and the Daucus carota,
Oryza sativa, Ricinus communis and Vicia faba sucrose transport
protein amino acid sequences.
[0099] BLAST scores and probabilities indicate that the instant
nucleic acid fragments encode entire or portions of proteins. These
sequences represent the first corn, soybean and wheat, amino acid
sequences and a new rice sequence encoding sucrose transport
proteins.
Example 4
Expression of Chimeric Genes in Monocot Cells
[0100] A chimeric gene comprising a cDNA encoding a sucrose
transport protein in sense orientation with respect to the maize 27
kD zein promoter that is located 5' to the cDNA fragment, and the
10 kD zein 3' end that is located 3' to the cDNA fragment, can be
constructed. The cDNA fragment of this gene may be generated by
polymerase chain reaction (PCR) of the cDNA clone using appropriate
oligonucleotide primers. Cloning sites (NcoI or SmaI) can be
incorporated into the oligonucleotides to provide proper
orientation of the DNA fragment when inserted into the digested
vector pML103 as described below. Amplification is then performed
in a standard PCR. The amplified DNA is then digested with
restriction enzymes NcoI and SmaI and fractionated on an agarose
gel. The appropriate band can be isolated from the gel and combined
with a 4.9 kb NcoI-SmaI fragment of the plasmid pML103. Plasmid
pML103 has been deposited under the terms of the Budapest Treaty at
ATCC (American Type Culture Collection, 10801 University Blvd.,
Manassas, Va. 20110-2209), and bears accession number ATCC 97366.
The DNA segment from pML103 contains a 1.05 kb SalI-NcoI promoter
fragment of the maize 27 kD zein gene and a 0.96 kb SmaI-SalI
fragment from the 3' end of the maize 10 kD zein gene in the vector
pGem9Zf(+) (Promega). Vector and insert DNA can be ligated at
15.degree. C. overnight, essentially as described (Maniatis). The
ligated DNA may then be used to transform E. coli XL1-Blue
(Epicurian Coli XL-1 Blue.TM.; Stratagene). Bacterial transformants
can be screened by restriction enzyme digestion of plasmid DNA and
limited nucleotide sequence analysis using the dideoxy chain
termination method (Sequenase.TM. DNA Sequencing Kit; U.S.
Biochemical). The resulting plasmid construct would comprise a
chimeric gene encoding, in the 5' to 3' direction, the maize 27 kD
zein promoter, a cDNA fragment encoding a sucrose transport
protein, and the 10 kD zein 3' region.
[0101] The chimeric gene described above can then be introduced
into corn cells by the following procedure. Immature corn embryos
can be dissected from developing caryopses derived from crosses of
the inbred corn lines H99 and LH132. The embryos are isolated 10 to
11 days after pollination when they are 1.0 to 1.5 mm long. The
embryos are then placed with the axis-side facing down and in
contact with agarose-solidified N6 medium (Chu et al., (1975) Sci.
Sin. Peking 18:659-668). The embryos are kept in the dark at
27.degree. C. Friable embryogenic callus consisting of
undifferentiated masses of cells with somatic proembryoids and
embryoids borne on suspensor structures proliferates from the
scutellum of these immature embryos. The embryogenic callus
isolated from the primary explant can be cultured on N6 medium and
sub-cultured on this medium every 2 to 3 weeks.
[0102] The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst
Ag, Frankfurt, Germany) may be used in transformation experiments
in order to provide for a selectable marker. This plasmid contains
the Pat gene (see European Patent Publication 0 242 236) which
encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT
confers resistance to herbicidal glutamine synthetase inhibitors
such as phosphinothricin. The pat gene in p35S/Ac is under the
control of the 35S promoter from Cauliflower Mosaic Virus (Odell et
al. (1985) Nature 313:810-812) and the 3' region of the nopaline
synthase gene from the T-DNA of the Ti plasmid of Agrobacterium
tumefaciens.
[0103] The particle bombardment method (Klein et al., (1987) Nature
327:70-73) may be used to transfer genes to the callus culture
cells. According to this method, gold particles (1 .mu.m in
diameter) are coated with DNA using the following technique. Ten
.mu.g of plasmid DNAs are added to 50 .mu.L of a suspension of gold
particles (60 mg per mL). Calcium chloride (50 .mu.L of a 2.5 M
solution) and spermidine free base (20 .mu.L of a 1.0 M solution)
are added to the particles. The suspension is vortexed during the
addition of these solutions. After 10 minutes, the tubes are
briefly centrifuged (5 sec at 15,000 rpm) and the supernatant
removed. The particles are resuspended in 200 .mu.L of absolute
ethanol, centrifuged again and the supernatant removed. The ethanol
rinse is performed again and the particles resuspended in a final
volume of 30 .mu.L of ethanol. An aliquot (5 .mu.L) of the
DNA-coated gold particles can be placed in the center of a
Kapton.TM. flying disc (Bio-Rad Labs). The particles are then
accelerated into the corn tissue with a Biolistic.TM. PDS-1000/He
(Bio-Rad Instruments, Hercules Calif.), using a helium pressure of
1000 psi, a gap distance of 0.5 cm and a flying distance of 1.0
cm.
[0104] For bombardment, the embryogenic tissue is placed on filter
paper over agarose-solidified N6 medium. The tissue is arranged as
a thin lawn and covered a circular area of about 5 cm in diameter.
The petri dish containing the tissue can be placed in the chamber
of the PDS-1000/He approximately 8 cm from the stopping screen. The
air in the chamber is then evacuated to a vacuum of 28 inches of
Hg. The macrocarrier is accelerated with a helium shock wave using
a rupture membrane that bursts when the He pressure in the shock
tube reaches 1000 psi.
[0105] Seven days after bombardment the tissue can be transferred
to N6 medium that contains gluphosinate (2 mg per liter) and lacks
casein or proline. The tissue continues to grow slowly on this
medium. After an additional 2 weeks the tissue can be transferred
to fresh N6 medium containing gluphosinate. After 6 weeks, areas of
about 1 cm in diameter of actively growing callus can be identified
on some of the plates containing the glufosinate-supplemented
medium. These calli may continue to grow when sub-cultured on the
selective medium.
[0106] Plants can be regenerated from the transgenic callus by
first transferring clusters of tissue to N6 medium supplemented
with 0.2 mg per liter of 2,4-D. After two weeks the tissue can be
transferred to regeneration medium (Fromm et al., (1990)
Bio/Technology 8:833-839).
Example 5
Expression of Chimeric Genes in Dicot Cells
[0107] A seed-specific expression cassette composed of the promoter
and transcription terminator from the gene encoding the .beta.
subunit of the seed storage protein phaseolin from the bean
Phaseolus vulgaris (Doyle et al. (1986) J. Biol. Chem.
261:9228-9238) can be used for expression of the instant sucrose
transport proteins in transformed soybean. The phaseolin cassette
includes about 500 nucleotides upstream (5') from the translation
initiation codon and about 1650 nucleotides downstream (3') from
the translation stop codon of phaseolin. Between the 5' and 3'
regions are the unique restriction endonuclease sites Nco I (which
includes the ATG translation initiation codon), Sma I, Kpn I and
Xba I. The entire cassette is flanked by Hind III sites.
[0108] The cDNA fragment of this gene may be generated by
polymerase chain reaction (PCR) of the cDNA clone using appropriate
oligonucleotide primers. Cloning sites can be incorporated into the
oligonucleotides to provide proper orientation of the DNA fragment
when inserted into the expression vector. Amplification is then
performed as described above, and the isolated fragment is inserted
into a pUC18 vector carrying the seed expression cassette.
[0109] Soybean embryos may then be transformed with the expression
vector comprising a sequence encoding a sucrose transport protein.
To induce somatic embryos, cotyledons, 3-5 mm in length dissected
from surface sterilized, immature seeds of the soybean cultivar
A2872, can be cultured in the light or dark at 26.degree. C. on an
appropriate agar medium for 6-10 weeks. Somatic embryos which
produce secondary embryos are then excised and placed into a
suitable liquid medium. After repeated selection for clusters of
somatic embryos which multiplied as early, globular staged embryos,
the suspensions are maintained as described below.
[0110] Soybean embryogenic suspension cultures can maintained in 35
mL liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with
florescent lights on a 16:8 hour day/night schedule. Cultures are
subcultured every two weeks by inoculating approximately 35 mg of
tissue into 35 mL of liquid medium.
[0111] Soybean embryogenic suspension cultures may then be
transformed by the method of particle gun bombardment (Kline et al.
(1987) Nature (London) 327:70, U.S. Pat. No. 4,945,050). A DuPont
Biolistic.TM. PDS1000/HE instrument (helium retrofit) can be used
for these transformations.
[0112] A selectable marker gene which can be used to facilitate
soybean transformation is a chimeric gene composed of the 35S
promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature
313:810-812), the hygromycin phosphotransferase gene from plasmid
pJR225 (from E. coli; Gritz et al. (1983) Gene 25:179-188) and the
3' region of the nopaline synthase gene from the T-DNA of the Ti
plasmid of Agrobacterium tumefaciens. The seed expression cassette
comprising the phaseolin 5' region, the fragment encoding the
sucrose transport protein and the phaseolin 3' region can be
isolated as a restriction fragment. This fragment can then be
inserted into a unique restriction site of the vector carrying the
marker gene.
[0113] To 50 .mu.L of a 60 mg/mL 1 .mu.m gold particle suspension
is added (in order): 5 .mu.L DNA (1 .mu.g/.mu.L), 20 .mu.l
spermidine (0.1 M), and 50 .mu.L CaCl.sub.2 (2.5 M). The particle
preparation is then agitated for three minutes, spun in a microfuge
for 10 seconds and the supernatant removed. The DNA-coated
particles are then washed once in 400 .mu.L 70% ethanol and
resuspended in 40 .mu.L of anhydrous ethanol. The DNA/particle
suspension can be sonicated three times for one second each. Five
.mu.L of the DNA-coated gold particles are then loaded on each
macro carrier disk.
[0114] Approximately 300-400 mg of a two-week-old suspension
culture is placed in an empty 60.times.15 mm petri dish and the
residual liquid removed from the tissue with a pipette. For each
transformation experiment, approximately 5-10 plates of tissue are
normally bombarded. Membrane rupture pressure is set at 1100 psi
and the chamber is evacuated to a vacuum of 28 inches mercury. The
tissue is placed approximately 3.5 inches away from the retaining
screen and bombarded three times. Following bombardment, the tissue
can be divided in half and placed back into liquid and cultured as
described above.
[0115] Five to seven days post bombardment, the liquid media may be
exchanged with fresh media, and eleven to twelve days post
bombardment with fresh media containing 50 mg/mL hygromycin. This
selective media can be refreshed weekly. Seven to eight weeks post
bombardment, green, transformed tissue may be observed growing from
untransformed, necrotic embryogenic clusters. Isolated green tissue
is removed and inoculated into individual flasks to generate new,
clonally propagated, transformed embryogenic suspension cultures.
Each new line may be treated as an independent transformation
event. These suspensions can then be subcultured and maintained as
clusters of immature embryos or regenerated into whole plants by
maturation and germination of individual somatic embryos.
Example 6
Expression of Chimeric Genes in Microbial Cells
[0116] The cDNAs encoding the instant sucrose transport proteins
can be inserted into the T7 E. coli expression vector pBT430. This
vector is a derivative of pET-3a (Rosenberg et al. (1987) Gene
56:125-135) which employs the bacteriophage T7 RNA polymerase/T7
promoter system. Plasmid pBT430 was constructed by first destroying
the EcoR I and Hind III sites in pET-3a at their original
positions. An oligonucleotide adaptor containing EcoR I and Hind
III sites was inserted at the BamH I site of pET-3a. This created
pET-3aM with additional unique cloning sites for insertion of genes
into the expression vector. Then, the Nde I site at the position of
translation initiation was converted to an Nco I site using
oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM
in this region, 5'-CATATGG, was converted to 5'-CCCATGG in
pBT430.
[0117] Plasmid DNA containing a cDNA may be appropriately digested
to release a nucleic acid fragment encoding the protein. This
fragment may then be purified on a 1% NuSieve GTG.TM. low melting
agarose gel (FMC). Buffer and agarose contain 10 .mu.g/ml ethidium
bromide for visualization of the DNA fragment. The fragment can
then be purified from the agarose gel by digestion with GELase.TM.
(Epicentre Technologies) according to the manufacturer's
instructions, ethanol precipitated, dried and resuspended in 20
.mu.L of water. Appropriate oligonucleotide adapters may be ligated
to the fragment using T4 DNA ligase (New England Biolabs, Beverly,
Mass.). The fragment containing the ligated adapters can be
purified from the excess adapters using low melting agarose as
described above. The vector pBT430 is digested, dephosphorylated
with alkaline phosphatase (NEB) and deproteinized with
phenol/chloroform as described above. The prepared vector pBT430
and fragment can then be ligated at 16.degree. C. for 15 hours
followed by transformation into DH5 electrocompetent cells (GIBCO
BRL). Transformants can be selected on agar plates containing LB
media and 100 .mu.g/mL ampicillin. Transformants containing the
gene encoding the sucrose transport protein are then screened for
the correct orientation with respect to the T7 promoter by
restriction enzyme analysis.
[0118] For high level expression, a plasmid clone with the cDNA
insert in the correct orientation relative to the T7 promoter can
be transformed into E. coli strain BL21(DE3) (Studier et al. (1986)
J. Mol. Biol. 189:113-130). Cultures are grown in LB medium
containing ampicillin (100 mg/L) at 25.degree. C. At an optical
density at 600 nm of approximately 1, IPTG
(isopropylthio-.beta.-galactoside, the inducer) can be added to a
final concentration of 0.4 mM and incubation can be continued for 3
h at 25.degree.. Cells are then harvested by centrifugation and
re-suspended in 50 .mu.L of 50 mM Tris-HCl at pH 8.0 containing 0.1
mM DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of
1 mm glass beads can be added and the mixture sonicated 3 times for
about 5 seconds each time with a microprobe sonicator. The mixture
is centrifuged and the protein concentration of the supernatant
determined. One .mu.g of protein from the soluble fraction of the
culture can be separated by SDS-polyacrylamide gel electrophoresis.
Gels can be observed for protein bands migrating at the expected
molecular weight.
Sequence CWU 1
1
2812088DNAZea mays 1gcacgagaca ctcctcacct ctcctcgctc cacgcacgcg
ctctctcacc cgctggctat 60tagtcgtcgt cccttggatt tcgacactct ctctagcggg
cgcctgttcc gccgccgtcc 120atcgatccta gctagctagc tagctagggc
gcgaccgtcg tctcggtggt tgttgacagg 180tcccgtacgt gtgtgctcgc
catggctcgt ggcgacggcg ggcagctggc ggagctgtcc 240gcgggggtcc
gcggcgcggc cgcggtggtg gaccacgtgg ccccgatcag cctcgggagg
300ctcatcctcg ccggcatggt cgccggcggc gtgcagtacg gctgggcgct
gcagctctcc 360ctcctcacgc cctacgtgca gactctgggg ctttcacatg
cgctcacttc attcatgtgg 420ctctgcggcc ctattgccgg cttagtggtc
caaccgctgg ttggcctgta cagcgacagg 480tgtacatcga gatgggggag
acggaggccg tttatcctga cagggtgcat gctcatctgc 540gttgccgtca
ttgttgtcgg attctcgtca gacatcggag ctgctctagg ggacacgaag
600gaacactgca gcctctacca cggtcctcgt tggcacgctg cgatcgtgta
cgttctgggg 660ttttggctcc ttgacttctc caacaacact gtgcagggtc
cagcacgtgc tatgatggct 720gatctatgtg accatcatgg gccaagtgcg
gctaactcca tcttctgttc ttggatggcg 780ctgggaaaca tcctaggcta
ctcctctggc tccacgaaca attggcacaa gtggtttccc 840ttccttaaaa
cgagcgcctg ctgtgaggcc tgtgcgaacc tgaaaggtgc atttctggtg
900gccgtggtgt tcctagtcct gtgcctgacg gtaaccctga tcttcgccaa
ggaggtgccg 960tacagagcga acgagaacct cccgacgacg aaggccggcg
gcgaggtcga gactgagcct 1020accgggccac ttgccgtgct caagggcttc
aaggacctgc ctcccgggat gccgtccgtg 1080ctcctcgtga ctgccatcac
ctggctttcg tggttcccgt tcatcctcta cgacaccgac 1140tggatgggcc
gggagatcta ccacggcgac cccaagggga gcaacgccca gatctcggcg
1200ttcaacgaag gtgtccgagt cggcgcgttc gggctgctac tcaactcggt
tattctaggg 1260ttcagctcgt tcctgatcga gcccatgtgc cggaaggtcg
ggccgagggt ggtgtgggtg 1320acgagcaact tcatggtctg cgtcgccatg
gcggccaccg cgctgatcag cttctggtcg 1380ctcagggact accacgggta
cgtgcaggac gccatcaccg cgaacgccag catcaaggcc 1440gtctgcctcg
tcctcttcgc cttcctgggc gtccctctcg ccatcctgta cagcgtcccg
1500ttcgcggtga cggcgcagct ggcggccacc cggggcggcg ggcaggggct
gtgcaccggc 1560gtcctcaaca tctccatcgt catccctcag gtgatcatcg
cgctgggcgc cggcccgtgg 1620gacgcgctgt tcgggaaggg caacatcccg
gcgttcggcg tcgcgtcggc cttcgccctc 1680gtcggcggcg tcgtgggcgt
gttcctgctg cccaagatct ccaagcgcca gttccgggcc 1740gtcagcgcgg
gcggccactg atcgaacccg gccggggccg gccgccggca cgcagcccgg
1800caagagctgt atgttgttga gagttgaaca gaaaccatgc atgtgtgctt
ctgtagttct 1860gttgtttgtg gtcgatcgat gggcgttgcg tggcagcgtg
ggcaagcgag gcgaggtgcg 1920cggatccaaa aaaagggcca ttcgatcaat
caatgtgtag tagagtacaa ctagacgatg 1980atgttcacat catttgtctt
taatacatac cggtttctat tgtctttaaa aaaaaaaaaa 2040aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 20882519PRTZea mays 2Met
Ala Arg Gly Asp Gly Gly Gln Leu Ala Glu Leu Ser Ala Gly Val1 5 10
15Arg Gly Ala Ala Ala Val Val Asp His Val Ala Pro Ile Ser Leu Gly
20 25 30Arg Leu Ile Leu Ala Gly Met Val Ala Gly Gly Val Gln Tyr Gly
Trp 35 40 45Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln Thr Leu
Gly Leu 50 55 60Ser His Ala Leu Thr Ser Phe Met Trp Leu Cys Gly Pro
Ile Ala Gly65 70 75 80Leu Val Val Gln Pro Leu Val Gly Leu Tyr Ser
Asp Arg Cys Thr Ser 85 90 95Arg Trp Gly Arg Arg Arg Pro Phe Ile Leu
Thr Gly Cys Met Leu Ile 100 105 110Cys Val Ala Val Ile Val Val Gly
Phe Ser Ser Asp Ile Gly Ala Ala 115 120 125Leu Gly Asp Thr Lys Glu
His Cys Ser Leu Tyr His Gly Pro Arg Trp 130 135 140His Ala Ala Ile
Val Tyr Val Leu Gly Phe Trp Leu Leu Asp Phe Ser145 150 155 160Asn
Asn Thr Val Gln Gly Pro Ala Arg Ala Met Met Ala Asp Leu Cys 165 170
175Asp His His Gly Pro Ser Ala Ala Asn Ser Ile Phe Cys Ser Trp Met
180 185 190Ala Leu Gly Asn Ile Leu Gly Tyr Ser Ser Gly Ser Thr Asn
Asn Trp 195 200 205His Lys Trp Phe Pro Phe Leu Lys Thr Ser Ala Cys
Cys Glu Ala Cys 210 215 220Ala Asn Leu Lys Gly Ala Phe Leu Val Ala
Val Val Phe Leu Val Leu225 230 235 240Cys Leu Thr Val Thr Leu Ile
Phe Ala Lys Glu Val Pro Tyr Arg Ala 245 250 255Asn Glu Asn Leu Pro
Thr Thr Lys Ala Gly Gly Glu Val Glu Thr Glu 260 265 270Pro Thr Gly
Pro Leu Ala Val Leu Lys Gly Phe Lys Asp Leu Pro Pro 275 280 285Gly
Met Pro Ser Val Leu Leu Val Thr Ala Ile Thr Trp Leu Ser Trp 290 295
300Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg Glu Ile
Tyr305 310 315 320His Gly Asp Pro Lys Gly Ser Asn Ala Gln Ile Ser
Ala Phe Asn Glu 325 330 335Gly Val Arg Val Gly Ala Phe Gly Leu Leu
Leu Asn Ser Val Ile Leu 340 345 350Gly Phe Ser Ser Phe Leu Ile Glu
Pro Met Cys Arg Lys Val Gly Pro 355 360 365Arg Val Val Trp Val Thr
Ser Asn Phe Met Val Cys Val Ala Met Ala 370 375 380Ala Thr Ala Leu
Ile Ser Phe Trp Ser Leu Arg Asp Tyr His Gly Tyr385 390 395 400Val
Gln Asp Ala Ile Thr Ala Asn Ala Ser Ile Lys Ala Val Cys Leu 405 410
415Val Leu Phe Ala Phe Leu Gly Val Pro Leu Ala Ile Leu Tyr Ser Val
420 425 430Pro Phe Ala Val Thr Ala Gln Leu Ala Ala Thr Arg Gly Gly
Gly Gln 435 440 445Gly Leu Cys Thr Gly Val Leu Asn Ile Ser Ile Val
Ile Pro Gln Val 450 455 460Ile Ile Ala Leu Gly Ala Gly Pro Trp Asp
Ala Leu Phe Gly Lys Gly465 470 475 480Asn Ile Pro Ala Phe Gly Val
Ala Ser Ala Phe Ala Leu Val Gly Gly 485 490 495Val Val Gly Val Phe
Leu Leu Pro Lys Ile Ser Lys Arg Gln Phe Arg 500 505 510Ala Val Ser
Ala Gly Gly His 5153825DNAZea mays 3gcacgagtta agttggatct
cttctgatct gtactcaagc aaacttcatc acatcatcgg 60ggcaaataaa acagtcaaga
tcacggcatt ggttgttttc tctcttctcg gattgccact 120ctccatcact
tacagcgttc cgttttctgt gactgctgag ctgactgccg gtacaggagg
180tggacaaggt ttggccacag gagtcctaaa tcttgctatc gtggttcccc
agatagtagt 240gtcgcttgga gcaggtccat gggacgctct gtatggagga
gggaataccc cggcgttcgt 300cttggcttcg gtcttctccc tggcagcagg
tgtgctcgca gttctcaagc tgccaaagct 360gtccaactcg taccaatctg
ccgggttcca tggatttggc tgatgctcat gcccaaaaca 420cccccgtctg
ccatgtaaaa catcacacca acacttggcc ccattttgcc attcgtttac
480agagaaatga ttcttttttc ctcgtacaac tacagaataa tgacagtgaa
agtaggagtt 540taggtgagag agagagagag gctaggtagg ttgatgtgaa
ggtgtaaaag ctgtatcctc 600ctttttttgt ttttgttttt gtttttgaca
gtgtatgtaa gagctgtcca caagaaaatt 660tacaagtggt gtaacctgcc
ctcgtttgta cattgtacta ctactacatg acaatcatat 720gtcctttgtc
tttatccaag gttgaagacg taaactgagg ccatctattt atcttgggcc
780atgaaaaaaa aaaaaaaaaa aaaaaaaact cgaaactagt tctct 8254133PRTZea
mays 4His Glu Leu Ser Trp Ile Ser Ser Asp Leu Tyr Ser Ser Lys Leu
His1 5 10 15His Ile Ile Gly Ala Asn Lys Thr Val Lys Ile Thr Ala Leu
Val Val 20 25 30Phe Ser Leu Leu Gly Leu Pro Leu Ser Ile Thr Tyr Ser
Val Pro Phe 35 40 45Ser Val Thr Ala Glu Leu Thr Ala Gly Thr Gly Gly
Gly Gln Gly Leu 50 55 60Ala Thr Gly Val Leu Asn Leu Ala Ile Val Val
Pro Gln Ile Val Val65 70 75 80Ser Leu Gly Ala Gly Pro Trp Asp Ala
Leu Tyr Gly Gly Gly Asn Thr 85 90 95Pro Ala Phe Val Leu Ala Ser Val
Phe Ser Leu Ala Ala Gly Val Leu 100 105 110Ala Val Leu Lys Leu Pro
Lys Leu Ser Asn Ser Tyr Gln Ser Ala Gly 115 120 125Phe His Gly Phe
Gly 13051977DNAZea mays 5gcggcggacc acgtggcgcc gatcagcctc
ggcaggctca tcctcgccgg catggtcgcc 60ggcggcgtgc agtacggctg ggcgctgcag
ctctccctcc tcacgcccta cgtgcagact 120ctggggctct cacatgccct
cacttcattc atgtggctat gcggtcctat tgctggctta 180gtggtccaac
cgctggttgg cctgtacagc gataggtgca cagcaagatg gggaagacgc
240aggccattta tcctgatagg atgcatgctc atctgccttg ccgtcattgt
tgttggcttc 300tcgtccgaca tcggagctgc tctaggggac acaaaggaac
actgcagcct ctaccacggc 360cctcgttggc atgctgcgat cgtgtacgtt
ctggggtttt ggctccttga cttctccaac 420aatactgtgc aaggtccagc
gcgtgctatg atggctgatc tgtgcggtca tcatgggcct 480agtgcagcca
actcaatctt ctgttcttgg atggcgctgg gaaacatcct aggctattcc
540tctggctcca caaacaactg gcacaagtgg tttccgttcc ttatgacaaa
cgcgtgctgt 600gaagcctgcg caaacctgaa aggcgcgttt ctggtggctg
tggtgttcct aatcatgtgc 660ttgactataa ccctgttctt cgccaaggaa
gtgccctaca gaggaaacca gaacctcccc 720acaaaggcaa acggcgaggt
cgagactgaa ccttccggcc cactcgctgt gctcaagggc 780ttcaagaact
tgcccacggg gatgccgtcc gtgctcctcg taactggact cacctggctc
840tcttggttcc cgttcatcct ctacgacacc gactggatgg gccgtgagat
ctaccacggc 900gaccccaagg gtagcaacgc tcagatctcg gcgttcgacg
aaggcgtcag agttggctcg 960ttcgggctgc tgctcaactc gatcgttcta
ggattcagct cgttcctgat cgagcccatg 1020tgccggaagg tcgggccgag
ggtggtgtgg gtgacgagca acttcatggt ctgcgtcgcc 1080atggcggcca
ccgcgctgat cagcttctgg tcgctcaagg actaccacgg atacgtgcag
1140gacgccatca ccgccagcac gagcatcaag gccgtctgcc tcgtcctctt
cgcgttcctg 1200ggtgtccctc tcgccatcct gtacagcgtc ccgttcgcgg
tgacggcgca gctggcggcc 1260acgaagggcg gcgggcaggg gctgtgcacc
ggcgtgctca acatctccat cgtcatccct 1320caggtgatca tcgcgctggg
cgcgggcccg tgggacgcgc tgttcggcaa gggcaacatc 1380ccggcgttcg
gcgtggcgtc ggggttcgcc ctcatcggcg gcgtcgtggg cgtgttcctg
1440ctgcccaaga tctccaagcg ccagttccgc gccgtcagcg cgggcggcca
ctgatcgcgg 1500ccgccgcgcc ggagcacggc acggcggcac agcccagccg
tgctagagct gtatgttttg 1560aaagttgaaa cagaataaga agcgggcgaa
acgagaaaac catgcatgtc atgtgtgtgc 1620ttttgttgtg tgtggggtgg
ggcaagcgag gcgaggtgtg tggaggtgaa gtgaaggtga 1680gcatatccag
caccagctgg taccaaggtc gggtctctgt gctagtgcta ttagctagtg
1740taaggagcga gtaggtcagt taaggctggt gcgtcgtgag ggctgtcttg
tgtgtagcta 1800cagcagacgg ttcatcagaa ggattattcg tgcagtatat
acagtacaac tagacaatga 1860tgttgatgat tggtctagag ctagaggcct
atagccctat actactgtgt attgtccgcc 1920gttttagttt tttggtccca
tcccatcaat gcaaccgcct tgttttaaaa aaaaaaa 19776497PRTZea mays 6Ala
Ala Asp His Val Ala Pro Ile Ser Leu Gly Arg Leu Ile Leu Ala1 5 10
15Gly Met Val Ala Gly Gly Val Gln Tyr Gly Trp Ala Leu Gln Leu Ser
20 25 30Leu Leu Thr Pro Tyr Val Gln Thr Leu Gly Leu Ser His Ala Leu
Thr 35 40 45Ser Phe Met Trp Leu Cys Gly Pro Ile Ala Gly Leu Val Val
Gln Pro 50 55 60Leu Val Gly Leu Tyr Ser Asp Arg Cys Thr Ala Arg Trp
Gly Arg Arg65 70 75 80Arg Pro Phe Ile Leu Ile Gly Cys Met Leu Ile
Cys Leu Ala Val Ile 85 90 95Val Val Gly Phe Ser Ser Asp Ile Gly Ala
Ala Leu Gly Asp Thr Lys 100 105 110Glu His Cys Ser Leu Tyr His Gly
Pro Arg Trp His Ala Ala Ile Val 115 120 125Tyr Val Leu Gly Phe Trp
Leu Leu Asp Phe Ser Asn Asn Thr Val Gln 130 135 140Gly Pro Ala Arg
Ala Met Met Ala Asp Leu Cys Gly His His Gly Pro145 150 155 160Ser
Ala Ala Asn Ser Ile Phe Cys Ser Trp Met Ala Leu Gly Asn Ile 165 170
175Leu Gly Tyr Ser Ser Gly Ser Thr Asn Asn Trp His Lys Trp Phe Pro
180 185 190Phe Leu Met Thr Asn Ala Cys Cys Glu Ala Cys Ala Asn Leu
Lys Gly 195 200 205Ala Phe Leu Val Ala Val Val Phe Leu Ile Met Cys
Leu Thr Ile Thr 210 215 220Leu Phe Phe Ala Lys Glu Val Pro Tyr Arg
Gly Asn Gln Asn Leu Pro225 230 235 240Thr Lys Ala Asn Gly Glu Val
Glu Thr Glu Pro Ser Gly Pro Leu Ala 245 250 255Val Leu Lys Gly Phe
Lys Asn Leu Pro Thr Gly Met Pro Ser Val Leu 260 265 270Leu Val Thr
Gly Leu Thr Trp Leu Ser Trp Phe Pro Phe Ile Leu Tyr 275 280 285Asp
Thr Asp Trp Met Gly Arg Glu Ile Tyr His Gly Asp Pro Lys Gly 290 295
300Ser Asn Ala Gln Ile Ser Ala Phe Asp Glu Gly Val Arg Val Gly
Ser305 310 315 320Phe Gly Leu Leu Leu Asn Ser Ile Val Leu Gly Phe
Ser Ser Phe Leu 325 330 335Ile Glu Pro Met Cys Arg Lys Val Gly Pro
Arg Val Val Trp Val Thr 340 345 350Ser Asn Phe Met Val Cys Val Ala
Met Ala Ala Thr Ala Leu Ile Ser 355 360 365Phe Trp Ser Leu Lys Asp
Tyr His Gly Tyr Val Gln Asp Ala Ile Thr 370 375 380Ala Ser Thr Ser
Ile Lys Ala Val Cys Leu Val Leu Phe Ala Phe Leu385 390 395 400Gly
Val Pro Leu Ala Ile Leu Tyr Ser Val Pro Phe Ala Val Thr Ala 405 410
415Gln Leu Ala Ala Thr Lys Gly Gly Gly Gln Gly Leu Cys Thr Gly Val
420 425 430Leu Asn Ile Ser Ile Val Ile Pro Gln Val Ile Ile Ala Leu
Gly Ala 435 440 445Gly Pro Trp Asp Ala Leu Phe Gly Lys Gly Asn Ile
Pro Ala Phe Gly 450 455 460Val Ala Ser Gly Phe Ala Leu Ile Gly Gly
Val Val Gly Val Phe Leu465 470 475 480Leu Pro Lys Ile Ser Lys Arg
Gln Phe Arg Ala Val Ser Ala Gly Gly 485 490 495His 71653DNAOryza
sativa 7gcacgagatc actgcttcca tcgctgccgc agttctcacc gtcggattct
ccgccgacct 60cggccgaatc ttcggcgatt ccatcacccc gggctccacc cgcctcggcg
ccatcatcgt 120ctacctcgtc ggcttctggc tcctcgacgt cggcaacaac
gctacacagg gaccctgcag 180ggccttcctc gccgacctca ccgagaatga
cccaaggagg actcggatag ctaatgctta 240cttctcattg ttcatggccc
tgggaaacat acttggatat gccactggag catacagtgg 300ctggtacaag
atattcccgt tcaccgttac tccatcatgt agcatcagct gtgccaactt
360caagtctgcc tttctacttg atattatcat tttggtggtc actacatgca
tcactgtagc 420atcagtgcaa gagcctcaat cctttggaag tgatgaagca
gatcacccta gcacagaaca 480ggaagctttc ctctgggaac tttttggatc
attccggtac tttacattac cggtttggat 540ggttttgatt gttactgccc
tcacatggat tggatggttt ccatttatcc tctttgatac 600cgattggatg
ggtcgagaga tctatcgtgg aagtccagat gatccaagta taactcagag
660ctatcatgat ggtgtgagaa tgggttcttt tggtctgatg ctgaactcgg
tccttcttgg 720attcacttct attgtactag agaagttatg tcggaagtgg
ggagctggac tggtgtgggg 780tgtctccaat atcctaatgg cattgtgctt
tgtggcaatg cttgtaataa catatgtggc 840aaagaatatg gattatccac
ctagtggagt accaccaacc ggcattgtca ttgcttccct 900ggtagttttt
acaattttag gagcgcccct ggcgatcacg tacagtatac catatgcaat
960ggctgctagt cgggttgaaa atctgggact tggccaaggt ctagcaatgg
gcattcttaa 1020tttggctatt gtcataccac aggttattgt gtcactgggt
agcgggccct gggaccaact 1080gtttggtggt ggcaatgcac cagcctttgc
agtggctgct gctgcatctt ttatcggtgg 1140gctggtggct attctgggcc
ttccacgagc ccgcattgca tcaaggagga gaggtcaccg 1200ataagaatat
tgctacatat aaattgtcgg ccattctttg caattcgact cataagaggc
1260actcggaacg ctatgcagtg catgggggaa ttgtatatta tctccgaatc
aagaagggga 1320taatgcttgc tttctccatg agctattttt gcctttttca
tgccggatca tcatatgctg 1380tcgtacattg gatgatctta tgctgttgta
cattggatgt tggtcatttg tagagatact 1440agtgaataaa agttgcagga
gttggttcac tcgagaaaat tctggtcagt atgtcgtcca 1500tctgctgcac
gacagcagtt aggagccgaa tagcatgtcc atgggttttc atcaaatgtt
1560gtatcatcat ttgttttttg atacgttcag acggcttcag tgctgtgtga
atatatatgt 1620atggaatata tcgagaaaaa aaaaaaaaaa aaa
16538400PRTOryza sativa 8His Glu Ile Thr Ala Ser Ile Ala Ala Ala
Val Leu Thr Val Gly Phe1 5 10 15Ser Ala Asp Leu Gly Arg Ile Phe Gly
Asp Ser Ile Thr Pro Gly Ser 20 25 30Thr Arg Leu Gly Ala Ile Ile Val
Tyr Leu Val Gly Phe Trp Leu Leu 35 40 45Asp Val Gly Asn Asn Ala Thr
Gln Gly Pro Cys Arg Ala Phe Leu Ala 50 55 60Asp Leu Thr Glu Asn Asp
Pro Arg Arg Thr Arg Ile Ala Asn Ala Tyr65 70 75 80Phe Ser Leu Phe
Met Ala Leu Gly Asn Ile Leu Gly Tyr Ala Thr Gly 85 90 95Ala Tyr Ser
Gly Trp Tyr Lys Ile Phe Pro Phe Thr Val Thr Pro Ser 100 105 110Cys
Ser Ile Ser Cys Ala Asn Phe Lys Ser Ala Phe Leu Leu Asp Ile 115 120
125Ile Ile Leu Val Val Thr Thr Cys Ile Thr Val Ala Ser Val Gln Glu
130 135 140Pro Gln Ser Phe Gly Ser Asp Glu Ala Asp His Pro Ser Thr
Glu Gln145 150 155 160Glu Ala Phe Leu Trp Glu Leu Phe Gly Ser Phe
Arg Tyr Phe Thr Leu 165 170 175Pro Val Trp Met Val Leu Ile Val Thr
Ala Leu Thr Trp Ile Gly Trp 180 185 190Phe Pro Phe Ile Leu Phe Asp
Thr Asp Trp Met Gly Arg Glu Ile Tyr 195 200 205Arg Gly Ser Pro Asp
Asp Pro Ser Ile Thr Gln Ser Tyr His Asp Gly 210 215 220Val Arg
Met
Gly Ser Phe Gly Leu Met Leu Asn Ser Val Leu Leu Gly225 230 235
240Phe Thr Ser Ile Val Leu Glu Lys Leu Cys Arg Lys Trp Gly Ala Gly
245 250 255Leu Val Trp Gly Val Ser Asn Ile Leu Met Ala Leu Cys Phe
Val Ala 260 265 270Met Leu Val Ile Thr Tyr Val Ala Lys Asn Met Asp
Tyr Pro Pro Ser 275 280 285Gly Val Pro Pro Thr Gly Ile Val Ile Ala
Ser Leu Val Val Phe Thr 290 295 300Ile Leu Gly Ala Pro Leu Ala Ile
Thr Tyr Ser Ile Pro Tyr Ala Met305 310 315 320Ala Ala Ser Arg Val
Glu Asn Leu Gly Leu Gly Gln Gly Leu Ala Met 325 330 335Gly Ile Leu
Asn Leu Ala Ile Val Ile Pro Gln Val Ile Val Ser Leu 340 345 350Gly
Ser Gly Pro Trp Asp Gln Leu Phe Gly Gly Gly Asn Ala Pro Ala 355 360
365Phe Ala Val Ala Ala Ala Ala Ser Phe Ile Gly Gly Leu Val Ala Ile
370 375 380Leu Gly Leu Pro Arg Ala Arg Ile Ala Ser Arg Arg Arg Gly
His Arg385 390 395 40092375DNAOryza sativa 9gcacgaggtt ctaacccgcg
ccttcgccga gggaggccga ccaacgcatc aatcaaacac 60acaagcacac cacgcggacg
cagcagcagg ggaggagaca atttcctatt cttcctcgcc 120ccgcgtcgcc
tcgcctgagt ctgactctcc aaacgccgac cagtgacgcc gcgagccttg
180ccccttgccc gcgcagatct caccaaaccc taccagatct gcgccccgcc
atggactccg 240ccgccggcgg tggcggcctc acggccatcc gcctgcccta
ccgccacctc cgcgacgccg 300agatggagct cgtcagcctc aacggcggca
ccccccgcgg aggctccccc aaggaccccg 360acgccacgca ccagcagggg
ccccccgccg cccgtaccac caccaccagg aagctcgtcc 420tcgcctgcat
ggtcgccgcc ggcgtgcagt tcggctgggc gcttcagctc tcgctcctca
480cgccctacat ccagacccta ggaatagacc atgccatggc atcattcatt
tggctttgtg 540gacctattac tggttttgtg gttcaaccat gtgttggtgt
ctggagtgac aaatgccgtt 600caaagtatgg aagaaggaga ccgttcattt
tggctggatg cttgatgata tgctttgctg 660taactttaat cggattttct
gcagaccttg gttacatttt aggagatacc actgagcact 720gcagtacata
taaaggttca agatttcgag cagctattat tttcgttctt gggttctgga
780tgttggatct cgcaaacaat acagttcaag gtcctgctcg tgccctttta
gctgaccttt 840caggtcctga tcagtgtaat tctgcaaatg caattttttg
cacatggatg gctgttggaa 900acgttcttgg tttttcatct ggtgctagtg
ggaattggca caagtggttt ccttttctaa 960tgacaagagc atgctgtgaa
gcttgtagta atttgaaagc cgcttttctg gttgcagttg 1020tattcctttt
gttttgtatg tctgttaccc tgtactttgc tgaagagatc ccgctggaac
1080caacagatgc acaacgatta tctgattctg cgcctctcct gaatggttct
agagatgata 1140acaatgcatc aaatgaacct cgtaatggag cacttcctaa
tggtcataca gatggaagca 1200atgtcccagc taactccaac gctgaggact
ccaattcaaa cagagagaat gtcgaagttt 1260tcaatgatgg accaggagca
gttttggtga atattttgac tagcatgagg catctacctc 1320ctggaatgta
ctctgttctt ctagttatgg ctctaacatg gttgtcgtgg tttccctttt
1380tcctttttga tactgactgg atgggacgtg aggtttacca tggggaccca
aatggcaact 1440tgagtgaaag gaaagcttat gacaacggtg tccgagaagg
tgcatttggt ttgctattga 1500attcagttgt ccttggaatt gggtccttcc
ttgttgatcc actatgccga ctgatgggtg 1560ctagactggt ttgggcaatc
agcaacttca cagtgtttat ctgcatgctg gctacagcaa 1620tattaagttg
gatctctttt gatttgtact caagtaaact tcaccacatc attggagcaa
1680ataaaacagt gaagaattca gccttgattg ttttctccct acttggactg
ccactctcga 1740tcacatatag cgttcctttt tctgtgactg ctgagctgac
tgctggaaca ggaggtggac 1800aaggtctggc aacaggagtc ctgaaccttg
caatcgttgt tccgcagata gtagtgtcac 1860taggagcagg tccatgggat
gctctctttg ggggagggaa cgtccctgct ttcgccttgg 1920cttccgtttt
ctcactagga gctggtgtcc tcgcggtcct taagctaccc aagctgccaa
1980actcttacag atctgctggg ttccatggat ttggctgagc agaacaccag
ccgcatggtg 2040tgtaacattg agaaatgcaa ctccattttg ccattcgttt
acagtgaaat gattcttttt 2100acctactact acaacagaat aagctgaaaa
gatagagatt aggatagaga gctaggtaac 2160tagtccagtt aggttgatgt
gcatacaagg caattggaag gtgtaagagc tgtatctact 2220tttttgacag
aaaaatgtaa gctctgcccg aatgacatgg cggatagatt ttacaatgga
2280tgtaatcatg tactatatat aacacgtttt ggtcacagct tgccaagttt
catgtatagt 2340actgctacta aaaaaaaaaa aaaaaaaaaa aaaaa
237510667PRTOryza sativa 10Pro Ala Pro Ser Pro Arg Glu Ala Asp Gln
Arg Ile Asn Gln Thr His1 5 10 15Lys His Thr Thr Arg Thr Gln Gln Gln
Gly Arg Arg Gln Phe Pro Ile 20 25 30Leu Pro Arg Pro Ala Ser Pro Arg
Leu Ser Leu Thr Leu Gln Thr Pro 35 40 45Thr Ser Asp Ala Ala Ser Leu
Ala Pro Cys Pro Arg Arg Ser His Gln 50 55 60Thr Leu Pro Asp Leu Arg
Pro Ala Met Asp Ser Ala Ala Gly Gly Gly65 70 75 80Gly Leu Thr Ala
Ile Arg Leu Pro Tyr Arg His Leu Arg Asp Ala Glu 85 90 95Met Glu Leu
Val Ser Leu Asn Gly Gly Thr Pro Arg Gly Gly Ser Pro 100 105 110Lys
Asp Pro Asp Ala Thr His Gln Gln Gly Pro Pro Ala Ala Arg Thr 115 120
125Thr Thr Thr Arg Lys Leu Val Leu Ala Cys Met Val Ala Ala Gly Val
130 135 140Gln Phe Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr
Ile Gln145 150 155 160Thr Leu Gly Ile Asp His Ala Met Ala Ser Phe
Ile Trp Leu Cys Gly 165 170 175Pro Ile Thr Gly Phe Val Val Gln Pro
Cys Val Gly Val Trp Ser Asp 180 185 190Lys Cys Arg Ser Lys Tyr Gly
Arg Arg Arg Pro Phe Ile Leu Ala Gly 195 200 205Cys Leu Met Ile Cys
Phe Ala Val Thr Leu Ile Gly Phe Ser Ala Asp 210 215 220Leu Gly Tyr
Ile Leu Gly Asp Thr Thr Glu His Cys Ser Thr Tyr Lys225 230 235
240Gly Ser Arg Phe Arg Ala Ala Ile Ile Phe Val Leu Gly Phe Trp Met
245 250 255Leu Asp Leu Ala Asn Asn Thr Val Gln Gly Pro Ala Arg Ala
Leu Leu 260 265 270Ala Asp Leu Ser Gly Pro Asp Gln Cys Asn Ser Ala
Asn Ala Ile Phe 275 280 285Cys Thr Trp Met Ala Val Gly Asn Val Leu
Gly Phe Ser Ser Gly Ala 290 295 300Ser Gly Asn Trp His Lys Trp Phe
Pro Phe Leu Met Thr Arg Ala Cys305 310 315 320Cys Glu Ala Cys Ser
Asn Leu Lys Ala Ala Phe Leu Val Ala Val Val 325 330 335Phe Leu Leu
Phe Cys Met Ser Val Thr Leu Tyr Phe Ala Glu Glu Ile 340 345 350Pro
Leu Glu Pro Thr Asp Ala Gln Arg Leu Ser Asp Ser Ala Pro Leu 355 360
365Leu Asn Gly Ser Arg Asp Asp Asn Asn Ala Ser Asn Glu Pro Arg Asn
370 375 380Gly Ala Leu Pro Asn Gly His Thr Asp Gly Ser Asn Val Pro
Ala Asn385 390 395 400Ser Asn Ala Glu Asp Ser Asn Ser Asn Arg Glu
Asn Val Glu Val Phe 405 410 415Asn Asp Gly Pro Gly Ala Val Leu Val
Asn Ile Leu Thr Ser Met Arg 420 425 430His Leu Pro Pro Gly Met Tyr
Ser Val Leu Leu Val Met Ala Leu Thr 435 440 445Trp Leu Ser Trp Phe
Pro Phe Phe Leu Phe Asp Thr Asp Trp Met Gly 450 455 460Arg Glu Val
Tyr His Gly Asp Pro Asn Gly Asn Leu Ser Glu Arg Lys465 470 475
480Ala Tyr Asp Asn Gly Val Arg Glu Gly Ala Phe Gly Leu Leu Leu Asn
485 490 495Ser Val Val Leu Gly Ile Gly Ser Phe Leu Val Asp Pro Leu
Cys Arg 500 505 510Leu Met Gly Ala Arg Leu Val Trp Ala Ile Ser Asn
Phe Thr Val Phe 515 520 525Ile Cys Met Leu Ala Thr Ala Ile Leu Ser
Trp Ile Ser Phe Asp Leu 530 535 540Tyr Ser Ser Lys Leu His His Ile
Ile Gly Ala Asn Lys Thr Val Lys545 550 555 560Asn Ser Ala Leu Ile
Val Phe Ser Leu Leu Gly Leu Pro Leu Ser Ile 565 570 575Thr Tyr Ser
Val Pro Phe Ser Val Thr Ala Glu Leu Thr Ala Gly Thr 580 585 590Gly
Gly Gly Gln Gly Leu Ala Thr Gly Val Leu Asn Leu Ala Ile Val 595 600
605Val Pro Gln Ile Val Val Ser Leu Gly Ala Gly Pro Trp Asp Ala Leu
610 615 620Phe Gly Gly Gly Asn Val Pro Ala Phe Ala Leu Ala Ser Val
Phe Ser625 630 635 640Leu Gly Ala Gly Val Leu Ala Val Leu Lys Leu
Pro Lys Leu Pro Asn 645 650 655Ser Tyr Arg Ser Ala Gly Phe His Gly
Phe Gly 660 665111885DNAGlycine max 11gcacgaggag agaaagagaa
aacatttaaa aaaatataaa aaaaaataaa cctctttctc 60tctctgaatt tctaagcctc
tctcaaaata atggaggagc cacaaccagg acccagcccg 120ttacgcaaaa
tgattttggt gtcgtcaatg gcggccggta tccaattcgg gtgggcccta
180cagctctccc ttctcacccc atatgttcaa accctaggcg tcccgcatgc
ttgggcctca 240tttatttggc tatgtggccc gatatctggg ctgctggtgc
agcccattgt gggctacagc 300agcgaccgat gccaatcccg tttcggtcgt
cgccgtccct ttatcctagc cgggtctttg 360gccgtcgcca ttgctgtgtt
cctaattggt tacgcggccg atataggaca cgcggcaggc 420gacaacctga
cccaaaagac tcggccacgt gcagtggcga tcttcgtgat cgggttttgg
480atcctcgacg tggctaacaa catgctccag ggtccatgcc gtgcctttct
gggcgacctc 540gctgccgggg atgagaaaaa gacaaaggca gccaatgcct
tcttctcttt cttcatggcc 600gtcggcaaca tcctgggcta tgctgcggga
tcctacgacg gcctccaccg cctcttcccc 660ttcacggaaa ccgaggcatg
caacgtcttc tgcgcaaacc tcaagagttg cttcttcttc 720gctatcgtcc
tcctggtggt cctcaccacc ttggtgctga ttaccgtgaa agaaactccc
780tacacgccaa aggcagagaa ggaaaccgaa gatgcagaga agacacactt
ctcgtgcttc 840tgcggagaac tttgtcttgc attcaagggg ctgaagaggc
caatgtggat gttgatgttg 900gtgaccgccg tgaactggat agcgtggttc
ccttacttct tgttcgacac cgattggatg 960ggtcgtgagg tgtacggtgg
tgacgtgggg cagaaggcgt acgattcggg agttcatgca 1020ggttctctag
ggctaatgtt gaatgcggtg gtgttggctg tgatgtcatt ggcaattgaa
1080ccgttggggc gtgtggttgg gggaatcaag tggttgtggg gaatcgttaa
catcttgttg 1140gctatatgct tgggaatgac cgttctcatc acaaagatcg
ctgagcatga acgtcttctt 1200aaccctgctt tggttgggaa cccttccctc
ggtatcaaag ttggttccat ggttttcttc 1260tctgtccttg gaatccctct
tgcgattact ttcagtgtcc catttgctct agcatctata 1320tactccagca
cttccggagc aggccaaggt ctatctttgg gtgtccttaa tattgcaatt
1380gtcgttccac agatgatagt atcaaccata agtggacctt gggatgcctt
gttcggcggt 1440ggaaacttgc ctgcattcgt gttgggtgcg gtggccgccg
tcgtgagtgc aatattagca 1500gttcttctgc tgccaactcc aaagaaagct
gatgaggtca gggcttctag cctcaacatg 1560ggaagtttgc attagtgtgt
ctattatagg gctttacatg tttcactttc aaccttgctt 1620tgatatggga
aaaagaactt agtctttaga ttcgaagtgg gtgtgtgcat gtgtatatta
1680ggtattagac atgggtttta gatgcttcca tagccacttt atgtccaagg
acaatcatta 1740atttgtaaac tttggtgcga caattatacc gaatagaaaa
tcattaaaca tacatctttt 1800tatttcacac attaaaaaaa tatcataata
aatatatata ttatcatatt ataaaagaaa 1860tatttgaaaa aaaaaaaaaa aaaaa
188512494PRTGlycine max 12Met Glu Glu Pro Gln Pro Gly Pro Ser Pro
Leu Arg Lys Met Ile Leu1 5 10 15Val Ser Ser Met Ala Ala Gly Ile Gln
Phe Gly Trp Ala Leu Gln Leu 20 25 30Ser Leu Leu Thr Pro Tyr Val Gln
Thr Leu Gly Val Pro His Ala Trp 35 40 45Ala Ser Phe Ile Trp Leu Cys
Gly Pro Ile Ser Gly Leu Leu Val Gln 50 55 60Pro Ile Val Gly Tyr Ser
Ser Asp Arg Cys Gln Ser Arg Phe Gly Arg65 70 75 80Arg Arg Pro Phe
Ile Leu Ala Gly Ser Leu Ala Val Ala Ile Ala Val 85 90 95Phe Leu Ile
Gly Tyr Ala Ala Asp Ile Gly His Ala Ala Gly Asp Asn 100 105 110Leu
Thr Gln Lys Thr Arg Pro Arg Ala Val Ala Ile Phe Val Ile Gly 115 120
125Phe Trp Ile Leu Asp Val Ala Asn Asn Met Leu Gln Gly Pro Cys Arg
130 135 140Ala Phe Leu Gly Asp Leu Ala Ala Gly Asp Glu Lys Lys Thr
Lys Ala145 150 155 160Ala Asn Ala Phe Phe Ser Phe Phe Met Ala Val
Gly Asn Ile Leu Gly 165 170 175Tyr Ala Ala Gly Ser Tyr Asp Gly Leu
His Arg Leu Phe Pro Phe Thr 180 185 190Glu Thr Glu Ala Cys Asn Val
Phe Cys Ala Asn Leu Lys Ser Cys Phe 195 200 205Phe Phe Ala Ile Val
Leu Leu Val Val Leu Thr Thr Leu Val Leu Ile 210 215 220Thr Val Lys
Glu Thr Pro Tyr Thr Pro Lys Ala Glu Lys Glu Thr Glu225 230 235
240Asp Ala Glu Lys Thr His Phe Ser Cys Phe Cys Gly Glu Leu Cys Leu
245 250 255Ala Phe Lys Gly Leu Lys Arg Pro Met Trp Met Leu Met Leu
Val Thr 260 265 270Ala Val Asn Trp Ile Ala Trp Phe Pro Tyr Phe Leu
Phe Asp Thr Asp 275 280 285Trp Met Gly Arg Glu Val Tyr Gly Gly Asp
Val Gly Gln Lys Ala Tyr 290 295 300Asp Ser Gly Val His Ala Gly Ser
Leu Gly Leu Met Leu Asn Ala Val305 310 315 320Val Leu Ala Val Met
Ser Leu Ala Ile Glu Pro Leu Gly Arg Val Val 325 330 335Gly Gly Ile
Lys Trp Leu Trp Gly Ile Val Asn Ile Leu Leu Ala Ile 340 345 350Cys
Leu Gly Met Thr Val Leu Ile Thr Lys Ile Ala Glu His Glu Arg 355 360
365Leu Leu Asn Pro Ala Leu Val Gly Asn Pro Ser Leu Gly Ile Lys Val
370 375 380Gly Ser Met Val Phe Phe Ser Val Leu Gly Ile Pro Leu Ala
Ile Thr385 390 395 400Phe Ser Val Pro Phe Ala Leu Ala Ser Ile Tyr
Ser Ser Thr Ser Gly 405 410 415Ala Gly Gln Gly Leu Ser Leu Gly Val
Leu Asn Ile Ala Ile Val Val 420 425 430Pro Gln Met Ile Val Ser Thr
Ile Ser Gly Pro Trp Asp Ala Leu Phe 435 440 445Gly Gly Gly Asn Leu
Pro Ala Phe Val Leu Gly Ala Val Ala Ala Val 450 455 460Val Ser Ala
Ile Leu Ala Val Leu Leu Leu Pro Thr Pro Lys Lys Ala465 470 475
480Asp Glu Val Arg Ala Ser Ser Leu Asn Met Gly Ser Leu His 485
490131041DNAGlycine maxmisc_feature(1007)..(1007)n is a, c, g, or t
13gcacgagctc acactctctc tttctttctt cctgctgcta caatatggag cctctctctt
60ccaccaaaca caacaacaat ctctccaagc cttcctccct ccacacggag gctccgccgc
120cggaggccag tcccctccgg aagatcatgg tggtggcctc catcgccgcc
ggggtgcaat 180tcgggtgggc cctacagctc tctctactta ccccttacgt
ccaactgctg gggattcccc 240acacttgggc cgccttcatc tggctctgcg
gcccaatctc cggcatgctc gtccagccca 300tcgtgggata ccacagcgac
cgctgcacct cccgcttcgg ccgccgccgc cccttcatcg 360ccgccggctc
cctcgccgtc gccatcgccg tcttccttat cggctacgcc gccgacctcg
420gccacatgtt cggcgactcc ctagccaaaa aaaccgcccc gcgccatcgc
atcttcgttg 480tcggcttctg gattctcgac gtcgcaaaca acatgctaca
agggccctgc cgcgccctcc 540tgggcgacct ctgcgccgga gaacaacgga
aaacgcgaaa cgcaaacgcc ttcttctcct 600tcttcatggc cgtcggaaac
gtcctgggct acgccgcggg ctcttacagc ggcctccaca 660acgtcttccc
tttcactaaa acaaaagcat gtgatgttta ctgcgcgaat ttgaagagtt
720gtttcttcct ctccatcgcg cttcttctca ctctctccac aatcgccttg
acctacgtga 780aggagaaaac ggtgtcgtca gagaaaacgg tgaggagttc
ggtggaggag gatgggtccc 840acgggggcat gccgtgcttc gggcaattat
tcggtgcgtt ccgcgaactg aagcgtccca 900tgtggatcct tctgttggtg
acgtgtctga actgggattg cctggttcct tttttgctat 960tcgacaccga
ctgggattgg ggcgtgaggt gtacggaggg aaaattnggg gaaaggaaag
1020ggtacgataa ggggttccgt t 104114322PRTGlycine
maxmisc_feature(311)..(311)Xaa can be any naturally occurring amino
acid 14Met Glu Pro Leu Ser Ser Thr Lys His Asn Asn Asn Leu Ser Lys
Pro1 5 10 15Ser Ser Leu His Thr Glu Ala Pro Pro Pro Glu Ala Ser Pro
Leu Arg 20 25 30Lys Ile Met Val Val Ala Ser Ile Ala Ala Gly Val Gln
Phe Gly Trp 35 40 45Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln
Leu Leu Gly Ile 50 55 60Pro His Thr Trp Ala Ala Phe Ile Trp Leu Cys
Gly Pro Ile Ser Gly65 70 75 80Met Leu Val Gln Pro Ile Val Gly Tyr
His Ser Asp Arg Cys Thr Ser 85 90 95Arg Phe Gly Arg Arg Arg Pro Phe
Ile Ala Ala Gly Ser Leu Ala Val 100 105 110Ala Ile Ala Val Phe Leu
Ile Gly Tyr Ala Ala Asp Leu Gly His Met 115 120 125Phe Gly Asp Ser
Leu Ala Lys Lys Thr Ala Pro Arg His Arg Ile Phe 130 135 140Val Val
Gly Phe Trp Ile Leu Asp Val Ala Asn Asn Met Leu Gln Gly145 150 155
160Pro Cys Arg Ala Leu Leu Gly Asp Leu Cys Ala Gly Glu Gln Arg Lys
165 170 175Thr Arg Asn Ala Asn Ala Phe Phe Ser Phe Phe Met Ala Val
Gly Asn 180 185 190Val Leu Gly Tyr Ala Ala Gly Ser Tyr Ser Gly Leu
His Asn Val Phe 195 200 205Pro Phe Thr Lys Thr Lys Ala Cys Asp Val
Tyr Cys
Ala Asn Leu Lys 210 215 220Ser Cys Phe Phe Leu Ser Ile Ala Leu Leu
Leu Thr Leu Ser Thr Ile225 230 235 240Ala Leu Thr Tyr Val Lys Glu
Lys Thr Val Ser Ser Glu Lys Thr Val 245 250 255Arg Ser Ser Val Glu
Glu Asp Gly Ser His Gly Gly Met Pro Cys Phe 260 265 270Gly Gln Leu
Phe Gly Ala Phe Arg Glu Leu Lys Arg Pro Met Trp Ile 275 280 285Leu
Leu Leu Val Thr Cys Leu Asn Trp Asp Cys Leu Val Pro Phe Leu 290 295
300Leu Phe Asp Thr Asp Trp Xaa Gly Arg Glu Val Tyr Gly Gly Lys
Ile305 310 315 320Xaa Gly15578DNAVernonia mespilifolia 15gcacgaggtt
ggcttggcgg tgtgaaacgg ttatggggtg gcatcaattt ccttctagct 60gtttgtttgg
ccatgacggt ggtggtgacc aaaatggcag actctgaacg acagtttaag
120acgttgcccg acggtagcaa aaccgcgttg ccaccaggcg gcgacattaa
agccggtgct 180ttgtcaattt ttgccgtcct cggtgcccca ctagctgtga
ctttcagtgt tccatgtgct 240cttgcatcaa tattttctaa cagttcagga
gctggacaag gtctatcact tggtgttttg 300aatctagcaa tcgtcatacc
acagatgttc gtatcagtac taagtggacc atgggacgca 360ctgttcggcg
gtggaaactt accagcattt gtggttggag caatttcggc tgcagtaagt
420gggatattat cgttcaccat gcttccttcg ccacccccag atgtcgtact
ttcaaaggtt 480tccggaggtg ggatgcatta gagagtaaat aactgccact
caacacgtcc cgattgtgtc 540agattgggac atttaggacc aaaaaaaaaa aaaaaaaa
57816166PRTVernonia mespilifolia 16Ala Arg Gly Trp Leu Gly Gly Val
Lys Arg Leu Trp Gly Gly Ile Asn1 5 10 15Phe Leu Leu Ala Val Cys Leu
Ala Met Thr Val Val Val Thr Lys Met 20 25 30Ala Asp Ser Glu Arg Gln
Phe Lys Thr Leu Pro Asp Gly Ser Lys Thr 35 40 45Ala Leu Pro Pro Gly
Gly Asp Ile Lys Ala Gly Ala Leu Ser Ile Phe 50 55 60Ala Val Leu Gly
Ala Pro Leu Ala Val Thr Phe Ser Val Pro Cys Ala65 70 75 80Leu Ala
Ser Ile Phe Ser Asn Ser Ser Gly Ala Gly Gln Gly Leu Ser 85 90 95Leu
Gly Val Leu Asn Leu Ala Ile Val Ile Pro Gln Met Phe Val Ser 100 105
110Val Leu Ser Gly Pro Trp Asp Ala Leu Phe Gly Gly Gly Asn Leu Pro
115 120 125Ala Phe Val Val Gly Ala Ile Ser Ala Ala Val Ser Gly Ile
Leu Ser 130 135 140Phe Thr Met Leu Pro Ser Pro Pro Pro Asp Val Val
Leu Ser Lys Val145 150 155 160Ser Gly Gly Gly Met His
165171062DNATriticum aestivum 17ctggaatgcc gtcagtgctc ctcgtcaccg
gcctcacctg gctgtcctgg ttccccttca 60tcctgtacga caccgactgg atgggtcgtg
agatctacca cggtgacccc aagggaaccc 120ccgacgaggc caacgcgttc
caggcaggtg tcagggccgg ggcgttcggc ctgctactca 180actcggtcgt
cctggggttc agctcgttcc tgatcgagcc gctgtgcaag aggctaggcc
240cgcgggtggt gtgggtgtca agcaacttcc tcgtctgcat ctccatggcc
gccatttgca 300tcataagctg gtgggccact caggacctgc atgggtacat
ccagcacgcc atcaccgcca 360gcaaggagat caagatcgtc tccctcgccc
tcttcgcctt cctcggaatc cctctcgcca 420ttctgtacag tgtccctttc
gcggtgacgg cgcagctggc ggcgaacaga ggcggtggcc 480aagggctgtg
cacgggcgtg ctgaacatcg ccatcgtgat accccaggtg atcatcgcgg
540tgggggcggg gccgtgggac gagctgttcg gcaagggcaa catcccggcg
ttcggcgtgg 600cgtccgcctt cgcgctcatc ggcggcatcg tcggcatatt
cctgctgccc aagatctcca 660ggcgccagtt ccgggccgtc agcggcggcg
gtcactgacc gcgccgcgcg ccggtcggcc 720tgagcatggc gaaggccgat
cgcgccggcc cgaaggtccc agcccagctc ggcatttacc 780aaattttcgc
ataggcgtaa ctagggggct ctcgcctaag gactccgtag agcagaataa
840gaattgtgag gaacctgtat gtgttgtgtc tgtatgtgcg tgtaagtcag
tgcgtgtagc 900ggaaaatgga cagaggaatg cgggcatcca tcgccggctg
gggtgtcgtc tttgggttgt 960gacttgtgtg tagcaaacca aggtgatcaa
gtgaggggaa aagaatggat gatgaacttt 1020cagcgacaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aa 106218232PRTTriticum aestivum 18Ala Gly
Met Pro Ser Val Leu Leu Val Thr Gly Leu Thr Trp Leu Ser1 5 10 15
Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg Glu Ile 20
25 30Tyr His Gly Asp Pro Lys Gly Thr Pro Asp Glu Ala Asn Ala Phe
Gln 35 40 45Ala Gly Val Arg Ala Gly Ala Phe Gly Leu Leu Leu Asn Ser
Val Val 50 55 60Leu Gly Phe Ser Ser Phe Leu Ile Glu Pro Leu Cys Lys
Arg Leu Gly65 70 75 80Pro Arg Val Val Trp Val Ser Ser Asn Phe Leu
Val Cys Ile Ser Met 85 90 95Ala Ala Ile Cys Ile Ile Ser Trp Trp Ala
Thr Gln Asp Leu His Gly 100 105 110Tyr Ile Gln His Ala Ile Thr Ala
Ser Lys Glu Ile Lys Ile Val Ser 115 120 125Leu Ala Leu Phe Ala Phe
Leu Gly Ile Pro Leu Ala Ile Leu Tyr Ser 130 135 140Val Pro Phe Ala
Val Thr Ala Gln Leu Ala Ala Asn Arg Gly Gly Gly145 150 155 160Gln
Gly Leu Cys Thr Gly Val Leu Asn Ile Ala Ile Val Ile Pro Gln 165 170
175Val Ile Ile Ala Val Gly Ala Gly Pro Trp Asp Glu Leu Phe Gly Lys
180 185 190Gly Asn Ile Pro Ala Phe Gly Val Ala Ser Ala Phe Ala Leu
Ile Gly 195 200 205Gly Ile Val Gly Ile Phe Leu Leu Pro Lys Ile Ser
Arg Arg Gln Phe 210 215 220Arg Ala Val Ser Gly Gly Gly His225
230192083DNATriticum aestivummisc_feature(1093)..(1093)n is a, c,
g, or t 19gcacgagcac accacaccac acctctctct ctctcactcg cactttccgc
tctcgtctcc 60tcctcttcct cctcccgtca gacccttctt ccccggcgtt gatccgatca
acgtcctcct 120ccgtcctgcc cctagatcct tggccgggca gggatacgcc
gtagaattga taggcgaacg 180gacgaggtgg tgatcgccag ggcggcctct
ctgccatggc gcgcggcgga ggcaacggcg 240aggtggagct ctcggtcggg
gtcggcggcg gaggcggcgg cgccgccggc gggggggagc 300aacccgccgt
ggacatcagc ctcggcagac tcatcctcgc cggcatggtc gccggcggcg
360tgcagtacgg atgggcgctc cagctctccc tgctcacccc ctacgtccag
actctgggac 420tttcgcatgc tctgacttca ttcatgtggc tctgcggccc
tattgctgga ttagtggttc 480aaccatgcgt tgggctctac agtgacaagt
gcacatctag atggggaaga cgcagaccgt 540ttattctgac aggatgcatc
ctcatctgca ttgctgttgt ggtcgtcggc ttctcggctg 600acattggagc
tggtctgggt gacagcaagg aagagtgcag tctctatcat gggcctcgtt
660ggcacgctgc aattgtgtat gttcttggat tctggctcct tgacttctcc
aacaacactg 720tgcaaggtcc agcgcgtgct ctgatggctg atttatcagc
tcagcatgga cccagtgcag 780caaattcaat cttctgttct tggatggcgc
taggaaatat ccttggatac tcctctggtt 840ccacaaacaa ctggcacaag
tggtttccgt tcctccggac aagggcttgc tgtgaagcct 900gcgcaaatct
gaaaggcgca tttctggtgg cagtgctggt cctggccttc tgtttggtga
960taactgtgat cttcgccaag gagataccgt acaaggcgat tgcgcccctc
ccaacaaagg 1020gcaatggcca ggttgaagtc gagcccaccg ggccgctcgc
cgtgttcaaa ggcttcaaga 1080acttgcctcc tgnaatgccg tcggtgctcc
tcgtcactgg cctcacctgg ctgtcctggt 1140tccccttcat cctgtacgac
accgactgga tgggtcgtga gatctaccac ggtgacccca 1200agggaacccc
cgacgaggcc aacgcgttcc aggcaggtgt cagggccggg gcgttcggcc
1260tgctactcaa ctcggtcgtc ctggggttca gctcgttcct gatcgagccg
ctgtgcaaga 1320ggctaggccc gcgggtggtg tgggtgtcga gcaacttcct
cgtctgcctc tccatggccg 1380cgatttgcat cataagctgg tgggctactc
aggacttgca tgggtatatc cagcacgcca 1440tcaccgccag caaggagatc
aagatcgtct ccctcgccct cttcgccttc ctcggaatcc 1500ctctcgccat
tctgtacagt gtccctttcg cggtgacggc gcagctggcg gcgaagagag
1560gcggtggcca agggctgtgc acgggcgtgc tcaacatcgc catcgtgata
ccccaggtga 1620tcatcgcggt gggggcgggg ccgtgggacg agctgttcgg
caagggcaac atcccggcgt 1680tcggcatggc ctccgccttc gcgctcatcg
gcggcatcgt cggcatattc ctgctgccca 1740agatctccag gcgccagttc
cgggccgtca gcggcggcgg tcactgagca tggccaaggc 1800cggaggtccc
agcccagccc gccatttacc aaattttcgc ataggcgtaa ctaggtggct
1860ctcgcctaag gactccgtag agcagaataa gaattgtgag gaacctgtat
gtgttgtgtc 1920tgtatgtgcg tgtaagtcag tgcgtgtagc ggaaaatgga
cagaggaatg tgggcatcca 1980tcaccggctg gggtgtcgtc tttgggttgt
gacttgtgtg tagcaaacca aggtgatcaa 2040gtgaggggaa atgaatggat
gatgaacttt cagcgacaaa aaa 208320522PRTTriticum aestivum 20Met Ala
Arg Gly Gly Gly Asn Gly Glu Val Glu Leu Ser Val Gly Val1 5 10 15Gly
Gly Gly Gly Gly Gly Ala Ala Gly Gly Gly Glu Gln Pro Ala Val 20 25
30Asp Ile Ser Leu Gly Arg Leu Ile Leu Ala Gly Met Val Ala Gly Gly
35 40 45Val Gln Tyr Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr
Val 50 55 60Gln Thr Leu Gly Leu Ser His Ala Leu Thr Ser Phe Met Trp
Leu Cys65 70 75 80Gly Pro Ile Ala Gly Leu Val Val Gln Pro Cys Val
Gly Leu Tyr Ser 85 90 95Asp Lys Cys Thr Ser Arg Trp Gly Arg Arg Arg
Pro Phe Ile Leu Thr 100 105 110Gly Cys Ile Leu Ile Cys Ile Ala Val
Val Val Val Gly Phe Ser Ala 115 120 125Asp Ile Gly Ala Gly Leu Gly
Asp Ser Lys Glu Glu Cys Ser Leu Tyr 130 135 140His Gly Pro Arg Trp
His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp145 150 155 160Leu Leu
Asp Phe Ser Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu 165 170
175Met Ala Asp Leu Ser Ala Gln His Gly Pro Ser Ala Ala Asn Ser Ile
180 185 190Phe Cys Ser Trp Met Ala Leu Gly Asn Ile Leu Gly Tyr Ser
Ser Gly 195 200 205Ser Thr Asn Asn Trp His Lys Trp Phe Pro Phe Leu
Arg Thr Arg Ala 210 215 220Cys Cys Glu Ala Cys Ala Asn Leu Lys Gly
Ala Phe Leu Val Ala Val225 230 235 240Leu Val Leu Ala Phe Cys Leu
Val Ile Thr Val Ile Phe Ala Lys Glu 245 250 255Ile Pro Tyr Lys Ala
Ile Ala Pro Leu Pro Thr Lys Gly Asn Gly Gln 260 265 270Val Glu Val
Glu Pro Thr Gly Pro Leu Ala Val Phe Lys Gly Phe Lys 275 280 285Asn
Leu Pro Pro Met Pro Ser Val Leu Leu Val Thr Gly Leu Thr Trp 290 295
300Leu Ser Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly
Arg305 310 315 320Glu Ile Tyr His Gly Asp Pro Lys Gly Thr Pro Asp
Glu Ala Asn Ala 325 330 335Phe Gln Ala Gly Val Arg Ala Gly Ala Phe
Gly Leu Leu Leu Asn Ser 340 345 350Val Val Leu Gly Phe Ser Ser Phe
Leu Ile Glu Pro Leu Cys Lys Arg 355 360 365Leu Gly Pro Arg Val Val
Trp Val Ser Ser Asn Phe Leu Val Cys Leu 370 375 380Ser Met Ala Ala
Ile Cys Ile Ile Ser Trp Trp Ala Thr Gln Asp Leu385 390 395 400His
Gly Tyr Ile Gln His Ala Ile Thr Ala Ser Lys Glu Ile Lys Ile 405 410
415Val Ser Leu Ala Leu Phe Ala Phe Leu Gly Ile Pro Leu Ala Ile Leu
420 425 430Tyr Ser Val Pro Phe Ala Val Thr Ala Gln Leu Ala Ala Lys
Arg Gly 435 440 445Gly Gly Gln Gly Leu Cys Thr Gly Val Leu Asn Ile
Ala Ile Val Ile 450 455 460Pro Gln Val Ile Ile Ala Val Gly Ala Gly
Pro Trp Asp Glu Leu Phe465 470 475 480Gly Lys Gly Asn Ile Pro Ala
Phe Gly Met Ala Ser Ala Phe Ala Leu 485 490 495Ile Gly Gly Ile Val
Gly Ile Phe Leu Leu Pro Lys Ile Ser Arg Arg 500 505 510Gln Phe Arg
Ala Val Ser Gly Gly Gly His 515 520212160DNATriticum aestivum
21gcacgagacc acccctctct ctctctctca ctcgcgcttt ccgctctcgt ctcctcctct
60tcctcctccc gtcagcccct tcttccccgg cgttgatccg atcgacgtcc tccctcctcc
120ccggcgttga tccgacgcgc cgtagagttg ataggcgaac gaacggggcg
gtgatcgtcc 180gggcggcccc cctgcgacga tggcgcgcgg cggcggcaac
ggcgaggtgg agctctcggt 240gggggtcggc ggaggcggcg ccggcgccgg
cggggcggac gcccccgccg tggacatcag 300cctcggcagg ctcatcctcg
ccggcatggt cgccggcggc gtgcagtacg gatgggcgct 360ccagctctcc
ctgctcaccc cctacgtcca gactctggga ctttcgcatg ctctgacttc
420attcatgtgg ctctgcggcc ctattgctgg attagtggtt caaccatgcg
ttgggctcta 480cagtgacaag tgcacttcaa gatggggaag acgcagaccg
ttcattctga caggatgtat 540cctcatctgc attgctgtcg tggtcgtcgg
cttctcggct gacattggag ctgctctggg 600tgacagcaag gaagagtgca
gtctctatca tgggcctcgt tggcacgctg caattgtgta 660tgttcttgga
ttctggctcc ttgacttctc caacaacaca gtgcaaggac cagcgcgtgc
720tctgatggct gatttatcag cccagcatgg acccagtgca gcaaattcaa
tcttctgttc 780ttggatggca ctgggaaata tcctaggata ctcatctggt
tccacaaata actggcacaa 840gtggtttccg ttcctccgga caagggcttg
ctgtgaagcc tgcgcaaatc tgaaaggcgc 900atttctggtg gcagtgctgt
tcctggcctt ctgtttggtg ataaccgtga tcttcgccaa 960ggagataccg
tacaaggcga ttgcgcccct cccaacaaag gccaatggcc aggttgaagt
1020cgagcccacc gggccgctcg ccgtcttcaa aggcttcaag aacttgcctc
ctggaatgcc 1080gtcagtgctc ctcgtcaccg gcctcacctg gctgtcctgg
ttccccttca tcctgtacga 1140caccgactgg atgggtcgtg agatctacca
cggtgacccc aagggaaccc ccgacgaggc 1200caacgcgttc caggcaggtg
tcagggccgg ggcgttcggc ctgctactca actcggtcgt 1260cctggggttc
agctcgttcc tgatcgagcc gctgtgcaag aggctaggcc cgcgggtggt
1320gtgggtgtca agcaacttcc tcgtctgcct ctccatggcc gccatttgca
tcataagctg 1380gtgggccact caggacctgc atgggtacat ccagcacgcc
atcaccgcca gcaaggagat 1440caagatcgtc tccctcgccc tcttcgcctt
cctcggaatc cctctcgcca ttctgtacag 1500tgtcactttc gccgtgacgg
cgcagctggc ggcgaacaga tgcggtgggc aatggctgtg 1560cacgggcgtg
ctgaacatcg ccatcgcgat accccaggtg atcatcgcgt tgggggcggg
1620gccgtgggac gagctgttcg gcaagggcaa catcccggcg ttcggcgtgg
cgtccgcctt 1680cgcgctcatc ggcggcatcg tcggcatatt cctgctgccc
aagatctcca ggctccagtt 1740ccgggccgtc agcggcggcg gtcactgacc
gcgccgcgcg ccggtcggcc tgagcatggc 1800gaaggccgat cgcgccggcc
cgaaggtccc agcccagctc ggcatttacc aaattttcgc 1860ataggcgtaa
ctagggggct ctcgcctaag gactccgtag agcagaataa gaattgtgag
1920gaacctgtat gtgttgtgtc tgtatgtgcg tgtaagtcag tgcgtgtagc
ggaaaatgga 1980cagaggaatg cgggcatcca tcgccggctg gggtgtcgtc
tttgggttgt gacttgtgtg 2040tagcaaacca aggtgatcaa gtgaggggaa
aagaatggat gatgaacttt cagcgacaaa 2100aaaaaaaaaa aaaaaaaaaa
aaaaaaataa aaaaaaaaaa aagaaaaaaa taaaaaaaaa 216022522PRTTriticum
aestivum 22Met Ala Arg Gly Gly Gly Asn Gly Glu Val Glu Leu Ser Val
Gly Val1 5 10 15Gly Gly Gly Gly Ala Gly Ala Gly Gly Ala Asp Ala Pro
Ala Val Asp 20 25 30Ile Ser Leu Gly Arg Leu Ile Leu Ala Gly Met Val
Ala Gly Gly Val 35 40 45Gln Tyr Gly Trp Ala Leu Gln Leu Ser Leu Leu
Thr Pro Tyr Val Gln 50 55 60Thr Leu Gly Leu Ser His Ala Leu Thr Ser
Phe Met Trp Leu Cys Gly65 70 75 80Pro Ile Ala Gly Leu Val Val Gln
Pro Cys Val Gly Leu Tyr Ser Asp 85 90 95Lys Cys Thr Ser Arg Trp Gly
Arg Arg Arg Pro Phe Ile Leu Thr Gly 100 105 110Cys Ile Leu Ile Cys
Ile Ala Val Val Val Val Gly Phe Ser Ala Asp 115 120 125Ile Gly Ala
Ala Leu Gly Asp Ser Lys Glu Glu Cys Ser Leu Tyr His 130 135 140Gly
Pro Arg Trp His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp Leu145 150
155 160Leu Asp Phe Ser Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu
Met 165 170 175Ala Asp Leu Ser Ala Gln His Gly Pro Ser Ala Ala Asn
Ser Ile Phe 180 185 190Cys Ser Trp Met Ala Leu Gly Asn Ile Leu Gly
Tyr Ser Ser Gly Ser 195 200 205Thr Asn Asn Trp His Lys Trp Phe Pro
Phe Leu Arg Thr Arg Ala Cys 210 215 220Cys Glu Ala Cys Ala Asn Leu
Lys Gly Ala Phe Leu Val Ala Val Leu225 230 235 240Phe Leu Ala Phe
Cys Leu Val Ile Thr Val Ile Phe Ala Lys Glu Ile 245 250 255Pro Tyr
Lys Ala Ile Ala Pro Leu Pro Thr Lys Ala Asn Gly Gln Val 260 265
270Glu Val Glu Pro Thr Gly Pro Leu Ala Val Phe Lys Gly Phe Lys Asn
275 280 285Leu Pro Pro Gly Met Pro Ser Val Leu Leu Val Thr Gly Leu
Thr Trp 290 295 300Leu Ser Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp
Trp Met Gly Arg305 310 315 320Glu Ile Tyr His Gly Asp Pro Lys Gly
Thr Pro Asp Glu Ala Asn Ala 325 330 335Phe Gln Ala Gly Val Arg Ala
Gly Ala Phe Gly Leu Leu Leu Asn Ser 340 345 350Val Val Leu Gly Phe
Ser Ser Phe Leu Ile Glu Pro Leu Cys Lys Arg 355 360 365Leu Gly Pro
Arg Val Val Trp Val Ser Ser Asn Phe Leu Val Cys Leu 370 375 380Ser
Met Ala Ala Ile Cys Ile Ile Ser Trp Trp Ala Thr Gln Asp Leu385 390
395 400His Gly Tyr Ile Gln His Ala Ile Thr Ala Ser Lys Glu Ile Lys
Ile 405 410 415Val Ser Leu Ala Leu Phe Ala Phe Leu Gly Ile Pro Leu
Ala Ile Leu 420
425 430Tyr Ser Val Thr Phe Ala Val Thr Ala Gln Leu Ala Ala Asn Arg
Cys 435 440 445Gly Gly Gln Trp Leu Cys Thr Gly Val Leu Asn Ile Ala
Ile Ala Ile 450 455 460Pro Gln Val Ile Ile Ala Leu Gly Ala Gly Pro
Trp Asp Glu Leu Phe465 470 475 480Gly Lys Gly Asn Ile Pro Ala Phe
Gly Val Ala Ser Ala Phe Ala Leu 485 490 495Ile Gly Gly Ile Val Gly
Ile Phe Leu Leu Pro Lys Ile Ser Arg Leu 500 505 510Gln Phe Arg Ala
Val Ser Gly Gly Gly His 515 520232030DNATriticum aestivum
23cggaagcgac gccgcgcggc ccaaggagga acagggcagc ggcgcggggg cgggggaagg
60cggcatgaag ggcgcgccca agtggcgggt ggtgctggcc tgcatggtcg ccgccggcgt
120gcagttcggc tgggcgctcc agctctccct cctcaccccc tacatccaga
ctctaggaat 180agaccatgcc atggcgtcct tcatttggct ttgcgggccc
attactggtt ttgtggttca 240accgtgtgtt ggtgtctgga gtgacaagtg
ccgctccaag tacgggagga gacggccgtt 300cattttggct ggatgcgtgc
tgatttgtgc agctgtaact ttagtcgggt tttctgcaga 360ccttggctac
atgttaggag acaccactga gcactgcagt acatacaaag gtctacgata
420tcgagctgct tttattttca tttttggatt ctggatgctg gaccttgcaa
ataatacagt 480tcaaggacct gctcgtgccc tcctagctga tctttcaggt
cccgatcaat gtaattcggc 540aaatgcaata ttctgctcat ggatggctgt
tggaaacgtt cttggttttt cagctggtgc 600gagtgggaat tggcacaagt
ggtttccttt tctgatgact agggcctgtt gtgaagcttg 660tggtaatttg
aaagcagctt tcttgattgc agttgtattc cttctgtttt gcatggctgt
720taccctctac tttgctgaag agattccact ggaaccaaag gatgcacagc
agttatctga 780ctcggctcct ctactgaacg gttctagaga tgatcatgat
gcttcaagtg aacagactaa 840tggaggactt tctaacggtc atgctgatgc
aaaccatgtc tcagctaact ccagtgcaga 900tgcaggttcc aactcgaaca
aggacgatgt tgaggctttc aatgatggac caggagcagt 960tttggttaaa
attttgacta gcatgaggca tctacctcct ggaatgtatt ccgtgcttct
1020ggttatggcc ctaacatggc tgtcgtggtt tccctttttc ctttttgaca
ccgactggat 1080ggggcgtgag gtttatcacg gtgacccaaa aggaaacgcg
agtgaaagga aagcttatga 1140tgatggtgtc cgagaaggtg catttggttt
gctattgaat tcagtcgtcc ttgggattgg 1200ctctttcctt atcgatccat
tatgccggat gattggtgca agattggttt gggcaatcag 1260caacttcata
gtgtttgcct gcatgttggc tacaacaata ctaagttgga tctcctatga
1320cctgtactcg agcaagcttc aacatattgt cggggcagat aaaacagtca
agacctcagc 1380gcttattctt ttctctcttc tcggattgcc actctcgatc
acttatagtg ttccgttctc 1440cgtgactgct gagctgactg ccggaacagg
aggcggacaa ggtttggcta ctggagttct 1500gaatcttgcc atcgtcgctc
ctcagatagt agtgtcactc ggagcaggcc catgggacaa 1560gctcttgggg
ggagggaacg tccccgcttt cgccctggcc tcggtcttct cgctagcagc
1620cggagtgctc gcggtgatca agctgcccaa gttgtcgaac aattaccaat
ccgccggctt 1680ccacatgggc tgaaccctaa agcccgaagc cagctgctgt
gtgtaacatc cagatgttta 1740gtaccaatcc gccggtttcc atattaagat
tcgtttatat ggagatgatt ctttttctcc 1800tcttgctaga tacacagtta
ataagactac agatcagata gactaggata aagagatagt 1860ttttaggcct
gtgtgcatac aagtgtcgat gagaagttgt aaaacatgta cactgttttt
1920ttgtactgta tatgtagtga aatttcatag atggccggat gtgttctggt
ccgataaaaa 1980aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa 203024563PRTTriticum aestivum 24Gly Ser Asp Ala Ala Arg
Pro Lys Glu Glu Gln Gly Ser Gly Ala Gly1 5 10 15Ala Gly Glu Gly Gly
Met Lys Gly Ala Pro Lys Trp Arg Val Val Leu 20 25 30Ala Cys Met Val
Ala Ala Gly Val Gln Phe Gly Trp Ala Leu Gln Leu 35 40 45Ser Leu Leu
Thr Pro Tyr Ile Gln Thr Leu Gly Ile Asp His Ala Met 50 55 60Ala Ser
Phe Ile Trp Leu Cys Gly Pro Ile Thr Gly Phe Val Val Gln65 70 75
80Pro Cys Val Gly Val Trp Ser Asp Lys Cys Arg Ser Lys Tyr Gly Arg
85 90 95Arg Arg Pro Phe Ile Leu Ala Gly Cys Val Leu Ile Cys Ala Ala
Val 100 105 110Thr Leu Val Gly Phe Ser Ala Asp Leu Gly Tyr Met Leu
Gly Asp Thr 115 120 125Thr Glu His Cys Ser Thr Tyr Lys Gly Leu Arg
Tyr Arg Ala Ala Phe 130 135 140Ile Phe Ile Phe Gly Phe Trp Met Leu
Asp Leu Ala Asn Asn Thr Val145 150 155 160Gln Gly Pro Ala Arg Ala
Leu Leu Ala Asp Leu Ser Gly Pro Asp Gln 165 170 175Cys Asn Ser Ala
Asn Ala Ile Phe Cys Ser Trp Met Ala Val Gly Asn 180 185 190Val Leu
Gly Phe Ser Ala Gly Ala Ser Gly Asn Trp His Lys Trp Phe 195 200
205Pro Phe Leu Met Thr Arg Ala Cys Cys Glu Ala Cys Gly Asn Leu Lys
210 215 220Ala Ala Phe Leu Ile Ala Val Val Phe Leu Leu Phe Cys Met
Ala Val225 230 235 240Thr Leu Tyr Phe Ala Glu Glu Ile Pro Leu Glu
Pro Lys Asp Ala Gln 245 250 255Gln Leu Ser Asp Ser Ala Pro Leu Leu
Asn Gly Ser Arg Asp Asp His 260 265 270Asp Ala Ser Ser Glu Gln Thr
Asn Gly Gly Leu Ser Asn Gly His Ala 275 280 285Asp Ala Asn His Val
Ser Ala Asn Ser Ser Ala Asp Ala Gly Ser Asn 290 295 300Ser Asn Lys
Asp Asp Val Glu Ala Phe Asn Asp Gly Pro Gly Ala Val305 310 315
320Leu Val Lys Ile Leu Thr Ser Met Arg His Leu Pro Pro Gly Met Tyr
325 330 335Ser Val Leu Leu Val Met Ala Leu Thr Trp Leu Ser Trp Phe
Pro Phe 340 345 350Phe Leu Phe Asp Thr Asp Trp Met Gly Arg Glu Val
Tyr His Gly Asp 355 360 365Pro Lys Gly Asn Ala Ser Glu Arg Lys Ala
Tyr Asp Asp Gly Val Arg 370 375 380Glu Gly Ala Phe Gly Leu Leu Leu
Asn Ser Val Val Leu Gly Ile Gly385 390 395 400Ser Phe Leu Ile Asp
Pro Leu Cys Arg Met Ile Gly Ala Arg Leu Val 405 410 415Trp Ala Ile
Ser Asn Phe Ile Val Phe Ala Cys Met Leu Ala Thr Thr 420 425 430Ile
Leu Ser Trp Ile Ser Tyr Asp Leu Tyr Ser Ser Lys Leu Gln His 435 440
445Ile Val Gly Ala Asp Lys Thr Val Lys Thr Ser Ala Leu Ile Leu Phe
450 455 460Ser Leu Leu Gly Leu Pro Leu Ser Ile Thr Tyr Ser Val Pro
Phe Ser465 470 475 480Val Thr Ala Glu Leu Thr Ala Gly Thr Gly Gly
Gly Gln Gly Leu Ala 485 490 495Thr Gly Val Leu Asn Leu Ala Ile Val
Ala Pro Gln Ile Val Val Ser 500 505 510Leu Gly Ala Gly Pro Trp Asp
Lys Leu Leu Gly Gly Gly Asn Val Pro 515 520 525Ala Phe Ala Leu Ala
Ser Val Phe Ser Leu Ala Ala Gly Val Leu Ala 530 535 540Val Ile Lys
Leu Pro Lys Leu Ser Asn Asn Tyr Gln Ser Ala Gly Phe545 550 555
560His Met Gly25501PRTDaucus carota 25Met Ala Gly Pro Glu Ala Asp
Arg Asn Arg His Arg Gly Gly Ala Thr1 5 10 15Ala Ala Pro Pro Pro Arg
Ser Arg Val Ser Leu Arg Leu Leu Leu Arg 20 25 30Val Ala Ser Val Ala
Cys Gly Ile Gln Phe Gly Trp Ala Leu Gln Leu 35 40 45Ser Leu Leu Thr
Pro Tyr Val Gln Glu Leu Gly Ile Pro His Ala Trp 50 55 60Ser Ser Ile
Ile Trp Leu Cys Gly Pro Leu Ser Gly Leu Leu Val Gln65 70 75 80Pro
Ile Val Gly His Met Ser Asp Gln Cys Thr Ser Lys Tyr Gly Arg 85 90
95Arg Arg Pro Phe Ile Val Ala Gly Gly Thr Ala Ile Ile Leu Ala Val
100 105 110Ile Ile Ile Ala His Ser Ala Asp Ile Gly Gly Leu Leu Gly
Asp Thr 115 120 125Ala Asp Asn Lys Thr Met Ala Ile Val Ala Phe Val
Ile Gly Phe Trp 130 135 140Ile Leu Asp Val Ala Asn Asn Met Thr Gln
Gly Pro Cys Arg Ala Leu145 150 155 160Leu Ala Asp Leu Thr Gly Asn
Asp Ala Arg Arg Thr Arg Val Ala Asn 165 170 175Ala Tyr Phe Ser Leu
Phe Met Ala Ile Gly Asn Val Leu Gly Tyr Ala 180 185 190Thr Gly Ala
Tyr Ser Gly Trp Tyr Lys Val Phe Pro Phe Ser Leu Thr 195 200 205Ser
Ser Cys Thr Ile Asn Cys Ala Asn Leu Lys Ser Ala Phe Tyr Ile 210 215
220Asp Ile Ile Phe Ile Ile Ile Thr Thr Tyr Ile Ser Ile Ser Ala
Ala225 230 235 240Lys Glu Arg Pro Arg Ile Ser Ser Gln Asp Gly Pro
Gln Phe Ser Glu 245 250 255Asp Gly Thr Ala Gln Ser Gly His Ile Glu
Glu Ala Phe Leu Trp Glu 260 265 270Leu Phe Gly Thr Phe Arg Leu Leu
Pro Gly Ser Val Trp Val Ile Leu 275 280 285Leu Val Thr Cys Leu Asn
Trp Ile Gly Trp Phe Pro Phe Ile Leu Phe 290 295 300Asp Thr Asp Trp
Met Gly Arg Glu Ile Tyr Gly Gly Glu Pro Asn Gln305 310 315 320Gly
Gln Ser Tyr Ser Asp Gly Val Arg Met Gly Ala Phe Gly Leu Met 325 330
335Met Asn Ser Val Val Leu Gly Ile Thr Ser Val Leu Met Glu Lys Leu
340 345 350Cys Arg Ile Trp Gly Ser Gly Phe Met Trp Gly Leu Ser Asn
Ile Leu 355 360 365Met Thr Ile Cys Phe Phe Ala Met Leu Leu Ile Thr
Phe Ile Ala Lys 370 375 380Asn Met Asp Tyr Gly Thr Asn Pro Pro Pro
Asn Gly Ile Val Ile Ser385 390 395 400Ala Leu Ile Val Phe Ala Ile
Leu Gly Ile Pro Leu Ala Ile Thr Tyr 405 410 415Ser Val Pro Tyr Ala
Leu Val Ser Thr Arg Ile Glu Ser Leu Gly Leu 420 425 430Gly Gln Gly
Leu Ser Met Gly Val Leu Asn Leu Ala Ile Val Val Pro 435 440 445Gln
Val Ile Val Ser Leu Gly Ser Gly Pro Trp Asp Gln Leu Phe Gly 450 455
460Gly Gly Asn Ser Pro Ala Phe Val Val Ala Ala Leu Ser Ala Phe
Ala465 470 475 480Ala Gly Leu Ile Ala Leu Ile Ala Ile Arg Arg Pro
Arg Val Asp Lys 485 490 495Ser Arg Leu His His 50026537PRTOryza
sativa 26Met Ala Arg Gly Ser Gly Ala Gly Gly Gly Gly Gly Gly Gly
Gly Gly1 5 10 15Gly Leu Glu Leu Ser Val Gly Val Gly Gly Gly Gly Ala
Arg Gly Gly 20 25 30Gly Gly Gly Glu Ala Ala Ala Ala Val Glu Thr Ala
Ala Pro Ile Ser 35 40 45Leu Gly Arg Leu Ile Leu Ser Gly Met Val Ala
Gly Gly Val Gln Tyr 50 55 60Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr
Pro Tyr Val Gln Thr Leu65 70 75 80Gly Leu Ser His Ala Leu Thr Ser
Phe Met Trp Leu Cys Gly Pro Ile 85 90 95Ala Gly Met Val Val Gln Pro
Cys Val Gly Leu Tyr Ser Asp Arg Cys 100 105 110Thr Ser Lys Trp Gly
Arg Arg Arg Pro Tyr Ile Leu Thr Gly Cys Val 115 120 125Leu Ile Cys
Leu Ala Val Val Val Ile Gly Phe Ser Ala Asp Ile Gly 130 135 140Tyr
Ala Met Gly Asp Thr Lys Glu Asp Cys Ser Val Tyr His Gly Ser145 150
155 160Arg Trp His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp Leu Leu
Asp 165 170 175Phe Ser Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu
Met Ala Asp 180 185 190Leu Ser Gly Arg His Gly Pro Gly Thr Ala Asn
Ser Ile Phe Cys Ser 195 200 205Trp Met Ala Met Gly Asn Ile Leu Gly
Tyr Ser Ser Gly Ser Thr Asn 210 215 220Asn Trp His Lys Trp Phe Pro
Phe Leu Lys Thr Arg Ala Cys Cys Glu225 230 235 240Ala Cys Ala Asn
Leu Lys Gly Ala Phe Leu Val Ala Val Ile Phe Leu 245 250 255Ser Leu
Cys Leu Val Ile Thr Leu Ile Phe Ala Lys Glu Val Pro Phe 260 265
270Lys Gly Asn Ala Ala Leu Pro Thr Lys Ser Asn Glu Pro Ala Glu Pro
275 280 285Glu Gly Thr Gly Pro Leu Ala Val Leu Lys Gly Phe Arg Asn
Leu Pro 290 295 300Thr Gly Met Pro Ser Val Leu Ile Val Thr Gly Leu
Thr Trp Leu Ser305 310 315 320Trp Phe Pro Phe Ile Leu Tyr Asp Thr
Asp Trp Met Gly Arg Glu Ile 325 330 335Tyr His Gly Asp Pro Lys Gly
Thr Asp Pro Gln Ile Glu Ala Phe Asn 340 345 350Gln Gly Val Arg Ala
Gly Ala Phe Gly Leu Leu Leu Asn Ser Ile Val 355 360 365Leu Gly Phe
Ser Ser Phe Leu Ile Glu Pro Met Cys Arg Lys Val Gly 370 375 380Pro
Arg Val Val Trp Val Thr Ser Asn Phe Leu Val Cys Ile Ala Met385 390
395 400Ala Ala Thr Ala Leu Ile Ser Phe Trp Ser Leu Lys Asp Phe His
Gly 405 410 415Thr Val Gln Lys Ala Ile Thr Ala Asp Lys Ser Ile Lys
Ala Val Cys 420 425 430Leu Val Leu Phe Ala Phe Leu Gly Val Pro Leu
Ala Val Leu Tyr Ser 435 440 445Val Pro Phe Ala Val Thr Ala Gln Leu
Ala Ala Thr Arg Gly Gly Gly 450 455 460Gln Gly Leu Cys Thr Gly Val
Leu Asn Ile Ser Ile Val Ile Pro Gln465 470 475 480Val Val Ile Ala
Leu Gly Ala Gly Pro Trp Asp Glu Leu Phe Gly Lys 485 490 495Gly Asn
Ile Pro Ala Phe Gly Leu Ala Ser Gly Phe Ala Leu Ile Gly 500 505
510Gly Val Ala Gly Ile Phe Leu Leu Pro Lys Ile Ser Lys Arg Gln Phe
515 520 525Trp Ser Val Ser Met Gly Gly Gly His 530
53527533PRTRicinus communis 27Met Gln Ser Ser Thr Ser Lys Glu Asn
Lys Gln Pro Pro Ser Ser Gln1 5 10 15Pro His Pro Pro Pro Leu Met Val
Ala Gly Ala Ala Glu Pro Asn Ser 20 25 30Ser Pro Leu Arg Lys Val Val
Met Val Ala Ser Ile Ala Ala Gly Ile 35 40 45Gln Phe Gly Trp Ala Leu
Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln 50 55 60Leu Leu Gly Ile Pro
His Thr Trp Ala Ala Phe Ile Trp Leu Cys Gly65 70 75 80Pro Ile Ser
Gly Met Leu Val Gln Pro Ile Val Gly Tyr His Ser Asp 85 90 95Arg Cys
Thr Ser Arg Phe Gly Arg Arg Arg Pro Phe Ile Ala Ser Gly 100 105
110Ala Ala Phe Val Ala Ile Ala Val Phe Leu Ile Gly Tyr Ala Ala Asp
115 120 125Leu Gly His Leu Ser Gly Asp Ser Leu Asp Lys Ser Pro Lys
Thr Arg 130 135 140Ala Ile Ala Ile Phe Val Val Gly Phe Trp Ile Leu
Asp Val Ala Asn145 150 155 160Asn Met Leu Gln Gly Pro Cys Arg Ala
Leu Leu Ala Asp Leu Ser Gly 165 170 175Thr Ser Gln Lys Lys Thr Arg
Thr Ala Asn Ala Leu Phe Ser Phe Phe 180 185 190Met Ala Val Gly Asn
Val Leu Gly Tyr Ala Ala Gly Ala Tyr Thr His 195 200 205Leu Tyr Lys
Leu Phe Pro Phe Thr Lys Thr Thr Ala Cys Asp Val Tyr 210 215 220Cys
Ala Asn Leu Lys Ser Cys Phe Phe Ile Ser Ile Val Leu Leu Leu225 230
235 240Ser Leu Thr Val Leu Ala Leu Ser Tyr Val Lys Glu Lys Pro Trp
Ser 245 250 255Pro Asp Gln Ala Val Asp Asn Ala Glu Asp Asp Thr Ala
Ser Gln Ala 260 265 270Ser Ser Ser Ala Gln Pro Met Pro Phe Phe Gly
Glu Ile Leu Gly Ala 275 280 285Phe Lys Asn Leu Lys Arg Pro Met Trp
Ile Leu Leu Leu Val Thr Cys 290 295 300Leu Asn Trp Ile Ala Trp Phe
Pro Phe Leu Leu Phe Asp Thr Asp Trp305 310 315 320Met Gly Arg Glu
Val Tyr Gly Gly Asp Ser Ser Gly Ser Ala Glu Gln 325 330 335Leu Lys
Leu Tyr Asp Arg Gly Val Arg Ala Gly Ala Leu Gly Leu Met 340 345
350Leu Asn Ser Val Val Leu Gly Phe Thr Ser Leu Gly Val Glu Val Leu
355 360 365Ala Arg Gly Val Gly Gly Val Lys Arg Leu Trp Gly Ile Val
Asn Phe 370 375 380Val Leu Ala Val Cys Leu Ala Met Thr Val Leu Val
Thr Lys Gln Ala385 390 395 400Glu Ser Thr Arg Arg Phe Ala Thr Val
Ser Gly Gly Ala Lys Val Pro 405 410 415Leu Pro Pro Pro Ser Gly Val
Lys Ala Gly Ala Leu Ala Leu Phe Ala 420
425 430Val Met Gly Val Pro Gln Ala Ile Thr Tyr Ser Ile Pro Phe Ala
Leu 435 440 445Ala Ser Ile Phe Ser Asn Thr Ser Gly Ala Gly Gln Gly
Leu Ser Leu 450 455 460Gly Val Leu Asn Leu Ser Ile Val Ile Pro Gln
Met Ile Val Ser Val465 470 475 480Ala Ala Gly Pro Trp Asp Ala Leu
Phe Gly Gly Gly Asn Leu Pro Ala 485 490 495Phe Val Val Gly Ala Val
Ala Ala Leu Ala Ser Gly Ile Phe Ala Leu 500 505 510Thr Met Leu Pro
Ser Pro Gln Pro Asp Met Pro Ser Ala Lys Ala Leu 515 520 525Thr Ala
Ala Phe His 53028523PRTVicia faba 28Met Glu Pro Leu Ser Ser Thr Lys
Gln Ile Asn Asn Asn Asn Asn Leu1 5 10 15Ala Lys Pro Ser Ser Leu His
Val Glu Thr Gln Pro Leu Glu Pro Ser 20 25 30Pro Leu Arg Lys Ile Met
Val Val Ala Ser Ile Ala Ala Gly Val Gln 35 40 45Phe Gly Trp Ala Leu
Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln Leu 50 55 60Leu Gly Ile His
His Thr Trp Ala Ala Tyr Ile Trp Leu Cys Gly Pro65 70 75 80Ile Ser
Gly Met Leu Val Gln Pro Ile Val Gly Tyr His Ser Asp Arg 85 90 95Cys
Thr Ser Arg Phe Gly Arg Arg Arg Pro Phe Ile Ala Ala Gly Ser 100 105
110Ile Ala Val Ala Ile Ala Val Phe Leu Ile Gly Tyr Ala Ala Asp Leu
115 120 125Gly His Ser Phe Gly Asp Ser Leu Asp Gln Lys Val Arg Pro
Arg Ala 130 135 140Ile Gly Ile Phe Val Val Gly Phe Trp Ile Leu Asp
Val Ala Asn Asn145 150 155 160Met Leu Gln Gly Pro Cys Arg Ala Leu
Leu Gly Asp Leu Cys Ala Gly 165 170 175Asn Gln Arg Lys Thr Arg Asn
Ala Asn Ala Phe Phe Ser Phe Phe Met 180 185 190Ala Val Gly Asn Val
Leu Gly Tyr Ala Ala Gly Ala Tyr Ser Lys Leu 195 200 205Tyr His Val
Phe Pro Phe Thr Lys Thr Lys Ala Cys Asn Val Tyr Cys 210 215 220Ala
Asn Leu Lys Ser Cys Phe Phe Leu Ser Ile Ala Leu Leu Thr Val225 230
235 240Leu Ala Thr Ser Ala Leu Ile Tyr Val Lys Glu Thr Ala Leu Thr
Pro 245 250 255Glu Lys Thr Val Val Thr Thr Glu Asp Gly Gly Ser Ser
Gly Gly Met 260 265 270Pro Cys Phe Gly Gln Leu Ser Gly Ala Phe Lys
Glu Leu Lys Arg Pro 275 280 285Met Trp Ile Leu Leu Leu Val Thr Cys
Leu Asn Trp Ile Ala Trp Phe 290 295 300Pro Phe Leu Leu Phe Asp Thr
Asp Trp Met Gly Lys Glu Val Tyr Gly305 310 315 320Gly Thr Val Gly
Glu Gly His Ala Tyr Asp Met Gly Val Arg Glu Gly 325 330 335Ala Leu
Gly Leu Met Leu Asn Ser Val Val Leu Gly Ala Thr Ser Leu 340 345
350Gly Val Asp Ile Leu Ala Arg Gly Val Gly Gly Val Lys Arg Leu Trp
355 360 365Gly Ile Val Asn Phe Leu Leu Ala Ile Cys Leu Gly Leu Thr
Val Leu 370 375 380Val Thr Lys Leu Ala Gln His Ser Arg Gln Tyr Ala
Pro Gly Thr Gly385 390 395 400Ala Leu Gly Asp Pro Leu Pro Pro Ser
Glu Gly Ile Lys Ala Gly Ala 405 410 415Leu Thr Leu Phe Ser Val Leu
Gly Val Pro Leu Ala Ile Thr Tyr Ser 420 425 430Ile Pro Phe Ala Leu
Ala Ser Ile Phe Ser Ser Thr Ser Gly Ala Gly 435 440 445Gln Gly Leu
Ser Leu Gly Val Leu Asn Leu Ala Ile Val Ile Pro Gln 450 455 460Met
Phe Val Ser Val Leu Ser Gly Pro Trp Asp Ala Leu Phe Gly Gly465 470
475 480Gly Asn Leu Pro Ala Phe Val Val Gly Ala Val Ala Ala Leu Ala
Ser 485 490 495Gly Ile Leu Ser Ile Ile Leu Leu Pro Ser Pro Pro Pro
Asp Met Ala 500 505 510Lys Ser Val Ser Ala Thr Gly Gly Gly Phe His
515 520
* * * * *
References