Sucrose Transport Proteins Allen; Stephen M. ; et al. [E.I. DU PONT DE NEMOURS AND COMPANY]

Sucrose Transport Proteins

Allen; Stephen M. ; et al.

Patent Application Summary

U.S. patent application number 12/568846 was filed with the patent office on 2010-02-04 for sucrose transport proteins. This patent application is currently assigned to E.I. DU PONT DE NEMOURS AND COMPANY. Invention is credited to Stephen M. Allen, William D. Hitz, J. Antoni Rafalski.

Application Number	20100029489 12/568846
Document ID	/
Family ID	22162389
Filed Date	2010-02-04

United States Patent Application	20100029489
Kind Code	A1
Allen; Stephen M. ; et al.	February 4, 2010

SUCROSE TRANSPORT PROTEINS

Abstract

This invention relates to an isolated nucleic acid fragment encoding a sucrose transport protein. The invention also relates to the construction of a chimeric gene encoding all or a portion of the sucrose transport protein, in sense or antisense orientation, wherein expression of the chimeric gene results in production of altered levels of the sucrose transport protein in a transformed host cell.

Inventors:	Allen; Stephen M.; (Wilmington, DE) ; Hitz; William D.; (Wilmington, DE) ; Rafalski; J. Antoni; (Wilmington, DE)
Correspondence Address:	E I DU PONT DE NEMOURS AND COMPANY;LEGAL PATENT RECORDS CENTER BARLEY MILL PLAZA 25/1122B, 4417 LANCASTER PIKE WILMINGTON DE 19805 US
Assignee:	E.I. DU PONT DE NEMOURS AND COMPANY Wilmington DE
Family ID:	22162389
Appl. No.:	12/568846
Filed:	September 29, 2009

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
11901765	Sep 18, 2007	7605247
12568846
09679687	Oct 5, 2000	7288645
11901765
PCT/US99/07562	Apr 7, 1999
09679687
60081148	Apr 9, 1998

Current U.S. Class:	506/2 ; 435/419; 435/468; 530/350; 536/23.6
Current CPC Class:	C07K 14/415 20130101; C12N 15/8245 20130101
Class at Publication:	506/2 ; 536/23.6; 435/419; 530/350; 435/468
International Class:	C07H 21/04 20060101 C07H021/04; C12N 5/10 20060101 C12N005/10; C07K 14/415 20060101 C07K014/415; C12N 15/82 20060101 C12N015/82; C40B 20/00 20060101 C40B020/00

Claims

1. An isolated nucleic acid fragment encoding all or a substantial portion of a sucrose transport protein comprising a member selected from the group consisting of: (a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence set forth in a member selected from the group consisting of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24; (b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence set forth in a member selected from the group consisting of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24; and (c) an isolated nucleic acid fragment that is complementary to (a) or (b).

2. The isolated nucleic acid fragment of claim 1 wherein the nucleotide sequence of the fragment comprises all or a portion of the sequence set forth in a member selected from the group consisting of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23.

3. A chimeric gene comprising the nucleic acid fragment of claim 1 operably linked to suitable regulatory sequences.

4. A transformed host cell comprising the chimeric gene of claim 3.

5. A sucrose transport protein polypeptide comprising all or a substantial portion of the amino acid sequence set forth in a member selected from the group consisting of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24.

6. A method of altering the level of expression of a sucrose transport protein in a host cell comprising: (a) transforming a host cell with the chimeric gene of claim 3; and (b) growing the transformed host cell produced in step (a) under conditions that are suitable for expression of the chimeric gene wherein expression of the chimeric gene results in production of altered levels of a sucrose transport protein in the transformed host cell.

7. A method of obtaining a nucleic acid fragment encoding all or a substantial portion of the amino acid sequence encoding a sucrose transport protein comprising: (a) probing a cDNA or genomic library with the nucleic acid fragment of claim 1; (b) identifying a DNA clone that hybridizes with the nucleic acid fragment of claim 1; (c) isolating the DNA clone identified in step (b); and (d) sequencing the cDNA or genomic fragment that comprises the clone isolated in step (c) wherein the sequenced nucleic acid fragment encodes all or a substantial portion of the amino acid sequence encoding a sucrose transport protein.

8. A method of obtaining a nucleic acid fragment encoding a substantial portion of an amino acid sequence encoding a sucrose transport protein comprising: (a) synthesizing an oligonucleotide primer corresponding to a portion of the sequence set forth in any of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23; and (b) amplifying a cDNA insert present in a cloning vector using the oligonucleotide primer of step (a) and a primer representing sequences of the cloning vector wherein the amplified nucleic acid fragment encodes a substantial portion of an amino acid sequence encoding a sucrose transport protein.

9. The product of the method of claim 7.

10. The product of the method of claim 8.

Description

[0001] This application claims the benefit of U.S. Provisional Application No. 60/081,148, filed Apr. 9, 1998.

FIELD OF THE INVENTION

[0002] This invention is in the field of plant molecular biology. More specifically, this invention pertains to nucleic acid fragments encoding sucrose transport proteins in plants and seeds.

BACKGROUND OF THE INVENTION

[0003] Sucrose is the first form of carbohydrate to leave photosynthesizing cells in most higher plants and is the main form of transported carbon in most annual field crops plants such as corn, soybeans and wheat. As such its movement and concentration across various plant membranes is critical to plant growth and development. In addition sucrose is the main form of carbon that moves into developing seeds of soybeans, corn and wheat. This movement and concentration is accomplished by the action of sucrose carrier proteins that act to move sucrose against a concentration gradient by coupling sucrose movement to the opposite vectoral movement of a proton. Specific sucrose carrier sequences from these crop plants should find use in controlling the timing and extent of phenomena such as grain fill duration that are important factors in crop yield and quality. Accordingly, the availability of nucleic acid sequences encoding all or a portion of these enzymes would facilitate studies to better understand carbohydrate metabolism and function in plants, provide genetic tools for the manipulation of these biosynthetic pathways, and provide a means to control carbohydrate transport and distribution in plant cells.

SUMMARY OF THE INVENTION

[0004] The instant invention relates to isolated nucleic acid fragments encoding proteins involved in sucrose transport. Specifically, this invention concerns an isolated nucleic acid fragment encoding a sucrose transport protein. In addition, this invention relates to a nucleic acid fragment that is complementary to the nucleic acid fragment encoding the sucrose transport protein. An additional embodiment of the instant invention pertains to a polypeptide encoding all or a substantial portion of a sucrose transport protein.

[0005] In another embodiment, the instant invention relates to a chimeric gene encoding a sucrose transport protein, or to a chimeric gene that comprises a nucleic acid fragment that is complementary to a nucleic acid fragment encoding a sucrose transport protein, operably linked to suitable regulatory sequences, wherein expression of the chimeric gene results in production of levels of the encoded protein in a transformed host cell that is altered (i.e., increased or decreased) from the level produced in an untransformed host cell.

[0006] In a further embodiment, the instant invention concerns a transformed host cell comprising in its genome a chimeric gene encoding a sucrose transport protein, operably linked to suitable regulatory sequences. Expression of the chimeric gene results in production of altered levels of the encoded protein in the transformed host cell. The transformed host cell can be of eukaryotic or prokaryotic origin, and include cells derived from higher plants and microorganisms. The invention also includes transformed plants that arise from transformed host cells of higher plants, and seeds derived from such transformed plants.

[0007] An additional embodiment of the instant invention concerns a method of altering the level of expression of a sucrose transport protein in a transformed host cell comprising: a) transforming a host cell with a chimeric gene comprising a nucleic acid fragment encoding a sucrose transport protein; and b) growing the transformed host cell under conditions that are suitable for expression of the chimeric gene wherein expression of the chimeric gene results in production of altered levels of sucrose transport protein in the transformed host cell.

[0008] An addition embodiment of the instant invention concerns a method for obtaining a nucleic acid fragment encoding all or a substantial portion of an amino acid sequence encoding a sucrose transport protein.

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE DESCRIPTIONS

[0009] The invention can be more fully understood from the following detailed description and the accompanying drawings and Sequence Listing which form a part of this application.

[0010] FIG. 1 shows a comparison of the amino acid sequences set forth in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24 and the Daucus carota (SEQ ID NO:25), Oryza sativa (SEQ ID NO:26), Ricinus communis (SEQ ID NO:27) and Vicia faba (SEQ ID NO:28) sucrose transport protein amino acid sequences.

[0011] The following sequence descriptions and sequence listings attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. .sctn.1.821-1.825.

[0012] SEQ ID NO:1 is the nucleotide sequence comprising the entire cDNA insert in clone cepe7.pk0015.d10 encoding an entire corn sucrose transport protein.

[0013] SEQ ID NO:2 is the deduced amino acid sequence of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:1.

[0014] SEQ ID NO:3 is the nucleotide sequence comprising a portion of the cDNA insert in clone cr1n.pk0075.f5 encoding a portion of a corn sucrose transport protein.

[0015] SEQ ID NO:4 is the deduced amino acid sequence of a portion of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:3.

[0016] SEQ ID NO:5 is the nucleotide sequence comprising a portion of the cDNA insert in clone cr1n.pk0095.c10 encoding a portion of a corn sucrose transport protein.

[0017] SEQ ID NO:6 is the deduced amino acid sequence of a portion of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:5.

[0018] SEQ ID NO:7 is the nucleotide sequence comprising the entire cDNA insert in clone rlr2.pk0043.b1 encoding a portion of a rice sucrose transport protein.

[0019] SEQ ID NO:8 is the deduced amino acid sequence of a portion of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:7.

[0020] SEQ ID NO:9 is the nucleotide sequence comprising the entire cDNA insert in clone rls6.pk0076.e2 encoding an entire rice sucrose transport protein.

[0021] SEQ ID NO:10 is the deduced amino acid sequence of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:9.

[0022] SEQ ID NO:11 is the nucleotide sequence comprising the entire cDNA insert in clone sfl1.pk0001.g1 encoding an entire soybean sucrose transport protein.

[0023] SEQ ID NO:12 is the deduced amino acid sequence of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:11.

[0024] SEQ ID NO:13 is the nucleotide sequence comprising a contig assembled from the cDNA inserts in clones sfl1.pk0043.c7 and sdp3c.pk012.c13 encoding a portion of a soybean sucrose transport protein.

[0025] SEQ ID NO:14 is the deduced amino acid sequence of a portion of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:13.

[0026] SEQ ID NO:15 is the nucleotide sequence comprising a portion of the cDNA insert in clone vs1n.pk0002.h3 encoding a portion of a Vernonia sucrose transport protein.

[0027] SEQ ID NO:16 is the deduced amino acid sequence of a portion of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:15.

[0028] SEQ ID NO:17 is the nucleotide sequence comprising the entire cDNA insert in clone wle1n.pk0007.h8 encoding a portion of a wheat sucrose transport protein.

[0029] SEQ ID NO:18 is the deduced amino acid sequence of a portion of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:17.

[0030] SEQ ID NO:19 is the nucleotide sequence comprising the entire cDNA insert in clone wle1n.pk0103.c11 encoding an entire wheat sucrose transport protein.

[0031] SEQ ID NO:20 is the deduced amino acid sequence of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:19.

[0032] SEQ ID NO:21 is the nucleotide sequence comprising the entire cDNA insert in clone wlm24.pk0015.g11 encoding an entire wheat sucrose transport protein.

[0033] SEQ ID NO:22 is the deduced amino acid sequence of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:21.

[0034] SEQ ID NO:23 is the nucleotide sequence comprising the entire cDNA insert in clone wlmk1.pk0002.e11 encoding an entire wheat sucrose transport protein.

[0035] SEQ ID NO:24 is the deduced amino acid sequence of a sucrose transport protein derived from the nucleotide sequence of SEQ ID NO:23.

[0036] SEQ ID NO:25 is the amino acid sequence of a Daucus carota sucrose transport protein (NCBI Identifier No. gi 2969887).

[0037] SEQ ID NO:26 is the amino acid sequence of a Oryza sativa sucrose transport protein (NCBI Identifier No. gi 2723471).

[0038] SEQ ID NO:27 is the amino acid sequence of a Ricinus communis sucrose transport protein (NCBI Identifier No. gi 542020).

[0039] SEQ ID NO:28 is the amino acid sequence of a Vicia faba sucrose transport protein (NCBI Identifier No. gi 1935019).

[0040] The Sequence Listing contains the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB standards described in Nucleic Acids Research 13:3021-3030 (1985) and in the Biochemical Journal 219 (No. 2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. .sctn.1.822.

DETAILED DESCRIPTION OF THE INVENTION

[0041] In the context of this disclosure, a number of terms shall be utilized. As used herein, an "isolated nucleic acid fragment" is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA. As used herein, "contig" refers to an assemblage of overlapping nucleic acid sequences to form one contiguous nucleotide sequence. For example, several DNA sequences can be compared and aligned to identify common or overlapping regions. The individual sequences can then be assembled into a single contiguous nucleotide sequence. As used herein, "substantially similar" refers to nucleic acid fragments wherein changes in one or more nucleotide bases results in substitution of one or more amino acids, but do not affect the functional properties of the protein encoded by the DNA sequence.

[0042] "Substantially similar" also refers to nucleic acid fragments wherein changes in one or more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate alteration of gene expression by antisense or co-suppression technology. "Substantially similar" also refers to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotides that do not substantially affect the functional properties of the resulting transcript vis-a-vis the ability to mediate alteration of gene expression by antisense or co-suppression technology or alteration of the functional properties of the resulting protein molecule. It is therefore understood that the invention encompasses more than the specific exemplary sequences.

[0043] For example, it is well known in the art that antisense suppression and co-suppression of gene expression may be accomplished using nucleic acid fragments representing less than the entire coding region of a gene, and by nucleic acid fragments that do not share 100% sequence identity with the gene to be suppressed. Moreover, alterations in a gene which result in the production of a chemically equivalent amino acid at a given site, but do not effect the functional properties of the encoded protein, are well known in the art. Thus, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a functionally equivalent product. Nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the protein molecule would also not be expected to alter the activity of the protein. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products.

[0044] Moreover, substantially similar nucleic acid fragments may also be characterized by their ability to hybridize, under stringent conditions (0.1.times.SSC, 0.1% SDS, 65.degree. C.), with the nucleic acid fragments disclosed herein.

[0045] Substantially similar nucleic acid fragments of the instant invention may also be characterized by the percent similarity of the amino acid sequences that they encode to the amino acid sequences disclosed herein, as determined by algorithms commonly employed by those skilled in this art. Preferred are those nucleic acid fragments whose nucleotide sequences encode amino acid sequences that are 90% similar to the amino acid sequences reported herein. Most preferred are nucleic acid fragments that encode amino acid sequences that are 95% similar to the amino acid sequences reported herein. Sequence alignments and percent similarity calculations were performed using the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences was performed using the Clustal method of alignment (Higgins, D. G. and Sharp, P. M. (1989) CABIOS. 5:151-153) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10), (hereafter Clustal algorithm). Default parameters for pairwise alignments using the Clustal method were KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5.

[0046] A "substantial portion" of an amino acid or nucleotide sequence comprises enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford putative identification of that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., (1993) J. Mol. Biol. 215:403-410; see also www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises enough of the sequence to afford specific identification and/or isolation of a nucleic acid fragment comprising the sequence. The instant specification teaches partial or complete amino acid and nucleotide sequences encoding one or more particular plant proteins. The skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in this art. Accordingly, the instant invention comprises the complete sequences as reported in the accompanying Sequence Listing, as well as substantial portions of those sequences as defined above.

[0047] "Codon degeneracy" refers to divergence in the genetic code permitting variation of the nucleotide sequence without effecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment that encodes all or a substantial portion of the amino acid sequence encoding the sucrose transport proteins as set forth in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.

[0048] "Synthetic genes" can be assembled from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form gene segments which are then enzymatically assembled to construct the entire gene. "Chemically synthesized", as related to a sequence of DNA, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be accomplished using well established procedures, or automated chemical synthesis can be performed using one of a number of commercially available machines. Accordingly, the genes can be tailored for optimal gene expression based on optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determination of preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.

[0049] "Gene" refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Chimeric gene" refers any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. "Endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene that has been introduced into the genome by a transformation procedure.

[0050] "Coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

[0051] "Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3' to a promoter sequence. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, (1989) Biochemistry of Plants 15: 1-82. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.

[0052] The "translation leader sequence" refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 3:225).

[0053] The "3' non-coding sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht et al., (1989) Plant Cell 1:671-680.

[0054] "RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from posttranscriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a double-stranded DNA that is complementary to and derived from mRNA. "Sense" RNA refers to RNA transcript that includes the mRNA and so can be translated into protein by the cell. "Antisense RNA" refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065, incorporated herein by reference). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to sense RNA, antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes.

[0055] The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.

[0056] The term "expression", as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. "Overexpression" refers to the production of a gene product in transgenic organisms that exceeds levels of production in normal or non-transformed organisms. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020, incorporated herein by reference).

[0057] "Altered levels" refers to the production of gene product(s) in transgenic organisms in amounts or proportions that differ from that of normal or non-transformed organisms.

[0058] "Mature" protein refers to a post-translationally processed polypeptide; i.e., one from which any pre- or propeptides present in the primary translation product have been removed. "Precursor" protein refers to the primary product of translation of mRNA; i.e., with pre- and propeptides still present. Pre- and propeptides may be but are not limited to intracellular localization signals.

[0059] A "chloroplast transit peptide" is an amino acid sequence which is translated in conjunction with a protein and directs the protein to the chloroplast or other plastid types present in the cell in which the protein is made. "Chloroplast transit sequence" refers to a nucleotide sequence that encodes a chloroplast transit peptide. A "signal peptide" is an amino acid sequence which is translated in conjunction with a protein and directs the protein to the secretory system (Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53). If the protein is to be directed to a vacuole, a vacuolar targeting signal (supra) can further be added, or if to the endoplasmic reticulum, an endoplasmic reticulum retention signal (supra) may be added. If the protein is to be directed to the nucleus, any signal peptide present should be removed and instead a nuclear localization signal included (Raikhel (1992) Plant Phys. 100:1627-1632).

[0060] "Transformation" refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. Examples of methods of plant transformation include Agrobacterium-mediated transformation (De Blaere et al. (1987) Meth. Enzymol. 143:277) and particle-accelerated or "gene gun" transformation technology (Klein et al. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050, incorporated herein by reference).

[0061] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter "Maniatis").

[0062] Nucleic acid fragments encoding at least a portion of several sucrose transport proteins have been isolated and identified by comparison of random plant cDNA sequences to public databases containing nucleotide and protein sequences using the BLAST algorithms well known to those skilled in the art. Table 1 lists the proteins that are described herein, and the designation of the cDNA clones that comprise the nucleic acid fragments encoding these proteins.

TABLE-US-00001 TABLE 1 Sucrose Transport Proteins Enzyme Clone Plant Sucrose Transporter cepe7.pk0015.d10 Corn cr1n.pk0095.c10 Corn cr1n.pk0075.f5 Corn rlr2.pk0043.b1 Rice rls6.pk0076.e2 Rice sfl1.pk0001.g1 Soybean sfl1.pk0043.c7 Soybean sdp3c.pk012.c13 Soybean vs1n.pk0002.h3 Vernonia wle1n.pk0007.h8 Wheat wle1n.pk0103.c11 Wheat wlm24.pk0015.g11 Wheat wlmk1.pk0002.e11 Wheat

[0063] The nucleic acid fragments of the instant invention may be used to isolate cDNAs and genes encoding homologous proteins from the same or other plant species. Isolation of homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to, methods of nucleic acid hybridization, and methods of DNA and RNA amplification as exemplified by various uses of nucleic acid amplification technologies (e.g., polymerase chain reaction, ligase chain reaction).

[0064] For example, genes encoding other sucrose transport proteins, either as cDNAs or genomic DNAs, could be isolated directly by using all or a portion of the instant nucleic acid fragments as DNA hybridization probes to screen libraries from any desired plant employing methodology well known to those skilled in the art. Specific oligonucleotide probes based upon the instant nucleic acid sequences can be designed and synthesized by methods known in the art (Maniatis). Moreover, the entire sequences can be used directly to synthesize DNA probes by methods known to the skilled artisan such as random primer DNA labeling, nick translation, or end-labeling techniques, or RNA probes using available in vitro transcription systems. In addition, specific primers can be designed and used to amplify a part or all of the instant sequences. The resulting amplification products can be labeled directly during amplification reactions or labeled after amplification reactions, and used as probes to isolate full length cDNA or genomic fragments under conditions of appropriate stringency.

[0065] In addition, two short segments of the instant nucleic acid fragments may be used in polymerase chain reaction protocols to amplify longer nucleic acid fragments encoding homologous genes from DNA or RNA. The polymerase chain reaction may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the instant nucleic acid fragments, and the sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRNA precursor encoding plant genes. Alternatively, the second primer sequence may be based upon sequences derived from the cloning vector. For example, the skilled artisan can follow the RACE protocol (Frohman et al., (1988) PNAS USA 85:8998) to generate cDNAs by using PCR to amplify copies of the region between a single point in the transcript and the 3' or 5' end. Primers oriented in the 3' and 5' directions can be designed from the instant sequences. Using commercially available 3' RACE or 5' RACE systems (BRL), specific 3' or 5' cDNA fragments can be isolated (Ohara et al., (1989) PNAS USA 86:5673; Loh et al., (1989) Science 243:217). Products generated by the 3' and 5' RACE procedures can be combined to generate full-length cDNAs (Frohman, M. A. and Martin, G. R., (1989) Techniques 1: 165).

[0066] Availability of the instant nucleotide and deduced amino acid sequences facilitates immunological screening of cDNA expression libraries. Synthetic peptides representing portions of the instant amino acid sequences may be synthesized. These peptides can be used to immunize animals to produce polyclonal or monoclonal antibodies with specificity for peptides or proteins comprising the amino acid sequences. These antibodies can be then be used to screen cDNA expression libraries to isolate full-length cDNA clones of interest (Lerner, R. A. (1984) Adv. Immunol. 36:1; Maniatis).

[0067] The nucleic acid fragments of the instant invention may be used to create transgenic plants in which the disclosed sucrose transport proteins are present at higher or lower levels than normal or in cell types or developmental stages in which they are not normally found. This would have the effect of altering the level of sucrose metabolism in those cells.

[0068] Overexpression of the sucrose transport proteins of the instant invention may be accomplished by first constructing a chimeric gene in which the coding region is operably linked to a promoter capable of directing expression of a gene in the desired tissues at the desired stage of development. For reasons of convenience, the chimeric gene may comprise promoter sequences and translation leader sequences derived from the same genes. 3' Non-coding sequences encoding transcription termination signals may also be provided. The instant chimeric gene may also comprise one or more introns in order to facilitate gene expression.

[0069] Plasmid vectors comprising the instant chimeric gene can then constructed. The choice of plasmid vector is dependent upon the method that will be used to transform host plants. The skilled artisan is well aware of the genetic elements that must be present on the plasmid vector in order to successfully transform, select and propagate host cells containing the chimeric gene. The skilled artisan will also recognize that different independent transformation events will result in different levels and patterns of expression (Jones et al., (1985) EMBO J. 4:2411-2418; De Almeida et al., (1989) Mol. Gen. Genetics 218:78-86), and thus that multiple events must be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by Southern analysis of DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.

[0070] For some applications it may be useful to direct the instant sucrose transport proteins to different cellular compartments, or to facilitate its secretion from the cell. It is thus envisioned that the chimeric gene described above may be further supplemented by altering the coding sequence to encode a sucrose transport protein with appropriate intracellular targeting sequences such as transit sequences (Keegstra, K. (1989) Cell 56:247-253), signal sequences or sequences encoding endoplasmic reticulum localization (Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53), or nuclear localization signals (Raikhel, N. (1992) Plant Phys. 100:1627-1632) added and/or with targeting sequences that are already present removed. While the references cited give examples of each of these, the list is not exhaustive and more targeting signals of utility may be discovered in the future.

[0071] It may also be desirable to reduce or eliminate expression of genes encoding sucrose transport proteins in plants for some applications. In order to accomplish this, a chimeric gene designed for co-suppression of the instant sucrose transport proteins can be constructed by linking a gene or gene fragment encoding a sucrose transport protein to plant promoter sequences. Alternatively, a chimeric gene designed to express antisense RNA for all or part of the instant nucleic acid fragment can be constructed by linking the gene or gene fragment in reverse orientation to plant promoter sequences. Either the co-suppression or antisense chimeric genes could be introduced into plants via transformation wherein expression of the corresponding endogenous genes are reduced or eliminated.

[0072] The instant sucrose transport proteins (or portions thereof) may be produced in heterologous host cells, particularly in the cells of microbial hosts, and can be used to prepare antibodies to the these proteins by methods well known to those skilled in the art. The antibodies are useful for detecting sucrose transport proteins in situ in cells or in vitro in cell extracts. Preferred heterologous host cells for production of the instant sucrose transport proteins are microbial hosts. Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct a chimeric gene for production of the instant sucrose transport proteins. This chimeric gene could then be introduced into appropriate microorganisms via transformation to provide high level expression of the encoded sucrose transport protein. An example of a vector for high level expression of the instant sucrose transport proteins in a bacterial host is provided (Example 6).

[0073] All or a substantial portion of the nucleic acid fragments of the instant invention may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes. For example, the instant nucleic acid fragments may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Maniatis) of restriction-digested plant genomic DNA may be probed with the nucleic acid fragments of the instant invention. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et at., (1987) Genomics 1:174-181) in order to construct a genetic map. In addition, the nucleic acid fragments of the instant invention may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the instant nucleic acid sequence in the genetic map previously obtained using this population (Botstein, D. et al., (1980) Am. J. Hum. Genet. 32:314-331).

[0074] The production and use of plant gene-derived probes for use in genetic mapping is described in R. Bernatzky, R. and Tanksley, S. D. (1986) Plant Mol. Biol. Reporter 4(1):37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.

[0075] Nucleic acid probes derived from the instant nucleic acid sequences may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel, J. D., et al., In: Nonmammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).

[0076] In another embodiment, nucleic acid probes derived from the instant nucleic acid sequences may be used in direct fluorescence in situ hybridization (FISH) mapping (Trask, B. J. (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favor use of large clones (several to several hundred KB; see Laan, M. et al. (1995) Genome Research 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.

[0077] A variety of nucleic acid amplification-based methods of genetic and physical mapping may be carried out using the instant nucleic acid sequences. Examples include allele-specific amplification (Kazazian, H. H. (1989) J. Lab. Clin. Med. 114(2):95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield, V. C. et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren, U. et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov, B. P. (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter, M. A. et al. (1997) Nature Genetics 7:22-28) and Happy Mapping (Dear, P. H. and Cook, P. R. (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid fragment is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art. In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.

[0078] Loss of function mutant phenotypes may be identified for the instant cDNA clones either by targeted gene disruption protocols or by identifying specific mutants for these genes contained in a maize population carrying mutations in all possible genes (Ballinger and Benzer, (1989) Proc. Natl. Acad. Sci. USA 86:9402; Koes et al., (1995) Proc. Natl. Acad. Sci. USA 92:8149; Bensen et al., (1995) Plant Cell 7:75). The latter approach may be accomplished in two ways. First, short segments of the instant nucleic acid fragments may be used in polymerase chain reaction protocols in conjunction with a mutation tag sequence primer on DNAs prepared from a population of plants in which Mutator transposons or some other mutation-causing DNA element has been introduced (see Bensen, supra). The amplification of a specific DNA fragment with these primers indicates the insertion of the mutation tag element in or near the plant gene encoding the sucrose transport protein. Alternatively, the instant nucleic acid fragment may be used as a hybridization probe against PCR amplification products generated from the mutation population using the mutation tag sequence primer in conjunction with an arbitrary genomic site primer, such as that for a restriction enzyme site-anchored synthetic adaptor. With either method, a plant containing a mutation in the endogenous gene encoding a sucrose transport protein can be identified and obtained. This mutant plant can then be used to determine or confirm the natural function of the sucrose transport protein gene product.

EXAMPLES

[0079] The present invention is further defined in the following Examples, in which all parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions.

Example 1

Composition of cDNA Libraries; Isolation and Sequencing of cDNA Clones

[0080] cDNA libraries representing mRNAs from various corn, rice, soybean, Vernonia and wheat tissues were prepared. The characteristics of the libraries are described below.

TABLE-US-00002 TABLE 2 cDNA Libraries from Corn Rice, Soybean Vernonia and Wheat Library Tissue Clone cepe7 Corn epicotyl from 7 day old etiolated seedling cepe7.pk0015.d10 cr1n Corn root from 7 day seedling grown in light* cr1n.pk0075.f5 cr1n.pk0095.c10 rlr2 Rice leaf 15 days after germination 2 hours after infection of rlr2.pk0043.b1 strain Magnaporthe grisea 4360-R-62 (AVR2-YAMO) rls6 Rice leaf 15 days after germination 6 hours after infection of rls6.pk0076.e2 strain Magnaporthe grisea 4360-R-62 (AVR2-YAMO) sdp3c Soybean developing pods 8-9 mm sdp3c.pk012.c13 sfl1 Soybean immature flower sfl1.pk0001.g1 sfl1.pk0043.c7 vs1 Vernonia developing seed vs1n.pk0002.h3 wle1n Wheat leaf 7 day old etiolated seedling light grown* wle1n.pk0007.h8 wle1n.pk0103.c11 wlm24 Wheat seedling 24 hours after inoculation with Erysiphe wlm24.pk0015.g11 graminis wlmk1 Wheat seedlings 1 hour after inoculation with Erysiphe wlmk1.pk0002.e11 graminis and treatment with fungicide** *These libraries were normalized essentially as described in U.S. Pat. No. 5,482,845 **Application of 6-iodo-2-propoxy-3-propyl-4(3H)-quinazolinone; synthesis and methods of using this compound are described in USSN 08/545,827, incorporated herein by reference.

[0081] cDNA libraries were prepared in Uni-ZAP.TM. XR vectors according to the manufacturer's protocol (Stratagene Cloning Systems, La Jolla, Calif.). Conversion of the Uni-ZAP.TM. XR libraries into plasmid libraries was accomplished according to the protocol provided by Stratagene. Upon conversion, cDNA inserts were contained in the plasmid vector pBluescript. cDNA inserts from randomly picked bacterial colonies containing recombinant pBluescript plasmids were amplified via polymerase chain reaction using primers specific for vector sequences flanking the inserted cDNA sequences or plasmid DNA was prepared from cultured bacterial cells. Amplified insert DNAs or plasmid DNAs were sequenced in dye-primer sequencing reactions to generate partial cDNA sequences (expressed sequence tags or "ESTs"; see Adams, M. D. et al., (1991) Science 252:1651). The resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent sequencer.

Example 2

Identification of cDNA Clones

[0082] ESTs encoding sucrose transport proteins were identified by conducting BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., (1993) J. Mol. Biol. 215:403-410; see also www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequences contained in the BLAST "nr" database (comprising all non-redundant GenBank CDS translations, sequences derived from the 3-dimensional structure Brookhaven Protein Data Bank, the last major release of the SWISS-PROT protein sequence database, EMBL, and DDBJ databases). The cDNA sequences obtained in Example 1 were analyzed for similarity to all publicly available DNA sequences contained in the "nr" database using the BLASTN algorithm provided by the National Center for Biotechnology Information (NCBI). The DNA sequences were translated in all reading frames and compared for similarity to all publicly available protein sequences contained in the "nr" database using the BLASTX algorithm (Gish, W. and States, D. J. (1993) Nature Genetics 3:266-272 and Altschul, Stephen F., et al. (1997) Nucleic Acids Res. 25:3389-3402) provided by the NCBI. For convenience, the P-value (probability) of observing a match of a cDNA sequence to a sequence contained in the searched databases merely by chance as calculated by BLAST are reported herein as "pLog" values, which represent the negative of the logarithm of the reported P-value. Accordingly, the greater the pLog value, the greater the likelihood that the cDNA sequence and the BLAST "hit" represent homologous proteins.

Example 3

Characterization of cDNA Clones Encoding Sucrose Transporter Proteins

[0083] The BLASTX search using the EST sequences from clones cepe7.pk0015.d10, cr1n.pk0095.c10, cr1n.pk0075.f5, rls6.pk0076.e2, wle1n.pk0007.h8, wle1n.pk0007.h8, wle1n.pk0103.c11, wlm24.pk0015.g11 and wlmk1.pk0002.e11 revealed similarity of the proteins encoded by the cDNAs to a sucrose transporter from Oryza sativa (NCBI Identifier No. gi 2723471). The BLASTX search using the EST sequence from clone rlr2.pk0043.b1 revealed similarity of the protein encoded by the cDNA to a sucrose transporter from Daucus carota (NCBI Identifier No. gi 2969887). The BLASTX search using the EST sequence from clone sfl1.pk0001.g1 revealed similarity of the protein encoded by the cDNA to a sucrose transporter from Vicia faba (NCBI Identifier No. gi 1935019). The BLASTX search using the EST sequences from clones sfl1.pk0043.c7, sdp3c.pk012.c13 and vs1n.pk0002.h3 revealed similarity of the proteins encoded by the cDNAs to a sucrose transporter from Ricinus communis (NCBI Identifier No. gi 542020).

[0084] In the process of comparing the ESTs it was found that soybean clones sfl1.pk0043.c7 and sdp3c.pk012.c13 had overlapping regions of homology. Using this homology it was possible to align the ESTs and assemble a contig encoding a unique soybean sucrose transport protein.

[0085] The BLAST results for each of these ESTs and the soybean contig are shown in Table 3:

TABLE-US-00003 TABLE 3 BLAST Results for Clones Encoding Polypeptides Homologous to Daucus carota, Oryza sativa, Ricinus communis and Vicia faba Sucrose Transport Proteins Clone BLAST pLog Score cepe7.pk0015.d10 >250.00 cr1n.pk0095.c10 >250.00 cr1n.pk0075.f5 31.10 rlr2.pk0043.b1 148.00 rls6.pk0076.e2 >250.00 sfl1.pk0001.g1 >250.00 Contig composed of: 142.00 sfl1.pk0043.c7 sdp3c.pk012.c13 vs1n.pk0002.h3 59.30 wle1n.pk0007.h8 110.00 wle1n.pk0103.c11 >250.00 wlm24.pk0015.g11 >250.00 wlmk1.pk0002.e11 177.00

[0086] The sequence of a portion of the cDNA insert from clone cepe7.pk0015.d10 is shown in SEQ ID NO:1; the deduced amino acid sequence of this cDNA, which represents 100% of the protein, is shown in SEQ ID NO:2. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:2 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:2 is 82% similar to the Oryza sativa protein.

[0087] The sequence of a portion of the cDNA insert from clone cr1n.pk0075.f5 is shown in SEQ ID NO:3; the deduced amino acid sequence of this cDNA, which represents 93% of the protein, is shown in SEQ ID NO:4. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:4 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:4 is 50% similar to the Oryza sativa protein.

[0088] The sequence of a portion of the cDNA insert from clone cr1n.pk0095.c10 is shown in SEQ ID NO:5; the deduced amino acid sequence of this cDNA, which represents 20% of the protein (C-terminal region), is shown in SEQ ID NO:6. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:6 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:6 is 86% similar to the Oryza sativa protein.

[0089] The sequence of a portion of the cDNA insert from clone rlr2.pk0043.b1 is shown in SEQ ID NO:7; the deduced amino acid sequence of this cDNA, which represents 79% of the protein (C-terminal region), is shown in SEQ ID NO:8. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:8 and the Daucus carota sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:8 is 60% similar to the Daucus carota protein.

[0090] The sequence of a portion of the cDNA insert from clone rls6.pk0076.e2 is shown in SEQ ID NO:9; the deduced amino acid sequence of this cDNA, which represents 100% of the protein, is shown in SEQ ID NO:10. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:10 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:10 is 55% similar to the Oryza sativa protein. Due to a percent similarity of only 55% with a known rice sucrose transport protein clone rls6.pk0076.e2 appears to represent a second rice sucrose transport protein.

[0091] The sequence of a portion of the cDNA insert from clone sfl1.pk0001.g1 is shown in SEQ ID NO:11; the deduced amino acid sequence of this cDNA, which represents 100% of the protein, is shown in SEQ ID NO:12. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:12 and the Vicia faba sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:12 is 67% similar to the Vicia faba protein.

[0092] The sequence of a portion of the contig composed of clones sfl1.pk0043.c7 and sdp3c.pk012.c13 is shown in SEQ ID NO:13; the deduced amino acid sequence of this contig, which represents 62% of the protein (N-terminal region), is shown in SEQ ID NO:14. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:14 and the Ricinus communis sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:14 is 66% similar to the Ricinus communis protein.

[0093] The sequence of a portion of the cDNA insert from clone vs1n.pk0002.h3 is shown in SEQ ID NO:15; the deduced amino acid sequence of this cDNA, which represents 31% of the protein (C-terminal region), is shown in SEQ ID NO:16. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:16 and the Ricinus communis sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:16 is 66% similar to the Ricinus communis protein.

[0094] The sequence of a portion of the cDNA insert from clone wle1n.pk0007.h8 is shown in SEQ ID NO:17; the deduced amino acid sequence of this cDNA, which represents 43% of the protein (C-terminal region), is shown in SEQ ID NO:18. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:18 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:18 is 80% similar to the Oryza sativa protein.

[0095] The sequence of a portion of the cDNA insert from clone wle1n.pk0103.c11 is shown in SEQ ID NO:19; the deduced amino acid sequence of this cDNA, which represents 100% of the protein, is shown in SEQ ID NO:20. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:20 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:20 is 80% similar to the Oryza sativa protein.

[0096] The sequence of a portion of the cDNA insert from clone wlm24.pk0015.g11 is shown in SEQ ID NO:21; the deduced amino acid sequence of this cDNA, which represents 100% of the protein, is shown in SEQ ID NO:22. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:22 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:22 is 80% similar to the Oryza sativa protein.

[0097] The sequence of a portion of the cDNA insert from clone wlmk1.pk0002.e11 is shown in SEQ ID NO:23; the deduced amino acid sequence of this cDNA, which represents 97% of the protein, is shown in SEQ ID NO:24. A calculation of the percent similarity of the amino acid sequence set forth in SEQ ID NO:24 and the Oryza sativa sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:24 is 54% similar to the Oryza sativa protein.

[0098] The percent similarity between each of the corn, rice, soybean, Vernonia and wheat amino acid sequence was calculated to range from 12 to 98% using the Clustal algorithm. FIG. 1 presents an alignment of the amino acid sequences set forth in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24 and the Daucus carota, Oryza sativa, Ricinus communis and Vicia faba sucrose transport protein amino acid sequences.

[0099] BLAST scores and probabilities indicate that the instant nucleic acid fragments encode entire or portions of proteins. These sequences represent the first corn, soybean and wheat, amino acid sequences and a new rice sequence encoding sucrose transport proteins.

Example 4

Expression of Chimeric Genes in Monocot Cells

[0100] A chimeric gene comprising a cDNA encoding a sucrose transport protein in sense orientation with respect to the maize 27 kD zein promoter that is located 5' to the cDNA fragment, and the 10 kD zein 3' end that is located 3' to the cDNA fragment, can be constructed. The cDNA fragment of this gene may be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites (NcoI or SmaI) can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the digested vector pML103 as described below. Amplification is then performed in a standard PCR. The amplified DNA is then digested with restriction enzymes NcoI and SmaI and fractionated on an agarose gel. The appropriate band can be isolated from the gel and combined with a 4.9 kb NcoI-SmaI fragment of the plasmid pML103. Plasmid pML103 has been deposited under the terms of the Budapest Treaty at ATCC (American Type Culture Collection, 10801 University Blvd., Manassas, Va. 20110-2209), and bears accession number ATCC 97366. The DNA segment from pML103 contains a 1.05 kb SalI-NcoI promoter fragment of the maize 27 kD zein gene and a 0.96 kb SmaI-SalI fragment from the 3' end of the maize 10 kD zein gene in the vector pGem9Zf(+) (Promega). Vector and insert DNA can be ligated at 15.degree. C. overnight, essentially as described (Maniatis). The ligated DNA may then be used to transform E. coli XL1-Blue (Epicurian Coli XL-1 Blue.TM.; Stratagene). Bacterial transformants can be screened by restriction enzyme digestion of plasmid DNA and limited nucleotide sequence analysis using the dideoxy chain termination method (Sequenase.TM. DNA Sequencing Kit; U.S. Biochemical). The resulting plasmid construct would comprise a chimeric gene encoding, in the 5' to 3' direction, the maize 27 kD zein promoter, a cDNA fragment encoding a sucrose transport protein, and the 10 kD zein 3' region.

[0101] The chimeric gene described above can then be introduced into corn cells by the following procedure. Immature corn embryos can be dissected from developing caryopses derived from crosses of the inbred corn lines H99 and LH132. The embryos are isolated 10 to 11 days after pollination when they are 1.0 to 1.5 mm long. The embryos are then placed with the axis-side facing down and in contact with agarose-solidified N6 medium (Chu et al., (1975) Sci. Sin. Peking 18:659-668). The embryos are kept in the dark at 27.degree. C. Friable embryogenic callus consisting of undifferentiated masses of cells with somatic proembryoids and embryoids borne on suspensor structures proliferates from the scutellum of these immature embryos. The embryogenic callus isolated from the primary explant can be cultured on N6 medium and sub-cultured on this medium every 2 to 3 weeks.

[0102] The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag, Frankfurt, Germany) may be used in transformation experiments in order to provide for a selectable marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers resistance to herbicidal glutamine synthetase inhibitors such as phosphinothricin. The pat gene in p35S/Ac is under the control of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812) and the 3' region of the nopaline synthase gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens.

[0103] The particle bombardment method (Klein et al., (1987) Nature 327:70-73) may be used to transfer genes to the callus culture cells. According to this method, gold particles (1 .mu.m in diameter) are coated with DNA using the following technique. Ten .mu.g of plasmid DNAs are added to 50 .mu.L of a suspension of gold particles (60 mg per mL). Calcium chloride (50 .mu.L of a 2.5 M solution) and spermidine free base (20 .mu.L of a 1.0 M solution) are added to the particles. The suspension is vortexed during the addition of these solutions. After 10 minutes, the tubes are briefly centrifuged (5 sec at 15,000 rpm) and the supernatant removed. The particles are resuspended in 200 .mu.L of absolute ethanol, centrifuged again and the supernatant removed. The ethanol rinse is performed again and the particles resuspended in a final volume of 30 .mu.L of ethanol. An aliquot (5 .mu.L) of the DNA-coated gold particles can be placed in the center of a Kapton.TM. flying disc (Bio-Rad Labs). The particles are then accelerated into the corn tissue with a Biolistic.TM. PDS-1000/He (Bio-Rad Instruments, Hercules Calif.), using a helium pressure of 1000 psi, a gap distance of 0.5 cm and a flying distance of 1.0 cm.

[0104] For bombardment, the embryogenic tissue is placed on filter paper over agarose-solidified N6 medium. The tissue is arranged as a thin lawn and covered a circular area of about 5 cm in diameter. The petri dish containing the tissue can be placed in the chamber of the PDS-1000/He approximately 8 cm from the stopping screen. The air in the chamber is then evacuated to a vacuum of 28 inches of Hg. The macrocarrier is accelerated with a helium shock wave using a rupture membrane that bursts when the He pressure in the shock tube reaches 1000 psi.

[0105] Seven days after bombardment the tissue can be transferred to N6 medium that contains gluphosinate (2 mg per liter) and lacks casein or proline. The tissue continues to grow slowly on this medium. After an additional 2 weeks the tissue can be transferred to fresh N6 medium containing gluphosinate. After 6 weeks, areas of about 1 cm in diameter of actively growing callus can be identified on some of the plates containing the glufosinate-supplemented medium. These calli may continue to grow when sub-cultured on the selective medium.

[0106] Plants can be regenerated from the transgenic callus by first transferring clusters of tissue to N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the tissue can be transferred to regeneration medium (Fromm et al., (1990) Bio/Technology 8:833-839).

Example 5

Expression of Chimeric Genes in Dicot Cells

[0107] A seed-specific expression cassette composed of the promoter and transcription terminator from the gene encoding the .beta. subunit of the seed storage protein phaseolin from the bean Phaseolus vulgaris (Doyle et al. (1986) J. Biol. Chem. 261:9228-9238) can be used for expression of the instant sucrose transport proteins in transformed soybean. The phaseolin cassette includes about 500 nucleotides upstream (5') from the translation initiation codon and about 1650 nucleotides downstream (3') from the translation stop codon of phaseolin. Between the 5' and 3' regions are the unique restriction endonuclease sites Nco I (which includes the ATG translation initiation codon), Sma I, Kpn I and Xba I. The entire cassette is flanked by Hind III sites.

[0108] The cDNA fragment of this gene may be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the expression vector. Amplification is then performed as described above, and the isolated fragment is inserted into a pUC18 vector carrying the seed expression cassette.

[0109] Soybean embryos may then be transformed with the expression vector comprising a sequence encoding a sucrose transport protein. To induce somatic embryos, cotyledons, 3-5 mm in length dissected from surface sterilized, immature seeds of the soybean cultivar A2872, can be cultured in the light or dark at 26.degree. C. on an appropriate agar medium for 6-10 weeks. Somatic embryos which produce secondary embryos are then excised and placed into a suitable liquid medium. After repeated selection for clusters of somatic embryos which multiplied as early, globular staged embryos, the suspensions are maintained as described below.

[0110] Soybean embryogenic suspension cultures can maintained in 35 mL liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with florescent lights on a 16:8 hour day/night schedule. Cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 mL of liquid medium.

[0111] Soybean embryogenic suspension cultures may then be transformed by the method of particle gun bombardment (Kline et al. (1987) Nature (London) 327:70, U.S. Pat. No. 4,945,050). A DuPont Biolistic.TM. PDS1000/HE instrument (helium retrofit) can be used for these transformations.

[0112] A selectable marker gene which can be used to facilitate soybean transformation is a chimeric gene composed of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812), the hygromycin phosphotransferase gene from plasmid pJR225 (from E. coli; Gritz et al. (1983) Gene 25:179-188) and the 3' region of the nopaline synthase gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens. The seed expression cassette comprising the phaseolin 5' region, the fragment encoding the sucrose transport protein and the phaseolin 3' region can be isolated as a restriction fragment. This fragment can then be inserted into a unique restriction site of the vector carrying the marker gene.

[0113] To 50 .mu.L of a 60 mg/mL 1 .mu.m gold particle suspension is added (in order): 5 .mu.L DNA (1 .mu.g/.mu.L), 20 .mu.l spermidine (0.1 M), and 50 .mu.L CaCl.sub.2 (2.5 M). The particle preparation is then agitated for three minutes, spun in a microfuge for 10 seconds and the supernatant removed. The DNA-coated particles are then washed once in 400 .mu.L 70% ethanol and resuspended in 40 .mu.L of anhydrous ethanol. The DNA/particle suspension can be sonicated three times for one second each. Five .mu.L of the DNA-coated gold particles are then loaded on each macro carrier disk.

[0114] Approximately 300-400 mg of a two-week-old suspension culture is placed in an empty 60.times.15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 plates of tissue are normally bombarded. Membrane rupture pressure is set at 1100 psi and the chamber is evacuated to a vacuum of 28 inches mercury. The tissue is placed approximately 3.5 inches away from the retaining screen and bombarded three times. Following bombardment, the tissue can be divided in half and placed back into liquid and cultured as described above.

[0115] Five to seven days post bombardment, the liquid media may be exchanged with fresh media, and eleven to twelve days post bombardment with fresh media containing 50 mg/mL hygromycin. This selective media can be refreshed weekly. Seven to eight weeks post bombardment, green, transformed tissue may be observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Each new line may be treated as an independent transformation event. These suspensions can then be subcultured and maintained as clusters of immature embryos or regenerated into whole plants by maturation and germination of individual somatic embryos.

Example 6

Expression of Chimeric Genes in Microbial Cells

[0116] The cDNAs encoding the instant sucrose transport proteins can be inserted into the T7 E. coli expression vector pBT430. This vector is a derivative of pET-3a (Rosenberg et al. (1987) Gene 56:125-135) which employs the bacteriophage T7 RNA polymerase/T7 promoter system. Plasmid pBT430 was constructed by first destroying the EcoR I and Hind III sites in pET-3a at their original positions. An oligonucleotide adaptor containing EcoR I and Hind III sites was inserted at the BamH I site of pET-3a. This created pET-3aM with additional unique cloning sites for insertion of genes into the expression vector. Then, the Nde I site at the position of translation initiation was converted to an Nco I site using oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM in this region, 5'-CATATGG, was converted to 5'-CCCATGG in pBT430.

[0117] Plasmid DNA containing a cDNA may be appropriately digested to release a nucleic acid fragment encoding the protein. This fragment may then be purified on a 1% NuSieve GTG.TM. low melting agarose gel (FMC). Buffer and agarose contain 10 .mu.g/ml ethidium bromide for visualization of the DNA fragment. The fragment can then be purified from the agarose gel by digestion with GELase.TM. (Epicentre Technologies) according to the manufacturer's instructions, ethanol precipitated, dried and resuspended in 20 .mu.L of water. Appropriate oligonucleotide adapters may be ligated to the fragment using T4 DNA ligase (New England Biolabs, Beverly, Mass.). The fragment containing the ligated adapters can be purified from the excess adapters using low melting agarose as described above. The vector pBT430 is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized with phenol/chloroform as described above. The prepared vector pBT430 and fragment can then be ligated at 16.degree. C. for 15 hours followed by transformation into DH5 electrocompetent cells (GIBCO BRL). Transformants can be selected on agar plates containing LB media and 100 .mu.g/mL ampicillin. Transformants containing the gene encoding the sucrose transport protein are then screened for the correct orientation with respect to the T7 promoter by restriction enzyme analysis.

[0118] For high level expression, a plasmid clone with the cDNA insert in the correct orientation relative to the T7 promoter can be transformed into E. coli strain BL21(DE3) (Studier et al. (1986) J. Mol. Biol. 189:113-130). Cultures are grown in LB medium containing ampicillin (100 mg/L) at 25.degree. C. At an optical density at 600 nm of approximately 1, IPTG (isopropylthio-.beta.-galactoside, the inducer) can be added to a final concentration of 0.4 mM and incubation can be continued for 3 h at 25.degree.. Cells are then harvested by centrifugation and re-suspended in 50 .mu.L of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glass beads can be added and the mixture sonicated 3 times for about 5 seconds each time with a microprobe sonicator. The mixture is centrifuged and the protein concentration of the supernatant determined. One .mu.g of protein from the soluble fraction of the culture can be separated by SDS-polyacrylamide gel electrophoresis. Gels can be observed for protein bands migrating at the expected molecular weight.

Sequence CWU 1

1

2812088DNAZea mays 1gcacgagaca ctcctcacct ctcctcgctc cacgcacgcg ctctctcacc cgctggctat 60tagtcgtcgt cccttggatt tcgacactct ctctagcggg cgcctgttcc gccgccgtcc 120atcgatccta gctagctagc tagctagggc gcgaccgtcg tctcggtggt tgttgacagg 180tcccgtacgt gtgtgctcgc catggctcgt ggcgacggcg ggcagctggc ggagctgtcc 240gcgggggtcc gcggcgcggc cgcggtggtg gaccacgtgg ccccgatcag cctcgggagg 300ctcatcctcg ccggcatggt cgccggcggc gtgcagtacg gctgggcgct gcagctctcc 360ctcctcacgc cctacgtgca gactctgggg ctttcacatg cgctcacttc attcatgtgg 420ctctgcggcc ctattgccgg cttagtggtc caaccgctgg ttggcctgta cagcgacagg 480tgtacatcga gatgggggag acggaggccg tttatcctga cagggtgcat gctcatctgc 540gttgccgtca ttgttgtcgg attctcgtca gacatcggag ctgctctagg ggacacgaag 600gaacactgca gcctctacca cggtcctcgt tggcacgctg cgatcgtgta cgttctgggg 660ttttggctcc ttgacttctc caacaacact gtgcagggtc cagcacgtgc tatgatggct 720gatctatgtg accatcatgg gccaagtgcg gctaactcca tcttctgttc ttggatggcg 780ctgggaaaca tcctaggcta ctcctctggc tccacgaaca attggcacaa gtggtttccc 840ttccttaaaa cgagcgcctg ctgtgaggcc tgtgcgaacc tgaaaggtgc atttctggtg 900gccgtggtgt tcctagtcct gtgcctgacg gtaaccctga tcttcgccaa ggaggtgccg 960tacagagcga acgagaacct cccgacgacg aaggccggcg gcgaggtcga gactgagcct 1020accgggccac ttgccgtgct caagggcttc aaggacctgc ctcccgggat gccgtccgtg 1080ctcctcgtga ctgccatcac ctggctttcg tggttcccgt tcatcctcta cgacaccgac 1140tggatgggcc gggagatcta ccacggcgac cccaagggga gcaacgccca gatctcggcg 1200ttcaacgaag gtgtccgagt cggcgcgttc gggctgctac tcaactcggt tattctaggg 1260ttcagctcgt tcctgatcga gcccatgtgc cggaaggtcg ggccgagggt ggtgtgggtg 1320acgagcaact tcatggtctg cgtcgccatg gcggccaccg cgctgatcag cttctggtcg 1380ctcagggact accacgggta cgtgcaggac gccatcaccg cgaacgccag catcaaggcc 1440gtctgcctcg tcctcttcgc cttcctgggc gtccctctcg ccatcctgta cagcgtcccg 1500ttcgcggtga cggcgcagct ggcggccacc cggggcggcg ggcaggggct gtgcaccggc 1560gtcctcaaca tctccatcgt catccctcag gtgatcatcg cgctgggcgc cggcccgtgg 1620gacgcgctgt tcgggaaggg caacatcccg gcgttcggcg tcgcgtcggc cttcgccctc 1680gtcggcggcg tcgtgggcgt gttcctgctg cccaagatct ccaagcgcca gttccgggcc 1740gtcagcgcgg gcggccactg atcgaacccg gccggggccg gccgccggca cgcagcccgg 1800caagagctgt atgttgttga gagttgaaca gaaaccatgc atgtgtgctt ctgtagttct 1860gttgtttgtg gtcgatcgat gggcgttgcg tggcagcgtg ggcaagcgag gcgaggtgcg 1920cggatccaaa aaaagggcca ttcgatcaat caatgtgtag tagagtacaa ctagacgatg 1980atgttcacat catttgtctt taatacatac cggtttctat tgtctttaaa aaaaaaaaaa 2040aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 20882519PRTZea mays 2Met Ala Arg Gly Asp Gly Gly Gln Leu Ala Glu Leu Ser Ala Gly Val1 5 10 15Arg Gly Ala Ala Ala Val Val Asp His Val Ala Pro Ile Ser Leu Gly 20 25 30Arg Leu Ile Leu Ala Gly Met Val Ala Gly Gly Val Gln Tyr Gly Trp 35 40 45Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln Thr Leu Gly Leu 50 55 60Ser His Ala Leu Thr Ser Phe Met Trp Leu Cys Gly Pro Ile Ala Gly65 70 75 80Leu Val Val Gln Pro Leu Val Gly Leu Tyr Ser Asp Arg Cys Thr Ser 85 90 95Arg Trp Gly Arg Arg Arg Pro Phe Ile Leu Thr Gly Cys Met Leu Ile 100 105 110Cys Val Ala Val Ile Val Val Gly Phe Ser Ser Asp Ile Gly Ala Ala 115 120 125Leu Gly Asp Thr Lys Glu His Cys Ser Leu Tyr His Gly Pro Arg Trp 130 135 140His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp Leu Leu Asp Phe Ser145 150 155 160Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Met Met Ala Asp Leu Cys 165 170 175Asp His His Gly Pro Ser Ala Ala Asn Ser Ile Phe Cys Ser Trp Met 180 185 190Ala Leu Gly Asn Ile Leu Gly Tyr Ser Ser Gly Ser Thr Asn Asn Trp 195 200 205His Lys Trp Phe Pro Phe Leu Lys Thr Ser Ala Cys Cys Glu Ala Cys 210 215 220Ala Asn Leu Lys Gly Ala Phe Leu Val Ala Val Val Phe Leu Val Leu225 230 235 240Cys Leu Thr Val Thr Leu Ile Phe Ala Lys Glu Val Pro Tyr Arg Ala 245 250 255Asn Glu Asn Leu Pro Thr Thr Lys Ala Gly Gly Glu Val Glu Thr Glu 260 265 270Pro Thr Gly Pro Leu Ala Val Leu Lys Gly Phe Lys Asp Leu Pro Pro 275 280 285Gly Met Pro Ser Val Leu Leu Val Thr Ala Ile Thr Trp Leu Ser Trp 290 295 300Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg Glu Ile Tyr305 310 315 320His Gly Asp Pro Lys Gly Ser Asn Ala Gln Ile Ser Ala Phe Asn Glu 325 330 335Gly Val Arg Val Gly Ala Phe Gly Leu Leu Leu Asn Ser Val Ile Leu 340 345 350Gly Phe Ser Ser Phe Leu Ile Glu Pro Met Cys Arg Lys Val Gly Pro 355 360 365Arg Val Val Trp Val Thr Ser Asn Phe Met Val Cys Val Ala Met Ala 370 375 380Ala Thr Ala Leu Ile Ser Phe Trp Ser Leu Arg Asp Tyr His Gly Tyr385 390 395 400Val Gln Asp Ala Ile Thr Ala Asn Ala Ser Ile Lys Ala Val Cys Leu 405 410 415Val Leu Phe Ala Phe Leu Gly Val Pro Leu Ala Ile Leu Tyr Ser Val 420 425 430Pro Phe Ala Val Thr Ala Gln Leu Ala Ala Thr Arg Gly Gly Gly Gln 435 440 445Gly Leu Cys Thr Gly Val Leu Asn Ile Ser Ile Val Ile Pro Gln Val 450 455 460Ile Ile Ala Leu Gly Ala Gly Pro Trp Asp Ala Leu Phe Gly Lys Gly465 470 475 480Asn Ile Pro Ala Phe Gly Val Ala Ser Ala Phe Ala Leu Val Gly Gly 485 490 495Val Val Gly Val Phe Leu Leu Pro Lys Ile Ser Lys Arg Gln Phe Arg 500 505 510Ala Val Ser Ala Gly Gly His 5153825DNAZea mays 3gcacgagtta agttggatct cttctgatct gtactcaagc aaacttcatc acatcatcgg 60ggcaaataaa acagtcaaga tcacggcatt ggttgttttc tctcttctcg gattgccact 120ctccatcact tacagcgttc cgttttctgt gactgctgag ctgactgccg gtacaggagg 180tggacaaggt ttggccacag gagtcctaaa tcttgctatc gtggttcccc agatagtagt 240gtcgcttgga gcaggtccat gggacgctct gtatggagga gggaataccc cggcgttcgt 300cttggcttcg gtcttctccc tggcagcagg tgtgctcgca gttctcaagc tgccaaagct 360gtccaactcg taccaatctg ccgggttcca tggatttggc tgatgctcat gcccaaaaca 420cccccgtctg ccatgtaaaa catcacacca acacttggcc ccattttgcc attcgtttac 480agagaaatga ttcttttttc ctcgtacaac tacagaataa tgacagtgaa agtaggagtt 540taggtgagag agagagagag gctaggtagg ttgatgtgaa ggtgtaaaag ctgtatcctc 600ctttttttgt ttttgttttt gtttttgaca gtgtatgtaa gagctgtcca caagaaaatt 660tacaagtggt gtaacctgcc ctcgtttgta cattgtacta ctactacatg acaatcatat 720gtcctttgtc tttatccaag gttgaagacg taaactgagg ccatctattt atcttgggcc 780atgaaaaaaa aaaaaaaaaa aaaaaaaact cgaaactagt tctct 8254133PRTZea mays 4His Glu Leu Ser Trp Ile Ser Ser Asp Leu Tyr Ser Ser Lys Leu His1 5 10 15His Ile Ile Gly Ala Asn Lys Thr Val Lys Ile Thr Ala Leu Val Val 20 25 30Phe Ser Leu Leu Gly Leu Pro Leu Ser Ile Thr Tyr Ser Val Pro Phe 35 40 45Ser Val Thr Ala Glu Leu Thr Ala Gly Thr Gly Gly Gly Gln Gly Leu 50 55 60Ala Thr Gly Val Leu Asn Leu Ala Ile Val Val Pro Gln Ile Val Val65 70 75 80Ser Leu Gly Ala Gly Pro Trp Asp Ala Leu Tyr Gly Gly Gly Asn Thr 85 90 95Pro Ala Phe Val Leu Ala Ser Val Phe Ser Leu Ala Ala Gly Val Leu 100 105 110Ala Val Leu Lys Leu Pro Lys Leu Ser Asn Ser Tyr Gln Ser Ala Gly 115 120 125Phe His Gly Phe Gly 13051977DNAZea mays 5gcggcggacc acgtggcgcc gatcagcctc ggcaggctca tcctcgccgg catggtcgcc 60ggcggcgtgc agtacggctg ggcgctgcag ctctccctcc tcacgcccta cgtgcagact 120ctggggctct cacatgccct cacttcattc atgtggctat gcggtcctat tgctggctta 180gtggtccaac cgctggttgg cctgtacagc gataggtgca cagcaagatg gggaagacgc 240aggccattta tcctgatagg atgcatgctc atctgccttg ccgtcattgt tgttggcttc 300tcgtccgaca tcggagctgc tctaggggac acaaaggaac actgcagcct ctaccacggc 360cctcgttggc atgctgcgat cgtgtacgtt ctggggtttt ggctccttga cttctccaac 420aatactgtgc aaggtccagc gcgtgctatg atggctgatc tgtgcggtca tcatgggcct 480agtgcagcca actcaatctt ctgttcttgg atggcgctgg gaaacatcct aggctattcc 540tctggctcca caaacaactg gcacaagtgg tttccgttcc ttatgacaaa cgcgtgctgt 600gaagcctgcg caaacctgaa aggcgcgttt ctggtggctg tggtgttcct aatcatgtgc 660ttgactataa ccctgttctt cgccaaggaa gtgccctaca gaggaaacca gaacctcccc 720acaaaggcaa acggcgaggt cgagactgaa ccttccggcc cactcgctgt gctcaagggc 780ttcaagaact tgcccacggg gatgccgtcc gtgctcctcg taactggact cacctggctc 840tcttggttcc cgttcatcct ctacgacacc gactggatgg gccgtgagat ctaccacggc 900gaccccaagg gtagcaacgc tcagatctcg gcgttcgacg aaggcgtcag agttggctcg 960ttcgggctgc tgctcaactc gatcgttcta ggattcagct cgttcctgat cgagcccatg 1020tgccggaagg tcgggccgag ggtggtgtgg gtgacgagca acttcatggt ctgcgtcgcc 1080atggcggcca ccgcgctgat cagcttctgg tcgctcaagg actaccacgg atacgtgcag 1140gacgccatca ccgccagcac gagcatcaag gccgtctgcc tcgtcctctt cgcgttcctg 1200ggtgtccctc tcgccatcct gtacagcgtc ccgttcgcgg tgacggcgca gctggcggcc 1260acgaagggcg gcgggcaggg gctgtgcacc ggcgtgctca acatctccat cgtcatccct 1320caggtgatca tcgcgctggg cgcgggcccg tgggacgcgc tgttcggcaa gggcaacatc 1380ccggcgttcg gcgtggcgtc ggggttcgcc ctcatcggcg gcgtcgtggg cgtgttcctg 1440ctgcccaaga tctccaagcg ccagttccgc gccgtcagcg cgggcggcca ctgatcgcgg 1500ccgccgcgcc ggagcacggc acggcggcac agcccagccg tgctagagct gtatgttttg 1560aaagttgaaa cagaataaga agcgggcgaa acgagaaaac catgcatgtc atgtgtgtgc 1620ttttgttgtg tgtggggtgg ggcaagcgag gcgaggtgtg tggaggtgaa gtgaaggtga 1680gcatatccag caccagctgg taccaaggtc gggtctctgt gctagtgcta ttagctagtg 1740taaggagcga gtaggtcagt taaggctggt gcgtcgtgag ggctgtcttg tgtgtagcta 1800cagcagacgg ttcatcagaa ggattattcg tgcagtatat acagtacaac tagacaatga 1860tgttgatgat tggtctagag ctagaggcct atagccctat actactgtgt attgtccgcc 1920gttttagttt tttggtccca tcccatcaat gcaaccgcct tgttttaaaa aaaaaaa 19776497PRTZea mays 6Ala Ala Asp His Val Ala Pro Ile Ser Leu Gly Arg Leu Ile Leu Ala1 5 10 15Gly Met Val Ala Gly Gly Val Gln Tyr Gly Trp Ala Leu Gln Leu Ser 20 25 30Leu Leu Thr Pro Tyr Val Gln Thr Leu Gly Leu Ser His Ala Leu Thr 35 40 45Ser Phe Met Trp Leu Cys Gly Pro Ile Ala Gly Leu Val Val Gln Pro 50 55 60Leu Val Gly Leu Tyr Ser Asp Arg Cys Thr Ala Arg Trp Gly Arg Arg65 70 75 80Arg Pro Phe Ile Leu Ile Gly Cys Met Leu Ile Cys Leu Ala Val Ile 85 90 95Val Val Gly Phe Ser Ser Asp Ile Gly Ala Ala Leu Gly Asp Thr Lys 100 105 110Glu His Cys Ser Leu Tyr His Gly Pro Arg Trp His Ala Ala Ile Val 115 120 125Tyr Val Leu Gly Phe Trp Leu Leu Asp Phe Ser Asn Asn Thr Val Gln 130 135 140Gly Pro Ala Arg Ala Met Met Ala Asp Leu Cys Gly His His Gly Pro145 150 155 160Ser Ala Ala Asn Ser Ile Phe Cys Ser Trp Met Ala Leu Gly Asn Ile 165 170 175Leu Gly Tyr Ser Ser Gly Ser Thr Asn Asn Trp His Lys Trp Phe Pro 180 185 190Phe Leu Met Thr Asn Ala Cys Cys Glu Ala Cys Ala Asn Leu Lys Gly 195 200 205Ala Phe Leu Val Ala Val Val Phe Leu Ile Met Cys Leu Thr Ile Thr 210 215 220Leu Phe Phe Ala Lys Glu Val Pro Tyr Arg Gly Asn Gln Asn Leu Pro225 230 235 240Thr Lys Ala Asn Gly Glu Val Glu Thr Glu Pro Ser Gly Pro Leu Ala 245 250 255Val Leu Lys Gly Phe Lys Asn Leu Pro Thr Gly Met Pro Ser Val Leu 260 265 270Leu Val Thr Gly Leu Thr Trp Leu Ser Trp Phe Pro Phe Ile Leu Tyr 275 280 285Asp Thr Asp Trp Met Gly Arg Glu Ile Tyr His Gly Asp Pro Lys Gly 290 295 300Ser Asn Ala Gln Ile Ser Ala Phe Asp Glu Gly Val Arg Val Gly Ser305 310 315 320Phe Gly Leu Leu Leu Asn Ser Ile Val Leu Gly Phe Ser Ser Phe Leu 325 330 335Ile Glu Pro Met Cys Arg Lys Val Gly Pro Arg Val Val Trp Val Thr 340 345 350Ser Asn Phe Met Val Cys Val Ala Met Ala Ala Thr Ala Leu Ile Ser 355 360 365Phe Trp Ser Leu Lys Asp Tyr His Gly Tyr Val Gln Asp Ala Ile Thr 370 375 380Ala Ser Thr Ser Ile Lys Ala Val Cys Leu Val Leu Phe Ala Phe Leu385 390 395 400Gly Val Pro Leu Ala Ile Leu Tyr Ser Val Pro Phe Ala Val Thr Ala 405 410 415Gln Leu Ala Ala Thr Lys Gly Gly Gly Gln Gly Leu Cys Thr Gly Val 420 425 430Leu Asn Ile Ser Ile Val Ile Pro Gln Val Ile Ile Ala Leu Gly Ala 435 440 445Gly Pro Trp Asp Ala Leu Phe Gly Lys Gly Asn Ile Pro Ala Phe Gly 450 455 460Val Ala Ser Gly Phe Ala Leu Ile Gly Gly Val Val Gly Val Phe Leu465 470 475 480Leu Pro Lys Ile Ser Lys Arg Gln Phe Arg Ala Val Ser Ala Gly Gly 485 490 495His 71653DNAOryza sativa 7gcacgagatc actgcttcca tcgctgccgc agttctcacc gtcggattct ccgccgacct 60cggccgaatc ttcggcgatt ccatcacccc gggctccacc cgcctcggcg ccatcatcgt 120ctacctcgtc ggcttctggc tcctcgacgt cggcaacaac gctacacagg gaccctgcag 180ggccttcctc gccgacctca ccgagaatga cccaaggagg actcggatag ctaatgctta 240cttctcattg ttcatggccc tgggaaacat acttggatat gccactggag catacagtgg 300ctggtacaag atattcccgt tcaccgttac tccatcatgt agcatcagct gtgccaactt 360caagtctgcc tttctacttg atattatcat tttggtggtc actacatgca tcactgtagc 420atcagtgcaa gagcctcaat cctttggaag tgatgaagca gatcacccta gcacagaaca 480ggaagctttc ctctgggaac tttttggatc attccggtac tttacattac cggtttggat 540ggttttgatt gttactgccc tcacatggat tggatggttt ccatttatcc tctttgatac 600cgattggatg ggtcgagaga tctatcgtgg aagtccagat gatccaagta taactcagag 660ctatcatgat ggtgtgagaa tgggttcttt tggtctgatg ctgaactcgg tccttcttgg 720attcacttct attgtactag agaagttatg tcggaagtgg ggagctggac tggtgtgggg 780tgtctccaat atcctaatgg cattgtgctt tgtggcaatg cttgtaataa catatgtggc 840aaagaatatg gattatccac ctagtggagt accaccaacc ggcattgtca ttgcttccct 900ggtagttttt acaattttag gagcgcccct ggcgatcacg tacagtatac catatgcaat 960ggctgctagt cgggttgaaa atctgggact tggccaaggt ctagcaatgg gcattcttaa 1020tttggctatt gtcataccac aggttattgt gtcactgggt agcgggccct gggaccaact 1080gtttggtggt ggcaatgcac cagcctttgc agtggctgct gctgcatctt ttatcggtgg 1140gctggtggct attctgggcc ttccacgagc ccgcattgca tcaaggagga gaggtcaccg 1200ataagaatat tgctacatat aaattgtcgg ccattctttg caattcgact cataagaggc 1260actcggaacg ctatgcagtg catgggggaa ttgtatatta tctccgaatc aagaagggga 1320taatgcttgc tttctccatg agctattttt gcctttttca tgccggatca tcatatgctg 1380tcgtacattg gatgatctta tgctgttgta cattggatgt tggtcatttg tagagatact 1440agtgaataaa agttgcagga gttggttcac tcgagaaaat tctggtcagt atgtcgtcca 1500tctgctgcac gacagcagtt aggagccgaa tagcatgtcc atgggttttc atcaaatgtt 1560gtatcatcat ttgttttttg atacgttcag acggcttcag tgctgtgtga atatatatgt 1620atggaatata tcgagaaaaa aaaaaaaaaa aaa 16538400PRTOryza sativa 8His Glu Ile Thr Ala Ser Ile Ala Ala Ala Val Leu Thr Val Gly Phe1 5 10 15Ser Ala Asp Leu Gly Arg Ile Phe Gly Asp Ser Ile Thr Pro Gly Ser 20 25 30Thr Arg Leu Gly Ala Ile Ile Val Tyr Leu Val Gly Phe Trp Leu Leu 35 40 45Asp Val Gly Asn Asn Ala Thr Gln Gly Pro Cys Arg Ala Phe Leu Ala 50 55 60Asp Leu Thr Glu Asn Asp Pro Arg Arg Thr Arg Ile Ala Asn Ala Tyr65 70 75 80Phe Ser Leu Phe Met Ala Leu Gly Asn Ile Leu Gly Tyr Ala Thr Gly 85 90 95Ala Tyr Ser Gly Trp Tyr Lys Ile Phe Pro Phe Thr Val Thr Pro Ser 100 105 110Cys Ser Ile Ser Cys Ala Asn Phe Lys Ser Ala Phe Leu Leu Asp Ile 115 120 125Ile Ile Leu Val Val Thr Thr Cys Ile Thr Val Ala Ser Val Gln Glu 130 135 140Pro Gln Ser Phe Gly Ser Asp Glu Ala Asp His Pro Ser Thr Glu Gln145 150 155 160Glu Ala Phe Leu Trp Glu Leu Phe Gly Ser Phe Arg Tyr Phe Thr Leu 165 170 175Pro Val Trp Met Val Leu Ile Val Thr Ala Leu Thr Trp Ile Gly Trp 180 185 190Phe Pro Phe Ile Leu Phe Asp Thr Asp Trp Met Gly Arg Glu Ile Tyr 195 200 205Arg Gly Ser Pro Asp Asp Pro Ser Ile Thr Gln Ser Tyr His Asp Gly 210 215 220Val Arg Met

Gly Ser Phe Gly Leu Met Leu Asn Ser Val Leu Leu Gly225 230 235 240Phe Thr Ser Ile Val Leu Glu Lys Leu Cys Arg Lys Trp Gly Ala Gly 245 250 255Leu Val Trp Gly Val Ser Asn Ile Leu Met Ala Leu Cys Phe Val Ala 260 265 270Met Leu Val Ile Thr Tyr Val Ala Lys Asn Met Asp Tyr Pro Pro Ser 275 280 285Gly Val Pro Pro Thr Gly Ile Val Ile Ala Ser Leu Val Val Phe Thr 290 295 300Ile Leu Gly Ala Pro Leu Ala Ile Thr Tyr Ser Ile Pro Tyr Ala Met305 310 315 320Ala Ala Ser Arg Val Glu Asn Leu Gly Leu Gly Gln Gly Leu Ala Met 325 330 335Gly Ile Leu Asn Leu Ala Ile Val Ile Pro Gln Val Ile Val Ser Leu 340 345 350Gly Ser Gly Pro Trp Asp Gln Leu Phe Gly Gly Gly Asn Ala Pro Ala 355 360 365Phe Ala Val Ala Ala Ala Ala Ser Phe Ile Gly Gly Leu Val Ala Ile 370 375 380Leu Gly Leu Pro Arg Ala Arg Ile Ala Ser Arg Arg Arg Gly His Arg385 390 395 40092375DNAOryza sativa 9gcacgaggtt ctaacccgcg ccttcgccga gggaggccga ccaacgcatc aatcaaacac 60acaagcacac cacgcggacg cagcagcagg ggaggagaca atttcctatt cttcctcgcc 120ccgcgtcgcc tcgcctgagt ctgactctcc aaacgccgac cagtgacgcc gcgagccttg 180ccccttgccc gcgcagatct caccaaaccc taccagatct gcgccccgcc atggactccg 240ccgccggcgg tggcggcctc acggccatcc gcctgcccta ccgccacctc cgcgacgccg 300agatggagct cgtcagcctc aacggcggca ccccccgcgg aggctccccc aaggaccccg 360acgccacgca ccagcagggg ccccccgccg cccgtaccac caccaccagg aagctcgtcc 420tcgcctgcat ggtcgccgcc ggcgtgcagt tcggctgggc gcttcagctc tcgctcctca 480cgccctacat ccagacccta ggaatagacc atgccatggc atcattcatt tggctttgtg 540gacctattac tggttttgtg gttcaaccat gtgttggtgt ctggagtgac aaatgccgtt 600caaagtatgg aagaaggaga ccgttcattt tggctggatg cttgatgata tgctttgctg 660taactttaat cggattttct gcagaccttg gttacatttt aggagatacc actgagcact 720gcagtacata taaaggttca agatttcgag cagctattat tttcgttctt gggttctgga 780tgttggatct cgcaaacaat acagttcaag gtcctgctcg tgccctttta gctgaccttt 840caggtcctga tcagtgtaat tctgcaaatg caattttttg cacatggatg gctgttggaa 900acgttcttgg tttttcatct ggtgctagtg ggaattggca caagtggttt ccttttctaa 960tgacaagagc atgctgtgaa gcttgtagta atttgaaagc cgcttttctg gttgcagttg 1020tattcctttt gttttgtatg tctgttaccc tgtactttgc tgaagagatc ccgctggaac 1080caacagatgc acaacgatta tctgattctg cgcctctcct gaatggttct agagatgata 1140acaatgcatc aaatgaacct cgtaatggag cacttcctaa tggtcataca gatggaagca 1200atgtcccagc taactccaac gctgaggact ccaattcaaa cagagagaat gtcgaagttt 1260tcaatgatgg accaggagca gttttggtga atattttgac tagcatgagg catctacctc 1320ctggaatgta ctctgttctt ctagttatgg ctctaacatg gttgtcgtgg tttccctttt 1380tcctttttga tactgactgg atgggacgtg aggtttacca tggggaccca aatggcaact 1440tgagtgaaag gaaagcttat gacaacggtg tccgagaagg tgcatttggt ttgctattga 1500attcagttgt ccttggaatt gggtccttcc ttgttgatcc actatgccga ctgatgggtg 1560ctagactggt ttgggcaatc agcaacttca cagtgtttat ctgcatgctg gctacagcaa 1620tattaagttg gatctctttt gatttgtact caagtaaact tcaccacatc attggagcaa 1680ataaaacagt gaagaattca gccttgattg ttttctccct acttggactg ccactctcga 1740tcacatatag cgttcctttt tctgtgactg ctgagctgac tgctggaaca ggaggtggac 1800aaggtctggc aacaggagtc ctgaaccttg caatcgttgt tccgcagata gtagtgtcac 1860taggagcagg tccatgggat gctctctttg ggggagggaa cgtccctgct ttcgccttgg 1920cttccgtttt ctcactagga gctggtgtcc tcgcggtcct taagctaccc aagctgccaa 1980actcttacag atctgctggg ttccatggat ttggctgagc agaacaccag ccgcatggtg 2040tgtaacattg agaaatgcaa ctccattttg ccattcgttt acagtgaaat gattcttttt 2100acctactact acaacagaat aagctgaaaa gatagagatt aggatagaga gctaggtaac 2160tagtccagtt aggttgatgt gcatacaagg caattggaag gtgtaagagc tgtatctact 2220tttttgacag aaaaatgtaa gctctgcccg aatgacatgg cggatagatt ttacaatgga 2280tgtaatcatg tactatatat aacacgtttt ggtcacagct tgccaagttt catgtatagt 2340actgctacta aaaaaaaaaa aaaaaaaaaa aaaaa 237510667PRTOryza sativa 10Pro Ala Pro Ser Pro Arg Glu Ala Asp Gln Arg Ile Asn Gln Thr His1 5 10 15Lys His Thr Thr Arg Thr Gln Gln Gln Gly Arg Arg Gln Phe Pro Ile 20 25 30Leu Pro Arg Pro Ala Ser Pro Arg Leu Ser Leu Thr Leu Gln Thr Pro 35 40 45Thr Ser Asp Ala Ala Ser Leu Ala Pro Cys Pro Arg Arg Ser His Gln 50 55 60Thr Leu Pro Asp Leu Arg Pro Ala Met Asp Ser Ala Ala Gly Gly Gly65 70 75 80Gly Leu Thr Ala Ile Arg Leu Pro Tyr Arg His Leu Arg Asp Ala Glu 85 90 95Met Glu Leu Val Ser Leu Asn Gly Gly Thr Pro Arg Gly Gly Ser Pro 100 105 110Lys Asp Pro Asp Ala Thr His Gln Gln Gly Pro Pro Ala Ala Arg Thr 115 120 125Thr Thr Thr Arg Lys Leu Val Leu Ala Cys Met Val Ala Ala Gly Val 130 135 140Gln Phe Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Ile Gln145 150 155 160Thr Leu Gly Ile Asp His Ala Met Ala Ser Phe Ile Trp Leu Cys Gly 165 170 175Pro Ile Thr Gly Phe Val Val Gln Pro Cys Val Gly Val Trp Ser Asp 180 185 190Lys Cys Arg Ser Lys Tyr Gly Arg Arg Arg Pro Phe Ile Leu Ala Gly 195 200 205Cys Leu Met Ile Cys Phe Ala Val Thr Leu Ile Gly Phe Ser Ala Asp 210 215 220Leu Gly Tyr Ile Leu Gly Asp Thr Thr Glu His Cys Ser Thr Tyr Lys225 230 235 240Gly Ser Arg Phe Arg Ala Ala Ile Ile Phe Val Leu Gly Phe Trp Met 245 250 255Leu Asp Leu Ala Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu Leu 260 265 270Ala Asp Leu Ser Gly Pro Asp Gln Cys Asn Ser Ala Asn Ala Ile Phe 275 280 285Cys Thr Trp Met Ala Val Gly Asn Val Leu Gly Phe Ser Ser Gly Ala 290 295 300Ser Gly Asn Trp His Lys Trp Phe Pro Phe Leu Met Thr Arg Ala Cys305 310 315 320Cys Glu Ala Cys Ser Asn Leu Lys Ala Ala Phe Leu Val Ala Val Val 325 330 335Phe Leu Leu Phe Cys Met Ser Val Thr Leu Tyr Phe Ala Glu Glu Ile 340 345 350Pro Leu Glu Pro Thr Asp Ala Gln Arg Leu Ser Asp Ser Ala Pro Leu 355 360 365Leu Asn Gly Ser Arg Asp Asp Asn Asn Ala Ser Asn Glu Pro Arg Asn 370 375 380Gly Ala Leu Pro Asn Gly His Thr Asp Gly Ser Asn Val Pro Ala Asn385 390 395 400Ser Asn Ala Glu Asp Ser Asn Ser Asn Arg Glu Asn Val Glu Val Phe 405 410 415Asn Asp Gly Pro Gly Ala Val Leu Val Asn Ile Leu Thr Ser Met Arg 420 425 430His Leu Pro Pro Gly Met Tyr Ser Val Leu Leu Val Met Ala Leu Thr 435 440 445Trp Leu Ser Trp Phe Pro Phe Phe Leu Phe Asp Thr Asp Trp Met Gly 450 455 460Arg Glu Val Tyr His Gly Asp Pro Asn Gly Asn Leu Ser Glu Arg Lys465 470 475 480Ala Tyr Asp Asn Gly Val Arg Glu Gly Ala Phe Gly Leu Leu Leu Asn 485 490 495Ser Val Val Leu Gly Ile Gly Ser Phe Leu Val Asp Pro Leu Cys Arg 500 505 510Leu Met Gly Ala Arg Leu Val Trp Ala Ile Ser Asn Phe Thr Val Phe 515 520 525Ile Cys Met Leu Ala Thr Ala Ile Leu Ser Trp Ile Ser Phe Asp Leu 530 535 540Tyr Ser Ser Lys Leu His His Ile Ile Gly Ala Asn Lys Thr Val Lys545 550 555 560Asn Ser Ala Leu Ile Val Phe Ser Leu Leu Gly Leu Pro Leu Ser Ile 565 570 575Thr Tyr Ser Val Pro Phe Ser Val Thr Ala Glu Leu Thr Ala Gly Thr 580 585 590Gly Gly Gly Gln Gly Leu Ala Thr Gly Val Leu Asn Leu Ala Ile Val 595 600 605Val Pro Gln Ile Val Val Ser Leu Gly Ala Gly Pro Trp Asp Ala Leu 610 615 620Phe Gly Gly Gly Asn Val Pro Ala Phe Ala Leu Ala Ser Val Phe Ser625 630 635 640Leu Gly Ala Gly Val Leu Ala Val Leu Lys Leu Pro Lys Leu Pro Asn 645 650 655Ser Tyr Arg Ser Ala Gly Phe His Gly Phe Gly 660 665111885DNAGlycine max 11gcacgaggag agaaagagaa aacatttaaa aaaatataaa aaaaaataaa cctctttctc 60tctctgaatt tctaagcctc tctcaaaata atggaggagc cacaaccagg acccagcccg 120ttacgcaaaa tgattttggt gtcgtcaatg gcggccggta tccaattcgg gtgggcccta 180cagctctccc ttctcacccc atatgttcaa accctaggcg tcccgcatgc ttgggcctca 240tttatttggc tatgtggccc gatatctggg ctgctggtgc agcccattgt gggctacagc 300agcgaccgat gccaatcccg tttcggtcgt cgccgtccct ttatcctagc cgggtctttg 360gccgtcgcca ttgctgtgtt cctaattggt tacgcggccg atataggaca cgcggcaggc 420gacaacctga cccaaaagac tcggccacgt gcagtggcga tcttcgtgat cgggttttgg 480atcctcgacg tggctaacaa catgctccag ggtccatgcc gtgcctttct gggcgacctc 540gctgccgggg atgagaaaaa gacaaaggca gccaatgcct tcttctcttt cttcatggcc 600gtcggcaaca tcctgggcta tgctgcggga tcctacgacg gcctccaccg cctcttcccc 660ttcacggaaa ccgaggcatg caacgtcttc tgcgcaaacc tcaagagttg cttcttcttc 720gctatcgtcc tcctggtggt cctcaccacc ttggtgctga ttaccgtgaa agaaactccc 780tacacgccaa aggcagagaa ggaaaccgaa gatgcagaga agacacactt ctcgtgcttc 840tgcggagaac tttgtcttgc attcaagggg ctgaagaggc caatgtggat gttgatgttg 900gtgaccgccg tgaactggat agcgtggttc ccttacttct tgttcgacac cgattggatg 960ggtcgtgagg tgtacggtgg tgacgtgggg cagaaggcgt acgattcggg agttcatgca 1020ggttctctag ggctaatgtt gaatgcggtg gtgttggctg tgatgtcatt ggcaattgaa 1080ccgttggggc gtgtggttgg gggaatcaag tggttgtggg gaatcgttaa catcttgttg 1140gctatatgct tgggaatgac cgttctcatc acaaagatcg ctgagcatga acgtcttctt 1200aaccctgctt tggttgggaa cccttccctc ggtatcaaag ttggttccat ggttttcttc 1260tctgtccttg gaatccctct tgcgattact ttcagtgtcc catttgctct agcatctata 1320tactccagca cttccggagc aggccaaggt ctatctttgg gtgtccttaa tattgcaatt 1380gtcgttccac agatgatagt atcaaccata agtggacctt gggatgcctt gttcggcggt 1440ggaaacttgc ctgcattcgt gttgggtgcg gtggccgccg tcgtgagtgc aatattagca 1500gttcttctgc tgccaactcc aaagaaagct gatgaggtca gggcttctag cctcaacatg 1560ggaagtttgc attagtgtgt ctattatagg gctttacatg tttcactttc aaccttgctt 1620tgatatggga aaaagaactt agtctttaga ttcgaagtgg gtgtgtgcat gtgtatatta 1680ggtattagac atgggtttta gatgcttcca tagccacttt atgtccaagg acaatcatta 1740atttgtaaac tttggtgcga caattatacc gaatagaaaa tcattaaaca tacatctttt 1800tatttcacac attaaaaaaa tatcataata aatatatata ttatcatatt ataaaagaaa 1860tatttgaaaa aaaaaaaaaa aaaaa 188512494PRTGlycine max 12Met Glu Glu Pro Gln Pro Gly Pro Ser Pro Leu Arg Lys Met Ile Leu1 5 10 15Val Ser Ser Met Ala Ala Gly Ile Gln Phe Gly Trp Ala Leu Gln Leu 20 25 30Ser Leu Leu Thr Pro Tyr Val Gln Thr Leu Gly Val Pro His Ala Trp 35 40 45Ala Ser Phe Ile Trp Leu Cys Gly Pro Ile Ser Gly Leu Leu Val Gln 50 55 60Pro Ile Val Gly Tyr Ser Ser Asp Arg Cys Gln Ser Arg Phe Gly Arg65 70 75 80Arg Arg Pro Phe Ile Leu Ala Gly Ser Leu Ala Val Ala Ile Ala Val 85 90 95Phe Leu Ile Gly Tyr Ala Ala Asp Ile Gly His Ala Ala Gly Asp Asn 100 105 110Leu Thr Gln Lys Thr Arg Pro Arg Ala Val Ala Ile Phe Val Ile Gly 115 120 125Phe Trp Ile Leu Asp Val Ala Asn Asn Met Leu Gln Gly Pro Cys Arg 130 135 140Ala Phe Leu Gly Asp Leu Ala Ala Gly Asp Glu Lys Lys Thr Lys Ala145 150 155 160Ala Asn Ala Phe Phe Ser Phe Phe Met Ala Val Gly Asn Ile Leu Gly 165 170 175Tyr Ala Ala Gly Ser Tyr Asp Gly Leu His Arg Leu Phe Pro Phe Thr 180 185 190Glu Thr Glu Ala Cys Asn Val Phe Cys Ala Asn Leu Lys Ser Cys Phe 195 200 205Phe Phe Ala Ile Val Leu Leu Val Val Leu Thr Thr Leu Val Leu Ile 210 215 220Thr Val Lys Glu Thr Pro Tyr Thr Pro Lys Ala Glu Lys Glu Thr Glu225 230 235 240Asp Ala Glu Lys Thr His Phe Ser Cys Phe Cys Gly Glu Leu Cys Leu 245 250 255Ala Phe Lys Gly Leu Lys Arg Pro Met Trp Met Leu Met Leu Val Thr 260 265 270Ala Val Asn Trp Ile Ala Trp Phe Pro Tyr Phe Leu Phe Asp Thr Asp 275 280 285Trp Met Gly Arg Glu Val Tyr Gly Gly Asp Val Gly Gln Lys Ala Tyr 290 295 300Asp Ser Gly Val His Ala Gly Ser Leu Gly Leu Met Leu Asn Ala Val305 310 315 320Val Leu Ala Val Met Ser Leu Ala Ile Glu Pro Leu Gly Arg Val Val 325 330 335Gly Gly Ile Lys Trp Leu Trp Gly Ile Val Asn Ile Leu Leu Ala Ile 340 345 350Cys Leu Gly Met Thr Val Leu Ile Thr Lys Ile Ala Glu His Glu Arg 355 360 365Leu Leu Asn Pro Ala Leu Val Gly Asn Pro Ser Leu Gly Ile Lys Val 370 375 380Gly Ser Met Val Phe Phe Ser Val Leu Gly Ile Pro Leu Ala Ile Thr385 390 395 400Phe Ser Val Pro Phe Ala Leu Ala Ser Ile Tyr Ser Ser Thr Ser Gly 405 410 415Ala Gly Gln Gly Leu Ser Leu Gly Val Leu Asn Ile Ala Ile Val Val 420 425 430Pro Gln Met Ile Val Ser Thr Ile Ser Gly Pro Trp Asp Ala Leu Phe 435 440 445Gly Gly Gly Asn Leu Pro Ala Phe Val Leu Gly Ala Val Ala Ala Val 450 455 460Val Ser Ala Ile Leu Ala Val Leu Leu Leu Pro Thr Pro Lys Lys Ala465 470 475 480Asp Glu Val Arg Ala Ser Ser Leu Asn Met Gly Ser Leu His 485 490131041DNAGlycine maxmisc_feature(1007)..(1007)n is a, c, g, or t 13gcacgagctc acactctctc tttctttctt cctgctgcta caatatggag cctctctctt 60ccaccaaaca caacaacaat ctctccaagc cttcctccct ccacacggag gctccgccgc 120cggaggccag tcccctccgg aagatcatgg tggtggcctc catcgccgcc ggggtgcaat 180tcgggtgggc cctacagctc tctctactta ccccttacgt ccaactgctg gggattcccc 240acacttgggc cgccttcatc tggctctgcg gcccaatctc cggcatgctc gtccagccca 300tcgtgggata ccacagcgac cgctgcacct cccgcttcgg ccgccgccgc cccttcatcg 360ccgccggctc cctcgccgtc gccatcgccg tcttccttat cggctacgcc gccgacctcg 420gccacatgtt cggcgactcc ctagccaaaa aaaccgcccc gcgccatcgc atcttcgttg 480tcggcttctg gattctcgac gtcgcaaaca acatgctaca agggccctgc cgcgccctcc 540tgggcgacct ctgcgccgga gaacaacgga aaacgcgaaa cgcaaacgcc ttcttctcct 600tcttcatggc cgtcggaaac gtcctgggct acgccgcggg ctcttacagc ggcctccaca 660acgtcttccc tttcactaaa acaaaagcat gtgatgttta ctgcgcgaat ttgaagagtt 720gtttcttcct ctccatcgcg cttcttctca ctctctccac aatcgccttg acctacgtga 780aggagaaaac ggtgtcgtca gagaaaacgg tgaggagttc ggtggaggag gatgggtccc 840acgggggcat gccgtgcttc gggcaattat tcggtgcgtt ccgcgaactg aagcgtccca 900tgtggatcct tctgttggtg acgtgtctga actgggattg cctggttcct tttttgctat 960tcgacaccga ctgggattgg ggcgtgaggt gtacggaggg aaaattnggg gaaaggaaag 1020ggtacgataa ggggttccgt t 104114322PRTGlycine maxmisc_feature(311)..(311)Xaa can be any naturally occurring amino acid 14Met Glu Pro Leu Ser Ser Thr Lys His Asn Asn Asn Leu Ser Lys Pro1 5 10 15Ser Ser Leu His Thr Glu Ala Pro Pro Pro Glu Ala Ser Pro Leu Arg 20 25 30Lys Ile Met Val Val Ala Ser Ile Ala Ala Gly Val Gln Phe Gly Trp 35 40 45Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln Leu Leu Gly Ile 50 55 60Pro His Thr Trp Ala Ala Phe Ile Trp Leu Cys Gly Pro Ile Ser Gly65 70 75 80Met Leu Val Gln Pro Ile Val Gly Tyr His Ser Asp Arg Cys Thr Ser 85 90 95Arg Phe Gly Arg Arg Arg Pro Phe Ile Ala Ala Gly Ser Leu Ala Val 100 105 110Ala Ile Ala Val Phe Leu Ile Gly Tyr Ala Ala Asp Leu Gly His Met 115 120 125Phe Gly Asp Ser Leu Ala Lys Lys Thr Ala Pro Arg His Arg Ile Phe 130 135 140Val Val Gly Phe Trp Ile Leu Asp Val Ala Asn Asn Met Leu Gln Gly145 150 155 160Pro Cys Arg Ala Leu Leu Gly Asp Leu Cys Ala Gly Glu Gln Arg Lys 165 170 175Thr Arg Asn Ala Asn Ala Phe Phe Ser Phe Phe Met Ala Val Gly Asn 180 185 190Val Leu Gly Tyr Ala Ala Gly Ser Tyr Ser Gly Leu His Asn Val Phe 195 200 205Pro Phe Thr Lys Thr Lys Ala Cys Asp Val Tyr Cys

Ala Asn Leu Lys 210 215 220Ser Cys Phe Phe Leu Ser Ile Ala Leu Leu Leu Thr Leu Ser Thr Ile225 230 235 240Ala Leu Thr Tyr Val Lys Glu Lys Thr Val Ser Ser Glu Lys Thr Val 245 250 255Arg Ser Ser Val Glu Glu Asp Gly Ser His Gly Gly Met Pro Cys Phe 260 265 270Gly Gln Leu Phe Gly Ala Phe Arg Glu Leu Lys Arg Pro Met Trp Ile 275 280 285Leu Leu Leu Val Thr Cys Leu Asn Trp Asp Cys Leu Val Pro Phe Leu 290 295 300Leu Phe Asp Thr Asp Trp Xaa Gly Arg Glu Val Tyr Gly Gly Lys Ile305 310 315 320Xaa Gly15578DNAVernonia mespilifolia 15gcacgaggtt ggcttggcgg tgtgaaacgg ttatggggtg gcatcaattt ccttctagct 60gtttgtttgg ccatgacggt ggtggtgacc aaaatggcag actctgaacg acagtttaag 120acgttgcccg acggtagcaa aaccgcgttg ccaccaggcg gcgacattaa agccggtgct 180ttgtcaattt ttgccgtcct cggtgcccca ctagctgtga ctttcagtgt tccatgtgct 240cttgcatcaa tattttctaa cagttcagga gctggacaag gtctatcact tggtgttttg 300aatctagcaa tcgtcatacc acagatgttc gtatcagtac taagtggacc atgggacgca 360ctgttcggcg gtggaaactt accagcattt gtggttggag caatttcggc tgcagtaagt 420gggatattat cgttcaccat gcttccttcg ccacccccag atgtcgtact ttcaaaggtt 480tccggaggtg ggatgcatta gagagtaaat aactgccact caacacgtcc cgattgtgtc 540agattgggac atttaggacc aaaaaaaaaa aaaaaaaa 57816166PRTVernonia mespilifolia 16Ala Arg Gly Trp Leu Gly Gly Val Lys Arg Leu Trp Gly Gly Ile Asn1 5 10 15Phe Leu Leu Ala Val Cys Leu Ala Met Thr Val Val Val Thr Lys Met 20 25 30Ala Asp Ser Glu Arg Gln Phe Lys Thr Leu Pro Asp Gly Ser Lys Thr 35 40 45Ala Leu Pro Pro Gly Gly Asp Ile Lys Ala Gly Ala Leu Ser Ile Phe 50 55 60Ala Val Leu Gly Ala Pro Leu Ala Val Thr Phe Ser Val Pro Cys Ala65 70 75 80Leu Ala Ser Ile Phe Ser Asn Ser Ser Gly Ala Gly Gln Gly Leu Ser 85 90 95Leu Gly Val Leu Asn Leu Ala Ile Val Ile Pro Gln Met Phe Val Ser 100 105 110Val Leu Ser Gly Pro Trp Asp Ala Leu Phe Gly Gly Gly Asn Leu Pro 115 120 125Ala Phe Val Val Gly Ala Ile Ser Ala Ala Val Ser Gly Ile Leu Ser 130 135 140Phe Thr Met Leu Pro Ser Pro Pro Pro Asp Val Val Leu Ser Lys Val145 150 155 160Ser Gly Gly Gly Met His 165171062DNATriticum aestivum 17ctggaatgcc gtcagtgctc ctcgtcaccg gcctcacctg gctgtcctgg ttccccttca 60tcctgtacga caccgactgg atgggtcgtg agatctacca cggtgacccc aagggaaccc 120ccgacgaggc caacgcgttc caggcaggtg tcagggccgg ggcgttcggc ctgctactca 180actcggtcgt cctggggttc agctcgttcc tgatcgagcc gctgtgcaag aggctaggcc 240cgcgggtggt gtgggtgtca agcaacttcc tcgtctgcat ctccatggcc gccatttgca 300tcataagctg gtgggccact caggacctgc atgggtacat ccagcacgcc atcaccgcca 360gcaaggagat caagatcgtc tccctcgccc tcttcgcctt cctcggaatc cctctcgcca 420ttctgtacag tgtccctttc gcggtgacgg cgcagctggc ggcgaacaga ggcggtggcc 480aagggctgtg cacgggcgtg ctgaacatcg ccatcgtgat accccaggtg atcatcgcgg 540tgggggcggg gccgtgggac gagctgttcg gcaagggcaa catcccggcg ttcggcgtgg 600cgtccgcctt cgcgctcatc ggcggcatcg tcggcatatt cctgctgccc aagatctcca 660ggcgccagtt ccgggccgtc agcggcggcg gtcactgacc gcgccgcgcg ccggtcggcc 720tgagcatggc gaaggccgat cgcgccggcc cgaaggtccc agcccagctc ggcatttacc 780aaattttcgc ataggcgtaa ctagggggct ctcgcctaag gactccgtag agcagaataa 840gaattgtgag gaacctgtat gtgttgtgtc tgtatgtgcg tgtaagtcag tgcgtgtagc 900ggaaaatgga cagaggaatg cgggcatcca tcgccggctg gggtgtcgtc tttgggttgt 960gacttgtgtg tagcaaacca aggtgatcaa gtgaggggaa aagaatggat gatgaacttt 1020cagcgacaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 106218232PRTTriticum aestivum 18Ala Gly Met Pro Ser Val Leu Leu Val Thr Gly Leu Thr Trp Leu Ser1 5 10 15 Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg Glu Ile 20 25 30Tyr His Gly Asp Pro Lys Gly Thr Pro Asp Glu Ala Asn Ala Phe Gln 35 40 45Ala Gly Val Arg Ala Gly Ala Phe Gly Leu Leu Leu Asn Ser Val Val 50 55 60Leu Gly Phe Ser Ser Phe Leu Ile Glu Pro Leu Cys Lys Arg Leu Gly65 70 75 80Pro Arg Val Val Trp Val Ser Ser Asn Phe Leu Val Cys Ile Ser Met 85 90 95Ala Ala Ile Cys Ile Ile Ser Trp Trp Ala Thr Gln Asp Leu His Gly 100 105 110Tyr Ile Gln His Ala Ile Thr Ala Ser Lys Glu Ile Lys Ile Val Ser 115 120 125Leu Ala Leu Phe Ala Phe Leu Gly Ile Pro Leu Ala Ile Leu Tyr Ser 130 135 140Val Pro Phe Ala Val Thr Ala Gln Leu Ala Ala Asn Arg Gly Gly Gly145 150 155 160Gln Gly Leu Cys Thr Gly Val Leu Asn Ile Ala Ile Val Ile Pro Gln 165 170 175Val Ile Ile Ala Val Gly Ala Gly Pro Trp Asp Glu Leu Phe Gly Lys 180 185 190Gly Asn Ile Pro Ala Phe Gly Val Ala Ser Ala Phe Ala Leu Ile Gly 195 200 205Gly Ile Val Gly Ile Phe Leu Leu Pro Lys Ile Ser Arg Arg Gln Phe 210 215 220Arg Ala Val Ser Gly Gly Gly His225 230192083DNATriticum aestivummisc_feature(1093)..(1093)n is a, c, g, or t 19gcacgagcac accacaccac acctctctct ctctcactcg cactttccgc tctcgtctcc 60tcctcttcct cctcccgtca gacccttctt ccccggcgtt gatccgatca acgtcctcct 120ccgtcctgcc cctagatcct tggccgggca gggatacgcc gtagaattga taggcgaacg 180gacgaggtgg tgatcgccag ggcggcctct ctgccatggc gcgcggcgga ggcaacggcg 240aggtggagct ctcggtcggg gtcggcggcg gaggcggcgg cgccgccggc gggggggagc 300aacccgccgt ggacatcagc ctcggcagac tcatcctcgc cggcatggtc gccggcggcg 360tgcagtacgg atgggcgctc cagctctccc tgctcacccc ctacgtccag actctgggac 420tttcgcatgc tctgacttca ttcatgtggc tctgcggccc tattgctgga ttagtggttc 480aaccatgcgt tgggctctac agtgacaagt gcacatctag atggggaaga cgcagaccgt 540ttattctgac aggatgcatc ctcatctgca ttgctgttgt ggtcgtcggc ttctcggctg 600acattggagc tggtctgggt gacagcaagg aagagtgcag tctctatcat gggcctcgtt 660ggcacgctgc aattgtgtat gttcttggat tctggctcct tgacttctcc aacaacactg 720tgcaaggtcc agcgcgtgct ctgatggctg atttatcagc tcagcatgga cccagtgcag 780caaattcaat cttctgttct tggatggcgc taggaaatat ccttggatac tcctctggtt 840ccacaaacaa ctggcacaag tggtttccgt tcctccggac aagggcttgc tgtgaagcct 900gcgcaaatct gaaaggcgca tttctggtgg cagtgctggt cctggccttc tgtttggtga 960taactgtgat cttcgccaag gagataccgt acaaggcgat tgcgcccctc ccaacaaagg 1020gcaatggcca ggttgaagtc gagcccaccg ggccgctcgc cgtgttcaaa ggcttcaaga 1080acttgcctcc tgnaatgccg tcggtgctcc tcgtcactgg cctcacctgg ctgtcctggt 1140tccccttcat cctgtacgac accgactgga tgggtcgtga gatctaccac ggtgacccca 1200agggaacccc cgacgaggcc aacgcgttcc aggcaggtgt cagggccggg gcgttcggcc 1260tgctactcaa ctcggtcgtc ctggggttca gctcgttcct gatcgagccg ctgtgcaaga 1320ggctaggccc gcgggtggtg tgggtgtcga gcaacttcct cgtctgcctc tccatggccg 1380cgatttgcat cataagctgg tgggctactc aggacttgca tgggtatatc cagcacgcca 1440tcaccgccag caaggagatc aagatcgtct ccctcgccct cttcgccttc ctcggaatcc 1500ctctcgccat tctgtacagt gtccctttcg cggtgacggc gcagctggcg gcgaagagag 1560gcggtggcca agggctgtgc acgggcgtgc tcaacatcgc catcgtgata ccccaggtga 1620tcatcgcggt gggggcgggg ccgtgggacg agctgttcgg caagggcaac atcccggcgt 1680tcggcatggc ctccgccttc gcgctcatcg gcggcatcgt cggcatattc ctgctgccca 1740agatctccag gcgccagttc cgggccgtca gcggcggcgg tcactgagca tggccaaggc 1800cggaggtccc agcccagccc gccatttacc aaattttcgc ataggcgtaa ctaggtggct 1860ctcgcctaag gactccgtag agcagaataa gaattgtgag gaacctgtat gtgttgtgtc 1920tgtatgtgcg tgtaagtcag tgcgtgtagc ggaaaatgga cagaggaatg tgggcatcca 1980tcaccggctg gggtgtcgtc tttgggttgt gacttgtgtg tagcaaacca aggtgatcaa 2040gtgaggggaa atgaatggat gatgaacttt cagcgacaaa aaa 208320522PRTTriticum aestivum 20Met Ala Arg Gly Gly Gly Asn Gly Glu Val Glu Leu Ser Val Gly Val1 5 10 15Gly Gly Gly Gly Gly Gly Ala Ala Gly Gly Gly Glu Gln Pro Ala Val 20 25 30Asp Ile Ser Leu Gly Arg Leu Ile Leu Ala Gly Met Val Ala Gly Gly 35 40 45Val Gln Tyr Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val 50 55 60Gln Thr Leu Gly Leu Ser His Ala Leu Thr Ser Phe Met Trp Leu Cys65 70 75 80Gly Pro Ile Ala Gly Leu Val Val Gln Pro Cys Val Gly Leu Tyr Ser 85 90 95Asp Lys Cys Thr Ser Arg Trp Gly Arg Arg Arg Pro Phe Ile Leu Thr 100 105 110Gly Cys Ile Leu Ile Cys Ile Ala Val Val Val Val Gly Phe Ser Ala 115 120 125Asp Ile Gly Ala Gly Leu Gly Asp Ser Lys Glu Glu Cys Ser Leu Tyr 130 135 140His Gly Pro Arg Trp His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp145 150 155 160Leu Leu Asp Phe Ser Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu 165 170 175Met Ala Asp Leu Ser Ala Gln His Gly Pro Ser Ala Ala Asn Ser Ile 180 185 190Phe Cys Ser Trp Met Ala Leu Gly Asn Ile Leu Gly Tyr Ser Ser Gly 195 200 205Ser Thr Asn Asn Trp His Lys Trp Phe Pro Phe Leu Arg Thr Arg Ala 210 215 220Cys Cys Glu Ala Cys Ala Asn Leu Lys Gly Ala Phe Leu Val Ala Val225 230 235 240Leu Val Leu Ala Phe Cys Leu Val Ile Thr Val Ile Phe Ala Lys Glu 245 250 255Ile Pro Tyr Lys Ala Ile Ala Pro Leu Pro Thr Lys Gly Asn Gly Gln 260 265 270Val Glu Val Glu Pro Thr Gly Pro Leu Ala Val Phe Lys Gly Phe Lys 275 280 285Asn Leu Pro Pro Met Pro Ser Val Leu Leu Val Thr Gly Leu Thr Trp 290 295 300Leu Ser Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg305 310 315 320Glu Ile Tyr His Gly Asp Pro Lys Gly Thr Pro Asp Glu Ala Asn Ala 325 330 335Phe Gln Ala Gly Val Arg Ala Gly Ala Phe Gly Leu Leu Leu Asn Ser 340 345 350Val Val Leu Gly Phe Ser Ser Phe Leu Ile Glu Pro Leu Cys Lys Arg 355 360 365Leu Gly Pro Arg Val Val Trp Val Ser Ser Asn Phe Leu Val Cys Leu 370 375 380Ser Met Ala Ala Ile Cys Ile Ile Ser Trp Trp Ala Thr Gln Asp Leu385 390 395 400His Gly Tyr Ile Gln His Ala Ile Thr Ala Ser Lys Glu Ile Lys Ile 405 410 415Val Ser Leu Ala Leu Phe Ala Phe Leu Gly Ile Pro Leu Ala Ile Leu 420 425 430Tyr Ser Val Pro Phe Ala Val Thr Ala Gln Leu Ala Ala Lys Arg Gly 435 440 445Gly Gly Gln Gly Leu Cys Thr Gly Val Leu Asn Ile Ala Ile Val Ile 450 455 460Pro Gln Val Ile Ile Ala Val Gly Ala Gly Pro Trp Asp Glu Leu Phe465 470 475 480Gly Lys Gly Asn Ile Pro Ala Phe Gly Met Ala Ser Ala Phe Ala Leu 485 490 495Ile Gly Gly Ile Val Gly Ile Phe Leu Leu Pro Lys Ile Ser Arg Arg 500 505 510Gln Phe Arg Ala Val Ser Gly Gly Gly His 515 520212160DNATriticum aestivum 21gcacgagacc acccctctct ctctctctca ctcgcgcttt ccgctctcgt ctcctcctct 60tcctcctccc gtcagcccct tcttccccgg cgttgatccg atcgacgtcc tccctcctcc 120ccggcgttga tccgacgcgc cgtagagttg ataggcgaac gaacggggcg gtgatcgtcc 180gggcggcccc cctgcgacga tggcgcgcgg cggcggcaac ggcgaggtgg agctctcggt 240gggggtcggc ggaggcggcg ccggcgccgg cggggcggac gcccccgccg tggacatcag 300cctcggcagg ctcatcctcg ccggcatggt cgccggcggc gtgcagtacg gatgggcgct 360ccagctctcc ctgctcaccc cctacgtcca gactctggga ctttcgcatg ctctgacttc 420attcatgtgg ctctgcggcc ctattgctgg attagtggtt caaccatgcg ttgggctcta 480cagtgacaag tgcacttcaa gatggggaag acgcagaccg ttcattctga caggatgtat 540cctcatctgc attgctgtcg tggtcgtcgg cttctcggct gacattggag ctgctctggg 600tgacagcaag gaagagtgca gtctctatca tgggcctcgt tggcacgctg caattgtgta 660tgttcttgga ttctggctcc ttgacttctc caacaacaca gtgcaaggac cagcgcgtgc 720tctgatggct gatttatcag cccagcatgg acccagtgca gcaaattcaa tcttctgttc 780ttggatggca ctgggaaata tcctaggata ctcatctggt tccacaaata actggcacaa 840gtggtttccg ttcctccgga caagggcttg ctgtgaagcc tgcgcaaatc tgaaaggcgc 900atttctggtg gcagtgctgt tcctggcctt ctgtttggtg ataaccgtga tcttcgccaa 960ggagataccg tacaaggcga ttgcgcccct cccaacaaag gccaatggcc aggttgaagt 1020cgagcccacc gggccgctcg ccgtcttcaa aggcttcaag aacttgcctc ctggaatgcc 1080gtcagtgctc ctcgtcaccg gcctcacctg gctgtcctgg ttccccttca tcctgtacga 1140caccgactgg atgggtcgtg agatctacca cggtgacccc aagggaaccc ccgacgaggc 1200caacgcgttc caggcaggtg tcagggccgg ggcgttcggc ctgctactca actcggtcgt 1260cctggggttc agctcgttcc tgatcgagcc gctgtgcaag aggctaggcc cgcgggtggt 1320gtgggtgtca agcaacttcc tcgtctgcct ctccatggcc gccatttgca tcataagctg 1380gtgggccact caggacctgc atgggtacat ccagcacgcc atcaccgcca gcaaggagat 1440caagatcgtc tccctcgccc tcttcgcctt cctcggaatc cctctcgcca ttctgtacag 1500tgtcactttc gccgtgacgg cgcagctggc ggcgaacaga tgcggtgggc aatggctgtg 1560cacgggcgtg ctgaacatcg ccatcgcgat accccaggtg atcatcgcgt tgggggcggg 1620gccgtgggac gagctgttcg gcaagggcaa catcccggcg ttcggcgtgg cgtccgcctt 1680cgcgctcatc ggcggcatcg tcggcatatt cctgctgccc aagatctcca ggctccagtt 1740ccgggccgtc agcggcggcg gtcactgacc gcgccgcgcg ccggtcggcc tgagcatggc 1800gaaggccgat cgcgccggcc cgaaggtccc agcccagctc ggcatttacc aaattttcgc 1860ataggcgtaa ctagggggct ctcgcctaag gactccgtag agcagaataa gaattgtgag 1920gaacctgtat gtgttgtgtc tgtatgtgcg tgtaagtcag tgcgtgtagc ggaaaatgga 1980cagaggaatg cgggcatcca tcgccggctg gggtgtcgtc tttgggttgt gacttgtgtg 2040tagcaaacca aggtgatcaa gtgaggggaa aagaatggat gatgaacttt cagcgacaaa 2100aaaaaaaaaa aaaaaaaaaa aaaaaaataa aaaaaaaaaa aagaaaaaaa taaaaaaaaa 216022522PRTTriticum aestivum 22Met Ala Arg Gly Gly Gly Asn Gly Glu Val Glu Leu Ser Val Gly Val1 5 10 15Gly Gly Gly Gly Ala Gly Ala Gly Gly Ala Asp Ala Pro Ala Val Asp 20 25 30Ile Ser Leu Gly Arg Leu Ile Leu Ala Gly Met Val Ala Gly Gly Val 35 40 45Gln Tyr Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln 50 55 60Thr Leu Gly Leu Ser His Ala Leu Thr Ser Phe Met Trp Leu Cys Gly65 70 75 80Pro Ile Ala Gly Leu Val Val Gln Pro Cys Val Gly Leu Tyr Ser Asp 85 90 95Lys Cys Thr Ser Arg Trp Gly Arg Arg Arg Pro Phe Ile Leu Thr Gly 100 105 110Cys Ile Leu Ile Cys Ile Ala Val Val Val Val Gly Phe Ser Ala Asp 115 120 125Ile Gly Ala Ala Leu Gly Asp Ser Lys Glu Glu Cys Ser Leu Tyr His 130 135 140Gly Pro Arg Trp His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp Leu145 150 155 160Leu Asp Phe Ser Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu Met 165 170 175Ala Asp Leu Ser Ala Gln His Gly Pro Ser Ala Ala Asn Ser Ile Phe 180 185 190Cys Ser Trp Met Ala Leu Gly Asn Ile Leu Gly Tyr Ser Ser Gly Ser 195 200 205Thr Asn Asn Trp His Lys Trp Phe Pro Phe Leu Arg Thr Arg Ala Cys 210 215 220Cys Glu Ala Cys Ala Asn Leu Lys Gly Ala Phe Leu Val Ala Val Leu225 230 235 240Phe Leu Ala Phe Cys Leu Val Ile Thr Val Ile Phe Ala Lys Glu Ile 245 250 255Pro Tyr Lys Ala Ile Ala Pro Leu Pro Thr Lys Ala Asn Gly Gln Val 260 265 270Glu Val Glu Pro Thr Gly Pro Leu Ala Val Phe Lys Gly Phe Lys Asn 275 280 285Leu Pro Pro Gly Met Pro Ser Val Leu Leu Val Thr Gly Leu Thr Trp 290 295 300Leu Ser Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg305 310 315 320Glu Ile Tyr His Gly Asp Pro Lys Gly Thr Pro Asp Glu Ala Asn Ala 325 330 335Phe Gln Ala Gly Val Arg Ala Gly Ala Phe Gly Leu Leu Leu Asn Ser 340 345 350Val Val Leu Gly Phe Ser Ser Phe Leu Ile Glu Pro Leu Cys Lys Arg 355 360 365Leu Gly Pro Arg Val Val Trp Val Ser Ser Asn Phe Leu Val Cys Leu 370 375 380Ser Met Ala Ala Ile Cys Ile Ile Ser Trp Trp Ala Thr Gln Asp Leu385 390 395 400His Gly Tyr Ile Gln His Ala Ile Thr Ala Ser Lys Glu Ile Lys Ile 405 410 415Val Ser Leu Ala Leu Phe Ala Phe Leu Gly Ile Pro Leu Ala Ile Leu 420

425 430Tyr Ser Val Thr Phe Ala Val Thr Ala Gln Leu Ala Ala Asn Arg Cys 435 440 445Gly Gly Gln Trp Leu Cys Thr Gly Val Leu Asn Ile Ala Ile Ala Ile 450 455 460Pro Gln Val Ile Ile Ala Leu Gly Ala Gly Pro Trp Asp Glu Leu Phe465 470 475 480Gly Lys Gly Asn Ile Pro Ala Phe Gly Val Ala Ser Ala Phe Ala Leu 485 490 495Ile Gly Gly Ile Val Gly Ile Phe Leu Leu Pro Lys Ile Ser Arg Leu 500 505 510Gln Phe Arg Ala Val Ser Gly Gly Gly His 515 520232030DNATriticum aestivum 23cggaagcgac gccgcgcggc ccaaggagga acagggcagc ggcgcggggg cgggggaagg 60cggcatgaag ggcgcgccca agtggcgggt ggtgctggcc tgcatggtcg ccgccggcgt 120gcagttcggc tgggcgctcc agctctccct cctcaccccc tacatccaga ctctaggaat 180agaccatgcc atggcgtcct tcatttggct ttgcgggccc attactggtt ttgtggttca 240accgtgtgtt ggtgtctgga gtgacaagtg ccgctccaag tacgggagga gacggccgtt 300cattttggct ggatgcgtgc tgatttgtgc agctgtaact ttagtcgggt tttctgcaga 360ccttggctac atgttaggag acaccactga gcactgcagt acatacaaag gtctacgata 420tcgagctgct tttattttca tttttggatt ctggatgctg gaccttgcaa ataatacagt 480tcaaggacct gctcgtgccc tcctagctga tctttcaggt cccgatcaat gtaattcggc 540aaatgcaata ttctgctcat ggatggctgt tggaaacgtt cttggttttt cagctggtgc 600gagtgggaat tggcacaagt ggtttccttt tctgatgact agggcctgtt gtgaagcttg 660tggtaatttg aaagcagctt tcttgattgc agttgtattc cttctgtttt gcatggctgt 720taccctctac tttgctgaag agattccact ggaaccaaag gatgcacagc agttatctga 780ctcggctcct ctactgaacg gttctagaga tgatcatgat gcttcaagtg aacagactaa 840tggaggactt tctaacggtc atgctgatgc aaaccatgtc tcagctaact ccagtgcaga 900tgcaggttcc aactcgaaca aggacgatgt tgaggctttc aatgatggac caggagcagt 960tttggttaaa attttgacta gcatgaggca tctacctcct ggaatgtatt ccgtgcttct 1020ggttatggcc ctaacatggc tgtcgtggtt tccctttttc ctttttgaca ccgactggat 1080ggggcgtgag gtttatcacg gtgacccaaa aggaaacgcg agtgaaagga aagcttatga 1140tgatggtgtc cgagaaggtg catttggttt gctattgaat tcagtcgtcc ttgggattgg 1200ctctttcctt atcgatccat tatgccggat gattggtgca agattggttt gggcaatcag 1260caacttcata gtgtttgcct gcatgttggc tacaacaata ctaagttgga tctcctatga 1320cctgtactcg agcaagcttc aacatattgt cggggcagat aaaacagtca agacctcagc 1380gcttattctt ttctctcttc tcggattgcc actctcgatc acttatagtg ttccgttctc 1440cgtgactgct gagctgactg ccggaacagg aggcggacaa ggtttggcta ctggagttct 1500gaatcttgcc atcgtcgctc ctcagatagt agtgtcactc ggagcaggcc catgggacaa 1560gctcttgggg ggagggaacg tccccgcttt cgccctggcc tcggtcttct cgctagcagc 1620cggagtgctc gcggtgatca agctgcccaa gttgtcgaac aattaccaat ccgccggctt 1680ccacatgggc tgaaccctaa agcccgaagc cagctgctgt gtgtaacatc cagatgttta 1740gtaccaatcc gccggtttcc atattaagat tcgtttatat ggagatgatt ctttttctcc 1800tcttgctaga tacacagtta ataagactac agatcagata gactaggata aagagatagt 1860ttttaggcct gtgtgcatac aagtgtcgat gagaagttgt aaaacatgta cactgttttt 1920ttgtactgta tatgtagtga aatttcatag atggccggat gtgttctggt ccgataaaaa 1980aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 203024563PRTTriticum aestivum 24Gly Ser Asp Ala Ala Arg Pro Lys Glu Glu Gln Gly Ser Gly Ala Gly1 5 10 15Ala Gly Glu Gly Gly Met Lys Gly Ala Pro Lys Trp Arg Val Val Leu 20 25 30Ala Cys Met Val Ala Ala Gly Val Gln Phe Gly Trp Ala Leu Gln Leu 35 40 45Ser Leu Leu Thr Pro Tyr Ile Gln Thr Leu Gly Ile Asp His Ala Met 50 55 60Ala Ser Phe Ile Trp Leu Cys Gly Pro Ile Thr Gly Phe Val Val Gln65 70 75 80Pro Cys Val Gly Val Trp Ser Asp Lys Cys Arg Ser Lys Tyr Gly Arg 85 90 95Arg Arg Pro Phe Ile Leu Ala Gly Cys Val Leu Ile Cys Ala Ala Val 100 105 110Thr Leu Val Gly Phe Ser Ala Asp Leu Gly Tyr Met Leu Gly Asp Thr 115 120 125Thr Glu His Cys Ser Thr Tyr Lys Gly Leu Arg Tyr Arg Ala Ala Phe 130 135 140Ile Phe Ile Phe Gly Phe Trp Met Leu Asp Leu Ala Asn Asn Thr Val145 150 155 160Gln Gly Pro Ala Arg Ala Leu Leu Ala Asp Leu Ser Gly Pro Asp Gln 165 170 175Cys Asn Ser Ala Asn Ala Ile Phe Cys Ser Trp Met Ala Val Gly Asn 180 185 190Val Leu Gly Phe Ser Ala Gly Ala Ser Gly Asn Trp His Lys Trp Phe 195 200 205Pro Phe Leu Met Thr Arg Ala Cys Cys Glu Ala Cys Gly Asn Leu Lys 210 215 220Ala Ala Phe Leu Ile Ala Val Val Phe Leu Leu Phe Cys Met Ala Val225 230 235 240Thr Leu Tyr Phe Ala Glu Glu Ile Pro Leu Glu Pro Lys Asp Ala Gln 245 250 255Gln Leu Ser Asp Ser Ala Pro Leu Leu Asn Gly Ser Arg Asp Asp His 260 265 270Asp Ala Ser Ser Glu Gln Thr Asn Gly Gly Leu Ser Asn Gly His Ala 275 280 285Asp Ala Asn His Val Ser Ala Asn Ser Ser Ala Asp Ala Gly Ser Asn 290 295 300Ser Asn Lys Asp Asp Val Glu Ala Phe Asn Asp Gly Pro Gly Ala Val305 310 315 320Leu Val Lys Ile Leu Thr Ser Met Arg His Leu Pro Pro Gly Met Tyr 325 330 335Ser Val Leu Leu Val Met Ala Leu Thr Trp Leu Ser Trp Phe Pro Phe 340 345 350Phe Leu Phe Asp Thr Asp Trp Met Gly Arg Glu Val Tyr His Gly Asp 355 360 365Pro Lys Gly Asn Ala Ser Glu Arg Lys Ala Tyr Asp Asp Gly Val Arg 370 375 380Glu Gly Ala Phe Gly Leu Leu Leu Asn Ser Val Val Leu Gly Ile Gly385 390 395 400Ser Phe Leu Ile Asp Pro Leu Cys Arg Met Ile Gly Ala Arg Leu Val 405 410 415Trp Ala Ile Ser Asn Phe Ile Val Phe Ala Cys Met Leu Ala Thr Thr 420 425 430Ile Leu Ser Trp Ile Ser Tyr Asp Leu Tyr Ser Ser Lys Leu Gln His 435 440 445Ile Val Gly Ala Asp Lys Thr Val Lys Thr Ser Ala Leu Ile Leu Phe 450 455 460Ser Leu Leu Gly Leu Pro Leu Ser Ile Thr Tyr Ser Val Pro Phe Ser465 470 475 480Val Thr Ala Glu Leu Thr Ala Gly Thr Gly Gly Gly Gln Gly Leu Ala 485 490 495Thr Gly Val Leu Asn Leu Ala Ile Val Ala Pro Gln Ile Val Val Ser 500 505 510Leu Gly Ala Gly Pro Trp Asp Lys Leu Leu Gly Gly Gly Asn Val Pro 515 520 525Ala Phe Ala Leu Ala Ser Val Phe Ser Leu Ala Ala Gly Val Leu Ala 530 535 540Val Ile Lys Leu Pro Lys Leu Ser Asn Asn Tyr Gln Ser Ala Gly Phe545 550 555 560His Met Gly25501PRTDaucus carota 25Met Ala Gly Pro Glu Ala Asp Arg Asn Arg His Arg Gly Gly Ala Thr1 5 10 15Ala Ala Pro Pro Pro Arg Ser Arg Val Ser Leu Arg Leu Leu Leu Arg 20 25 30Val Ala Ser Val Ala Cys Gly Ile Gln Phe Gly Trp Ala Leu Gln Leu 35 40 45Ser Leu Leu Thr Pro Tyr Val Gln Glu Leu Gly Ile Pro His Ala Trp 50 55 60Ser Ser Ile Ile Trp Leu Cys Gly Pro Leu Ser Gly Leu Leu Val Gln65 70 75 80Pro Ile Val Gly His Met Ser Asp Gln Cys Thr Ser Lys Tyr Gly Arg 85 90 95Arg Arg Pro Phe Ile Val Ala Gly Gly Thr Ala Ile Ile Leu Ala Val 100 105 110Ile Ile Ile Ala His Ser Ala Asp Ile Gly Gly Leu Leu Gly Asp Thr 115 120 125Ala Asp Asn Lys Thr Met Ala Ile Val Ala Phe Val Ile Gly Phe Trp 130 135 140Ile Leu Asp Val Ala Asn Asn Met Thr Gln Gly Pro Cys Arg Ala Leu145 150 155 160Leu Ala Asp Leu Thr Gly Asn Asp Ala Arg Arg Thr Arg Val Ala Asn 165 170 175Ala Tyr Phe Ser Leu Phe Met Ala Ile Gly Asn Val Leu Gly Tyr Ala 180 185 190Thr Gly Ala Tyr Ser Gly Trp Tyr Lys Val Phe Pro Phe Ser Leu Thr 195 200 205Ser Ser Cys Thr Ile Asn Cys Ala Asn Leu Lys Ser Ala Phe Tyr Ile 210 215 220Asp Ile Ile Phe Ile Ile Ile Thr Thr Tyr Ile Ser Ile Ser Ala Ala225 230 235 240Lys Glu Arg Pro Arg Ile Ser Ser Gln Asp Gly Pro Gln Phe Ser Glu 245 250 255Asp Gly Thr Ala Gln Ser Gly His Ile Glu Glu Ala Phe Leu Trp Glu 260 265 270Leu Phe Gly Thr Phe Arg Leu Leu Pro Gly Ser Val Trp Val Ile Leu 275 280 285Leu Val Thr Cys Leu Asn Trp Ile Gly Trp Phe Pro Phe Ile Leu Phe 290 295 300Asp Thr Asp Trp Met Gly Arg Glu Ile Tyr Gly Gly Glu Pro Asn Gln305 310 315 320Gly Gln Ser Tyr Ser Asp Gly Val Arg Met Gly Ala Phe Gly Leu Met 325 330 335Met Asn Ser Val Val Leu Gly Ile Thr Ser Val Leu Met Glu Lys Leu 340 345 350Cys Arg Ile Trp Gly Ser Gly Phe Met Trp Gly Leu Ser Asn Ile Leu 355 360 365Met Thr Ile Cys Phe Phe Ala Met Leu Leu Ile Thr Phe Ile Ala Lys 370 375 380Asn Met Asp Tyr Gly Thr Asn Pro Pro Pro Asn Gly Ile Val Ile Ser385 390 395 400Ala Leu Ile Val Phe Ala Ile Leu Gly Ile Pro Leu Ala Ile Thr Tyr 405 410 415Ser Val Pro Tyr Ala Leu Val Ser Thr Arg Ile Glu Ser Leu Gly Leu 420 425 430Gly Gln Gly Leu Ser Met Gly Val Leu Asn Leu Ala Ile Val Val Pro 435 440 445Gln Val Ile Val Ser Leu Gly Ser Gly Pro Trp Asp Gln Leu Phe Gly 450 455 460Gly Gly Asn Ser Pro Ala Phe Val Val Ala Ala Leu Ser Ala Phe Ala465 470 475 480Ala Gly Leu Ile Ala Leu Ile Ala Ile Arg Arg Pro Arg Val Asp Lys 485 490 495Ser Arg Leu His His 50026537PRTOryza sativa 26Met Ala Arg Gly Ser Gly Ala Gly Gly Gly Gly Gly Gly Gly Gly Gly1 5 10 15Gly Leu Glu Leu Ser Val Gly Val Gly Gly Gly Gly Ala Arg Gly Gly 20 25 30Gly Gly Gly Glu Ala Ala Ala Ala Val Glu Thr Ala Ala Pro Ile Ser 35 40 45Leu Gly Arg Leu Ile Leu Ser Gly Met Val Ala Gly Gly Val Gln Tyr 50 55 60Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln Thr Leu65 70 75 80Gly Leu Ser His Ala Leu Thr Ser Phe Met Trp Leu Cys Gly Pro Ile 85 90 95Ala Gly Met Val Val Gln Pro Cys Val Gly Leu Tyr Ser Asp Arg Cys 100 105 110Thr Ser Lys Trp Gly Arg Arg Arg Pro Tyr Ile Leu Thr Gly Cys Val 115 120 125Leu Ile Cys Leu Ala Val Val Val Ile Gly Phe Ser Ala Asp Ile Gly 130 135 140Tyr Ala Met Gly Asp Thr Lys Glu Asp Cys Ser Val Tyr His Gly Ser145 150 155 160Arg Trp His Ala Ala Ile Val Tyr Val Leu Gly Phe Trp Leu Leu Asp 165 170 175Phe Ser Asn Asn Thr Val Gln Gly Pro Ala Arg Ala Leu Met Ala Asp 180 185 190Leu Ser Gly Arg His Gly Pro Gly Thr Ala Asn Ser Ile Phe Cys Ser 195 200 205Trp Met Ala Met Gly Asn Ile Leu Gly Tyr Ser Ser Gly Ser Thr Asn 210 215 220Asn Trp His Lys Trp Phe Pro Phe Leu Lys Thr Arg Ala Cys Cys Glu225 230 235 240Ala Cys Ala Asn Leu Lys Gly Ala Phe Leu Val Ala Val Ile Phe Leu 245 250 255Ser Leu Cys Leu Val Ile Thr Leu Ile Phe Ala Lys Glu Val Pro Phe 260 265 270Lys Gly Asn Ala Ala Leu Pro Thr Lys Ser Asn Glu Pro Ala Glu Pro 275 280 285Glu Gly Thr Gly Pro Leu Ala Val Leu Lys Gly Phe Arg Asn Leu Pro 290 295 300Thr Gly Met Pro Ser Val Leu Ile Val Thr Gly Leu Thr Trp Leu Ser305 310 315 320Trp Phe Pro Phe Ile Leu Tyr Asp Thr Asp Trp Met Gly Arg Glu Ile 325 330 335Tyr His Gly Asp Pro Lys Gly Thr Asp Pro Gln Ile Glu Ala Phe Asn 340 345 350Gln Gly Val Arg Ala Gly Ala Phe Gly Leu Leu Leu Asn Ser Ile Val 355 360 365Leu Gly Phe Ser Ser Phe Leu Ile Glu Pro Met Cys Arg Lys Val Gly 370 375 380Pro Arg Val Val Trp Val Thr Ser Asn Phe Leu Val Cys Ile Ala Met385 390 395 400Ala Ala Thr Ala Leu Ile Ser Phe Trp Ser Leu Lys Asp Phe His Gly 405 410 415Thr Val Gln Lys Ala Ile Thr Ala Asp Lys Ser Ile Lys Ala Val Cys 420 425 430Leu Val Leu Phe Ala Phe Leu Gly Val Pro Leu Ala Val Leu Tyr Ser 435 440 445Val Pro Phe Ala Val Thr Ala Gln Leu Ala Ala Thr Arg Gly Gly Gly 450 455 460Gln Gly Leu Cys Thr Gly Val Leu Asn Ile Ser Ile Val Ile Pro Gln465 470 475 480Val Val Ile Ala Leu Gly Ala Gly Pro Trp Asp Glu Leu Phe Gly Lys 485 490 495Gly Asn Ile Pro Ala Phe Gly Leu Ala Ser Gly Phe Ala Leu Ile Gly 500 505 510Gly Val Ala Gly Ile Phe Leu Leu Pro Lys Ile Ser Lys Arg Gln Phe 515 520 525Trp Ser Val Ser Met Gly Gly Gly His 530 53527533PRTRicinus communis 27Met Gln Ser Ser Thr Ser Lys Glu Asn Lys Gln Pro Pro Ser Ser Gln1 5 10 15Pro His Pro Pro Pro Leu Met Val Ala Gly Ala Ala Glu Pro Asn Ser 20 25 30Ser Pro Leu Arg Lys Val Val Met Val Ala Ser Ile Ala Ala Gly Ile 35 40 45Gln Phe Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln 50 55 60Leu Leu Gly Ile Pro His Thr Trp Ala Ala Phe Ile Trp Leu Cys Gly65 70 75 80Pro Ile Ser Gly Met Leu Val Gln Pro Ile Val Gly Tyr His Ser Asp 85 90 95Arg Cys Thr Ser Arg Phe Gly Arg Arg Arg Pro Phe Ile Ala Ser Gly 100 105 110Ala Ala Phe Val Ala Ile Ala Val Phe Leu Ile Gly Tyr Ala Ala Asp 115 120 125Leu Gly His Leu Ser Gly Asp Ser Leu Asp Lys Ser Pro Lys Thr Arg 130 135 140Ala Ile Ala Ile Phe Val Val Gly Phe Trp Ile Leu Asp Val Ala Asn145 150 155 160Asn Met Leu Gln Gly Pro Cys Arg Ala Leu Leu Ala Asp Leu Ser Gly 165 170 175Thr Ser Gln Lys Lys Thr Arg Thr Ala Asn Ala Leu Phe Ser Phe Phe 180 185 190Met Ala Val Gly Asn Val Leu Gly Tyr Ala Ala Gly Ala Tyr Thr His 195 200 205Leu Tyr Lys Leu Phe Pro Phe Thr Lys Thr Thr Ala Cys Asp Val Tyr 210 215 220Cys Ala Asn Leu Lys Ser Cys Phe Phe Ile Ser Ile Val Leu Leu Leu225 230 235 240Ser Leu Thr Val Leu Ala Leu Ser Tyr Val Lys Glu Lys Pro Trp Ser 245 250 255Pro Asp Gln Ala Val Asp Asn Ala Glu Asp Asp Thr Ala Ser Gln Ala 260 265 270Ser Ser Ser Ala Gln Pro Met Pro Phe Phe Gly Glu Ile Leu Gly Ala 275 280 285Phe Lys Asn Leu Lys Arg Pro Met Trp Ile Leu Leu Leu Val Thr Cys 290 295 300Leu Asn Trp Ile Ala Trp Phe Pro Phe Leu Leu Phe Asp Thr Asp Trp305 310 315 320Met Gly Arg Glu Val Tyr Gly Gly Asp Ser Ser Gly Ser Ala Glu Gln 325 330 335Leu Lys Leu Tyr Asp Arg Gly Val Arg Ala Gly Ala Leu Gly Leu Met 340 345 350Leu Asn Ser Val Val Leu Gly Phe Thr Ser Leu Gly Val Glu Val Leu 355 360 365Ala Arg Gly Val Gly Gly Val Lys Arg Leu Trp Gly Ile Val Asn Phe 370 375 380Val Leu Ala Val Cys Leu Ala Met Thr Val Leu Val Thr Lys Gln Ala385 390 395 400Glu Ser Thr Arg Arg Phe Ala Thr Val Ser Gly Gly Ala Lys Val Pro 405 410 415Leu Pro Pro Pro Ser Gly Val Lys Ala Gly Ala Leu Ala Leu Phe Ala 420

425 430Val Met Gly Val Pro Gln Ala Ile Thr Tyr Ser Ile Pro Phe Ala Leu 435 440 445Ala Ser Ile Phe Ser Asn Thr Ser Gly Ala Gly Gln Gly Leu Ser Leu 450 455 460Gly Val Leu Asn Leu Ser Ile Val Ile Pro Gln Met Ile Val Ser Val465 470 475 480Ala Ala Gly Pro Trp Asp Ala Leu Phe Gly Gly Gly Asn Leu Pro Ala 485 490 495Phe Val Val Gly Ala Val Ala Ala Leu Ala Ser Gly Ile Phe Ala Leu 500 505 510Thr Met Leu Pro Ser Pro Gln Pro Asp Met Pro Ser Ala Lys Ala Leu 515 520 525Thr Ala Ala Phe His 53028523PRTVicia faba 28Met Glu Pro Leu Ser Ser Thr Lys Gln Ile Asn Asn Asn Asn Asn Leu1 5 10 15Ala Lys Pro Ser Ser Leu His Val Glu Thr Gln Pro Leu Glu Pro Ser 20 25 30Pro Leu Arg Lys Ile Met Val Val Ala Ser Ile Ala Ala Gly Val Gln 35 40 45Phe Gly Trp Ala Leu Gln Leu Ser Leu Leu Thr Pro Tyr Val Gln Leu 50 55 60Leu Gly Ile His His Thr Trp Ala Ala Tyr Ile Trp Leu Cys Gly Pro65 70 75 80Ile Ser Gly Met Leu Val Gln Pro Ile Val Gly Tyr His Ser Asp Arg 85 90 95Cys Thr Ser Arg Phe Gly Arg Arg Arg Pro Phe Ile Ala Ala Gly Ser 100 105 110Ile Ala Val Ala Ile Ala Val Phe Leu Ile Gly Tyr Ala Ala Asp Leu 115 120 125Gly His Ser Phe Gly Asp Ser Leu Asp Gln Lys Val Arg Pro Arg Ala 130 135 140Ile Gly Ile Phe Val Val Gly Phe Trp Ile Leu Asp Val Ala Asn Asn145 150 155 160Met Leu Gln Gly Pro Cys Arg Ala Leu Leu Gly Asp Leu Cys Ala Gly 165 170 175Asn Gln Arg Lys Thr Arg Asn Ala Asn Ala Phe Phe Ser Phe Phe Met 180 185 190Ala Val Gly Asn Val Leu Gly Tyr Ala Ala Gly Ala Tyr Ser Lys Leu 195 200 205Tyr His Val Phe Pro Phe Thr Lys Thr Lys Ala Cys Asn Val Tyr Cys 210 215 220Ala Asn Leu Lys Ser Cys Phe Phe Leu Ser Ile Ala Leu Leu Thr Val225 230 235 240Leu Ala Thr Ser Ala Leu Ile Tyr Val Lys Glu Thr Ala Leu Thr Pro 245 250 255Glu Lys Thr Val Val Thr Thr Glu Asp Gly Gly Ser Ser Gly Gly Met 260 265 270Pro Cys Phe Gly Gln Leu Ser Gly Ala Phe Lys Glu Leu Lys Arg Pro 275 280 285Met Trp Ile Leu Leu Leu Val Thr Cys Leu Asn Trp Ile Ala Trp Phe 290 295 300Pro Phe Leu Leu Phe Asp Thr Asp Trp Met Gly Lys Glu Val Tyr Gly305 310 315 320Gly Thr Val Gly Glu Gly His Ala Tyr Asp Met Gly Val Arg Glu Gly 325 330 335Ala Leu Gly Leu Met Leu Asn Ser Val Val Leu Gly Ala Thr Ser Leu 340 345 350Gly Val Asp Ile Leu Ala Arg Gly Val Gly Gly Val Lys Arg Leu Trp 355 360 365Gly Ile Val Asn Phe Leu Leu Ala Ile Cys Leu Gly Leu Thr Val Leu 370 375 380Val Thr Lys Leu Ala Gln His Ser Arg Gln Tyr Ala Pro Gly Thr Gly385 390 395 400Ala Leu Gly Asp Pro Leu Pro Pro Ser Glu Gly Ile Lys Ala Gly Ala 405 410 415Leu Thr Leu Phe Ser Val Leu Gly Val Pro Leu Ala Ile Thr Tyr Ser 420 425 430Ile Pro Phe Ala Leu Ala Ser Ile Phe Ser Ser Thr Ser Gly Ala Gly 435 440 445Gln Gly Leu Ser Leu Gly Val Leu Asn Leu Ala Ile Val Ile Pro Gln 450 455 460Met Phe Val Ser Val Leu Ser Gly Pro Trp Asp Ala Leu Phe Gly Gly465 470 475 480Gly Asn Leu Pro Ala Phe Val Val Gly Ala Val Ala Ala Leu Ala Ser 485 490 495Gly Ile Leu Ser Ile Ile Leu Leu Pro Ser Pro Pro Pro Asp Met Ala 500 505 510Lys Ser Val Ser Ala Thr Gly Gly Gly Phe His 515 520

* * * * *

References

ncbi.nlm.nih.gov/BLAST