U.S. patent application number 13/801253 was filed with the patent office on 2014-02-06 for trait improvement in plants expressing myb-related proteins.
This patent application is currently assigned to MENDEL BIOTECHNOLOGY, INC.. The applicant listed for this patent is Graham J. Hymus, Colleen M. Marion, Oliver J. Ratcliffe, T. Lynne Reuber. Invention is credited to Graham J. Hymus, Colleen M. Marion, Oliver J. Ratcliffe, T. Lynne Reuber.
Application Number | 20140041073 13/801253 |
Document ID | / |
Family ID | 50026922 |
Filed Date | 2014-02-06 |
United States Patent
Application |
20140041073 |
Kind Code |
A1 |
Marion; Colleen M. ; et
al. |
February 6, 2014 |
TRAIT IMPROVEMENT IN PLANTS EXPRESSING MYB-RELATED PROTEINS
Abstract
Polynucleotides and polypeptides incorporated into expression
vectors are introduced into plants and were ectopically expressed.
These polypeptides may confer at least one regulatory activity and
increased photosynthetic resource use efficiency, increased yield,
greater vigor, greater biomass as compared to a control plant.
Inventors: |
Marion; Colleen M.; (San
Mateo, CA) ; Hymus; Graham J.; (Castro Valley,
CA) ; Reuber; T. Lynne; (San Mateo, CA) ;
Ratcliffe; Oliver J.; (Oakland, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Marion; Colleen M.
Hymus; Graham J.
Reuber; T. Lynne
Ratcliffe; Oliver J. |
San Mateo
Castro Valley
San Mateo
Oakland |
CA
CA
CA
CA |
US
US
US
US |
|
|
Assignee: |
MENDEL BIOTECHNOLOGY, INC.
HAYWARD
CA
|
Family ID: |
50026922 |
Appl. No.: |
13/801253 |
Filed: |
March 13, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61679320 |
Aug 3, 2012 |
|
|
|
Current U.S.
Class: |
800/260 ;
800/287; 800/298; 800/306; 800/312; 800/314; 800/317.4; 800/319;
800/320; 800/320.1; 800/320.2; 800/320.3 |
Current CPC
Class: |
A01H 1/02 20130101; C12N
15/8273 20130101; Y02A 40/146 20180101; C12N 15/8242 20130101; C07K
14/415 20130101; C12N 15/825 20130101; C12N 15/8261 20130101; C12N
15/8269 20130101 |
Class at
Publication: |
800/260 ;
800/298; 800/320.1; 800/320.3; 800/320.2; 800/312; 800/320;
800/306; 800/317.4; 800/319; 800/314; 800/287 |
International
Class: |
C12N 15/82 20060101
C12N015/82 |
Claims
1. A transgenic plant having greater photosynthetic resource use
efficiency than a control plant; wherein the transgenic plant
comprises a recombinant polynucleotide comprising a promoter that
regulates expression of a polypeptide comprising SEQ ID NO: 2 in a
photosynthetic tissue to a level that is effective in conferring
greater photosynthetic resource use efficiency in the transgenic
plant relative to the control plant; wherein the control plant does
not comprise the recombinant polynucleotide; wherein the promoter
does not regulate protein expression in a constitutive manner; and
wherein expression of the polypeptide under the regulatory control
of the promoter confers greater photosynthetic resource use
efficiency in the transgenic plant relative to the control
plant.
2. The transgenic plant of claim 1, wherein the promoter is a
photosynthetic tissue-enhanced promoter.
3. The transgenic plant of claim 2, wherein the photosynthetic
tissue-enhanced promoter is an RBCS3 promoter, an RBCS4 promoter,
an At4g01060 promoter, an Os02g09720 promoter, an Os05g34510
promoter, an Os11g08230 promoter, an Os01g64390 promoter, an
Os06g15760 promoter, an Os12g37560 promoter, an Os03g17420
promoter, an Os04g51000 promoter, an Os01g01960 promoter, an
Os05g04990 promoter, an Os02g44970 promoter, an Os01g25530
promoter, an Os03g30650 promoter, an Os01g64910 promoter, an
Os07g26810 promoter, an Os07g26820 promoter, an Os09g11220
promoter, an Os04g21800 promoter, an Os10g23840 promoter, an
Os08g13850 promoter, an Os12g42980 promoter, an Os03g29280
promoter, an Os03g20650 promoter, or an Os06g43920 promoter (SEQ ID
NO: 136-159, respectively).
4. The transgenic plant of claim 1, wherein: the recombinant
polynucleotide encodes the polypeptide comprising SEQ ID NO: 2; or
the polypeptide is encoded by a second polynucleotide and
expression of the polypeptide is regulated by a trans-regulatory
element.
5. The transgenic plant of claim 1, wherein the transgenic plant
has an altered trait that confers the greater photosynthetic
resource use efficiency, wherein the altered trait is: (a)
increased photosynthetic capacity; and/or (b) increased
photosynthetic rate; and/or (c) a decrease in leaf chlorophyll
content; and/or (d) a decrease in percentage of nitrogen in leaf
dry weight; and/or (e) increased transpiration efficiency; and/or
(f) an increase in resistance to water vapor diffusion exerted by
leaf stomata; and/or (g) an increase in a rate of reactions
responsible for dissipating light energy absorbed by light
harvesting antennae as heat; and/or (h) a decrease in the ratio of
the carbon isotope .sup.12C to .sup.13C in above-ground biomass;
and/or (i) an increase in the total dry weight of above-ground
plant material; and/or (j) greater yield than the control
plant.
6. The transgenic plant of claim 1, wherein a plurality of the
transgenic plants have greater cumulative canopy photosynthesis
than the canopy photosynthesis of the same number of the control
plants grown under the same conditions and at the same density.
7. The transgenic plant of claim 1, wherein the transgenic plant is
selected from the group consisting of a corn, wheat, rice, Setaria,
Miscanthus, switchgrass, ryegrass, sugarcane, miscane, barley,
sorghum, soy, cotton, canola, rapeseed, Crambe, Camelina, sugar
beet, alfalfa, tomato, Eucalyptus, poplar, willow, pine, birch and
a woody plant.
8. A method for increasing photosynthetic resource use efficiency
in a plant, the method comprising: (a) providing one or more
transgenic plants that comprise a recombinant polynucleotide that
comprises a photosynthetic tissue-enhanced promoter that regulates
a polypeptide comprising SEQ ID NO: 2; and (b) growing the one or
more transgenic plants; wherein the photosynthetic tissue-enhanced
promoter does not regulate protein expression in a constitutive
manner; and wherein expression of the polypeptide in the one or
more transgenic plants confers increased photosynthetic resource
use efficiency relative to a control plant that does not comprise
the recombinant polynucleotide.
9. The method of claim 8, wherein the photosynthetic
tissue-enhanced promoter is an RBCS3 promoter, an RBCS4 promoter,
an At4g01060 promoter, an Os02g09720 promoter, an Os05g34510
promoter, an Os11g08230 promoter, an Os01g64390 promoter, an
Os06g15760 promoter, an Os12g37560 promoter, an Os03g17420
promoter, an Os04g51000 promoter, an Os01g01960 promoter, an
Os05g04990 promoter, an Os02g44970 promoter, an Os01g25530
promoter, an Os03g30650 promoter, an Os01g64910 promoter, an
Os07g26810 promoter, an Os07g26820 promoter, an Os09g11220
promoter, an Os04g21800 promoter, an Os10g23840 promoter, an
Os08g13850 promoter, an Os12g42980 promoter, an Os03g29280
promoter, an Os03g20650 promoter, or an Os06g43920 promoter (SEQ ID
NO: 136-159, respectively).
10. The method of claim 8, wherein an expression cassette
comprising the recombinant polynucleotide is introduced into a
target plant to produce the transgenic plant.
11. The method of claim 8, wherein the transgenic plant has an
altered trait that confers the greater photosynthetic resource use
efficiency, wherein the altered trait is: (a) increased
photosynthetic capacity; and/or (b) increased photosynthetic rate;
and/or (c) a decrease in leaf chlorophyll content; and/or (d) a
decrease in percentage of nitrogen in leaf dry weight; and/or (e)
increased transpiration efficiency; and/or (f) an increase in
resistance to water vapor diffusion exerted by leaf stomata; and/or
(g) an increase in a rate of reactions responsible for dissipating
light energy absorbed by light harvesting antennae as heat; and/or
(h) a decrease in the ratio of the carbon isotope .sup.12C to
.sup.13C in above-ground biomass; and/or (i) an increase in the
total dry weight of above-ground plant material; and/or (j) greater
yield than the control plant.
12. The method of claim 8, wherein the transgenic plant is selected
for having the increased photosynthetic resource use efficiency
relative to the control plant.
13. The method of claim 12, wherein the plant is selected for
having the greater yield relative to the control plant.
14. The method of claim 8, wherein a plurality of the transgenic
plants have greater cumulative canopy photosynthesis than the
canopy photosynthesis of the same number of the control plants
grown under the same conditions and at the same density.
15. The method of claim 8, the method steps further including:
crossing the target plant with itself, a second plant from the same
line as the target plant, a non-transgenic plant, a wild-type
plant, or a transgenic plant from a different line of plants, to
produce a transgenic seed.
16. A method for producing and selecting a crop plant with greater
yield than a control plant, the method comprising: (a) providing
one or more transgenic plants that comprise a recombinant
polynucleotide that comprises photosynthetic tissue-enhanced
promoter that regulates a polypeptide comprising SEQ ID NO: 2,
wherein the photosynthetic tissue-enhanced promoter does not
regulate protein expression in a constitutive manner; (b) growing a
plurality of the transgenic plants; and (c) selecting a transgenic
plant that: has greater photosynthetic resource use efficiency than
the control plant, wherein the control plant does not comprise the
recombinant polynucleotide; and/or comprises the recombinant
polynucleotide; wherein expression of the polypeptide in the
selected transgenic plant confers the greater yield of the selected
transgenic plant relative to the control plant.
17. The method of claim 16, the method steps further including: (d)
crossing the selected transgenic plant with itself, a second plant
from the same line as the selected transgenic plant, a
non-transgenic plant, a wild-type plant, or a transgenic plant from
a different line of plants, to produce a transgenic seed.
18. The method of claim 16, wherein the transgenic plant is
selected for having the increased photosynthetic resource use
efficiency relative to the control plant.
19. The method of claim 16, wherein a plurality of the selected
transgenic plants have greater cumulative canopy photosynthesis
than the canopy photosynthesis of the same number of the control
plants grown under the same conditions and at the same density.
20. The method of claim 16, wherein the selected transgenic plant
has an altered trait that confers the greater photosynthetic
resource use efficiency, wherein the altered trait is: (a)
increased photosynthetic capacity; and/or (b) increased
photosynthetic rate; and/or (c) a decrease in leaf chlorophyll
content; and/or (d) a decrease in percentage of nitrogen in leaf
dry weight; and/or (e) increased transpiration efficiency; and/or
(f) an increase in resistance to water vapor diffusion exerted by
leaf stomata; and/or (g) an increase in a rate of reactions
responsible for dissipating light energy absorbed by light
harvesting antennae as heat; and/or (h) a decrease in the ratio of
the carbon isotope .sup.12C to .sup.13C in above-ground biomass;
and/or (i) an increase in the total dry weight of above-ground
plant material.
Description
[0001] This application claims the benefit of copending U.S.
Provisional Application No. 61/679,320, filed Aug. 3, 2012, the
entire contents of which are hereby incorporated by reference.
FIELD OF THE INVENTION
[0002] The present invention relates to plant genomics and plant
improvement.
BACKGROUND OF THE INVENTION
[0003] A plant's phenotypic characteristics that enhance
photosynthetic resource use efficiency may be controlled through a
number of cellular processes. One important way to manipulate that
control is by manipulating the characteristics or expression of
regulatory proteins, proteins that influence the expression of a
particular gene or sets of genes. For example, transformed or
transgenic plants that comprise cells with altered levels of at
least one selected regulatory polypeptide may possess advantageous
or desirable traits, and strategies for manipulating traits by
altering a plant cell's regulatory polypeptide content or
expression level can result in plants and crops with commercially
valuable properties. Examples of such trait manipulation are
increased canopy photosynthesis, nitrogen use efficiency, or water
use efficiency, as considered below.
[0004] Increasing Canopy Photosynthesis to Increase Crop Yield.
[0005] Recent studies by crop physiologists have provided evidence
that crop-canopy photosynthesis is correlated with crop yield, and
that increasing canopy photosynthesis can increase crop yield (Long
et al., 2006. Plant Cell Environ. 29:315-33; Murchie et al., 2009
New Phytol. 181:532-552; Zhu et al., 2010. Ann. Rev. Plant Biol.
61:235-261). Two overlapping strategies for increasing canopy
photosynthesis have been proposed. The first recognizes that there
exists great potential to increase canopy photosynthesis by
improving multiple discrete reactions that currently limit
photosynthetic capacity (reviewed in Zhu et al., 2010. supra). The
second focuses upon improving plant physiological status during
environmental conditions that limit the realization of
photosynthetic capacity. It is important to distinguish this second
goal from recent industry and academic screening for genes to
improve stress tolerance. Arguably, these efforts may have
identified genes that improve plant physiological status during
severe stresses not typically experienced on productive acres
(Jones, 2007. J. Exp. Bot. 58:119-130; Passioura, 2007. J. Exp.
Bot. 58:113-117). In contrast, improving the resource use
efficiency with which photosynthesis operates relative to the
availability of key resources of water, nitrogen and light, is
thought to be more appropriate for improving yield on productive
acres (Long et al., 1994. Ann. Rev. Plant Physiol. Plant Molec.
Biol. 45:633-662; Morison et al., 2008. Philosophical Transactions
of the Royal Society B: Biological Sciences 363:639-658; Passioura,
2007, supra).
[0006] Increasing Nitrogen Use Efficiency (NUE) to Increase Crop
Yield.
[0007] There has been a large increase in food productivity over
the past 50 years causing a decrease in world hunger despite a
significant increase in population (Godfray et al., 2010. Science
327:812-818). A significant contribution to this increased yield
was a 20-fold increase in the application of nitrogen fertilizers
(Glass, 2003. Crit. Rev. Plant Sci. 22:453-470). About 85 million
to 90 million metric tons of nitrogen are applied annually to soil,
and this application rate is expected to increase to 240 million
metric tons by 2050 (Good et al., 2004. Trends Plant Sci.
9:597-605). However, plants use only 30 to 40% of the applied
nitrogen and the rest is lost through a combination of leaching,
surface run-off, denitrification, volatilization, and microbial
consumption (Frink et al., 1999. Proc. Natl. Acad. Sci. USA
96:1175-1180; Glass, 2003, supra; Good et al., 2004, supra; Raun
and Johnson, 1999. Agron. J. 91:357-363). The loss of more than 60%
of applied nitrogen can have serious environmental effects, such as
groundwater contamination, anoxic coastal zones, and conversion to
greenhouse gases. In addition, while most fertilizer components are
mined (such as phosphates), inorganic nitrogen is derived from the
energy intensive conversion of gaseous nitrogen to ammonia. Thus,
the addition of nitrogen fertilizer is typically the highest single
input cost for many crops, and since its production is energy
intensive, the cost is dependent on the price of energy (Rothstein,
2007. Plant Cell 19:2695-2699). With an increasing demand for food
from an increasing human population, agriculture yields must be
increased at the same time as dependence on applied fertilizers is
decreased. Therefore, to minimize nitrogen loss, reduce
environmental pollution, and decrease input cost, it is crucial to
develop crop varieties with higher nitrogen use efficiency (Garnett
et al., 2009. Plant Cell Environ. 32:1272-1283; Hirel et al., 2007.
J. Exp. Bot. 58:2369-2387; Lea and Azevedo, 2007. Ann. Appl. Biol.
151:269-275; Masclaux-Daubresse et al., 2010. Ann. Bot.
105:1141-1157; Moll et al., 1982. Agron. J. 74:562-564;
Sylvester-Bradley and Kindred, 2009. J. Exp. Bot.
60:1939-1951).
[0008] Improving Water Use Efficiency (WUE) to Improve Yield.
[0009] Freshwater is a limited and dwindling global resource;
therefore, improving the efficiency with which food and biofuel
crops use water is a prerequisite for maintaining and improving
yield (Karaba et al., 2007. Proc. Natl. Acad. Sci. USA.
104:15270-15275). WUE can be used to describe the relationship
between water use and crop productivity over a range of time
integrals. The basic physiological definition of WUE equates the
ratio of photosynthesis (A) to transpiration (T) at a given moment
in time, also referred to as transpiration efficiency. However, the
WUE concept can be scaled significantly, for example, over the
complete lifecycle of a crop, where biomass or yield can be
expressed per cumulative total of water transpired from the canopy.
Thus far, the engineering of major field crops for improved WUE
with single genes has not yet been achieved (Karaba et al., 2007.
supra). Regardless, increased yields of wheat cultivars bred for
increased transpiration efficiency (the ratio of photosynthesis to
transpiration) have provided important support for the proposition
that crop yield can be increased over broad acres through
improvement in crop water-use efficiency (Condon et al., 2004. J.
Exp. Bot. 55:2447-2460).
[0010] With these needs in mind, new technologies for yield
enhancement are required. In this disclosure, a phenotypic
screening platform that directly measures photosynthetic capacity,
water-use efficiency, and nitrogen use efficiency of mature plants
was used to discover advantageous properties conferred by ectopic
expression of the described regulatory proteins in plants.
SUMMARY
[0011] The instant description is directed to a transgenic plant or
plants that have increased photosynthetic resource use efficiency
with respect to a control plant. In this regard, the transgenic
plant or plants comprise a first recombinant polynucleotide
comprising a promoter of interest. The choice of promoter may
include a constitutive promoter or a promoter with enhanced
activity in a tissue capable of photosynthesis (also referred to
herein as a "photosynthetic promoter" or a "photosynthetic
tissue-enhanced promoter") such as a leaf tissue or other green
tissue. Examples of photosynthetic promoters include for example,
an RBCS3 promoter (SEQ ID NO: 133), an RBCS4 promoter (SEQ ID NO:
134) or others such as the At4g01060 (also referred to as "G682")
promoter (SEQ ID NO: 135), the latter regulating expression in a
guard cell. The promoter regulates a polypeptide that is encoded by
the first recombinant polynucleotide or by a second (or target)
recombinant polynucleotide (in which case expression of the
polypeptide may be regulated by a trans-regulatory element). The
promoter may also regulate expression of a polypeptide to an
effective level of expression in a photosynthetic tissue, that is,
to a level that, as a result of expression of the polypeptide to
that level, improves photosynthetic resource use efficiency in a
transgenic plant relative to a control plant. The first
polynucleotide may comprise the promoter and also encode the
polypeptide or alternatively, the first polypeptide may comprise
the promoter and drive expression of the polypeptide which is
encoded by the second recombinant polynucleotide. In a preferred
embodiment, the polypeptide comprises SEQ ID NO: 2 or a sequence
that is paralogous or orthologous to SEQ ID NO: 2, being
structurally-related to SEQ ID NO: 2 and having a function similar
to SEQ ID NO: 2 as described herein. Expression of the polypeptide
under the regulatory control of the constitutive or the
leaf-enhanced or photosynthetic tissue-enhanced promoter in the
transgenic plant confers greater photosynthetic resource use
efficiency to the transgenic plants, and may ultimately increase
yield that may be obtained from the plants.
[0012] The instant description also pertains to methods for
increasing photosynthetic resource use efficiency in, or increasing
yield from, a plant or plants including: the method conducted by
growing a transgenic plant comprising and/or transformed with an
expression cassette comprising the first recombinant polynucleotide
that comprises a constitutive promoter or a promoter expressed in
photosynthetic tissue, which may be a leaf-enhanced or green
tissue-enhanced promoter, such as for example, the RBCS3, RBCS4 or
At4g01060 promoters, or another photosynthetic tissue-enhanced
promoter, for example, such a promoter found in the sequence
listing or in Table 4. Said promoter regulates expression of a
polypeptide that comprises SEQ ID NO: 2, or a polypeptide sequence
within the MYB19 clade (recombinant polynucleotides encoding MYB19
clade polypeptides are described in the following paragraphs
(a)-(c), and exemplary polypeptides within the clade are described
in the following paragraphs (d)-(f) and are shown in FIG. 1 and
FIGS. 2A-2I).
[0013] The first or second recombinant polynucleotide encoding a
MYB19 clade polypeptide may include:
[0014] (a) nucleic acid sequences that are at least 30%, 31%, 32%,
33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%,
46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%,
59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%,
72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%,
85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95% or 96%, 97%,
98%, 99%, or about 100% identical to SEQ ID NO: 1, 3, 5, 7, 9, 11,
13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33; and/or
[0015] (b) nucleic acid sequences that encode polypeptide sequences
that are at least 40%, increasing by increments of 1% to about 100%
identical in their amino acid sequences to the entire length of any
of SEQ ID NOs: 2n, where n=1-17 (that is, SEQ ID NOs: 2, 4, 6, 8,
10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34); and/or
[0016] (c) nucleic acid sequences that hybridize under stringent
conditions (e.g., hybridization followed by one, two, or more wash
steps of 6.times.SSC and 65.degree. C. for ten to thirty minutes
per step) to any of SEQ ID NOs: 2n-1, where n=1-17 (that is, SEQ ID
NO: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31,
33).
[0017] The MYB19 clade polypeptides may include:
[0018] (d) polypeptide sequences encoded by the nucleic acid
sequences of (a), (b) and/or (c); and/or
[0019] (e) polypeptide sequences that have at least 40% identity
increasing by increments of 1% to about 100% identity to SEQ ID NO:
2 or to SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26,
28, 30, 32, or 34, and/or at least 69% identity increasing by
increments of 1% to about 100% identity to the first Myb DNA
binding domain of SEQ ID NO: 2 (`Myb DNA binding domain 1`) or SEQ
ID NOs: 61-77, and/or at least 72% identity increasing by
increments of 1% to about 100% identity to the second Myb DNA
binding domain (`Myb DNA binding domain 2`) of SEQ ID NO: 2 or SEQ
ID NOs: 95-111; and/or
[0020] (f) polypeptide sequences that comprise a subsequence that
are at least 95%, 96%, 97%, 98%, 99%, or about 100% identical to a
consensus sequence of SEQ ID NO: 129 or 130.
[0021] "Increasing by increments of 1% to about 100% identity" in
paragraphs (b) and (e) refers to at least: 40%, 41%, 42%, 43%, 44%,
45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%,
58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%,
71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 90%, 81%, 82%, 83%,
84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95% or 96%,
97%, 98%, 99%, or about 100% amino acid identity to SEQ ID NO: 2 or
to SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30,
32, or 34; or
[0022] at least: 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%,
79%, 90%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,
92%, 93%, 94%, 95% or 96%, 97%, 98%, or at least 99%, or about 100%
identity to any the first Myb DNA binding domains of SEQ ID NOs: 61
to 77; or
[0023] at least: 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%,
82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,
95% or 96%, 97%, 98%, 99%, or about 100% identity to any of the
second Myb DNA binding domains of SEQ ID NOs: 95 to 111.
[0024] Expression of these MYB19 clade polypeptides in the
transgenic plant may confer increased photosynthetic resource use
efficiency relative to a control plant. The transgenic plant may be
selected for increased photosynthetic resource use efficiency or
greater yield relative to the control plant. The transgenic plant
may also be crossed with itself, a second plant from the same line
as the transgenic plant, a non-transgenic plant, a wild-type plant,
or a transgenic plant from a different line of plants, to produce a
transgenic seed.
[0025] The instant description also pertains to methods for
producing and selecting a crop plant with a greater yield than a
control plant, the method comprising producing a transgenic plant
by introducing into a target plant a recombinant polynucleotide
that comprises a promoter, such as a leaf- or photosynthetic
tissue-enhanced promoter that regulates a polypeptide encoded by
the recombinant polynucleotide or a second recombinant
polynucleotide, wherein the polypeptide comprises SEQ ID NO: 2. A
plurality of the transgenic plants are then grown, and a transgenic
plant is selected that produces greater yield or has greater
photosynthetic resource use efficiency than a control plant. The
expression of the polypeptide in the selected transgenic plant
confers the greater photosynthetic resource use efficiency and/or
greater yield relative to the control plant. Optionally, the
selected transgenic plant may be crossed with itself, a second
plant from the same line as the transgenic plant, a non-transgenic
plant, a wild-type plant, or a transgenic plant from a different
line of plants, to produce a transgenic seed. A plurality of the
selected transgenic plants will generally have greater cumulative
canopy photosynthesis than the canopy photosynthesis of an
identical number of the control plants.
[0026] The transgenic plant(s) described herein and produced by the
instantly described methods may also possess one or more altered
traits that result in greater photosynthetic resource use
efficiency. The altered trait may include: increased photosynthetic
capacity; a decrease in leaf chlorophyll content; a decrease in
percentage of nitrogen in leaf dry weight; increased transpiration
efficiency; an increase in resistance to water vapor diffusion
exerted by leaf stomata; an increase in a rate of reactions
responsible for dissipating light energy absorbed by light
harvesting antennae as heat; a decrease in the ratio of the carbon
isotope .sup.12C to .sup.13C in above-ground biomass; and/or an
increase in the total dry weight of above-ground plant
material.
[0027] At least one advantage of greater photosynthetic resource
use efficiency is that the transgenic plant, or a plurality of the
transgenic plants, will have greater cumulative canopy
photosynthesis than the canopy photosynthesis of an identical
number of the control plants, or produce greater yield than an
identical number of the control plants. A wide variety of
transgenic plants are envisioned, including corn, wheat, rice,
Setaria, Miscanthus, Setaria switchgrass, ryegrass, sugarcane,
miscane, barley, sorghum, soy, cotton, canola, rapeseed, Crambe,
Camelina, sugar beet, alfalfa, tomato, Eucalyptus, poplar, willow,
pine, birch and other woody plants.
[0028] The instant description also pertains to expression vectors
that comprise a recombinant polynucleotide that comprises a
promoter expressed in photosynthetic tissue, for example a leaf- or
green tissue-enhanced promoter including the RBCS3, RBCS4, or
At4g01060 promoters (SEQ ID NOs: 133-135), or another
photosynthetic tissue-enhanced promoter, for example, such a
promoter found in the sequence listing or in Table 4 (e.g., SEQ ID
NOs: 136-159), and a subsequence that encodes a polypeptide
comprising SEQ ID NO: 2, or, alternatively, two expression
constructs, one of which encodes a promoter such as a leaf-enhanced
promoter or other photosynthetic tissue-enhanced promoter, and the
second encodes the polypeptide comprising SEQ ID NO: 2. In either
instance, whether the polypeptide is encoded by the first or second
expression constructs, the promoter regulates expression of the
polypeptide comprising SEQ ID NO: 2 by being responsible for
production of cis- or trans-regulatory elements, respectively.
[0029] In the above paragraphs, the control plant may be
exemplified by a plant of the same species as the plant comprising
the recombinant polynucleotide, but the control plant does not
comprise the recombinant polynucleotide (containing the promoter
and possibly encoding the polypeptide) or the second recombinant
polynucleotide.
BRIEF DESCRIPTION OF THE SEQUENCE LISTING AND DRAWINGS
[0030] The Sequence Listing provides exemplary polynucleotide and
polypeptide sequences of the instant description. The traits
associated with the use of the sequences are included in the
Examples.
[0031] Incorporation of the Sequence Listing.
[0032] The Sequence Listing provides exemplary polynucleotide and
polypeptide sequences. The copy of the Sequence Listing, being
submitted electronically with this patent application, provided
under 37 CFR .sctn.1.821-1.825, is a read-only memory
computer-readable file in ASCII text format. The Sequence Listing
is named "MBI-0203P_ST25.txt", the electronic file of the Sequence
Listing was created on Aug. 3, 2012, and is 222,861 bytes in size
(217 kilobytes in size as measured in MS-WINDOWS). The Sequence
Listing is herein incorporated by reference in its entirety.
[0033] In FIG. 1, a phylogenetic tree of the MYB19 (also referred
to as G1309) clade members and related full length proteins were
constructed using TreeBeST (Ruan et al., 2008. Nucleic Acids Res.
36 (suppl. 1): D735-D740) using the best command to identify the
best tree from maximum likelihood and neighbor joining methods. The
MYB19 clade members appear in the large box with the solid line
boundary. MYB19 appears in the oval. An ancestral sequence of MYB19
and closely-related sequences is represented by the node of the
tree indicated by the arrow "A" in FIG. 1. MYB19 clade members are
considered those proteins that descended from ancestral sequence
"A", including the exemplary sequences shown in this figure that
are bounded by LOC_Os04g45020.1 and Solyc03g025870.2.1 (indicated
by the box around these sequences). A related clade is represented
by the node indicated by arrow "B".
[0034] FIGS. 2A-2I show an alignment of the MYB19 clade and related
proteins which appear in the boxes with the solid line boundaries.
The alignment was generated with MUSCLE v3.8.31 (Edgar (2004)
Nucleic Acids Res. 32:1792-1797) with default parameters. SEQ ID
NOs: appear in parentheses after each Gene Identifier (GID). The
conserved first and second Myb DNA binding domains appear in boxes
with the dashed line boundaries. The conserved residues within the
clade are shown in the last rows of FIGS. 2B-2F and are presented
as SEQ ID NOs: 129 (underlined), 130 (double underlined) and 160.
SEQ ID NOs: 129 and 130 share the triple underlined Glu residue in
FIG. 2C.
[0035] FIG. 3 presents a plot of photosynthetic capacity at growth
temperature, showing increased light saturated photosynthesis
(A.sub.sat) over a range of leaf, sub-stomatal CO.sub.2
concentration (C.sub.i), in five MYB19 overexpression lines,
compared to a control line. Data were collected over a range of
C.sub.i over which the activity of Rubisco is known to limit
A.sub.sat. The solid line shown is a regression fitted to the data
for the control line only. All data are the means.+-.1 standard
error for data collected on at least nine replicate plants for each
line.
[0036] FIG. 4 presents a plot of photosynthetic capacity at growth
temperature showing increased A.sub.sat over a range of leaf,
sub-stomatal C.sub.i in five MYB19 overexpression lines, compared
to a control line. Data were collected over a range of C.sub.i over
which the capacity to regenerate RuBP is known to limit A.sub.sat.
The solid line shown is a regression fitted to the data for the
control line only. All data are the means.+-.1 standard error for
data collected on at least nine replicate plants for each line.
LEGEND FOR FIG. 3 AND FIG. 4
[0037] control
[0038] .smallcircle. Line 2
[0039] .diamond. Line 3
[0040] .DELTA. Line 6
[0041] .quadrature. Line 7
[0042] Line 8
DETAILED DESCRIPTION
[0043] The present description relates to polynucleotides and
polypeptides for modifying phenotypes of plants, particularly those
associated with increased photosynthetic resource use efficiency
and increased yield with respect to a control plant (for example, a
wild-type plant). Throughout this disclosure, various information
sources are referred to and/or are specifically incorporated. The
information sources include scientific journal articles, patent
documents, textbooks, and internet entries. While the reference to
these information sources clearly indicates that they can be used
by one of skill in the art, each and every one of the information
sources cited herein are specifically incorporated in their
entirety, whether or not a specific mention of "incorporation by
reference" is noted. The contents and teachings of each and every
one of the information sources can be relied on and used to make
and use embodiments of the instant description.
[0044] As used herein and in the appended claims, the singular
forms "a", "an", and "the" include the plural reference unless the
context clearly dictates otherwise. Thus, for example, a reference
to "a host cell" includes a plurality of such host cells, and a
reference to "a plant" is a reference to one or more plants, and so
forth.
[0045] A "recombinant polynucleotide" is a polynucleotide that is
not in its native state, e.g., the polynucleotide comprises a
nucleotide sequence not found in nature, or the polynucleotide is
in a context other than that in which it is naturally found, e.g.,
separated from nucleotide sequences with which it typically is in
proximity in nature, or adjacent (or contiguous with) nucleotide
sequences with which it typically is not in proximity For example,
the sequence at issue can be cloned into a vector, or otherwise
recombined with one or more additional nucleic acid.
[0046] A "polypeptide" is an amino acid sequence comprising a
plurality of consecutive polymerized amino acid residues e.g., at
least about 15 consecutive polymerized amino acid residues. In many
instances, a polypeptide comprises a polymerized amino acid residue
sequence that is a regulatory polypeptide or a domain or portion or
fragment thereof. Additionally, the polypeptide may comprise: (i) a
localization domain; (ii) an activation domain; (iii) a repression
domain; (iv) an oligomerization domain; (v) a protein-protein
interaction domain; (vi) a DNA-binding domain; or the like. The
polypeptide optionally comprises modified amino acid residues,
naturally occurring amino acid residues not encoded by a codon, or
non-naturally occurring amino acid residues.
[0047] "Protein" refers to an amino acid sequence, oligopeptide,
peptide, polypeptide or portions thereof whether naturally
occurring or synthetic.
[0048] A "recombinant polypeptide" is a polypeptide produced by
translation of a recombinant polynucleotide. A "synthetic
polypeptide" is a polypeptide created by consecutive polymerization
of isolated amino acid residues using methods well known in the
art. An "isolated polypeptide," whether a naturally occurring or a
recombinant polypeptide, is more enriched in (or out of) a cell
than the polypeptide in its natural state in a wild-type cell,
e.g., more than about 5% enriched, more than about 10% enriched, or
more than about 20%, or more than about 50%, or more, enriched,
i.e., alternatively denoted: 105%, 110%, 120%, 150% or more,
enriched relative to wild type standardized at 100%. Such an
enrichment is not the result of a natural response of a wild-type
plant. Alternatively, or additionally, the isolated polypeptide is
separated from other cellular components with which it is typically
associated, e.g., by any of the various protein purification
methods herein.
[0049] "Identity" or "similarity" refers to sequence similarity
between two polynucleotide sequences or between two polypeptide
sequences, with identity being a more strict comparison. The
phrases "percent identity" and "% identity" refer to the percentage
of sequence similarity found in a comparison of two or more
polynucleotide sequences or two or more polypeptide sequences.
"Sequence similarity" refers to the percent similarity in base pair
sequence (as determined by any suitable method) between two or more
polynucleotide sequences. Two or more sequences can be anywhere
from 0-100% similar or identical, or any integer value between
0-100%. Identity or similarity can be determined by comparing a
position in each sequence that may be aligned for purposes of
comparison. When a position in the compared sequence is occupied by
the same nucleotide base or amino acid, then the molecules are
identical at that position. A degree of similarity or identity
between polynucleotide sequences is a function of the number of
identical, matching or corresponding nucleotides at positions
shared by the polynucleotide sequences. A degree of identity of
polypeptide sequences is a function of the number of identical
amino acids at corresponding positions shared by the polypeptide
sequences. A degree of homology or similarity of polypeptide
sequences is a function of the number of amino acids at
corresponding positions shared by the polypeptide sequences. The
fraction or percentage of components in common is related to the
homology or identity between the sequences. Alignments such as
those of FIGS. 2A-2I may be used to identify conserved domains and
relatedness within these domains. An alignment may suitably be
determined by means of computer programs known in the art, such as
MACVECTOR software, (1999; Accelrys, Inc., San Diego, Calif.).
[0050] "Homologous sequences" refers to polynucleotide or
polypeptide sequences that are similar due to common ancestry and
sequence conservation. The terms "ortholog" and "paralog" are
defined below in the section entitled "Orthologs and Paralogs". In
brief, orthologs and paralogs are evolutionarily related genes that
have similar sequences and functions. Orthologs are structurally
related genes in different species that are derived by a speciation
event. Paralogs are structurally related genes within a single
species that are derived by a duplication event.
[0051] "Functional homologs" are polynucleotide or polypeptide
sequences, including orthologs and paralogs, that are similar due
to common ancestry and sequence conservation and have identical or
similar function at the catalytic, cellular, or organismal levels.
The presently disclosed MYB19 clade polypeptides are
"functionally-related and/or closely-related" by having descended
from a common ancestral sequence (see the node shown by arrow A in
FIG. 1), and/or by being sufficiently similar to the sequences and
domains listed in Tables 2 or 3 that they confer the same function
to plants of increased photosynthetic resource use efficiency and
associated improved plant vigor, quality, yield, size, and/or
biomass.
[0052] Functionally-related and/or closely-related polypeptides may
be created artificially, semi-synthetically, or may occur naturally
by having descended from the same ancestral sequence as the
disclosed MYB19-related sequences, where the polypeptides have the
function of conferring increased photosynthetic resource use
efficiency to plants.
[0053] "Conserved domains" are recurring units in molecular
evolution, the extents of which can be determined by sequence and
structure analysis. A "conserved domain" or "conserved region" as
used herein refers to a region in heterologous polynucleotide or
polypeptide sequences where there is a relatively high degree of
sequence identity between the distinct sequences. Conserved domains
contain conserved sequence patterns or motifs that allow for their
detection in, and identification and characterization of,
polypeptide sequences. A Myb or Myb-like domain is an example of a
conserved domain.
[0054] A transgenic plant is expected to have improved or increased
photosynthetic resource use efficiency relative to a control plant
when the transgenic plant is transformed with a recombinant
polynucleotide encoding any of the listed sequences or another
MYB19 clade sequence, or when the transgenic plant contains or
expresses a MYB19 clade sequence.
[0055] The terms "highly stringent" or "highly stringent condition"
refer to conditions that permit hybridization of DNA strands whose
sequences are highly complementary, wherein these same conditions
exclude hybridization of significantly mismatched DNAs.
Polynucleotide sequences capable of hybridizing under stringent
conditions with the polynucleotides of the present description may
be, for example, variants of the disclosed polynucleotide
sequences, including allelic or splice variants, or sequences that
encode orthologs or paralogs of presently disclosed polypeptides.
Nucleic acid hybridization methods are disclosed in detail by
Kashima et al., 1985. Nature 313: 402-404; Sambrook et al., 1989.
Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor
Laboratory, Cold Spring Harbor, N.Y., and by Haymes et al., 1985.
Nucleic Acid Hybridization: A Practical Approach, IRL Press,
Washington, D.C., which references are incorporated herein by
reference.
[0056] In general, stringency is determined by the temperature,
ionic strength, and concentration of denaturing agents (e.g.,
formamide) used in a hybridization and washing procedure (for a
more detailed description of establishing and determining
stringency, see the section "Identifying Polynucleotides or Nucleic
Acids by Hybridization", below). The degree to which two nucleic
acids hybridize under various conditions of stringency is
correlated with the extent of their similarity. Thus, similar
nucleic acid sequences from a variety of sources, such as within a
plant's genome (as in the case of paralogs) or from another plant
(as in the case of orthologs) that may perform similar functions
can be isolated on the basis of their ability to hybridize with
known related polynucleotide sequences. Numerous variations are
possible in the conditions and means by which nucleic acid
hybridization can be performed to isolate related polynucleotide
sequences having similarity to sequences known in the art and are
not limited to those explicitly disclosed herein. Such an approach
may be used to isolate polynucleotide sequences having various
degrees of similarity with disclosed polynucleotide sequences, such
as, for example, encoded regulatory polypeptides also having at
least 40% identity to SEQ ID NO: 2, and/or 69% identity to the
first Myb DNA binding domain of SEQ ID NO: 2, and/or 72% identity
to the second Myb DNA binding domain of SEQ ID NO: 2, increasing by
steps of 1% to about 100%, identity with the conserved domains of
disclosed sequences (see, for example, Table 2 showing MYB19 clade
polypeptides having at least 69%, 70%, 72%, 73%, 75%, 76%, 77%,
78%, 80%, 85%, or about 100% amino acid identity with the first Myb
DNA binding domain of SEQ ID NO: 2, and/or, in Table 3, at least
72%, 74%, 76%, 79%, 81%, 88% or about 100% amino acid identity with
the second Myb DNA binding domain of SEQ ID NO: 2).
[0057] "Fragment", with respect to a polynucleotide, refers to a
clone or any part of a polynucleotide molecule that retains a
usable, functional characteristic. Useful fragments include
oligonucleotides and polynucleotides that may be used in
hybridization or amplification technologies or in the regulation of
replication, transcription or translation. A "polynucleotide
fragment" refers to any subsequence of a polynucleotide, typically,
of at least about nine consecutive nucleotides, preferably at least
about 30 nucleotides, more preferably at least about 50
nucleotides, of any of the sequences provided herein. Exemplary
polynucleotide fragments are the first 60 consecutive nucleotides
of the polynucleotides listed in the Sequence Listing. Exemplary
fragments also include fragments that comprise a region that
encodes an conserved domain of a polypeptide. Exemplary fragments
also include fragments that comprise a conserved domain of a
polypeptide. Exemplary fragments include fragments that comprise an
conserved domain of a polypeptide, for example, amino acid residues
17-77 or 70-112 of MYB19 (SEQ ID NO: 2), or the amino acid residues
of the domains listed in Tables 2 or 3, or SEQ ID NO: 61-77 or
95-111.
[0058] Fragments may also include subsequences of polypeptides and
protein molecules, or a subsequence of the polypeptide. Fragments
may have uses in that they may have antigenic potential. In some
cases, the fragment or domain is a subsequence of the polypeptide
which performs at least one biological function of the intact
polypeptide in substantially the same manner, or to a similar
extent, as does the intact polypeptide. For example, a polypeptide
fragment can comprise a recognizable structural motif or functional
domain such as a DNA-binding site or domain that binds to a DNA
promoter region, an activation domain, or a domain for
protein-protein interactions, and may initiate transcription.
Fragments can vary in size from as few as three amino acid residues
to the full length of the intact polypeptide, but are preferably at
least about 30 amino acid residues in length and more preferably at
least about 60 amino acid residues in length.
[0059] Fragments may also refer to a functional fragment of a
promoter region. For example, a recombinant polynucleotide capable
of modulating transcription in a plant may comprise a nucleic acid
sequence with similarity to, or a percentage identity to, a
promoter region exemplified by a promoter sequence provided in the
Sequence Listing (also see promoters listed in Example I), a
fragment thereof, or a complement thereof, wherein the nucleic acid
sequence, or the fragment thereof, or the complement thereof,
regulates expression of a polypeptide in a plant cell.
[0060] The term "plant" includes whole plants, shoot vegetative
organs/structures (for example, leaves, stems and tubers), roots,
flowers and floral organs/structures (for example, bracts, sepals,
petals, stamens, carpels, anthers and ovules), seed (including
embryo, endosperm, and seed coat) and fruit (the mature ovary),
plant tissue (for example, vascular tissue, ground tissue, and the
like) and cells (for example, guard cells, egg cells, and the
like), and progeny of same. The class of the plants that can be
transformed using the methods provided of the instant description
is generally as broad as the class of higher and lower plants
amenable to transformation techniques, including angiosperms
(monocotyledonous and dicotyledonous plants), gymnosperms, ferns,
horsetails, psilophytes, lycophytes, and bryophytes.
[0061] A "control plant" as used in the present description refers
to a plant cell, seed, plant component, plant tissue, plant organ
or whole plant used to compare against transgenic or genetically
modified plant for the purpose of identifying an enhanced phenotype
in the transgenic or genetically modified plant. A control plant
may in some cases be a transgenic plant line that comprises an
empty vector or marker gene, but does not contain the recombinant
polynucleotide of the present description that is expressed in the
transgenic or genetically modified plant being evaluated. In
general, a control plant is a plant of the same line or variety as
the transgenic or genetically modified plant being tested. A
suitable control plant would include a genetically unaltered or
non-transgenic plant of the parental line used to generate a
transgenic plant herein.
[0062] A "transgenic plant" refers to a plant that contains genetic
material not found in a wild-type plant of the same species,
variety or cultivar. The genetic material may include a transgene,
an insertional mutagenesis event (such as by transposon or T-DNA
insertional mutagenesis), an activation tagging sequence, a mutated
sequence, a homologous recombination event or a sequence modified
by chimeraplasty. Typically, the foreign genetic material has been
introduced into the plant by human manipulation, but any method can
be used as one of skill in the art recognizes.
[0063] A transgenic line or transgenic plant line refers to the
progeny plant or plants deriving from the stable integration of
heterologous genetic material into a specific location or locations
within the genome of the original transformed cell.
[0064] A transgenic plant may contain an expression vector or
cassette. The expression cassette typically comprises a
polypeptide-encoding sequence operably linked (i.e., under
regulatory control of) to appropriate inducible, tissue-enhanced,
tissue-specific, or constitutive regulatory sequences that allow
for the controlled expression of the polypeptide. The expression
cassette can be introduced into a plant by transformation or by
breeding after transformation of a parent plant. A plant refers to
a whole plant as well as to a plant part, such as seed, fruit,
leaf, or root, plant tissue, plant cells or any other plant
material, e.g., a plant explant, as well as to progeny thereof, and
to in vitro systems that mimic biochemical or cellular components
or processes in a cell.
[0065] "Germplasm" refers to a genetic material or a collection of
genetic resources for an organism from an individual plan, a group
of related individual plants (for example, a plant line, a plant
variety or a plant family), or a clone derived from a plant line,
plant variety, plant species, or plant culture.
[0066] A constitutive promoter is active under most environmental
conditions, and in most plant parts. Regulation of protein
expression in a constitutive manner refers to the control of
expression of a gene and/or its encoded protein in all tissues
regardless of the surrounding environment or development stage of
the plant.
[0067] Alternatively, expression of the disclosed or listed
polypeptides may be under the regulatory control of a promoter that
is not a constitutive promoter. For example, tissue-enhanced (also
referred to as tissue-preferred), tissue-specific, cell
type-specific, and inducible promoters constitute non-constitutive
promoters; that is, these promoters do not regulate protein
expression in a constitutive manner. Tissue-enhanced or
tissue-preferred promoters facilitate expression of a gene and/or
its encoded protein in specific tissue(s) and generally, although
perhaps not completely, do not express the gene and/or protein in
all other tissues of the plant, or do so to a much lesser extent.
Promoters under developmental control include promoters that
preferentially initiate transcription in certain tissues, such as
xylem, leaves, roots, or seeds. Such promoters are examples of
tissue-enhanced or tissue-preferred promoters (see U.S. Pat. No.
7,365,186). Tissue-specific promoters generally confine transgene
expression to a single plant part, tissue or cell-type, although
many such promoters are not perfectly restricted in their
expression and their regulatory control is more properly described
as being "tissue-enhanced" or "tissue-preferred". Tissue-enhanced
promoters primarily regulate transgene expression in a limited
number of plant parts, tissues or cell-types, and the latter type
of promoters causes the expression of proteins to be overwhelming
restricted to a few particular tissues, plant parts, or cell types.
An example of a tissue-enhanced promoter is a "photosynthetic
tissue-enhanced promoter", for which the promoter preferentially
regulates gene or protein expression in photosynthetic tissues
(e.g., leaves, cotyledons, stems, etc.). Tissue-enhanced promoters
can be found upstream and operatively linked to DNA sequences
normally transcribed in higher levels in certain plant tissues or
specifically in certain plant tissues, respectively.
"Cell-enhanced", "tissue-enhanced", or "tissue-specific" regulation
thus refer to the control of gene or protein expression, for
example, by a promoter, which drives expression that is not
necessarily totally restricted to a single type of cell or tissue,
but where expression is elevated in particular cells or tissues to
a greater extent than in other cells or tissues within the
organism, and in the case of tissue-specific regulation, in a
manner that is primarily elevated in a specific tissue.
Tissue-enhanced or preferred promoters have been described in, for
example, U.S. Pat. No. 7,365,186, or U.S. Pat. No. 7,619,133.
[0068] Another example of a promoter that is not a constitutive
promoter is a "condition-enhanced" promoter, the latter term
referring to a promoter that activates a gene in response to a
particular environmental stimulus. This may include, for example,
an abiotic stress, infection caused by a pathogen, light treatment,
etc., and a condition-enhanced promoter drives expression in a
unique pattern which may include expression in specific cell and/or
tissue types within the organism (as opposed to a constitutive
expression pattern in all cell types of an organism at all
times).
[0069] "Wild type" or "wild-type", as used herein, refers to a
plant cell, seed, plant component, plant tissue, plant organ or
whole plant that has not been genetically modified or treated in an
experimental sense. Wild-type cells, seed, components, tissue,
organs or whole plants may be used as controls to compare levels of
expression and the extent and nature of trait modification with
cells, tissue or plants of the same species in which a
polypeptide's expression is altered, e.g., in that it has been
knocked out, overexpressed, or ectopically expressed.
[0070] With regard to gene knockouts as used herein, the term
"knockout" (KO) refers to a plant or plant cell having a disruption
in at least one gene in the plant or cell, where the disruption
results in a reduced expression or activity of the polypeptide
encoded by that gene compared to a control cell. The knockout can
be the result of, for example, genomic disruptions, including
transposons, tilling, and homologous recombination, antisense
constructs, sense constructs, RNA silencing constructs, or RNA
interference. A T-DNA insertion within a gene is an example of a
genotypic alteration that may abolish expression of that gene.
[0071] "Ectopic expression" or "altered expression" in reference to
a polynucleotide indicates that the pattern of expression in, e.g.,
a transgenic plant or plant tissue, is different from the
expression pattern in a wild-type plant or a reference plant of the
same species. The pattern of expression may also be compared with a
reference expression pattern in a wild-type plant of the same
species. For example, the polynucleotide or polypeptide is
expressed in a cell or tissue type other than a cell or tissue type
in which the sequence is expressed in the wild-type plant, or by
expression at a time other than at the time the sequence is
expressed in the wild-type plant, or by a response to different
inducible agents, such as hormones or environmental signals, or at
different expression levels (either higher or lower) compared with
those found in a wild-type plant. The term also refers to altered
expression patterns that are produced by lowering the levels of
expression to below the detection level or completely abolishing
expression. The resulting expression pattern can be transient or
stable, constitutive or inducible. In reference to a polypeptide,
the term "ectopic expression or altered expression" further may
relate to altered activity levels resulting from the interactions
of the polypeptides with exogenous or endogenous modulators or from
interactions with factors or as a result of the chemical
modification of the polypeptides.
[0072] The term "overexpression" as used herein refers to a greater
expression level of a gene in a plant, plant cell or plant tissue,
compared to expression of that gene in a wild-type plant, cell or
tissue, at any developmental or temporal stage. Overexpression can
occur when, for example, the genes encoding one or more
polypeptides are under the control of a strong promoter (e.g., the
cauliflower mosaic virus 35S transcription initiation region).
Overexpression may also be achieved by placing a gene of interest
under the control of an inducible or tissue specific promoter, or
may be achieved through integration of transposons or engineered
T-DNA molecules into regulatory regions of a target gene. Other
means for inducing overexpression may include making targeted
changes in a gene's native promoter, e.g. through elimination of
negative regulatory sequences or engineering positive regulatory
sequences, though the use of targeted nuclease activity (such as
zinc finger nucleases or TAL effector nucleases) for genome
editing. Elimination of micro-RNA binding sites in a gene's
transcript may also result in overexpression of that gene.
Additionally, a gene may be overexpressed by creating an artificial
transcriptional activator targeted to bind specifically to its
promoter sequences, comprising an engineered sequence-specific DNA
binding domain such as a zinc finger protein or TAL effector
protein fused to a transcriptional activation domain. Thus,
overexpression may occur throughout a plant, in specific tissues of
the plant, or in the presence or absence of particular
environmental signals, depending on the promoter or overexpression
approach used.
[0073] Overexpression may take place in plant cells normally
lacking expression of polypeptides functionally equivalent or
identical to the present polypeptides. Overexpression may also
occur in plant cells where endogenous expression of the present
polypeptides or functionally equivalent molecules normally occurs,
but such normal expression is at a lower level. Overexpression thus
results in a greater than normal production, or "overproduction" of
the polypeptide in the plant, cell or tissue.
[0074] "Nitrogen limitation" or "nitrogen-limiting" refers to
nitrogen levels that act as net limitations on primary production
in terrestrial or aquatic biomes. Much of terrestrial growth,
including much of crop growth, is limited by the availability of
nitrogen, which can be alleviated by nitrogen input through
deposition or fertilization.
[0075] "Water use efficiency", or WUE, measured as the biomass
produced per unit transpiration, describes the relationship between
water use and crop production. The basic physiological definition
of WUE equates to the ratio of photosynthesis (A) to transpiration
(T), also referred to as transpiration efficiency (Karaba et al.
2007, supra).
[0076] "Photosynthetic capacity" refers to the limits placed on
photosynthesis by the physiology of the chloroplast. Specifically,
regulation of light absorption and activities of enzymes in the C3
and C4 photosynthetic pathways. Increasing photosynthetic capacity
is seen as an important means of increasing leaf and crop-canopy
photosynthesis, and crop yield.
[0077] "Rubisco (ribulose-1,5-bisphosphate carboxylase oxygenase)
activity" refers to the activation state of Rubisco, the most
abundant protein in the chloroplast and a key limitation to C3
photosynthesis. Increasing Rubisco activity: by increasing the
amount of Rubisco in the chloroplast; impacting any combination of
specific reactions that regulate Rubisco activity; or increasing
the concentration of CO.sub.2 in the chloroplast, is seen as an
import means to improving C3 leaf and crop-canopy photosynthesis
and crop yield.
[0078] The "capacity for RuBP (ribulose-1,5-bisphosphate)
regeneration" refers to the rate at which RuBP, a key
photosynthetic substrate is regenerated in the Calvin cycle.
Increasing the capacity for RuBP regeneration by increasing the
activity of enzymes in the regenerative phase of the Calvin cycle
is seen as an important means to improving C3 leaf and crop-canopy
photosynthesis and crop yield that will become progressively more
important as atmospheric CO.sub.2 concentrations continue to
rise.
[0079] "Leaf chlorophyll content" refers to the chlorophyll content
of the leaf expressed either per unit leaf area or unit weight.
Leaves absorb more light than they require for photosynthesis for
large portions of the day. This absorbed energy must be dissipated
if damage to the leaf is to be avoided. Consequently, decreasing
leaf chlorophyll content is considered an effective means to
improving photosynthetic resource-use efficiency across scales,
from the leaf to the crop canopy. Decreasing leaf chlorophyll
content can increase light transmission into the leaf and improve N
investment between the different components of the photosynthetic
apparatus. This concept can be extended to the canopy where
decreased chlorophyll content in upper canopy leaves is expected to
improve photosynthesis of shaded lower canopy leaves with little
impact on the rate of photosynthesis of upper canopy sunlit
leaves.
[0080] "Regulation of photosystem II" is a term that covers how the
quantum efficiency with which electron transport is initiated as
PSII responds to environment. Non-photochemical quenching from the
antenna plays a key role in this regulation and is thought to
constrain the efficiency of PSII operation and by extension rate of
photosynthesis as leaves continually transition from high to low
light. Increasing the rate of relaxation of non-photochemical
quenching during the transition from high to low light is expected
to increase leaf and canopy photosynthesis integrated over days to
the growing season.
[0081] "Stomatal conductance" refers to a measurement of the
limitation that the stomatal pore imposes on CO.sub.2 diffusion
into, and H.sub.2O diffusion out of, the leaf. Decreasing stomatal
conductance will decrease water loss from the leaf and crop canopy
via transpiration. This will conserve soil water, delay the onset
and reduce the severity of drought effects on canopy photosynthesis
and other physiology.
[0082] "Yield" or "plant yield" refers to increased plant growth,
increased crop growth, increased biomass, and/or increased plant
product production (including grain), and is dependent to some
extent on temperature, plant size, organ size, planting density,
light, water and nutrient availability, and how the plant copes
with various stresses, such as through temperature acclimation and
water or nutrient use efficiency. Increased or improved yield may
be measured as increased seed yield, increased plant product yield
(plant products include, for example, plant tissue, including
ground or otherwise broken-up plant tissue, and products derived
from one or more types of plant tissue), or increased vegetative
yield.
DESCRIPTION OF THE SPECIFIC EMBODIMENTS
Regulatory Polypeptides Modify Expression of Endogenous Genes
[0083] A regulatory polypeptide may include, but is not limited to,
any polypeptide that can activate or repress transcription of a
single gene or a number of genes. As one of ordinary skill in the
art recognizes, regulatory polypeptides can be identified by the
presence of a region or domain of structural similarity or identity
to a specific consensus sequence or the presence of a specific
consensus DNA-binding motif (see, for example, Riechmann et al.,
2000a. supra). The plant regulatory polypeptides of the instant
description belong to the MYB-(R1)R2R3 family (Shore and Sharrocks,
1995. Eur. J. Biochem. 229:1-13; Ng and Yanofsky, 2001. Nat. Rev.
Genet. 2:186-195; Alvarez-Buylla et al., 2000. Proc. Natl. Acad.
Sci. USA. 97:5328-5333) and are putative regulatory
polypeptides.
[0084] Generally, regulatory polypeptides are involved in cell
differentiation and proliferation and the regulation of growth.
Accordingly, one skilled in the art would recognize that by
expressing the present sequences in a plant, one may change the
expression of autologous genes or induce the expression of
introduced genes. By affecting the expression of similar autologous
sequences in a plant that have the biological activity of the
present sequences, or by introducing the present sequences into a
plant, one may alter a plant's phenotype to one with improved
traits related to photosynthetic resource use efficiency. The
sequences of the instant description may also be used to transform
a plant and introduce desirable traits not found in the wild-type
cultivar or strain. Plants may then be selected for those that
produce the most desirable degree of over- or under-expression of
target genes of interest and coincident trait improvement.
[0085] The sequences of the present description may be from any
species, particularly plant species, in a naturally occurring form
or from any source whether natural, synthetic, semi-synthetic or
recombinant. The sequences of the instant description may also
include fragments of the present amino acid sequences. Where "amino
acid sequence" is recited to refer to an amino acid sequence of a
naturally occurring protein molecule, "amino acid sequence" and
like terms are not meant to limit the amino acid sequence to the
complete native amino acid sequence associated with the recited
protein molecule.
[0086] In addition to methods for modifying a plant phenotype by
employing one or more polynucleotides and polypeptides of the
instant description described herein, the polynucleotides and
polypeptides of the instant description have a variety of
additional uses. These uses include their use in the recombinant
production (i.e., expression) of proteins; as regulators of plant
gene expression, as diagnostic probes for the presence of
complementary or partially complementary nucleic acids (including
for detection of natural coding nucleic acids); as substrates for
further reactions, e.g., mutation reactions, PCR reactions, or the
like; as substrates for cloning e.g., including digestion or
ligation reactions; and for identifying exogenous or endogenous
modulators of the regulatory polypeptides. The polynucleotide can
be, e.g., genomic DNA or RNA, a transcript (such as an mRNA), a
cDNA, a PCR product, a cloned DNA, a synthetic DNA or RNA, or the
like. The polynucleotide can comprise a sequence in either sense or
antisense orientations.
[0087] Expression of genes that encode polypeptides that modify
expression of endogenous genes, polynucleotides, and proteins are
well known in the art. In addition, transgenic plants comprising
polynucleotides encoding regulatory polypeptides may also modify
expression of endogenous genes, polynucleotides, and proteins.
Examples include Peng et al., 1997. Genes Development 11:
3194-3205, and Peng et al., 1999. Nature 400: 256-261. In addition,
many others have demonstrated that an Arabidopsis regulatory
polypeptide expressed in an exogenous plant species elicits the
same or very similar phenotypic response. See, for example, Fu et
al., 2001. Plant Cell 13: 1791-1802; Nandi et al., 2000. Curr.
Biol. 10: 215-218; Coupland, 1995. Nature 377: 482-483; and Weigel
and Nilsson, 1995. Nature 377: 482-500).
[0088] In another example, Mandel et al., 1992b. Cell 71-133-143,
and Suzuki et al., 2001. Plant J. 28: 409-418, teach that a
transcription factor expressed in another plant species elicits the
same or very similar phenotypic response of the endogenous
sequence, as often predicted in earlier studies of Arabidopsis
transcription factors in Arabidopsis (see Mandel et al., 1992a.
Nature 360: 273-277; Suzuki et al., 2001. supra). Other examples
include Muller et al., 2001. Plant J. 28: 169-179; Kim et al.,
2001. Plant J. 25: 247-259; Kyozuka and Shimamoto, 2002. Plant Cell
Physiol. 43: 130-135; Boss and Thomas, 2002. Nature, 416: 847-850;
He et al., 2000. Transgenic Res. 9: 223-227; and Robson et al.,
2001. Plant J. 28: 619-631.
[0089] In yet another example, Gilmour et al., 1998. Plant J. 16:
433-442 teach an Arabidopsis AP2 transcription factor, CBF1, which,
when overexpressed in transgenic plants, increases plant freezing
tolerance. Jaglo et al., 2001. Plant Physiol. 127: 910-917, further
identified sequences in Brassica napus which encode CBF-like genes
and that transcripts for these genes accumulated rapidly in
response to low temperature. Transcripts encoding CBF proteins were
also found to accumulate rapidly in response to low temperature in
wheat, as well as in tomato. An alignment of the CBF proteins from
Arabidopsis, B. napus, wheat, rye, and tomato revealed the presence
of conserved consecutive amino acid residues which bracket the
AP2/EREBP DNA binding domains of the proteins and distinguish them
from other members of the AP2/EREBP protein family (Jaglo et al.,
2001. supra).
[0090] Regulatory polypeptides mediate cellular responses and
control traits through altered expression of genes containing
cis-acting nucleotide sequences that are targets of the introduced
regulatory polypeptide. It is well appreciated in the art that the
effect of a regulatory polypeptide on cellular responses or a
cellular trait is determined by the particular genes whose
expression is either directly or indirectly (e.g., by a cascade of
regulatory polypeptide binding events and transcriptional changes)
altered by regulatory polypeptide binding. In a global analysis of
transcription comparing a standard condition with one in which a
regulatory polypeptide is overexpressed, the resulting transcript
profile associated with regulatory polypeptide overexpression is
related to the trait or cellular process controlled by that
regulatory polypeptide. For example, the PAP2 gene and other genes
in the Myb family have been shown to control anthocyanin
biosynthesis through regulation of the expression of genes known to
be involved in the anthocyanin biosynthetic pathway (Bruce et al.,
2000. Plant Cell 12: 65-79; and Borevitz et al., 2000. Plant Cell
12: 2383-2393). Further, global transcript profiles have been used
successfully as diagnostic tools for specific cellular states
(e.g., cancerous vs. non-cancerous; Bhattacharjee et al., 2001.
Proc. Natl. Acad. Sci. USA 98: 13790-13795; and Xu et al., 2001.
Proc. Natl. Acad. Sci. USA 98: 15089-15094). Consequently, it is
evident to one skilled in the art that similarity of transcript
profile upon overexpression of different regulatory polypeptides
would indicate similarity of regulatory polypeptide function.
[0091] Polypeptides and Polynucleotides of the Present
Description.
[0092] The present description includes putative regulatory
polypeptides, and isolated or recombinant polynucleotides encoding
the polypeptides, or novel sequence variant polypeptides or
polynucleotides encoding novel variants of polypeptides derived
from the specific sequences provided in the Sequence Listing; the
recombinant polynucleotides of the instant description may be
incorporated in expression vectors for the purpose of producing
transformed plants.
[0093] Because of their relatedness at the nucleotide level, the
claimed sequences will typically share at least about 40%
nucleotide sequence identity, or at least 45% identity, at least
50%, at least 51%, at least 52%, at least 53%, at least 54%, at
least 55%, at least 56%, at least 57%, at least 58%, at least 59%,
at least 60%, at least 70%, at least 71%, at least 72%, at least
73%, at least 74%, at least 75%, at least 76%, at least 77%, at
least 78%, at least 79%, at least 80%, at least 81%, at least 82%,
at least 83%, at least 84%, at least 85%, at least 86%, at least
87%, at least 88%, at least 89%, at least 90%, at least 91%, at
least 92%, at least 93%, at least 94%, at least 95% or at least
96%, at least 97%, at least 98%, at least 99%, or about 100%
sequence identity to one or more of the listed full-length
sequences, or to a listed sequence but excluding or outside of the
region(s) encoding a known consensus sequence or consensus
DNA-binding site, or outside of the region(s) encoding one or all
conserved domains. The degeneracy of the genetic code enables major
variations in the nucleotide sequence of a polynucleotide while
maintaining the amino acid sequence of the encoded protein.
[0094] Because of their relatedness at the protein level, the
claimed nucleotide sequences will typically encode a polypeptide
that is at least 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%,
50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%,
63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%,
76%, 77%, 78%, 79%, 90%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%,
89%, 90%, 91%, 92%, 93%, 94%, 95% or 96%, 97%, 98%, or at least
99%, or about 100% identical, in its amino acid sequence to the
entire length of any of SEQ ID NOs: 2n where n=1-17.
[0095] Also provided are methods for modifying yield from a plant
by modifying the mass, size or number of plant organs or seed of a
plant by controlling a number of cellular processes, and for
increasing a plant's photosynthetic resource use efficiency. These
methods are based on the ability to alter the expression of
critical regulatory molecules that may be conserved between diverse
plant species. Related conserved regulatory molecules may be
originally discovered in a model system such as Arabidopsis and
homologous, functional molecules then discovered in other plant
species. The latter may then be used to confer increased yield or
photosynthetic resource use efficiency in diverse plant
species.
[0096] Sequences in the Sequence Listing, derived from diverse
plant species, may be ectopically expressed in overexpressor
plants. The changes in the characteristic(s) or trait(s) of the
plants may then be observed and found to confer increased yield
and/or increased photosynthetic resource use efficiency. Therefore,
the polynucleotides and polypeptides can be used to improve
desirable characteristics of plants.
[0097] The polynucleotides of the instant description are also
ectopically expressed in overexpressor plant cells and the changes
in the expression levels of a number of genes, polynucleotides,
and/or proteins of the plant cells observed. Therefore, the
polynucleotides and polypeptides can be used to change expression
levels of genes, polynucleotides, and/or proteins of plants or
plant cells.
[0098] The data presented herein represent the results obtained in
experiments with polynucleotides and polypeptides that may be
expressed in plants for the purpose of increasing yield that arises
from improved photosynthetic resource use efficiency.
[0099] Variants of the Disclosed Sequences.
[0100] Also within the scope of the instant description is a
variant of a nucleic acid listed in the Sequence Listing, that is,
one having a sequence that differs from the one of the
polynucleotide sequences in the Sequence Listing, or a
complementary sequence, that encodes a functionally equivalent
polypeptide (i.e., a polypeptide having some degree of equivalent
or similar biological activity) but differs in sequence from the
sequence in the Sequence Listing, due to degeneracy in the genetic
code. Included within this definition are polymorphisms that may or
may not be readily detectable using a particular oligonucleotide
probe of the polynucleotide encoding polypeptide, and improper or
unexpected hybridization to allelic variants, with a locus other
than the normal chromosomal locus for the polynucleotide sequence
encoding polypeptide.
[0101] Differences between presently disclosed polypeptides and
polypeptide variants are limited so that the sequences of the
former and the latter are closely similar overall and, in many
regions, identical. Presently disclosed polypeptide sequences and
similar polypeptide variants may differ in amino acid sequence by
one or more substitutions, additions, deletions, fusions and
truncations, which may be present in any combination. These
differences may produce silent changes and result in a functionally
equivalent polypeptides. Thus, it will be readily appreciated by
those of skill in the art, that any of a variety of polynucleotide
sequences is capable of encoding the polypeptides and homolog
polypeptides of the instant description. A polypeptide sequence
variant may have "conservative" changes, wherein a substituted
amino acid has similar structural or chemical properties.
[0102] Conservative substitutions include substitutions in which at
least one residue in the amino acid sequence has been removed and a
different residue inserted in its place. Such substitutions
generally are made in accordance with the Table 1 when it is
desired to maintain the activity of the protein. Table 1 shows
amino acids which can be substituted for an amino acid in a protein
and which are typically regarded as conservative substitutions.
TABLE-US-00001 TABLE 1 Possible conservative amino acid
substitutions Amino Acid Conservative Amino Acid Conservative
Residue substitutions Residue substitutions Ala Ser Leu Ile; Val
Arg Lys Lys Arg; Gln Asn Gln; His Met Leu; Ile Asp Glu Phe Met;
Leu; Tyr Gln Asn Pro Gly Cys Ser Ser Thr; Gly Glu Asp Thr Ser; Val
Gly Pro Trp Tyr His Asn; Gln Tyr Trp; Phe Ile Leu, Val Val Ile;
Leu
[0103] The polypeptides provided in the Sequence Listing have a
novel activity, such as, for example, regulatory activity. Although
all conservative amino acid substitutions (for example, one basic
amino acid substituted for another basic amino acid) in a
polypeptide will not necessarily result in the polypeptide
retaining its activity, it is expected that many of these
conservative mutations would result in the polypeptide retaining
its activity. Most mutations, conservative or non-conservative,
made to a protein but outside of a conserved domain required for
function and protein activity will not affect the activity of the
protein to any great extent.
[0104] Deliberate amino acid substitutions may thus be made on the
basis of similarity in polarity, charge, solubility,
hydrophobicity, hydrophilicity, and/or the amphipathic nature of
the residues, as long as a significant amount of the functional or
biological activity of the polypeptide is retained. For example,
negatively charged amino acids may include aspartic acid and
glutamic acid, positively charged amino acids may include lysine
and arginine, and amino acids with uncharged polar head groups
having similar hydrophilicity values may include leucine,
isoleucine, and valine; glycine and alanine; asparagine and
glutamine; serine and threonine; and phenylalanine and tyrosine.
More rarely, a variant may have "non-conservative" changes, e.g.,
replacement of a glycine with a tryptophan. Similar minor
variations may also include amino acid deletions or insertions, or
both. Related polypeptides may comprise, for example, additions
and/or deletions of one or more N-linked or O-linked glycosylation
sites, or an addition and/or a deletion of one or more cysteine
residues. Guidance in determining which and how many amino acid
residues may be substituted, inserted or deleted without abolishing
functional or biological activity may be found using computer
programs well known in the art, for example, DNASTAR software (see
U.S. Pat. No. 5,840,544).
[0105] Conserved Domains
[0106] Conserved domains are recurring functional and/or structural
units of a protein sequence within a protein family (for example, a
family of regulatory proteins), and distinct conserved domains have
been used as building blocks in molecular evolution and recombined
in various arrangements to make proteins of different protein
families with different functions. Conserved domains often
correspond to the 3-dimensional domains of proteins and contain
conserved sequence patterns or motifs, which allow for their
detection in polypeptide sequences with, for example, the use of a
Conserved Domain Database (for example, at
www.ncbi.nlm.nih.gov/cdd). The National Center for Biotechnology
Information Conserved Domain Database defines conserved domains as
recurring units in molecular evolution, the extents of which can be
determined by sequence and structure analysis. Conserved domains
contain conserved sequence patterns or motifs, which allow for
their detection in polypeptide sequences (Conserved Domain
Database; www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). A
"conserved domain" or "conserved region" as used herein refers to a
region in heterologous polynucleotide or polypeptide sequences
where there is a relatively high degree of sequence identity
between the distinct sequences. A `Myb DNA binding domain 1` is an
example of a conserved domain.
[0107] Conserved domains may also be identified as regions or
domains of identity to a specific consensus sequence (see, for
example, Riechmann et al., 2000a. Science 290, 2105-2110; Riechmann
et al., 2000b. Curr Opin Plant Biol 3: 423-434). Thus, by using
alignment methods well known in the art, the conserved domains of
the plant polypeptides, for example, for the first or second Myb
DNA binding domain proteins may be determined. The polypeptides of
Tables 2 or 3 have conserved domains specifically indicated by
amino acid coordinate start and stop sites. A comparison of the
regions of these polypeptides allows one of skill in the art (see,
for example, Reeves and Nissen, 1990. J. Biol. Chem. 265,
8573-8582; Reeves and Nissen, 1995. Prog. Cell Cycle Res. 1:
339-349) to identify domains or conserved domains for any of the
polypeptides listed or referred to in this disclosure.
[0108] Conserved domain models are generally identified with
multiple sequence alignments of related proteins spanning a variety
of organisms (for example, conserved domains of the disclosed
sequences can be found in FIG. 2B-FIG. 2D. These alignments reveal
sequence regions containing the same, or similar, patterns of amino
acids. Multiple sequence alignments, three-dimensional structure
and three-dimensional structure superposition of conserved domains
can be used to infer sequence, structure, and functional
relationships (Conserved Domain Database, supra). Since the
presence of a particular conserved domain within a polypeptide is
highly correlated with an evolutionarily conserved function, a
conserved domain database may be used to identify the amino acids
in a protein sequence that are putatively involved in functions
such as binding or catalysis, as mapped from conserved domain
annotations to the query sequence. For example, the presence in a
protein of a first or second Myb DNA binding domain that is
structurally and phylogenetically similar to one or more domains
shown in Tables 2 or 3 would be a strong indicator of a related
function in plants (e.g., the function of regulating and/or
improving photosynthetic resource use efficiency, yield, size,
biomass, and/or vigor; i.e., a polypeptide with such a domain is
expected to confer altered photosynthetic resource use efficiency,
yield, size, biomass, and/or vigor when its expression level is
altered). Sequences herein referred to as functionally-related
and/or closely-related to the sequences or domains listed in Tables
2 or 3, including polypeptides that are closely related to the
polypeptides of the instant description, may have conserved domains
that share at least at least nine base pairs (bp) in length and at
least 72% increasing by increments of 1% to about 100% amino acid
sequence identity to the sequences provided in the Sequence Listing
or in Tables 2 or 3, and have similar functions in that the
polypeptides of the instant description. Said polypeptides may,
when their expression level is altered by suppressing their
expression, knocking out their expression, or increasing their
expression, confer at least one regulatory activity selected from
the group consisting of increased photosynthetic resource use
efficiency, greater yield, greater size, greater biomass, and/or
greater vigor as compared to a control plant.
[0109] Methods using manual alignment of sequences similar or
homologous to one or more polynucleotide sequences or one or more
polypeptides encoded by the polynucleotide sequences may be used to
identify regions of similarity and first or second Myb DNA binding
domains. Such manual methods are well-known of those of skill in
the art and can include, for example, comparisons of tertiary
structure between a polypeptide sequence encoded by a
polynucleotide that comprises a known function and a polypeptide
sequence encoded by a polynucleotide sequence that has a function
not yet determined. Such examples of tertiary structure may
comprise predicted alpha helices, beta-sheets, amphipathic helices,
leucine zipper motifs, zinc finger motifs, proline-rich regions,
cysteine repeat motifs, and the like.
[0110] With respect to polynucleotides encoding presently disclosed
polypeptides, a conserved domain refers to a subsequence within a
polypeptide family the presence of which is correlated with at
least one function exhibited by members of the polypeptide family,
and which exhibits a high degree of sequence homology, such as at
least 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%,
81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,
94%, 95% or 96%, 97%, 98%, 99%, or about 100% identity to a
conserved domain of a polypeptide of the Sequence Listing (e.g.,
any of SEQ ID NOs: 61-132) or listed in Tables 2 or 3. Sequences
that possess or encode for conserved domains that meet these
criteria of percentage identity, and that have comparable
biological and regulatory activity to the present polypeptide
sequences, thus being members of the MYB19 clade polypeptides or
sequences in the MYB19 clade, are described. Sequences having
lesser degrees of identity but comparable biological activity are
considered to be equivalents.
[0111] Orthologs and Paralogs.
[0112] Homologous sequences as described above can comprise
orthologous or paralogous sequences. Several different methods are
known by those of skill in the art for identifying and defining
these functionally homologous sequences. General methods for
identifying orthologs and paralogs, including phylogenetic methods,
sequence similarity and hybridization methods, are described
herein; an ortholog or paralog, including equivalogs, may be
identified by one or more of the methods described below.
[0113] As described by Eisen, 1998. Genome Res. 8: 163-167,
evolutionary information may be used to predict gene function. It
is common for groups of genes that are homologous in sequence to
have diverse, although usually related, functions. However, in many
cases, the identification of homologs is not sufficient to make
specific predictions because not all homologs have the same
function. Thus, an initial analysis of functional relatedness based
on sequence similarity alone may not provide one with a means to
determine where similarity ends and functional relatedness begins.
Fortunately, it is well known in the art that protein function can
be classified using phylogenetic analysis of gene trees combined
with the corresponding species. Functional predictions can be
greatly improved by focusing on how the genes became similar in
sequence (i.e., by evolutionary processes) rather than on the
sequence similarity itself (Eisen, supra). In fact, many specific
examples exist in which gene function has been shown to correlate
well with gene phylogeny (Eisen, supra). Thus, "[t]he first step in
making functional predictions is the generation of a phylogenetic
tree representing the evolutionary history of the gene of interest
and its homologs. Such trees are distinct from clusters and other
means of characterizing sequence similarity because they are
inferred by techniques that help convert patterns of similarity
into evolutionary relationships . . . . After the gene tree is
inferred, biologically determined functions of the various homologs
are overlaid onto the tree. Finally, the structure of the tree and
the relative phylogenetic positions of genes of different functions
are used to trace the history of functional changes, which is then
used to predict functions of [as yet] uncharacterized genes"
(Eisen, supra).
[0114] Within a single plant species, gene duplication may cause
two copies of a particular gene, giving rise to two or more genes
with similar sequence and often similar function known as paralogs.
A paralog is therefore a similar gene formed by duplication within
the same species. Paralogs typically cluster together or in the
same clade (a group of similar genes) when a gene family phylogeny
is analyzed using programs such as CLUSTAL (Thompson et al., 1994.
Nucleic Acids Res. 22: 4673-4680; Higgins et al., 1996. Methods
Enzymol. 266: 383-402). Groups of similar genes can also be
identified with pair-wise BLAST analysis (Feng and Doolittle, 1987.
J. Mol. Evol. 25: 351-360). For example, a clade of very similar
MADS domain transcription factors from Arabidopsis all share a
common function in flowering time (Ratcliffe et al., 2001. Plant
Physiol. 126: 122-132), and a group of very similar AP2 domain
transcription factors from Arabidopsis are involved in tolerance of
plants to freezing (Gilmour et al., 1998. supra). Analysis of
groups of similar genes with similar function that fall within one
clade can yield sub-sequences that are particular to the clade.
These sub-sequences, known as consensus sequences, can not only be
used to define the sequences within each clade, but define the
functions of these genes; genes within a clade may contain
paralogous sequences, or orthologous sequences that share the same
function (see also, for example, Mount, 2001, in Bioinformatics:
Sequence and Genome Analysis, Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, N.Y., p. 543)
[0115] Regulatory polypeptide gene sequences are conserved across
diverse eukaryotic species lines (Goodrich et al., 1993. Cell
75:519-530; Lin et al., 1991. Nature 353:569-571; Sadowski et al.,
1988. Nature 335: 563-564). Plants are no exception to this
observation; diverse plant species possess regulatory polypeptides
that have similar sequences and functions. Speciation, the
production of new species from a parental species, gives rise to
two or more genes with similar sequence and similar function. These
genes, termed orthologs, often have an identical function within
their host plants and are often interchangeable between species
without losing function. Because plants have common ancestors, many
genes in any plant species will have a corresponding orthologous
gene in another plant species. Once a phylogenic tree for a gene
family of one species has been constructed using a program such as
CLUSTAL (Thompson et al., 1994. supra; Higgins et al., 1996. supra)
potential orthologous sequences can be placed into the phylogenetic
tree and their relationship to genes from the species of interest
can be determined. Orthologous sequences can also be identified by
a reciprocal BLAST strategy. Once an orthologous sequence has been
identified, the function of the ortholog can be deduced from the
identified function of the reference sequence.
[0116] By using a phylogenetic analysis, one skilled in the art
would recognize that the ability to deduce similar functions
conferred by closely-related polypeptides is predictable. This
predictability has been confirmed by our own many studies in which
we have found that a wide variety of polypeptides have orthologous
or closely-related homologous sequences that function as does the
first, closely-related reference sequence. For example, distinct
regulatory polypeptides, including:
[0117] (i) AP2 family Arabidopsis G47 (found in U.S. Pat. No.
7,135,616), a phylogenetically-related sequence from soybean, and
two phylogenetically-related homologs from rice all can confer
greater tolerance to drought, hyperosmotic stress, or delayed
flowering as compared to control plants;
[0118] (ii) CAAT family Arabidopsis G481 (found in PCT patent
publication WO2004076638), and numerous phylogenetically-related
sequences from eudicots and monocots can confer greater tolerance
to drought-related stress as compared to control plants;
[0119] (iii) Myb-related Arabidopsis G682 (found in U.S. Pat. Nos.
7,223,904 and 7,193,129) and numerous phylogenetically-related
sequences from eudicots and monocots can confer greater tolerance
to heat, drought-related stress, cold, and salt as compared to
control plants;
[0120] (iv) WRKY family Arabidopsis G1274 (found in U.S. Pat. No.
7,196,245) and numerous closely-related sequences from eudicots and
monocots have been shown to confer increased water deprivation
tolerance, and
[0121] (v) AT-hook family soy sequence G3456 (found in US patent
publication 20040128712A1) and numerous phylogenetically-related
sequences from eudicots and monocots, increased biomass compared to
control plants when these sequences are overexpressed in
plants.
[0122] The polypeptides sequences belong to distinct clades of
polypeptides that include members from diverse species. In each
case, most or all of the clade member sequences derived from both
eudicots and monocots have been shown to confer increased yield or
tolerance to one or more abiotic stresses when the sequences were
overexpressed. These studies each demonstrate that evolutionarily
conserved genes from diverse species are likely to function
similarly (i.e., by regulating similar target sequences and
controlling the same traits), and that polynucleotides from one
species may be transformed into closely-related or
distantly-related plant species to confer or improve traits.
[0123] Orthologs and paralogs of presently disclosed polypeptides
may be cloned using compositions provided by the present
description according to methods well known in the art. cDNAs can
be cloned using mRNA from a plant cell or tissue that expresses one
of the present sequences. Appropriate mRNA sources may be
identified by interrogating Northern blots with probes designed
from the present sequences, after which a library is prepared from
the mRNA obtained from a positive cell or tissue.
Polypeptide-encoding cDNA is then isolated using, for example, PCR,
using primers designed from a presently disclosed gene sequence, or
by probing with a partial or complete cDNA or with one or more sets
of degenerate probes based on the disclosed sequences. The cDNA
library may be used to transform plant cells. Expression of the
cDNAs of interest is detected using, for example, microarrays,
Northern blots, quantitative PCR, or any other technique for
monitoring changes in expression. Genomic clones may be isolated
using similar techniques to those.
[0124] Examples of orthologs of the Arabidopsis polypeptide
sequences and their functionally similar orthologs are listed in
Tables 2 or 3 and the Sequence Listing. In addition to the
sequences in Tables 2 or 3 and the Sequence Listing, the claimed
nucleotide sequences are phylogenetically and structurally similar
to sequences listed in the Sequence Listing and can function in a
plant by increasing photosynthetic resource use efficiency and/or
and increasing yield, vigor, or biomass when ectopically expressed,
or overexpressed, in a plant. Since a significant number of these
sequences are phylogenetically and sequentially related to each
other and may be shown to increase yield from a plant and/or
photosynthetic resource use efficiency, one skilled in the art
would predict that other similar, phylogenetically related
sequences falling within the present clades of polypeptides,
including MYB19 clade polypeptide sequences, would also perform
similar functions when ectopically expressed.
[0125] Background Information for MYB19, and the MYB19 Clade.
[0126] A number of phylogenetic ally-related sequences have been
found in other plant species. Tables 2 and 3 list a number of MYB19
clade sequences from diverse species. The tables include the SEQ ID
NO: (Column 1), the species from which the sequence was derived and
the Gene Identifier ("GID"; Column 2), the percent identity of the
polypeptide in Column 1 to the full length MYB19 polypeptide, SEQ
ID NO: 2, as determined by a BLASTp analysis, for example, with a
wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62
scoring matrix (Henikoff and Henikoff, 1989. Proc. Natl. Acad. Sci.
USA 89:10915; Henikoff and Henikoff, 1991. Nucleic Acids Res. 19:
6565-6572) (Column 3), the amino acid residue coordinates for the
conserved first or second Myb DNA binding domains in amino acid
coordinates beginning at the N-terminus, of each of the sequences
(Column 4), the conserved first or second Myb DNA binding domain
sequences of the respective polypeptides (Column 5); the SEQ ID NO:
of each of the first or second Myb DNA binding domains (Column 6),
and the percentage identity of the conserved domain in Column 5 to
the conserved domain of the Arabidopsis MYB19 sequence, SEQ ID NO:
2 (as determined by a BLASTp analysis, wordlength (W) of 3, an
expectation (E) of 10, and the BLOSUM62 scoring matrix, and with
the proportion of identical amino acids in parentheses; Column
7).
TABLE-US-00002 TABLE 2 Conserved `Myb DNA binding domain 1` of
MYB19 and closely related sequences Col. 7 Percent Col. 4 identity
Col. 3 Myb DNA Col. 6 of first Percent binding SEQ ID Myb domain
Col. identity domain 1 Col. 5 NO: of in Col. 5 1 of poly- in amino
Conserved Myb DNA to Myb SEQ Col. 2 peptide acid Myb DNA binding
DNA binding ID Species/ in Col. 1 coordi- binding domain domain 1
of NO: Identifier to MYB19 nates domain 1 1 MYB19 2 At/MYB19 100%
17-77 WSPEEDQKLKSFILSR 61 100% (61/61) AT5G52260.1 (268/268)
GHACWTTVPILAGLQ RNGKSCRLRWINYLR PGLKRGSFSEEEEET 4 At/ 60% 18-78
WSPEEDEKLRSFILSY 62 85% (52/61) AT4G25560.1 (169/280)
GHSCWTTVPIKAGLQ RNGKSCRLRWINYLR PGLKRDMISAEEEET 6 Os/LOC_Os04g 48%
18-78 WSPEEDQKLRDFILRY 63 80% (49/61) 45020.1 (96/200)
GHGCWSAVPVKAGL QRNGKSCRLRWINYL RPGLKHGMFSREEEET 8 Bd/Bradi5g1667
53% 18-78 WSPEEDQKLRDYIIRY 64 78% (48/61) 2.1 (102/192)
GHSCWSTVPVKAGLQ RNGKSCRLRWINYLR PGLKHGMFSQEEEET 10 Zm/GRMZM2G 50%
18-78 WSPEEDQKLRDYILLH 65 77% (47/61) 170049_T01 (97/191)
GHGCWSALPAKAGLQ RNGKSCRLRWINYLR PGLKHGMFSPEEEET 12 Si/Si012304m 48%
23-83 WSPEEDEKLRDFILRY 66 77% (47/61) (98/202) GHGCWSALPAKAGLQ
RNGKSCRLRWINYLR PGLKHGMFSREEEET 14 Cc/clementine0. 48% 15-75
WSPEEDQRLKNYVLQ 67 77% (47/61) 9_033485m (115/237) HGHPCWSSVPINAGL
QRNGKSCRLRWINYL RPGLKRGVFNMQEEE T 16 Pt/POPTR_001 50% 22-82
WSPEEDQRLRNYVLK 68 77% (47/61) 5s13190.1 (109/217) HGHGCWSSVPINAGL
QRNGKSCRLRWINYL RPGLKRGTFSAQEEET 18 Eg/EUCGR.K0 49% 18-78
WSPEEDQKLRNYVLK 69 76% (46/60) 0250.1 (107/217) HGHGCWSSVPINTGL
QRNGKSCRLRWINYL RPGLKRGMFTMEEEEI 20 Eg/EUCGR.K0 48% 18-78
WSPEEDQRLRNYILNH 70 75% (45/60) 0251.1 (110/226) GHGYWSSVPINTGLQ
RNGKSCRLRWINYLR PGLKRGMFTLEEEEI 22 Pt/POPTR_001 48% 18-78
WSPEEDQRLGSYVFQ 71 75% (46/61) 2s13260.1 (109/223) HGHGCWSSVPINAGL
QRTGKSCRLRWINYL RPGLKRGAFSTDEEET 24 Gm/Glyma16g3 48% 18-78
WSPEEDNKLRNHIIKH 72 75% (46/61) 1280.1 (116/238) GHGCWSSVPIKAGLQ
RNGKSCRLRWINYLR PGLKRGVFSKHEEDT 26 Gm/Glyma09g2 49% 18-78
WSPEEDNKLRNHIIKH 73 73% (45/61) 5590.1 (103/209) GHGCWSSVPIKAGLQ
RNGKSCRLRWINYLR PGLKRGVFSKHEKDT 28 Sl/Solyc03g025 40% 52-112
WSPDEDDRLKNYMIK 74 73% (44/60) 870.2.1 (115/283) HGHGCWSSVPINAGL
QRNGKSCRLRWINYL RPGLKRGAFSLEEEDI 30 Vv/GSVIVT010 42% 22-82
WSPEEDARLRNYVLK 75 72% (44/61) 28984001 (115/272) YGLGCWSSVPVNAGL
QRNGKSCRLRWINYL RPGLKRGMFTIEEEET 32 Eg/EUCGR.A0 51% 20-80
WSPDEDQRLRNYIHK 76 70% (44/61) 2796.1 (112/217) HGYSCWSSVPINAGL
QRNGKSCRLRWINYL RPGLKRGAFTVQEEET 34 At/AT3G48920. 51%) 19-79
WSPEEDEKLRSHVLK 77 69% (41-59) 1 (99/191) YGHGCWSTIPLQAGL
QRNGKSCRLRWVNYL RPGLKKSLFTKQEETI
TABLE-US-00003 TABLE 3 Conserved second Myb DNA binding domains of
MYB19 and closely related sequences Col. 7 Percent Col. 4 identity
Col. 3 Myb DNA of second Percent binding Col. 6 Myb domain Col.
identity domain 2 Col. 5 SEQ ID in Col. 5 1 of poly- in amino
Conserved NO: of to Myb SEQ Col. 2 peptide acid Myb DNA second DNA
binding ID Species/ in Col. 1 coordi- binding Myb domain 2 of NO:
Identifier to MYB19 nates domain 2 domain MYB19 2 At/MYB19 100%
70-112 FSEEEEETILTLHSSL 95 100% (43/43) AT5G52260.1 (268/268)
GNKWSRIAKYLPGRT DNEIKNYWHSYL 4 At/ 60% 68-110 ISAEEEETILTFHSSLG 96
88% (37/42) AT4G25560.1 (169/280) NKWSQIAKFLPGRTD NEIKNYWHSHL 6
Os/LOC_Os04g 48% 71-113 FSREEEETVMNLHAT 97 72% (31/43) 45020.1
(96/200) MGNKWSQIARHLPG RTDNEVKNYWNSYL 8 Bd/ 53% 71-113
FSQEEEETVMSLHAT 98 76% (33/43) Bradi5g16672.1 (102/192)
LGNKWSRIAQHLPGR TDNEVKNYWNSYL 10 Zm/GRMZM2 50% 71-113
FSPEEEETVMSLHAT 99 76% (33/43) G170049_T01 (97/191) LGNKWSRIARHLPGR
TDNEVKNYWNSYL 12 Si/Si012304m 48% 71-113 FSREEEETVMSLHAK 100 74%
(32/43) (98/202) LGNKWSQIARHLPGR TDNEVKNYWNSYL 14 Cc/clementine0.
48% 75-117 FNMQEEETILTVHRL 101 76% (33/43) 9_033485m (115/237)
LGNKWSQIAQHLPGR TDNEIKNYWHSHL 16 Pt/POPTR_001 50% 75-117
FSAQEEETILALHHM 102 79% (34/43) 5s13190.1 (109/217) LGNKWSQIAQHLPGR
TDNEIKNHWHSYL 18 Eg/EUCGR.K0 49% 71-113 FTMEEEEIIFSLHHLIG 103 74%
(32/43) 0250.1 (107/217) NKWSQIAKHLPGRTD NEIKNHWHSYL 20 Eg/EUCGR.K0
48% 71-113 FTLEEEEIILSLHRLIG 104 76% (33/43) 0251.1 (110/226)
NKWSQIAKHLPGRTD NEIKNHWHSYL 22 Pt/POPTR_001 48% 105-147
FSTDEEETILTLHRML 105 81% (35/43) 2s13260.1 (109/223)
GNKWSQIAQHLPGRT DNEIKNHWHSYL 24 Gm/Glyma16g3 48% 71-113
FSKHEEDTIMVLHHM 106 76% (33/43) 1280.1 (116/238) LGNKWSQIAQHLPGR
TDNEIKNYWHSYL 26 Gm/Glyma09g2 49% 71-113 FSKHEKDTIMALHH 107 72%
(31/43) 5590.1 (103/209) MLGNKWSQIAQHLP GRTDNEVKNYWHSY L 28
Sl/Solyc03g025 40% 72-114 FSLEEEDIILTLHAMF 108 76% (33/43) 870.2.1
(115/283) GNKWSQIAQQLPGRT DNEIKNHWHSYL 30 Vv/GSVIVT01 42% 73-115
FTIEEEETIMALHRLL 109 74% (32/43) 028984001 (115/272)
GNKWSQIAQNFPGRT DNEIKNYWHSCL 32 Eg/EUCGR.A0 51% 71-113
FTVQEEETILNLHHLL 110 76% (33/43) 2796.1 (112/217) GNKWSQIAQHLPGRT
DNEIKNHWHSYL 34 At/AT3G48920. 51%) 76-118 FTKQEETILLSLHSML 111 72%
(31/43) 1 (99/191) GNKWSQISKFLPGRT DNEIKNYWHSNL Species
abbreviations for Tables 2 and 3: At- Arabidopsis thaliana; Bd-
Brachypodium distachyon; Cc- Citrus x clementina; Eg- Eucalyptus
grandis; Gm- Glycine max; Os- Oryza sativa; Pt- Populus
trichocarpa; Si- Setaria italica; Sl- Solanum lycopersicum; Vv-
Vitis vinifera; Zm- Zea mays
[0127] Sequences that are functionally-related and/or
closely-related to the polypeptides in Tables 2 and 3 may be
created artificially, semi-synthetically, or may occur naturally by
having descended from the same ancestral sequence as the disclosed
MYB19-related sequences, where the polypeptides have the function
of conferring increased photosynthetic resource use efficiency to
plants. These "functionally-related and/or closely-related" MYB19
clade polypeptides generally contain the consensus sequence of the
Myb DNA binding domain 1 of SEQ ID NO: 129:
WSPX.sup.1EDxxLxxxX.sup.2xxxGxxxWX.sup.3x
X.sup.2PxxxGLQRxGKSCRLRWX.sup.2NYLRPGLKxxxxxxxE; where x represents
any amino acid;
X.sup.1 is D or E;
X.sup.2 is I, V, L or M;
[0128] and X.sup.3 represents S or T; as provided in FIG.
2B-2C.
[0129] Other highly conserved residues found in the Myb DNA binding
domain 2 of MYB19 clade members, as shown in FIG. 2C-2D and SEQ ID
NO: 130:
ExxxX.sup.1xxxHxxxGNKWSxIX.sup.2xxxPGRTDNEX.sup.1KNxWxSxL where x
represents any amino acid;
X.sup.1 is I, V, L or M; and
[0130] X.sup.2 represents A or S.
[0131] There is also a small motif that is present in MYB19 clade
member proteins, identifiable as SEQ ID NO: 160 and that can be
located spanning FIGS. 2E-2F:
PxFxX.sup.1W
[0132] where x represents any amino acid; and
X.sup.1 is D or E.
[0133] The presence of one or more of these consensus sequences
and/or these amino acid residues is correlated with conferring of
improved or increased photosynthetic resource use efficiency to a
plant when the expression level of the polypeptide is altered in a
plant by being reduced, knocked-out, or overexpressed. A MYB19
clade polypeptide sequence that is "functionally-related and/or
closely-related" to the listed full length protein sequences or
domains provided in Tables 2 or 3 may also have at least 40%, 42%,
48%, 49%, 50%, 51%, 53%, 60%, or about 100% amino acid identity to
SEQ ID NO: 2 or to SEQ ID NO: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22,
24, 26, 28, 30, 32, 34, and/or at least 69%, 70%, 71%, 72%, 73%,
74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%,
87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or
about 100% amino acid identity to the first Myb DNA binding domain
of SEQ ID NO: 2, or to a listed first Myb DNA binding domain or to
SEQ ID NOs: 61-77, and/or 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%,
80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, 99%, or about 100% amino acid
identity to a listed second Myb DNA binding domain or to the second
Myb DNA binding domain of SEQ ID NO: 2 or SEQ ID NOs: 95-111, or to
an amino acid sequence having at least 77%, at least 78%, at least
79%, at least 80%, at least 81%, at least 82%, at least 83%, at
least 84%, at least 85%, at least 86%, at least 87%, at least 88%,
at least 89%, at least 90%, at least 91%, at least 92%, at least
93%, at least 94%, at least 95% or at least 96%, at least 97%, at
least 98%, at least 99%, or about 100% sequence identity to SEQ ID
NOs: 129-132. The presence of the disclosed conserved first Myb DNA
binding domains and/or second Myb DNA binding domains in the
polypeptide sequence (for example, SEQ ID NO: 61-77 or 95-111), is
correlated with the conferring of improved or increased
photosynthetic resource use efficiency to a plant when the
expression level of the polypeptide is altered in a plant by being
reduced, knocked-out, or overexpressed. All of the sequences that
adhere to these functional and sequential relationships are herein
referred to as "MYB19 clade polypeptides" or "MYB19 clade
polypeptides", or which fall within the "MYB19 clade" or "G1309
clade" exemplified in the tree in FIG. 1 as those polypeptides
bounded by LOC_Os04g45020.1 and Solyc03g025870.2.1 (indicated by
the box around these sequences).
Examples of Methods for Identifying Identity, Similarity, Homology
and Relatedness
[0134] Percent identity can be determined electronically, e.g., by
using the MEGALIGN program (DNASTAR, Inc. Madison, Wis.) or the
Accelrys. The MEGALIGN program can create alignments between two or
more sequences according to different methods, for example, the
clustal method (see, for example, Higgins and Sharp, 1988. Gene 73:
237-244). The clustal algorithm groups sequences into clusters by
examining the distances between all pairs. The clusters are aligned
pairwise and then in groups. Other alignment algorithms or programs
may be used for preparing alignments and/or determining percentage
identities, including Accelrys Gene, FASTA, BLAST, or ENTREZ, FASTA
and BLAST, some of which may also be used to calculate percent
similarity. Accelrys Gene is available from Accelrys, Inc., San
Diego, Calif. Other programs are available as a part of the GCG
sequence analysis package (University of Wisconsin, Madison, Wis.),
and can be used with or without default settings. ENTREZ is
available through the National Center for Biotechnology
Information. In one embodiment, the percent identity of two
sequences can be determined by the GCG program with a gap weight of
1, e.g., each amino acid gap is weighted as if it were a single
amino acid or nucleotide mismatch between the two sequences (see
U.S. Pat. No. 6,262,333).
[0135] Software for performing BLAST analyses is publicly
available, e.g., through the National Center for Biotechnology
Information (see internet website at www.ncbi.nlm.nih.gov/). This
algorithm involves first identifying high scoring sequence pairs
(HSPs) by identifying short words of length W in the query
sequence, which either match or satisfy some positive-valued
threshold score T when aligned with a word of the same length in a
database sequence. T is referred to as the neighborhood word score
threshold (Altschul, 1990. J. Mol. Biol. 215: 403-410; Altschul,
1993. J. Mol. Evol. 36: 290-300). These initial neighborhood word
hits act as seeds for initiating searches to find longer HSPs
containing them. The word hits are then extended in both directions
along each sequence for as far as the cumulative alignment score
can be increased. Cumulative scores are calculated using, for
nucleotide sequences, the parameters M (reward score for a pair of
matching residues; always >0) and N (penalty score for
mismatching residues; always <0). For amino acid sequences, a
scoring matrix is used to calculate the cumulative score. Extension
of the word hits in each direction are halted when: the cumulative
alignment score falls off by the quantity X from its maximum
achieved value; the cumulative score goes to zero or below, due to
the accumulation of one or more negative-scoring residue
alignments; or the end of either sequence is reached. The BLAST
algorithm parameters W, T, and X determine the sensitivity and
speed of the alignment. The BLASTN program (for nucleotide
sequences) uses as defaults a wordlength (W) of 11, an expectation
(E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both
strands. For amino acid sequences, the BLASTP program uses as
defaults a wordlength (W) of 3, an expectation (E) of 10, and the
BLOSUM62 scoring matrix (see Henikoff and Henikoff, 1989. supra;
Henikoff and Henikoff, 1991. supra). Unless otherwise indicated for
comparisons of predicted polynucleotides, "sequence identity"
refers to the % sequence identity generated from a tBLASTx using
the NCBI version of the algorithm at the default settings using
gapped alignments with the filter "off" (see, for example, internet
website at www.ncbi.nlm nih gov).
[0136] Other techniques for alignment are described by Doolittle,
ed., 1996. Methods in Enzymology, vol. 266: "Computer Methods for
Macromolecular Sequence Analysis" Academic Press, Inc., San Diego,
Calif., USA. Preferably, an alignment program that permits gaps in
the sequence is utilized to align the sequences. The Smith-Waterman
is one type of algorithm that permits gaps in sequence alignments
(see Shpaer, 1997. Methods Mol. Biol. 70: 173-187). Also, the GAP
program using the Needleman and Wunsch alignment method can be
utilized to align sequences. An alternative search strategy uses
MPSRCH software, which runs on a MASPAR computer. MPSRCH uses a
Smith-Waterman algorithm to score sequences on a massively parallel
computer. This approach improves ability to pick up distantly
related matches, and is especially tolerant of small gaps and
nucleotide sequence errors. Nucleic acid-encoded amino acid
sequences can be used to search both protein and DNA databases.
[0137] The percentage similarity between two polypeptide sequences,
e.g., sequence A and sequence B, is calculated by dividing the
length of sequence A, minus the number of gap residues in sequence
A, minus the number of gap residues in sequence B, into the sum of
the residue matches between sequence A and sequence B, times one
hundred. Gaps of low or of no similarity between the two amino acid
sequences are not included in determining percentage similarity.
Percent identity between polynucleotide sequences can also be
counted or calculated by other methods known in the art, e.g., the
Jotun Hein method (see, for example, Hein, 1990. Methods Enzymol.
183: 626-645). Identity between sequences can also be determined by
other methods known in the art, e.g., by varying hybridization
conditions (see US Patent Application No. 20010010913).
[0138] The percent identity between two polypeptide sequences can
also be determined using Accelrys Gene v2.5, 2006. with default
parameters: Pairwise Matrix: GONNET; Align Speed: Slow; Open Gap
Penalty: 10.000; Extended Gap Penalty: 0.100; Multiple Matrix:
GONNET; Multiple Open Gap Penalty: 10.000; Multiple Extended Gap
Penalty: 0.05; Delay Divergent: 30; Gap Separation Distance: 8; End
Gap Separation: false; Residue Specific Penalties: false;
Hydrophilic Penalties: false; Hydrophilic Residues: GPSNDQEKR. The
default parameters for determining percent identity between two
polynucleotide sequences using Accelrys Gene are: Align Speed:
Slow; Open Gap Penalty: 10.000; Extended Gap Penalty: 5.000;
Multiple Open Gap Penalty: 10.000; Multiple Extended Gap Penalty:
5.000; Delay Divergent: 40; Transition: Weighted.
[0139] In addition, one or more polynucleotide sequences or one or
more polypeptides encoded by the polynucleotide sequences may be
used to search against a BLOCKS (Bairoch et al., 1997. Nucleic
Acids Res. 25: 217-221), PFAM, and other databases which contain
previously identified and annotated motifs, sequences and gene
functions. Methods that search for primary sequence patterns with
secondary structure gap penalties (Smith et al., 1992. Protein
Engineering 5: 35-51) as well as algorithms such as Basic Local
Alignment Search Tool (BLAST; Altschul, 1990. supra; Altschul et
al., 1993. supra), BLOCKS (Henikoff and Henikoff, 1991 supra),
Hidden Markov Models (HMM; Eddy, 1996. Curr. Opin. Str. Biol. 6:
361-365; Sonnhammer et al., 1997. Proteins 28: 405-420), and the
like, can be used to manipulate and analyze polynucleotide and
polypeptide sequences encoded by polynucleotides. These databases,
algorithms and other methods are well known in the art and are
described in Ausubel et al., 1997. Short Protocols in Molecular
Biology, John Wiley & Sons, New York, N.Y., unit 7.7, and in
Meyers, 1995. Molecular Biology and Biotechnology, Wiley VCH, New
York, N.Y., p 856-853.
[0140] Thus, the instant description provides methods for
identifying a sequence similar or paralogous or orthologous or
homologous to one or more polynucleotides as noted herein, or one
or more target polypeptides encoded by the polynucleotides, or
otherwise noted herein and may include linking or associating a
given plant phenotype or gene function with a sequence. In the
methods, a sequence database is provided (locally or across an
internet or intranet) and a query is made against the sequence
database using the relevant sequences herein and associated plant
phenotypes or gene functions.
[0141] A further method for identifying or confirming that specific
homologous sequences control the same function is by comparison of
the transcript profile(s) obtained upon overexpression or knockout
of two or more related polypeptides. Since transcript profiles are
diagnostic for specific cellular states, one skilled in the art
will appreciate that genes that have a highly similar transcript
profile (e.g., with greater than 50% regulated transcripts in
common, or with greater than 70% regulated transcripts in common,
or with greater than 90% regulated transcripts in common) will have
highly similar functions. Fowler and Thomashow, 2002. Plant Cell
14, 1675-1690, have shown that three paralogous AP2 family genes
(CBF1, CBF2 and CBF3) are induced upon cold treatment, each of
which can condition improved freezing tolerance, and all have
highly similar transcript profiles. Once a polypeptide has been
shown to provide a specific function, its transcript profile
becomes a diagnostic tool to determine whether paralogs or
orthologs have the same function.
[0142] Identifying Polynucleotides or Nucleic Acids by
Hybridization.
[0143] Polynucleotides homologous to the sequences illustrated in
the Sequence Listing and tables can be identified, e.g., by
hybridization to each other under stringent or under highly
stringent conditions. Stringency is influenced by a variety of
factors, including temperature, salt concentration and composition,
organic and non-organic additives, solvents, etc. present in both
the hybridization and wash solutions and incubations, and the
number of washes, as described in more detail in the references
cited below (e.g., Sambrook et al., 1989. supra; Berger and Kimmel,
eds., 1987. Methods Enzymol. 152: 507-511; Anderson and Young,
1985. "Quantitative Filter Hybridisation", In: Hames and Higgins,
ed., Nucleic Acid Hybridisation, A Practical Approach. Oxford, IRL
Press, 73-111), each of which are incorporated herein by reference.
Conditions that are highly stringent, and means for achieving them,
are also well known in the art and described in, for example,
Sambrook et al., 1989. supra; Berger and Kimmel, eds., 1987. Meth.
Enzymol. 152:467-469; and Anderson and Young, 1985. supra.
[0144] Also provided in the instant description are polynucleotide
sequences that are capable of hybridizing to the claimed
polynucleotide sequences, including any of the polynucleotides
within the Sequence Listing, and fragments thereof under various
conditions of stringency (see, for example, Wahl and Berger, 1987.
Methods Enzymol. 152: 399-407; Berger and Kimmel, ed., 1987.
Methods Enzymol. 152:507-511). In addition to the nucleotide
sequences listed in the Sequence Listing, full length cDNA,
orthologs, and paralogs of the present nucleotide sequences may be
identified and isolated using well-known methods. The cDNA
libraries, orthologs, and paralogs of the present nucleotide
sequences may be screened using hybridization methods to determine
their utility as hybridization target or amplification probes.
[0145] Stability of DNA duplexes is affected by such factors as
base composition, length, and degree of base pair mismatch.
Hybridization conditions may be adjusted to allow DNAs of different
sequence relatedness to hybridize. The melting temperature
(T.sub.m) is defined as the temperature when 50% of the duplex
molecules have dissociated into their constituent single strands.
The melting temperature of a perfectly matched duplex, where the
hybridization buffer contains formamide as a denaturing agent, may
be estimated by the following equations:
[0146] (I) DNA-DNA:
T.sub.m(.degree. C.)=81.5+16.6(log [Na+])+0.41(% G+C)-0.62(%
formamide)-500/L
[0147] (II) DNA-RNA:
T.sub.m(.degree. C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(%
G+C).sup.2-0.5(% formamide)-820/L
[0148] (III) RNA-RNA:
T.sub.m(.degree. C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(%
G+C).sup.2-0.35(% formamide)-820/L
[0149] where L is the length of the duplex formed, [Na+] is the
molar concentration of the sodium ion in the hybridization or
washing solution, and % G+C is the percentage of (guanine+cytosine)
bases in the hybrid. For imperfectly matched hybrids, approximately
1.degree. C. is required to reduce the melting temperature for each
1% mismatch.
[0150] Hybridization experiments are generally conducted in a
buffer of pH between 6.8 to 7.4, although the rate of hybridization
is nearly independent of pH at ionic strengths likely to be used in
the hybridization buffer (Anderson and Young, 1985. supra). In
addition, one or more of the following may be used to reduce
non-specific hybridization: sonicated salmon sperm DNA or another
non-complementary DNA, bovine serum albumin, sodium pyrophosphate,
sodium dodecylsulfate (SDS), polyvinyl-pyrrolidone, ficoll and
Denhardt's solution. Dextran sulfate and polyethylene glycol 6000
act to exclude DNA from solution, thus raising the effective probe
DNA concentration and the hybridization signal within a given unit
of time. In some instances, conditions of even greater stringency
may be desirable or required to reduce non-specific and/or
background hybridization. These conditions may be created with the
use of higher temperature, lower ionic strength and higher
concentration of a denaturing agent such as formamide.
[0151] Stringency conditions can be adjusted to screen for
moderately similar fragments such as homologous sequences from
distantly related organisms, or to highly similar fragments such as
genes that duplicate functional enzymes from closely related
organisms. The stringency can be adjusted either during the
hybridization step or in the post-hybridization washes. Salt
concentration, formamide concentration, hybridization temperature
and probe lengths are variables that can be used to alter
stringency (as described by the formula above). As a general
guideline, high stringency is typically performed at
T.sub.m-5.degree. C. to T.sub.m-20.degree. C., moderate stringency
at T.sub.m-20.degree. C. to T.sub.m-35.degree. C. and low
stringency at T.sub.m-35.degree. C. to T.sub.m-50.degree. C. for
duplex >150 base pairs. Hybridization may be performed at low to
moderate stringency (25-50.degree. C. below T.sub.m), followed by
post-hybridization washes at increasing stringencies. Maximum rates
of hybridization in solution are determined empirically to occur at
T.sub.m-25.degree. C. for DNA-DNA duplex and T.sub.m-15.degree. C.
for RNA-DNA duplex. Optionally, the degree of dissociation may be
assessed after each wash step to determine the need for subsequent,
higher stringency wash steps.
[0152] High stringency conditions may be used to select for nucleic
acid sequences with high degrees of identity to the disclosed
sequences. An example of stringent hybridization conditions
obtained in a filter-based method such as a Southern or Northern
blot for hybridization of complementary nucleic acids that have
more than 100 complementary residues is about 5.degree. C. to
20.degree. C. lower than the thermal melting point (T.sub.m) for
the specific sequence at a defined ionic strength and pH.
Conditions used for hybridization may include about 0.02 M to about
0.15 M sodium chloride, about 0.5% to about 5% casein, about 0.02%
SDS or about 0.1% N-laurylsarcosine, about 0.001 M to about 0.03 M
sodium citrate, at hybridization temperatures between about
50.degree. C. and about 70.degree. C. More preferably, high
stringency conditions are about 0.02 M sodium chloride, about 0.5%
casein, about 0.02% SDS, about 0.001 M sodium citrate, at a
temperature of about 50.degree. C. Nucleic acid molecules that
hybridize under stringent conditions will typically hybridize to a
probe based on either the entire DNA molecule or selected portions,
e.g., to a unique subsequence, of the DNA.
[0153] Stringent salt concentration will ordinarily be less than
about 750 mM NaCl and 75 mM trisodium citrate. Increasingly
stringent conditions may be obtained with less than about 500 mM
NaCl and 50 mM trisodium citrate, to even greater stringency with
less than about 250 mM NaCl and 25 mM trisodium citrate. Low
stringency hybridization can be obtained in the absence of organic
solvent, e.g., formamide, whereas high stringency hybridization may
be obtained in the presence of at least about 35% formamide, and
more preferably at least about 50% formamide. Stringent temperature
conditions will ordinarily include temperatures of at least about
30.degree. C., more preferably of at least about 37.degree. C., and
most preferably of at least about 42.degree. C. with formamide
present. Varying additional parameters, such as hybridization time,
the concentration of detergent, e.g., sodium dodecyl sulfate (SDS)
and ionic strength, are well known to those skilled in the art.
Various levels of stringency are accomplished by combining these
various conditions as needed.
[0154] The washing steps that follow hybridization may also vary in
stringency; the post-hybridization wash steps primarily determine
hybridization specificity, with the most critical factors being
temperature and the ionic strength of the final wash solution. Wash
stringency can be increased by decreasing salt concentration or by
increasing temperature. Stringent salt concentration for the wash
steps will preferably be less than about 30 mM NaCl and 3 mM
trisodium citrate, and most preferably less than about 15 mM NaCl
and 1.5 mM trisodium citrate.
[0155] Thus, high stringency hybridization and wash conditions that
may be used to bind and remove polynucleotides with less than the
desired homology to the nucleic acid sequences or their complements
that encode the present polypeptides include, for example:
[0156] 6.times.SSC at 65.degree. C.;
[0157] 50% formamide, 4.times.SSC at 42.degree. C.; or
[0158] 0.5.times.SSC, 0.1% SDS at 65.degree. C.;
[0159] with, for example, two wash steps of 10-30 minutes each.
Useful variations on these conditions will be readily apparent to
those skilled in the art.
[0160] A person of skill in the art would not expect substantial
variation among polynucleotide species provided with the present
description because the highly stringent conditions set forth in
the above formulae yield structurally similar polynucleotides.
[0161] If desired, one may employ wash steps of even greater
stringency, including about 0.2.times.SSC, 0.1% SDS at 65.degree.
C. and washing twice, each wash step being about 30 minutes, or
about 0.1.times.SSC, 0.1% SDS at 65.degree. C. and washing twice
for 30 minutes. The temperature for the wash solutions will
ordinarily be at least about 25.degree. C., and for greater
stringency at least about 42.degree. C. Hybridization stringency
may be increased further by using the same conditions as in the
hybridization steps, with the wash temperature raised about
3.degree. C. to about 5.degree. C., and stringency may be increased
even further by using the same conditions except the wash
temperature is raised about 6.degree. C. to about 9.degree. C. For
identification of less closely related homologs, wash steps may be
performed at a lower temperature, e.g., 50.degree. C.
[0162] An example of a low stringency wash step employs a solution
and conditions of at least 25.degree. C. in 30 mM NaCl, 3 mM
trisodium citrate, and 0.1% SDS over 30 minutes. Greater stringency
may be obtained at 42.degree. C. in 15 mM NaCl, with 1.5 mM
trisodium citrate, and 0.1% SDS over 30 minutes. Even higher
stringency wash conditions are obtained at 65.degree. C.-68.degree.
C. in a solution of 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1%
SDS. Wash procedures will generally employ at least two final wash
steps. Additional variations on these conditions will be readily
apparent to those skilled in the art (see, for example, US Patent
Application No. 20010010913).
[0163] Stringency conditions can be selected such that an
oligonucleotide that is perfectly complementary to the coding
oligonucleotide hybridizes to the coding oligonucleotide with at
least about a 5-10.times. higher signal to noise ratio than the
ratio for hybridization of the perfectly complementary
oligonucleotide to a nucleic acid encoding a polypeptide known as
of the filing date of the application. It may be desirable to
select conditions for a particular assay such that a higher signal
to noise ratio, that is, about 15.times. or more, is obtained.
Accordingly, a subject nucleic acid will hybridize to a unique
coding oligonucleotide with at least a 2.times. or greater signal
to noise ratio as compared to hybridization of the coding
oligonucleotide to a nucleic acid encoding known polypeptide. The
particular signal will depend on the label used in the relevant
assay, e.g., a fluorescent label, a colorimetric label, a
radioactive label, or the like. Labeled hybridization or PCR probes
for detecting related polynucleotide sequences may be produced by
oligolabeling, nick translation, end-labeling, or PCR amplification
using a labeled nucleotide.
[0164] The present description also provides polynucleotide
sequences that are capable of hybridizing to the claimed
polynucleotide sequences, including any of the polynucleotides
within the Sequence Listing, and fragments thereof under various
conditions of stringency (see, for example, Wahl and Berger, 1987,
supra, pages 399-407; and Kimmel, 1987. Meth. Enzymol. 152,
507-511). In addition to the nucleotide sequences in the Sequence
Listing, full length cDNA, orthologs, and paralogs of the present
nucleotide sequences may be identified and isolated using
well-known methods. The cDNA libraries, orthologs, and paralogs of
the present nucleotide sequences may be screened using
hybridization methods to determine their utility as hybridization
target or amplification probes.
EXAMPLES
[0165] It is to be understood that this description is not limited
to the particular devices, machines, materials and methods
described. Although particular embodiments are described,
equivalent embodiments may be used to practice the claims.
[0166] The specification, now being generally described, will be
more readily understood by reference to the following examples,
which are included merely for purposes of illustration of certain
aspects and embodiments of the present description and are not
intended to limit the claims or description. It will be recognized
by one of skill in the art that a polypeptide that is associated
with a particular first trait may also be associated with at least
one other, unrelated and inherent second trait which was not
predicted by the first trait.
Example I
Plant Genotypes and Vector and Cloning Information
[0167] A variety of constructs may be used to modulate the activity
of regulatory polypeptides (RPs), and to test the activity of
orthologs and paralogs in transgenic plant material. This platform
provides the material for all subsequent analysis.
[0168] An individual plant "genotype" refers to a set of plant
lines containing a particular construct or knockout (for example,
this might be 35S lines for a given gene sequence (GID, Gene
Identifier) being tested, 35S lines for a paralog or ortholog of
that gene sequence, lines for an RNAi construct, lines for a GAL4
fusion construct, or lines in which expression of the gene sequence
is driven from a particular promoter that enhances expression in
particular cell, tissue or condition). For a given genotype arising
from a particular transformed construct, multiple independent
transgenic lines may be examined for morphological and
physiological phenotypes. Each individual "line" (also sometimes
known as an "event") refers to the progeny plant or plants deriving
from the stable integration of the transgene(s), carried within the
T-DNA borders contained within a transformation construct, into a
specific location or locations within the genome of the original
transformed cell. It is well known in the art that different lines
deriving from transformation with a given transgene may exhibit
different levels of expression of that transgene due to so called
"position effects" of the surrounding chromatin at the locus of
integration in the genome, and therefore it is necessary to examine
multiple lines containing each construct of interest.
(1) Overexpression/Tissue-Enhanced/Conditional Expression
[0169] Expression of a given regulatory protein from a particular
promoter, for example a photosynthetic tissue-enhanced promoter
(e.g., a green tissue- or leaf-enhanced promoter), is achieved
either by a direct-promoter fusion construct in which that
regulatory protein is cloned directly behind the promoter of
interest or by a two component system.
[0170] The Two-Component Expression System.
[0171] For the two-component system, two separate constructs are
used: Promoter::LexA-GAL4TA and opLexA::RP. The first of these
(Promoter::LexA-GAL4TA) comprises a desired promoter cloned in
front of a LexA DNA binding domain fused to a GAL4 activation
domain. The construct vector backbone (pMEN48, also known as P5375)
also carries a kanamycin resistance marker, along with an
opLexA::GFP (green fluorescent protein) reporter. Transgenic lines
are obtained containing this first component, and a line is
selected that shows reproducible expression of the reporter gene in
the desired pattern through a number of generations. A homozygous
population is established for that line, and the population is
supertransformed with the second construct (opLexA::RP) carrying
the regulatory protein of interest cloned behind a LexA operator
site. This second construct vector backbone (pMEN53, also known as
P5381) also contains a sulfonamide resistance marker.
[0172] Conditional Expression.
[0173] Various promoters can be used to overexpress disclosed
polypeptides in plants to confer improved photosynthetic resource
use efficiency. However, in some cases, there may be limitations in
the use of various proteins that confer increased photosynthetic
resource use efficiency when the proteins are overexpressed.
Negative side effects associated with constitutive overexpression
such as small size, delayed growth, increased disease sensitivity,
and development and alteration in flowering time are not uncommon.
A number of stress-inducible promoters can be used promote protein
expression during the periods of stress, and therefore may be used
to induce overexpression of polypeptides that can confer improved
stress tolerance when they are needed without the adverse
developmental or morphological effects that may be associated with
their constitutive overexpression.
[0174] Promoters that drive protein expression in response to
stress can be used to regulate the expression of the disclosed
polypeptides to confer photosynthetic resource use efficiency to
plants. The promoter may regulate expression of a disclosed
polypeptide to an effective level in a photosynthetic tissue.
Effective level in this regard refers to an expression level that
confers greater photosynthetic resource use efficiency in the
transgenic plant relative to the control plant that, for example,
does not comprise a recombinant polynucleotide that encodes the
disclosed polypeptide. Optionally, the promoter does not regulate
protein expression in a constitutive manner.
[0175] Such promoters include, but are not limited to, the
sequences located in the promoter regions of At5g52310 (RD29A),
At5g52300, AT1G16850, At3g46230, AT1G52690, At2g37870, AT5G43840,
At5g66780, At3g17520, and At4g09600.
[0176] In addition, promoters with expression specific to or
enhanced in particular cells or tissue types may be used to express
a given regulatory protein only in these cells or tissues. Examples
of such promoter types include but are not limited to promoters
expressed in green tissue, guard cell, epidermis, whole root, root
hairs, vasculature, apical meristems, and developing leaves.
[0177] Table 4 lists a number of photosynthetic tissue-enhanced
promoters, specifically, mesophyll tissue-enhanced promoters from
rice, that may be used to regulate expression of polynucleotides
and polypeptides found in the Sequence Listing and structurally and
functionally-related sequences. Promoters that may be used to drive
expression of polynucleotides and polypeptides found in the
Sequence Listing and structurally and functionally-related
sequences included, but are not limited to, promoter sequences
listed in Table 4.
TABLE-US-00004 TABLE 4 Rice Genes with Photosynthetic
Tissue-Enhanced Promoters Rice Gene Identifier of Photosynthetic
SEQ ID NO: Tissue-Enhanced Promoter 136 Os02g09720 137 Os05g34510
138 Os11g08230 139 Os01g64390 140 Os06g15760 141 Os12g37560 142
Os03g17420 143 Os04g51000 144 Os01g01960 145 Os05g04990 146
Os02g44970 147 Os01g25530 148 Os03g30650 149 Os01g64910 150
Os07g26810 151 Os07g26820 152 Os09g11220 153 Os04g21800 154
Os10g23840 155 Os08g13850 156 Os12g42980 157 Os03g29280 158
Os03g20650 159 Os06g43920
[0178] Tissue-enhanced promoters that may be used to drive
expression of polynucleotides and polypeptides found in the
Sequence Listing and structurally and functionally-related
sequences have also been described in US patent application
U520110179520A1, incorporated herein by reference. Such promoters
include, but are not limited to, Arabidopsis sequences located in
the promoter regions of AT1G08465, AT1G10155, AT1G14190, AT1G24130,
AT1G24735, AT1G29270, AT1G30950, AT1G31310, AT1G37140, AT1G49320,
AT1G49475, AT1G52100, AT1G60540, AT1G60630, AT1G64625, AT1G65150,
AT1G68480, AT1G68780, AT1G69180, AT1G77145, AT1G80580, AT2G03500,
AT2G17950, AT2G19910, AT2G27250, AT2G33880, AT2G39850, AT3G02500,
AT3G12750, AT3G15170, AT3G16340, AT3G27920, AT3G30340, AT3G42670,
AT3G44970, AT3G49950, AT3G50870, AT3G54990, AT3G59270, AT4G00180,
AT4G00480, AT4G12450, AT4G14819, AT4G31610, AT4G31615, AT4G31620,
AT4G31805, AT4G31877, AT4G36060, AT4G36470, AT4G36850, AT4G37970,
AT5G03840, AT5G12330, AT5G14070, AT5G16410, AT5G20740, AT5G27690,
AT5G35770, AT5G39330, AT5G42655, AT5G53210, AT5G56530, AT5G58780,
AT5G61070, and AT5G6491.
[0179] In addition to the sequences provided in the Sequence
Listing or in this Example, a promoter region may include a
fragment of the promoter sequences provided in the Sequence Listing
or in this Example, or a complement thereof, wherein the promoter
sequence, or the fragment thereof, or the complement thereof,
regulates expression of a polypeptide in a plant cell, for example,
in response to a biotic or abiotic stress, or in a manner that is
enhanced or preferred in certain plant tissues.
(2) Knock-Out/Knock-Down
[0180] In some cases, lines mutated in a given regulatory protein
may be analyzed. Where available, T-DNA insertion lines in a given
gene are isolated and characterized. In cases where a T-DNA
insertion line is unavailable, an RNA interference (RNAi) strategy
is sometimes used.
Example II
Transformation Methods
[0181] Transformation of Monocots.
[0182] Cereal plants including corn, wheat, rice, sorghum, barley,
or other monocots may be transformed with the present
polynucleotide sequences, including monocot or eudicot-derived
sequences such as those presented in the present Tables, cloned
into a vector such as pGA643 and containing a kanamycin-resistance
marker, and expressed constitutively under, for example, the
CaMV35S or COR15 promoters, or with tissue-enhanced or inducible
promoters. The expression vectors may be one found in the Sequence
Listing, or any other suitable expression vector may be similarly
used. For example, pMEN020 may be modified to replace the NptII
coding region with the BAR gene of Streptomyces hygroscopicus that
confers resistance to phosphinothricin. The KpnI and BgIII sites of
the Bar gene are removed by site-directed mutagenesis with silent
codon changes.
[0183] The cloning vector may be introduced into a variety of
cereal plants by means well known in the art including direct DNA
transfer or Agrobacterium tumefaciens-mediated transformation. The
latter approach may be accomplished by a variety of means,
including, for example, that of U.S. Pat. No. 5,591,616, in which
monocotyledon callus is transformed by contacting dedifferentiating
tissue with the Agrobacterium containing the cloning vector.
[0184] The sample tissues are immersed in a suspension of
3.times.10.sup.-9 cells of Agrobacterium containing the cloning
vector for 3-10 minutes. The callus material is cultured on solid
medium at 25.degree. C. in the dark for several days. The calli
grown on this medium are transferred to a Regeneration Medium.
Transfers are continued every two to three weeks (two or three
times) until shoots develop. Shoots are then transferred to
Shoot-Elongation Medium every 2-3 weeks. Healthy looking shoots are
transferred to Rooting Medium and after roots have developed, the
plants are placed into moist potting soil.
[0185] The transformed plants are then analyzed for the presence of
the NPTII gene/kanamycin resistance by ELISA, using the ELISA NPTII
kit from SPrime-3Prime Inc. (Boulder, Colo.).
[0186] It is also routine to use other methods to produce
transgenic plants of most cereal crops (Vasil, 1994. Plant Mol.
Biol. 25: 925-937) such as corn, wheat, rice, sorghum (Cassas et
al., 1993. Proc. Natl. Acad. Sci. USA 90: 11212-11216), and barley
(Wan and Lemeaux, 1994. Plant Physiol. 104: 37-48). DNA transfer
methods such as the microprojectile method can be used for corn
(Fromm et al., 1990. Bio/Technol. 8: 833-839; Gordon-Kamm et al.,
1990. Plant Cell 2: 603-618; Ishida, 1990. Nature Biotechnol.
14:745-750), wheat (Vasil et al., 1992. Bio/Technol. 10:667-674;
Vasil et al., 1993. Bio/Technol. 11:1553-1558; Weeks et al., 1993.
Plant Physiol. 102:1077-1084), and rice (Christou, 1991.
Bio/Technol. 9:957-962; Hiei et al., 1994. Plant J. 6:271-282;
Aldemita and Hodges, 1996. Planta 199: 612-617; and Hiei et al.,
1997. Plant Mol. Biol. 35:205-218). For most cereal plants,
embryogenic cells derived from immature scutellum tissues are the
preferred cellular targets for transformation (Hiei et al., 1997.
supra; Vasil, 1994. supra). For transforming corn embryogenic cells
derived from immature scutellar tissue using microprojectile
bombardment, the A188XB73 genotype is the preferred genotype (Fromm
et al., 1990. Bio/Technol. 8: 833-839; Gordon-Kamm et al., 1990.
supra). After microprojectile bombardment the tissues are selected
on phosphinothricin to identify the transgenic embryogenic cells
(Gordon-Kamm et al., 1990. supra). Transgenic plants from
transformed host plant cells may be regenerated by standard corn
regeneration techniques (Fromm et al., 1990. Bio/Technol. 8:
833-839; Gordon-Kamm et al., 1990. supra).
[0187] Transformation of Dicots.
[0188] Crop species that overexpress polypeptides of the instant
description may produce plants with increased photosynthetic
resource use efficiency and/or yield. Thus, polynucleotide
sequences listed in the Sequence Listing recombined into, for
example, one of the expression vectors of the instant description,
or another suitable expression vector, may be transformed into a
plant for the purpose of modifying plant traits for the purpose of
improving yield, quality, and/or photosynthetic resource use
efficiency. The expression vector may contain a constitutive,
tissue-enhanced or inducible promoter operably linked to the
polynucleotide. The cloning vector may be introduced into a variety
of plants by means well known in the art such as, for example,
direct DNA transfer or Agrobacterium tumefaciens-mediated
transformation. It is now routine to produce transgenic plants
using most eudicot plants (see Weissbach and Weissbach, 1989.
Methods for Plant Molecular Biology, Academic Press; Gelvin et al.,
1990. Plant Molecular Biology Manual, Kluwer Academic Publishers;
Herrera-Estrella et al., 1983. Nature 303: 209; Bevan, 1984.
Nucleic Acids Res. 12: 8711-8721; and Klee, 1985. Bio/Technology 3:
637-642). Methods for analysis of traits are routine in the art and
examples are disclosed above.
[0189] Numerous protocols for the transformation of tomato and soy
plants have been previously described, and are well known in the
art. Gruber et al., in Glick and Thompson, 1993. Methods in Plant
Molecular Biology and Biotechnology. eds., CRC Press, Inc., Boca
Raton, describe several expression vectors and culture methods that
may be used for cell or tissue transformation and subsequent
regeneration. For soybean transformation, methods are described by
Miki et al., 1993. in Methods in Plant Molecular Biology and
Biotechnology, p. 67-88, Glick and Thompson, eds., CRC Press, Inc.,
Boca Raton; and U.S. Pat. No. 5,563,055, (Townsend and Thomas),
issued Oct. 8, 1996.
[0190] There are a substantial number of alternatives to
Agrobacterium-mediated transformation protocols, other methods for
the purpose of transferring exogenous genes into soybeans or
tomatoes. One such method is microprojectile-mediated
transformation, in which DNA on the surface of microprojectile
particles is driven into plant tissues with a biolistic device
(see, for example, Sanford et al., 1987. Part. Sci. Technol.
5:27-37; Sanford, 1993. Methods Enzymol. 217: 483-509; Christou et
al., 1992. Plant. J. 2: 275-281; Klein et al., 1987. Nature 327:
70-73; U.S. Pat. No. 5,015,580 (Christou et al), issued May 14,
1991; and U.S. Pat. No. 5,322,783 (Tomes et al.), issued Jun. 21,
1994).
[0191] Alternatively, sonication methods (see, for example, Zhang
et al., 1991. Bio/Technology 9: 996-997); direct uptake of DNA into
protoplasts using CaCl.sub.2 precipitation, polyvinyl alcohol or
poly-L-ornithine (see, for example, Hain et al., 1985. Mol. Gen.
Genet. 199: 161-168; Draper et al., 1982. Plant Cell Physiol. 23:
451-458); liposome or spheroplast fusion (see, for example,
Deshayes et al., 1985. EMBO J., 4: 2731-2737; Christou et al.,
1987. Proc. Natl. Acad. Sci. USA 84: 3962-3966); and
electroporation of protoplasts and whole cells and tissues (see,
for example, Donn et al. (1990. in Abstracts of VIIth International
Congress on Plant Cell and Tissue Culture IAPTC, A2-38: 53;
D'Halluin et al., 1992. Plant Cell 4: 1495-1505; and Spencer et
al., 1994. Plant Mol. Biol. 24: 51-61) have been used to introduce
foreign DNA and expression vectors into plants.
[0192] After a plant or plant cell is transformed (and the
transformed host plant cell then regenerated into a plant), the
transformed plant may propagated vegetatively or it may be crossed
with itself or a plant from the same line, a non-transformed or
wild-type plant, or another transformed plant from a different
transgenic line of plants. Crossing provides the advantages of
producing new and often stable transgenic varieties. Genes and the
traits they confer that have been introduced into a tomato or
soybean line may be moved into distinct line of plants using
traditional backcrossing techniques well known in the art.
Transformation of tomato plants may be conducted using the
protocols of Koornneef et al, 1986. In Tomato Biotechnology: Alan
R. Liss, Inc., 169-178, and in U.S. Pat. No. 6,613,962, the latter
method described in brief here. Eight day old cotyledon explants
are precultured for 24 hours in Petri dishes containing a feeder
layer of Petunia hybrida suspension cells plated on MS medium with
2% (w/v) sucrose and 0.8% agar supplemented with 10 .mu.M
.alpha.-naphthalene acetic acid and 4.4 .mu.M 6-benzylaminopurine.
The explants are then infected with a diluted overnight culture of
Agrobacterium tumefaciens containing an expression vector
comprising a polynucleotide of the instant description for 5-10
minutes, blotted dry on sterile filter paper and cocultured for 48
hours on the original feeder layer plates. Culture conditions are
as described above. Overnight cultures of Agrobacterium tumefaciens
are diluted in liquid MS medium with 2% (w/v/) sucrose, pH 5.7) to
an OD.sub.600 of 0.8.
[0193] Following cocultivation, the cotyledon explants are
transferred to Petri dishes with selective medium comprising MS
medium with 4.56 .mu.M zeatin, 67.3 .mu.M vancomycin, 418.9 .mu.M
cefotaxime and 171.6 .mu.M kanamycin sulfate, and cultured under
the culture conditions described above. The explants are
subcultured every three weeks onto fresh medium. Emerging shoots
are dissected from the underlying callus and transferred to glass
jars with selective medium without zeatin to form roots. The
formation of roots in a kanamycin sulfate-containing medium is a
positive indication of a successful transformation.
[0194] Transformation of soybean plants may be conducted using the
methods found in, for example, U.S. Pat. No. 5,563,055 (Townsend et
al., issued Oct. 8, 1996), described in brief here. In this method
soybean seed is surface sterilized by exposure to chlorine gas
evolved in a glass bell jar. Seeds are germinated by plating on
1/10 strength agar solidified medium without plant growth
regulators and culturing at 28.degree. C. with a 16 hour day
length. After three or four days, seed may be prepared for
cocultivation. The seedcoat is removed and the elongating radicle
removed 3-4 mm below the cotyledons.
[0195] Eucalyptus is now considered an important crop that is grown
for example to provide feedstocks for the pulp and paper and
biofuel markets. This species is also amenable to transformation as
described in PCT patent publication WO/2005/032241.
[0196] Crambe has been recognized as a high potential oilseed crop
that may be grown for the production of high value oils. An
efficient method for transformation of this species has been
described in PCT patent publication WO 2009/067398 A1.
[0197] Overnight cultures of Agrobacterium tumefaciens harboring
the expression vector comprising a polynucleotide of the instant
description are grown to log phase, pooled, and concentrated by
centrifugation. Inoculations are conducted in batches such that
each plate of seed was treated with a newly resuspended pellet of
Agrobacterium. The pellets are resuspended in 20 ml inoculation
medium. The inoculum is poured into a Petri dish containing
prepared seed and the cotyledonary nodes are macerated with a
surgical blade. After 30 minutes the explants are transferred to
plates of the same medium that has been solidified. Explants are
embedded with the adaxial side up and level with the surface of the
medium and cultured at 22.degree. C. for three days under white
fluorescent light. These plants may then be regenerated according
to methods well established in the art, such as by moving the
explants after three days to a liquid counter-selection medium (see
U.S. Pat. No. 5,563,055).
[0198] The explants may then be picked, embedded and cultured in
solidified selection medium. After one month on selective media
transformed tissue becomes visible as green sectors of regenerating
tissue against a background of bleached, less healthy tissue.
Explants with green sectors are transferred to an elongation
medium. Culture is continued on this medium with transfers to fresh
plates every two weeks. When shoots are 0.5 cm in length they may
be excised at the base and placed in a rooting medium.
Experimental Methods; Transformation of Arabidopsis
[0199] Transformation of Arabidopsis is performed by an
Agrobacterium-mediated protocol based on the method of Bechtold and
Pelletier, 1998. Unless otherwise specified, all experimental work
is performed using the Columbia ecotype.
[0200] Plant Preparation.
[0201] Arabidopsis seeds are gas sterilized and sown on plates with
media containing 80% MS with vitamins, 0.3% sucrose and 1% Bacto
agar. The plates are placed at 4.degree. in the dark for the days
then transferred to 24 hour light at 22.degree. for 7 days. After 7
days the seedlings are transplanted to soil, placing individual
seedlings in each pot. The primary bolts are cut off a week before
transformation to break apical dominance and encourage auxiliary
shoots to form. Transformation is typically performed at 4-5 weeks
after sowing.
[0202] Bacterial Culture Preparation.
[0203] Agrobacterium stocks are inoculated from single colony
plates or from glycerol stocks and grown with the appropriate
antibiotics until saturation. On the morning of transformation, the
saturated cultures are centrifuged and bacterial pellets are
re-suspended in Infiltration Media (0.5.times.MS, 1.times.
Gamborg's Vitamins, 5% sucrose, 200 .mu.l/L Silwet L77) until an
A.sub.600 reading of 0.8 is reached.
[0204] Transformation and Harvest of Transgenic Seeds.
[0205] The Agrobacterium solution is poured into dipping
containers. All flower buds and rosette leaves of the plants are
immersed in this solution for 30 seconds. The plants are laid on
their side and wrapped to keep the humidity high. The plants are
kept this way overnight at 22.degree. C. and then the pots are
turned upright, unwrapped, and moved to the growth racks. In most
cases, the transformation process is repeated one week later to
increase transformation efficiency.
[0206] The plants are maintained on the growth rack under 24-hour
light until seeds are ready to be harvested. Seeds are harvested
when 80% of the siliques of the transformed plants are ripe
(approximately five weeks after the initial transformation). This
seed is deemed T.sub.0 seed, since it is obtained from the T.sub.0
generation, and is later plated on selection plates (either
kanamycin or sulfonamide). Resistant plants that are identified on
such selection plates comprise the T1 generation, from which
transgenic seed comprising an expression vector of interest may be
derived.
Example III
Primary Screening Materials and Methods
[0207] Plant Growth Conditions
[0208] Seeds from Arabidopsis lines are chlorine gas sterilized
using a standard protocol and spread onto plates containing a
sucrose based media augmented with vitamins (80% MS+Vit, 1%
sucrose, 0.65% PhytoBlend Agar (Caisson Laboratories, Inc., North
Logan, Utah) and appropriate kanamycin or sulfonamide
concentrations where selection is required. Seeds are stratified in
the dark on plates, at 4.degree. C. for 3 days then moved to a
walk-in growth chamber (Conviron MTW120, Conviron Controlled
Environments Ltd, Winnipeg, Manitoba, Canada) running at a 10 hour
photoperiod at a photosynthetic photon flux of approximately 200
.mu.mol m.sup.-2 s.sup.-1 at plant height and a photoperiod/night
temperature regime of 22.degree. C./19.degree. C. After seven days
of light exposure seedlings are transplanted into 164 ml volume
pots containing autoclaved ProMix.RTM. soil. All pots are returned
to the same growth-chamber where they are stood in water and
covered with a lid for the first seven days. This protocol keeps
the soil moist during this period. Seven days after transplanting
lids are removed and a watering and nutrition regime begun. All
plants receive water three times a week, and a weekly a fertilizer
treatment (80% Peter's NPK fertilizer).
[0209] Primary Screening
[0210] Between 35 and 38 days after being transferred to light on
plates, and after between 28 and 31 days growth in soil, a suite of
leaf-physiological parameters are measured using an infrared gas
analyzer (LI-6400XT, LI-COR.RTM. Biosciences, Lincoln, NB, USA)
integrated with `a fluorimeter that measures fluorescence from
Chlorophyll A (LI-6400-40, LI-COR Biosciences). The growth
conditions used, and plant age and leaf selection criteria for
measurement are designed to maximize the chance that the leaves
sampled fill the 2 cm.sup.2 leaf chamber of the gas-exchange system
and that plants show no visible signs of having transitioned to
reproductive growth.
[0211] Screening High-Light Leaf Physiology at Two Air
Temperatures
[0212] Leaf physiology is screened after plants have been
acclimated to high light (700 .mu.mol photons m.sup.-2 s.sup.-1)
under LED light banks emitting visible light (400-700 nm, Photon
Systems Instruments, Brno, Czech Republic), for 40 minutes. Other
than the change in light level, the atmospheric environment is the
same as that in which the plants have been grown, and the LI-6400
leaf chamber is set to reflect this, being set to deliver a
photosynthetic photon flux of 700 .mu.mol photons m.sup.-2 s.sup.-1
and operate at an air temperature of 22.degree. C. Forty minutes
acclimation to a photosynthetic photon flux of 700 .mu.mol photons
m.sup.-2 s.sup.-1 has repeatedly been shown to be sufficient to
achieve a steady-state rate of light-saturated photosynthesis and
stomatal conductance in control plants. Gas exchange and
fluorescence data are logged simultaneously two minutes after the
leaf has been closed in the chamber. Two minutes is found to be
long enough for the leaf chamber CO.sub.2 and H.sub.2O
concentrations to stabilize after closing a new leaf inside, and
thereby minimizing leaf physiological adjustment to small
differences between the growth environment and the LI-6400 chamber.
Screening at the growth air temperature of 22.degree. C. is begun
one hour into the photoperiod and is typically completed in two
hours. After being screened at 22.degree. C., plants are returned
to growth-light levels prior to being screened again at 35.degree.
C. later in the photoperiod. The higher-temperature screening
begins six hours into the photoperiod and measurements are made
after the rosettes have been acclimated to the same high light dose
as described above, but this time in a controlled environment with
an air temperature set to 35.degree. C. Measurements are again made
in a leaf chamber set to match the warmer air temperature and
logged using the protocol described above for the 22.degree. C.
measurements. Data generated at both 22.degree. C. and 35.degree.
C. are used to calculate: rates of CO.sub.2 assimilation by
photosynthesis (A, .mu.mol CO.sub.2 m.sup.-2 s.sup.-1); rates of
H.sub.2O loss through transpiration (Tr, mmol H.sub.2O m.sup.-2
s.sup.-1); the conductance to CO.sub.2 and H.sub.2O movement
between the leaf and air through the stomatal pore (g.sub.s, mol.
m.sup.-2 s.sup.-1); the sub-stomatal CO.sub.2 concentration
(C.sub.i, .mu.mol CO.sub.2 mol.sup.-1); transpiration efficiency,
the instantaneous ratio of photosynthesis to transpiration,
(TE=A/Tr (.mu.mol CO.sub.2 mmol H.sub.2O m.sup.-2 s.sup.-1)); the
rate of electron flow through photosystem two (ETR .mu.mol
e-m.sup.-2 s.sup.-1). Derivation of the parameters described above
followed established published protocols (Long & Bernacchi,
2003. J. Exp. Botany; 54:2393-24)
[0213] Leaves from up to 10 replicate plants are screened for a
given line of interest. Data generated from these lines are
compared with that from an empty vector control line planted at the
same time, and grown within the same flats, as the lines being
screened.
[0214] For control lines, data are collected not only at an
atmospheric CO.sub.2 concentration of 400 .mu.mol CO.sub.2
mol.sup.-1, but also after stepwise changes in CO.sub.2
concentration to 350, 300, 450 and 500 .mu.mol CO.sub.2 mol.sup.-1.
These measurements underlay screening for more complex
physiological traits of: 1) photosynthetic capacity; 2) regulation
of photosystem two (PSII) operation; and 3) non-photosynthetic
metabolism.
[0215] Screening Photosynthetic Capacity
[0216] Under most conditions, the rate of light-saturated
photosynthesis in a C3 leaf is a product of the biochemical
capacity of the Calvin cycle and the transfer conductance of
CO.sub.2 concentration to the sites of carboxylation (Farquhar et
al., 1980. Planta:149, 78-90). Plotting the rate of photosynthesis
against an estimate of the sub-stomatal CO.sub.2 concentration
(C.sub.i) provides a means to identify changes in photosynthetic
capacity of the Calvin cycle independent of changes in stomatal
conductance, a key component of the total transfer conductance to
CO.sub.2 of the leaf. Consequently, for lines being screened, rates
of photosynthesis are plotted against a regression plot of A vs.
C.sub.i generated for the control lines over a range of atmospheric
CO.sub.2 concentration, as described above. This technique enables
visual confirmation of changes in photosynthetic capacity in lines
of interest.
[0217] Screening Regulation of Photosystem Two (PSII) Operation
[0218] During acclimation to high light, the efficiency with which
photosystem PSII operates will reach a steady state regulated
largely by the feedback between non-photochemical quenching in the
antenna and the metabolic demand for energy produced in the
chloroplast (Genty et al., 1989. Biochim. Biophys. Acta 990:87-92;
Baker et al., 2007. Plant Cell Environ. 30:1107-1125). This
understanding is used in this screen to identify lines in which the
limitation that non-photochemical quenching exerts on the
efficiency with which photosystem II operates, is decreased. Lower
levels of non-photochemical quenching will result in a higher
efficiency of photosynthesis over a range of light levels, but
importantly higher rates of photosynthesis at low light where
light-use efficiency is important. Increasing rates of
photosynthesis as leaves in crop canopies transition from high to
low light is a process thought relevant to increasing crop-canopy
photosynthesis (Zhu et al., 2010. Plant Biol. 61:235-261). In
keeping with the A/Ci analysis described above, a regression of the
operating efficiency of PSII against non-photochemical quenching is
generated for the control line from data collected over a range of
atmospheric CO.sub.2 concentration to provide a reference against
which data for lines of interest can be visually compared.
[0219] Screening for Non-Photosynthetic Metabolism
[0220] Measurement of the ratio of the rate of electron flow
through PSII (ETR) to the rate of photosynthesis (A) is used to
screen for changes in non-photosynthetic metabolism. This screen is
based upon the understanding that the transport of four .mu.mol of
electrons from PSII to photosystem one PSI will supply the NADPH
and ATP required to fix one .mu.mol of CO.sub.2 in the Calvin
cycle. For a C3 leaf operating in an atmosphere with 21% oxygen,
the ratio of electron flow to photosynthesis should be higher than
four, reflecting photorespiratory and other metabolism. However,
because the rate of photorespiration in a C3 leaf is dependent upon
the concentration of CO.sub.2 at the active site of Rubisco, a
regression of the ratio of electron flow to photosynthesis,
generated over the range of CO.sub.2 concentrations described
above, provides the reference regression against which lines being
screened can be compared to controls. Changes in the ratio of ETR
to A, when observed at the same C, as the control line, could
indicate changes in the specificity of the Rubisco active site for
O.sub.2 relative to CO.sub.2 and or other metabolic sinks which
would be expected to have important implications for crop
productivity and/or stress tolerance.
[0221] Surrogate Screening for Growth-Light Physiology
[0222] Rosette biomass: the dry weight of whole Arabidopsis
rosettes is measured after being dried down at 80.degree. C. for 24
hours, a time found to be sufficient to reach constant weight.
Samples are taken after 35-38 days growth, and used as an assay of
above-ground productivity at growth light. Typically, five
replicate rosettes are sampled per Arabidopsis line being
screened.
[0223] Rosette chemical and isotopic C and N analysis: after
weighing, the five rosettes sampled for each line screened are
pooled together and ground to a fine powder. The pooled sample
generated is sub-sampled and approximately 4 .mu.g samples are
prepared for analysis.
[0224] Chlorophyll content index (CCI): measurements of light
transmission through the leaf are made for plants being screened
using a chlorophyll content meter (CCM-200, Apogee Instruments,
Logan, Utah, USA). The first is made within the first hour of the
photoperiod prior to any acclimation to high light on leaves of
plants samples for rosette analysis. The second is made later in
the photoperiod on leaves of plants that had undergone the
high-temperature screening.
[0225] Light absorption: measurements of CCI are used as a
surrogate for leaf light absorption, based upon a known
relationship between the two. The estimates of light absorption by
the leaf, required to construct this relationship, were made by
placing the leaf on top of a quantum sensor (LI-190, LI-COR
Biosciences) with both the leaf and quantum sensor then pressed
firmly up to the foam gasket underneath the LI-6400 light source.
This procedure provides an estimate of the transmission of a known
light flux through the leaf and is used to estimate the fraction of
light absorbed by the leaf.
Example IV
Experimental Results
[0226] This Example provides experimental observations for
transgenic plants overexpressing AtMYB19 related polypeptides in
plate-based assays and results observed for improved photosynthetic
resource use efficiency.
[0227] Photosynthetic rate was increased in six of nine independent
lines screened at growth temperature (22.degree. C.) and seven of
nine lines for measurements made after acclimation to high
temperature. For measurements made at air temperatures of
22.degree. C. and 35.degree. C.; photosynthesis was increased by
16% at 22.degree. C. and 17% at 35.degree. C., when averaged across
the lines that displayed increased photosynthesis. This provided
evidence that the increase in photosynthesis is conferred over a
wide range of air temperatures observed in Arabidopsis plants
overexpressing AtMYB19. Leaf and crop-canopy photosynthesis is
known to be related to final crop yield and improving
photosynthesis is widely considered to be a relevant pathway to
increasing crop yield. In a C3 plant, photosynthesis at high-light
can be limited by the biochemical capacity for photosynthesis,
indicated as photosynthetic capacity in Tables 5 and 6, or the
supply of CO.sub.2 into the chloroplast, of which stomatal
conductance, which regulates the transfer of CO.sub.2 into the leaf
through stoma, is a principle component. Both the capacity for
photosynthesis and stomatal conductance were increased in
Arabidopsis plants overexpressing AtMYB19 assayed at both
temperatures. Photosynthetic capacity was increased in five lines
at 22.degree. C. and in three at 35.degree. C. Focused secondary
assays on select lines, enabled the biochemical limitations to
photosynthesis that underlay photosynthetic capacity, to be
investigated. For measurements made at 22.degree. C., the
biochemical basis for the increase in photosynthetic capacity was
an increase in the plant's capacity to regenerate RuBP, a key
substrate for photosynthesis, for four lines (figure *). Two of
these four lines also displayed evidence of an increase in the
activity of Rubisco (figure *). For measurements made at 35.degree.
C., three lines displayed an increase in the capacity to regenerate
RuBP. Stomatal conductance was increased by 32% at 22.degree. C.
and 37% at 35.degree. C., when averaged across the AtMYB19
overexpression lines that displayed increased photosynthesis. The
extent to which photosynthesis is increased as a consequence of
improvements in photosynthetic capacity and stomatal conductance
has important implications. For example, increasing stomatal
conductance will increase the supply of CO.sub.2 into the leaf,
however this will increase photosynthesis to a greater extent in a
C3 plant than a C4 plant, where chloroplast CO.sub.2 concentrations
are typically maintained at close to saturating levels for
photosynthesis. Increasing stomatal conductance will increase
transpiration from the leaf, typically to a greater extent than
photosynthesis is stimulated. This combination of traits may be
more appropriate for crops growing on acreages where soil-water
availability is seldom limiting yield. Conversely, an increase in
photosynthetic capacity could increase photosynthetic rate without
increasing stomatal conductance and water loss, and would be
expected to increase crop yield over broad acres. For transgenic
plants overexpressing AtMYB19 related polypeptides, the increase in
photosynthetic rate was the result of increases in both
photosynthetic capacity and stomatal conductance. Consequently
transpiration efficiency, often used synonymously with WUE and
expressed as unit carbon uptake via photosynthesis per unit water
lost via transpiration, was typically not decreased across lines
and temperatures.
[0228] All experimental observations of greater photosynthetic
resource use efficiency were made by comparison to control plants
(e.g., plants that did not comprise a recombinant construct
encoding a AtMYB19-related polypeptide or overexpress a AtMYB19
clade or phylogenetically-related regulatory protein).
[0229] Tables 5 and 6 present the indicators of photosynthetic
resource use efficiency observed in Arabidopsis plants
overexpressing AtMYB19 in experiments conducted to date. The data
presented in Table 5 were collected on plants at their normal
growth temperature of 22.degree. C. For lines with increased
photosynthetic capacity, RuBP indicates that the capacity to
increase RuBP was increased and Rubisco indicates that Rubisco
activity was increased.
TABLE-US-00005 TABLE 5 Photosynthetic resource use efficiency
measurements in plants with altered expression of MYB19 clade
polypeptides at a growth temperature of 22.degree. C. Polypeptide
SEQ Photosynthetic Stomatal sequence/ ID Rate Conductance
Photosynthetic Line NO Driver Target 22.degree. C. 22.degree. C.
Capacity MYB19/ 2 35S::m35S::oEnh:LexA: opLexA::G1309 Increased
Increased No effect Line 1 GAL4_opLexA::GFP (20%) (32%) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased Increased
Line 2 GAL4_opLexA::GFP (15%) (28%) (Rubisco and RuBP) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased Increased
Line 3 GAL4_opLexA::GFP (10%) (35%) (Rubisco and RuBP) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 No effect No effect No effect
Line 4 GAL4_opLexA::GFP MYB19/ 2 35S::m35S::oEnh:LexA:
opLexA::G1309 Increased Increased Increased Line 5 GAL4_opLexA::GFP
(26%) (27%) MYB19/ 2 35S::m35S::oEnh:LexA: opLexA::G1309 Increased
Increased Increased Line 6 GAL4_opLexA::GFP (13%) (30%) RuBP MYB19/
2 35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased Increased
Line 7 GAL4_opLexA::GFP (10%) (41%) RuBP MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 No effect No effect No effect
Line 8 GAL4_opLexA::GFP
[0230] The data presented in Table 6 were collected on plants
acclimated to an air temperature of 35.degree. C. For lines with
increased photosynthetic capacity, RuBP indicates that the capacity
to increase RuBP was increased and Rubisco indicates that Rubisco
activity was increased.
TABLE-US-00006 TABLE 6 Photosynthetic resource use efficiency
measurements in plants with altered expression of MYB19 clade
polypeptides at a growth temperature of 35.degree. C. Polypeptide
Seq Photosynthetic Stomatal sequence/ ID Rate Conductance
Photosynthetic Line No Driver Target 22.degree. C. 22.degree. C.
Capacity MYB19/ 2 35S::m35S::oEnh:LexA: opLexA::G1309 Increased
Increased No effect Line 1 GAL4_opLexA::GFP (22%) (49%) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased Increased
Line 2 GAL4_opLexA::GFP (14%) (43%) (RuBP) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased Increased
Line 3 GAL4_opLexA::GFP (15%) (23%) (RuBP) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased No effect
Line 4 GAL4_opLexA::GFP (26%) (39%) MYB19/ 2 35S::m35S::oEnh:LexA:
opLexA::G1309 Increased Increased No effect Line 5 GAL4_opLexA::GFP
(22%) (37%) MYB19/ 2 35S::m35S::oEnh:LexA: opLexA::G1309 Increased
Increased No effect Line 6 GAL4_opLexA::GFP (19%) (61%) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 Increased Increased Increased
Line 7 GAL4_opLexA::GFP (13%) (28%) (RuBP) MYB19/ 2
35S::m35S::oEnh:LexA: opLexA::G1309 No effect Increased No effect
Line 8 GAL4_opLexA::GFP (17%)
[0231] The results presented in Tables 5 and 6 were determined
after screening nine independent transgenic events. Multiple lines
were screened in replicate independent experiments.
[0232] The present disclosure thus describes how the transformation
of plants, which may include monocots and/or dicots, with a MYB19
clade polypeptide can confer to the transformed plants greater
photosynthetic resource use efficiency than the level of
photosynthetic resource use efficiency exhibited by control plants.
In one embodiment, expression of MYB19 is driven by a constitutive
promoter. In another embodiment, expression of MYB19 is driven by a
promoter with enhanced activity in a tissue capable of
photosynthesis (also referred to herein as a "photosynthetic
promoter" or a "photosynthetic tissue-enhanced promoter") such as a
leaf tissue or other green tissue. Examples of photosynthetic
tissue-enhanced promoters include for example, an RBCS3 promoter
(SEQ ID NO: 133), an RBCS4 promoter (SEQ ID NO: 134) or others such
as the At4g01060 promoter (SEQ ID NO: 135), the latter regulating
expression in guard cells. Other photosynthetic tissue-enhanced
promoters have been taught by Bassett et al., 2007. BMC Biotechnol.
7: 47, specifically incorporated herein by reference in its
entirety. Other photosynthetic tissue-enhanced promoters of
interest include those from the maize aldolase gene FDA (U.S.
Patent Publication No. U520040216189, specifically incorporated
herein by reference in its entirety, and the aldolase and pyruvate
orthophosphate dikinase (PPDK) (Taniguchi et al., 2000. Plant Cell
Physiol. 41:42-48, specifically incorporated herein by reference in
its entirety. Other tissue enhanced promoters or inducible
promoters are also envisioned that may be used to regulate
expression of MYB19 clade member polypeptides and improve
photosynthetic resource use efficiency in a variety of plants.
Example V
Utilities of MYB19 Clade Sequences for Improving Photosynthetic
Resource Use Efficiency, Yield or Biomass
[0233] The improved photosynthetic resource use efficiency
conferred by increasing the expression level of a MYB19 clade
polypeptide sequence may contribute to increased yield of
commercially available plants. For plants for which biomass is the
product of interest, increasing the expression level of MYB19 clade
of polypeptide sequences may increase yield, photosynthetic
resource use efficiency, vigor, growth rate, and/or biomass of the
plants. Thus, it is thus expected that these sequences will improve
yield and/or photosynthetic resource use efficiency in
non-Arabidopsis plants relative to control plants. This yield
improvement may result in yield increases in crop or
non-Arabidopsis plants including, but not limited to, wheat,
Setaria, corn (maize), rice, barley; rye; millet; sorghum;
sugarcane, miscane, turfgrass, Miscanthus, switchgrass; soybean,
cotton, rape, oilseed rape including canola, Eucalyptus or poplar,
such as at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%,
13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 25%, 30% or greater yield
relative to the yield that may be obtained with control plants.
[0234] It is expected that the same methods may be applied to
identify other useful and valuable sequences that are
functionally-related and/or closely-related to the listed sequences
or domains provided in Tables 2 or 3, and the sequences may be
derived from a diverse range of species. Because of morphological,
physiological and photosynthetic resource use efficiency
similarities that may occur among MYB19-related sequences, the
MYB19 clade sequences are expected to increase yield, plant growth,
vigor, size, biomass, and/or increase photosynthetic resource use
efficiency to a variety of crop plants, ornamental plants, and
woody plants used in the food, ornamental, paper, pulp, lumber or
other industries.
Example VI
Expression and Analysis of Increased Yield or Photosynthetic
Resource Use Efficiency in Non-Arabidopsis or Crop Species
[0235] Northern blot analysis, RT-PCR or microarray analysis of the
regenerated, transformed plants may be used to show expression of a
polypeptide or the instant description and related genes that are
capable of inducing improved photosynthetic resource use
efficiency, and/or larger size.
[0236] After a eudicot plant, monocot plant or plant cell has been
transformed (and the latter plant host cell regenerated into a
plant) and shown to have greater photosynthetic resource use
efficiency, and/or greater size, vigor, biomass, and/or produce
greater yield relative to a control plant, the transformed monocot
plant may be crossed with itself or a plant from the same line, a
non-transformed or wild-type monocot plant, or another transformed
monocot plant from a different transgenic line of plants.
[0237] The function of one or more specific polypeptides of the
instant description has been analyzed and may be further
characterized and incorporated into crop plants. The ectopic
overexpression of one or more of MYB19 clade polypeptide sequences
may be regulated using constitutive, inducible, or tissue-enhanced
regulatory elements. Genes that have been examined have been shown
to modify plant traits including increasing yield and/or
photosynthetic resource use efficiency. It is expected that newly
discovered polynucleotide and polypeptide sequences closely
related, as determined by the disclosed hybridization or identity
analyses, to polynucleotide and polypeptide sequences found in the
Sequence Listing can also confer alteration of traits in a similar
manner to the sequences found in the Sequence Listing, when
transformed into any of a considerable variety of plants of
different species, and including dicots and monocots. The
polynucleotide and polypeptide sequences derived from monocots
(e.g., the rice sequences) may be used to transform both monocot
and dicot plants, and those derived from dicots (e.g., the
Arabidopsis and soy genes) may be used to transform either group,
although it is expected that some of these sequences will function
best if the gene is transformed into a plant from the same group as
that from which the sequence is derived.
[0238] As an example of a first step to determine photosynthetic
resource use efficiency, seeds of these transgenic plants may be
grown as described above or methods known in the art. Disclosed
sequences may be identified that, when ectopically expressed, or
overexpressed, in plants, the expression of said sequences result
in one or more characteristics that lead to greater photosynthetic
resource use efficiency.
These characteristics include:
[0239] (a) increased photosynthetic capacity, measured as an
increase in the rate of light-saturated photosynthesis of at least
10% when compared to the rate of light-saturated photosynthesis of
a control leaf at the same leaf-internal CO2 concentration.
Optionally, measurements are made after 40 minutes of acclimation
to a light intensity that is saturating for photosynthesis;
[0240] (b) Increased photosynthetic rate, measured as an increase
in the rate of light-saturated photosynthesis of at least 10%.
Optionally, measurements are made after 40 minutes of acclimation
to a light intensity known to be saturating for photosynthesis
[0241] (c) a decrease in the chlorophyll content of the leaf of at
least 10%, observed in the absence of a decrease in photosynthetic
capacity;
[0242] (d) a decrease in the percentage of the leaf dry weight that
is nitrogen of at least 0.5%, observed in the absence of a decrease
in photosynthetic capacity or increase in dry weight;
[0243] (e) increased transpiration efficiency, measured as an
increase in the rate of light-saturated photosynthesis relative to
water loss via transpiration from the leaf, of at least 10%;
optionally, measurements are made after 40 minutes of acclimation
to a light intensity of 700 .mu.mol PAR m.sup.-2 s.sup.-1;
[0244] (f) an increase in the resistance to water vapor diffusion
out of the leaf that is exerted by the stomata, measured as a
decrease in stomatal conductance to H.sub.2O loss from the leaf of
at least 10%; optionally, measurements were are after 40 minutes of
acclimation to a light intensity of 700 .mu.mol PAR m-2 s-1;
[0245] (g) a decrease in the relative limitation that
non-photochemical quenching exerts on the operation of PSII. The
term non-photochemical quenching covers a suite of processes that
dissipate light energy absorbed by light harvesting antennae as
heat. and is measurable as a decrease in non-photochemical
quenching of at least 2%, for leaf measurements made after 40
minutes of acclimation to a light intensity of 700 .mu.mol PAR
m.sup.-2 s.sup.-1;
[0246] (h) a decrease in the ratio of the carbon isotope .sup.12C
to .sup.13C found in either, all the dried above-ground biomass, or
specific components of the above-ground biomass, leaves,
reproductive structures, of at least 0.5%0 (0.5 per mille),
measured as a decrease in the ratio of .sup.12C to .sup.13C
relative to the controls with both ratio being expressed relative
to the same standard; and/or
[0247] (i) an increase in the total dry weight of above-ground
plant material of at least 5%.
[0248] Closely-related homologs of MYB19 derived from various
diverse plant species may be overexpressed in plants and have the
same functions of conferring increased photosynthetic resource use
efficiency. It is thus expected that structurally similar orthologs
of the MYB19 polypeptide clade, including SEQ ID NOs: 2n, where
n=1-17, can confer increased yield, and/or increased vigor,
biomass, or size, relative to control plants. As at least one
sequence of the instant description has increased photosynthetic
resource use efficiency in Arabidopsis, it is expected that the
sequences provided in the Sequence Listing, or polypeptide
sequences comprising one of or any of the conserved first Myb DNA
binding domains provided in Table 2, or the conserved second Myb
DNA binding domains provided in Table 3, will increase the
photosynthetic resource use efficiency and/or yield of transgenic
plants including transgenic non-Arabidopsis (plant species other
than Arabidopsis species) crop or other commercially important
plant species, including, but not limited to, non-Arabidopsis
plants and plant species such as monocots and dicots; wheat,
Setaria, corn (maize), teosinte (Zea species which is related to
maize), rice, barley; rye; millet; sorghum; sugarcane, miscane,
turfgrass, Miscanthus, switchgrass; soybean, cotton, rape, oilseed
rape including canola, tobacco, tomato, tomatillo, potato,
sunflower, alfalfa, clover, banana, blackberry, blueberry,
strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee,
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion,
papaya, peas, peppers, pineapple, pumpkin, spinach, squash, sweet
corn, watermelon, rosaceous fruits including apple, peach, pear,
cherry and plum; and brassicas including broccoli, cabbage,
cauliflower, Brussels sprouts, and kohlrabi; currant; avocado;
citrus fruits including oranges, lemons, grapefruit and tangerines,
artichoke, cherries; endive; leek; roots such as arrowroot, beet,
cassaya, turnip, radish, yam, and sweet potato; beans; woody
species including pine, poplar, Eucalyptus, mint or other labiates;
nuts such as walnut and peanut. Within each of these species the
Closely-related homologs of MYB19 may be overexpressed or
ectopically expressed in different varieties, cultivars, or
germplasm.
[0249] All publications and patent applications mentioned in this
specification are herein incorporated by reference to the same
extent as if each individual publication or patent application was
specifically and individually indicated to be incorporated by
reference.
[0250] The present invention is not limited by the specific
embodiments described herein. The invention now being fully
described, it will be apparent to one of ordinary skill in the art
that many changes and modifications can be made thereto without
departing from the spirit or scope of the appended claims.
Modifications that become apparent from the foregoing description
and accompanying figures fall within the scope of the claims.
Sequence CWU 1
1
1601804DNAArabidopsis thalianaAT5G52260.1 1atgaccaaat ctggagagag
accaaaacag agacagagga aagggttatg gtcacctgaa 60gaagaccaga agctcaagag
tttcatcctc tctcgtggcc atgcttgctg gaccactgtt 120cccatcctag
ctggattgca aaggaatggg aaaagctgca gattaaggtg gattaattac
180ctaagaccag gactaaagag ggggtcgttt agtgaagaag aagaagagac
catcttgact 240ttacattctt ccttgggtaa caagtggtct cggattgcaa
aatatttacc gggaagaaca 300gacaacgaga ttaagaacta ttggcattcc
tatctgaaga agagatggct caaatctcaa 360ccacaactca aaagccaaat
atcagacctc acagaatctc cttcttcact actttcttgc 420gggaaaagaa
atctggaaac cgaaacccta gatcacgtga tctccttcca gaaattttca
480gagaatccaa cttcatcacc atccaaagaa agcaacaaca acatgatcat
gaacaacagt 540aataacttgc ctaaactgtt cttctctgag tggatcagtt
cttcaaatcc acacatcgat 600tactcctctg cttttacaga ttccaagcac
attaatgaaa ctcaagatca aatcaatgaa 660gaggaagtga tgatgatcaa
taacaacaac tactcttcac ttgaggatgt catgctccgt 720acagattttt
tgcagcctga tcatgaatat gcaaattatt attcttctgg agatttcttc
780atcaacagtg accaaaatta tgtc 8042268PRTArabidopsis
thalianaAT5G52260.1, domain AAs 17-77, 70-112 2Met Thr Lys Ser Gly
Glu Arg Pro Lys Gln Arg Gln Arg Lys Gly Leu 1 5 10 15 Trp Ser Pro
Glu Glu Asp Gln Lys Leu Lys Ser Phe Ile Leu Ser Arg 20 25 30 Gly
His Ala Cys Trp Thr Thr Val Pro Ile Leu Ala Gly Leu Gln Arg 35 40
45 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Gly
50 55 60 Leu Lys Arg Gly Ser Phe Ser Glu Glu Glu Glu Glu Thr Ile
Leu Thr 65 70 75 80 Leu His Ser Ser Leu Gly Asn Lys Trp Ser Arg Ile
Ala Lys Tyr Leu 85 90 95 Pro Gly Arg Thr Asp Asn Glu Ile Lys Asn
Tyr Trp His Ser Tyr Leu 100 105 110 Lys Lys Arg Trp Leu Lys Ser Gln
Pro Gln Leu Lys Ser Gln Ile Ser 115 120 125 Asp Leu Thr Glu Ser Pro
Ser Ser Leu Leu Ser Cys Gly Lys Arg Asn 130 135 140 Leu Glu Thr Glu
Thr Leu Asp His Val Ile Ser Phe Gln Lys Phe Ser 145 150 155 160 Glu
Asn Pro Thr Ser Ser Pro Ser Lys Glu Ser Asn Asn Asn Met Ile 165 170
175 Met Asn Asn Ser Asn Asn Leu Pro Lys Leu Phe Phe Ser Glu Trp Ile
180 185 190 Ser Ser Ser Asn Pro His Ile Asp Tyr Ser Ser Ala Phe Thr
Asp Ser 195 200 205 Lys His Ile Asn Glu Thr Gln Asp Gln Ile Asn Glu
Glu Glu Val Met 210 215 220 Met Ile Asn Asn Asn Asn Tyr Ser Ser Leu
Glu Asp Val Met Leu Arg 225 230 235 240 Thr Asp Phe Leu Gln Pro Asp
His Glu Tyr Ala Asn Tyr Tyr Ser Ser 245 250 255 Gly Asp Phe Phe Ile
Asn Ser Asp Gln Asn Tyr Val 260 265 3849DNAArabidopsis
thalianaAT4G25560.1 3atggcgaaga cgaaatatgg agagagacat aggaaagggt
tatggtcacc tgaagaagac 60gagaagctaa ggagcttcat cctctcttat ggccattctt
gctggaccac tgttcccatc 120aaagctgggt tacaaaggaa tgggaagagc
tgcagattaa gatggattaa ttacctaaga 180ccagggttaa agagggatat
gattagtgca gaagaagaag agactatctt gacgtttcat 240tcttccttgg
gtaacaagtg gtcgcaaata gctaaattct taccgggaag aacagacaat
300gagataaaga actattggca ctctcatttg aaaaagaaat ggctcaagtc
tcagagctta 360caagatgcaa aatctatttc ccctccttcg tcttcatcat
catcacttgt tgcttgtgga 420aaaagaaatc cggaaacctt gatctcgaat
cacgtgttct ccttccagag acttctagag 480aacaaatctt catctccctc
acaagaaagc aacggaaata acagccatca atgttcttct 540gctcctgaga
ttccaaggct tttcttctct gaatggcttt cttcttcata tccccacacc
600gattattcct ctgagtttac cgactctaag cacagtcaag ctccaaatgt
cgaagagact 660ctctcagctt atgaagaaat gggtgatgtt gatcagttcc
attacaacga aatgatgatc 720aacaacagca actggactct taacgacatt
gtgtttggtt ccaaatgtaa gaagcaggag 780catcatattt atagagaggc
ttcagattgt aattcttctg ctgaattctt ttctccatca 840acaacgacg
8494283PRTArabidopsis thalianaAT4G25560.1, domain AAs 15-75, 68-110
4Met Ala Lys Thr Lys Tyr Gly Glu Arg His Arg Lys Gly Leu Trp Ser 1
5 10 15 Pro Glu Glu Asp Glu Lys Leu Arg Ser Phe Ile Leu Ser Tyr Gly
His 20 25 30 Ser Cys Trp Thr Thr Val Pro Ile Lys Ala Gly Leu Gln
Arg Asn Gly 35 40 45 Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu
Arg Pro Gly Leu Lys 50 55 60 Arg Asp Met Ile Ser Ala Glu Glu Glu
Glu Thr Ile Leu Thr Phe His 65 70 75 80 Ser Ser Leu Gly Asn Lys Trp
Ser Gln Ile Ala Lys Phe Leu Pro Gly 85 90 95 Arg Thr Asp Asn Glu
Ile Lys Asn Tyr Trp His Ser His Leu Lys Lys 100 105 110 Lys Trp Leu
Lys Ser Gln Ser Leu Gln Asp Ala Lys Ser Ile Ser Pro 115 120 125 Pro
Ser Ser Ser Ser Ser Ser Leu Val Ala Cys Gly Lys Arg Asn Pro 130 135
140 Glu Thr Leu Ile Ser Asn His Val Phe Ser Phe Gln Arg Leu Leu Glu
145 150 155 160 Asn Lys Ser Ser Ser Pro Ser Gln Glu Ser Asn Gly Asn
Asn Ser His 165 170 175 Gln Cys Ser Ser Ala Pro Glu Ile Pro Arg Leu
Phe Phe Ser Glu Trp 180 185 190 Leu Ser Ser Ser Tyr Pro His Thr Asp
Tyr Ser Ser Glu Phe Thr Asp 195 200 205 Ser Lys His Ser Gln Ala Pro
Asn Val Glu Glu Thr Leu Ser Ala Tyr 210 215 220 Glu Glu Met Gly Asp
Val Asp Gln Phe His Tyr Asn Glu Met Met Ile 225 230 235 240 Asn Asn
Ser Asn Trp Thr Leu Asn Asp Ile Val Phe Gly Ser Lys Cys 245 250 255
Lys Lys Gln Glu His His Ile Tyr Arg Glu Ala Ser Asp Cys Asn Ser 260
265 270 Ser Ala Glu Phe Phe Ser Pro Ser Thr Thr Thr 275 280
5819DNAOryza sativaLOC_Os04g45020.1 5atggggtgca aggcgtgcca
gaagcccaag gtgcactacc ggaagggcct gtggtcgccg 60gaggaggacc agaagctccg
cgacttcatc ctccgctacg gccacggctg ctggagcgcc 120gtccccgtga
aggccgggct gcagcgtaac ggcaagagtt gcaggctgag atggatcaat
180tacctgaggc cggggctgaa gcacggcatg ttttcccgag aggaagaaga
aaccgtcatg 240aacctgcacg ctacaatggg caacaagtgg tcacagatag
cgcggcatct gcctggccgg 300acggacaacg aggtgaagaa ctactggaac
tcgtacctca agaagcgagt cgaaggcgcg 360gaggctgcgg ccagaaaatc
cgccgagccg gccgacgtcg tcaccggcag cccgaaccgc 420agcgagaccg
gccaagaacg cgtcgccgct gaccggccgg cgagctccga gtcttccggg
480ccggtcgagt cgtcgtcggc cgacgactcg agcagcctca ccgagcccgc
ggcggggctc 540gccgccgtcc ggccgcacgc gcccgtgatc cccaaggtca
tgttcgccga ctggttcgac 600atggactacg ggactagcct cgccgggacg
gcgccgggcc tgagctacca gggctcgtcg 660tcggtgcagg tcgacgtccc
gtgcggcggc gccgtggact ccctgcacgg gctgggcgac 720ggcggcttct
gctgggactt cgacgacgcg gccgatcaca tgcagggagg aggagggctc
780tgcgacctgc tctccatgag cgagttcctc ggcatcaac 8196273PRTOryza
sativaLOC_Os04g45020.1, domain AAs 18-78, 71-113 6Met Gly Cys Lys
Ala Cys Gln Lys Pro Lys Val His Tyr Arg Lys Gly 1 5 10 15 Leu Trp
Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Phe Ile Leu Arg 20 25 30
Tyr Gly His Gly Cys Trp Ser Ala Val Pro Val Lys Ala Gly Leu Gln 35
40 45 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg
Pro 50 55 60 Gly Leu Lys His Gly Met Phe Ser Arg Glu Glu Glu Glu
Thr Val Met 65 70 75 80 Asn Leu His Ala Thr Met Gly Asn Lys Trp Ser
Gln Ile Ala Arg His 85 90 95 Leu Pro Gly Arg Thr Asp Asn Glu Val
Lys Asn Tyr Trp Asn Ser Tyr 100 105 110 Leu Lys Lys Arg Val Glu Gly
Ala Glu Ala Ala Ala Arg Lys Ser Ala 115 120 125 Glu Pro Ala Asp Val
Val Thr Gly Ser Pro Asn Arg Ser Glu Thr Gly 130 135 140 Gln Glu Arg
Val Ala Ala Asp Arg Pro Ala Ser Ser Glu Ser Ser Gly 145 150 155 160
Pro Val Glu Ser Ser Ser Ala Asp Asp Ser Ser Ser Leu Thr Glu Pro 165
170 175 Ala Ala Gly Leu Ala Ala Val Arg Pro His Ala Pro Val Ile Pro
Lys 180 185 190 Val Met Phe Ala Asp Trp Phe Asp Met Asp Tyr Gly Thr
Ser Leu Ala 195 200 205 Gly Thr Ala Pro Gly Leu Ser Tyr Gln Gly Ser
Ser Ser Val Gln Val 210 215 220 Asp Val Pro Cys Gly Gly Ala Val Asp
Ser Leu His Gly Leu Gly Asp 225 230 235 240 Gly Gly Phe Cys Trp Asp
Phe Asp Asp Ala Ala Asp His Met Gln Gly 245 250 255 Gly Gly Gly Leu
Cys Asp Leu Leu Ser Met Ser Glu Phe Leu Gly Ile 260 265 270 Asn
7855DNABrachypodium distachyonBradi5g16672.1 7atggggtgca agtcgtgcca
gaagcccaag gcgcaccatc ggaagggcct gtggtcgccg 60gaggaggacc agaagctccg
cgactacatc atccgttatg gccatagctg ctggagcacc 120gtccccgtca
aggctggact gcagcggaac ggcaagagct gcaggctgag atggatcaac
180tacctgaggc cggggctgaa gcacggcatg ttctcccagg aggaagaaga
gaccgtcatg 240agcctccacg ccacactggg caacaaatgg tctcggatag
cgcagcatct gccaggccgg 300accgacaacg aggtgaagaa ctactggaac
tcgtacctga agaagcgcgt ggagggcgcg 360caggcggcac cagccaaatc
cgccggctcg gactcgcccc agagcccgac ggcggcgctc 420agcgagagcg
gcgttaaacg gccggagaac tccggctcgt ccgggccgcc ggaatcgtcg
480tcggccgacg actcgagctg cctcacgggg cccgccggcg ccgccgcggc
cctgatccgg 540ccgcacgcgc ccgtgctccc caaggtcatg ttcgcggact
ggctcgacat ggacatggac 600tacggcacgg gcctgatggc gccgggcctg
gacgcgggct tcggagcggg ccggtgcagc 660agcccggccc agggcgccgc
gagccagcag gggtccgtgc aggtcgacgg cccatcgtgc 720agcgccgtgg
attccttgca cgggctcggc ggcggcatct gctgggactt cgacgcggcg
780gatcagatgc acatgcagag cgggggagga gggttctgcg acctgctctc
catgagcgag 840ttccttggga tcaac 8558285PRTBrachypodium
distachyonBradi5g16672.1, domain AAs 18-78, 71-113 8Met Gly Cys Lys
Ser Cys Gln Lys Pro Lys Ala His His Arg Lys Gly 1 5 10 15 Leu Trp
Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Tyr Ile Ile Arg 20 25 30
Tyr Gly His Ser Cys Trp Ser Thr Val Pro Val Lys Ala Gly Leu Gln 35
40 45 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg
Pro 50 55 60 Gly Leu Lys His Gly Met Phe Ser Gln Glu Glu Glu Glu
Thr Val Met 65 70 75 80 Ser Leu His Ala Thr Leu Gly Asn Lys Trp Ser
Arg Ile Ala Gln His 85 90 95 Leu Pro Gly Arg Thr Asp Asn Glu Val
Lys Asn Tyr Trp Asn Ser Tyr 100 105 110 Leu Lys Lys Arg Val Glu Gly
Ala Gln Ala Ala Pro Ala Lys Ser Ala 115 120 125 Gly Ser Asp Ser Pro
Gln Ser Pro Thr Ala Ala Leu Ser Glu Ser Gly 130 135 140 Val Lys Arg
Pro Glu Asn Ser Gly Ser Ser Gly Pro Pro Glu Ser Ser 145 150 155 160
Ser Ala Asp Asp Ser Ser Cys Leu Thr Gly Pro Ala Gly Ala Ala Ala 165
170 175 Ala Leu Ile Arg Pro His Ala Pro Val Leu Pro Lys Val Met Phe
Ala 180 185 190 Asp Trp Leu Asp Met Asp Met Asp Tyr Gly Thr Gly Leu
Met Ala Pro 195 200 205 Gly Leu Asp Ala Gly Phe Gly Ala Gly Arg Cys
Ser Ser Pro Ala Gln 210 215 220 Gly Ala Ala Ser Gln Gln Gly Ser Val
Gln Val Asp Gly Pro Ser Cys 225 230 235 240 Ser Ala Val Asp Ser Leu
His Gly Leu Gly Gly Gly Ile Cys Trp Asp 245 250 255 Phe Asp Ala Ala
Asp Gln Met His Met Gln Ser Gly Gly Gly Gly Phe 260 265 270 Cys Asp
Leu Leu Ser Met Ser Glu Phe Leu Gly Ile Asn 275 280 285 9873DNAZea
maysGRMZM2G170049_T01 9atggggtgca aggcgtgcga caagcccaag cccaactacc
gcaagggcct gtggtcgccg 60gaggaggacc agaagctccg cgactacatt ctcctccacg
gccacggctg ctggagcgcg 120ctccccgcga aagccgggct ccagcggaac
ggcaagagct gcaggctgcg gtggatcaac 180taccttcggc cggggctgaa
gcacggcatg ttctccccgg aggaggagga gacggtgatg 240agcctccacg
ccacgctcgg caacaagtgg tccaggatcg cacggcactt gcctggcagg
300accgacaacg aggtcaagaa ctactggaac tcgtacctca agaagagggt
cgagggcaag 360gaccaggggc ccagcacgcc cgcgccggcg gcgtccaatt
cggacgacga ctcgcactgc 420gtcaagcagc gcagggacga cgacggcacg
gcggactccg gcgcgtcgga gccgcgcgag 480tcgtcgtcgg ccgacgactc
gagctgcctg acggacccgc acgcctgcag gccccacgcg 540cccgtgccgc
ccaaggtcat gttcgcggac tggctggaca tggactacgt gggcggtgcc
600ctgccggcga cagcaccagc agcacctggt ctgctcggcg ctgcgggcgt
ggccacggcc 660agcacgggcg accgcgatca gcatcaggtg atgagcatga
gccaggggtc cgttcaggtg 720gatgggccat ccggtgccga tgtgtccctg
cacggcttcg atgacagcgg cgccggctgc 780tgggagttcc aggagcactt
cgatgccatc gatcacatgc aggcggccgg cttctgcgac 840ctgctctcca
tgagcgacta cttcggcctc gac 87310291PRTZea maysGRMZM2G170049_T01,
domain AAs 18-78, 71-113 10Met Gly Cys Lys Ala Cys Asp Lys Pro Lys
Pro Asn Tyr Arg Lys Gly 1 5 10 15 Leu Trp Ser Pro Glu Glu Asp Gln
Lys Leu Arg Asp Tyr Ile Leu Leu 20 25 30 His Gly His Gly Cys Trp
Ser Ala Leu Pro Ala Lys Ala Gly Leu Gln 35 40 45 Arg Asn Gly Lys
Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro 50 55 60 Gly Leu
Lys His Gly Met Phe Ser Pro Glu Glu Glu Glu Thr Val Met 65 70 75 80
Ser Leu His Ala Thr Leu Gly Asn Lys Trp Ser Arg Ile Ala Arg His 85
90 95 Leu Pro Gly Arg Thr Asp Asn Glu Val Lys Asn Tyr Trp Asn Ser
Tyr 100 105 110 Leu Lys Lys Arg Val Glu Gly Lys Asp Gln Gly Pro Ser
Thr Pro Ala 115 120 125 Pro Ala Ala Ser Asn Ser Asp Asp Asp Ser His
Cys Val Lys Gln Arg 130 135 140 Arg Asp Asp Asp Gly Thr Ala Asp Ser
Gly Ala Ser Glu Pro Arg Glu 145 150 155 160 Ser Ser Ser Ala Asp Asp
Ser Ser Cys Leu Thr Asp Pro His Ala Cys 165 170 175 Arg Pro His Ala
Pro Val Pro Pro Lys Val Met Phe Ala Asp Trp Leu 180 185 190 Asp Met
Asp Tyr Val Gly Gly Ala Leu Pro Ala Thr Ala Pro Ala Ala 195 200 205
Pro Gly Leu Leu Gly Ala Ala Gly Val Ala Thr Ala Ser Thr Gly Asp 210
215 220 Arg Asp Gln His Gln Val Met Ser Met Ser Gln Gly Ser Val Gln
Val 225 230 235 240 Asp Gly Pro Ser Gly Ala Asp Val Ser Leu His Gly
Phe Asp Asp Ser 245 250 255 Gly Ala Gly Cys Trp Glu Phe Gln Glu His
Phe Asp Ala Ile Asp His 260 265 270 Met Gln Ala Ala Gly Phe Cys Asp
Leu Leu Ser Met Ser Asp Tyr Phe 275 280 285 Gly Leu Asp 290
11867DNASetaria italicaSi012304m 11atggggtgca aggcgtgcca gaagcccaag
gtgcagtacc gcaagggcct gtggtcgccg 60gaggaggacg agaagctccg cgacttcatc
ctccgctacg gccacggctg ctggagcgcg 120ctccccgcca aggccgggct
gcagcgcaac ggcaagagct gcaggctgag gtggatcaac 180tacctgaggc
cggggctgaa gcacggcatg ttctcccggg aggaggagga gaccgtcatg
240agcctccacg ccaagcttgg caacaagtgg tctcagatcg cgcggcacct
gccgggccgg 300accgacaacg aggtgaagaa ctactggaac tcgtacctca
agaagcgcgt cgagggcggc 360gcgcaggcca agtgcgcggc ggacccggcg
acacccgccg gttccgacgt ccgcgccggg 420agccccaacc ccagcgacaa
cggtcgggaa cgcgccaacc accccgcgag ctctgactcg 480tcggagccgg
tcgagtcgtc ctcggccgac gactcgagct gcctcaccgt caccgagccc
540gccagggcgg gcgcggtgcg gccgcacgct cccgtgctcc ccaaggtcat
gttcgcggac 600tggctcgaca tggactacgg caccagcctg gcggcgctgg
gtccggacgc cggcgtcttc 660gacgtgagcg ggcgcagccc ggggcagggc
ctgagccacc aggggtccgt gcaggtggac 720ggcccgtgcg gcgcggtgga
ttccctgcac gggctcggcg acggcggcat ctgcggctgg 780gggttcgacg
cggcggtgga tcagatggac gtgcagggag gagggttctg cgatctgctc
840tccatgaccg agttccttgg gatcaac 86712289PRTSetaria
italicaSi012304m, domain AAs 18-78, 71-113 12Met Gly Cys Lys Ala
Cys Gln Lys Pro Lys Val Gln Tyr Arg Lys Gly 1 5
10 15 Leu Trp Ser Pro Glu Glu Asp Glu Lys Leu Arg Asp Phe Ile Leu
Arg 20 25 30 Tyr Gly His Gly Cys Trp Ser Ala Leu Pro Ala Lys Ala
Gly Leu Gln 35 40 45 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile
Asn Tyr Leu Arg Pro 50 55 60 Gly Leu Lys His Gly Met Phe Ser Arg
Glu Glu Glu Glu Thr Val Met 65 70 75 80 Ser Leu His Ala Lys Leu Gly
Asn Lys Trp Ser Gln Ile Ala Arg His 85 90 95 Leu Pro Gly Arg Thr
Asp Asn Glu Val Lys Asn Tyr Trp Asn Ser Tyr 100 105 110 Leu Lys Lys
Arg Val Glu Gly Gly Ala Gln Ala Lys Cys Ala Ala Asp 115 120 125 Pro
Ala Thr Pro Ala Gly Ser Asp Val Arg Ala Gly Ser Pro Asn Pro 130 135
140 Ser Asp Asn Gly Arg Glu Arg Ala Asn His Pro Ala Ser Ser Asp Ser
145 150 155 160 Ser Glu Pro Val Glu Ser Ser Ser Ala Asp Asp Ser Ser
Cys Leu Thr 165 170 175 Val Thr Glu Pro Ala Arg Ala Gly Ala Val Arg
Pro His Ala Pro Val 180 185 190 Leu Pro Lys Val Met Phe Ala Asp Trp
Leu Asp Met Asp Tyr Gly Thr 195 200 205 Ser Leu Ala Ala Leu Gly Pro
Asp Ala Gly Val Phe Asp Val Ser Gly 210 215 220 Arg Ser Pro Gly Gln
Gly Leu Ser His Gln Gly Ser Val Gln Val Asp 225 230 235 240 Gly Pro
Cys Gly Ala Val Asp Ser Leu His Gly Leu Gly Asp Gly Gly 245 250 255
Ile Cys Gly Trp Gly Phe Asp Ala Ala Val Asp Gln Met Asp Val Gln 260
265 270 Gly Gly Gly Phe Cys Asp Leu Leu Ser Met Thr Glu Phe Leu Gly
Ile 275 280 285 Asn 13939DNACitrus clementinaclementine0.9_033485m
13atgggatgca agtcatcgga aaagccaatt gcaaagccga agccaaagca cagaaagggc
60ttgtggtctc ccgaagaaga ccagaggctc aagaactatg tcctccagca tggccaccct
120tgctggagct ccgtccccat caatgccggc ttgcaaagga atggaaagag
ctgcagactg 180agatggatta attatttgag gccaggactt aagagagggg
tgttcaatat gcaagaagaa 240gagacaatcc tgaccgtcca tcgcctgtta
ggaaacaagt ggtctcaaat tgctcagcat 300ttgcctggaa gaacagataa
cgagataaag aactattggc actcccattt gaagaaaaaa 360ttagccaaac
ttgaagaaat ggaagcagct aatgcgacaa ctccaagctc agaaaatatg
420gaatcttcaa cttcccctaa taacaatccc tcaactcgca gctcaagcta
tgaatcgttg 480caccacatgg aaaaatcatc agccggtagt actgatcagt
gtgcaactca gggtcagaaa 540agttgcttgc cgaagctttt atttgcagag
tggctgtcgc ttgatcatgc taatgatggt 600agcttcgcaa attccttcga
gcaagtggct tccaaggaag gctttaataa taataataat 660aataataata
ataataataa taataataat aataataatc agaactccaa cttggtccaa
720gattcgagtg atacatttat gaatggttac ttgtccaatg agggagcatt
tggcggcgat 780tttattcata acggattcaa caacagtttt gttgatgaaa
tgttgagttc aagattcaaa 840ttcgaggatc atcagttttc cggaattggg
tttgttgatt ctatctctgg ggatgatgta 900tgtagtgctt tgaatatgaa
taatgatgta atgtacata 93914313PRTCitrus
clementinaclementine0.9_033485m, domain AAs 22-82, 75-117 14Met Gly
Cys Lys Ser Ser Glu Lys Pro Ile Ala Lys Pro Lys Pro Lys 1 5 10 15
His Arg Lys Gly Leu Trp Ser Pro Glu Glu Asp Gln Arg Leu Lys Asn 20
25 30 Tyr Val Leu Gln His Gly His Pro Cys Trp Ser Ser Val Pro Ile
Asn 35 40 45 Ala Gly Leu Gln Arg Asn Gly Lys Ser Cys Arg Leu Arg
Trp Ile Asn 50 55 60 Tyr Leu Arg Pro Gly Leu Lys Arg Gly Val Phe
Asn Met Gln Glu Glu 65 70 75 80 Glu Thr Ile Leu Thr Val His Arg Leu
Leu Gly Asn Lys Trp Ser Gln 85 90 95 Ile Ala Gln His Leu Pro Gly
Arg Thr Asp Asn Glu Ile Lys Asn Tyr 100 105 110 Trp His Ser His Leu
Lys Lys Lys Leu Ala Lys Leu Glu Glu Met Glu 115 120 125 Ala Ala Asn
Ala Thr Thr Pro Ser Ser Glu Asn Met Glu Ser Ser Thr 130 135 140 Ser
Pro Asn Asn Asn Pro Ser Thr Arg Ser Ser Ser Tyr Glu Ser Leu 145 150
155 160 His His Met Glu Lys Ser Ser Ala Gly Ser Thr Asp Gln Cys Ala
Thr 165 170 175 Gln Gly Gln Lys Ser Cys Leu Pro Lys Leu Leu Phe Ala
Glu Trp Leu 180 185 190 Ser Leu Asp His Ala Asn Asp Gly Ser Phe Ala
Asn Ser Phe Glu Gln 195 200 205 Val Ala Ser Lys Glu Gly Phe Asn Asn
Asn Asn Asn Asn Asn Asn Asn 210 215 220 Asn Asn Asn Asn Asn Asn Asn
Asn Asn Gln Asn Ser Asn Leu Val Gln 225 230 235 240 Asp Ser Ser Asp
Thr Phe Met Asn Gly Tyr Leu Ser Asn Glu Gly Ala 245 250 255 Phe Gly
Gly Asp Phe Ile His Asn Gly Phe Asn Asn Ser Phe Val Asp 260 265 270
Glu Met Leu Ser Ser Arg Phe Lys Phe Glu Asp His Gln Phe Ser Gly 275
280 285 Ile Gly Phe Val Asp Ser Ile Ser Gly Asp Asp Val Cys Ser Ala
Leu 290 295 300 Asn Met Asn Asn Asp Val Met Tyr Ile 305 310
15885DNAPopulus trichocarpaPOPTR_0015s13190.1 15atggggtgca
agtcatctga catgccaaag ctaaagccaa agccaaagca caggaaaggc 60ttgtggtcac
ctgaagaaga tcaaaggctc agaaactatg tccttaaaca tggccatgga
120tgttggagct ctgtccccat taatgctggc ttgcagagga atgggaagag
ctgcagacta 180agatggatta attacttgag accaggatta aaaagaggga
cgttttctgc acaagaagag 240gagacaatcc tggcccttca tcacatgtta
ggcaacaagt ggtctcagat agcacagcat 300ttgcctggaa gaacagataa
tgagataaag aatcattggc attcctattt gaagaaaaat 360ttgctcaaag
acgaagggat ggagtctttg aaaagaacaa aatctgacag ctcaaactca
420gacattatgg aactttcacc atctcccaag agactcaaaa tgcaagcttc
aagttttgag 480tcatcaatga gtgcagaaaa atcatcagct gatatcaacc
ggtcagttcc gcagatgttt 540gagtctccta acgaacctaa aggaagctcc
ttattaccaa aggttatgtt tgcagagtgg 600ctttcacttg aaagcttcgc
gagtttaggt gagcctatgg attcaaagac tacacttgat 660cataatacaa
tcttccaaga caatttcttg catgattact tactggatga aagagcattt
720ggcggcgagt atcataattc actaagcgat ggttcgagcg gcgacatttt
tagttcagaa 780ttcaggtttg agagccagag tccagggaat gagtttgatt
ttagctctgg agaggattta 840tgtagtgact tcaacttgag caacattagt
gatgtgatgt acata 88516295PRTPopulus trichocarpaPOPTR_0015s13190.1,
domain AAs 22-82, 75-117 16Met Gly Cys Lys Ser Ser Asp Met Pro Lys
Leu Lys Pro Lys Pro Lys 1 5 10 15 His Arg Lys Gly Leu Trp Ser Pro
Glu Glu Asp Gln Arg Leu Arg Asn 20 25 30 Tyr Val Leu Lys His Gly
His Gly Cys Trp Ser Ser Val Pro Ile Asn 35 40 45 Ala Gly Leu Gln
Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn 50 55 60 Tyr Leu
Arg Pro Gly Leu Lys Arg Gly Thr Phe Ser Ala Gln Glu Glu 65 70 75 80
Glu Thr Ile Leu Ala Leu His His Met Leu Gly Asn Lys Trp Ser Gln 85
90 95 Ile Ala Gln His Leu Pro Gly Arg Thr Asp Asn Glu Ile Lys Asn
His 100 105 110 Trp His Ser Tyr Leu Lys Lys Asn Leu Leu Lys Asp Glu
Gly Met Glu 115 120 125 Ser Leu Lys Arg Thr Lys Ser Asp Ser Ser Asn
Ser Asp Ile Met Glu 130 135 140 Leu Ser Pro Ser Pro Lys Arg Leu Lys
Met Gln Ala Ser Ser Phe Glu 145 150 155 160 Ser Ser Met Ser Ala Glu
Lys Ser Ser Ala Asp Ile Asn Arg Ser Val 165 170 175 Pro Gln Met Phe
Glu Ser Pro Asn Glu Pro Lys Gly Ser Ser Leu Leu 180 185 190 Pro Lys
Val Met Phe Ala Glu Trp Leu Ser Leu Glu Ser Phe Ala Ser 195 200 205
Leu Gly Glu Pro Met Asp Ser Lys Thr Thr Leu Asp His Asn Thr Ile 210
215 220 Phe Gln Asp Asn Phe Leu His Asp Tyr Leu Leu Asp Glu Arg Ala
Phe 225 230 235 240 Gly Gly Glu Tyr His Asn Ser Leu Ser Asp Gly Ser
Ser Gly Asp Ile 245 250 255 Phe Ser Ser Glu Phe Arg Phe Glu Ser Gln
Ser Pro Gly Asn Glu Phe 260 265 270 Asp Phe Ser Ser Gly Glu Asp Leu
Cys Ser Asp Phe Asn Leu Ser Asn 275 280 285 Ile Ser Asp Val Met Tyr
Ile 290 295 17807DNAEucalyptus grandisEUCGR.K00250.1 17atggcactga
agtcatcaga aagaccaaaa cccaagcaca gaaagggatt gtggtcacct 60gaagaagatc
agaagctcag gaactatgtc ctcaagcatg gccatggttg ctggagctct
120gtccccatta acaccggctt gcagaggaat ggcaagagct gcagattaag
gtggatcaat 180tacttgaggc ctggcctaaa gagaggcatg ttcaccatgg
aagaggagga gattattttt 240tcccttcatc acttgatagg caacaagtgg
tctcaaatag caaagcattt gccaggaagg 300acagataacg agataaagaa
tcactggcat tcttatctta agaagaaggt ggcaaacaag 360actgaatctt
tatcgtcatc attagaagct caccatcatg cccggagtca atgtaccaat
420tcggacaatg tggaatcttc gcctcctcca gatcaaatcc ctccaaacca
gaacccatca 480gttcatgcac catcacagga gcaaaaggaa aagacatcat
ttgactttca gagggacggg 540ctacgcagct acttgcccca gattttcttc
gccgagtggc tgaatcaagc tgatcaaggg 600aacaacatcc ccaattacgg
cgacgctttc gatgactgct tgaatcttca ggaccccctt 660gtgcctgatc
tatgcacgag tgattttgga aattcttatg gtggtgaata tgttggtagt
720gagctaagta acgggtctgc tagtgctagc gtgagcgaca tgtacagttc
gcagttgaag 780ttggagatgg gatcaggttt cggggag 80718269PRTEucalyptus
grandisEUCGR.K00250.1, domain AAs 18-78, 71-113 18Met Ala Leu Lys
Ser Ser Glu Arg Pro Lys Pro Lys His Arg Lys Gly 1 5 10 15 Leu Trp
Ser Pro Glu Glu Asp Gln Lys Leu Arg Asn Tyr Val Leu Lys 20 25 30
His Gly His Gly Cys Trp Ser Ser Val Pro Ile Asn Thr Gly Leu Gln 35
40 45 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg
Pro 50 55 60 Gly Leu Lys Arg Gly Met Phe Thr Met Glu Glu Glu Glu
Ile Ile Phe 65 70 75 80 Ser Leu His His Leu Ile Gly Asn Lys Trp Ser
Gln Ile Ala Lys His 85 90 95 Leu Pro Gly Arg Thr Asp Asn Glu Ile
Lys Asn His Trp His Ser Tyr 100 105 110 Leu Lys Lys Lys Val Ala Asn
Lys Thr Glu Ser Leu Ser Ser Ser Leu 115 120 125 Glu Ala His His His
Ala Arg Ser Gln Cys Thr Asn Ser Asp Asn Val 130 135 140 Glu Ser Ser
Pro Pro Pro Asp Gln Ile Pro Pro Asn Gln Asn Pro Ser 145 150 155 160
Val His Ala Pro Ser Gln Glu Gln Lys Glu Lys Thr Ser Phe Asp Phe 165
170 175 Gln Arg Asp Gly Leu Arg Ser Tyr Leu Pro Gln Ile Phe Phe Ala
Glu 180 185 190 Trp Leu Asn Gln Ala Asp Gln Gly Asn Asn Ile Pro Asn
Tyr Gly Asp 195 200 205 Ala Phe Asp Asp Cys Leu Asn Leu Gln Asp Pro
Leu Val Pro Asp Leu 210 215 220 Cys Thr Ser Asp Phe Gly Asn Ser Tyr
Gly Gly Glu Tyr Val Gly Ser 225 230 235 240 Glu Leu Ser Asn Gly Ser
Ala Ser Ala Ser Val Ser Asp Met Tyr Ser 245 250 255 Ser Gln Leu Lys
Leu Glu Met Gly Ser Gly Phe Gly Glu 260 265 19885DNAEucalyptus
grandisEUCGR.K00251.1 19atggcattga agtcatcaga aaggccaaag cccaagcaca
ggaagggctt gtggtcacct 60gaagaagacc agaggctcag gaactatatc ctgaaccatg
gccatggtta ctggagctct 120gtccccatta acaccggctt gcagaggaat
ggcaagagct gcagattaag gtggatcaat 180tacttgaggc ctggcctaaa
gagaggcatg ttcaccctgg aagaagagga gattattttg 240tcccttcatc
gcttgatagg caataagtgg tctcaaatag caaagcattt gccaggaagg
300accgataatg agataaagaa tcactggcat tcttatctta agaagaaggt
ggcaaataag 360actgaatcat cgtcatcatc agaagcccgc cataatgccc
agagtcaatg taccaattcg 420gacaatgtgg aatcttcgcc ttcaccagat
caaatcccca ctaaccaaaa cgcatcagtt 480catgcaccgt cacaggaaca
aaaggaaaag atgtcattgg actttccgaa tgggggtcca 540cgcagctgct
tgcccaatat tttcttcgcc gagtggctga atcaagctga tcaagggtac
600aacgttccga cctatggcga tgctttcgat tatcgctcaa atattcagga
ctctcttgtg 660catgattggt gcacaagcga ttttgggaat tctaatggcg
gtgagtatgt tgggaatgag 720ctaagtaagg ggtccgctag tgccagcgtg
agcgacatgt acagttcgcg gttgaagtcg 780gagatggatc aggtttcggg
aggtgggttt tacttggatt atttctctgg ggatgatatc 840tgtagtcagt
tcgacatggg cagtgatgta aatatgatgt acata 88520295PRTEucalyptus
grandisEUCGR.K00251.1, domain AAs 18-78, 71-113 20Met Ala Leu Lys
Ser Ser Glu Arg Pro Lys Pro Lys His Arg Lys Gly 1 5 10 15 Leu Trp
Ser Pro Glu Glu Asp Gln Arg Leu Arg Asn Tyr Ile Leu Asn 20 25 30
His Gly His Gly Tyr Trp Ser Ser Val Pro Ile Asn Thr Gly Leu Gln 35
40 45 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg
Pro 50 55 60 Gly Leu Lys Arg Gly Met Phe Thr Leu Glu Glu Glu Glu
Ile Ile Leu 65 70 75 80 Ser Leu His Arg Leu Ile Gly Asn Lys Trp Ser
Gln Ile Ala Lys His 85 90 95 Leu Pro Gly Arg Thr Asp Asn Glu Ile
Lys Asn His Trp His Ser Tyr 100 105 110 Leu Lys Lys Lys Val Ala Asn
Lys Thr Glu Ser Ser Ser Ser Ser Glu 115 120 125 Ala Arg His Asn Ala
Gln Ser Gln Cys Thr Asn Ser Asp Asn Val Glu 130 135 140 Ser Ser Pro
Ser Pro Asp Gln Ile Pro Thr Asn Gln Asn Ala Ser Val 145 150 155 160
His Ala Pro Ser Gln Glu Gln Lys Glu Lys Met Ser Leu Asp Phe Pro 165
170 175 Asn Gly Gly Pro Arg Ser Cys Leu Pro Asn Ile Phe Phe Ala Glu
Trp 180 185 190 Leu Asn Gln Ala Asp Gln Gly Tyr Asn Val Pro Thr Tyr
Gly Asp Ala 195 200 205 Phe Asp Tyr Arg Ser Asn Ile Gln Asp Ser Leu
Val His Asp Trp Cys 210 215 220 Thr Ser Asp Phe Gly Asn Ser Asn Gly
Gly Glu Tyr Val Gly Asn Glu 225 230 235 240 Leu Ser Lys Gly Ser Ala
Ser Ala Ser Val Ser Asp Met Tyr Ser Ser 245 250 255 Arg Leu Lys Ser
Glu Met Asp Gln Val Ser Gly Gly Gly Phe Tyr Leu 260 265 270 Asp Tyr
Phe Ser Gly Asp Asp Ile Cys Ser Gln Phe Asp Met Gly Ser 275 280 285
Asp Val Asn Met Met Tyr Ile 290 295 21975DNAPopulus
trichocarpaPOPTR_0012s13260.1 21atgccaaagg cattcattgc atccatcaca
aagtccaaga ctctctttct cttgtacaag 60tcaccaatcc ttctcatcat cggtgttctt
ggcgaaatgg ggtgcaaatc atcagacaag 120ccaaagccaa agctaaggca
caggaaaggc ttgtggtcac ctgaagaaga tcaaaggctt 180ggaagctatg
tctttcaaca tggccacgga tgttggagct ctgtccccat taatgctggc
240ttgcagagga ctgggaagag ctgcagatta agatggatta attacttgag
accaggactg 300aaaagagggg cgttttctac agacgaagaa gagacaatcc
tgacccttca tcgcatgtta 360ggcaacaagt ggtctcaaat tgcacagcat
ttgcctggaa gaacagacaa tgagataaag 420aaccattggc attcctattt
gaagaaaaag ttgttcaaag ctgaaggaat ggaatctcct 480aataagactc
aatctgccag ctcaaactca gacaatatgg atctttcacc ctctcccaaa
540aggcttaaaa tgcaaagtcc tgaatcgtca atgaatatgg aaaaaccatc
aactgatatc 600gaccggccgg tacttccacg gatgtttgac tatcttaaag
aacctaacag aagctcctta 660ttaccaaagg ttatgtttgc tgagtggctc
tcacttgaca gctttgcaag ttcaggtgag 720cctgtggttt caaagagtac
attcgatcat aatccaagct tccaagacac tagtttcatg 780catcattact
tactggaaga aggagcattt ggtggcgact atcaaaattc tctaagcgat
840ggttcgagcg gcgacatttt tagttcagaa ttcaaatttg aaagccagag
tccaggaaat 900gagtttgatt ttagctctgg agaggattta tgtagggaat
tcaacttccg taatattggt 960gatgtgatgt acata 97522325PRTPopulus
trichocarpaPOPTR_0012s13260.1, domain AAs 52-112, 105-147 22Met Pro
Lys Ala Phe Ile Ala Ser Ile Thr Lys Ser Lys Thr Leu Phe 1 5 10 15
Leu Leu Tyr Lys Ser Pro Ile Leu Leu Ile Ile Gly Val Leu Gly Glu 20
25 30 Met Gly Cys Lys Ser Ser Asp Lys Pro Lys Pro Lys Leu Arg His
Arg 35 40 45 Lys Gly Leu Trp Ser Pro Glu Glu Asp Gln Arg Leu Gly
Ser Tyr Val 50 55 60 Phe Gln His Gly
His Gly Cys Trp Ser Ser Val Pro Ile Asn Ala Gly 65 70 75 80 Leu Gln
Arg Thr Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 85 90 95
Arg Pro Gly Leu Lys Arg Gly Ala Phe Ser Thr Asp Glu Glu Glu Thr 100
105 110 Ile Leu Thr Leu His Arg Met Leu Gly Asn Lys Trp Ser Gln Ile
Ala 115 120 125 Gln His Leu Pro Gly Arg Thr Asp Asn Glu Ile Lys Asn
His Trp His 130 135 140 Ser Tyr Leu Lys Lys Lys Leu Phe Lys Ala Glu
Gly Met Glu Ser Pro 145 150 155 160 Asn Lys Thr Gln Ser Ala Ser Ser
Asn Ser Asp Asn Met Asp Leu Ser 165 170 175 Pro Ser Pro Lys Arg Leu
Lys Met Gln Ser Pro Glu Ser Ser Met Asn 180 185 190 Met Glu Lys Pro
Ser Thr Asp Ile Asp Arg Pro Val Leu Pro Arg Met 195 200 205 Phe Asp
Tyr Leu Lys Glu Pro Asn Arg Ser Ser Leu Leu Pro Lys Val 210 215 220
Met Phe Ala Glu Trp Leu Ser Leu Asp Ser Phe Ala Ser Ser Gly Glu 225
230 235 240 Pro Val Val Ser Lys Ser Thr Phe Asp His Asn Pro Ser Phe
Gln Asp 245 250 255 Thr Ser Phe Met His His Tyr Leu Leu Glu Glu Gly
Ala Phe Gly Gly 260 265 270 Asp Tyr Gln Asn Ser Leu Ser Asp Gly Ser
Ser Gly Asp Ile Phe Ser 275 280 285 Ser Glu Phe Lys Phe Glu Ser Gln
Ser Pro Gly Asn Glu Phe Asp Phe 290 295 300 Ser Ser Gly Glu Asp Leu
Cys Arg Glu Phe Asn Phe Arg Asn Ile Gly 305 310 315 320 Asp Val Met
Tyr Ile 325 23873DNAGlycine maxGlyma16g31280.1 23atggagagcc
agccactaga aaaagcaaaa ccaaaataca gaaaaggctt atggtcacct 60gaagaagata
ataaactcag aaaccatatc attaagcatg gtcatggctg ctggagctct
120gtccctatta aggcaggctt gcaaagaaat gggaagagct gtagactaag
gtggattaac 180tacttgaggc caggattgaa gagaggggtg ttcagcaaac
atgaggaaga tacaatcatg 240gtcctacacc atatgttagg aaacaagtgg
tctcaaatag cacagcattt gccaggaagg 300actgacaatg agataaaaaa
ttattggcat tcatatttga aaaagaaaga gatcaaagcc 360aaggaaatgg
aatctgataa agaaattcag catgctagct caagttcaga cacaatggaa
420aactcactct ctcctcagaa acttgcaaca caagatccaa gttatagttt
gttagaaaac 480ctggacaaat caatagcaca caatgataac tttttctcac
aaagctataa cttttccaag 540gaggcttgtc agagttccct accattacca
aaactcctat tttctgagtg gctttcagtg 600gatcaagtag atggtggaag
ctcagtgaat tctgatgatt ccttggtctt ggggaatgaa 660tttgatcaaa
attcaacttt ccaagaagct ataatgcata tgttagaaga aaactttggt
720gaagagtatc ataatagtct aattcacagt tcaaccactg aggtctacaa
ttcacaacta 780aagtcaacaa atcaagtgga tggaagtgac ttcatcaatt
gtattcccgg gaatgagttg 840tgtagcaatt tcagcctaac caatcatgct atg
87324291PRTGlycine maxGlyma16g31280.1, domain AAs 18-78, 71-113
24Met Glu Ser Gln Pro Leu Glu Lys Ala Lys Pro Lys Tyr Arg Lys Gly 1
5 10 15 Leu Trp Ser Pro Glu Glu Asp Asn Lys Leu Arg Asn His Ile Ile
Lys 20 25 30 His Gly His Gly Cys Trp Ser Ser Val Pro Ile Lys Ala
Gly Leu Gln 35 40 45 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile
Asn Tyr Leu Arg Pro 50 55 60 Gly Leu Lys Arg Gly Val Phe Ser Lys
His Glu Glu Asp Thr Ile Met 65 70 75 80 Val Leu His His Met Leu Gly
Asn Lys Trp Ser Gln Ile Ala Gln His 85 90 95 Leu Pro Gly Arg Thr
Asp Asn Glu Ile Lys Asn Tyr Trp His Ser Tyr 100 105 110 Leu Lys Lys
Lys Glu Ile Lys Ala Lys Glu Met Glu Ser Asp Lys Glu 115 120 125 Ile
Gln His Ala Ser Ser Ser Ser Asp Thr Met Glu Asn Ser Leu Ser 130 135
140 Pro Gln Lys Leu Ala Thr Gln Asp Pro Ser Tyr Ser Leu Leu Glu Asn
145 150 155 160 Leu Asp Lys Ser Ile Ala His Asn Asp Asn Phe Phe Ser
Gln Ser Tyr 165 170 175 Asn Phe Ser Lys Glu Ala Cys Gln Ser Ser Leu
Pro Leu Pro Lys Leu 180 185 190 Leu Phe Ser Glu Trp Leu Ser Val Asp
Gln Val Asp Gly Gly Ser Ser 195 200 205 Val Asn Ser Asp Asp Ser Leu
Val Leu Gly Asn Glu Phe Asp Gln Asn 210 215 220 Ser Thr Phe Gln Glu
Ala Ile Met His Met Leu Glu Glu Asn Phe Gly 225 230 235 240 Glu Glu
Tyr His Asn Ser Leu Ile His Ser Ser Thr Thr Glu Val Tyr 245 250 255
Asn Ser Gln Leu Lys Ser Thr Asn Gln Val Asp Gly Ser Asp Phe Ile 260
265 270 Asn Cys Ile Pro Gly Asn Glu Leu Cys Ser Asn Phe Ser Leu Thr
Asn 275 280 285 His Ala Met 290 25786DNAGlycine maxGlyma09g25590.1
25atggagagca agccactaga aaaagcaaaa ccaaaataca gaaagggctt atggtcacca
60gaagaagata ataagctcag aaatcatatc attaagcatg gtcatggctg ctggagctct
120gtccctatta aggcaggctt gcaaagaaat gggaagagct gcagactaag
gtggattaac 180tacttgaggc caggattgaa gagaggggtg ttcagcaaac
atgagaaaga tacaatcatg 240gccctacacc atatgttagg aaacaagtgg
tctcagatag cacagcattt gccaggaagg 300actgacaatg aggtaaaaaa
ttactggcat tcatatttga aaaagaaagt catcaaagct 360aaggaaatgg
aatctgataa acaaattcaa catgccggct caagttcaga cacagtggaa
420aacgcactct ctcctcagaa acttgcaaca caagattcaa gttatgggtt
gttagaaaac 480cttgacaaat caatagcaca aaatgataac tttttctcga
aaagctataa cttttccaag 540gaggcttatc agagttctct accactacca
aaactcttat tttctgagtg gctatcagtg 600gatcaagagt atcataatcg
tctaattcac agttcaacca ctgaggtcta taattcacaa 660ataaagtcaa
caaatcaaat ggatggaagt gatttcatga attgtattcc cgggaatgag
720ttacgtagca atttcagcct aaccaatcat ggtgagttgg aaggagaaga
atacaatgca 780attcct 78626262PRTGlycine maxGlyma09g25590.1, domain
AAs 18-78, 71-113 26Met Glu Ser Lys Pro Leu Glu Lys Ala Lys Pro Lys
Tyr Arg Lys Gly 1 5 10 15 Leu Trp Ser Pro Glu Glu Asp Asn Lys Leu
Arg Asn His Ile Ile Lys 20 25 30 His Gly His Gly Cys Trp Ser Ser
Val Pro Ile Lys Ala Gly Leu Gln 35 40 45 Arg Asn Gly Lys Ser Cys
Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro 50 55 60 Gly Leu Lys Arg
Gly Val Phe Ser Lys His Glu Lys Asp Thr Ile Met 65 70 75 80 Ala Leu
His His Met Leu Gly Asn Lys Trp Ser Gln Ile Ala Gln His 85 90 95
Leu Pro Gly Arg Thr Asp Asn Glu Val Lys Asn Tyr Trp His Ser Tyr 100
105 110 Leu Lys Lys Lys Val Ile Lys Ala Lys Glu Met Glu Ser Asp Lys
Gln 115 120 125 Ile Gln His Ala Gly Ser Ser Ser Asp Thr Val Glu Asn
Ala Leu Ser 130 135 140 Pro Gln Lys Leu Ala Thr Gln Asp Ser Ser Tyr
Gly Leu Leu Glu Asn 145 150 155 160 Leu Asp Lys Ser Ile Ala Gln Asn
Asp Asn Phe Phe Ser Lys Ser Tyr 165 170 175 Asn Phe Ser Lys Glu Ala
Tyr Gln Ser Ser Leu Pro Leu Pro Lys Leu 180 185 190 Leu Phe Ser Glu
Trp Leu Ser Val Asp Gln Glu Tyr His Asn Arg Leu 195 200 205 Ile His
Ser Ser Thr Thr Glu Val Tyr Asn Ser Gln Ile Lys Ser Thr 210 215 220
Asn Gln Met Asp Gly Ser Asp Phe Met Asn Cys Ile Pro Gly Asn Glu 225
230 235 240 Leu Arg Ser Asn Phe Ser Leu Thr Asn His Gly Glu Leu Glu
Gly Glu 245 250 255 Glu Tyr Asn Ala Ile Pro 260 27837DNASolanum
lycopersicumSolyc03g025870.2.1 27atggggtgca aattggcagc tgagaagcca
aaacaaaaac acaagaaggg attatggtct 60cctgatgaag atgataggct caaaaattat
atgattaagc atggtcatgg atgttggagc 120tctgttccca ttaatgctgg
cttgcaaaga aatggaaaga gttgtagact gagatggatt 180aattatttaa
ggcctggctt aaagagaggg gcatttagct tagaagagga agacataata
240ttgacccttc atgccatgtt tggcaacaaa tggtctcaga ttgcacaaca
gttacctgga 300agaacggata acgagataaa gaatcactgg cactcgtatt
taaagaaaag agtgtccaaa 360atgggagaaa atgaagggca cactaagcct
gggaaaacag attcttcttc accttcttta 420aagaaattga ctccacagaa
ttcaagtttg gattcatttg aacatattga aggatcatta 480gcagattcag
atcaatctgt ttatccaaga gagactcaaa agagtaattt acctaaagta
540ttattcgcgg aatggctttc gttggatcag tttaatggac aagattttca
aaactcaggg 600agtttcagtt ttgaaccttg caagagtaac tttgtgtata
ataataatgc agagttacat 660gacatactca tgcatagttt accgatgaac
aacgatgatg ggaatggcgt aaatcaagag 720gttcttcaca atgatatttt
cccaccacaa ctcaagtttg aggatacatt gtctggtaac 780ggatttgagg
agtttatgtc aagggagttc aatattaacg acgatgtgat gtacata
83728279PRTSolanum lycopersicumSolyc03g025870.2.1, domain AAs
19-79, 72-114 28Met Gly Cys Lys Leu Ala Ala Glu Lys Pro Lys Gln Lys
His Lys Lys 1 5 10 15 Gly Leu Trp Ser Pro Asp Glu Asp Asp Arg Leu
Lys Asn Tyr Met Ile 20 25 30 Lys His Gly His Gly Cys Trp Ser Ser
Val Pro Ile Asn Ala Gly Leu 35 40 45 Gln Arg Asn Gly Lys Ser Cys
Arg Leu Arg Trp Ile Asn Tyr Leu Arg 50 55 60 Pro Gly Leu Lys Arg
Gly Ala Phe Ser Leu Glu Glu Glu Asp Ile Ile 65 70 75 80 Leu Thr Leu
His Ala Met Phe Gly Asn Lys Trp Ser Gln Ile Ala Gln 85 90 95 Gln
Leu Pro Gly Arg Thr Asp Asn Glu Ile Lys Asn His Trp His Ser 100 105
110 Tyr Leu Lys Lys Arg Val Ser Lys Met Gly Glu Asn Glu Gly His Thr
115 120 125 Lys Pro Gly Lys Thr Asp Ser Ser Ser Pro Ser Leu Lys Lys
Leu Thr 130 135 140 Pro Gln Asn Ser Ser Leu Asp Ser Phe Glu His Ile
Glu Gly Ser Leu 145 150 155 160 Ala Asp Ser Asp Gln Ser Val Tyr Pro
Arg Glu Thr Gln Lys Ser Asn 165 170 175 Leu Pro Lys Val Leu Phe Ala
Glu Trp Leu Ser Leu Asp Gln Phe Asn 180 185 190 Gly Gln Asp Phe Gln
Asn Ser Gly Ser Phe Ser Phe Glu Pro Cys Lys 195 200 205 Ser Asn Phe
Val Tyr Asn Asn Asn Ala Glu Leu His Asp Ile Leu Met 210 215 220 His
Ser Leu Pro Met Asn Asn Asp Asp Gly Asn Gly Val Asn Gln Glu 225 230
235 240 Val Leu His Asn Asp Ile Phe Pro Pro Gln Leu Lys Phe Glu Asp
Thr 245 250 255 Leu Ser Gly Asn Gly Phe Glu Glu Phe Met Ser Arg Glu
Phe Asn Ile 260 265 270 Asn Asp Asp Val Met Tyr Ile 275
29894DNAVitis viniferaGSVIVT01028984001 29atggggtgta attcattgga
gaagtcgaag accaagccca aacaccgaaa ggggttatgg 60tcaccggaag aagatgctag
gctcagaaac tatgtcctca aatatggcct tggctgctgg 120agctccgtcc
ctgttaacgc cggtttgcaa aggaatggaa agagctgcag attaaggtgg
180attaactact taagaccagg attaaaacgc gggatgttta cgatcgagga
ggaagagacg 240atcatggccc ttcatcgctt gttaggcaac aagtggtctc
agatagcgca gaattttcct 300ggaagaactg ataatgagat taagaactac
tggcattcat gtctcaagaa gaaagtggtg 360aaagctcagg aaatggaagt
tcatatgaac tcccaatgca tcaactctaa ctcacagagc 420attgattctt
caacttcaca agaaaagcca tcaatccaac ttccgggttt cgaatcgttt
480gaaaacatga aaggatcatc ttcaacagat actgatcagt ccattccaca
gatgttggac 540tgtcctagag tggacacgca ggaaagcccc ttgccgaaga
ttttattcgc agagtggctt 600tctcttgacc atatttacgg ccagctcttt
gttaattcag gcgagtcagt catttccaag 660gatactcttg atcagcatga
tccaaccttt caagacaatt tcacgcatgg tttcctactg 720aacgaggagt
catatgtagg tgaattgcac catggcctaa gcaatgattc gtccagtgac
780atgttttcgc cgcaatttaa gttcgagagc cagactccgg gaagtgggat
atgtgatttt 840gtgtatgggg atgaaatatg cagtgatttc aacatgaacg
gccatgtaat gtac 89430298PRTVitis viniferaGSVIVT01028984001, domain
AAs 20-80, 73-115 30Met Gly Cys Asn Ser Leu Glu Lys Ser Lys Thr Lys
Pro Lys His Arg 1 5 10 15 Lys Gly Leu Trp Ser Pro Glu Glu Asp Ala
Arg Leu Arg Asn Tyr Val 20 25 30 Leu Lys Tyr Gly Leu Gly Cys Trp
Ser Ser Val Pro Val Asn Ala Gly 35 40 45 Leu Gln Arg Asn Gly Lys
Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 50 55 60 Arg Pro Gly Leu
Lys Arg Gly Met Phe Thr Ile Glu Glu Glu Glu Thr 65 70 75 80 Ile Met
Ala Leu His Arg Leu Leu Gly Asn Lys Trp Ser Gln Ile Ala 85 90 95
Gln Asn Phe Pro Gly Arg Thr Asp Asn Glu Ile Lys Asn Tyr Trp His 100
105 110 Ser Cys Leu Lys Lys Lys Val Val Lys Ala Gln Glu Met Glu Val
His 115 120 125 Met Asn Ser Gln Cys Ile Asn Ser Asn Ser Gln Ser Ile
Asp Ser Ser 130 135 140 Thr Ser Gln Glu Lys Pro Ser Ile Gln Leu Pro
Gly Phe Glu Ser Phe 145 150 155 160 Glu Asn Met Lys Gly Ser Ser Ser
Thr Asp Thr Asp Gln Ser Ile Pro 165 170 175 Gln Met Leu Asp Cys Pro
Arg Val Asp Thr Gln Glu Ser Pro Leu Pro 180 185 190 Lys Ile Leu Phe
Ala Glu Trp Leu Ser Leu Asp His Ile Tyr Gly Gln 195 200 205 Leu Phe
Val Asn Ser Gly Glu Ser Val Ile Ser Lys Asp Thr Leu Asp 210 215 220
Gln His Asp Pro Thr Phe Gln Asp Asn Phe Thr His Gly Phe Leu Leu 225
230 235 240 Asn Glu Glu Ser Tyr Val Gly Glu Leu His His Gly Leu Ser
Asn Asp 245 250 255 Ser Ser Ser Asp Met Phe Ser Pro Gln Phe Lys Phe
Glu Ser Gln Thr 260 265 270 Pro Gly Ser Gly Ile Cys Asp Phe Val Tyr
Gly Asp Glu Ile Cys Ser 275 280 285 Asp Phe Asn Met Asn Gly His Val
Met Tyr 290 295 31723DNAEucalyptus grandisEUCGR.A02796.1
31atgggatgca agtcagtgga aaaaccaaag gcaaggcaca gaaaggggtt gtggtcacca
60gatgaagacc agaggctcag aaactacatc cataaacacg gctacagttg ctggagctca
120gttcccatca atgcaggttt gcagaggaat ggtaagagct gcagattaag
gtggattaat 180tacctgaggc caggattaaa gagaggcgcg ttcacagtac
aggaggaaga gacaattttg 240aacctccacc acttgttagg caacaagtgg
tctcaaatag cacagcatct ccctggaagg 300accgataacg aaataaagaa
tcattggcat tcttatctta agaagaagat taatatcaaa 360gctgacgaat
ctcaccttca gatgccgagc aattcagact ctgtgggatc tccgaactct
420acggactttc catctgatca cacccaaagt tctgatacaa cggaatatac
aaaaagctca 480gcttctcaaa ttccgaaaat tttcaacccc acaaaagagg
gtgaaagctc attgccgacg 540cttctttttg aggagtggct ttctctggat
aattctcctg gaggaagttt cacaaaccat 600gctgaatcac aagatcaaat
ttcgggaaac gggcttgtgc agtgtctttc tggggatgat 660ctttgtatcc
tttccaagtc aggctgttca cctgaagaag tgattagccg agaaatattg 720aaa
72332241PRTEucalyptus grandisEUCGR.A02796.1, domain AAs 18-78,
71-113 32Met Gly Cys Lys Ser Val Glu Lys Pro Lys Ala Arg His Arg
Lys Gly 1 5 10 15 Leu Trp Ser Pro Asp Glu Asp Gln Arg Leu Arg Asn
Tyr Ile His Lys 20 25 30 His Gly Tyr Ser Cys Trp Ser Ser Val Pro
Ile Asn Ala Gly Leu Gln 35 40 45 Arg Asn Gly Lys Ser Cys Arg Leu
Arg Trp Ile Asn Tyr Leu Arg Pro 50 55 60 Gly Leu Lys Arg Gly Ala
Phe Thr Val Gln Glu Glu Glu Thr Ile Leu 65 70 75 80 Asn Leu His His
Leu Leu Gly Asn Lys Trp Ser Gln Ile Ala Gln His 85 90 95 Leu Pro
Gly Arg Thr Asp Asn Glu Ile Lys Asn His Trp His Ser Tyr 100 105 110
Leu Lys Lys Lys Ile Asn Ile Lys Ala Asp Glu Ser His Leu Gln Met 115
120 125 Pro Ser Asn Ser Asp Ser Val Gly Ser Pro Asn Ser Thr Asp Phe
Pro 130 135 140 Ser Asp His Thr Gln Ser Ser Asp Thr Thr Glu Tyr Thr
Lys Ser Ser 145 150 155 160 Ala Ser Gln Ile Pro Lys Ile Phe Asn Pro
Thr Lys Glu Gly Glu Ser 165 170 175 Ser Leu Pro Thr Leu Leu Phe Glu
Glu Trp Leu Ser Leu Asp Asn Ser
180 185 190 Pro Gly Gly Ser Phe Thr Asn His Ala Glu Ser Gln Asp Gln
Ile Ser 195 200 205 Gly Asn Gly Leu Val Gln Cys Leu Ser Gly Asp Asp
Leu Cys Ile Leu 210 215 220 Ser Lys Ser Gly Cys Ser Pro Glu Glu Val
Ile Ser Arg Glu Ile Leu 225 230 235 240 Lys 33783DNAArabidopsis
thalianaAT3G48920.1 33atggtgttta aatcagaaaa atcaaaccgg gaaatgaaat
caaaggagaa gcaaaggaag 60ggattatggt cacccgagga agatgagaag cttaggagtc
atgtcctcaa atatggccat 120ggatgctgga gtactattcc tcttcaagct
ggattgcaga ggaatgggaa gagttgtaga 180ttaaggtggg ttaattattt
aagacctgga cttaagaagt ctttattcac taaacaagag 240gaaactatac
ttctttcact tcattccatg ttgggtaaca aatggtctca gatatcgaaa
300ttcttaccag gaagaaccga caacgagatc aaaaactatt ggcattctaa
tctaaagaag 360ggtgtaactt tgaaacaaca tgaaaccaca aaaaaacatc
aaacaccttt aatcacaaac 420tcacttgagg ccttgcagag ttcaactgaa
agatcttctt catctatcaa tgtcggagaa 480acgtctaatg ctcaaacctc
aagcttttcg ccaaatctcg tgttctcgga atggttagat 540catagtttgc
ttatggatca gtcacctcaa aagtctagct atgttcaaaa tcttgtttta
600ccggaagaga gaggattcat tggaccatgt ggccctcgtt atttgggaaa
cgactctttg 660cctgatttcg tgccaaattc agaatttttg ttggatgatg
agatatcatc tgagatcgag 720ttctgtactt cattttcaga caactttttg
ttcgatggtc tcatcaacga gctacgacca 780atg 78334261PRTArabidopsis
thalianaAT3G48920.1, domain AAs 23-83, 76-118 34Met Val Phe Lys Ser
Glu Lys Ser Asn Arg Glu Met Lys Ser Lys Glu 1 5 10 15 Lys Gln Arg
Lys Gly Leu Trp Ser Pro Glu Glu Asp Glu Lys Leu Arg 20 25 30 Ser
His Val Leu Lys Tyr Gly His Gly Cys Trp Ser Thr Ile Pro Leu 35 40
45 Gln Ala Gly Leu Gln Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Val
50 55 60 Asn Tyr Leu Arg Pro Gly Leu Lys Lys Ser Leu Phe Thr Lys
Gln Glu 65 70 75 80 Glu Thr Ile Leu Leu Ser Leu His Ser Met Leu Gly
Asn Lys Trp Ser 85 90 95 Gln Ile Ser Lys Phe Leu Pro Gly Arg Thr
Asp Asn Glu Ile Lys Asn 100 105 110 Tyr Trp His Ser Asn Leu Lys Lys
Gly Val Thr Leu Lys Gln His Glu 115 120 125 Thr Thr Lys Lys His Gln
Thr Pro Leu Ile Thr Asn Ser Leu Glu Ala 130 135 140 Leu Gln Ser Ser
Thr Glu Arg Ser Ser Ser Ser Ile Asn Val Gly Glu 145 150 155 160 Thr
Ser Asn Ala Gln Thr Ser Ser Phe Ser Pro Asn Leu Val Phe Ser 165 170
175 Glu Trp Leu Asp His Ser Leu Leu Met Asp Gln Ser Pro Gln Lys Ser
180 185 190 Ser Tyr Val Gln Asn Leu Val Leu Pro Glu Glu Arg Gly Phe
Ile Gly 195 200 205 Pro Cys Gly Pro Arg Tyr Leu Gly Asn Asp Ser Leu
Pro Asp Phe Val 210 215 220 Pro Asn Ser Glu Phe Leu Leu Asp Asp Glu
Ile Ser Ser Glu Ile Glu 225 230 235 240 Phe Cys Thr Ser Phe Ser Asp
Asn Phe Leu Phe Asp Gly Leu Ile Asn 245 250 255 Glu Leu Arg Pro Met
260 351059DNASolanum lycopersicumSolyc11g065840.1.1 35atgaggaagc
ctgagttctc ctcctcctcc tcttcctcct ccgcaaagaa caataacaat 60aacaataata
ataacacgaa cgtgaagcta agaaaagggt tgtggtctcc agaggaagat
120gaaaagctta tgcattatat gctaacaaat ggacaagggt gttggagtga
tgtagcaaga 180aatgctggat tacaaagatg tggaaagagt tgtagactca
gatggatcaa ttatttgagg 240ccagatctta agagaggtgc attttcacct
caagaagaag aacatattat ccatttacat 300tccattcttg gtaacaggtg
gtctcaaata gctgcacgtt tgcctggacg tactgataat 360gaaatcaaga
atttttggaa ttcgacattg aaaaagaggc taaagaactc atcatcatct
420tctacaccat caccaaatgc aagtgattca tcctcagatc atccctccaa
agaactcaat 480atgggagtca ctcaacaagg attcatgcca atgctcaaac
ataacctaat gtccatgtac 540atggattcaa ccacctctcc ttcttcctcg
tctatagccc taaataccat aaatattgat 600cctttgccca ccctcgaaca
caccttaata aacatgccta atggattcaa cgcgccctca 660tacttgagta
ctacacaacc atgcttggta caaggtggga atattgtgag tgctaatggt
720ggaaatcttt tttatgggaa taaccatggg atatttggag ggaatcttag
tatggaaggt 780catgaactct atgttccacc attggagaat gtaagtattg
agtatcaaaa tgttgaaaat 840gggaatttta gtcatcatca aaacaacaat
aaccctaaca acatgaccaa cttgatcaat 900actagccata atttcaatac
ttgtagtaat atcaaagtag aaaattttgg agggataggg 960aattattggg
aaggagatga cctaaaagtg ggagagtggg acttggagga attgatgaag
1020gatgtttcac ctttcccttt tcttgatttc caagttgaa 105936353PRTSolanum
lycopersicumSolyc11g065840.1.1 36Met Arg Lys Pro Glu Phe Ser Ser
Ser Ser Ser Ser Ser Ser Ala Lys 1 5 10 15 Asn Asn Asn Asn Asn Asn
Asn Asn Asn Thr Asn Val Lys Leu Arg Lys 20 25 30 Gly Leu Trp Ser
Pro Glu Glu Asp Glu Lys Leu Met His Tyr Met Leu 35 40 45 Thr Asn
Gly Gln Gly Cys Trp Ser Asp Val Ala Arg Asn Ala Gly Leu 50 55 60
Gln Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg 65
70 75 80 Pro Asp Leu Lys Arg Gly Ala Phe Ser Pro Gln Glu Glu Glu
His Ile 85 90 95 Ile His Leu His Ser Ile Leu Gly Asn Arg Trp Ser
Gln Ile Ala Ala 100 105 110 Arg Leu Pro Gly Arg Thr Asp Asn Glu Ile
Lys Asn Phe Trp Asn Ser 115 120 125 Thr Leu Lys Lys Arg Leu Lys Asn
Ser Ser Ser Ser Ser Thr Pro Ser 130 135 140 Pro Asn Ala Ser Asp Ser
Ser Ser Asp His Pro Ser Lys Glu Leu Asn 145 150 155 160 Met Gly Val
Thr Gln Gln Gly Phe Met Pro Met Leu Lys His Asn Leu 165 170 175 Met
Ser Met Tyr Met Asp Ser Thr Thr Ser Pro Ser Ser Ser Ser Ile 180 185
190 Ala Leu Asn Thr Ile Asn Ile Asp Pro Leu Pro Thr Leu Glu His Thr
195 200 205 Leu Ile Asn Met Pro Asn Gly Phe Asn Ala Pro Ser Tyr Leu
Ser Thr 210 215 220 Thr Gln Pro Cys Leu Val Gln Gly Gly Asn Ile Val
Ser Ala Asn Gly 225 230 235 240 Gly Asn Leu Phe Tyr Gly Asn Asn His
Gly Ile Phe Gly Gly Asn Leu 245 250 255 Ser Met Glu Gly His Glu Leu
Tyr Val Pro Pro Leu Glu Asn Val Ser 260 265 270 Ile Glu Tyr Gln Asn
Val Glu Asn Gly Asn Phe Ser His His Gln Asn 275 280 285 Asn Asn Asn
Pro Asn Asn Met Thr Asn Leu Ile Asn Thr Ser His Asn 290 295 300 Phe
Asn Thr Cys Ser Asn Ile Lys Val Glu Asn Phe Gly Gly Ile Gly 305 310
315 320 Asn Tyr Trp Glu Gly Asp Asp Leu Lys Val Gly Glu Trp Asp Leu
Glu 325 330 335 Glu Leu Met Lys Asp Val Ser Pro Phe Pro Phe Leu Asp
Phe Gln Val 340 345 350 Glu 371002DNAGlycine maxGlyma15g03920.1
37atgaggaagc cagaggcgag taataataat actaaaaaca acaacaacaa tagcaagaag
60cttagaaagg ggttgtggtc acctgaagaa gatgacaagc tcatgaacta catgctaaac
120catggacaag ggtgttggag cgatgtggca agaaatgctg gcctccaaag
gtgtggcaaa 180agttgtcgcc ttcgatggat caattacttg aggcctgatc
ttaagagagg tgcattctca 240ccccaagaag aggaactcat catccacttc
cattcccttc ttggaaacag atggtctcaa 300atagcggcgc gtttgcctgg
gcgaaccgac aatgaaataa aaaacttttg gaattcgacg 360ataaagaaaa
gactcaggaa tatgtcttcc acgaccacca caacctcacc ctcaccctca
420tcaaatgcaa gcgagacctc aatatccgag cctagtaata aagacctcaa
catgggaggg 480tttatttcca cacaacataa tcaacacgca ggctttgttc
ctatgttcgg ttcatctcca 540tcaccatcaa taatgcaaac cggtacagtt
ttcaatacct tgattgacag attgcctatg 600ctggagcatg gactaaacat
gccagcttct ggagggtact tcgaaggcac aggtattcct 660tgcttttcgc
aaagtgaagt taacaaatta ggttcttgtt atttagaaaa cggagtattt
720gggagaagtg taaatatcgg ggtagaaggg gatatgtttg ttcctcccct
agagaatgct 780acatgcagca ggagagaaac aactaacagc agttactttg
acgatgacat aaatagtatt 840cttaataact gcaacattgg cataggtgaa
aataaggctc atgatggggt ggagaatttg 900tttcaacaag agttagccac
tgccactgcc acaggagaat gggactttga ggagttgatg 960aaattagatg
tttcctcctt tccgtttctt gatttttcat ac 100238334PRTGlycine
maxGlyma15g03920.1 38Met Arg Lys Pro Glu Ala Ser Asn Asn Asn Thr
Lys Asn Asn Asn Asn 1 5 10 15 Asn Ser Lys Lys Leu Arg Lys Gly Leu
Trp Ser Pro Glu Glu Asp Asp 20 25 30 Lys Leu Met Asn Tyr Met Leu
Asn His Gly Gln Gly Cys Trp Ser Asp 35 40 45 Val Ala Arg Asn Ala
Gly Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu 50 55 60 Arg Trp Ile
Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser 65 70 75 80 Pro
Gln Glu Glu Glu Leu Ile Ile His Phe His Ser Leu Leu Gly Asn 85 90
95 Arg Trp Ser Gln Ile Ala Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu
100 105 110 Ile Lys Asn Phe Trp Asn Ser Thr Ile Lys Lys Arg Leu Arg
Asn Met 115 120 125 Ser Ser Thr Thr Thr Thr Thr Ser Pro Ser Pro Ser
Ser Asn Ala Ser 130 135 140 Glu Thr Ser Ile Ser Glu Pro Ser Asn Lys
Asp Leu Asn Met Gly Gly 145 150 155 160 Phe Ile Ser Thr Gln His Asn
Gln His Ala Gly Phe Val Pro Met Phe 165 170 175 Gly Ser Ser Pro Ser
Pro Ser Ile Met Gln Thr Gly Thr Val Phe Asn 180 185 190 Thr Leu Ile
Asp Arg Leu Pro Met Leu Glu His Gly Leu Asn Met Pro 195 200 205 Ala
Ser Gly Gly Tyr Phe Glu Gly Thr Gly Ile Pro Cys Phe Ser Gln 210 215
220 Ser Glu Val Asn Lys Leu Gly Ser Cys Tyr Leu Glu Asn Gly Val Phe
225 230 235 240 Gly Arg Ser Val Asn Ile Gly Val Glu Gly Asp Met Phe
Val Pro Pro 245 250 255 Leu Glu Asn Ala Thr Cys Ser Arg Arg Glu Thr
Thr Asn Ser Ser Tyr 260 265 270 Phe Asp Asp Asp Ile Asn Ser Ile Leu
Asn Asn Cys Asn Ile Gly Ile 275 280 285 Gly Glu Asn Lys Ala His Asp
Gly Val Glu Asn Leu Phe Gln Gln Glu 290 295 300 Leu Ala Thr Ala Thr
Ala Thr Gly Glu Trp Asp Phe Glu Glu Leu Met 305 310 315 320 Lys Leu
Asp Val Ser Ser Phe Pro Phe Leu Asp Phe Ser Tyr 325 330
39828DNAGlycine maxGlyma12g06180.1 39atgaggaagc ccgaggtttc
tggaaacaac aacaacaaca acaacattaa taacaagctt 60agaaaagggt tgtggtcacc
tgaagaagat gacaagctca tgaactacat gctaaacagt 120ggacaaggtt
gttggagcga tgtagccaga aatgctggcc ttcaaaggtg tggcaaaagt
180tgtcgccttc gatggatcaa ctacttgagg cctgatctta aacgaggtgc
attctcacaa 240caagaagagg aactcatcat ccacttgcat tcccttctcg
gaaacagatg gtctcaaata 300gcggcgcgct taccagggag aacagacaat
gaaattaaga atttttggaa ttcaacaata 360aagaaaagac tcaagaacat
gtcatccaac acctcaccaa atggaagcga gtcctcatat 420gagcctaata
acagagacct taacatggca gggtttacta cttctaatac ccaagatcaa
480caacatgctg attttatgcc tatgttcaat tcatcatctc aatcaccctc
catgcatgcc 540atggttctca attccataat tgacaggttg cctatgctag
agcatggact aaacatgcca 600tgttctgttg acaacaaagg gatttatttg
gaaaatggag gagtatttgg gagtgtaaat 660attggtgcag aaggggatgt
gtatgttccc cctctagaga gtgttagcac tacttctgac 720cataacctga
aaggctggtg gggtggagaa tttgtttcag gaagagttaa ccattggaga
780gtgggacttg gaggagttaa tgaaagatgt ttcatccttt ccctttct
82840276PRTGlycine maxGlyma12g06180.1 40Met Arg Lys Pro Glu Val Ser
Gly Asn Asn Asn Asn Asn Asn Asn Ile 1 5 10 15 Asn Asn Lys Leu Arg
Lys Gly Leu Trp Ser Pro Glu Glu Asp Asp Lys 20 25 30 Leu Met Asn
Tyr Met Leu Asn Ser Gly Gln Gly Cys Trp Ser Asp Val 35 40 45 Ala
Arg Asn Ala Gly Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu Arg 50 55
60 Trp Ile Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser Gln
65 70 75 80 Gln Glu Glu Glu Leu Ile Ile His Leu His Ser Leu Leu Gly
Asn Arg 85 90 95 Trp Ser Gln Ile Ala Ala Arg Leu Pro Gly Arg Thr
Asp Asn Glu Ile 100 105 110 Lys Asn Phe Trp Asn Ser Thr Ile Lys Lys
Arg Leu Lys Asn Met Ser 115 120 125 Ser Asn Thr Ser Pro Asn Gly Ser
Glu Ser Ser Tyr Glu Pro Asn Asn 130 135 140 Arg Asp Leu Asn Met Ala
Gly Phe Thr Thr Ser Asn Thr Gln Asp Gln 145 150 155 160 Gln His Ala
Asp Phe Met Pro Met Phe Asn Ser Ser Ser Gln Ser Pro 165 170 175 Ser
Met His Ala Met Val Leu Asn Ser Ile Ile Asp Arg Leu Pro Met 180 185
190 Leu Glu His Gly Leu Asn Met Pro Cys Ser Val Asp Asn Lys Gly Ile
195 200 205 Tyr Leu Glu Asn Gly Gly Val Phe Gly Ser Val Asn Ile Gly
Ala Glu 210 215 220 Gly Asp Val Tyr Val Pro Pro Leu Glu Ser Val Ser
Thr Thr Ser Asp 225 230 235 240 His Asn Leu Lys Gly Trp Trp Gly Gly
Glu Phe Val Ser Gly Arg Val 245 250 255 Asn His Trp Arg Val Gly Leu
Gly Gly Val Asn Glu Arg Cys Phe Ile 260 265 270 Leu Ser Leu Ser 275
41887DNAGlycine maxGlyma11g14200.1 41atgaggaagc ccgaggtttc
tggaaaaaac aacaacatta ataacaagct tagaaagggg 60ttgtggtcac ctgaagaaga
tgacaagctc atgaactaca tgctaaacag tggacaaggt 120tgttggagcg
atgtagccag aaatgcgggc cttcaaaggt gtggcaaaag ttgtcgcctt
180cgatggatca attacttgag gcctgatctt aaacgaggtg cattctcacc
acaagaagag 240gaaatcatca tccatttgca ttcccttctc ggaaacagat
ggtctcaaat agcagcgcgc 300ttaccaggga gaactgacaa tgaaattaag
aacttttgga attcaacaat aaagaaaaga 360ctcaagaact tgtcctccaa
cacctcacca aatggaagcg agtcatcata tgagcccaac 420aacaaagacc
ttaacatggc agggtttact acttctaata cccaacaaaa tcaacaacat
480gctgatttta tgcctatgtt gcctatgcta gagcatggac taaacatgac
aagttctggt 540ggattcttca acagcaaagg gccatgcttc tcatcatcac
aaagagtatt tgggagtgta 600aatattggtg cagaagggga tatgtatgtt
cctcctctag agagtgttag cactacttct 660gaccattata acctgaaatt
ggagagtacg tgcaacacag atactaacaa tagtaattac 720tttgatgaca
taaacagtat catccttaat aactgcaaca ttaataatag caacaatatt
780aagagagctg aaaatagggc tggtggggtg gagaatttgt ttcaagaaga
gttaaccatt 840ggagagtggg acttagagga gttgatgaaa gatgtttcat cctttcc
88742296PRTGlycine maxmisc_feature(296)..(296)Xaa can be any
naturally occurring amino acid 42Met Arg Lys Pro Glu Val Ser Gly
Lys Asn Asn Asn Ile Asn Asn Lys 1 5 10 15 Leu Arg Lys Gly Leu Trp
Ser Pro Glu Glu Asp Asp Lys Leu Met Asn 20 25 30 Tyr Met Leu Asn
Ser Gly Gln Gly Cys Trp Ser Asp Val Ala Arg Asn 35 40 45 Ala Gly
Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn 50 55 60
Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser Pro Gln Glu Glu 65
70 75 80 Glu Ile Ile Ile His Leu His Ser Leu Leu Gly Asn Arg Trp
Ser Gln 85 90 95 Ile Ala Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu
Ile Lys Asn Phe 100 105 110 Trp Asn Ser Thr Ile Lys Lys Arg Leu Lys
Asn Leu Ser Ser Asn Thr 115 120 125 Ser Pro Asn Gly Ser Glu Ser Ser
Tyr Glu Pro Asn Asn Lys Asp Leu 130 135 140 Asn Met Ala Gly Phe Thr
Thr Ser Asn Thr Gln Gln Asn Gln Gln His 145 150 155 160 Ala Asp Phe
Met Pro Met Leu Pro Met Leu Glu His Gly Leu Asn Met 165 170 175 Thr
Ser Ser Gly Gly Phe Phe Asn Ser Lys Gly Pro Cys Phe Ser Ser 180 185
190 Ser Gln Arg Val Phe Gly Ser Val Asn Ile Gly Ala Glu Gly Asp Met
195 200 205 Tyr Val Pro Pro Leu Glu Ser Val Ser Thr Thr Ser Asp His
Tyr Asn 210 215 220 Leu Lys Leu Glu Ser Thr Cys Asn Thr Asp Thr Asn
Asn Ser Asn Tyr 225 230 235
240 Phe Asp Asp Ile Asn Ser Ile Ile Leu Asn Asn Cys Asn Ile Asn Asn
245 250 255 Ser Asn Asn Ile Lys Arg Ala Glu Asn Arg Ala Gly Gly Val
Glu Asn 260 265 270 Leu Phe Gln Glu Glu Leu Thr Ile Gly Glu Trp Asp
Leu Glu Glu Leu 275 280 285 Met Lys Asp Val Ser Ser Phe Xaa 290 295
43840DNAArabidopsis thalianaAT5G12870.1 43atgaggaagc cagaggtagc
cattgcagct agtactcacc aagtaaagaa gatgaagaag 60ggactttggt ctcctgagga
agactcaaag ctgatgcaat acatgttaag caatggacaa 120ggatgttgga
gtgatgttgc gaaaaacgca ggacttcaaa gatgtggcaa aagctgccgt
180cttcgttgga tcaactatct tcgtcctgac ctcaagcgtg gcgctttctc
tcctcaagaa 240gaggatctca tcattcgctt tcattccatc ctcggcaaca
ggtggtctca gattgcagca 300cgattgcctg gtcggaccga taacgagatc
aagaatttct ggaactcaac aataaagaaa 360aggctaaaga agatgtccga
tacctccaac ttaatcaaca actcatcctc atcacccaac 420acagcaagcg
attcctcttc taattccgca tcttctttgg atattaaaga cattatagga
480agcttcatgt ccttacaaga acaaggcttc gtcaaccctt ccttgaccca
catacaaacc 540aacaatccat ttccaacggg aaacatgatc agccacccgt
gcaatgacga ttttacccct 600tatgtagatg gtatctatgg agtaaacgca
ggggtacaag gggaactcta cttcccacct 660ttggaatgtg aagaaggtga
ttggtacaat gcaaatataa acaaccactt agacgagttg 720aacactaatg
gatccggaaa cgcacctgag ggtatgagac cagtggaaga attttgggac
780cttgaccagt tgatgaacac tgaggttcct tcgttttact tcaacttcaa
acaaagcata 84044280PRTArabidopsis thalianaAT5G12870.1 44Met Arg Lys
Pro Glu Val Ala Ile Ala Ala Ser Thr His Gln Val Lys 1 5 10 15 Lys
Met Lys Lys Gly Leu Trp Ser Pro Glu Glu Asp Ser Lys Leu Met 20 25
30 Gln Tyr Met Leu Ser Asn Gly Gln Gly Cys Trp Ser Asp Val Ala Lys
35 40 45 Asn Ala Gly Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu Arg
Trp Ile 50 55 60 Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe
Ser Pro Gln Glu 65 70 75 80 Glu Asp Leu Ile Ile Arg Phe His Ser Ile
Leu Gly Asn Arg Trp Ser 85 90 95 Gln Ile Ala Ala Arg Leu Pro Gly
Arg Thr Asp Asn Glu Ile Lys Asn 100 105 110 Phe Trp Asn Ser Thr Ile
Lys Lys Arg Leu Lys Lys Met Ser Asp Thr 115 120 125 Ser Asn Leu Ile
Asn Asn Ser Ser Ser Ser Pro Asn Thr Ala Ser Asp 130 135 140 Ser Ser
Ser Asn Ser Ala Ser Ser Leu Asp Ile Lys Asp Ile Ile Gly 145 150 155
160 Ser Phe Met Ser Leu Gln Glu Gln Gly Phe Val Asn Pro Ser Leu Thr
165 170 175 His Ile Gln Thr Asn Asn Pro Phe Pro Thr Gly Asn Met Ile
Ser His 180 185 190 Pro Cys Asn Asp Asp Phe Thr Pro Tyr Val Asp Gly
Ile Tyr Gly Val 195 200 205 Asn Ala Gly Val Gln Gly Glu Leu Tyr Phe
Pro Pro Leu Glu Cys Glu 210 215 220 Glu Gly Asp Trp Tyr Asn Ala Asn
Ile Asn Asn His Leu Asp Glu Leu 225 230 235 240 Asn Thr Asn Gly Ser
Gly Asn Ala Pro Glu Gly Met Arg Pro Val Glu 245 250 255 Glu Phe Trp
Asp Leu Asp Gln Leu Met Asn Thr Glu Val Pro Ser Phe 260 265 270 Tyr
Phe Asn Phe Lys Gln Ser Ile 275 280 451056DNASolanum
lycopersicumSolyc01g087130.2.1 45atgaggaagc cggaacacaa taatacaacg
atgaaggaga aggagaagga gaaagtgaac 60aagttaggga aattaagaaa aggtttatgg
tcaccagaag aagatgaaaa gttgatgagt 120tacatgttaa gaaatggtca
aggatgttgg agtgacattg ctagaaatgc tggattacaa 180agatgtggta
agagttgtcg tcttagatgg attaattact tgaggcctga ccttaaacgc
240ggcgcctttt ctcttcaaga agaagaactc attgttcatt tgcattctat
tctgggaaat 300aggtggtcac aaattgcggc tcgtctacct gggaggacag
ataatgaaat caagaatttc 360tggaattcca cgataaaaaa gagatcaaaa
aacaacaaca acagcacgcc atcgcccaac 420acgagtgatt cttccattgg
aggaatattt ccaatgcaag ggcacgatgt aaataatgtt 480atggcaacgt
tatgcatgga taattcatcg tctactactt caggatcatc catgcaagcc
540atggtacctt tcaacccttt ccctcagctt gatagtacaa gctacgatat
atctggatta 600gtcgggccag tgaatttagg tcaatttggt tgtagtggag
gtgatggtgg atttttggac 660tatggagttg tggaaactta tagtatgatg
ggtttaggaa gtgatgaatt ttcaatacct 720tctttagagg gtgttcataa
taagagtact actactatgg gagagagtaa taataatagt 780aatgttgatt
ttagtagtaa taatattgtt agtggtgcta atgactatga ttccatgatt
840gagaagaaaa ataatacaaa cgttaacaac aacaacaacc aacacttgat
gaatatgagt 900ggaattagtg atcaaagcct aaaggttgaa gactatatgg
ttggttttgg gaatcatcat 960cattggcatg gagaaagctt aagaattgga
gaatttgatt gggaaggttt gttggcaaat 1020gtttcctctt taccttacct
tgattttcaa gttgaa 105646352PRTSolanum
lycopersicumSolyc01g087130.2.1 46Met Arg Lys Pro Glu His Asn Asn
Thr Thr Met Lys Glu Lys Glu Lys 1 5 10 15 Glu Lys Val Asn Lys Leu
Gly Lys Leu Arg Lys Gly Leu Trp Ser Pro 20 25 30 Glu Glu Asp Glu
Lys Leu Met Ser Tyr Met Leu Arg Asn Gly Gln Gly 35 40 45 Cys Trp
Ser Asp Ile Ala Arg Asn Ala Gly Leu Gln Arg Cys Gly Lys 50 55 60
Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Asp Leu Lys Arg 65
70 75 80 Gly Ala Phe Ser Leu Gln Glu Glu Glu Leu Ile Val His Leu
His Ser 85 90 95 Ile Leu Gly Asn Arg Trp Ser Gln Ile Ala Ala Arg
Leu Pro Gly Arg 100 105 110 Thr Asp Asn Glu Ile Lys Asn Phe Trp Asn
Ser Thr Ile Lys Lys Arg 115 120 125 Ser Lys Asn Asn Asn Asn Ser Thr
Pro Ser Pro Asn Thr Ser Asp Ser 130 135 140 Ser Ile Gly Gly Ile Phe
Pro Met Gln Gly His Asp Val Asn Asn Val 145 150 155 160 Met Ala Thr
Leu Cys Met Asp Asn Ser Ser Ser Thr Thr Ser Gly Ser 165 170 175 Ser
Met Gln Ala Met Val Pro Phe Asn Pro Phe Pro Gln Leu Asp Ser 180 185
190 Thr Ser Tyr Asp Ile Ser Gly Leu Val Gly Pro Val Asn Leu Gly Gln
195 200 205 Phe Gly Cys Ser Gly Gly Asp Gly Gly Phe Leu Asp Tyr Gly
Val Val 210 215 220 Glu Thr Tyr Ser Met Met Gly Leu Gly Ser Asp Glu
Phe Ser Ile Pro 225 230 235 240 Ser Leu Glu Gly Val His Asn Lys Ser
Thr Thr Thr Met Gly Glu Ser 245 250 255 Asn Asn Asn Ser Asn Val Asp
Phe Ser Ser Asn Asn Ile Val Ser Gly 260 265 270 Ala Asn Asp Tyr Asp
Ser Met Ile Glu Lys Lys Asn Asn Thr Asn Val 275 280 285 Asn Asn Asn
Asn Asn Gln His Leu Met Asn Met Ser Gly Ile Ser Asp 290 295 300 Gln
Ser Leu Lys Val Glu Asp Tyr Met Val Gly Phe Gly Asn His His 305 310
315 320 His Trp His Gly Glu Ser Leu Arg Ile Gly Glu Phe Asp Trp Glu
Gly 325 330 335 Leu Leu Ala Asn Val Ser Ser Leu Pro Tyr Leu Asp Phe
Gln Val Glu 340 345 350 471008DNAGlycine maxGlyma19g05080.1
47atgaggaaac ctgatatgat gggaaaagac aaaatcaaca acaacattaa gagcaagcta
60aggaagggtt tgtggtcacc tgaggaagat gagaagctcc taaggtatat gatcactaag
120ggacaagggt gttggagtga cattgctagg aatgctggtc ttcaaaggtg
cggcaaaagt 180tgccgtcttc gttggattaa ctacttgaga cctgatctca
aacgtggtgc attttcacct 240caagaggaag aagtcatcat tcacttgcac
tccattcttg gcaacagatg gtctcaaatt 300gccgcacgtc tccctggtcg
cacagacaat gagatcaaga atttctggaa ctccacactg 360aagaaaaggt
tgaaaatgaa caacaataac tccactttat caccaaacaa tagtgactca
420tcagggccta aagatgtcaa tgtcatgggt ggaatcatgt ccatgaacga
gcatgacctc 480atgaccatgt gcatggactc ctcctcatca acatcatcat
catcatgcat gcaatccatg 540catgccacca acatggtact aactaacccc
tttcccttgt tgcccaacaa ccgttatgac 600atgatgaccg gtgcaaccgg
tttccttgac aacatggctg ctgcatgctt aacccaagtt 660ggcatggtag
atcatgatca tggggttgtt catgggacat tggagcctaa taaaacgcgt
720ttaggaagcg acttttccct tcctccacta gaaagtagaa gcattgagga
caatagtagt 780accccaattg atcatgtgaa aagccataac aacaacaacc
acttcaagaa tagttgcttc 840aataacactg atcattacca tcatattcaa
agctccaaca acgtagttgt agaggatttg 900tttgggtttg gaaatcatgg
gcaaggggaa aactttagaa tgggagaatg ggaccttgag 960ggcttgatgc
aagacatttc ctattttcct tcccttgatt tccaagtt 100848336PRTGlycine
maxGlyma19g05080.1 48Met Arg Lys Pro Asp Met Met Gly Lys Asp Lys
Ile Asn Asn Asn Ile 1 5 10 15 Lys Ser Lys Leu Arg Lys Gly Leu Trp
Ser Pro Glu Glu Asp Glu Lys 20 25 30 Leu Leu Arg Tyr Met Ile Thr
Lys Gly Gln Gly Cys Trp Ser Asp Ile 35 40 45 Ala Arg Asn Ala Gly
Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu Arg 50 55 60 Trp Ile Asn
Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser Pro 65 70 75 80 Gln
Glu Glu Glu Val Ile Ile His Leu His Ser Ile Leu Gly Asn Arg 85 90
95 Trp Ser Gln Ile Ala Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu Ile
100 105 110 Lys Asn Phe Trp Asn Ser Thr Leu Lys Lys Arg Leu Lys Met
Asn Asn 115 120 125 Asn Asn Ser Thr Leu Ser Pro Asn Asn Ser Asp Ser
Ser Gly Pro Lys 130 135 140 Asp Val Asn Val Met Gly Gly Ile Met Ser
Met Asn Glu His Asp Leu 145 150 155 160 Met Thr Met Cys Met Asp Ser
Ser Ser Ser Thr Ser Ser Ser Ser Cys 165 170 175 Met Gln Ser Met His
Ala Thr Asn Met Val Leu Thr Asn Pro Phe Pro 180 185 190 Leu Leu Pro
Asn Asn Arg Tyr Asp Met Met Thr Gly Ala Thr Gly Phe 195 200 205 Leu
Asp Asn Met Ala Ala Ala Cys Leu Thr Gln Val Gly Met Val Asp 210 215
220 His Asp His Gly Val Val His Gly Thr Leu Glu Pro Asn Lys Thr Arg
225 230 235 240 Leu Gly Ser Asp Phe Ser Leu Pro Pro Leu Glu Ser Arg
Ser Ile Glu 245 250 255 Asp Asn Ser Ser Thr Pro Ile Asp His Val Lys
Ser His Asn Asn Asn 260 265 270 Asn His Phe Lys Asn Ser Cys Phe Asn
Asn Thr Asp His Tyr His His 275 280 285 Ile Gln Ser Ser Asn Asn Val
Val Val Glu Asp Leu Phe Gly Phe Gly 290 295 300 Asn His Gly Gln Gly
Glu Asn Phe Arg Met Gly Glu Trp Asp Leu Glu 305 310 315 320 Gly Leu
Met Gln Asp Ile Ser Tyr Phe Pro Ser Leu Asp Phe Gln Val 325 330 335
49933DNAGlycine maxGlyma13g27310.1 49atgaggaaac ctgatctgat
ggccaacaag gacaaagtga acaacaacat aaagagcaag 60ttgagaaaag ggttgtggtc
accagatgaa gatgagaggc tcataaggta catgctcaca 120aatggacaag
ggtgttggag tgacattgct aggaatgctg gtcttcaaag gtgtggcaaa
180agttgccgtc ttcgttggat caattacttg agacctgacc tcaagcgtgg
tgcattttcg 240ccccaagagg aagatctcat cgttcatttg cactccattc
ttggcaatag gtggtctcag 300attgcagcac atctccctgg ccgcacagac
aatgagatta agaatttctg gaactccaca 360ttgaagaaaa ggttgaaagc
aaacacttct actccctcac taaacaacag cacaggctca 420tcagagtcta
ataaggatgt tttgagtggg atcatgccct ttagtgaaca tgacatcatg
480accatgtgca tggattcctc ttcttccata tcatccatgc aagcaacggt
tttgcctgac 540caatttgacc ctttttccat gttggcaaat aatcagtgtg
acatgactaa tgtttcagca 600gattttccca acttgactca aattggcatg
gtagaggggc atgaagggaa ttatgggata 660ttggagccaa ataaaatggg
gttaggaaga gatttctccc ttccttcact agaaagtaga 720agcattgaaa
gcaatagtgt cccaattgat gtgaaaagcc ataacaacca cttcaattat
780ggttccttca atcacactga taaaattcag ggctccaaag tagaggactt
aattgagttt 840ggaaatcatg gccaagggga ggatttaaaa atgggagagt
gggatttgga gaatttgatg 900caagacataa cctcttttcc tttccttgag ttt
93350311PRTGlycine maxGlyma13g27310.1 50Met Arg Lys Pro Asp Leu Met
Ala Asn Lys Asp Lys Val Asn Asn Asn 1 5 10 15 Ile Lys Ser Lys Leu
Arg Lys Gly Leu Trp Ser Pro Asp Glu Asp Glu 20 25 30 Arg Leu Ile
Arg Tyr Met Leu Thr Asn Gly Gln Gly Cys Trp Ser Asp 35 40 45 Ile
Ala Arg Asn Ala Gly Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu 50 55
60 Arg Trp Ile Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser
65 70 75 80 Pro Gln Glu Glu Asp Leu Ile Val His Leu His Ser Ile Leu
Gly Asn 85 90 95 Arg Trp Ser Gln Ile Ala Ala His Leu Pro Gly Arg
Thr Asp Asn Glu 100 105 110 Ile Lys Asn Phe Trp Asn Ser Thr Leu Lys
Lys Arg Leu Lys Ala Asn 115 120 125 Thr Ser Thr Pro Ser Leu Asn Asn
Ser Thr Gly Ser Ser Glu Ser Asn 130 135 140 Lys Asp Val Leu Ser Gly
Ile Met Pro Phe Ser Glu His Asp Ile Met 145 150 155 160 Thr Met Cys
Met Asp Ser Ser Ser Ser Ile Ser Ser Met Gln Ala Thr 165 170 175 Val
Leu Pro Asp Gln Phe Asp Pro Phe Ser Met Leu Ala Asn Asn Gln 180 185
190 Cys Asp Met Thr Asn Val Ser Ala Asp Phe Pro Asn Leu Thr Gln Ile
195 200 205 Gly Met Val Glu Gly His Glu Gly Asn Tyr Gly Ile Leu Glu
Pro Asn 210 215 220 Lys Met Gly Leu Gly Arg Asp Phe Ser Leu Pro Ser
Leu Glu Ser Arg 225 230 235 240 Ser Ile Glu Ser Asn Ser Val Pro Ile
Asp Val Lys Ser His Asn Asn 245 250 255 His Phe Asn Tyr Gly Ser Phe
Asn His Thr Asp Lys Ile Gln Gly Ser 260 265 270 Lys Val Glu Asp Leu
Ile Glu Phe Gly Asn His Gly Gln Gly Glu Asp 275 280 285 Leu Lys Met
Gly Glu Trp Asp Leu Glu Asn Leu Met Gln Asp Ile Thr 290 295 300 Ser
Phe Pro Phe Leu Glu Phe 305 310 51945DNAGlycine maxGlyma12g36630.1
51atgaggaaac ctgatctgat ggccaacaag gacaaaatga acaacattaa gagcaagttg
60agaaaagggt tgtggtcacc agatgaagat gagaggctcg taaggtacat gctgacaaat
120ggacaagggt gttggagtga cattgctagg aatgctggtc ttcaaaggtg
tggcaaaagt 180tgccgtcttc gttggatcaa ttacttgaga cctgacctca
agcgtggtgc attctcacct 240caagaggaag atctcatcgt tcatttgcac
tccattcttg gcaataggtg gtctcagatt 300gcagcgcgtc tccctggccg
cacagacaat gagattaaga atttctggaa ctccacattg 360aagaaaaggt
tgaaaactaa cacttccact ccctcactaa acaacagcac tggctcatca
420gagtctaata aggatgtttt gagtgggatc atgcccttta atgaacatga
catcatgacc 480atgtgcatgg attcctcttc gtccatatca tccatgcaag
caatggtttt gcctgaccaa 540tttgaccctt ttttcatgtt ggcaaataat
cagtgtgaca tgactaatgt ttcatctgac 600ttttccaaca tgcctgctgc
atgcttgact caaattggca tggtagatgg gcatcaaggg 660aattatggga
tattggagcc aaataaaatg gggtcaggaa tagacttctc ccttccttca
720ctagaaagta gaagcattga aagcaatagt gtcccaattg atgtgaaaag
ccataacaac 780cacttcaatt atggttcctt caagaacact gataagattc
agggctccaa agtggaggac 840ttgatcgggt ttggaaatca tggccaaggg
gaaaatttaa aaatgggaga gtgggatttg 900gagaatttaa tgcaagacat
aacctctttt cctttccttg atttt 94552315PRTGlycine maxGlyma12g36630.1
52Met Arg Lys Pro Asp Leu Met Ala Asn Lys Asp Lys Met Asn Asn Ile 1
5 10 15 Lys Ser Lys Leu Arg Lys Gly Leu Trp Ser Pro Asp Glu Asp Glu
Arg 20 25 30 Leu Val Arg Tyr Met Leu Thr Asn Gly Gln Gly Cys Trp
Ser Asp Ile 35 40 45 Ala Arg Asn Ala Gly Leu Gln Arg Cys Gly Lys
Ser Cys Arg Leu Arg 50 55 60 Trp Ile Asn Tyr Leu Arg Pro Asp Leu
Lys Arg Gly Ala Phe Ser Pro 65 70 75 80 Gln Glu Glu Asp Leu Ile Val
His Leu His Ser Ile Leu Gly Asn Arg 85 90 95 Trp Ser Gln Ile Ala
Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu Ile 100 105 110 Lys Asn Phe
Trp Asn Ser Thr Leu Lys Lys Arg Leu Lys Thr Asn Thr 115 120 125 Ser
Thr Pro Ser Leu Asn Asn Ser Thr Gly Ser Ser Glu Ser Asn Lys 130 135
140 Asp Val Leu Ser Gly Ile Met Pro Phe Asn Glu His Asp Ile Met Thr
145 150 155 160 Met Cys Met Asp Ser Ser Ser Ser Ile Ser Ser Met Gln
Ala Met Val
165 170 175 Leu Pro Asp Gln Phe Asp Pro Phe Phe Met Leu Ala Asn Asn
Gln Cys 180 185 190 Asp Met Thr Asn Val Ser Ser Asp Phe Ser Asn Met
Pro Ala Ala Cys 195 200 205 Leu Thr Gln Ile Gly Met Val Asp Gly His
Gln Gly Asn Tyr Gly Ile 210 215 220 Leu Glu Pro Asn Lys Met Gly Ser
Gly Ile Asp Phe Ser Leu Pro Ser 225 230 235 240 Leu Glu Ser Arg Ser
Ile Glu Ser Asn Ser Val Pro Ile Asp Val Lys 245 250 255 Ser His Asn
Asn His Phe Asn Tyr Gly Ser Phe Lys Asn Thr Asp Lys 260 265 270 Ile
Gln Gly Ser Lys Val Glu Asp Leu Ile Gly Phe Gly Asn His Gly 275 280
285 Gln Gly Glu Asn Leu Lys Met Gly Glu Trp Asp Leu Glu Asn Leu Met
290 295 300 Gln Asp Ile Thr Ser Phe Pro Phe Leu Asp Phe 305 310 315
531029DNAArabidopsis thalianaAT3G08500.1 53atgatgatga ggaaaccgga
cattactacg atcagagaca aaggcaagcc aaatcatgca 60tgtggtggta ataacaacaa
accgaagcta agaaaaggac tttggtcgcc tgatgaagat 120gagaagctga
taagatacat gttgactaat ggacaaggat gttggagtga catcgctaga
180aatgctggtc ttttacgttg tggtaaaagt tgtcgccttc gctggatcaa
ttacttgagg 240cctgatctta aacgtggatc cttctctcct caggaggagg
atctcatctt ccatttgcat 300tccattcttg gtaacaggtg gtctcaaata
gctactcggc ttccaggtag aacagacaac 360gagatcaaaa acttttggaa
ctcgacattg aagaagcggc ttaagaacaa cagcaacaac 420aatacttcat
caggatcatc acctaacaat agtaatagta attccttgga cccaagagat
480caacatgtgg atatgggagg caactcaact tcattgatgg atgactatca
tcatgatgaa 540aacatgatga cagtggggaa caccatgcgc atggactctt
cctccccatt caatgttgga 600ccaatggtta atagtgtggg cttaaaccaa
ctttatgatc ccttgatgat atcagtgccg 660gataacggat atcaccaaat
gggaaacaca gtgaatgtgt tcagcgttaa tggtttagga 720gattatggaa
acacaattct tgatccaatt agcaagagag tatcagtaga aggtgatgat
780tggttcattc ccccctcgga gaataccaac gtcattgctt gtagtacaag
caacaaccta 840aacttacagg cccttgatcc ttgcttcaat agcaaaaatc
tttgtcattc agaaagcttc 900aaggtaggga atgtgttggg gatagagaat
ggttcttggg aaatagaaaa ccctaaaatc 960ggagattggg atttggatgg
tctcatcgat aacaactctt cttttccctt ccttgatttc 1020caagtcgat
102954343PRTArabidopsis thalianaAT3G08500.1 54Met Met Met Arg Lys
Pro Asp Ile Thr Thr Ile Arg Asp Lys Gly Lys 1 5 10 15 Pro Asn His
Ala Cys Gly Gly Asn Asn Asn Lys Pro Lys Leu Arg Lys 20 25 30 Gly
Leu Trp Ser Pro Asp Glu Asp Glu Lys Leu Ile Arg Tyr Met Leu 35 40
45 Thr Asn Gly Gln Gly Cys Trp Ser Asp Ile Ala Arg Asn Ala Gly Leu
50 55 60 Leu Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu Arg 65 70 75 80 Pro Asp Leu Lys Arg Gly Ser Phe Ser Pro Gln Glu
Glu Asp Leu Ile 85 90 95 Phe His Leu His Ser Ile Leu Gly Asn Arg
Trp Ser Gln Ile Ala Thr 100 105 110 Arg Leu Pro Gly Arg Thr Asp Asn
Glu Ile Lys Asn Phe Trp Asn Ser 115 120 125 Thr Leu Lys Lys Arg Leu
Lys Asn Asn Ser Asn Asn Asn Thr Ser Ser 130 135 140 Gly Ser Ser Pro
Asn Asn Ser Asn Ser Asn Ser Leu Asp Pro Arg Asp 145 150 155 160 Gln
His Val Asp Met Gly Gly Asn Ser Thr Ser Leu Met Asp Asp Tyr 165 170
175 His His Asp Glu Asn Met Met Thr Val Gly Asn Thr Met Arg Met Asp
180 185 190 Ser Ser Ser Pro Phe Asn Val Gly Pro Met Val Asn Ser Val
Gly Leu 195 200 205 Asn Gln Leu Tyr Asp Pro Leu Met Ile Ser Val Pro
Asp Asn Gly Tyr 210 215 220 His Gln Met Gly Asn Thr Val Asn Val Phe
Ser Val Asn Gly Leu Gly 225 230 235 240 Asp Tyr Gly Asn Thr Ile Leu
Asp Pro Ile Ser Lys Arg Val Ser Val 245 250 255 Glu Gly Asp Asp Trp
Phe Ile Pro Pro Ser Glu Asn Thr Asn Val Ile 260 265 270 Ala Cys Ser
Thr Ser Asn Asn Leu Asn Leu Gln Ala Leu Asp Pro Cys 275 280 285 Phe
Asn Ser Lys Asn Leu Cys His Ser Glu Ser Phe Lys Val Gly Asn 290 295
300 Val Leu Gly Ile Glu Asn Gly Ser Trp Glu Ile Glu Asn Pro Lys Ile
305 310 315 320 Gly Asp Trp Asp Leu Asp Gly Leu Ile Asp Asn Asn Ser
Ser Phe Pro 325 330 335 Phe Leu Asp Phe Gln Val Asp 340
551215DNAZea maysGRMZM2G052606_T01 55atgaggaaac cggagtgccc
agcggcgaac agcagcaatg cgggggcggc ggccgcgaag 60ctgcggaagg ggctgtggtc
gccggaggag gacgagaggc tggtggcgta catgctgcgg 120agtggacagg
gttcttggag cgatgtggcc cggaacgccg ggttgcagcg gtgcggcaag
180agctgccgcc tccggtggat caactacctc cggccggacc tcaagcgcgg
cgccttctcg 240ccgcaggagg aggagctcat cgtcagcctc cacgccatcc
tgggaaacag gtggtctcag 300attgctgccc ggttgccggg gcgcaccgac
aacgagatca agaacttctg gaactccacc 360atcaagaagc ggctcaagaa
cagctcggca gcttcgtcac cagcagctac ggactgcgcg 420tcgccggagc
ctaataacaa ggtcgccgcc gccggtagct gcccggatct ttccgtccta
480gatcatcagg acggtggcca ccaccacgca atgacgacga cgactgcagg
tttgtggatg 540gtggactcat cctcctcttg tacctcgtcg acctcgccaa
tgcatcagtt tcagaggccg 600acgacgacga tggcagcggc cgtggccagc
gggagctatg gaggtctcgt ccccttccct 660gaccaggtcc gtggtgttgt
ggccgacacg ggagggttct ttcatggcca cgcggcgcca 720gcgttcaagc
accaagttgc cgcattgcac ggtggtggtt attactacgg cagcgctcct
780cgtcaccatg gaatgacgac gacgacgacg acggtggcat tggaaggaag
cggtggatgc 840ttcatatctg gcgaaggcat gcttggtgtg ccccctctgc
tgttagagcc catgtcagca 900gcgctagagc aagaccaagg ccagaccttg
atggcatcaa gtggtaacaa caaccctaaa 960aacaacagca gcagcaacac
tactgatact acgactacca cgacactgag caacaatgag 1020agcaacgtca
cagacaccac caccaaggac aacaccacca acaccatcag ccaagtgaac
1080agtggcagca ataatgtcta ctgggagggg gcccgccagc agtacatgag
caggaatgtc 1140atgcatgggg agtgggacct ggaggagctg atgaaagatg
tgtcatcctt gccttttctt 1200gatttccaag ttgaa 121556405PRTZea
maysGRMZM2G052606_T01 56Met Arg Lys Pro Glu Cys Pro Ala Ala Asn Ser
Ser Asn Ala Gly Ala 1 5 10 15 Ala Ala Ala Lys Leu Arg Lys Gly Leu
Trp Ser Pro Glu Glu Asp Glu 20 25 30 Arg Leu Val Ala Tyr Met Leu
Arg Ser Gly Gln Gly Ser Trp Ser Asp 35 40 45 Val Ala Arg Asn Ala
Gly Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu 50 55 60 Arg Trp Ile
Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser 65 70 75 80 Pro
Gln Glu Glu Glu Leu Ile Val Ser Leu His Ala Ile Leu Gly Asn 85 90
95 Arg Trp Ser Gln Ile Ala Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu
100 105 110 Ile Lys Asn Phe Trp Asn Ser Thr Ile Lys Lys Arg Leu Lys
Asn Ser 115 120 125 Ser Ala Ala Ser Ser Pro Ala Ala Thr Asp Cys Ala
Ser Pro Glu Pro 130 135 140 Asn Asn Lys Val Ala Ala Ala Gly Ser Cys
Pro Asp Leu Ser Val Leu 145 150 155 160 Asp His Gln Asp Gly Gly His
His His Ala Met Thr Thr Thr Thr Ala 165 170 175 Gly Leu Trp Met Val
Asp Ser Ser Ser Ser Cys Thr Ser Ser Thr Ser 180 185 190 Pro Met His
Gln Phe Gln Arg Pro Thr Thr Thr Met Ala Ala Ala Val 195 200 205 Ala
Ser Gly Ser Tyr Gly Gly Leu Val Pro Phe Pro Asp Gln Val Arg 210 215
220 Gly Val Val Ala Asp Thr Gly Gly Phe Phe His Gly His Ala Ala Pro
225 230 235 240 Ala Phe Lys His Gln Val Ala Ala Leu His Gly Gly Gly
Tyr Tyr Tyr 245 250 255 Gly Ser Ala Pro Arg His His Gly Met Thr Thr
Thr Thr Thr Thr Val 260 265 270 Ala Leu Glu Gly Ser Gly Gly Cys Phe
Ile Ser Gly Glu Gly Met Leu 275 280 285 Gly Val Pro Pro Leu Leu Leu
Glu Pro Met Ser Ala Ala Leu Glu Gln 290 295 300 Asp Gln Gly Gln Thr
Leu Met Ala Ser Ser Gly Asn Asn Asn Pro Lys 305 310 315 320 Asn Asn
Ser Ser Ser Asn Thr Thr Asp Thr Thr Thr Thr Thr Thr Leu 325 330 335
Ser Asn Asn Glu Ser Asn Val Thr Asp Thr Thr Thr Lys Asp Asn Thr 340
345 350 Thr Asn Thr Ile Ser Gln Val Asn Ser Gly Ser Asn Asn Val Tyr
Trp 355 360 365 Glu Gly Ala Arg Gln Gln Tyr Met Ser Arg Asn Val Met
His Gly Glu 370 375 380 Trp Asp Leu Glu Glu Leu Met Lys Asp Val Ser
Ser Leu Pro Phe Leu 385 390 395 400 Asp Phe Gln Val Glu 405
571236DNASetaria italicaSi024786m 57atgaggaagc cggagggccc
agcggcgagc ggcggctgca atggcggtgc ggcggcggcg 60gcgaagctgc ggaaggggct
gtggtcgccg gaggaggacg agaagctggt ggcctacatg 120ctgcggagcg
ggcaggggtc gtggagcgac gtggcccgga acgccggcct gcaacgctgc
180ggcaagagct gccgcctccg gtggatcaac tacctccggc cggacctcaa
gcgcggcgcc 240ttctcgccgc aggaggagga cctcatcgtc agcctccacg
ccatcctcgg caacaggtgg 300tctcagatcg ctgcccggct gccggggcgc
accgacaacg agatcaagaa cttctggaac 360tccaccatca agaagcggct
caagaacagc tcctcggcct cgtcgccggc ggccaccgac 420tgcgcgtcgc
cgacggagcc tagcagcaag gtcgccggca tcgacatcag cggcgccacc
480agctgcccgg acctcgccgg cctggaccat catcatcagg acggcggcca
ccaccacgcg 540atgatgacga cgggcttgtg gatggtggac tcgtcctcct
ccacttcctc atcgacctcg 600ccgatgcaga gccggccgcc gccgtcggcc
attgcagcgg cggtggcccg gagctacggc 660ggcctcctcc ccctccccga
ccagctccgc ggcggcacgg cggccgacac gtcgccggca 720gggttcttcc
acggccacgc ggcgccgttc aaacagcaag cagcagttgc ctcattgcat
780ggcggttact atggaatggg cagtcctcat caccatggga tgatggcaat
ggagggagga 840ggagggtgct tcatgagagg agaaggcctc tttggtgtgg
cccctctgct ggatgccatg 900tcagcacaag accaagacca ggcaggccag
gccctaatag catcaagtgg tggtaacaac 960aaccctaaaa acaacagcag
caacaacact accgagacta caacaacagt gagtaacaat 1020gagagcaaca
tcacagacaa caacaccacc aacaccaagg acaacaacat caacgccatg
1080agcctagtga acagcggcag cagcaatgtg gctgctgtct actgggaggg
ggcccaccag 1140cagtacatga gcaggaatgt catgcatggg gagtgggacc
tggaggagct gatgaaagat 1200gtgtcatcct tgcctttcct tgatttccaa gtcgaa
123658412PRTSetaria italicaSi024786m 58Met Arg Lys Pro Glu Gly Pro
Ala Ala Ser Gly Gly Cys Asn Gly Gly 1 5 10 15 Ala Ala Ala Ala Ala
Lys Leu Arg Lys Gly Leu Trp Ser Pro Glu Glu 20 25 30 Asp Glu Lys
Leu Val Ala Tyr Met Leu Arg Ser Gly Gln Gly Ser Trp 35 40 45 Ser
Asp Val Ala Arg Asn Ala Gly Leu Gln Arg Cys Gly Lys Ser Cys 50 55
60 Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala
65 70 75 80 Phe Ser Pro Gln Glu Glu Asp Leu Ile Val Ser Leu His Ala
Ile Leu 85 90 95 Gly Asn Arg Trp Ser Gln Ile Ala Ala Arg Leu Pro
Gly Arg Thr Asp 100 105 110 Asn Glu Ile Lys Asn Phe Trp Asn Ser Thr
Ile Lys Lys Arg Leu Lys 115 120 125 Asn Ser Ser Ser Ala Ser Ser Pro
Ala Ala Thr Asp Cys Ala Ser Pro 130 135 140 Thr Glu Pro Ser Ser Lys
Val Ala Gly Ile Asp Ile Ser Gly Ala Thr 145 150 155 160 Ser Cys Pro
Asp Leu Ala Gly Leu Asp His His His Gln Asp Gly Gly 165 170 175 His
His His Ala Met Met Thr Thr Gly Leu Trp Met Val Asp Ser Ser 180 185
190 Ser Ser Thr Ser Ser Ser Thr Ser Pro Met Gln Ser Arg Pro Pro Pro
195 200 205 Ser Ala Ile Ala Ala Ala Val Ala Arg Ser Tyr Gly Gly Leu
Leu Pro 210 215 220 Leu Pro Asp Gln Leu Arg Gly Gly Thr Ala Ala Asp
Thr Ser Pro Ala 225 230 235 240 Gly Phe Phe His Gly His Ala Ala Pro
Phe Lys Gln Gln Ala Ala Val 245 250 255 Ala Ser Leu His Gly Gly Tyr
Tyr Gly Met Gly Ser Pro His His His 260 265 270 Gly Met Met Ala Met
Glu Gly Gly Gly Gly Cys Phe Met Arg Gly Glu 275 280 285 Gly Leu Phe
Gly Val Ala Pro Leu Leu Asp Ala Met Ser Ala Gln Asp 290 295 300 Gln
Asp Gln Ala Gly Gln Ala Leu Ile Ala Ser Ser Gly Gly Asn Asn 305 310
315 320 Asn Pro Lys Asn Asn Ser Ser Asn Asn Thr Thr Glu Thr Thr Thr
Thr 325 330 335 Val Ser Asn Asn Glu Ser Asn Ile Thr Asp Asn Asn Thr
Thr Asn Thr 340 345 350 Lys Asp Asn Asn Ile Asn Ala Met Ser Leu Val
Asn Ser Gly Ser Ser 355 360 365 Asn Val Ala Ala Val Tyr Trp Glu Gly
Ala His Gln Gln Tyr Met Ser 370 375 380 Arg Asn Val Met His Gly Glu
Trp Asp Leu Glu Glu Leu Met Lys Asp 385 390 395 400 Val Ser Ser Leu
Pro Phe Leu Asp Phe Gln Val Glu 405 410 591227DNAOryza
sativaLOC_Os12g33070.1 59atgaggaagc cggattgcgg cggcggggga
ggggcggcga agggcggcgg cgttctgggt 60gtggcgggag ggaacaatgc ggcggtggtg
ggggggaagg ttcggaaggg gctgtggtcg 120ccggaggagg acgagaagct
ggtggcgtac atgctgcgga gcgggcaggg gtcgtggagc 180gacgtggcga
ggaacgccgg gctgcagcgc tgcggcaaga gctgccgcct ccggtggatc
240aactacctcc gccccgacct caagcgcggc gccttctcgc cgcaggagga
ggacctcatc 300gtcaacctcc acgccatcct cggcaacagg tggtctcaga
tcgctgcccg gctaccgggg 360cgcaccgaca acgagatcaa gaacttctgg
aactccacca tcaagaagcg cctcaagatc 420tcctcctcct cggcgtctcc
ggccaccacc accgactgcg cctccccgcc ggagcacaag 480ctcggcgccg
tcgtcgacct cgccggcggc ggcggcgcca cggacgacgt cgttgtcggg
540acagctaatg ctgccatgaa gagcatgtgg gtggattcct cgtcgtcgtc
gtcgtcgtct 600tcctcgtcga tgcagagccg gccgtcgata atggcggcgg
cggcggcggg gaggagctac 660ggcggcctcc tcccactccc cgaccaggtc
tgcggcgtcg acacctcgcc gccaccgccg 720ttcttccacg accactccat
ctccatcaag caagcatact acggatcaac cggcgcccac 780caccaccacc
acgcgatcgc caccatggac ggatcaagct taataggaga tcatcaccat
840cacagcagca gcatcctctt tggcggcgca tcagtgccac ctctcctaga
ccaccaaacc 900attctcgacg acgacgacga ccaccctaac aaaaccggca
gcaacacgac cgcggccaca 960ctgagcagca acatcacaga caacagcaac
agcaacaaga acaacagtga taataacaac 1020aacatcagca gcagctgctg
cattagccta atgaacagca gcagcaacat gatctattgg 1080gagggtcacc
accaacaaca gcagcagcag catcagatgc tgcagcagca gcagcagcac
1140atgagcagga atgtcatggg agagtgggac ttggaggagc tgatgaaaga
tgtgtcatcc 1200ttgcctttcc ttgatttcca agttgaa 122760409PRTOryza
sativaLOC_Os12g33070.1 60Met Arg Lys Pro Asp Cys Gly Gly Gly Gly
Gly Ala Ala Lys Gly Gly 1 5 10 15 Gly Val Leu Gly Val Ala Gly Gly
Asn Asn Ala Ala Val Val Gly Gly 20 25 30 Lys Val Arg Lys Gly Leu
Trp Ser Pro Glu Glu Asp Glu Lys Leu Val 35 40 45 Ala Tyr Met Leu
Arg Ser Gly Gln Gly Ser Trp Ser Asp Val Ala Arg 50 55 60 Asn Ala
Gly Leu Gln Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Ile 65 70 75 80
Asn Tyr Leu Arg Pro Asp Leu Lys Arg Gly Ala Phe Ser Pro Gln Glu 85
90 95 Glu Asp Leu Ile Val Asn Leu His Ala Ile Leu Gly Asn Arg Trp
Ser 100 105 110 Gln Ile Ala Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu
Ile Lys Asn 115 120 125 Phe Trp Asn Ser Thr Ile Lys Lys Arg Leu Lys
Ile Ser Ser Ser Ser 130 135 140 Ala Ser Pro Ala Thr Thr Thr Asp Cys
Ala Ser Pro Pro Glu His Lys 145 150 155 160 Leu Gly Ala Val Val Asp
Leu Ala Gly Gly Gly Gly Ala Thr Asp Asp 165 170 175 Val Val Val Gly
Thr Ala Asn Ala Ala Met Lys Ser Met Trp Val Asp 180 185 190 Ser Ser
Ser Ser Ser Ser Ser Ser Ser Ser Ser Met Gln Ser Arg Pro 195 200 205
Ser Ile Met Ala Ala Ala Ala
Ala Gly Arg Ser Tyr Gly Gly Leu Leu 210 215 220 Pro Leu Pro Asp Gln
Val Cys Gly Val Asp Thr Ser Pro Pro Pro Pro 225 230 235 240 Phe Phe
His Asp His Ser Ile Ser Ile Lys Gln Ala Tyr Tyr Gly Ser 245 250 255
Thr Gly Ala His His His His His Ala Ile Ala Thr Met Asp Gly Ser 260
265 270 Ser Leu Ile Gly Asp His His His His Ser Ser Ser Ile Leu Phe
Gly 275 280 285 Gly Ala Ser Val Pro Pro Leu Leu Asp His Gln Thr Ile
Leu Asp Asp 290 295 300 Asp Asp Asp His Pro Asn Lys Thr Gly Ser Asn
Thr Thr Ala Ala Thr 305 310 315 320 Leu Ser Ser Asn Ile Thr Asp Asn
Ser Asn Ser Asn Lys Asn Asn Ser 325 330 335 Asp Asn Asn Asn Asn Ile
Ser Ser Ser Cys Cys Ile Ser Leu Met Asn 340 345 350 Ser Ser Ser Asn
Met Ile Tyr Trp Glu Gly His His Gln Gln Gln Gln 355 360 365 Gln Gln
His Gln Met Leu Gln Gln Gln Gln Gln His Met Ser Arg Asn 370 375 380
Val Met Gly Glu Trp Asp Leu Glu Glu Leu Met Lys Asp Val Ser Ser 385
390 395 400 Leu Pro Phe Leu Asp Phe Gln Val Glu 405
6161PRTArabidopsis thalianaAT5G52260.1 1st Myb Domain 61Trp Ser Pro
Glu Glu Asp Gln Lys Leu Lys Ser Phe Ile Leu Ser Arg 1 5 10 15 Gly
His Ala Cys Trp Thr Thr Val Pro Ile Leu Ala Gly Leu Gln Arg 20 25
30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Gly
35 40 45 Leu Lys Arg Gly Ser Phe Ser Glu Glu Glu Glu Glu Thr 50 55
60 6261PRTArabidopsis thalianaAT4G25560.1 1st Myb Domain 62Trp Ser
Pro Glu Glu Asp Glu Lys Leu Arg Ser Phe Ile Leu Ser Tyr 1 5 10 15
Gly His Ser Cys Trp Thr Thr Val Pro Ile Lys Ala Gly Leu Gln Arg 20
25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro
Gly 35 40 45 Leu Lys Arg Asp Met Ile Ser Ala Glu Glu Glu Glu Thr 50
55 60 6361PRTOryza sativaLOC_Os04g45020.1 1st Myb Domain 63Trp Ser
Pro Glu Glu Asp Gln Lys Leu Arg Asp Phe Ile Leu Arg Tyr 1 5 10 15
Gly His Gly Cys Trp Ser Ala Val Pro Val Lys Ala Gly Leu Gln Arg 20
25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro
Gly 35 40 45 Leu Lys His Gly Met Phe Ser Arg Glu Glu Glu Glu Thr 50
55 60 6461PRTBrachypodium distachyonBradi5g16672.1 1st Myb Domain
64Trp Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Tyr Ile Ile Arg Tyr 1
5 10 15 Gly His Ser Cys Trp Ser Thr Val Pro Val Lys Ala Gly Leu Gln
Arg 20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu
Arg Pro Gly 35 40 45 Leu Lys His Gly Met Phe Ser Gln Glu Glu Glu
Glu Thr 50 55 60 6561PRTZea maysGRMZM2G170049_T01 1st Myb Domain
65Trp Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Tyr Ile Leu Leu His 1
5 10 15 Gly His Gly Cys Trp Ser Ala Leu Pro Ala Lys Ala Gly Leu Gln
Arg 20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu
Arg Pro Gly 35 40 45 Leu Lys His Gly Met Phe Ser Pro Glu Glu Glu
Glu Thr 50 55 60 6661PRTSetaria italicaSi012304m 1st Myb Domain
66Trp Ser Pro Glu Glu Asp Glu Lys Leu Arg Asp Phe Ile Leu Arg Tyr 1
5 10 15 Gly His Gly Cys Trp Ser Ala Leu Pro Ala Lys Ala Gly Leu Gln
Arg 20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu
Arg Pro Gly 35 40 45 Leu Lys His Gly Met Phe Ser Arg Glu Glu Glu
Glu Thr 50 55 60 6761PRTCitrus clementinaclementine0.9_033485m 1st
Myb Domain 67Trp Ser Pro Glu Glu Asp Gln Arg Leu Lys Asn Tyr Val
Leu Gln His 1 5 10 15 Gly His Pro Cys Trp Ser Ser Val Pro Ile Asn
Ala Gly Leu Gln Arg 20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp
Ile Asn Tyr Leu Arg Pro Gly 35 40 45 Leu Lys Arg Gly Val Phe Asn
Met Gln Glu Glu Glu Thr 50 55 60 6861PRTPopulus
trichocarpaPOPTR_0015s13190.1 1st Myb Domain 68Trp Ser Pro Glu Glu
Asp Gln Arg Leu Arg Asn Tyr Val Leu Lys His 1 5 10 15 Gly His Gly
Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu Gln Arg 20 25 30 Asn
Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Gly 35 40
45 Leu Lys Arg Gly Thr Phe Ser Ala Gln Glu Glu Glu Thr 50 55 60
6961PRTEucalyptus grandisEUCGR.K00250.1 1st Myb Domain 69Trp Ser
Pro Glu Glu Asp Gln Lys Leu Arg Asn Tyr Val Leu Lys His 1 5 10 15
Gly His Gly Cys Trp Ser Ser Val Pro Ile Asn Thr Gly Leu Gln Arg 20
25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro
Gly 35 40 45 Leu Lys Arg Gly Met Phe Thr Met Glu Glu Glu Glu Ile 50
55 60 7061PRTEucalyptus grandisEUCGR.K00251.1 1st Myb Domain 70Trp
Ser Pro Glu Glu Asp Gln Arg Leu Arg Asn Tyr Ile Leu Asn His 1 5 10
15 Gly His Gly Tyr Trp Ser Ser Val Pro Ile Asn Thr Gly Leu Gln Arg
20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg
Pro Gly 35 40 45 Leu Lys Arg Gly Met Phe Thr Leu Glu Glu Glu Glu
Ile 50 55 60 7161PRTPopulus trichocarpaPOPTR_0012s13260.1 1st Myb
Domain 71Trp Ser Pro Glu Glu Asp Gln Arg Leu Gly Ser Tyr Val Phe
Gln His 1 5 10 15 Gly His Gly Cys Trp Ser Ser Val Pro Ile Asn Ala
Gly Leu Gln Arg 20 25 30 Thr Gly Lys Ser Cys Arg Leu Arg Trp Ile
Asn Tyr Leu Arg Pro Gly 35 40 45 Leu Lys Arg Gly Ala Phe Ser Thr
Asp Glu Glu Glu Thr 50 55 60 7261PRTGlycine maxGlyma16g31280.1 1st
Myb Domain 72Trp Ser Pro Glu Glu Asp Asn Lys Leu Arg Asn His Ile
Ile Lys His 1 5 10 15 Gly His Gly Cys Trp Ser Ser Val Pro Ile Lys
Ala Gly Leu Gln Arg 20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp
Ile Asn Tyr Leu Arg Pro Gly 35 40 45 Leu Lys Arg Gly Val Phe Ser
Lys His Glu Glu Asp Thr 50 55 60 7361PRTGlycine maxGlyma09g25590.1
1st Myb Domain 73Trp Ser Pro Glu Glu Asp Asn Lys Leu Arg Asn His
Ile Ile Lys His 1 5 10 15 Gly His Gly Cys Trp Ser Ser Val Pro Ile
Lys Ala Gly Leu Gln Arg 20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg
Trp Ile Asn Tyr Leu Arg Pro Gly 35 40 45 Leu Lys Arg Gly Val Phe
Ser Lys His Glu Lys Asp Thr 50 55 60 7461PRTSolanum
lycopersicumSolyc03g025870.2.1 1st Myb Domain 74Trp Ser Pro Asp Glu
Asp Asp Arg Leu Lys Asn Tyr Met Ile Lys His 1 5 10 15 Gly His Gly
Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu Gln Arg 20 25 30 Asn
Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Gly 35 40
45 Leu Lys Arg Gly Ala Phe Ser Leu Glu Glu Glu Asp Ile 50 55 60
7561PRTVitis viniferaGSVIVT01028984001 1st Myb Domain 75Trp Ser Pro
Glu Glu Asp Ala Arg Leu Arg Asn Tyr Val Leu Lys Tyr 1 5 10 15 Gly
Leu Gly Cys Trp Ser Ser Val Pro Val Asn Ala Gly Leu Gln Arg 20 25
30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro Gly
35 40 45 Leu Lys Arg Gly Met Phe Thr Ile Glu Glu Glu Glu Thr 50 55
60 7661PRTEucalyptus grandisEUCGR.A02796.1 1st Myb Domain 76Trp Ser
Pro Asp Glu Asp Gln Arg Leu Arg Asn Tyr Ile His Lys His 1 5 10 15
Gly Tyr Ser Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu Gln Arg 20
25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu Arg Pro
Gly 35 40 45 Leu Lys Arg Gly Ala Phe Thr Val Gln Glu Glu Glu Thr 50
55 60 7761PRTArabidopsis thalianaAT3G48920.1 1st Myb Domain 77Trp
Ser Pro Glu Glu Asp Glu Lys Leu Arg Ser His Val Leu Lys Tyr 1 5 10
15 Gly His Gly Cys Trp Ser Thr Ile Pro Leu Gln Ala Gly Leu Gln Arg
20 25 30 Asn Gly Lys Ser Cys Arg Leu Arg Trp Val Asn Tyr Leu Arg
Pro Gly 35 40 45 Leu Lys Lys Ser Leu Phe Thr Lys Gln Glu Glu Thr
Ile 50 55 60 7846PRTArabidopsis thalianaAT5G52260.1 SANT1 domai
78Leu Trp Ser Pro Glu Glu Asp Gln Lys Leu Lys Ser Phe Ile Leu Ser 1
5 10 15 Arg Gly His Ala Cys Trp Thr Thr Val Pro Ile Leu Ala Gly Leu
Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu 35 40 45 7946PRTArabidopsis thalianaAT4G25560.1 SANT1 domain
79Leu Trp Ser Pro Glu Glu Asp Glu Lys Leu Arg Ser Phe Ile Leu Ser 1
5 10 15 Tyr Gly His Ser Cys Trp Thr Thr Val Pro Ile Lys Ala Gly Leu
Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu 35 40 45 8046PRTOryza sativaLOC_Os04g45020.1 SANT1 domain 80Leu
Trp Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Phe Ile Leu Arg 1 5 10
15 Tyr Gly His Gly Cys Trp Ser Ala Val Pro Val Lys Ala Gly Leu Gln
20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35
40 45 8146PRTBrachypodium distachyonBradi5g16672.1 SANT1 domain
81Leu Trp Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Tyr Ile Ile Arg 1
5 10 15 Tyr Gly His Ser Cys Trp Ser Thr Val Pro Val Lys Ala Gly Leu
Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu 35 40 45 8246PRTZea maysGRMZM2G170049_T01 SANT1 domain 82Leu
Trp Ser Pro Glu Glu Asp Gln Lys Leu Arg Asp Tyr Ile Leu Leu 1 5 10
15 His Gly His Gly Cys Trp Ser Ala Leu Pro Ala Lys Ala Gly Leu Gln
20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35
40 45 8346PRTSetaria italicaSi012304m SANT1 domain 83Leu Trp Ser
Pro Glu Glu Asp Glu Lys Leu Arg Asp Phe Ile Leu Arg 1 5 10 15 Tyr
Gly His Gly Cys Trp Ser Ala Leu Pro Ala Lys Ala Gly Leu Gln 20 25
30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35 40 45
8446PRTCitrus clementinaclementine0.9_033485m SANT1 domain 84Leu
Trp Ser Pro Glu Glu Asp Gln Arg Leu Lys Asn Tyr Val Leu Gln 1 5 10
15 His Gly His Pro Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu Gln
20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35
40 45 8546PRTPopulus trichocarpaPOPTR_0015s13190.1 SANT1 domain
85Leu Trp Ser Pro Glu Glu Asp Gln Arg Leu Arg Asn Tyr Val Leu Lys 1
5 10 15 His Gly His Gly Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu
Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu 35 40 45 8646PRTEucalyptus grandisEucgr.K00250.1 SANT1 domain
86Leu Trp Ser Pro Glu Glu Asp Gln Lys Leu Arg Asn Tyr Val Leu Lys 1
5 10 15 His Gly His Gly Cys Trp Ser Ser Val Pro Ile Asn Thr Gly Leu
Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu 35 40 45 8746PRTEucalyptus grandisEucgr.K00251.1 SANT1 domain
87Leu Trp Ser Pro Glu Glu Asp Gln Arg Leu Arg Asn Tyr Ile Leu Asn 1
5 10 15 His Gly His Gly Tyr Trp Ser Ser Val Pro Ile Asn Thr Gly Leu
Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr
Leu 35 40 45 8846PRTPopulus trichocarpaPOPTR_0012s13260.1 SANT1
domain 88Leu Trp Ser Pro Glu Glu Asp Gln Arg Leu Gly Ser Tyr Val
Phe Gln 1 5 10 15 His Gly His Gly Cys Trp Ser Ser Val Pro Ile Asn
Ala Gly Leu Gln 20 25 30 Arg Thr Gly Lys Ser Cys Arg Leu Arg Trp
Ile Asn Tyr Leu 35 40 45 8946PRTGlycine maxGlyma16g31280.1 SANT1
domain 89Leu Trp Ser Pro Glu Glu Asp Asn Lys Leu Arg Asn His Ile
Ile Lys 1 5 10 15 His Gly His Gly Cys Trp Ser Ser Val Pro Ile Lys
Ala Gly Leu Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp
Ile Asn Tyr Leu 35 40 45 9046PRTGlycine maxGlyma09g25590.1 SANT1
domain 90Leu Trp Ser Pro Glu Glu Asp Asn Lys Leu Arg Asn His Ile
Ile Lys 1 5 10 15 His Gly His Gly Cys Trp Ser Ser Val Pro Ile Lys
Ala Gly Leu Gln 20 25 30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp
Ile Asn Tyr Leu 35 40 45 9146PRTSolanum
lycopersicumSolyc03g025870.2.1 SANT1 domain 91Leu Trp Ser Pro Asp
Glu Asp Asp Arg Leu Lys Asn Tyr Met Ile Lys 1 5 10 15 His Gly His
Gly Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu Gln 20 25 30 Arg
Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35 40 45
9246PRTVitis viniferaGSVIVT01028984001 SANT1 domain 92Leu Trp Ser
Pro Glu Glu Asp Ala Arg Leu Arg Asn Tyr Val Leu Lys 1 5 10 15 Tyr
Gly Leu Gly Cys Trp Ser Ser Val Pro Val Asn Ala Gly Leu Gln 20 25
30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35 40 45
9346PRTEucalyptus grandisEucgr.A02796.1 SANT1 domain 93Leu Trp Ser
Pro Asp Glu Asp Gln Arg Leu Arg Asn Tyr Ile His Lys 1 5 10 15 His
Gly Tyr Ser Cys Trp Ser Ser Val Pro Ile Asn Ala Gly Leu Gln 20 25
30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Ile Asn Tyr Leu 35 40 45
9446PRTArabidopsis thalianaAT3G48920.1 SANT1 domain 94Leu Trp Ser
Pro Glu Glu Asp Glu Lys Leu Arg Ser His Val Leu Lys 1 5 10 15 Tyr
Gly His Gly Cys Trp Ser Thr Ile Pro Leu Gln Ala Gly Leu Gln 20 25
30 Arg Asn Gly Lys Ser Cys Arg Leu Arg Trp Val Asn Tyr Leu 35 40 45
9543PRTArabidopsis thalianaAT5G52260.1 2nd Myb Domain 95Phe Ser Glu
Glu Glu Glu Glu Thr Ile Leu Thr Leu His Ser Ser Leu 1 5 10
15 Gly Asn Lys Trp Ser Arg Ile Ala Lys Tyr Leu Pro Gly Arg Thr Asp
20 25 30 Asn Glu Ile Lys Asn Tyr Trp His Ser Tyr Leu 35 40
9643PRTArabidopsis thalianaAT4G25560.1 2nd Myb Domain 96Ile Ser Ala
Glu Glu Glu Glu Thr Ile Leu Thr Phe His Ser Ser Leu 1 5 10 15 Gly
Asn Lys Trp Ser Gln Ile Ala Lys Phe Leu Pro Gly Arg Thr Asp 20 25
30 Asn Glu Ile Lys Asn Tyr Trp His Ser His Leu 35 40 9743PRTOryza
sativaLOC_Os04g45020.1 2nd Myb Domain 97Phe Ser Arg Glu Glu Glu Glu
Thr Val Met Asn Leu His Ala Thr Met 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ala Arg His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Val
Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 9843PRTBrachypodium
distachyonBradi5g16672.1 2nd Myb Domain 98Phe Ser Gln Glu Glu Glu
Glu Thr Val Met Ser Leu His Ala Thr Leu 1 5 10 15 Gly Asn Lys Trp
Ser Arg Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu
Val Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 9943PRTZea
maysGRMZM2G170049_T01 2nd Myb Domain 99Phe Ser Pro Glu Glu Glu Glu
Thr Val Met Ser Leu His Ala Thr Leu 1 5 10 15 Gly Asn Lys Trp Ser
Arg Ile Ala Arg His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Val
Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 10043PRTSetaria
italicaSi012304m 2nd Myb Domain 100Phe Ser Arg Glu Glu Glu Glu Thr
Val Met Ser Leu His Ala Lys Leu 1 5 10 15 Gly Asn Lys Trp Ser Gln
Ile Ala Arg His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Val Lys
Asn Tyr Trp Asn Ser Tyr Leu 35 40 10143PRTCitrus
clementinaclementine0.9_033485m 2nd Myb Domain 101Phe Asn Met Gln
Glu Glu Glu Thr Ile Leu Thr Val His Arg Leu Leu 1 5 10 15 Gly Asn
Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30
Asn Glu Ile Lys Asn Tyr Trp His Ser His Leu 35 40 10243PRTPopulus
trichocarpaPOPTR_0015s13190.1 2nd Myb Domain 102Phe Ser Ala Gln Glu
Glu Glu Thr Ile Leu Ala Leu His His Met Leu 1 5 10 15 Gly Asn Lys
Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30 Asn
Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40 10343PRTEucalyptus
grandisEUCGR.K00250.1 2nd Myb Domain 103Phe Thr Met Glu Glu Glu Glu
Ile Ile Phe Ser Leu His His Leu Ile 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ala Lys His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Ile
Lys Asn His Trp His Ser Tyr Leu 35 40 10443PRTEucalyptus
grandisEUCGR.K00251.1 2nd Myb Domain 104Phe Thr Leu Glu Glu Glu Glu
Ile Ile Leu Ser Leu His Arg Leu Ile 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ala Lys His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Ile
Lys Asn His Trp His Ser Tyr Leu 35 40 10543PRTPopulus
trichocarpaPOPTR_0012s13260.1 2nd Myb Domain 105Phe Ser Thr Asp Glu
Glu Glu Thr Ile Leu Thr Leu His Arg Met Leu 1 5 10 15 Gly Asn Lys
Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30 Asn
Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40 10643PRTGlycine
maxGlyma16g31280.1 2nd Myb Domain 106Phe Ser Lys His Glu Glu Asp
Thr Ile Met Val Leu His His Met Leu 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Ile
Lys Asn Tyr Trp His Ser Tyr Leu 35 40 10743PRTGlycine
maxGlyma09g25590.1 2nd Myb Domain 107Phe Ser Lys His Glu Lys Asp
Thr Ile Met Ala Leu His His Met Leu 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Val
Lys Asn Tyr Trp His Ser Tyr Leu 35 40 10843PRTSolanum
lycopersicumSolyc03g025870.2.1 2nd Myb Domain 108Phe Ser Leu Glu
Glu Glu Asp Ile Ile Leu Thr Leu His Ala Met Phe 1 5 10 15 Gly Asn
Lys Trp Ser Gln Ile Ala Gln Gln Leu Pro Gly Arg Thr Asp 20 25 30
Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40 10943PRTVitis
viniferaGSVIVT01028984001 2nd Myb Domain 109Phe Thr Ile Glu Glu Glu
Glu Thr Ile Met Ala Leu His Arg Leu Leu 1 5 10 15 Gly Asn Lys Trp
Ser Gln Ile Ala Gln Asn Phe Pro Gly Arg Thr Asp 20 25 30 Asn Glu
Ile Lys Asn Tyr Trp His Ser Cys Leu 35 40 11043PRTEucalyptus
grandisEUCGR.A02796.1 2nd Myb Domain 110Phe Thr Val Gln Glu Glu Glu
Thr Ile Leu Asn Leu His His Leu Leu 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ala Gln His Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Ile
Lys Asn His Trp His Ser Tyr Leu 35 40 11143PRTArabidopsis
thalianaAT3G48920.1 2nd Myb Domain 111Phe Thr Lys Gln Glu Glu Thr
Ile Leu Leu Ser Leu His Ser Met Leu 1 5 10 15 Gly Asn Lys Trp Ser
Gln Ile Ser Lys Phe Leu Pro Gly Arg Thr Asp 20 25 30 Asn Glu Ile
Lys Asn Tyr Trp His Ser Asn Leu 35 40 11244PRTArabidopsis
thalianaAT4G25560.1 SANT2 domain 112Met Ile Ser Ala Glu Glu Glu Glu
Thr Ile Leu Thr Phe His Ser Ser 1 5 10 15 Leu Gly Asn Lys Trp Ser
Gln Ile Ala Lys Phe Leu Pro Gly Arg Thr 20 25 30 Asp Asn Glu Ile
Lys Asn Tyr Trp His Ser His Leu 35 40 11344PRTArabidopsis
thalianaAT5G52260.1 SANT2 domain 113Ser Phe Ser Glu Glu Glu Glu Glu
Thr Ile Leu Thr Leu His Ser Ser 1 5 10 15 Leu Gly Asn Lys Trp Ser
Arg Ile Ala Lys Tyr Leu Pro Gly Arg Thr 20 25 30 Asp Asn Glu Ile
Lys Asn Tyr Trp His Ser Tyr Leu 35 40 11444PRTOryza
sativaLOC_Os04g45020.1 SANT2 domain 114Met Phe Ser Arg Glu Glu Glu
Glu Thr Val Met Asn Leu His Ala Thr 1 5 10 15 Met Gly Asn Lys Trp
Ser Gln Ile Ala Arg His Leu Pro Gly Arg Thr 20 25 30 Asp Asn Glu
Val Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 11544PRTBrachypodium
distachyonBradi5g16672.1 SANT2 domain 115Met Phe Ser Gln Glu Glu
Glu Glu Thr Val Met Ser Leu His Ala Thr 1 5 10 15 Leu Gly Asn Lys
Trp Ser Arg Ile Ala Gln His Leu Pro Gly Arg Thr 20 25 30 Asp Asn
Glu Val Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 11644PRTZea
maysGRMZM2G170049_T01 SANT2 domain 116Met Phe Ser Pro Glu Glu Glu
Glu Thr Val Met Ser Leu His Ala Thr 1 5 10 15 Leu Gly Asn Lys Trp
Ser Arg Ile Ala Arg His Leu Pro Gly Arg Thr 20 25 30 Asp Asn Glu
Val Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 11744PRTSetaria
italicaSi012304m SANT2 domain 117Met Phe Ser Arg Glu Glu Glu Glu
Thr Val Met Ser Leu His Ala Lys 1 5 10 15 Leu Gly Asn Lys Trp Ser
Gln Ile Ala Arg His Leu Pro Gly Arg Thr 20 25 30 Asp Asn Glu Val
Lys Asn Tyr Trp Asn Ser Tyr Leu 35 40 11844PRTCitrus
clementinaclementine0.9_033485m SANT2 domain 118Val Phe Asn Met Gln
Glu Glu Glu Thr Ile Leu Thr Val His Arg Leu 1 5 10 15 Leu Gly Asn
Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr 20 25 30 Asp
Asn Glu Ile Lys Asn Tyr Trp His Ser His Leu 35 40 11944PRTPopulus
trichocarpaPOPTR_0015s13190.1 SANT2 domain 119Thr Phe Ser Ala Gln
Glu Glu Glu Thr Ile Leu Ala Leu His His Met 1 5 10 15 Leu Gly Asn
Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr 20 25 30 Asp
Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40
12044PRTEucalyptus grandisEucgr.K00250.1 SANT2 domain 120Met Phe
Thr Met Glu Glu Glu Glu Ile Ile Phe Ser Leu His His Leu 1 5 10 15
Ile Gly Asn Lys Trp Ser Gln Ile Ala Lys His Leu Pro Gly Arg Thr 20
25 30 Asp Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40
12144PRTEucalyptus grandisEucgr.K00251.1 SANT2 domain 121Met Phe
Thr Leu Glu Glu Glu Glu Ile Ile Leu Ser Leu His Arg Leu 1 5 10 15
Ile Gly Asn Lys Trp Ser Gln Ile Ala Lys His Leu Pro Gly Arg Thr 20
25 30 Asp Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40
12244PRTPopulus trichocarpaPOPTR_0012s13260.1 SANT2 domain 122Ala
Phe Ser Thr Asp Glu Glu Glu Thr Ile Leu Thr Leu His Arg Met 1 5 10
15 Leu Gly Asn Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr
20 25 30 Asp Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40
12344PRTGlycine maxGlyma16g31280.1 SANT2 domain 123Val Phe Ser Lys
His Glu Glu Asp Thr Ile Met Val Leu His His Met 1 5 10 15 Leu Gly
Asn Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr 20 25 30
Asp Asn Glu Ile Lys Asn Tyr Trp His Ser Tyr Leu 35 40
12444PRTGlycine maxGlyma09g25590.1 SANT2 domain 124Val Phe Ser Lys
His Glu Lys Asp Thr Ile Met Ala Leu His His Met 1 5 10 15 Leu Gly
Asn Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr 20 25 30
Asp Asn Glu Val Lys Asn Tyr Trp His Ser Tyr Leu 35 40
12544PRTSolanum lycopersicumSolyc03g025870.2.1 SANT2 domain 125Ala
Phe Ser Leu Glu Glu Glu Asp Ile Ile Leu Thr Leu His Ala Met 1 5 10
15 Phe Gly Asn Lys Trp Ser Gln Ile Ala Gln Gln Leu Pro Gly Arg Thr
20 25 30 Asp Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40
12644PRTVitis viniferaGSVIVT01028984001 SANT2 domain 126Met Phe Thr
Ile Glu Glu Glu Glu Thr Ile Met Ala Leu His Arg Leu 1 5 10 15 Leu
Gly Asn Lys Trp Ser Gln Ile Ala Gln Asn Phe Pro Gly Arg Thr 20 25
30 Asp Asn Glu Ile Lys Asn Tyr Trp His Ser Cys Leu 35 40
12744PRTEucalyptus grandisEucgr.A02796.1 SANT2 domain 127Ala Phe
Thr Val Gln Glu Glu Glu Thr Ile Leu Asn Leu His His Leu 1 5 10 15
Leu Gly Asn Lys Trp Ser Gln Ile Ala Gln His Leu Pro Gly Arg Thr 20
25 30 Asp Asn Glu Ile Lys Asn His Trp His Ser Tyr Leu 35 40
12844PRTArabidopsis thalianaAT3G48920.1 SANT2 domain 128Leu Phe Thr
Lys Gln Glu Glu Thr Ile Leu Leu Ser Leu His Ser Met 1 5 10 15 Leu
Gly Asn Lys Trp Ser Gln Ile Ser Lys Phe Leu Pro Gly Arg Thr 20 25
30 Asp Asn Glu Ile Lys Asn Tyr Trp His Ser Asn Leu 35 40
12958PRTArabidopsis thalianamisc_feature(4)..(4)Xaa is Asp or Glu
129Trp Ser Pro Xaa Glu Asp Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15 Gly Xaa Xaa Xaa Trp Xaa Xaa Xaa Pro Xaa Xaa Xaa Gly Leu
Gln Arg 20 25 30 Xaa Gly Lys Ser Cys Arg Leu Arg Trp Xaa Asn Tyr
Leu Arg Pro Gly 35 40 45 Leu Lys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Glu 50
55 13039PRTArabidopsis thalianamisc_feature(2)..(4)Xaa can be any
naturally occurring amino acid 130Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa
His Xaa Xaa Xaa Gly Asn Lys Trp 1 5 10 15 Ser Xaa Ile Xaa Xaa Xaa
Xaa Pro Gly Arg Thr Asp Asn Glu Xaa Lys 20 25 30 Asn Xaa Trp Xaa
Ser Xaa Leu 35 13146PRTArabidopsis thalianamisc_feature(5)..(5)Xaa
can be any naturally occurring amino acid 131Leu Trp Ser Pro Xaa
Glu Asp Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Xaa Gly Xaa
Xaa Xaa Trp Xaa Xaa Xaa Pro Xaa Xaa Xaa Gly Leu Gln 20 25 30 Arg
Xaa Gly Lys Ser Cys Arg Leu Arg Trp Xaa Asn Tyr Leu 35 40 45
13239PRTArabidopsis thalianamisc_feature(2)..(8)Xaa can be any
naturally occurring amino acid 132Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa
His Xaa Xaa Xaa Gly Asn Lys Trp 1 5 10 15 Ser Xaa Ile Xaa Xaa Xaa
Xaa Pro Gly Arg Thr Asp Asn Glu Xaa Lys 20 25 30 Asn Xaa Trp Xaa
Ser Xaa Leu 35 1331009DNASolanum lycopersicumRBCS3 (Ribulose
1,5-bisphosphate carboxylase, small subunit 3) leaf-specific
promoter 133aaatggagta atatggataa tcaacgcaac tatatagaga aaaaataata
gcgctaccat 60atacgaaaaa tagtaaaaaa ttataataat gattcagaat aaattattaa
taactaaaaa 120gcgtaaagaa ataaattaga gaataagtga tacaaaattg
gatgttaatg gatacttctt 180ataattgctt aaaaggaata caagatggga
aataatgtgt tattattatt gatgtataaa 240gaatttgtac aatttttgta
tcaataaagt tccaaaaata atctttaaaa aataaaagta 300cccttttatg
aactttttat caaataaatg aaatccaata ttagcaaaac attgatatta
360ttactaaata tttgttaaat taaaaaatat gtcattttat tttttaacag
atatttttta 420aagtaaatgt tataaattac gaaaaaggga ttaatgagta
tcaaaacagc ctaaatggga 480ggagacaata acagaaattt gctgtagtaa
ggtggcttaa gtcatcattt aatttgatat 540tataaaaatt ctaattagtt
tatagtcttt cttttcctct tttgtttgtc ttgtatgcta 600aaaaaggtat
attatatcta taaattatgt agcataatga ccacatctgg catcatcttt
660acacaattca cctaaatatc tcaagcgaag ttttgccaaa actgaagaaa
agatttgaac 720aacctatcaa gtaacaaaaa tcccaaacaa tatagtcatc
tatattaaat cttttcaatt 780gaagaaattg tcaaagacac atacctctat
gagttttttc atcaattttt ttttcttttt 840taaactgtat ttttaaaaaa
atattgaata aaacatgtcc tattcattag tttgggaact 900ttaagataag
gagtgtgtaa tttcagaggc tattaatttt gaaatgtcaa gagccacata
960atccaatggt tatggttgct cttagatgag gttattgctt taggtgaaa
10091341714DNAArabidopsis thalianaRBCS4 leaf-specific promoter
sequence 134caaatttatt atgtgttttt tttccgtggt cgagattgtg tattattctt
tagttattac 60aagactttta gctaaaattt gaaagaattt actttaagaa aatcttaaca
tctgagataa 120tttcagcaat agattatatt tttcattact ctagcagtat
ttttgcagat caatcgcaac 180atatatggtt gttagaaaaa atgcactata
tatatatata ttattttttc aattaaaagt 240gcatgatata taatatatat
atatatatat atgtgtgtgt gtatatggtc aaagaaattc 300ttatacaaat
atacacgaac acatatattt gacaaaatca aagtattaca ctaaacaatg
360agttggtgca tggccaaaac aaatatgtag attaaaaatt ccagcctcca
aaaaaaaatc 420caagtgttgt aaagcattat atatatatag tagatcccaa
atttttgtac aattccacac 480tgatcgaatt tttaaagttg aatatctgac
gtaggatttt tttaatgtct tacctgacca 540tttactaata acattcatac
gttttcattt gaaatatcct ctataattat attgaatttg 600gcacataata
agaaacctaa ttggtgattt attttactag taaatttctg gtgatgggct
660ttctactaga aagctctcgg aaaatcttgg accaaatcca tattccatga
cttcgattgt 720taaccctatt agttttcaca aacatactat caatatcatt
gcaacggaaa aggtacaagt 780aaaacattca atccgatagg gaagtgatgt
aggaggttgg gaagacaggc ccagaaagag 840atttatctga cttgttttgt
gtatagtttt caatgttcat aaaggaagat ggagacttga 900gaagtttttt
ttggactttg tttagctttg ttgggcgttt ttttttttga tcaataactt
960tgttgggctt atgatttgta atattttcgt ggactcttta gtttatttag
acgtgctaac 1020tttgttgggc ttatgacttg
ttgtaacata ttgtaacaga tgacttgatg tgcgactaat 1080ctttacacat
taaacatagt tctgtttttt gaaagttctt attttcattt ttatttgaat
1140gttatatatt tttctatatt tataattcta gtaaaaggca aattttgctt
ttaaatgaaa 1200aaaatatata ttccacagtt tcacctaatc ttatgcattt
agcagtacaa attcaaaaat 1260ttcccatttt tattcatgaa tcataccatt
atatattaac taaatccaag gtaaaaaaaa 1320ggtatgaaag ctctatagta
agtaaaatat aaattcccca taaggaaagg gccaagtcca 1380ccaggcaagt
aaaatgagca agcaccactc caccatcaca caatttcact catagataac
1440gataagattc atggaattat cttccacgtg gcattattcc agcggttcaa
gccgataagg 1500gtctcaacac ctctccttag gcctttgtgg ccgttaccaa
gtaaaattaa cctcacacat 1560atccacactc aaaatccaac ggtgtagatc
ctagtccact tgaatctcat gtatcctaga 1620ccctccgatc actccaaagc
ttgttctcat tgttgttatc attatatata gatgaccaaa 1680gcactagacc
aaacctcagt cacacaaaga gtaa 17141351923DNAArabidopsis
thalianaAt4g01060 (G682) promoter sequence 135ttattaagtg ctatgcgtta
atcggcatct ataaagtgtt gcattgatga acaaagtgga 60tgcctaaact agacgtttaa
ctaaatgttt agaatgaaat cttcatctca tctaaaaagt 120gttgcattga
tgtaaaaagt ggatgcccat tagttcttgg ctttgaaatg tttttagaat
180gaaatcttca tcaatctcca tatgtggttc aatccactca ttttatcttt
tgttaaagat 240gttcttcagg ccaatataat gatgaccatg gatggtttgc
aactcgcata taacacttct 300ttatccgatg gttacaagta ttacatggct
atagatagct tttgcatgca acaaattatc 360tatcaaagtt tatgcatcct
ctaaaatatg gtcattggca agccactaaa cgtatatatt 420gtgacaatgt
atgatgatat atttttatgt gttgactccg tttttcatta agtaatgaaa
480catgttgctc tagattacca ttttaatcgc aaacatatat gttgctctac
cacgcattgc 540atcaaatgat aagcttgagg acgcttcgac aaaaacatat
ttctccttct tcttactgaa 600ccaaatgaca attgacaaac cccattaata
aaatcggtta gtgttaatgt gtcactcata 660atattaactt agtaaagaac
aagaccacat taattaaatc aggtgttagt ctgagaatat 720acgtttctct
tcctcattcc aaacttaaat tcggaattta ctgagaatat attgttagca
780ctgaaaaagg ttaagttgaa agtttgctag ggatggcaat taaatatagt
cttgccttgg 840ggatattccc ttcgtggact tgtaagttta tttataggtc
ctcttatgta tatatagatg 900atctaacgat cgatatacta tgaaaaaagt
tgttactaga ttttattgca ggtaatagtg 960ttgaataacc cgaaccaata
aagcagttgt aacgaacaca cgacacgttg cttactgcga 1020ggaccacttt
gttttttgtt ttttttggct ttaagccaat ttaggaccaa atttgcatga
1080ttgaggatgc aagtatccaa cccattttca tctttcgtag tgacactcat
ttacttttgt 1140gatggacacg ttatagtata tcttaaatat taaagagaca
tgattggggg atcattgttt 1200taatttaaat aatgtagatt ctattctttt
catggtatta atccaattta tagaaagtta 1260tgtgttatta gcaattaagc
taaatgatga aaacaatcag tttagtgaaa caaactcgcc 1320gagaaaacat
gaatggttga aaatattatt gtgttttaca aacgtacacg aggacaatag
1380ttttgtaagt ttttcttagg cattgaaaaa tgtttgatac aaaaagtaat
gttaaaataa 1440ttaaaaatga ttttgtctta atatatccaa aatttcaatc
tattatgaac aaagggagta 1500taatttctga ttgaatgaac tggaatagca
atcagaaaag ctttgaaaac aattgttgtt 1560gattattaat gatcttaatt
aacggcatgt atcaatattt atacaactta tgttccagtc 1620caagccatca
caacggagta aatgaagtca cgggtacttg tggtttttat tggttgcaaa
1680cttgcaactt gcaaagatag ctaacaataa ttaatataat taatgagaac
aaaaccaatt 1740tagtaaatta aaatccttta acatagaaac cgaccaaacc
cgttggaccg ttggttactt 1800gatttggtta gttgctataa atagaaatga
tggttcgtgt gcaaccttca aaatacgacc 1860actctctcag agtactctct
tagtttcttt cttcttcttc tttgtaatac ggtgccgttt 1920gac
19231361500DNAOryza sativaOs02g09720 location Chr24997678-4999177
136ggccgtcgtc ggcgagttct cagctatagc ttggtagcta tctagctcaa
tttgctgtct 60ccaagtgtga cagctagttg taattgagtt ggtatagctt tggcctcttc
ttttattttt 120ttcaggccgt ttagatggta acctctctga cttgagagag
ttagagaggc cttgtatata 180cggagtatat agtactccag tttagttgcc
atgccagcag ttgcatgcac tagtaagcaa 240ctagtctctc ggtttcatat
tacaactcgt tttaactaag tttatagaaa aaacatagtc 300gtatttttaa
tacaaaacaa atatattatc aaaatatatt taatgtttgg tttaattaat
360taacattggt gtttttgatg ttactaattt tttttataaa cttagttgaa
cttaaaataa 420attggttaag aaaaagttaa agcgacttgt aatatagaac
ggaggaggta gctattggtc 480aaggggatgt gcatccatgt ttggtaggta
gccgtggtgg ctttggtgtg tcattctcta 540tcttttccac acacacactt
taaatttgct ggctttttag aagaacaatg gtaatgtttc 600gcttttattt
tgctcatggt aaaaagattg tgtttcgctt ttcgaggggc aaaaattaag
660tctaattgat gctattgatt aataagatcg tgatcgcgag gaggattgag
atgtctcgat 720ggacataata gaatcttagg ttcaaggaat ttacactctt
taataaagac ttcgcttgca 780aagatgtagg ccaaatgaaa gagacttaag
catgcttgtt gatataatac tagtttattt 840tgagaattag gatttaattt
gatgcactaa acctagagta aaaaccataa gatgacctaa 900atctgtgcca
atatatttct agttggttca taattaatca acggaatgag tatgcagttt
960ttctcttaaa atgcctatag gagttaggtg ccatgctaca ccacactact
gaaactgaat 1020tggtgccggt tggaaaccgg ctatagatgt cggatatcct
aactgatacc cagtagacga 1080cacctttggc ttcagtcatc gatcccgccg
gcatctttta cttacaaagg tgtcggtttt 1140gaatacagaa ccaacaccta
tactgtcata taggtgtcag ttcttaaatt gattggcgta 1200tggacagttt
tacatgtgac ttttacaggt gacagttcgt aaatacaact ggcacctata
1260agaagcatat atgtgccatt tgtacaccgg gcaggacacg ggacaagggt
tataggtgtt 1320tgtttgtaag taaaaccagc acctatatgt tagcaagaaa
aaaaataaaa atgataggac 1380ctaccaattt cacatacata tcacacaata
acagaaattc acatcacttc aaacatacaa 1440ccacattcac attcaaccaa
actatatata ttcaattcca catccacaac cctcatccca 15001371500DNAOryza
sativaOs05g34510 location Chr520455817-20457316 (reverse
complement) 137ctatatcttc ctacctatcc ccctgcaagc taacaagggg
aattacttgc acatctaatg 60atggactggt gttgttatgc ggttgtgttg gggtgagatg
gaggatgtct ctcctccccc 120ttgctagcct ttaaaaagga gtcctacagg
ggcataacag gaatgagagg aagcttctgg 180agatcaattc tcacaacaca
catttcctac agtacagttt tgcccccaga cagatcggat 240ggtgtgtgtc
agcagcctct atgcctgtag gcatgcattc agttgaatgg ttttgtctcg
300atcgacggcg aaaaaggatt tacttttgga caagaaaaaa gagcctttaa
tccatgagat 360tactacatct cgcatacagg gacaagtatt acagctgtga
gtgatgctta agagacttta 420cggttttgtt tgcagctaag caggaattaa
tataaataat ctcccttttc agggctctgc 480ttgcatgaat agctaccagc
cagagccaag agtgccagag acatcaagcc tggatggtag 540tagtacagta
cttgcgttac tcgtgcccgg cgttgctgct tcggtggggc ccacttgaac
600tggccgtcct cccgagccgt cggtgggggt ggtccccacc accggccagt
tactttgctc 660ccgtcgattc ggcccggtgc accaacagcc cattttttta
tgatgggctt gcaacttggc 720ccaacttgtt ggattgtcaa ggccggaggg
aggcccatct gggcgaactt gtgtgggccg 780cttctttcca ttttagagag
gaaacatggg ctgggctacg ggctgcgggc tgccacctat 840caccggagaa
acacttgctc gcagcctcaa atacttgaaa ccaggctttg aaaattcggc
900aaaaatcatc aaacctccga aagtatgaga gaggaaaatc aattatgatc
ttcctaattg 960tgtcatgatt agcatagcgc atcaaagcac cttatcggtg
caaaattaat ccctttcaaa 1020accattgtac ttcatgaatc ttgctacata
tcttaaaaac gttgacgggc ttacagctta 1080ccagagatca aagcgagagg
cagaacaaag ttcaaaatct gccgtgtata acaacaacaa 1140gtaattacag
tagaaacacc ttactactca tcaatcacta ctactaacac caccacatac
1200gaatcatttc taccagatcc aaacaaagaa aaaaaaagag ggagagagaa
ctaattaaaa 1260acacgaaatc cgaaccggtt caactaatca atcttcgcca
caatctccag ctccagcgat 1320gaactgatga tcatggcggc caccaggtca
ggtgtatgta ttcacaggtt tcgccgcgat 1380ttgacatgat tggaagtgga
acgcaaactc acgcggccac tccggcgccg gcgccgatgc 1440aggagaagac
gcggacgctg cgggccttgg ggagcgctac ccgcggcagc cgccggccag
15001381500DNAOryza sativaOs11g08230 location Chr114325545-4327044
138cgccgtatcc accttcacca gctcgatctc cgtcgttgat cgttgattga
ttacgtacta 60cctgagtgat gatggacgtg gatcactgga atgggttgag aaaataaaaa
tgatttatgt 120agtgtgtagg agtactcaac taacaattga acgaccagaa
tttgacgaat tttatttaca 180tgtttagaga ccctgggtag agattcagta
ataattaagt tcttgccata tttttgcatc 240aatttaatta ttttactaaa
atgatttatg ttgttttatt ttcttggcac tatatttgca 300tcaaatataa
ttaaacttag cttaactggc cagagcttat atatggctca gatttaattc
360ccctcttatc ttcagagcat taaaaaggga gtatcctaaa atacctatct
cctaaaagat 420caatctgcct gtggctcaaa tctatataaa gtgagctccc
ctcccactgt cccacatatc 480tatatagcta ccatggcact acagcttgca
acacgctcaa ggaagtctct cgccatcgcc 540gtcgccgccg ccgtgccgct
gctgatgtgc ctcttcctcg tcgccgccgc cgccgcggct 600gcatcgtcgg
agacggcggt ggcgagctcc ccacagtacc agccgagcta tggtaatacg
660tactcgacgt gcttcgaggt ttcggcatgc gatgacaccg ggtgcgcgat
caggtgccgc 720gacatgggtc acaaccctgc tggctcagcc tgctggacca
gcaacgtcgc gaccatcttc 780tgctgctgcg gccgtggtcg tcctcctccg
gttgcttgat ccatatgtat acataattac 840atatatggta tatgtataac
tggataataa agcgtatgtg cgtgtcagcg tgagttggtg 900aagatgactt
ataacatgaa acaggagtag cgctagtaat gagatgtgtg taaaaatgtt
960atggttcaat taattaatca aagtcgatct ttgcattgca tgctgataca
tatctatcga 1020aatatatatg tactaagacc aagaatcaaa gtcgtactac
tcgtatatgt acaaaataat 1080taattttagg caaattttgc tacaggacac
cgtaattgtg tggttttagc ctagggacac 1140cgcaaaaaca aagtttgagg
aaagacacta cgtaaccgtg gatatttgcg atggacaccg 1200caccgtctaa
acgaaacttg ctatgctgac gtggcggtcg ggagcccctt ttttgacgcg
1260gtcggaccga aatgcccctg caccttacct cctcatattg cttgctggac
actaaattgt 1320ttgctggaca ctaaatacat agatagatac atcgtacata
tacatgacag atataccgac 1380atggactacg ttccttctag ccaccgctgc
cggcgagctc caccgccgtc atatggttgt 1440tgcgcctggc gtgagaggta
tatgcggcat tgcatcactg tccggccaac tttgaattaa 15001391500DNAOryza
sativaOs01g64390 location Chr137374848-37376347 (reverse
complement) 139tgtccagggc tccaggccgt gatcagcgct cgctgctacg
ctcagaccaa cacaggaaat 60tattacttgc tccctactac tacgtgcaag cagcgtaaag
ggggcaggcc actgtttgaa 120attcatccct ctcacctttt gaaacagccg
cgcgtcgtgc cgctgtagta gtgcagctgt 180gcaggtgcag tagctttggg
tcaaaactca aaactcgaaa gggagacagc accaaaaaga 240tactgcggct
gcagccacac agggctagct gcttgacgca gagagttcgt agcgacgttg
300ctcgtctggg catggcagtc gcgcctagat cggctcgtgc acggcatgcg
ttgcttgcct 360gggtacgcca ccacctgcgc agccagcgat catggccggc
gcggcgtccc acggcacgag 420agcgagcgcg cgcgcgcgcg gcactctcgt
gaaggcgtac gcgccatccg gccgcgcggc 480gcggcgcagg ccggtgcagc
gtgcaggttg acccgcgaca ggcgcacggc cgcacacgcg 540gtgcggcggg
ccggcggcct ggatttgggc ggagttctcg gtggccgtcg cttttcccgt
600gccagctgac gcgattacga gccgtacccg atcaaaaccg gcgtgatcgc
cggtccgatg 660catgcgtttt gttttcatgc cagatctttg gtaatcgtta
gtacgactct ctgagtccgc 720gagtacgctg gtttctgtta attgcagcct
acagtatttt ctttttcttt tttgcctcgg 780ggaagcgatg cgagcgatac
gtgataattt caggtggcgc agtgccgctg tgcatgtgtg 840aaatcatgca
agattggcag gttgctagcg cgcacgtact cctctctggt gtgatcggtt
900tcaggtgata gagacgatac atgggcatac cttgacacgg agtgggctgg
gatggcgcgg 960gtggtccatc ggctacagcc ttacatgctc ctcctatcaa
actctcgtta tgtccagttc 1020acttggccca tccttggttt cctctacgac
tagtgctcct ctagccagta gccactaccc 1080actgtttcca tattggcgca
cgggtcatat aagatggttt gttcctgttc acctgacggg 1140aaattctttc
tcctaaaacg atatcgaacc atccttatcc ggattggatt aacgagatcg
1200tcgtgctaag tgtgctaaca cacttgttat atgaccgtta ttccgagttc
gtcttgctgc 1260tgggagagtt gcagaattgt agtactccct tcgtttcatg
tgacaagacg ttttgacttt 1320agtcaaaatt aaattgcttt aaatttgatt
aagttcgtaa aaaaaagtac tacctccgtt 1380ttacaatgta agtcatttta
gcattaacga gcatggtggc tcgcgcaaat tgcgcggcta 1440gcatcattat
attttctctc atataatagc atatatgttt tctcattata ttattcaaat
15001401500DNAOryza sativaOs06g15760 location Chr68954266-8955765
(reverse complement) 140ctccggcgag gcggtggact cggactccga ctccgacgaa
gtcgaagaga tgatcgaaag 60gtttgggagt cgcggcggcg aggatgaggg gggtggattt
ggtatattta cgggcgcggg 120tttgggcttt ttttcgcgaa tgggccccgc
ccggtttgac ttctgcgata gggatgcccc 180tcaaaaggta gaaaaatact
caaagaaata ctcctaactc cgtttctaaa tataagtgat 240atatatatac
tactcaatcc cttttagatt ggaagatgtt ttaactttga ccaaagtcaa
300actactctaa atttaactaa ctttgtagaa aaaattagta atatttataa
cactagcata 360gtttcattaa atctataatt gaataaattt tcataatata
tttgtcttgg gttaaaaata 420ttactatttt ttttacaaaa ttagtcaaac
ttagattagt ttaaatttga ccaaaatcaa 480aacgtcttgt aacctgaaac
ggagggagta tttcttttag ctggagaata ggtgaaacta 540tccctcttca
aaattttaac cagtcggatt tttataaaaa aaaatatttt caaatatttt
600agacaagcat attaccaaat tatattgcat cttaaatata taaaagtatt
gcgaagtggt 660gctatacttt ttggcgcaaa aaaaaactca aatacttcac
atcccttttt atataaggga 720aactttcttt tactgaaata aataataagt
gaaactattt atctttagat tttagccact 780cgtatttttt agactgatgc
cacttatttt tctaaacaag gttttcaaat attcaaaaca 840atattattct
caaattataa aagcttgaac ctttgtttat cgtaatacaa ccatgtaaga
900aatttttagt ggtgaaaacg aaaaattgtc cagccaaata catgtcatct
tcacatcttc 960tactagtttg atccaccttt ggtgtttaat gttttcacaa
aaggagtaag atactacgtc 1020gcacaatttt ggaatcagtt gttggaaata
atttattcta ttttggatgg cataatactt 1080tcagactttc attatagaat
ttactatatg atcatttttt ttatatttca gctggctatc 1140atttgaaggg
atgggtgaat ttatcccatg acattaatcc cccccctccc cctaaatttt
1200tataacttat cggctgtgtc ttttatatac cacagatatg aatcagtttt
catcattatt 1260aaaaaaaatc aagacaatta ttttaaaaca atatttagcg
cggagattcg tgctgggtcg 1320tccacgtatg cagcatgggt gtcgctgtgg
ggggtttggt gattgctggt cgttgtcgtt 1380tcgtggccgg ttgcagtttg
tagccaaggg tggtggcatc tgatatctgg ccacacactt 1440tggttatgga
atgctgggtt tgctgctggt gtggtgatca accggtgggt gattggtttg
15001411500DNAOryza sativaOs12g37560 location
Chr1223047433-23048932 (reverse complement) 141tgacggcgtc
tccttcttct tgcttcttgc ttcttcttct tcctcctccc gatctggggt 60tgtaggaagc
tactggtcgt gatcaatgga gcattccgcc atggatggat tggatgggat
120ggggagaagg ggagggggag aagcgcgcgt tgctggcgat gttctctccg
cgtggggggt 180gggatgcgat gcgatgcgat gcggaggagg aggaagacga
cgaggactcg gctagtctgg 240agttaattga ttaattaatt gattaattag
gagcaggaga agtgcaagca cgacgagagt 300gaagggaaag gaaccgccgt
ttcaaaaaag atggaatttt tgcggccaac cccttctact 360agtgacgaat
cacggtcatg tttgccacaa ctttcaggct gagttcgttt ctttgatact
420ccattcgtac cgtaaaaaac cagcctaata ctagatgtga cacatcatag
tattacgaat 480ctggagatac ctctgtccag atttattgta ctcgaatatg
tcacattcag tcatatattc 540gttttttttg gacgggggag tatctaattc
actcctcggc taatgattaa ttaatcatgt 600actaatggat cactctgttt
tccatgaaca cagcctcggg ttaggtctta gatccacgac 660agatttaaat
ttttaagttt tatttaaaat atgtgtaaaa tgatttatcc aaatgttata
720gacataattg aggatctaaa tacatggatt tgtggagctt gaaatattca
gcttctaaaa 780atcttgaatt tggagctatg ctaaagagga cctacgtatc
ctgcagttaa cttattccaa 840aaagaaaaac aatatccaaa cctgatgctt
aaaaaaacac actgaaaatt aaaggcgtta 900tataaaaagt atatataaag
tattaaactt gacaaaatgt gttggagaaa ccatgcgtaa 960aactccagat
attacctgga gacaaaaggg catgtagtca atgattgcaa aacaaattct
1020ctggaagtaa cggaaataat tgaaagttat acatatccaa gtaggattta
ctacttacat 1080agccagccat tatccactga taattgctgc tattatcacc
ataatcaacc gaataatcag 1140ccaagaggtt atcccaacga taaatgactt
tgatctctcc tttggtttaa ctcaaatgaa 1200tgataatggc ttatcaaatg
accagtcatc ttgactacaa aatcatatga taattaacct 1260cattaattta
tcactagtgg ctgataaaat gggatataga gaatgctttg gaattcatga
1320tatttgctaa tttataggta aatttctaac aaacatatgg taaaataaac
ccccataata 1380atcgtactac caaactatac caaacaagta ctcttagcat
gtatatatag ccacaccgat 1440caactagaag actcaaaata ccaataggtt
gtcggcatcg cctaaccgtg atcagacatc 15001421500DNAOryza
sativaOs03g17420 location Chr39689781-9691280 (reverse complement)
142gtgttcgcac gacgataggg ctctgacttc ttatggaaga tgaacatgat
aaacgtagat 60gccttttcat atagtacgac ctacctgaag acatgagatg agatttatta
gacgttttgc 120tggtcaacac acaacacgtc gatgactcgt gcgacgcaaa
tgcaaatcca gatgagttag 180tggaccctca cacttataca taggagtagg
tgtcatctga gtcaatatac tagcaagagt 240gagtttgaaa gagatgttgg
gattgagctt ttttttagca catataaaac gagataagtc 300attactatat
gattaattaa atattagcta ttgcaatttg acttctatgt aaaaattttt
360tttcaaaaac atatacatta gtttgcaaag cgatgaaata gtttacccaa
cttctcttta 420gaactcagcc taaataaatc taacccctaa atatgctaaa
tgtgccagcc ctagtccaaa 480atttaatgag acctaccata agaatgttga
catgacatct tcaccaatcc tacaaaccct 540taaaaagtta aaagtttgaa
gaaagttgga agtttagaaa aaaagttaga agtttatgtg 600tgtagaaaag
tttttgatgt gatatgatgt gatggaaagt tgggaatatg ggggaaacta
660aatacggcct aacaagtatg gcacacacac gctttatttt tacctttttt
cttcttttta 720ttctgtcttt cattctcctt ttctatttct ttgcttctat
gttcttcctc aagcgagcct 780tccttgagct caagttgttg ccattgagct
catggtgggc tggtgggcat agtcgtgcac 840tcacactaag caactatcgt
taagctttcc atggctgctg taacatcccg gcccagggct 900taataggatt
aatataagac ttgtgcacac gagtgaggac aaagtgtgca gaaaagactt
960gtgttggtct gtgagggcag tctatgacct atgaaagctg ccagatgtaa
gcggaccact 1020tcttttctgg aagccgatgt ccaaagaact ttagggttaa
gcgtgcttgg cctggagcaa 1080tttgggatgg gtgaccgacc ggaaaattct
tcccaggtgc acacgagtga ggacaaagtg 1140tgcatgaaag ctgccagatg
taagcgggcc cagcttggga gaggcgggac gttacagctg 1200cgatgcttaa
gcgtccctat cggccgccac catcgagctc gtcctaccct tggtcaccac
1260cactcctcta ccatcactga gctcctatgc atccacggcc aaaacaaaaa
ccctatcctt 1320acgtatctcc tctttcacca accgaaaccc accctagcct
ccattgcttc agtcaccgcc 1380attaaacaat attagatcga aatcctaatt
cttcgccagc acaaaatgac taaattcatt 1440agtgttgctt ctcatgttcc
aaaattaaac atgtatacga gttcattgta cccggtaatc 15001431500DNAOryza
sativaOs04g51000 location Chr430185779-30187278 143gcgaggatgg
ctactagctt gctgtccttg cggtcggccg ggctcggttt ctccggttgt 60ttcttgtgtt
tcgcccacac acctctgtgc gtcgcgttgg gctcgttata tagcggccac
120tgtagtgttg ggctatagtt gctgcacgcg gttgacttga caaatctcca
tagctcgttg 180cattggtccg gatcgatgta tctgcattca cgctagcttt
tggtttttgt ccaatacttg 240gaggaaggga gcgagctacc gatcgatact
acgtgaaaac gacctgtcct gtagaaagct 300gcatgcgtcg ctagagcaca
cacgatttga tagtctagat tctagtaagc cctattacgt 360gccggtacat
aaaaagtagg cacagtaagc cttataggag taatacaccc acacattgtg
420ttgtcctgtc acggcgtacg tgcatttgta atgtggcgca cgtctcaagt
gtccttcgat 480cttttcgtcc cgtctctcct tggaggtagc aacggcgtca
ctgttcctct tccgaagaaa 540aaagatactc ctgcatgcgt actgttgttg
cgtacacggt gaacacgggt gaggccgtag 600ccttgtgccc ttaattatcg
tacgtggcgg ttgtccatgt aatcgtgtga gcggtggggg 660cagaagatta
gcgtgtcatt ctgcagattt tccccacgct agggcaactc aaaaatgttt
720ttgattccga gggcgaaaat atacttttta agaattattt taccattctg
gataattaat 780gtatcgagag acaccgtatt tttttaatat aaaacttaat
acttttagat actgtgtatt 840ttgagataaa caaaagagta aattgcatca
gcggtacacg aacttgtcag gttggtgcaa 900tctagtacat gaacttctaa
aacgctcgtt tctgtgcacg agcttgtttg atgcgtgcga 960ataatgtcaa
aatcgcactg caaggttaat cttgttgatt ctgtggttga tttaaactct
1020atgtaaacat tagtctaaga aggataaaca tataataagg cataatgatt
gtataaaaaa 1080ataaaaattt gagtaatgat gcaagggaca gaaacaaaga
aagataaagt ggaatctaag 1140aaaaatcggt cttgtcgcac cattttagtc
ttagtttgca cataccagac aaattcatgc 1200accaaaacga gacgagcatt
ttataagttc atacactaaa ttgcacgcac ctgacaaatt 1260tatgtaccgc
taatataatt tactctaaaa aatgtagtgt aaaatttctc ttcttttaaa
1320gtctcactat ggtgcattca ttggtcttat ttgattggag gatttagatc
ttttttacaa 1380ttatttcaat taacggtagt agaaatatgg agtgatattt
attgtaaaaa agaataatgt 1440gcatggtgta taattgtcat ttagcaatga
ttaaatattt gtttctcttt ctggattttt 15001441500DNAOryza
sativaOs01g01960 location Chr1522445-523944 144gacctcgagg
aactcctcgt atttctccct cttgtcctgg aacttatcct tgacggcctt 60gaggtagacg
agcgcatcgt tcgtggtgag cttctggccg gccgttgcgc cgccggcggg
120cggctgagca ggcggagcag cggcggcctg aggcggcggc gctgcagcgg
aggcaggccc 180cagaggcatg tgctgcggct gcgccgtcct acaccacgag
aggaggggca aacagttaga 240tcaaaacgca aaacaaaatc gcaccctaaa
aaacacctcc tttttttttg cgacgtagag 300ttagaaattg gaacacaaaa
tttgacggtg aagaagagag attccccaaa ggaaacccga 360attcctcccg
cagatggaaa taatatatat aattaattat tcactccctc gcgggctgag
420ggtgagggaa acgaaaccga agtgaagttt attttgaaaa ataaaacaaa
cgaaataggc 480aggaacgcct ggaagggcga aggcgatgag caccgaggaa
gccaaggatg gaggaggcga 540ggagaaaagc actcacggat cggatcgacc
gacgttgggg cgcttgagtt gggagcccat 600gagcgcgtcg tccctggcgc
gcttcatccc cccatacaca gactcccttc ctcctcctcc 660ccctcctccc
ttcacccaac caaaccaccg ccgcctccta cagaacaacc tcccggtcgc
720cgccgccgcc gccgccgccg atcctctcaa acctcggaag acgtacaccg
gcgccggatc 780tgcccgccgc tggctcctgc gagacccccc gagcgcggcg
cgaagagggc gcagcgcgag 840gcgaaggcga cgcgcgggga ggagaggcgg
ctaatcgcct cggcgagacg cgagacgcga 900gagggaagga tgggtgagga
acgaggcaag ggcgaggcga agaagaagaa gagaagcgag 960gcctcctctt
ctctttaaca caacgaagag aagagagcgc atcacaaccc atgtacaccc
1020accggtgcca ccacgctgcc ccgcgcccgt cgcgaaccac atcccgcccg
cacatcccca 1080gataagggac acgtggacgt accggatcga ccgccctagg
gtgcgaaatg gttactggcg 1140gtgatcgcac cggtgggagg gttagcttgt
ttcgtgggta aacacaacct acccattgga 1200tttggggatt ccagtgagcg
gtctggatta gcgcggggtt acttcgagat tagtgccccc 1260gggcccacct
gtcagtcgga tgagtgcctc cgatctcagg tttaacctag ctccgctagg
1320gcggcgtcgg ttcgtatatc cgcctcatgc gtgtgtgtct tctggctcag
gagacgtatt 1380tccgtgggat ccgtatatac atgggcttca tttggccatt
tgtcacgatc cctagcttcc 1440agagagggct accgctggat cttcagtttc
atctgcagta tcaaactatt aataaaaaag 15001451500DNAOryza
sativaOs05g04990 location Chr52418356-2419855 145tgataacgat
ggtgcgcctt cgtgatcgat cgagagcgtg aggagcaacg cggttcaatt 60tatagacgat
gcagatcaag ctggtgccaa gaagtggcgg gaggttgatg acgatcgatc
120gatccgagga agaagaagaa gaagaaatgt aagtggtgat ggtggtgatc
gatcgagttg 180agttgagagg cggcatgcgc gcatgcatga ggatgagggg
gaggcaatgg taatgattgt 240gagcattatg gggaggggag aggaggatct
tggtgattaa tgaacaacgt taatggtgga 300agtggtggca gtggggattt
cgaatgagat ttttcttttt aacgattagc actagacaac 360acgactatga
gaaataccca agggtcatct ggctagctct acaaggtggt aggccagatg
420atctgggttc aaagcctcgc cccttttaat tatttgatat tagtcttttt
ctaatattca 480tgtcttttac tagacagtac gattattcat cgaatagaat
atgaaaaaat tacaagatta 540attagagacc tagaagacaa tagctaggaa
agaagaaaaa aaatccacca ccacgcaccg 600acgtcatcta cacacagcca
aggagaaggt ttagcaccgg accagtcggc gctaagcgtg 660accgactgcc
gctaaaccaa caagggtatg ggaatgaccg ctaagattgc ccaagtccaa
720gctataacag ccgagcatca aacacaactc tagccacatc tataaaattg
aggggaagag 780gatgagagag aagaattgaa ccgctctcta cgctaggccc
tctaacgtgg ccaatttgtt 840taaactaggg cgccaccaca agaggatgag
acatctaaag tgagcttgca ccagttatct 900atgataagat ttgtcaaggg
ggctttctag atgacgtctc cagggacagg agcgacaatg 960acactgccgc
caccatccgt cgaggtctta aggagaacta agacaagatt ttcactcgat
1020aaccctacaa gaggagaggg gatggctcga caacaccccg aagaagtaag
atggcgcctg 1080aagcgccagc caagaccggg ctgggtttac acccgccgtc
ccgccacctg ccgatcgaaa 1140gctgtgctcc attgctacaa ccaccatcca
tcctccgtcg cgcgtgggca ccgttgcacc 1200gccggccccc gccggccaac
ctccatgcgg cctcccactc tttgcaccgg acagttgtca 1260ggccatcacc
gtcgccgacc gccggccttt gccgctccgt cacccacgcc tgccacaagc
1320cgtcggccta cgtcgcccgt gcccgtcgca agccgctggc ctctgtcgct
gtctcctccg 1380gtctctagat tcggcagcgg ctgtgccgga tctaggcgcg
ggggagcggg ggtggatgtg 1440gaggtggggg aggaaggggg gcagatctgg
cagcggtggt gccggggaca ctatctcgag 15001461500DNAOryza
sativaOs02g44970 location Chr227240720-27242219 146ggttgaattc
cacagggaaa gctcatcatt tacaattttc tggaaaacaa gtcttgactc 60agacagggcc
cctaaattag cataacaact cacaatctta gagccaagaa tcacatccca
120gcacaatcca tgtgtgaaaa catgacaatg gatcttcttt agagacctaa
tatctgcaca 180gccttggaac aatggagcaa acttgtcaaa attaagtatc
ttattagagc atgaatttgc 240atcggcagtg gatgctgata gtctccaacg
cagcttcagc actgcaattt tatcaaactt 300tgaactagtg tttccagcta
aattcccctc aaaagaattg gtagcatcat catggttgtt 360gacgcaagaa
aaacaaaaat aggtaactaa ttgactccga attcatagac atacgatgac
420agttcagtta aatgagccta caatataata gtcccaagca cacatgtgtg
agtcatgtgc 480aatgattaca tcatgtgtgc ttgggactat tatattgtag
actcatttaa ctgaactgtc 540caattgtggt tggtgacatg gttcaggagt
ctaacggtgt accggaaaca gatatagtgt 600atatctggaa gtggagtcta
taaataggtc cctaccctcc agcctcataa tgcattgtga 660gcggaacaac
ggaaaaggaa tagagggtag gccggataga cccacgtaaa cctaacaata
720tttgcttttc ttgcacctcc aaggatagac tattcaacta aacctgatca
acagcggaac 780acatacaatc agtgtacttt agtacaacag cactaaactt
gatagcagat cacattaaat 840taagcgataa tcccagttat ggcatacaca
gcagaggagt gaagccacaa tctctccagg 900acattgccac aggtaatgaa
tatgaaatac taccgcaatc tcgccggtgc taccaaataa 960accaccacct
aatcagaaca aatagccaca tcatcaagca gtcatcatgt aaaaacgaga
1020aaatgggaaa actgaccacc agccgaaatc tttttcagag gttcattccg
tgaaagaacc 1080ggagtaggag gaggaggcct cggggtggcc gccggcaagc
cgacctccgg agggacggcg 1140ccaggaacag tggaggaggg gcgcgcgcgt
gggggaggcg cgggtggtgc ggccgtgcga 1200gggggcggcg ctgccgctgg
ctagtggaga catcggagtt cggagcgagg agaggtggcg 1260tggcgagacg
gtggggaatc gcgtgggcct ttgcgcgtcc cggagaaacg gccgggtcga
1320aagcccacac taccattcgg cttgggacta gctatccatc acgcgacaga
tttttttttt 1380tccgaatcac acagatttta tttttttttc aaatcacaca
gattttagat taattgaatt 1440aagctcattt gatcacttga ttagaagagt
tatcaacata tattgcatat attttattta 15001471500DNAOryza
sativaOs01g25530 location Chr114470301-14471800 (reverse
complement) 147ggttgtagat gccagaaaaa tgaatatgag taactaattg
gcttcaaatt cacaaacatg 60ttttgctttt attgcacctc caaggataga ctattcacct
gaacccgatc aacagtaggg 120gacatataat cagtgagctg tagtacagca
acacaaaaaa tgatagagaa tcacattaaa 180attaagccag aattccagtt
atgatgtaca caacaaacaa gtgaagtcca ggggcggatc 240caacttggga
catgggggtt cagttgaacc cccaaacttt tgggtgaaca aacaaactgt
300tattacctga attatttaag agaggtttaa ggctatcaaa ttaaggagga
ggagaagatc 360tagaagagaa gctagtaggg ttcacgaacg caagcaaatt
ccagtcccgg tggggtggga 420gagagagaga gatgaatcag aaagaggacg
caccaggatc ccctcccctc cgctgcgatg 480gagatcgccg cccggccctc
ccctgttcag ccgcgactcg tcttggcggc ggcgcaacga 540ggcaaggagc
ggaggagaca acgagaggta ctcgactact ctgcccttat ctcaatattt
600ttacttacca ctctctacgt cctaatagtt ttacaaggtt cacatccaac
atttaacttt 660ttatcttatt taaaaatttg aaaatttttt aaaaacggac
ggtcaaaagt tcgacacgga 720ttttcacggc tacacttatt agggacaagg
tagtatttta aaactagcta gtgttatttc 780tttatataac tggtgaccca
tgtatcatca atacttagga aaaaattaaa cgattgtgtc 840gacaacaaat
tcctcatacc accagcaacc gctaactgcc actctacgca ccgccatcac
900tgctctttct ccatgaatca ccatcgttct tctcctttgc ctttacatgc
gcagccttaa 960cctctagtaa ccgacaagtc tttttctcct cccctatcgt
gatgcgcact attgttatcg 1020ttgccattga ggtcggagat gtcgggcttg
cagggacttt ggaaggcagg agctgacccg 1080acgaagaacg aaggcgctat
atccgggctc tttcttgcta gatctatcgg ttatccaact 1140gtcgtatgga
taaagaaaat ggagagagag aaagagagag agttaatgtg cgaagtgatt
1200actgagtcac cggacttgga gatggataga atatggtggt ctaaattaaa
atgtaaagtt 1260agctattgtg taaatgaaaa tggttaaatg ataccattat
atgaaaactt ttaagaattt 1320ttaagaatac atagtatgga ttgtaattta
cttcctttgt caaaaaaacg aatgtagaac 1380tggtgtggca cattctagta
caacaaatct gaacatatgt atgtctagat tcgttttact 1440aggatgtgtc
acatccaatc ctaggttagt tttttatggg acgaagggag tacttttttt
15001481500DNAOryza sativaOs03g30650 location Chr317474886-17476385
148ctccgactct cttgtcttct ctgcgtgtgc gcgcgtgttc tgctctgtgt
ttgtgtggga 60gaaaggttcc ggcgttcagc gaggtttgga ggtaggagtg gggccctatt
tatgcactga 120ttcgaatact actctgtatt gtatttacgg tggctaattg
ggattttctt ggggtatttt 180tataacgtac ttctgtattt tacggtgcct
aataattggg atttgtttga ggtaggggtg 240aaaacggtac ggaaacagac
ggaaaccatc tttattgttt ttgtttttat tttatttttg 300gaatcggaaa
ccccagatac gaaaacggaa tcgaatatta tcgaaaccga aaacggagcg
360aaaacaaacc ggcgcgaata cggtaacgaa aatttatcgg aataaaaaac
ccctcaaact 420gataaatcca attgaataaa ttgtctatgt ttaaaagagt
aatgttgttg ataaatccca 480atataggaaa aaaatattct tatgtaattt
tagattcgca taacaaatat ttttttaaaa 540aaataacagc caacatcatt
gtattcacta tttagtgctt aaaaaaatac atggtatttc 600agtcacgaaa
tttttggtaa ttccggaaag tttccgaccg attccgagtt ccgatggaaa
660ctgcccttat catttccgat tccgtttccg agaaaatatt tccgaattcg
tttctgtttc 720tgaaaaattc cgaccgacag attccgtttt cgaaaatagg
tctggaatcc ggaaagattc 780cgtaccgttt tcacccctag tttgaggtga
taagctttga gatttacctc aagggtccca 840cgagcaggtg tagattctag
aaacgggcag cgcccagggg ggggggttga aaagttcgtt 900gaaggctgtc
agaacaccat ggaattgagg aattgcgtca aactggaatg aacatgatga
960acacccaagt actggcatga ttagtccact tattattcaa agccaaaatg
gcgtaatggc 1020gtacgtgaat ggcgttgcaa aatcatccgg actgcgactc
gcgcgcggaa gcctctgcct 1080ccgcggcgcc ccaacgccgc acgaggatgt
tctttttctt gacgtcgtgc ctcatttcga 1140gcgcagcgag ttagccctgt
tttcccgatc aaaaactttt tatcatgtca catcgaatat 1200ttggatacat
gtatagtcta ttaaatatca aatatagaca agaaaaaacc taattacata
1260gattgcatgt agattacgag atgaatattt taatcataat tatgccatga
tttggcaatg 1320tggtgctact gtaaacattt gctaatgaca gattaattag
gcttaataaa ttcgtctcgc 1380agtttacagg cggaatctgt aatttgtttt
cttattagac tacgtttaat acttcaaatg 1440tgtgtctgta tccttcaaaa
acattacacc caaagaacta aacacaccct ttgtattgtg 15001491500DNAOryza
sativaOs01g64910 location Chr137684066-37685565 149ggcagcatgc
cttgccttcg atagttcgat ctctagttag cggcaggcgc agtgcaggac 60agtgaactga
ttgcccaagt tcttgcttgc tgaggactgg atgggagaag aatgtagtac
120tagtacttcg gaggaaaaga gccaaagaaa cgctcgtcat ggctgacgaa
acagttgaag 180gaatcaggtg attgatcgtg ttcgtgtatg atacacggtg
tatctggaat tctggatccg 240gctccgactc cggctcctgt atctgtatat
gtcaaaaact ggtgtaaacg agatccctcg 300acggtttgaa agacagtttt
ggctgacgtg atgcttacgt ttttttatta tttttccggg 360atcaaattgt
cgcatatgcg ccatgtcaat gttacgtggg acgaagtcct aatcaaacca
420gccacgcaaa cgtcacatca gtcaaaatcg cctttcaaac cgtcgaggga
cttcgtttgc 480acagattttg acagttcagg gaccggttgc atctgatttt
tggtttctaa gaacgaaaat 540cggatttggt gtaaagttaa gggatctgaa
atgaacttat tcatttctaa tagggcacat 600cgccttcatt tcaaatgggc
ctccatgtgc caggcccata tttcgattcg agtgtggcct 660ccatggacca
attgagaaat aagttcactt taggtcactc gtattgttgg agagtctaat
720attcattcca ggaccaatta aagtggccca tattgcggac gctgaagccc
aaaggagttg 780ctcctatggt gcggttggaa tggtggatgg ctgaagatgc
ctcggaggta gagtagtatt 840ctctcattcc caaaataaac tggagtaaga
gcatctccaa tagatgacta aaattaaact 900cccaaaaatc atgtattggg
gacagccaaa aacatattta gcctaaaata cacccccttc 960tccaagagag
gactaaaatt tgggagcgct tctagttgcc caatatttgg ttcaggttgg
1020tcctggtttt ggaggtggct aaattttggg accatgctta ggagtctgtt
ggagggctga 1080ttttcaccaa attcctaaaa tttatgtttt agtaacctgt
ttagcattct cttggagatg 1140ccctaactac catctttaag ggagtatcta
aaacagtgat aagttttttt ttaaaaaaaa 1200tttagatata actatagtat
aattgatacg taattacatt gtaactatat tgtaacataa 1260ctatgatatt
ggcatagttt ggtggtttag tggtacgtga gcattcgaca gatcgtgggt
1320tctcaatcct tgaccactgc atgcattggt taattgttac tccctccgtc
cataaaaaaa 1380acaacctagt actgacacat cctaatacta tgaatctaga
catacatctg tccggattcg 1440ttgtactaga atgtgtcaca tctagttcta
gaattgtttt ttaatgggac ggagggagta 15001501500DNAOryza
sativaOs07g26810 location Chr715496040-15497539 150gaatgatcat
gtcgcgaaga tctacggggc gacggcgagg cggtgaacga ggtcgttctg 60ctcggcgctc
cctatgatgt tcgggaggcg agtggacaga tcgctcgtcg tgacccctgc
120tcctgcgatt agatcccacg ccggcgatgc tgaggcttgg ctgatgatag
tcatgagatc 180ggagatgggg gactcggctg ccagtcggcg tgcggacgct
tccggagtta actgggaccg 240aagcagcaat catggtcttg gtctggttca
agatgttgac cacgtgatca gattgactga 300gcttgttctc cttgaggagg
gtgtcgagga gggtgatcgc agcaacagca ttctgttgtg 360gggtgcggaa
aaccggtgtc ccctcaacat cttgctggcc gataagatct cgcgctcggc
420gccctgcatc taaggctcat tgccgtcggt cttcagcctc cttagcggcc
cgctcgcgct 480cctgttgctc acgccgcagc cgttcttgct cctgtgcttg
acgctcctcc tccagtcggc 540gtcgttcggc ttcacgtcgc gcttcggctt
ctcgggcttg gcgttgctcc tccgtctcgt 600tatttactcg atgagaggta
ataccctact cctgtttggg gatttaaatc caccgggtgt 660agtatagatc
tgacgatcat atgtgctcat gcccctagag ggcctcctgc ccaccttata
720taggatgggg ggcaggatta caagatagaa accttaacca atatagtatc
ggtttcctaa 780atttatttta caatattatc aaatcaggac tttaggccgc
tccataatat aaaaggaaac 840gtaataccca agtcatgatc tgttacatat
tccacagata taagctatcc cctatgacta 900gtcggataac catgccgtgt
gggtatgggg tacccataat ctccacagta gcccctgaga 960ccttcacagt
cgaaaagata atcttttctc gaactagatt actccaaagc cgagtgcttc
1020aatcatcttc gccatgatct cccgagtact tttaccaaat atgaagactg
tggagagctg 1080aaaaataaag tcaggtgcaa cgactagatg catctaatag
gtgtagcccc cgactatgtg 1140gttggctgaa caaactaagc atatagtcaa
ggaataaatg attcaaccaa ccgagcggtt 1200ttgataattg taatcaacca
agtgacttga tatcaatata tagcatatgc ggtgtgaatc 1260ccccaaataa
catgatccaa gtaatatacc gactgctttg taaaaattga tgcgagcaac
1320ttaaagttga ctacatcatt gaaaattcac atagcaaaac agctataaag
aagcttgaat 1380gttgcaaata aagtattggg catacgccat gcgcacactg
ctttaacata tgtgcttaag 1440ttactaaaag acacatggtt tcaccacacc
tggaaatata tggcatatgt ggtgtaaaat 15001511500DNAOryza
sativaOs07g26820 location Chr715515227-15516726 (reverse
complement) 151agctgatgta gcgggtaggg gagttgatgt tggcgccgtc
ggctgaattg tggacgcaac 60ggaggaaacc gcttgaacag atggctgaac atcggtgatc
aaagatcgat aattcgggaa 120gacacctcct tgcaagtaaa ctgggcctgt
agcctgatgc tcggctatcg aaccatcggc 180tattgtcttc atcatgtttg
acaatgtatt gactaaaacc ccggactgat tgatcagagc 240atgatgcaca
gcgtaatcaa ctcgatcttg aaagttgttg aaaaaatctt gtgctagctc
300attattctga ttaagatttc cttcttgcgg cccttgagca ccatcacctt
gagccccacg 360cgctcctcgc agattgtcgc cttgatcgct tggggagccg
tcggtagcac ctttagtact 420cggctgagcg tgtatcggct aatactccga
taccactcta tatcaggata ttaaagcagg 480tataatatat caaacaaaag
cctaacatat ttagatacag caatatcttg atataaaggg 540cagatttagc
atatcaatga gacatataga ataaatatgg ctaaatcaga tacgatcggc
600tgaaactcca atgctattct aatcggcaac tagaaggcag gctagagatc
gatattctaa 660gcacgactta atagatcaaa ctcaacttat gcagcattaa
gtatgaaaag aagaacgata 720tctagacaat caagccgcta gcagttccat
agaatggtgg atatcttata tgatctagat 780caacgtcaag atctaaccta
atcggctgcc ttctgcgtac agatattggc cgatagtaga 840ttagatagcg
atattgttag agattatata agatatatga taactcgacg aattacataa
900acaagattag agtgtcatga agatggaagc actaatcccg agaacgcaag
ccgtcataac 960aagttttacc tcttgttgaa tattgaaatc gatgcagctc
aacccgaaag caagaacttg 1020tcgaaacaaa actaaagcaa aaagggtggc
gatgcgccga gattgtattg gacgtgtgtg 1080ttaaaaaatt acatagggcc
cggggtctat ttatacccga gaattacaag atatgcccat 1140accggacacg
accattatct ctaacaaact ccaagatacc ataagtcttt gcggcagatt
1200tttgcccaca cttatctata aggaatttac ataaaatatc ctaattaata
gatacaattg 1260ccttcccagg actctatcca tgtatggcaa tcatcttgaa
gtacattaac gtgaacccga 1320tgtcgcgatc aagccgtatt gtcggaatcg
gctgtatcgg cttatttagc tcgactcaga 1380cccagccgat cttaaccgta
gccgatctgg actccagccg attcctgctc agccgattcc 1440tactctgttt
ccgaactcga tctccgcctc cgactccgct ttgatcaaat cctctttcct
15001521500DNAOryza sativaOs09g11220 location Chr96225987-6227486
152cgggctagct acccccttat caagagcctc cacctcatcg gctcatccca
ttctcttcct 60cacctccccc tctcttctct ttcgtgatcc ctcttcctcc tcctcctctc
tctctctctc 120tctctcacac acaccaattc cacctcaaaa cgcatgaata
tagatgatgg gttcatttgt 180atggccgtat gaagtttaca atgatccctg
tattttctat cttccttatc atccttatcc 240aaagctcttg ctcgatgtct
ccgattgtgt tcttcaaaaa cacctactta ttgtttgatt 300gctttggtct
tttgtgaagc aaaataaaaa agagcttgaa gtggataaaa tttttcccat
360gccttcccat gaaatcagta gcgaaggaat ctttccttct atttttcgta
cggctggtgt 420gttccatctc ctccagatcc tcggacgtag cactcgctac
gtcagaggga catccggcgc 480agctgcgtcg ttcataccgc atcggatgaa
cgcacgcagt gaggcggcca aacgggctca 540aactcggtcc aagtcactga
aataggacgg ctcatttgaa cagaaggctc gtttccttag 600ccagaaaaat
cggtgtcgct cccaagttct ccaggacatc agccggcggc cgctcatcgt
660ctcctccggc tcaggcgcta ttggagccga tctcggacaa agcccctccc
ctcgtcgccg 720ccctcctcac catccctcga ctcctcacgt ggcgcgatgt
cacgccctga agttttcccc 780ctttttcttg ctttaaaaat ttgtttaata
aattgcctca agaaataatt tgattaacct 840agagctaatt ccttaattaa
taaatgcaat caataattgg aaatggcatt gtgggatttt 900tcttgggttc
cacttgtcac atttcattaa cgggattttt agtagaattt tcatagccct
960ataaatagtt ttaaccaata aaaatcaatt atccgcaata ttctaatccc
aggaaaatcc 1020ttttcttttt cctctttttt cttttcctcc tttttcctct
tcttgggcca tcggcccact 1080tggctcctac gctgcccgct cggctgggcc
ggcccacgcc ccatccctcc tctctgggac 1140gccgataggt ggggcccacc
tgtcaagtcg tcccctacct ctagccgggc agcaaccgcc 1200gctgaaaccg
cccacgccac caccgttccc gctctcctcc acgccaccac tccacaccag
1260tgccccacgc ccacgtcgcc cgcccactaa cccgcttctc ccgctcgtgc
gtgcgcccgt 1320gaggggcagg attcaatttg aatccccccc tctctctctc
ccccacctcc ccacgtcgcc 1380agccaaatcg ggccctttcc cggccgtgtc
cgcctctccc aaacccctat atagcttccc 1440tgcgtcctcc tctccatttt
tcccctttcc acctccctct cccgtgacct ctatcgcacc 15001531500DNAOryza
sativaOs04g21800 location Chr412352409-12353908 153ggtgacgggt
ttatgacgac agcgggttgc aggtgcgatg atcgtccttc ctctcctctc 60tcctccactt
tcccccggtt gcttggctag tggattgtcg ggtcccaaaa ctaatgattc
120ggtaaccgac atgtttagac tgtatcaagc cctggatcag tagattgata
caggttcaac 180aatctggatc tttattgtat acattttaat aaggctccaa
aggagatatt gttattacca 240aatatatggg tatttacaaa cttgtggcca
actaatacaa cagaagctat agaatgaacc 300aactctaatg ttttgcgtag
agttaaacca tcaatctaat ctatacttga aaactaaact 360aaattagaat
gagactaact tatatgattg aactcccagc ttcggcaaaa actccgaaat
420gactctgaaa aaggtggggt tgaagcaagg gtgagtacaa cgtactcagc
aagctattat 480attcaatatg aatgtatgaa atagtagcat ttgagtaggg
ctagatttac ttgcagaaag 540cagagaatgc agaagaagtg agcctgtaat
gattttaatg caacagtatt taataaagtt 600tggaattaag tttctaacca
aataccatat gagtcccaat
gctcaaatcc atgagcacgg 660ctattcgaat agattcgttt tcactttact
gcagtaaatg tatgctttac ccatagccca 720cgacgtgacg ataatcatca
gctttagtca tggcccagca ttaggttatt aacaatagtg 780gcacctgttt
catgaactct agtccccatg cgctctgaac gtacgttatc agcagcgtga
840ggagttctgg cgttcctggg gtatttagag aggactgatt gtatgacacc
atatcatcgc 900aatcagggtt tacaaacagt tcgtgatatc acaatttatc
tcaaatattt acctatgcct 960cggtaaatat cacaatattg ccctgctcgg
cataagtttt cctcctgcgc gaggacttaa 1020acaagaacca ctatacagag
gtaccacctt gttaaataac ataataactt ggtctgtccc 1080catcctagaa
ctgtggtcgt actcgtttgt tcttcataag tacttggcag tcttatgtcg
1140gttggaacag tactagccac ccggaaaatc aaccatttct accgtaccgt
tcaaatctaa 1200gtttattata tttgtatgca gtctaactag gcatgactaa
gcaaagctag catatatctg 1260gtttgctata tgttcatgat atgcattcaa
aatcatgaag agctaatgca tagaacagaa 1320ataaagaata tagggcattt
atgctcaaag gagaggaata ataacttgcc ttgctccaat 1380gcaaataaat
ccgagatagg caacctgatt atctgatcct tgaaatatca caagttgcca
1440tctaaatata atagactcta ctggagaaga ggaagaacca aattcaataa
aaatcatgaa 15001541500DNAOryza sativaOs10g23840 location
Chr1012223468-12224967 (reverse complement) 154gacctcgccc
cagtgggcat gccaaaaacc tcacacccga cacggcacca ctttacttac 60caagctcaag
aacacccaac gtaacgagca gacgcgcagg gcagatcaaa tatcaccggt
120acgaaatctg agacaggata gctcatgcgg attatgcgtc taaacagctc
aacttgcaac 180gacaagctga tacgcgagct cagctcatag cgataacctt
ttttttacgg ggagtaccga 240tagctgatgc ggcgtcgggg actcggggca
acgcaacctg atgcgaacaa gcgaccgaga 300aagagaaaca tcacatacca
aaattacccc taatatatat taagagatta ctaatctacc 360ctttaaaata
tacggttgag attaattcta cttgcagtag aatactgcaa gtagaaagca
420ccggagcgtc accttaaagc caaataatgc cttctgaaat ctcatgtaca
actagaagta 480gaaaatttct agtggaaatc ttgcatccgt ggttctatga
atcttgttat ctacagtgat 540caatcgatca atcggataac ctgttctttg
atgaatcttc ttgagttttt atcggatgga 600atttgctggg attgctttga
attcgtgcac aacgccgtcg atctcgatga ggctgcagcc 660ggggaccttg
tcgatctcgt tcctcctcat ggcgtcgagt tgctggagcc cctcctcgat
720gagtccggca tgacagcacg ccgtgaggac gccgaggagg gtcacctcgt
tcggcgtcac 780gccggctcgt tgcatgctgg cgaatagaga cagggcatcc
tcgccgcggc cgtgcattgc 840gagacccagg atcatggata tgtaggtata
ccggcgagac ccagcgacca gcggccgcag 900caagcctcgt accggccagc
gttcaacagc cttctcgcgc acggccacat cctgctcgcg 960ctccgcaccg
ccgttgccgg ccttgcgcgc agccgcctcc tgctcacgct ccacgccgcc
1020gtcctcgcgc tccctccccg accgtgtccg catccgcgcc attggtcgct
cctggcgcgc 1080ggtcgccgct acccgctccc ttgccgtgga caacctcgtc
ggcggcggct gctccctgat 1140attttttttt tttggtgtga ttcaggtgga
gaaagatgga gccaggggca atcttgccat 1200ttcgaaaaat ttctcacctc
ttttgacccg gatattataa aatattgtct tcggtgtcct 1260atagctaatt
acataatttt tgtagtgtcc atcagccgtg tctcgttttt gcggtgtctc
1320tcagccaatt acacgttttt tgagtgtcct atagcaaatt ttgccttctt
cgaacggcgg 1380gaagaagttt gctttgtaat ctatatagtc aacacatact
aagtttgatc gtaatcattg 1440cttacacagg atttggtcac atttatgaaa
atgacaatat agccatattt gttaaacaaa 15001551500DNAOryza
sativaOs08g13850 location Chr88268007-8269506 (reverse complement)
155gacctcgccc cagcgggcat gccaaaaacc tcacacccga cacggcacca
ctttacttac 60caagctcaag agcacccaac gtaacgagca gacgcgcagg gcagatcaaa
tatcaccagt 120acgaaatcga gatgggatag ctcatgcgga ttatgcgtct
aaacagctca actcgcaacg 180acaagctgat acgcgagctc agctcatagc
gataaccttt tttttacggg gaataccgat 240agctgatgcg gcgtcggggg
ctcggggcaa cgcaacctga tgcgaacaag cgaccgagaa 300agagaaacat
cgcatatcaa aattacccct aatctgtatt aagagattac taatataccc
360tttaaaatat acggttgaga ttaattctac ttgcagtaga atactgcaag
tagaaagcac 420tggagcgtca ccctagttgt aaatgggcac cattctttga
ttctctcgtg atgcatatat 480acattacgaa gaatgtcaaa atataactac
tggtgtgaac atgaagttat ggttgtttgg 540tttcattcat caaggttcaa
atctttatgt ccatatacac atctcgcatg tgtattttaa 600ttggtgcaga
gaggcggtca ttctatttct cttgttaaaa aaaaaaatca aactataacc
660atgtgttcgc catgctttat accttccata caaaacttgg gtcataaatg
tgagctgcca 720aagtgatctt aatggtttag aagtaagttg cttcctaaca
tggctccatt atagaagaag 780aaaacaatgt ttcattcatc tcctttaaac
caagatgaaa ttaagaagcc ctcttcctag 840ttgcgggcca ggaagcgact
aaccatttgt ggacaagcac agccaagtaa agccccaaat 900ctcatattgt
tgtaggacaa gataaggttg acaacacacc caggttttgt ccatgttgat
960gatcaattag tttggttgtt caactgtccc gattaacatc ttcaaagcaa
agcttccaac 1020agaccaacag tacattgaat gaaccgaaca aaacagaaga
gaatagatgg tttctttgca 1080tttaagtttt attgtaccac ttcgactcat
ttgtattcaa gttttacgaa acttcttgat 1140taactgaaaa ctgggctagg
tatacttagc ccgaccaatc agatatgact tttcaccctt 1200cctttttcac
caattaacta ggaaagtatc ccgcgcatgc attcgtgtgc ggggatgcat
1260atattagttt gtttcaaaaa cacatcaaaa tctaaattaa aagtaagatg
ttttccttta 1320ttctaattga atgttcattg gcactctttc tttccataag
caatctcgat ttttaccgct 1380gatgaatgta ttttattata atgtcaagct
ggacatccct ctaataggtt taattgcata 1440tagtttacaa aattggagga
aatatcatct cggtgatatt tgacaccata aaatatgatg 15001561500DNAOryza
sativaOs12g42980 location Chr1226703270-26704769 156ggtttgccag
aaggcagcaa cgagagagag aggcgatcga attcagtgag ctgtggcgta 60attgcccaag
ccacagctct ctctctccct ctctctctct gcctataaat aagtgtttgc
120agctacgaaa attcaatcgg ggaaaaacta tatgggatta tcctattgat
ttatttatgg 180tatgcaggat atggaggctt cgcaaattcg tttgttgtcg
cctgtgggac atcgatatca 240gagtagaatt gtttaggctc gtactcccta
tgtcaaaaaa aaaacccact tctataaatg 300aatctggaca tacacggaca
tagtgtatgt ccagattcgt tttttttacg aagagagcac 360aacatatgtc
tattgtcgat acttttttga gatcagtata taaaataggt ggtactagat
420taacgatggg ctagatggtt aataatgagc atatatgtgc aatagatgaa
acatgtctat 480acaagtatag tggtatactg ctacaggtaa tttgttacta
tgtgtcagtt caattgtaca 540gttttaaata ttattactac tatatcacaa
tatgatactt gagtgttttt actatgggag 600actcccctgc tgtgttttgg
ttaagagtga gcccttctca tatgagtgat tcttactctc 660ccgtcccaaa
aaaaactcaa cctaggaggg gatgtgagac aacgaatctg gataaatggt
720agtccagatt cattgtacta ggaggggtca catcccctcc taagttgagt
ttttttttat 780ggaggaagta tactactccc tccgtttcag gttataagac
tttctaccat tgcctacatt 840catatatata tatatatata tatatatata
tatgggttag aaagtcttat aatataaaac 900gaaggtagta ctgtacatat
gattgggtat gggtgtttct tataattgtg actttcagat 960aataaaataa
tatcccatgt tttttttaaa ctaaataata tccgatgttt ttattccatt
1020atcataagga aaaaaaatcc catgaatgat atcattgccg gcggttgcct
gtggcatgca 1080agcagttggc agattcttct cgtctcacag catattgtcc
cttggccctt gggcattcca 1140aatttctacc aataatttca ttctaagaac
taaagtcgac gtcgccatcc cggtgcacgc 1200acgccggcca tgctagcttg
cagatagaat tggaatagtt agccaagcca actccaacca 1260aagaccctac
acatcaccat cctatccgtt ctacacgatg aaatattcac tccatttcta
1320atatacgttg tcatcgattt gtagtgcgtg aacagtgatt ttattaaaaa
aatgtaaata 1380taaaagaata tatgtaagtc atacttaaag cagctttaat
ggtaaaaata aataacaaca 1440aaaaatatta ttattattac atattttttc
agtaaggtta aagataaaac atgtgtacag 15001571500DNAOryza
sativaOs03g29280 location Chr316670214-16671713 (reverse
complement) 157cgtggcctgc gaccgcgagt ggtcagtgtg tccttctgtg
tatagttgga atcttttcga 60tatacttctg tgtatagtta gagtttcgag gggcgagttg
aaaccttaaa agacgtaacc 120agtagaaaaa aaaacaaaca caccaccatc
ataaagtaga aaaaaaaaca aacacaccac 180catcataaac tagcaggtcc
tcgcacaaca ccctaaagaa aactctaaag tgttgtgcga 240ggacttacta
gtttatgatg ataaactagt aattcctcgc acaacaccct aaagaaaact
300ctagggtgat gtgcgagggc ctgccgtttt atgaaggtgg tgtgtttgtt
tttttttcct 360actggttacg tcttttcagg ttatgatatt atagttattt
cctgttatat ccgtacgaac 420ttcgcgctat ctaaatataa atcgtaaaaa
aatatatact ccctccgttt caaaatattt 480gacacgctat ttattaccat
ggctagcaat gattttaaaa ttgtggcaac atgtatttat 540tgctacaatt
ttagtattga tgccacgatt tttttcgtgg caacaaatga tttttctagt
600agtgattgtc aaatgtaata aataaagggt tgtgttgata gttatcaagt
ggtgttttgg 660tggcaaggtg ccagggtgct tgaggttaga ccggcgtggt
aaggcgggtc agattggaag 720tgtttgtgcg gtcagaccgg ccggatagct
gtggtctaac cataggtgtt tgagcggtca 780gactaggtag cccgacagtg
ggaacgatat actcctgttt ggagtcggta tcttgttgat 840tttatggaac
atgttgattg cttaatgttt attacttcta gtttgtttct atacgtgaga
900tatattgtac gttgtgtacc attgttgagt caagttgata aaaaaacttg
tgcttggata 960tagtttctta gtttgttcat gtgtagttgt tacttgaggc
ctcgggagta ttgccggtga 1020tgaatcgaga ctaagcttgg gaaaagttga
ggtctagatg gacaagaaga tcatgcatga 1080tactaaggct agagaacaat
gcacgtagag ataaagtttc catatggcat gcaagaagag 1140ttttgtgcgt
atgagaagtg gagtcaaatt tagattggag tccaagttta ggaagattag
1200agattgaaga tgtacaggga tgtgtatatc cttgttttga gaagtttcct
atgtaattag 1260gattccttgt tctagttgga ttcgtggcat gtttgtcggt
ataattagat gaggggtcga 1320ggctcaaagt aaggtgaagt gggcaacttt
taggagagaa aagttaggtt tctttttgag 1380atttcggttc tagttcgtga
attgagaaag gaatgcttta tattcccttt gtaagtataa 1440cttgatacaa
taaagtttat ccacttttag atgccctttg taagttaggt ttgtgttttt
15001581500DNAOryza sativaOs03g20650 location Chr311684335-11685834
158tgcctgagct cccatccacc ccacccccaa ccccacgcgc gcgatcacca
ccacctgcga 60cacacacaac cccgagacgc accccccccc cccaaccctc acgatcaaac
aatcaaacac 120ctgacctacg ccgcagcaac aacaacgaca acaaccacaa
accacccaca caaacagagc 180ccacactgac cctcgtccct cggcggcggc
gcatggcaag aagaagcaga gaggagagtg 240gaggtgggga ggaggcaaga
atttaggatt ccacaaaggg gggtggtgcg cttgggccaa 300tgggggcaga
caggaggcga tgcccctctc tccctctctc tctctctctc tctctctcgc
360ttgggaagaa tccaaaaagc gatcgacgag gcgagaagcg aaagacgagt
gcgtgcgcct 420gtgctgtgcg tctccgcgcg cgcgcggtgc tggaaaagaa
agaaagaaag aggcgctgcc 480tttctagctc tgtgaccaag gttcgctttg
cctttgcctt tggtgggcag agggagagag 540agaggggggg ttttattcgc
gggggagatg gcagctgcag aatctgcaca agagagacgg 600gggccaatgt
gacacaagag atcaggttat tcaggtactc ccgccacatc agtctaaggc
660cccacgtaac agtcgcagcg tcactgctct ctccaccgcg gtgatttttt
ttatttaccg 720ctcacgaatc tttttgagta ctggaattcg gataacaaac
gcactcaaac gacgagtact 780atgttattcc tcccttccta catcgtataa
tacaagagat tcggataaga tgtaatattt 840ttcagtacta gaatgtgtca
cctctctaaa ttctttagta gcatgaatat gtatatacta 900tttgtccata
tttatagtaa taaaaaatat gatatccggt tctggattat tgtattttga
960gacagatgaa gtagttaaat tttaaggttt tataagaatt tattaaaaaa
ggtattgttt 1020gatagcacta tagtaaaagc gcaggaattt gagacggaat
aaagagaaaa catgaggtct 1080gtttttgtag gagatatttc tttgtattcc
atatgcattc ggagccattt ctttgtttca 1140gagaatatga agctagggat
ctttttccaa ctaaatttca attatccaaa attcctcttt 1200ttttgctgtt
ctaaaaggaa cctttaattt tatagtattt ttcatcgcaa gaaaaagtta
1260taagtgcgtt gttatatgtt ttcgagaata ccgaatttga tgcacaacgg
gttgccacag 1320aagattcagg tcctaggtgt gtcttcagtt ccgaatctaa
tacgagtggc tgcctgtact 1380actgcctccc tttcacaata taagtcattt
tagcattttt catattcata ttgatgttaa 1440tgatcaatat gaatataaaa
aatatagtat tttctatatt aatattgatg ttaatgaatt 15001591500DNAOryza
sativaOs06g43920 location Chr626453545-26455044 (reverse
complement) 159tatgcaatca aagaagagat gttgaattga ctcattagag
ttacaaaagc tacaaagcaa 60gctacccttc catcttcttt taacaagatt atcttttgtt
agaatcaccc ccttttcaat 120ataccaaagg aaaattttaa tttttaaagg
aatccttagc ttccaaatta aagatttcct 180atgtattaca ccattaagca
taagggcttt atacatggag ttaacataaa acaacccttt 240tttattgctc
ttgcaaaaga attggtcttg cttatcattc aaattgacac tcaccacctt
300tgaaacaatt tcaagccaat cttccaagtt tttccctaca attgctcttc
taaaagaaac 360attcaaaggg acccttccta acacatcggc cactaccaag
tttttcccta ccattgtttt 420catctagttg atcaaaaaga tcttgctgat
caactgtgtc atcgagcaca gggagaagaa 480tgccatcttc atccccacca
ttcagcatct gagccggcat ctcagggctg tcgagatgaa 540agtcaggaaa
tgtagaaccg aagcttgaat cagttggtgt tgaagaactg atgttcgagt
600cactaccaag tgagaagttg tcatgcatca tctctgctgg gcttgtgtcg
tttcctatgt 660tgaatgaccc catctgatca tcgaaatctg acagtgatat
ggtgctcatg attggtgaca 720tttgaccatg tagttcatct ccaaacatcg
gttcctggcc actggaggct gcaaagcttg 780agagctggtt gatattggcg
atgttcgcct gcgacgggcg gaaagatgtc ccatcattgt 840gaccaagatc
aggaaacatc gatggtgctg taggttgcca gtatttgcta cagtttgcaa
900ggtccggaaa ttggtaatca ttgccaattg tgctgaagtg gtttgtagaa
gattgattca 960tctgaaccaa aggctgatta ttcacttcag ggtactgcac
tggaaactga tttgttggcg 1020ccaaaaactc accatttgac tgcactggaa
actggtttgc aggcaccgaa aacccaccat 1080ttgactgcac ttgatactga
tttgcaggcg ccgaaaaccc acaatttgac tgcactggaa 1140actgatttgc
aagagctgaa aacccaaact ttgactgtac ttgaaactga tttgcaggcg
1200ccgaaacccc attatttgac ttgtttgcag acgagttgcc agaatggcaa
gatggagagc 1260cattgcttgt ctcccacata ctcttacaga acatgccagc
atatgagcta ccagaatcgt 1320ctgtatccaa caccaagcca tttgatatgt
ttgcaaatga gctcccagat gggccagatg 1380ggaagcactt tcttgcattt
gatgagttct gcagtggctg aaacacagga ttgttgcccg 1440ggcttccacc
atggccaact gtccccatgg ccatgtttct ctggctgttc ttgagctgaa
15001606PRTArabidopsis thalianamisc_feature(2)..(2)Xaa can be any
naturally occurring amino acid 160Pro Xaa Phe Xaa Xaa Trp 1 5
* * * * *
References