U.S. patent application number 13/465078 was filed with the patent office on 2012-08-23 for method for modulating plant growth, nucleic acid molecules and polypeptides encoded thereof useful as modulating agent.
This patent application is currently assigned to CropDesign N.V.. Invention is credited to Juan Antonio Torres Acosta, Veronique Boudolf, Lieven de Veylder, Dirk Inze, Zoltan Magyar.
Application Number | 20120216320 13/465078 |
Document ID | / |
Family ID | 22756387 |
Filed Date | 2012-08-23 |
United States Patent
Application |
20120216320 |
Kind Code |
A1 |
Inze; Dirk ; et al. |
August 23, 2012 |
METHOD FOR MODULATING PLANT GROWTH, NUCLEIC ACID MOLECULES AND
POLYPEPTIDES ENCODED THEREOF USEFUL AS MODULATING AGENT
Abstract
The invention provides isolated nucleic acids molecules,
designated CCP nucleic acid molecules, which encode novel cell
cycle associated polypeptides. The invention also provides
antisense nucleic acid molecules, recombinant expression vectors
containing CCP nucleic acid molecules, host cells into which the
expression vectors have been introduced, and transgenic plants in
which a CCP gene has been introduced or disrupted. The invention
still further provides isolated CCP proteins, fusion proteins,
antigenic peptides and anti-CCP antibodies. Agricultural,
diagnostic, screening, and therapeutic methods utilizing
compositions of the invention are also provided.
Inventors: |
Inze; Dirk; (Aalst, BE)
; Boudolf; Veronique; (Gent, BE) ; de Veylder;
Lieven; (Gent, BE) ; Acosta; Juan Antonio Torres;
(Gent, BE) ; Magyar; Zoltan; (Gent, BE) |
Assignee: |
CropDesign N.V.
Zwijnaarde
BE
|
Family ID: |
22756387 |
Appl. No.: |
13/465078 |
Filed: |
May 7, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10276032 |
Dec 16, 2002 |
8193414 |
|
|
PCT/IB2001/001307 |
May 14, 2001 |
|
|
|
13465078 |
|
|
|
|
60204045 |
May 12, 2000 |
|
|
|
Current U.S.
Class: |
800/306 ; 435/15;
435/252.33; 435/254.2; 435/320.1; 435/325; 435/348; 435/358;
435/365; 435/375; 435/419; 435/6.11; 435/69.1; 436/501; 504/196;
504/319; 530/350; 530/387.9; 536/23.1; 800/298; 800/312; 800/317.4;
800/320.1; 800/320.2; 800/320.3; 800/322 |
Current CPC
Class: |
C12N 15/8261 20130101;
Y02A 40/146 20180101; C07K 14/415 20130101 |
Class at
Publication: |
800/306 ;
435/320.1; 435/419; 435/252.33; 435/348; 435/325; 435/358; 435/365;
435/254.2; 435/69.1; 436/501; 435/6.11; 435/15; 435/375; 536/23.1;
530/350; 530/387.9; 800/298; 800/320.2; 800/320.3; 800/320.1;
800/317.4; 800/312; 800/322; 504/319; 504/196 |
International
Class: |
A01H 5/00 20060101
A01H005/00; C12N 5/10 20060101 C12N005/10; C12N 1/21 20060101
C12N001/21; C12N 1/19 20060101 C12N001/19; C12P 21/00 20060101
C12P021/00; G01N 33/566 20060101 G01N033/566; C12Q 1/68 20060101
C12Q001/68; C12Q 1/48 20060101 C12Q001/48; C12N 15/11 20060101
C12N015/11; C07K 14/00 20060101 C07K014/00; C07K 16/00 20060101
C07K016/00; A01N 37/44 20060101 A01N037/44; A01N 57/24 20060101
A01N057/24; A01P 21/00 20060101 A01P021/00; C12N 15/63 20060101
C12N015/63 |
Claims
1. An isolated nucleic acid molecule selected from the group
consisting of: (a) a nucleic acid molecule comprising the
nucleotide sequence set forth in SEQ ID NO: 3, 6, 12, 13, 29, 41,
42, or 45; (b) a nucleic acid molecule which encodes a polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 69, 72,
78, 79, 95, 108, or 111; (c) a nucleic acid molecule which encodes
a naturally occurring allelic variant of a polypeptide comprising
the amino acid sequence set forth in SEQ ID NOs: 69, 72, 78, 79,
95, 108, or 111; (d) a nucleic acid molecule comprising a
nucleotide sequence which is at least 60% identical to the
nucleotide sequence of SEQ ID NO: 3, 6, 12, 13, 29, 41, 42, or 45,
or a complement thereof; (e) a nucleic acid molecule comprising a
fragment of at least 50 nucleotides of a nucleic acid comprising
the nucleotide sequence of SEQ ID NO: 3, 6, 12, 13, 29, 41, 42, or
45, or a complement thereof; (f) a nucleic acid molecule which
encodes a polypeptide comprising an amino acid sequence at least
about 60% identical to the amino acid sequence of SEQ ID NO: 69,
72, 78, 79, 95, 108, or 111; and (g) a nucleic acid molecule which
encodes a fragment of a polypeptide comprising the amino acid
sequence of SEQ ID NO: 69, 72, 78, 79, 95, 108, or 111, wherein the
fragment comprises at least 15 contiguous amino acid residues of
the amino acid sequence of SEQ ID NO: 69, 72, 78, 79, 95, 108, or
111.
2-4. (canceled)
5. An isolated nucleic acid molecule which hybridizes to the
nucleic acid molecule of claim 1, under stringent conditions.
6. An isolated nucleic acid molecule comprising a nucleotide
sequence which is complementary to the nucleotide sequence of the
nucleic acid molecule of claim 1.
7. An isolated nucleic acid molecule comprising the nucleic acid
molecule of claim 1, and a nucleotide sequence encoding a
heterologous peptide.
8. A vector comprising the nucleic acid molecule of claim 1.
9. A cell comprising the nucleic acid molecule of claim 1.
10. A host cell transfected with the vector of claim 8.
11. A method of producing a polypeptide comprising culturing the
host cell of claim 10 in an appropriate culture medium to, thereby,
produce the polypeptide.
12. An isolated polypeptide selected from the group consisting of:
a) a fragment of a polypeptide comprising the amino acid sequence
of SEQ ID NOs: 69, 72, 78, 79, 95, 108, or 11, wherein the fragment
comprises at least 15 contiguous amino acids of SEQ ID NOs:69, 72,
78, 79, 95, 108, or 111; b) a naturally occurring allelic variant
of a polypeptide comprising the amino acid sequence of SEQ ID
NOs:69, 72, 78, 79, 95, 108, or 111, wherein the polypeptide is
encoded by a nucleic acid molecule which hybridizes to a nucleic
acid molecule consisting of SEQ ID NOs:3, 6, 12, 13, 29, 41, 42, or
45 under stringent conditions; c) a polypeptide which is encoded by
a nucleic acid molecule comprising a nucleotide sequence which is
at least 60% identical to a nucleic acid comprising the nucleotide
sequence of SEQ ID NOs:3, 6, 12, 13, 29, 41, 42, or 45; and d) a
polypeptide comprising an amino acid sequence which is at least 60%
identical to the amino acid sequence of SEQ ID NOs:69, 72, 78, 79,
95, 108, or 111.
13. The isolated polypeptide of claim 12 comprising the amino acid
sequence of SEQ ID NOs:69, 72, 78, 79, 95, 108, or 111.
14. The polypeptide of claim 12, further comprising one or more
heterologous amino acid sequences.
15. An antibody which selectively binds to a polypeptide of claim
12.
16. A method for detecting the presence of a polypeptide of claim
12 in a sample comprising: a) contacting the sample with a compound
which selectively binds to the polypeptide; and b) determining
whether the compound binds to the polypeptide in the sample to
thereby detect the presence of a polypeptide of claim 12 in the
sample.
17. The method of claim 16, wherein the compound which binds to the
polypeptide is an antibody.
18. A kit comprising a compound which selectively binds to a
polypeptide of claim 12 and instructions for use.
19. A method for detecting the presence of the nucleic acid
molecule of claim 1 in a sample comprising: a) contacting the
sample with a nucleic acid probe or primer which selectively
hybridizes to the nucleic acid molecule; and b) determining whether
the nucleic acid probe or primer binds to a nucleic acid molecule
in the sample to thereby detect the presence of nucleic acid
molecule of claim 1 in the sample.
20. The method of claim 19, wherein the sample comprises mRNA
molecules and is contacted with a nucleic acid probe.
21. A kit comprising a compound which selectively hybridizes to a
nucleic acid molecule of claim 1 and instructions for use.
22. A method for identifying a compound which binds to the
polypeptide of claim 12 comprising: a) contacting the polypeptide,
or a cell expressing the polypeptide with a test compound; and b)
determining whether the polypeptide binds to the test compound.
23. The method of claim 22, wherein the binding of the test
compound to the polypeptide is detected by a method selected from
the group consisting of: a) detection of binding by direct
detection of test compound/polypeptide binding; b) detection of
binding using a competition binding assay; and c) detection of
binding using an assay for CCP activity.
24. A method for modulating the activity of a polypeptide of claim
12 comprising contacting the polypeptide or a cell expressing the
polypeptide with a compound which binds to the polypeptide in a
sufficient concentration to modulate the activity of the
polypeptide.
25. A method for identifying a compound which modulates the
activity of a polypeptide of claim 12 comprising: a) contacting a
polypeptide of claim 12 with a test compound; and b) determining
the effect of the test compound on the activity of the polypeptide
to thereby identify a compound which modulates the activity of the
polypeptide.
26. A transgenic plant comprising the nucleic acid molecule of
claim 1.
27. The transgenic plant of claim 26, wherein the plant is a
monocot plant.
28. The transgenic plant of claim 26, wherein the plant is a dicot
plant.
29. The transgenic plant of claim 26, wherein the plant is selected
from the group consisting of Arabidopsis thaliana, rice, wheat,
maize, tomato, oilseed rape, soybean, sunflower, and canola.
30. A method for modulating the growth of a plant, comprising
introducing into the plant a CCP modulator in an amount sufficient
to modulate the growth of the plant, thereby modulating the growth
of the plant.
31. The method of claim 30, wherein the CCP modulator is a small
molecule.
32. The method of claim 30, wherein the CCP modulator is capable of
modulating CCP polypeptide activity.
33. The method of claim 32, wherein the CCP modulator is an
anti-CCP antibody; or wherein the CCP modulator is a CCP
polypeptide comprising the amino acid sequence of SEQ ID NOs:
67-132, 205, 211, 215-216 or 220-227, or a fragment thereof.
34. (canceled)
35. The method of claim 30, wherein the CCP modulator is capable of
modulating CCP nucleic acid expression.
36. The method of claim 35, wherein the CCP modulator is an
antisense CCP nucleic acid molecule; wherein the CCP modulator is a
ribozyme; or wherein the CCP modulator comprises the nucleotide
sequence of SEQ ID NO: 1-66 or 228-239, or a fragment thereof.
37-38. (canceled)
39. The method of claim 30, wherein the plant is a monocot
plant.
40. The method of claim 30, wherein the plant is a dicot plant.
41. The method of claim 30, wherein the plant is selected from the
group consisting of Arabidopsis thaliana, rice, wheat, maize,
tomato, alfalfa, oilseed rape, soybean, sunflower, and canola.
42. A method for modulating the cell cycle in a plant, comprising
introducing into the plant a CCP modulator in an amount sufficient
to modulate the cell cycle in the plant, thereby modulating the
cell cycle in the plant.
43. The method of claim 42, wherein the plant is a monocot
plant.
44. The method of claim 42, wherein the plant is a dicot plant.
45. The method of claim 42, wherein the plant is selected from the
group consisting of Arabidopsis thaliana, rice, wheat, maize,
tomato, alfalfa, oilseed rape, soybean, sunflower, and canola
Description
RELATED APPLICATIONS
[0001] This application claims priority to U.S. provisional patent
application Ser. No. 60/204,045, filed May 12, 2000. The contents
of this provisional patent application are incorporated herein by
reference in their entirety.
BACKGROUND OF THE INVENTION
[0002] Cell division plays a crucial role during all phases of
plant development. The continuation of organogenesis and growth
responses to a changing environment require precise spatial,
temporal, and developmental regulation of cell division.
[0003] The basic mechanisms controlling the progression through the
cell cycle appear to be conserved in all higher eukaryotes,
although the temporal and spatial control of cell division can
differ largely, between organisms. Plants have unique developmental
features which are not found in either animals or fungi. First, due
to the presence of a rigid cell wall, plant cells cannot move and
consequently organogenesis is dependent on cell division and cell
expansion at the site of formation of new organs. Secondly, cell
divisions are confined to specialized regions, called meristems.
These meristems continuously produce new cells which, as they move
away from the meristem, become differentiated. The meristem
identity itself can change from a vegetative to a reproductive
phase, resulting in the formation of flowers. Thirdly, plant
development is largely post-embryonic. During embryogenesis, the
main developmental event is the establishment of the root-shoot
axis.
[0004] Most plant growth occurs after germination, by iterative
development at the meristems. Lastly, as a consequence of the
sessile life of plants, development and cell division are, to a
large extent, influenced by environmental factors such as light,
gravity, wounding, nutrients, and stress conditions. All these
features are reflected in a plant-specific regulation of the
factors controlling cell division.
[0005] The unparalleled potential of plants for continuous
organogenesis and plastic growth also relies on the competent or
active state of the cell division apparaturs. The discovery of a
common mechanism underlying the regulation of the cell cycle in
yeasts and animals has led to efforts to extend these findings to
the plant kingdom and is leading to research aimed at converting
the gathered knowledge into useful traits introduced in transgenic
plants.
[0006] When eukaryotic cells and, thus, also plant cells divide
they go through a highly ordered sequence of events collectively
termed as the "cell cycle." Briefly, DNA replication or synthesis
(S) and mitotic segregation of the chromosomes (M) occur with
intervening gap phases (G1 and G2) and the phases follow the
sequence G1-S-G2-M. Cell division is completed after cytokinesis,
the last step of the M-phase. Cells that have exited the cell cycle
and have become quiescent are said to be in the G0 phase. Cells at
the G0 stage can be stimulated to reenter the cell cycle at the G1
phase. The transition between the different phases of the cell
cycle are basically driven by the sequential
activation/inactivation of a kinase (called "cyclin-dependent
kinase", "CDC" or "CDK") by different agonists.
[0007] Proteins called cyclins are required for kinase activation.
Cyclins are also important for targeting the kinase activity to a
given subset of substrate(s). Other factors regulating CDK activity
include CDK inhibitors (CKIs or ICKs, KIPs, CIPs, INKs), CDK
activating kinase (CAK) and CDK phosphatase (CDC25) (Mironov et al.
(1999) Plant Cell 11, 509-522 and Won K. et al. (1996) EMBO J. 15,
4182-4193).
SUMMARY OF THE INVENTION
[0008] The present invention is based, at least in part, on the
discovery of novel plant nucleic acid molecules and polypeptides
encoded by such nucleic acid molecules, referred to herein as "cell
cycle proteins" or "CCP." The CCP nucleic acid and polypeptide
molecules of the present invention are useful as modulating agents
in regulating cell cycle progression in, for example, plants.
Accordingly, in one aspect, this invention provides isolated
nucleic acid molecules encoding CCP polypeptides, as well as
nucleic acid fragments suitable as primers or hybridization probes
for the detection of CCP-encoding nucleic acids.
[0009] In one embodiment, a CCP nucleic acid molecule of the
invention is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%
or more identical to the nucleotide sequence (e.g., to the entire
length of the nucleotide sequence) of SEQ ID NO:1-66 or 228-239, or
a complement thereof.
[0010] In a preferred embodiment, the isolated nucleic acid
molecule includes the nucleotide sequence shown in SEQ ID NO:1-66
or 228-239, or a complement thereof. In another preferred
embodiment, an isolated nucleic acid molecule of the invention
encodes the amino acid sequence of a plant CCP polypeptide.
[0011] Another embodiment of the invention features nucleic acid
molecules, preferably CCP nucleic acid molecules, which
specifically detect CCP nucleic acid molecules relative to nucleic
acid molecules encoding non-CCP polypeptides. For example, in one
embodiment, such a nucleic acid molecule is at least 15, 20, 25,
30, 40, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600,
650, 700, 750, or 800 nucleotides in length and hybridizes under
stringent conditions to a nucleic acid molecule comprising the
nucleotide sequence shown in SEQ ID NO:1-66\or 228-239, or a
complement thereof.
[0012] In other preferred embodiments, the nucleic acid molecule
encodes a naturally occurring allelic variant of a plant CCP
polypeptide, wherein the nucleic acid molecule hybridizes to the
nucleic acid molecule of SEQ ID NO:1-66 or 228-239 under stringent
conditions.
[0013] Another embodiment of the invention provides an isolated
nucleic acid molecule which is antisense to a CCP nucleic acid
molecule, e.g., the coding strand of a CCP nucleic acid
molecule.
[0014] Another aspect of the invention provides a vector comprising
a CCP nucleic acid molecule. In certain embodiments, the vector is
a recombinant expression vector. In another embodiment, the
invention provides a host cell containing a vector of the
invention. The invention also provides a method for producing a CCP
polypeptide, by culturing in a suitable medium a host cell of the
invention, e.g., a plant host cell such as a host monocot plant
cell (e.g., rice, wheat or corn) or a dicot host cell (e.g.,
Arabidopsis thaliana, oilseed rape, or soybeans) containing a
recombinant expression vector, such that the polypeptide is
produced.
[0015] Another aspect of this invention features isolated or
recombinant CCP polypeptides. In one embodiment, an isolated CCP
polypeptides has one or more of the following domains: a "cyclin
destruction box", a "cyclin box motif 1", a "cyclin box motif 2", a
"CDC2 motif", a "CDK phosphorylation site", a "nuclear localization
signal", a "Cy-like box", an "Rb binding domain", a "DEF domain", a
"DNA binding domain", a "DCB1 domain", a "DCB2 domain" and/or a
"SAP domain".
[0016] In a preferred embodiment, a CCP polypeptide includes at
least one or more of the following domains: a "cyclin destruction
box", a "cyclin box motif 1", a "cyclin box motif 2", a "CDC2
motif", a "CDK phosphorylation site", a "nuclear localization
signal", a "Cy-like box", an "Rb binding domain", a "DEF domain", a
"DNA binding domain", a "DCB1 domain", a "DCB2 domain" and/or a
"SAP domain", and has an amino acid sequence at least about 50%,
55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more
identical to the amino acid sequence of SEQ ID NO:67-132, 205, 211,
215-216, or 220-227.
[0017] In another preferred embodiment, a CCP polypeptide includes
at least one or more of the following domains: a "cyclin
destruction box", a "cyclin box motif 1", a "cyclin box motif 2", a
"CDC2 motif", a "CDK phosphorylation site", a "nuclear localization
signal", a "Cy-like box", an "Rb binding domain", a "DEF domain", a
"DNA binding domain", a "DCB1 domain", a "DCB2 domain" and/or a SAP
domain and has a CCP activity (as described herein).
[0018] In yet another preferred embodiment, a CCP polypeptide
includes one or more of the following domains: a "cyclin
destruction box", a "cyclin box motif 1", a "cyclin box motif 2", a
"CDC2 motif", a "CDK phosphorylation site", a "nuclear localization
signal", a "Cy-like box", an "Rb binding domain", a "DEF domain", a
"DNA binding domain", a "DCB1 domain", a "DCB2 domain" and/or a SAP
domain and is encoded by a nucleic acid molecule having a
nucleotide sequence which hybridizes under stringent hybridization
conditions to a nucleic acid molecule comprising the nucleotide
sequence of SEQ ID NO:1-66 or 228-239.
[0019] In another embodiment, the invention features fragments of
the polypeptide having the amino acid sequence of SEQ ID NO:67-132,
205, 211, 215-216, or 220-227, wherein the fragment comprises at
least 15 amino acids (e.g., contiguous amino acids) of the amino
acid sequence of SEQ ID NO:67-132, 205, 211, 215-216, or 220-227.
In another embodiment, a CCP polypeptide has the amino acid
sequence of SEQ ID NO:67-132, 205, 211, 215-216, or 220-227.
[0020] In another embodiment, the invention features a CCP protein
which is encoded by a nucleic acid molecule consisting of a
nucleotide sequence at least about 50%, 55%, 60%, 65%, 70%, 75%,
80%, 85%, 90%, 95%, 98%, 99% or more identical to a nucleotide
sequence of SEQ ID NO:1-66 or 228-239, or a complement thereof.
This invention further features a CCP polypeptide, which is encoded
by a nucleic acid molecule consisting of a nucleotide sequence
which hybridizes under stringent hybridization conditions to a
nucleic acid molecule comprising the nucleotide sequence of SEQ ID
NO:1-66 or 228-239, or a complement thereof.
[0021] In another embodiment the invention provides transgenic
plants (e.g., monocot or dicot plants) containing an isolated
nucleic acid molecule of the present invention. For example, the
invention provides transgenic plants containing a recombinant
expression cassette including a plant promoter operably linked to
an isolated nucleic acid molecule of the present invention. The
present invention also provides transgenic seed from the transgenic
plants. In another embodiment the invention provides methods of
modulating, in a transgenic plant, the expression of the nucleic
acids of the invention.
[0022] The proteins of the present invention or portions thereof,
e.g., biologically active portions thereof, can be operatively
linked to a non-CCP polypeptide (e.g., heterologous amino acid
sequences) to form fusion proteins. The invention further features
antibodies, such as monoclonal or polyclonal antibodies, that
specifically bind polypeptide of the invention, preferably CCP
polypeptide. In addition, the CCP polypeptide or biologically
active portions thereof can be incorporated into pharmaceutical
compositions, which optionally include pharmaceutically acceptable
carriers.
[0023] In another aspect, the present invention provides a method
for detecting the presence of a CCP nucleic acid molecule,
polypeptide in a biological sample by contacting the biological
sample with an agent capable of detecting a CCP nucleic acid
molecule, polypeptide such that the presence of a CCP nucleic acid
molecule, polypeptide is detected in the biological sample.
[0024] In another aspect, the present invention provides a method
for detecting the presence of CCP activity in a biological sample
by contacting the biological sample with an agent capable of
detecting an indicator of CCP activity such that the presence of
CCP activity is detected in the biological sample.
[0025] In another aspect, the invention provides a method for
modulating CCP activity comprising contacting a cell capable of
expressing CCP with an agent that modulates CCP activity such that
CCP activity in the cell is modulated. In one embodiment, the agent
inhibits CCP activity. In another embodiment, the agent stimulates
CCP activity. In one embodiment, the agent is an antibody that
specifically binds to a CCP polypeptide. In another embodiment, the
agent modulates expression of CCP by modulating transcription of a
CCP gene or translation of a CCP mRNA. In yet another embodiment,
the agent is a nucleic acid molecule having a nucleotide sequence
that is antisense to the coding strand of a CCP mRNA or a CCP
gene.
[0026] In one embodiment, the methods of the present invention are
used to increase crop yield, improve the growth characteristics of
a plant (such as growth rate or size of specific tissues or organs
in the plant), modify the architecture or morphology of a plant,
improve tolerance to environmental stress conditions (such as
drought, salt, temperature, nutrient or deprivation), or improve
tolerance to plant pathogens (e.g., pathogens that abuse the cell
cycle) by modulating CCP activity in a cell. In one embodiment, the
CCP activity is modulated by modulating the expression of a CCP
nucleic acid molecule. In yet another embodiment, the CCP activity
is modulated by modulating the activity of a CCP polypeptide.
Modulators of CCP activity include, for example, a CCP nucleic acid
or polypeptide.
[0027] The present invention also provides diagnostic assays for
identifying the presence or absence of a genetic alteration
characterized by at least one of (i) aberrant modification or
mutation of a gene encoding a CCP polypeptide; (ii) mis-regulation
of the gene; and (iii) aberrant post-translational modification of
a CCP polypeptide, wherein a wild-type form of the gene encodes a
protein with a CCP activity.
[0028] In another aspect the invention provides methods for
identifying a compound that binds to or modulates the activity of a
CCP polypeptide, by providing an indicator composition comprising a
CCP polypeptide having CCP activity, contacting the indicator
composition with a test compound, and determining the effect of the
test compound on CCP activity in the indicator composition to
identify a compound that modulates the activity of a CCP
polypeptide. The identified compounds may be used as herbicides or
plant growth regulators.
[0029] Other features and advantages of the invention will be
apparent from the following detailed description and claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0030] FIG. 1 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP1. The complete nucleotide
sequence (FIG. 1A) corresponds to nucleic acids 1 to 1715 of SEQ ID
NO:39. The complete amino acid sequence (FIG. 1B) corresponds to
amino acids 1 to 460 of SEQ ID NO:105. Underlined in FIG. 1A and
FIG. 1B are the partially characterized nucleotide (SEQ ID NO:1)
and predicted partial amino acid (SEQ ID NO:67) sequence,
respectively. Further indicated in FIG. 1A are the stop and start
codons (both in black shaded boxes) which are part of the primers
(grey shaded boxes) used to amplify the coding region of CCP1 by
PCR. The SEQ ID NOs of the primers used can be found in Table III.
Indicated in FIG. 1B are the cyclin destruction box (black shaded
box) and the cyclin box motifs 1 and 2 (both in gray shaded
boxes).
[0031] FIG. 2 depicts the cDNA sequence of the Arabidopsis thaliana
CCP2. The complete nucleotide sequence corresponds to nucleic acids
1 to 2195 of SEQ ID NO:40. Underlined is the partially
characterized nucleotide (SEQ ID NO:2) sequence. Nucleotide
sequence differences between SEQ ID NO:40 and SEQ ID NO:2 are
depicted. Indicated are the stop and start codons (both in black
shaded boxes) which are part of the primers (grey shaded boxes)
used to amplify the coding region of CCP2 by PCR. SEQ ID NOs of the
primers used can be found in Table III.
[0032] FIG. 3 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP2. The complete amino acid sequence
corresponds to amino acids 1 to 664 of SEQ ID NO:106. Underlined is
the predicted partial amino acid (SEQ ID NO:68) sequence.
[0033] FIG. 4 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP3. The complete nucleotide
sequence (FIG. 3A) corresponds to nucleic acids 1 to 1413 of SEQ ID
NO:41. The complete amino acid sequence (FIG. 3B) corresponds to
amino acids 1 to 450 of SEQ ID NO:69. Underlined in FIG. 3A and
FIG. 3B are the partially characterized nucleotide (SEQ ID NO:3)
and predicted partial amino acid (SEQ ID NO:69) sequences,
respectively. Indicated in FIG. 3A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP3 by PCR. SEQ
ID NOs of the primers used can be found in Table III. Nucleotide
sequence differences between SEQ ID NO:41 and SEQ ID NO:3 are
depicted Indicated in FIG. 3B are the cyclin destruction box (black
shaded box) and the cyclin box motifs 1 and 2 (both in gray shaded
boxes).
[0034] FIG. 5 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP4. The complete nucleotide
sequence (FIG. 5A) corresponds to nucleic acids 1 to 672 of SEQ ID
NO:4. The complete amino acid sequence (FIG. 5B) corresponds to
amino acids 1 to 223 of SEQ ID NO:70. Indicated in FIG. 5A are stop
and start codon (both in black shaded boxes) which are part of the
primers (grey shaded boxes) used to amplify the coding region of
CCP4 by PCR. SEQ ID NOs of the primers used can be found in Table
III. Indicated in FIG. 5B is the CDK phosphorylation site to (black
shaded box).
[0035] FIG. 6 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP5. The complete nucleotide
sequence (FIG. 6A) corresponds to nucleic acids 1 to 1287 of SEQ ID
NO:5. The complete amino acid sequence (FIG. 6B) corresponds to
amino acids 1 to 429 of SEQ ID NO:71. Indicated in FIG. 6A are the
stop and start codons (both in black shaded boxes) which are part
of the primers (grey shaded boxes) used to amplify the coding
region of CCP5 by PCR. SEQ ID NOs of the primers used can be found
in Table III. Indicated in FIG. 6B are the cyclin destruction box
(black shaded box) and the cyclin box motifs 1 and 2 (both in gray
shaded boxes).
[0036] FIG. 7 depicts the cDNA sequence of the Arabidopsis thaliana
CCP6. The complete nucleotide sequence corresponds to nucleic acids
1 to 2766 of SEQ ID NO:42. Underlined is the partially
characterized nucleotide (SEQ ID NO:6) sequence. Indicated are the
stop and start codons (both in black shaded boxes) which are part
of the primers (grey shaded boxes) used to amplify the coding
region of CCP6 by PCR. SEQ ID NOs of the primers used can be found
in Table III. Nucleotide sequence differences between SEQ ID NO:42
and SEQ ID NO:6 are depicted.
[0037] FIG. 8 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP6. The complete amino acid sequence
corresponds to amino acids 1 to 901 of SEQ ID NO:108. Underlined is
the predicted partial amino acid (SEQ ID NO:72) sequence.
[0038] FIG. 9 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP7/CCP8. The complete
nucleotide sequence (FIG. 9A) corresponds to nucleic acids 1 to
1260 of SEQ ID NO:43. The complete amino acid sequence (FIG. 9B)
corresponds to amino acids 1 to 358 of SEQ ID NO:109. Underlined in
FIG. 9A and FIG. 9B are the partially characterized nucleotide (SEQ
ID NO:7) and predicted partial amino acid (SEQ ID NO:73) sequence,
respectively. Italic sequences in FIG. 9A and FIG. 9B correspond to
the partially characterized nucleotide (SEQ ID NO:8) and amino acid
(SEQ ID NO:74) sequence, respectively, of another clone found
independently to interact with an AtE2F protein in a yeast
two-hybrid screen. Indicated in FIG. 9A are the stop and start
codons (both in black shaded boxes) which are part of the primers
(grey shaded boxes) used to amplify the coding region of CCP7/8 by
PCR. SEQ ID NOs of the primers used can be found in Table III.
Nucleotide sequence differences between SEQ ID NO:43 and SEQ ID
NO:7-8 are depicted.
[0039] FIG. 10 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP9. The complete nucleotide
sequence (FIG. 10A) corresponds to nucleic acids 1 to 1308 of SEQ
ID NO:9. The complete amino acid sequence (FIG. 10B) corresponds to
amino acids 1 to 436 of SEQ ID NO:75. Indicated in FIG. 10A are the
stop and start codons (both in black shaded boxes) which are part
of the primers (grey shaded boxes) used to amplify the coding
region of CCP9 by PCR. SEQ ID NOs of the primers used can be found
in Table III. Indicated in FIG. 10B are the cyclin destruction box
(black shaded box) and the cyclin box motifs 1 and 2 (both in gray
shaded boxes).
[0040] FIG. 11 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP10. The complete nucleotide
sequence (FIG. 11A) corresponds to nucleic acids 1 to 1006 of SEQ
ID NO:10. The complete amino acid sequence (FIG. 11B) corresponds
to amino acids 1 to 254 of SEQ ID NO:76. Indicated in FIG. 11A are
the stop and start codons (both in black shaded boxes) which are
part of the primers (grey shaded boxes) used to amplify the coding
region of CCP10 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0041] FIG. 12 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP11. The complete nucleotide
sequence (FIG. 12A) corresponds to nucleic acids 1 to 653 of SEQ ID
NO:44. Indicated in FIG. 12A are the stop and start codons (both in
black shaded boxes) which are part of the primers (grey shaded
boxes) used to amplify the coding region of CCP11 by PCR. SEQ ID
NOs of the primers used can be found in Table III. However, during
prediction of the open reading frame a frame shift was introduced
which effected the CCP11 open reading frame. The stop codon
indicated in italics in a black shaded box is the putative correct
stop codon. The amino acid sequence in FIG. 12B corresponds to
amino acids 1 to 86 of SEQ ID NO:77, the protein encoded by the
initially identified open reading frame of SEQ ID NO:11. The
putative correct complete amino acid sequence in FIG. 12C
corresponds to amino acids 1 to 98 of SEQ ID NO:110.
[0042] FIG. 13 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP12/13. The complete
nucleotide sequence (FIG. 13A) corresponds to nucleic acids 1 to
1266 of SEQ ID NO:45. The complete amino acid sequence (FIG. 13B)
corresponds to amino acids 1 to 385 of SEQ ID NO:111. Double
underlined in FIG. 13A and FIG. 13B are the partially characterized
3' nucleotide (SEQ ID NO:12) and C-terminal predicted partial amino
acid (SEQ ID NO:78) sequence, respectively. Single underlined in
FIG. 13A and FIG. 13B are the partially characterized 5' nucleotide
(SEQ ID NO:13) and N-terminal predicted partial amino acid (SEQ ID
NO:79) sequences, respectively. Indicated in FIG. 13A are the stop
and start codons (both in black shaded boxes) and the primers (grey
shaded boxes) used to amplify the coding region of CCP12/13 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
Nucleotide sequence differences between SEQ ID NO:45 and SEQ ID
NO:12 are depicted.
[0043] FIG. 14 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP14. The complete nucleotide
sequence (FIG. 14A) corresponds to nucleic acids 1 to 1520 of SEQ
ID NO:46. The complete amino acid sequence (FIG. 14B) corresponds
to amino acids 1 to 465 of SEQ ID NO:112. Underlined in FIG. 14A
and FIG. 14B are the partially characterized nucleotide (SEQ ID
NO:14) and predicted partial amino acid (SEQ ID NO:80) sequence,
respectively. Indicated in FIG. 14A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP14 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
[0044] FIG. 15 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP15. The complete nucleotide
sequence (FIG. 15A) corresponds to nucleic acids 1 to 1142 of SEQ
ID NO:47. The complete amino acid sequence (FIG. 1B) corresponds to
amino acids 1 to 313 of SEQ ID NO:113. Underlined in FIG. 15A and
FIG. 15B are the partially characterized nucleotide (SEQ ID NO:15)
and predicted partial amino acid (SEQ ID NO:81) sequence,
respectively. Indicated in FIG. 15A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP 15 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
Nucleotide sequence differences between SEQ ID NO:47 and SEQ ID
NO:15 are depicted. Indicated in FIG. 15B are the PSTTLRE motif
(boxed) characteristic for the subclass of plant PSTTLRE CDC2
kinases.
[0045] Further indicated in FIG. 15B are three CDC2 motifs (black
shaded box, grey shaded box and double underlined). Other residues
conserved in CDC2s are underscored by `*` (residues in common with
Propom domain PD198850), `+` (residues in common with Propom domain
PD015684), (residues in common with Propom domain PD063669), and
`1` (residues in common with Propom domain PD195780).
[0046] FIG. 16 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP16. The complete nucleotide
sequence (FIG. 16A) corresponds to nucleic acids 1 to 1189 of SEQ
ID NO:48. The complete amino acid sequence (FIG. 16B) corresponds
to amino acids 1 to 292 of SEQ ID NO:114. Indicated in FIG. 16A are
the stop and the three possible start codons (all in black shaded
boxes) and the primers (grey shaded boxes) used to amplify the
coding region of CCP16 by PCR. SEQ ID NOs of the primers used can
be found in Table III. Nucleotide sequence differences between SEQ
ID NO:48 and SEQ ID NO:16 are depicted. Indicated in FIG. 16B are
the DNA binding domain (black shaded box), DEF domain (grey shaded
box), DCB1 domain (single underlined) and DCB2 domain (double
underlined), all domains characteristic for a DP protein.
[0047] FIG. 17 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP17. The complete nucleotide
sequence (FIG. 17A) corresponds to nucleic acids 1 to 794 of SEQ ID
NO:17. The complete amino acid sequence (FIG. 17B) corresponds to
amino acids 1 to 173 of SEQ ID NO:83. Indicated in FIG. 17A are the
stop and start codons (both in black shaded boxes) which are part
of the primers (grey shaded boxes) used to amplify the coding
region of CCP17 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0048] FIG. 18 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP18. The complete nucleotide
sequence (FIG. 18A) corresponds to nucleic acids 1 to 805 of SEQ ID
NO:49. The complete amino acid sequence (FIG. 18B) corresponds to
amino acids 1 to 165 of SEQ ID NO:115. Underlined in FIG. 15A and
FIG. 15B are the partially characterized nucleotide (SEQ ID NO:18)
and predicted partial amino acid (SEQ ID NO:84) sequence,
respectively. Indicated in FIG. 18A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP18 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
[0049] FIG. 19 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP19. The complete nucleotide
sequence (FIG. 19A) corresponds to nucleic acids 1 to 1152 of SEQ
ID NO:19. The complete amino acid sequence (FIG. 1B) corresponds to
amino acids 1 to 383 of SEQ ID NO:85. Indicated in to FIG. 19A are
the stop and start codons (both in black shaded boxes) which are
part of the primers (grey shaded boxes) used to amplify the coding
region of CCP19 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0050] FIG. 20 depicts the cDNA sequence of the Arabidopsis
thaliana CCP20/21. The complete nucleotide sequence corresponds to
nucleic acids 1 to 1539 of SEQ ID NO:50. Underlined are the
partially characterized 5' nucleotide (SEQ ID NO:20) sequence and
the partially characterized 3' nucleotide (SEQ ID NO:21). Indicated
in FIG. 20 are the stop and start codons (both in black shaded
boxes) which are part of the primers (grey shaded boxes) used to
amplify the coding region of CCP20/21 by PCR. SEQ ID NOs of the
primers used can be found in Table III. Nucleotide sequence
differences between SEQ ID NOs:20-21 and SEQ ID NO:50 are
depicted.
[0051] FIG. 21 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP20/21. The complete amino acid sequence
corresponds to amino acids 1 to 432 of SEQ ID NO:116. Underlined
are the partially characterized N-terminal predicted partial amino
acid (SEQ ID NO:50) sequence and the partially characterized
C-terminal amino predicted partial acid (SEQ ID NO: 87) sequence.
Indicated are further differences in amino acid sequence between
SEQ ID NO:87 and SEQ ID NO:116.
[0052] FIG. 22 depicts the cDNA sequence of the Arabidopsis
thaliana CCP22. The complete nucleotide sequence corresponds to
nucleic acids 1 to 1977 of SEQ ID NO:51. Underlined is the
partially characterized nucleotide (SEQ ID NO:22). Indicated are
the stop and start codons (both in black shaded boxes) which are
part of the primers (grey shaded boxes) used to amplify the coding
region of CCP22 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0053] FIG. 23 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP22. The complete amino acid sequence
corresponds to amino acids 1 to 559 of SEQ ID NO:117. Underlined is
the predicted partial amino acid (SEQ ID NO:88) sequence.
[0054] FIG. 24 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP23. The complete nucleotide
sequence (FIG. 24A) corresponds to nucleic acids 1 to 525 of SEQ ID
NO:52. Indicated in FIG. 24A are the stop and start codons (both in
black shaded boxes) which are part of the primers (grey shaded
boxes) used to amplify the coding region of CCP23 by PCR. SEQ ID
NOs of the primers used can be found in Table III. Nucleotide
sequence differences between SEQ ID NOs:23 and SEQ ID NO:52 are
depicted. The amino acid sequence in FIG. 24B corresponds to amino
acids 1 to 98 of SEQ ID NO:89. The complete amino acid sequence in
FIG. 24C corresponds to amino acids 1 to 86 of SEQ ID NO:118.
[0055] FIG. 25 depicts the cDNA sequence of the Arabidopsis
thaliana CCP24. The complete nucleotide sequence corresponds to
nucleic acids 1 to 2610 of SEQ ID NO:53. Underlined is the
partially characterized nucleotide (SEQ ID NO:24). Indicated are
the stop and start codons (both in black shaded boxes) which are
part of the primers (grey shaded boxes) used to amplify the coding
region of CCP24 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0056] FIG. 26 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP24. The complete amino acid sequence
corresponds to amino acids 1 to 784 of SEQ ID NO:119. Underlined is
the predicted partial amino acid (SEQ ID NO:90) sequence.
[0057] FIG. 27 depicts the cDNA sequence of the Arabidopsis
thaliana CCP25. The complete nucleotide sequence corresponds to
nucleic acids 1 to 2235 of SEQ ID NO:54. Underlined is the
partially characterized nucleotide (SEQ ID NO:25) sequence.
Indicated are stop and start codon (both in black shaded boxes)
which are part of the primers (grey shaded boxes) used to amplify
the coding region of CCP25 by PCR. SEQ ID NOs of the primers used
can be found in Table III.
[0058] FIG. 28 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP25. The complete amino acid sequence
corresponds to amino acids 1 to 724 of SEQ ID NO:120. Underlined is
the predicted partial amino acid (SEQ ID NO:91) sequence.
[0059] FIG. 29 depicts the cDNA sequence of the Arabidopsis
thaliana CCP26. The complete nucleotide sequence corresponds to
nucleic acids 1 to 4002 of SEQ ID NO:55. Underlined is the
partially characterized nucleotide (SEQ ID NO:26) sequence.
Indicated are stop and start codon (both in black shaded boxes)
which are part of the primers (grey shaded boxes) used to amplify
the coding region of CCP26 by PCR. SEQ ID NOs of the primers used
can be found in Table III. Nucleotide sequence differences between
SEQ ID NOs:26 and SEQ ID NO:55 are depicted.
[0060] FIG. 30 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP26. The complete amino acid sequence
corresponds to amino acids 1 to 1313 of SEQ ID NO:121. Underlined
is the predicted partial amino acid (SEQ ID NO:92) sequence. Amino
acid sequence differences between SEQ ID NOs:92 and SEQ ID NO:121
are depicted.
[0061] FIG. 31 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP27. The complete nucleotide
sequence (FIG. 31A) corresponds to nucleic acids 1 to 1251 of SEQ
ID NO:56. The complete amino acid sequence (FIG. 31B) corresponds
to amino acids 1 to 310 of SEQ ID NO:122. Underlined in FIG. 31A
and FIG. 31B are the partially characterized nucleotide (SEQ ID
NO:27) and predicted partial amino acid (SEQ ID NO:93) sequence,
respectively. Indicated in FIG. 31A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP27 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
Nucleotide sequence differences between SEQ ID NO:27 and SEQ ID
NO:56 are depicted in FIG. 31A.
[0062] FIG. 32 depicts the cDNA sequence of the Arabidopsis
thaliana CCP28. The complete nucleotide sequence corresponds to
nucleic acids 1 to 2955 of SEQ ID NO:56. Underlined is the
partially characterized nucleotide (SEQ ID NO:28) sequence.
Indicated are the stop and start codons (both in black shaded
boxes) which are part of the primers (grey shaded boxes) used to
amplify the coding region of CCP28 by PCR. SEQ ID NOs of the
primers used can be found in Table III. Nucleotide sequence
differences between SEQ ID NO:28 and SEQ ID NO:57 are depicted.
[0063] FIG. 33 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP28. The complete amino acid sequence
corresponds to amino acids 1 to 964 of SEQ ID NO:123. Underlined is
the predicted partial amino acid (SEQ ID NO:94) sequence.
[0064] FIG. 34 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP29. The complete nucleotide
sequence (FIG. 34A) corresponds to nucleic acids 1 to 546 of SEQ ID
NO:29. The complete amino acid sequence (FIG. 34B) corresponds to
amino acids 1 to 181 of SEQ ID NO:95. Indicated in FIG. 34A are the
stop and start codons (both in black shaded boxes) which are part
of the primers (grey shaded boxes) used to amplify the coding
region of CCP29 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0065] FIG. 35 depicts the cDNA sequences and predicted amino acid
sequences of the Arabidopsis thaliana CCP30. The complete
nucleotide sequence (FIG. 35A) corresponds to nucleic acids 1 to
492 of SEQ ID NO:30. Indicated in FIG. 35A are the stop and start
codons (both in black shaded boxes), the complete sense primer and
part of the antisense primer (grey shaded boxes) used to amplify
the coding region of CCP30 by PCR. SEQ ID NOs of the primers used
can be found in Table III. However, after sequencing of the PCR
product a sequence error in SEQ ID NO:30 was detected (boxed
nucleotide `a` in FIG. 35A not present) which caused a frame shift
effectuating the CCP30 open reading frame. The putative correct
cDNA sequence is given in FIG. 35B (nucleic acids 1 to 865 of SEQ
ID NO:58) wherein the three putative start codons are marked by a
black shaded box. The originally identified start codon is
indicated in bold letters. The stop codon is unaltered. The amino
acid sequence in FIG. 35C corresponds to amino acids 1 to 163 of
SEQ ID NO:96, the protein encoded by the initially identified open
reading frame of SEQ ID NO:30. The putative correct complete amino
acid sequence in FIG. 35D corresponds to amino acids 1 to 222 of
SEQ ID NO:124 which comprises the longest possible open reading
frame. The Met residues corresponding to the three possible start
codons in SEQ ID NO:58 (FIG. 35B) are bold faced.
[0066] FIG. 36 depicts the cDNA sequence of the Arabidopsis
thaliana CCP31. The complete nucleotide sequence corresponds to
nucleic acids 1 to 723 of SEQ ID NO:31. Indicated in FIG. 1A are
the stop and start codons (both in black shaded boxes).
[0067] FIG. 37 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP31. The complete amino acid sequence
corresponds to amino acids 1 to 148 of SEQ ID NO:125.
[0068] FIG. 38 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP32. The complete nucleotide
sequence (FIG. 38A) corresponds to nucleic acids 1 to 426 of SEQ ID
NO:60. The complete amino acid sequence (FIG. 38B) corresponds to
amino acids 1 to 70 of SEQ ID NO:126. Underlined in FIG. 38A is the
partially characterized nucleotide (SEQ ID NO:32) sequence.
Indicated in FIG. 38A are the stop and start codons (both in black
shaded boxes) which are part of the primers (grey shaded boxes)
used to amplify the coding region of CCP32 by PCR. SEQ ID NOs of
the primers used can be found in Table III. FIG. 38C gives the
originally erroneously predicted amino acid sequence of CCP32
(amino acids 1 to 38 of SEQ ID NO:98).
[0069] FIG. 39 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP33. The complete nucleotide
sequence (FIG. 39A) corresponds to nucleic acids 1 to 1442 of SEQ
ID NO:61. The complete amino acid sequence (FIG. 39B) corresponds
to amino acids 1 to 385 of SEQ ID NO:127. Indicated in FIG. 39A are
the stop and start codons (both in black shaded boxes) which are
part of the primers (grey shaded boxes) used to amplify the coding
region of CCP33 by PCR. SEQ ID NOs of the primers used can be found
in Table III. Indicated in FIG. 39B are the DNA binding domain
(black shaded box), DEF domain (grey shaded box), DCB1 domain
(single underlined) and DCB2 domain (double underlined), all
domains characteristic for a DP protein.
[0070] FIG. 40 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP34. The complete nucleotide
sequence (FIG. 40A) corresponds to nucleic acids 1 to 1506 of SEQ
ID NO:62. The complete amino acid sequence (FIG. 40B) corresponds
to amino acids 1 to 437 of SEQ ID NO:128. Underlined in FIG. 40A
and FIG. 40B are the partially characterized nucleotide (SEQ ID
NO:34) and predicted partial amino acid (SEQ ID NO:62) sequence,
respectively. Indicated in FIG. 40A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP34 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
[0071] FIG. 41 depicts the cDNA sequence of the Arabidopsis
thaliana CCP35. The complete nucleotide sequence corresponds to
nucleic acids 1 to 2631 of SEQ ID NO:63. Underlined is the
partially characterized nucleotide (SEQ ID NO:35) sequence.
Indicated are the stop and start codons (both in black shaded
boxes) and of the primers (grey shaded boxes) used to amplify the
coding region of CCP35 by PCR. SEQ ID NOs of the primers used can
be found in Table III. Nucleotide sequence differences between SEQ
ID NO:33 and SEQ ID NO:63 are depicted.
[0072] FIG. 42 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP35. The complete amino acid sequence
corresponds to amino acids 1 to 749 of SEQ ID NO:129. Underlined is
the predicted partial amino acid (SEQ ID NO:101) sequence.
[0073] FIG. 43 depicts the cDNA sequence of the Arabidopsis
thaliana CCP36. The complete nucleotide sequence corresponds to
nucleic acids 1 to 2743 of SEQ ID NO:64. Underlined is the
partially characterized nucleotide (SEQ ID NO:36) sequence.
Indicated are the stop and start codons (both in black shaded
boxes). Nucleotide sequence differences between SEQ ID NO:36 and
SEQ ID NO:64 are depicted.
[0074] FIG. 44 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP36. The complete amino acid sequence
corresponds to amino acids 1 to 742 of SEQ ID NO:130. Underlined is
the predicted partial amino acid (SEQ ID NO:102) sequence.
[0075] FIG. 45 depicts the cDNA sequence of the Arabidopsis
thaliana CCP37. The complete nucleotide sequence corresponds to
nucleic acids 1 to 2959 of SEQ ID NO:65. Underlined is the
partially characterized nucleotide (SEQ ID NO:37) sequence.
Indicated are the stop and start codons (both in black shaded
boxes) and primers (grey shaded boxes) used to amplify the coding
region of CCP45 by PCR. SEQ ID NOs of the primers used can be found
in Table III.
[0076] FIG. 46 depicts the predicted amino acid sequence of the
Arabidopsis thaliana CCP37. The complete amino acid sequence
corresponds to amino acids 1 to 911 of SEQ ID NO:131. Underlined is
the predicted partial amino acid (SEQ ID NO:103) sequence.
Indicated in a black shaded box is a SAP-like domain.
[0077] FIG. 47 depicts the cDNA sequence and predicted amino acid
sequence of the Arabidopsis thaliana CCP38. The complete nucleotide
sequence (FIG. 47A) corresponds to nucleic acids 1 to 1295 of SEQ
ID NO:66. The complete amino acid sequence (FIG. 47B) corresponds
to amino acids 1 to 357 of SEQ ID NO:132. Underlined in FIG. 47A
and FIG. 47B arc the partially characterized nucleotide (SEQ ID
NO:38) and predicted partial amino acid (SEQ ID NO:104) sequence,
respectively. Indicated in FIG. 47A are the stop and start codons
(both in black shaded boxes) which are part of the primers (grey
shaded boxes) used to amplify the coding region of CCP38 by PCR.
SEQ ID NOs of the primers used can be found in Table III.
[0078] FIG. 48 depicts phosphorylation of the Arabidopsis thaliana
CCP4 by CDKs. The protein CDC2bDN-IC26M (SEQ ID NO:70) contains a
consensus CDK phosphorylation site (TPWK, residues 54-57 of SEQ ID
NO:263). The corresponding gene (SEQ ID NO:4) was expressed in E.
coli and the protein was purified from the crude extracts. The
purified protein was subsequently shown to be phosphorylated by
CDKs in an in vitro CDK phosphorylation assay. -: no IC.sub.26M
added; +: IC26M added.
[0079] FIG. 49 schematically represents the domain organization of
AtE2Fa and AtE2Fb. The DNA-binding domain (DB), the dimerization
domain (DIM), the marked box (MB), and the Rb-binding domain (RB)
are indicated by marked boxes, the N-terminal domains are indicated
by open boxes. Numbering on the right refers to the amino acid
sequence contained in the different AtE2F constructs, which were
used in the in vitro binding assays.
[0080] FIG. 50 depicts AtDPa in vitro interactions with AtE2Fa and
AtE2Fb. The c-myc-tagged AtDPa (c-myc-AtDPa) was in vitro
translated and used as control. The lower migrating proteins
observed in the case of c-myc-AtDPa are most probably due to
initiation of translation at internal methionine codons (panel A,
unnumbered left lane). The c-myc-AtDPa was in vitro co-translated
with HA-AtE2Fb (panels A and B, lane 1), HA-AtE2Fa (panels B, lane
2), the C-terminal deleted form of HA-AtE2Fb (panels A and B, lane
3), HA-AtE2Fa 1-420 (panels A and B, lane 4) and the N-terminal
truncated form of HA-AtE2Fa 162-485 (panels A and B, lane 5) as
indicated. Numbers in the case of the mutant AtE2Fs refer to the
amino acid sequence contained in these constructs (see FIG. 49). An
aliquot of each sample was analyzed directly by SDS-PAGE and
autoradiographed (panel A; total IVT, total in vitro translation).
Another aliquot of the same samples was subjected to
immunoprecipitation with anti-c-myc monoclonal antibodies (panel
B), lanes are indicated by numbering. The position of c-myc-AtDPa
proteins are marked by arrows in both panels. Molecular mass
markers are indicated at the left.
[0081] FIG. 51 shows AtDPb in vitro interactions with AtE2Fa and
AtE2Fb. The c-myc-tagged AtDPb (c-myc-AtDPb, panels A and B, lane
2) and the HA-tagged AtE2Fb (HA-AtE2Fb, panels A and B, lane 1)
were in vitro translated and used as controls. The lower migrating
proteins observed in the case of c-myc-AtDPb are most probably due
to initiation of translation at internal methionine codons (panel
A, lane 2). The c-myc-AtDPb was in vitro co-translated with
HA-AtE2Fb (panels A and B, lane 3), HA-AtE2Fa (panels A and B lane
4), HA-AtE2Fa 1-420 (panels A and B, lane 5) and the N-terminal
truncated form of HA-AtE2Fa 162-485 (panels A and B, lane 6) as
indicated. Numbers in the case of the mutant AtE2Fs refer to the
amino acid sequence contained in these constructs (see FIG. 49). An
aliquot of each sample was analyzed directly by SDS-PAGE and
autoradiographed (panel A; total IVT, total in vitro translation).
Another aliquot of the same samples was subjected to
immunoprecipitation with anti-c-myc monoclonal antibodies (panel
B), lanes are indicated by numbering. The c-myc-AtDPb (panels A and
B, lanes 2-6; indicated with `y`) co-migrated almost exactly with
the mutant HA-AtE2Fa 1-420 (panels A and B, lane 5; indicated with
`x`) and HA-AtE2Fa 162-485 (panels A and B, lane 6; indicated with
`z`) in the gel system. These polypeptides as well as the position
of c-myc-AtDPa and c-myc-AtDPb proteins are marked by arrows marked
with `y`, `x` and `z`, respectively (cfr. supra). Molecular mass
markers are indicated at the left.
[0082] FIG. 52 schematically represents AtDPa and mutants. The
DNA-binding domain (DB) and the dimerization domain (DIM) are
indicated by marked boxes, N- and C-terminal regions are indicated
by open boxes. Numbering on the right side refers to the amino acid
sequence contained in the different AtDP constructs, which were
used in the in vitro binding assays.
[0083] FIG. 53 schematically represents AtDPb and mutants. The
DNA-binding domain (DB) and the dimerization domain (DIM) are
indicated by marked boxes, N- and C-terminal regions are indicated
by open boxes. Numbering on the right side refers to the amino acid
sequence contained in the different AtDP constructs, which were
used in the in vitro binding assays.
[0084] FIG. 54 shows the mapping of regions in AtDPa required for
in vitro binding to AtE2Fb. HA-AtE2Fb was co-translated with series
of c-myc-AtDPa mutants. An aliquot of each sample was analyzed
directly by SDS-PAGE and autoradiographed (panel A). Another
aliquot of the same samples was subjected to immunoprecipitation
with anti-HA (panel B) or anti-c-myc (panel C) monoclonal
antibodies. The c-myc-AtDPa mutants are marked by dots. Positions
of the HA-AtE2Fb proteins are indicated by arrows. Molecular mass
markers are indicated at the left.
[0085] FIG. 55 shows the mapping of regions in AtDPb required for
in vitro binding to AtE2Fb. HA-AtE2Fb was co-translated with series
c-myc-AtDPb mutants. An aliquot of each sample was analyzed
directly by SDS-PAGE and autoradiographed (panel A). Another
aliquot of the same samples was subjected to immunoprecipitation
with anti-HA (panel B) or anti-c-myc (panel C) monoclonal
antibodies. The c-myc-AtDPb mutants are marked by dots. Positions
of the HA-AtE2Fb proteins are indicated by arrows. Molecular mass
markers are indicated at the left.
[0086] FIG. 56 shows the mapping of regions in AtDPb required for
in vitro binding to AtE2Fb. HA-AtE2Fb was co-translated with
c-myc-AtDPb 182-263. Because of the small size of this protein, it
was hardly detectable when it was directly analyzed by SDS-PAGE
(data not shown). An aliquot of this sample was subjected to
immunoprecipitation with anti-c-myc monoclonal antibodies. The
c-myc-AtDP mutant is marked by dots. Position of the HA-AtE2Fb
protein is indicated by an arrow. Molecular mass markers are
indicated at the left.
[0087] FIG. 57 shows organ- and cell cycle-specific expression of
AtE2Fa and AtDPa. Tissue-specific expression of AtDPa and AtE2Fa
genes. cDNA prepared from the indicated tissues was subjected to
semi-quantitative RT-PCR analysis. The Arath; CDKB1; 1 gene was
used as a marker for highly proliferating tissues. The actin 2 gene
(ACT2) was used as loading control.
[0088] FIG. 58 shows organ- and cell cycle-specific expression of
AtE2Fa and AtDPa. Co-regulated cell cycle phase-dependent
transcription of AtE2Fa and AtDPa. The cDNA was prepared from
partially synchronized Arabidopsis cells harvested at the indicated
time point after removal of the cell cycle blocker was subjected to
semi-quantitative RT-PCR analysis. Histone H4 and Arath;CDKB1;1
were used as markers for S and G2/M phase, respectively, and ROCS
and Arath;CDKA;1 as loading controls.
[0089] FIG. 59 is a photographic representation of Northern
blotting analysis of DPa expression in independent Arabidopsis
thaliana DPa overexpressing lines (lines 16-27 as indicated) and
one untransformed control line (indicated by C).
[0090] FIG. 60 describes the molecules defined in SEQ ID
NOs:199-204 and 240-290.
DETAILED DESCRIPTION OF THE INVENTION
[0091] The present invention is based, at least in part, on the
discovery of novel molecules, referred to herein as "cell cycle
proteins" or "CCP" nucleic acid and polypeptide molecules. The CCP
molecules of the present invention were identified based on their
ability, as determined using yeast two-hybrid assays (described in
detail in Example 1), to interact with proteins involved in the
cell cycle, such as plant cyclin dependent kinases (e.g., a
dominant negative form of CDC2b, CDC2bAt.N161), cyclin dependent
kinase subunits referred herein as "CKS" (such as CKS1At), cyclin
dependent kinase inhibitors referred to herein as "CKI" (such as
CKI4), PHO80-like proteins referred to herein as "PLP", E2F, and
different domains of kinesin-like proteins referred to herein as
"KLPNT.
[0092] Because of their ability to interact with (e.g., bind to)
the cyclin dependent kinases, the CCP molecules of the present
invention may modulate, e.g., upregulate or downregulate, the
activity of plant CDKs, such as CDC2a or CDC2b; CKSs, CKIs, PLPs
and KLPNTs. Furthermore, because of their ability to interact with
(e.g., bind to) the aforementioned proteins which are proteins
involved in cell cycle regulation, the CCP molecules of the present
invention may also play a role in or function in cell cycle
regulation, e.g., plant or animal cell cycle regulation.
[0093] As used herein, the term "cell cycle protein" includes a
polypeptide which is involved in controlling or regulating the cell
cycle, or part thereof, in a cell, tissue, organ or whole organism.
Cell cycle proteins may also be capable of binding to, regulating,
or being regulated by cyclin dependent kinases, such as plant
cyclin dependent kinases, e.g., CDC2a or CDC2b, or their subunits.
The term cell cycle protein also includes peptides, polypeptides,
fragments, variant, homologs, alleles or precursors (e.g.,
pre-proteins or pro-proteins) thereof.
[0094] As used herein, the term "cell cycle" includes the cyclic
biochemical and structural events associated with growth, division
and proliferation of cells, and in particular with the regulation
of the replication of DNA and mitosis. The cell cycle is divided
into periods called: G.sub.0, Gap.sub.1 (G.sub.1), DNA synthesis
(S), Gap.sub.2 (G.sub.2), and mitosis (M). Normally these four
phases occur sequentially, however, the cell cycle also includes
modified cycles wherein one or more phases are absent resulting in
modified cell cycle such as endomitosis, acytokinesis, polyploidy,
polyteny, and endoreduplication.
[0095] As used herein, the term "plant" includes reference to whole
plants, plant organ (e.g., leaves, stems, roots), plant tissue,
seeds, and plant cells and progeny thereof. Plant cell, as used
herein includes, without limitation, seeds, e.g., seed suspension
cultures, embryos, meristematic regions, callus tissue, leaves,
roots, shoots, gametophytes, sporophytes, pollen, and microspores.
The class of plants which can be used in the methods of the
invention is generally as broad as the class of higher plants
amenable of transformation techniques, including both
monocotyledonous and dicotyledonous plants. Particularly preferred
plants are Arabidopsis thaliana, rice, wheat, maize, tomato,
alfalfa, oilseed rape, soybean, cotton, sunflower or canola. The
term plant also includes monocotyledonous (monocot) plants and
dicotyledonous (dicot) plants including a fodder or forage legume,
ornamental plant, food crop, tree, or shrub selected from the list
comprising Acacia spp., Acer spp., Actinidia spp., Aesculus spp.,
Agathis australis, Albizia amara, Alsophila tricolor, Andropogon
spp., Arachis spp, Areca catechu, Astelia fragrans, Astragalus
cicer, Baikiaea plurijuga, Betula spp., Brassica spp., Bruguiera
gymnorrhiza, Burkea africana, Buteafrondosa, Cadaba farinosa,
Calliandra spp, Camellia sinensis, Canna indica, Capsicum spp.,
Cassia spp., Centroema pubescens, Chaenomeles spp., Cinnamomum
cassia, Coffea arabica, Colophospermum mopane, Coronillia varia,
Cotoneaster serotina, Crataegus spp., Cucumis spp., Cupressus spp.,
Cyathea dealbata, Cydonia oblonga, Cryptomeria japonica, Cymbopogon
spp., Cynthea dealbata, Cydonia oblonga, Dalbergia monetaria,
Davallia divaricata, Desmodium spp., Dicksonia squarosa,
Diheteropogon amplectens, Dioclea spp, Dolichos spp., Dorycnium
rectum, Echinochloa pyramidalis, Ehrartia spp., Eleusine coracana,
Eragrestis spp., Erythrina spp., Eucalyptus spp., Euclea schimperi,
Eulalia villosa, Fagopyrum spp., Feijoa sellowiana, Fragaria spp.,
Flemingia spp, Freycinetia banksii, Geranium thunbergii, Ginkgo
biloba, Glycine javanica, Gliricidia spp, Gossypium hirsutum,
Grevillea spp., Guibourtia coleosperma, Hedysarum spp., Hemarthia
altissima, Heteropogon contortus, Hordeum vulgare, Hyparrhenia
rufa, Hypericum erectum, Hyperthelia dissoluta, Indigo incarnata,
Iris spp., Leptarrhena pyrolifolia, Lespediza spp., Lettuca spp.,
Leucaena leucocephala, Loudetia simplex, Lotonus bainesii, Lotus
spp., Macrotyloma axillare, Malus spp., Manihot esculenta, Medicago
sativa, Metasequoia glyptostroboides, Musa sapientum, Nicotianum
spp., Onobrychis spp., Ornithopus spp., Oryza spp., Peltophorum
africanum, Pennisetum spp., Persea gratissima, Petunia spp.,
Phaseolus spp., Phoenix canariensis, Phormium cookianum, Photinia
spp., Picea glauca, Pinus spp., Pisum sativum, Podocarpus totara,
Pogonarthria fleckii, Pogonarthria squarrosa, Populus spp.,
Prosopis cineraria, Pseudotsuga menziesii, Pterolobium stellatum,
Pyrus communis, Quercus spp., Rhaphiolepsis untbellata,
Rhopalostylis sapida, Rhus natatlensis, Ribes grossularia, Ribes
spp., Robinia pseudoacacia, Rosa spp., Rubus spp., Salix spp.,
Schyzachyrium sanguineum, Sciadopitys verticillata, Sequoia
sempervirens, Sequoiadendron giganteum, Sorghum bicolor, Spinacia
spp., Sporobolus fimbriatus, Stiburus alopecuroides, Stylosanthos
humilis, Tadehagi spp, Taxodium distichum, Themeda triandra,
Trifolium spp., Triticum spp., Tsuga heterophylla, Vaccinium spp.,
Vicia spp. Vitis vinifera, Watsonia pyramidata, Zantedeschia
aethiopica, Zea mays, amaranth, artichoke, asparagus, broccoli,
brussel sprout, cabbage, canola, carrot, cauliflower, celery,
collard greens, flax, kale, lentil, oilseed rape, okra, onion,
potato, rice, soybean, straw, sugarbeet, sugar cane, sunflower,
tomato, squash, and tea, amongst others, or the seeds of any plant
specifically named above or a tissue, cell or organ culture of any
of the above species.
[0096] The cell cycle proteins of the present invention are
involved in cell cycle regulation which is largely, but not
completely, similar in plants and animals Accordingly, the nucleic
acid molecules and polypeptide of the invention, or derivatives
thereof, may be used to modulate the cell cycle in a plant or an
animal such as by modulating the activity or level or expression of
CCP, altering the rate of the cell cycle or phases of the cell
cycle, and entry into and out of the various cell cycle phases. In
plants, the molecules of the present invention may be used in
agriculture to, for example, improve the growth characteristics of
plant such as growth rate or size of specific tissues or organs,
the architecture or morphology of the plant, increase crop yield,
improve tolerance to environmental stress conditions (such as
drought, salt, temperature, or nutrient deprivation), improve
tolerance to plant pathogens that abuse the cell cycle or as
targets to facilitate the identification of inhibitors or
activators of CCPs that may be useful as phytopharmaceuticals such
as herbicides or plant growth regulators.
[0097] As used herein, the term "cell cycle associated disorders"
includes a disorder, disease or condition which is caused or
characterized by a misregulation (e.g., downregulation or
upregulation), abuse, arrest, or modification of the cell cycle. In
plants cell cycle associated disorders include endomitosis,
acytokinesis, polyploidy, polyteny, and endoreduplication which may
be caused by external factors such as pathogens (nematodes,
viruses, fungi, or insects), chemicals, environmental stress (e.g.,
drought, temperature, nutrients, or UV) resulting in for instance
neoplastic tissue (e.g., galls, root knots) or inhibition of cell
division/proliferation (e.g., stunted growth). Cell cycle
associated disorders in animals include proliferative disorders or
differentiative disorders, such as cancer, e.g., melanoma, prostate
cancer, cervical cancer, breast cancer, colon cancer, or
sarcoma.
[0098] The present invention is based, at least in part, on the
discovery of novel molecules, referred to herein as CCP protein and
nucleic acid molecules, which comprise a family of molecules having
certain conserved structural and functional features. The term
"family" when referring to the protein and nucleic acid molecules
of the invention is intended to mean two or more proteins or
nucleic acid molecules having a common structural domain or motif
and having sufficient amino acid or nucleotide sequence homology as
defined herein. Such family members can be naturally or
non-naturally occurring and can be from either the same or
different species. For example, a family can contain a first
protein of plant, e.g. Arabidopsis, origin, as well as other,
distinct proteins of plant, e.g., Arabidopsis, origin or
alternatively, can contain homologues of other plants, e.g., rice,
or of non-plant origin. Members of a family may also have common
functional characteristics.
[0099] In one embodiment of the invention, a CCP protein of the
present invention is identified based on the presence of at least
one or more of the following domains:
A. Cyclin Destruction Box
[0100] As used herein, the term "Cyclin destruction box" includes a
domain of 9-10 amino acid residues in length which typically
contains the following consensus pattern:
TABLE-US-00001 (SEQ ID NO: 267)
R-X.sub.2-L-X.sub.2-[I/V]-X.sub.1-2-N,
wherein X can be any amino acid, X.sub.n is a stretch of n Xs,
X.sub.n-m is a stretch of n to m Xs, and wherein [I/V] means that
an Ile or Val residue can occur at that position. SEQ ID NO:267
depicts the minimal consensus sequence of the cyclin destruction
box and underlies the ubiquitin-mediated proteolytic destruction of
the cyclins bearing this motif (Yarnano et al. (1998), EMBO J. 17:
5670-5678; Renaudin et al. (1998) in Plant Cell Division (Francis,
Dudits and Inze, eds.), Portland Press Research Monograph, Portland
Press Ltd. London (1998), pp 67-98).
B. Cyclin Box Motif 1
[0101] As used herein, the term "Cyclin box motif 1" includes a
domain of 8 amino acid residues in length and which typically
contains the following consensus pattern:
TABLE-US-00002 (SEQ ID NO: 268) MRXIL[I/V]DW,
wherein X can be any amino acid and wherein [I/V] means that an Ile
or Val residue can occur at that position. This motif forms part of
the helix H1 of the first cyclin fold and is the best conserved
motif in the cyclinA/B family (Renaudin et al. (1998) in Plant Cell
Division (Francis, Dudits and Inze, eds.), Portland Press Research
Monograph, Portland Press Ltd. London (1998), pp 67-98).
C. Cyclin Box Motif 2
[0102] As used herein, the term "Cyclin box motif 2" includes a
domain of 8 amino acid residues in length and which typically
contains the following consensus pattern:
TABLE-US-00003 (SEQ ID NO: 269) KYEE-X.sub.3-P,
wherein X can be any amino acid and wherein X.sub.n is a stretch of
n Xs. This motif forms part of the helix H3 of the first cyclin
fold wherein the 2 acidic residues are part of the CDK binding site
(Renaudin et al. (1998) in Plant Cell Division (Francis, Dudits and
Inze, eds.), Portland Press Research Monograph, Portland Press Ltd.
London (1998), pp 67-98).
D. CDC2 Motifs
[0103] As used herein, the term "CDC2 motifs" includes domains of
about 9-12 amino acid residues in length and which typically
contain one of the following consensus patterns:
TABLE-US-00004 (SEQ ID NO: 270) GXG-X.sub.2-GXVY (SEQ ID NO: 271)
HRDXK-X.sub.2-NXL (SEQ ID NO: 272)
D-X.sub.1-2-[W/Y]SXG-X.sub.4-E
wherein X can be any amino acid, X.sub.n is a stretch of n Xs,
X.sub.n-m is a stretch of n to m Xs, and wherein [W/Y] means that
an Trp or Tyr residue can occur at that position.
E. CDK Phosphorylation Site
[0104] As used herein the term "CDK phosphorylation site" includes
a domain of about 5-7 amino acids in length and which contains one
or more of the following consensus
TABLE-US-00005 (SEQ ID NO: 273) TPX.sub.1-2[R/K] (SEQ ID NO: 274)
SPX[R/K] (SEQ ID NO: 275) SPX(Hu) (SEQ ID NO: 276) SP(Hu)X
with Hu being a hydrophobic uncharged amino acid (M, I, L, V) and X
any amino acid. The foregoing are typically found in
cyclin-dependent kinase substrates such as histone kinase,
transcription factors such as E2F or transcription regulators like
Rb. CDK phosphorylation sites are described in, for example,
Tamrakar et al. 2000, Frontiers Biosci 5, d121-137.
[0105] CCP proteins of the present invention comprising a CDK
phosphorylation site can be mutated in said CDK phosphorylation
site such that said CCP proteins are no longer able to be
phosphorylated on the CDK phosphorylation site. Mutations of a CDK
phosphorylation site include all mutations of the ser or thr
residue in any of SEQ ID NOs:273-276 into a non-phosphorylatable
amino acid residue, e.g., an ala or glu residue. Mutation of one or
more CDK phosphorylation site(s) in a CCP protein of the invention
is expected to modulate modifications of the CCP protein by CDKs
and, thus, to modulate the biological or biochemical function of
the CCP protein.
F. E Nuclear Localisation Signal (NLS)
[0106] As used herein the term "nuclear localization signal" or
"NLS" includes a domain conferring to a protein comprising the NLS
domain the ability to be imported into the nucleus and to, for
example, accumulate within the nucleus. NLS domains include one or
more of the following concensus patterns:
TABLE-US-00006 (SEQ ID NO: 277) PKKKRKV (SEQ ID NO: 278)
KRX.sub.10KKKK (SEQ ID NO: 279) KRPRP (SEQ ID NO: 280)
PAAKRVKLD
[0107] NLS domains have been found in the SV40 T antigen, in
nucleoplasmin (bipartite NLS), in a Adeno EIA, and in c-Myc. NLS
domains are described in, for example, Laskey et al. (1998)
Biochem. Soc. Trans. 26, 561-567.
G. Cy-Like Boxes
[0108] As used herein, the term "Cy-like box" includes a domain of
3-6 amino acid residues in length with has the consensus motif
R-X-X-F (SEQ ID NO:281) with X being any amino acid and one of two
Xs preferably being a hydrophobic residue.
H. Rb Binding Domain
[0109] As used herein, the term "Rb binding domain" includes a
domain which when present in a protein confers to the protein the
ability to bind the Rb protein. Rb binding domains include one or
more of the following concensus patterns:
TABLE-US-00007 (SEQ ID NO: 282) LXCXE (SEQ ID NO: 283) LXSXE (SEQ
ID NO: 284) DYX.sub.7EX.sub.3DLFD (SEQ ID NO: 285)
DYX.sub.6DX.sub.4DMWE
Rb binding domains have been found in D-cyclins, in protein
phosphatase 1, in human E2F-1, and in plant E2F. Rb binding domains
are described in, for example, Rubin et al. (1998) Frontiers Eiosci
3, d1209-1219; Phelps et al. (1992) J. Virol. 66, 2418-2427, and
Cress et al. (1993) Mol. Cell. Biol. 13, 6314-6325.
I. DEF Domain
[0110] As used herein the term "DEF domain" includes a protein
domain which is required for the formation of heterodimers between
DP proteins and E2F proteins. DEF domains comprise the following
concensus pattern:
TABLE-US-00008 (SEQ ID NO: 286)
[D/N/-][Q/E]KNIR[R/G]RV[Y/D]DALNV[L/F]MA[M/I/L/-]
[N/D][V/I]I[S/A][K/R][D/E]KKEI[K/Q/R/-]W[R/K/I]GLP
J. DNA Binding Domain
[0111] As used herein the term "DNA binding domain" includes a
domain which is involved in the binding of DP proteins and/or
DP-E2F heterodimers to DNA. DNA binding domains include the
following concensus pattern:
TABLE-US-00009 (SEQ ID NO: 287)
[G/N][K/R]GLR[H/Q]FS[M/V][K/M][I/V]X.sub.(0-17)C[E/Q]K[V/L][Q/E/-][S/-]XK[-
G/K]- [R/I/-]TT[S/-]Y[N/K]EVADE[L/I][V/I][A/S][E/D]F
DNA binding domains are described in, for example, Hao et al.
(1995) J. Cell Sci. 108, 2945-2954; Bandara et al. (1993) EMBO J.
12, 4317-4324; and Girling et al. (1994) Mol. Biol. Cell 5,
1081-4092.
K. DCB1 Domain:
[0112] As used herein the term "DCB1 domain" includes a protein
domain which is conserved among DP proteins and has the following
consensus patterns:
TABLE-US-00010 (SEQ ID NO: 288) [R/S][I/V]X[Q/K]KX.sub.3[L/S]XE
(SEQ ID NO: 289)
[R/S][I/V]X[Q/K]KX.sub.3[L/S]XE[L/M]X.sub.2-3[Q/H]X.sub.4-5NL[V/I/M][Q/E]R-
N
DCB1 domains are described in, for example, Hao et al. (1995) J.
Cell Sci. 108, 2945-2954; Bandara et al. (1993) EMBO J. 12,
4317-4324; and Girling et al. (1994) Mol. Biol. Cell 5,
1081-1092.
L. DCB2 Domain:
[0113] As used herein the term "DCB2 domain" includes a protein
domain which is conserved among DP proteins and has the following
consensus patern:
TABLE-US-00011 (SEQ ID NO: 290)
[L/I]PFI[L/I][V/L]XTX.sub.3-4[T/V]VX.sub.12-14FX.sub.3-4F[E/S][Hu]HDDX.sub-
.2[V/I]L[R/K]XM
DCB2 domains are described in, for example, Hao et al. (1995) J.
Cell Sci. 108, 2945-2954; Bandara et al. (1993) EMBO J. 12,
4317-4324; and Girling et al. (1994) Mol. Biol. Cell 5,
1081-1092.
M. SAP Domain:
[0114] As used herein the term SAP motif includes a protein domain
of about 35 amino acid residues which is found in a variety of
nuclear proteins involved in transcription, DNA repair, DNA
processing or apoptotic chromatin degradation. It was named after
SAF-A/B, Acinus and PIAS, three proteins known to contain it. The
SAP motif reveals a bipartite distribution of strongly conserved
hydrophobic, polar and bulky amino acids separated by a region that
contains a glycine. The SAP domain has been proposed to be a
DNA-binding motif (Aravind and Koonin (2000) Trends Biochem. Sci.
25:112-114).
[0115] Isolated CCP proteins of the present invention have an amino
acid sequence sufficiently identical to the amino acid sequence of
SEQ ID NO:67-132, 205, 211, 215-216, or 220-227 or are encoded by a
nucleotide sequence sufficiently identical to SEQ ID NO:1-66 or
228-239. As used herein, the term "sufficiently identical" refers
to a first amino acid or nucleotide sequence which contains a
sufficient or minimum number of identical or equivalent (e.g., an
amino acid residue which has a similar side chain) amino acid
residues or nucleotides to a second amino acid or nucleotide
sequence such that the first and second amino acid or nucleotide
sequences share common structural domains or motifs and/or a common
functional activity. For example, amino acid or nucleotide
sequences which share common structural domains have at least 30%,
40%, or 50% homology, preferably 60% homology, more preferably
70%-80%, and even more preferably 90-95% homology across the amino
acid sequences of the domains and contain at least one and
preferably two structural domains or motifs, are defined herein as
sufficiently identical. Furthermore, amino acid or nucleotide
sequences which share at least 30%, 40%, or 50%, preferably 60%,
more preferably 70-80%, or 90-95% homology and share a common
functional activity are defined herein as sufficiently
identical.
[0116] As used interchangeably herein, an "CCP activity",
"biological activity of CCP" or "functional activity of CCP",
refers to an activity exerted by a CCP protein, polypeptide or
nucleic acid molecule on a CCP responsive cell or tissue, or on a
CCP protein substrate, as determined in vivo, or in vitro,
according to standard techniques. In one embodiment, a CCP activity
is a direct activity, such as an association with a CCP-target
molecule. As used herein, a "target molecule" or "binding partner"
is a molecule with which a CCP protein binds or interacts in
nature, such that CCP-mediated function is achieved. A CCP target
molecule can be a non-CCP molecule or a CCP protein or polypeptide
of the present invention, e.g., a plant cyclin dependent kinase,
such as CDC2b. In an exemplary embodiment, a CCP target molecule is
a CCP ligand. Alternatively, a CCP activity is an indirect
activity, such as a cellular signaling activity mediated by
interaction of the CCP protein with a CCP ligand. The biological
activities of CCP are described herein. For example, the CCP
proteins of the present invention can have one or more of the
following activities: (1) they may interact with a non-CCP protein
molecule, e.g., a CCP ligand; (2) they may modulate a CCP-dependent
signal transduction pathway; (3) they may modulate the activity of
a plant cyclin dependent kinase, such as CDC2a, CDC2b, or CDC2c,
and (4) they may modulate the cell cycle.
[0117] Accordingly, another embodiment of the invention features
isolated CCP proteins and polypeptides having a CCP activity.
Preferred proteins are CCP proteins having at least one or more of
the following domains: a "cyclin destruction box", a "cyclin box
motif 1", a "cyclin box motif 2", a "CDC2 motif", a "CDK
phosphorylation site", a "nuclear localization signal", a "Cy-like
box", an "Rb binding domain", a "DEF domain", a "DNA binding
domain", a "DCB1 domain", a "DCB2 domain" and/or a SAP domain, and,
preferably, a CCP activity.
[0118] Additional preferred proteins have at least one or more of
the following domains: a "cyclin destruction box", a "cyclin box
motif 1", a "cyclin box motif 2", a "CDC2 motif", a "CDK
phosphorylation site", a "nuclear localization signal", a "Cy-like
box", an "Rb binding domain", a "DEF domain", a "DNA binding
domain", a "DCB1 domain", a "DCB2 domain" and/or a SAP domain and
are, preferably, encoded by a nucleic acid molecule having a
nucleotide sequence which hybridizes under stringent hybridization
conditions to a nucleic acid molecule comprising the nucleotide
sequence of SEQ ID NO:1-66 or 228-239.
[0119] The sequences of the present invention are summarized below,
in Table I.
TABLE-US-00012 TABLE I SEQ ID NO: SEQ ID NO: SEQ ID NO: SEQ ID NO:
CCP Clone Homolog/ partial full-length partial full-length Molecule
Name Bait function motif DNA DNA Protein Protein CCP1 CDC2bD
CDC2bAt. Novel CYCB2; 3 cyclin box 1 39 67 105 N-IC19 N161 motifs 1
and 2; cyclin destruction box CCP2 CDC2bD CDC2bAt. ARR2 2 40 68 106
N-IC20 N161 CCP3 CDC2bD CDC2bAt. novel A-type cyclin box 3 41 69
107 N-IC21 N161 cyclin motifs 1 and 2; cyclin destruction box CCP4
CDC2bD CDC2bAt. CDK 4 4 70 70 N-IC26M N161 phosphorylation site
CCP5 CDC2bD CDC2bAt. Arath cyclin box 5 5 71 71 N-IC39 N161 CYCB2;
1 motifs 1 and 2; cyclin destruction box CCP6 CDC2bD CDC2bAt. 6 42
72 108 N-IC57 N161 CCP7 CDC2bD CDC2bAt. AJH2-COP9 7 43 73 109
N-IC62 N161 CCP8 E2F3ca55 E2F3 N- 8 43 74 109 terminal CCP9 CDC2bD
CDC2bAt. Arath cyclin box 9 9 75 75 N-IC9 N161 CYCA2; 2 motifs 1
and 2; cyclin destruction box CCP10 CKSBC001 CKS1At 10 10 76 76
CCP11 CKSBC011 CKS1At gibberellin- 11 44 77 110 regulated protein
GASA1 precursor CCP12 CKSBC9 CKS1At 12 45 78 111 8-7 (Cterm) CCP13
CKSBC9 CKS1At 13 45 79 111 8-7 (Nterm) CCP14 CKSBC1 CKS1At 14 46 80
112 03-19 (Cterm) CCP15 CKSBC1 CKS1At PSTTLRE-type CDC2 15 47 81
113 99-20 CDK motifs CCP16 E2F5BB E2F5 DPa DNA-binding 16 48 82 114
C1 dimerisation domain; DEF domain domain; DCB1 and DCB2 domain
CCP17 FL67BC4- CKI4 17 17 83 83 2 CCP18 FL67BC12- CKI4 RNA 18 49 84
115 17 polymerase B transcription factor 3 CCP19 JUT1 PLP1 19 19 85
85 CCP20 JUT2 PLP1 20 50 86 116 CCP21 JUT3 PLP1 21 50 87 116 CCP22
JUT6 PLP1 Submergence 22 51 88 117 induced protein2 or Oryza sativa
CCP23 kbp1 KLPNT1 HSF1 23 52 89 118 36-508aa (motor domain) KLPNT2
(TH65) 73-186 aa (neck domain) CCP24 kbp3 KLPNT1 24 53 90 119 (427-
867aa) stalk domain CCP25 kbp6 KLPNT2 25 54 91 120 (TH65) 73-186 aa
neck domain CCP26 kbp9 KLPNT2 AtKLPNT1 26 55 92 121 (TH65) 73-186
aa neck domain CCP27 kbp11 KLPNT2 27 56 93 122 (TH65) 73-186 aa
neck domain CCP28 kbp12 KLPNT2 28 57 94 123 (TH65) 73-186 aa neck
domain CCP29 kbp13 KLPNT2 29 29 95 95 (TH65) 73-186 aa neck domain
CCP30 kbp15 KLPNT2 Centromere/ 30 58 96 124 (TH65) microtubule
73-186 aa binding neck protein CBF5 domain from yeast CCP31 kbp20
KLPNT2 VU91C 31 59 97 125 (TH65) calmodulin 73-608 aa from yeast
stalk domain CCP32 E2F5BB E2F5 32 60 98 126 C16 dimerization CCP33
DPb / DNA-binding 33 61 99 127 domain; DEF domain; DCB1 and DCB2
domain CCP34 E2F3ca1 E2F3 N- 34 62 100 128 terminal CCP35 E2F3ca2
E2F3 N- 35 63 101 129 terminal CCP36 E2F3ca9 E2F3 N- 36 64 102 130
terminal CCP37 E2F3ca12 E2F3 N- SAP 37 65 103 131 terminal domain
CCP38 E2F3ca50 E2F3 N- 38 66 104 132 terminal
[0120] Detailed studies of interactions between AtDPs (a and b
forms, SEQ ID NO:114 and SEQ ID NO:127, respectively) and AtE2Fs (a
and b forms; GenBank accession numbers AJ294534 and AJ294533,
respectively) revealed that the regions of AtDPa and AtDPb involved
in the binding of AtE2Fb are different.
[0121] Binding of AtDPa to AtE2Fb requires at least the AtDPa
dimerization domain and the whole (or possibly part of) the
C-terminal domain of AtDPa. The N-terminal domain and the
DNA-binding domain of AtDPa do not seem to contribute to the
interaction of AtDPa with AtE2Fb (Examples 11, 12, Table 5, FIG.
54).
[0122] Binding of AtDPb to AtE2Fb, however, only requires an intact
AtDPb dimerization domain. Neither the region including the
N-terminal and DNA-binding domains of AtDPb, nor the C-terminal
region of AtDPb seem to contribute to the interaction of AtDPb with
AtE2Fb (Examples 11, 12, Table 5, FIG. 55). These observations
indicate that modulating the formation of specific E2F/DP-complexes
may be useful in modulating cell cycle traversal and the regulation
thereof.
[0123] AtDPa and AtDPb, respectively, do not form homodimers but
both interact with either AtE2Fa or AtE2Fb (Example 12, Table 5).
In reciprocal experiments it was shown that the N-terminal domain
of AtE2Fa is not required for binding AtDPa or AtDPb. Likewise, the
Rb-binding domains of AtE2Fa and AtE2Fb, respectively, do not seem
to contribute to the binding to either AtDPa or AtDPb. The region
of AtE2Fa encompassing the dimerization domain and the marked box
is sufficient for binding to AtDPa and AtDPb (Examples 11, 12, FIG.
50, FIG. 51, Table 5). The dimerization domain of AtE2Fs appears to
be sufficient for binding to AtDPs.
[0124] Accordingly, it is shown herein for the first time (for
plant DPs and plant E2Fs) that the minimal DP and E2F proteins or
corresponding coding DNA Sequences that can be used in modifying
E2F/DP-related processes, e.g., regulation of gene expression by
E2F/DP, include:
[0125] (A) Plant DP dimerization domain with or without (part of)
the C-terminal DP domain. These domains include the proteins
AtDPa143-292 and AtDPa143-213 (numbering indicates the amino acids
included in said fragment relative to the full-length AtDPa
protein) set forth in SEQ ID NO:221 and SEQ ID NO:222,
respectively. The coding sequences corresponding to the foregoing
amino acid sequences are set forth in SEQ ID NO:232 and SEQ ID
NO:233, respectively. Also included are the corresponding regions
of the AtDPb protein characterized by AtDPb182-385 and AtDPb
182-263 (parts of the full-length AtDPb protein). The foregoing
regions of AtDPb are set forth in SEQ ID NO:216 and SEQ ID NO:215,
respectively, and the coding sequences corresponding thereto are
set forth in SEQ ID NO:231 and SEQ ID NO:230, respectively. The
AtDPb1-263 domain (SEQ ID NO:223) and the corresponding AtDPa1-214
domain (SEQ ID NO:220) encoded by the nucleic acid sequences SEQ ID
NO:234 and SEQ ID NO:239, respectively, can also be used. Further
included are nucleic acid sequences hybridizing to SEQ ID
NOs:229-234 or SEQ ID NO:239 or encoding a protein at least 50%,
55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more identical
to SEQ ID NOs:211, 215-216 and 220-223.
[0126] (B) Plant E2F dimerization domain with or without (part of)
the marked box. These domains include the proteins AtE2Fa232-282,
AtE2Fa232-352 and AtE2Fa226-356 set forth in SEQ ID NO:224, SEQ ID
NO:225 and SEQ ID NO:205, respectively. The corresponding coding
DNA sequences are set forth in SEQ ID NO:235, SEQ ID NO:236 and SEQ
ID NO:228, respectively. Also included are the corresponding
regions of the AtE2Fb protein characterized by AtE2Fb194-243 and
AtE2Fb194-311 set forth in SEQ ID NO:226 and SEQ ID NO:227,
respectively. The corresponding coding DNA sequences are set forth
in SEQ ID NO:237 and SEQ ID NO:238, respectively. Further included
are nucleic acid sequences hybridizing to SEQ ID NO:228 or SEQ ID
NOs:235-238 or encoding a' protein at least 70%, 75%, 80%, 85%,
90%, 95%, 98% identical to SEQ ID NO:205 or SEQ ID NOs:224-227.
[0127] (C) Full-length plant DP and plant E2F proteins or
corresponding DNA sequences may also be used to modify said
E2F/DP-related processes. Furthermore, plant DP and plant E2F
proteins or corresponding DNA sequences, or parts thereof, can be
used either separately or in combination to modify said
E2F/DP-related processes. This is underscored by the demonstration
that AtDPs and AtE2Fs are co-expressed in actively dividing cells
and in at least some plant tissues (Example 13 and FIGS. 57 and
58).
[0128] Various aspects of the invention are described in further
detail in the following subsections:
I. Isolated Nucleic Acid Molecules
[0129] One aspect of the invention pertains to isolated nucleic
acid molecules that encode CCP proteins or biologically active
portions thereof, as well as nucleic acid fragments sufficient for
use as hybridization probes to identify CCP-encoding nucleic acids
(e.g., CCP mRNA) and fragments for use as PCR primers for the
amplification or mutation of CCP nucleic acid molecules. As used
herein, the term "nucleic acid molecule" is intended to include DNA
molecules (e.g., cDNA or genomic DNA) and RNA molecules (e.g.,
mRNA) and analogs of the DNA or RNA generated using nucleotide
analogs. The nucleic acid molecule can be single-stranded or
double-stranded, but preferably is double-stranded DNA.
[0130] An "isolated" nucleic acid molecule is one which is
separated from other nucleic acid molecules which are present in
the natural source of the nucleic acid. For example, with regards
to genomic DNA, the term "isolated" includes nucleic acid molecules
which are separated from the chromosome with which the genomic DNA
is naturally associated. Preferably, an "isolated" nucleic acid is
free of sequences which naturally flank the nucleic acid (i.e.,
sequences located at the 5' and 3' ends of the nucleic acid) in the
genomic DNA of the organism from which the nucleic acid is derived.
For example, in various embodiments, the isolated CCP nucleic acid
molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb,
0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the
nucleic acid molecule in genomic DNA of the cell from which the
nucleic acid is derived. Moreover, an "isolated" nucleic acid
molecule, such as a cDNA molecule, can be substantially free of
other cellular material, or culture medium when produced by
recombinant techniques, or substantially free of chemical
precursors or other chemicals when chemically synthesized.
[0131] A nucleic acid molecule of the present invention, e.g., a
nucleic acid molecule having the nucleotide sequence of SEQ ID
NO:1-66 or 228-239, or a portion thereof, can be isolated using
standard molecular biology techniques and the sequence information
provided herein. For example, using all or portion of the nucleic
acid sequence of SEQ ID NO:1-66 or 228-239, as a hybridization
probe, CCP nucleic acid molecules can be isolated using standard
hybridization and cloning techniques (e.g., as described in
Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A
Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,
1989).
[0132] Moreover, a nucleic acid molecule encompassing all or a
portion of SEQ ID NO:1-66 or 228-239 can be isolated by the
polymerase chain reaction (PCR) using synthetic oligonucleotide
primers designed based upon the sequence of SEQ ID NO:1-66 or
228-239, respectively.
[0133] A nucleic acid of the invention can be amplified using cDNA,
mRNA or alternatively, genomic DNA, as a template and appropriate
oligonucleotide primers according to standard PCR amplification
techniques. The nucleic acid so amplified can be cloned into an
appropriate vector and characterized by DNA sequence analysis.
[0134] Furthermore, oligonucleotides corresponding to CCP
nucleotide sequences can be prepared by standard synthetic
techniques, e.g., using an automated DNA synthesizer.
[0135] In a preferred embodiment, an isolated nucleic acid molecule
of the invention comprises the nucleotide sequence shown in SEQ ID
NO:1-66 or 228-239.
[0136] In another preferred embodiment, an isolated nucleic acid
molecule of the invention comprises a nucleic acid molecule which
is a complement of the nucleotide sequence shown in SEQ ID NO:1-66
or 228-239, or a portion of any of these nucleotide sequences. A
nucleic acid molecule which is complementary to the nucleotide
sequence shown in SEQ ID NO:1-66 or 228-239, is one which is
sufficiently complementary to the nucleotide sequence shown in SEQ
ID NO:1-66 or 228-239, respectively, such that it can hybridize to
the nucleotide sequence shown in SEQ ID NO:1-66 or 228-239,
respectively, thereby forming a stable duplex.
[0137] In still another preferred embodiment, an isolated nucleic
acid molecule of the present invention comprises a nucleotide
sequence which is at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%,
85%, 90%, 95%, 98% or more homologous to the nucleotide sequence
(e.g., to the entire length of the nucleotide sequence) shown in
SEQ ID NO:1-66 or 228-239, or a portion of any of these nucleotide
sequences.
[0138] Moreover, the nucleic acid molecule of the invention can
comprise only a portion of the nucleic acid sequence of SEQ ID
NO:1-66 or 228-239, for example a fragment which can be used as a
probe or primer or a fragment encoding a biologically active
portion of a CCP protein. The nucleotide sequence determined from
the cloning of the CCP gene allows for the generation of probes and
primers designed for use in identifying and/or cloning other CCP
family members, as well as CCP homologues from other species. The
probe/primer typically comprises substantially purified
oligonucleotide. The oligonucleotide typically comprises a region
of nucleotide sequence that hybridizes under stringent conditions
to at least about 12 or 15, preferably about 20 or 25, more
preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive
nucleotides of a sense sequence of SEQ ID NO:1-66 or 228-239, or of
a naturally occurring allelic variant or mutant of SEQ ID NO:1-66
or 228-239. In an exemplary embodiment, a nucleic acid molecule of
the present invention comprises a nucleotide sequence which is at
least 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650,
700, 750, or 800 nucleotides in length and hybridizes under
stringent hybridization conditions to a nucleic acid molecule of
SEQ ID NO:1-66 or 228-239.
[0139] Probes based on the CCP nucleotide sequences can be used to
detect transcripts or genomic sequences encoding the same or
homologous proteins. In preferred embodiments, the probe further
comprises a label group attached thereto, e.g., the label group can
be a radioisotope, a fluorescent compound, an enzyme, or an enzyme
co-factor. Such probes can be used as a part of a diagnostic test
kit for identifying cells or tissues which misexpress a CCP
protein, such as by measuring a level of a CCP-encoding nucleic
acid in a sample of cells from a subject e.g., detecting CCP mRNA
levels or determining whether a genomic CCP gene has been mutated
or deleted.
[0140] A nucleic acid fragment encoding a "biologically active
portion of a CCP protein" can be prepared by isolating a portion of
the nucleotide sequence of SEQ ID NO:1-66 or 228-239, which encodes
a polypeptide having a CCP biological activity (the biological
activities of the CCP proteins are described herein), expressing
the encoded portion of the CCP protein (e.g., by recombinant
expression in vitro) and assessing the activity of the encoded
portion of the CCP protein.
[0141] The invention further encompasses nucleic acid molecules
that differ from the nucleotide sequence shown in SEQ ID NO:1-66 or
228-239, due to the degeneracy of the genetic code and, thus,
encode the same CCP proteins as those encoded by the nucleotide
sequence shown in SEQ ID NO:1-66 or 228-239. In another embodiment,
an isolated nucleic acid molecule of the invention has a nucleotide
sequence encoding a CCP protein.
[0142] In addition to the CCP nucleotide sequences shown in SEQ ID
NO:1-66 or 228-239, it will be appreciated by those skilled in the
art that DNA sequence polymorphisms that lead to changes in the
amino acid sequences of the CCP proteins may exist within a
population (e.g., an Arabidopsis or rice plant population). Such
genetic polymorphism in the CCP genes may exist among individuals
within a population due to natural allelic variation. As used
herein, the terms "gene" and "recombinant gene" refer to nucleic
acid molecules which include an open reading frame encoding an CCP
protein, preferably a plant CCP protein, and can further include
non-coding regulatory sequences, and introns. Such natural allelic
variations include both functional and non-functional CCP proteins
and can typically result in 1-5% variance in the nucleotide
sequence of a CCP gene. Any and all such nucleotide variations and
resulting amino acid polymorphisms in CCP genes that are the result
of natural allelic variation and that do not alter the functional
activity of a CCP protein are intended to be within the scope of
the invention. Differences in preferred codon usage are illustrated
below for Agrobacterium tumefaciens (a bacterium), Arabidopsis
thaliana, Medicago sativa (two dicotyledonous plants) and Oryza
sativa (a monocotyledonous plant). These examples were extracted
from http://www.kazusa.or.jp/codon. For example, the codon GGC (for
glycine) is the most frequently used codon in A. tumefaciens
(36.2%), is the second most frequently used codon in O. sativa but
is used at much lower frequencies in A. thaliana and M. sativa (9%
o and 8.4%, respectively). Of the four possible codons encoding
glycine the GGC codon is most preferably used in A. tumefaciens and
O. sativa. However, in A. thaliana the GGA (and GGU) codon is most
preferably used, whereas in M. sativa the GGU (and GGA) codon is
most preferably used.
[0143] Moreover, nucleic acid molecules encoding other CCP family
members and, thus, which have a nucleotide sequence which differs
from the CCP sequences of SEQ ID NO:1-66 or 228-239 are intended to
be within the scope of the invention. For example, another CCP cDNA
can be identified based on the nucleotide sequence of the plant CCP
molecules described herein. Moreover, nucleic acid molecules
encoding CCP proteins from different species, and thus which have a
nucleotide sequence which differs from the CCP sequences of SEQ ID
NO:1-66 or 228-239 are intended to be within the scope of the
invention. For example, a human CCP cDNA can be identified based on
the nucleotide sequence of a plant CCP.
[0144] Nucleic acid molecules corresponding to natural allelic
variants and homologues of the CCP cDNAs of the invention can be
isolated based on their homology to the CCP nucleic acids disclosed
herein using the cDNAs disclosed herein, or a portion thereof; as a
hybridization probe according to standard hybridization techniques
under stringent hybridization conditions.
[0145] Accordingly, in another embodiment, an isolated nucleic acid
molecule of the invention is at least 15, 20, 25, 30 or more
nucleotides in length and hybridizes under stringent conditions to
the nucleic acid molecule comprising the nucleotide sequence of SEQ
ID NO:1-66 or 228-239. In other embodiment, the nucleic acid is at
least 30, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, or
600 nucleotides in length. As used herein, the term "hybridizes
under stringent conditions" is intended to describe conditions for
hybridization and washing under which nucleotide sequences at least
30%, 40%, 50%, or 60% homologous to each other typically remain
hybridized to each other. Preferably, the conditions are such that
sequences at least about 70%, more preferably at least about 80%,
even more preferably at least about 85% or 90% homologous to each
other typically remain hybridized to each other. Such stringent
conditions are known to those skilled in the art and can be found
in Current Protocols in Molecular Biology, John Wiley & Sons,
N.Y. (1989), 6.3.1-6.3.6. A preferred, non-limiting example of
stringent hybridization conditions are hybridization in 6.times.
sodium chloride/sodium citrate (SSC) at about 45.degree. C.,
followed by one or more washes in 0.2.times.SSC, 0.1% SDS at about
45.degree. C., followed by one or more washes in 0.2.times.SSC,
0.1% SDS at 50.degree. C., preferably at 55.degree. C., more
preferably at 60.degree. C., and even more preferably at 65.degree.
C. Ranges intermediate to the above-recited values, e.g., at
60-65.degree. C. or at 55-60.degree. C. are also intended to be
encompassed by the present invention. Preferably, an isolated
nucleic acid molecule of the invention that hybridizes under
stringent conditions to the sequence of SEQ ID NO:1-66 or 228-239
corresponds to a naturally-occurring nucleic acid molecule. As used
herein, a "naturally-occurring" nucleic acid molecule refers to an
RNA or DNA molecule having a nucleotide sequence that occurs in
nature (e.g., encodes a natural protein).
[0146] In addition to naturally-occurring allelic variants of the
CCP sequences that may exist in nature, the skilled artisan will
further appreciate that changes can be introduced by mutation into
the nucleotide sequences of SEQ ID NO:1-66 or 228-239, thereby
leading to changes in the amino acid sequence of the encoded CCP
proteins, without altering the functional ability of the CCP
proteins. For example, nucleotide substitutions leading to amino
acid substitutions at "non-essential" amino acid residues can be
made in the sequence of a CCP protein. A "non-essential" amino acid
residue is a residue that can be altered from the wild-type
sequence of CCP without altering the biological activity, whereas
an "essential" amino acid residue is required for biological
activity. For example, amino acid residues that are conserved among
the CCP proteins of the present invention, are predicted to be
particularly unamenable to alteration. Furthermore, additional
amino acid residues that are conserved between the CCP proteins of
the present invention and other CCP family members are not likely
to be amenable to alteration.
[0147] Accordingly, another aspect of the invention pertains to
nucleic acid molecules encoding CCP proteins that contain changes
in amino acid residues that are not essential for activity.
[0148] An isolated nucleic acid molecule encoding a CCP protein
homologous to the CCP proteins of the present invention can be
created by introducing one or more nucleotide substitutions,
additions or deletions into the nucleotide sequence of SEQ ID
NO:1-66 or 228-239, such that one or more amino acid substitutions,
additions or deletions are introduced into the encoded protein.
Mutations can be introduced into SEQ ID NO:1-66 or 228-239 by
standard techniques, such as site-directed mutagenesis and
PCR-mediated mutagenesis. Preferably, conservative amino acid
substitutions are made at one or more predicted non-essential amino
acid residues. A "conservative amino acid substitution" is one in
which the amino acid residue is replaced with an amino acid residue
having a similar side chain. Families of amino acid residues having
similar side chains have been defined in the art. These families
include amino acids with basic side chains (e.g., lysine, arginine,
histidine), acidic side chains (e.g., aspartic acid, glutamic
acid), uncharged polar side chains (e.g., glycine, asparagine,
glutamine, serine, threonine, tyrosine, cysteine), nonpolar side
chains (e.g., alanine, valine, leucine, isoleucine, proline,
phenylalanine, methionine, tryptophan), beta-branched side chains
(e.g., threonine, valine, isoleucine) and aromatic side chains
(e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a
predicted nonessential amino acid residue in a CCP protein is
preferably replaced with another amino acid residue from the same
side chain family. Alternatively, in another embodiment, mutations
can be introduced randomly along all or part of a CCP coding
sequence, such as by saturation mutagenesis, and the resultant
mutants can be screened for CCP biological activity to identify
mutants that retain activity. Following mutagenesis of SEQ ID
NO:1-66 or 228-239, the encoded protein can be expressed
recombinantly and the activity of the protein can be determined.
Another alternative embodiment comprises targeted in vivo gene
correction or modification which can be achieved by chimeric
RNA/DNA oligonucleotides (e.g., Yoon et al. (1996), Proc. Natl.
Acad. Sci. USA 93, 2071-2076; Arntzen et al. (1999)
WO99/07865).
[0149] In a preferred embodiment, a mutant CCP protein can be
assayed for the ability to: (1) regulate transmission of signals
from cellular receptors, e.g. hormone receptors; (2) control cell
cycle checkpoints, e.g. entry of cells into mitosis; (3) modulate
the cell cycle; (4) modulate cell death, e.g., apoptosis; (5)
modulate cytoskeleton function, e.g. actin bundling; (6)
phosphorylate a substrate; (7) create dominant negative or dominant
positive effects in transgenic plants; (8) interact with other cell
cycle control proteins in, e.g. a yeast two hybrid assay; (9)
modulate CDK activity (e.g., cyclin-CDK activity); (10) regulate
cyclin-CDK complex assembly; (11) regulate the commitment of cells
to divide, e.g., by integrating mitogenic and antimitogenic
signals; (12) regulate cell cycle progression; (13) regulate DNA
replication and/or DNA repair; (14) modulate gene transcription,
e.g., regulate E2F/DP-dependent transcription of genes; (15)
regulate cyclin degradation; (16) modulate cell cycle withdrawal
and/or cell differentiation; (17) control organ (e.g., plant organ)
and/or organism (e.g., plant organism) size; (18) control organ
(e.g., plant organ) and/or organism (e.g., plant organism) growth
or growth rate; and (19) regulate endoreduplication.
[0150] In addition to the nucleic acid molecules encoding CCP
proteins described above, another aspect of the invention pertains
to isolated nucleic acid molecules which are antisense thereto. An
"antisense" nucleic acid comprises a nucleotide sequence which is
complementary to a "sense" nucleic acid encoding a protein, e.g.,
complementary to the coding strand of a double-stranded cDNA
molecule or complementary to an mRNA sequence. Accordingly, an
antisense nucleic acid can hydrogen bond to a sense nucleic acid.
The antisense nucleic acid can be complementary to an entire CCP
coding strand, or only to a portion thereof. In one embodiment, an
antisense nucleic acid molecule is antisense to a "coding region"
of the coding strand of a nucleotide sequence encoding CCP. The
term "coding region" refers to the region of the nucleotide
sequence comprising codons which are translated into amino acid
residues. In another embodiment, the antisense nucleic acid
molecule is antisense to a "noncoding region" of the coding strand
of a nucleotide sequence encoding CCP. The term "noncoding region"
refers to 5' and 3' sequences which flank the coding region that
are not translated into amino acids (i.e., also referred to as 5'
and 3' untranslated regions).
[0151] Given the coding strand sequences encoding CCP disclosed
herein, antisense nucleic acids of the invention can be designed
according to the rules of Watson and Crick base pairing. The
antisense nucleic acid molecule can be complementary to the entire
coding region of CCP mRNA, but more preferably is an
oligonucleotide which is antisense to only a portion of the coding
or noncoding region of CCP mRNA. For example, the antisense
oligonucleotide can be complementary to the region surrounding the
translation start site of CCP mRNA. An antisense oligonucleotide
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50
nucleotides in length. An antisense nucleic acid of the invention
can be constructed using chemical synthesis and enzymatic ligation
reactions using procedures known in the art. For example, an
antisense nucleic acid (e.g., an antisense oligonucleotide) can be
chemically synthesized using naturally occurring nucleotides or
variously modified nucleotides designed to increase the biological
stability of the molecules or to increase the physical stability of
the duplex formed between the antisense and sense nucleic acids,
e.g., phosphorothioate derivatives and acridine substituted
nucleotides can be used. Examples of modified nucleotides which can
be used to generate the antisense nucleic acid include
5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine, xantine, 4-acetylcytosine,
5-(carboxyhydroxylmethyl)uracil,
5-carboxymethylaminomethyl-2-thiouridine,
5-carboxymethylaminomethyluracil, dihydrouracil,
beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,
1-methylguanine, 1-methylinosine, 2,2-dimethylguanine,
2-methyladenine, 2-methylguanine, 3-methylcytosine,
5-methylcytosine, N6-adenine, 7-methylguanine,
5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil,
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil,
5-methoxyuracil, 2-methylthio-N6-isopentenyladenine,
uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine,
2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil,
5-methyluracil, uracil-5-oxyacetic acid methylester,
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil,
3-(3-amino-3-N-2-carboxypropyl)uracil, (acp3)w, and
2,6-diaminopurine. Alternatively, the antisense nucleic acid can be
produced biologically using an expression vector into which a
nucleic acid has been subcloned in an antisense orientation (i.e.,
RNA transcribed from the inserted nucleic acid will be of an
antisense orientation to a target nucleic acid of interest,
described further in the following subsection). Preferably,
production of antisense nucleic acids in plants occurs by means of
a stably integrated transgene comprising a promoter operative in
plants, an antisense oligonucleotide, and a terminator.
[0152] Other known nucleotide modifications include methylation,
cyclization and `caps` and substitution of one or more of the
naturally occurring nucleotides with an analog such as inosine.
Modifications of nucleotides include modifications generated by the
addition to nucleotides of acridine, amine, biotin, cascade blue,
cholesterol, Cy3.RTM., Cy5.RTM., Cy5.5.RTM. Dabcyl, digoxigenin,
dinitrophenyl, Edam, 6-FAM, fluorescein, 3'-glyceryl, HEX, IRD-700,
IRD-800, JOE, phosphate psoralen, rhodamine, ROX, thiol (SH),
spacers, TAMRA, TET, AMCA-S.RTM., SE, BODIPY.RTM., Marina
Blue.RTM., Pacific Blue.RTM., Oregon Green.RTM., Rhodamine
Green.RTM., Rhodamine Red.RTM., Rhodol Green.RTM. and Texas
Red.RTM.. Polynucleotide backbone modifications include
methylphosphonate, 2'-OMe-methylphosphonate RNA, phosphorothiorate,
RNA, 2'-OMeRNA. Base modifications include 2-amino-dA,
2-aminopurine, 3'-(ddA), 3' dA(cordycepin), 7-deaza-dA, 8-Br-dA,
8-oxo-dA, N.sup.6-Me-dA, abasic site (dSpacer), biotin dT,
T-OMe-5Me-C, 2'-OMe-propynyl-C, 3'-(5-Me-dC), 3'-(ddC), 5-Br-dC,
5-I -dC, 5-Me-dC, 5-F-dC, carboxy-dT, convertible dA, convertible
dC, convertible dG, convertible dT, convertible dU, 7-deaza-dG,
8-Br-dG, 8-oxo-dG, O.sup.6-Me-dG, S6-DNP-dG, 4-methyl-indole,
5-nitroindole, 2'-OMe-inosine, 2'-dI, O.sup.6-phenyl-dI,
4-methyl-indole, 2'-deoxynebularine, 5-nitroindole, 2-aminopurine,
dP(purine analogue), dK(pyrimidine analogue), 3-nitropyrrole,
2-thio-dT, 4-thio-dT, biotin-dT, carboxy-dT, O.sup.4-Me-dT,
O.sup.4-triazol dT, 2'-OMe-propynyl-U, 5-Br-dU, 2'-dU, 5-F-dU,
5-I-dU, O.sup.4-triazol dU.
[0153] The antisense nucleic acid molecules of the invention are
typically introduced into a plant or administered to a subject or
generated in situ such that they hybridize with or bind to cellular
mRNA and/or genomic DNA encoding a CCP protein to thereby inhibit
expression of the protein, e.g., by inhibiting transcription and/or
translation. The hybridization can be by conventional nucleotide
complementarity to form a stable duplex, or, for example, in the
case of an antisense nucleic acid molecule which binds to DNA
duplexes, through specific interactions in the major groove of the
double helix. An example of a route of introduction or
administration of antisense nucleic acid molecules of the invention
include transformation in a plant or direct injection at a tissue
site in a subject. Alternatively, antisense nucleic acid molecules
can be modified to target selected cells and then administered
systemically. For example, for systemic administration, antisense
molecules can be modified such that they specifically bind to
receptors or antigens expressed on a selected cell surface, e.g.,
by linking the antisense nucleic acid molecules to peptides or
antibodies which bind to cell surface receptors or antigens. The
antisense nucleic acid molecules can also be delivered to cells
using the vectors described herein. To achieve sufficient
intracellular concentrations of the antisense molecules, vector
constructs in which the antisense nucleic acid molecule is placed
under the control of a constitutive promoter or a strong pol II or
pol III promoter are preferred.
[0154] In yet another embodiment, the antisense nucleic acid
molecule of the invention is an .alpha.-anomeric nucleic acid
molecule. An .alpha.-anomeric nucleic acid molecule forms specific
double-stranded hybrids with complementary RNA in which, contrary
to the usual .beta.-units, the strands run parallel to each other
(Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The
antisense nucleic acid molecule can also comprise a
2'-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res.
15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987)
FEBS Lett. 215:327-330).
[0155] In another embodiment, the antisense nucleic acid molecule
further comprises a sense nucleic acid molecule complementary to
the antisense nucleic acid molecule. Gene silencing methods based
on such nucleic acid molecules are well known to the skilled
artisan (e.g., Grierson et al. (1998) WO 98/53083; Waterhouse et
al. (1999) WO 99/53050).
[0156] In still another embodiment, an antisense nucleic acid of
the invention is a ribozyme. Ribozymes are catalytic RNA molecules
with ribonuclease activity which are capable of cleaving a
single-stranded nucleic acid, such as an mRNA, to which they have a
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes
(described in Haselhoff and Gerlach (1988) Nature 334:585-591)) can
be used to catalytically cleave CCP mRNA transcripts to thereby
inhibit translation of CCP mRNA. A ribozyme having specificity for
a CCP-encoding nucleic acid can be designed based upon the
nucleotide sequence of a CCP cDNA disclosed herein (i.e., SEQ ID
NO:1-66 or 228-239). For example, a derivative of a Tetrahymena
L-19 IVS RNA can be constructed in which the nucleotide sequence of
the active site is complementary to the nucleotide sequence to be
cleaved in a CCP-encoding mRNA. See, e.g., Cech et al. U.S. Pat.
No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742.
Alternatively, CCP mRNA can be used to select a catalytic RNA
having a specific ribonuclease activity from a pool of RNA
molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science
261:1411-1418.
[0157] The use of ribozymes for gene silencing in plants is known
in the art (e.g., Atkins et al. (1994) WO 94/00012; Lenne et al.
(1995) WO 95/03404; Lutziger et al. (2000) WO 00/00619; Prinsen et
al. (1997) WO 97/13865 and Scott et al. (1997) WO/97/38116).
[0158] Alternatively, CCP gene expression can be inhibited by
targeting nucleotide sequences complementary to the regulatory
region of the CCP (e.g., the CCP promoter and/or enhancers) to form
triple helical structures that prevent transcription of the CCP
gene in target cells. See generally, Helene, C. (1991) Anticancer
Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad.
Sci. 660:27-36; and Maher, L. J. (1992) Bioassays
14(12):807-15.
[0159] In yet another embodiment, the CCP nucleic acid molecules of
the present invention can be modified at the base moiety, sugar
moiety or phosphate backbone to improve, e.g., the stability,
hybridization, or solubility of the molecule. For example, the
deoxyribose phosphate backbone of the nucleic acid molecules can be
modified to generate peptide nucleic acids (see Hyrup B. et al.
(1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used
herein, the terms "peptide nucleic acids" or "PNAs" refer to
nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose
phosphate backbone is replaced by a pseudopeptide backbone and only
the four natural nucleobases are retained. The neutral backbone of
PNAs has been shown to allow for specific hybridization to DNA and
RNA under conditions of low ionic strength. The synthesis of PNA
oligomers can be performed using standard solid phase peptide
synthesis protocols as described in Hyrup B. et al. (1996) supra;
Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.
[0160] PNAs of CCP nucleic acid molecules can be used for
increasing crop yield in plants or in therapeutic and diagnostic
applications. For example, PNAs can be used as antisense or
antigene agents for sequence-specific modulation of gene expression
by, for example, inducing transcription or translation arrest or
inhibiting replication. PNAs of CCP nucleic acid molecules can also
be used in the analysis of single base pair mutations in a gene,
(e.g., by PNA-directed PCR clamping); as `artificial restriction
enzymes` when used in combination with other enzymes, (e.g., S1
nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA
sequencing or hybridization (Hyrup B. et al. (1996) supra;
Perry-O'Keefe supra).
[0161] In another embodiment, PNAs of CCP can be modified, (e.g.,
to enhance their stability or cellular uptake), by attaching
lipophilic or other helper groups to PNA, by the formation of
PNA-DNA chimeras, or by the use of liposomes or other techniques of
drug delivery known in the art. For example, PNA-DNA chimeras of
CCP nucleic acid molecules can be generated which may combine the
advantageous properties of PNA and DNA. Such chimeras allow DNA
recognition enzymes, (e.g., RNAse H and DNA polymerases), to
interact with the DNA portion while the PNA portion would provide
high binding affinity and specificity. PNA-DNA chimeras can be
linked using linkers of appropriate lengths selected in terms of
base stacking, number of bonds between the nucleobases, and
orientation (Hyrup B. (1996) supra). The synthesis of PNA-DNA
chimeras can be performed as described in Hyrup B. (1996) supra and
Finn P. J. et al. (1996) Nucleic Acids Res. 24 (17): 3357-63. For
example, a DNA chain can be synthesized on a solid support using
standard phosphoramidite coupling chemistry and modified nucleoside
analogs, e.g., 5'-(4-methoxytrityl)amino-5'-deoxy-thymidine
phosphoramidite, can be used as a between the PNA and the 5' end of
DNA (Mag, M. et al. (1989) Nucleic Acid Res. 17: 5973-88). PNA
monomers are then coupled in a stepwise manner to produce a
chimeric molecule with a 5' PNA segment and a 3' DNA segment (Finn
P. J. et al. (1996) supra). Alternatively, chimeric molecules can
be synthesized with a 5' DNA segment and a 3' PNA segment
(Peterser, K. H. et al. (1975) Bioorganic Med. Chem. Lett. 5:
1119-11124).
[0162] In other embodiments, the oligonucleotide may include other
appended groups such as peptides (e.g., for targeting host cell
receptors in vivo), or agents facilitating transport across the
cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad.
Sci. US. 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad.
Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the
blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In
addition, oligonucleotides can be modified with
hybridization-triggered cleavage agents (See, e.g., Krol et al.
(1988) Bio-Techniques 6:958-976) or intercalating agents. (See,
e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the
oligonucleotide may be conjugated to another molecule, (e.g., a
peptide, hybridization triggered cross-linking agent, transport
agent, or hybridization-triggered cleavage agent).
II. Isolated CCP Proteins and Anti-CCP Antibodies
[0163] One aspect of the invention pertains to isolated CCP
proteins (e.g., the amino acid sequences set forth in SEQ ID
NO:67-132, 205, 211, 215-216, or 220-227) and biologically active
portions thereof, as well as polypeptide fragments suitable for use
as immunogens to raise anti-CCP antibodies. In one embodiment,
native CCP proteins can be isolated from cells or tissue sources by
an appropriate purification scheme using standard protein
purification techniques. In another embodiment, CCP proteins are
produced by recombinant DNA techniques. Alternative to recombinant
expression, a CCP protein or polypeptide can be synthesized
chemically using standard peptide synthesis techniques.
[0164] An "isolated" or "purified" protein or biologically active
portion thereof is substantially free of cellular material or other
contaminating proteins from the cell or tissue source from which
the CCP protein is derived, or substantially free from chemical
precursors or other chemicals when chemically synthesized. The
language "substantially free of cellular material" includes
preparations of CCP protein in which the protein is separated from
cellular components of the cells from which it is isolated or
recombinantly produced. In one embodiment, the language
"substantially free of cellular material" includes preparations of
CCP protein having less than about 30% (by dry weight) of non-CCP
protein (also referred to herein as a "contaminating protein"),
more preferably less than about 20% of non-CCP protein, still more
preferably less than about 10% of non-CCP protein, and most
preferably less than about 5% non-CCP protein. When the CCP protein
or biologically active portion thereof is recombinantly produced,
it is also preferably substantially free of culture medium, i.e.,
culture medium represents less than about 20%, more preferably less
than about 10%, and most preferably less than about 5% of the
volume of the protein preparation.
[0165] The language "substantially free of chemical precursors or
other chemicals" includes preparations of CCP protein in which the
protein is separated from chemical precursors or other chemicals
which are involved in the synthesis of the protein. In one
embodiment, the language "substantially free of chemical precursors
or other chemicals" includes preparations of CCP protein having
less than about 30% (by dry weight) of chemical precursors or
non-CCP chemicals, more preferably less than about 20% chemical
precursors or non-CCP chemicals, still more preferably less than
about 10% chemical precursors or non-CCP chemicals, and most
preferably less than about 5% chemical precursors or non-CCP
chemicals.
[0166] Biologically active portions of a CCP protein include
peptides comprising amino acid sequences sufficiently homologous to
or derived from the amino acid sequence of the CCP protein, which
include less amino acids than the full length CCP proteins, and
exhibit at least one activity of a CCP protein. Typically,
biologically active portions comprise a domain or motif with at
least one activity of the CCP protein. A biologically active
portion of a CCP protein can be a polypeptide which is, for
example, at least 10, 25, 50, 100 or more amino acids in
length.
[0167] To determine the percent identity of two amino acid
sequences or of two nucleic acid sequences, the sequences are
aligned for optimal comparison purposes (e.g., gaps can be
introduced in one or both of a first and a second amino acid or
nucleic acid sequence for optimal alignment and non-homologous
sequences can be disregarded for comparison purposes). In a
preferred embodiment, the length of a reference sequence aligned
for comparison purposes is at least 30%, preferably at least 40%,
more preferably at least 50%, even more preferably at least 60%,
and even more preferably at least 70%, 80%, or 90% of the length of
the reference sequence. The amino acid residues or nucleotides at
corresponding amino acid positions or nucleotide positions are then
compared. When a position in the first sequence is occupied by the
same amino acid residue or nucleotide as the corresponding position
in the second sequence, then the molecules are identical at that
position (as used herein amino acid or nucleic acid "identity" is
equivalent to amino acid or nucleic acid "homology"). The percent
identity between the two sequences is a function of the number of
identical positions shared by the sequences, taking into account
the number of gaps, and the length of each gap, which need to be
introduced for optimal alignment of the two sequences.
[0168] The comparison of sequences and determination of percent
identity between two sequences can be accomplished using a
mathematical algorithm. In a preferred embodiment, the percent
identity between two amino acid sequences is determined using the
Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm
which has been incorporated into the GAP program in the GCG
software package (available at http://www.gcg.com), using either a
Blosum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14,
12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In
yet another preferred embodiment, the percent identity between two
nucleotide sequences is determined using the GAP program in the GCG
software package (available at http://www.gcg.com), using a
NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and
a length weight of 1, 2, 3, 4, 5, or 6. A preferred, non-limiting
example of parameters to be used in conjunction with the GAP
program include a Blosum 62 scoring matrix with a gap penalty of
12, a gap extend penalty of 4, and a frameshift gap penalty of
5.
[0169] In another embodiment, the percent identity between two
amino acid or nucleotide sequences is determined using the
algorithm of E. Meyers and W. Miller (Comput. Appl. Biosci.,
4:11-17 (1988)) which has been incorporated into the ALIGN program
(version 2.0 or version 2.0U), using a PAM120 weight residue table,
a gap length penalty of 12 and a gap penalty of 4.
[0170] The nucleic acid and polypeptide sequences of the present
invention can further be used as a "query sequence" to perform a
search against public databases to, for example, identify other
family members or related sequences. Such searches can be performed
using the NBLAST and XBLAST programs (version 2.0) of Altschul, et
al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can
be performed with the NBLAST program, score=100, wordlength=12 to
obtain nucleotide sequences homologous to Kinase and Phosphatase
nucleic acid molecules of the invention. BLAST protein searches can
be performed with the XBLAST program, score=100, wordlength=3, and
a Blosum62 matrix to obtain amino acid sequences homologous to
Kinase and Phosphatase polypeptide molecules of the invention. To
obtain gapped alignments for comparison purposes, Gapped BLAST can
be utilized as described in Altschul et al., (1997) Nucleic Acids
Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST
programs, the default parameters of the respective programs (e.g.,
XBLAST and NBLAST) can be used. See
http://www.ncbi.nlm.nih.gov.
[0171] The invention also provides CCP chimeric or fusion proteins.
As used herein, a CCP "chimeric protein" or "fusion protein"
comprises a CCP polypeptide operatively linked to a non-CCP
polypeptide. An "CCP polypeptide" refers to a polypeptide having an
amino acid sequence corresponding to CCP, whereas a "non-CCP
polypeptide" refers to a polypeptide having an amino acid sequence
corresponding to a protein which is not substantially homologous to
the CCP protein, e.g., a protein which is different from the CCP
protein and which is derived from the same or a different organism.
Within a CCP fusion protein the CCP polypeptide can correspond to
all or a portion of a CCP protein. In a preferred embodiment, a CCP
fusion protein comprises at least one biologically active portion
of a CCP protein. In another preferred embodiment, a CCP fusion
protein comprises at least two biologically active portions of a
CCP protein. Within the fusion protein, the term "operatively
linked" is intended to indicate that the CCP polypeptide and the
non-CCP polypeptide are fused in-frame to each other. The non-CCP
polypeptide can be fused to the N-terminus or C-terminus of the CCP
polypeptide or can be inserted within the CCP polypeptide. The
non-CCP polypeptide can, for example, be (histidine).sub.6-tag,
glutathione S-transferase, protein A, maltose-binding protein,
dihydrofolate reductase, Tag.cndot.100 epitope (EETARFQPGYRS; SEQ
ID NO:199), c-myc epitope (EQKLISEEDL; SEQ ID NO:200),
FLAG.RTM.-epitope (DYKDDDK; SEQ ID NO:201), lacZ, CMP
(calmodulin-binding peptide), HA epitope (YPYDVPDYA; SEQ ID
NO:202), protein C epitope (EDQVDPRLIDGK; SEQ ID NO:203) or VSV
epitope (YTDIEMNRLGK; SEQ ID NO:204).
[0172] For example, in one embodiment, the fusion protein is a
GST-CCP fusion protein in which the CCP sequences are fused to the
C-terminus of the GST sequences. Such fusion proteins can
facilitate the purification of recombinant CCP.
[0173] In another embodiment, the fusion protein is a CCP protein
containing a heterologous signal sequence at its N-terminus. In
certain host cells (e.g., plant or mammalian host cells),
expression and/or secretion of CCP can be increased through use of
a heterologous signal sequence.
[0174] The CCP fusion proteins of the invention can be incorporated
into pharmaceutical compositions and administered to a plant or a
subject in vivo. The CCP fusion proteins can be used to affect the
bioavailability of a CCP substrate. Use of CCP fusion proteins may
be useful agriculturally for the increase of crop yields or
therapeutically for the treatment of cellular growth related
disorders, e.g., cancer. Moreover, the CCP-fusion proteins of the
invention can be used as immunogens to produce anti-CCP antibodies
in a subject, to purify CCP ligands and in screening assays to
identify molecules which inhibit the interaction of CCP with a CCP
substrate, e.g., a kinase such as CDC2b.
[0175] Preferably, a CCP chimeric or fusion protein of the
invention is produced by standard recombinant DNA techniques. For
example, DNA fragments coding for the different polypeptide
sequences are ligated together in-frame in accordance with
conventional techniques, for example by employing blunt-ended or
stagger-ended termini for ligation, restriction enzyme digestion to
provide for appropriate termini, filling-in of cohesive ends as
appropriate, alkaline phosphatase treatment to avoid undesirable
joining, and enzymatic ligation. In another embodiment, the fusion
gene can be synthesized by conventional techniques including
automated DNA synthesizers. Alternatively, PCR amplification of
gene fragments can be carried out using anchor primers which give
rise to complementary overhangs between two consecutive gene
fragments which can subsequently be annealed and reamplified to
generate a chimeric gene sequence (see, for example, Current
Protocols in Molecular Biology, eds. Ausubel et al. John Wiley
& Sons: 1992). Moreover, many expression vectors are
commercially available that already encode a fusion moiety (e.g., a
GST polypeptide). A CCP-encoding nucleic acid can be cloned into
such an expression vector such that the fusion moiety is linked
in-frame to the CCP protein.
[0176] The present invention also pertains to variants of the CCP
proteins which function as either CCP agonists (mimetics) or as CCP
antagonists. Variants of the CCP proteins can be generated by
mutagenesis, e.g., discrete point mutation or truncation of a CCP
protein. An agonist of the CCP proteins can retain substantially
the same, or a subset, of the biological activities of the
naturally occurring form of a CCP protein. An antagonist of a CCP
protein can inhibit one or more of the activities of the naturally
occurring form of the CCP protein by, for example, competitively
modulating a cellular activity of a CCP protein. Thus, specific
biological effects can be elicited by treatment with a variant of
limited function. In one embodiment, treatment of a subject with a
variant having a subset of the biological activities of the
naturally occurring form of the protein has fewer side effects in a
subject relative to treatment with the naturally occurring form of
the CCP protein.
[0177] In one embodiment, variants of a CCP protein which function
as either CCP agonists (mimetics) or as CCP antagonists can be
identified by screening combinatorial libraries of mutants, e.g.,
truncation mutants, of a CCP protein for CCP protein agonist or
antagonist activity. In one embodiment, a variegated library of CCP
variants is generated by combinatorial mutagenesis at the nucleic
acid level and is encoded by a variegated gene library. A
variegated library of CCP variants can be produced by, for example,
enzymatically ligating a mixture of synthetic oligonucleotides into
gene sequences such that a degenerate set of potential CCP
sequences is expressible as individual polypeptides, or
alternatively, as a set of larger fusion proteins (e.g., for phage
display) containing the set of CCP sequences therein. There are a
variety of methods which can be used to produce libraries of
potential CCP variants from a degenerate oligonucleotide sequence.
Chemical synthesis of a degenerate gene sequence can be performed
in an automatic DNA synthesizer, and the synthetic gene then
ligated into an appropriate expression vector. Use of a degenerate
set of genes allows for the provision, in one mixture, of all of
the sequences encoding the desired set of potential CCP sequences.
Methods for synthesizing degenerate oligonucleotides are known in
the art (see, e.g., Narang, S. A. (1983) Tetrahedron 39:3; Itakura
et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984)
Science 198:1056; Ike et al. (1983) Nucleic Acid Res. 11:477.
[0178] In addition, libraries of fragments of a CCP protein coding
sequence can be used to generate a variegated population of CCP
fragments for screening and subsequent selection of variants of a
CCP protein. In one embodiment, a library of coding sequence
fragments can be generated by treating a double stranded PCR
fragment of a CCP coding sequence with a nuclease under conditions
wherein nicking occurs only about once per molecule, denaturing the
double stranded DNA, renaturing the DNA to form double stranded DNA
which can include sense/antisense pairs from different nicked
products, removing single stranded portions from reformed duplexes
by treatment with S1 nuclease, and ligating the resulting fragment
library into an expression vector. By this method, an expression
library can be derived which encodes N-terminal, C-terminal and
internal fragments of various sizes of the CCP protein.
[0179] Several techniques are known in the art for screening gene
products of combinatorial libraries made by point mutations or
truncation, and for screening cDNA libraries for gene products
having a selected property. Such techniques are adaptable for rapid
screening of the gene libraries generated by the combinatorial
mutagenesis of CCP proteins. The most widely used techniques, which
are amenable to high through-put analysis, for screening large gene
libraries typically include cloning the gene library into
replicable expression vectors, transforming appropriate cells with
the resulting library of vectors, and expressing the combinatorial
genes under conditions in which detection of a desired activity
facilitates isolation of the vector encoding the gene whose product
was detected. Recrusive ensemble mutagenesis (REM), a new technique
which enhances the frequency of functional mutants in the
libraries, can be used in combination with the screening assays to
identify CCP variants (Arkin and Yourvan (1992) Proc. Natl. Acad.
Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering
6(3):327-331).
[0180] In one embodiment, cell based assays can be exploited to
analyze a variegated CCP library. For example, a library of
expression vectors can be transfected into a cell line which
ordinarily synthesizes and secretes CCP. The transfected cells are
then cultured such that CCP and a particular mutant CCP are
secreted and the effect of expression of the mutant on CCP activity
in cell supernatants can be detected, e.g., by any of a number of
enzymatic assays. Plasmid DNA can then be recovered from the cells
which score for inhibition, or alternatively, potentiation of CCP
activity, and the individual clones further characterized.
[0181] An isolated CCP protein, or a portion or fragment thereof,
can be used as an immunogen to generate antibodies that bind CCP
using standard techniques for polyclonal and monoclonal antibody
preparation. A full-length CCP protein can be used or,
alternatively, the invention provides antigenic peptide fragments
of CCP for use as immunogens. The antigenic peptide of CCP
comprises at least 8 amino acid residues and encompasses an epitope
of CCP such that an antibody raised against the peptide forms a
specific immune complex with CCP. Preferably, the antigenic peptide
comprises at least 10 amino acid residues, more preferably at least
15 amino acid residues, even more preferably at least 20 amino acid
residues, and most preferably at least 30 amino acid residues.
[0182] Preferred epitopes encompassed by the antigenic peptide are
regions of CCP that are located on the surface of the protein,
e.g., hydrophilic regions.
[0183] A CCP immunogen typically is used to prepare antibodies by
immunizing a suitable subject, (e.g., rabbit, goat, mouse or other
mammal) with the immunogen. An appropriate immunogenic preparation
can contain, for example, recombinantly expressed CCP protein or a
chemically synthesized CCP polypeptide. The preparation can further
include an adjuvant, such as Freund's complete or incomplete
adjuvant, or similar immunostimulatory agent. Immunization of a
suitable subject with an immunogenic CCP preparation induces a
polyclonal anti-CCP antibody response.
[0184] Accordingly, another aspect of the invention pertains to
anti-CCP antibodies. The term "antibody" as used herein refers to
immunoglobulin molecules and immunologically active portions of
immunoglobulin molecules, i.e., molecules that contain an antigen
binding site which specifically binds (immunoreacts with) an
antigen, such as CCP. Examples of immunologically active portions
of immunoglobulin molecules include F(ab) and F(ab').sub.2
fragments which can be generated by treating the antibody with an
enzyme such as pepsin. The invention provides polyclonal and
monoclonal antibodies that bind CCP. The term "monoclonal antibody"
or "monoclonal antibody composition", as used herein, refers to a
population of antibody molecules that contain only one species of
an antigen binding site capable of immunoreacting with a particular
epitope of CCP. A monoclonal antibody composition thus typically
displays a single binding affinity for a particular CCP protein
with which it immunoreacts.
[0185] Polyclonal anti-CCP antibodies can be prepared as described
above by immunizing a suitable subject with a CCP immunogen. The
anti-CCP antibody titer in the immunized subject can be monitored
over time by standard techniques, such as with an enzyme linked
immunosorbent assay (ELISA) using immobilized CCP. If desired, the
antibody molecules directed against CCP can be isolated from the
mammal (e.g., from the blood) and further purified by well known
techniques, such as protein A chromatography to obtain the IgG
fraction. At an appropriate time after immunization, e.g., when the
anti-CCP antibody titers are highest, antibody-producing cells can
be obtained from the subject and used to prepare monoclonal
antibodies by standard techniques, such as the hybridoma technique
originally described by Kohler and Milstein (1975) Nature
256:495-497) (see also, Brown et al. (1981) J Immunol 127:539-46;
Brown et al. (1980) J. Biol. Chem. 0.255:4980-83; Yeh et al. (1976)
Proc. Natl. Acad. Sci. USA 76:2927-31; and Yeh et al. (1982) Int.
J. Cancer 29:269-75), the more recent human B cell hybridoma
technique (Kozbor et al. (1983) Immunol Today 4:72), the
EBV-hybridoma technique (Cole et al. (1985), Monoclonal Antibodies
and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96) or trioma
techniques. The technology for producing monoclonal antibody
hybridomas is well known (see generally R. H. Kenneth, in
Monoclonal Antibodies: A New Dimension In Biological Analyses,
Plenum Publishing Corp., New York, N.Y. (1980); E. A. Lerner (1981)
Yale J. Biol. Med., 54:387-402; M. L. Gefter et al. (1977) Somatic
Cell Genet. 3:231-36). Briefly, an immortal cell line (typically a
myeloma) is fused to lymphocytes (typically splenocytes) from a
mammal immunized with a CCP immunogen as described above, and the
culture supernatants of the resulting hybridoma cells are screened
to identify a hybridoma producing a monoclonal antibody that binds
CCP.
[0186] Any of the many well known protocols used for fusing
lymphocytes and immortalized cell lines can be applied for the
purpose of generating an anti-CCP monoclonal antibody (see, e.g.,
G. Galfre et al. (1977) Nature 266:55052; Gefter et al. Somatic
Cell Genet., cited supra; Lerner, Yale J. Biol. Med., cited supra;
Kenneth, Monoclonal Antibodies, cited supra). Moreover, the
ordinarily skilled worker will appreciate that there are many
variations of such methods which also would be useful. Typically,
the immortal cell line (e.g., a myeloma cell line) is derived from
the same mammalian species as the lymphocytes. For example, murine
hybridomas can be made by fusing lymphocytes from a mouse immunized
with an immunogenic preparation of the present invention with an
immortalized mouse cell line. Preferred immortal cell lines are
mouse myeloma cell lines that are sensitive to culture medium
containing hypoxanthine, aminopterin and thymidine ("HAT medium").
Any of a number of myeloma cell lines can be used as a fusion
partner according to standard techniques, e.g., the P3-NS1/1-Ag4-1,
P3-x63-Ag8.653 or Sp2/O-Ag14 myeloma lines. These myeloma lines are
available from ATCC. Typically, HAT-sensitive mouse myeloma cells
are fused to mouse splenocytes using polyethylene glycol ("PEG").
Hybridoma cells resulting from the fusion are then selected using
HAT medium, which kills unfused and unproductively fused myeloma
cells (unfused splenocytes die after several days because they are
not transformed). Hybridoma cells producing a monoclonal antibody
of the invention are detected by screening the hybridoma culture
supernatants for antibodies that bind CCP, e.g., using a standard
ELISA assay.
[0187] Alternative to preparing monoclonal antibody-secreting
hybridomas, a monoclonal anti-CCP antibody can be identified and
isolated by screening a recombinant combinatorial immunoglobulin
library (e.g., an antibody phage display library) with CCP to
thereby isolate immunoglobulin library members that bind CCP. Kits
for generating and screening phage display libraries are
commercially available (e.g., the Pharmacia Recombinant Phage
Antibody System, Catalog No. 27-9400-01; and the Stratagene
SurfZAP.TM. Phage Display Kit, Catalog No. 240612). Additionally,
examples of methods and reagents particularly amenable for use in
generating and screening antibody display library can be found in,
for example, Ladner et al. U.S. Pat. No. 5,223,409; Rang et al. PCT
International Publication No. WO 92/18619; Dower et al. PCT
International Publication No. WO 91/17271; Winter et al. PCT
International Publication WO 92/20791; Markland et al. PCT
International Publication No. WO 92/15679; Breitling et al. PCT
International Publication WO 93/01288; McCafferty et al. PCT
International Publication No. WO 92/01047; Garrard et al. PCT
International Publication No. WO 92/09690; Ladner et al. PCT
International Publication No. WO 90/02809; Fuchs et al. (1991)
Bio/Technology 9:13704372; Hay et al. (1992) Hum. Antibod.
Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281;
Griffiths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J.
Mol. Biol. 226:889-896; Clarkson et al. (1991) Nature 352:624-628;
Gram et al. (1992) Proc. Natl. Acad. Sci. USA 89:3576-3580; Garrad
et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991)
Nuc. Acid Res. 19:4133-4137; Barbas et al. (1991) Proc. Natl. Acad.
Sci. USA 88:7978-7982; and McCafferty et al. Nature (1990)
348:552-554.
[0188] Additionally, recombinant anti-CCP antibodies, such as
chimeric and humanized monoclonal antibodies, comprising both human
and non-human portions, which can be made using standard
recombinant DNA techniques, are within the scope of the invention.
Such chimeric and humanized monoclonal antibodies can be produced
by recombinant DNA techniques known in the art, for example using
methods described in Robinson et al. International Application No.
PCT/US86/02269; Akira, et al. European Patent Application 184,187;
Taniguchi, M., European Patent Application 171,496; Morrison et al.
European Patent Application 173,494; Neuberger et al. PCT
International Publication No. WO 86/01533; Cabilly et al. U.S. Pat.
No. 4,816,567; Cabilly et al. European Patent Application 125,023;
Better et al. (1988) Science 240:1041-1043; Liu et al. (1987) Proc.
Natl. Acad. Sci. USA 84:3439-3443; Liu et al. (1987) J. Immunol.
139:3521-3526; Sun et al. (1987) Proc. Natl. Acad. Sci. USA
84:214-218; Nishimura et al. (1987) Canc. Res. 47:999-1005; Wood et
al. (1985) Nature 314:446-449; and Shaw et al, (1988) J. Natl.
Cancer Inst. 80:1553-1559); Morrison, S. L. (1985) Science
229:1202-1207; Oi et al. (1986) BioTechniques 4:214; Winter U.S.
Pat. No. 5,225,539; Jones et al. (1986) Nature 321:552-525;
Verhoeyan et al. (1988) Science 239:1534; and Beidler et al. (1988)
J. Immunol. 141:4053-4060.
[0189] An anti-CCP antibody (e.g., monoclonal antibody) can be used
to isolate CCP by standard techniques, such as affinity
chromatography or immunoprecipitation. An anti-CCP antibody can
facilitate the purification of natural CCP from cells and of
recombinantly produced CCP expressed in host cells. Moreover, an
anti-CCP antibody can be used to detect CCP protein (e.g., in a
cellular lysate or cell supernatant) in order to evaluate the
abundance and pattern of expression of the CCP protein. These
antibodies can also be used, for example, for the
immunoprecipitation and immunolocalization of proteins according to
the invention as well as for the monitoring of the synthesis of
such proteins, for example, in recombinant organisms, and for the
identification of compounds interacting with the protein according
to the invention.
[0190] Anti-CCP antibodies can be used diagnostically to monitor
protein levels in tissue as part of a clinical testing procedure,
e.g., to, for example, determine the efficacy of a given treatment
regimen. Detection can be facilitated by coupling (i.e., physically
linking) the antibody to a detectable substance. Examples of
detectable substances include various enzymes, prosthetic groups,
fluorescent materials, luminescent materials, bioluminescent
materials, and radioactive materials. Examples of suitable enzymes
include horseradish peroxidase, alkaline phosphatase,
-galactosidase, or acetylcholinesterase; examples of suitable
prosthetic group complexes include streptavidin/biotin and
avidin/biotin; examples of suitable fluorescent materials include
umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,
dichlorotriazinylamine fluorescein, dansyl chloride or
phycoerythrin; an example of a luminescent material includes
luminol; examples of bioluminescent materials include luciferase,
luciferin, and aequorin, and examples of suitable radioactive
material include .sup.125I, .sup.131I, .sup.35S or .sup.3H.
III. Computer Readable Means
[0191] The CCP nucleotide sequences of the invention (e.g., SEQ ID
NO:1-66 or 228-239) or amino acid sequences of the invention (e.g.,
SEQ ID NO:67-132, 205, 211, 215-216, or 220-227) are also provided
in a variety of mediums to facilitate use thereof. As used herein,
"provided" refers to a manufacture, other than an isolated nucleic
acid or amino acid molecule, which contains a nucleotide or amino
acid sequences of the present invention. Such a manufacture
provides the nucleotide or amino acid sequences, or a subset
thereof (e.g., a subset of open reading frames (ORI's)) in a form
which allows a skilled artisan to examine the manufacture using
means not directly applicable to examining the nucleotide or amino
acid sequences, or a subset thereof, as they exist in nature or in
purified form.
[0192] In one application of this embodiment, a nucleotide or amino
acid sequence of the present invention can be recorded on computer
readable media. As used herein "computer readable media" includes
any medium that can be read and accessed directly by a computer.
Such media include, but are not limited to: magnetic storage media,
such as floppy discs, hard disc storage medium, and magnetic tape;
optical storage media such a CD-ROM; electrical storage media such
as RAM and ROM; and hybrids of these categories such as
magnetic/optical storage media. The skilled artisan will readily
appreciate how any of the presently known computer readable mediums
can be used to create a manufacture comprising computer readable
medium having recorded thereon a nucleotide or amino acid sequence
of the present invention.
[0193] As used herein "recorded" refers to a process of storing
information on computer readable medium. The skilled artisan can
readily adopt any of the presently known methods for recording
information on a computer readable medium to generate manufactures
comprising the nucleotide or amino acid sequence information of the
present invention.
[0194] A variety of data storage structures are available to a
skilled artisan for creating a computer readable medium having
recorded thereon a nucleotide or amino acid sequence of the present
invention. The choice of the data storage structure will generally
be based on the means chosen to access the stored information. In
addition, a variety of data processor programs and formats can be
used to store the nucleotide sequence information of the present
invention on computer readable medium. The sequence information can
be represented in a word processing text file, formatted in
commercially-available software such as WordPerfect and Microsoft
Word, or represented in the form of an ASCII file, stored in a
database application, such as DB2, Sybase Oracle, or the like. The
skilled artisan can readily adapt any number of dataprocessor
structuring formats (e.g., text file or database) in order to
obtain computer readable medium having recorded thereon the
nucleotide sequence information of the present invention.
[0195] By providing the nucleotide or amino acid sequences of the
invention in computer readable form, the skilled artisan can
routinely access the sequence information for a variety of
purposes. For example, one skilled in the art can use the
nucleotide or amino acid sequences of the invention in computer
readable form to compare a target sequence or target structural
motif with the sequence information stored within the data storage
means. Search means are used to identity fragments or regions of
the sequences of the invention which match a particular target
sequence or target motif As used herein, a "target sequence" can be
any DNA or amino acid sequence of six or more nucleotide or two or
more amino acids. A skilled artisan can readily recognize that the
longer a target sequence is, the less likely a target sequence will
be present as a random occurrence in the database. The most
preferred sequence length of a target sequence is from about 10 to
100 amino acids or form about 30 to 300 nucleotide residues.
However, it is well recognized that commercially important
fragments, such as sequence fragments involved in gene expression
and protein processing, may be shorter length.
[0196] As used herein, "a target structural motif," or "target
motif," refers to any rationally selected sequence or combination
of sequences in which the sequence(s) are chosen based on a
three-dimensional configuration which is formed upon the folding of
the target motif. There are a variety of target motifs known in the
art. Protein target motifs include, but are not limited to, enzyme
active sites and signal sequences. Nucleic acid target motifs
include, but are not limited to, promoter sequences, hairpin
structures and inducible expression elements (protein binding
sequences).
[0197] Computer software is publicly available which allows a
skilled artisan to access sequence information provided in a
computer readable medium for analysis and comparison to other
sequences. A variety of known algorithms are disclosed publicly and
a variety of commercially available software of conducting search
means are and can be used in the computer-based systems of the
present invention. Examples of such software include, but are not
limited to, MacPatter (EMBL), BLASTN and BASTX (NCBIA).
[0198] For example, software which implements the BLAST (Altschul
et al. (1990) J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al.
(1993) Comp. Chem. 17:203-207) search algorithms on a Sybase system
can be used to identify open reading frames (ORFs) of the sequences
of the invention which contain homology to ORFs or proteins from
other libraries. Such ORFs are protein encoding fragments and are
useful in producing commercially important proteins such as enzyme
used in various reactions and in the production of commercially
useful metabolites.
IV. Recombinant Expression Vectors and Host Cells
[0199] Another aspect of the invention pertains to vectors,
preferably expression vectors, containing a nucleic acid encoding a
CCP protein (or a portion thereof). As used herein, the term
"vector" refers to a nucleic acid molecule capable of transporting
another nucleic acid to which it has been linked. One type of
vector is a "plasmid", which refers to a circular double stranded
DNA loop into which additional DNA segments can be ligated. Another
type of vector is a viral vector, wherein additional DNA segments
can be ligated into the viral genome. Certain vectors are capable
of autonomous replication in a host cell into which they are
introduced (e.g., bacterial vectors having a bacterial origin of
replication and episomal mammalian vectors). Other vectors (e.g.,
non-episomal mammalian vectors) are integrated into the genome of a
host cell upon introduction into the host cell, and thereby are
replicated along with the host genome. Moreover, certain vectors
are capable of directing the expression of genes to which they are
operatively linked. Such vectors are referred to herein as
"expression vectors". In general, expression vectors of utility in
recombinant DNA techniques are often in the form of plasmids. In
the present specification, "plasmid" and "vector" can be used
interchangeably as the plasmid is the most commonly used form of
vector. However, the invention is intended to include such other
forms of expression vectors, such as viral vectors (e.g.,
replication defective retroviruses, adenoviruses and
adeno-associated viruses), which serve equivalent functions.
[0200] The recombinant expression vectors of the invention comprise
a nucleic acid of the invention in a form suitable for expression
of the nucleic acid in a host cell, e.g., a plant cell, which means
that the recombinant expression vectors include one or more
regulatory sequences, selected on the basis of the host cells to be
used for expression, which is operatively linked to the nucleic
acid sequence to be expressed. Within a recombinant expression
vector, "operably linked" is intended to mean that the nucleotide
sequence of interest is linked to the regulatory sequence(s) in a
manner which allows for expression of the nucleotide sequence
(e.g., in an in vitro transcription/translation system or in a host
cell when the vector is introduced into the host cell). The term
"regulatory sequence" is intended to includes promoters, enhancers
and other expression control elements (e.g., polyadenylation
signals). Such regulatory sequences are described, for example, in
Goeddel; Gene Expression Technology: Methods in Enzymology 185,
Academic Press, San Diego, Calif. (1990). Regulatory sequences
include those which direct constitutive expression of a nucleotide
sequence in many types of host cell and those which direct
expression of the nucleotide sequence only in certain host cells
(e.g., tissue-specific regulatory sequences). It will be
appreciated by those skilled in the art that the design of the
expression vector can depend on such factors as the choice of the
host cell to be transformed, the level of expression of protein
desired, and the like. The expression vectors of the invention can
be introduced into host cells to thereby produce proteins or
peptides, including fusion proteins or peptides, encoded by nucleic
acids as described herein (e.g., CCP proteins, mutant forms of CCP
proteins, fusion proteins, and the like).
[0201] The vectors of the invention comprise a selectable and/or
scorable marker. Selectable marker genes useful for the selection
of transformed plant cells, callus, plant tissue and plants are
well known to those skilled in the art and comprise, for example,
antimetabolite resistance as the basis of selection for dhfr, which
confers resistance to methotrexate (Reiss, Plant Physiol. (Life
Sci. Adv.) 13 (1994), 143-149); npt, which confers resistance to
the aminoglycosides neomycin, kanamycin and paromycin
(Herrera-Estrella, EMBO J. 2 (1983), 987-995) and hygro, which
confers resistance to hygromycin (Marsh, Gene 32 (1984), 481-485).
Additional selectable genes have been described, namely trpB, which
allow cells to utilize indole in place of tryptophan; hisD, which
allows cells to utilize histinol in place of histidine (Hartman,
Proc. Natl. Acad. Sci. USA 85 (1988), 8047); mannose-6-phosphate
isomerase which allows cells to utilize mannose (WO 94/20627) and
ODC (ornithine decarboxylase) which confers resistance to the
ornithine decarboxylase inhibitor, 2-(difluoromethyl)-DL-ornithine,
DFMO (McConlogue, 1987, In: Current Communications in Molecular
Biology, Cold Spring Harbor Laboratory ed.) or deaminase from
Aspergillus terreus which confers resistance to Blasticidin S
(Tamura, Biosci. Biotechnol. Biochem. 59 (1995), 2336-2338).
[0202] Useful scorable markers are also known to those skilled in
the art and are commercially available. Advantageously, the marker
is a gene encoding luciferase (Giacomin, Pl. Sci. 116 (1996),
59-72; Scikantha, J. Bact. 178 (1996), 121), green fluorescent
protein (Gerdes, FEBS Lett. 389 (1996), 44-47) or B-glucuronidase
(Jefferson, EMBO J. 6 (1987), 3901-3907). This embodiment is
particularly useful for simple and rapid screening of cells,
tissues and organisms containing a vector of the invention.
[0203] A "plant promoter" is a promoter capable of initiating
transcription in plant cells. Exemplary plant promoters include,
but are not limited to, those that are obtained from plants, plant
viruses, and bacteria. Preferred promoters may contain additional
copies of one or more specific regulatory elements, to further
enhance expression and/or to alter the spatial expression and/or
temporal expression of a nucleic acid molecule to which it is
operably connected. For example, copper-responsive,
glucocorticoid-responsive or dexamethasone-responsive regulatory
elements may be placed adjacent to a heterologous promoter sequence
driving expression of a nucleic acid molecule to confer copper
inducible, glucocorticoid-inducible, or dexamethasone-inducible
expression respectively, on said nucleic acid molecule. Examples of
promoters under developmental control include promoters that
preferentially initiate transcription in certain tissues, such as
leaves, roots, seeds, endosperm, embryos, fibers, xylem vessels,
tracheids, or sclerenchyma. Such promoters are referred to as
"tissue preferred." Promoters which initiate transcription only in
certain tissue are referred to as "tissue specific." A "cell type"
specific promoter primarily drives expression in certain cell types
in one or more organs, for example, vascular cells in roots or
leaves. An "inducible" promoter is a promoter which is under
environmental control. Examples of environmental conditions that
may effect transcription by inducible promoters include anaerobic
conditions or the presence of light. Tissue specific, tissue
preferred, cell type specific, and inducible promoters constitute
the class of "non-constitutive" promoters. A "constitutive"
promoter is a promoter which is active under most environmental
conditions.
[0204] Another aspect of the invention pertains to host cells into
which a recombinant expression vector of the invention has been
introduced. The terms "host cell" and "recombinant host cell" are
used interchangeably herein. It is understood that such terms refer
not only to the particular subject cell but to the progeny or
potential progeny of such a cell. Because certain modifications may
occur in succeeding generations due to either mutation or
environmental influences, such progeny may not, in fact, be
identical to the parent cell, but are still included within the
scope of the term as used herein.
[0205] A host cell can be any prokaryotic or eukaryotic cell. For
example, a CCP protein can be expressed in plant cells, bacterial
cells such as E. coli, insect cells, yeast or mammalian cells (such
as Chinese hamster ovary cells (CHO) or COS cells). Other suitable
host cells are known to those skilled in the art.
[0206] Vector DNA can be introduced into prokaryotic or eukaryotic
cells via conventional transformation or transfection techniques.
As used herein, the terms "transformation" and "transfection" are
intended to refer to a variety of art-recognized techniques for
introducing foreign nucleic acid (e.g., DNA) into a host cell,
including calcium phosphate or calcium chloride co-precipitation,
DEAE-dextran-mediated transfection, lipofection, or
electroporation. Suitable methods for transforming or transfecting
host cells can be found in Sambrook, et al. (Molecular Cloning: A
Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989),
and other laboratory manuals.
[0207] Means for introducing a recombinant expression vector of
this invention into plant tissue or cells include, but are not
limited to; transformation using CaCl.sub.2 and variations thereof,
in particular the method described by Hanahan (J. Mol. Biol. 166,
557-560, 1983), direct DNA uptake into protoplasts (Krens et al,
Nature 296: 72-74, 1982; Paszkowski et al, EMBO J. 3:2717-2722,
1984), PEG-mediated uptake to protoplasts (Armstrong et al, Plant
Cell Reports 9: 335-339, 1990) microparticle bombardment,
electroporation (Fromm et al., Proc. Natl. Acad. Sci. (USA)
82:5824-5828, 1985), microinjection of DNA (Crossway et al., Mol.
Gen. Genet. 202:179-185, 1986), microparticle bombardment of tissue
explants or cells (Christou et al, Plant Physiol 87: 671-674, 1988;
Sanford, Particulate Science and Technology 5: 27-37, 1987),
vacuum-infiltration of tissue with nucleic acid, or in the case of
plants, T-DNA-mediated transfer from Agrobacterium to the plant
tissue as described essentially by An et al. (EMBO J. 4:277-284,
1985), Herrera-Estrella et al. (Nature 303: 209-213, 1983a; EMBO J.
2: 987-995, 1983b; In: Plant Genetic Engineering, Cambridge
University Press, N.Y., pp 63-93, 1985), or in planta method using
Agrobacterium tumefaciens such as that described by Bechtold et
al., (C. R. Acad. Sci. (Paris, Sciences de la vie/Life
Sciences)316: 1194-1199, 1993), Clough et al (Plant J. 16: 735-743,
1998), Trieu et al. (Plant J. 22:531-541, 2000) or Kloti
(WO01/12828, 2001). Methods for transformation of monocotyledonous
plants are well known in the art and include Agrobacterium-mediated
transformation (Cheng et al. (1997) WO 97/48814; Hansen (1998) WO
98/54961; Hiei et al. (1994) WO 94/00977; Hiei et al. (1998) WO
98/17813; Rikiishi et al. (1999) WO 99/04618; Saito et al. (1995)
WO 95/06722), microprojectile bombardment (Adams et al. (1999) U.S.
Pat. No. 5,969,213; Bowen et al. (1998) U.S. Pat. No. 5,736,369;
Chang et al. (1994) WO 94/13822; Lundquist et al. (1999) U.S. Pat.
No. 5,874,265/U.S. Pat. No. 5,990,390; Vasil and Vasil (1995) U.S.
Pat. No. 5,405,765; Walker et al. (1999) U.S. Pat. No. 5,955,362),
DNA uptake (Eval et al. (1993) WO 93/181,168), microinjection of
Agrobacterium cells (von Holt 1994 DE 4309203), sonication (Finer
et al. (1997) U.S. Pat. No. 5,693,512) and flower-dip or in
planta-transformation (Kloti, WO01/12828, 2001).
[0208] The vector DNA may further comprise a selectable marker gene
to facilitate the identification and/or selection of cells which
are transfected or transformed with a genetic construct. Suitable
selectable marker genes contemplated herein include the ampicillin
resistance (Amp.sup.r), tetracycline resistance gene Tc.sup.r),
bacterial kanamycin resistance gene (Kang, phosphinothricin
resistance gene, neomycin phosphotransferase gene (nptII),
hygromycin resistance gene, .beta.-glucuronidase (GUS) gene,
chloramphenicol acetyltransferase (CAT) gene, green fluorescent
protein (gfp) gene (Haseloff et al, 1997), and luciferase gene.
[0209] For microparticle bombardment of cells, a microparticle is
propelled into a cell to produce a transformed cell. Any suitable
ballistic cell transformation methodology and apparatus can be used
in performing the present invention. Exemplary apparatus and
procedures are disclosed by Stomp et al. (U.S. Pat. No. 5,122,466)
and Sanford and Wolf (U.S. Pat. No. 4,945,050). When using
ballistic transformation procedures, the gene construct may
incorporate a plasmid capable of replicating in the cell to be
transformed. Examples of microparticles suitable for use in such
systems include 1 to 5 .mu.m gold spheres. The DNA construct may be
deposited on the microparticle by any suitable technique, such as
by precipitation.
[0210] A whole plant may be regenerated from the transformed or
transfected cell, in accordance with procedures well known in the
art. Plant tissue capable of subsequent clonal propagation, whether
by organogenesis or embryogenesis, may be transformed with a gene
construct of the present invention and a whole plant regenerated
therefrom. The particular tissue chosen will vary depending on the
clonal propagation systems available for, and best suited to, the
particular species being transformed. Exemplary tissue targets
include leaf disks, pollen, embryos, cotyledons, hypocotyls,
megagametophytes, callus tissue, existing meristematic tissue
(e.g., apical meristem, axillary buds, and root meristems), and
induced meristem tissue (e.g., cotyledon meristem and hypocotyl
meristem).
[0211] The term "organogenesis", as used herein, includes a process
by which shoots and roots are developed sequentially from
meristematic centres.
[0212] The term "embryogenesis", as used herein, includes a process
by which shoots and roots develop together in a concerted fashion
(not sequentially), whether from somatic cells or gametes.
[0213] Preferably, the plant is produced according to the methods
of the invention by transfecting or transforming the plant with a
genetic sequence, or by introducing to the plant a protein, by any
art-recognized means, such as microprojectile bombardment,
microinjection, Agrobacterium-mediated transformation (including in
planta transformation), protoplast fusion, or electroporation,
amongst others. Most preferably the plant is produced by
Agrobacterium-mediated transformation. Agrobacterium-mediated
transformation or agrolistic transformation of plants, yeast,
moulds or filamentous fungi is based on the transfer of part of the
transformation vector sequences, called the T-DNA, to the nucleus
and on integration of said T-DNA in the genome of said
eukaryote.
[0214] The term "Agrobacterium" as used herein, includes a member
of the Agrobacteriaceae, more preferably Agrobacterium or
Rhizobacterium and most preferably Agrobacterium tumefaciens.
[0215] The term "T-DNA", or "transferred DNA", as used herien,
includes the transformation vector flanked by T-DNA borders which
is, after activation of the Agrobacterium vir genes, nicked at the
T-DNA borders and is transferred as a single stranded DNA to the
nucleus of an eukaryotic cell.
[0216] As used herein, the terms "T-DNA borders", "T-DNA border
region", or "border region" include either right T-DNA borders (RB)
or left T-DNA borders (LB), which comprise a core sequence flanked
by a border inner region as part of the T-DNA flanking the border
and/or a border outer region as part of the vector backbone
flanking the border. The core sequences comprise 22 bp in case of
octopine-type vectors and 25 bp in case of nopaline-type vectors.
The core sequences in the right border region and left border
region form imperfect repeats.
[0217] As used herein, the term "T-DNA transformation vector" or
"T-DNA vector" includes any vector encompassing a T-DNA sequence
flanked by a right and left T-DNA border consisting of at least the
right and left border core sequences, respectively, and used for
transformation of any eukaryotic cell.
[0218] As used herein, the term "T-DNA vector backbone sequence" or
"T-DNA vector backbone sequences" includes all DNA of a T-DNA
containing vector that lies outside of the T-DNA borders and, more
specifically, outside the nicking sites of the border core
imperfect repeats.
[0219] The present invention includes optimized T-DNA vectors such
that vector backbone integration in the genome of a eukaryotic cell
is minimized or absent. The term "optimized T-DNA vector" as used
herein includes a T-DNA vector designed either to decrease or
abolish transfer of vector backbone sequences to the genome of a
eukaryotic cell. Such T-DNA vectors are known to the one of skill
in the art and include those described by Hanson et al. (1999) and
by Stuiver et al. (1999--WO9901563).
[0220] The current invention clearly considers the inclusion of a
DNA sequence encoding a CCP, homologue, analogue, derivative or
immunologically active fragment thereof as defined supra, in any
T-DNA vector comprising binary transformation vectors, super-binary
transformation vectors, co-integrate transformation vectors,
R1-derived transformation vectors as well as in T-DNA carrying
vectors used in agrolistic transformation.
[0221] As used herein, the term "binary transformation vector"
includes a T-DNA transformation vector comprising: a T-DNA region
comprising at least one gene of interest and/or at least one
selectable marker active in the eukaryotic cell to be transformed;
and a vector backbone region comprising at least origins of
replication active in E. coli and Agrobacterium and markers for
selection in E. coli and Agrobacterium. Alternatively, replication
of the binary transformation vector in Agrobacterium is dependent
on the presence of a separate helper plasmid. The binary vector
pGreen and the helper plasmid pSoup form an example of such a
system (Hellens et al. (2000), Plant Mol. Biol. 42, 819-832;
http://www.pgreen.ac.uk).
[0222] The T-DNA borders of a binary transformation vector can be
derived from octopine-type or nopaline-type Ti plasmids or from
both. The T-DNA of a binary vector is only transferred to a
eukaryotic cell in conjunction with a helper plasmid. As used
herein, the term "helper plasmid" includes a plasmid that is stably
maintained in Agrobacterium and is at least carrying the set of vir
genes necessary for enabling transfer of the T-DNA. The set of vir
genes can be derived from either octopine-type or nopaline-type Ti
plasmids or from both.
[0223] As used herein, the term "super-binary transformation
vector" includes a binary transformation vector additionally
carrying in the vector backbone region a vir region of the Ti
plasmid pTiBo542 of the super-virulent A. tumefaciens strain A281
(EP0604662, EP0687730). Super-binary transformation vectors are
used in conjunction with a helper plasmid.
[0224] As used herein, the term "co-integrate transformation
vector" includes a T-DNA vector at least comprising: a T-DNA region
comprising at least one gene of interest and/or at least one
selectable marker active in plants; and a vector backbone region
comprising at least origins of replication active in Escherichia
coli and Agrobacterium, and markers for selection in E. coli and
Agrobacterium, and a set of vir genes necessary for enabling
transfer of the T-DNA. The T-DNA borders and the set of vir genes
of the T-DNA vector can be derived from either octopine-type or
nopaline-type Ti plasmids or from both.
[0225] The term "Ri-derived plant transformation vector" includes a
binary transformation vector in which the T-DNA borders are derived
from a Ti plasmid and the binary transformation vector being used
in conjunction with a `helper` Ri-plasmid carrying the necessary
set of vir genes.
[0226] The terms "agrolistics", "agrolistic transformation" or
"agrolistic transfer" include a transformation method combining
features of Agrobacterium-mediated transformation and of biolistic
DNA delivery. As such, a T-DNA containing target plasmid is
co-delivered with DNA/RNA enabling in planta production of VirD1
and VirD2 with or without VirE2 (Hansen and Chilton 1996; Hansen et
al. 1997; Hansen and Chilton 1997--WO9712046).
[0227] A host cell of the invention, such as a prokaryotic or
eukaryotic host cell in culture, can be used to produce (i.e.,
express) a CCP protein. Accordingly, the invention further provides
methods for producing a CCP protein using the host cells of the
invention. In one embodiment, the method comprises culturing the
host cell of invention (into which a recombinant expression vector
encoding a CCP protein has been introduced) in a suitable medium
such that a CCP protein is produced. In another embodiment, the
method further comprises isolating a CCP protein from the medium or
the host cell.
[0228] The host cells of the invention can also be used to produce
transgenic plant or non-human transgenic animals in which exogenous
CCP sequences have been introduced into their genome or homologous
recombinant plants or animals in which endogenous CCP sequences
have been altered. Such plants and animals are useful for studying
the function and/or activity of a CCP and for identifying and/or
evaluating modulators of CCP activity.
Trangenic Plants
[0229] As used herein, "transgenic plant" includes a plant which
comprises within its genome a heterologous polynucleotide.
Generally, the heterologous polynucleotide is stably integrated
within the genome such that the polynucleotide is passed on to
successive generations. The heteroglogous polynucleotide may be
integrated into the genome alone or as part of a recombinant
expression cassette. "Transgenic" is used herein to include any
cell, cell line, callus, tissue, plant part or plant, the genotype
of which has been altered by the presence of heterologous nucleic
acid including those transgenics initially so altered as well as
those created by sexual crosses as asexual propagation from the
initial transgenic. The term "transgenic" as used herein does not
encompass the alteration of the genome (chromosomal or
extra-chromosomal) by conventional plant breeding methods or by
naturally occurring event such as random cross-fertilization,
non-recombinant viral infection, non-recombinant bacterial
transformation, non-recombinant transposition, or spontaneous
mutation.
[0230] A transgenic plant of the invention can be created by
introducing a CCP-encoding nucleic acid into the plant by placing
it under the control of regulatory elements which ensure the
expression in plant cells. These regulatory elements may be
heterologous or homologous with respect to the nucleic acid
molecule to be expressed as well with respect to the plant species
to be transformed. In general, such regulatory elements comprise a
promoter active in plant cells. These promoters can be used to
modulate (e.g. increase or decrease) CCP content and/or composition
in a desired tissue. To obtain expression in all tissues of a
transgenic plant, preferably constitutive promoters are used, such
as the 35 S promoter of CaMV (Odell, Nature 313 (1985), 810-812) or
promoters from such genes as rice actin (McElroy et al. (1990)
Plant Cell 2:163-171) maize H3 histone (Lepetit et al. (1992) Mol.
Gen. Genet. 231:276-285) or promoters of the polyubiquitin genes of
maize (Christensen, Plant Mol. Biol. 18 (1982), 675-689). In order
to achieve expression in specific tissues of a transgenic plant it
is possible to use tissue specific promoters (see, e.g., Stockhaus,
EMBO J. 8 (1989), 2245-2251 or Table II, below).
TABLE-US-00013 TABLE II EXPRESSION GENE SOURCE PATTERN REFERENCE
.alpha.-amylase (Amy32b) aleurone Lanahan, M. B., e t al., Plant
Cell 4: 203- 211, 1992; Skriver, K., et al. Proc. Natl. Acad. Sci.
(USA) 88: 7266-7270, 1991 cathepsin .beta.-like gene aleurone
Cejudo, F. J., et al. Plant Molecular Biology 20: 849-856, 1992.
Agrobacterium rhizogenes rolB cambium Nilsson et al., Physiol.
Plant. 100: 456-462, 1997 PRP genes cell wall
http://salus.medium.edu/mmg/tierney/html barley Itr1 promoter
endosperm synthetic promoter endosperm Vicente-Carbajosa et al.,
Plant J. 13: 629- 640, 1998. AtPRP4 flowers
http://salus.medium.edu/mmg/tierney/html chalene synthase (chsA)
flowers Van der Meer, et al., Plant Mol. Biol. 15, 95-109, 1990.
LAT52 anther Twell et al Mol. Gen Genet. 217: 240-245 (1989)
apetala-3 flowers chitinase fruit (berries, grapes, etc) Thomas et
al. CSIRO Plant Industry, Urrbrae, South Australia, Australia;
http://winetitles.com.au/gwrdc/csh95-1.html rbcs-3A green tissue
(eg leaf) Lam, E. et al., The Plant Cell 2: 857-866, 1990.; Tucker
et al., Plant Physiol. 113: 1303-1308, 1992. leaf-specific genes
leaf Baszczynski, et al., Nucl. Acid Res. 16: 4732, 1988. AtPRP4
leaf http://salus.medium.edu/mmg/tierney/html Pinus cab-6 leaf
Yamamoto et al., Plant Cell Physiol. 35: 773-778, 1994. SAM22
senescent leaf Crowell, et al., Plant Mol. Biol. 18: 459- 466,
1992. R. japonicum nif gene nodule U.S. Pat. No. 4,803,165 B.
japonicum nifH gene nodule U.S. Pat. No. 5,008,194 GmENOD40 nodule
Yang, et al., The Plant J. 3: 573-585. PEP carboxylase (PEPC)
nodule Pathirana, et al., Plant Mol. Biol. 20: 437- 450, 1992.
leghaemoglobin (Lb) nodule Gordon, et al., J. Exp. Bot. 44:
1453-1465, 1993. Tungro bacilliform virus gene phloem
Bhattacharyya-Pakrasi, et al, The Plant J. 4: 71-79, 1992.
sucrose-binding protein gene plasma membrane Grimes, et al., The
Plant Cell 4: 1561- 1574, 1992. pollen-specific genes pollen;
microspore Albani, et al., Plant Mol. Biol. 15: 605, 1990; Albani,
et al., Plant Mol. Biol. 16: 501, 1991) Zm13 pollen Guerrero et al
Mol. Gen. Genet. 224: 161- 168 (1993) apg gene microspore Twell et
al Sex. Plant Reprod. 6: 217-224 (1993) maize pollen-specific gene
pollen Hamilton, et al., Plant Mol. Biol. 18: 211- 218, 1992.
sunflower pollen-expressed gene pollen Baltz, et al., The Plant J.
2: 713-721, 1992. B. napus pollen-specific gene pollen; anther;
tapetum Arnoldo, et al., J. Cell. Biochem., Abstract No. Y101, 204,
1992. root-expressible genes roots Tingey, et al., EMBO J. 6: 1,
1987. tobacco auxin-inducible gene root tip Van der Zaal, et al.,
Plant Mol. Biol. 16, 983, 1991. .beta.-tubulin root Oppenheimer, et
al., Gene 63: 87, 1988. tobacco root-specific genes root Conkling,
et al., Plant Physiol. 93: 1203, 1990. B. napus G1-3b gene root
U.S. Pat. No. 5,401,836 SbPRP1 roots Suzuki et al., Plant Mol.
Biol. 21: 109- 119, 1993. AtPRP1; AtPRP3 roots; root hairs
http://salus.medium.edu/mmg/tierney/html RD2 gene root cortex
http://www2.cnsu.edu/ncsu/research TobRB7 gene root vasculature
http://www2.cnsu.edu/ncsu/research AtPRP4 leaves; flowers; lateral
http://salus.medium.edu/mmg/tierney/html root primordia
seed-specific genes seed Simon, et al., Plant Mol. Biol. 5: 191,
1985; Scofield, et al., J. Biol. Chem. 262: 12202, 1987.;
Baszczynski, et al., Plant Mol. Biol. 14: 633, 1990. Brazil Nut
albumin seed Pearson, et al., Plant Mol. Biol. 18: 235- 245, 1992.
legumin seed Ellis, et al., Plant Mol. Biol. 10: 203-214, 1988.
glutelin (rice) seed Takaiwa, et al., Mol. Gen. Genet. 208: 15-22,
1986; Takaiwa, et al., FEBS Letts. 221: 43-47, 1987. zein seed
Matzke et al Plant Mol Biol, 14(3): 323- 32 1990 napA seed
Stalberg, et al, Planta 199: 515-519, 1996. sunflower oleosin seed
(embryo and dry Cummins, et al., Plant Mol. Biol. 19: seed)
873-876, 1992 LEAFY shoot meristem Weigel et al., Cell 69: 843-859,
1992. Arabidopsis thaliana knat1 shoot meristem Accession number
AJ131822 Malus domestica kn1 shoot meristem Accession number Z71981
CLAVATA1 shoot meristem Accession number AF049870 stigma-specific
genes stigma Nasrallah, et al., Proc. Natl. Acad. Sci. USA 85:
5551, 1988; Trick, et al., Plant Mol. Biol. 15: 203, 1990. class I
patatin gene tuber Liu et al., Plant Mol. Biol. 153: 386-395, 1991.
blz2 endosperm EP99106056.7 PCNA rice meristem Kosugi et al,
Nucleic Acids Research 19: 1571-1576, 1991; Kosugi S. and Ohashi Y,
Plant Cell 9: 1607-1619, 1997.
The promoters listed in the foregoing table are provided for the
purposes of exemplification only and the present invention is not
to be limited by the list provided therein. Those skilled in the
art will readily be in a position to provide additional promoters
that are useful in performing the present invention. The promoters
listed may also be modified to provide specificity of expression as
required.
[0231] Known are also promoters which are specifically active in
tubers of potatoes or in seeds of different plants species, such as
maize, Vicia, wheat, barley and the like. Inducible promoters may
be used in order to be able to exactly control expression under
certain environmental or developmental conditions such as
pathogens, anaerobia, or light. Examples of inducible promoters
include the promoters of genes encoding heat shock proteins or
microspore-specific regulatory elements (WO96/16182). Furthermore,
the chemically inducible Tet-system may be employed (Gatz, Mol.
Gen. Genet. 227 (1991); 229-237). Further suitable promoters are
known to the person skilled in the art and are described, e.g., in
Ward (Plant Mol. Biol. 22 (1993), 361-366). The regulatory elements
may further comprise transcriptional and/or translational enhancers
functional in plants cells. Furthermore, the regulatory elements
may include transcription termination signals, such as a poly-A
signal, which lead to the addition of a poly A tail to the
transcript which may improve its stability.
[0232] In the case that a nucleic acid molecule according to the
invention is expressed in the sense orientation, the coding
sequence can be modified such that the protein is located in any
desired compartment of the plant cell, e.g., the nucleus,
endoplasmatic reticulum, the vacuole, the mitochondria, the
plastids, the apoplast, or the cytoplasm.
[0233] Methods for the introduction of foreign DNA into plants are
also well known in the art. These include, for example, the
transformation of plant cells or tissues with T-DNA using
Agrobacterium tumefaciens or Agrobacterium rhizogenes, the fusion
of protoplasts, direct gene transfer (see, e.g., EP-A 164 575),
injection, electroporation, biolistic methods like particle
bombardment, pollen-mediated transformation, plant RNA
virus-mediated transformation, liposome-mediated transformation,
transformation using wounded or enzyme-degraded immature embryos,
or wounded or enzyme-degraded embryogenic callus and other methods
known in the art. The vectors used in the method of the invention
may contain further functional elements, for example "left border"-
and "right border"-sequences of the T-DNA of Agrobacterium which
allow for stably integration into the plant genome. Furthermore,
methods and vectors are known to the person skilled in the art
which permit the generation of marker free transgenic plants, i.e.,
the selectable or scorable marker gene is lost at a certain stage
of plant development or plant breeding. This can be achieved by,
for example, cotransformation (Lyznik, Plant Mol. Biol. 13 (1989),
151-161; Peng, Plant Mol. Biol. 27 (1995), 91-104) and/or by using
systems which utilize enzymes capable of promoting homologous
recombination in plants (see, e.g., WO97/08331; Bayley, Plant Mol.
Biol. 18 (1992), 353-361); Lloyd, Mol. Gen. Genet. 242 (1994),
653-657; Maeser, Mol. Gen. Genet. 230 (1991), 170-176; Onouchi,
Nucl. Acids Res. 19 (1991), 6373-6378). Methods for the preparation
of appropriate vectors are described by, e.g., Sambrook (Molecular
Cloning; A Laboratory Manual, 2nd Edition (1989), Cold Spring
Harbor Laboratory Press, Cold Spring Harbor, N.Y.).
[0234] Suitable strains of Agrobacterium tumefaciens and vectors,
as well as transformation of Agrobacteria, and appropriate growth
and selection media are described in, for example, GV3101
(pMK90RK), Koncz, Mol. Gen. Genet. 204 (1986), 383-396; C58C1 (pGV
3850kan), Deblaere, Nucl. Acid Res. 13 (1985), 4777; Bevan,
Nucleic. Acid Res. 12 (1984), 8711; Koncz, Proc. Natl. Acad. Sci.
USA 86 (1989), 8467-8471; Koncz, Plant Mol. Biol. 20 (1992),
963-976; Koncz, Specialized vectors for gene tagging and expression
studies. In: Plant Molecular Biology Manual Vol 2, Gelvin and
Schilperoort (Eds.), Dordrecht, The Netherlands: Kluwer Academic
Publ. (1994), 1-22; EP-A-120 516; Hoekema: The Binary Plant Vector
System, Offsetdrukkerij Kanters B. V., Alblasserdam (1985), Chapter
V, Fraley, Crit. Rev. Plant. Sci., 4, 1-46; An, EMBO J. 4 (1985),
277-287). Although the use of Agrobacterium tumefaciens is
preferred in the method of the invention, other Agrobacterium
strains, such as Agrobacterium rhizogenes, may be used, for
example, if a phenotype conferred by said strain is desired.
[0235] Methods for the transformation using biolistic methods are
known to the person skilled in the art; see, e.g., Wan, Plant
Physiol. 104 (1994), 37-48; Vasil, Bio/Technology 11 (1993),
1553-1558 and Christou (1996) Trends in Plant Science 1, 423-431.
Microinjection can be performed as described in Potrykus and
Spangenberg (eds.), Gene Transfer To Plants. Springer Verlag,
Berlin, N.Y. (1995).
[0236] The transformation of most dicotyledonous plants may be
performed using the methods described above or using transformation
via biolistic methods as, e.g., described above as well as
protoplast transformation, electroporation of partially
permeabilized cells, or introduction of DNA using glass fibers.
[0237] In general, the plants which are modified according to the
invention may be derived from any desired plant species. They can
be monocotyledonous plants or dicotyledonous plants, preferably
they belong to plant species of interest in agriculture, wood
culture or horticulture interest, such as crop plants (e.g., maize,
rice, barley, wheat, rye, oats), potatoes, oil producing plants
(e.g., oilseed rape, sunflower, pea nut, soy bean), cotton, sugar
beet, sugar cane, leguminous plants (e.g., beans, peas), or wood
producing plants, preferably trees.
[0238] The present invention also relates to a transgenic plant
cell which contains (preferably stably integrated into its genome)
a nucleic acid molecule of the present invention linked to
regulatory elements which allow expression of the nucleic acid
molecule in plant cells. The presence and expression of the nucleic
acid molecule in the transgenic plant cells leads to the synthesis
of a CCP protein and may lead to physiological and phenotypic
changes in plants containing such cells.
[0239] Transformed plant cells which are derived by any of the
above transformation techniques can be cultured to regenerate a
whole plant which possesses the transformed genotype. Such
regeneration techniques often rely on manipulation of certain
phytohormones in a tissue culture growth medium, typically relying
on a biocide and/or herbicide marker which has been introduced with
a polynucleotide of the present invention.
[0240] Plant cells transformed with a plant expression vector can
be regenerated, e.g., from single cells, callus tissue or leaf
discs according to standard plant tissue culture techniques. It is
well known in the art that various cells, tissues, and organs from
almost any plant can be successfully cultured to regenerate an
entire plant. Plant regeneration from cultured protoplasts is
described in Evans et al., Protoplasts Isolation and Culture,
Handbook of Plant Cell Culture, Macmillilan Publishing Company, New
York, pp. 124-176 (1983); and Binding, Regeneration of Plants,
Plant Protoplasts, CRC Press, Boca Raton, pp. 21-73 (1985).
[0241] Transformed plant cells, calli or explant can be cultured on
regeneration medium in the dark for several weeks, generally about
1 to 3 weeks to allow the somatic embryos to mature. Preferred
regeneration media include media containing MS salts, such as PHI-E
and PHI-F media. The plant cells, calli or explant are then
typically cultured on rooting medium in a light/dark cycle until
shoots and roots develop. Methods for plant regeneration are known
in the art and preferred methods are provided by Kamo et al., (Bot.
Gaz. 146(3):324-334, 1985), West et al., (The Plant Cell
5:1361-1369. 1993), and Duncan et al. (Planta 165:322-332,
1985).
[0242] Small plantlets can then be transferred to tubes containing
rooting medium and allowed to grow and develop more roots for
approximately another week. The plants can then be transplanted to
soil mixture in pots in the greenhouse.
[0243] The regeneration of plants containing the foreign gene
introduced by Agrobacterium from leaft explants can be achieved as
described by Horsch et al., Science, 227:1229-1231 (1985). In this
procedure, transformants are grown in the presence of a selection
agent and in a medium that induces the regeneration of shoots in
the plant species being transformed as described by Fraley et al.,
Proc. Natl. Acad. Sci, U.S.A. 80:4803 (1983). This procedure
typically produces shoots within two to four weeks and these
transformant shoots are then transferred to an appropriate
root-inducing medium containing the selective agent and an
antibiotic to prevent bacterial growth. Transgenic plants of the
present invention may be fertile or sterile.
[0244] Regeneration can also be obtained from plant callus,
explants, organs, or parts thereof. Such regeneration techniques
are described generally in Klee et al., Ann. Rev. of Plant Phys.,
38:467-486 (1987). The regeneration of plants from either single
plant protoplasts or various explants is well known in the art.
See, from example, Methods for Plant Molecular Biology, A.
Weissbach and H. Weissback, eds., Academic Press, Inc., San Diego,
Calif. (1988). This regeneration and growth process includes the
steps of selection of transformant cells and shoots, rooting ht
transformant shoots and growth of the plantlets in soil. For maize
cell culture and regeneration see generally, The Maize Handbook,
Freeling and Walbot, Eds., Springer, New York (1994); Corn and Corn
Improvement, 3.sup.rd edition, Sprague and Dudley Eds., American
Society of Agronomy, Madison, Wis. (1988).
[0245] One of skill will recognize that after the recombinant
expression cassette is stably incorporated in transgenic plants and
confirmed to be operable, it can be introduced into other plants by
sexual crossing. Any of a number of standard breeding techniques
can be used, depending upon the species to be crossed.
[0246] In vegetatively propagated crops, mature transgenic plants
can be propagated by the taking of cuttings or by tissue culture
techniques to produce multiple identical plants. Selection of
desirable transgenics is made and new varieties are obtained and
propagated vegetatively for commercial use. In seed propagated
crops, mature transgenic plants can be self crossed to produce a
homozygous inbred plant. The inbred plant produces seed containing
the newly introduced heterologous nucleic acid. These seeds can be
grown to produce plants that would produce the selected phenotype,
(e.g., altered cell cycle content or composition).
[0247] Parts obtained from the regenerated plant, such as flowers,
seeds, leaves, branches, fruit and the like are included in the
invention, provided that these parts comprise cells comprising the
isolated nucleic acid of the present invention. Progeny and
variants, and mutants of the regenerated plants are also included
within the scope of the invention, provided that these parts
comprise the introduced nucleic acid sequences.
[0248] Transgenic plants expressing the selectable marker can be
screened for transmission of the nucleic acid of the present
invention by, for example, standard immunoblot and DNA detection
techniques. Transgenic lines are also typically evaluated on levels
of expression of the heterologous nucleic acid. Expression at the
RNA level can be determined initially to identify and quantitate
expression-positive plants. Standard techniques for RNA analysis
can be employed and include PCR amplification assays using
oligonucleotide primers designed to amplify only the heterologous
RNA templates and solution hybridization assays using heterologous
nucleic acid-specific probes. The RNA-positive plants can then
analyzed for protein expression by Western immunoblot analysis
using the specifically reactive antibodies of the present
invention. In addition, in situ hybridization and
immunocytochemistry according to standard protocols can be done
using heterologous nucleic acid specific polynucleotide probes and
antibodies, respectively, to localize sites of expression within
transgenic tissue. Generally, a number of transgenic lines are
usually screened for the incorporated nucleic acid to identify and
select plants with the most appropriate expression profiles.
[0249] A preferred embodiment of the invention is a transgenic
plant that is homozygous for the added heterologous nucleic acid;
i.e., a transgenic plant that contains two added nucleic acid
sequences, one gene at the same locus on each chromosome of a
chromosome pair. A homozygous transgenic plant can be obtained by
sexually mating (selfing) a heterozygous transgenic plant that
contains a single added heterologous nucleic acid, germinating some
of the seed produced and analyzing the resulting plants produced
for altered cell division relative to a control plant (i.e.,
native, non-transgenic). Back-crossing to a parental plant and
out-crossing with a non-transgenic plant are also contemplated.
[0250] The present invention also relates to transgenic plants and
plant tissue comprising transgenic plant cells according to the
invention. Due to the (over)expression of a CCP molecule, e.g., at
developmental stages and/or in plant tissue in which they do not
naturally occur, these transgenic plants may show various
physiological, developmental and/or morphological modifications in
comparison to wild-type plants.
[0251] Therefore, part of this invention is the use of the CCP
molecules to modulate the cell cycle and/or plant cell division
and/or growth in plant cells, plant tissues, plant organs and/or
whole plants. To the scope of the invention also belongs a method
for influencing the activity of CDKs such as CDC2a, or CDC2b, CKSs,
CKIs, PLPs and KLPNTs in a plant cell by transforming the plant
cell with a nucleic acid molecule according to the invention and/or
manipulation of the expression of the molecule.
[0252] Furthermore, the invention also relates to a transgenic
plant cell which contains (preferably stably integrated into its
genome) a nucleic acid molecule of the invention or part thereof,
wherein the transcription and/or expression of the nucleic acid
molecule or part thereof leads to reduction of the synthesis of a
CCP. In a preferred embodiment, the reduction is achieved by an
anti-sense, sense, ribozyme, co-suppression and/or dominant mutant
effect. The reduction of the synthesis of a protein according to
the invention in the transgenic plant cells can result in an
alteration in, e.g., cell division. In transgenic plants comprising
such cells this can lead to various physiological, developmental
and/or morphological changes.
[0253] In yet another aspect, the invention relates to harvestable
parts and to propagation material of the transgenic plants of the
invention which either contain transgenic plant cells expressing a
nucleic acid molecule according to the invention or which contain
cells which show a reduced level of the described protein.
Harvestable parts can be in principle any useful parts of a plant,
for example, flowers, pollen, seedlings, tubers, leaves, stems,
fruit, seeds, roots etc. Propagation material includes, for
example, seeds, fruits, cuttings, seedlings, tubers, rootstocks,
and the like.
Transgenic Animals
[0254] As used herein, a "transgenic animal" is a non-human animal,
preferably a mammal, more preferably a rodent such as a rat or
mouse, in which one or more of the cells of the animal includes a
transgene. Other examples of transgenic animals include non-human
primates, sheep, dogs, cows, goats, chickens, amphibians, and the
like. A transgene is exogenous DNA which is integrated into the
genome of a cell from which a transgenic animal develops and which
remains in the genome of the mature animal, thereby directing the
expression of an encoded gene product in one or more cell types or
tissues of the transgenic animal. As used herein, a "homologous
recombinant animal" is a non-human animal, preferably a mammal,
more preferably a mouse, in which an endogenous CCP gene has been
altered by homologous recombination between the endogenous gene and
an exogenous DNA molecule introduced into a cell of the animal,
e.g., an embryonic cell of the animal, prior to development of the
animal.
[0255] A transgenic animal of the invention can be created by
introducing a CCP-encoding nucleic acid into the male pronuclei of
a fertilized oocyte, e.g., by microinjection, retroviral infection,
and allowing the oocyte to develop in a pseudopregnant female
foster animal. The CCP cDNA sequence of SEQ ID NO:1-66 or 228-239
can be introduced as a transgene into the genome of a non-human
animal. Alternatively, a nonhuman homologue of a human CCP gene,
such as a mouse or rat CCP gene, can be used as a transgene.
Alternatively, a CCP gene homologue, such as another CCP family
member, can be isolated based on hybridization to the CCP cDNA
sequences of SEQ ID NO:1-66 or 228-239 (described further in
subsection I above) and used as a transgene. Intronic sequences and
polyadenylation signals can also be included in the transgene to
increase the efficiency of expression of the transgene. A
tissue-specific regulatory sequence(s) can be operably linked to a
CCP transgene to direct expression of a CCP protein to particular
cells. Methods for generating transgenic animals via embryo
manipulation and microinjection, particularly animals such as mice,
have become conventional in the art and are described, for example,
in U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al.,
U.S. Pat. No. 4,873,191 by Wagner et al. and in Hogan, B.,
Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, N.Y., 1986). Similar methods are used
for production of other transgenic animals. A transgenic founder
animal can be identified based upon the presence of a CCP transgene
in its genome and/or expression of CCP mRNA in tissues or cells of
the animals. A transgenic founder animal can then be used to breed
additional animals carrying the transgene. Moreover, transgenic
animals carrying a transgene encoding a CCP protein can further be
bred to other transgenic animals carrying other transgenes.
V. Agricultural, Phytopharmaceutical and Pharmaceutical
Compositions
[0256] The CCP nucleic acid molecules, CCP proteins, and anti-CCP
antibodies (also referred to herein as "active compounds") of the
invention can be incorporated into compositions useful in
agriculture and in plant cell and tissue culture. Plant protection
compositions can be prepared by conventional means commonly used
for the application of, for example, herbicides and pesticides. For
example, certain additives known to those skilled in the art
stabilizers or substances which facilitate the uptake by the plant
cell, plant tissue or plant may be used.
[0257] The CCP nucleic acid molecules, CCP proteins, and anti-CCP
antibodies (also referred to herein as "active compounds") of the
invention can also be incorporated into pharmaceutical compositions
suitable for administration into animals. Such compositions
typically comprise the nucleic acid molecule, protein, or antibody
and a pharmaceutically acceptable carrier. As used herein the
language "pharmaceutically acceptable carrier" is intended to
include any and all solvents, dispersion media, coatings,
antibacterial and antifungal agents, isotonic and absorption
delaying agents, and the like, compatible with pharmaceutical
administration. The use of such media and agents for
pharmaceutically active substances is well known in the art. Except
insofar as any conventional media or agent is incompatible with the
active compound, use thereof in the compositions is contemplated.
Supplementary active compounds can also be incorporated into the
compositions.
[0258] The nucleic acid molecules of the invention can be inserted
into vectors and used as gene therapy vectors. Gene therapy vectors
can be delivered to a plant or subject by, for example, injection,
local administration (see U.S. Pat. No. 5,328,470) or by
stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl.
Acad. Sci. USA 91:3054-3057). The agricultural or pharmaceutical
preparation of the gene therapy vector can include the gene therapy
vector in an acceptable diluent, or can comprise a slow release
matrix in which the gene delivery vehicle is imbedded.
Alternatively, where the complete gene delivery vector can be
produced intact from recombinant cells, e.g., retroviral vectors,
the agricultural or pharmaceutical preparation can include one or
more cells which produce the gene delivery system.
[0259] The agricultural and pharmaceutical compositions can be
included in a container, pack, or dispenser together with
instructions for administration.
VI. Uses and Methods of the Invention
[0260] The nucleic acid molecules, proteins, protein homologues,
and antibodies described herein can be used in one or more of the
following methods: a) agricultural uses (e.g., to increase plant
yield and to develop phytopharmaceuticals); b) screening assays; c)
predictive medicine (e.g., diagnostic assays, prognostic assays,
monitoring clinical trials); d) methods of treatment (e.g.,
phytotherapeutic, therapeutic and prophylactic); e)
transcriptomics; f) proteomics; g) metabolomics; h) ligandomics;
and i) pharmacogenetics or pharmacogenomics. The isolated nucleic
acid molecules of the invention can be used, for example, to
express CCP protein (e.g., via a recombinant expression vector in a
host cell or in gene therapy applications), to detect CCP mRNA
(e.g., in a biological sample) or a genetic alteration in a CCP
gene, and to modulate CCP activity, as described further below. The
CCP proteins can be used to treat disorders characterized by
insufficient or excessive production of a CCP substrate or
production of CCP inhibitors. In addition, the CCP proteins can be
used to screen for naturally occurring CCP substrates, to screen
for drugs or compounds which modulate CCP activity, as well as to
treat disorders characterized by insufficient or excessive
production of CCP protein or production of CCP protein forms which
have decreased or aberrant activity compared to CCP wild type
protein. Moreover, the anti-CCP antibodies of the invention can be
used to detect and isolate CCP proteins, regulate the
bioavailability of CCP proteins, and modulate CCP activity.
[0261] A. Agricultural Uses:
[0262] In another embodiment of the invention, a method is provided
for modifying cell fate and/or plant development and/or plant
morphology and/or biochemistry and/or physiology comprising the
modification of expression in particular cells, tissues or organs
of a plant, of a genetic sequence encoding a CCP, e.g., a CCP
operably connected with a plant-operable promoter sequence.
[0263] Modulation of the expression in a plant of a CCP or a
homologue, analogue or derivative thereof as defined in the present
invention can produce a range of desirable phenotypes in plants,
such as, for example, the modification of one or more
morphological, biochemical, or physiological characteristics
including: (i) modification of the length of the G1 and/or the S
and/or the G2 and/or the M phase of the cell cycle of a plant; (ii)
modification of the G1/S and/or S/G2 and/or G2/M and/or M/G1 phase
transition of a plant cell; (iii) modification of the initiation,
promotion, stimulation or enhancement of cell division; (iv)
modification of the initiation, promotion, stimulation or
enhancement of DNA replication; (v) modification of the initiation,
promotion, stimulation or enhancement of seed set and/or seed size
and/or seed development; (vi) modification of the initiation,
promotion, stimulation or enhancement of tuber formation; (vii)
modification of the initiation, promotion, stimulation or
enhancement of fruit formation; (viii) modification of the
initiation, promotion, stimulation or enhancement of leaf
formation; (ix) modification of the initiation, promotion,
stimulation or enhancement of shoot initiation and/or development;
(x) modification of the initiation, promotion, stimulation or
enhancement of root initiation and/or development; (xi)
modification of the initiation, promotion, stimulation or
enhancement of lateral root initiation and/or development; (xii)
modification of the initiation, promotion, stimulation or
enhancement of nodule formation and/or nodule function; (xiii)
modification of the initiation, promotion, stimulation or
enhancement of the bushiness of the plant; (xiv) modification of
the initiation, promotion, stimulation or enhancement of dwarfism
in the plant; (xv) modification of the initiation, promotion,
stimulation or enhancement of senescence; (xvi) modification of
stem thickness and/or strength characteristics and/or
wind-resistance of the stem and/or stem length; (xvii) modification
of tolerance and/or resistance to biotic stresses such as pathogen
infection; and (xviii) modification of tolerance and/or resistance
to abiotic stresses such as drought stress or salt stress.
[0264] Methods to effect expression of a CCP or a homologue,
analogue or derivative thereof as defined in the present invention
in a plant cell, tissue or organ, include either the introduction
of the protein directly to a cell, tissue or organ such as by
microinjection of ballistic means or, alternatively, introduction
of an isolated nucleic acid molecule encoding the protein into the
cell, tissue or organ in an expressible format. Methods to effect
expression of a CCP or a homologue, analogue or derivative thereof
as defined in the current invention in whole plants include
regeneration of whole plants from the transformed cells in which an
isolated nucleic acid molecule encoding the protein was introduced
in an expressible format.
[0265] The present invention clearly extends to any plant produced
by the inventive method described herein, and any and all plant
parts and propagules thereof. The present invention extends further
to encompass the progeny derived from a primary transformed or
transfected cell, tissue, organ or whole plant that has been
produced by the inventive method, the only requirement being that
the progeny exhibits the same genotypic and/or phenotypic
characteristic(s) as those characteristic(s) that (have) been
produced in the parent by the performance of the inventive
method.
[0266] By "cell fate and/or plant development and/or plant
morphology and/or biochemistry and/or physiology" is meant that one
or more developmental and/or morphological and/or biochemical
and/or physiological characteristics of a plant is altered by the
performance of one or more steps pertaining to the invention
described herein. "Cell fate" includes the cell-type or cellular
characteristics of a particular cell that are produced during plant
development or a cellular process therefor, in particular during
the cell cycle or as a consequence of a cell cycle process.
[0267] The term "plant development" or the term "plant
developmental characteristic" or similar terms shall, when used
herein, be taken to mean any cellular process of a plant that is
involved in determining the developmental fate of a plant cell, in
particular the specific tissue or organ type into which a
progenitor cell will develop. Cellular processes relevant to plant
development will be known to those skilled in the art. Such
processes include, for example, morphogenesis, photomorphogenesis,
shoot development, root development, vegetative development,
reproductive development, stem elongation, flowering, and
regulatory mechanisms involved in determining cell fate, in
particular a process or regulatory process involving the cell
cycle.
[0268] The term "plant morphology" or the term "plant morphological
characteristic" or similar term will, when used herein, be
understood by those skilled in the art to include the external
appearance of a plant, including any one or more structural
features or combination of structural features thereof. Such
structural features include the shape, size, number, position,
color, texture, arrangement, and patternation of any cell, tissue
or organ or groups of cells, tissues or organs of a plant,
including the root, stem, leaf, shoot, petiole, trichome, flower,
petal, stigma, style, stamen, pollen, ovule, seed, embryo,
endosperm, seed coat, aleurone, fibre, fruit, cambium, wood,
heartwood, parenchyma, aerenchyma, sieve element, phloem or
vascular tissue.
[0269] The term "plant biochemistry" or the term "plant biochemical
characteristic" or similar term will, when used herein, be
understood by those skilled in the art to include the metabolic and
catalytic processes of a plant, including primary and secondary
metabolism and the products thereof, including any small molecules,
macromolecules or chemical compounds, such as but not limited to
starches, sugars, proteins, peptides, enzymes, hormones, growth
factors, nucleic acid molecules, celluloses, hemicelluloses,
calloses, lectins, fibres, pigments such as anthocyanins, vitamins,
minerals, micronutrients, or macronutrients, that are produced by
plants.
[0270] The term "plant physiology" or the term "plant physiological
characteristic" or similar term will, when used herein, be
understood to include the functional processes of a plant,
including developmental processes such as growth, expansion and
differentiation, sexual development, sexual reproduction, seed set,
seed development, grain filling, asexual reproduction, cell
division, dormancy, germination, light adaptation, photosynthesis,
leaf expansion, fibre production, secondary growth or wood
production, amongst others; responses of a plant to
externally-applied factors such as metals, chemicals, hormones,
growth factors, environment and environmental stress factors (e.g.,
anoxia, hypoxia, high temperature, low temperature, dehydration,
light, daylength, flooding, salt, heavy metals, amongst others),
including adaptive responses of plants to said externally-applied
factors.
[0271] The CCP molecules of the present invention are useful in
agriculture. The nucleic acid molecules, proteins, protein
homologues, and antibodies described herein can be used to modulate
the protein levels or activity of a protein involved in the cell
cycle, e.g., proteins involved in the G1/S and/or the G2/M
transition in the cell cycle due to environmental conditions,
including abiotic stress such as cold, nutrient deprivation, heat,
drought, salt stress, or biotic stress such as a pathogen
attack.
[0272] Thus, the CCP molecules of the present invention may be used
to modulate, e.g., enhance, crop yields; modulate, e.g., attenuate,
stress, e.g. heat or nutrient deprivation; modulate tolerance to
pests and diseases; modulate plant architecture; modulate plant
quality traits; or modulate plant reproduction and seed
development.
[0273] The CCP molecules of the present invention may also be used
to modulate endoreduplication in storage cells, storage tissues,
and/or storage organs of plants or parts thereof. The term
"endoreduplication" includes recurrent DNA replication without
consequent mitosis and cytokinesis. Preferred target storage organs
and parts thereof for the modulation of endoreduplication are, for
example, seeds (such as from cereals, oilseed crops), roots (such
as in sugar beet), tubers (such as in potatoes) and fruits (such as
in vegetables and fruit species). Increased endoreduplication in
storage organs, and parts thereof, correlates with enhanced storage
capacity and, thus, with improved yield. In another embodiment of
the invention, the endoreduplication of a whole plant is
modulated.
[0274] B. Screening Assays:
[0275] The invention provides a method (also referred to herein as
a "screening assay") for identifying modulators, i.e., candidate or
test compounds or agents (e.g., peptides, peptidomimetics, small
molecules or other drugs) which bind to CCP proteins, have a
stimulatory or inhibitory effect on, for example, CCP expression or
CCP activity, or have a stimulatory or inhibitory effect on, for
example, the expression or activity of a CCP substrate.
[0276] In one embodiment, the invention provides assays for
screening candidate or test compounds which are substrates of a CCP
protein or polypeptide or biologically active portion thereof. In
another embodiment, the invention provides assays for screening
candidate or test compounds which bind to or modulate the activity
of a CCP protein or polypeptide or biologically active portion
thereof, e.g., modulate the ability of CCP to interact with its
cognate ligand. The test compounds of the present invention can be
obtained using any of the numerous approaches in combinatorial
library methods known in the art, including: biological libraries;
spatially addressable parallel solid phase or solution phase
libraries; synthetic library methods requiring deconvolution; the
`one-bead one-compound` library method; and synthetic library
methods using affinity chromatography selection. The biological
library approach is limited to peptide libraries, while the other
four approaches are applicable to peptide, non-peptide oligomer or
small molecule libraries of compounds (Lam, K. S. (1997) Anticancer
Drug Des. 12:145).
[0277] Examples of methods for the synthesis of molecular libraries
can be found in the art, for example in: DeWitt et al. (1993) Proc.
Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl.
Acad. Sci. USA 91:11422; Zuckeimann et al. (1994). J. Med. Chem.
37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994)
Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew.
Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med.
Chem. 37:1233.
[0278] Libraries of compounds may be presented in solution (e.g.,
Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991)
Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556),
bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat.
No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA
89:1865-1869) or on phage (Scott and Smith (1990) Science
249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al.
(1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol.
Biol. 222:301-310); (Ladner supra.).
[0279] In another embodiment, an assay is a cell-based assay
comprising contacting a cell expressing a CCP target molecule
(e.g., a plant cyclin dependent kinase) with a test compound and
determining the ability of the test compound to modulate (e.g.
stimulate or inhibit) the activity of the CCP target molecule.
Determining the ability of the test compound to modulate the
activity of a CCP target molecule can be accomplished, for example,
by determining the ability of the CCP protein to bind to or
interact with the CCP target molecule, or by determining the
ability of the target molecule, e.g., the plant cyclin dependent
kinase, to phosphorylate a protein.
[0280] The ability of the target molecule, e.g., the plant cyclin
dependent kinase, to phosphorylate a protein can be determined by,
for example, an in vitro kinase assay. Briefly, a protein can be
incubated with the target molecule, e.g., the plant cyclin
dependent kinase, and radioactive ATP, e.g., [.gamma.-.sup.32P]
ATP, in a buffer containing MgCl.sub.2 and MnCl.sub.2, e.g., 10 mM
MgCl.sub.2 and 5 mM MnCl.sub.2. Following the incubation, the
immunoprecipitated protein can be separated by SDS-polyacrylamide
gel electrophoresis under reducing conditions, transferred to a
membrane, e.g., a PVDF membrane, and autoradiographed. The
appearance of detectable bands on the autoradiograph indicates that
the protein has been phosphorylated. Phosphoaminoacid analysis of
the phosphorylated substrate can also be performed in order to
determine which residues on the protein are phosphorylated.
Briefly, the radiophosphorylated protein band can be excised from
the SDS gel and subjected to partial acid hydrolysis. The products
can then be separated by one-dimensional electrophoresis and
analyzed on, for example, a phosphoimager and compared to
ninhydrin-stained phosphoaminoacid standards.
[0281] Determining the ability of the CCP protein to bind to or
interact with a CCP target molecule can be accomplished by
determining direct binding. Determining the ability of the CCP
protein to bind to or interact with a CCP target molecule can be
accomplished, for example, by coupling the CCP protein with a
radioisotope or enzymatic label such that binding of the CCP
protein to a CCP target molecule can be determined by detecting the
labeled CCP protein in a complex. For example, CCP molecules, e.g.,
CCP proteins, can be labeled with .sup.125I, .sup.35S, .sup.14C, or
.sup.3H, either directly or indirectly, and the radioisotope
detected by direct counting of radioemmission or by scintillation
counting. Alternatively, CCP molecules can be enzymatically labeled
with, for example, horseradish peroxidase, alkaline phosphatase, or
luciferase, and the enzymatic label detected by determination of
conversion of an appropriate substrate to product.
[0282] It is also within the scope of this invention to determine
the ability of a compound to modulate the interaction between CCP
and its target molecule, without the labeling of any of the
interactants. For example, a microphysiometer can be used to detect
the interaction of CCP with its target molecule without the
labeling of either CCP or the target molecule. McConnell, H. M. et
al. (1992) Science 257:1906-1912. As used herein, a
"microphysiometer" (e.g., Cytosensor) is an analytical instrument
that measures the rate at which a cell acidifies its environment
using a light-addressable potentiometric sensor (LAPS). Changes in
this acidification rate can be used as an indicator of the
interaction between compound and receptor.
[0283] In a preferred embodiment, determining the ability of the
CCP protein to bind to or interact with a CCP target molecule can
be accomplished by determining the activity of the target molecule.
For example, the activity of the target molecule can be determined
by detecting induction of a cellular second messenger of the target
(e.g., intracellular Ca.sup.2+, diacylglycerol, IP.sub.3, etc.),
detecting catalytic/enzymatic activity of the target an appropriate
substrate, detecting the induction of a reporter gene (comprising a
target-responsive regulatory element operatively linked to a
nucleic acid encoding a detectable marker, e.g., chloramphenicol
acetyl transferase), or detecting a target-regulated cellular
response.
[0284] In yet another embodiment, an assay of the present invention
is a cell-free assay in which a CCP protein or biologically active
portion thereof is contacted with a test compound and the ability
of the test compound to bind to the CCP protein or biologically
active portion thereof is determined. Binding of the test compound
to the CCP protein can be determined either directly or indirectly
as described above. In a preferred embodiment, the assay includes
contacting the CCP protein or biologically active portion thereof
with a known compound which binds CCP to form an assay mixture,
contacting the assay mixture with a test compound, and determining
the ability of the test compound to interact with a CCP protein,
wherein determining the ability of the test compound to interact
with a CCP protein comprises determining the ability of the test
compound to preferentially bind to CCP or biologically active
portion thereof as compared to the known compound.
[0285] In another embodiment, the assay is a cell-free assay in
which a CCP protein or biologically active portion thereof is
contacted with a test compound and the ability of the test compound
to modulate (e.g., stimulate or inhibit) the activity of the CCP
protein or biologically active portion thereof is determined.
Determining the ability of the test compound to modulate the
activity of a CCP protein can be accomplished, for example, by
determining the ability of the CCP protein to bind to a CCP target
molecule by one of the methods described above for determining
direct binding. Determining the ability of the CCP protein to bind
to a CCP target molecule can also be accomplished using a
technology such as real-time Biomolecular Interaction Analysis
(BIA). Sjolander, S, and Urbaniczky, C. (1991) Anal. Chem.
63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol.
5:699-705. As used herein, "BIA" is a technology for studying
biospecific interactions in real time, without labeling any of the
interactants (e.g., BIAcore). Changes in the optical phenomenon of
surface plasmon resonance (SPR) can be used as an indication of
real-time reactions between biological molecules.
[0286] In an alternative embodiment, determining the ability of the
test compound to modulate the activity of a CCP protein can be
accomplished by determining the ability of the CCP protein to
further modulate the activity of a CCP target molecule (e.g., a CCP
mediated signal transduction pathway component). For example, the
activity of the effector molecule on an appropriate target can be
determined, or the binding of the effector to an appropriate target
can be determined as previously described.
[0287] In yet another embodiment, the cell-free assay involves
contacting a CCP protein or biologically active portion thereof
with a known compound which binds the CCP protein to form an assay
mixture, contacting the assay mixture with a test compound, and
determining the ability of the test compound to interact with the
CCP protein, wherein determining the ability of the test compound
to interact with the CCP protein comprises determining the ability
of the CCP protein to preferentially bind to or modulate the
activity of a CCP target molecule.
[0288] The cell-free assays of the present invention are amenable
to use of both soluble and/or membrane-bound forms of proteins
(e.g., CCP proteins or biologically active portions thereof). In
the case of cell-free assays in which a membrane-bound form a
protein is used it may be desirable to utilize a solubilizing agent
such that the membrane-bound form of the protein is maintained in
solution. Examples of such solubilizing agents include non-ionic
detergents such as n-octylglucoside, n-dodecylglucoside,
n-dodecylmaltoside, octanoyl-N-methylglucamide,
decanoyl-N-methylglucamide, Triton.RTM. X-100, Triton.RTM. X-114,
Thesit.RTM., Isotridecypoly(ethylene glycol ether).sub.n,
3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS),
3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane
sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane
sulfonate.
[0289] In more than one embodiment of the above assay methods of
the present invention, it may be desirable to immobilize either CCP
or its target molecule to facilitate separation of complexed from
uncomplexed forms of one or both of the proteins, as well as to
accommodate automation of the assay. Binding of a test compound to
a CCP protein, or interaction of a CCP protein with a target
molecule in the presence and absence of a candidate compound, can
be accomplished in any vessel suitable for containing the
reactants. Examples of such vessels include microtitre plates, test
tubes, and micro-centrifuge tubes. In one embodiment, a fusion
protein can be provided which adds a domain that allows one or both
of the proteins to be bound to a matrix. For example,
glutathione-S-transferase/CCP fusion proteins or
glutathione-S-transferase/target fusion proteins can be adsorbed
onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.)
or glutathione derivatized microtitre plates, which are then
combined with the test compound or the test compound and either the
non-adsorbed target protein or CCP protein, and the mixture
incubated under conditions conducive to complex formation (e.g., at
physiological conditions for salt and pH). Following incubation,
the beads or microtitre plate wells are washed to remove any
unbound components, the matrix immobilized in the case of beads,
complex determined either directly or indirectly, for example, as
described above. Alternatively, the complexes can be dissociated
from the matrix, and the level of CCP binding or activity
determined using standard techniques.
[0290] Other techniques for immobilizing proteins on matrices can
also be used in the screening assays of the invention. For example,
either a CCP protein or a CCP target molecule can be immobilized
utilizing conjugation of biotin and streptavidin. Biotinylated CCP
protein or target molecules can be prepared from biotin-NHS
(N-hydroxy-succinimide) using techniques well known in the art
(e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and
immobilized in the wells of streptavidin-coated 96 well plates
(Pierce Chemical). Alternatively, antibodies reactive with CCP
protein or target molecules but which do not interfere with binding
of the CCP protein to its target molecule can be derivatized to the
wells of the plate, and unbound target or CCP protein trapped in
the wells by antibody conjugation. Methods for detecting such
complexes, in addition to those described above for the
GST-immobilized complexes, include immunodetection of complexes
using antibodies reactive with the CCP protein or target molecule,
as well as enzyme-linked assays which rely on detecting an
enzymatic activity associated with the CCP protein or target
molecule.
[0291] In another embodiment, modulators of CCP expression are
identified in a method wherein a cell is contacted with a candidate
compound and the expression of CCP mRNA or protein in the cell is
determined. The level of expression of CCP mRNA or protein in the
presence of the candidate compound is compared to the level of
expression of CCP mRNA or protein in the absence of the candidate
compound. The candidate compound can then be identified as a
modulator of CCP expression based on this comparison. For example,
when expression of CCP mRNA or protein is greater (statistically
significantly greater) in the presence of the candidate compound
than in its absence, the candidate compound is identified as a
stimulator of CCP mRNA or protein expression. Alternatively, when
expression of CCP mRNA or protein is less (statistically
significantly less) in the presence of the candidate compound than
in its absence, the candidate compound is identified as an
inhibitor of CCP mRNA or protein expression. The level of CCP mRNA
or protein expression in the cells can be determined by methods
described herein for detecting CCP mRNA or protein.
[0292] In yet another aspect of the invention, the CCP proteins can
be used as "bait proteins" in a two-hybrid assay or three-hybrid
assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993)
Cell 72:223-232; Madura et al. (1993). J. Biol. Chem.
268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924;
Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300),
to identify other proteins, which bind to or interact with CCP
("CCP-binding proteins" or "CCP-bp") and are involved in CCP
activity. Such CCP-binding proteins are also likely to be involved
in the propagation of signals by the CCP proteins or CCP targets
as, for example, downstream elements of a CCP-mediated signaling
pathway. Alternatively, such CCP-binding proteins are likely to be
CCP inhibitors. Alternatively, a mammalian two-hybrid system can be
used which includes e.g. a chimeric green fluorescent protein
encoding reporter gene (Shioda et al. 2000, Proc. Natl. Acad. Sci.
USA 97, 5520-5224). Yet another alternative consists of a bacterial
two-hybrid system using e.g. HIS as reporter gene (Joung et al.
2000, Proc. Natl. Acad. Sci. USA 97, 7382-7387).
[0293] The two-hybrid system is based on the modular nature of most
transcription factors, which consist of separable DNA-binding and
activation domains. Briefly, the assay utilizes two different DNA
constructs. In one construct, the gene that codes for a CCP protein
is fused to a gene encoding the DNA binding domain of a known
transcription factor (e.g., GAL-4). In the other construct, a DNA
sequence, from a library of DNA sequences, that encodes an
unidentified protein ("prey" or "sample") is fused to a gene that
codes for the activation domain of the known transcription factor.
If the "bait" and the "prey" proteins are able to interact, in
vivo, forming a CCP-dependent complex, the DNA-binding and
activation domains of the transcription factor are brought into
close proximity. This proximity allows transcription of a reporter
gene (e.g., LacZ) which is operably linked to a transcriptional
regulatory site responsive to the transcription factor. Expression
of the reporter gene can be detected and cell colonies containing
the functional transcription factor can be isolated and used to
obtain the cloned gene which encodes the protein which interacts
with the CCP protein.
[0294] This invention further pertains to novel agents identified
by the above-described screening assays. Accordingly, it is within
the scope of this invention to further use an agent identified as
described herein in an appropriate plant or animal model. For
example, an agent identified as described herein (e.g., a CCP
modulating agent, an antisense CCP nucleic acid molecule, a
CCP-specific antibody, or a CCP-binding partner) can be used in a
plant or animal model to determine the efficacy, toxicity, or side
effects of treatment with such an agent. Alternatively, an agent
identified as described herein can be used in a plant or animal
model to determine the mechanism of action of such an agent.
Furthermore, this invention pertains to uses of novel agents
identified by the above-described screening assays for the
agricultural and therapeutic uses described herein.
[0295] C. Detection Assays
[0296] Portions or fragments of the cDNA sequences identified
herein (and the corresponding complete gene sequences) can be used
in numerous ways as polynucleotide reagents. For example, these
sequences can be used to: map their respective genes on a
chromosome; and, thus, locate gene regions associated with genetic
disease; identify an individual from a minute biological sample
(tissue typing); and aid in forensic identification of a biological
sample. Once the sequence (or a portion of the sequence) of a gene
has been isolated, this sequence can be used to map the location of
the gene on a chromosome. This process is called chromosome
mapping. Accordingly, portions or fragments of the CCP nucleotide
sequences, described herein, can be used to map the location of the
CCP genes on a chromosome. The mapping of the CCP sequences to
chromosomes is an important first step in correlating these
sequences with genes associated with disease.
[0297] Briefly, CCP genes can be mapped to chromosomes by preparing
PCR primers (preferably 15-25 bp in length) from the CCP nucleotide
sequences. Computer analysis of the CCP sequences can be used to
predict primers that do not span more than one exon in the genomic
DNA, thus complicating the amplification process. These primers can
then be used for PCR screening of cell hybrids containing
individual plant or human chromosomes. Only those hybrids
containing the plant or human gene corresponding to the CCP
sequences will yield an amplified fragment.
[0298] Other mapping strategies which can similarly be used to map
a CCP sequence to its chromosome include in situ hybridization
(described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA,
87:6223-27), pre-screening with labeled flow-sorted chromosomes,
and pre-selection by hybridization to chromosome specific cDNA
libraries.
[0299] Fluorescence in situ hybridization (FISH) of a DNA sequence
to a metaphase chromosomal spread can further be used to provide a
precise chromosomal location in one step. Chromosome spreads can be
made using cells whose division has been blocked in metaphase by a
chemical such as colcemid that disrupts the mitotic spindle. The
chromosomes can be treated briefly with trypsin, and then stained
with Giemsa. A pattern of light and dark bands develops on each
chromosome, so that the chromosomes can be identified individually.
The FISH technique can be used with a DNA sequence as short as 500
or 600 bases. However, clones larger than 1,000 bases have a higher
likelihood of binding to a unique chromosomal location with
sufficient signal intensity for simple detection. Preferably 1,000
bases, and more preferably 2,000 bases will suffice to get good
results at a reasonable amount of time. For a review of this
technique, see Verma et al., Human Chromosomes: A Manual of Basic
Techniques (Pergamon Press, New York 1988).
[0300] Reagents for chromosome mapping can be used individually to
mark a single chromosome or a single site on that chromosome, or
panels of reagents can be used for marking multiple sites and/or
multiple chromosomes. Reagents corresponding to noncoding regions
of the genes actually are preferred for mapping purposes. Coding
sequences are more likely to be conserved within gene families,
thus increasing the chance of cross hybridizations during
chromosomal mapping.
[0301] Once a sequence has been mapped to a precise chromosomal
location, the physical position of the sequence on the chromosome
can be correlated with genetic map data. (Such data are found, for
example, in V. McKusick, Mendelian Inheritance in Man, available
on-line through Johns Hopkins University Welch Medical Library).
The relationship between a gene and a disease, mapped to the same
chromosomal region, can then be identified through linkage analysis
(co-inheritance of physically adjacent genes), described in, for
example, Egeland, J. et al. (1987) Nature, 325:783-787.
[0302] Moreover, differences in the DNA sequences between plants
affected and unaffected with a disease associated with the CCP
gene, can be determined. If a mutation is observed in some or all
of the affected plants but not in any unaffected plants, then the
mutation is likely to be the causative agent of the particular
disease. Comparison of affected and unaffected plants generally
involves first looking for structural alterations in the
chromosomes, such as deletions or translocations that are visible
from chromosome spreads or detectable using PCR based on that DNA
sequence. Ultimately, complete sequencing of genes from several
plants can be performed to confirm the presence of a mutation and
to distinguish mutations from polymorphisms.
[0303] D. Predictive Medicine:
[0304] The present invention also pertains to the field of
predictive medicine in which diagnostic assays, prognostic assays,
and monitoring clinical trials are used for prognostic (predictive)
purposes to thereby treat an individual prophylactically.
Accordingly, one aspect of the present invention relates to
diagnostic assays for determining CCP protein and/or nucleic acid
expression as well as CCP activity, in the context of a biological
sample (e.g., blood, serum, cells, tissue) to thereby determine
whether an individual is afflicted with a disease or disorder, or
is at risk of developing a disorder, associated with aberrant CCP
expression or activity. The invention also provides for prognostic
(or predictive) assays for determining whether an individual is at
risk of developing a disorder associated with CCP protein, nucleic
acid expression or activity. For example, mutations in a CCP gene
can be assayed in a biological sample. Such assays can be used for
prognostic or predictive purpose to thereby phophylactically treat
an individual prior to the onset of a disorder characterized by or
associated with CCP protein, nucleic acid expression or
activity.
[0305] Another aspect of the invention pertains to monitoring the
influence of agents (e.g., drugs, compounds) on the expression or
activity of CCP in clinical trials.
[0306] E. Methods of Treatment:
[0307] The present invention provides for both prophylactic and
therapeutic methods of treating a subject at risk of (or
susceptible to) a disorder or having a disorder associated with
aberrant CCP expression or activity. With regards to both
prophylactic and therapeutic methods of treatment, such treatments
may be specifically tailored or modified, based on knowledge
obtained from the field of pharmacogenomies. "Pharmacogenomics", as
used herein, refers to the application of genomics technologies
such as gene sequencing, statistical genetics, and gene expression
analysis to drugs in clinical development and on the market. More
specifically, the term refers the study of how a patient's genes
determine his or her response to a drug (e.g., a patient's "drug
response phenotype", or "drug response genotype".) Thus, another
aspect of the invention provides methods for tailoring an
individual's prophylactic or therapeutic treatment with either the
CCP molecules of the present invention or CCP modulators according
to that individual's drug response genotype. Pharmacogenomics
allows a clinician or physician to target prophylactic or
therapeutic treatments to patients who will most benefit from the
treatment and to avoid treatment of patients who will experience
toxic drug-related side effects.
[0308] This invention is further illustrated by the following
examples which should not be construed as limiting. The contents of
all references, patents and published patent applications cited
throughout this application, as well as the Figures and the
Sequence Listing are incorporated herein by reference.
EXAMPLES
Example 1
Identification of Plant CCP Polypeptides Using the Two Hybrid
System with CDC2B as a Bait
[0309] A two-hybrid screening was performed using as bait a fusion
between the GAL4 DNA-binding domain and one of the following:
CDC2bAt.N161 (GenBank accession number D10851; residue Asp161
converted into Asn161); CKS1At (GenBank accession number AJ000016);
E2Fa (=E2F5) (GenBank accession number AJ294534) dimerization
domain (226-356aa; SEQ ID NO:205); CKI4 (SEQ ID NO:264); PLP1
(GenBank accession number T01601); KLPNT1 (GenBank accession number
AB011479; protein ID number BAB11568) motor domain (36-508 aa);
KLPNT1 (GenBank accession number AB011479; protein ID number
BAB11568) stalk domain (427-867 aa); KLPNT2=TH65 (GenBank accession
number AJ001729) neck domain (3-186 aa); KLPNT2=TH65 (GenBank
accession number AJ001729) stalk domain (73-608 aa); E2Fb (=E2F3)
(GenBank accession number AJ294533) N-terminal domain (1-385 aa;
SEQ ID NO:206), respectively
[0310] CDC2bAt.N161 is a dominant negative form of the CDC2bAt
protein. The D161 residue in CDC2bAt is crucial for ATP binding
and, thus, the mutation of this residue results in an inactive
kinase. The interactions between this mutated CDK and its
substrates and regulatory proteins are also more stabilised as a
result of this mutation.
[0311] In yeast the PHO genes are part of a complex regulatory
network linking phosphate availability with the expression of
phosphatases. When phosphate levels are high the PHO80/PHO85
cyclin/CDK complex phosphorylates a transcription factor. This
transcription factor of phosphatase genes thereby becomes inactive.
The S. cerevisiae PHO85 protein can interact with the G1 specific
cyclins PCL1 and PCL2 (close homologues to the PHO80). In a yeast
strain deficient for the G1 cyclins CLN1 and CLN2, PHO80 is
required for G1 progression. This result suggests that PHO85 is
involved in a regulatory pathway that links the nutrient status of
the cell with cell division activity. The five PLP of A. thaliana
show similarity to the yeast cyclin-like PHO80 gene.
[0312] Kinesins use the cytoskeleton to move around vesicles,
organelles, chromosomes and the like in the cell. They can also be
involved in spindle formation. Kinesins consist of three functional
unrelated domains: the motor domain (involved in microtubule
binding; contains the ATPase domain), the stalk region (involved in
homo- or heterodimirisation of the kinesins), and the tail
(involved in the interaction with the `substrates` of the kinesin).
Two hybrid screens were performed using different parts of
two-kinesin-related proteins (KLPNT1 and KLPNT2 (being more than
80% identical to KLPNT1). Other information obtained by the two
hybrid approach is the dimerization of the kinesins: the KLPNT1 and
KLPNT2 interact (stalks and stalks-tail) with and between
themselves.
[0313] Vectors and strains used were provided with the Matchmaker
Two-Hybrid System (Clontech, Palo Alto, Calif.). The bait was
constructed by inserting the CDC2bAt.N161 (GenBank accession number
D10851; residue Asp161 converted into Asn161); CKS1At (GenBank
accession number AJ000016); E2Fa (=E2F5) (GenBank accession number
AJ294534) dimerization domain (226-356aa; SEQ ID NO:205); CKI4 (SEQ
ID NO:264); PLP1 (GenBank accession number T01601); KLPNT1 (GenBank
accession number AB011479; protein ID number BAB11568) motor domain
(36-508 aa); KLPNT1 (GenBank accession number AB011479; protein ID
number BAB11568) stalk domain (427-867 aa); KLPNT2=TH65 (GenBank
accession number AJ001729) neck domain (3-186 aa); KLPNT2=TH65
(GenBank accession number AJ001729) stalk domain (73-608 aa); E2Fb
(=E2F3) (GenBank accession number AJ294533) N-terminal domain
(1-385 aa; SEQ ID NO:206), respectively, into the pGBT9 vector.
Bait vectors where constructed by introducing the PCR fragment
created from the corresponding cDNA using primers to incorporate
EcoRI and BamH1 restriction enzyme sites. The PCR fragment was cut
with EcoRI and BamH1 and cloned into the EcoRI and BamH1 sites of
pGBT9, resulting in the desired plasmid. The GAL4 activation domain
cDNA fusion library was constructed as described in De Veylder et
al 1999, 208(4) p453-62 from mRNA of Arabidopsis thaliana cell
suspensions harvested at various growing stages: early exponential,
exponential, early stationary, and stationary phase.
[0314] For the screening a 1-liter culture of the Saccharomyces
cerevisiae strain HF7c (MATaura3-52 his3-200 ade2-101 lys2-801
trp1-901 leu2-3,112 gal4-542 gal80-538
LYS2::GAL1.sub.UAS-GAL1.sub.TATA-HIS3
URA3::GAL4.sub.17mers(3x)-CyC1.sub.TATA-LacZ) was sequentially
transformed with the bait plasmid and 20 .mu.g DNA of the library
using the lithium acetate method (Geitz et al. (1992) supra). To
estimate the number of independent cotransformants, 1/1000 of the
transformation mix was plated on Leu- and Trp-medium. The rest of
the transformation mix was plated on medium to select for histidine
prototrophy (Tip-, Leu-, His-). After 5 days of growth at
30.degree. C., the colonies larger than 2 mm were streaked on
histidine-lacking medium. At total for each screening at least
10.sup.7 independent cotransformants were screened for there
ability to grow on histidine free medium. Of the His.sup.+ colonies
the activation domain plasmids were isolated as described (Hoffman
and Winston, 1987, Gene 57, 267-272). The hybriZAP.TM. inserts were
PCR amplified and the PCR fragments were digested with AluI and
fractionized on a 2% agarose gel. Plasmid DNA of which the inserts
gave rise to different restriction patterns were electroporated
into Escherichia coli XL1-Blue, and the DNA sequence of the inserts
was determined. Extracted DNA was also used to retransform HF7c to
test the specificity of the interaction.
[0315] Using the foregoing technique, 61 cDNAs were identified,
their sequences were determined and found to contain open reading
frames termed CCP1 through CCP61 (FIGS. 1-61).
Example 2
Extension of CCP Encoding Polynucleotides to Full Length or to
Recover Regulatory Elements
[0316] The CCP encoding nucleic acid sequences (SEQ ID NO:1-66 or
228-239) are used to design oligonucleotide primers for extending a
partial nucleotide sequence to full length or for obtaining 5'
sequences from genomic or cDNA libraries. One primer is synthesized
to initiate extension in the antisense direction (XLR) and the
other is synthesized to extend sequence in the sense direction
(XLF). Primers allow the extension of the known CCP encoding
sequence "outward" generating amplicons containing new, unknown
nucleotide sequence for the region of interest. The initial primers
are designed from the cDNA using OLIGO.RTM. 4.06 Primer Analysis
Software (National Biosciences), or another appropriate program, to
be preferably 22-30 nucleotides in length, to have a GC content of
preferably 50% or more, and to anneal to the target sequence at
temperatures preferably about 68.degree.-72.degree. C. Any stretch
of nucleotides which would result in hairpin structures and
primer-primer dimerizations is avoided. The original, selected cDNA
libraries, prepared from mRNA isolated from actively dividing cells
or a plant genomic library are used to extend the sequence; the
latter is most useful to obtain 5' upstream regions. If more
extension is necessary or desired, additional sets of primers are
designed to further extend the known region.
[0317] Sense XLF primers can also be designed based on publicly
available genomic sequences. GENEMARK.hmm (hidden morkov model)
version 2.2a software (default parameters) can e.g. be used to
predict open reading frames. The 5' end of the predicted open
reading frame is then subsequently used to design the sense XLF
primer. Said XLF primer and the appropriate XLR primer are then
used in an RT-PCR (reverse transcription-polymerase chain reaction)
reaction to amplify the predicted cDNA. The resulting PCR product
is cloned in a suitable vector and subjected to DNA sequence
analysis to verify the prediction.
[0318] Primers used to amplify coding regions of the CCPs of the
invention are designed such that the PCR product can be cloned in
the pDONR201 vector (Gateway.TM. cloning system, Invitrogen). Thus,
a sense primer has the attB1 site (SEQ ID NO:246) at its 5' end.
For current purposes, the attB1 site is followed by a consensus
Kozak sequence (SEQ ID NO:247; Kozak (1989)J Cell Biol 108:229-241;
Lutck et al. (1987) EMBO J. 6:43-48). The 3' end of the sense
primer comprises the gene-specific parts as indicated in FIGS.
1-46. An antisense primer has at the 5' end the attB2 site (SEQ ID
NO:248) followed by the inverse complement of the gene/coding
region of interest as indicated in FIGS. 1-46. Primers used for CCP
amplification by PCR are given with their SEQ ID NOs in Table 3.
The sequence of cloned CCP PCR products was or is determined using
the sense primer prm1024 (SEQ ID NO:265) and the antisense primer
prm1025 (SEQ ID NO:266).
TABLE-US-00014 TABLE III sense antisense primer primer CCP PCR
primers SEQ ID SEQ ID Molecule sense + antisense NO: NO: CCP1
prm0733 + prm0734 133 134 CCP2 prm0663 + prm0664 135 136 CCP3
prm0705 + prm0706 137 138 CCP4 prm0659 + prm0660 139 140 CCP5
prm0749 + prm0750 141 142 CCP6 prm0707 + prm0708 143 144 CCP7/8
prm0657 + prm0658 145 146 CCP9 prm0582 + prm0583 147 148 CCP10
prm0671 + prm0672 149 150 CCP11 prm0729 + prm0730 151 152 CCP12 +
prm1676 + prm1677 153 154 CCP13 CCP14 prm0701 + prm0702 155 156
CCP15 prm0445 + prm0446 157 158 CCP16 prm0321 + prm0322 159 160
CCP17 prm0632 + prm0633 161 162 CCP18 prm0488 + prm0489 163 164
CCP19 prm0661 + prm0662 165 166 CCP20 + prm0709 + prm0710 167 168
CCP21 CCP22 prm0711 + prm0712 169 170 CCP23 prm0819 + prm0820 171
172 CCP24 prm0739 + prm0740 173 174 CCP25 prm0741 + prm0742 175 176
CCP26 prm0703 + prm0704 177 178 CCP27 prm0817 + prm0818 179 180
CCP28 prm0713 + prm0714 181 182 CCP29 / / / CCP30 prm0480 + prm0481
183 184 CCP31 prm0737 + prm0738 185 186 CCP32 prm1493 + prm1494 187
188 CCP33 prm0319 + prm0320 189 190 CCP34 prm1377 + prm1378 191 192
CCP35 prm1381 + prm1382 193 194 CCP36 / / / CCP37 prm1379 + prm1380
195 196 CCP38 prm1383 + prm1384 197 198
[0319] By following the instructions for the XL-PCR kit (Perkin
Elmer) and thoroughly mixing the enzyme and reaction mix, high
fidelity amplification is obtained. Beginning with 40 pmol of each
primer and the recommended concentrations of all other components
of the kit, PCR is performed suing the Peltier Thermal Cycle
(PTC200; MJ Research, Watertown Mass.) and the following
parameters:
TABLE-US-00015 Step 1 94.degree. C. for 1 min (initial
denaturation) Step 2 65.degree. C. for 1 min Step 3 68.degree. C.
for 6 min Step 4 94.degree. for 15 sec Step 5 65.degree. C. for 1
min Step 6 68.degree. C. for 7 min Step 7 Repeat steps 4-6 for 15
additional cycles Step 8 94.degree. C. for 15 sec Step 9 65.degree.
C. for 1 min Step 10 68.degree. C. for 7:15 min Step 11 Repeat step
8-10 for 12 cycles Step 12 72.degree. C. for 8 min Step 13
4.degree. C. (and holding)
[0320] A 5-10 .mu.l aliquot of the reaction mixture is analyzed by
electrophoresis on a low concentration (about 0.6-0.8%) agarose
mini-gel to determine which reactions were successful in extending
the sequence. Bands thought to contain the largest products were
selected and cut out of the gel. Further purification involves
using a commercial gel extraction method such as QIAQuick.TM.
(QIAGEN Inc). After recovery of the DNA, Klenow enzyme was used to
trim single-stranded, nucleotide overhangs creating blunt ends
which facilitate religation and cloning. After ethanol
precipitation, the products are redissolved in 13 .mu.l of ligation
buffer, 1 .mu.l T4-DNA ligase (15 units) and 1 .mu.l T4
polynucleotide kinase are added, and the mixture is incubated at
room temperature for 2-3 hours or overnight at 16.degree. C.
Competent E. coli cells (in 40 .mu.l of appropriate media) are
transformed with 3 .mu.l of ligation mixture and cultured in 80
.mu.l of SOC medium (Sambrook, supra). After incubation for one
hour at 37.degree. C., the whole transformation mixture is plated
on Luria Bertani (LB)-agar (Sambrook, supra) containing
2.times.Carb. The following day, several colonies are randomly
picked from each plate and cultured in 150 p. 1 of liquid
LB/2.times.Carb medium placed in an individual well of an
appropriate, commerically-available, sterile 96-well microtiter
plate. The following day, 5 .mu.l of each overnight culture is
transferred into a non-sterile 96-well plate and after dilution
1:10 with water, 5 .mu.l of each sample is transferred into a PCR
array. For PCR amplification, 18 .mu.l of concentrated PCR reaction
mix (3.3.times.) containing 4 units of 4Tth DNA polymerase, a
vector primer and both of the gene specific primers used for the
extension reaction are added to each well. Amplification is
performed using the following conditions:
TABLE-US-00016 Step 1 94.degree. C. for 60 sec Step 2 94.degree. C.
for 20 sec Step 3 55.degree. C. for 30 sec Step 4 72.degree. C. for
90 sec Step 5 Repeat steps 2-4 for an additional 29 cycles Step 6
72.degree. C. for 180 sec Step 7 4.degree. C. (and holding)
Aliquots of the PCR reactions are run on agarose gels together with
molecular weight markers. The sizes of the PCR products are
compared to the original partial cDNAs, and appropriate clones are
selected, ligated into plasmid and sequenced.
Example 3
Expression of Recombinant CCP Proteins in Transgenic Plants
[0321] In this example, the CCP molecules of the present invention
were expressed in a 35S expression vector in transgenic plants. The
CCP molecules of this invention were cloned using standard cloning
procedures between a suitable promoter, e.g. the CaMV35S promoter
or any promoter from e.g. Table II, and a suitable terminator,
e.g., the NOS 3' untranslated region. The resulting recombinant
gene is subsequently cloned in a suitable binary vector and the
resulting plant transformation vector is then transferred to
Agrobacterium tumefaciens. Arabidopsis thaliana is transformed with
this Agrobacterium applying the in planta flower-dip transformation
method (Clough and Bent, Plant J. 16:735-743, 1998). Transgenic
plant lines are selected on a growth medium containing the suitable
selection agent (e.g., kanamycin or Basta) or on the basis of
scoring the expression of a screenable marker (e.g., luciferase,
green fluorescent protein).
[0322] For tissue-specific expression, the CCP gene can also be
expressed under control of the minimal 35S promoter containing UAS
elements. These UAS elements are sites for transcriptional
activation by the GAL4-VP16 fusion protein. The GAL4-VP16 fusion
protein in turn is expressed under control of a tissue-specific
promoter. The UAS-CCP construct and the GAL4-VP16 construct are
combined by co-transformation of both constructs, subsequent
transformation of single constructs or by sexual cross of lines
that contain the single constructs. The advantage of this
two-component system is that a wide array of tissue-specific
expression patterns can be generated for a specific transgene, by
simply crossing selected parent lines expressing the UAS-CCP
construct with various tissue-specific GAL4-VP16 lines. A
tissue-specific promoter/CCP combination that gives a desired
phenotype can subsequently be recloned in a single expression
vector, to avoid stacking of transgene constructs in commercial
lines.
[0323] Primary transformants are characterized by Northern and
Western blotting using 1-4 week old plantlets. Expression levels
were compared with those of non-transformed (control) plants.
Example 4
Downregulation of Target CCP Genes in Transgenic Plants
[0324] Plant genes can be specifically downregulated by antisense
and co-suppression technologies. These technologies are based on
the synthesis of antisense transcripts, complementary to the mRNA
of a given CCP gene. There are several methods described in
literature, that increase the efficiency of this downregulation,
for example to express the sense strand with introduced inverted
repeats, rather than the antisense strand. The constructs for
downregulation of target genes are made similarly as those for
expression of recombinant proteins, i.e., they are fused to
promoter sequences and transcription termination sequences (see
example 3). Promoters used for this purpose are constitutive
promoters as well as tissue-specific promoters.
Example 5
Agrobacterium-Mediated Rice Transformation
[0325] Mature dry seeds of the rice japonica cultivars Nipponbare
or Taipei 309 are dehusked, sterilised and germinated on a medium
containing 2,4-D (2,4-dichlorophenoxyacetic acid). After incubation
in the dark for four weeks, embryogenic, scutellum-derived calli
are excised and propagated on the same medium. Selected embryogenic
calluses are then co-cultivated with Agrobacterium. Widely used
Agrobacterium strains such as LBA4404 or C58 harbouring binary
T-DNA vectors can be used. The hpt gene in combination with
hygromycin is suitable as a selectable marker system but other
systems can be used. Co-cultivated callus is grown on
2,4-D-containing medium for 4 to 5 weeks in the dark in the
presence of a suitable concentration of the selective agent. During
this period, rapidly growing resistant callus islands develop.
After transfer of this material to a medium with a reduced
concentration of 2,4-D and incubation in the light, the embryogenic
potential is released and shoots develop in the next four to five
weeks. Shoots are excised from the callus and incubated for one
week on an auxin-containing medium from which they can be
transferred to the soil. Hardened shoots are grown under high
humidity and short days in a phytotron. Seeds can be harvested
three to five months after transplanting. The method yields single
locus transformants at a rate of over 50% (Aldemita and Hodges
(1996) Planta 199:612-617; Chan et al. (1993) Plant Mol. Biol. 22:
491-506; Hiei et al. (1994) Plant J. 6:271-282).
Example 6
Expression of Recombinant CCP Proteins in Bacterial Cells
[0326] In this example, the CCP molecules of the present invention
are expressed as a recombinant glutathione-S-transferase (GST)
fusion polypeptide in E. coli and the fusion polypeptide is
isolated and characterized. Specifically, CCP molecules are fused
to GST and this fusion polypeptide is expressed in E. coli, e.g.,
strain PEB199. Expression of the GST-CCP fusion protein in PEB199
is induced with IPTG. The recombinant fusion polypeptide is
purified from crude bacterial lysates of the induced PEB199 strain
by affinity chromatography on glutathione beads. Using
polyacrylamide gel electrophoretic analysis of the polypeptide
purified from the bacterial lysates, the molecular weight of the
resultant fusion polypeptide is determined.
Example 7
Expression of Recombinant CCP Proteins in COS Cells
[0327] To express the CCP gene of the present invention in COS
cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego,
Calif.) is used. This vector contains an SV40 origin of
replication, an ampicillin resistance gene, an E. coli replication
origin, a CMV promoter followed by a polylinker region, and an SV40
intron and polyadenylation site. A DNA fragment encoding the entire
CCP protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a
FLAG tag fused in-frame to its 3' end of the fragment is cloned
into the polylinker region of the vector, thereby placing the
expression of the recombinant protein under the control of the CMV
promoter.
[0328] To construct the plasmid, the CCP DNA sequence is amplified
by PCR using two primers. The 5' primer contains the restriction
site of interest followed by approximately twenty nucleotides of
the CCP coding sequence starting from the initiation codon; the 3'
end sequence contains complementary sequences to the other
restriction site of interest, a translation stop codon, the HA tag
or FLAG tag and the last 20 nucleotides of the CCP coding sequence.
The PCR amplified fragment and the pcDNA/Amp vector are digested
with the appropriate restriction enzymes and the vector is
dephosphorylated using the CIAP enzyme (New England Biolabs,
Beverly, Mass.). Preferably the two restriction sites chosen are
different so that the Kinase and/or Phosphatase gene is inserted in
the correct orientation. The ligation mixture is transformed into
E. coli cells (strains HB101, DH5a, SURE, available from Stratagene
Cloning Systems, La Jolla, Calif., can be used), the transformed
culture is plated on ampicillin media plates, and resistant
colonies are selected. Plasmid DNA is isolated from transformants
and examined by restriction analysis for the presence of the
correct fragment.
[0329] COS cells are subsequently transfected with the
CCP-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium
chloride co-precipitation methods, DEAE-dextran-mediated
transfection, lipofection, or electroporation. Other suitable
methods for transfecting host cells can be found in Sambrook, J.,
Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory
Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of
the CCP polypeptide is detected by radiolabelling
(.sup.35S-methionine or .sup.35S-cysteine available from NEN,
Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and
Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA
specific monoclonal antibody. Briefly, the cells are labelled for 8
hours with .sup.35S-methionine (or .sup.35S-cysteine). The culture
media are then collected and the cells are lysed using detergents
(RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM
Tris, pH 7.5). Both the cell lysate and the culture media are
precipitated with an HA specific monoclonal antibody. Precipitated
polypeptides are then analyzed by SDS-PAGE.
[0330] Alternatively, DNA containing the Kinase and/or Phosphatase
coding sequence is cloned directly into the polylinker of the
pcDNA/Amp vector using the appropriate restriction sites. The
resulting plasmid is transfected into COS cells in the manner
described above, and the expression of the CCP polypeptide is
detected by radiolabelling and immunoprecipitation using a CCP
specific monoclonal antibody.
Example 8
In Vitro Phosphorylation of CDC2DN-IC26M by Plant CDKs
[0331] The CDC2bDN-IC26M coding region (SEQ ID NO:4) was amplified
by PCR with Pfu polymerase (Stratagene, La Jolla, Calif.). The PCR
product was subcloned into pET19b (Novagen, Madison, Wis.), to
obtain CDC2bDN-IC26 MpET19b. The CDC2bDN-IC26M gene is located
downstream of a T1lac promoter, in frame with a sequence encoding a
10-histidine tag followed by an enterokinase recognition site.
Escherichia coli BL21(DE3) cells (Novagen) containing the
CDC2bDN-IC26 MpET19b plasmid were grown at 37.degree. C. in M9
medium (Sambrook and Russel, Molecular Cloning, A Laboratory
Manual, 3.sup.rd Edition, CSHL Press, CSH New York, 2001),
supplemented with 100 .mu.g/ml of ampicillin, to obtain a cell
density corresponding to an A600 of 0.6. Subsequently, expression
of the CDC2bDN-IC26M gene was induced by addition of 0.4 mM
isopropyl .beta.-D-thiogalactoside, and culture was continued for 4
h at 30.degree. C.
[0332] Cells were collected in lysis buffer containing 50 mM sodium
phosphate buffer, pH 8.0, 300 mM NaCl, 0.1% Triton X-100, and 1 mM
phenylmethylsulfonyl fluoride (PMSF) and were lysed on ice by
sonication. The extract was clarified by centrifugation for 20
minutes at 20,000.times.g. The crude extract was loaded at
4.degree. C. on a nickel-nitrilotriacetic acid-agarose affinity
resin (Qiagen), and protein fractionation was performed according
to the manufacturer's instructions. The fractions containing the
CDC2bDN-IC26M fusion protein were pooled.
[0333] CDC2bDN-IC26M kinase assays were performed with CDK
complexes purified from total plant (Arabidopsis seedlings) protein
extracts by p13.sup.suc1-Sepharose affinity binding according to
Azzi et al. (Eur. J. Biochem. 203: 353-360). Briefly, p13.sup.suc1
was purified from an overproducing E. coli strain by chromatography
in Sephacryl S2000, and conjugated to CNBr-activated Sepharose 4B
(Pharmacia) according to the manufacturer's instructions. Total
plant protein extracts (300 .mu.g) were incubated with 50 .mu.l 50%
(v/v) p13.sup.suc1-Sepharose beads for 2 h at 4.degree. C. The
washed beads were combined with 30 .mu.l kinase buffer containing
.about.1 mg/ml CDC2bDN-IC26M, 150 mM ATP and 1 .mu.Ci of [-32P]ATP
(Amersham). After 20 minutes of incubation at 30.degree. C.,
samples were analysed by SDS-PAGE and autoradiographed.
[0334] As shown in FIG. 48, the purified CDC2bDN-IC26M protein is
phosphorylated by CDKs in vitro.
Example 9
PCR Amplification of AtDPb
[0335] Based on available sequence data of putative plant
DP-related partial clones from the databank (soybean DP(AI939068),
tomato DP(AW217514), and cotton DP (AI731675)), three
oligonucleotides, corresponding to the most conserved part of the
DNA-binding and E2F heterodimerization domains (MKVCEKV, SEQ ID
NO:240; LNVLMAMD, SEQ ID NO:241 and FNSTPFEL, SEQ ID NO:242), were
synthesized and designated A (ATAGAATTCATGAAAGTTTGTGAAAAGGTG, SEQ
ID NO:243), B (ATAGAATTCCTGAATGTTCTCATGGCAATGGAT, SEQ ID NO:244)
and C (ATAGGATCCCAGCTCAAAAGGAGTGCTATTGAA, SEQ ID NO:245),
respectively.
[0336] PCR was performed on an Arabidopsis/yeast two-hybrid
suspension culture cDNA library. The PCR products were purified,
digested with EcoRI and BamHI, and ligated into pCR-XL-TOPO vector
(Invitrogen). The cloned inserts were sequenced by double-stranded
dideoxy sequencing.
Example 10
Construction of AtDP and AtE2F Mutants, In Vitro
Transcription-Translation System and Immunoprecipitation
[0337] Influenza hemagglutinin (HA)-tagged versions of the
wild-type and mutant AtE2Fa and AtE2Fb were constructed by cloning
into the pSK plasmid (Stratagene) containing the HA-tag (SEQ ID
NO:202). The AtE2F mutants, namely AtE2Fa 1-420 (SEQ ID NO:217),
AtE2Fa 162-485 (SEQ ID NO:218), and AtE2Fb 1-385 (SEQ ID NO:206),
were obtained by PCR and cloned into the EcoRI and BamHI sites of
HA-pSK. The c-myc (SEQ ID NO:200)-tagged versions of wild-type and
AtDP mutants (AtDPa 1-292, SEQ ID NO:114; AtDPa 121-292, SEQ ID
NO:211; AtDPa 1-142, SEQ ID NO:208; AtDPa 172-292, SEQ ID NO:213;
AtDPa 121-213, SEQ ID NO:212; and AtDPb 1-385, SEQ ID NO:127; AtDPb
182-385, SEQ ID NO:216; AtDPb 1-263, SEQ ID NO:223; AtDPb 1-193,
SEQ ID NO:214; and AtDPb 182-263, SEQ ID NO:215) were generated by
PCR and cloned into the EcoRI and PstI sites of the pBluescript
plasmid (Stratagene) containing a double c-myc tag. All cloning
steps were carried out according to standard procedures, and the
reading frames were verified by direct sequencing.
[0338] In vitro transcription and translation experiments were
performed using the TNT T7-coupled wheat germ extract kit (Promega)
primed with appropriate plasmids for 90 min at 30.degree. C. For
immunoprecipitation, 10 .mu.l of the total in vitro translated
extract (50 .mu.l) was diluted at 1:5 in Nonidet P40 buffer (50 mM
Tris, pH 7.4, 150 mM NaCl, 1% Nonidet P40, 1 mM
phenylmethylsulfonyl fluoride, 10 .mu.g/ml
leupeptin/aprotinin/pepstatin) and incubated for 2 h at 4.degree.
C. with anti-c-myc (9E10; BabCo) or anti-HA (16B12; BabCo)
antibodies. Protein-A-Sepharose (40 .mu.l 25% (v/v)) was added and
incubated for 1 h at 4.degree. C., then the beads were washed four
times with Nonidet P40 buffer. Immune complexes were eluted with 10
.mu.l 2 U sodium dodecyl sulfate (SDS) sample buffer and analyzed
by 10% or 15% SDS-PAGE and by autoradiography.
[0339] An overview of the AtDP and AtE2F fragments and their SEQ ID
NOs is given in Table 4.
TABLE-US-00017 TABLE IV SEQ ID NO SEQ ID NO amino acid DNA CCP or
partial CCP sequence sequence AtE2Fa 226-356 205 228 AtE2Fb 1-385
206 AtE2Fb 1-127 207 AtDPa 1-142 208 AtDPa 42-142 209 AtDPa 42-292
210 AtDPa 121-292 211 229 AtDPa 121-213 212 AtDPa 172-292 213 AtDPb
1-193 214 AtDPb 182-263 215 230 AtDPb 182-385 216 231 AtE2Fa 1-420
217 AtE2Fa 162-485 218 AtE2Fa 1-38 219 AtDPa 1-214 220 239 AtDPa
143-292 221 232 AtDPa 143-213 222 233 AtDPb 1-263 223 234 AtE2Fa
232-282 224 235 AtE2Fa 232-352 225 236 AtE2Fb 194-243 226 237
AtE2Fb 194-311 227 238
Example 11
In Vitro Interaction Between AtDPs, AtE2Fs and Mutants Thereof
Illustrated by Immunoprecipitation Experiments
[0340] The AtDPa and AtDPb can efficiently interact in vitro with
AtE2Fa and AtE2Fb. As a first step in comparing the biochemical
properties of AtDPa and AtDPb, the ability of these molecules to
heterodimerize with AtE2Fa and AtE2Fb was tested. For this purpose,
the coupled in vitro transcription-translation system was used in
which the c-myc-tagged AtDPa or AtDPb was co-expressed with the
HA-tagged AtE2Fa or AtE2Fb. One part of each sample was resolved by
SDS-PAGE (FIGS. 50 and 51, panels A), while another part was
subjected to immunoprecipitation with monoclonal anti-c-myc
antibodies (FIGS. 50 and 51, panels B). In the absence of DP
proteins, no AtE2F2a or AtE2F2b was precipitated by the anti-c-myc
antibodies (FIG. 51, panel B, lane 1). However, both HA-AtE2F
proteins co-precipitated reproducibly with c-myc-tagged AtDPa (FIG.
50, panel B, lanes 1 and 2) and AtDPb (FIG. 51, panel B, lanes 3
and 4). Identical results were obtained in a reciprocal experiment
with anti-HA monoclonal antibodies. These data revealed that both
Arabidopsis DP-related proteins interacted in vitro with the
different Arabidopsis E2F-related proteins.
[0341] The conserved dimerization domain of the AtE2Fs seemed to be
important for the interaction with the AtDPs, because mutational
analysis showed that deletion neither of the N-terminal extension
nor the C-terminal part of AtE2Fa and AtE2Fb impaired the
interaction with the DPs (FIGS. 50 and 51, panels B). Similar
results were obtained by two-hybrid analysis (see Table 5 of
Example 12). To test whether the structural requirements for
heterodimerization of the AtDPs were similar to those of their
animal homologs, several deletion mutants of AtDPa and AtDPb were
constructed (for a schematic illustration, see FIGS. 52 and 53),
tagged with the c-myc epitope (FIGS. 54 and 55, panels A). The
interactions between the mutant AtDPs and AtE2Fb were analyzed in
immunoprecipitation experiments with the specific anti-HA or
anti-c-myc antibodies (Figures A6 and A7, panels B and C,
respectively). As shown in FIGS. 54 and 55, mutant AtDP proteins
with deleted DNA-binding domain could bind sufficiently to the
co-translated HA-AtE2Fb proteins (FIG. 54, panel C, lane 2; and
FIG. 55, panel C, lane 2). No detectable interaction was found
between the AtE2Fb protein and mutant DP proteins containing the
complete DNA-binding domain, but lacking the putative dimerization
domain (FIG. 54, panel C, lane 3; FIG. 55, panel C, lane 4). Thus,
the N-terminal part of both AtDP proteins, including the conserved
DNA-binding domain, was not sufficient for the in vitro interaction
to occur. In contrast, a mutant form of AtDPb (amino acids 1-263;
SEQ ID NO:223) could bind to AtE2Fb (FIG. 55, panel C, lane 3),
indicating that the region of AtDPb between amino acids 182 and 263
was required for interaction with AtE2Fb.
[0342] To confirm this hypothesis, a deletion mutant of AtDPb
(182-263, SEQ ID NO:215) was constructed and, as expected, it could
bind to AtE2Fb (FIG. 56). The requirement for the homologous
dimerization domain of AtDPa for the interaction with AtE2Fb was
supported by a binding assay in which the mutant AtDPa 172-292 (SEQ
ID NO:213), with the N-terminal part of the dimerization domain
deleted, failed to bind to AtE2Fb (FIG. 54, panels B and C, lanes
4). However, when the E2F-binding activity of the predicted
dimerization domain of the AtDPa (amino acid positions 121-213, SEQ
ID NO:212) was tested, no interaction could be detected between
this region and the AtE2Fb protein (FIG. 54, panel B, lane 5).
These data indicate that other carboxyl-terminal regions of AtDPa
are required for the stable interaction with AtE2Fb.
Example 12
Yeast Two-Hybrid Experiments for Showing Interaction Between DP and
E2F Mutants
[0343] For library screening, vectors and strains (HF7c) were
provided with the Matchmaker two-hybrid system (Clontech). The
dimerization and DNA-binding domains of the AtE2Fa (amino acids
226-356; SEQ ID NO:205) were amplified by polymerase chain reaction
(PCR) and subcloned in-frame with the GAL4 DNA-binding domain of
pGBT9 (Clontech) to create the bait plasmid pGBTE2Fa226-356.
Screens were performed as described previously (De Veylder et al.
1999; Planta 208, 453-462). A second library screening was
performed with the AtE2Fb construct (pGBTE2Fb-Rb) lacking the
Rb-binding domain (amino acids 1-385; SEQ ID NO:206). Plasmids from
interacting clones were isolated and sequenced.
[0344] For the yeast two-hybrid interaction experiments, a number
of yeast two-hybrid prey (in pAD-GAL424) plasmids were created by
PCR amplification of fragments from the AtDPa (DPa 1-292, SEQ ID
NO:114; DPa 1-142, SEQ ID NO:208; DPa 42-142, SEQ ID NO:209; DPa
42-292, SEQ ID NO:210; DPa 121-292, SEQ ID NO:211; DPa 121-213, SEQ
ID NO:212; and DPa 172-292, SEQ ID NO:213) and AtDPb (DPb 1-385,
SEQ ID NO:127; DPb 1-193, SEQ ID NO:214; DPb 182-263, SEQ ID
NO:215; and DPb 182-385, SEQ ID NO:216) genes and confirmed by
sequencing. Different combinations between bait (pGBTE2Fa226-356,
pGBTE2Fb-Rb, or pGBTE2Fb 1-127, SEQ ID NO:207) and prey constructs
were transformed into yeast cells and assayed for their ability to
grow on His.sup.- minimal media after 3 days of incubation at
30.degree. C. Bait plasmids co-transformed with empty pAD-GAL424
and prey plasmids co-transformed with empty pGBT9 were assessed
along as controls for the specificity of the interaction.
[0345] An overview of the AtDP and AtE2F fragments and their SEQ ID
NOs is given in Table 4.
[0346] The results obtained were confirmed by two-hybrid
interaction analysis. pGBTE2Fa226-356 and pGBTE2Fb-Rb were
co-transformed in an appropriate yeast reporter stain with a
plasmid producing the full-length AtDPa or AtDPb protein fused to
the GAL4 transactivation domain. The specific reconstitution of
GAL4-dependent gene expression measured as the ability to grow in
the absence of histidine confirms the interaction between the two
DP and E2F proteins (Table 5).
TABLE-US-00018 TABLE V AtDPs and AtE2Fs interaction in yeast
two-hybrid assays. Preys DPa DPa DPa DPa DPa DPa DPa DPb DPb DPb
DPb E2Fa pAD- Baits 1-292 1-142 42-142 42-292 121-292 121-213
172-292 1-385 1-193 182-263 182-385 226-356 GAL424 pGBT + - - + + -
- + - + + - - E2Fa 226-356 pGBT + - - + + - - + - + + - - E2Fb- Rb
pGBT - NT NT NT NT NT NT - NT NT NT - - E2Fb 1-127 pGBT - NT NT NT
NT NT NT - NT NT NT + - DPa 1-292 pGBT NT NT NT NT NT NT NT - NT NT
NT + - DPb 1-385 pBGT9 - - - - - - - - - - - - - Different
combinations between AtE2Fs bait and AtDPs prey constructs were
tested for growth on His.sup.- minimal media. -, no interaction; +,
positive interaction; NT, not tested.
Example 13
RNA Isolation and Reverse Transcription-(RT)-PCR analysis of AtDP
and AtE2F Expression
[0347] A. thaliana (L.) Heynh. cell suspension cultures were
maintained as described previously (Glab et al. 1994, FEBS Lett.
17, 207-211). The cells were partially synchronized by the
consecutive addition of aphidicolin (5 .mu.g/ml) and propyzamide
(1.54 .mu.g/ml). The aphidicolin block was left for 24 hours. The
cells were washed for 1 hour in B5 medium before the addition of
propyzamide. Samples were taken at the end of the 24 hour
aphidicolin block, at the end of a 1 hour washing step, and at 1,
2, 3, and 4 hours after the addition of propyzamide to the culture
medium. Total RNA was isolated from the Arabidopsis cell suspension
culture according to Magyar et al. (1997), Plant Cell 9, 223-235,
and with the Triazol reagent (Gibco/BRL) from different organs.
Semi-quantitative RT-PCR amplification was carried out on
reverse-transcribed mRNA, ensuring that the amount of amplified
product stayed in linear proportion to the initial template present
in the reaction. 10 .mu.l from the PCR was transferred onto
Hybond-N/membrane, hybridized to fluorescein-labeled gene-specific
probes (Gene-Images random prime labeling module; Amersham
Pharmacia Bio-tech), detected with the CDP-Star detection module
(Amersham), and visualized by short exposure to Kodak X-OMAT
autoradiography film.
[0348] The following primer pairs (forward and reverse) were used
for the amplification: 5'-ATAGAATTCATGTCCGGTGTCGTACGA-3' (SEQ ID
NO:249, EcoRI site underlined) and
5'-ATAGGATCCCACCTCCAATCTTTCTGCAGC-3' (SEQ ID NO:250, BamHI site
underlined) for AtE2Fa (GenBank accession number AJ294533);
5'-ATAGAATTCGAGAAGAAAGGGCAAT CAAGA-3' (SEQ ID NO:251, EcoRI site
underlined) and 5'-ATACTGCAGAGAAATCTCGATTTCGACTAC-3' (SEQ ID
NO:252, PstI site underlined) for AtDPa (GenBank accession number
AJ294531); 5'-GCCACTCTCATAGGGTTCTC CATCG-3' (SEQ ID NO:253) and
5'-GGCATGCCTCCAAGATCCTTGAAGT-3' (SEQ ID NO:254) for Arath;CDKA;1
(Genbank accession number X57839); 5'-GGGTCTTGGTCGTTTTACTGTT-3'
(SEQ ID NO:255) and 5'-CCAAGACGATGACAACAGATACAGC-3' (SEQ ID NO:256)
for Arath;CDKB1;1 (Genbank accession number X57840);
5'-ATAAACTAAATCTTCGCTGAA-3' (SEQ ID NO:257) and
5'-CAAACGCGGATCTGAAAAACT-3' (SEQ ID NO:258) for histone 114
(Genbank accession number M17132); 5'-TCTCTCTTCCAAATCTCC-3' (SEQ ID
NO:259) and 5'-AAGTCTCT CACTTTCTCACT-3' (SEQ ID NO:260) for R005
(AtCYP1, GenBank accession number U072676) (Chou and Gasser 1997,
Plant Mol. Biol. 35, 873-892); 5'-CTAAGCTCTCAAGATCAAAGGCTTA-3' (SEQ
ID NO:261) and 5'-TTAACATTG CAAAGAGTTTCAAGGT-3' (SEQ ID NO:262) for
actin 2 gene (GenBank accession number U41998) (An et al. 1996,
Plant J. 10, 107-121).
Example 14
The AtDPa and the AtE2Fa Genes are Co-Expressed in a Cell Cycle
Phase-Dependent Manner
[0349] The identification of the AtDPa in a yeast two-hybrid screen
as a gene encoding an AtE2Fa-associating protein indicated that it
might act cooperatively in the plant cells as a functional
heterodimer. To strengthen this hypothesis, we investigated whether
both genes were co-regulated at the transcriptional level.
Tissue-specific expression analysis revealed that both genes were
clearly up-regulated in flowers and were very strongly transcribed
in actively dividing cell suspension cultures (FIG. 57). Expression
in these tissues could be a sign for the correlation between the
actual proliferation activity of a given tissue and the transcript
accumulation, as can be seen from the Arath;CDKB1;1 gene. AtDPa
transcripts were also detectable in leaf and, to a lesser extent,
in root and stem tissues, whereas AtE2Fa transcripts were virtually
undetectable in roots and stem with only slight levels of
expression in leaf tissues. Cell cycle phase-dependent gene
transcription was studied using an Arabidopsis cell suspension that
was partially synchronized by the sequential treatment with
aphidicolin and propyzamide. The Arabidopsis histone H4 and the
Arath;CDKB1;1 gene were included to monitor the cell cycle
progression (FIG. 58) (Chaubet et al. 1996, Plant J. 10, 425-435;
Segers et al. 1996, Plant J. 10, 601-612). Bearing in mind the
partial synchronization of the culture, it can be observed that
histone H4 transcript levels peaked immediately after the removal
of the inhibitor and decrease gradually thereafter (FIG. 58). The
opposite expression pattern could be observed for the Arath;CDKB1;1
gene, illustrating that cells entered the G2-M phases with partial
synchrony. Within this experimental setting, the AtDPa and the
AtE2Fa genes show a very similar expression pattern. Both exhibit
higher transcript accumulation before the peak of histone H4 gene
expression and quickly decay in the following samples (FIG. 58).
The similarity in the expression patterns of Arabidopsis AtDPa and
AtE2Fa supports the possibility that they act cooperatively as a
heterodimer during the S phase.
Example 15
Transformation of Arabidopsis Thaliana with CaMV35S::DPa
[0350] Arabidopsis plants were transformed (using the in planta
flower dip method; Clough and Bent, Plant J. 16:735-743, 1998) with
a construct containing the DPa gene under the control of the CaMV
.sup.35S promoter. The lines were molecularly analysed by northern
blotting. As can be seen in FIG. 59, all lines showed increased DPa
levels in comparison with the untransformed control. Generally, two
classes of lines were observed: weakly expressing (e.g., 16) and
strongly expressing (e.g., 23) lines (see FIG. 59). The plants are
subsequently analyzed for phenotypic alterations as described
herein.
EQUIVALENTS
[0351] Those skilled in the art will recognize, or be able to
ascertain using no more than routine experimentation, many
equivalents to the specific embodiments of the invention described
herein. Such equivalents are intended to be encompassed by the
following claims.
Sequence CWU 1
1
29011255DNAArabidopsis thaliana 1ccacatatcc gtgatgagga aactaagaaa
ccagactcag tttcaagtga agaaccagag 60acgattatca ttgatgtgga tgaaagtgat
aaagaaggag gtgactctaa tgagccaatg 120tttgtacaac atactgaagc
aatgctggag gagattgaac agatggagaa ggagattgaa 180atggaagatg
cagacaaaga agaagagcct gtgatcgata ttgatgcctg tgataagaat
240aatcctttgg ctgcggttga atatatccat gatatgcata ccttctacaa
gaattttgag 300aaacttagtt gcgtgcctcc taactatatg gacaatcaac
aagatcttaa tgagagaatg 360agaggaatcc tcattgactg gttaattgag
gtgcactaca agtttgaact gatggaggaa 420actctttatc tcacaatcaa
tgtcatcgac agattccttg cggttcatca aatcgtgagg 480aaaaagcttc
agcttgttgg tgttactgct ttgttgcttg catgtaaata tgaagaagtt
540tcagttccag tggtagatga tctcatcttg atctctgaca aagcttactc
tagaagagaa 600gtgctagata tggagaagct aatggccaac accttgcaat
tcaatttctc tctaccaact 660ccatatgttt tcatgaaacg atttctcaaa
gctgcccaat ctgacaagaa gcttgagatt 720ttatcattct ttatgatcga
gctttgcctt gtggagtatg agatgctaga gtatcttcca 780tctaagctgg
cggcctcagc aatctacact gctcagtgta cacttaaggg atttgaagaa
840tggagcaaaa cctgtgagtt tcacacaggc tacaacgaaa aacagctact
ggcatgtgcg 900agaaagatgg ttgctttcca tcacaaggca ggaacaggga
agctcacagg agttcacaga 960aagtacaaca catctaagtt ctgtcatgct
gcaagaactg aaccagctgg gtttctgatt 1020taatattaat aagaatctaa
tatgacttaa ctcgagtttt tctttagaac aaaaagagtg 1080tgagagaaag
agagatagta gagcaagttg cccaaaatgg gagaagaatg gatctttaga
1140tatcatggca agtagcccaa aaagagtgta ttcttctctt tctaaggtct
ttagatcttt 1200cttcacttga gagagaataa aaagaatctt ctgaaaaaaa
aaaaaaaaaa aaaaa 12552471DNAArabidopsis thaliana 2cccgattcgg
gtactgctgc tggtgggtca aactccgacc cgtttcctgc gaatcttcga 60gttcttgtcg
ttgatgatga tccaacttgt ctcatgatct tagagaggat gcttatgact
120tgtctctaca gagtaactaa atgtaacaga gcagagagcg cattgtctct
gcttcggaag 180aacaagaatg gttttgatat tgtcattagt gatgttcata
tgcctgacat ggatggtttc 240aagctccttg aacacgttgg tttagagatg
gatttacctg ttatcatgat gtctgcggat 300gattcgaaga gcgttgtgtt
gaaaggagtg actcacggtg cagttgatta cctcatcaaa 360ccggtacgta
ttgaggcttt gaagaatata tggcaacatg tggtgcggaa gaagcgtaac
420cgagtggaat ggttctgaac attctggagg aagtattgaa gatactggcg g
47131351DNAArabidopsis thaliana 3atggggaagg aaaatgctgt gtctcggcca
ttcactcgtt cccttgcctc tgctttgcgc 60gcttcagaag tgacttctac tacacagaat
caacagagag taaacacaaa aagaccagcc 120ttggaggata caagagccac
tggacccaac aagaggaaga agcgagcggt tctaggggag 180atcacaaatg
ttaactccaa tacagctata cttgaggcca aaaacagcaa gcagataaag
240aaaggacgcg gtcatggatt ggcgagtaca tcccagttgg caacttctgt
tacttcagaa 300gtcacagatc ttcagtccag gaccgatgca aaagttgaag
ttgcatcaaa tacagcagga 360aacctttctg tttctaaagg cacagataac
acagctgata actgtattga gatatggaat 420tctagattgc ctccaagacc
tcttgggaga tcagcttcta cagctgagaa aagtgctgtt 480attggtagtt
caactgtacc ggatatccca aaatttgtag acatcgattc agatgacaag
540gatcctttac tgtgctgcct ctatgcccct gaaatccact acaatttgcg
tgtttcagag 600cttaaacgca gaccacttcc ggactttatg gagagaatac
agaaggatgt cacccagtcc 660atgcggggaa ttctggttga ttggcttgtg
gaggtctctg aagaatacac acttgcatct 720gacactctct acctcacagt
gtatctcata gactggttcc tccatggaaa ctacgtgcaa 780agacagcaac
ttcaactgct cggcatcact tgcatgctaa ttgcctcgaa gtatgaggaa
840atctctgctc cacgcattga ggagttttgc ttcattacgg ataacaccta
cacaagagat 900caggtcctgg aaatggagaa ccaagtactt aagcatttta
gctttcaaat atacactccc 960actccaaaaa cgttccttag gagatttctc
agagcagctc aagcctctcg cctgagccca 1020agccttgaag tcgagtttct
agccagctat ctaacagagt tgacattaat agactaccat 1080ttcttaaagt
ttcttccttc cgttgttgct gcttcagcgg gttttctcgc caagtggaca
1140atggaccaat caaaccaccc atggaatcca acacttgagc attacacaac
gtacaaagca 1200tcggatctga aagcatctgt tcatgcctta caagatctgc
agcttaacac caaaggttgc 1260cccttgagcg ctatacgcat gaagtatagg
caagagaaat acaaatctgt ggcggttctc 1320acgtctccaa agctacttga
cacgctattc t 13514672DNAArabidopsis thaliana 4atggggaaga agtgtgattt
atgtaacggt gttgcaagaa tgtattgcga gtcagatcaa 60gctagtttat gttgggattg
cgacggtaaa gttcacggcg ctaatttctt ggtagctaaa 120cacacgcgtt
gtcttctctg tagcgcttgt cagtctctta cgccgtggaa agctactggg
180cttcgtcttg gcccaacttt ctccgtctgc gagtcatgcg tcgctcttaa
aaacgccggc 240ggtggccgtg gaaacagagt tttatcggag aatcgtggtc
aggaggaggt taatagtttc 300gagtccgaag aagatcggat tagagaagat
cacggtgacg gtgacgacgc ggagtcttac 360gatgatgatg aggaagaaga
tgaggatgaa gagtacagcg acgatgagga tgaggatgat 420gatgaggatg
gtgatgatga ggaagcggag aatcaagttg tgccgtggtc tgcggcggcg
480caagttcctc cggtgatgag ttcttcatct tctgacggag gaagcggagg
ttcagtgacg 540aagaggacga gggctagaga gaattcagat cttctctgct
ccgatgatga gatcggaagc 600tcttcagctc aagggtcaaa ctattctcgg
ccgttgaagc gatcggcgtt taaatcaacg 660gttgttgttt aa
67251287DNAArabidopsis thaliana 5atggttaact catgcgagaa caaaatcttc
gttaaaccca cttcaacgac gattcttcaa 60gatgaaacaa gaagtagaaa attcggacaa
gagatgaaga gggagaagag aagagtgttg 120cgtgtgatta accagaatct
cgctggtgca agagtttatc cttgtgttgt caacaagaaa 180ggaagcttat
tgtctaataa gcaagaagaa gaagaaggat gtcaaaagaa gaagtttgat
240tctttgcgtc cttcagttac aagatctgga gttgaggaag agactaacaa
gaagctgaag 300ccctcagttc caagtgctaa cgacttcggt gattgtatat
ttattgatga ggaggaagct 360acattggacc ttccaatgcc aatgtcgctt
gagaaaccat acattgaagc tgatccaatg 420gaagaagttg agatggagga
tgtaacagtg gaagaaccga tcgtggatat cgatgtctta 480gactcgaaga
actcgcttgc ggctgttgaa tatgttcaag atctttacgc attttacaga
540acaatggaga gatttagttg tgttccagta gactatatga tgcaacaaat
cgacttaaac 600gagaagatga gagcaatact aatcgactgg ttaatcgagg
tacatgacaa gtttgatctg 660atgaacgaga cactgtttct gacagtgaat
ctgatagata gattcttgtc caagcaaaat 720gttatgagaa agaagcttca
gcttgtaggg ttagtagctt tgctgttagc ttgtaagtat 780gaggaggttt
cggttcctgt tgtcgaagat ttagtactca tttcggacaa agcgtatacg
840aggaacgatg ttctagagat ggagaaaaca atgttgagta ctttgcaatt
caatatctcg 900ttaccgacac aatacccgtt cttgaaaaga ttcctcaagg
cagctcaagc agacaagaag 960tgtgaggtct tggcgtcgtt cttgatcgag
cttgcccttg tggagtacga gatgcttcgg 1020tttccaccat cattactagc
tgccacatct gtgtacactg ctcaatgtac acttgatggt 1080tccaggaaat
ggaacagtac atgtgaattc cattgtcatt actctgaaga ccagctcatg
1140gaatgttcac ggaagctggt gagtctgcat cagagggcgg cgacaggaaa
cttaacagga 1200gtatatagga agtacagcac aagcaaattt ggttacatag
caaaatgtga agctgcacac 1260tttctagtgt ctgagtctca tcattct
128761078DNAArabidopsis thaliana 6actaagcagg aggccaaagc tgctttcaag
tctcttttgg aatctgtaaa tgttcattcc 60gactggacat gggaacagac attgaaagag
attgttcacg ataaaagata tggtgctttg 120aggacactcg gcgagcggaa
acaagcgttt aacgagtatc ttggccaaag gaaaaaagtg 180gaagctgagg
aaagacgaag gaggcagaag aaagctcggg aagaatttgt caagatgcta
240gaggagtgtg aagaactttc atcatccctg aaatggagca aagcaatgag
tttgttcgaa 300aatgatcagc gttttaaagc tgttgaccgt cctagggatc
gtgaagatct ttttgacaat 360tacattgtgg aacttgagag gaaggaaaga
gaaaaggcag cggaggaaca tcggcagtat 420atggcagact atcggaagtt
tcttgaaacc tgtgactata tcaaagctgg tacacaatgg 480cgcaaaattc
aagatagact ggaggatgat gacagatgct catgtcttga aaagatagat
540cgtctgattg gttttgagga atacattctt gacctagaga aggaagaaga
agagctgaag 600agagtagaga aagaacatgt aaggcgggcc gagagaaaaa
accgtgatgc atttcgtaca 660ctattggaag aacatgttgc tgcaggcatc
cttacagcca agacgtactg gttggattat 720tgcattgagt taaaagactt
gccccaatac caagctgttg catctaatac atctggttca 780actccgaaag
acttgtttga agatgtcaca gaagaattag agaagcagta tcatgaggat
840aagagctatg tgaaggatgc tatgaagtca agaaagattt ccatggtctc
ctcgtggctg 900tttgaagatt ttaaatctgc tatttcagaa gatctcagta
ctcaacagat atcagacata 960aatttaaagc ttatatatga tgacttggtt
gggagagtga aggaaaaaga agaaaaagag 1020gccagaaagc ttcagcgtct
ggctgaagaa tttaccaatc tgttgcacac tttcaagg 10787511DNAArabidopsis
thaliana 7caagagaaac cgtgggagaa tgatcctcac tactttaaac gagtcaagat
ctcagcgctc 60gctcttctta agatggtggt tcacgctcgc tctggtggta caattgaaat
aatgggtctt 120atgcaaggta agaccgatgg tgatactatc attgttatgg
atgcttttgc tttaccagtg 180gaaggtactg agacaagggt taatgctcag
gatgatgctt atgagtacat ggttgagtat 240tcacagacca acaagctcgc
ggggccggct ggagaatgtt gttggatggt atcactctca 300ccctggatat
ggatgctggc tctccggtat tgatgtttct acgcagaggc ttaaccaaca
360gcatcaggag ccatttttag ctgttgttat tgatcccaca aggactgttt
cagctggtaa 420ggttgagatt ggtgctttca gaacatactc taaaggatat
aaagccctcc agatgaacct 480gtttctgagt atcaaaacta ttcctttaaa t
51181155DNAArabidopsis thaliana 8agtagactca cctgattcaa cctccgacaa
catcttctac tacgacgata cttcacagac 60taggttccag caagagaaac cgtgggagaa
tgatcctcac tactttaaac gagtcaagat 120ctcagcgctc gctcttctta
agatggtggt tcacgctcgc tctggtggta caattgaaat 180aatgggtctt
atgcaaggta agaccgatgg tgatactatc attgttatgg atgcttttgc
240tttaccagtg gaaggtactg agacaagggt taatgctcag gatgatgctt
atgagtacat 300ggttgagtat tcacagacca acaagctcgc ggggcggctg
gagaatgttg ttggatggta 360tcactctcac cctggatatg gatgctggct
ctccggtatt gatgtttcta cgcagaggct 420taaccaacag catcaggagc
catttttagc tgttgttatt gatcccacaa ggactgtttc 480agctggtaag
gttgagattg gtgctttcag aacatactct aaaggatata agcctccaga
540tgaacctgtt tctgagtatc aaactattcc tttaaataag attgaggact
ttggtgttca 600ctgcaaacag tactattcat tagatgtcac ttatttcaag
tcatctcttg attctcacct 660tctggatcta ctatggaaca agtactgggt
gaacactctt tcttcttctc cactgctggg 720taatggagac tatgttgctg
gacaaatatc agacttagct gagaagcttg agcaagccga 780gagtcatctg
gttcagtctc gctttggagg agttgtgcca tcatcccttc ataagaaaaa
840agaggatgag tctcaactaa ctaagataac tcgggatagc gcaaagataa
ctgtggaaca 900ggtccatgga ctaatgtcgc aggtcataaa agatgaatta
ttcaactcaa tgcgtcagtc 960caacaacaaa tctcccactg actcgtcgga
tccagaccct atgattacat attgaagttg 1020ctcttctttt ggtttctagt
tttggattga cccatcattt gttgtccttt catttatttt 1080ctgttgtgta
aagaattata atgctaatca gaataataca gaagaagatt ttggttaaaa
1140aaaaaaaaaa aaaaa 115591308DNAArabidopsis thaliana 9atgtattgct
cttcttcgat gcatccaaat gcaaacaaag aaaatatctc tacttcagat 60gtacaggaga
gttttgtacg aataacgaga tcacgagcta aaaaagccat gggaagagga
120gtatcaatac ctccaacaaa accttctttt aaacagcaaa agagacgtgc
agtacttaag 180gatgtgagta atacctctgc agatattatt tattcagaac
ttcgaaaggg aggcaacatc 240aaggcaaaca gaaaatgtct aaaagagcct
aaaaaagcag caaaggaagg tgctaacagt 300gccatggata ttctggtaga
tatgcataca gaaaaatcaa aattagcaga agatttgtcc 360aagatcagga
tggctgaagc ccaagatgtc tctctttcaa actttaaaga tgaagaaatt
420actgagcaac aagaagatgg atcaggtgtc atggagttac ttcaagttgt
agatattgat 480tccaacgtcg aagatccaca gtgttgcagc ttgtatgctg
ctgatatata tgacaacata 540catgttgcag agcttcaaca acgacccttg
gctaattata tggagcttgt gcagcgagat 600atcgacccag acatgagaaa
gattctgatt gactggcttg tagaagtttc tgacgactac 660aagctggttc
cagatacgct ttaccttaca gtgaatctta tcgaccggtt tctgtccaac
720agttacattg aaaggcaaag actccagctc cttggtgtct cttgcatgct
tatagcttca 780aaatatgaag agctttccgc accaggggtg gaggagtttt
gcttcattac ggccaacaca 840tacacaagac gagaagtgct gagcatggag
attcaaattc taaattttgt gcactttaga 900ttatcggttc ctaccaccaa
aacatttctg aggcggttca ttaaagcagc tcaagcttcg 960tacaaggtgc
ctttcattga actggagtat ttagcaaact atctcgccga attgacactg
1020gtggaatata gtttcctaag gttcctgcca tcactaattg ctgcttcagc
tgttttccta 1080gcccgatgga cactcgacca aactgaccat ccttggaacc
ctactctgca acactacacc 1140agatatgagg tagctgagct gaagaacaca
gttctcgcca tggaggactt gcagctcaac 1200accagtggct gtactctcgc
tgccacccgt gagaaataca accaaccaaa gtttaagagc 1260gtggcaaagc
tgacatctcc caaacgagtc acattactat tctcaaga 1308101006DNAArabidopsis
thaliana 10agacttcaca ttttaccatt atttgctctg agctcagtag gagagttcaa
gaaacaatgg 60caaagatgca attatcaatc tttatcgctg tcgttgcgct tatcgtctgc
tctgcatctg 120ctaaaaccgc aagccctcca gctccagtgc tgccaccgac
accagctcca gcaccagccc 180cggaaaatgt gaatctcacc gagcttttaa
gtgtagctgg tccgttccac acattcctcg 240actaccttct ctcgactgga
gtcattgaga ctttccaaaa ccaagctaac aacactgagg 300aaggcatcac
aatctttgtc cctaaagatg atgctttcaa agctcagaag aatcctcctt
360tgtcaaatct cacaaaggat cagcttaagc agcttgttct cttccatgct
ctgcctcatt 420actattcgct ttcggaattc aagaacttga gccaatctgg
tccagtgagc acctttgctg 480gtggtcaata ctccttgaaa ttcactgatg
tttctggcac ggttaggatt gattctttat 540ggaccaggac taaagtcagc
agcagtgttt tctccactga ccctgttgcg gtttaccaag 600tgaaccgcgt
gcttctaccc gaagcaatct ttggtactga tgtccctcca atgcctgctc
660cagctcctgc tcctatcgtt agtgctcctt cggattctcc ttcagttgct
gattctgaag 720gagcttcttc accaaagtcc tcacacaaga actccggaca
aaagctgcta cttgcaccaa 780tctccatggt tatttccggt ttggtggcat
tgttcttgtg atcagatggt tttgcagatt 840gagttatgtt tttaagttac
aatgtgaaag attgtattac atcatttgaa ttgtcttttt 900gatttttgaa
acccattttt tattatacat ttttatcatt attattgttt gtcattacga
960ttgttgtgaa ttgaaattgt tcctccaaaa aaaaaaaaaa aaaaaa
100611643DNAArabidopsis thaliana 11atttatcatt acagtctgat ttgagctaag
ttctctcatc ataaactctc cttggagaat 60catggctatt tcaaaagctc ttatcgcttc
tcttctcata tctcttcttg ttctccaact 120cgtccaggct gatgtcgaaa
actcacagaa gaaaaatggt tacgcaaaga agatcgattg 180tgggagtgcg
tgtgtagcac ggtgcaggct ttcgaggagg ccgaggctgt gtcacagagc
240gtgcgggact tgctgctaca ggtgcaactg tgtgcctccg ggtacgtacg
gaaactacga 300caagtgccag tgctacgcta gcctcaccac ccacggtgga
cgccgcaagt gcccataaga 360agaaacaaag ctcttaattg ctgcggataa
tgggacgatg tcgttttgtt agtatttact 420ttggcgtata tatgtggatc
gaataataaa cgagaacgta cgttgtcgtt gtgagtgtga 480gtactgtatt
attaatggtt ctatttgttt ttacttgcaa gttttcttgt tttgaatttg
540tttttttcat atttgtatat cgattcgtgc attattgtat tatttcaatt
tgtaataaga 600ttatgttacc tttgagtggt tgtttaaaaa aaaaaaaaaa aaa
64312484DNAArabidopsis thaliana 12aaggaagaag caggaatgta ttggggatac
aaagtacgat atgcatcaca attaagttca 60gtattcaagg aatgcccttt cgagggtggt
tacgattatt tgattggtac ctcggagcac 120ggcctggtaa ttagttcatc
tgagctgaaa ataccaacat ttaggcacct attgattgca 180tttggtggac
ttgctgggct tgaagaaagt attgaagatg ataatcagta taaggggaaa
240aacgttcgag atgtgtttaa tgtatacttg aatacttgtc cacatcaagg
tagccgaacc 300attcgagcag aggaagcgat gtttatatca cttcagtact
tccaggaacc catcagcagg 360gcagtgagaa gactttaagc ttcgataaaa
agagtcaaag aagctatttt gttctcatag 420atctgaggtt tgtctgaaaa
agagtgatgt aatgtaactg ttttagaaaa aaaaaaaaaa 480aaaa
48413688DNAArabidopsis thaliana 13agatggggaa gaagaacaag agaagtcaag
acgagtctga gctcgaattg gagccagagc 60taacgaaaat aatcgatgga gactctaaaa
agaagaaaaa taagaataag aagaagagaa 120gccatgaaga tacggagata
gaaccggagc aaaagatgag tctcgacgga gactcgaggg 180aggagaagat
aaagaagaag aggaagaaca agaaccaaga ggaggagcca gagcttgtga
240cggagaaaac gaaagtccaa gaggaggaaa agggaaatgt agaagagggt
agagccactg 300ttagcatagc catagctggt tcaatcatcc acaacactca
atcacttgag ctcgccacac 360gcgtaatctc tctttctctc tatctctccc
ttcgtttctc tgtttttcca ttcccagata 420atttaaagtc cccttcttcc
atttctaaca tttctcagct cgccggccaa attgctcgtg 480cagctacaat
tttccgaatc gacgagatcg tagtgttcga caataagagc agctcagaaa
540tcgaatcagc tgctacgaat gcttctgata gcaatgaaag tggtgcctcc
tttctcgttc 600gtatcttgaa gtatctagag acaccacaat atttgaggaa
atctctcttc cccaagcaaa 660atgatcttag atatgtgggt atgttgcc
68814461DNAArabidopsis thalianamisc_feature(396)..(396)n = a, c, g
or t 14gtcagtgctg tctggcatgg actgtatcct ggatacatta tattctttgt
gcaatcagca 60ttgatgatcg atggttcgaa agctatttac cggtggcaac aagcaatacc
tccgaaaatg 120gcaatgctga gaaatgtttt ggttctcatc aatttcctct
acacagtagt ggttctcaat 180tactcatccg tcggtttcat ggttttaagc
ttgcacgaaa cactagtcgc cttcaagagt 240gtatattaca ttggaacagt
tatacctatc gctgtgcttc ttctcagcta cttagttcct 300gtgaagcctg
ttagaccaaa gaccagaaaa gaagaataat gttgtctttt taaaaaatca
360acaacatttt ggttcttttc tttttttcca cttggnccgt tttatgtaaa
acaagagaaa 420tcaagatttg aggttttatt cttaaaaaaa aaaaaaaaaa a
46115862DNAArabidopsis thalianamisc_feature(292)..(292)n = a, c, g,
or t 15ggtttttgaa tacatggaca ctgatgtcaa gaaattcatc agaagtttcc
gtagcactgg 60caagaacatt ccaacccaaa ctatcaagag cttgatgtat caactatgca
aaggtatggc 120attctgccat ggtcacggga tattgcacag agatctcaag
cctcacaatc tcttgatgga 180tcccaagaca atgaggctca aaatagcaga
tcttggttta gccagagcct tcactctgcc 240aatgaagaag tatacccatg
agatattaac tctttggtat agagctccag angnttcttc 300ttggtgccac
ccattactct acagctgtgg atatgtggnc tgttggctgc atatttgctg
360aacttgtgac caaccaagca atctttcagg gagactctga gctccaacag
ctcctccata 420ttttcaagtt gttgggacac ccaatgaaga aatgtggcca
ggagtgagca cactcaagaa 480ctggcatgaa tacccacagt ggaaaccatc
gactctatct ctgctgttcc aaacctcgac 540gaggctggag ttgatcttct
atctaaaatg ctgcagtacg agccagcgaa acgaatatca 600gcaaagatgg
ctatggagca tccttacttt gatgatctgc cagaaaagtc ctctctctaa
660ggatttaaaa tcttcagtta gtatctttcc aagttttatg gtttttctag
ttttgcttct 720ttcaagcata tctctagtgt gctgcttccc cctctatgaa
tcatcctttc tttagcataa 780tatatcactt ctgattgttg tttctttcta
ttcgaatatt tggattaacg gctttaatgt 840tcttaaaaaa aaaaaaaaaa aa
862161114DNAArabidopsis thaliana 16acccaaaaga aggatgagta tggagatgga
gttgtttgtc actccagaga agcagaggca 60acatccttca gtgagcgttg agaaaactcc
agtgagaagg aaattgattg ttgatgatga 120ttctgaaatt ggatcagaga
agaaagggca atcaagaact tctggaggcg ggcttcgtca 180attcagtgtt
atggtttgtc agaagttgga agccaagaag ataactactt acaaggaggt
240tgcagacgaa attatttcag attttgccac aattaagcaa aacgcagaga
agcctttgaa 300tgaaaatgag tacaatgaga agaacataag gcggagagtc
tacgatgcgc tcaatgtgtt 360catggcgttg gatattattg caagggataa
aaaggaaatc cggtggaaag gacttcctat 420tacctgcaaa aaggatgtgg
aagaagtcaa gatggatcgt aataaagtta tgagcagtgt 480gcaaaagaag
gctgcttttc ttaaagagtt gagagaaaag gtctcaagtc ttgagagtct
540tatgtcgaga aatcaagaga tggttgtgaa gactcaaggc ccagcagaag
gatttacctt 600accattcatt ctacttgaga caaaccctca cgcagtagtc
gaaatcgaga tttctgaaga 660tatgcaactt gtacacctcg acttcaatag
cacacctttc tcggtccatg atgatgctta 720cattttgaaa ctgatgcaag
aacagaagca ggaacagaac agagtatctt cttcttcatc 780tacacatcac
caatctcaac atagctccgc tcattcttca tccagttctt gcattgcttc
840tggaacctca ggcccggttt gctggaactc gggatccatt gatactcgct
gaccgagctt 900ctattcccaa attcttcaag aagaagaagt aatgatctaa
ttggtatact aaaaaattat 960acatctggtt tagtgttcaa ttgagagaga
ctgtaaaatc aattcatagg ccaacaaatg 1020tttgtttatc caattttcct
ttttattcga acttgatgcg atatttcaac ggaaacagaa 1080actattgttt
taaaccaaaa aaaaaaaaaa aaaa 111417794DNAArabidopsis thaliana
17aagatgcaac cgacagagac gtcgcagccg gcgccgtcgg atcaaggccg ccggcttaag
60gatcagttat cggagagtat gagcttcagt agccaaatga agaaggaaga cgatgagttg
120tcgatgaaag ctttgtcggc gttcaaggcc aaagaagagg agatcgagaa
gaagaagatg 180gagatcagag aaagagttca agctcagctt ggtcgtgttg
aagatgagtc caagcgtctc 240gctatgattc gcgaggaact tgaaggtttt
gctgatccca tgaggaagga agttactatg 300gtgaggaaga agattgattc
tctcgacaaa gaattaaagc cattggggaa tacagttcag 360aaaaaggaaa
cagagtacaa ggatgctctt gaagcattca atgaaaagaa caaggagaag
420gtggagctga tcaccaagct acaggagttg gagggagaaa gcgagaaatt
caggttcaag 480aagctggagg agctaagcaa gaacattgat ctaaccaaac
cttagtgttg gacgagcaga 540gtcgctggga tttggctatt caaagttcta
aaaaagtcac tttttagagt attttcattg 600ttcttttatg attctagtaa
tatatataat ttataaaata aaaagtaaga agatatgtgt 660ttgaactaga
tgttgcaaag aaaatgtaac aaagttacga tggcactaca ttatcgacgt
720gattggcaga attgtaatag taatgtaaag aaactatgtt tgttccggaa
aaaaaaaaaa 780aaaaaaaaaa aaaa 79418448DNAArabidopsis thaliana
18cagaaacaag ctccaggtgc aggtgatgtc ccagcaacaa tccaagaaga ggacgatgat
60gatgatgtcc cagatcttgt agtgggagag actttcgaga cccctgctac tgaagaggct
120cccaaagctg ctgcttctta gaggaggagg aagaagaagg agaagagctc
acctgcaaaa 180cccatcataa aaatgtttgt cgctcgacct cttctgagca
ctgtcagatt cttgttttct 240ctaatgcttg cgaacagaaa gacttggttt
tattatcact tgatgctttt tggtccgaac 300agcaattttc cttttattaa
ggttagatcg ctttttgttt accacctgtt caaatgagta 360ctactatgtc
ctgtcgcttc atacacttct tgcaacacag tcctttgttt tgagtcaaaa
420aaaaaaaaaa aaaaaaaaaa aaaaaaaa 448191152DNAArabidopsis thaliana
19atggaggacg acgacgagat tcagtcaatt ccatctccgg gagattcttc cctttcacca
60caagctcctc cttctccgcc gattttgcca acaaacgacg tgacggtggc cgtcgtgaag
120aaaccacaac cggggctttc ttctcaatct ccgtccatga acgctttagc
gttagtggtt 180catactcctt ctgtaaccgg tggtggtggt agcggaaaca
gaaacggacg aggaggagga 240ggaggaagcg gtggtggtgg aggaggaaga
gatgattgtt ggagcgaaga agctacaaag 300gttctaatcg aagcttgggg
agatcgattc tctgaaccag gtaaaggaac tttgaagcaa 360caacattgga
aagaagtagc tgagattgtg aacaagagtc gtcaatgcaa ataccctaaa
420actgatattc agtgtaagaa cagaattgat acggtgaaga agaagtataa
gcaagagaaa 480gctaagattg cttctggtga tggacctagt aaatgggttt
tcttcaagaa gcttgagagt 540ttgattggtg gtactacaac attcattgct
tcttcaaaag cttcagagaa ggctcctatg 600ggaggagctc ttgggaatag
ccgttcgagt atgtttaaac ggcaaactaa aggtaatcag 660attgtgcagc
aacaacaaga gaagagaggc tctgattcga tgcggtggca ttttaggaaa
720cgtagtgctt ctgagactga gtctgagtct gatcctgaac ctgaggcttc
tcctgaggaa 780tctgctgaga gtctcccacc tttgcaaccg attcaaccgc
tttcgtttca tatgccaaag 840cggttgaagg tggataagag tggaggtgga
gggagtggag ttggagatgt ggcgagggcg 900atacttggat ttacggaagc
ttatgagaag gcggaaactg ctaagcttaa gttaatggcg 960gaactggaaa
aggagaggat gaaatttgct aaagagatgg agttgcagag aatgcagttc
1020ttgaaaactc aattggagat aacacagaac aatcaagaag aggaagagag
gagcaggcag 1080cgaggagaaa ggaggatcgt tgatgatgat gatgatcgca
atggcaagaa taacggcaat 1140gtaagtagct ga 115220409DNAArabidopsis
thalianamisc_feature(201)..(201)n = a, c, g, or t 20cctccttctc
cacgcttctt cttcttcttc ctcaatctct cttacgattc cttcaaatca 60ttcttccatg
gccaccgtat cttcttcctc ctggccaaac cccaacccta atcccgattc
120cacgtctgcc tcagattccg attctacttt tccctctcac cgcgatcgcg
tagacgaacc 180cgactctctc gattccttct nctccatgag tcttaactcc
gacgaaccta atcagacttc 240taatcaatcg cctctttctc cccctacgcc
caatttaccg gtgatgcctc ctccgttcgt 300gctttatctt tcctttaacc
aagatcatgc ttgcttcgcc tgtnggcact gaccgtggct 360ttacngatnc
ttaattgcga tccctttcgc gagattttcc ggcgggatt 40921758DNAArabidopsis
thaliana 21gtcaggctca tgattccaga atagcttgct tcgctctcac gcaggatggc
catttgttgg 60ccactgctag ctctaagggt actctggttc ggatcttcaa tactgttgat
ggtaccttgc 120gtcaagaggt aaggaggggt gcggatagag cagagatcta
cagtttggcc ttctcttcaa 180atgctcagtg gttagctgtc tcaagtgaca
aaggaacggt ccatgtcttt ggtctcaaag 240tcaactccgg atctcaagtg
aaagactcat cccgaattgc acctgatgct actccctcat 300ccccatcgtc
gtctctgtct ttattcaaag gagtgttacc gaggtatttc agctcggagt
360ggtcggtggc tcagttcagg ttggttgaag gaactcagta catagccgcc
tttggccatc 420aaaagaacac cgttgttatt cttggcatgg atgggagctt
ctacagatgc cagtttgatc 480cggtgaacgg cggtgaaatg tctcagcttg
agtaccacaa ctgtctgaaa ccgccttcag 540ttttctaaaa gctttactac
ttatactctt ttgttccttc tctctcttta tatctctctg 600caacttaagc
ggtgagatat ggtgtatagt tttgtgtata taataatgat gggtcgtcct
660ataatttgta aaacctttta tcgctacccg ggtcgactct agagccctat
agtgagtcgt 720attactgcag agatctatga atcgtagata ctgaaaaa
75822624DNAArabidopsis thaliana 22atggactttt gtgaggtatg cccggaaaag
cttccaaact atgaagtgaa agtgaagagc 60tttttcgaag aacatttaca cactgatgag
gagatccgtt actgcgttgc aggaactggt 120tactttgatg tgagagatcg
taatgaagct tggattaggg tattggtaaa gaagggaggt 180atgatagtct
tacctgctgg gatctatcat cgcttcactg tggactctga caactatatc
240aaggcaatgc ggctattcgt gggtgaaccg gtatggacac catacaatcg
cccacacgac 300catcttcctg caaggaaaga atatgtcgat aacttcatga
tcaatgcctc ggcttagaga 360gcttcctctc tctatatctg gctttctgaa
acaaggatct ataaacaagg cctacaataa 420agaaagcttt cctgtcaagt
attggatatt tatatgtatt cctgtgtaga atgatggctt 480ttggtatgct
tgagttgttg taaacttagt tacactctct gatatgtctc tctttaccat
540ctttgtcgta tcccatatac gaaaagatta cattgggatt catattgtct
tacgttcgtt 600cctatgtgca atatgttgag tttt 62423495DNAArabidopsis
thaliana 23ccagttttcc gatcactcgc aagaaaaccc taaaaatgga tggtcatgat
tctgaggata 60ctaagcagag cactgctgat atgactgctt ttgtccaaaa tcttctccag
cagatgcaaa 120ccaggttcca gacaatgtcg gactccatca tcacaaagat
tgatgacatg ggaggcagaa 180tcaatgagct ggagcaaagc atcaatgatc
taagagccga gatgggagta gaaggcactc 240ctcctccagc ctccaaatca
ggcgatgaac ccaaaacacc ggctagttcc tcttaaaaag 300gaatgtggtg
ttcattgaca tgtccgaagg aaaaagaaaa actatgaaat atgttaagag
360cagtattact tttaaaattc ctgtttaaga aacgagtttg ttgtttatta
aagttcatca 420aatagattga tgatgtggtg cattacatta ttctccacct
atgaattgca tttctatttt 480ggtctaaaaa aaaaa 49524580DNAArabidopsis
thaliana 24cgcgcaggta cgagcaaaaa tgctcaaaga agttgccacg gagaagcaaa
ccgccgtgga 60cactcatttc gcaaccgcta aaaagcttgc tcaagaagga gacgcgttgt
tcgttaaaat 120cttcgcaatc aagaaactgt tggcgaaact tgaagcagag
aaagaatctg ttgatggaaa 180gtttaaggag actgtgaaag aactttctca
tcttctggct gatgcttctg aggcttacga 240agagtatcat ggcgcggtga
ggaaggcgaa agacgagcaa gcggctgagg aatttgcgaa 300agaggcgacg
caaagtgcag agatcatttg ggttaagttt cttagttctc tttagagaac
360aattgagatt cttggttgtg ttaagagcaa atctagagct cttgttggtt
cttgttatgt 420attttgtgat gatgttctgt ttcagagttt gtgtgttggt
tgtatcagga gaaagaggct 480gggagataga gagaaagaga gtctctgcga
aaactaataa tgttttttca gatatctaaa 540taataagctt tttacaaaaa
aaaaaaaaaa aaaaaaaaaa 58025656DNAArabidopsis thaliana 25cggccgcgtc
gacgcttgag agattcctct ggctaaaccc agatggagtt tggatctttt 60cttgtgtcct
tagggacatc ttttgttatc ttcgtcattc tcatgcttct cttcacctgg
120ctttctcgca aatctggaaa tgctcccatt tattacccga atcggatcct
taaagggctg 180gagccatggg aaggcacctc cttgactcga aacccttttg
cttggatgcg tgaagctttg 240acttcctctg aacaagatgt cgttaactta
tccggcgtcg atactgctgt ccactttgtc 300ttcttgagca ctgttctggg
gatatttgct tgttccagtc ttcttctcct accaactcta 360ctgcctctag
ccgctacaga caacaacata aagaacacaa agaatgcgac agataccaca
420agcaaaggaa cttttagcca acttgataat ctatcaatgg ctaacatcac
aaaaaaaagt 480tcgaggctgt gggcgttcct aggagctgtt tactggatat
ctttggtcac atatttcttc 540ttgtggaaag cttataagca tgtctcttca
ttgagagctc aagctctgat gtctgctgat 600gtaaaacccg agcaattcgc
tattcttgtt agggatatgc ctgcaccacc tgacgg 65626985DNAArabidopsis
thaliana 26gttcacactc cggctggtga actgcaaaga cagattaggt catggcttgc
agaaagtttt 60gagtttctct ctgttacagc agatgatgtt tcaggagtaa ccactggcca
attagagctt 120ctttccacag caattatgga tggctggatg gctggagtag
gagctccggt gcctcctcac 180acagacgctt taggacagct tgtgtctgag
tatgcaaagc gagtctacac ttctcagatg 240cagcatctaa aggatattgc
cggtactttg gcttcggaag aggcagaaga tgctggtcaa 300gtcgcgaagc
ttcgatcagc tctcgagtct gttgaccaca aaagaagaaa gattttgcaa
360caaatgagaa gtgatgcagc tttgtttacc ttggaagaag gcagttcccc
tgttcaaaat 420ccatctacag cagccgaaga ctcgagatta gcctccctca
tttctctgga tgccatactg 480aagcaagtca aggaaataac aagacaagcc
tctgtccacg ttttgagtaa aagcaagaaa 540aaggcattgc ttgagtctct
tgatgaactt aacgaacgaa tgccttctct gcttgatgtt 600gatcatccat
gtgcacagag agaaattgat acggctcacc agttggtcga gacaattcca
660gaacaagagg acaatcttca agacgaaaag agaccttcaa tagattcaat
atcttcgact 720gaaaccgatg tgtctcaatg gaatgttttg caattcaaca
caggaggctc ttcagctcca 780ttcatcataa aatgcggagc taactccaac
tcagagctcg tgatcaaagc ggatgcccgt 840attcaagaac ctaaaggagg
cgaaatagtg agagttgtgc caagaccttc ggttttagaa 900aacatgagct
tagaggaaat gaaacaagtg tttggtcagt tgcccgaagc tctaagttca
960ctggccttag ctagaacagc tgatg 98527527DNAArabidopsis
thalianamisc_feature(512)..(512)n = a, c, g, or t 27acttatgaga
ggttaccgat tgaggaagaa caacagcaag agcagccgct tcaactagaa 60gatgggaaga
agcagaaaga agagaatgat gataacgaga gtgggaataa cggaaacgaa
120ggatcgatgc agccgccgat gtataatatg cctcctaatt ttatcccaaa
tggtcatcaa 180atggctcaac acgacgtgta ttggggtggt cctccgcctc
gtgctcctcc ttcgtattga 240ttaagttaga taggcggtgg ttggtgcgtt
ctttttactg gaatgattat attttccatt 300aggatgggta ggcttttgtt
attaaagcta tcaagtttct ttttttttac ggataattcg 360gatgacaatt
agctagtgtt tgtttgtttg ttttgtggcc ggcttttctg cttgactatt
420ttgatcgcgg atagctttgt atgaaagtga attgattgta gaatcgtctt
ttgaattttg 480atgttggaaa aaaccaagca atggtgtgtg gnctttgcaa tggaagc
52728610DNAArabidopsis thalianamisc_feature(482)..(482)n = a, c, g,
or t 28atcaaaagct agagtcttgg ccattcctga tgatctagca aatgtgtcat
gcggtgtgga 60acagattgaa gaactgaaag gattgaacct tgttgagaaa gatggtggtt
catcttcttc 120tgacggggct aggaacacta atcctgaaac tagaaggtac
agtggttcct tgggtgtaga 180ggatggagcc tatactaatg agatgctcca
gtccatagag atggttactg atgtgctgga 240ctctcttgtg aggagggtta
cagtagcaga atctgagtct gctgtcaaaa ggagagggca 300cttttgggag
aggaaagaaa tcagtaggaa agactatcca aatcgaaaat ttgtccgtga
360agttagaaga gatggaacga tttgcttatg ggactaatag tgttctaaac
gaaatgcggg 420aaaggattga ggaattagtt gaagagcgat gaggcagagg
gaaaaagctg tggaaaacga 480anaggagttg tgtnntgtga agagagagtc
gagtcnttaa aagctcctca gtactttacc 540atgtcgagaa cactctttcn
ccggagncat tcaaaccatg aggagtnttt gacggtggca 600ctaaacnccg
61029546DNAArabidopsis thaliana 29atgaccaata tcgccatggc tgatgctctc
aaatctcttg agattgttga tggtcttgat 60gaatacatga atcaatctga atccagtgct
ccgcattctc caaccagtgt agcaaagctg 120ccaccaagca ctgcaactag
aacaactcga cggaagacca caacaaaagc tgagcctcag 180ccatcatctc
agttggtgtc ccgttcttgt cgttcgacga gcaagtctct tgctggagat
240atggaccagg aaaacataaa caagaatgtt gctcaagaaa tgaagactag
caatgtcaag 300tttgaagcca atgtgctcaa aactccagca gcaggaagca
caaggaaaac ttcagcagca 360acttcttgca ctaagaagga tgaattggtc
cagtcggtct acagcactag gagatcaacc 420aggctgttag agaaatgtat
ggccgatctg agtttgaaga ctaaagaaac tgtggataat 480aaacctgcca
agaatgaaga tacagaacag aaagtatctg cacaggagaa gaatctaact 540ggttag
54630492DNAArabidopsis thaliana 30atgctgatgc tgtgtgggtt cacggtcttg
gatatgctaa agcaccacga ccttgggaag 60atccgagcac ccttgcatcc tctcagaaag
aagatgcaga ttcagcacgc ttaccagcag 120atacatcagg ggtcaaaact
gttgaagatg gaccggatga tgttgagagg gaccaaaaga 180aggataggcg
tgaggaaagg aaacctgcaa agagagagaa ggaagaaaga catgataggc
240gtgaaaaacg cgaaaggcat gagaagcgaa gcgctcgtga ttcagatgat
agaaagaagc 300acaagaaaga gaagaaggag aaaaaaagaa ggcatgactc
tgattctgat tgaagcgaat 360tgtcccagga tggaacattt tgctcttcag
aggaagagtg gtcggctagg taccaaaatc 420cagctaccac ttctgcaaga
tttaaatctg ttgcttattt catttacgaa tcgtggagta 480aagtgttgtt ga
49231723DNAArabidopsis thalianamisc_feature(559)..(559)n = a, c, g,
or t 31gcaaaagaga gaaacatctg acccggaatc tgacctgaaa acccggaaga
atcgaaaaat 60ggggaaagat ggtctgagcg acgatcaggt ctcgtcgatg aaggaagcct
tcatgctctt 120cgacaccgat ggcgacggca aaatcgcacc gtcagagctc
gggatcctca tgcgatctct 180cggcggaaac ccgacccaag cccagctgaa
atccataatc gcatccgaga atctctcttc 240accgtttgat ttcaacagat
tcctcgatct catggcgaaa catctgaaga cggaaccttt 300cgatcgccag
ctccgtgacg cattcaaagt gctcgataag gaaggtaccg ggttcgttgc
360tgtggcggat ctgaggcata ttctgaccag tatcggagag aagctggagc
ctaatgagtt 420cgatgagtgg atcaaggagg tggatgttgg atccgatgga
aagatccggt atgaagattt 480catagcaagg atggttgcta agtgagatct
aatcttttat gttttgaaag ttgaaatttt 540taagaagaga ttcttttgng
gttttttcac ttggttggtt tgatttcgag cgaatcctaa 600ctaggggttg
gtttatcatt gnggaatttg cttactaact ttggcttctt catggttggg
660tttcaatttt taatggnaaa tggtggctgg gggaattcct aaaaaaaaaa
aaaaaaaaaa 720aaa 72332344DNAArabidopsis thaliana 32cgcggagtct
cctttcgatc aagagaaatg cgtccgattt ttgcaatctc tcagagaatg 60cgttctatca
aagaaagtaa agaagttctc gataccgagt caagatcacg actctgaggg
120agcagcttca gctacaaaga gaccttcata acgttctttg ttccgatttt
cttttatcgt 180ttgagttgta atcatgtaat tgattttaat gtcatgcctt
ggattcataa gctgggtcat 240gccttgtttc ccctttgttg tcttgtatgt
tgaatattgc aaactctaaa gagcatattt 300ataagaagaa ataaaagttt
ctacaaaaaa aaaaaaaaaa aaaa 344331131DNAArabidopsis thaliana
33atgacaacta ctgggtctaa ttctaatcac aaccaccatg aaagcaataa taacaacaat
60aaccctagta ctaggtcttg gggcacggcg gtttcaggtc aatctgtgtc tactagcggc
120agtatgggct ctccgtcgag ccggagtgag caaaccatca ccgttgttac
atctactagc 180gacactactt ttcaacgcct gaataatttg gacattcaag
gtgatgatgc tggttctcaa 240ggagcttctg gtgttaagaa gaagaagagg
ggacagcgtg cggctggtcc agataagact 300ggaagaggac tacgtcaatt
tagtatgaaa gtttgtgaaa aggtggaaag caaaggaagg 360acaacttaca
atgaggttgc agacgagctt gttgctgaat ttgcacttcc aaataacgat
420ggaacatccc ctgatcagca acagtatgat gagaaaaaca taagacgaag
agtatatgat 480gctttaaacg tcctcatggc tatggatata atatccaagg
ataaaaaaga aattcaatgg 540agaggtcttc ctcggacaag cttaagcgac
attgaagaat taaagaacga acgactctca 600cttaggaaca gaattgagaa
gaaaactgca tattcccaag aactggaaga acaaagaaat 660gagcacttat
atagctcagg aaatgctccc agtggcggtg ttgctcttcc ttttatcctt
720gtccagactc gtcctcacgc aacagtagaa gtggagatat cagaagatat
gcagctcgtg 780cattttgatt tcaacagcac tccatttgag ctccacgacg
acaattttgt cctcaagact 840atgaagtttt gtgatcaacc gccgcaacaa
ccaaacggtc ggaacaacag ccagctggtt 900tgtcacaatt tcacgccaga
aaaccctaac aaaggcccca gcacaggtcc aacaccgcag 960ctggatatgt
acgagactca tcttcaatcg caacaacatc agcagcattc tcagctacaa
1020atcattccta tgcctgagac taacaacgtt acttccagcg ctgatactgc
tccagtgaaa 1080tccccgtctc ttccagggat aatgaactcc agcatgaagc
cggagaattg a 113134631DNAArabidopsis thaliana 34agagtatctg
aagaaagggt caccaataag cgcgctcaaa agtttcatct cgtctctctc 60tgaacctcct
caagacatca tggacgcact cttcaatgct ctctttgatg gtgtgggaaa
120gggattcgcc aaagaagtga ctaagaagaa gaattactta gcggctgctg
caacaatgca 180agaggatgga tcacagatgc atctgctcaa ttcgattggg
acattctgtg gaaagaatgg 240aaacgaagaa gctttgaaag aggtggctct
ggttcttaaa gcattgtacg accaagacat 300cattgaggaa gaggtagtgt
tggattggta cgaaaagggt ctcaccggag ctgacaaaag 360ctcgccggtt
tggaagaatg ttaagccttt tgtggagtgg cttcagagcg ctgagtctga
420gtccgaagag gaggattgag tcactttttt cttccctcct aacttttctt
tgcggcattt 480cttataatac ttcgtcagtt ttcagaattc ttaaatcttt
ttgctgtgtt cttataaaga 540aacatcatct attaaagttg tcttcgtttg
gatttggttt tgacgacttt gggaaatatt 600tatgtttaag aaaaaaaaaa
aaaaaaaaaa a 631351333DNAArabidopsis thaliana 35gctggaggta
gagaggaatg cgtctgctgt tgctgccagt gaaacaatgg cgatgatcaa 60taggttgcat
gaggagaaag ctgcgatgca gatggaagcg ttgcagtatc agagaatgat
120ggaggagcaa gctgagtttg atcaagaagc tttgcagttg ttgaatgagc
ttatggtgaa 180tagagagaag gagaatgctg agcttgagaa ggagctagag
gtgtatagaa agagaatgga 240ggagtatgaa gctaaagaga aaatggggat
gttgaggagg agattgagag attcctctgt 300tgattcgtat agaaataatg
gcgattctga tgagaatagc aatggagagt tacagtttaa 360gaacgttgaa
ggggttacgg attggaaata tagagagaat gagatggaga atacgccggt
420ggatgttgta cttcgtcttg atgagtgttt agatgattat gatggagaga
ggctttcgat 480tcttgggaga ttgaagtttc ttgaagagaa actcacagat
cttaataacg aagaggacga 540cgaggaggag gctaaaacgt ttgagagtaa
tggtagcatc aatggaaatg agcatattca 600tggcaaagaa acaaacggga
agcacagagt tatccagtca aaaagattac ttcccctgtt 660tgatgcggtc
gatggagaga tggaaaacgg gttaagtaac ggaaaccatc acgaaaacgg
720gtttgatgat tcggagaagg gtgagaatgt gacgatagaa gaagaagtgg
atgagcttta 780cgagaggtta gaagctctag aggcagatag agagttctta
agacattgtg ttggttcatt 840gaaaaaagga gacaaaggtg tacatctcct
ccatgagatt ctgcaacatc ttcgtgatct 900aaggaatatc gatcttactc
gcgtcagaga aaacggagac atgagtttat gagtttgatt 960ttgagttttg
ggtttgagtc cactctttgc atagtgaccc aaagaacaag aaaaatcata
1020caggtatgga agtgacatgt tgcttgtgag gcaaggaaca acgacaaggt
ttcagatgaa 1080gaagaaaacg ttctcagaat aaaagtattt taagtatata
ctctgaggaa aagtgtcaga 1140tcagaatgtt cgtctttctt cgttcatttt
cattattata agttttgttt tttatattga 1200agatttattt agagagaggg
aagtgtcagt ataatttcac ttttatattt tatatttggg 1260agttgtcttt
atgagtggtg gtaatagaaa aaggtagaat gatgagtgaa gaaaaaaaaa
1320aaaaaaaaaa aaa 1333362289DNAArabidopsis thaliana 36cttatgcaaa
ctctcagcag attctgatgc caatgtccaa agtgctgctc atcttttgga 60tcgccttgtt
aaggatattg tgacggaaag tgatcagttc agtattgagg aattcatacc
120tcttttaaaa gagcgaatga acgttctcaa cccttacgtc cggcaatttc
tggttggatg 180gatcactgtt cttgatagtg ttccagacat tgacatgctt
gggtttctgc
cagactttct 240cgatgggtta ttcaatatgt tgagcgactc tagtcatgaa
atacgacagc aagctgattc 300agctctttca gagtttcttc aagagataaa
aaattcacca tctgtagatt atggtcgcat 360ggctgaaata ctggtgcaga
gggctgcttc tcctgatgaa ttcactcgat taacagccat 420cacgtggata
aacgagttcg taaaacttgg gggagaccag ctcgtgcgtt attatgctga
480cattcttggg gctatcttgc cttgcatatc tgacaaagaa gagaaaatca
gggtggttgc 540tcgtgaaacc aatgaagaac ttcgttcaat ccatgttgaa
ccctcagatg gttttgatgt 600tggcgcaatt ctctctgttg caaggaggca
gctatcaagt gagtttgagg ctactcggat 660tgaagcattg aattggatat
caacactttt aaacaagcat cgtactgagg tcttgtgctt 720cctgaatgac
atatttgaca cccttctaaa gcactatctg attcttctga tgacgtggtg
780ctcttggttc tggaggttca tgctggtgta gcaaaagatc cacaacactt
tcgccagctc 840atcgtatttc ttgtccacaa tttccgagct gataattctc
ttttggaaag gcgcggtgcc 900cttattgtcc gaagaatgtg tgtacttttg
gatgccgaaa gagtctaccg agagctctct 960acaattcttg agggagaaga
taatcttgac tttgcttcta ccatggttca ggcattgaat 1020ttgattttgc
ttacttcccc ggagttatcg aaactgagag aactattaaa aggttcactc
1080gtcaatcgcg aagggaaaga acttttcgtt gccttgtata cttcatggtg
ccattcaccc 1140atgggcaatt ataagcctct gcttattagc tcaggcttac
caagcatgcg agtgtcgtaa 1200tccaatcctt ggtaaaagaa gacattaacg
tccaaatttc ttaggccagc ttgataaaat 1260tgatccggct tctggaaact
ccaatcttta cttaccttag attgcagctt ctggaaccag 1320gaaggtacac
atggttgctg aaaacacttt atggtcttct tatgttactt cctcagcaaa
1380gtgcggcgtt caagatactt aggacaagac tcaaaactgt gccaacgtac
tcattcagta 1440ctggaaacca aataggcaga gcaacttcag gagttccttt
ctctcagtat aagcatcaaa 1500acgaggacgg tgacttagaa gacgataaca
tcaacagttc tcaccaagga atcaattttg 1560ctgtgcggct acaacagttc
gaaaacgtac agaatctaca tcgtggccag gcaaggacta 1620gagtgaacta
ctcatatcac tcttcctctt cttctacatc aaaggaggtg aggagatctg
1680aagaacaaca acagcagcag cagcaacaac aacagcaaca acaacaacaa
caacgaccac 1740caccttcttc gacatcatca tcagttgcag ataacaatag
acctccatca agaacttcaa 1800gaaaaggccc tggtcaatta cagctttaac
ctacctggta atcataaata ataaataata 1860ttccatcccc gacaatcatc
atcttcatct tctttgtgtg gacaccaccg atcccttttg 1920tctcctgtaa
aattgtatat ctctcttttt tagtaactct tcaagtttcg acggaacttg
1980tggaaaagct acggtcgtgt ccatcatctc tttctctctg tcgggttttt
tttatttacg 2040agagattctt cttcagtccc tcagtctacc tttatattgt
ttttttgggg gtttctcgtt 2100tctttgaatt tgtttcattg tttggagctt
tttatatttt taccttatgt ggagatgtaa 2160gaaaaagaag tgatcatgtg
gttttgtgtt gtttttttat aactggaaaa ccacatgagt 2220ttgtagaggt
cacttattgg atattttatg tcaaatgatg ctccttttta caaaaaaaaa
2280aaaaaaaaa 2289371094DNAArabidopsis thaliana 37cttgattgaa
acttcagttg aatccaagga aacgactgaa tcagtggtta caggtgaatc 60ggagaaagcg
attgaagata tttcaaaaga agctgacaat gaggaggatg atgatgagga
120ggaacaagaa ggagatgagg atgatgatga aaatgaagag gaagaagtgg
ttgttccaga 180aactgagaat cgagcagaag gagaagattt agtgaagaat
aaggcagctg acgcgaagaa 240gcatcttcaa atgattggag tccaactctt
gaaagaatcc gatgaagcaa acagaacaaa 300gaaacgtggg aagagggcat
ctcgtatgac acttgaggat gatgcagatg aggattggtt 360ccctgaggaa
ccatttgaag cattcaaaga aatgagggaa agaaaagtgt tcgatgtggc
420tgacatgtat acaatagcag acgtttgggg ttggacatgg gagaaggatt
ttaagaacaa 480aactccaagg aaatggtcac aagagtggga agtcgagttg
gcaattgtgc tcatgacaaa 540ggtgattgaa ttgggtggaa ttccaacgat
tggtgattgt gcagtgatat tacgagctgc 600tttaagagct cccatgcctt
cagccttctt gaagatcttg cagacgacac acagtcttgg 660ctactcattt
ggcagcccgt tgtacgatga gatcatcaca ttgtgtttgg accttggaga
720acttgatgca gccatcgcca tagttgcaga tatggaaacc acagggatca
ctgtccctga 780tcaaaccctt gacaaggtca tatctgctag acaatctaat
gagagtccgc ggtctgagcc 840tgaagagcca gcatcaacag taagctctta
gttatcatat cctcttctgc ttgttgtgaa 900gtctctataa gaaacagaaa
tcggtagaag gagctgaatc tgtcttagtt atgaaagttt 960tgttcattat
aagtacaagt catgtagttc cgagtgtaga acagttttta ctagtgttgc
1020accaggtccc tccagtctga tacttaattc tttagtgttg gatctttcta
tataagaaaa 1080aaaaaaaaaa aaaa 1094381204DNAArabidopsis thaliana
38aaccgattca gcctccgaca gtatattcca ctacgacgac gcttcacaag ccaaaatcca
60gcaggagaag ccatgggcct ccgatcctaa ctacttcaag cgcgttcaca tctcagccct
120tgctcttctc aagatggtgg ttcacgctcg ctccggtggc acaatcgaga
tcatgggtct 180tatgcagggt aaaaccgagg gtgatacaat catcgttatg
gatgcttttg ctttgcctgt 240tgaaggtact gagactaggg ttaatgctca
gtctgatgcc tatgagtata tggttgaata 300ctctcagacc agcaagctgg
ctgggaggtt ggagaacgtt gttggatggt atcactctca 360ccctgggtat
ggatgttggc tctcgggtat tgatgtttcg acacagatgc ttaaccaaca
420gtatcaggag ccattcttag ctgttgttat tgatccaaca aggactgttt
cggctggtaa 480ggttgagatt ggggcattca gaacatatcc agagggacat
aagatctcgg atgatcatgt 540ttctgagtat cagactatcc ctcttaacaa
gattgaggac tttggtgtac attgcaaaca 600gtactactca ttggacatca
cttatttcaa gtcatctctc gatagtcacc ttctggatct 660cctttggaac
aagtactggg tgaacactct ttcttcttcc ccactgttgg gcaatggaga
720ctatgttgcc gggcaaatat cagacttggc tgagaagctc gagcaagcgg
agagtcagct 780cgctaactcc cggtatggag gaattgcgcc agccggtcac
caaaggagga aagaggatga 840gcctcaactc gcgaagataa ctcgggatag
tgcaaagata actgtcgagc aggtccatgg 900actaatgtca caggttatca
aagacatctt gttcaattcc gctcgtcagt ccaagaagtc 960tgctgacgac
tcatcagatc cagagcccat gattacatcg tgaagttggt ctattctttt
1020gttttttggc tgcggaaatt gactatcggt ttgacccggt ttatgaggca
atgcccattg 1080ttccctatat ctctagtgta gtatctgctt cagacaaaga
tctttgggtt attaaatgac 1140attaacataa atcgatcatt atgtttttgc
gttaaaaaaa aaaaaaaaaa aaaaaaaaaa 1200aaaa 1204391715DNAArabidopsis
thaliana 39cttttaagtt gggggatgtt tcgattttga aatttgattt cttcaagaga
agagatttaa 60tgaaaataaa taacttccgc agataacgaa gaagaagaaa atggttagat
cagatgaaaa 120tagccttgga ttaatcggat caatgagtct ccaaggtacc
ctaaatcgat cgattttgtt 180attaaaaatc aaaactttcg ttctctttga
tttttccccc aaattgattt tgaatttact 240tgatgtaggg ggaggagtag
taggaaagat caagacgacg gcaacaacag gaccgacaag 300aagagcacta
agtactatta acaagaacat cactgaagcg ccgtcttacc cttatgctgt
360caacaagaga tcagtttctg aaagagatgg catttgtaat aaaccacctg
tgcatcgacc 420agttactagg aagtttgctg ctcagttagc agatcataag
ccacatatcc gtgatgagga 480aactaagaaa ccagactcag tttcaagtga
agaaccagag acgattatca ttgatgtgga 540tgaaagtgat aaagaaggag
gtgactctaa tgagccaatg tttgtacaac atactgaagc 600aatgctggag
gagattgaac agatggagaa ggagattgaa atggaagatg cagacaaaga
660agaagagcct gtgatcgata ttgatgcctg tgataagaat aatcctttgg
ctgcggttga 720atatatccat gatatgcata ccttctacaa gaattttgag
aaacttagtt gcgtgcctcc 780taactatatg gacaatcaac aagatcttaa
tgagagaatg agaggaatcc tcattgactg 840gttaattgag gtgcactaca
agtttgaact gatggaggaa actctttatc tcacaatcaa 900tgtcatcgac
agattccttg cggttcatca aatcgtgagg aaaaagcttc agcttgttgg
960tgttactgct ttgttgcttg catgtaaata tgaagaagtt tcagttccag
tggtagatga 1020tctcatcttg atctctgaca aagcttactc tagaagagaa
gtgctagata tggagaagct 1080aatggccaac accttgcaat tcaatttctc
tctaccaact ccatatgttt tcatgaaacg 1140atttctcaaa gctgcccaat
ctgacaagaa gcttgagatt ttatcattct ttatgatcga 1200gctttgcctt
gtggagtatg agatgctaga gtatcttcca tctaagctgg cggcctcagc
1260aatctacact gctcagtgta cacttaaggg atttgaagaa tggagcaaaa
cctgtgagtt 1320tcacacaggc tacaacgaaa aacagctact ggcatgtgcg
agaaagatgg ttgctttcca 1380tcacaaggca ggaacaggga agctcacagg
agttcacaga aagtacaaca catctaagtt 1440ctgtcatgct gcaagaactg
aaccagctgg gtttctgatt taatattaat aagaatctaa 1500tatgacttaa
ctcgagtttt tctttagaac aaaaagagtg tgagagaaag agagatagta
1560gagcaagttg cccaaaatgg gagaagaatg gatctttaga tatcatggca
agtagcccaa 1620aaagagtgta ttcttctctt tctaaggtct ttagatcttt
cttcacttga gagagaataa 1680aaagaatctt ctgaaaaaaa aaaaaaaaaa aaaaa
1715402195DNAArabidopsis thaliana 40aacccacgtc aattcttttt
caaaggcata tattctctct gtttcaaact ttgtgtctct 60tcttctcctt ctctgatcgt
tcgttttctg gacgagagag atggtaaatc cgggtcacgg 120aagaggaccc
gattcgggta ctgctgctgg tgggtcaaac tccgacccgt ttcctgcgaa
180tcttcgagtt cttgtcgttg atgatgatcc aacttgtctc atgatcttag
agaggatgct 240tatgacttgt ctctacagag taactaaatg taacagagca
gagagcgcat tgtctctgct 300tcggaagaac aagaatggtt ttgatattgt
cattagtgat gttcatatgc ctgacatgga 360tggtttcaag ctccttgaac
acgttggttt agagatggat ttacctgtta tcatgatgtc 420tgcggatgat
tcgaagagcg ttgtgttgaa aggagtgact cacggtgcag ttgattacct
480catcaaaccg gtacgtattg aggctttgaa gaatatatgg caacatgtgg
tgcggaagaa 540gcgtaacgag tggaatgttt ctgaacattc tggaggaagt
attgaagata ctggcggtga 600cagggacagg cagcagcagc atagggagga
tgctgataac aactcgtctt cagttaatga 660agggaacggg aggagctcga
ggaagcggaa ggaagaggaa gtagatgatc aaggggatga 720taaggaagac
tcatcgagtt taaagaaacc acgcgtggtt tggtctgttg aattgcatca
780gcagtttgtt gctgctgtga atcagctagg cgttgacaaa gctgttccta
agaagatctt 840agagatgatg aatgtacccg ggctaacgcg agaaaacgta
gccagtcacc tccagaagta 900tcggatatat ctgagacggc ttggaggagt
atcgcaacac caaggaaata tgaaccattc 960gtttatgact ggtcaagatc
agagttttgg acctctttct tcgttgaatg gatttgatct 1020tcaatcttta
gctgttactg gtcagctccc tcctcagagc cttgcacagc ttcaagcagc
1080tggtcttggc cggcctacac tcgctaaacc agggatgtcg gtttctcccc
ttgtagatca 1140gagaagcatc ttcaactttg aaaacccaaa aataagattt
ggagacggac atggtcagac 1200gatgaacaat ggaaatttgc ttcatggtgt
cccaacgggt agtcacatgc gtctgcgtcc 1260tggacagaat gttcagagca
gcggaatgat gttgccagta gcagaccagc tacctcgagg 1320aggaccatcg
atgctaccat ccctcgggca acagccgata ttgtcaagca gcgtttcaag
1380aagaagcgat ctcactggtg cgctggcggt tagaaacagt atccccgaga
ccaacagcag 1440agtgttacca actactcact cggtcttcaa taacttcccc
gcggatctac ctcgcagcag 1500cttcccgttg gcaagtgccc cagggatttc
agttccagta tcagtttctt accaagaaga 1560ggtcaacagc tcggatgcaa
aaggaggttc atcagctgct actgctggat ttggtaaccc 1620aagctacgac
atatttaacg attttccgca gcaccaacag cacaacaaga acatcagcaa
1680taaactaaac gattgggatc tgcggaatat gggattggtc ttcagttcca
atcaggacgc 1740agcaactgca accgcaaccg cagcattttc cacttcggaa
gcatactctt cgtcttctac 1800gcagagaaaa agacgggaaa cggacgcaac
agttgtgggt gagcatgggc agaacctgca 1860gtcaccgagc cggaatctgt
atcatctgaa ccacgttttt atggacggtg gttcagtcag 1920agtgaagtca
gaaagagtgg cggagacagt gacttgtcct ccagcaaata cattgtttca
1980cgagcagtat aatcaagaag atctgatgag cgcatttctc aaacaggaag
gcatcccatc 2040cgtagataac gagttcgaat ttgacggata ctccatcgat
aatatccagg tctgactaca 2100gaaactcaga ctagactgca agattctttg
tttttcttct ccctccttcg aggtacaaag 2160ctcaaaacat ggcaataacc
gaagggaaag ataga 2195411413DNAArabidopsis thaliana 41aggctgtgtt
ttatcgtggg atttttaaac atggggaagg aaaatgctgt gtctcggcca 60ttcactcgtt
cccttgcctc tgctttgcgc gcttcagaag tgacttctac tacacagaat
120caacagagag taaacacaaa aagaccagcc ttggaggata caagagccac
tggacccaac 180aagaggaaga agcgagcggt tctaggggag atcacaaatg
ttaactccaa tacagctata 240cttgaggcca aaaacagcaa gcagataaag
aaaggacgcg gtcatggatt ggcgagtaca 300tcccagttgg caacttctgt
tacttcagaa gtcacagatc ttcagtccag gaccgatgca 360aaagttgaag
ttgcatcaaa tacagcagga aacctttctg tttctaaagg cacagataac
420acagctgata actgtattga gatatggaat tctagattgc ctccaagacc
tcttgggaga 480tcagcttcta cagctgagaa aagtgctgtt attggtagtt
caactgtacc ggatatccca 540aaatttgtag acatcgattc agatgacaag
gatcctttac tgtgctgcct ctatgcccct 600gaaatccact acaatttgcg
tgtttcagag cttaaacgca gaccacttcc ggactttatg 660gagagaatac
agaaggatgt cacccagtcc atgcggggaa ttctggttga ttggcttgtg
720gaggtctctg aagaatacac acttgcatct gacactctct acctcacagt
gtatctcata 780gactggttcc tccatggaaa ctacgtgcaa agacagcaac
ttcaactgct cggcatcact 840tgcatgctaa ttgcctcgaa gtatgaggaa
atctctgctc cacgcattga ggagttttgc 900ttcattacgg ataacaccta
cacaagagat caggtcctgg aaatggagaa ccaagtactt 960aagcatttta
gctttcaaat atacactccc actccaaaaa cgttccttag gagatttctc
1020agagcagctc aagcctctcg cctgagccca agccttgaag tcgagtttct
agccagctat 1080ctaacagagt tgacattaat agactaccat ttcttaaagt
ttcttccttc cgttgttgct 1140gcttcagcgg gttttctcgc caagtggaca
atggaccaat caaaccaccc atggaatcca 1200acacttgagc attacacaac
gtacaaagca tcggatctga aagcatctgt tcatgcctta 1260caagatctgc
agcttaacac caaaggttgc cccttgagcg ctatacgcat gaagtatagg
1320caagagaaat acaaatctgt ggcggttctc acgtctccaa agctacttga
cacgctattc 1380tgaaggtttc aactcctaac cgataatagt ttt
1413422766DNAArabidopsis thaliana 42atttgagagg aagctttatt
ttgtgtgtag atggcgaata atcctccgca gtcttctggt 60acccagggtc agcattttgt
tcctgcagct tcacaacctt ttcaccctta tggacatgta 120cctccaaatg
ttcaaagtca gcctccacag tattctcagc cgatacagca gcagcagctc
180tttccagtga gaccaggtca gcctgtgcat attacatcat cctcacaggc
tgtatcagtt 240ccgtatattc aaacgaacaa gattctcact tctggatcta
ctcaaccaca gccaaatgca 300cctccaatga cgggctttgc tacatctgga
cctccatttt cttctccata tacttttgta 360ccatcatctt atcctcagca
acaaccaaca tccttggtcc aaccaaattc tcagatgcat 420gtagctggcg
tccctccagc agcaaacact tggcctgttc ctgttaatca aagcacatca
480cttgtttccc ctgtgcagca gactgggcaa caaacaccgg tcgcagtttc
cacagaccca 540ggaaacttga ctccgcaatc tgcatctgac tggcaggagc
atacatctgc tgatgggaga 600aaggctgatg catccactgt atggaaggaa
tttacaacac ctgaaggaaa gaaatattat 660tataacaagg ttacaaagga
gtctaagtgg acaattccgg aagatttaaa gttagctcgg 720gaacaagccc
aactagctag tgaaaaaacg tccctttcgg aagctggatc tacccctcta
780tcccaccatg ctgcatcctc gtctgatcta gcagttagca ctgtgacttc
tgttgttccc 840agcacatctt cagcacttac tggacattct tcaagcccta
ttcaagcggg tttggctgta 900cctgtcaccc gtcctccctc tgttgctcct
gttactccaa catctggtgc aattagtgac 960actgaggcta ctacaatgta
ctatttttcc ttgggaagtt ttgctgagaa taaggaaatg 1020tctgtgaatg
gaaaagccaa tttgtcacct gctggtgaca aagcaaatgt cgaggaacct
1080atggtatatg ctactaagca ggaggccaaa gctgctttca agtctctttt
ggaatctgta 1140aatgttcatt ccgactggac atgggaacag acattgaaag
agattgttca cgataaaaga 1200tatggtgctt tgaggacact cggcgagcgg
aaacaagcgt ttaacgagta tcttggccaa 1260aggaaaaaag tggaagctga
ggaaagacga aggaggcaga agaaagctcg ggaagaattt 1320gtcaagatgc
tagaggagtg tgaagaactt tcatcatccc tgaaatggag caaagcaatg
1380agtttgttcg aaaatgatca gcgttttaaa gctgttgacc gtcctaggga
tcgtgaagat 1440ctttttgaca attacattgt ggaacttgag aggaaggaaa
gagaaaaggc agcggaggaa 1500catcggcagt atatggcaga ctatcggaag
tttcttgaaa cctgtgacta tatcaaagct 1560ggtacacaat ggcgcaaaat
tcaagataga ctggaggatg atgacagatg ctcatgtctt 1620gaaaagatag
atcgtctgat tggttttgag gaatacattc ttgacctaga gaaggaagaa
1680gaagagctga agagagtaga gaaagaacat gtaaggcggg ccgagagaaa
aaaccgtgat 1740gcatttcgta cactattgga agaacatgtt gctgcaggca
tccttacagc caagacgtac 1800tggttggatt attgcattga gttaaaagac
ttgccccaat accaagctgt tgcatctaat 1860acatctggtt caactccgaa
agacttgttt gaagatgtca cagaagaatt agagaagcag 1920tatcatgagg
ataagagcta tgtgaaggat gctatgaagt caagaaaggc aaattttaaa
1980tctgctattt cagaagatct cagtactcaa cagatatcag acataaattt
aaagcttata 2040tatgatgact tggttgggag agtgaaggaa aaagaagaaa
aagaggccag aaagcttcag 2100cgtctggctg aagaatttac caatctgttg
cacactttca aggaaatcac cgtagcttca 2160aattgggaag atagcaaaca
actagtagaa gaaagtcaag agtacagatc gattggagat 2220gaaagtgtta
gccaagggtt atttgaggaa tacataacga gtttacaaga aaaggcaaag
2280gagaaggagc gtaagcgtga cgaggaaaag gttagaaaag agaaggaaag
ggacgagaaa 2340gagaaacgga aagacaagga taaggagaga agggaaaagg
aaagagaacg tgaaaaagag 2400aagggaaaag agaggagtaa acgggaagaa
tcagatggtg agactgctat ggatgtgagc 2460gaaggtcata aagacgagaa
aagaaaggga aaagatcgtg acagaaaaca tcgaagacgg 2520catcacaaca
attctgatga agatgttagt tctgataggg atgacagaga tgagtcgaag
2580aaatcatccc gtaaacatgg taatgatcgc aaaaaatcaa gaaagcacgc
aaactcgcct 2640gaatcggaga gtgaaaaccg gcataaaaga cagaaaaaag
agagtagtcg ccgaagtggt 2700aatgacgagc tagaggatgg agaagttggg
gagtgatagt gaaattcgac attaatctga 2760aacctt
2766431260DNAArabidopsis thaliana 43tgaaacctag atttctgcaa
ctgaattcct aattcgaaaa agaatggagg gttcgtcgtc 60gacgatagca aggaagacat
gggaactaga gaacagcatt ctaacagtag actcacctga 120ttcaacctcc
gacaacatct tctactacga cgatacttca cagactaggt tccagcaaga
180gaaaccgtgg gagaatgatc ctcactactt taaacgagtc aagatctcag
cgctcgctct 240tcttaagatg gtggttcacg ctcgctctgg tggtacaatt
gaaataatgg gtcttatgca 300aggtaagacc gatggtgata ctatcattgt
tatggatgct tttgctttac cagtggaagg 360tactgagaca agggttaatg
ctcaggatga tgcttatgag tacatggttg agtattcaca 420gaccaacaag
ctcgcggggc ggctggagaa tgttgttgga tggtatcact ctcaccctgg
480atatggatgc tggctctccg gtattgatgt ttctacgcag aggcttaacc
aacagcatca 540ggagccattt ttagctgttg ttattgatcc cacaaggact
gtttcagctg gtaaggttga 600gattggtgct ttcagaacat actctaaagg
atataagcct ccagatgaac ctgtttctga 660gtatcaaact attcctttaa
ataagattga ggactttggt gttcactgca aacagtacta 720ttcattagat
gtcacttatt tcaagtcatc tcttgattct caccttctgg atctactatg
780gaacaagtac tgggtgaaca ctctttcttc ttctccactg ctgggtaatg
gagactatgt 840tgctggacaa atatcagact tagctgagaa gcttgagcaa
gccgagagtc atctggttca 900gtctcgcttt ggaggagttg tgccatcatc
ccttcataag aaaaaagagg atgagtctca 960actaactaag ataactcggg
atagcgcaaa gataactgtg gaacaggtcc atggactaat 1020gtcgcaggtc
ataaaagatg aattattcaa ctcaatgcgt cagtccaaca acaaatctcc
1080cactgactcg tcggatccag accctatgat tacatattga agttgctctt
cttttggttt 1140ctagttttgg attgacccat catttgttgt cctttcattt
attttctgtt gtgtaaagaa 1200ttataatgct aatcagaata atacagaaga
agattttggt taaaaaaaaa aaaaaaaaaa 126044653DNAArabidopsis thaliana
44cttaaactac atttatcatt acagtctgat ttgagctaag ttctctcatc ataaactctc
60cttggagaat catggctatt tcaaaagctc ttatcgcttc tcttctcata tctcttcttg
120ttctccaact cgtccaggct gatgtcgaaa actcacagaa gaaaaatggt
tacgcaaaga 180agatcgattg tgggagtgcg tgtgtagcac ggtgcaggct
ttcgaggagg ccgaggctgt 240gtcacagagc gtgcgggact tgctgctaca
ggtgcaactg tgtgcctccg ggtacgtacg 300gaaactacga caagtgccag
tgctacgcta gcctcaccac ccacggtgga cgccgcaagt 360gcccataaga
agaaacaaag ctcttaattg ctgcggataa tgggacgatg tcgttttgtt
420agtatttact ttggcgtata tatgtggatc gaataataaa cgagaacgta
cgttgtcgtt 480gtgagtgtga gtactgtatt attaatggtt ctatttgttt
ttacttgcaa gttttcttgt 540tttgaatttg tttttttcat atttgtatat
cgattcgtgc attattgtat tatttcaatt 600tgtaataaga ttatgttacc
tttgagtggt tgtttaaaaa aaaaaaaaaa aaa 653451266DNAArabidopsis
thalianamisc_feature(757)..(761)n = a, c, g, or t 45agatggggaa
gaagaacaag agaagtcaag acgagtctga gctcgaattg gagccagagc 60taacgaaaat
aatcgatgga gactctaaaa agaagaaaaa taagaataag
aagaagagaa 120gccatgaaga tacggagata gaaccggagc aaaagatgag
tctcgacgga gactcgaggg 180aggagaagat aaagaagaag aggaagaaca
agaaccaaga ggaggagcca gagcttgtga 240cggagaaaac gaaagtccaa
gaggaggaaa agggaaatgt agaagagggt agagccactg 300ttagcatagc
catagctggt tcaatcatcc acaacactca atcacttgag ctcgccacac
360gcgtaatctc tctttctctc tatctctccc ttcgtttctc tgtttttcca
ttcccagata 420atttaaagtc cccttcttcc atttctaaca tttctcagct
cgccggccaa attgctcgtg 480cagctacaat tttccgaatc gacgagatcg
tagtgttcga caataagagc agctcagaaa 540tcgaatcagc tgctacgaat
gcttctgata gcaatgaaag tggtgcctcc tttctcgttc 600gtatcttgaa
gtatctagag acaccacaat atttgaggaa atctctcttc cccaagcaaa
660atgatcttag atatgtgggt atgttgccgg gtatgttgcc acctcttgat
gctcctcacc 720atctgcgtaa gcacgagtgg gaacaatacc gtgaagnnnn
nattgttcca ccctctaagc 780caagggaaga agcaggaatg tattggggat
acaaagtacg atatgcatca caattaagtt 840cagtattcaa ggaatgccct
ttcgagggtg gttacgatta tttgattggt acctcggagc 900acggcctggt
aattagttca tctgagctga aaataccaac atttaggcac ctattgattg
960catttggtgg acttgctggg cttgaagaaa gtattgaaga tgataatcag
tataagggga 1020aaaacgttcg agatgtgttt aatgtatact tgaatacttg
tccacatcaa ggtagccgaa 1080ccattcgagc agaggaagcg atgtttatat
cacttcagta cttccaggaa cccatcagca 1140gggcagtgag aagactttaa
gcttcgataa aaagagtcaa agaagctatt ttgttctcat 1200agatctgagg
tttgtctgaa aaagagtgat gtaatgtaac tgttttagaa aaaaaaaaaa 1260aaaaaa
1266461520DNAArabidopsis thalianamisc_feature(1455)..(1455)n = a,
c, g, or t 46atggaattgc ttgacatgaa ctcaatggct gcctcaatcg gcgtctccgt
cgccgttctc 60cgtttcctcc tctgtttcgt cgcaacgata ccaatctcat ttttatggcg
attcatcccg 120agtcgactcg gtaaacacat atactcagct gcttctggag
ctttcctctc ttatctctcc 180tttggcttct cctcaaatct tcacttcctt
gtcccaatga cgattggtta cgcttcaatg 240gcgatttatc gacccttgtc
tggattcatt actttcttcc taggcttcgc ttatctcatt 300ggctgtcatg
tgttttatat gagtggtgat gcttggaaag aaggaggaat tgattctact
360ggagctttga tggtattaac actgaaagtg atttcgtgtt cgataaacta
caacgatgga 420atgttgaaag aagaaggtct acgtgaggct cagaagaaga
accgtttgat tcagatgcct 480tctcttattg agtactttgg ttattgcctc
tgttgtggaa gccatttcgc tggcccggtt 540ttcgaaatga aagattatct
cgaatggact gaagagaaag gaatttgggc tgtttctgaa 600aaaggaaaga
gaccatcgcc ttatggagca atgattcgag ctgtgtttca agctgcgatt
660tgtatggctc tctatctcta tttagtacct cagtttccgt taactcggtt
cactgaacca 720gtgtaccaag aatggggatt cttgaagaga tttggttacc
aatacatggc gggtttcacg 780gctcgttgga agtattactt tatatggtct
atctcagagg cttctattat tatctctggt 840ttgggtttca gtggttggac
tgatgaaact cagacaaagg ctaaatggga ccgcgctaag 900aatgtcgata
ttttgggggt tgagcttgcc aagagtgcgg ttcagattcc gcttttctgg
960aacatacaag tcagcacatg gctccgtcac tacgtatatg agagaattgt
gaagcccggg 1020aagaaagcgg gtttcttcca attgctagct acgcaaaccg
tcagtgctgt ctggcatgga 1080ctgtatcctg gatacattat attctttgtg
caatcagcat tgatgatcga tggttcgaaa 1140gctatttacc ggtggcaaca
agcaatacct ccgaaaatgg caatgctgag aaatgttttg 1200gttctcatca
atttcctcta cacagtagtg gttctcaatt actcatccgt cggtttcatg
1260gttttaagct tgcacgaaac actagtcgcc ttcaagagtg tatattacat
tggaacagtt 1320atacctatcg ctgtgcttct tctcagctac ttagttcctg
tgaagcctgt tagaccaaag 1380accagaaaag aagaataatg ttgtcttttt
aaaaaatcaa caacattttg gttcttttct 1440ttttttccac ttggnccgtt
ttatgtaaaa caagagaaat caagatttga ggttttattc 1500ttaaaaaaaa
aaaaaaaaaa 1520471142DNAArabidopsis thaliana 47ttatataacc
tatctacact ttgatctccg acaattcact ttcccaataa gaaccaactg 60agagagagag
agcgccggag aagaagaatt ttagagagcg atggacgagg gagttatagc
120agtttccgcc atggatgctt tcgagaagct tgagaaagtt ggtgaaggga
catacgggaa 180agtttacaga gccagagaga aagctaccgg gaaaatcgtc
gctctaaaga agacgcgtct 240ccatgaggac gaagaaggcg ttccttccac
cactctccgc gagatctcca ttttgcgaat 300gctcgctcgt gatcctcacg
tcgtcaggtt aatggatgtt aagcaaggac taagcaaaga 360aggcaaaact
gtactgtacc tggtttttga atacatggac actgatgtca agaaattcat
420cagaagtttc cgtagcactg gcaagaacat tccaacccaa actatcaaga
gcttgatgta 480tcaactatgc aaaggtatgg cattctgcca tggtcacggg
atattgcaca gagatctcaa 540gcctcacaat ctcttgatgg atcccaagac
aatgaggctc aaaatagcag atcttggttt 600agccagagcc ttcactctgc
caatgaagaa gtatacccat gagatattaa ctctttggta 660tagagctcca
gaggttcttc ttggtgccac ccattactct acagctgtgg atatgtggtc
720tgttggctgc atatttgctg aacttgtgac caaccaagca atctttcagg
gagactctga 780gctccaacag ctcctccata ttttcaagtt gtttgggaca
cccaatgaag aaatgtggcc 840aggagtgagc acactcaaga actggcatga
atacccacag tggaaaccat cgactctatc 900ctctgctgtt ccaaacctcg
acgaggctgg agttgatctt ctatctaaaa tgctgcagta 960cgagccagcg
aaacgaatct cagcaaagat ggctatggag catccttact ttgatgatct
1020gccagaaaag tcctctctct aaggatttaa aatcttcagt tagtatcttt
ccaagtttta 1080tggtttttct agttttgctt ctttcaagca tatctctagt
gtgctgcttc cccctctatg 1140aa 1142481189DNAArabidopsis thaliana
48tagtcaacga tggatttgag acatgaacaa ctaattgatt tgatttcgtg tagctaactt
60tgttaattgg taaattgtgt agagaaggat gagtatggag atggagttgt ttgtcactcc
120agagaagcag aggcaacatc cttcagtgag cgttgagaaa actccagtga
gaaggaaatt 180gattgttgat gatgattctg aaattggatc agagaagaaa
gggcaatcaa gaacttctgg 240aggcgggctt cgtcaattca gtgttatggt
ttgtcagaag ttggaagcca agaagataac 300tacttacaag gaggttgcag
acgaaattat ttcagatttt gccacaatta agcaaaacgc 360agagaagcct
ttgaatgaaa atgagtacaa tgagaagaac ataaggcgga gagtctacga
420tgcgctcaat gtgttcatgg cgttggatat tattgcaagg gataaaaagg
aaatccggtg 480gaaaggactt cctattacct gcaaaaagga tgtggaagaa
gtcaagatgg atcgtaataa 540agttatgagc agtgtgcaaa agaaggctgc
ttttcttaaa gagttgagag aaaaggtctc 600aagtcttgag agtcttatgt
cgagaaatca agagatggtt gtgaagactc aaggcccagc 660agaaggattt
accttaccat tcattctact tgagacaaac cctcacgcag tagtcgaaat
720cgagatttct gaagatatgc aacttgtaca cctcgacttc aatagcacac
ctttctcggt 780ccatgatgat gcttacattt tgaaactgat gcaagaacag
aagcaagaac agaacagagt 840atcttcttct tcatctacac atcaccaatc
tcaacatagc tccgctcatt cttcatccag 900ttcttgcatt gcttctggaa
cctcaggccc ggtttgctgg aactcgggat ccattgatac 960tcgctgaccg
agcttctatt cccaaattct tcaagaagaa gaagtaatga tctaattggt
1020atactaaaaa attatacatc tggtttagtg ttcaattgag agagactgta
aaatcaattc 1080ataggccaac aaatgtttgt ttatccaatt ttccttttta
ttcgaacttg atgcgatatt 1140tcaacggaaa cagaaactat tgttttaaac
caaaaaaaaa aaaaaaaaa 118949805DNAArabidopsis thaliana 49atgaataggg
aaaagttgat gaagatggct aacactgtcc gcactggcgg aaaggggaca 60gtaagaagaa
agaagaaggc tgttcacaag accactacaa ccgatgacaa gaggctccag
120agcactctta agagagttgg agtcaattcc attcccgcca ttgaagaagt
taacattttt 180aaggatgatg tagtcattca gttcattaac cctaaagttc
aagcttcaat tgctgctaac 240acatgggttg tgagtggtac accacagacg
aaaaaattgc aagacattct tcctcagatt 300atcagccaac ttggaccaga
taacttggac aacctgaaga agctagcaga gcaattccag 360aaacaagctc
caggtgcagg tgatgtccca gcaacaatcc aagaagagga cgatgatgat
420gatgtcccag atcttgtagt gggagagact ttcgagaccc ctgctactga
agaggctccc 480aaagctgctg cttcttagag gaggaggaag aagaaggaga
agagctcacc tgcaaaaccc 540atcataaaaa tgtttgtcgc tcgacctctt
ctgagcactg tcagattctt gttttctcta 600atgcttgcga acagaaagac
ttggttttat tatcacttga tgctttttgg tccgaacagc 660aattttcctt
ttattaaggt tagatcgctt tttgtttacc acctgttcaa atgagtacta
720ctatgtcctg tcgcttcata cacttcttgc aacacagtcc tttgttttga
gtcaaaaaaa 780aaaaaaaaaa aaaaaaaaaa aaaaa 805501539DNAArabidopsis
thaliana 50aagctttact acttatactc ttttgttcct atggccaccg tatcttcttc
ctcctggcca 60aaccccaacc ctaatcccga ttccacgtct gcctcagatt ccgattctac
ttttccctct 120caccgcgatc gcgtagacga acccgactct ctcgattcct
tctcctccat gagtcttaac 180tccgacgaac ctaatcagac ttctaatcaa
tcgcctcttt ctccccctac gcccaattta 240ccggtgatgc ctcctccgtc
cgtgcttcat ctttccttta accaagatca tgcttgcttc 300gctgtcggca
ctgaccgtgg cttccggatc cttaattgcg atccctttcg cgagattttc
360cggcgtgatt tcgatcgtgg cggtggtgtt gcagtcgtgg agatgctttt
cagatgcaat 420atattagccc tagttggtgg cggacctgat cctcaatatc
ctcctaataa ggttatgatt 480tgggatgatc accagggccg atgtatcgga
gaactctctt tcaggtccga tgttagatcc 540gtccggctta ggagggatcg
gattattgtc gttcttgagc agaagatttt tgtctacaat 600ttctctgacc
tcaagctgat gcatcagatt gaaaccattg ccaaccctaa gggtttgtgt
660gctgtttctc agggtgttgg ttctatggtt ttggtatgtc caggtttgca
gaaaggtcaa 720gttcggatcg agcactacgc ttctaaacgg accaaattcg
tcatggctca tgattccaga 780atagcttgct tcgctctcac gcaggatggc
catttgttgg ccactgctag ctctaagggt 840actctggttc ggatcttcaa
tactgttgat ggtaccttgc gtcaagagtc tggcacttct 900gaggatgaaa
taggtaagga gggtgcggat agagcagaga tctacagttt ggccttctct
960tcaaatgctc agtggttagc tgtctcaagt gacaaaggaa cggtccatgt
ctttggtctc 1020aaagtcaact ccggatctca agtgaaagac tcatcccgaa
ttgcacctga tgctactccc 1080tcatccccat cgtcgtctct gtctttattc
aaagtgttac cgaggtattt cagctcggag 1140tggtcggtgg ctcagttcag
gttggttgaa ggaactcagt acatagccgc ctttggccat 1200caaaagaaca
ccgttgttat tcttggcatg gatgggagct tctacagatg ccagtttgat
1260ccggtgaacg gcggtgaaat gtctcagctt gagtaccaca actgtctgaa
accgccttca 1320gttttctaaa agctttacta cttatactct tttgttcctt
ctctctcttt atatctctct 1380gcaacttaag cggtgagata tggtgtatag
ttttgtgtat ataataatga tgggtcgtcc 1440tataatttgt aaaacctttt
atcgctaccc gggtcgactc tagagcccta tagtgagtcg 1500tattactgca
gagatctatg aatcgtagat actgaaaaa 1539511977DNAArabidopsis thaliana
51agagcttcct ctctctatat ctggctttct atggatgtag gagttactac ggcgaagtct
60atacttgaga agcctctgaa gcttctcact gaagaagaca tttctcagct tactcgcgaa
120gattgccgca aattcctcaa agagaaaggt ttcttcttct tcctttctcc
atttttttcc 180ggtcttattg tcttcgacga atggcggctg acacgtgtcg
aaacaggaat gcgcaggcct 240tcgtggaata aatctcaggc gatccagcaa
gttttatctc ttaaagctct ctatgaacct 300ggagatgatt ccggcgccgg
aatcctccgc aagatccttg tttctcagcc gccaaatccg 360cctcgcgtta
caacaacgtt gattgagcca aggaacgagc tcgaagcttg tggaaggatt
420cctttacagg aagatgatgg tgcgtgccat agaagggatt ctccaagatc
agctgagttt 480tctggtagtt ctggtcagtt tgttgcggat aaagatagcc
acaagactgt ttctgtttcc 540cccagaagcc cagctgaaac aaatgcggtg
gttgggcaaa tgacgatatt ttatagtggc 600aaagtgaatg tatatgatgg
agtaccacct gaaaaggccc ggtctatcat gcattttgca 660gccaatccaa
ttgatttgcc tgaaaatggt atttttgctt ctagtagaat gatttcgaaa
720cccatgagta aagagaagat ggtggagctt ccccaatatg gacttgaaaa
ggcacctgct 780tctcgtgatt ctgatgttga gggtcaggcg aacagaaaag
tatcgttgca aagatatctt 840gaaaagcgga aagacagatt ttctaagacc
aagaaggctc caggagttgc gtcctctagc 900ttggagatgt ttctgaatcg
tcagccacgg atgaacgctg catattcaca aaaccttagt 960ggcacagggc
attgcgagtc acctgaaaat caaacaaaaa gtcccaatat ctcagttgat
1020ctaaacagtg atctaaacag cgaaggtgcc aaaagaactg gagatggtac
tacgggtcaa 1080aaggcgggaa gaacaatttc atgttcttat aacatgacta
agacatcacg aggaacacga 1140tgggtgaagc ggtcaagaga agaagtgatt
caagcttggt atatggatga tagtgaagag 1200gatcagagac ttcctcacca
caaggatcct aaagagtttg tatcgttgga caaacttgca 1260gagctgggag
tacttagctg gagacttgat gctgataact atgaaaccga tgaggatttg
1320aaaaagatcc gtgaatctcg tggttactct tacatggact tttgtgaggt
atgcccggaa 1380aagcttccaa actatgaagt gaaagtgaag agctttttcg
aagaacattt acacactgat 1440gaggagatcc gttactgcgt tgcaggaact
ggttactttg atgtgagaga tcgtaatgaa 1500gcttggatta gggtattggt
aaagaaggga ggtatgatag tcttacctgc tgggatctat 1560catcgcttca
ctgtggactc tgacaactat atcaaggcaa tgcggctatt cgtgggtgaa
1620ccggtatgga caccatacaa tcgcccacac gaccatcttc ctgcaaggaa
agaatatgtc 1680gataacttca tgatcaatgc ctcggcttag agagcttcct
ctctctatat ctggctttct 1740gaaacaagga tctataaaca aggcctacaa
taaagaaagc tttcctgtca agtattggat 1800atttatatgt attcctgtgt
agaatgatgg cttttggtat gcttgagttg ttgtaaactt 1860agttacactc
tctgatatgt ctctctttac catctttgtc gtatcccata tacgaaaaga
1920ttacattggg attcatattg tcttacgttc gttcctatgt gcaatatgtt gagtttt
197752525DNAArabidopsis thaliana 52catcgctttt cgctgaaatc aaaatttctc
cagttttccg atcagtcgca agaaaaccct 60aaaaatggat ggtcatgatt ctaaggatac
taagcagagc actgctgata tgactgcttt 120tgtccaaaat cttctccagc
agatgcaaac caggttccag acaatgtcgg actccatcat 180cacaaagatt
gatgacatgg gaggcagaat caatgagctg gagcaaagca tcaatgatct
240aagagccgag atgggagtag aaggcactcc tcctccagcc tccaaatcag
gcgatgaacc 300caaaacaccg gctagttcct cttaaaaagg aatgtggtgt
tcattgacat gtccgaagga 360aaaagaaaaa ctatgaaata tgttaagagc
agtattactt ttaaaattcc tgttttaaga 420aacgagtttg ttgtttatta
aagttcatca aatagattga tgatgtggtg cattacatta 480ttctccacct
atgaattgca tttctatttt ggtctaaaaa aaaaa 525532610DNAArabidopsis
thaliana 53agaacaattg agattcttgg ttgtgttaag atggaaatct acaccatgaa
aacgaatttt 60cttgtactgg ctttgtcttt gtgtatcctt ctttcaagct tccatgaggt
ttcttgtcag 120gatgatggta gtggtttgag taatttggat ctaatagaac
gtgattatca agatagtgtc 180aatgctcttc aaggcaagga cgatgaagat
cagtctgcaa agatacagag tgaaaaccag 240aataacacta cagtgactga
taagaacact atttctctat ctctatcaga tgaatctgag 300gttggatctg
ttagtgatga aagcgttgga cgttcgagtc tgttggatca aatcaaactt
360gaattcgaag ctcatcacaa tagtattaac caagctggat ctgatggtgt
caaggctgaa 420tccaaggatg atgatgaaga attatctgct catagacaga
aaatgttgga agaaatcgaa 480catgagtttg aagctgcttc agatagtctg
aaacaactaa agactgatga tgtaaacgaa 540ggaaatgatg aagaacattc
tgcaaagagg caaagtttgt tggaagagat cgaacgtgag 600tttgaagctg
ctacaaaaga acttgaacaa ctaaaggtta atgacttcac cggggacaaa
660gatgacgaag aacactctgc aaagagaaaa agtatgcttg aagctattga
acgcgagttt 720gaagctgcta tggaaggcat tgaagcactt aaggtttctg
attccacagg aagcggagat 780gatgaagaac aatctgcaaa gagactaagt
atgcttgaag agatcgaacg ggaatttgaa 840gctgcttcaa aaggtcttga
acaactaagg gctagcgatt caaccgcgga caataacgaa 900gaagaacacg
ctgcaaaggg acaaagtttg ttagaagaga tcgaacgaga gttcgaagct
960gctacagaga gccttaagca acttcaagtt gatgattcta ctgaagacaa
agaacactgt 1020aaagcactct tcttcttatt atctgctatt ctttctctat
ggttatctga atcaggcttt 1080gaatgtattg tagttacagc tgcaaagagg
caaagtctgc tggaagagat tgaacgtgaa 1140tttgaagctg caacaaaaga
tcttaaacaa ctaaatgatt tcactgaagg cagtgctgat 1200gatgaacaat
ctgcaaagag aaacaaaatg ttggaagata tcgaacgcga atttgaagct
1260gctacaatag gtcttgaaca actaaaggct aatgatttct ctgaaggcaa
taataatgaa 1320gaacaatctg caaagagaaa gagtatgctt gaagagatcg
aacgcgagtt cgaagctgct 1380attggaggtc ttaaacagat caaagttgat
gattccagaa atcttgaaga agaatctgct 1440aagagaaaga taattttgga
agagatggaa cgtgaatttg aagaagcaca cagtggtatt 1500aatgcaaagg
ctgacaaaga agaatctgca aagaaacaga gtggctctgc tataccagag
1560gttcttggac taggacagtc aggtggttgt agctgttcta aacaagacga
agattcctcg 1620attgttatac caacaaaata tagcatagaa gatatcctct
ctgaagaatc tgcagtccag 1680ggaacagaga cttctagtct caccgcgtct
ttgactcaac tcgttgagaa tcacaggaaa 1740gaaaaggaat ctctactcgg
acacagagtt ctcacttctc cttctatagc ttcttccaca 1800agcgaatcat
ctgctacatc agagactgta gaaaccctaa gggctaaact gaatgagctt
1860cgcggcttaa ccgctcgtga gcttgtgaca cgtaaagatt tcggtcagat
tctcattacg 1920gctgcgagtt ttgaagagct aagttcagct ccaatcagtt
acatttctag gttagctaaa 1980tacagaaacg tcatcaaaga aggacttgaa
gcttctgaga gagttcacat cgcgcaggta 2040cgagcaaaaa tgctcaaaga
agttgccacg gagaagcaaa ccgccgtgga cactcatttc 2100gcaaccgcta
aaaagcttgc tcaagaagga gacgcgttgt tcgttaaaat cttcgcaatc
2160aagaaactgt tggcgaaact tgaagcagag aaagaatctg ttgatggaaa
gtttaaggag 2220actgtgaaag aactttctca tcttctggct gatgcttctg
aggcttacga agagtatcat 2280ggcgcggtga ggaaggcgaa agacgagcaa
gcggctgagg aatttgcgaa agaggcgacg 2340caaagtgcag agatcatttg
ggttaagttt cttagttctc tttagagaac aattgagatt 2400cttggttgtg
ttaagagcaa atctagagct cttgttggtt cttgttatgt attttgtgat
2460gatgttctgt ttcagagttt gtgtgttggt tgtatcagga gaaagaggct
gggagataga 2520gagaaagaga gtctctgcga aaactaataa tgttttttca
gatatctaaa taataagctt 2580tttacaaaaa aaaaaaaaaa aaaaaaaaaa
2610542235DNAArabidopsis thaliana 54aatttgaatc caatccccaa
attatctcat atggagtttg gatcttttct tgtgtcctta 60gggacatctt ttgttatctt
cgtcattctc atgcttctct tcacctggct ttctcgcaaa 120tctggaaatg
ctcccattta ttacccgaat cggatcctta aagggctgga gccatgggaa
180ggcacctcct tgactcgaaa cccttttgct tggatgcgtg aagctttgac
ttcctctgaa 240caagatgtcg ttaacttatc cggcgtcgat actgctgtcc
actttgtctt cttgagcact 300gttctgggga tatttgcttg ttccagtctt
cttctcctac caactctact gcctctagcc 360gctacagaca acaacataaa
gaacacaaag aatgcgacag ataccacaag caaaggaact 420tttagccaac
ttgataatct atcaatggct aacatcacaa aaaaaagttc gaggctgtgg
480gcgttcctag gagctgttta ctggatatct ttggtcacat atttcttctt
gtggaaagct 540tataagcatg tctcttcatt gagagctcaa gctctgatgt
ctgctgatgt aaaacccgag 600caattcgcta ttcttgttag ggatatgcct
gcaccacctg acgggcagac acagaaagag 660tttattgatt cttatttcag
agaaatatac cctgagacat tctacagatc gcttgtcgca 720acagaaaaca
gcaaggttaa taaaatatgg gaaaaattgg aaggttacaa gaagaagctt
780gcgcgagcag aagcaatatt agcagcaact aataaccgtc ccacgaacaa
aaccggcttc 840tgtgggctag tcggtaaaca agtagacagc attgagtatt
acactgagct aatcaacgag 900tctgtagcca aactggaaac agagcagaaa
gcggttcttg ctgagaagca gcaaaccgca 960gcagtggttt tcttcacaac
cagggttgct gctgcatcag cagctcagtc tctgcactgc 1020cagatggttg
ataaatggac tgtgaccgaa gctcctgagc cacggcagct cctatggcag
1080aatctcaaca tcaagctctt cagcagaata atccggcaat acttcatcta
cttctttgtt 1140gcagtgacca ttctgtttta catgatacca atcgcgttcg
tctctgccat caccactctt 1200aagaatcttc agaggattat tccgttcata
aagccggttg tggagataac cgccataaga 1260accgttttgg agtctttcct
tcctcagatt gcgctcattg ttttcttggc catgttgccg 1320aagcttctct
tgtttctctc caaagccgag gggattcctt cacagagcca tgccattagg
1380gctgcttcag ggaagtactt ttacttctcg gtctttaatg tcttcattgg
tgttaccctt 1440gctgggactt tgttcaacac agtgaaggat atcgcgaaaa
atcccaaact cgacatgatt 1500attaaccttt tggctactag cctccctaag
agcgcaactt tcttcctgac ctacgttgct 1560ctcaagttct ttatcggtta
tggccttgag ctgtctcgga tcataccttt gataatcttc 1620cacctgaaaa
agaagtatct ctgcaaaacc gaagcggagg tcaaagaagc ttggtacccg
1680ggagacttaa gctatgcgac tagggttccc ggagacatgc tcatcctcac
aatcacgttc 1740tgctattcag tcattgctcc tcttatcctc atattcggca
tcacctactt tggtttaggc 1800tggctagtcc tcaggaatca ggcgttgaaa
gtgtacgttc catcatacga gagctatgga 1860agaatgtggc cgcatattca
ccagcgcata ctagcagcgt tgtttctatt ccaagtggta 1920atgtttggct
acttaggagc caagacattc ttctacacgg cccttgtgat ccctctcatt
1980atcacctctc tcatcttcgg atatgtgtgc cgccagaaat tctacggagg
gttcgaacac 2040acagctctcg aggtagcttg ccgtgagctg aagcagagtc
cagacctaga ggagattttc 2100agagcataca ttccgcatag cttgagctct
cacaaaccag aagaacacga gttcaaaggc 2160gcaatgtctc gttatcaaga
tttcaacgca atagcaggcg tttaaagctt gagagattcc 2220tctggctaaa cccag
2235554002DNAArabidopsis thaliana 55aacaataaga agaaaaagtt
tcattttctg atggcggagc agaagagtac caatatgtgg 60aactgggagg tgactgggtt
cgaatcgaag aagtcgcctt ctagtgagga aggcgttcat 120cggacaccgt
cttctatgct tcgacggtac tcgatcccga agaactcgct tccaccgcac
180tcgtcggagc ttgcgtctaa ggttcagagt ttgaaggata aagttcagct
tgcaaaggac 240gattatgtgg gattgagaca ggaagctact gatcttcaag
agtactccaa tgcgaagctt 300gaaagggtta cacgttattt aggtgttctg
gctgataaaa gtcgtaaact ggatcaatat 360gcacttgaga ctgaggctag
gatatctcca cttatcaatg agaagaagag actgttcaat 420gacttactga
cgaccaaagg tgcacatctt ccatttccga cgtcattctc tatccttact
480tctattgata ttgatcacac cagaccctta tttgaagacg agggtccctc
tatcattgaa 540tttcctgata actgcactat acgcgtaaac actagtgatg
atactctgtc caatcccaag 600aaggaatttg aatttgatag agtttatggg
cctcaagttg gacaagcttc actgttcagt 660gatgtccaac cttttgtgca
atccgctctg gatggatcga acgtttctat atttgcgtat 720ggccaaactc
acgcggggaa gacatacacc atggttgccc ctcctttccc tttcctctct
780gaaattagat ataggtcttg tttggattta aatatgatag gcaagttcat
ggacgttcat 840agtaagttca tggacgaagg atctaatcag gaccgtggtt
tatatgctcg ttgttttgag 900gaacttatgg acttggccaa ttctgattca
acttccgcat ctcagttcag tttctctgtt 960tcagtgtttg agctttataa
cgaacaggtc agggatttac tctcgggttg tcagagcaat 1020ttgccaaaga
tcaatatggg tttacgcgaa tcggttatag aactttcaca ggaaaaagtt
1080gataatccat cagagttcat gagagtcctg aactctgcat ttcagaatag
agggaatgat 1140aaatcaaagt ctactgtgac ccatctgatt gtctcgatac
acatttgtta tagcaacaca 1200attacgagag aaaatgtaat tagcaagctt
tctttagttg acctggctgg aagtgaaggt 1260ttaactgtgg aggatgacaa
tggagatcat gtaactgatc tgctccatgt aacaaattca 1320atttccgcgc
tgggagatgt tttatcatct ttgacgtcaa aaagagatac cattccttac
1380gagaactcat ttcttacaag aatacttgca gattcactag gagggagctc
caaaacattg 1440atgatcgtca acatttgtcc aagtgcacgg aacttgtctg
aaataatgtc gtgtctcaac 1500tatgctgcta gagctcgaaa tactgtacca
agccttggga atcgagacac aattaagaaa 1560tggagagacg tggcaaatga
tgctcggaag gaggtattgg agaaagagag ggaaaatcag 1620cgtctaaaac
aagaggttac gggtttaaaa caagcactta aagaagcaaa tgaccaatgt
1680gtactgctct ataatgaagt acagagagcg tggagagttt cattcacact
gcaatcagat 1740ttaaagtcag agaatgcgat ggttgtagac aaacataaaa
tagaaaagga gcagaatttt 1800cagttaagaa atcaaatagc tcaactttta
cagttagaac aggaacaaaa gctgcaggcg 1860caacaacaag attccaccat
tcaaaatctc cagtctaaag tgaaagactt agaatcacaa 1920ctaagtaaag
ctctgaagtc tgacatgaca agatcgagag atcccttgga acctcagccc
1980agagcagctg agaacacact cgattcttct gcagttacca agaaacttga
ggaagaattg 2040aaaaaacgtg atgcactgat tgagaggttg catgaagaaa
atgaaaaatt gttcgacaga 2100ttaacagaaa agtcagtggc tagctcgact
caggtatcta gcccctcatc aaaagcttca 2160ccaacagtgc agcctgcaga
tgttgacagg aaaaatagcg cgggcacttt accgtcttca 2220gtggataaaa
atgagggcac gattacatta gtaaaatcca gctctgaatt agtaaaaacc
2280actccagctg gagaatactt aacagctgca ttgaatgatt ttgatcccga
acaatatgaa 2340ggtcttgcag ccatagctga tggcgcaaac aagcttctga
tgctggtctt agcagctgtc 2400ataaaggctg gtgcttccag agagcatgaa
atccttgctg agatcagaga ttctgtcttt 2460tcatttatcc ggaaaatgga
accaaggaga gtaatggata caatgcttgt ttctcgagtc 2520aggatattgt
acataaggtc cttacttgca cgatcacctg agcttcagtc gatcaaggtt
2580tctcctgttg aacgcttttt ggagaagcca tatactggtc gaactagaag
ctccagcggg 2640agtagcagcc caggtagatc accagttcga tattatgatg
agcagattta tggctttaaa 2700gttaatttaa agccagaaaa gaaaagtaag
ttggtatctg tagtttcaag aatccgtgga 2760catgaccagg atactgggag
gcagcaagtg actggaggaa agctgaggga gatacaagat 2820gaagccaaaa
gttttgccat tggaaacaaa cccttagctg ctttatttgt tcacactccg
2880gctggtgaac tgcaaagaca gattaggtca tggcttgcag aaagttttga
gtttctctct 2940gttacagcag atgatgtttc aggagtaacc actggccaat
tagagcttct ttccacagca 3000attatggatg gctggatggc tggagtagga
gctgcggtgc cacctcacac agacgcttta 3060ggacagcttt tgtctgagta
tgcaaaacga gtctacactt ctcagatgca gcatctaaag 3120gatattgccg
gtactttggc ttcggaagag gcagaagatg ctggtcaagt cgcgaagctt
3180cgatcagctc tcgagtctgt tgaccacaaa agaagaaaga ttttgcaaca
aatgagaagt 3240gatgcagctt tgtttacctt ggaagaaggc agttcccctg
ttcaaaatcc atctacagca 3300gccgaagact cgagattagc ctccctcatt
tctctggatg ccatactgaa gcaagtcaag 3360gaaataacaa gacaagcctc
tgtccacgtt ttgagtaaaa gcaagaaaaa ggcattgctt 3420gagtctcttg
atgaacttaa cgaacgaatg ccttctctgc ttgatgttga tcatccatgt
3480gcacagagag aaattgatac ggctcaccag ttggtcgaga caattccaga
acaagaggac 3540aatcttcaag acgaaaagag accttcaata gattcaatat
cttcgactga aaccgatgtg 3600tctcaatgga atgttttgca attcaacaca
ggaggctctt cagctccatt catcataaaa 3660tgcggagcta actccaactc
agagctcgtg atcaaagcgg atgcccgtat tcaagaacct 3720aaaggaggcg
aaatagtgag agttgtgcca agaccttcgg ttttagaaaa catgagctta
3780gaggaaatga aacaagtgtt tggtcagttg cccgaagctc taagttcact
ggccttagct 3840agaacagctg atggcacacg ggctcgatac tctagactct
acagaactct agccatgaag 3900gttccctctc ttagggacct cgttggagag
cttgagaaag gaggagtctt aaaagataca 3960aaatcgacat gataggatta
gggttttttc gtgaatttga aa 4002561251DNAArabidopsis thaliana
56ttagttagat aggcggtggt tggtgcgttc atggcgaatc cttggtgggt agggaatgtt
60gcgatcggtg gagttgagag tccagtgacg tcatcagctc cttctttgca ccacagaaac
120agtaacaaca acaacccacc gactatgact cgttcggatc caagattgga
ccatgacttc 180accaccaaca acagtggaag ccctaatacc cagactcaga
gccaagaaga acagaacagc 240agagacgagc aaccagctgt tgaacccgga
tccggatccg ggtctacggg tcgtcgtcct 300agaggtagac ctcctggttc
caagaacaaa ccaaagagtc cagttgttgt taccaaagaa 360agccctaact
ctctccagag ccatgttctt gagattgcta cgggagctga cgtggcggaa
420agcttaaacg cctttgctcg tagacgcggc cggggcgttt cggtgctgag
cggtagtggt 480ttggttacta atgttactct gcgtcagcct gctgcatccg
gtggagttgt tagtttacgt 540ggtcagtttg agatcttgtc tatgtgtggg
gcttttcttc ctacgtctgg ctctcctgct 600gcagccgctg gtttaaccat
ttacttagct ggagctcaag gtcaagttgt gggaggtgga 660gttgctggcc
cgcttattgc ctctggaccc gttattgtga tagctgctac gttttgcaat
720gccacttatg agaggttacc gattgaggaa gaacaacagc aagagcagcc
gcttcaacta 780gaagatggga agaagcagaa agaagagaat gatgataacg
agagtgggaa taacggaaac 840gaaggatcga tgcagccgcc gatgtataat
atgcctccta attttatccc aaatggtcat 900caaatggctc aacacgacgt
gtattggggt ggtcctccgc ctcgtgctcc tccttcgtat 960tgattagtta
gataggcggt ggttggtgcg ttctttttac tggaatgatt atattttcca
1020ttaggatggt taggcttttg tttattaaag ctatcaagtt tctttttttt
ttacggataa 1080ttcggatgac aattagctag tgtttgtttg tttgttttgt
ggcggctttt ctgacttgac 1140tattttgatc gcggatagct ttgtatgaaa
gtgaattgat tgtagaatcg tcttttgaat 1200tttgatgttg gaaaaaacca
agcaatggtg tgtggccttt gcaatggaag c 1251572955DNAArabidopsis
thaliana 57aatttgcttt atctttgcat tgttgttggc atggctctca atctccgtca
gaaacagact 60gaatgtgtaa tccggatgtt gaatctgaac caacctttga atccaagtgg
aactgcgaac 120gaagaagttt acaagatctt gatttacgat aggttttgtc
agaacattct atctccattg 180acccatgtca aggatctgcg taagcatgga
gttacactct tctttctcat agacaaagat 240cgacaacctg ttcatgatgt
tcccgctgtc tactttgttc aaccaactga atccaacctc 300cagaggatca
tagccgatgc ttctagatct ctctacgata cctttcatct gaatttctcg
360tcttcgatcc ctcgtaagtt tcttgaagag ctagcttctg ggactcttaa
atctggttct 420gttgagaaag tctcgaaagt gcatgatcag tatctggagt
ttgtgacttt ggaagataac 480ttgttctcgc tggctcagca atctacctat
gttcaaatga atgacccatc agcaggggag 540aaagagatta atgagattat
cgaaagggtc gctagtggtt tgttttgtgt gttggtaacg 600cttggtgtgg
ttcctgttat ccgatgccct agtggtggac ctgcagagat ggtggcgtct
660ttgttggatc agaaactgag ggatcatctt ttgtccaaga acaatctgtt
tactgaaggt 720ggcggtttca tgagctcgtt tcagcgtccc ctcttgtgca
tatttgatag gaactttgag 780ctctcggttg ggattcagca tgatttcaga
taccggcctc tcgttcacga tgttctcggg 840ttaaagctca accaattgaa
agtgcaggga gagaaaggac caccgaaatc gtttgagctg 900gacagttcgg
acccattctg gtcagcaaac agtactctgg agtttccaga tgtcgctgtg
960gagatcgaaa cacagttgaa caagtacaag agagacgttg aagaggttaa
caagaaaacc 1020ggaggtggga gcggcgctga gtttgatggg acagatctga
ttggaaacat ccacaccgag 1080catctcatga acactgtgaa atcgctcccg
gagttaactg agcgaaagaa agtgattgac 1140aaacacacca atatcgcaac
agcgctctta ggacagatca aggagagatc tattgacgct 1200ttcactaaga
aagaaagcga catgatgatg aggggcggaa tcgacagaac tgaacttatg
1260gctgctctga aaggcaaagg gacaaagatg gacaagctcc ggtttgcaat
catgtacctg 1320atctccacag aaaccataaa ccaatcggaa gttgaagcag
tggaggcagc attgaatgaa 1380gctgaggctg atacaagtgc gtttcagtat
gtaaagaaaa tcaaatcgtt aaacgcatct 1440tttgcagcta catcagcgaa
ttcagctagc agaagcaaca ttgtagactg ggccgagaag 1500ctttacggac
agtctataag cgcagtgact gcaggagtca agaatctgtt atctagtgat
1560caacaattgg cagtgactcg aacagtcgaa gctttaacag aaggaaaacc
aaacccggag 1620atcgattctt accgcttcct ggacccaaga gctccaaagt
cgtctagctc cggtggtagc 1680catgtaaaag gaccgttcag agaagctata
gtgttcatga tcggtggagg taactatgtt 1740gagtatggaa gtttgcagga
gttgactcag agacagttaa ccgttaaaaa cgttatttat 1800ggagccactg
agattcttaa cggaggtgag ttggtggagc agcttggact tttgggaaag
1860aagatgggat taggaggtcc ggtcgcttca acgctgaaga ggctgggaat
ggctggtaaa 1920gaggagactg atgtatctgc acaagggtct ttaaccaggg
aggccactga gatatggagg 1980agtgagttgg aatctcgccg gtttcaggta
gatagtttag aagctgaact tgtggatgtc 2040aaggcttacc ttgagtttgg
ctcagaagaa gatgccagaa aggagttagg agttctttcg 2100ggtagggtca
gatcgactgc aactatgttg cgttatttga gatcaaaagc tagagtcttg
2160gccattcctg atgatctagc aaatgtgtca tgcggtgtgg aacagattga
agaactgaaa 2220ggattgaacc ttgttgagaa agatggtggt tcatcttctt
ctgacggggc taggaacact 2280aatcctgaaa ctagaaggta cagtggttcc
ttgggtgtag aggatggagc ctatactaat 2340gagatgctcc agtccataga
gatggttact gatgtgctgg actctcttgt gaggagggtt 2400acagtagcag
aatctgagtc tgctgttcaa aaggagaggg cacttttggg agaggaagaa
2460atcagtagga agactatcca aatcgaaaat ttgtccgtga agttagaaga
gatggaacga 2520tttgcttatg ggactaatag tgttctaaac gaaatgcggg
aaaggattga ggaattagtt 2580gaagagacga tgaggcagag ggaaaaagct
gtggaaaacg aagaggagtt gtgtcgtgtg 2640aagagagagt tcgagtcgct
taaaagctac gtcagtactt ttaccaatgt tcgagaaaca 2700cttctttcgt
ccgagagaca attcaaaacc attgaggagc tctttgaacg gttggtcact
2760aagacgacac aattagaagg ggagaaggca caaaaggagg ttgaagtaca
gaaactgatg 2820gaggagaatg tgaaattgac agcacttctc gacaagaaag
aggctcagct tctagctttg 2880aatgaacaat gcaaagttat ggctttgagt
gcatcaaaca tatgactctc taatccaacc 2940gaatctcaag cttcc
295558865DNAArabidopsis thaliana 58ggctgataaa tatagggaga actatttggg
tcacagtatc aaagcccctg ttggaagatg 60gcaaaaaggt aaagatcttc attggtatgc
tagagataaa aagcaaaagg gttccgagat 120ggatgctatg aaagaagaga
ttcaaagagt taaggaacaa gaggagcagg ccatgaggga 180ggctcttggc
ttggcaccaa agtcctctac aaggccacaa ggaaatcgcc ttgataagca
240agagtttact gaacttgtga agaggggttc gacagcagag gacttaggtg
cagggaatgc 300tgatgctgtg tgggttcacg gtcttggata tgctaaagca
ccacgacctt gggaagatcc 360gagcaccctt gcatcctctc agaaagaaga
tgcagattca gcacgcttac cagcagatac 420atcaggggtc aaaactgttg
aagatggacc ggatgatgtt gagagggacc aaagaaggat 480aggcgtgagg
aaaggaaacc tgcaaagaga gagaaggaag aaagacatga taggcgtgaa
540aaacgcgaaa ggcatgagaa gcgaagcgct cgtgattcag atgatagaaa
gaagcacaag 600aaagagaaga aggagaaaaa aagaaggcat gactctgatt
ctgattgaag cgaattgtcc 660caggatggaa cattttgctc ttcagaggaa
gagtggtcgg ctaggtacca aaatccagct 720accacttctg caagatttaa
atctgttgct tatttcattt acgaatcgtg gagtaaagtg 780ttgttgaaca
ttgttgaaaa tgtttgttaa aacacatgaa aaatgtggtt tgatattata
840acaaaccgag acgctcgttt tagct 86559723DNAArabidopsis
thalianamisc_feature(559)..(559)n= a, c, g, t 59gcaaaagaga
gaaacatctg acccggaatc tgacctgaaa acccggaaga atcgaaaaat 60ggggaaagat
ggtctgagcg acgatcaggt ctcgtcgatg aaggaagcct tcatgctctt
120cgacaccgat ggcgacggca aaatcgcacc gtcagagctc gggatcctca
tgcgatctct 180cggcggaaac ccgacccaag cccagctgaa atccataatc
gcatccgaga atctctcttc 240accgtttgat ttcaacagat tcctcgatct
catggcgaaa catctgaaga cggaaccttt 300cgatcgccag ctccgtgacg
cattcaaagt gctcgataag gaaggtaccg ggttcgttgc 360tgtggcggat
ctgaggcata ttctgaccag tatcggagag aagctggagc ctaatgagtt
420cgatgagtgg atcaaggagg tggatgttgg atccgatgga aagatccggt
atgaagattt 480catagcaagg atggttgcta agtgagatct aatcttttat
gttttgaaag ttgaaatttt 540taagaagaga ttcttttgng gttttttcac
ttggttggtt tgatttcgag cgaatcctaa 600ctaggggttg gtttatcatt
gnggaatttg cttactaact ttggcttctt catggttggg 660tttcaatttt
taatggnaaa tggtggctgg gggaattcct aaaaaaaaaa aaaaaaaaaa 720aaa
72360426DNAArabidopsis thaliana 60caaaaaaaga gatcgcttca atggagaaac
agagtactca accaatttgc ggccaagagg 60ctctccaact tctcaattgc gtcgcggagt
ctcctttcga tcaagagaaa tgcgtccgat 120ttttgcaatc tctcagagaa
tgcgttctat caaagaaagt aaagaagttc tcgataccga 180gtcaagatca
cgactctgag ggagcagctt cagctacaaa gagaccttca taacgttctt
240tgttccgatt ttcttttatc gtttgagttg taatcatgta attgatttta
atgtcatgcc 300ttggattcat aagctgggtc atgccttgtt tcccctttgt
tgtcttgtat gttgaatatt 360gcaaactcta aagagcatat ttataagaag
aaataaaagt ttctacaaaa aaaaaaaaaa 420aaaaaa 426611442DNAArabidopsis
thaliana 61tcaaaatcag aaactttcct tgacaaattt taacaaatct ctttctcgtt
ttctattgaa 60ttctccagta gcgcggtagt tagttttagg tggaagaaga atgacaacta
ctgggtctaa 120ttctaatcac aaccaccatg aaagcaataa taacaacaat
aaccctagta ctaggtcttg 180gggcacggcg gtttcaggtc aatctgtgtc
tactagcggc agtatgggct ctccgtcgag 240ccggagtgag caaaccatca
ccgttgttac atctactagc gacactactt ttcaacgcct 300gaataatttg
gacattcaag gtgatgatgc tggttctcaa ggagcttctg gtgttaagaa
360gaagaagagg ggacagcgtg cggctggtcc agataagact ggaagaggac
tacgtcaatt 420tagtatgaaa ggtcttatct ctttctctgc ccctattatg
ctttcatcta aatgcctttc 480aatttgtgaa aaggtggaaa gcaaaggaag
gacaacttac aatgaggttg cagacgagct 540tgttgctgaa tttgcacttc
caaataacga tggaacatcc cctgatcagc aacagtatga 600tgagaaaaac
ataagacgaa gagtatatga tgctttaaac gtcctcatgg ctatggatat
660aatatccaag gataaaaaag aaattcaatg gagaggtctt cctcggacaa
gcttaagcga 720cattgaagaa ttaaagaacg aacgactctc acttaggaac
agaattgaga agaaaactgc 780atattcccaa gaactggaag aacaagtaat
gaacatcatc gatactctcg gcttatctgc 840ttcctgcctt cagaatctga
tacagagaaa tgagcactta tatagctcag gaaatgctcc 900cagtggcggt
gttgctcttc cttttatcct tgtccagact cgtcctcacg caacagtaga
960agtggagata tcagaagata tgcagctcgt gcattttgat ttcaacagca
ctccatttga 1020gctccacgac gacaattttg tcctcaagac tatgaagttt
tgtgatcaac cgccgcaaca 1080accaaacggt cggaacaaca gccagctggt
ttgtcacaat ttcacgccag aaaaccctaa 1140caaaggcccc agcacaggtc
caacaccgca gctggatatg tacgagactc atcttcaatc 1200gcaacaacat
cagcagcatt ctcagctaca aatcattcct atgcctgaga ctaacaacgt
1260tacttccagc gctgatactg ctccagtgaa atccccgtct cttccaggga
taatgaactc 1320cagcatgaag ccggagaatt gaaacacgta tgaaggcccc
ttgtacaatt tctgtaaaac 1380tgtaaagtag ctcttgaaaa actttacctg
gttttttgac gaatagtctg tttagcggta 1440aa 1442621506DNAArabidopsis
thaliana 62atggcgctgc agaacattgg tgcttccaac cgtaacgatg ccttctacag
gtacaagatg 60cctaagatgg ttaccaaaac cgaaggcaaa ggtaatggca ttaagaccaa
cattatcaac 120aatgttgaga ttgccaaagc cttggctaga ccgccttctt
atacgaccaa gtactttggt 180tgtgagcttg gagcgcagtc taagtttgat
gagaagactg ggacgtcgct tgtgaatgga 240gctcacaaca cgtctaagct
tgctgggctt ttggagaatt ttattaagaa gtttgttcag 300tgttatggat
gtggtaaccc ggagactgag attattatta cgaagacgca gatggtgaat
360ctcaagtgtg ctgcttgtgg gtttatctct gaggtcgaca tgagggataa
gttgactaat 420ttcattctca agaacccacc tgagcagaag aaggtgtcaa
aggataagaa agcaatgagg 480aaagctgaga aggagaggct taaagaaggc
gagctagctg atgaggagca gagaaagctg 540aaagctaaga agaaagcatt
gtctaacggc aaggattcta agacgtctaa gaaccattct 600tctgatgagg
atataagccc gaagcatgat gagaatgctc tagaggtgga tgaggatgaa
660gatgatgatg atggtgtcga gtggcaaact gatacttccc gagaagctgc
tgagaaaaga 720atgatggaac agttgagtgc taaaactgcc gaaatggtga
tgctctctgc aatggaagta 780gaagagaaaa aggcgcccaa aagcaaatct
aacgggaacg ttgtgaaaac tgagaatcct 840cctccgcaag agaagaatct
cgtgcaggat atgaaagagt atctgaagaa agggtcacca 900ataagcgcgc
tcaaaagttt catctcgtct ctctctgaac ctcctcaaga catcatggac
960gcactcttca atgctctctt tgatggtgtg ggaaagggat tcgccaaaga
agtgactaag 1020aagaagaatt acttagcggc tgctgcaaca atgcaagagg
atggatcaca gatgcatctg 1080ctcaattcga ttgggacatt ctgtggaaag
aatggaaacg aagaagcttt gaaagaggtg 1140gctctggttc ttaaagcatt
gtacgaccaa gacatcattg aggaagaggt agtgttggat 1200tggtacgaaa
agggtctcac cggagctgac aaaagctcgc cggtttggaa gaatgttaag
1260ccttttgtgg agtggcttca gagcgctgag tctgagtccg aagaggagga
ttgagtcact 1320tttttcttcc ctcctaactt ttctttgcgg catttcttat
aatacttcgt cagttttcag 1380aattcttaaa tctttttgct gtgttcttat
aaagaaacat catctattaa agttgtcttc 1440gtttggattt ggttttgacg
actttgggaa atatttatgt ttaagaaaaa aaaaaaaaaa 1500aaaaaa
1506632631DNAArabidopsis thaliana 63atggcggcta acaaattcgc
gactctgatt catcggaaaa caaaccgaat cactttaatc 60ctcgtatacg cttttctcga
atggtcactc attttcttca ttttgctcaa ctctctcttt 120tcttatttca
tactcagatt cgctgattat ttcggtctta aacgtccttg tctcttctgc
180tctagactcg atcgtttctt cgatgcttct ggtaaatctc cttctcatcg
agatcttctc 240tgcgatgatc atgctctcca attacattca aaacctgttg
aagaatctaa ttgtggtttc 300ggagaatttc acaatgattt ggttcatcgt
ggttgttgcg tagagaagat aagttcgtca 360ctatgtgctc cgattgagtc
tgactttggg aatttagatt atccaattgg agatgaaggt 420cagatttaca
atggtcttaa gtttcctcga tcgatcttcg tctttgaaga agagaaagta
480ggatctgtaa atttgaatga ttctcaggaa gaaacagagg agaagaaagt
tccccaatct 540catgagaaac ttgaagatga tgatgttgat gaggagtttt
catgctatgt atcaagcttc 600gattgtaaga acaaagaaat tgcaacagag
aaggaagaag aaaacagagt ggatctacct 660atagaggtgg aaactgcaga
atcagctccg aaaaacctcg agttctatat tgatgaagaa 720gactgtcatt
tgattccagt tgaattctat aaaccgagtg aagaagttcg agagatttcc
780gacattaacg gagattttat cctcgatttc ggcgttgagc atgatttcac
ggcggcggcg 840gagacggagg aaatctccga ctttgcttcg ccgggtgaat
cgaaaccgga ggatgcagag 900acgaatctag ttgcttcgga aatggaaaac
gacgacgaag aaacagacgc agaggtttct 960ataggtacag agattcctga
tcatgagcaa atcggagata ttccttctca
ccagctcatt 1020cctcaccacg atgacgatga tcatgaggag gaaacgttgg
agttcaaaac agtaacgatt 1080gaaaccaaga tgccagtctt aaacatcaac
gaagagcgga ttttagaagc tcaaggctcg 1140atggaaagct cgcatagtag
tctacataac gctatgtttc acttagagca aagagtatct 1200gttgatggta
ttgaatgtcc tgaaggagta ctcactgttg ataagttgaa gtttgagtta
1260caagaagaga gaaaagcact tcacgcgtta tacgaggagc tggaggtaga
gaggaatgcg 1320tctgctgttg ctgccagtga aacaatggcg atgatcaata
ggttgcatga ggagaaagct 1380gcgatgcaga tggaagcgtt gcagtatcag
agaatgatgg aggagcaagc tgagtttgat 1440caagaagctt tgcagttgtt
gaatgagctt atggtgaata gagagaagga gaatgctgag 1500cttgagaagg
agctagaggt gtatagaaag agaatggagg agtatgaagc taaagagaaa
1560atggggatgt tgaggaggag attgagagat tcctctgttg attcgtatag
aaataatggc 1620gattctgatg agaatagcaa tggagagtta cagtttaaga
acgttgaagg ggttacggat 1680tggaaatata gagagaatga gatggagaat
acgccggtgg atgttgtact tcgtcttgat 1740gagtgtttag atgattatga
tggagagagg ctttcgattc ttgggagatt gaagtttctt 1800gaagagaaac
tcacagatct taataacgaa gaggacgacg aggaggaggc taaaacgttt
1860gagagtaatg gtagcatcaa tggaaatgag catattcatg gcaaagaaac
aaacgggaag 1920cacagagtta tcaagtcaaa gagattactt cccctgtttg
atgcggtcga tggagagatg 1980gaaaacgggt taagtaacgg aaaccatcac
gaaaacgggt ttgatgattc ggagaagggt 2040gagaatgtga cgatagaaga
agaagtggat gagctttacg agaggttaga agctctagag 2100gcagatagag
agttcttaag acattgtgtt ggttcattga aaaaaggaga caaaggtgta
2160catctcctcc atgagattct gcaacatctt cgtgatctaa ggaatatcga
tcttactcgc 2220gtcagagaaa acggagacat gagtttatga gtttgatttt
gagttttggg tttgagtcca 2280ctctttgcat agtgacccaa agaacaagaa
aaatcataca ggtatggaag tgacatgttg 2340cttgtgaggc aaggaacaac
gacaaggttt cagatgaaga agaaaacgtt ctcagaataa 2400aagtatttta
agtatatact ctgaggaaaa gtgtcagatc agaatgttcg tctttcttcg
2460ttcattttca ttattataag ttttgttttt tatattgaag atttatttag
agagagggaa 2520gtgtcagtat aatttcactt ttatatttta tatttgggag
ttgtctttat gagtggtggt 2580aatagaaaaa ggtagaatga tgagtgaaga
aaaaaaaaaa aaaaaaaaaa a 2631642743DNAArabidopsis thaliana
64atgtcagacg ctctttctgc gattccggcc gcagttcatc gcaatctctc cgataaactc
60tatgagaagc gcaaaaatgc tgcgcttgag cttgagaata ttgtgaagaa tctaacttct
120tcgggtgatc atgacaagat ctcgaaagtc attgagatgt tgattaagga
atttgccaaa 180tctcctcaag ctaatcatcg gaagggtggt ctaattggct
tagctgctgt aactgttggt 240ttgtctacag aagctgctca atatcttgag
caaatagtgc cacctgtgat taattccttt 300tctgatcaag atagccgagt
tcggtactat gcatgtgaag ctctctataa cattgcaaag 360gttgtgcgag
gcgatttcat tattttcttc aataagatat ttgatgcctt atgcaaactc
420tcagcagatt ctgatgccaa tgtccaaagt gctgctcatc ttttggatcg
ccttgttaag 480gatattgtga cggaaagtga tcagttcagt attgaggaat
tcatacctct tttaaaagag 540cgaatgaacg ttctcaaccc ttacgtccgg
caatttctgg ttggatggat cactgttctt 600gatagtgttc cagacattga
catgcttggg tttctgccag actttctcga tgggttattc 660aatatgttga
gcgactctag tcatgaaata cgacagcaag ctgattcagc tctttcagag
720tttcttcaag agataaaaaa ttcaccatct gtagattatg gtcgcatggc
tgaaatactg 780gtgcagaggg ctgcttctcc tgatgaattc actcgattaa
cagccatcac gtggataaac 840gagttcgtaa aacttggggg agaccagctc
gtgcgttatt atgctgacat tcttggggct 900atcttgcctt gcatatctga
caaagaagag aaaatcaggg tggttgctcg tgaaaccaat 960gaagaacttc
gttcaatcca tgttgaaccc tcagatggtt ttgatgttgg cgcaattctc
1020tctgttgcaa ggaggcagct atcaagtgag tttgaggcta ctcggattga
agcattgaat 1080tggatatcaa cacttttaaa caagcatcgt actgaggtct
tgtgcttcct gaatgacata 1140tttgacaccc ttctaaaagc actatctgat
tcttctgatg acgtggtgct cttggttctg 1200gaggttcatg ctggtgtagc
aaaagatcca caacactttc gccagctcat cgtatttctt 1260gtccacaatt
tccgagctga taattctctt ttggaaaggt atctggaaag aacatattat
1320ttagttggtc aaaacatatc tcgttatagg cgcggtgccc ttattgtccg
aagaatgtgt 1380gtacttttgg atgccgaaag agtctaccga gagctctcta
caattcttga gggagaagat 1440aatcttgact ttgcttctac catggttcag
gcattgaatt tgattttgct tacttccccg 1500gagttatcga aactgagaga
actattaaaa ggttcactcg tcaatcgcga agggaaagaa 1560cttttcgttg
ccttgtatac ttcatggtgc cattcaccca tggcaattat aagcctctgc
1620ttattagctc aggcttacca gcatgcgagt gtcgtgattc aatcattggt
agaagaagac 1680attaacgtca aatttctagt acagcttgat aaattgatcc
ggcttctgga aactccaatc 1740tttacttacc ttagattgca gcttctggaa
ccaggaaggt acacatggtt gctgaaaaca 1800ctttatggtc ttcttatgtt
acttcctcag caaagtgcgg cgttcaagat acttaggaca 1860agactcaaaa
ctgtgccaac gtactcattc agtactggaa accaaatagg cagagcaact
1920tcaggagttc ctttctctca gtataagcat caaaacgagg acggtgactt
agaagacgat 1980aacatcaaca gttctcacca aggaatcaat tttgctgtgc
ggctacaaca gttcgaaaac 2040gtacagaatc tacatcgtgg ccaggcaagg
actagagtga actactcata tcactcttcc 2100tcttcttcta catcaaagga
ggtgaggaga tctgaagaac aacaacagca gcagcagcaa 2160caacaacagc
aacaacaaca acaacaacga ccaccacctt cttcgacatc atcatcagtt
2220gcagataaca atagacctcc atcaagaact tcaagaaaag gccctggtca
attacagctt 2280taacctacct ggtaatcata aataataaat aatattccat
ccccgacaat catcatcttc 2340atcttctttg tgtggacacc accgatccct
tttgtctcct gtaaaattgt atatctctct 2400tttttagtaa ctcttcaagt
ttcgacggaa cttgtggaaa agctacggtc gtgtccatca 2460tctctttctc
tctgtcgggt tttttttatt tacgagagat tcttcttcag tccctcagtc
2520tacctttata ttgttttttt gggggtttct cgtttctttg aatttgtttc
attgtttgga 2580gctttttata tttttacctt atgtggagat gtaagaaaaa
gaagtgatca tgtggttttg 2640tgttgttttt ttataactgg aaaaccacat
gagtttgtag aggtcactta ttggatattt 2700tatgtcaaat gatgctcctt
tttacaaaaa aaaaaaaaaa aaa 2743652959DNAArabidopsis thaliana
65atgtcactct tgttcctcaa tcctccgttt ccctccaatt caatccaccc aattcctcgt
60cgtgccgccg gaatatcctc cattcgatgc tcaatttctg caccggagaa gaaaccgagg
120aggaggagga agcagaagcg cggcgacgga gctgagaatg acgactcttt
gtctttcgga 180agtggtgaag ctgtctccgc tctggagagg agtctccgcc
tcacttttat ggacgagctt 240atggaacgag ctagaaatcg agatacttca
ggtgtttctg aggttatcta tgacatgatt 300gctgctgggc ttagccctgg
acctcgttct ttccatggtt tggttgtagc tcacgcgctt 360aacggcgacg
aacaaggcgc gatgcactcg ctgagaaagg agctaggtgc aggccaacgt
420ccgcttcctg agactatgat tgctttggtt cgtctctctg gttcgaaagg
gaatgctacg 480agaggcctag aaatcctcgc cgctatggaa aagcttaagt
atgacattcg tcaagcttgg 540ctcattcttg ttgaggagct catgaggatc
aatcacttgg aagatgcgaa taaagttttc 600ttgaagggtg caagaggtgg
gatgagagca acagatcagc tttatgattt gatgattgaa 660gaagattgca
aagctggaga tcattctaat gccttagaca tctcttacga aatggaggca
720gctggtagaa tggccacaac atttcatttc aactgtcttc ttagtgtgca
ggctacatgt 780gggattcccg aggtagctta tgctacattc gaaaatatgg
agtacggtga aggtttattt 840atgaagcctg acactgagac atataactgg
gtgattcaag cctacactag agccgagtca 900tatgataggg ttcaggatgt
tgctgaatta cttggaatga tggttgagga ccacaaacgt 960gtgcagccaa
atgtgaagac ttatgcgctc ttagttgagt gcttcaccaa atattgtgtc
1020gtgaaggaag cgattagaca ttttcgtgct cttaaaaact ttgaaggagg
aacagtaatt 1080ttacacaatg cagggaattt tgaggatcct ctctctttgt
atctcagggc tttgtgtcga 1140gaaggaagaa ttgttgagct tattgatgct
ttagatgcaa tgcgcaaaga taaccaacct 1200atacctccaa gagccatgat
tatgagcaga aagtatcgaa cactagtcag ctcatggatt 1260gaaccattgc
aagaagaagc tgaacttggc tatgagattg attatttagc gaggtacata
1320gaggaagggg gacttactgg tgaacgcaag cgttgggtac ctcgaagagg
gaaaactcct 1380ttagatcccg atgcttctgg ttttatatac tcaaacccta
ttgaaacatc ctttaaacag 1440agatgccttg aagattggaa agttcaccat
aggaagctct tgagaacctt acagagtgaa 1500ggtcttccag ttctaggaga
tgcatcagaa tctgattaca tgagagtggt ggagagatta 1560cggaacataa
taaaaggtcc tgcactgaat cttttgaagc cgaaagcagc aagcaagatg
1620gttgtatcag agttaaagga agaactcgaa gctcagggtt tgccaattga
tggaacaaga 1680aatgtgcttt accagcgtgt ccaaaaagca aggagaataa
acaaatctcg aggtcgacct 1740ctttgggttc ctccaattga agaagaagag
gaggaggtcg atgaagaagt agacgattta 1800atatgtcgaa tcaagctaca
tgaaggagac acagagttct ggaaacgtcg gtttcttgga 1860gaaggcttga
ttgaaacttc agttgaatcc aaggaaacga ctgaatcagt ggttacaggt
1920gaatcggaga aagcgattga agatatttca aaagaagctg acaatgagga
ggatgatgat 1980gaggaggaac aagaaggaga tgaggatgat gatgaaaatg
aagaggaaga agtggttgtt 2040ccagaaactg agaatcgagc agaaggagaa
gatttagtga agaataaggc agctgacgcg 2100aagaagcatc ttcaaatgat
tggagtccaa ctcttgaaag aatccgatga agcaaacaga 2160acaaagaaac
gtgggaagag ggcatctcgt atgacacttg aggatgatgc agatgaggat
2220tggttccctg aggaaccatt tgaagcattc aaagaaatga gggaaagaaa
agtgttcgat 2280gtggctgaca tgtatacaat agcagacgtt tggggttgga
catgggagaa ggattttaag 2340aacaaaactc caaggaaatg gtcacaagag
tgggaagtcg agttggcaat tgtgctcatg 2400acaaaggtga ttgaattggg
tggaattcca acgattggtg attgtgcagt gatattacga 2460gctgctttaa
gagctcccat gccttcagcc ttcttgaaga tcttgcagac gacacacagt
2520cttggctact catttggcag cccgttgtac gatgagatca tcacattgtg
tttggacctt 2580ggagaacttg atgcagccat cgccatagtt gcagatatgg
aaaccacagg gatcactgtc 2640cctgatcaaa cccttgacaa ggtcatatct
gctagacaat ctaatgagag tccgcggtct 2700gagcctgaag agccagcatc
aacagtaagc tcttagttat catatcctct tctgcttgtt 2760gtgaagtctc
tataagaaac agaaatcggt agaaggagct gaatctgtct tagttatgaa
2820agttttgttc attataagta caagtcatgt agttccgagt gtagaacagt
ttttactagt 2880gttgcaccag gtccctccag tctgatactt aattctttag
tgttggatct ttctatataa 2940gaaaaaaaaa aaaaaaaaa
2959661295DNAArabidopsis thaliana 66aagcttcgaa gtcgatttca
atggaaggtt cctcgtcagc catcgcgagg aagacatggg 60agctagagaa caacattctc
ccagtggaac caaccgattc agcctccgac agtatattcc 120actacgacga
cgcttcacaa gccaaaatcc agcaggagaa gccatgggcc tccgatccta
180actacttcaa gcgcgttcac atctcagccc ttgctcttct caagatggtg
gttcacgctc 240gctccggtgg cacaatcgag atcatgggtc ttatgcaggg
taaaaccgag ggtgatacaa 300tcatcgttat ggatgctttt gctttgcctg
ttgaaggtac tgagactagg gttaatgctc 360agtctgatgc ctatgagtat
atggttgaat actctcagac cagcaagctg gctgggaggt 420tggagaacgt
tgttggatgg tatcactctc accctgggta tggatgttgg ctctcgggta
480ttgatgtttc gacacagatg cttaaccaac agtatcagga gccattctta
gctgttgtta 540ttgatccaac aaggactgtt tcggctggta aggttgagat
tggggcattc agaacatatc 600cagagggaca taagatctcg gatgatcatg
tttctgagta tcagactatc cctcttaaca 660agattgagga ctttggtgta
cattgcaaac agtactactc attggacatc acttatttca 720agtcatctct
cgatagtcac cttctggatc tcctttggaa caagtactgg gtgaacactc
780tttcttcttc cccactgttg ggcaatggag actatgttgc cgggcaaata
tcagacttgg 840ctgagaagct cgagcaagcg gagagtcagc tcgctaactc
ccggtatgga ggaattgcgc 900cagccggtca ccaaaggagg aaagaggatg
agcctcaact cgcgaagata actcgggata 960gtgcaaagat aactgtcgag
caggtccatg gactaatgtc acaggttatc aaagacatct 1020tgttcaattc
cgctcgtcag tccaagaagt ctgctgacga ctcatcagat ccagagccca
1080tgattacatc gtgaagttgg tctattcttt tgttttttgg ctgcggaaat
tgactatcgg 1140tttgacccgg tttatgaggc aatgcccatt gttccctata
tctctagtgt agtatctgct 1200tcagacaaag atctttgggt tattaaatga
cattaacata aatcgatcat tatgtttttg 1260cgttaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaa 129567340PRTArabidopsis thaliana 67Pro His Ile Arg
Asp Glu Glu Thr Lys Lys Pro Asp Ser Val Ser Ser1 5 10 15Glu Glu Pro
Glu Thr Ile Ile Ile Asp Val Asp Glu Ser Asp Lys Glu 20 25 30Gly Gly
Asp Ser Asn Glu Pro Met Phe Val Gln His Thr Glu Ala Met 35 40 45Leu
Glu Glu Ile Glu Gln Met Glu Lys Glu Ile Glu Met Glu Asp Ala 50 55
60Asp Lys Glu Glu Glu Pro Val Ile Asp Ile Asp Ala Cys Asp Lys Asn65
70 75 80Asn Pro Leu Ala Ala Val Glu Tyr Ile His Asp Met His Thr Phe
Tyr 85 90 95Lys Asn Phe Glu Lys Leu Ser Cys Val Pro Pro Asn Tyr Met
Asp Asn 100 105 110Gln Gln Asp Leu Asn Glu Arg Met Arg Gly Ile Leu
Ile Asp Trp Leu 115 120 125Ile Glu Val His Tyr Lys Phe Glu Leu Met
Glu Glu Thr Leu Tyr Leu 130 135 140Thr Ile Asn Val Ile Asp Arg Phe
Leu Ala Val His Gln Ile Val Arg145 150 155 160Lys Lys Leu Gln Leu
Val Gly Val Thr Ala Leu Leu Leu Ala Cys Lys 165 170 175Tyr Glu Glu
Val Ser Val Pro Val Val Asp Asp Leu Ile Leu Ile Ser 180 185 190Asp
Lys Ala Tyr Ser Arg Arg Glu Val Leu Asp Met Glu Lys Leu Met 195 200
205Ala Asn Thr Leu Gln Phe Asn Phe Ser Leu Pro Thr Pro Tyr Val Phe
210 215 220Met Lys Arg Phe Leu Lys Ala Ala Gln Ser Asp Lys Lys Leu
Glu Ile225 230 235 240Leu Ser Phe Phe Met Ile Glu Leu Cys Leu Val
Glu Tyr Glu Met Leu 245 250 255Glu Tyr Leu Pro Ser Lys Leu Ala Ala
Ser Ala Ile Tyr Thr Ala Gln 260 265 270Cys Thr Leu Lys Gly Phe Glu
Glu Trp Ser Lys Thr Cys Glu Phe His 275 280 285Thr Gly Tyr Asn Glu
Lys Gln Leu Leu Ala Cys Ala Arg Lys Met Val 290 295 300Ala Phe His
His Lys Ala Gly Thr Gly Lys Leu Thr Gly Val His Arg305 310 315
320Lys Tyr Asn Thr Ser Lys Phe Cys His Ala Ala Arg Thr Glu Pro Ala
325 330 335Gly Phe Leu Ile 34068145PRTArabidopsis thaliana 68Pro
Asp Ser Gly Thr Ala Ala Gly Gly Ser Asn Ser Asp Pro Phe Pro1 5 10
15Ala Asn Leu Arg Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu Met
20 25 30Ile Leu Glu Arg Met Leu Met Thr Cys Leu Tyr Arg Val Thr Lys
Cys 35 40 45Asn Arg Ala Glu Ser Ala Leu Ser Leu Leu Arg Lys Asn Lys
Asn Gly 50 55 60Phe Asp Ile Val Ile Ser Asp Val His Met Pro Asp Met
Asp Gly Phe65 70 75 80Lys Leu Leu Glu His Val Gly Leu Glu Met Asp
Leu Pro Val Ile Met 85 90 95Met Ser Ala Asp Asp Ser Lys Ser Val Val
Leu Lys Gly Val Thr His 100 105 110Gly Ala Val Asp Tyr Leu Ile Lys
Pro Val Arg Ile Glu Ala Leu Lys 115 120 125Asn Ile Trp Gln His Val
Val Arg Lys Lys Arg Asn Arg Val Glu Trp 130 135
140Phe14569450PRTArabidopsis thaliana 69Met Gly Lys Glu Asn Ala Val
Ser Arg Pro Phe Thr Arg Ser Leu Ala1 5 10 15Ser Ala Leu Arg Ala Ser
Glu Val Thr Ser Thr Thr Gln Asn Gln Gln 20 25 30Arg Val Asn Thr Lys
Arg Pro Ala Leu Glu Asp Thr Arg Ala Thr Gly 35 40 45Pro Asn Lys Arg
Lys Lys Arg Ala Val Leu Gly Glu Ile Thr Asn Val 50 55 60Asn Ser Asn
Thr Ala Ile Leu Glu Ala Lys Asn Ser Lys Gln Ile Lys65 70 75 80Lys
Gly Arg Gly His Gly Leu Ala Ser Thr Ser Gln Leu Ala Thr Ser 85 90
95Val Thr Ser Glu Val Thr Asp Leu Gln Ser Arg Thr Asp Ala Lys Val
100 105 110Glu Val Ala Ser Asn Thr Ala Gly Asn Leu Ser Val Ser Lys
Gly Thr 115 120 125Asp Asn Thr Ala Asp Asn Cys Ile Glu Ile Trp Asn
Ser Arg Leu Pro 130 135 140Pro Arg Pro Leu Gly Arg Ser Ala Ser Thr
Ala Glu Lys Ser Ala Val145 150 155 160Ile Gly Ser Ser Thr Val Pro
Asp Ile Pro Lys Phe Val Asp Ile Asp 165 170 175Ser Asp Asp Lys Asp
Pro Leu Leu Cys Cys Leu Tyr Ala Pro Glu Ile 180 185 190His Tyr Asn
Leu Arg Val Ser Glu Leu Lys Arg Arg Pro Leu Pro Asp 195 200 205Phe
Met Glu Arg Ile Gln Lys Asp Val Thr Gln Ser Met Arg Gly Ile 210 215
220Leu Val Asp Trp Leu Val Glu Val Ser Glu Glu Tyr Thr Leu Ala
Ser225 230 235 240Asp Thr Leu Tyr Leu Thr Val Tyr Leu Ile Asp Trp
Phe Leu His Gly 245 250 255Asn Tyr Val Gln Arg Gln Gln Leu Gln Leu
Leu Gly Ile Thr Cys Met 260 265 270Leu Ile Ala Ser Lys Tyr Glu Glu
Ile Ser Ala Pro Arg Ile Glu Glu 275 280 285Phe Cys Phe Ile Thr Asp
Asn Thr Tyr Thr Arg Asp Gln Val Leu Glu 290 295 300Met Glu Asn Gln
Val Leu Lys His Phe Ser Phe Gln Ile Tyr Thr Pro305 310 315 320Thr
Pro Lys Thr Phe Leu Arg Arg Phe Leu Arg Ala Ala Gln Ala Ser 325 330
335Arg Leu Ser Pro Ser Leu Glu Val Glu Phe Leu Ala Ser Tyr Leu Thr
340 345 350Glu Leu Thr Leu Ile Asp Tyr His Phe Leu Lys Phe Leu Pro
Ser Val 355 360 365Val Ala Ala Ser Ala Gly Phe Leu Ala Lys Trp Thr
Met Asp Gln Ser 370 375 380Asn His Pro Trp Asn Pro Thr Leu Glu His
Tyr Thr Thr Tyr Lys Ala385 390 395 400Ser Asp Leu Lys Ala Ser Val
His Ala Leu Gln Asp Leu Gln Leu Asn 405 410 415Thr Lys Gly Cys Pro
Leu Ser Ala Ile Arg Met Lys Tyr Arg Gln Glu 420 425 430Lys Tyr Lys
Ser Val Ala Val Leu Thr Ser Pro Lys Leu Leu Asp Thr 435 440 445Leu
Phe 45070223PRTArabidopsis thaliana 70Met Gly Lys Lys Cys Asp Leu
Cys Asn Gly Val Ala Arg Met Tyr Cys1 5 10 15Glu Ser Asp Gln Ala Ser
Leu Cys Trp Asp Cys Asp Gly Lys Val His 20 25 30Gly Ala Asn Phe Leu
Val Ala Lys His Thr Arg Cys Leu Leu Cys Ser 35 40 45Ala Cys Gln Ser
Leu Thr Pro Trp Lys Ala Thr Gly Leu Arg Leu Gly 50 55 60Pro Thr Phe
Ser Val Cys Glu Ser Cys Val Ala Leu Lys Asn Ala Gly65 70
75 80Gly Gly Arg Gly Asn Arg Val Leu Ser Glu Asn Arg Gly Gln Glu
Glu 85 90 95Val Asn Ser Phe Glu Ser Glu Glu Asp Arg Ile Arg Glu Asp
His Gly 100 105 110Asp Gly Asp Asp Ala Glu Ser Tyr Asp Asp Asp Glu
Glu Glu Asp Glu 115 120 125Asp Glu Glu Tyr Ser Asp Asp Glu Asp Glu
Asp Asp Asp Glu Asp Gly 130 135 140Asp Asp Glu Glu Ala Glu Asn Gln
Val Val Pro Trp Ser Ala Ala Ala145 150 155 160Gln Val Pro Pro Val
Met Ser Ser Ser Ser Ser Asp Gly Gly Ser Gly 165 170 175Gly Ser Val
Thr Lys Arg Thr Arg Ala Arg Glu Asn Ser Asp Leu Leu 180 185 190Cys
Ser Asp Asp Glu Ile Gly Ser Ser Ser Ala Gln Gly Ser Asn Tyr 195 200
205Ser Arg Pro Leu Lys Arg Ser Ala Phe Lys Ser Thr Val Val Val 210
215 22071429PRTArabidopsis thaliana 71Met Val Asn Ser Cys Glu Asn
Lys Ile Phe Val Lys Pro Thr Ser Thr1 5 10 15Thr Ile Leu Gln Asp Glu
Thr Arg Ser Arg Lys Phe Gly Gln Glu Met 20 25 30Lys Arg Glu Lys Arg
Arg Val Leu Arg Val Ile Asn Gln Asn Leu Ala 35 40 45Gly Ala Arg Val
Tyr Pro Cys Val Val Asn Lys Lys Gly Ser Leu Leu 50 55 60Ser Asn Lys
Gln Glu Glu Glu Glu Gly Cys Gln Lys Lys Lys Phe Asp65 70 75 80Ser
Leu Arg Pro Ser Val Thr Arg Ser Gly Val Glu Glu Glu Thr Asn 85 90
95Lys Lys Leu Lys Pro Ser Val Pro Ser Ala Asn Asp Phe Gly Asp Cys
100 105 110Ile Phe Ile Asp Glu Glu Glu Ala Thr Leu Asp Leu Pro Met
Pro Met 115 120 125Ser Leu Glu Lys Pro Tyr Ile Glu Ala Asp Pro Met
Glu Glu Val Glu 130 135 140Met Glu Asp Val Thr Val Glu Glu Pro Ile
Val Asp Ile Asp Val Leu145 150 155 160Asp Ser Lys Asn Ser Leu Ala
Ala Val Glu Tyr Val Gln Asp Leu Tyr 165 170 175Ala Phe Tyr Arg Thr
Met Glu Arg Phe Ser Cys Val Pro Val Asp Tyr 180 185 190Met Met Gln
Gln Ile Asp Leu Asn Glu Lys Met Arg Ala Ile Leu Ile 195 200 205Asp
Trp Leu Ile Glu Val His Asp Lys Phe Asp Leu Met Asn Glu Thr 210 215
220Leu Phe Leu Thr Val Asn Leu Ile Asp Arg Phe Leu Ser Lys Gln
Asn225 230 235 240Val Met Arg Lys Lys Leu Gln Leu Val Gly Leu Val
Ala Leu Leu Leu 245 250 255Ala Cys Lys Tyr Glu Glu Val Ser Val Pro
Val Val Glu Asp Leu Val 260 265 270Leu Ile Ser Asp Lys Ala Tyr Thr
Arg Asn Asp Val Leu Glu Met Glu 275 280 285Lys Thr Met Leu Ser Thr
Leu Gln Phe Asn Ile Ser Leu Pro Thr Gln 290 295 300Tyr Pro Phe Leu
Lys Arg Phe Leu Lys Ala Ala Gln Ala Asp Lys Lys305 310 315 320Cys
Glu Val Leu Ala Ser Phe Leu Ile Glu Leu Ala Leu Val Glu Tyr 325 330
335Glu Met Leu Arg Phe Pro Pro Ser Leu Leu Ala Ala Thr Ser Val Tyr
340 345 350Thr Ala Gln Cys Thr Leu Asp Gly Ser Arg Lys Trp Asn Ser
Thr Cys 355 360 365Glu Phe His Cys His Tyr Ser Glu Asp Gln Leu Met
Glu Cys Ser Arg 370 375 380Lys Leu Val Ser Leu His Gln Arg Ala Ala
Thr Gly Asn Leu Thr Gly385 390 395 400Val Tyr Arg Lys Tyr Ser Thr
Ser Lys Phe Gly Tyr Ile Ala Lys Cys 405 410 415Glu Ala Ala His Phe
Leu Val Ser Glu Ser His His Ser 420 42572359PRTArabidopsis thaliana
72Thr Lys Gln Glu Ala Lys Ala Ala Phe Lys Ser Leu Leu Glu Ser Val1
5 10 15Asn Val His Ser Asp Trp Thr Trp Glu Gln Thr Leu Lys Glu Ile
Val 20 25 30His Asp Lys Arg Tyr Gly Ala Leu Arg Thr Leu Gly Glu Arg
Lys Gln 35 40 45Ala Phe Asn Glu Tyr Leu Gly Gln Arg Lys Lys Val Glu
Ala Glu Glu 50 55 60Arg Arg Arg Arg Gln Lys Lys Ala Arg Glu Glu Phe
Val Lys Met Leu65 70 75 80Glu Glu Cys Glu Glu Leu Ser Ser Ser Leu
Lys Trp Ser Lys Ala Met 85 90 95Ser Leu Phe Glu Asn Asp Gln Arg Phe
Lys Ala Val Asp Arg Pro Arg 100 105 110Asp Arg Glu Asp Leu Phe Asp
Asn Tyr Ile Val Glu Leu Glu Arg Lys 115 120 125Glu Arg Glu Lys Ala
Ala Glu Glu His Arg Gln Tyr Met Ala Asp Tyr 130 135 140Arg Lys Phe
Leu Glu Thr Cys Asp Tyr Ile Lys Ala Gly Thr Gln Trp145 150 155
160Arg Lys Ile Gln Asp Arg Leu Glu Asp Asp Asp Arg Cys Ser Cys Leu
165 170 175Glu Lys Ile Asp Arg Leu Ile Gly Phe Glu Glu Tyr Ile Leu
Asp Leu 180 185 190Glu Lys Glu Glu Glu Glu Leu Lys Arg Val Glu Lys
Glu His Val Arg 195 200 205Arg Ala Glu Arg Lys Asn Arg Asp Ala Phe
Arg Thr Leu Leu Glu Glu 210 215 220His Val Ala Ala Gly Ile Leu Thr
Ala Lys Thr Tyr Trp Leu Asp Tyr225 230 235 240Cys Ile Glu Leu Lys
Asp Leu Pro Gln Tyr Gln Ala Val Ala Ser Asn 245 250 255Thr Ser Gly
Ser Thr Pro Lys Asp Leu Phe Glu Asp Val Thr Glu Glu 260 265 270Leu
Glu Lys Gln Tyr His Glu Asp Lys Ser Tyr Val Lys Asp Ala Met 275 280
285Lys Ser Arg Lys Ile Ser Met Val Ser Ser Trp Leu Phe Glu Asp Phe
290 295 300Lys Ser Ala Ile Ser Glu Asp Leu Ser Thr Gln Gln Ile Ser
Asp Ile305 310 315 320Asn Leu Lys Leu Ile Tyr Asp Asp Leu Val Gly
Arg Val Lys Glu Lys 325 330 335Glu Glu Lys Glu Ala Arg Lys Leu Gln
Arg Leu Ala Glu Glu Phe Thr 340 345 350Asn Leu Leu His Thr Phe Lys
35573110PRTArabidopsis thaliana 73Gln Glu Lys Pro Trp Glu Asn Asp
Pro His Tyr Phe Lys Arg Val Lys1 5 10 15Ile Ser Ala Leu Ala Leu Leu
Lys Met Val Val His Ala Arg Ser Gly 20 25 30Gly Thr Ile Glu Ile Met
Gly Leu Met Gln Gly Lys Thr Asp Gly Asp 35 40 45Thr Ile Ile Val Met
Asp Ala Phe Ala Leu Pro Val Glu Gly Thr Glu 50 55 60Thr Arg Val Asn
Ala Gln Asp Asp Ala Tyr Glu Tyr Met Val Glu Tyr65 70 75 80Ser Gln
Thr Asn Lys Leu Ala Gly Pro Ala Gly Glu Cys Cys Trp Met 85 90 95Val
Ser Leu Ser Pro Trp Ile Trp Met Leu Ala Leu Arg Tyr 100 105
11074337PRTArabidopsis thaliana 74Val Asp Ser Pro Asp Ser Thr Ser
Asp Asn Ile Phe Tyr Tyr Asp Asp1 5 10 15Thr Ser Gln Thr Arg Phe Gln
Gln Glu Lys Pro Trp Glu Asn Asp Pro 20 25 30His Tyr Phe Lys Arg Val
Lys Ile Ser Ala Leu Ala Leu Leu Lys Met 35 40 45Val Val His Ala Arg
Ser Gly Gly Thr Ile Glu Ile Met Gly Leu Met 50 55 60Gln Gly Lys Thr
Asp Gly Asp Thr Ile Ile Val Met Asp Ala Phe Ala65 70 75 80Leu Pro
Val Glu Gly Thr Glu Thr Arg Val Asn Ala Gln Asp Asp Ala 85 90 95Tyr
Glu Tyr Met Val Glu Tyr Ser Gln Thr Asn Lys Leu Ala Gly Arg 100 105
110Leu Glu Asn Val Val Gly Trp Tyr His Ser His Pro Gly Tyr Gly Cys
115 120 125Trp Leu Ser Gly Ile Asp Val Ser Thr Gln Arg Leu Asn Gln
Gln His 130 135 140Gln Glu Pro Phe Leu Ala Val Val Ile Asp Pro Thr
Arg Thr Val Ser145 150 155 160Ala Gly Lys Val Glu Ile Gly Ala Phe
Arg Thr Tyr Ser Lys Gly Tyr 165 170 175Lys Pro Pro Asp Glu Pro Val
Ser Glu Tyr Gln Thr Ile Pro Leu Asn 180 185 190Lys Ile Glu Asp Phe
Gly Val His Cys Lys Gln Tyr Tyr Ser Leu Asp 195 200 205Val Thr Tyr
Phe Lys Ser Ser Leu Asp Ser His Leu Leu Asp Leu Leu 210 215 220Trp
Asn Lys Tyr Trp Val Asn Thr Leu Ser Ser Ser Pro Leu Leu Gly225 230
235 240Asn Gly Asp Tyr Val Ala Gly Gln Ile Ser Asp Leu Ala Glu Lys
Leu 245 250 255Glu Gln Ala Glu Ser His Leu Val Gln Ser Arg Phe Gly
Gly Val Val 260 265 270Pro Ser Ser Leu His Lys Lys Lys Glu Asp Glu
Ser Gln Leu Thr Lys 275 280 285Ile Thr Arg Asp Ser Ala Lys Ile Thr
Val Glu Gln Val His Gly Leu 290 295 300Met Ser Gln Val Ile Lys Asp
Glu Leu Phe Asn Ser Met Arg Gln Ser305 310 315 320Asn Asn Lys Ser
Pro Thr Asp Ser Ser Asp Pro Asp Pro Met Ile Thr 325 330
335Tyr75436PRTArabidopsis thaliana 75Met Tyr Cys Ser Ser Ser Met
His Pro Asn Ala Asn Lys Glu Asn Ile1 5 10 15Ser Thr Ser Asp Val Gln
Glu Ser Phe Val Arg Ile Thr Arg Ser Arg 20 25 30Ala Lys Lys Ala Met
Gly Arg Gly Val Ser Ile Pro Pro Thr Lys Pro 35 40 45Ser Phe Lys Gln
Gln Lys Arg Arg Ala Val Leu Lys Asp Val Ser Asn 50 55 60Thr Ser Ala
Asp Ile Ile Tyr Ser Glu Leu Arg Lys Gly Gly Asn Ile65 70 75 80Lys
Ala Asn Arg Lys Cys Leu Lys Glu Pro Lys Lys Ala Ala Lys Glu 85 90
95Gly Ala Asn Ser Ala Met Asp Ile Leu Val Asp Met His Thr Glu Lys
100 105 110Ser Lys Leu Ala Glu Asp Leu Ser Lys Ile Arg Met Ala Glu
Ala Gln 115 120 125Asp Val Ser Leu Ser Asn Phe Lys Asp Glu Glu Ile
Thr Glu Gln Gln 130 135 140Glu Asp Gly Ser Gly Val Met Glu Leu Leu
Gln Val Val Asp Ile Asp145 150 155 160Ser Asn Val Glu Asp Pro Gln
Cys Cys Ser Leu Tyr Ala Ala Asp Ile 165 170 175Tyr Asp Asn Ile His
Val Ala Glu Leu Gln Gln Arg Pro Leu Ala Asn 180 185 190Tyr Met Glu
Leu Val Gln Arg Asp Ile Asp Pro Asp Met Arg Lys Ile 195 200 205Leu
Ile Asp Trp Leu Val Glu Val Ser Asp Asp Tyr Lys Leu Val Pro 210 215
220Asp Thr Leu Tyr Leu Thr Val Asn Leu Ile Asp Arg Phe Leu Ser
Asn225 230 235 240Ser Tyr Ile Glu Arg Gln Arg Leu Gln Leu Leu Gly
Val Ser Cys Met 245 250 255Leu Ile Ala Ser Lys Tyr Glu Glu Leu Ser
Ala Pro Gly Val Glu Glu 260 265 270Phe Cys Phe Ile Thr Ala Asn Thr
Tyr Thr Arg Arg Glu Val Leu Ser 275 280 285Met Glu Ile Gln Ile Leu
Asn Phe Val His Phe Arg Leu Ser Val Pro 290 295 300Thr Thr Lys Thr
Phe Leu Arg Arg Phe Ile Lys Ala Ala Gln Ala Ser305 310 315 320Tyr
Lys Val Pro Phe Ile Glu Leu Glu Tyr Leu Ala Asn Tyr Leu Ala 325 330
335Glu Leu Thr Leu Val Glu Tyr Ser Phe Leu Arg Phe Leu Pro Ser Leu
340 345 350Ile Ala Ala Ser Ala Val Phe Leu Ala Arg Trp Thr Leu Asp
Gln Thr 355 360 365Asp His Pro Trp Asn Pro Thr Leu Gln His Tyr Thr
Arg Tyr Glu Val 370 375 380Ala Glu Leu Lys Asn Thr Val Leu Ala Met
Glu Asp Leu Gln Leu Asn385 390 395 400Thr Ser Gly Cys Thr Leu Ala
Ala Thr Arg Glu Lys Tyr Asn Gln Pro 405 410 415Lys Phe Lys Ser Val
Ala Lys Leu Thr Ser Pro Lys Arg Val Thr Leu 420 425 430Leu Phe Ser
Arg 43576254PRTArabidopsis thaliana 76Met Ala Lys Met Gln Leu Ser
Ile Phe Ile Ala Val Val Ala Leu Ile1 5 10 15Val Cys Ser Ala Ser Ala
Lys Thr Ala Ser Pro Pro Ala Pro Val Leu 20 25 30Pro Pro Thr Pro Ala
Pro Ala Pro Ala Pro Glu Asn Val Asn Leu Thr 35 40 45Glu Leu Leu Ser
Val Ala Gly Pro Phe His Thr Phe Leu Asp Tyr Leu 50 55 60Leu Ser Thr
Gly Val Ile Glu Thr Phe Gln Asn Gln Ala Asn Asn Thr65 70 75 80Glu
Glu Gly Ile Thr Ile Phe Val Pro Lys Asp Asp Ala Phe Lys Ala 85 90
95Gln Lys Asn Pro Pro Leu Ser Asn Leu Thr Lys Asp Gln Leu Lys Gln
100 105 110Leu Val Leu Phe His Ala Leu Pro His Tyr Tyr Ser Leu Ser
Glu Phe 115 120 125Lys Asn Leu Ser Gln Ser Gly Pro Val Ser Thr Phe
Ala Gly Gly Gln 130 135 140Tyr Ser Leu Lys Phe Thr Asp Val Ser Gly
Thr Val Arg Ile Asp Ser145 150 155 160Leu Trp Thr Arg Thr Lys Val
Ser Ser Ser Val Phe Ser Thr Asp Pro 165 170 175Val Ala Val Tyr Gln
Val Asn Arg Val Leu Leu Pro Glu Ala Ile Phe 180 185 190Gly Thr Asp
Val Pro Pro Met Pro Ala Pro Ala Pro Ala Pro Ile Val 195 200 205Ser
Ala Pro Ser Asp Ser Pro Ser Val Ala Asp Ser Glu Gly Ala Ser 210 215
220Ser Pro Lys Ser Ser His Lys Asn Ser Gly Gln Lys Leu Leu Leu
Ala225 230 235 240Pro Ile Ser Met Val Ile Ser Gly Leu Val Ala Leu
Phe Leu 245 2507786PRTArabidopsis thaliana 77Met Ala Ile Ser Lys
Ala Leu Ile Ala Ser Phe Leu Ile Ser Leu Leu1 5 10 15Val Leu Gln Leu
Val Gln Ala Asp Val Glu Asn Ser Gln Lys Lys Asn 20 25 30Gly Tyr Ala
Lys Lys Ile Asp Cys Gly Ser Ala Cys Val Ala Arg Leu 35 40 45Gln Ala
Phe Glu Glu Ala Glu Ala Val Ser Gln Ser Val Arg Asp Leu 50 55 60Leu
Leu Gln Val Gln Leu Cys Ala Ser Gly Tyr Val Arg Lys Leu Arg65 70 75
80Gln Val Pro Val Leu Arg 8578125PRTArabidopsis thaliana 78Lys Glu
Glu Ala Gly Met Tyr Trp Gly Tyr Lys Val Arg Tyr Ala Ser1 5 10 15Gln
Leu Ser Ser Val Phe Lys Glu Cys Pro Phe Glu Gly Gly Tyr Asp 20 25
30Tyr Leu Ile Gly Thr Ser Glu His Gly Leu Val Ile Ser Ser Ser Glu
35 40 45Leu Lys Ile Pro Thr Phe Arg His Leu Leu Ile Ala Phe Gly Gly
Leu 50 55 60Ala Gly Leu Glu Glu Ser Ile Glu Asp Asp Asn Gln Tyr Lys
Gly Lys65 70 75 80Asn Val Arg Asp Val Phe Asn Val Tyr Leu Asn Thr
Cys Pro His Gln 85 90 95Gly Ser Arg Thr Ile Arg Ala Glu Glu Ala Met
Phe Ile Ser Leu Gln 100 105 110Tyr Phe Gln Glu Pro Ile Ser Arg Ala
Val Arg Arg Leu 115 120 12579231PRTArabidopsis thaliana 79Ala Arg
Glu Met Gly Lys Lys Asn Lys Arg Ser Gln Asp Glu Ser Glu1 5 10 15Leu
Glu Leu Glu Pro Glu Leu Thr Lys Ile Ile Asp Gly Asp Ser Lys 20 25
30Lys Lys Lys Asn Lys Asn Lys Lys Lys Arg Ser His Glu Asp Thr Glu
35 40 45Ile Glu Pro Glu Gln Lys Met Ser Leu Asp Gly Asp Ser Arg Glu
Glu 50 55 60Lys Ile Lys Lys Lys Arg Lys Asn Lys Asn Gln Glu Glu Glu
Pro Glu65 70 75 80Leu Val Thr Glu Lys Thr Lys Val Gln Glu Glu Glu
Lys Gly Asn Val 85 90 95Glu Glu Gly Arg Ala Thr Val Ser Ile Ala Ile
Ala Gly Ser Ile Ile 100 105 110His Asn Thr Gln Ser Leu Glu Leu Ala
Thr Arg Val Ile Ser Leu Ser 115 120 125Leu Tyr Leu Ser Leu Arg Phe
Ser Val Phe Pro Phe Pro Asp Asn Leu 130 135 140Lys Ser Pro Ser Ser
Ile Ser Asn Ile Ser Gln Leu Ala Gly Gln Ile145
150 155 160Ala Arg Ala Ala Thr Ile Phe Arg Ile Asp Glu Ile Val Val
Phe Asp 165 170 175Asn Lys Ser Ser Ser Glu Ile Glu Ser Ala Ala Thr
Asn Ala Ser Asp 180 185 190Ser Asn Glu Ser Gly Ala Ser Phe Leu Val
Arg Ile Leu Lys Tyr Leu 195 200 205Glu Thr Pro Gln Tyr Leu Arg Lys
Ser Leu Phe Pro Lys Gln Asn Asp 210 215 220Leu Arg Tyr Val Gly Met
Leu225 23080112PRTArabidopsis thaliana 80Val Ser Ala Val Trp His
Gly Leu Tyr Pro Gly Tyr Ile Ile Phe Phe1 5 10 15Val Gln Ser Ala Leu
Met Ile Asp Gly Ser Lys Ala Ile Tyr Arg Trp 20 25 30Gln Gln Ala Ile
Pro Pro Lys Met Ala Met Leu Arg Asn Val Leu Val 35 40 45Leu Ile Asn
Phe Leu Tyr Thr Val Val Val Leu Asn Tyr Ser Ser Val 50 55 60Gly Phe
Met Val Leu Ser Leu His Glu Thr Leu Val Ala Phe Lys Ser65 70 75
80Val Tyr Tyr Ile Gly Thr Val Ile Pro Ile Ala Val Leu Leu Leu Ser
85 90 95Tyr Leu Val Pro Val Lys Pro Val Arg Pro Lys Thr Arg Lys Glu
Glu 100 105 11081119PRTArabidopsis
thalianaMISC_FEATURE(97)..(98)Xaa = any amino acid 81Val Phe Glu
Tyr Met Asp Thr Asp Val Lys Lys Phe Ile Arg Ser Phe1 5 10 15Arg Ser
Thr Gly Lys Asn Ile Pro Thr Gln Thr Ile Lys Ser Leu Met 20 25 30Tyr
Gln Leu Cys Lys Gly Met Ala Phe Cys His Gly His Gly Ile Leu 35 40
45His Arg Asp Leu Lys Pro His Asn Leu Leu Met Asp Pro Lys Thr Met
50 55 60Arg Leu Lys Ile Ala Asp Leu Gly Leu Ala Arg Ala Phe Thr Leu
Pro65 70 75 80Met Lys Lys Tyr Thr His Glu Ile Leu Thr Leu Trp Tyr
Arg Ala Pro 85 90 95Xaa Xaa Ser Ser Trp Cys His Pro Leu Leu Tyr Ser
Cys Gly Tyr Val 100 105 110Xaa Cys Trp Leu His Ile Cys
11582296PRTArabidopsis thaliana 82Pro Lys Arg Arg Met Ser Met Glu
Met Glu Leu Phe Val Thr Pro Glu1 5 10 15Lys Gln Arg Gln His Pro Ser
Val Ser Val Glu Lys Thr Pro Val Arg 20 25 30Arg Lys Leu Ile Val Asp
Asp Asp Ser Glu Ile Gly Ser Glu Lys Lys 35 40 45Gly Gln Ser Arg Thr
Ser Gly Gly Gly Leu Arg Gln Phe Ser Val Met 50 55 60Val Cys Gln Lys
Leu Glu Ala Lys Lys Ile Thr Thr Tyr Lys Glu Val65 70 75 80Ala Asp
Glu Ile Ile Ser Asp Phe Ala Thr Ile Lys Gln Asn Ala Glu 85 90 95Lys
Pro Leu Asn Glu Asn Glu Tyr Asn Glu Lys Asn Ile Arg Arg Arg 100 105
110Val Tyr Asp Ala Leu Asn Val Phe Met Ala Leu Asp Ile Ile Ala Arg
115 120 125Asp Lys Lys Glu Ile Arg Trp Lys Gly Leu Pro Ile Thr Cys
Lys Lys 130 135 140Asp Val Glu Glu Val Lys Met Asp Arg Asn Lys Val
Met Ser Ser Val145 150 155 160Gln Lys Lys Ala Ala Phe Leu Lys Glu
Leu Arg Glu Lys Val Ser Ser 165 170 175Leu Glu Ser Leu Met Ser Arg
Asn Gln Glu Met Val Val Lys Thr Gln 180 185 190Gly Pro Ala Glu Gly
Phe Thr Leu Pro Phe Ile Leu Leu Glu Thr Asn 195 200 205Pro His Ala
Val Val Glu Ile Glu Ile Ser Glu Asp Met Gln Leu Val 210 215 220His
Leu Asp Phe Asn Ser Thr Pro Phe Ser Val His Asp Asp Ala Tyr225 230
235 240Ile Leu Lys Leu Met Gln Glu Gln Lys Gln Glu Gln Asn Arg Val
Ser 245 250 255Ser Ser Ser Ser Thr His His Gln Ser Gln His Ser Ser
Ala His Ser 260 265 270Ser Ser Ser Ser Cys Ile Ala Ser Gly Thr Ser
Gly Pro Val Cys Trp 275 280 285Asn Ser Gly Ser Ile Asp Thr Arg 290
29583173PRTArabidopsis thaliana 83Met Gln Pro Thr Glu Thr Ser Gln
Pro Ala Pro Ser Asp Gln Gly Arg1 5 10 15Arg Leu Lys Asp Gln Leu Ser
Glu Ser Met Ser Phe Ser Ser Gln Met 20 25 30Lys Lys Glu Asp Asp Glu
Leu Ser Met Lys Ala Leu Ser Ala Phe Lys 35 40 45Ala Lys Glu Glu Glu
Ile Glu Lys Lys Lys Met Glu Ile Arg Glu Arg 50 55 60Val Gln Ala Gln
Leu Gly Arg Val Glu Asp Glu Ser Lys Arg Leu Ala65 70 75 80Met Ile
Arg Glu Glu Leu Glu Gly Phe Ala Asp Pro Met Arg Lys Glu 85 90 95Val
Thr Met Val Arg Lys Lys Ile Asp Ser Leu Asp Lys Glu Leu Lys 100 105
110Pro Leu Gly Asn Thr Val Gln Lys Lys Glu Thr Glu Tyr Lys Asp Ala
115 120 125Leu Glu Ala Phe Asn Glu Lys Asn Lys Glu Lys Val Glu Leu
Ile Thr 130 135 140Lys Leu Gln Glu Leu Glu Gly Glu Ser Glu Lys Phe
Arg Phe Lys Lys145 150 155 160Leu Glu Glu Leu Ser Lys Asn Ile Asp
Leu Thr Lys Pro 165 1708446PRTArabidopsis thaliana 84Gln Lys Gln
Ala Pro Gly Ala Gly Asp Val Pro Ala Thr Ile Gln Glu1 5 10 15Glu Asp
Asp Asp Asp Asp Val Pro Asp Leu Val Val Gly Glu Thr Phe 20 25 30Glu
Thr Pro Ala Thr Glu Glu Ala Pro Lys Ala Ala Ala Ser 35 40
4585383PRTArabidopsis thaliana 85Met Glu Asp Asp Asp Glu Ile Gln
Ser Ile Pro Ser Pro Gly Asp Ser1 5 10 15Ser Leu Ser Pro Gln Ala Pro
Pro Ser Pro Pro Ile Leu Pro Thr Asn 20 25 30Asp Val Thr Val Ala Val
Val Lys Lys Pro Gln Pro Gly Leu Ser Ser 35 40 45Gln Ser Pro Ser Met
Asn Ala Leu Ala Leu Val Val His Thr Pro Ser 50 55 60Val Thr Gly Gly
Gly Gly Ser Gly Asn Arg Asn Gly Arg Gly Gly Gly65 70 75 80Gly Gly
Ser Gly Gly Gly Gly Gly Gly Arg Asp Asp Cys Trp Ser Glu 85 90 95Glu
Ala Thr Lys Val Leu Ile Glu Ala Trp Gly Asp Arg Phe Ser Glu 100 105
110Pro Gly Lys Gly Thr Leu Lys Gln Gln His Trp Lys Glu Val Ala Glu
115 120 125Ile Val Asn Lys Ser Arg Gln Cys Lys Tyr Pro Lys Thr Asp
Ile Gln 130 135 140Cys Lys Asn Arg Ile Asp Thr Val Lys Lys Lys Tyr
Lys Gln Glu Lys145 150 155 160Ala Lys Ile Ala Ser Gly Asp Gly Pro
Ser Lys Trp Val Phe Phe Lys 165 170 175Lys Leu Glu Ser Leu Ile Gly
Gly Thr Thr Thr Phe Ile Ala Ser Ser 180 185 190Lys Ala Ser Glu Lys
Ala Pro Met Gly Gly Ala Leu Gly Asn Ser Arg 195 200 205Ser Ser Met
Phe Lys Arg Gln Thr Lys Gly Asn Gln Ile Val Gln Gln 210 215 220Gln
Gln Glu Lys Arg Gly Ser Asp Ser Met Arg Trp His Phe Arg Lys225 230
235 240Arg Ser Ala Ser Glu Thr Glu Ser Glu Ser Asp Pro Glu Pro Glu
Ala 245 250 255Ser Pro Glu Glu Ser Ala Glu Ser Leu Pro Pro Leu Gln
Pro Ile Gln 260 265 270Pro Leu Ser Phe His Met Pro Lys Arg Leu Lys
Val Asp Lys Ser Gly 275 280 285Gly Gly Gly Ser Gly Val Gly Asp Val
Ala Arg Ala Ile Leu Gly Phe 290 295 300Thr Glu Ala Tyr Glu Lys Ala
Glu Thr Ala Lys Leu Lys Leu Met Ala305 310 315 320Glu Leu Glu Lys
Glu Arg Met Lys Phe Ala Lys Glu Met Glu Leu Gln 325 330 335Arg Met
Gln Phe Leu Lys Thr Gln Leu Glu Ile Thr Gln Asn Asn Gln 340 345
350Glu Glu Glu Glu Arg Ser Arg Gln Arg Gly Glu Arg Arg Ile Val Asp
355 360 365Asp Asp Asp Asp Arg Asn Gly Lys Asn Asn Gly Asn Val Ser
Ser 370 375 38086131PRTArabidopsis
thalianaMISC_FEATURE(70)..(70)Xaa = any amino acid 86Gly Thr Ser
Leu Leu Leu His Ala Ser Ser Ser Ser Ser Ser Ile Ser1 5 10 15Leu Thr
Ile Pro Ser Asn His Ser Ser Met Ala Thr Val Ser Ser Ser 20 25 30Ser
Trp Pro Asn Pro Asn Pro Asn Pro Asp Ser Thr Ser Ala Ser Asp 35 40
45Ser Asp Ser Thr Phe Pro Ser His Arg Asp Arg Val Asp Glu Pro Asp
50 55 60Ser Leu Asp Ser Phe Xaa Ser Met Ser Leu Asn Ser Asp Glu Pro
Asn65 70 75 80Gln Thr Ser Asn Gln Ser Pro Leu Ser Pro Pro Thr Pro
Asn Leu Pro 85 90 95Val Met Pro Pro Pro Phe Val Leu Tyr Leu Ser Phe
Asn Gln Asp His 100 105 110Ala Cys Phe Ala Cys Xaa His Phe Val Pro
Ser Leu Ser Leu Tyr Leu 115 120 125Ser Ala Thr
13087181PRTArabidopsis thaliana 87Gln Ala His Asp Ser Arg Ile Ala
Cys Phe Ala Leu Thr Gln Asp Gly1 5 10 15His Leu Leu Ala Thr Ala Ser
Ser Lys Gly Thr Leu Val Arg Ile Phe 20 25 30Asn Thr Val Asp Gly Thr
Leu Arg Gln Glu Val Arg Arg Gly Ala Asp 35 40 45Arg Ala Glu Ile Tyr
Ser Leu Ala Phe Ser Ser Asn Ala Gln Trp Leu 50 55 60Ala Val Ser Ser
Asp Lys Gly Thr Val His Val Phe Gly Leu Lys Val65 70 75 80Asn Ser
Gly Ser Gln Val Lys Asp Ser Ser Arg Ile Ala Pro Asp Ala 85 90 95Thr
Pro Ser Ser Pro Ser Ser Ser Leu Ser Leu Phe Lys Gly Val Leu 100 105
110Pro Arg Tyr Phe Ser Ser Glu Trp Ser Val Ala Gln Phe Arg Leu Val
115 120 125Glu Gly Thr Gln Tyr Ile Ala Ala Phe Gly His Gln Lys Asn
Thr Val 130 135 140Val Ile Leu Gly Met Asp Gly Ser Phe Tyr Arg Cys
Gln Phe Asp Pro145 150 155 160Val Asn Gly Gly Glu Met Ser Gln Leu
Glu Tyr His Asn Cys Leu Lys 165 170 175Pro Pro Ser Val Phe
18088175PRTArabidopsis thaliana 88Met Asp Asp Ser Glu Glu Asp Gln
Arg Leu Pro His His Lys Asp Pro1 5 10 15Lys Glu Phe Val Ser Leu Asp
Lys Leu Ala Glu Leu Gly Val Leu Ser 20 25 30Trp Arg Leu Asp Ala Asp
Asn Tyr Glu Thr Asp Glu Asp Leu Lys Lys 35 40 45Ile Arg Glu Ser Arg
Gly Tyr Ser Tyr Met Asp Phe Cys Glu Val Cys 50 55 60Pro Glu Lys Leu
Pro Asn Tyr Glu Val Lys Val Lys Ser Phe Phe Glu65 70 75 80Glu His
Leu His Thr Asp Glu Glu Ile Arg Tyr Cys Val Ala Gly Thr 85 90 95Gly
Tyr Phe Asp Val Arg Asp Arg Asn Glu Ala Trp Ile Arg Val Leu 100 105
110Val Lys Lys Gly Gly Met Ile Val Leu Pro Ala Gly Ile Tyr His Arg
115 120 125Phe Thr Val Asp Ser Asp Asn Tyr Ile Lys Ala Met Arg Leu
Phe Val 130 135 140Gly Glu Pro Val Trp Thr Pro Tyr Asn Arg Pro His
Asp His Leu Pro145 150 155 160Ala Arg Lys Glu Tyr Val Asp Asn Phe
Met Ile Asn Ala Ser Ala 165 170 1758998PRTArabidopsis thaliana
89Thr Ser Phe Pro Ile Thr Arg Lys Lys Thr Leu Lys Met Asp Gly His1
5 10 15Asp Ser Glu Asp Thr Lys Gln Ser Thr Ala Asp Met Thr Ala Phe
Val 20 25 30Gln Asn Leu Leu Gln Gln Met Gln Thr Arg Phe Gln Thr Met
Ser Asp 35 40 45Ser Ile Ile Thr Lys Ile Asp Asp Met Gly Gly Arg Ile
Asn Glu Leu 50 55 60Glu Gln Ser Ile Asn Asp Leu Arg Ala Glu Met Gly
Val Glu Gly Thr65 70 75 80Pro Pro Pro Ala Ser Lys Ser Gly Asp Glu
Pro Lys Thr Pro Ala Ser 85 90 95Ser Ser90117PRTArabidopsis thaliana
90Ala Gln Val Arg Ala Lys Met Leu Lys Glu Val Ala Thr Glu Lys Gln1
5 10 15Thr Ala Val Asp Thr His Phe Ala Thr Ala Lys Lys Leu Ala Gln
Glu 20 25 30Gly Asp Ala Leu Phe Val Lys Ile Phe Ala Ile Lys Lys Leu
Leu Ala 35 40 45Lys Leu Glu Ala Glu Lys Glu Ser Val Asp Gly Lys Phe
Lys Glu Thr 50 55 60Val Lys Glu Leu Ser His Leu Leu Ala Asp Ala Ser
Glu Ala Tyr Glu65 70 75 80Glu Tyr His Gly Ala Val Arg Lys Ala Lys
Asp Glu Gln Ala Ala Glu 85 90 95Glu Phe Ala Lys Glu Ala Thr Gln Ser
Ala Glu Ile Ile Trp Val Lys 100 105 110Phe Leu Ser Ser Leu
11591216PRTArabidopsis thaliana 91Met Glu Phe Gly Ser Phe Leu Val
Ser Leu Gly Thr Ser Phe Val Ile1 5 10 15Phe Val Ile Leu Met Leu Leu
Phe Thr Trp Leu Ser Arg Lys Ser Gly 20 25 30Asn Ala Pro Ile Tyr Tyr
Pro Asn Arg Ile Leu Lys Gly Leu Glu Pro 35 40 45Trp Glu Gly Thr Ser
Leu Thr Arg Asn Pro Phe Ala Trp Met Arg Glu 50 55 60Ala Leu Thr Ser
Ser Glu Gln Asp Val Val Asn Leu Ser Gly Val Asp65 70 75 80Thr Ala
Val His Phe Val Phe Leu Ser Thr Val Leu Gly Ile Phe Ala 85 90 95Cys
Ser Ser Leu Leu Leu Leu Pro Thr Leu Leu Pro Leu Ala Ala Thr 100 105
110Asp Asn Asn Ile Lys Asn Thr Lys Asn Ala Thr Asp Thr Thr Ser Lys
115 120 125Gly Thr Phe Ser Gln Leu Asp Asn Leu Ser Met Ala Asn Ile
Thr Lys 130 135 140Lys Ser Ser Arg Leu Trp Ala Phe Leu Gly Ala Val
Tyr Trp Ile Ser145 150 155 160Leu Val Thr Tyr Phe Phe Leu Trp Lys
Ala Tyr Lys His Val Ser Ser 165 170 175Leu Arg Ala Gln Ala Leu Met
Ser Ala Asp Val Lys Pro Glu Gln Phe 180 185 190Ala Ile Leu Val Arg
Asp Met Pro Ala Pro Pro Asp Gly Arg Arg Gly 195 200 205Arg Glu Phe
Gln Ile Tyr Glu Ser 210 21592328PRTArabidopsis thaliana 92Val His
Thr Pro Ala Gly Glu Leu Gln Arg Gln Ile Arg Ser Trp Leu1 5 10 15Ala
Glu Ser Phe Glu Phe Leu Ser Val Thr Ala Asp Asp Val Ser Gly 20 25
30Val Thr Thr Gly Gln Leu Glu Leu Leu Ser Thr Ala Ile Met Asp Gly
35 40 45Trp Met Ala Gly Val Gly Ala Pro Val Pro Pro His Thr Asp Ala
Leu 50 55 60Gly Gln Leu Val Ser Glu Tyr Ala Lys Arg Val Tyr Thr Ser
Gln Met65 70 75 80Gln His Leu Lys Asp Ile Ala Gly Thr Leu Ala Ser
Glu Glu Ala Glu 85 90 95Asp Ala Gly Gln Val Ala Lys Leu Arg Ser Ala
Leu Glu Ser Val Asp 100 105 110His Lys Arg Arg Lys Ile Leu Gln Gln
Met Arg Ser Asp Ala Ala Leu 115 120 125Phe Thr Leu Glu Glu Gly Ser
Ser Pro Val Gln Asn Pro Ser Thr Ala 130 135 140Ala Glu Asp Ser Arg
Leu Ala Ser Leu Ile Ser Leu Asp Ala Ile Leu145 150 155 160Lys Gln
Val Lys Glu Ile Thr Arg Gln Ala Ser Val His Val Leu Ser 165 170
175Lys Ser Lys Lys Lys Ala Leu Leu Glu Ser Leu Asp Glu Leu Asn Glu
180 185 190Arg Met Pro Ser Leu Leu Asp Val Asp His Pro Cys Ala Gln
Arg Glu 195 200 205Ile Asp Thr Ala His Gln Leu Val Glu Thr Ile Pro
Glu Gln Glu Asp 210 215 220Asn Leu Gln Asp Glu Lys Arg Pro Ser Ile
Asp Ser Ile Ser Ser Thr225 230 235 240Glu Thr Asp Val Ser Gln Trp
Asn Val Leu Gln Phe Asn Thr Gly Gly 245 250 255Ser Ser Ala Pro Phe
Ile Ile Lys Cys Gly Ala Asn Ser Asn Ser Glu 260 265 270Leu Val Ile
Lys Ala Asp Ala Arg Ile Gln Glu Pro Lys Gly Gly Glu 275 280 285Ile
Val Arg Val Val Pro Arg Pro Ser Val Leu Glu
Asn Met Ser Leu 290 295 300Glu Glu Met Lys Gln Val Phe Gly Gln Leu
Pro Glu Ala Leu Ser Ser305 310 315 320Leu Ala Leu Ala Arg Thr Ala
Asp 3259379PRTArabidopsis thaliana 93Thr Tyr Glu Arg Leu Pro Ile
Glu Glu Glu Gln Gln Gln Glu Gln Pro1 5 10 15Leu Gln Leu Glu Asp Gly
Lys Lys Gln Lys Glu Glu Asn Asp Asp Asn 20 25 30Glu Ser Gly Asn Asn
Gly Asn Glu Gly Ser Met Gln Pro Pro Met Tyr 35 40 45Asn Met Pro Pro
Asn Phe Ile Pro Asn Gly His Gln Met Ala Gln His 50 55 60Asp Val Tyr
Trp Gly Gly Pro Pro Pro Arg Ala Pro Pro Ser Tyr65 70
7594150PRTArabidopsis thaliana 94Ser Lys Ala Arg Val Leu Ala Ile
Pro Asp Asp Leu Ala Asn Val Ser1 5 10 15Cys Gly Val Glu Gln Ile Glu
Glu Leu Lys Gly Leu Asn Leu Val Glu 20 25 30Lys Asp Gly Gly Ser Ser
Ser Ser Asp Gly Ala Arg Asn Thr Asn Pro 35 40 45Glu Thr Arg Arg Tyr
Ser Gly Ser Leu Gly Val Glu Asp Gly Ala Tyr 50 55 60Thr Asn Glu Met
Leu Gln Ser Ile Glu Met Val Thr Asp Val Leu Asp65 70 75 80Ser Leu
Val Arg Arg Val Thr Val Ala Glu Ser Glu Ser Ala Val Gln 85 90 95Lys
Glu Arg Ala Leu Leu Gly Glu Glu Glu Ile Ser Arg Lys Thr Ile 100 105
110Gln Ile Glu Asn Leu Ser Val Lys Leu Glu Glu Met Glu Arg Phe Ala
115 120 125Tyr Gly Thr Asn Ser Val Leu Asn Glu Met Arg Glu Arg Ile
Glu Glu 130 135 140Leu Val Glu Glu Thr Met145
15095181PRTArabidopsis thaliana 95Met Thr Asn Ile Ala Met Ala Asp
Ala Leu Lys Ser Leu Glu Ile Val1 5 10 15Asp Gly Leu Asp Glu Tyr Met
Asn Gln Ser Glu Ser Ser Ala Pro His 20 25 30Ser Pro Thr Ser Val Ala
Lys Leu Pro Pro Ser Thr Ala Thr Arg Thr 35 40 45Thr Arg Arg Lys Thr
Thr Thr Lys Ala Glu Pro Gln Pro Ser Ser Gln 50 55 60Leu Val Ser Arg
Ser Cys Arg Ser Thr Ser Lys Ser Leu Ala Gly Asp65 70 75 80Met Asp
Gln Glu Asn Ile Asn Lys Asn Val Ala Gln Glu Met Lys Thr 85 90 95Ser
Asn Val Lys Phe Glu Ala Asn Val Leu Lys Thr Pro Ala Ala Gly 100 105
110Ser Thr Arg Lys Thr Ser Ala Ala Thr Ser Cys Thr Lys Lys Asp Glu
115 120 125Leu Val Gln Ser Val Tyr Ser Thr Arg Arg Ser Thr Arg Leu
Leu Glu 130 135 140Lys Cys Met Ala Asp Leu Ser Leu Lys Thr Lys Glu
Thr Val Asp Asn145 150 155 160Lys Pro Ala Lys Asn Glu Asp Thr Glu
Gln Lys Val Ser Ala Gln Glu 165 170 175Lys Asn Leu Thr Gly
18096163PRTArabidopsis thaliana 96Met Leu Met Leu Cys Gly Phe Thr
Val Leu Asp Met Leu Lys His His1 5 10 15Asp Leu Gly Lys Ile Arg Ala
Pro Leu His Pro Leu Arg Lys Lys Met 20 25 30Gln Ile Gln His Ala Tyr
Gln Gln Ile His Gln Gly Ser Lys Leu Leu 35 40 45Lys Met Asp Arg Met
Met Leu Arg Gly Thr Lys Arg Arg Ile Gly Val 50 55 60Arg Lys Gly Asn
Leu Gln Arg Glu Arg Arg Lys Lys Asp Met Ile Gly65 70 75 80Val Lys
Asn Ala Lys Gly Met Arg Ser Glu Ala Leu Val Ile Gln Met 85 90 95Ile
Glu Arg Ser Thr Arg Lys Arg Arg Arg Arg Lys Lys Glu Gly Met 100 105
110Thr Leu Ile Leu Ile Glu Ala Asn Cys Pro Arg Met Glu His Phe Ala
115 120 125Leu Gln Arg Lys Ser Gly Arg Leu Gly Thr Lys Ile Gln Leu
Pro Leu 130 135 140Leu Gln Asp Leu Asn Leu Leu Leu Ile Ser Phe Thr
Asn Arg Gly Val145 150 155 160Lys Cys Cys97170PRTArabidopsis
thaliana 97Gly Thr Arg Gln Lys Arg Glu Thr Ser Asp Pro Glu Ser Asp
Leu Lys1 5 10 15Thr Arg Lys Asn Arg Lys Met Gly Lys Asp Gly Leu Ser
Asp Asp Gln 20 25 30Val Ser Ser Met Lys Glu Ala Phe Met Leu Phe Asp
Thr Asp Gly Asp 35 40 45Gly Lys Ile Ala Pro Ser Glu Leu Gly Ile Leu
Met Arg Ser Leu Gly 50 55 60Gly Asn Pro Thr Gln Ala Gln Leu Lys Ser
Ile Ile Ala Ser Glu Asn65 70 75 80Leu Ser Ser Pro Phe Asp Phe Asn
Arg Phe Leu Asp Leu Met Ala Lys 85 90 95His Leu Lys Thr Glu Pro Phe
Asp Arg Gln Leu Arg Asp Ala Phe Lys 100 105 110Val Leu Asp Lys Glu
Gly Thr Gly Phe Val Ala Val Ala Asp Leu Arg 115 120 125His Ile Leu
Thr Ser Ile Gly Glu Lys Leu Glu Pro Asn Glu Phe Asp 130 135 140Glu
Trp Ile Lys Glu Val Asp Val Gly Ser Asp Gly Lys Ile Arg Tyr145 150
155 160Glu Asp Phe Ile Ala Arg Met Val Ala Lys 165
1709838PRTArabidopsis thaliana 98Arg Gly Val Ser Phe Arg Ser Arg
Glu Met Arg Pro Ile Phe Ala Ile1 5 10 15Ser Gln Arg Met Arg Ser Ile
Lys Glu Ser Lys Glu Val Leu Asp Thr 20 25 30Glu Ser Arg Ser Arg Leu
3599376PRTArabidopsis thaliana 99Met Thr Thr Thr Gly Ser Asn Ser
Asn His Asn His His Glu Ser Asn1 5 10 15Asn Asn Asn Asn Asn Pro Ser
Thr Arg Ser Trp Gly Thr Ala Val Ser 20 25 30Gly Gln Ser Val Ser Thr
Ser Gly Ser Met Gly Ser Pro Ser Ser Arg 35 40 45Ser Glu Gln Thr Ile
Thr Val Val Thr Ser Thr Ser Asp Thr Thr Phe 50 55 60Gln Arg Leu Asn
Asn Leu Asp Ile Gln Gly Asp Asp Ala Gly Ser Gln65 70 75 80Gly Ala
Ser Gly Val Lys Lys Lys Lys Arg Gly Gln Arg Ala Ala Gly 85 90 95Pro
Asp Lys Thr Gly Arg Gly Leu Arg Gln Phe Ser Met Lys Val Cys 100 105
110Glu Lys Val Glu Ser Lys Gly Arg Thr Thr Tyr Asn Glu Val Ala Asp
115 120 125Glu Leu Val Ala Glu Phe Ala Leu Pro Asn Asn Asp Gly Thr
Ser Pro 130 135 140Asp Gln Gln Gln Tyr Asp Glu Lys Asn Ile Arg Arg
Arg Val Tyr Asp145 150 155 160Ala Leu Asn Val Leu Met Ala Met Asp
Ile Ile Ser Lys Asp Lys Lys 165 170 175Glu Ile Gln Trp Arg Gly Leu
Pro Arg Thr Ser Leu Ser Asp Ile Glu 180 185 190Glu Leu Lys Asn Glu
Arg Leu Ser Leu Arg Asn Arg Ile Glu Lys Lys 195 200 205Thr Ala Tyr
Ser Gln Glu Leu Glu Glu Gln Arg Asn Glu His Leu Tyr 210 215 220Ser
Ser Gly Asn Ala Pro Ser Gly Gly Val Ala Leu Pro Phe Ile Leu225 230
235 240Val Gln Thr Arg Pro His Ala Thr Val Glu Val Glu Ile Ser Glu
Asp 245 250 255Met Gln Leu Val His Phe Asp Phe Asn Ser Thr Pro Phe
Glu Leu His 260 265 270Asp Asp Asn Phe Val Leu Lys Thr Met Lys Phe
Cys Asp Gln Pro Pro 275 280 285Gln Gln Pro Asn Gly Arg Asn Asn Ser
Gln Leu Val Cys His Asn Phe 290 295 300Thr Pro Glu Asn Pro Asn Lys
Gly Pro Ser Thr Gly Pro Thr Pro Gln305 310 315 320Leu Asp Met Tyr
Glu Thr His Leu Gln Ser Gln Gln His Gln Gln His 325 330 335Ser Gln
Leu Gln Ile Ile Pro Met Pro Glu Thr Asn Asn Val Thr Ser 340 345
350Ser Ala Asp Thr Ala Pro Val Lys Ser Pro Ser Leu Pro Gly Ile Met
355 360 365Asn Ser Ser Met Lys Pro Glu Asn 370
375100145PRTArabidopsis thaliana 100Glu Tyr Leu Lys Lys Gly Ser Pro
Ile Ser Ala Leu Lys Ser Phe Ile1 5 10 15Ser Ser Leu Ser Glu Pro Pro
Gln Asp Ile Met Asp Ala Leu Phe Asn 20 25 30Ala Leu Phe Asp Gly Val
Gly Lys Gly Phe Ala Lys Glu Val Thr Lys 35 40 45Lys Lys Asn Tyr Leu
Ala Ala Ala Ala Thr Met Gln Glu Asp Gly Ser 50 55 60Gln Met His Leu
Leu Asn Ser Ile Gly Thr Phe Cys Gly Lys Asn Gly65 70 75 80Asn Glu
Glu Ala Leu Lys Glu Val Ala Leu Val Leu Lys Ala Leu Tyr 85 90 95Asp
Gln Asp Ile Ile Glu Glu Glu Val Val Leu Asp Trp Tyr Glu Lys 100 105
110Gly Leu Thr Gly Ala Asp Lys Ser Ser Pro Val Trp Lys Asn Val Lys
115 120 125Pro Phe Val Glu Trp Leu Gln Ser Ala Glu Ser Glu Ser Glu
Glu Glu 130 135 140Asp145101316PRTArabidopsis thaliana 101Leu Glu
Val Glu Arg Asn Ala Ser Ala Val Ala Ala Ser Glu Thr Met1 5 10 15Ala
Met Ile Asn Arg Leu His Glu Glu Lys Ala Ala Met Gln Met Glu 20 25
30Ala Leu Gln Tyr Gln Arg Met Met Glu Glu Gln Ala Glu Phe Asp Gln
35 40 45Glu Ala Leu Gln Leu Leu Asn Glu Leu Met Val Asn Arg Glu Lys
Glu 50 55 60Asn Ala Glu Leu Glu Lys Glu Leu Glu Val Tyr Arg Lys Arg
Met Glu65 70 75 80Glu Tyr Glu Ala Lys Glu Lys Met Gly Met Leu Arg
Arg Arg Leu Arg 85 90 95Asp Ser Ser Val Asp Ser Tyr Arg Asn Asn Gly
Asp Ser Asp Glu Asn 100 105 110Ser Asn Gly Glu Leu Gln Phe Lys Asn
Val Glu Gly Val Thr Asp Trp 115 120 125Lys Tyr Arg Glu Asn Glu Met
Glu Asn Thr Pro Val Asp Val Val Leu 130 135 140Arg Leu Asp Glu Cys
Leu Asp Asp Tyr Asp Gly Glu Arg Leu Ser Ile145 150 155 160Leu Gly
Arg Leu Lys Phe Leu Glu Glu Lys Leu Thr Asp Leu Asn Asn 165 170
175Glu Glu Asp Asp Glu Glu Glu Ala Lys Thr Phe Glu Ser Asn Gly Ser
180 185 190Ile Asn Gly Asn Glu His Ile His Gly Lys Glu Thr Asn Gly
Lys His 195 200 205Arg Val Ile Gln Ser Lys Arg Leu Leu Pro Leu Phe
Asp Ala Val Asp 210 215 220Gly Glu Met Glu Asn Gly Leu Ser Asn Gly
Asn His His Glu Asn Gly225 230 235 240Phe Asp Asp Ser Glu Lys Gly
Glu Asn Val Thr Ile Glu Glu Glu Val 245 250 255Asp Glu Leu Tyr Glu
Arg Leu Glu Ala Leu Glu Ala Asp Arg Glu Phe 260 265 270Leu Arg His
Cys Val Gly Ser Leu Lys Lys Gly Asp Lys Gly Val His 275 280 285Leu
Leu His Glu Ile Leu Gln His Leu Arg Asp Leu Arg Asn Ile Asp 290 295
300Leu Thr Arg Val Arg Glu Asn Gly Asp Met Ser Leu305 310
315102194PRTArabidopsis thaliana 102Ala Ser Leu Ile Lys Leu Ile Arg
Leu Leu Glu Thr Pro Ile Phe Thr1 5 10 15Tyr Leu Arg Leu Gln Leu Leu
Glu Pro Gly Arg Tyr Thr Trp Leu Leu 20 25 30Lys Thr Leu Tyr Gly Leu
Leu Met Leu Leu Pro Gln Gln Ser Ala Ala 35 40 45Phe Lys Ile Leu Arg
Thr Arg Leu Lys Thr Val Pro Thr Tyr Ser Phe 50 55 60Ser Thr Gly Asn
Gln Ile Gly Arg Ala Thr Ser Gly Val Pro Phe Ser65 70 75 80Gln Tyr
Lys His Gln Asn Glu Asp Gly Asp Leu Glu Asp Asp Asn Ile 85 90 95Asn
Ser Ser His Gln Gly Ile Asn Phe Ala Val Arg Leu Gln Gln Phe 100 105
110Glu Asn Val Gln Asn Leu His Arg Gly Gln Ala Arg Thr Arg Val Asn
115 120 125Tyr Ser Tyr His Ser Ser Ser Ser Ser Thr Ser Lys Glu Val
Arg Arg 130 135 140Ser Glu Glu Gln Gln Gln Gln Gln Gln Gln Gln Gln
Gln Gln Gln Gln145 150 155 160Gln Gln Gln Arg Pro Pro Pro Ser Ser
Thr Ser Ser Ser Val Ala Asp 165 170 175Asn Asn Arg Pro Pro Ser Arg
Thr Ser Arg Lys Gly Pro Gly Gln Leu 180 185 190Gln
Leu103289PRTArabidopsis thaliana 103Leu Ile Glu Thr Ser Val Glu Ser
Lys Glu Thr Thr Glu Ser Val Val1 5 10 15Thr Gly Glu Ser Glu Lys Ala
Ile Glu Asp Ile Ser Lys Glu Ala Asp 20 25 30Asn Glu Glu Asp Asp Asp
Glu Glu Glu Gln Glu Gly Asp Glu Asp Asp 35 40 45Asp Glu Asn Glu Glu
Glu Glu Val Val Val Pro Glu Thr Glu Asn Arg 50 55 60Ala Glu Gly Glu
Asp Leu Val Lys Asn Lys Ala Ala Asp Ala Lys Lys65 70 75 80His Leu
Gln Met Ile Gly Val Gln Leu Leu Lys Glu Ser Asp Glu Ala 85 90 95Asn
Arg Thr Lys Lys Arg Gly Lys Arg Ala Ser Arg Met Thr Leu Glu 100 105
110Asp Asp Ala Asp Glu Asp Trp Phe Pro Glu Glu Pro Phe Glu Ala Phe
115 120 125Lys Glu Met Arg Glu Arg Lys Val Phe Asp Val Ala Asp Met
Tyr Thr 130 135 140Ile Ala Asp Val Trp Gly Trp Thr Trp Glu Lys Asp
Phe Lys Asn Lys145 150 155 160Thr Pro Arg Lys Trp Ser Gln Glu Trp
Glu Val Glu Leu Ala Ile Val 165 170 175Leu Met Thr Lys Val Ile Glu
Leu Gly Gly Ile Pro Thr Ile Gly Asp 180 185 190Cys Ala Val Ile Leu
Arg Ala Ala Leu Arg Ala Pro Met Pro Ser Ala 195 200 205Phe Leu Lys
Ile Leu Gln Thr Thr His Ser Leu Gly Tyr Ser Phe Gly 210 215 220Ser
Pro Leu Tyr Asp Glu Ile Ile Thr Leu Cys Leu Asp Leu Gly Glu225 230
235 240Leu Asp Ala Ala Ile Ala Ile Val Ala Asp Met Glu Thr Thr Gly
Ile 245 250 255Thr Val Pro Asp Gln Thr Leu Asp Lys Val Ile Ser Ala
Arg Gln Ser 260 265 270Asn Glu Ser Pro Arg Ser Glu Pro Glu Glu Pro
Ala Ser Thr Val Ser 275 280 285Ser104333PRTArabidopsis thaliana
104Thr Asp Ser Ala Ser Asp Ser Ile Phe His Tyr Asp Asp Ala Ser Gln1
5 10 15Ala Lys Ile Gln Gln Glu Lys Pro Trp Ala Ser Asp Pro Asn Tyr
Phe 20 25 30Lys Arg Val His Ile Ser Ala Leu Ala Leu Leu Lys Met Val
Val His 35 40 45Ala Arg Ser Gly Gly Thr Ile Glu Ile Met Gly Leu Met
Gln Gly Lys 50 55 60Thr Glu Gly Asp Thr Ile Ile Val Met Asp Ala Phe
Ala Leu Pro Val65 70 75 80Glu Gly Thr Glu Thr Arg Val Asn Ala Gln
Ser Asp Ala Tyr Glu Tyr 85 90 95Met Val Glu Tyr Ser Gln Thr Ser Lys
Leu Ala Gly Arg Leu Glu Asn 100 105 110Val Val Gly Trp Tyr His Ser
His Pro Gly Tyr Gly Cys Trp Leu Ser 115 120 125Gly Ile Asp Val Ser
Thr Gln Met Leu Asn Gln Gln Tyr Gln Glu Pro 130 135 140Phe Leu Ala
Val Val Ile Asp Pro Thr Arg Thr Val Ser Ala Gly Lys145 150 155
160Val Glu Ile Gly Ala Phe Arg Thr Tyr Pro Glu Gly His Lys Ile Ser
165 170 175Asp Asp His Val Ser Glu Tyr Gln Thr Ile Pro Leu Asn Lys
Ile Glu 180 185 190Asp Phe Gly Val His Cys Lys Gln Tyr Tyr Ser Leu
Asp Ile Thr Tyr 195 200 205Phe Lys Ser Ser Leu Asp Ser His Leu Leu
Asp Leu Leu Trp Asn Lys 210 215 220Tyr Trp Val Asn Thr Leu Ser Ser
Ser Pro Leu Leu Gly Asn Gly Asp225 230 235 240Tyr Val Ala Gly Gln
Ile Ser Asp Leu Ala Glu Lys Leu Glu Gln Ala 245 250 255Glu Ser Gln
Leu Ala Asn Ser Arg Tyr Gly Gly Ile Ala Pro Ala Gly 260 265 270His
Gln Arg Arg Lys Glu Asp Glu Pro Gln Leu Ala Lys Ile Thr Arg 275 280
285Asp Ser Ala Lys Ile Thr Val Glu Gln Val His Gly Leu Met Ser Gln
290 295 300Val Ile Lys
Asp Ile Leu Phe Asn Ser Ala Arg Gln Ser Lys Lys Ser305 310 315
320Ala Asp Asp Ser Ser Asp Pro Glu Pro Met Ile Thr Ser 325
330105460PRTArabidopsis thaliana 105Met Val Arg Ser Asp Glu Asn Ser
Leu Gly Leu Ile Gly Ser Met Ser1 5 10 15Leu Gln Gly Thr Leu Asn Arg
Ser Ile Leu Leu Leu Lys Ile Lys Thr 20 25 30Phe Val Leu Phe Asp Phe
Ser Pro Lys Leu Ile Leu Asn Leu Leu Asp 35 40 45Val Gly Gly Gly Val
Val Gly Lys Ile Lys Thr Thr Ala Thr Thr Gly 50 55 60Pro Thr Arg Arg
Ala Leu Ser Thr Ile Asn Lys Asn Ile Thr Glu Ala65 70 75 80Pro Ser
Tyr Pro Tyr Ala Val Asn Lys Arg Ser Val Ser Glu Arg Asp 85 90 95Gly
Ile Cys Asn Lys Pro Pro Val His Arg Pro Val Thr Arg Lys Phe 100 105
110Ala Ala Gln Leu Ala Asp His Lys Pro His Ile Arg Asp Glu Glu Thr
115 120 125Lys Lys Pro Asp Ser Val Ser Ser Glu Glu Pro Glu Thr Ile
Ile Ile 130 135 140Asp Val Asp Glu Ser Asp Lys Glu Gly Gly Asp Ser
Asn Glu Pro Met145 150 155 160Phe Val Gln His Thr Glu Ala Met Leu
Glu Glu Ile Glu Gln Met Glu 165 170 175Lys Glu Ile Glu Met Glu Asp
Ala Asp Lys Glu Glu Glu Pro Val Ile 180 185 190Asp Ile Asp Ala Cys
Asp Lys Asn Asn Pro Leu Ala Ala Val Glu Tyr 195 200 205Ile His Asp
Met His Thr Phe Tyr Lys Asn Phe Glu Lys Leu Ser Cys 210 215 220Val
Pro Pro Asn Tyr Met Asp Asn Gln Gln Asp Leu Asn Glu Arg Met225 230
235 240Arg Gly Ile Leu Ile Asp Trp Leu Ile Glu Val His Tyr Lys Phe
Glu 245 250 255Leu Met Glu Glu Thr Leu Tyr Leu Thr Ile Asn Val Ile
Asp Arg Phe 260 265 270Leu Ala Val His Gln Ile Val Arg Lys Lys Leu
Gln Leu Val Gly Val 275 280 285Thr Ala Leu Leu Leu Ala Cys Lys Tyr
Glu Glu Val Ser Val Pro Val 290 295 300Val Asp Asp Leu Ile Leu Ile
Ser Asp Lys Ala Tyr Ser Arg Arg Glu305 310 315 320Val Leu Asp Met
Glu Lys Leu Met Ala Asn Thr Leu Gln Phe Asn Phe 325 330 335Ser Leu
Pro Thr Pro Tyr Val Phe Met Lys Arg Phe Leu Lys Ala Ala 340 345
350Gln Ser Asp Lys Lys Leu Glu Ile Leu Ser Phe Phe Met Ile Glu Leu
355 360 365Cys Leu Val Glu Tyr Glu Met Leu Glu Tyr Leu Pro Ser Lys
Leu Ala 370 375 380Ala Ser Ala Ile Tyr Thr Ala Gln Cys Thr Leu Lys
Gly Phe Glu Glu385 390 395 400Trp Ser Lys Thr Cys Glu Phe His Thr
Gly Tyr Asn Glu Lys Gln Leu 405 410 415Leu Ala Cys Ala Arg Lys Met
Val Ala Phe His His Lys Ala Gly Thr 420 425 430Gly Lys Leu Thr Gly
Val His Arg Lys Tyr Asn Thr Ser Lys Phe Cys 435 440 445His Ala Ala
Arg Thr Glu Pro Ala Gly Phe Leu Ile 450 455 460106664PRTArabidopsis
thaliana 106Met Val Asn Pro Gly His Gly Arg Gly Pro Asp Ser Gly Thr
Ala Ala1 5 10 15Gly Gly Ser Asn Ser Asp Pro Phe Pro Ala Asn Leu Arg
Val Leu Val 20 25 30Val Asp Asp Asp Pro Thr Cys Leu Met Ile Leu Glu
Arg Met Leu Met 35 40 45Thr Cys Leu Tyr Arg Val Thr Lys Cys Asn Arg
Ala Glu Ser Ala Leu 50 55 60Ser Leu Leu Arg Lys Asn Lys Asn Gly Phe
Asp Ile Val Ile Ser Asp65 70 75 80Val His Met Pro Asp Met Asp Gly
Phe Lys Leu Leu Glu His Val Gly 85 90 95Leu Glu Met Asp Leu Pro Val
Ile Met Met Ser Ala Asp Asp Ser Lys 100 105 110Ser Val Val Leu Lys
Gly Val Thr His Gly Ala Val Asp Tyr Leu Ile 115 120 125Lys Pro Val
Arg Ile Glu Ala Leu Lys Asn Ile Trp Gln His Val Val 130 135 140Arg
Lys Lys Arg Asn Glu Trp Asn Val Ser Glu His Ser Gly Gly Ser145 150
155 160Ile Glu Asp Thr Gly Gly Asp Arg Asp Arg Gln Gln Gln His Arg
Glu 165 170 175Asp Ala Asp Asn Asn Ser Ser Ser Val Asn Glu Gly Asn
Gly Arg Ser 180 185 190Ser Arg Lys Arg Lys Glu Glu Glu Val Asp Asp
Gln Gly Asp Asp Lys 195 200 205Glu Asp Ser Ser Ser Leu Lys Lys Pro
Arg Val Val Trp Ser Val Glu 210 215 220Leu His Gln Gln Phe Val Ala
Ala Val Asn Gln Leu Gly Val Asp Lys225 230 235 240Ala Val Pro Lys
Lys Ile Leu Glu Met Met Asn Val Pro Gly Leu Thr 245 250 255Arg Glu
Asn Val Ala Ser His Leu Gln Lys Tyr Arg Ile Tyr Leu Arg 260 265
270Arg Leu Gly Gly Val Ser Gln His Gln Gly Asn Met Asn His Ser Phe
275 280 285Met Thr Gly Gln Asp Gln Ser Phe Gly Pro Leu Ser Ser Leu
Asn Gly 290 295 300Phe Asp Leu Gln Ser Leu Ala Val Thr Gly Gln Leu
Pro Pro Gln Ser305 310 315 320Leu Ala Gln Leu Gln Ala Ala Gly Leu
Gly Arg Pro Thr Leu Ala Lys 325 330 335Pro Gly Met Ser Val Ser Pro
Leu Val Asp Gln Arg Ser Ile Phe Asn 340 345 350Phe Glu Asn Pro Lys
Ile Arg Phe Gly Asp Gly His Gly Gln Thr Met 355 360 365Asn Asn Gly
Asn Leu Leu His Gly Val Pro Thr Gly Ser His Met Arg 370 375 380Leu
Arg Pro Gly Gln Asn Val Gln Ser Ser Gly Met Met Leu Pro Val385 390
395 400Ala Asp Gln Leu Pro Arg Gly Gly Pro Ser Met Leu Pro Ser Leu
Gly 405 410 415Gln Gln Pro Ile Leu Ser Ser Ser Val Ser Arg Arg Ser
Asp Leu Thr 420 425 430Gly Ala Leu Ala Val Arg Asn Ser Ile Pro Glu
Thr Asn Ser Arg Val 435 440 445Leu Pro Thr Thr His Ser Val Phe Asn
Asn Phe Pro Ala Asp Leu Pro 450 455 460Arg Ser Ser Phe Pro Leu Ala
Ser Ala Pro Gly Ile Ser Val Pro Val465 470 475 480Ser Val Ser Tyr
Gln Glu Glu Val Asn Ser Ser Asp Ala Lys Gly Gly 485 490 495Ser Ser
Ala Ala Thr Ala Gly Phe Gly Asn Pro Ser Tyr Asp Ile Phe 500 505
510Asn Asp Phe Pro Gln His Gln Gln His Asn Lys Asn Ile Ser Asn Lys
515 520 525Leu Asn Asp Trp Asp Leu Arg Asn Met Gly Leu Val Phe Ser
Ser Asn 530 535 540Gln Asp Ala Ala Thr Ala Thr Ala Thr Ala Ala Phe
Ser Thr Ser Glu545 550 555 560Ala Tyr Ser Ser Ser Ser Thr Gln Arg
Lys Arg Arg Glu Thr Asp Ala 565 570 575Thr Val Val Gly Glu His Gly
Gln Asn Leu Gln Ser Pro Ser Arg Asn 580 585 590Leu Tyr His Leu Asn
His Val Phe Met Asp Gly Gly Ser Val Arg Val 595 600 605Lys Ser Glu
Arg Val Ala Glu Thr Val Thr Cys Pro Pro Ala Asn Thr 610 615 620Leu
Phe His Glu Gln Tyr Asn Gln Glu Asp Leu Met Ser Ala Phe Leu625 630
635 640Lys Gln Glu Gly Ile Pro Ser Val Asp Asn Glu Phe Glu Phe Asp
Gly 645 650 655Tyr Ser Ile Asp Asn Ile Gln Val
660107450PRTArabidopsis thaliana 107Met Gly Lys Glu Asn Ala Val Ser
Arg Pro Phe Thr Arg Ser Leu Ala1 5 10 15Ser Ala Leu Arg Ala Ser Glu
Val Thr Ser Thr Thr Gln Asn Gln Gln 20 25 30Arg Val Asn Thr Lys Arg
Pro Ala Leu Glu Asp Thr Arg Ala Thr Gly 35 40 45Pro Asn Lys Arg Lys
Lys Arg Ala Val Leu Gly Glu Ile Thr Asn Val 50 55 60Asn Ser Asn Thr
Ala Ile Leu Glu Ala Lys Asn Ser Lys Gln Ile Lys65 70 75 80Lys Gly
Arg Gly His Gly Leu Ala Ser Thr Ser Gln Leu Ala Thr Ser 85 90 95Val
Thr Ser Glu Val Thr Asp Leu Gln Ser Arg Thr Asp Ala Lys Val 100 105
110Glu Val Ala Ser Asn Thr Ala Gly Asn Leu Ser Val Ser Lys Gly Thr
115 120 125Asp Asn Thr Ala Asp Asn Cys Ile Glu Ile Trp Asn Ser Arg
Leu Pro 130 135 140Pro Arg Pro Leu Gly Arg Ser Ala Ser Thr Ala Glu
Lys Ser Ala Val145 150 155 160Ile Gly Ser Ser Thr Val Pro Asp Ile
Pro Lys Phe Val Asp Ile Asp 165 170 175Ser Asp Asp Lys Asp Pro Leu
Leu Cys Cys Leu Tyr Ala Pro Glu Ile 180 185 190His Tyr Asn Leu Arg
Val Ser Glu Leu Lys Arg Arg Pro Leu Pro Asp 195 200 205Phe Met Glu
Arg Ile Gln Lys Asp Val Thr Gln Ser Met Arg Gly Ile 210 215 220Leu
Val Asp Trp Leu Val Glu Val Ser Glu Glu Tyr Thr Leu Ala Ser225 230
235 240Asp Thr Leu Tyr Leu Thr Val Tyr Leu Ile Asp Trp Phe Leu His
Gly 245 250 255Asn Tyr Val Gln Arg Gln Gln Leu Gln Leu Leu Gly Ile
Thr Cys Met 260 265 270Leu Ile Ala Ser Lys Tyr Glu Glu Ile Ser Ala
Pro Arg Ile Glu Glu 275 280 285Phe Cys Phe Ile Thr Asp Asn Thr Tyr
Thr Arg Asp Gln Val Leu Glu 290 295 300Met Glu Asn Gln Val Leu Lys
His Phe Ser Phe Gln Ile Tyr Thr Pro305 310 315 320Thr Pro Lys Thr
Phe Leu Arg Arg Phe Leu Arg Ala Ala Gln Ala Ser 325 330 335Arg Leu
Ser Pro Ser Leu Glu Val Glu Phe Leu Ala Ser Tyr Leu Thr 340 345
350Glu Leu Thr Leu Ile Asp Tyr His Phe Leu Lys Phe Leu Pro Ser Val
355 360 365Val Ala Ala Ser Ala Val Phe Leu Ala Lys Trp Thr Met Asp
Gln Ser 370 375 380Asn His Pro Trp Asn Pro Thr Leu Glu His Tyr Thr
Thr Tyr Lys Ala385 390 395 400Ser Asp Leu Lys Ala Ser Val His Ala
Leu Gln Asp Leu Gln Leu Asn 405 410 415Thr Lys Gly Cys Pro Leu Ser
Ala Ile Arg Met Lys Tyr Arg Gln Glu 420 425 430Lys Tyr Lys Ser Val
Ala Val Leu Thr Ser Pro Lys Leu Leu Asp Thr 435 440 445Leu Phe
450108901PRTArabidopsis thaliana 108Met Ala Asn Asn Pro Pro Gln Ser
Ser Gly Thr Gln Gly Gln His Phe1 5 10 15Val Pro Ala Ala Ser Gln Pro
Phe His Pro Tyr Gly His Val Pro Pro 20 25 30Asn Val Gln Ser Gln Pro
Pro Gln Tyr Ser Gln Pro Ile Gln Gln Gln 35 40 45Gln Leu Phe Pro Val
Arg Pro Gly Gln Pro Val His Ile Thr Ser Ser 50 55 60Ser Gln Ala Val
Ser Val Pro Tyr Ile Gln Thr Asn Lys Ile Leu Thr65 70 75 80Ser Gly
Ser Thr Gln Pro Gln Pro Asn Ala Pro Pro Met Thr Gly Phe 85 90 95Ala
Thr Ser Gly Pro Pro Phe Ser Ser Pro Tyr Thr Phe Val Pro Ser 100 105
110Ser Tyr Pro Gln Gln Gln Pro Thr Ser Leu Val Gln Pro Asn Ser Gln
115 120 125Met His Val Ala Gly Val Pro Pro Ala Ala Asn Thr Trp Pro
Val Pro 130 135 140Val Asn Gln Ser Thr Ser Leu Val Ser Pro Val Gln
Gln Thr Gly Gln145 150 155 160Gln Thr Pro Val Ala Val Ser Thr Asp
Pro Gly Asn Leu Thr Pro Gln 165 170 175Ser Ala Ser Asp Trp Gln Glu
His Thr Ser Ala Asp Gly Arg Lys Ala 180 185 190Asp Ala Ser Thr Val
Trp Lys Glu Phe Thr Thr Pro Glu Gly Lys Lys 195 200 205Tyr Tyr Tyr
Asn Lys Val Thr Lys Glu Ser Lys Trp Thr Ile Pro Glu 210 215 220Asp
Leu Lys Leu Ala Arg Glu Gln Ala Gln Leu Ala Ser Glu Lys Thr225 230
235 240Ser Leu Ser Glu Ala Gly Ser Thr Pro Leu Ser His His Ala Ala
Ser 245 250 255Ser Ser Asp Leu Ala Val Ser Thr Val Thr Ser Val Val
Pro Ser Thr 260 265 270Ser Ser Ala Leu Thr Gly His Ser Ser Ser Pro
Ile Gln Ala Gly Leu 275 280 285Ala Val Pro Val Thr Arg Pro Pro Ser
Val Ala Pro Val Thr Pro Thr 290 295 300Ser Gly Ala Ile Ser Asp Thr
Glu Ala Thr Thr Met Tyr Tyr Phe Ser305 310 315 320Leu Gly Ser Phe
Ala Glu Asn Lys Glu Met Ser Val Asn Gly Lys Ala 325 330 335Asn Leu
Ser Pro Ala Gly Asp Lys Ala Asn Val Glu Glu Pro Met Val 340 345
350Tyr Ala Thr Lys Gln Glu Ala Lys Ala Ala Phe Lys Ser Leu Leu Glu
355 360 365Ser Val Asn Val His Ser Asp Trp Thr Trp Glu Gln Thr Leu
Lys Glu 370 375 380Ile Val His Asp Lys Arg Tyr Gly Ala Leu Arg Thr
Leu Gly Glu Arg385 390 395 400Lys Gln Ala Phe Asn Glu Tyr Leu Gly
Gln Arg Lys Lys Val Glu Ala 405 410 415Glu Glu Arg Arg Arg Arg Gln
Lys Lys Ala Arg Glu Glu Phe Val Lys 420 425 430Met Leu Glu Glu Cys
Glu Glu Leu Ser Ser Ser Leu Lys Trp Ser Lys 435 440 445Ala Met Ser
Leu Phe Glu Asn Asp Gln Arg Phe Lys Ala Val Asp Arg 450 455 460Pro
Arg Asp Arg Glu Asp Leu Phe Asp Asn Tyr Ile Val Glu Leu Glu465 470
475 480Arg Lys Glu Arg Glu Lys Ala Ala Glu Glu His Arg Gln Tyr Met
Ala 485 490 495Asp Tyr Arg Lys Phe Leu Glu Thr Cys Asp Tyr Ile Lys
Ala Gly Thr 500 505 510Gln Trp Arg Lys Ile Gln Asp Arg Leu Glu Asp
Asp Asp Arg Cys Ser 515 520 525Cys Leu Glu Lys Ile Asp Arg Leu Ile
Gly Phe Glu Glu Tyr Ile Leu 530 535 540Asp Leu Glu Lys Glu Glu Glu
Glu Leu Lys Arg Val Glu Lys Glu His545 550 555 560Val Arg Arg Ala
Glu Arg Lys Asn Arg Asp Ala Phe Arg Thr Leu Leu 565 570 575Glu Glu
His Val Ala Ala Gly Ile Leu Thr Ala Lys Thr Tyr Trp Leu 580 585
590Asp Tyr Cys Ile Glu Leu Lys Asp Leu Pro Gln Tyr Gln Ala Val Ala
595 600 605Ser Asn Thr Ser Gly Ser Thr Pro Lys Asp Leu Phe Glu Asp
Val Thr 610 615 620Glu Glu Leu Glu Lys Gln Tyr His Glu Asp Lys Ser
Tyr Val Lys Asp625 630 635 640Ala Met Lys Ser Arg Lys Ala Asn Phe
Lys Ser Ala Ile Ser Glu Asp 645 650 655Leu Ser Thr Gln Gln Ile Ser
Asp Ile Asn Leu Lys Leu Ile Tyr Asp 660 665 670Asp Leu Val Gly Arg
Val Lys Glu Lys Glu Glu Lys Glu Ala Arg Lys 675 680 685Leu Gln Arg
Leu Ala Glu Glu Phe Thr Asn Leu Leu His Thr Phe Lys 690 695 700Glu
Ile Thr Val Ala Ser Asn Trp Glu Asp Ser Lys Gln Leu Val Glu705 710
715 720Glu Ser Gln Glu Tyr Arg Ser Ile Gly Asp Glu Ser Val Ser Gln
Gly 725 730 735Leu Phe Glu Glu Tyr Ile Thr Ser Leu Gln Glu Lys Ala
Lys Glu Lys 740 745 750Glu Arg Lys Arg Asp Glu Glu Lys Val Arg Lys
Glu Lys Glu Arg Asp 755 760 765Glu Lys Glu Lys Arg Lys Asp Lys Asp
Lys Glu Arg Arg Glu Lys Glu 770 775 780Arg Glu Arg Glu Lys Glu Lys
Gly Lys Glu Arg Ser Lys Arg Glu Glu785 790 795 800Ser Asp Gly Glu
Thr Ala Met Asp Val Ser Glu Gly His Lys Asp Glu 805 810 815Lys Arg
Lys Gly Lys Asp Arg Asp Arg Lys His Arg Arg Arg His His 820 825
830Asn Asn Ser Asp Glu Asp Val Ser Ser Asp Arg Asp Asp Arg Asp Glu
835 840 845Ser Lys Lys Ser Ser Arg Lys His Gly
Asn Asp Arg Lys Lys Ser Arg 850 855 860Lys His Ala Asn Ser Pro Glu
Ser Glu Ser Glu Asn Arg His Lys Arg865 870 875 880Gln Lys Lys Glu
Ser Ser Arg Arg Ser Gly Asn Asp Glu Leu Glu Asp 885 890 895Gly Glu
Val Gly Glu 900109358PRTArabidopsis thaliana 109Met Glu Gly Ser Ser
Ser Thr Ile Ala Arg Lys Thr Trp Glu Leu Glu1 5 10 15Asn Ser Ile Leu
Thr Val Asp Ser Pro Asp Ser Thr Ser Asp Asn Ile 20 25 30Phe Tyr Tyr
Asp Asp Thr Ser Gln Thr Arg Phe Gln Gln Glu Lys Pro 35 40 45Trp Glu
Asn Asp Pro His Tyr Phe Lys Arg Val Lys Ile Ser Ala Leu 50 55 60Ala
Leu Leu Lys Met Val Val His Ala Arg Ser Gly Gly Thr Ile Glu65 70 75
80Ile Met Gly Leu Met Gln Gly Lys Thr Asp Gly Asp Thr Ile Ile Val
85 90 95Met Asp Ala Phe Ala Leu Pro Val Glu Gly Thr Glu Thr Arg Val
Asn 100 105 110Ala Gln Asp Asp Ala Tyr Glu Tyr Met Val Glu Tyr Ser
Gln Thr Asn 115 120 125Lys Leu Ala Gly Arg Leu Glu Asn Val Val Gly
Trp Tyr His Ser His 130 135 140Pro Gly Tyr Gly Cys Trp Leu Ser Gly
Ile Asp Val Ser Thr Gln Arg145 150 155 160Leu Asn Gln Gln His Gln
Glu Pro Phe Leu Ala Val Val Ile Asp Pro 165 170 175Thr Arg Thr Val
Ser Ala Gly Lys Val Glu Ile Gly Ala Phe Arg Thr 180 185 190Tyr Ser
Lys Gly Tyr Lys Pro Pro Asp Glu Pro Val Ser Glu Tyr Gln 195 200
205Thr Ile Pro Leu Asn Lys Ile Glu Asp Phe Gly Val His Cys Lys Gln
210 215 220Tyr Tyr Ser Leu Asp Val Thr Tyr Phe Lys Ser Ser Leu Asp
Ser His225 230 235 240Leu Leu Asp Leu Leu Trp Asn Lys Tyr Trp Val
Asn Thr Leu Ser Ser 245 250 255Ser Pro Leu Leu Gly Asn Gly Asp Tyr
Val Ala Gly Gln Ile Ser Asp 260 265 270Leu Ala Glu Lys Leu Glu Gln
Ala Glu Ser His Leu Val Gln Ser Arg 275 280 285Phe Gly Gly Val Val
Pro Ser Ser Leu His Lys Lys Lys Glu Asp Glu 290 295 300Ser Gln Leu
Thr Lys Ile Thr Arg Asp Ser Ala Lys Ile Thr Val Glu305 310 315
320Gln Val His Gly Leu Met Ser Gln Val Ile Lys Asp Glu Leu Phe Asn
325 330 335Ser Met Arg Gln Ser Asn Asn Lys Ser Pro Thr Asp Ser Ser
Asp Pro 340 345 350Asp Pro Met Ile Thr Tyr 35511098PRTArabidopsis
thaliana 110Met Ala Ile Ser Lys Ala Leu Ile Ala Ser Leu Leu Ile Ser
Leu Leu1 5 10 15Val Leu Gln Leu Val Gln Ala Asp Val Glu Asn Ser Gln
Lys Lys Asn 20 25 30Gly Tyr Ala Lys Lys Ile Asp Cys Gly Ser Ala Cys
Val Ala Arg Cys 35 40 45Arg Leu Ser Arg Arg Pro Arg Leu Cys His Arg
Ala Cys Gly Thr Cys 50 55 60Cys Tyr Arg Cys Asn Cys Val Pro Pro Gly
Thr Tyr Gly Asn Tyr Asp65 70 75 80Lys Cys Gln Cys Tyr Ala Ser Leu
Thr Thr His Gly Gly Arg Arg Lys 85 90 95Cys Pro111385PRTArabidopsis
thalianaMISC_FEATURE(252)..(253)Xaa = any amino acid 111Met Gly Lys
Lys Asn Lys Arg Ser Gln Asp Glu Ser Glu Leu Glu Leu1 5 10 15Glu Pro
Glu Leu Thr Lys Ile Ile Asp Gly Asp Ser Lys Lys Lys Lys 20 25 30Asn
Lys Asn Lys Lys Lys Arg Ser His Glu Asp Thr Glu Ile Glu Pro 35 40
45Glu Gln Lys Met Ser Leu Asp Gly Asp Ser Arg Glu Glu Lys Ile Lys
50 55 60Lys Lys Arg Lys Asn Lys Asn Gln Glu Glu Glu Pro Glu Leu Val
Thr65 70 75 80Glu Lys Thr Lys Val Gln Glu Glu Glu Lys Gly Asn Val
Glu Glu Gly 85 90 95Arg Ala Thr Val Ser Ile Ala Ile Ala Gly Ser Ile
Ile His Asn Thr 100 105 110Gln Ser Leu Glu Leu Ala Thr Arg Val Ile
Ser Leu Ser Leu Tyr Leu 115 120 125Ser Leu Arg Phe Ser Val Phe Pro
Phe Pro Asp Asn Leu Lys Ser Pro 130 135 140Ser Ser Ile Ser Asn Ile
Ser Gln Leu Ala Gly Gln Ile Ala Arg Ala145 150 155 160Ala Thr Ile
Phe Arg Ile Asp Glu Ile Val Val Phe Asp Asn Lys Ser 165 170 175Ser
Ser Glu Ile Glu Ser Ala Ala Thr Asn Ala Ser Asp Ser Asn Glu 180 185
190Ser Gly Ala Ser Phe Leu Val Arg Ile Leu Lys Tyr Leu Glu Thr Pro
195 200 205Gln Tyr Leu Arg Lys Ser Leu Phe Pro Lys Gln Asn Asp Leu
Arg Tyr 210 215 220Val Gly Met Leu Pro Gly Met Leu Pro Pro Leu Asp
Ala Pro His His225 230 235 240Leu Arg Lys His Glu Trp Glu Gln Tyr
Arg Glu Xaa Xaa Ile Val Pro 245 250 255Pro Ser Lys Pro Arg Glu Glu
Ala Gly Met Tyr Trp Gly Tyr Lys Val 260 265 270Arg Tyr Ala Ser Gln
Leu Ser Ser Val Phe Lys Glu Cys Pro Phe Glu 275 280 285Gly Gly Tyr
Asp Tyr Leu Ile Gly Thr Ser Glu His Gly Leu Val Ile 290 295 300Ser
Ser Ser Glu Leu Lys Ile Pro Thr Phe Arg His Leu Leu Ile Ala305 310
315 320Phe Gly Gly Leu Ala Gly Leu Glu Glu Ser Ile Glu Asp Asp Asn
Gln 325 330 335Tyr Lys Gly Lys Asn Val Arg Asp Val Phe Asn Val Tyr
Leu Asn Thr 340 345 350Cys Pro His Gln Gly Ser Arg Thr Ile Arg Ala
Glu Glu Ala Met Phe 355 360 365Ile Ser Leu Gln Tyr Phe Gln Glu Pro
Ile Ser Arg Ala Val Arg Arg 370 375 380Leu385112465PRTArabidopsis
thaliana 112Met Glu Leu Leu Asp Met Asn Ser Met Ala Ala Ser Ile Gly
Val Ser1 5 10 15Val Ala Val Leu Arg Phe Leu Leu Cys Phe Val Ala Thr
Ile Pro Ile 20 25 30Ser Phe Leu Trp Arg Phe Ile Pro Ser Arg Leu Gly
Lys His Ile Tyr 35 40 45Ser Ala Ala Ser Gly Ala Phe Leu Ser Tyr Leu
Ser Phe Gly Phe Ser 50 55 60Ser Asn Leu His Phe Leu Val Pro Met Thr
Ile Gly Tyr Ala Ser Met65 70 75 80Ala Ile Tyr Arg Pro Leu Ser Gly
Phe Ile Thr Phe Phe Leu Gly Phe 85 90 95Ala Tyr Leu Ile Gly Cys His
Val Phe Tyr Met Ser Gly Asp Ala Trp 100 105 110Lys Glu Gly Gly Ile
Asp Ser Thr Gly Ala Leu Met Val Leu Thr Leu 115 120 125Lys Val Ile
Ser Cys Ser Ile Asn Tyr Asn Asp Gly Met Leu Lys Glu 130 135 140Glu
Gly Leu Arg Glu Ala Gln Lys Lys Asn Arg Leu Ile Gln Met Pro145 150
155 160Ser Leu Ile Glu Tyr Phe Gly Tyr Cys Leu Cys Cys Gly Ser His
Phe 165 170 175Ala Gly Pro Val Phe Glu Met Lys Asp Tyr Leu Glu Trp
Thr Glu Glu 180 185 190Lys Gly Ile Trp Ala Val Ser Glu Lys Gly Lys
Arg Pro Ser Pro Tyr 195 200 205Gly Ala Met Ile Arg Ala Val Phe Gln
Ala Ala Ile Cys Met Ala Leu 210 215 220Tyr Leu Tyr Leu Val Pro Gln
Phe Pro Leu Thr Arg Phe Thr Glu Pro225 230 235 240Val Tyr Gln Glu
Trp Gly Phe Leu Lys Arg Phe Gly Tyr Gln Tyr Met 245 250 255Ala Gly
Phe Thr Ala Arg Trp Lys Tyr Tyr Phe Ile Trp Ser Ile Ser 260 265
270Glu Ala Ser Ile Ile Ile Ser Gly Leu Gly Phe Ser Gly Trp Thr Asp
275 280 285Glu Thr Gln Thr Lys Ala Lys Trp Asp Arg Ala Lys Asn Val
Asp Ile 290 295 300Leu Gly Val Glu Leu Ala Lys Ser Ala Val Gln Ile
Pro Leu Phe Trp305 310 315 320Asn Ile Gln Val Ser Thr Trp Leu Arg
His Tyr Val Tyr Glu Arg Ile 325 330 335Val Lys Pro Gly Lys Lys Ala
Gly Phe Phe Gln Leu Leu Ala Thr Gln 340 345 350Thr Val Ser Ala Val
Trp His Gly Leu Tyr Pro Gly Tyr Ile Ile Phe 355 360 365Phe Val Gln
Ser Ala Leu Met Ile Asp Gly Ser Lys Ala Ile Tyr Arg 370 375 380Trp
Gln Gln Ala Ile Pro Pro Lys Met Ala Met Leu Arg Asn Val Leu385 390
395 400Val Leu Ile Asn Phe Leu Tyr Thr Val Val Val Leu Asn Tyr Ser
Ser 405 410 415Val Gly Phe Met Val Leu Ser Leu His Glu Thr Leu Val
Ala Phe Lys 420 425 430Ser Val Tyr Tyr Ile Gly Thr Val Ile Pro Ile
Ala Val Leu Leu Leu 435 440 445Ser Tyr Leu Val Pro Val Lys Pro Val
Arg Pro Lys Thr Arg Lys Glu 450 455 460Glu465113313PRTArabidopsis
thaliana 113Met Asp Glu Gly Val Ile Ala Val Ser Ala Met Asp Ala Phe
Glu Lys1 5 10 15Leu Glu Lys Val Gly Glu Gly Thr Tyr Gly Lys Val Tyr
Arg Ala Arg 20 25 30Glu Lys Ala Thr Gly Lys Ile Val Ala Leu Lys Lys
Thr Arg Leu His 35 40 45Glu Asp Glu Glu Gly Val Pro Ser Thr Thr Leu
Arg Glu Ile Ser Ile 50 55 60Leu Arg Met Leu Ala Arg Asp Pro His Val
Val Arg Leu Met Asp Val65 70 75 80Lys Gln Gly Leu Ser Lys Glu Gly
Lys Thr Val Leu Tyr Leu Val Phe 85 90 95Glu Tyr Met Asp Thr Asp Val
Lys Lys Phe Ile Arg Ser Phe Arg Ser 100 105 110Thr Gly Lys Asn Ile
Pro Thr Gln Thr Ile Lys Ser Leu Met Tyr Gln 115 120 125Leu Cys Lys
Gly Met Ala Phe Cys His Gly His Gly Ile Leu His Arg 130 135 140Asp
Leu Lys Pro His Asn Leu Leu Met Asp Pro Lys Thr Met Arg Leu145 150
155 160Lys Ile Ala Asp Leu Gly Leu Ala Arg Ala Phe Thr Leu Pro Met
Lys 165 170 175Lys Tyr Thr His Glu Ile Leu Thr Leu Trp Tyr Arg Ala
Pro Glu Val 180 185 190Leu Leu Gly Ala Thr His Tyr Ser Thr Ala Val
Asp Met Trp Ser Val 195 200 205Gly Cys Ile Phe Ala Glu Leu Val Thr
Asn Gln Ala Ile Phe Gln Gly 210 215 220Asp Ser Glu Leu Gln Gln Leu
Leu His Ile Phe Lys Leu Phe Gly Thr225 230 235 240Pro Asn Glu Glu
Met Trp Pro Gly Val Ser Thr Leu Lys Asn Trp His 245 250 255Glu Tyr
Pro Gln Trp Lys Pro Ser Thr Leu Ser Ser Ala Val Pro Asn 260 265
270Leu Asp Glu Ala Gly Val Asp Leu Leu Ser Lys Met Leu Gln Tyr Glu
275 280 285Pro Ala Lys Arg Ile Ser Ala Lys Met Ala Met Glu His Pro
Tyr Phe 290 295 300Asp Asp Leu Pro Glu Lys Ser Ser Leu305
310114292PRTArabidopsis thaliana 114Met Ser Met Glu Met Glu Leu Phe
Val Thr Pro Glu Lys Gln Arg Gln1 5 10 15His Pro Ser Val Ser Val Glu
Lys Thr Pro Val Arg Arg Lys Leu Ile 20 25 30Val Asp Asp Asp Ser Glu
Ile Gly Ser Glu Lys Lys Gly Gln Ser Arg 35 40 45Thr Ser Gly Gly Gly
Leu Arg Gln Phe Ser Val Met Val Cys Gln Lys 50 55 60Leu Glu Ala Lys
Lys Ile Thr Thr Tyr Lys Glu Val Ala Asp Glu Ile65 70 75 80Ile Ser
Asp Phe Ala Thr Ile Lys Gln Asn Ala Glu Lys Pro Leu Asn 85 90 95Glu
Asn Glu Tyr Asn Glu Lys Asn Ile Arg Arg Arg Val Tyr Asp Ala 100 105
110Leu Asn Val Phe Met Ala Leu Asp Ile Ile Ala Arg Asp Lys Lys Glu
115 120 125Ile Arg Trp Lys Gly Leu Pro Ile Thr Cys Lys Lys Asp Val
Glu Glu 130 135 140Val Lys Met Asp Arg Asn Lys Val Met Ser Ser Val
Gln Lys Lys Ala145 150 155 160Ala Phe Leu Lys Glu Leu Arg Glu Lys
Val Ser Ser Leu Glu Ser Leu 165 170 175Met Ser Arg Asn Gln Glu Met
Val Val Lys Thr Gln Gly Pro Ala Glu 180 185 190Gly Phe Thr Leu Pro
Phe Ile Leu Leu Glu Thr Asn Pro His Ala Val 195 200 205Val Glu Ile
Glu Ile Ser Glu Asp Met Gln Leu Val His Leu Asp Phe 210 215 220Asn
Ser Thr Pro Phe Ser Val His Asp Asp Ala Tyr Ile Leu Lys Leu225 230
235 240Met Gln Glu Gln Lys Gln Glu Gln Asn Arg Val Ser Ser Ser Ser
Ser 245 250 255Thr His His Gln Ser Gln His Ser Ser Ala His Ser Ser
Ser Ser Ser 260 265 270Cys Ile Ala Ser Gly Thr Ser Gly Pro Val Cys
Trp Asn Ser Gly Ser 275 280 285Ile Asp Thr Arg
290115165PRTArabidopsis thaliana 115Met Asn Arg Glu Lys Leu Met Lys
Met Ala Asn Thr Val Arg Thr Gly1 5 10 15Gly Lys Gly Thr Val Arg Arg
Lys Lys Lys Ala Val His Lys Thr Thr 20 25 30Thr Thr Asp Asp Lys Arg
Leu Gln Ser Thr Leu Lys Arg Val Gly Val 35 40 45Asn Ser Ile Pro Ala
Ile Glu Glu Val Asn Ile Phe Lys Asp Asp Val 50 55 60Val Ile Gln Phe
Ile Asn Pro Lys Val Gln Ala Ser Ile Ala Ala Asn65 70 75 80Thr Trp
Val Val Ser Gly Thr Pro Gln Thr Lys Lys Leu Gln Asp Ile 85 90 95Leu
Pro Gln Ile Ile Ser Gln Leu Gly Pro Asp Asn Leu Asp Asn Leu 100 105
110Lys Lys Leu Ala Glu Gln Phe Gln Lys Gln Ala Pro Gly Ala Gly Asp
115 120 125Val Pro Ala Thr Ile Gln Glu Glu Asp Asp Asp Asp Asp Val
Pro Asp 130 135 140Leu Val Val Gly Glu Thr Phe Glu Thr Pro Ala Thr
Glu Glu Ala Pro145 150 155 160Lys Ala Ala Ala Ser
165116432PRTArabidopsis thaliana 116Met Ala Thr Val Ser Ser Ser Ser
Trp Pro Asn Pro Asn Pro Asn Pro1 5 10 15Asp Ser Thr Ser Ala Ser Asp
Ser Asp Ser Thr Phe Pro Ser His Arg 20 25 30Asp Arg Val Asp Glu Pro
Asp Ser Leu Asp Ser Phe Ser Ser Met Ser 35 40 45Leu Asn Ser Asp Glu
Pro Asn Gln Thr Ser Asn Gln Ser Pro Leu Ser 50 55 60Pro Pro Thr Pro
Asn Leu Pro Val Met Pro Pro Pro Ser Val Leu His65 70 75 80Leu Ser
Phe Asn Gln Asp His Ala Cys Phe Ala Val Gly Thr Asp Arg 85 90 95Gly
Phe Arg Ile Leu Asn Cys Asp Pro Phe Arg Glu Ile Phe Arg Arg 100 105
110Asp Phe Asp Arg Gly Gly Gly Val Ala Val Val Glu Met Leu Phe Arg
115 120 125Cys Asn Ile Leu Ala Leu Val Gly Gly Gly Pro Asp Pro Gln
Tyr Pro 130 135 140Pro Asn Lys Val Met Ile Trp Asp Asp His Gln Gly
Arg Cys Ile Gly145 150 155 160Glu Leu Ser Phe Arg Ser Asp Val Arg
Ser Val Arg Leu Arg Arg Asp 165 170 175Arg Ile Ile Val Val Leu Glu
Gln Lys Ile Phe Val Tyr Asn Phe Ser 180 185 190Asp Leu Lys Leu Met
His Gln Ile Glu Thr Ile Ala Asn Pro Lys Gly 195 200 205Leu Cys Ala
Val Ser Gln Gly Val Gly Ser Met Val Leu Val Cys Pro 210 215 220Gly
Leu Gln Lys Gly Gln Val Arg Ile Glu His Tyr Ala Ser Lys Arg225 230
235 240Thr Lys Phe Val Met Ala His Asp Ser Arg Ile Ala Cys Phe Ala
Leu 245 250 255Thr Gln Asp Gly His Leu Leu Ala Thr Ala Ser Ser Lys
Gly Thr Leu 260 265 270Val Arg Ile Phe Asn Thr Val Asp Gly Thr Leu
Arg Gln Glu Ser Gly 275 280 285Thr Ser Glu Asp Glu Ile Gly Lys Glu
Gly Ala Asp Arg Ala Glu Ile 290 295 300Tyr Ser Leu Ala Phe Ser Ser
Asn Ala Gln Trp Leu Ala Val Ser Ser305 310 315
320Asp Lys Gly Thr Val His Val Phe Gly Leu Lys Val Asn Ser Gly Ser
325 330 335Gln Val Lys Asp Ser Ser Arg Ile Ala Pro Asp Ala Thr Pro
Ser Ser 340 345 350Pro Ser Ser Ser Leu Ser Leu Phe Lys Val Leu Pro
Arg Tyr Phe Ser 355 360 365Ser Glu Trp Ser Val Ala Gln Phe Arg Leu
Val Glu Gly Thr Gln Tyr 370 375 380Ile Ala Ala Phe Gly His Gln Lys
Asn Thr Val Val Ile Leu Gly Met385 390 395 400Asp Gly Ser Phe Tyr
Arg Cys Gln Phe Asp Pro Val Asn Gly Gly Glu 405 410 415Met Ser Gln
Leu Glu Tyr His Asn Cys Leu Lys Pro Pro Ser Val Phe 420 425
430117559PRTArabidopsis thaliana 117Met Asp Val Gly Val Thr Thr Ala
Lys Ser Ile Leu Glu Lys Pro Leu1 5 10 15Lys Leu Leu Thr Glu Glu Asp
Ile Ser Gln Leu Thr Arg Glu Asp Cys 20 25 30Arg Lys Phe Leu Lys Glu
Lys Gly Phe Phe Phe Phe Leu Ser Pro Phe 35 40 45Phe Ser Gly Leu Ile
Val Phe Asp Glu Trp Arg Leu Thr Arg Val Glu 50 55 60Thr Gly Met Arg
Arg Pro Ser Trp Asn Lys Ser Gln Ala Ile Gln Gln65 70 75 80Val Leu
Ser Leu Lys Ala Leu Tyr Glu Pro Gly Asp Asp Ser Gly Ala 85 90 95Gly
Ile Leu Arg Lys Ile Leu Val Ser Gln Pro Pro Asn Pro Pro Arg 100 105
110Val Thr Thr Thr Leu Ile Glu Pro Arg Asn Glu Leu Glu Ala Cys Gly
115 120 125Arg Ile Pro Leu Gln Glu Asp Asp Gly Ala Cys His Arg Arg
Asp Ser 130 135 140Pro Arg Ser Ala Glu Phe Ser Gly Ser Ser Gly Gln
Phe Val Ala Asp145 150 155 160Lys Asp Ser His Lys Thr Val Ser Val
Ser Pro Arg Ser Pro Ala Glu 165 170 175Thr Asn Ala Val Val Gly Gln
Met Thr Ile Phe Tyr Ser Gly Lys Val 180 185 190Asn Val Tyr Asp Gly
Val Pro Pro Glu Lys Ala Arg Ser Ile Met His 195 200 205Phe Ala Ala
Asn Pro Ile Asp Leu Pro Glu Asn Gly Ile Phe Ala Ser 210 215 220Ser
Arg Met Ile Ser Lys Pro Met Ser Lys Glu Lys Met Val Glu Leu225 230
235 240Pro Gln Tyr Gly Leu Glu Lys Ala Pro Ala Ser Arg Asp Ser Asp
Val 245 250 255Glu Gly Gln Ala Asn Arg Lys Val Ser Leu Gln Arg Tyr
Leu Glu Lys 260 265 270Arg Lys Asp Arg Phe Ser Lys Thr Lys Lys Ala
Pro Gly Val Ala Ser 275 280 285Ser Ser Leu Glu Met Phe Leu Asn Arg
Gln Pro Arg Met Asn Ala Ala 290 295 300Tyr Ser Gln Asn Leu Ser Gly
Thr Gly His Cys Glu Ser Pro Glu Asn305 310 315 320Gln Thr Lys Ser
Pro Asn Ile Ser Val Asp Leu Asn Ser Asp Leu Asn 325 330 335Ser Glu
Gly Ala Lys Arg Thr Gly Asp Gly Thr Thr Gly Gln Lys Ala 340 345
350Gly Arg Thr Ile Ser Cys Ser Tyr Asn Met Thr Lys Thr Ser Arg Gly
355 360 365Thr Arg Trp Val Lys Arg Ser Arg Glu Glu Val Ile Gln Ala
Trp Tyr 370 375 380Met Asp Asp Ser Glu Glu Asp Gln Arg Leu Pro His
His Lys Asp Pro385 390 395 400Lys Glu Phe Val Ser Leu Asp Lys Leu
Ala Glu Leu Gly Val Leu Ser 405 410 415Trp Arg Leu Asp Ala Asp Asn
Tyr Glu Thr Asp Glu Asp Leu Lys Lys 420 425 430Ile Arg Glu Ser Arg
Gly Tyr Ser Tyr Met Asp Phe Cys Glu Val Cys 435 440 445Pro Glu Lys
Leu Pro Asn Tyr Glu Val Lys Val Lys Ser Phe Phe Glu 450 455 460Glu
His Leu His Thr Asp Glu Glu Ile Arg Tyr Cys Val Ala Gly Thr465 470
475 480Gly Tyr Phe Asp Val Arg Asp Arg Asn Glu Ala Trp Ile Arg Val
Leu 485 490 495Val Lys Lys Gly Gly Met Ile Val Leu Pro Ala Gly Ile
Tyr His Arg 500 505 510Phe Thr Val Asp Ser Asp Asn Tyr Ile Lys Ala
Met Arg Leu Phe Val 515 520 525Gly Glu Pro Val Trp Thr Pro Tyr Asn
Arg Pro His Asp His Leu Pro 530 535 540Ala Arg Lys Glu Tyr Val Asp
Asn Phe Met Ile Asn Ala Ser Ala545 550 55511886PRTArabidopsis
thaliana 118Met Asp Gly His Asp Ser Lys Asp Thr Lys Gln Ser Thr Ala
Asp Met1 5 10 15Thr Ala Phe Val Gln Asn Leu Leu Gln Gln Met Gln Thr
Arg Phe Gln 20 25 30Thr Met Ser Asp Ser Ile Ile Thr Lys Ile Asp Asp
Met Gly Gly Arg 35 40 45Ile Asn Glu Leu Glu Gln Ser Ile Asn Asp Leu
Arg Ala Glu Met Gly 50 55 60Val Glu Gly Thr Pro Pro Pro Ala Ser Lys
Ser Gly Asp Glu Pro Lys65 70 75 80Thr Pro Ala Ser Ser Ser
85119784PRTArabidopsis thaliana 119Met Glu Ile Tyr Thr Met Lys Thr
Asn Phe Leu Val Leu Ala Leu Ser1 5 10 15Leu Cys Ile Leu Leu Ser Ser
Phe His Glu Val Ser Cys Gln Asp Asp 20 25 30Gly Ser Gly Leu Ser Asn
Leu Asp Leu Ile Glu Arg Asp Tyr Gln Asp 35 40 45Ser Val Asn Ala Leu
Gln Gly Lys Asp Asp Glu Asp Gln Ser Ala Lys 50 55 60Ile Gln Ser Glu
Asn Gln Asn Asn Thr Thr Val Thr Asp Lys Asn Thr65 70 75 80Ile Ser
Leu Ser Leu Ser Asp Glu Ser Glu Val Gly Ser Val Ser Asp 85 90 95Glu
Ser Val Gly Arg Ser Ser Leu Leu Asp Gln Ile Lys Leu Glu Phe 100 105
110Glu Ala His His Asn Ser Ile Asn Gln Ala Gly Ser Asp Gly Val Lys
115 120 125Ala Glu Ser Lys Asp Asp Asp Glu Glu Leu Ser Ala His Arg
Gln Lys 130 135 140Met Leu Glu Glu Ile Glu His Glu Phe Glu Ala Ala
Ser Asp Ser Leu145 150 155 160Lys Gln Leu Lys Thr Asp Asp Val Asn
Glu Gly Asn Asp Glu Glu His 165 170 175Ser Ala Lys Arg Gln Ser Leu
Leu Glu Glu Ile Glu Arg Glu Phe Glu 180 185 190Ala Ala Thr Lys Glu
Leu Glu Gln Leu Lys Val Asn Asp Phe Thr Gly 195 200 205Asp Lys Asp
Asp Glu Glu His Ser Ala Lys Arg Lys Ser Met Leu Glu 210 215 220Ala
Ile Glu Arg Glu Phe Glu Ala Ala Met Glu Gly Ile Glu Ala Leu225 230
235 240Lys Val Ser Asp Ser Thr Gly Ser Gly Asp Asp Glu Glu Gln Ser
Ala 245 250 255Lys Arg Leu Ser Met Leu Glu Glu Ile Glu Arg Glu Phe
Glu Ala Ala 260 265 270Ser Lys Gly Leu Glu Gln Leu Arg Ala Ser Asp
Ser Thr Ala Asp Asn 275 280 285Asn Glu Glu Glu His Ala Ala Lys Gly
Gln Ser Leu Leu Glu Glu Ile 290 295 300Glu Arg Glu Phe Glu Ala Ala
Thr Glu Ser Leu Lys Gln Leu Gln Val305 310 315 320Asp Asp Ser Thr
Glu Asp Lys Glu His Cys Lys Ala Leu Phe Phe Leu 325 330 335Leu Ser
Ala Ile Leu Ser Leu Trp Leu Ser Glu Ser Gly Phe Glu Cys 340 345
350Ile Val Val Thr Ala Ala Lys Arg Gln Ser Leu Leu Glu Glu Ile Glu
355 360 365Arg Glu Phe Glu Ala Ala Thr Lys Asp Leu Lys Gln Leu Asn
Asp Phe 370 375 380Thr Glu Gly Ser Ala Asp Asp Glu Gln Ser Ala Lys
Arg Asn Lys Met385 390 395 400Leu Glu Asp Ile Glu Arg Glu Phe Glu
Ala Ala Thr Ile Gly Leu Glu 405 410 415Gln Leu Lys Ala Asn Asp Phe
Ser Glu Gly Asn Asn Asn Glu Glu Gln 420 425 430Ser Ala Lys Arg Lys
Ser Met Leu Glu Glu Ile Glu Arg Glu Phe Glu 435 440 445Ala Ala Ile
Gly Gly Leu Lys Gln Ile Lys Val Asp Asp Ser Arg Asn 450 455 460Leu
Glu Glu Glu Ser Ala Lys Arg Lys Ile Ile Leu Glu Glu Met Glu465 470
475 480Arg Glu Phe Glu Glu Ala His Ser Gly Ile Asn Ala Lys Ala Asp
Lys 485 490 495Glu Glu Ser Ala Lys Lys Gln Ser Gly Ser Ala Ile Pro
Glu Val Leu 500 505 510Gly Leu Gly Gln Ser Gly Gly Cys Ser Cys Ser
Lys Gln Asp Glu Asp 515 520 525Ser Ser Ile Val Ile Pro Thr Lys Tyr
Ser Ile Glu Asp Ile Leu Ser 530 535 540Glu Glu Ser Ala Val Gln Gly
Thr Glu Thr Ser Ser Leu Thr Ala Ser545 550 555 560Leu Thr Gln Leu
Val Glu Asn His Arg Lys Glu Lys Glu Ser Leu Leu 565 570 575Gly His
Arg Val Leu Thr Ser Pro Ser Ile Ala Ser Ser Thr Ser Glu 580 585
590Ser Ser Ala Thr Ser Glu Thr Val Glu Thr Leu Arg Ala Lys Leu Asn
595 600 605Glu Leu Arg Gly Leu Thr Ala Arg Glu Leu Val Thr Arg Lys
Asp Phe 610 615 620Gly Gln Ile Leu Ile Thr Ala Ala Ser Phe Glu Glu
Leu Ser Ser Ala625 630 635 640Pro Ile Ser Tyr Ile Ser Arg Leu Ala
Lys Tyr Arg Asn Val Ile Lys 645 650 655Glu Gly Leu Glu Ala Ser Glu
Arg Val His Ile Ala Gln Val Arg Ala 660 665 670Lys Met Leu Lys Glu
Val Ala Thr Glu Lys Gln Thr Ala Val Asp Thr 675 680 685His Phe Ala
Thr Ala Lys Lys Leu Ala Gln Glu Gly Asp Ala Leu Phe 690 695 700Val
Lys Ile Phe Ala Ile Lys Lys Leu Leu Ala Lys Leu Glu Ala Glu705 710
715 720Lys Glu Ser Val Asp Gly Lys Phe Lys Glu Thr Val Lys Glu Leu
Ser 725 730 735His Leu Leu Ala Asp Ala Ser Glu Ala Tyr Glu Glu Tyr
His Gly Ala 740 745 750Val Arg Lys Ala Lys Asp Glu Gln Ala Ala Glu
Glu Phe Ala Lys Glu 755 760 765Ala Thr Gln Ser Ala Glu Ile Ile Trp
Val Lys Phe Leu Ser Ser Leu 770 775 780120724PRTArabidopsis
thaliana 120Met Glu Phe Gly Ser Phe Leu Val Ser Leu Gly Thr Ser Phe
Val Ile1 5 10 15Phe Val Ile Leu Met Leu Leu Phe Thr Trp Leu Ser Arg
Lys Ser Gly 20 25 30Asn Ala Pro Ile Tyr Tyr Pro Asn Arg Ile Leu Lys
Gly Leu Glu Pro 35 40 45Trp Glu Gly Thr Ser Leu Thr Arg Asn Pro Phe
Ala Trp Met Arg Glu 50 55 60Ala Leu Thr Ser Ser Glu Gln Asp Val Val
Asn Leu Ser Gly Val Asp65 70 75 80Thr Ala Val His Phe Val Phe Leu
Ser Thr Val Leu Gly Ile Phe Ala 85 90 95Cys Ser Ser Leu Leu Leu Leu
Pro Thr Leu Leu Pro Leu Ala Ala Thr 100 105 110Asp Asn Asn Ile Lys
Asn Thr Lys Asn Ala Thr Asp Thr Thr Ser Lys 115 120 125Gly Thr Phe
Ser Gln Leu Asp Asn Leu Ser Met Ala Asn Ile Thr Lys 130 135 140Lys
Ser Ser Arg Leu Trp Ala Phe Leu Gly Ala Val Tyr Trp Ile Ser145 150
155 160Leu Val Thr Tyr Phe Phe Leu Trp Lys Ala Tyr Lys His Val Ser
Ser 165 170 175Leu Arg Ala Gln Ala Leu Met Ser Ala Asp Val Lys Pro
Glu Gln Phe 180 185 190Ala Ile Leu Val Arg Asp Met Pro Ala Pro Pro
Asp Gly Gln Thr Gln 195 200 205Lys Glu Phe Ile Asp Ser Tyr Phe Arg
Glu Ile Tyr Pro Glu Thr Phe 210 215 220Tyr Arg Ser Leu Val Ala Thr
Glu Asn Ser Lys Val Asn Lys Ile Trp225 230 235 240Glu Lys Leu Glu
Gly Tyr Lys Lys Lys Leu Ala Arg Ala Glu Ala Ile 245 250 255Leu Ala
Ala Thr Asn Asn Arg Pro Thr Asn Lys Thr Gly Phe Cys Gly 260 265
270Leu Val Gly Lys Gln Val Asp Ser Ile Glu Tyr Tyr Thr Glu Leu Ile
275 280 285Asn Glu Ser Val Ala Lys Leu Glu Thr Glu Gln Lys Ala Val
Leu Ala 290 295 300Glu Lys Gln Gln Thr Ala Ala Val Val Phe Phe Thr
Thr Arg Val Ala305 310 315 320Ala Ala Ser Ala Ala Gln Ser Leu His
Cys Gln Met Val Asp Lys Trp 325 330 335Thr Val Thr Glu Ala Pro Glu
Pro Arg Gln Leu Leu Trp Gln Asn Leu 340 345 350Asn Ile Lys Leu Phe
Ser Arg Ile Ile Arg Gln Tyr Phe Ile Tyr Phe 355 360 365Phe Val Ala
Val Thr Ile Leu Phe Tyr Met Ile Pro Ile Ala Phe Val 370 375 380Ser
Ala Ile Thr Thr Leu Lys Asn Leu Gln Arg Ile Ile Pro Phe Ile385 390
395 400Lys Pro Val Val Glu Ile Thr Ala Ile Arg Thr Val Leu Glu Ser
Phe 405 410 415Leu Pro Gln Ile Ala Leu Ile Val Phe Leu Ala Met Leu
Pro Lys Leu 420 425 430Leu Leu Phe Leu Ser Lys Ala Glu Gly Ile Pro
Ser Gln Ser His Ala 435 440 445Ile Arg Ala Ala Ser Gly Lys Tyr Phe
Tyr Phe Ser Val Phe Asn Val 450 455 460Phe Ile Gly Val Thr Leu Ala
Gly Thr Leu Phe Asn Thr Val Lys Asp465 470 475 480Ile Ala Lys Asn
Pro Lys Leu Asp Met Ile Ile Asn Leu Leu Ala Thr 485 490 495Ser Leu
Pro Lys Ser Ala Thr Phe Phe Leu Thr Tyr Val Ala Leu Lys 500 505
510Phe Phe Ile Gly Tyr Gly Leu Glu Leu Ser Arg Ile Ile Pro Leu Ile
515 520 525Ile Phe His Leu Lys Lys Lys Tyr Leu Cys Lys Thr Glu Ala
Glu Val 530 535 540Lys Glu Ala Trp Tyr Pro Gly Asp Leu Ser Tyr Ala
Thr Arg Val Pro545 550 555 560Gly Asp Met Leu Ile Leu Thr Ile Thr
Phe Cys Tyr Ser Val Ile Ala 565 570 575Pro Leu Ile Leu Ile Phe Gly
Ile Thr Tyr Phe Gly Leu Gly Trp Leu 580 585 590Val Leu Arg Asn Gln
Ala Leu Lys Val Tyr Val Pro Ser Tyr Glu Ser 595 600 605Tyr Gly Arg
Met Trp Pro His Ile His Gln Arg Ile Leu Ala Ala Leu 610 615 620Phe
Leu Phe Gln Val Val Met Phe Gly Tyr Leu Gly Ala Lys Thr Phe625 630
635 640Phe Tyr Thr Ala Leu Val Ile Pro Leu Ile Ile Thr Ser Leu Ile
Phe 645 650 655Gly Tyr Val Cys Arg Gln Lys Phe Tyr Gly Gly Phe Glu
His Thr Ala 660 665 670Leu Glu Val Ala Cys Arg Glu Leu Lys Gln Ser
Pro Asp Leu Glu Glu 675 680 685Ile Phe Arg Ala Tyr Ile Pro His Ser
Leu Ser Ser His Lys Pro Glu 690 695 700Glu His Glu Phe Lys Gly Ala
Met Ser Arg Tyr Gln Asp Phe Asn Ala705 710 715 720Ile Ala Gly
Val1211313PRTArabidopsis thaliana 121Met Ala Glu Gln Lys Ser Thr
Asn Met Trp Asn Trp Glu Val Thr Gly1 5 10 15Phe Glu Ser Lys Lys Ser
Pro Ser Ser Glu Glu Gly Val His Arg Thr 20 25 30Pro Ser Ser Met Leu
Arg Arg Tyr Ser Ile Pro Lys Asn Ser Leu Pro 35 40 45Pro His Ser Ser
Glu Leu Ala Ser Lys Val Gln Ser Leu Lys Asp Lys 50 55 60Val Gln Leu
Ala Lys Asp Asp Tyr Val Gly Leu Arg Gln Glu Ala Thr65 70 75 80Asp
Leu Gln Glu Tyr Ser Asn Ala Lys Leu Glu Arg Val Thr Arg Tyr 85 90
95Leu Gly Val Leu Ala Asp Lys Ser Arg Lys Leu Asp Gln Tyr Ala Leu
100 105 110Glu Thr Glu Ala Arg Ile Ser Pro Leu Ile Asn Glu Lys Lys
Arg Leu 115 120 125Phe Asn Asp Leu Leu Thr Thr Lys Gly Ala His Leu
Pro Phe Pro Thr 130 135 140Ser Phe Ser Ile Leu Thr Ser Ile Asp Ile
Asp His Thr Arg Pro Leu145 150 155 160Phe Glu Asp Glu Gly Pro Ser
Ile Ile Glu Phe Pro Asp Asn Cys Thr 165 170 175Ile Arg Val Asn Thr
Ser Asp Asp Thr Leu Ser Asn Pro Lys Lys Glu
180 185 190Phe Glu Phe Asp Arg Val Tyr Gly Pro Gln Val Gly Gln Ala
Ser Leu 195 200 205Phe Ser Asp Val Gln Pro Phe Val Gln Ser Ala Leu
Asp Gly Ser Asn 210 215 220Val Ser Ile Phe Ala Tyr Gly Gln Thr His
Ala Gly Lys Thr Tyr Thr225 230 235 240Met Val Ala Pro Pro Phe Pro
Phe Leu Ser Glu Ile Arg Tyr Arg Ser 245 250 255Cys Leu Asp Leu Asn
Met Ile Gly Lys Phe Met Asp Val His Ser Lys 260 265 270Phe Met Asp
Glu Gly Ser Asn Gln Asp Arg Gly Leu Tyr Ala Arg Cys 275 280 285Phe
Glu Glu Leu Met Asp Leu Ala Asn Ser Asp Ser Thr Ser Ala Ser 290 295
300Gln Phe Ser Phe Ser Val Ser Val Phe Glu Leu Tyr Asn Glu Gln
Val305 310 315 320Arg Asp Leu Leu Ser Gly Cys Gln Ser Asn Leu Pro
Lys Ile Asn Met 325 330 335Gly Leu Arg Glu Ser Val Ile Glu Leu Ser
Gln Glu Lys Val Asp Asn 340 345 350Pro Ser Glu Phe Met Arg Val Leu
Asn Ser Ala Phe Gln Asn Arg Gly 355 360 365Asn Asp Lys Ser Lys Ser
Thr Val Thr His Leu Ile Val Ser Ile His 370 375 380Ile Cys Tyr Ser
Asn Thr Ile Thr Arg Glu Asn Val Ile Ser Lys Leu385 390 395 400Ser
Leu Val Asp Leu Ala Gly Ser Glu Gly Leu Thr Val Glu Asp Asp 405 410
415Asn Gly Asp His Val Thr Asp Leu Leu His Val Thr Asn Ser Ile Ser
420 425 430Ala Leu Gly Asp Val Leu Ser Ser Leu Thr Ser Lys Arg Asp
Thr Ile 435 440 445Pro Tyr Glu Asn Ser Phe Leu Thr Arg Ile Leu Ala
Asp Ser Leu Gly 450 455 460Gly Ser Ser Lys Thr Leu Met Ile Val Asn
Ile Cys Pro Ser Ala Arg465 470 475 480Asn Leu Ser Glu Ile Met Ser
Cys Leu Asn Tyr Ala Ala Arg Ala Arg 485 490 495Asn Thr Val Pro Ser
Leu Gly Asn Arg Asp Thr Ile Lys Lys Trp Arg 500 505 510Asp Val Ala
Asn Asp Ala Arg Lys Glu Val Leu Glu Lys Glu Arg Glu 515 520 525Asn
Gln Arg Leu Lys Gln Glu Val Thr Gly Leu Lys Gln Ala Leu Lys 530 535
540Glu Ala Asn Asp Gln Cys Val Leu Leu Tyr Asn Glu Val Gln Arg
Ala545 550 555 560Trp Arg Val Ser Phe Thr Leu Gln Ser Asp Leu Lys
Ser Glu Asn Ala 565 570 575Met Val Val Asp Lys His Lys Ile Glu Lys
Glu Gln Asn Phe Gln Leu 580 585 590Arg Asn Gln Ile Ala Gln Leu Leu
Gln Leu Glu Gln Glu Gln Lys Leu 595 600 605Gln Ala Gln Gln Gln Asp
Ser Thr Ile Gln Asn Leu Gln Ser Lys Val 610 615 620Lys Asp Leu Glu
Ser Gln Leu Ser Lys Ala Leu Lys Ser Asp Met Thr625 630 635 640Arg
Ser Arg Asp Pro Leu Glu Pro Gln Pro Arg Ala Ala Glu Asn Thr 645 650
655Leu Asp Ser Ser Ala Val Thr Lys Lys Leu Glu Glu Glu Leu Lys Lys
660 665 670Arg Asp Ala Leu Ile Glu Arg Leu His Glu Glu Asn Glu Lys
Leu Phe 675 680 685Asp Arg Leu Thr Glu Lys Ser Val Ala Ser Ser Thr
Gln Val Ser Ser 690 695 700Pro Ser Ser Lys Ala Ser Pro Thr Val Gln
Pro Ala Asp Val Asp Arg705 710 715 720Lys Asn Ser Ala Gly Thr Leu
Pro Ser Ser Val Asp Lys Asn Glu Gly 725 730 735Thr Ile Thr Leu Val
Lys Ser Ser Ser Glu Leu Val Lys Thr Thr Pro 740 745 750Ala Gly Glu
Tyr Leu Thr Ala Ala Leu Asn Asp Phe Asp Pro Glu Gln 755 760 765Tyr
Glu Gly Leu Ala Ala Ile Ala Asp Gly Ala Asn Lys Leu Leu Met 770 775
780Leu Val Leu Ala Ala Val Ile Lys Ala Gly Ala Ser Arg Glu His
Glu785 790 795 800Ile Leu Ala Glu Ile Arg Asp Ser Val Phe Ser Phe
Ile Arg Lys Met 805 810 815Glu Pro Arg Arg Val Met Asp Thr Met Leu
Val Ser Arg Val Arg Ile 820 825 830Leu Tyr Ile Arg Ser Leu Leu Ala
Arg Ser Pro Glu Leu Gln Ser Ile 835 840 845Lys Val Ser Pro Val Glu
Arg Phe Leu Glu Lys Pro Tyr Thr Gly Arg 850 855 860Thr Arg Ser Ser
Ser Gly Ser Ser Ser Pro Gly Arg Ser Pro Val Arg865 870 875 880Tyr
Tyr Asp Glu Gln Ile Tyr Gly Phe Lys Val Asn Leu Lys Pro Glu 885 890
895Lys Lys Ser Lys Leu Val Ser Val Val Ser Arg Ile Arg Gly His Asp
900 905 910Gln Asp Thr Gly Arg Gln Gln Val Thr Gly Gly Lys Leu Arg
Glu Ile 915 920 925Gln Asp Glu Ala Lys Ser Phe Ala Ile Gly Asn Lys
Pro Leu Ala Ala 930 935 940Leu Phe Val His Thr Pro Ala Gly Glu Leu
Gln Arg Gln Ile Arg Ser945 950 955 960Trp Leu Ala Glu Ser Phe Glu
Phe Leu Ser Val Thr Ala Asp Asp Val 965 970 975Ser Gly Val Thr Thr
Gly Gln Leu Glu Leu Leu Ser Thr Ala Ile Met 980 985 990Asp Gly Trp
Met Ala Gly Val Gly Ala Ala Val Pro Pro His Thr Asp 995 1000
1005Ala Leu Gly Gln Leu Leu Ser Glu Tyr Ala Lys Arg Val Tyr Thr
1010 1015 1020Ser Gln Met Gln His Leu Lys Asp Ile Ala Gly Thr Leu
Ala Ser 1025 1030 1035Glu Glu Ala Glu Asp Ala Gly Gln Val Ala Lys
Leu Arg Ser Ala 1040 1045 1050Leu Glu Ser Val Asp His Lys Arg Arg
Lys Ile Leu Gln Gln Met 1055 1060 1065Arg Ser Asp Ala Ala Leu Phe
Thr Leu Glu Glu Gly Ser Ser Pro 1070 1075 1080Val Gln Asn Pro Ser
Thr Ala Ala Glu Asp Ser Arg Leu Ala Ser 1085 1090 1095Leu Ile Ser
Leu Asp Ala Ile Leu Lys Gln Val Lys Glu Ile Thr 1100 1105 1110Arg
Gln Ala Ser Val His Val Leu Ser Lys Ser Lys Lys Lys Ala 1115 1120
1125Leu Leu Glu Ser Leu Asp Glu Leu Asn Glu Arg Met Pro Ser Leu
1130 1135 1140Leu Asp Val Asp His Pro Cys Ala Gln Arg Glu Ile Asp
Thr Ala 1145 1150 1155His Gln Leu Val Glu Thr Ile Pro Glu Gln Glu
Asp Asn Leu Gln 1160 1165 1170Asp Glu Lys Arg Pro Ser Ile Asp Ser
Ile Ser Ser Thr Glu Thr 1175 1180 1185Asp Val Ser Gln Trp Asn Val
Leu Gln Phe Asn Thr Gly Gly Ser 1190 1195 1200Ser Ala Pro Phe Ile
Ile Lys Cys Gly Ala Asn Ser Asn Ser Glu 1205 1210 1215Leu Val Ile
Lys Ala Asp Ala Arg Ile Gln Glu Pro Lys Gly Gly 1220 1225 1230Glu
Ile Val Arg Val Val Pro Arg Pro Ser Val Leu Glu Asn Met 1235 1240
1245Ser Leu Glu Glu Met Lys Gln Val Phe Gly Gln Leu Pro Glu Ala
1250 1255 1260Leu Ser Ser Leu Ala Leu Ala Arg Thr Ala Asp Gly Thr
Arg Ala 1265 1270 1275Arg Tyr Ser Arg Leu Tyr Arg Thr Leu Ala Met
Lys Val Pro Ser 1280 1285 1290Leu Arg Asp Leu Val Gly Glu Leu Glu
Lys Gly Gly Val Leu Lys 1295 1300 1305Asp Thr Lys Ser Thr
1310122310PRTArabidopsis thaliana 122Met Ala Asn Pro Trp Trp Val
Gly Asn Val Ala Ile Gly Gly Val Glu1 5 10 15Ser Pro Val Thr Ser Ser
Ala Pro Ser Leu His His Arg Asn Ser Asn 20 25 30Asn Asn Asn Pro Pro
Thr Met Thr Arg Ser Asp Pro Arg Leu Asp His 35 40 45Asp Phe Thr Thr
Asn Asn Ser Gly Ser Pro Asn Thr Gln Thr Gln Ser 50 55 60Gln Glu Glu
Gln Asn Ser Arg Asp Glu Gln Pro Ala Val Glu Pro Gly65 70 75 80Ser
Gly Ser Gly Ser Thr Gly Arg Arg Pro Arg Gly Arg Pro Pro Gly 85 90
95Ser Lys Asn Lys Pro Lys Ser Pro Val Val Val Thr Lys Glu Ser Pro
100 105 110Asn Ser Leu Gln Ser His Val Leu Glu Ile Ala Thr Gly Ala
Asp Val 115 120 125Ala Glu Ser Leu Asn Ala Phe Ala Arg Arg Arg Gly
Arg Gly Val Ser 130 135 140Val Leu Ser Gly Ser Gly Leu Val Thr Asn
Val Thr Leu Arg Gln Pro145 150 155 160Ala Ala Ser Gly Gly Val Val
Ser Leu Arg Gly Gln Phe Glu Ile Leu 165 170 175Ser Met Cys Gly Ala
Phe Leu Pro Thr Ser Gly Ser Pro Ala Ala Ala 180 185 190Ala Gly Leu
Thr Ile Tyr Leu Ala Gly Ala Gln Gly Gln Val Val Gly 195 200 205Gly
Gly Val Ala Gly Pro Leu Ile Ala Ser Gly Pro Val Ile Val Ile 210 215
220Ala Ala Thr Phe Cys Asn Ala Thr Tyr Glu Arg Leu Pro Ile Glu
Glu225 230 235 240Glu Gln Gln Gln Glu Gln Pro Leu Gln Leu Glu Asp
Gly Lys Lys Gln 245 250 255Lys Glu Glu Asn Asp Asp Asn Glu Ser Gly
Asn Asn Gly Asn Glu Gly 260 265 270Ser Met Gln Pro Pro Met Tyr Asn
Met Pro Pro Asn Phe Ile Pro Asn 275 280 285Gly His Gln Met Ala Gln
His Asp Val Tyr Trp Gly Gly Pro Pro Pro 290 295 300Arg Ala Pro Pro
Ser Tyr305 310123964PRTArabidopsis thaliana 123Met Ala Leu Asn Leu
Arg Gln Lys Gln Thr Glu Cys Val Ile Arg Met1 5 10 15Leu Asn Leu Asn
Gln Pro Leu Asn Pro Ser Gly Thr Ala Asn Glu Glu 20 25 30Val Tyr Lys
Ile Leu Ile Tyr Asp Arg Phe Cys Gln Asn Ile Leu Ser 35 40 45Pro Leu
Thr His Val Lys Asp Leu Arg Lys His Gly Val Thr Leu Phe 50 55 60Phe
Leu Ile Asp Lys Asp Arg Gln Pro Val His Asp Val Pro Ala Val65 70 75
80Tyr Phe Val Gln Pro Thr Glu Ser Asn Leu Gln Arg Ile Ile Ala Asp
85 90 95Ala Ser Arg Ser Leu Tyr Asp Thr Phe His Leu Asn Phe Ser Ser
Ser 100 105 110Ile Pro Arg Lys Phe Leu Glu Glu Leu Ala Ser Gly Thr
Leu Lys Ser 115 120 125Gly Ser Val Glu Lys Val Ser Lys Val His Asp
Gln Tyr Leu Glu Phe 130 135 140Val Thr Leu Glu Asp Asn Leu Phe Ser
Leu Ala Gln Gln Ser Thr Tyr145 150 155 160Val Gln Met Asn Asp Pro
Ser Ala Gly Glu Lys Glu Ile Asn Glu Ile 165 170 175Ile Glu Arg Val
Ala Ser Gly Leu Phe Cys Val Leu Val Thr Leu Gly 180 185 190Val Val
Pro Val Ile Arg Cys Pro Ser Gly Gly Pro Ala Glu Met Val 195 200
205Ala Ser Leu Leu Asp Gln Lys Leu Arg Asp His Leu Leu Ser Lys Asn
210 215 220Asn Leu Phe Thr Glu Gly Gly Gly Phe Met Ser Ser Phe Gln
Arg Pro225 230 235 240Leu Leu Cys Ile Phe Asp Arg Asn Phe Glu Leu
Ser Val Gly Ile Gln 245 250 255His Asp Phe Arg Tyr Arg Pro Leu Val
His Asp Val Leu Gly Leu Lys 260 265 270Leu Asn Gln Leu Lys Val Gln
Gly Glu Lys Gly Pro Pro Lys Ser Phe 275 280 285Glu Leu Asp Ser Ser
Asp Pro Phe Trp Ser Ala Asn Ser Thr Leu Glu 290 295 300Phe Pro Asp
Val Ala Val Glu Ile Glu Thr Gln Leu Asn Lys Tyr Lys305 310 315
320Arg Asp Val Glu Glu Val Asn Lys Lys Thr Gly Gly Gly Ser Gly Ala
325 330 335Glu Phe Asp Gly Thr Asp Leu Ile Gly Asn Ile His Thr Glu
His Leu 340 345 350Met Asn Thr Val Lys Ser Leu Pro Glu Leu Thr Glu
Arg Lys Lys Val 355 360 365Ile Asp Lys His Thr Asn Ile Ala Thr Ala
Leu Leu Gly Gln Ile Lys 370 375 380Glu Arg Ser Ile Asp Ala Phe Thr
Lys Lys Glu Ser Asp Met Met Met385 390 395 400Arg Gly Gly Ile Asp
Arg Thr Glu Leu Met Ala Ala Leu Lys Gly Lys 405 410 415Gly Thr Lys
Met Asp Lys Leu Arg Phe Ala Ile Met Tyr Leu Ile Ser 420 425 430Thr
Glu Thr Ile Asn Gln Ser Glu Val Glu Ala Val Glu Ala Ala Leu 435 440
445Asn Glu Ala Glu Ala Asp Thr Ser Ala Phe Gln Tyr Val Lys Lys Ile
450 455 460Lys Ser Leu Asn Ala Ser Phe Ala Ala Thr Ser Ala Asn Ser
Ala Ser465 470 475 480Arg Ser Asn Ile Val Asp Trp Ala Glu Lys Leu
Tyr Gly Gln Ser Ile 485 490 495Ser Ala Val Thr Ala Gly Val Lys Asn
Leu Leu Ser Ser Asp Gln Gln 500 505 510Leu Ala Val Thr Arg Thr Val
Glu Ala Leu Thr Glu Gly Lys Pro Asn 515 520 525Pro Glu Ile Asp Ser
Tyr Arg Phe Leu Asp Pro Arg Ala Pro Lys Ser 530 535 540Ser Ser Ser
Gly Gly Ser His Val Lys Gly Pro Phe Arg Glu Ala Ile545 550 555
560Val Phe Met Ile Gly Gly Gly Asn Tyr Val Glu Tyr Gly Ser Leu Gln
565 570 575Glu Leu Thr Gln Arg Gln Leu Thr Val Lys Asn Val Ile Tyr
Gly Ala 580 585 590Thr Glu Ile Leu Asn Gly Gly Glu Leu Val Glu Gln
Leu Gly Leu Leu 595 600 605Gly Lys Lys Met Gly Leu Gly Gly Pro Val
Ala Ser Thr Leu Lys Arg 610 615 620Leu Gly Met Ala Gly Lys Glu Glu
Thr Asp Val Ser Ala Gln Gly Ser625 630 635 640Leu Thr Arg Glu Ala
Thr Glu Ile Trp Arg Ser Glu Leu Glu Ser Arg 645 650 655Arg Phe Gln
Val Asp Ser Leu Glu Ala Glu Leu Val Asp Val Lys Ala 660 665 670Tyr
Leu Glu Phe Gly Ser Glu Glu Asp Ala Arg Lys Glu Leu Gly Val 675 680
685Leu Ser Gly Arg Val Arg Ser Thr Ala Thr Met Leu Arg Tyr Leu Arg
690 695 700Ser Lys Ala Arg Val Leu Ala Ile Pro Asp Asp Leu Ala Asn
Val Ser705 710 715 720Cys Gly Val Glu Gln Ile Glu Glu Leu Lys Gly
Leu Asn Leu Val Glu 725 730 735Lys Asp Gly Gly Ser Ser Ser Ser Asp
Gly Ala Arg Asn Thr Asn Pro 740 745 750Glu Thr Arg Arg Tyr Ser Gly
Ser Leu Gly Val Glu Asp Gly Ala Tyr 755 760 765Thr Asn Glu Met Leu
Gln Ser Ile Glu Met Val Thr Asp Val Leu Asp 770 775 780Ser Leu Val
Arg Arg Val Thr Val Ala Glu Ser Glu Ser Ala Val Gln785 790 795
800Lys Glu Arg Ala Leu Leu Gly Glu Glu Glu Ile Ser Arg Lys Thr Ile
805 810 815Gln Ile Glu Asn Leu Ser Val Lys Leu Glu Glu Met Glu Arg
Phe Ala 820 825 830Tyr Gly Thr Asn Ser Val Leu Asn Glu Met Arg Glu
Arg Ile Glu Glu 835 840 845Leu Val Glu Glu Thr Met Arg Gln Arg Glu
Lys Ala Val Glu Asn Glu 850 855 860Glu Glu Leu Cys Arg Val Lys Arg
Glu Phe Glu Ser Leu Lys Ser Tyr865 870 875 880Val Ser Thr Phe Thr
Asn Val Arg Glu Thr Leu Leu Ser Ser Glu Arg 885 890 895Gln Phe Lys
Thr Ile Glu Glu Leu Phe Glu Arg Leu Val Thr Lys Thr 900 905 910Thr
Gln Leu Glu Gly Glu Lys Ala Gln Lys Glu Val Glu Val Gln Lys 915 920
925Leu Met Glu Glu Asn Val Lys Leu Thr Ala Leu Leu Asp Lys Lys Glu
930 935 940Ala Gln Leu Leu Ala Leu Asn Glu Gln Cys Lys Val Met Ala
Leu Ser945 950 955 960Ala Ser Asn Ile124222PRTArabidopsis thaliana
124Met Asp Ala Met Lys Glu Glu Ile Gln Arg Val Lys Glu Gln Glu Glu1
5 10 15Gln Ala Met Arg Glu Ala Leu Gly Leu Ala Pro Lys Ser Ser Thr
Arg 20 25 30Pro Gln Gly Asn Arg Leu Asp Lys Gln Glu Phe Thr Glu Leu
Val Lys 35 40 45Arg Gly Ser Thr Ala Glu Asp Leu Gly Ala Gly
Asn Ala Asp Ala Val 50 55 60Trp Val His Gly Leu Gly Tyr Ala Lys Ala
Pro Arg Pro Trp Glu Asp65 70 75 80Pro Ser Thr Leu Ala Ser Ser Gln
Lys Glu Asp Ala Asp Ser Ala Arg 85 90 95Leu Pro Ala Asp Thr Ser Gly
Val Lys Thr Val Glu Asp Gly Pro Asp 100 105 110Asp Val Glu Arg Asp
Gln Arg Arg Ile Gly Val Arg Lys Gly Asn Leu 115 120 125Gln Arg Glu
Arg Arg Lys Lys Asp Met Ile Gly Val Lys Asn Ala Lys 130 135 140Gly
Met Arg Ser Glu Ala Leu Val Ile Gln Met Ile Glu Arg Ser Thr145 150
155 160Arg Lys Arg Arg Arg Arg Lys Lys Glu Gly Met Thr Leu Ile Leu
Ile 165 170 175Glu Ala Asn Cys Pro Arg Met Glu His Phe Ala Leu Gln
Arg Lys Ser 180 185 190Gly Arg Leu Gly Thr Lys Ile Gln Leu Pro Leu
Leu Gln Asp Leu Asn 195 200 205Leu Leu Leu Ile Ser Phe Thr Asn Arg
Gly Val Lys Cys Cys 210 215 220125148PRTArabidopsis thaliana 125Met
Gly Lys Asp Gly Leu Ser Asp Asp Gln Val Ser Ser Met Lys Glu1 5 10
15Ala Phe Met Leu Phe Asp Thr Asp Gly Asp Gly Lys Ile Ala Pro Ser
20 25 30Glu Leu Gly Ile Leu Met Arg Ser Leu Gly Gly Asn Pro Thr Gln
Ala 35 40 45Gln Leu Lys Ser Ile Ile Ala Ser Glu Asn Leu Ser Ser Pro
Phe Asp 50 55 60Phe Asn Arg Phe Leu Asp Leu Met Ala Lys His Leu Lys
Thr Glu Pro65 70 75 80Phe Asp Arg Gln Leu Arg Asp Ala Phe Lys Val
Leu Asp Lys Glu Gly 85 90 95Thr Gly Phe Val Ala Val Ala Asp Leu Arg
His Ile Leu Thr Ser Ile 100 105 110Gly Glu Lys Leu Glu Pro Asn Glu
Phe Asp Glu Trp Ile Lys Glu Val 115 120 125Asp Val Gly Ser Asp Gly
Lys Ile Arg Tyr Glu Asp Phe Ile Ala Arg 130 135 140Met Val Ala
Lys14512670PRTArabidopsis thaliana 126Met Glu Lys Gln Ser Thr Gln
Pro Ile Cys Gly Gln Glu Ala Leu Gln1 5 10 15Leu Leu Asn Cys Val Ala
Glu Ser Pro Phe Asp Gln Glu Lys Cys Val 20 25 30Arg Phe Leu Gln Ser
Leu Arg Glu Cys Val Leu Ser Lys Lys Val Lys 35 40 45Lys Phe Ser Ile
Pro Ser Gln Asp His Asp Ser Glu Gly Ala Ala Ser 50 55 60Ala Thr Lys
Arg Pro Ser65 70127385PRTArabidopsis thaliana 127Met Thr Thr Thr
Gly Ser Asn Ser Asn His Asn His His Glu Ser Asn1 5 10 15Asn Asn Asn
Asn Asn Pro Ser Thr Arg Ser Trp Gly Thr Ala Val Ser 20 25 30Gly Gln
Ser Val Ser Thr Ser Gly Ser Met Gly Ser Pro Ser Ser Arg 35 40 45Ser
Glu Gln Thr Ile Thr Val Val Thr Ser Thr Ser Asp Thr Thr Phe 50 55
60Gln Arg Leu Asn Asn Leu Asp Ile Gln Gly Asp Asp Ala Gly Ser Gln65
70 75 80Gly Ala Ser Gly Val Lys Lys Lys Lys Arg Gly Gln Arg Ala Ala
Gly 85 90 95Pro Asp Lys Thr Gly Arg Gly Leu Arg Gln Phe Ser Met Lys
Val Cys 100 105 110Glu Lys Val Glu Ser Lys Gly Arg Thr Thr Tyr Asn
Glu Val Ala Asp 115 120 125Glu Leu Val Ala Glu Phe Ala Leu Pro Asn
Asn Asp Gly Thr Ser Pro 130 135 140Asp Gln Gln Gln Tyr Asp Glu Lys
Asn Ile Arg Arg Arg Val Tyr Asp145 150 155 160Ala Leu Asn Val Leu
Met Ala Met Asp Ile Ile Ser Lys Asp Lys Lys 165 170 175Glu Ile Gln
Trp Arg Gly Leu Pro Arg Thr Ser Leu Ser Asp Ile Glu 180 185 190Glu
Leu Lys Asn Glu Arg Leu Ser Leu Arg Asn Arg Ile Glu Lys Lys 195 200
205Thr Ala Tyr Ser Gln Glu Leu Glu Glu Gln Tyr Val Gly Leu Gln Asn
210 215 220Leu Ile Gln Arg Asn Glu His Leu Tyr Ser Ser Gly Asn Ala
Pro Ser225 230 235 240Gly Gly Val Ala Leu Pro Phe Ile Leu Val Gln
Thr Arg Pro His Ala 245 250 255Thr Val Glu Val Glu Ile Ser Glu Asp
Met Gln Leu Val His Phe Asp 260 265 270Phe Asn Ser Thr Pro Phe Glu
Leu His Asp Asp Asn Phe Val Leu Lys 275 280 285Thr Met Lys Phe Cys
Asp Gln Pro Pro Gln Gln Pro Asn Gly Arg Asn 290 295 300Asn Ser Gln
Leu Val Cys His Asn Phe Thr Pro Glu Asn Pro Asn Lys305 310 315
320Gly Pro Ser Thr Gly Pro Thr Pro Gln Leu Asp Met Tyr Glu Thr His
325 330 335Leu Gln Ser Gln Gln His Gln Gln His Ser Gln Leu Gln Ile
Ile Pro 340 345 350Met Pro Glu Thr Asn Asn Val Thr Ser Ser Ala Asp
Thr Ala Pro Val 355 360 365Lys Ser Pro Ser Leu Pro Gly Ile Met Asn
Ser Ser Met Lys Pro Glu 370 375 380Asn385128437PRTArabidopsis
thaliana 128Met Ala Leu Gln Asn Ile Gly Ala Ser Asn Arg Asn Asp Ala
Phe Tyr1 5 10 15Arg Tyr Lys Met Pro Lys Met Val Thr Lys Thr Glu Gly
Lys Gly Asn 20 25 30Gly Ile Lys Thr Asn Ile Ile Asn Asn Val Glu Ile
Ala Lys Ala Leu 35 40 45Ala Arg Pro Pro Ser Tyr Thr Thr Lys Tyr Phe
Gly Cys Glu Leu Gly 50 55 60Ala Gln Ser Lys Phe Asp Glu Lys Thr Gly
Thr Ser Leu Val Asn Gly65 70 75 80Ala His Asn Thr Ser Lys Leu Ala
Gly Leu Leu Glu Asn Phe Ile Lys 85 90 95Lys Phe Val Gln Cys Tyr Gly
Cys Gly Asn Pro Glu Thr Glu Ile Ile 100 105 110Ile Thr Lys Thr Gln
Met Val Asn Leu Lys Cys Ala Ala Cys Gly Phe 115 120 125Ile Ser Glu
Val Asp Met Arg Asp Lys Leu Thr Asn Phe Ile Leu Lys 130 135 140Asn
Pro Pro Glu Gln Lys Lys Val Ser Lys Asp Lys Lys Ala Met Arg145 150
155 160Lys Ala Glu Lys Glu Arg Leu Lys Glu Gly Glu Leu Ala Asp Glu
Glu 165 170 175Gln Arg Lys Leu Lys Ala Lys Lys Lys Ala Leu Ser Asn
Gly Lys Asp 180 185 190Ser Lys Thr Ser Lys Asn His Ser Ser Asp Glu
Asp Ile Ser Pro Lys 195 200 205His Asp Glu Asn Ala Leu Glu Val Asp
Glu Asp Glu Asp Asp Asp Asp 210 215 220Gly Val Glu Trp Gln Thr Asp
Thr Ser Arg Glu Ala Ala Glu Lys Arg225 230 235 240Met Met Glu Gln
Leu Ser Ala Lys Thr Ala Glu Met Val Met Leu Ser 245 250 255Ala Met
Glu Val Glu Glu Lys Lys Ala Pro Lys Ser Lys Ser Asn Gly 260 265
270Asn Val Val Lys Thr Glu Asn Pro Pro Pro Gln Glu Lys Asn Leu Val
275 280 285Gln Asp Met Lys Glu Tyr Leu Lys Lys Gly Ser Pro Ile Ser
Ala Leu 290 295 300Lys Ser Phe Ile Ser Ser Leu Ser Glu Pro Pro Gln
Asp Ile Met Asp305 310 315 320Ala Leu Phe Asn Ala Leu Phe Asp Gly
Val Gly Lys Gly Phe Ala Lys 325 330 335Glu Val Thr Lys Lys Lys Asn
Tyr Leu Ala Ala Ala Ala Thr Met Gln 340 345 350Glu Asp Gly Ser Gln
Met His Leu Leu Asn Ser Ile Gly Thr Phe Cys 355 360 365Gly Lys Asn
Gly Asn Glu Glu Ala Leu Lys Glu Val Ala Leu Val Leu 370 375 380Lys
Ala Leu Tyr Asp Gln Asp Ile Ile Glu Glu Glu Val Val Leu Asp385 390
395 400Trp Tyr Glu Lys Gly Leu Thr Gly Ala Asp Lys Ser Ser Pro Val
Trp 405 410 415Lys Asn Val Lys Pro Phe Val Glu Trp Leu Gln Ser Ala
Glu Ser Glu 420 425 430Ser Glu Glu Glu Asp 435129749PRTArabidopsis
thaliana 129Met Ala Ala Asn Lys Phe Ala Thr Leu Ile His Arg Lys Thr
Asn Arg1 5 10 15Ile Thr Leu Ile Leu Val Tyr Ala Phe Leu Glu Trp Ser
Leu Ile Phe 20 25 30Phe Ile Leu Leu Asn Ser Leu Phe Ser Tyr Phe Ile
Leu Arg Phe Ala 35 40 45Asp Tyr Phe Gly Leu Lys Arg Pro Cys Leu Phe
Cys Ser Arg Leu Asp 50 55 60Arg Phe Phe Asp Ala Ser Gly Lys Ser Pro
Ser His Arg Asp Leu Leu65 70 75 80Cys Asp Asp His Ala Leu Gln Leu
His Ser Lys Pro Val Glu Glu Ser 85 90 95Asn Cys Gly Phe Gly Glu Phe
His Asn Asp Leu Val His Arg Gly Cys 100 105 110Cys Val Glu Lys Ile
Ser Ser Ser Leu Cys Ala Pro Ile Glu Ser Asp 115 120 125Phe Gly Asn
Leu Asp Tyr Pro Ile Gly Asp Glu Gly Gln Ile Tyr Asn 130 135 140Gly
Leu Lys Phe Pro Arg Ser Ile Phe Val Phe Glu Glu Glu Lys Val145 150
155 160Gly Ser Val Asn Leu Asn Asp Ser Gln Glu Glu Thr Glu Glu Lys
Lys 165 170 175Val Pro Gln Ser His Glu Lys Leu Glu Asp Asp Asp Val
Asp Glu Glu 180 185 190Phe Ser Cys Tyr Val Ser Ser Phe Asp Cys Lys
Asn Lys Glu Ile Ala 195 200 205Thr Glu Lys Glu Glu Glu Asn Arg Val
Asp Leu Pro Ile Glu Val Glu 210 215 220Thr Ala Glu Ser Ala Pro Lys
Asn Leu Glu Phe Tyr Ile Asp Glu Glu225 230 235 240Asp Cys His Leu
Ile Pro Val Glu Phe Tyr Lys Pro Ser Glu Glu Val 245 250 255Arg Glu
Ile Ser Asp Ile Asn Gly Asp Phe Ile Leu Asp Phe Gly Val 260 265
270Glu His Asp Phe Thr Ala Ala Ala Glu Thr Glu Glu Ile Ser Asp Phe
275 280 285Ala Ser Pro Gly Glu Ser Lys Pro Glu Asp Ala Glu Thr Asn
Leu Val 290 295 300Ala Ser Glu Met Glu Asn Asp Asp Glu Glu Thr Asp
Ala Glu Val Ser305 310 315 320Ile Gly Thr Glu Ile Pro Asp His Glu
Gln Ile Gly Asp Ile Pro Ser 325 330 335His Gln Leu Ile Pro His His
Asp Asp Asp Asp His Glu Glu Glu Thr 340 345 350Leu Glu Phe Lys Thr
Val Thr Ile Glu Thr Lys Met Pro Val Leu Asn 355 360 365Ile Asn Glu
Glu Arg Ile Leu Glu Ala Gln Gly Ser Met Glu Ser Ser 370 375 380His
Ser Ser Leu His Asn Ala Met Phe His Leu Glu Gln Arg Val Ser385 390
395 400Val Asp Gly Ile Glu Cys Pro Glu Gly Val Leu Thr Val Asp Lys
Leu 405 410 415Lys Phe Glu Leu Gln Glu Glu Arg Lys Ala Leu His Ala
Leu Tyr Glu 420 425 430Glu Leu Glu Val Glu Arg Asn Ala Ser Ala Val
Ala Ala Ser Glu Thr 435 440 445Met Ala Met Ile Asn Arg Leu His Glu
Glu Lys Ala Ala Met Gln Met 450 455 460Glu Ala Leu Gln Tyr Gln Arg
Met Met Glu Glu Gln Ala Glu Phe Asp465 470 475 480Gln Glu Ala Leu
Gln Leu Leu Asn Glu Leu Met Val Asn Arg Glu Lys 485 490 495Glu Asn
Ala Glu Leu Glu Lys Glu Leu Glu Val Tyr Arg Lys Arg Met 500 505
510Glu Glu Tyr Glu Ala Lys Glu Lys Met Gly Met Leu Arg Arg Arg Leu
515 520 525Arg Asp Ser Ser Val Asp Ser Tyr Arg Asn Asn Gly Asp Ser
Asp Glu 530 535 540Asn Ser Asn Gly Glu Leu Gln Phe Lys Asn Val Glu
Gly Val Thr Asp545 550 555 560Trp Lys Tyr Arg Glu Asn Glu Met Glu
Asn Thr Pro Val Asp Val Val 565 570 575Leu Arg Leu Asp Glu Cys Leu
Asp Asp Tyr Asp Gly Glu Arg Leu Ser 580 585 590Ile Leu Gly Arg Leu
Lys Phe Leu Glu Glu Lys Leu Thr Asp Leu Asn 595 600 605Asn Glu Glu
Asp Asp Glu Glu Glu Ala Lys Thr Phe Glu Ser Asn Gly 610 615 620Ser
Ile Asn Gly Asn Glu His Ile His Gly Lys Glu Thr Asn Gly Lys625 630
635 640His Arg Val Ile Lys Ser Lys Arg Leu Leu Pro Leu Phe Asp Ala
Val 645 650 655Asp Gly Glu Met Glu Asn Gly Leu Ser Asn Gly Asn His
His Glu Asn 660 665 670Gly Phe Asp Asp Ser Glu Lys Gly Glu Asn Val
Thr Ile Glu Glu Glu 675 680 685Val Asp Glu Leu Tyr Glu Arg Leu Glu
Ala Leu Glu Ala Asp Arg Glu 690 695 700Phe Leu Arg His Cys Val Gly
Ser Leu Lys Lys Gly Asp Lys Gly Val705 710 715 720His Leu Leu His
Glu Ile Leu Gln His Leu Arg Asp Leu Arg Asn Ile 725 730 735Asp Leu
Thr Arg Val Arg Glu Asn Gly Asp Met Ser Leu 740
745130742PRTArabidopsis thaliana 130Met Ser Asp Ala Leu Ser Ala Ile
Pro Ala Ala Val His Arg Asn Leu1 5 10 15Ser Asp Lys Leu Tyr Glu Lys
Arg Lys Asn Ala Ala Leu Glu Leu Glu 20 25 30Asn Ile Val Lys Asn Leu
Thr Ser Ser Gly Asp His Asp Lys Ile Ser 35 40 45Lys Val Ile Glu Met
Leu Ile Lys Glu Phe Ala Lys Ser Pro Gln Ala 50 55 60Asn His Arg Lys
Gly Gly Leu Ile Gly Leu Ala Ala Val Thr Val Gly65 70 75 80Leu Ser
Thr Glu Ala Ala Gln Tyr Leu Glu Gln Ile Val Pro Pro Val 85 90 95Ile
Asn Ser Phe Ser Asp Gln Asp Ser Arg Val Arg Tyr Tyr Ala Cys 100 105
110Glu Ala Leu Tyr Asn Ile Ala Lys Val Val Arg Gly Asp Phe Ile Ile
115 120 125Phe Phe Asn Lys Ile Phe Asp Ala Leu Cys Lys Leu Ser Ala
Asp Ser 130 135 140Asp Ala Asn Val Gln Ser Ala Ala His Leu Leu Asp
Arg Leu Val Lys145 150 155 160Asp Ile Val Thr Glu Ser Asp Gln Phe
Ser Ile Glu Glu Phe Ile Pro 165 170 175Leu Leu Lys Glu Arg Met Asn
Val Leu Asn Pro Tyr Val Arg Gln Phe 180 185 190Leu Val Gly Trp Ile
Thr Val Leu Asp Ser Val Pro Asp Ile Asp Met 195 200 205Leu Gly Phe
Leu Pro Asp Phe Leu Asp Gly Leu Phe Asn Met Leu Ser 210 215 220Asp
Ser Ser His Glu Ile Arg Gln Gln Ala Asp Ser Ala Leu Ser Glu225 230
235 240Phe Leu Gln Glu Ile Lys Asn Ser Pro Ser Val Asp Tyr Gly Arg
Met 245 250 255Ala Glu Ile Leu Val Gln Arg Ala Ala Ser Pro Asp Glu
Phe Thr Arg 260 265 270Leu Thr Ala Ile Thr Trp Ile Asn Glu Phe Val
Lys Leu Gly Gly Asp 275 280 285Gln Leu Val Arg Tyr Tyr Ala Asp Ile
Leu Gly Ala Ile Leu Pro Cys 290 295 300Ile Ser Asp Lys Glu Glu Lys
Ile Arg Val Val Ala Arg Glu Thr Asn305 310 315 320Glu Glu Leu Arg
Ser Ile His Val Glu Pro Ser Asp Gly Phe Asp Val 325 330 335Gly Ala
Ile Leu Ser Val Ala Arg Arg Gln Leu Ser Ser Glu Phe Glu 340 345
350Ala Thr Arg Ile Glu Ala Leu Asn Trp Ile Ser Thr Leu Leu Asn Lys
355 360 365His Arg Thr Glu Val Leu Cys Phe Leu Asn Asp Ile Phe Asp
Thr Leu 370 375 380Leu Lys Ala Leu Ser Asp Ser Ser Asp Asp Val Val
Leu Leu Val Leu385 390 395 400Glu Val His Ala Gly Val Ala Lys Asp
Pro Gln His Phe Arg Gln Leu 405 410 415Ile Val Phe Leu Val His Asn
Phe Arg Ala Asp Asn Ser Leu Leu Glu 420 425 430Arg Gly Ala Leu Ile
Val Arg Arg Met Cys Val Leu Leu Asp Ala Glu 435 440 445Arg Val Tyr
Arg Glu Leu Ser Thr Ile Leu Glu Gly Glu Asp Asn Leu 450 455 460Asp
Phe Ala Ser Thr Met Val Gln Ala Leu Asn Leu Ile Leu Leu Thr465 470
475 480Ser Pro Glu Leu Ser Lys Leu Arg Glu Leu Leu Lys Gly Ser Leu
Val 485 490
495Asn Arg Glu Gly Lys Glu Leu Phe Val Ala Leu Tyr Thr Ser Trp Cys
500 505 510His Ser Pro Met Ala Ile Ile Ser Leu Cys Leu Leu Ala Gln
Ala Tyr 515 520 525Gln His Ala Ser Val Val Ile Gln Ser Leu Val Glu
Glu Asp Ile Asn 530 535 540Val Lys Phe Leu Val Gln Leu Asp Lys Leu
Ile Arg Leu Leu Glu Thr545 550 555 560Pro Ile Phe Thr Tyr Leu Arg
Leu Gln Leu Leu Glu Pro Gly Arg Tyr 565 570 575Thr Trp Leu Leu Lys
Thr Leu Tyr Gly Leu Leu Met Leu Leu Pro Gln 580 585 590Gln Ser Ala
Ala Phe Lys Ile Leu Arg Thr Arg Leu Lys Thr Val Pro 595 600 605Thr
Tyr Ser Phe Ser Thr Gly Asn Gln Ile Gly Arg Ala Thr Ser Gly 610 615
620Val Pro Phe Ser Gln Tyr Lys His Gln Asn Glu Asp Gly Asp Leu
Glu625 630 635 640Asp Asp Asn Ile Asn Ser Ser His Gln Gly Ile Asn
Phe Ala Val Arg 645 650 655Leu Gln Gln Phe Glu Asn Val Gln Asn Leu
His Arg Gly Gln Ala Arg 660 665 670Thr Arg Val Asn Tyr Ser Tyr His
Ser Ser Ser Ser Ser Thr Ser Lys 675 680 685Glu Val Arg Arg Ser Glu
Glu Gln Gln Gln Gln Gln Gln Gln Gln Gln 690 695 700Gln Gln Gln Gln
Gln Gln Gln Arg Pro Pro Pro Ser Ser Thr Ser Ser705 710 715 720Ser
Val Ala Asp Asn Asn Arg Pro Pro Ser Arg Thr Ser Arg Lys Gly 725 730
735Pro Gly Gln Leu Gln Leu 740131911PRTArabidopsis thaliana 131Met
Ser Leu Leu Phe Leu Asn Pro Pro Phe Pro Ser Asn Ser Ile His1 5 10
15Pro Ile Pro Arg Arg Ala Ala Gly Ile Ser Ser Ile Arg Cys Ser Ile
20 25 30Ser Ala Pro Glu Lys Lys Pro Arg Arg Arg Arg Lys Gln Lys Arg
Gly 35 40 45Asp Gly Ala Glu Asn Asp Asp Ser Leu Ser Phe Gly Ser Gly
Glu Ala 50 55 60Val Ser Ala Leu Glu Arg Ser Leu Arg Leu Thr Phe Met
Asp Glu Leu65 70 75 80Met Glu Arg Ala Arg Asn Arg Asp Thr Ser Gly
Val Ser Glu Val Ile 85 90 95Tyr Asp Met Ile Ala Ala Gly Leu Ser Pro
Gly Pro Arg Ser Phe His 100 105 110Gly Leu Val Val Ala His Ala Leu
Asn Gly Asp Glu Gln Gly Ala Met 115 120 125His Ser Leu Arg Lys Glu
Leu Gly Ala Gly Gln Arg Pro Leu Pro Glu 130 135 140Thr Met Ile Ala
Leu Val Arg Leu Ser Gly Ser Lys Gly Asn Ala Thr145 150 155 160Arg
Gly Leu Glu Ile Leu Ala Ala Met Glu Lys Leu Lys Tyr Asp Ile 165 170
175Arg Gln Ala Trp Leu Ile Leu Val Glu Glu Leu Met Arg Ile Asn His
180 185 190Leu Glu Asp Ala Asn Lys Val Phe Leu Lys Gly Ala Arg Gly
Gly Met 195 200 205Arg Ala Thr Asp Gln Leu Tyr Asp Leu Met Ile Glu
Glu Asp Cys Lys 210 215 220Ala Gly Asp His Ser Asn Ala Leu Asp Ile
Ser Tyr Glu Met Glu Ala225 230 235 240Ala Gly Arg Met Ala Thr Thr
Phe His Phe Asn Cys Leu Leu Ser Val 245 250 255Gln Ala Thr Cys Gly
Ile Pro Glu Val Ala Tyr Ala Thr Phe Glu Asn 260 265 270Met Glu Tyr
Gly Glu Gly Leu Phe Met Lys Pro Asp Thr Glu Thr Tyr 275 280 285Asn
Trp Val Ile Gln Ala Tyr Thr Arg Ala Glu Ser Tyr Asp Arg Val 290 295
300Gln Asp Val Ala Glu Leu Leu Gly Met Met Val Glu Asp His Lys
Arg305 310 315 320Val Gln Pro Asn Val Lys Thr Tyr Ala Leu Leu Val
Glu Cys Phe Thr 325 330 335Lys Tyr Cys Val Val Lys Glu Ala Ile Arg
His Phe Arg Ala Leu Lys 340 345 350Asn Phe Glu Gly Gly Thr Val Ile
Leu His Asn Ala Gly Asn Phe Glu 355 360 365Asp Pro Leu Ser Leu Tyr
Leu Arg Ala Leu Cys Arg Glu Gly Arg Ile 370 375 380Val Glu Leu Ile
Asp Ala Leu Asp Ala Met Arg Lys Asp Asn Gln Pro385 390 395 400Ile
Pro Pro Arg Ala Met Ile Met Ser Arg Lys Tyr Arg Thr Leu Val 405 410
415Ser Ser Trp Ile Glu Pro Leu Gln Glu Glu Ala Glu Leu Gly Tyr Glu
420 425 430Ile Asp Tyr Leu Ala Arg Tyr Ile Glu Glu Gly Gly Leu Thr
Gly Glu 435 440 445Arg Lys Arg Trp Val Pro Arg Arg Gly Lys Thr Pro
Leu Asp Pro Asp 450 455 460Ala Ser Gly Phe Ile Tyr Ser Asn Pro Ile
Glu Thr Ser Phe Lys Gln465 470 475 480Arg Cys Leu Glu Asp Trp Lys
Val His His Arg Lys Leu Leu Arg Thr 485 490 495Leu Gln Ser Glu Gly
Leu Pro Val Leu Gly Asp Ala Ser Glu Ser Asp 500 505 510Tyr Met Arg
Val Val Glu Arg Leu Arg Asn Ile Ile Lys Gly Pro Ala 515 520 525Leu
Asn Leu Leu Lys Pro Lys Ala Ala Ser Lys Met Val Val Ser Glu 530 535
540Leu Lys Glu Glu Leu Glu Ala Gln Gly Leu Pro Ile Asp Gly Thr
Arg545 550 555 560Asn Val Leu Tyr Gln Arg Val Gln Lys Ala Arg Arg
Ile Asn Lys Ser 565 570 575Arg Gly Arg Pro Leu Trp Val Pro Pro Ile
Glu Glu Glu Glu Glu Glu 580 585 590Val Asp Glu Glu Val Asp Asp Leu
Ile Cys Arg Ile Lys Leu His Glu 595 600 605Gly Asp Thr Glu Phe Trp
Lys Arg Arg Phe Leu Gly Glu Gly Leu Ile 610 615 620Glu Thr Ser Val
Glu Ser Lys Glu Thr Thr Glu Ser Val Val Thr Gly625 630 635 640Glu
Ser Glu Lys Ala Ile Glu Asp Ile Ser Lys Glu Ala Asp Asn Glu 645 650
655Glu Asp Asp Asp Glu Glu Glu Gln Glu Gly Asp Glu Asp Asp Asp Glu
660 665 670Asn Glu Glu Glu Glu Val Val Val Pro Glu Thr Glu Asn Arg
Ala Glu 675 680 685Gly Glu Asp Leu Val Lys Asn Lys Ala Ala Asp Ala
Lys Lys His Leu 690 695 700Gln Met Ile Gly Val Gln Leu Leu Lys Glu
Ser Asp Glu Ala Asn Arg705 710 715 720Thr Lys Lys Arg Gly Lys Arg
Ala Ser Arg Met Thr Leu Glu Asp Asp 725 730 735Ala Asp Glu Asp Trp
Phe Pro Glu Glu Pro Phe Glu Ala Phe Lys Glu 740 745 750Met Arg Glu
Arg Lys Val Phe Asp Val Ala Asp Met Tyr Thr Ile Ala 755 760 765Asp
Val Trp Gly Trp Thr Trp Glu Lys Asp Phe Lys Asn Lys Thr Pro 770 775
780Arg Lys Trp Ser Gln Glu Trp Glu Val Glu Leu Ala Ile Val Leu
Met785 790 795 800Thr Lys Val Ile Glu Leu Gly Gly Ile Pro Thr Ile
Gly Asp Cys Ala 805 810 815Val Ile Leu Arg Ala Ala Leu Arg Ala Pro
Met Pro Ser Ala Phe Leu 820 825 830Lys Ile Leu Gln Thr Thr His Ser
Leu Gly Tyr Ser Phe Gly Ser Pro 835 840 845Leu Tyr Asp Glu Ile Ile
Thr Leu Cys Leu Asp Leu Gly Glu Leu Asp 850 855 860Ala Ala Ile Ala
Ile Val Ala Asp Met Glu Thr Thr Gly Ile Thr Val865 870 875 880Pro
Asp Gln Thr Leu Asp Lys Val Ile Ser Ala Arg Gln Ser Asn Glu 885 890
895Ser Pro Arg Ser Glu Pro Glu Glu Pro Ala Ser Thr Val Ser Ser 900
905 910132357PRTArabidopsis thaliana 132Met Glu Gly Ser Ser Ser Ala
Ile Ala Arg Lys Thr Trp Glu Leu Glu1 5 10 15Asn Asn Ile Leu Pro Val
Glu Pro Thr Asp Ser Ala Ser Asp Ser Ile 20 25 30Phe His Tyr Asp Asp
Ala Ser Gln Ala Lys Ile Gln Gln Glu Lys Pro 35 40 45Trp Ala Ser Asp
Pro Asn Tyr Phe Lys Arg Val His Ile Ser Ala Leu 50 55 60Ala Leu Leu
Lys Met Val Val His Ala Arg Ser Gly Gly Thr Ile Glu65 70 75 80Ile
Met Gly Leu Met Gln Gly Lys Thr Glu Gly Asp Thr Ile Ile Val 85 90
95Met Asp Ala Phe Ala Leu Pro Val Glu Gly Thr Glu Thr Arg Val Asn
100 105 110Ala Gln Ser Asp Ala Tyr Glu Tyr Met Val Glu Tyr Ser Gln
Thr Ser 115 120 125Lys Leu Ala Gly Arg Leu Glu Asn Val Val Gly Trp
Tyr His Ser His 130 135 140Pro Gly Tyr Gly Cys Trp Leu Ser Gly Ile
Asp Val Ser Thr Gln Met145 150 155 160Leu Asn Gln Gln Tyr Gln Glu
Pro Phe Leu Ala Val Val Ile Asp Pro 165 170 175Thr Arg Thr Val Ser
Ala Gly Lys Val Glu Ile Gly Ala Phe Arg Thr 180 185 190Tyr Pro Glu
Gly His Lys Ile Ser Asp Asp His Val Ser Glu Tyr Gln 195 200 205Thr
Ile Pro Leu Asn Lys Ile Glu Asp Phe Gly Val His Cys Lys Gln 210 215
220Tyr Tyr Ser Leu Asp Ile Thr Tyr Phe Lys Ser Ser Leu Asp Ser
His225 230 235 240Leu Leu Asp Leu Leu Trp Asn Lys Tyr Trp Val Asn
Thr Leu Ser Ser 245 250 255Ser Pro Leu Leu Gly Asn Gly Asp Tyr Val
Ala Gly Gln Ile Ser Asp 260 265 270Leu Ala Glu Lys Leu Glu Gln Ala
Glu Ser Gln Leu Ala Asn Ser Arg 275 280 285Tyr Gly Gly Ile Ala Pro
Ala Gly His Gln Arg Arg Lys Glu Asp Glu 290 295 300Pro Gln Leu Ala
Lys Ile Thr Arg Asp Ser Ala Lys Ile Thr Val Glu305 310 315 320Gln
Val His Gly Leu Met Ser Gln Val Ile Lys Asp Ile Leu Phe Asn 325 330
335Ser Ala Arg Gln Ser Lys Lys Ser Ala Asp Asp Ser Ser Asp Pro Glu
340 345 350Pro Met Ile Thr Ser 35513357DNAartificial sequenceprimer
133ggggacaagt ttgtacaaaa aagcaggctt cacaatggtt agatcagatg aaaatag
5713454DNAartificial sequenceprimer 134ggggaccact ttgtacaaga
aagctgggtt cttattaata ttaaatcaga aacc 5413552DNAartificial
sequenceprimer 135ggggacaagt ttgtacaaaa aagcaggctt cacaatggta
aatccgggtc ac 5213651DNAartificial sequenceprimer 136ggggaccact
ttgtacaaga aagctgggtt ttctgtagtc agacctggat a 5113755DNAartificial
sequenceprimer 137ggggacaagt ttgtacaaaa aagcaggctt cacaatgggg
aaggaaaatg ctgtg 5513854DNAartificial sequenceprimer 138ggggaccact
ttgtacaaga aagctgggtc cttcagaata gcgtgtcaag tagc
5413954DNAartificial sequenceprimer 139ggggacaagt ttgtacaaaa
aagcaggctt cacaatgggg aagaagtgtg attt 5414050DNAartificial
sequenceprimer 140ggggaccact ttgtacaaga aagctgggtt gtgagttaaa
caacaaccgt 5014155DNAartificial sequenceprimer 141ggggacaagt
ttgtacaaaa aagcaggctt cacaatggtt aactcatgcg agaac
5514252DNAartificial sequenceprimer 142ggggaccact ttgtacaaga
aagctgggtt ggattaagaa tgatgagact ca 5214352DNAartificial
sequenceprimer 143ggggacaagt ttgtacaaaa aagcaggctt cacaatggcg
aataatcctc cg 5214451DNAartificial sequenceprimer 144ggggaccact
ttgtacaaga aagctgggtc actatcactc cccaacttct c 5114551DNAartificial
sequenceprimer 145ggggacaagt ttgtacaaaa aagcaggctt cacaatggag
ggttcgtcgt c 5114649DNAartificial sequenceprimer 146ggggaccact
ttgtacaaga aagctgggtc caaaagaaga gcaacttca 4914756DNAartificial
sequenceprimer 147ggggacaagt ttgtacaaaa aagcaggctt cacaatgtat
tgctcttctt cgatgc 5614852DNAartificial sequenceprimer 148ggggaccact
ttgtacaaga aagctgggtg cttggtgtca tcttgagaat ag 5214954DNAartificial
sequenceprimer 149ggggacaagt ttgtacaaaa aagcaggctt cacaatggca
aagatgcaat tatc 5415050DNAartificial sequenceprimer 150ggggaccact
ttgtacaaga aagctgggta accatctgat cacaagaaca 5015158DNAartificial
sequenceprimer 151ggggacaagt ttgtacaaaa aagcaggctt cacaatggct
atttcaaaag ctcttatc 5815248DNAartificial sequenceprimer
152ggggaccact ttgtacaaga aagctgggtg aggctagcgt agcactgg
4815354DNAartificial sequenceprimer 153ggggacaagt ttgtacaaaa
aagcaggctt cacaatgggg aagaagaaca agag 5415451DNAartificial
sequenceprimer 154ggggaccact ttgtacaaga aagctgggtg cttctttgac
tctttttatc g 5115555DNAartificial sequenceprimer 155ggggacaagt
ttgtacaaaa aagcaggctt cacaatggaa ttgcttgaca tgaac
5515653DNAartificial sequenceprimer 156ggggaccact ttgtacaaga
aagctgggtc aacattattc ttcttttctg gtc 5315754DNAartificial
sequenceprimer 157ggggacaagt ttgtacaaaa aagcaggctt cacaatggac
gagggagtta tagc 5415850DNAartificial sequenceprimer 158ggggaccact
ttgtacaaga aagctgggtc cttagagaga ggacttttct 5015955DNAartificial
sequenceprimer 159ggggacaagt ttgtacaaaa aagcaggctt cacaatggag
ttgtttgtca ctcca 5516049DNAartificial sequenceprimer 160ggggaccact
ttgtacaaga aagctgggtt cagcgagtat caatggatc 4916152DNAartificial
sequenceprimer 161ggggacaagt ttgtacaaaa aagcaggctt cacaatgcaa
ccgacagaga cg 5216249DNAartificial sequenceprimer 162ggggaccact
ttgtacaaga aagctgggtg ctcgtccaac actaaggtt 4916355DNAartificial
sequenceprimer 163ggggacaagt ttgtacaaaa aagcaggctt cacaatgaat
agggaaaagt tgatg 5516447DNAartificial sequenceprimer 164ggggaccact
ttgtacaaga aagctgggtc ctctaagaag cagcagc 4716552DNAartificial
sequenceprimer 165ggggacaagt ttgtacaaaa aagcaggctt cacaatggag
gacgacgacg ag 5216650DNAartificial sequenceprimer 166ggggaccact
ttgtacaaga aagctgggtt gtcagctact tacattgccg 5016751DNAartificial
sequenceprimer 167ggggacaagt ttgtacaaaa aagcaggctt cacaatggcc
accgtatctt c 5116848DNAartificial sequenceprimer 168ggggaccact
ttgtacaaga aagctgggtg attagaaaac tgaaggcg 4816956DNAartificial
sequenceprimer 169ggggacaagt ttgtacaaaa aagcaggctt cacaatggat
gtaggagtta ctacgg 5617047DNAartificial sequenceprimer 170ggggaccact
ttgtacaaga aagctgggtc taagccgagg cattgat 4717156DNAartificial
sequenceprimer 171ggggacaagt ttgtacaaaa aagcaggctt cacaatggat
ggtcatgatt ctaagg 5617249DNAartificial sequenceprimer 172ggggaccact
ttgtacaaga aagctgggtt taagaggaac tagccggtg 4917352DNAartificial
sequenceprimer 173ggggacaagt ttgtacaaaa aagcaggctt cacaatggaa
atctacacca tg 5217451DNAartificial sequenceprimer 174ggggaccact
ttgtacaaga aagctgggta actaaagaga actaagaaac t 5117553DNAartificial
sequenceprimer 175ggggacaagt ttgtacaaaa aagcaggctt cacaatggag
tttggatctt ttc 5317647DNAartificial sequenceprimer 176ggggaccact
ttgtacaaga aagctgggtc tctcaagctt taaacgc 4717754DNAartificial
sequenceprimer 177ggggacaagt ttgtacaaaa aagcaggctt cacaatggcg
gagcagaaga gtac 5417854DNAartificial sequenceprimer 178ggggaccact
ttgtacaaga aagctgggtc ctatcatgtc gattttgtat cttt
5417951DNAartificial sequenceprimer 179ggggacaagt ttgtacaaaa
aagcaggctt cacaatggcg aatccttggt g 5118048DNAartificial
sequenceprimer 180ggggaccact ttgtacaaga aagctgggtt caatacgaag
gaggagca 4818153DNAartificial sequenceprimer 181ggggacaagt
ttgtacaaaa aagcaggctt cacaatggct ctcaatctcc gtc
5318254DNAartificial sequenceprimer 182ggggaccact ttgtacaaga
aagctgggtg gattagagag tcatatgttt gatg 5418352DNAartificial
sequenceprimer 183ggggacaagt ttgtacaaaa aagcaggctt cacaatgctg
atgctgtgtg gg 5218452DNAartificial sequenceprimer 184ggggaccact
ttgtacaaga aagctgggtt ttcaacaatg ttcaacaaca ct 5218554DNAartificial
sequenceprimer 185ggggacaagt ttgtacaaaa aagcaggctt cacaatggtg
aagttgatga tacg 5418649DNAartificial sequenceprimer 186ggggaccact
ttgtacaaga aagctgggtt ttagtgcaac caaagagtc 4918756DNAartificial
sequenceprimer 187ggggacaagt ttgtacaaaa aagcaggctt cacaatggag
aaacagagta ctcaac 5618851DNAartificial sequenceprimer 188ggggaccact
ttgtacaaga aagctgggtt tatgaaggtc tctttgtagc t 5118958DNAartificial
sequenceprimer 189ggggacaagt ttgtacaaaa aagcaggctt cacaatgaca
actactgggt ctaattct 5819047DNAartificial sequenceprimer
190ggggaccact ttgtacaaga aagctgggtt caattctccg gcttcat
4719152DNAartificial sequenceprimer 191ggggacaagt ttgtacaaaa
aagcaggctt cacaatggcg ctgcagaaca tt 5219251DNAartificial
sequenceprimer 192ggggaccact ttgtacaaga aagctgggtg caaagaaaag
ttaggaggga a 5119353DNAartificial sequenceprimer 193ggggacaagt
ttgtacaaaa aagcaggctt cacaatggcg gctaacaaat tcg
5319449DNAartificial sequenceprimer 194ggggaccact ttgtacaaga
aagctgggtg tcgttgttcc ttgcctcac 4919557DNAartificial sequenceprimer
195ggggacaagt ttgtacaaaa aagcaggctt cacaatgtca ctcttgttcc tcaatcc
5719654DNAartificial sequenceprimer 196ggggaccact ttgtacaaga
aagctgggtc cttctaccga tttctgtttc ttat
5419753DNAartificial sequenceprimer 197ggggacaagt ttgtacaaaa
aagcaggctt cacaatggaa ggttcctcgt cag 5319848DNAartificial
sequenceprimer 198ggggaccact ttgtacaaga aagctgggtt cacgatgtaa
tcatgggc 4819912PRTartificial sequencemotif 199Glu Glu Thr Ala Arg
Phe Gln Pro Gly Tyr Arg Ser1 5 1020010PRTartificial sequencemotif
200Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu1 5 102017PRTartificial
sequencemotif 201Asp Tyr Lys Asp Asp Asp Lys1 52029PRTartificial
sequencemotif 202Tyr Pro Tyr Asp Val Pro Asp Tyr Ala1
520312PRTartificial sequencemotif 203Glu Asp Gln Val Asp Pro Arg
Leu Ile Asp Gly Lys1 5 1020411PRTartificial sequencemotif 204Tyr
Thr Asp Ile Glu Met Asn Arg Leu Gly Lys1 5 10205131PRTArabidopsis
thaliana 205Asn Arg Ile Leu Trp Lys Gly Val Asp Ala Cys Pro Gly Asp
Glu Asp1 5 10 15Ala Asp Val Ser Val Leu Gln Leu Gln Ala Glu Ile Glu
Asn Leu Ala 20 25 30Leu Glu Glu Gln Ala Leu Asp Asn Gln Ile Arg Gln
Thr Glu Glu Arg 35 40 45Leu Arg Asp Leu Ser Glu Asn Glu Lys Asn Gln
Lys Trp Leu Phe Val 50 55 60Thr Glu Glu Asp Ile Lys Ser Leu Pro Gly
Phe Gln Asn Gln Thr Leu65 70 75 80Ile Ala Val Lys Ala Pro His Gly
Thr Thr Leu Glu Val Pro Asp Pro 85 90 95Asp Glu Ala Ala Asp His Pro
Gln Arg Arg Tyr Arg Ile Ile Leu Arg 100 105 110Ser Thr Met Gly Pro
Ile Asp Val Tyr Leu Val Ser Glu Phe Glu Gly 115 120 125Lys Phe Glu
130206385PRTArabidopsis thaliana 206Met Ser Glu Glu Val Pro Gln Gln
Phe Pro Ser Ser Lys Arg Gln Leu1 5 10 15His Pro Ser Leu Ser Ser Met
Lys Pro Pro Leu Val Ala Pro Gly Glu 20 25 30Tyr His Arg Phe Asp Ala
Ala Glu Thr Arg Gly Gly Gly Ala Val Ala 35 40 45Asp Gln Val Val Ser
Asp Ala Ile Val Ile Lys Ser Thr Leu Lys Arg 50 55 60Lys Thr Asp Leu
Val Asn Gln Ile Val Glu Val Asn Glu Leu Asn Thr65 70 75 80Gly Val
Leu Gln Thr Pro Val Ser Gly Lys Gly Gly Lys Ala Lys Lys 85 90 95Thr
Ser Arg Ser Ala Lys Ser Asn Lys Ser Gly Thr Leu Ala Ser Gly 100 105
110Ser Asn Ala Gly Ser Pro Gly Asn Asn Phe Ala Gln Ala Gly Thr Cys
115 120 125Arg Tyr Asp Ser Ser Leu Gly Leu Leu Thr Lys Lys Phe Ile
Asn Leu 130 135 140Ile Lys Gln Ala Glu Asp Gly Ile Leu Asp Leu Asn
Lys Ala Ala Asp145 150 155 160Thr Leu Glu Val Gln Lys Arg Arg Ile
Tyr Asp Ile Thr Asn Val Leu 165 170 175Glu Gly Ile Gly Leu Ile Glu
Lys Thr Leu Lys Asn Arg Ile Gln Trp 180 185 190Lys Gly Leu Asp Val
Ser Lys Pro Gly Glu Thr Ile Glu Ser Ile Ala 195 200 205Asn Leu Gln
Asp Glu Val Gln Asn Leu Ala Ala Glu Glu Ala Arg Leu 210 215 220Asp
Asp Gln Ile Arg Glu Ser Gln Glu Arg Leu Thr Ser Leu Ser Glu225 230
235 240Asp Glu Asn Asn Lys Arg Leu Leu Phe Val Thr Glu Asn Asp Ile
Lys 245 250 255Asn Leu Pro Cys Phe Gln Asn Lys Thr Leu Ile Ala Val
Lys Ala Pro 260 265 270His Gly Thr Thr Leu Glu Val Pro Asp Pro Asp
Glu Ala Gly Gly Tyr 275 280 285Gln Arg Arg Tyr Arg Ile Ile Leu Arg
Ser Thr Met Gly Pro Ile Asp 290 295 300Val Tyr Leu Val Ser Gln Phe
Glu Glu Ser Phe Glu Asp Ile Pro Gln305 310 315 320Ala Asp Glu Pro
Ser Asn Val Pro Asp Glu Pro Ser Asn Val Pro Asp 325 330 335Glu Pro
Ser Asn Leu Pro Ser Thr Ser Gly Leu Pro Glu Asn His Asp 340 345
350Val Ser Met Pro Met Lys Glu Glu Ser Thr Glu Arg Asn Met Glu Thr
355 360 365Gln Glu Val Asp Asp Thr Gln Arg Val Tyr Ser Asp Ile Glu
Ser His 370 375 380Asp385207127PRTArabidopsis thaliana 207Met Ser
Glu Glu Val Pro Gln Gln Phe Pro Ser Ser Lys Arg Gln Leu1 5 10 15His
Pro Ser Leu Ser Ser Met Lys Pro Pro Leu Val Ala Pro Gly Glu 20 25
30Tyr His Arg Phe Asp Ala Ala Glu Thr Arg Gly Gly Gly Ala Val Ala
35 40 45Asp Gln Val Val Ser Asp Ala Ile Val Ile Lys Ser Thr Leu Lys
Arg 50 55 60Lys Thr Asp Leu Val Asn Gln Ile Val Glu Val Asn Glu Leu
Asn Thr65 70 75 80Gly Val Leu Gln Thr Pro Val Ser Gly Lys Gly Gly
Lys Ala Lys Lys 85 90 95Thr Ser Arg Ser Ala Lys Ser Asn Lys Ser Gly
Thr Leu Ala Ser Gly 100 105 110Ser Asn Ala Gly Ser Pro Gly Asn Asn
Phe Ala Gln Ala Gly Thr 115 120 125208142PRTArabidopsis thaliana
208Met Ser Met Glu Met Glu Leu Phe Val Thr Pro Glu Lys Gln Arg Gln1
5 10 15His Pro Ser Val Ser Val Glu Lys Thr Pro Val Arg Arg Lys Leu
Ile 20 25 30Val Asp Asp Asp Ser Glu Ile Gly Ser Glu Lys Lys Gly Gln
Ser Arg 35 40 45Thr Ser Gly Gly Gly Leu Arg Gln Phe Ser Val Met Val
Cys Gln Lys 50 55 60Leu Glu Ala Lys Lys Ile Thr Thr Tyr Lys Glu Val
Ala Asp Glu Ile65 70 75 80Ile Ser Asp Phe Ala Thr Ile Lys Gln Asn
Ala Glu Lys Pro Leu Asn 85 90 95Glu Asn Glu Tyr Asn Glu Lys Asn Ile
Arg Arg Arg Val Tyr Asp Ala 100 105 110Leu Asn Val Phe Met Ala Leu
Asp Ile Ile Ala Arg Asp Lys Lys Glu 115 120 125Ile Arg Trp Lys Gly
Leu Pro Ile Thr Cys Lys Lys Asp Val 130 135 140209101PRTArabidopsis
thaliana 209Glu Lys Lys Gly Gln Ser Arg Thr Ser Gly Gly Gly Leu Arg
Gln Phe1 5 10 15Ser Val Met Val Cys Gln Lys Leu Glu Ala Lys Lys Ile
Thr Thr Tyr 20 25 30Lys Glu Val Ala Asp Glu Ile Ile Ser Asp Phe Ala
Thr Ile Lys Gln 35 40 45Asn Ala Glu Lys Pro Leu Asn Glu Asn Glu Tyr
Asn Glu Lys Asn Ile 50 55 60Arg Arg Arg Val Tyr Asp Ala Leu Asn Val
Phe Met Ala Leu Asp Ile65 70 75 80Ile Ala Arg Asp Lys Lys Glu Ile
Arg Trp Lys Gly Leu Pro Ile Thr 85 90 95Cys Lys Lys Asp Val
100210251PRTArabidopsis thaliana 210Glu Lys Lys Gly Gln Ser Arg Thr
Ser Gly Gly Gly Leu Arg Gln Phe1 5 10 15Ser Val Met Val Cys Gln Lys
Leu Glu Ala Lys Lys Ile Thr Thr Tyr 20 25 30Lys Glu Val Ala Asp Glu
Ile Ile Ser Asp Phe Ala Thr Ile Lys Gln 35 40 45Asn Ala Glu Lys Pro
Leu Asn Glu Asn Glu Tyr Asn Glu Lys Asn Ile 50 55 60Arg Arg Arg Val
Tyr Asp Ala Leu Asn Val Phe Met Ala Leu Asp Ile65 70 75 80Ile Ala
Arg Asp Lys Lys Glu Ile Arg Trp Lys Gly Leu Pro Ile Thr 85 90 95Cys
Lys Lys Asp Val Glu Glu Val Lys Met Asp Arg Asn Lys Val Met 100 105
110Ser Ser Val Gln Lys Lys Ala Ala Phe Leu Lys Glu Leu Arg Glu Lys
115 120 125Val Ser Ser Leu Glu Ser Leu Met Ser Arg Asn Gln Glu Met
Val Val 130 135 140Lys Thr Gln Gly Pro Ala Glu Gly Phe Thr Leu Pro
Phe Ile Leu Leu145 150 155 160Glu Thr Asn Pro His Ala Val Val Glu
Ile Glu Ile Ser Glu Asp Met 165 170 175Gln Leu Val His Leu Asp Phe
Asn Ser Thr Pro Phe Ser Val His Asp 180 185 190Asp Ala Tyr Ile Leu
Lys Leu Met Gln Glu Gln Lys Gln Glu Gln Asn 195 200 205Arg Val Ser
Ser Ser Ser Ser Thr His His Gln Ser Gln His Ser Ser 210 215 220Ala
His Ser Ser Ser Ser Ser Cys Ile Ala Ser Gly Thr Ser Gly Pro225 230
235 240Val Cys Trp Asn Ser Gly Ser Ile Asp Thr Arg 245
250211172PRTArabidopsis thaliana 211Ile Ile Ala Arg Asp Lys Lys Glu
Ile Arg Trp Lys Gly Leu Pro Ile1 5 10 15Thr Cys Lys Lys Asp Val Glu
Glu Val Lys Met Asp Arg Asn Lys Val 20 25 30Met Ser Ser Val Gln Lys
Lys Ala Ala Phe Leu Lys Glu Leu Arg Glu 35 40 45Lys Val Ser Ser Leu
Glu Ser Leu Met Ser Arg Asn Gln Glu Met Val 50 55 60Val Lys Thr Gln
Gly Pro Ala Glu Gly Phe Thr Leu Pro Phe Ile Leu65 70 75 80Leu Glu
Thr Asn Pro His Ala Val Val Glu Ile Glu Ile Ser Glu Asp 85 90 95Met
Gln Leu Val His Leu Asp Phe Asn Ser Thr Pro Phe Ser Val His 100 105
110Asp Asp Ala Tyr Ile Leu Lys Leu Met Gln Glu Gln Lys Gln Glu Gln
115 120 125Asn Arg Val Ser Ser Ser Ser Ser Thr His His Gln Ser Gln
His Ser 130 135 140Ser Ala His Ser Ser Ser Ser Ser Cys Ile Ala Ser
Gly Thr Ser Gly145 150 155 160Pro Val Cys Trp Asn Ser Gly Ser Ile
Asp Thr Arg 165 17021293PRTArabidopsis thaliana 212Ile Ile Ala Arg
Asp Lys Lys Glu Ile Arg Trp Lys Gly Leu Pro Ile1 5 10 15Thr Cys Lys
Lys Asp Val Glu Glu Val Lys Met Asp Arg Asn Lys Val 20 25 30Met Ser
Ser Val Gln Lys Lys Ala Ala Phe Leu Lys Glu Leu Arg Glu 35 40 45Lys
Val Ser Ser Leu Glu Ser Leu Met Ser Arg Asn Gln Glu Met Val 50 55
60Val Lys Thr Gln Gly Pro Ala Glu Gly Phe Thr Leu Pro Phe Ile Leu65
70 75 80Leu Glu Thr Asn Pro His Ala Val Val Glu Ile Glu Ile 85
90213121PRTArabidopsis thaliana 213Ser Leu Glu Ser Leu Met Ser Arg
Asn Gln Glu Met Val Val Lys Thr1 5 10 15Gln Gly Pro Ala Glu Gly Phe
Thr Leu Pro Phe Ile Leu Leu Glu Thr 20 25 30Asn Pro His Ala Val Val
Glu Ile Glu Ile Ser Glu Asp Met Gln Leu 35 40 45Val His Leu Asp Phe
Asn Ser Thr Pro Phe Ser Val His Asp Asp Ala 50 55 60Tyr Ile Leu Lys
Leu Met Gln Glu Gln Lys Gln Glu Gln Asn Arg Val65 70 75 80Ser Ser
Ser Ser Ser Thr His His Gln Ser Gln His Ser Ser Ala His 85 90 95Ser
Ser Ser Ser Ser Cys Ile Ala Ser Gly Thr Ser Gly Pro Val Cys 100 105
110Trp Asn Ser Gly Ser Ile Asp Thr Arg 115 120214193PRTArabidopsis
thaliana 214Met Thr Thr Thr Gly Ser Asn Ser Asn His Asn His His Glu
Ser Asn1 5 10 15Asn Asn Asn Asn Asn Pro Ser Thr Arg Ser Trp Gly Thr
Ala Val Ser 20 25 30Gly Gln Ser Val Ser Thr Ser Gly Ser Met Gly Ser
Pro Ser Ser Arg 35 40 45Ser Glu Gln Thr Ile Thr Val Val Thr Ser Thr
Ser Asp Thr Thr Phe 50 55 60Gln Arg Leu Asn Asn Leu Asp Ile Gln Gly
Asp Asp Ala Gly Ser Gln65 70 75 80Gly Ala Ser Gly Val Lys Lys Lys
Lys Arg Gly Gln Arg Ala Ala Gly 85 90 95Pro Asp Lys Thr Gly Arg Gly
Leu Arg Gln Phe Ser Met Lys Val Cys 100 105 110Glu Lys Val Glu Ser
Lys Gly Arg Thr Thr Tyr Asn Glu Val Ala Asp 115 120 125Glu Leu Val
Ala Glu Phe Ala Leu Pro Asn Asn Asp Gly Thr Ser Pro 130 135 140Asp
Gln Gln Gln Tyr Asp Glu Lys Asn Ile Arg Arg Arg Val Tyr Asp145 150
155 160Ala Leu Asn Val Leu Met Ala Met Asp Ile Ile Ser Lys Asp Lys
Lys 165 170 175Glu Ile Gln Trp Arg Gly Leu Pro Arg Thr Ser Leu Ser
Asp Ile Glu 180 185 190Glu21581PRTArabidopsis thaliana 215Gly Leu
Pro Arg Thr Ser Leu Ser Asp Ile Glu Glu Leu Lys Asn Glu1 5 10 15Arg
Leu Ser Leu Arg Asn Arg Ile Glu Lys Lys Thr Ala Tyr Ser Gln 20 25
30Glu Leu Glu Glu Gln Tyr Val Gly Leu Gln Asn Leu Ile Gln Arg Asn
35 40 45Glu His Leu Tyr Ser Ser Gly Asn Ala Pro Ser Gly Gly Val Ala
Leu 50 55 60Pro Phe Ile Leu Val Gln Thr Arg Pro His Ala Thr Val Glu
Val Glu65 70 75 80Ile216204PRTArabidopsis thaliana 216Gly Leu Pro
Arg Thr Ser Leu Ser Asp Ile Glu Glu Leu Lys Asn Glu1 5 10 15Arg Leu
Ser Leu Arg Asn Arg Ile Glu Lys Lys Thr Ala Tyr Ser Gln 20 25 30Glu
Leu Glu Glu Gln Tyr Val Gly Leu Gln Asn Leu Ile Gln Arg Asn 35 40
45Glu His Leu Tyr Ser Ser Gly Asn Ala Pro Ser Gly Gly Val Ala Leu
50 55 60Pro Phe Ile Leu Val Gln Thr Arg Pro His Ala Thr Val Glu Val
Glu65 70 75 80Ile Ser Glu Asp Met Gln Leu Val His Phe Asp Phe Asn
Ser Thr Pro 85 90 95Phe Glu Leu His Asp Asp Asn Phe Val Leu Lys Thr
Met Lys Phe Cys 100 105 110Asp Gln Pro Pro Gln Gln Pro Asn Gly Arg
Asn Asn Ser Gln Leu Val 115 120 125Cys His Asn Phe Thr Pro Glu Asn
Pro Asn Lys Gly Pro Ser Thr Gly 130 135 140Pro Thr Pro Gln Leu Asp
Met Tyr Glu Thr His Leu Gln Ser Gln Gln145 150 155 160His Gln Gln
His Ser Gln Leu Gln Ile Ile Pro Met Pro Glu Thr Asn 165 170 175Asn
Val Thr Ser Ser Ala Asp Thr Ala Pro Val Lys Ser Pro Ser Leu 180 185
190Pro Gly Ile Met Asn Ser Ser Met Lys Pro Glu Asn 195
200217420PRTArabidopsis thaliana 217Met Ser Gly Val Val Arg Ser Ser
Pro Gly Ser Ser Gln Pro Pro Pro1 5 10 15Pro Pro Pro His His Pro Pro
Ser Ser Pro Val Pro Val Thr Ser Thr 20 25 30Pro Val Ile Pro Pro Ile
Arg Arg His Leu Ala Phe Ala Ser Thr Lys 35 40 45Pro Pro Phe His Pro
Ser Asp Asp Tyr His Arg Phe Asn Pro Ser Ser 50 55 60Leu Ser Asn Asn
Asn Asp Arg Ser Phe Val His Gly Cys Gly Val Val65 70 75 80Asp Arg
Glu Glu Asp Ala Val Val Val Arg Ser Pro Ser Arg Lys Arg 85 90 95Lys
Ala Thr Met Asp Met Val Val Ala Pro Ser Asn Asn Gly Phe Thr 100 105
110Ser Ser Gly Phe Thr Asn Ile Pro Ser Ser Pro Cys Gln Thr Pro Arg
115 120 125Lys Gly Gly Arg Val Asn Ile Lys Ser Lys Ala Lys Gly Asn
Lys Ser 130 135 140Thr Pro Gln Thr Pro Ile Ser Thr Asn Ala Gly Ser
Pro Ile Thr Leu145 150 155 160Thr Pro Ser Gly Ser Cys Arg Tyr Asp
Ser Ser Leu Gly Leu Leu Thr 165 170 175Lys Lys Phe Val Asn Leu Ile
Lys Gln Ala Lys Asp Gly Met Leu Asp 180 185 190Leu Asn Lys Ala Ala
Glu Thr Leu Glu Val Gln Lys Arg Arg Ile Tyr 195 200 205Asp Ile Thr
Asn Val Leu Glu Gly Ile Asp Leu Ile Glu Lys Pro Phe 210 215 220Lys
Asn Arg Ile Leu Trp Lys Gly Val Asp Ala Cys Pro Gly Asp Glu225 230
235 240Asp Ala Asp Val Ser Val Leu Gln Leu Gln Ala Glu Ile Glu Asn
Leu 245 250 255Ala Leu Glu Glu Gln Ala Leu Asp Asn Gln Ile Arg Gln
Thr Glu Glu 260 265 270Arg Leu Arg Asp Leu Ser Glu Asn Glu Lys Asn
Gln Lys Trp Leu Phe 275 280 285Val Thr Glu Glu Asp Ile Lys Ser Leu
Pro Gly Phe Gln Asn Gln Thr 290 295 300Leu Ile Ala Val Lys Ala Pro
His
Gly Thr Thr Leu Glu Val Pro Asp305 310 315 320Pro Asp Glu Ala Ala
Asp His Pro Gln Arg Arg Tyr Arg Ile Ile Leu 325 330 335Arg Ser Thr
Met Gly Pro Ile Asp Val Tyr Leu Val Ser Glu Phe Glu 340 345 350Gly
Lys Phe Glu Asp Thr Asn Gly Ser Gly Ala Ala Pro Pro Ala Cys 355 360
365Leu Pro Ile Ala Ser Ser Ser Gly Ser Thr Gly His His Asp Ile Glu
370 375 380Ala Leu Thr Val Asp Asn Pro Glu Thr Ala Ile Val Ser His
Asp His385 390 395 400Pro His Pro Gln Pro Gly Asp Thr Ser Asp Leu
Asn Tyr Leu Gln Glu 405 410 415Gln Val Gly Gly
420218324PRTArabidopsis thaliana 218Pro Ser Gly Ser Cys Arg Tyr Asp
Ser Ser Leu Gly Leu Leu Thr Lys1 5 10 15Lys Phe Val Asn Leu Ile Lys
Gln Ala Lys Asp Gly Met Leu Asp Leu 20 25 30Asn Lys Ala Ala Glu Thr
Leu Glu Val Gln Lys Arg Arg Ile Tyr Asp 35 40 45Ile Thr Asn Val Leu
Glu Gly Ile Asp Leu Ile Glu Lys Pro Phe Lys 50 55 60Asn Arg Ile Leu
Trp Lys Gly Val Asp Ala Cys Pro Gly Asp Glu Asp65 70 75 80Ala Asp
Val Ser Val Leu Gln Leu Gln Ala Glu Ile Glu Asn Leu Ala 85 90 95Leu
Glu Glu Gln Ala Leu Asp Asn Gln Ile Arg Gln Thr Glu Glu Arg 100 105
110Leu Arg Asp Leu Ser Glu Asn Glu Lys Asn Gln Lys Trp Leu Phe Val
115 120 125Thr Glu Glu Asp Ile Lys Ser Leu Pro Gly Phe Gln Asn Gln
Thr Leu 130 135 140Ile Ala Val Lys Ala Pro His Gly Thr Thr Leu Glu
Val Pro Asp Pro145 150 155 160Asp Glu Ala Ala Asp His Pro Gln Arg
Arg Tyr Arg Ile Ile Leu Arg 165 170 175Ser Thr Met Gly Pro Ile Asp
Val Tyr Leu Val Ser Glu Phe Glu Gly 180 185 190Lys Phe Glu Asp Thr
Asn Gly Ser Gly Ala Ala Pro Pro Ala Cys Leu 195 200 205Pro Ile Ala
Ser Ser Ser Gly Ser Thr Gly His His Asp Ile Glu Ala 210 215 220Leu
Thr Val Asp Asn Pro Glu Thr Ala Ile Val Ser His Asp His Pro225 230
235 240His Pro Gln Pro Gly Asp Thr Ser Asp Leu Asn Tyr Leu Gln Glu
Gln 245 250 255Val Gly Gly Met Leu Lys Ile Thr Pro Ser Asp Val Glu
Asn Asp Glu 260 265 270Ser Asp Tyr Trp Leu Leu Ser Asn Ala Glu Ile
Ser Met Thr Asp Ile 275 280 285Trp Lys Thr Asp Ser Gly Ile Asp Trp
Asp Tyr Gly Ile Ala Asp Val 290 295 300Ser Thr Pro Pro Pro Gly Met
Gly Glu Ile Ala Pro Thr Ala Val Asp305 310 315 320Ser Thr Pro
Arg21938PRTArabidopsis thaliana 219Met Ser Gly Val Val Arg Ser Ser
Pro Gly Ser Ser Gln Pro Pro Pro1 5 10 15Pro Pro Pro His His Pro Pro
Ser Ser Pro Val Pro Val Thr Ser Thr 20 25 30Pro Val Ile Pro Pro Ile
35220142PRTArabidopsis thaliana 220Met Ser Met Glu Met Glu Leu Phe
Val Thr Pro Glu Lys Gln Arg Gln1 5 10 15His Pro Ser Val Ser Val Glu
Lys Thr Pro Val Arg Arg Lys Leu Ile 20 25 30Val Asp Asp Asp Ser Glu
Ile Gly Ser Glu Lys Lys Gly Gln Ser Arg 35 40 45Thr Ser Gly Gly Gly
Leu Arg Gln Phe Ser Val Met Val Cys Gln Lys 50 55 60Leu Glu Ala Lys
Lys Ile Thr Thr Tyr Lys Glu Val Ala Asp Glu Ile65 70 75 80Ile Ser
Asp Phe Ala Thr Ile Lys Gln Asn Ala Glu Lys Pro Leu Asn 85 90 95Glu
Asn Glu Tyr Asn Glu Lys Asn Ile Arg Arg Arg Val Tyr Asp Ala 100 105
110Leu Asn Val Phe Met Ala Leu Asp Ile Ile Ala Arg Asp Lys Lys Glu
115 120 125Ile Arg Trp Lys Gly Leu Pro Ile Thr Cys Lys Lys Asp Val
130 135 140221150PRTArabidopsis thaliana 221Glu Glu Val Lys Met Asp
Arg Asn Lys Val Met Ser Ser Val Gln Lys1 5 10 15Lys Ala Ala Phe Leu
Lys Glu Leu Arg Glu Lys Val Ser Ser Leu Glu 20 25 30Ser Leu Met Ser
Arg Asn Gln Glu Met Val Val Lys Thr Gln Gly Pro 35 40 45Ala Glu Gly
Phe Thr Leu Pro Phe Ile Leu Leu Glu Thr Asn Pro His 50 55 60Ala Val
Val Glu Ile Glu Ile Ser Glu Asp Met Gln Leu Val His Leu65 70 75
80Asp Phe Asn Ser Thr Pro Phe Ser Val His Asp Asp Ala Tyr Ile Leu
85 90 95Lys Leu Met Gln Glu Gln Lys Gln Glu Gln Asn Arg Val Ser Ser
Ser 100 105 110Ser Ser Thr His His Gln Ser Gln His Ser Ser Ala His
Ser Ser Ser 115 120 125Ser Ser Cys Ile Ala Ser Gly Thr Ser Gly Pro
Val Cys Trp Asn Ser 130 135 140Gly Ser Ile Asp Thr Arg145
15022271PRTArabidopsis thaliana 222Glu Glu Val Lys Met Asp Arg Asn
Lys Val Met Ser Ser Val Gln Lys1 5 10 15Lys Ala Ala Phe Leu Lys Glu
Leu Arg Glu Lys Val Ser Ser Leu Glu 20 25 30Ser Leu Met Ser Arg Asn
Gln Glu Met Val Val Lys Thr Gln Gly Pro 35 40 45Ala Glu Gly Phe Thr
Leu Pro Phe Ile Leu Leu Glu Thr Asn Pro His 50 55 60Ala Val Val Glu
Ile Glu Ile65 70223262PRTArabidopsis thaliana 223Met Thr Thr Thr
Gly Ser Asn Ser Asn His Asn His His Glu Ser Asn1 5 10 15Asn Asn Asn
Asn Asn Pro Ser Thr Arg Ser Trp Gly Thr Ala Val Ser 20 25 30Gly Gln
Ser Val Ser Thr Ser Gly Ser Met Gly Ser Pro Ser Ser Arg 35 40 45Ser
Glu Gln Thr Ile Thr Val Val Thr Ser Thr Ser Asp Thr Thr Phe 50 55
60Gln Arg Leu Asn Asn Leu Asp Ile Gln Gly Asp Asp Ala Gly Ser Gln65
70 75 80Gly Ala Ser Gly Val Lys Lys Lys Lys Arg Gly Gln Arg Ala Ala
Gly 85 90 95Pro Asp Lys Thr Gly Arg Gly Leu Arg Gln Phe Ser Met Lys
Val Cys 100 105 110Glu Lys Val Glu Ser Lys Gly Arg Thr Thr Tyr Asn
Glu Val Ala Asp 115 120 125Glu Leu Val Ala Glu Phe Ala Leu Pro Asn
Asn Asp Gly Thr Ser Pro 130 135 140Asp Gln Gln Gln Tyr Asp Glu Lys
Asn Ile Arg Arg Arg Val Tyr Asp145 150 155 160Ala Leu Asn Val Leu
Met Ala Met Asp Ile Ile Ser Lys Asp Lys Lys 165 170 175Glu Ile Gln
Trp Arg Gly Leu Pro Arg Thr Ser Leu Ser Asp Ile Glu 180 185 190Glu
Leu Lys Asn Glu Arg Leu Ser Leu Arg Asn Arg Ile Glu Lys Lys 195 200
205Thr Ala Tyr Ser Gln Glu Leu Glu Glu Gln Tyr Val Gly Leu Gln Asn
210 215 220Leu Ile Gln Arg Asn Glu His Leu Tyr Ser Ser Gly Asn Ala
Pro Ser225 230 235 240Gly Gly Val Ala Leu Pro Phe Ile Leu Val Gln
Thr Arg Pro His Ala 245 250 255Thr Val Glu Val Glu Ile
26022451PRTArabidopsis thaliana 224Gly Val Asp Ala Cys Pro Gly Asp
Glu Asp Ala Asp Val Ser Val Leu1 5 10 15Gln Leu Gln Ala Glu Ile Glu
Asn Leu Ala Leu Glu Glu Gln Ala Leu 20 25 30Asp Asn Gln Ile Arg Gln
Thr Glu Glu Arg Leu Arg Asp Leu Ser Glu 35 40 45Asn Glu Lys
50225121PRTArabidopsis thaliana 225Gly Val Asp Ala Cys Pro Gly Asp
Glu Asp Ala Asp Val Ser Val Leu1 5 10 15Gln Leu Gln Ala Glu Ile Glu
Asn Leu Ala Leu Glu Glu Gln Ala Leu 20 25 30Asp Asn Gln Ile Arg Gln
Thr Glu Glu Arg Leu Arg Asp Leu Ser Glu 35 40 45Asn Glu Lys Asn Gln
Lys Trp Leu Phe Val Thr Glu Glu Asp Ile Lys 50 55 60Ser Leu Pro Gly
Phe Gln Asn Gln Thr Leu Ile Ala Val Lys Ala Pro65 70 75 80His Gly
Thr Thr Leu Glu Val Pro Asp Pro Asp Glu Ala Ala Asp His 85 90 95Pro
Gln Arg Arg Tyr Arg Ile Ile Leu Arg Ser Thr Met Gly Pro Ile 100 105
110Asp Val Tyr Leu Val Ser Glu Phe Glu 115 12022650PRTArabidopsis
thaliana 226Gly Leu Asp Val Ser Lys Pro Gly Glu Thr Ile Glu Ser Ile
Ala Asn1 5 10 15Leu Gln Asp Glu Val Gln Asn Leu Ala Ala Glu Glu Ala
Arg Leu Asp 20 25 30Asp Gln Ile Arg Glu Ser Gln Glu Arg Leu Thr Ser
Leu Ser Glu Asp 35 40 45Glu Asn 50227118PRTArabidopsis thaliana
227Gly Leu Asp Val Ser Lys Pro Gly Glu Thr Ile Glu Ser Ile Ala Asn1
5 10 15Leu Gln Asp Glu Val Gln Asn Leu Ala Ala Glu Glu Ala Arg Leu
Asp 20 25 30Asp Gln Ile Arg Glu Ser Gln Glu Arg Leu Thr Ser Leu Ser
Glu Asp 35 40 45Glu Asn Asn Lys Arg Leu Leu Phe Val Thr Glu Asn Asp
Ile Lys Asn 50 55 60Leu Pro Cys Phe Gln Asn Lys Thr Leu Ile Ala Val
Lys Ala Pro His65 70 75 80Gly Thr Thr Leu Glu Val Pro Asp Pro Asp
Glu Ala Gly Gly Tyr Gln 85 90 95Arg Arg Tyr Arg Ile Ile Leu Arg Ser
Thr Met Gly Pro Ile Asp Val 100 105 110Tyr Leu Val Ser Gln Phe
115228393DNAArabidopsis thaliana 228aatcgaatac tttggaaggg
agttgatgcg tgtcctggcg atgaggatgc tgacgtatct 60gtattacagc tgcaggcaga
aattgaaaac ctcgccctcg aagagcaagc attagacaac 120caaatcagac
aaacagagga aagattaaga gacctgagcg aaaatgaaaa gaatcagaaa
180tggctttttg taactgaaga ggatatcaag agtttaccag gtttccagaa
ccagactctg 240atagccgtca aagctcctca tggcacaact ttggaagtgc
ctgatccaga tgaagcggct 300gaccacccac aaaggagata caggatcatt
cttagaagta caatgggacc tattgacgta 360tacctcgtca gcgaatttga
agggaaattc gaa 393229516DNAArabidopsis thaliana 229attattgcaa
gggataaaaa ggaaatccgg tggaaaggac ttcctattac ctgcaaaaag 60gatgtggaag
aagtcaagat ggatcgtaat aaagttatga gcagtgtgca aaagaaggct
120gcttttctta aagagttgag agaaaaggtc tcaagtcttg agagtcttat
gtcgagaaat 180caagagatgg ttgtgaagac tcaaggccca gcagaaggat
ttaccttacc attcattcta 240cttgagacaa accctcacgc agtagtcgaa
atcgagattt ctgaagatat gcaacttgta 300cacctcgact tcaatagcac
acctttctcg gtccatgatg atgcttacat tttgaaactg 360atgcaagaac
agaagcaaga acagaacaga gtatcttctt cttcatctac acatcaccaa
420tctcaacata gctccgctca ttcttcatcc agttcttgca ttgcttctgg
aacctcaggc 480ccggtttgct ggaactcggg atccattgat actcgc
516230276DNAArabidopsis thaliana 230ggtcttcctc ggacaagctt
aagcgacatt gaagaattaa agaacgaacg actctcactt 60aggaacagaa ttgagaagaa
aactgcatat tcccaagaac tggaagaaca agtaatgaac 120atcatcgata
ctctcggctt atctgcttcc tgccttcaga atctgataca gagaaatgag
180cacttatata gctcaggaaa tgctcccagt ggcggtgttg ctcttccttt
tatccttgtc 240cagactcgtc ctcacgcaac agtagaagtg gagata
276231645DNAArabidopsis thaliana 231ggtcttcctc ggacaagctt
aagcgacatt gaagaattaa agaacgaacg actctcactt 60aggaacagaa ttgagaagaa
aactgcatat tcccaagaac tggaagaaca agtaatgaac 120atcatcgata
ctctcggctt atctgcttcc tgccttcaga atctgataca gagaaatgag
180cacttatata gctcaggaaa tgctcccagt ggcggtgttg ctcttccttt
tatccttgtc 240cagactcgtc ctcacgcaac agtagaagtg gagatatcag
aagatatgca gctcgtgcat 300tttgatttca acagcactcc atttgagctc
cacgacgaca attttgtcct caagactatg 360aagttttgtg atcaaccgcc
gcaacaacca aacggtcgga acaacagcca gctggtttgt 420cacaatttca
cgccagaaaa ccctaacaaa ggccccagca caggtccaac accgcagctg
480gatatgtacg agactcatct tcaatcgcaa caacatcagc agcattctca
gctacaaatc 540attcctatgc ctgagactaa caacgttact tccagcgctg
atactgctcc agtgaaatcc 600ccgtctcttc cagggataat gaactccagc
atgaagccgg agaat 645232450DNAArabidopsis thaliana 232gaagaagtca
agatggatcg taataaagtt atgagcagtg tgcaaaagaa ggctgctttt 60cttaaagagt
tgagagaaaa ggtctcaagt cttgagagtc ttatgtcgag aaatcaagag
120atggttgtga agactcaagg cccagcagaa ggatttacct taccattcat
tctacttgag 180acaaaccctc acgcagtagt cgaaatcgag atttctgaag
atatgcaact tgtacacctc 240gacttcaata gcacaccttt ctcggtccat
gatgatgctt acattttgaa actgatgcaa 300gaacagaagc aagaacagaa
cagagtatct tcttcttcat ctacacatca ccaatctcaa 360catagctccg
ctcattcttc atccagttct tgcattgctt ctggaacctc aggcccggtt
420tgctggaact cgggatccat tgatactcgc 450233213DNAArabidopsis
thaliana 233gaagaagtca agatggatcg taataaagtt atgagcagtg tgcaaaagaa
ggctgctttt 60cttaaagagt tgagagaaaa ggtctcaagt cttgagagtc ttatgtcgag
aaatcaagag 120atggttgtga agactcaagg cccagcagaa ggatttacct
taccattcat tctacttgag 180acaaaccctc acgcagtagt cgaaatcgag att
213234870DNAArabidopsis thaliana 234atgacaacta ctgggtctaa
ttctaatcac aaccaccatg aaagcaataa taacaacaat 60aaccctagta ctaggtcttg
gggcacggcg gtttcaggtc aatctgtgtc tactagcggc 120agtatgggct
ctccgtcgag ccggagtgag caaaccatca ccgttgttac atctactagc
180gacactactt ttcaacgcct gaataatttg gacattcaag gtgatgatgc
tggttctcaa 240ggagcttctg gtgttaagaa gaagaagagg ggacagcgtg
cggctggtcc agataagact 300ggaagaggac tacgtcaatt tagtatgaaa
ggtcttatct ctttctctgc ccctattatg 360ctttcatcta aatgcctttc
aatttgtgaa aaggtggaaa gcaaaggaag gacaacttac 420aatgaggttg
cagacgagct tgttgctgaa tttgcacttc caaataacga tggaacatcc
480cctgatcagc aacagtatga tgagaaaaac ataagacgaa gagtatatga
tgctttaaac 540gtcctcatgg ctatggatat aatatccaag gataaaaaag
aaattcaatg gagaggtctt 600cctcggacaa gcttaagcga cattgaagaa
ttaaagaacg aacgactctc acttaggaac 660agaattgaga agaaaactgc
atattcccaa gaactggaag aacaagtaat gaacatcatc 720gatactctcg
gcttatctgc ttcctgcctt cagaatctga tacagagaaa tgagcactta
780tatagctcag gaaatgctcc cagtggcggt gttgctcttc cttttatcct
tgtccagact 840cgtcctcacg caacagtaga agtggagata
870235153DNAArabidopsis thaliana 235ggagttgatg cgtgtcctgg
cgatgaggat gctgacgtat ctgtattaca gctgcaggca 60gaaattgaaa acctcgccct
cgaagagcaa gcattagaca accaaatcag acaaacagag 120gaaagattaa
gagacctgag cgaaaatgaa aag 153236363DNAArabidopsis thaliana
236ggagttgatg cgtgtcctgg cgatgaggat gctgacgtat ctgtattaca
gctgcaggca 60gaaattgaaa acctcgccct cgaagagcaa gcattagaca accaaatcag
acaaacagag 120gaaagattaa gagacctgag cgaaaatgaa aagaatcaga
aatggctttt tgtaactgaa 180gaggatatca agagtttacc aggtttccag
aaccagactc tgatagccgt caaagctcct 240catggcacaa ctttggaagt
gcctgatcca gatgaagcgg ctgaccaccc acaaaggaga 300tacaggatca
ttcttagaag tacaatggga cctattgacg tatacctcgt cagcgaattt 360gaa
363237150DNAArabidopsis thaliana 237ggtctcgatg tctcaaaacc
aggagaaaca atcgaaagca tagctaacct acaggatgaa 60gtacaaaacc tcgcagctga
ggaggcaaga ttagatgacc aaatcagaga atcacaagaa 120agattaacaa
gcttgagtga ggatgaaaac 150238354DNAArabidopsis thaliana
238ggtctcgatg tctcaaaacc aggagaaaca atcgaaagca tagctaacct
acaggatgaa 60gtacaaaacc tcgcagctga ggaggcaaga ttagatgacc aaatcagaga
atcacaagaa 120agattaacaa gcttgagtga ggatgaaaac aacaaaaggt
tactgttcgt cactgaaaac 180gacattaaga acctaccatg cttccagaat
aagacgctga tagctgtaaa ggcaccgcat 240ggaacaactc ttgaggttcc
agatcctgat gaggctggtg gttatcagag gaggtacaga 300atcattctga
gaagcacaat gggaccaata gacgtgtacc tagtcagtca attc
354239426DNAArabidopsis thaliana 239atgagtatgg agatggagtt
gtttgtcact ccagagaagc agaggcaaca tccttcagtg 60agcgttgaga aaactccagt
gagaaggaaa ttgattgttg atgatgattc tgaaattgga 120tcagagaaga
aagggcaatc aagaacttct ggaggcgggc ttcgtcaatt cagtgttatg
180gtttgtcaga agttggaagc caagaagata actacttaca aggaggttgc
agacgaaatt 240atttcagatt ttgccacaat taagcaaaac gcagagaagc
ctttgaatga aaatgagtac 300aatgagaaga acataaggcg gagagtctac
gatgcgctca atgtgttcat ggcgttggat 360attattgcaa gggataaaaa
ggaaatccgg tggaaaggac ttcctattac ctgcaaaaag 420gatgtg
4262407PRTartificial sequencemotif 240Met Lys Val Cys Glu Lys Val1
52418PRTartificial sequencemotif 241Leu Asn Val Leu Met Ala Met
Asp1 52428PRTartificial sequencemotif 242Phe Asn Ser Thr Pro Phe
Glu Leu1 524330DNAartificial sequenceprimer 243atagaattca
tgaaagtttg tgaaaaggtg 3024433DNAartificial sequenceprimer
244atagaattcc tgaatgttct catggcaatg gat 3324533DNAartificial
sequenceprimer 245ataggatccc agctcaaaag gagtgctatt gaa
3324629DNAartificial sequencemotif 246ggggacaagt ttgtacaaaa
aagcaggct
292475DNAartificial sequencemotif 247tcaca 524829DNAartificial
sequencemotif 248ggggaccact ttgtacaaga aagctgggt
2924927DNAartificial sequenceprimer 249atagaattca tgtccggtgt
cgtacga 2725030DNAartificial sequenceprimer 250ataggatccc
acctccaatg tttctgcagc 3025130DNAartificial sequenceprimer
251atagaattcg agaagaaagg gcaatcaaga 3025230DNAartificial
sequenceprimer 252atactgcaga gaaatctcga tttcgactac
3025325DNAartificial sequenceprimer 253gccactctca tagggttctc catcg
2525425DNAartificial sequenceprimer 254ggcatgcctc caagatcctt gaagt
2525522DNAartificial sequenceprimer 255gggtcttggt cgttttactg tt
2225625DNAartificial sequenceprimer 256ccaagacgat gacaacagat acagc
2525721DNAartificial sequenceprimer 257ataaactaaa tcttcgctga a
2125821DNAartificial sequenceprimer 258caaacgcgga tctgaaaaac t
2125918DNAartificial sequenceprimer 259tctctcttcc aaatctcc
1826020DNAartificial sequenceprimer 260aagtctctca ctttctcact
2026125DNAartificial sequenceprimer 261ctaagctctc aagatcaaag gctta
2526225DNAartificial sequenceprimer 262ttaacattgc aaagagtttc aaggt
252634PRTartificial sequencemotif 263Thr Pro Trp
Lys1264289PRTArabidopsis thaliana 264Met Gly Lys Tyr Ile Arg Lys
Ser Lys Ile Asp Gly Ala Gly Ala Gly1 5 10 15Ala Gly Gly Gly Gly Gly
Gly Gly Gly Gly Gly Glu Ser Ser Ile Ala 20 25 30Leu Met Asp Val Val
Ser Pro Ser Ser Ser Ser Ser Leu Gly Val Leu 35 40 45Thr Arg Ala Lys
Ser Leu Ala Leu Gln Gln Gln Gln Gln Arg Cys Leu 50 55 60Leu Gln Lys
Pro Ser Ser Pro Ser Ser Leu Pro Pro Thr Ser Ala Ser65 70 75 80Pro
Asn Pro Pro Ser Lys Gln Lys Met Lys Lys Lys Gln Gln Gln Met 85 90
95Asn Asp Cys Gly Ser Tyr Leu Gln Leu Arg Ser Arg Arg Leu Gln Lys
100 105 110Lys Pro Pro Ile Val Val Ile Arg Ser Thr Lys Arg Arg Lys
Gln Gln 115 120 125Arg Arg Asn Glu Thr Cys Gly Arg Asn Pro Asn Pro
Arg Ser Asn Leu 130 135 140Asp Ser Ile Arg Gly Asp Gly Ser Arg Ser
Asp Ser Val Ser Glu Ser145 150 155 160Val Val Phe Gly Lys Asp Lys
Asp Leu Ile Ser Glu Ile Asn Lys Asp 165 170 175Pro Thr Phe Gly Gln
Asn Phe Phe Asp Leu Glu Glu Glu His Thr Gln 180 185 190Ser Phe Asn
Arg Thr Thr Arg Glu Ser Thr Pro Cys Ser Leu Ile Arg 195 200 205Arg
Pro Glu Ile Met Thr Thr Pro Gly Ser Ser Thr Lys Leu Asn Ile 210 215
220Cys Val Ser Glu Ser Asn Gln Arg Glu Asp Ser Leu Ser Arg Ser
His225 230 235 240Arg Arg Arg Pro Thr Thr Pro Glu Met Asp Glu Phe
Phe Ser Gly Ala 245 250 255Glu Glu Glu Gln Gln Lys Gln Phe Ile Glu
Lys Tyr Asn Phe Asp Pro 260 265 270Val Asn Glu Gln Pro Leu Pro Gly
Arg Phe Glu Trp Thr Lys Val Asp 275 280 285Asp 26520DNAartificial
sequenceprimer 265cgggccccaa ataatgattt 2026618DNAartificial
sequenceprimer 266gacacgggcc agagctgc 1826710PRTartificial
sequencemotif 267Arg Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Asn1 5
102688PRTartificial sequencemotif 268Met Arg Xaa Ile Leu Xaa Asp
Trp1 52698PRTartificial sequencemotif 269Lys Tyr Glu Glu Xaa Xaa
Xaa Pro1 52709PRTartificial sequencemotif 270Gly Xaa Gly Xaa Xaa
Gly Xaa Val Tyr1 527110PRTartificial sequencemotif 271His Arg Asp
Xaa Lys Xaa Xaa Asn Xaa Leu1 5 1027212PRTartificial sequencemotif
272Asp Xaa Xaa Xaa Ser Xaa Gly Xaa Xaa Xaa Xaa Glu1 5
102735PRTartificial sequencemotif 273Thr Pro Xaa Xaa Xaa1
52744PRTartificial sequencemotif 274Ser Pro Xaa
Xaa12754PRTartificial sequencemotif 275Ser Pro Xaa
Xaa12764PRTartificial sequencemotif 276Ser Pro Xaa
Xaa12777PRTartificial sequencemotif 277Pro Lys Lys Lys Arg Lys Val1
527816PRTartificial sequencemotif 278Lys Arg Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Lys Lys Lys Lys1 5 10 152795PRTartificial
sequencemotif 279Lys Arg Pro Arg Pro1 52809PRTartificial
sequencemotif 280Pro Ala Ala Lys Arg Val Lys Leu Asp1
52814PRTartificial sequencemotif 281Arg Xaa Xaa
Phe12825PRTartificial sequencemotif 282Leu Xaa Cys Xaa Glu1
52835PRTartificial sequencemotif 283Leu Xaa Ser Xaa Glu1
528417PRTartificial sequencemotif 284Asp Tyr Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Glu Xaa Xaa Xaa Asp Leu Phe1 5 10 15Asp28517PRTartificial
sequencemotif 285Asp Tyr Xaa Xaa Xaa Xaa Xaa Xaa Asp Xaa Xaa Xaa
Xaa Asp Met Trp1 5 10 15Glu28635PRTartificial sequencemotif 286Xaa
Xaa Lys Asn Ile Arg Xaa Arg Val Xaa Asp Ala Leu Asn Val Xaa1 5 10
15Met Ala Xaa Xaa Xaa Ile Xaa Xaa Xaa Lys Lys Glu Ile Xaa Trp Xaa
20 25 30Gly Leu Pro 3528753PRTartificial sequencemotif 287Xaa Xaa
Gly Leu Arg Xaa Phe Ser Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa1 5 10 15Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Lys Xaa 20 25
30Xaa Xaa Xaa Lys Xaa Xaa Thr Thr Xaa Tyr Xaa Glu Val Ala Asp Glu
35 40 45Xaa Xaa Xaa Xaa Phe 502889PRTartificial sequencemotif
288Xaa Xaa Xaa Xaa Lys Xaa Xaa Xaa Glu1 528927PRTartificial
sequencemotif 289Xaa Xaa Xaa Xaa Lys Xaa Xaa Xaa Xaa Xaa Glu Xaa
Xaa Xaa Xaa Xaa1 5 10 15Xaa Xaa Xaa Xaa Xaa Asn Leu Xaa Xaa Arg Asn
20 2529046PRTartificial sequencemotif 290Xaa Pro Phe Ile Xaa Xaa
Xaa Thr Xaa Xaa Xaa Xaa Xaa Val Xaa Xaa1 5 10 15Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Phe Xaa Xaa Xaa 20 25 30Xaa Phe Xaa Xaa
His Asp Asp Xaa Xaa Xaa Leu Xaa Xaa Met 35 40 45
* * * * *
References