U.S. patent application number 09/838539 was filed with the patent office on 2002-09-12 for plant cellulose synthase and promoter sequences.
Invention is credited to Delmer, Deborah, Pear, Julie R., Stalker, David M..
Application Number | 20020129401 09/838539 |
Document ID | / |
Family ID | 21851951 |
Filed Date | 2002-09-12 |
United States Patent
Application |
20020129401 |
Kind Code |
A1 |
Stalker, David M. ; et
al. |
September 12, 2002 |
Plant cellulose synthase and promoter sequences
Abstract
Provided are two plant cDNA clones that are homologs of the
bacterial CelA genes that encode the catalytic subunit of cellulose
synthase, derived from cotton (Gossypium hirsutum). Also provided
are genomic promoter regions to these encoding regions to cellulose
synthase. Methods for using cellulose synthase in cotton fiber and
wood quality modification are also provided.
Inventors: |
Stalker, David M.;
(Woodland, CA) ; Pear, Julie R.; (Davis, CA)
; Delmer, Deborah; (Davis, CA) |
Correspondence
Address: |
Calgene LLC
1920 Fifth Street
Davis
CA
95616
US
|
Family ID: |
21851951 |
Appl. No.: |
09/838539 |
Filed: |
April 18, 2001 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09838539 |
Apr 18, 2001 |
|
|
|
08960048 |
Oct 29, 1997 |
|
|
|
6271443 |
|
|
|
|
60029987 |
Oct 29, 1996 |
|
|
|
Current U.S.
Class: |
800/278 ;
435/320.1; 435/419; 536/23.6; 536/24.1; 800/286; 800/298 |
Current CPC
Class: |
C12N 15/8242 20130101;
C12N 9/1059 20130101; C12N 15/8233 20130101; C12N 15/8261 20130101;
C12N 15/8246 20130101; Y02A 40/146 20180101; C12N 15/8222
20130101 |
Class at
Publication: |
800/278 ;
536/23.6; 536/24.1; 435/320.1; 435/419; 800/298; 800/286 |
International
Class: |
C07H 021/04; C12N
015/82; A01H 005/00; C12N 015/00; C12N 015/63; C12N 005/04; C12N
015/29 |
Claims
What is claimed is:
1. An isolated DNA encoding sequence to a plant cellulose synthesis
enzyme.
2. The DNA encoding sequence of claim 1 wherein said cellulose
synthesis enzyme is cellulose synthase.
3. The DNA encoding sequence of claim 2 wherein said cellulose
synthase is from cotton.
4. The DNA encoding sequence of claim 3 wherein said cotton
cellulose synthase is celA1.
5. The DNA encoding sequence of claim 4 wherein said celA1 is
encoded by the sequence of FIG. 6.
6. The DNA encoding sequence of claim 3 wherein said cotton
cellulose synthase is celA2.
7. The DNA encoding sequence of claim 6 wherein said celA2 is
encoded by the sequence of FIG. 7.
8. An isolated DNA encoding sequence to a plant cellulose synthesis
promoter region.
9. The promoter encoding sequence of claim 8 wherein said cellulose
synthesis promoter region is to cellulose synthase.
10. The promoter sequence of claim 9 wherein said cellulose
synthase promoter region is from cotton.
11. The promoter sequence of claim 10 wherein said cotton cellulose
synthase promoter region is from celA1.
12. The promoter sequence of claim 11 wherein said cotton cellulose
synthase promoter region is the from sequence of FIG. 8.
13. A recombinant DNA construct comprising any of the DNA encoding
sequences of claims 1-10.
14. The DNA construct of claim 13 comprising as operably joined
components in the direction of transcription, a cotton fiber
transcriptional factor and the sequence of any of claims 1-7.
15. A plant cell comprising a DNA construct of claims 13 or 14.
16. A plant comprising a cell of claim 15.
17. A method of modifying fiber phenotype in a cotton plant, said
method comprising: transforming a plant cell with DNA comprising a
construct of claims 13 or 14.
18. A method of modifying the wood quality phenotype in a forest
tree species, said method comprising: transforming a plant cell of
said species with DNA comprising a construct of claim 13.
19. A method according to claim 18 wherein said cellulose sythesis
enzyme is cellulose synthase and wherein the encoding sequence is
in an antisense orientation, wherein transcribed mRNA from said
sequence is complementary to the equivalent mRNA transcribed from
the endogenous gene, whereby the synthesis of cellulose in said
plant cell is suppressed.
20. A method according to claim 18, wherein said cellulose sythesis
enzyme is cellulose synthase and wherein the encoding sequence is
in a sense orientation, and wherein the synthesis of cellulose in
said plant cell is increased.
21. A method according to claim 20 wherein said plant cell
additionally comprises a construct encoding a sequence to an enzyme
involved in the synthesis of lignin or a lignin precursor.
22. A method according to claim 20 wherein said lignin encoding
sequence is in an antisense orientation, wherein transcribed mRNA
from said sequence is complementary to the equivalent mRNA
transcribed from the endogenous gene, whereby the synthesis of
lignin is suppressed.
Description
TECHNICAL FIELD
[0001] This invention relates to plant cellulose synthase cDNA
encoding sequences, and their use in modifying plant phenotypes.
Methods are provided whereby the sequences can be used to control
or limit the expression of endogenous cellulose synthase.
[0002] This invention also relates to methods of using in vitro
constructed DNA transcription or expression cassettes capable of
directing fiber-tissue transcription of a DNA sequence of interest
in plants to produce fiber cells having an altered phenotype, and
to methods of providing for or modifying various characteristics of
cotton fiber. The invention is exemplified by methods of using
cotton fiber promoters for altering the phenotype of cotton fiber,
and cotton fibers produced by the method.
BACKGROUND
[0003] In spite of much effort, no one has succeeded in isolating
and characterizing the enzyme(s) responsible for synthesis of the
major cell wall polymer of plants, cellulose.
[0004] Numerous efforts have been directed toward the study of
synthesis of cellulose (1,4-.beta.-D-glucan) in higher plants.
However, hampered by low rates of activity in vitro, the cellulose
synthase of plants has resisted purification and detailed
characterization (for reviews, see 1,2). Aided by the discovery of
cyclic-di-GMP as a specific activator, the cellulose synthase of
the bacterium Acetobacter xylinum can be easily assayed in vitro,
has been purified to homogeneity, and a catalytic subunit
identified (for reviews, see 2,3). Furthermore, an operon of four
genes involved in cellulose synthesis in A. xylinum has been cloned
(4-7).
[0005] Characterization of these genes indicates that the first
gene, termed either BcsA (7) or AcsAB (6) codes for the 83 kD
subunit of the cellulose synthase that binds the substrate UDP-glc
and presumably catalyzes the polymerization of glucose residues to
1,4-.beta.-D-glucan (8). The second gene (B) of the operon is
believed to function as a regulatory subunit binding cyclic-di-GMP
(9) while recent-evidence suggests that the C and D genes may code
for proteins that form a pore allowing secretion of the polymer and
control the pattern of crystallization of the resulting
microfibrils (6).
[0006] Recent studies with another gram-negative bacterium,
Agrobacterium tumefaciens, have also led to cloning of genes
involved in cellulose synthesis (10,11), although the proposed
pathway of synthesis differs in some respects from that of A.
xylinum. In A. tumefaciens, a CelA gene showing significant
homology to the BcsA/AcsAB gene of A. xylinum, is proposed to
transfer glc from UDP-glc to a lipid acceptor; other gene products
may then build up a lipid oligosaccharide that is finally
polymerized to cellulose by the action of an endo-glucanase
functioning in a synthetic mode. In addition, homologs of the CelA,
B, and C genes have been identified in E. coli, but, as this
organism is not known to synthesize cellulose in vivo, the function
of these genes is not clear (2).
[0007] These successes in bacterial systems opened the possibility
that homologs of the bacterial genes might be identified in higher
plants. However, experiments in a number of laboratories utilizing
the A. xylinum genes as probes for screening plant cDNA libraries
have failed to identify similar plant genes. Such lack of success
suggests that, if plants do contain homologs of the bacterial
genes, their overall sequence homology is not very high. Recent
studies analyzing the conserved motifs common to
glycosyltransferases using either UDP-glc or UDP-GlcNAc as
substrate suggest that there are specific conserved regions that
might be expected to be found in any plant homolog of the catalytic
subunit (referred to hereafter as CelA). In one of these studies,
Delmer and Amor (2) identifed a motif common to many such
glycosyltransferases including the bacterial CelA proteins. An
independent analysis (6) also concluded that this motif was highly
conserved in a group of similar glycosyltransferases.
[0008] Extending these studies further, Saxena et al. (12)
presented an elegant model for the mechanism of catalysis for
enzymes such as cellulose synthase that have the unique problem of
synthesizing consecutive residues that are rotated approximately
rotated 180.degree. with respect to each other. The model invokes
independent UDP-glc binding sites and, based upon hydrophobic
cluster analysis of these enzymes, the authors concluded that 3
critical regions in all such processive glycosyltransferases each
contain a conserved aspartate (D) residue, while a fourth region
contained a conserved QXXRW motif. The first D residue resides in
the motif as previously analyzed (2,6).
[0009] In general, genetic engineering techniques have been
directed to modifying the phenotype of individual prokaryotic and
eukaryotic cells, especially in culture. Plant cells have proven
more intransigent than other eukaryotic cells, due not only to a
lack of suitable vector systems but also as a result of the
different goals involved. For many applications, it is desirable to
be able to control gene expression at a particular stage in the
growth of a plant or in a particular plant part. For this purpose,
regulatory sequences are required which afford the desired
initiation of transcription in the appropriate cell types and/or at
the appropriate time in the plant's development without having
serious detrimental effects on plant development and productivity.
It is therefore of interest to be able to isolate sequences which
can be used to provide the desired regulation of transcription in a
plant cell during the growing cycle of the host plant.
[0010] One aspect of this interest is the ability to change the
phenotype of particular cell types, such as differentiated
epidermal cells that originate in fiber tissue, i.e. cotton fiber
cells, so as to provide for altered or improved aspects of the
mature cell type. Cotton is a plant of great commercial
significance. In addition to the use of cotton fiber in the
production of textiles, other uses of cotton include food
preparation with cotton seed oil and animal feed derived from
cotton seed husks.
[0011] A related goal involving the control of cell wall and
characteristics would be to affect valuable secondary tree
characteristics of wood for paper forestry products. For instance,
by altering the balance of cellulose and lignin, the quality of
wood for paper production may be improved.
[0012] Finally, despite the importance of cotton as a crop, the
breeding and genetic engineering of cotton fiber phenotypes has
taken place at a relatively slow rate because of the absence of
reliable promoters for use in selectively effecting changes in the
phenotype of the fiber. In order to effect the desired phenotypic
changes, transcription initiation regions capable of initiating
transcription in fiber cells during development are desired. Thus,
an important goal of cotton bioengineering research is the
acquisition of a reliable promoter which would permit expression of
a protein selectively in cotton fiber to affect such qualities as
fiber strength, length, color and dyability.
Relevant Literature
[0013] Cotton fiber-specific promoters are discussed in PCT
publications WO 94/12014 and WO 95/08914, and John and Crow, Proc.
Natl. Acad. Sci. USA, 89:5769-5773, 1992. cDNA clones that are
preferentially expressed in cotton fiber have been isolated. One of
the clones isolated corresponds to mRNA and protein that are
highest during the late primary cell wall and early secondary cell
wall synthesis stages. John and Crow, supra.
[0014] In plants, control of cytoskeletal organization is poorly
understood in spite of its importance for the regulation of
patterns of cell division, expansion, and subsequent deposition of
secondary cell wall polymers. The cotton fiber represents an
excellent system for studying cytoskeletal organization. Cotton
fibers are single cells in which cell elongation and secondary wall
deposition can be studied as distinct events. These fibers develop
synchronously within the boll following anthesis, and each fiber
cell elongates for about 3 weeks, depositing a thin primary wall
(Meinert and Delmer, (1984) Plant Physiol. 59: 1088-1097; Basra and
Malik, (1984) Int Rev of Cytol 89: 65-113). At the time of
transition to secondary wall cellulose synthesis, the fiber cells
undergo a synchronous shift in the pattern of cortical microtubule
and cell wall microfibril alignments, events which may be regulated
upstream by the organization of actin (Seagull, (1990) Protoplasma
159: 44-59; and (1992) In: Proceedings of the Cotton Fiber
Cellulose Conference, National Cotton Council of America, Memphis
RN, pp 171-192.
[0015] Agrobacterium-mediated cotton transformation is described in
Umbeck, U.S. Pat. Nos. 5,004,863 and 5,159,135 and cotton
transformation by particle bombardment is reported in WO 92/15675,
published Sep. 17, 1992. Transformation of Brassica has been
described by Radke et al. (Theor. Appl. Genet. (1988) 75;685-694;
Plant Cell Reports (1992) 11:499-505.
[0016] Genes involved in lignin biosynthesis are described by
Dwivedi, U. N., Campbell, W. H., Yu, J., Datla, R. S. S., Chiang,
V. L., and Podila, G. K. (1994) "Modification of lignin
biosynthesis in transgenic Nicotiana through expression of an
antisense O-methyltransferase gene from Populus" Pl. Mol. Biol. 26:
61-71; and Tsai, C. J., Podila, G. K. and Chaing, V. L. (1995)
"Nucleotide sequence of Populus tremuloides gene for caffeic acid/5
hydroxyferulic acid O-methyltransferase" Pl. Physiol. 107: 1459;
and also U.S. Pat. No. 5,451,514 (claiming the use of cinnamyl
alcohol dehydrogenase gene in an antisense orientation such that
the endogenous plant cinnamyl alcohol dehydrogenase gene is
inhibited).
Other References Cited Throughout the Specification
[0017] 1. Gibeaut, D. M., & Carpita, N. C. (1994) FASEB J. 8,
904-915.
[0018] 2. Delmer, D. P., & Amor, Y. (1995) Plant Cell 7,
987-1000.
[0019] 3. Ross, P., Mayer, R., & Benziman, M. (1991) Microbiol.
Rev. 55, 35-58.
[0020] 4. Saxena, I. M., Lin, F. C., & Brown, R. M., Jr. (1990)
Plant Mol. Biol. 15, 673-683.
[0021] 5. Saxena, I. M., Lin, F. C., & Brown, R. M., Jr. (1992)
Plant Mol. Biol. 16, 947-954.
[0022] 6. Saxena, I. M., Kudlicka, K., Okuda, K., & Brown, R.
M., Jr. (1994) J. Bacteriol. 176, 5735-5752.
[0023] 7. Wong, H. C., Fear, A. L., Calhoon,, R. D., Eidhinger, G.
H., Mayer, R., Amikam, D., Benziman, M., Gelfand, D. H., Meade, J.
H., Emerick, A. W., Bruner, R., Ben-Basat, B. A., & Tal, R.
(1990) Proc. Natl. Acad. Sci. USA 87, 8130-8134.
[0024] 8. Lin, F.-C., Brown, R. M. Jr., Drake, R. R. Jr., &
Haley, B. E. (1990) J. Biol. Chem. 265, 4782- 4784.
[0025] 9. Mayer, R., Ross, P., Winhouse, H., Amikam, D., Volman,
G., Ohana, P., Calhoon, R. D., Wong, H. C., Emerick, A. W., &
Benziman, M. (1991) Proc. Natl. Acad. Sci. USA 88, 5472-5476.
[0026] 10. Matthysse, A. G., White, S., & Lightfoot, R. (1995a)
J. Bacteriol. 177, 1069-1075.
[0027] 11. Matthysse, A. G., Thomas, D. O. L., & White, S.
(1995b) J. Bacteriol. 177, 1076-1081.
[0028] 12. Saxena, I. M., Brown, R. M.,Jr., Fevre, M., Geremia, R.
A., & Henrissat, B. (1995) J. Bacteriol. 177, 1419-1424.
[0029] 13. Meinert, M., & Delmer, D. P. (1977) Plant Physiol.
59, 1088-1097.
[0030] 14. Delmer, D. P., Pear, J. R., Andrawis, A., & Stalker,
D. M. (1995) Mol. Gen. Genet. 248, 43-51.
[0031] 15. Delmer, D. P., Solomon, M., & Read, S. M. (1991)
Plant Physiol. 95, 556-563.
[0032] 16. Nagai, K., & Thogersen, H. C. (1987) Methods in
Enzymol. 153, 461-481.
[0033] 17. Laemmli, U. K. (1970) Nature 227, 680-685.
[0034] 18. Kyte, J., & Doolittle, R. F. (1982) J. Mol. Biol.
157, 105-132.
[0035] 19. Oikonomakos, N. G., Acharya, K. R., Stuart, D. I.,
Melpidou, A. E., McLaughlin, P. J., & Johnson, L. N. (1988)
Eur. J. Biochem. 173, 569-578.
[0036] 20. Maltby, D., Carpita, N. C., Montezinos, D., Kulow, C.,
& Delmer, D. P. (1979) Plant Physiol. 63, 1158-1164.
[0037] 21. Inoue, S. B., Takewaki, N., Takasuka, T., Mio, T.,
Adachi, M., Fujii, Y., Miyamoto, C., Arisawa, M., Furuichi, Y.,
& Watanabe, T. (1995) Eur. J. Biochem. 231, 845-854.
[0038] 22. Jacob, S. R., & Northcote, D. H. (1985) J. Cell Sci.
2 (suppl.), 1-11.
[0039] 23. Delmer, D. P. (1987) Annu. Rev. Plant Physiol. 38,
259-290.
[0040] 24. Altschul, S. F., Gish, W., Miller, W., Myers, E. W.,
& Lipman, D. J. (1990) J. Mol. Biol. 215, 403-410
[0041] 25. Milligan, G., Parenti, M., & Magee, A. I. (1995)
TIBS 20, 183-186.
[0042] 26. Amor, Y., Haigler, C. H., Johnson, S., Wainscott, M.,
& Delmer, D. P. (1995) Proc. Natl. Acad. Sci. USA 92,
9353-9357.
[0043] 27. Amor, Y., Mayer, R., Benziman, M., & Delmer, D. P.
(1991) Plant Cell 3, 989-995.
SUMMARY OF THE INVENTION
[0044] Two cotton genes, CelA1 and CelA2, have been shown to be
highly expressed in developing fibers at the onset of secondary
wall cellulose synthesis. Comparisons indicate that these genes and
the rice CelA gene encode polypeptides that have three regions of
reasonably high homology, both in terms of primary amino acid
sequence and hydropathy, with bacterial CelA proteins. The fact
that these homologous stretches are in the same sequential order as
in the bacterial CelA proteins and also contain four sub-regions
previously predicted to be critical for substrate binding and
catalysis (12) argues that the plant genes encode true homologs of
bacterial CelA proteins. Furthermore, the pattern of expression in
fiber as well as our demonstration that at least one of these
highly-conserved regions is critical for UDP-glc binding also
supports this conclusion.
[0045] Novel DNA promoter sequences are also supplied, and methods
for their use are described for directing transcription of a gene
of interest in cotton fiber.
[0046] The developing cotton fiber is an excellent system for
studies on cellulose synthesis as these single cells develop
synchronously in the boll and, at the end of elongation, initiate
the synthesis of a nearly pure cellulosic cell wall. During this
transition period, synthesis of other cell wall polymers ceases and
the rate of cellulose synthesis is estimated to rise nearly
100-fold in vivo (13). In our continuing efforts to identify genes
critical to this phase of fiber development, we have initiated a
program sequencing randomly selected cDNA clones derived from a
library prepared from mRNA harvested from fibers at the stage in
which secondary wall synthesis approaches its maximum rate
(approximately 21 dpa).
[0047] We have characterized two cotton (Gossypium hirsutum) cDNA
clones and identified one rice (Oryza sativa) cDNA that are
homologs of the bacterial CelA genes that encode the catalytic
subunit of cellulose synthase. Three regions in the deduced amino
acid sequences of the plant CelA gene products are conserved with
respect to the proteins encoded by bacterial CelA genes. Within
these conserved regions are four highly conserved subdomains
previously suggested to be critical for catalysis and/or binding of
the substrate UDP-glc. An overexpressed DNA segment of the cotton
CelAl gene encodes a polypeptide fragment that spans these domains
and effectively binds UDP-glc, while a similar fragment having one
of these domains deleted does not. The plant CelA genes show little
homology at the amino and carboxy terminal regions and also contain
two internal insertions of sequence, one conserved and one
hypervariable, that are not found in the bacterial gene sequences.
Cotton celA1 and CelA2 genes are expressed at high levels during
active secondary wall cellulose synthesis in the developing fiber.
Genomic Southern analyses in cotton demonstrate that CelA comprises
a family of approximately four distinct genes.
[0048] We report here the discovery of two cotton genes that show
highly-enhanced expression at the time of onset of secondary wall
synthesis in the fiber. The sequences of these two cDNA clones,
termed celA1 and CelA2, while not identical, are highly homologous
to each other and to a sequenced rice EST clone discovered in the
dBEST databank. The deduced proteins also share significant regions
of homology with the bacterial CelA proteins. Coupled with their
high level and specificity of expression in fiber at the time of
active cellulose synthesis, as well as the ability of an E. coli
expressed fragment of the celA1 gene product to bind UDP-glc, these
findings support the conclusion that these plant genes are true
homologs of the bacterial CelA genes.
[0049] The methods of the present invention include transfecting a
host plant cell of interest with a transcription or expression
cassette comprising a cotton fiber promoter and generating a plant
which is grown to produce fiber having the desired phenotype.
Constructs and methods of the subject invention thus find use in
modulation of endogenous fiber products, as well as production of
exogenous products and in modifying the phenotype of fiber and
fiber products. The constructs also find use as molecular probes.
In particular, constructs and methods for use in gene expression in
cotton embryo tissues are considered herein. By these methods,
novel cotton plants and cotton plant parts, such as modified cotton
fibers, may be obtained.
[0050] The sequences and constructs of this invention may also be
used to isolate related cellulose synthase genes from forest tree
species, for use in transforming and modifying wood quality. As and
example, lignin, an undesirable by-product of the pulping process,
by be reduced by over-expressing the cellulose synthase product and
diverting production into cellulose.
[0051] Thus, the application provides constructs and methods of use
relating to modification of cell and cell wall phenotype in cotton
fiber and wood products.
DESCRIPTION OF THE DRAWINGS
[0052] FIG. 1. Northern analysis of celA1 gene in cotton tissues
and developing fiber. Approximately 10 .mu.g total RNA from each
tissue was loaded per lane. Blots were prepared and probe
preparation and hybridization conditions were performed as
described previously (14). The entire celA1 cDNA insert was used as
a probe in this experiment. Exposure time for the audoradiogram was
seven hours at -70.degree..
[0053] FIG. 2. Cotton genomic DNA analysis for both the celA1 and
CelA2 cDNAs. Approximately 10-12.mu.g of DNA was digested with the
designated restriction enzymes and electrophoresed 0.9% agarose
gels. Probe preparation and hybridization conditions were as
described previously (14). The entire celA1 and CelA2 cDNAs were
utilized as probes. Exposure time for the audoradiograms was three
days at -70.degree..
[0054] FIG. 3. Multiple alignment of deduced amino acid sequences
of plant and bacterial CelA proteins. Analyses were performed by
Clustal Analysis using the Lasergene Multalign program (DNAStar,
Madison, Wis.) with gap and gap-length penalties of 10 and a PAM250
weight table. Residues are boxed and shaded when they show chemical
group similarity in 4 out of 7 proteins compared. H-1, H-2, H-3
regions are indicated where homology between plant and bacterial
proteins is highest. The plant proteins show two insertions that
are not present in the bacterial protein--one, P-CR, is conserved
among the plant CelA genes, while a second insertion is
hypervariable (HVR) between plant genes. The presence of the P-CR
and HVR regions led to inaccurate alignments when the entire
proteins were compared; the optimal alignments shown here were thus
performed in five seperate blocks. Regions U-1 through U-4 are
predicted to be critical for UDP-glc binding and catalysis in
bacterial CelA proteins; the predicted critical D residues and
QXXRW motif are boxed and starred respectively. Potential sites of
N-glycosylation are indicate by -G-.
[0055] FIG. 4. Kyte-Doolittle hydropathy plots of cotton celA1
aligned with those of two bacterial CelA proteins. Alignments and
designations are based upon those noted in FIG. 2. The hydropathy
profiles shown were calculated using a window of 7, although a
window of 19 was used for predictions of transmembrane helices that
are indicated by the arrows.
[0056] FIG. 5. An E. coli expressed GST cotton CelA-l fusion
protein binds the containing U1 through U4 binds UDP-glc in vitro.
Panel A shows a hypothetical orientation of the cotton celA1
protein in the plasma membrane and indicates the cytoplasmic region
containing the sub-domains U-1 to U-4. GST-fusion constructs for
celA1 fragments spanning the region between the potential
transmembrane helices (A through H) were prepared as described in
Materials and Methods. The purified and blotted celA1 fusion
protein fragments were tested as described in Materials and Methods
for their ability to bind .sup.32P-UDP-glc (panel B). M refers to
the molecular weight markers while CS and .cndot.U1 to the
full-length and deleted GST-celA1 fusion polypeptides. The left
panel shows proteins stained with Coomassie blue while the other
three panels show representative autoradiograms under different
binding conditions as described in Materials and Methods. Ph, BSA
and Ova refer to the molecular weight standards phosphorylase b,
bovine serum albumin and ovalbumin respectively.
[0057] FIG. 6. Nucleic acid sequences to cDNA of celA1 protein of
cotton (Gossypium hirsutum).
[0058] FIG. 7. Nucleic acid sequences to cDNA of CelA2 protein of
cotton (Gossypium hirsutum), including approximately the last 3'
two-thirds of the encoding region.
[0059] FIG. 8. Genomic nucleic acid sequences of celA1 protein of
cotton (Gossypium hirsutum), including approximately 900 bases of
the promoter region 5' to the encoding sequences.
DETAILED DESCRIPTION OF THE INVENTION
[0060] In accordance with the subject invention, novel constructs
and methods are described, which may be used provide for
transcription of a nucleotide sequence of interest in cells of a
plant host, preferentially in cotton fiber cells to produce cotton
fiber having an altered color phenotype.
[0061] Cotton fiber is a differentiated single epidermal cell of
the outer integument of the ovule. It has four distinct growth
phases; initiation, elongation (primary cell wall synthesis),
secondary cell wall synthesis, and maturation. Initiation of fiber
development appears to be triggered by hormones. The primary cell
wall is laid down during the elongation phase, lasting up to 25
days postanthesis (DPA). Synthesis of the secondary wall commences
prior to the cessation of the elongation phase and continues to
approximately 40 DPA, forming a wall of almost pure cellulose.
[0062] The constructs for use in such cells may include several
forms, depending upon the intended use of the construct. Thus, the
constructs include vectors, transcriptional cassettes, expression
cassettes and plasmids. The transcriptional and translational
initiation region (also sometimes referred to as a "promoter,"),
preferably comprises a transcriptional initiation regulatory region
and a translational initiation regulatory region of untranslated 5'
sequences, "ribosome binding sites," responsible for binding mRNA
to ribosomes and translational initiation. It is preferred that all
of the transcriptional and translational functional elements of the
initiation control region are derived from or obtainable from the
same gene. In some embodiments, the promoter will be modified by
the addition of sequences, such as enhancers, or deletions of
nonessential and/or undesired sequences. By "obtainable" is
intended a promoter having a DNA sequence sufficiently similar to
that of a native promoter to provide for the desired specificity of
transcription of a DNA sequence of interest. It includes natural
and synthetic sequences as well as sequences which may be a
combination of synthetic and natural sequences.
[0063] Cotton fiber transcriptional initiation regions of cellulose
synthase are used in cotton fiber modification.
[0064] A transcriptional cassette for transcription of a nucleotide
sequence of interest in cotton fiber will include in the direction
of transcription, the cotton fiber transcriptional initiation
region, a DNA sequence of interest, and a transcriptional
termination region functional in the plant cell. When the cassette
provides for the transcription and translation of a DNA sequence of
interest it is considered an expression cassette. One or more
introns may be also be present.
[0065] Other sequences may also be present, including those
encoding transit peptides and secretory leader sequences as
desired.
[0066] Downstream from, and under the regulatory control of, the
cellulose synthase transcriptional/translational initiation control
region is a nucleotide sequence of interest which provides for
modification of the phenotype of fiber. The nucleotide sequence may
be any open reading frame encoding a polypeptide of interest, for
example, an enzyme, or a sequence complementary to a genomic
sequence, where the genomic sequence may be an open reading frame,
an intron, a noncoding leader sequence, or any other sequence where
the complementary sequence inhibits transcription, messenger RNA
processing, for example, splicing, or translation. The nucleotide
sequences of this invention may be synthetic, naturally derived, or
combinations thereof. Depending upon the nature of the DNA sequence
of interest, it may be desirable to synthesize the sequence with
plant preferred codons. The plant preferred codons may be
determined from the codons of highest frequency in the proteins
expressed in the largest amount in the particular plant species of
interest. Phenotypic modification can be achieved by modulating
production either of an endogenous transcription or translation
product, for example as to the amount, relative distribution, or
the like, or an exogenous transcription or translation product, for
example to provide for a novel function or products in a transgenic
host cell or tissue. Of particular interest are DNA sequences
encoding expression products associated with the development of
plant fiber, including genes involved in metabolism of cytokinins,
auxins, ethylene, abscissic acid, and the like. Methods and
compositions for modulating cytokinin expression are described in
U.S. Pat. No. 5,177,307, which disclosure is hereby incorporated by
reference. Alternatively, various genes, from sources including
other eukaryotic or prokaryotic cells, including bacteria, such as
those from Agrobacterium tumefaciens T-DNA auxin and cytokinin
biosynthetic gene products, for example, and mammals, for example
interferons, may be used.
[0067] Alternatively, the present invention provides the sequences
to cotton cellulose synthase, which can be expressed, or down
regulated by antisense or co-suppression with its own, or other
cotton or other fiber promoters to modify fiber phenotyp.
[0068] In cotton, primary wall hemicellulose synthesis ceases as
secondary wall synthesis initiates in the fiber, and there are only
two possible .beta.-glucans synthesized in fibers at the time these
genes are highly-expressed; callose and cellulose (20). The
following data strongly argue against the plant CelA genes coding
for callose synthase: 1) callose synthase binds UDP-glc and is
activated in a Ca.sup.2+-dependent manner (2), while the celA1
polypeptide fragment containing the UDP-glc binding site
preferentially binds UDP-glc in a Mg.sup.2+-dependent manner,
similar to bacterial cellulose synthase (9); 2) the timing of
synthesis of callose in vivo in developing cotton fiber (20) does
not match the expression of the cotton CelA genes (FIG. 1); 3)
comparison of the CelA gene sequences with those of suspected
1,3.beta.-glucan synthase genes from yeast (21) indicated no
significant homology.
[0069] It is still possibille that the CelA protein might encode
both activities, as hypothesized some years ago (22-23), and the
plant CelAs might be responsible for direct polymerization of
glucan from UDP-glc as proposed for A. xylinum, although they may
catalyze synthesis of a lipid-glc precursor as proposed for the
CelA protein of A. tumefaciens.
[0070] In addition to their similarities, the plant CelA genes show
several very interesting divergences from their bacterial
ancestors, and these may account for the previous lack of success
in using bacterial probes to detect these cDNA clones. However, a
BLAST search of protein data banks (24) using the entire protein
sequence of cotton celA1 always shows highest homology with the
bacterial cellulose synthases. Of particular interest is the
insertion of two unique, plant-specific regions designated P-CR and
HVR. These regions are clearly not artifacts of cloning as they are
observed in both cotton genes as well as the rice CelA gene. The
three plant proteins show a high degree of amino acid homology to
each other throughout most of their length, diverging only at the
N- and C-terminal ends and the very interesting HVR region. It is
tempting to speculate that the HVR region may confer some
specificity of function; the highly-charged and cysteine rich
nature of the first portion of HVR could make this region a
potential candidate for interaction with specific regulatory
proteins, for cytoskeletal elements, or for redox regulation. In
addition, we note the presence of several cysteine residues near
the N- and C-terminal regions of the protein that might serve as
substrates for palmytolylation and also serve to help anchor the
protein in the membrane (25).
[0071] In summary, the finding of these plant CelA homologs
potentially opens up an exciting chapter in research on cellulose
synthesis in higher plants. Their finding is of particular
significance since biochemical approaches to identification of
plant cellulose synthase have proven exceedingly difficult. One
obvious challenge will be to gain definitive proof that these genes
are truely functional in cellulose synthesisin vivo. Other
promising goals will be to identify other components of a complex
that might interact with CelA, such as that proposed for sucrose
synthase (26), and/or a regulatory subunit that binds cyclic-di-GMP
(9,27) or other glycosyltransferases (10,11).
[0072] Transcriptional cassettes may be used when the transcription
of an anti-sense sequence is desired. When the expression of a
polypeptide is desired, expression cassettes providing for
transcription and translation of the DNA sequence of interest will
be used. Various changes are of interest; these changes may include
modulation (increase or decrease) of formation of particular
saccharides, hormones, enzymes, or other biological parameters.
These also include modifying the composition of the final fiber
that is changing the ratio and/or amounts of water, solids, fiber
or sugars. Other phenotypic properties of interest for modification
include response to stress, organisms, herbicides, brushing, growth
regulators, and the like. These results can be achieved by
providing for reduction of expression of one or more endogenous
products, particularly an enzyme or cofactor, either by producing a
transcription product which is complementary (anti-sense) to the
transcription product of a native gene, so as to inhibit the
maturation and/or expression of the transcription product, or by
providing for expression of a gene, either endogenous or exogenous,
to be associated with the development of a plant fiber.
[0073] The termination region which is employed in the expression
cassette will be primarily one of convenience, since the
termination regions appear to be relatively interchangeable. The
termination region may be native with the transcriptional
initiation region, may be native with the DNA sequence of interest,
may be derived from another source. The termination region may be
naturally occurring, or wholly or partially synthetic. Convenient
termination regions are available from the Ti-plasmid of A.
tumefaciens, such as the octopine synthase and nopaline synthase
termination regions. In some embodiments, it may be desired to use
the 3' termination region native to the cotton fiber transcription
initiation region used in a particular construct.
[0074] As described herein, in some instances additional nucleotide
sequences will be present in the constructs to provide for
targeting of a particular gene product to specific cellular
locations.
[0075] Similarly, other constitutive promoters may also be useful
in certain applications, for example the mas, Mac or DoubleMac,
promoters described in U.S. Pat. No. 5,106,739 and by Comai et al.,
Plant Mol. Biol. (1990) 15:373-381). When plants comprising
multiple gene constructs are desired, the plants may be obtained by
co-transformation with both constructs, or by transformation with
individual constructs followed by plant breeding methods to obtain
plants expressing both of the desired genes.
[0076] A variety of techniques are available and known to those
skilled in the art for introduction of constructs into a plant cell
host. These techniques include transfection with DNA employing A.
tumefaciens or A. rhizogenes as the transfecting agent, protoplast
fusion, injection, electroporation, particle acceleration, etc. For
transformation with Agrobacterium, plasmids can be prepared in E.
coli which contain DNA homologous with the Ti-plasmid, particularly
T-DNA. The plasmid may or may not be capable of replication in
Agrobacterium, that is, it may or may not have a broad spectrum
prokaryotic replication system such as does, for example, pRK290,
depending in part upon whether the transcription cassette is to be
integrated into the Ti-plasmid or to be retained on an independent
plasmid. The Agrobacterium host will contain a plasmid having the
vir genes necessary for transfer of the T-DNA to the plant cell and
may or may not have the complete T-DNA. At least the right border
and frequently both the right and left borders of the T-DNA of the
Ti- or Ri-plasmids will be joined as flanking regions to the
transcription construct. The use of T-DNA for transformation of
plant cells has received extensive study and is amply described in
EPA Serial No. 120,516, Hoekema, In: The Binary Plant Vector System
Offset-drukkerij Kanters B. V., Alblasserdam, 1985, Chapter V,
Knauf, et al., Genetic Analysis of Host Range Expression by
Agrobacterium, In: Molecular Genetics of the Bacteria-Plant
Interaction, Puhler, A. ed., Springer-Verlag, NY, 1983, p. 245, and
An, et al., EMBO J. (1985) 4:277-284.
[0077] For infection, particle acceleration and electroporation, a
disarmed Ti-plasmid lacking particularly the tumor genes found in
the T-DNA region) may be introduced into the plant cell. By means
of a helper plasmid, the construct may be transferred to the A.
tumefaciens and the resulting transfected organism used for
transfecting a plant cell; explants may be cultivated with
transformed A. tumefaciens or A. rhizogenes to allow for transfer
of the transcription cassette to the plant cells. Alternatively, to
enhance integration into the plant genome, terminal repeats of
transposons may be used as borders in conjunction with a
transposase. In this situation, expression of the transposase
should be inducible, so that once the transcription construct is
integrated into the genome, it should be relatively stably
integrated. Transgenic plant cells are then placed in an
appropriate selective medium for selection of transgenic cells
which are then grown to callus, shoots grown and plantlets
generated from the shoot by growing in rooting medium.
[0078] To confirm the presence of the transgenes in transgenic
cells and plants, a Southern blot analysis can be performed using
methods known to those skilled in the art. Expression products of
the transgenes can be detected in any of a variety of ways,
depending upon the nature of the product, and include immune assay,
enzyme assay or visual inspection, for example to detect pigment
formation in the appropriate plant part or cells. Once transgenic
plants have been obtained, they may be grown to produce fiber
having the desired phenotype. The fibers may be harvested, and/or
the seed collected. The seed may serve as a source for growing
additional plants having the desired characteristics. The terms
transgenic plants and transgenic cells include plants and cells
derived from either transgenic plants or transgenic cells.
[0079] The various sequences provided herein may be used as
molecular probes for the isolation of other sequences which may be
useful in the present invention, for example, to obtain related
transcriptional initiation regions from the same or different plant
sources. Related transcriptional initiation regions obtainable from
the sequences provided in this invention will show at least about
60% homology, and more preferred regions will demonstrate an even
greater percentage of homology with the probes.
[0080] Of particular importance is the ability to obtain related
transcription initiation control regions having the timing and
tissue parameters described herein. Thus, by employing the
techniques described in this application, and other techniques
known in the art (such as Maniatis, et al., Molecular Cloning,- A
Laboratory Manual (Cold Spring Harbor, N.Y.) 1982), other encoding
regions or transcription initiation regions of cellulose synthase
as described in this invention may be determined. The constructs
can also be used in conjunction with plant regeneration systems to
obtain plant cells and plants; thus, the constructs may be used to
modify the phenotype of fiber cells, to provide cotton fibers which
are colored as the result of genetic engineering to heretofor
unavailable hues and/or intensities.
[0081] Various varieties and lines of cotton may find use in the
described methods. Cultivated cotton species include Gossypium
hirsutum and G. babadense (extra-long stable, or Pima cotton),
which evolved in the New World, and the Old World crops G.
herbaceum and G. arboreum.
[0082] By using encoding sequences to enzymes which control wood
quality and wood product characteristics, i.e., cellulose synthase
and O-methyltransferase (a key enzyme in lignin biosynthesis) the
relative synthesis of cellulose and lignin by plants may be
controlled. Transformation of the plant genome with a recombinant
gene construct which contains the gene specifying an enzyme
critical to the synthesis of cellulose or lignin or a lignin
precursor, in either a sense or in an antisense orientation. If an
antisense orientation, the gene will transcribed so mRNA having a
sequence complementary to the equivalent mRNA transcribed from the
endogenous gene is expressed, leading to suppression of the
synthesis of lignin or cellulose.
[0083] If the recombinant gene has the lignin enzyme gene in
normal, or "sense" orientation, increased production of the enzyme
may occur when the insert is the full length DNA but suppression
may occur if only a partial sequence is employed.
[0084] Furthermore, the expression of one may be increased in this
manner while the other is reduced. For instance, the production of
cellulose may by increased through the overexpression of cellulose
synthase, while lignin production is reduced. By thus reducing the
relative lignin content, the quality of wood for paper production
would be improved.
EXAMPLES
[0085] The following examples are offered by way of illustration
and not by limitation.
Example 1
cDNA Libraries
[0086] An unamplified cDNA library was used to prepare the Lambda
Uni-Zap vector (Stratagene, LaJolla, Calif.) using cDNA derived
from polyA+ mRNA prepared from fibers of Gossypium hirsutum Acala
SJ-2 harvested at 21 DPA, the time at which secondary wall
cellulose synthesis is approaching a maximal rate (13).
Approximately 250 plaques were randomly selected from the cDNA
library, phages purified and plasmids excised from the phage vector
and transformed.
[0087] The resulting clones/inserts were size screened on 0.8%
agarose gels (DNA inserts below 600 bp were excluded).
Example 2
Isolation and Sequencing of cDNA Clones
[0088] Plasmid DNA inserts were randomly sequenced using an Applied
Biosystems (Foster City, Calif.) Model 373A DNA sequencer. A search
of the GenBank EST databank revealed that there were at least 23
rice and 8 Arabidopsis EST clones that contain sequences similar to
the cotton celA1 DNA sequence. EST clone S14965 was obtained from
Y. Nagamura (Rice Genome Research Program, Tsukuba). A series of
deletion mutants were generated and used for DNA sequencing
analysis at the Weizmann Institute of Science (Rehovot).
Example 3
Northern and Southern Analyses
[0089] Cotton plants (G. hirsutum cv. Coker 130) were grown in the
greenhouse and tissues harvested at the appropriate times indicated
and frozen in liquid N.sub.2. Total cotton RNA and cotton genomic
DNA was prepared and subjected to Northern and Southern analyses as
described previously (14).
Example 4
UDP-Glc Binding Studies
[0090] To construct a GST-celA1 protein fusion, a 1.6 kb DNA celA1
DNA fragment containing a putative cytoplasmic domain between the
second and third transmembrane helices was PCR amplified with the
primers ATTGAATTCCTGGGTGTTGGATCAGTT and ATTCTCGAGTGGAAGGGATTGAAA in
a reaction containing 1 ng plasmid DNA (clone 213) as template. The
amplified fragment was unidirectionally cloned into the EcoRI and
XhoI sites of the GST expression vector pGEX4T-3 (Pharmacia),
generating a fusion protein GST-CS containing the amino acids
Ser215 to Leu759 of the cotton celA1 protein. Two celA1 gene
internal PstI sites within the plasmid pGST-CS were used to
generate the deletion mutant pGST-CS.DELTA.U1, which lacks 196
amino acids (and the U1 binding region) from Val252 to Ala447.
[0091] For the UDGP binding assays, .alpha.-.sup.32P-labeled
UDP-glc was prepared as described (15). The two fusion proteins
GST-CS and GST-CS.cndot.U1 were expressed in E. coli and purified
from inclusion bodies (16). Proteins were suspended in sample
buffer, heated to 100.degree. C. for 5 min and approximately 50 ng
of the two fusion protein products and molecular weight standards
(Bio-Rad) subjected to SDS-PAGE using 4.5% and 7.5% acrylamide in
the stacking and separating gels, respectively (17). After
electrophoresis, protein transfer to nitrocellulose filters was
carried out in transfer buffer (25 mM Tris, 192 mM glycine and 20%
(v/v) methanol). The filter was briefly rinsed in deionized
H.sub.2O and incubated in PBS buffer for 15 min, then stained with
Ponceau-S in PBS buffer. After washing in deionized H.sub.2O,
protein was further renatured on the filter by incubation in PBS
buffer for 30 min and used directly for binding assays. All binding
buffers contained 50 mM HEPES/KOH (pH 7.3), 50 mM NaCl and 1 mMDTT.
In addition, binding buffers contained either 5 mM MgCl2 and 5 mM
EGTA (Buffer Mg/EGTA), 5 mM EDTA (Buffer EDTA) or 1 mM CaCl2 and 2
mM cellobiose (Buffer Ca/CB). Binding reaction was carried out in 7
ml containing .sup.32P-labeled UDP-glc (1.times.10.sup.7 cpm) at
room temperature for 3 hours with constant shaking. Filters were
washed separately three times in 20 ml washing buffer consisting of
50 mM HEPES/KOH (pH 7.3) and 50 mM NaCl for 5 min each, briefly
dried and analyzed on a Bio-imaging analyzer BAS1000 (Fugi).
Example 5
Identification, Differential Expression and Genomic Analysis of
Cotton CelA Genes
[0092] During the course of screening and sequencing random cDNA
clones from a cotton fiber specific cDNA library prepared from RNA
collected approximately 21 dpa, it was discovered that two cDNA
clones that initially exhibited small blocks of amino acid homology
to the proteins encoded by the bacterial CelA genes. Clone 213
appeared to be full-length cDNA while another distinct clone, 207,
appeared to be a partial clone relative to the length of 213. These
two clones were partially homologous at the nucleotide and amino
acid levels and designated celA1 and CelA2 respectively.
[0093] These clones were then utilized as probes for Northern blot
analysis to determine their differential expression in cotton
tissues and developing cotton fiber. FIG. 1 indicates the
expression pattern for the celA1 gene. The celA1 gene encodes a
mRNA of approximately 3.2 kb in length and is expressed at
extremely high levels in developing fiber, beginning at
approximately 17 dpa, the time at which secondary wall cellulose
synthesis is initiated(13). The gene is also expressed at low
levels in all other cotton tissues, most notably in root, flower
and developing seeds. Since regions of these genes are somewhat
homologous at the nucleotide level, gene specific probes were
designed (using the hypervariable regions described in FIG. 3) to
distinguish the specific expression patterns of celA1 and CelA2.
These gene specific probes generated expression patterns (data not
shown) for the two genes identical to that shown in FIG. 1, except
that a very low mRNA level was also detected in the primary wall
phase of fiber development (5-14 dpa) for the CelA2 gene when the
blots were overexposed. The CelA2 gene specific probe also encoded
a 3.2 kb mRNA, analogous in size to the mRNA specified by the gene
for celA1. Messenger RNAs for both genes exhibit a characteristic
degradation pattern similar to other mRNAs specifically expressed
late in fiber development (J. Pear, unpublished observations) and
this degradation is not a result of the integrity of the mRNA
preparations (14). We estimate that both cotton CelA genes are
expressed in developing fiber approximately 500 times their level
of expression in other cotton tissues and that they constitute
approximately 1-2% of the 24 dpa fiber mRNA.
[0094] In order to estimate the number of CelA genes in the cotton
genome, Southern analysis was performed utilizing both CelA cDNAs
independently as probes (FIG. 2). Although the two cotton genes are
fairly non-homologous at the nucleotide level over their entire
length, there are regions of homology (the H1, H2 and H3 regions
described below) and it was thought these regions could be useful
in identifying other cotton CelA genes. FIG. 2 indicates that the
celA1 cDNA probe will hybridize, albeit weakly, to the CelA2
genomic equivalent and vise versa. The HindIII pattern for both
genes and cDNA probes is particularly discriminating. There are
also a number of other weakly hybridzing bands in these digests and
from these data we estimate that the cotton CelA genes constitute a
small family of approximately four genes. Homology of Plant and
Bacterial CelA Gene Products.
[0095] In addition to the two similar cotton CelA genes, a
homologous cDNA clone was discovered in the dBest databank* of rice
and Arabidopsis ESTs. Accession No. D48636, the rice clone having
the longest insert was obtained and sequenced, and the homology
comparisons with bacterial proteins reported here also include
results with the rice CelA. FIG. 3 shows the results of a multiple
alignment of the deduced amino acid sequences from the three plant
CelA genes and four bacterial CelA genes from A. xylinum (AcsAB and
BcsA), E. coli , and A. tumefaciens. FIG. 4 shows hydropathy plots
(18) of cotton celA1 similarly aligned with two bacterial CelA
proteins and serves as a more general summary of the overall
homologies. *The following accession numbers were identified as
showing homology with cotton CelA-1. For rice: D48636, D41261,
D40691, D46824, D47622, D47175, D41766, D41986, D24655, D23732,
D24375, D47732, D47821, D47850, D47494, D24964, D24862, D24860,
D24711, D23841, D48053, D48612, D40673; for Arabidopsis: T45303,
T45414, H76149, H36985, Z30729, H36425, T45311, A35212.
[0096] Of the plant genes, only the cotton celA1 appears to be a
full-length clone of 3.2 kb exhibiting an open reading frame that
could potentially code for a polypeptide of 109,586 kD, a pI of
6.4, and four potential sites of N-glycosylation. Comparison of the
N-terminal region of cotton celA1 with bacterial genes indicates
that the plant protein has an extended N-terminal similar in length
and hydropathy profile, but with only poor amino acid sequence
homology to the A. tumefaciens CelA protein. In general, sequence
homology of plant and bacterial genes in both the N-terminal and
C-terminal regions is poor. However, although overall similarity
comparing plant to bacterial proteins is less than 25%, three
homologous regions were identified, called H-1, H-2, and H-3, where
the sequence similarity rises to 50-60% at the amino acid level.
Interspersed between these regions of homology are two
plant-specific regions not found at all in the bacterial proteins.
Sequences in the first of these insertions are highly conserved in
the plant genes (P-CR), while the second interspersed region seems
to be a hypervariable regions (HVR) for there is considerable
sequence divergence among the plant proteins analyzed.
[0097] None of the plant or bacterial CelA proteins contains
obvious signal sequences even though they are presumably
transmembrane proteins (4). However, the overall profiles suggest
two potential transmembrane helices in the N-terminal and six in
the C-terminal region of the cotton celA1 that could anchor the
protein in the membrane (see arrows FIG. 3 and also panel A of FIG.
5). The amino acid sequence positions for these predicted
transmembrane helices are: A (169-187), B (200-218), C (759-777), D
(783-801), E (819-837), F (870-888), G (903-921), H (933-951). The
central portions of the proteins are more hydrophilic and are
predicted to reside in the cytoplasm and contain the site(s) of
catalysis. More detailed inspection of these hydrophilic stretches
reveals four particularly conserved sub-regions (marked U-1 through
U-4 on FIGS. 3-4) that contain the conserved asp (D) residues (in
U-1-3) and the motif QXXRW (in U-4) that have been proposed (12) to
be involved in substrate binding and/or catalysis .
[0098] Binding of UDP-glucose. Further evidence that the proteins
encoded by these plant genes are CelA homologs comes from our
demonstration that a DNA segment encoding the central region of the
cotton celA1 protein, over-expressed in E. coli, binds UDP-glc. We
subcloned a 1.6 kb fragment of the cotton celA1 clone to create a
hybrid gene that encodes GST fused to the celA1 sequence encoding
amino acid residues 215-759 of the celA1 protein (FIG. 5a). This
region spans U-1 through U-4 that are suspected to be critical for
UDP-glc binding. As a control, another GST fusion was created using
a 1.0 kb PstI fragment that had the U-1 region deleted and might
not be predicted to bind UDP-glc. The fusion proteins were
overexpressed in E. coli purifed, and shown to have the predicted
sizes of approximately 87 and 64 kD, respectively (FIG. 5b). The
purified proteins were then subjected to SDS-PAGE, and blotted to
nitrocellulose. Blotted proteins were renatured, and incubated with
.sup.32P-UDP-glc in order to test for binding (FIG. 5b). As
predicted, the 87 kD GST-celA1 fusion does indeed bind UDP-glc in a
Mg.sup.2+ dependent manner, while the shorter fusion with the U-1
domain deleted did not show any binding (Although not observed in
the experiment shown, in some experiments very weak labeling in the
presence of Ca.sup.2+ could be observed). As further controls, note
that the molecular weight standards BSA and ovalbumin, proteins
lacking UDP-glc binding sites, show no interaction with UDP-glc,
while phosphorylase b, an enzyme inhibited by UDP-glc (19), binds
this substrate.
[0099] FIG. 6 provides the encoding sequence to the cDNA to celAl
(start ATG at .about.base 179), while FIG. 7 provides the encoding
sequence to the approximately two-thirds 3' of the cDNA to
celA2.
Example 6
Genomic DNA
[0100] cDNA for the cellulose synthase clones was used to probe for
genomic clones. For both, full length genomic DNA was obtained from
a library made using the lambda dash 2 vector from Stratagene.TM.,
which was used to construct a genomic DNA library from cotton
variety Coker 130 (Gossypium hirsutum cv. coker 130), using DNA
obtained from germinating seedlings.
[0101] The cotton genomic library was probed with a cellulose
synthase probe and genomic phage candidates were identified and
purified. FIG. 8 provides an approximately 1 kb sequence of the
cellulose synthase promoter region which is immediately 5' to the
celA1 encoding region. The start of the cellulose synthase enzyme
encoding region is at the ATG at base number 954.
Example 7
Cotton Transformation
Explant Preparation
[0102] Promoter constructs comprising the cellulose synthase
promoter sequences of celAl can be cotton prepared. Coker 315 seeds
are surface disinfected by placing in 50% Clorox (2.5% sodium
hypochlorite solution) for 20 minutes and rinsing 3 times in
sterile distilled water. Following surface sterilization, seeds are
germinated in 25.times.150 sterile tubes containing 25 mls
1/2.times.MS salts: 1/2.times.B5 vitamins: 1.5% glucose: 0.3%
gelrite. Seedlings are germinated in the dark at 28.degree. C. for
7 days. On the seventh day seedlings are placed in the light at
28.+-.2.degree. C.
Cocultivation and Plant Regeneration
[0103] Single colonies of A. tumefaciens strain 2760 containing
binary plasmids pCGN2917 and pCGN2926 are transferred to 5 ml of
MG/L broth and grown overnight at 30.degree. C. Bacteria cultures
are diluted to 1.times.108 cells/ml with MG/L just prior to
cocultivation. Hypocotyls are excised from eight day old seedlings,
cut into 0.5-0.7 cm sections and placed onto tobacco feeder plates
(Horsch et al. 1985). Feeder plates are prepared one day before use
by plating 1.0 ml tobacco suspension culture onto a petri plate
containing Callus Initiation Medium CIM without antibiotics (MS
salts: B5 vitamins: 3% glucose: 0.1 mg/L 2,4-D: 0.1 mg/L kinetin:
0.3% gelrite, pH adjusted to 5.8 prior to autoclaving). A sterile
filter paper disc (Whatman #1) was placed on top of the feeder
cells prior to use. After all sections are prepared, each section
was dipped into an A. tumefaciens culture, blotted on sterile paper
towels and returned to the tobacco feeder plates.
[0104] Following two days of cocultivation on the feeder plates,
hypocotyl sections are placed on fresh Callus Initiation Medium
containing 75 mg/L kanamycin and 500 mg/L carbenicillin. Tissue is
incubated at 28.+-.2.degree. C., 30uE 16:8 light:dark period for 4
weeks. At four weeks the entire explant is transferred to fresh
callus initiation medium containing antibiotics. After two weeks on
the second pass, the callus is removed from the explants and split
between Callus Initiation Medium and Regeneration Medium (MS salts:
40 mM KNO.sub.3. 10 mM NH4Cl:B5 vitamins:3% glucose:0.3%
gelrite:400 mg/L carb:75 mg/L kanamycin).
[0105] Embryogenic callus is identified 2-6 months following
initiation and was subcultured onto fresh regeneration medium.
Embryos are selected for germination, placed in static liquid
Embryo Pulsing Medium (Stewart and Hsu medium: 0.01 mg/l NAA: 0.01
mg/L kinetin: 0.2 mg/L GA3) and incubated overnight at 30.degree.
C. The embryos are blotted on paper towels and placed into Magenta
boxes containing 40 mls of Stewart and Hsu medium solidified with
Gelrite. Germinating embryos are maintained at 28.+-.2.degree. C.
50 uE m.sup.2s .sup.116:8 photoperiod. Rooted plantlets are
transferred to soil and established in the greenhouse.
[0106] Cotton growth conditions in growth chambers are as follows:
16 hour photoperiod, temperature of approximately 80-85.degree.,
light intensity of approximately 500 .mu.Einsteins. Cotton growth
conditions in greenhouses are as follows: 14-16 hour photoperiod
with light intensity of at least 400 .mu.Einsteins, day temperature
90-95.degree. F., night temperature 70-75.degree. F., relative
humidity to approximately 80%.
Plant Analysis
[0107] Flowers from greenhouse grown T1 plants are tagged at
anthesis in the greenhouse. Squares (cotton flower buds), flowers,
bolls etc. are harvested from these plants at various stages of
development and assayed for observable phenotype or tested for
enzyme activity.
Example 7
Transformation of Tree Species
[0108] Numerous methods are known to the art for transforming
forest tree species, for example U.S. Pat. No. 5,654,190 discloses
a process for producing transgenic plant belonging to the genus
Populus, the section Leuce.
[0109] The above results demonstrate how the cellulose synthase
cDNA may be used to alter the phenotype of a transgenic plant cell,
and how the promoter may be used to modify transgenic cotton fiber
cells.
[0110] All publications and patent applications cited in this
specification are herein incorporated by reference as if each
individual publication or patent application are specifically and
individually indicated to be incorporated by reference.
[0111] Although the foregoing invention has been described in some
detail, by way of illustration and example for purposes of clarity
and understanding, it will be readily apparent to those of ordinary
skill in the art that certain changes and modifications may be made
thereto, without departing from the spirit or scope of the appended
claims.
Sequence CWU 1
1
12 1 3328 DNA Artificial Sequence Synthetic Oligonucleotide 1
cgaaattaac cctcactaaa gggaacaaaa gctggagctc caccgcggtg gcggccgctc
60 tagaactagt ggatcccccg ggctgcagga attcggcacg agggttagca
tattgtttgt 120 agcattgggt ttttttctca aggaagaaga aggagaaaga
taagtacttt ttttgagaat 180 gatggaatct ggggttcctg tttgccacac
ttgtggtgaa catgttgggt tgaatgttaa 240 tggtgaacct tttgtggctt
gccatgaatg taatttccct atttgtaaga gttgttttga 300 gtatgatctt
aaggaaggac gaaaagcttg cttgcgttgt ggtagtccat atgatgaaaa 360
cctgttggac gatgtcgaga aggccaccgg cgatcaatcg acaatggctg cacatttgaa
420 caagtctcag gatgttggaa ttcatgcaag acatatcagc agtgtgtcta
cattggatag 480 tgaaatggct gaagacaatg ggaattcgat ttggaagaac
agggtggaaa gttggaaaga 540 aaagaagaac aagaagaaga agcctgcaac
aactaaggtt gaaagagagg ctgaaatccc 600 acctgagcaa caaatggaag
ataaaccggc accggatgct tcccagcccc tctcgactat 660 aattccaatc
ccgaaaagca gacttgcacc ataccgaacc gtgatcatta tgcgattgat 720
cattcttggt cttttcttcc attatcgagt aacaaacccc gttgacagtg cttttggact
780 gtggctcact tcagtcatat gtgaaatctg gtttgcattt tcctgggtgt
tggatcagtt 840 ccctaagtgg tatcctgtta acagggaaac atacattgac
agactatctg caagatatga 900 aagagaaggt gaacctgatg aacttgctgc
agttgacttc ttcgtgagta cagtggatcc 960 attgaaagag cctccattga
ttactgccaa tactgtgctt tccatccttg ccttggacta 1020 cccggtggat
aaggtctctt gttatatatc tgatgatggt gcggccatgc tgacatttga 1080
atctctagta gaaacagccg actttgcaag aaagtgggtt ccattctgca aaaaattttc
1140 cattgaaccc cgggcacctg agttttactt ctcacagaag attgattact
tgaaagataa 1200 agtgcagccc tcttttgtaa aagaacgtag agctatgaaa
agagattatg aagagtacaa 1260 aattcgaatc aatgctttag ttgcaaaggc
tcagaaaaca cctgatgaag gatggacaat 1320 gcaagatgga acttcttggc
caggaaataa cccgcgtgat caccctggca tgattcaggt 1380 tttccttgga
tatagtggtg ctcgtgacat cgaaggaaat gaacttcctc gactggttta 1440
cgtctctaga gagaagagac ctggctacca acaccacaaa aaggctggtg ctgaaaatgc
1500 tttggttagg gtgtctgcag ttcttacaaa tgctcccttc atcctcaatc
ttgattgtga 1560 ccactatgtt aacaatagca aggcagttag ggaggcaatg
tgcttcttga tggacccaca 1620 agttggtcga gatgtatgct atgtgcagtt
tcctcaaaga tttgatggca tagataggag 1680 tgatcgatat gccaatagga
acacagtttt ctttgatgtt aacatgaaag gtcttgatgg 1740 aatccaaggg
ccagtttatg tgggaacagg ttgtgttttc aataggcaag cactttatgg 1800
ctatggtcca ccttcaatgc caagttttcc caagtcatcc tcctcatctt gctcgtgttg
1860 ctgcccgggc aagaaggaac ctaaagatcc atcagagctt tatagggatg
caaaacggga 1920 agaacttgat gctgccatct ttaaccttag ggaaattgac
aattatgatg agtatgaaag 1980 atcaatgttg atctctcaaa caagctttga
gaaaactttt ggcttatctt cagtcttcat 2040 tgaatctaca ctaatggaga
atggaggagt ggctgaatct gccaaccctt ccacactaat 2100 caaggaagca
attcatgtca tcagctgtgg ctatgaagag aagactgcat gggggaaaga 2160
gattggatgg atatatggtt cagtcactga ggatatctta accggcttca aaatgcactg
2220 ccgaggatgg agatcgattt actgcatgcc cttaaggcca gcattcaaag
gatctgcacc 2280 catcaatctg tctgatcggt tgcaccaggt tcttcgatgg
gctcttggat ctgttgaaat 2340 tttcctaagc aggcattgcc ctctatggta
tggctttgga ggtggtcgtc ttaaatggct 2400 tcaaagacta gcatatataa
acaccattgt ctatcctttc acatcccttc cactcattgc 2460 ctattgttca
ctaccagcaa tctgtcttct cacaggaaaa tttatcatac caacgctctc 2520
aaacctggca agtgttctct ttcttggcct tttcctttcc attatcgtga ctgctgttct
2580 cgagctccga tggagtggtg tcagcattga ggacttatgg cgtaacgagc
agttttgggt 2640 catcggtggc gtttcagccc atctctttgc cgtcttccaa
ggtttcctta agatgcttgc 2700 gggcattgac accaacttta ctgtcactgc
caaagcagct gatgatgcag attttggtga 2760 gctctacatt gtgaaatgga
ctacacttct aatccctcca acaacactcc tcatcgtcaa 2820 catggttggt
gtcgttgccg gattctccga tgccctcaac aaagggtacg aagcttgggg 2880
accactcttt ggcaaagtgt tcttttcctt ctgggtcatc ctccatcttt atccattcct
2940 caaaggtctt atgggacgcc aaaacaggac accaaccatt gttgtccttt
ggtcagtgtt 3000 gttggcttct gtcttctctc ttgtttgggt tcggatcaac
ccgtttgtca gcaccgccga 3060 tagcaccacc gtgtcacaga gctgcatttc
cattgattgt tgatgatatt atgtgtttct 3120 tagaattgaa atcattgcaa
gtaagtggac tgaaacatgt ctattgacta agttttgaac 3180 agtttgtacc
cattttattc ttagcagtgt gtaattttcc taaacaatgc tatgaactat 3240
acatatttca ttgatattta cattaaatga aactacatca gtctgcagaa aaaaaaaaaa
3300 aaaaaaaaac tcgagggggg gcccggta 3328 2 4612 DNA Artificial
Sequence Synthetic Oligonucleotide 2 aactagtgga tcccccgggc
tgcaggaatt cggcacgagc gaggagatgg gttccgtttt 60 gtaagaagca
ttgatcacct agggggcccg acgtccttaa gccgtgctcg ctcctctacc 120
caaggcaaaa cattcttcgt taatgttgag cccagggcgc cggagtttta tttcaatgag
180 aagattgatt atttgaagga caaggtccat attacaactc gggtcccgcg
gcctcaaaat 240 aaagttactc ttctaactaa taaacttcct gttccaggta
cctagctttg ttaaagaacg 300 gagagccatg aaaagggaat atgaagaatt
taaagtaagg atcaatgcat ggatcgaaac 360 aatttcttgc ctctcggtac
ttttccctta tacttcttaa atttcattcc tagttacgta 420 tagtagcaaa
agctcagaag aaaccagaag aaggatgggt gatgcaagat ggcaccccat 480
ggcccggaaa atcatcgttt tcgagtcttc tttggtcttc ttcctaccca ctacgttcta
540 ccgtggggta ccgggccttt taacactcgt gatcatcctg gaatgattca
ggtctatcta 600 ggaagtgccg gtgcactcga tgtggatggc attgtgagca
ctagtaggac cttactaagt 660 ccagatagat ccttcacggc cacgtgagct
acacctaccg aaagagctgc ctcgacttgt 720 ctatgtttct cgtgagaaac
gacctggtta tcagcaccat aagaaagccg tttctcgacg 780 gagctgaaca
gatacaaaga gcactctttg ctggaccaat agtcgtggta ttctttcggc 840
gtgctgagaa tgctctggtt cgagtttctg cagtgcttac taatgcaccc ttcatattga
900 atctggattg cacgactctt acgagaccaa gctcaaagac gtcacgaatg
attacgtggg 960 aagtataact tagacctaac tgatcattac atcaacaata
gcaaggccat gagggaagcg 1020 atgtgctttt taatggatcc tcagtttgga
actagtaatg tagttgttat cgttccggta 1080 ctcccttcgc tacacgaaaa
attacctagg agtcaaacct aagaagcttt gttatgttca 1140 atttccacag
agatttgatg gtattgatcg tcatgatcga tatgctaatc ttcttcgaaa 1200
caatacaagt taaaggtgtc tctaaactac cataactagc agtactagct atacgattag
1260 gaaatgttgt cttctttgat atcaacatgt tgggattaga tggacttcaa
ggccctgtat 1320 atgtaggcac ctttacaaca gaagaaacta tagttgtaca
accctaatct acctgaagtt 1380 ccgggacata tacatccgtg agggtgtgtt
ttcaacaggc aggcattgta tggctacgat 1440 ccaccagtct ctgagaaacg
accaaagatg tcccacacaa aagttgtccg tccgtaacat 1500 accgatgcta
ggtggtcaga gactctttgc tggtttctac acatgtgatt gctggccttc 1560
ttggtgttgc tgttgttgcg gaggttctag gaagaaatca aagaagaaag tgtacactaa
1620 cgaccggaag aaccacaacg acaacaacgc ctccaagatc cttctttagt
ttcttctttc 1680 gtgaaaagaa gggcttactc ggaggtcttt tatacggaaa
aaagaagaag atgatgggca 1740 aaaactatgt cacttttctt cccgaatgag
cctccagaaa atatgccttt tttcttcttc 1800 tactacccgt ttttgataca
gaaaaaaggg tctgcaccag tctttgatct cgaagaaatc 1860 gaagaagggc
ttgaaggata cgaagaattg cttttttccc agacgtggtc agaaactaga 1920
gcttctttag cttcttcccg aacttcctat gcttcttaac gagaaatcga cattaatgtc
1980 gcagaagaat ttcgagaaac gattcggaca atcaccggtt ttcattgcct
ctctttagct 2040 gtaattacag cgtcttctta aagctctttg ctaagcctgt
tagtggccaa aagtaacgga 2100 caactttgat ggaaaatggt ggccttcctg
aaggaactaa ttccacatca ctgattaaag 2160 aggccattca gttgaaacta
ccttttacca ccggaaggac ttccttgatt aaggtgtagt 2220 gactaatttc
tccggtaagt cgtaattagc tgtggttatg aagaaaaaac tgagtggggc 2280
aaagagatcg gatggattta tgggtcggtg gcattaatcg acaccaatac ttcttttttg
2340 actcaccccg tttctctagc ctacctaaat acccagccac acggaagata
tattaacagg 2400 tttcaagatg cattgtagag ggtggaaatc ggtttattgt
gtaccgaaaa tgccttctat 2460 ataattgtcc aaagttctac gtaacatctc
ccacctttag ccaaataaca catggctttt 2520 gaccggcatt caaagggtcc
gctccaatca atctctcgga tcggttgcac caagttttga 2580 gatgggcact
ctggccgtaa gtttcccagg cgaggttagt tagagagcct agccaacgtg 2640
gttcaaaact ctacccgtga tggttctgta gaaattttcc ttagtcgtca ctgtccactt
2700 tggtatggtt atggtggaaa actgaaatgg accaagacat ctttaaaagg
aatcagcagt 2760 gacaggtgaa accataccaa taccaccttt tgactttacc
ctcgagaggc ttgcttatat 2820 caacaccatt gtttaccctt tcacctcgat
ccctttactc gcctattgta gagctctccg 2880 aacgaatata gttgtggtaa
caaatgggaa agtggagcta gggaaatgag cggataacat 2940 ctattccagc
tgtttgtctt ctcaccggca aattcatcat tccaactcta agcaacctta 3000
caagtgtgtg gataaggtcg acaaacagaa gagtggccgt ttaagtagta aggttgagat
3060 tcgttggaat gttcacacac gttcttggca cttttcctct ccatcattgc
aactggagtg 3120 cttgaacttc gatggagcgg ggttagcatc caagaaccgt
gaaaaggaga ggtagtaacg 3180 ttgacctcac gaacttgaag ctacctcgcc
ccaatcgtag caagactggt ggcgcaatga 3240 acaattctgg gtgatcggag
gtgtctccgc ccatcttttt gctgtcttcc gttctgacca 3300 ccgcgttact
tgttaagacc cactagcctc cacagaggcg ggtagaaaaa cgacagaagg 3360
agggcctcct caaagtccta gctggagtag acaccaactt caccgtaaca gcaaaagcag
3420 cagacgatac tcccggagga gtttcaggat cgacctcatc tgtggttgaa
gtggcattgt 3480 cgttttcgtc gtctgctatg agaattcggt gaactttatc
tcttcaaatg gacaactctc 3540 ttaatccctc ccacaactct gataatactg
tcttaagcca cttgaaatag agaagtttac 3600 ctgttgagag aattagggag
ggtgttgaga ctattatgac aacatggtcg gagtcgtggc 3660 cggagtttca
gacgcaatca acaacggcta tggttcatgg ggtccattgt ttgtaccagc 3720
ctcagcaccg gcctcaaagt ctgcgttagt tgttgccgat accaagtacc ccaggtaaca
3780 tcggcaaact gttcttcgca ttctgggtca ttcttcatct ttacccattc
ctcaaaggtt 3840 tgatggggag agccgtttga caagaagcgt aagacccagt
aagaagtaga aatgggtaag 3900 gagtttccaa actacccctc acaaaacagg
acgcccacca ttgttgtgct ttggtccata 3960 cttttggcat cgattttctc
actggtttgg tgttttgtcc tgcgggtggt aacaacacga 4020 aaccaggtat
gaaaaccgta gctaaaagag tgaccaaacc gtacggatcg atcccttctt 4080
gcccaaacaa acaggtccag ttcttaaaca atgtggcgtg gagtgctaaa catgcctagc
4140 tagggaagaa cgggtttgtt tgtccaggtc aagaatttgt tacaccgcac
ctcacgattt 4200 tggtgtttta caaacctttc ttattatttt attttccctt
tttgccacta ctgttgattt 4260 gctgtgattc accacaaaat gtttggaaag
aataataaaa taaaagggaa aaacggtgat 4320 gacaactaaa cgacactaag
taaaagggat ttatcttgtt tgtaaaaagt ctcctatgat 4380 tttgttggtt
caatttaatt tctatatggt attttcccta aatagaacaa acatttttca 4440
gaggatacta aaacaaccaa gttaaattaa agatatacca aaaaaaatat ttctttaaat
4500 taactataaa aaaaaaaaaa aaaaactcga gggggggccc ggtacctttt
tttataaaga 4560 aatttaattg atattttttt tttttttttt tgagctcccc
cccgggccat gg 4612 3 1063 DNA Artificial Sequence Synthetic
Oligonucleotide 3 gggtgattga ctaaaatttt taaaaatttt gaaggtttta
atgagaattt ttaaacaatt 60 ttgtatgtta aactaaaact ttcaaaaaaa
attttgaaag gtttaatgag aattttaaaa 120 attttgagcg ggctaattaa
aatttttaaa aaatgtataa taaaaaaatt caaaaactct 180 ttgaggccat
aaaggtcatc gggcccttaa atacatcagc ttgttgtttc ctcatattac 240
tcatgttatt tcagttaaca gatataatgg ctatcatttg atttaggagt gaaatctaaa
300 aattcgaaaa gtataaaaac taaaaaggat taaattgaag aacattaatt
aaatcaacaa 360 tttactattc caataacaga attttgagtt aacaaattta
actgctacaa tttggttcga 420 gaccaaaatt acaaaacccg aaaagtattg
ggactaaaat tgatcaaatt agagtacatg 480 ggttaaattc acaacttact
tatggtacaa ggattaatag cataatttct ccttaggcaa 540 atgccagtta
gttaaagatg taccttgccc aaccgaaagc ttccttaaac ttcccgcaat 600
tttttaaatt tctttttccc ttagaaaaaa gaacaaaaat gtaagctttg cttgtcagag
660 atttctctgc aaatacattg acaccaacaa cctaccctcc attacactac
caaccggcct 720 tccccttcaa cttttcttca ccattacaac atgcctatct
ccacccttag cccaacatgc 780 acttatatct tgtgtttggt tgtttttctt
tttcatataa aaacacacac caagacacaa 840 aggtattgag aggtaagtag
agggaaagac cctttggtta gcatattgtt tgtagcattg 900 ggttttttct
caaggaagaa gaaggagaaa gataagtact ttttttgaga atgatggaat 960
ctggggttcc tgtttgccac acttgtggtg aacatgttgg gttgaatgta agccgaattc
1020 cagcacactg gcggccgtta ctagtggatc cgcgctcggt acc 1063 4 27 DNA
Artificial Sequence Synthetic Oligonucleotide 4 attgaattcc
tgggtgttgg atcagtt 27 5 24 DNA Artificial Sequence Synthetic
Oligonucleotide 5 attctcgagt ggaagggatt gaaa 24 6 974 PRT Gossypim
hirsutum 6 Met Met Glu Ser Gly Val Pro Val Cys His Thr Cys Gly Glu
His Val 1 5 10 15 Gly Leu Asn Val Asn Gly Glu Pro Phe Val Ala Cys
His Glu Cys Asn 20 25 30 Phe Pro Ile Cys Lys Ser Cys Phe Glu Tyr
Asp Leu Lys Glu Gly Arg 35 40 45 Lys Ala Cys Leu Arg Cys Gly Ser
Pro Tyr Asp Glu Asn Leu Leu Asp 50 55 60 Asp Val Glu Lys Ala Thr
Gly Asp Gln Ser Thr Met Ala Ala His Leu 65 70 75 80 Asn Lys Ser Gln
Asp Val Gly Ile His Ala Arg His Ile Ser Ser Val 85 90 95 Ser Thr
Leu Asp Ser Glu Met Ala Glu Asp Asn Gly Asn Ser Ile Trp 100 105 110
Lys Asn Arg Val Glu Ser Trp Lys Glu Lys Lys Asn Lys Lys Lys Lys 115
120 125 Pro Ala Thr Thr Lys Val Glu Arg Glu Ala Glu Ile Pro Pro Glu
Gln 130 135 140 Gln Met Glu Asp Lys Pro Ala Pro Asp Ala Ser Gln Pro
Leu Ser Thr 145 150 155 160 Ile Ile Pro Ile Pro Lys Ser Arg Leu Ala
Pro Tyr Arg Thr Val Ile 165 170 175 Ile Met Arg Leu Ile Ile Leu Gly
Leu Phe Phe His Tyr Arg Val Thr 180 185 190 Asn Pro Val Asp Ser Ala
Phe Gly Leu Trp Leu Thr Ser Val Ile Cys 195 200 205 Glu Ile Trp Phe
Ala Phe Ser Trp Val Leu Asp Gln Phe Pro Lys Trp 210 215 220 Tyr Pro
Val Asn Arg Glu Thr Tyr Ile Asp Arg Leu Ser Ala Arg Tyr 225 230 235
240 Glu Arg Glu Gly Glu Pro Asp Glu Leu Ala Ala Val Asp Phe Phe Val
245 250 255 Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Leu Ile Thr Ala
Asn Thr 260 265 270 Val Leu Ser Ile Leu Ala Leu Asp Tyr Pro Val Asp
Lys Val Ser Cys 275 280 285 Tyr Ile Ser Asp Asp Gly Ala Ala Met Leu
Thr Phe Glu Ser Leu Val 290 295 300 Glu Thr Ala Asp Phe Ala Arg Lys
Trp Val Pro Phe Cys Lys Lys Phe 305 310 315 320 Ser Ile Glu Pro Arg
Ala Pro Glu Phe Tyr Phe Ser Gln Lys Ile Asp 325 330 335 Tyr Leu Lys
Asp Lys Val Gln Pro Ser Phe Val Lys Glu Arg Arg Ala 340 345 350 Met
Lys Arg Asp Tyr Glu Glu Tyr Lys Ile Arg Ile Asn Ala Leu Val 355 360
365 Ala Lys Ala Gln Lys Thr Pro Asp Glu Gly Trp Thr Met Gln Asp Gly
370 375 380 Thr Ser Trp Pro Gly Asn Asn Pro Arg Asp His Pro Gly Met
Ile Gln 385 390 395 400 Val Phe Leu Gly Tyr Ser Gly Ala Arg Asp Ile
Glu Gly Asn Glu Leu 405 410 415 Pro Arg Leu Val Tyr Val Ser Arg Glu
Lys Arg Pro Gly Tyr Gln His 420 425 430 His Lys Lys Ala Gly Ala Glu
Asn Ala Leu Val Arg Val Ser Ala Val 435 440 445 Leu Thr Asn Ala Pro
Phe Ile Leu Asn Leu Asp Cys Asp His Tyr Val 450 455 460 Asn Asn Ser
Lys Ala Val Arg Glu Ala Met Cys Phe Leu Met Asp Pro 465 470 475 480
Gln Val Gly Arg Asp Val Cys Tyr Val Gln Phe Pro Gln Arg Phe Asp 485
490 495 Gly Ile Asp Arg Ser Asp Arg Tyr Ala Asn Arg Asn Thr Val Phe
Phe 500 505 510 Asp Val Asn Met Lys Gly Leu Asp Gly Ile Gln Gly Pro
Val Tyr Val 515 520 525 Gly Thr Gly Cys Val Phe Asn Arg Gln Ala Leu
Tyr Gly Tyr Gly Pro 530 535 540 Pro Ser Met Pro Ser Phe Pro Lys Ser
Ser Ser Ser Ser Cys Ser Cys 545 550 555 560 Cys Cys Pro Gly Lys Lys
Glu Pro Lys Asp Pro Ser Glu Leu Tyr Arg 565 570 575 Asp Ala Lys Arg
Glu Glu Leu Asp Ala Ala Ile Phe Asn Leu Arg Glu 580 585 590 Ile Asp
Asn Tyr Asp Glu Tyr Glu Arg Ser Met Leu Ile Ser Gln Thr 595 600 605
Ser Phe Glu Lys Thr Phe Gly Leu Ser Ser Val Phe Ile Glu Ser Thr 610
615 620 Leu Met Glu Asn Gly Gly Val Ala Glu Ser Ala Asn Pro Ser Thr
Leu 625 630 635 640 Ile Lys Glu Ala Ile His Val Ile Ser Cys Gly Tyr
Glu Glu Lys Thr 645 650 655 Ala Trp Gly Lys Glu Ile Gly Trp Ile Tyr
Gly Ser Val Thr Glu Asp 660 665 670 Ile Leu Thr Gly Phe Lys Met His
Cys Arg Gly Trp Arg Ser Ile Tyr 675 680 685 Cys Met Pro Leu Arg Pro
Ala Phe Lys Gly Ser Ala Pro Ile Asn Leu 690 695 700 Ser Asp Arg Leu
His Gln Val Leu Arg Trp Ala Leu Gly Ser Val Glu 705 710 715 720 Ile
Phe Leu Ser Arg His Cys Pro Leu Trp Tyr Gly Phe Gly Gly Gly 725 730
735 Arg Leu Lys Trp Leu Gln Arg Leu Ala Tyr Ile Asn Thr Ile Val Tyr
740 745 750 Pro Phe Thr Ser Leu Pro Leu Ile Ala Tyr Cys Ser Leu Pro
Ala Ile 755 760 765 Cys Leu Leu Thr Gly Lys Phe Ile Ile Pro Thr Leu
Ser Asn Leu Ala 770 775 780 Ser Val Leu Phe Leu Gly Leu Phe Leu Ser
Ile Ile Val Thr Ala Val 785 790 795 800 Leu Glu Leu Arg Trp Ser Gly
Val Ser Ile Glu Asp Leu Trp Arg Asn 805 810 815 Glu Gln Phe Trp Val
Ile Gly Gly Val Ser Ala His Leu Phe Ala Val 820 825 830 Phe Gln Gly
Phe Leu Lys Met Leu Ala Gly Ile Asp Thr Asn Phe Thr 835 840 845 Val
Thr Ala Lys Ala Ala Asp Asp Ala Asp Phe Gly Glu Leu Tyr Ile 850 855
860 Val Lys Trp Thr Thr Leu Leu Ile Pro Pro Thr Thr Leu Leu Ile Val
865 870 875 880 Asn Met Val Gly Val Val Ala Gly Phe Ser Asp Ala Leu
Asn Lys Gly 885 890 895 Tyr Glu Ala Trp Gly Pro Leu Phe Gly Lys Val
Phe Phe Ser Phe Trp 900 905
910 Val Ile Leu His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gln
915 920 925 Asn Arg Thr Pro Thr Ile Val Val Leu Trp Ser Val Leu Leu
Ala Ser 930 935 940 Val Phe Ser Leu Val Trp Val Arg Ile Asn Pro Phe
Val Ser Thr Ala 945 950 955 960 Asp Ser Thr Thr Val Ser Gln Ser Cys
Ile Ser Ile Asp Cys 965 970 7 685 PRT Gossypium hirsutum 7 Ala Arg
Arg Trp Val Pro Phe Cys Lys Lys His Asn Val Glu Pro Arg 1 5 10 15
Ala Pro Glu Phe Tyr Phe Asn Glu Lys Ile Asp Tyr Leu Lys Asp Lys 20
25 30 Val His Pro Ser Phe Val Lys Glu Arg Arg Ala Met Lys Arg Glu
Tyr 35 40 45 Glu Glu Phe Lys Val Arg Ile Asn Ala Leu Val Ala Lys
Ala Gln Lys 50 55 60 Lys Pro Glu Glu Gly Trp Val Met Gln Asp Gly
Thr Pro Trp Pro Gly 65 70 75 80 Asn Asn Thr Arg Asp His Pro Gly Met
Ile Gln Val Tyr Leu Gly Ser 85 90 95 Ala Gly Ala Leu Asp Val Asp
Gly Lys Glu Leu Pro Arg Leu Val Tyr 100 105 110 Val Ser Arg Glu Lys
Arg Pro Gly Tyr Gln His His Lys Lys Ala Gly 115 120 125 Ala Glu Asn
Ala Leu Val Arg Val Ser Ala Val Leu Thr Asn Ala Pro 130 135 140 Phe
Ile Leu Asn Leu Asp Cys Asp His Tyr Ile Asn Asn Ser Lys Ala 145 150
155 160 Met Arg Glu Ala Met Cys Phe Leu Met Asp Pro Gln Phe Gly Lys
Lys 165 170 175 Leu Cys Tyr Val Gln Phe Pro Gln Arg Phe Asp Gly Ile
Asp Arg His 180 185 190 Asp Arg Tyr Ala Asn Arg Asn Val Val Phe Phe
Asp Ile Asn Met Leu 195 200 205 Gly Leu Asp Gly Leu Gln Gly Pro Val
Tyr Val Gly Thr Gly Cys Val 210 215 220 Phe Asn Arg Gln Ala Leu Tyr
Gly Tyr Asp Pro Pro Val Ser Glu Lys 225 230 235 240 Arg Pro Lys Met
Thr Cys Asp Cys Trp Pro Ser Trp Cys Cys Cys Cys 245 250 255 Cys Gly
Gly Ser Arg Lys Lys Ser Lys Lys Lys Gly Glu Lys Lys Gly 260 265 270
Leu Leu Gly Gly Leu Leu Tyr Gly Lys Lys Lys Lys Met Met Gly Lys 275
280 285 Asn Tyr Val Lys Lys Gly Ser Ala Pro Val Phe Asp Leu Glu Glu
Ile 290 295 300 Glu Glu Gly Leu Glu Gly Tyr Glu Glu Leu Glu Lys Ser
Thr Leu Met 305 310 315 320 Ser Gln Lys Asn Phe Glu Lys Arg Phe Gly
Gln Ser Pro Val Phe Ile 325 330 335 Ala Ser Thr Leu Met Glu Asn Gly
Gly Leu Pro Glu Gly Thr Asn Ser 340 345 350 Thr Ser Leu Ile Lys Glu
Ala Ile His Val Ile Ser Cys Gly Tyr Glu 355 360 365 Glu Lys Thr Glu
Trp Gly Lys Glu Ile Gly Trp Ile Tyr Gly Ser Val 370 375 380 Thr Glu
Asp Ile Leu Thr Gly Phe Lys Met His Cys Arg Gly Trp Lys 385 390 395
400 Ser Val Tyr Cys Val Pro Lys Arg Pro Ala Phe Lys Gly Ser Ala Pro
405 410 415 Ile Asn Leu Ser Asp Arg Leu His Gln Val Leu Arg Trp Ala
Leu Gly 420 425 430 Ser Val Glu Ile Phe Leu Ser Arg His Cys Pro Leu
Trp Tyr Gly Tyr 435 440 445 Gly Gly Lys Leu Lys Trp Leu Glu Arg Leu
Ala Tyr Ile Asn Thr Ile 450 455 460 Val Tyr Pro Phe Thr Ser Ile Pro
Leu Leu Ala Tyr Cys Thr Ile Pro 465 470 475 480 Ala Val Cys Leu Leu
Thr Gly Lys Phe Ile Ile Pro Thr Leu Ser Asn 485 490 495 Leu Thr Ser
Val Trp Phe Leu Ala Leu Phe Leu Ser Ile Ile Ala Thr 500 505 510 Gly
Val Leu Glu Leu Arg Trp Ser Gly Val Ser Ile Gln Asp Trp Trp 515 520
525 Arg Asn Glu Gln Phe Trp Val Ile Gly Gly Val Ser Ala His Leu Phe
530 535 540 Ala Val Phe Gln Gly Leu Leu Lys Val Leu Ala Gly Val Asp
Thr Asn 545 550 555 560 Phe Thr Val Thr Ala Lys Ala Ala Asp Asp Thr
Glu Phe Gly Glu Leu 565 570 575 Tyr Leu Phe Lys Trp Thr Thr Leu Leu
Ile Pro Pro Thr Thr Leu Ile 580 585 590 Ile Leu Asn Met Val Gly Val
Val Ala Gly Val Ser Asp Ala Ile Asn 595 600 605 Asn Gly Tyr Gly Ser
Trp Gly Pro Leu Phe Gly Lys Leu Phe Phe Ala 610 615 620 Phe Trp Val
Ile Leu His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly 625 630 635 640
Arg Gln Asn Arg Thr Pro Thr Ile Val Val Leu Trp Ser Ile Leu Leu 645
650 655 Ala Ser Ile Phe Ser Leu Val Trp Val Arg Ile Asp Pro Phe Leu
Pro 660 665 670 Lys Gln Thr Gly Pro Val Leu Lys Gln Cys Gly Val Glu
675 680 685 8 881 PRT Oryzae sativa 8 Gly Asn Val Ala Trp Lys Glu
Arg Val Asp Gly Trp Lys Leu Lys Gln 1 5 10 15 Asp Lys Gly Ala Ile
Pro Met Thr Asn Gly Thr Ser Ile Ala Pro Ser 20 25 30 Glu Gly Arg
Gly Val Gly Asp Ile Asp Ala Ser Thr Asp Tyr Asn Asn 35 40 45 Glu
Asp Ala Leu Leu Asn Asp Glu Thr Arg Gln Pro Leu Ser Arg Lys 50 55
60 Val Pro Leu Pro Ser Ser Arg Ile Asn Pro Tyr Arg Asn Val Ile Val
65 70 75 80 Leu Arg Leu Val Val Leu Ser Ile Phe Leu His Tyr Arg Ile
Thr Asn 85 90 95 Pro Val Arg Asn Ala Tyr Pro Leu Trp Leu Leu Ser
Val Ile Cys Glu 100 105 110 Ile Trp Phe Ala Leu Ser Trp Leu Ile Asp
Gln Phe Pro Lys Trp Phe 115 120 125 Pro Ile Asn Arg Glu Thr Tyr Leu
Asp Arg Leu Ala Leu Arg Tyr Asp 130 135 140 Arg Glu Gly Glu Pro Ser
Gln Leu Ala Ala Val Asp Ile Phe Val Ser 145 150 155 160 Thr Val Asp
Pro Met Lys Glu Pro Pro Leu Val Thr Ala Asn Thr Val 165 170 175 Leu
Ser Ile Leu Ala Val Asp Tyr Pro Val Asp Lys Val Ser Cys Tyr 180 185
190 Val Ser Asp Asp Gly Ala Ala Met Leu Thr Phe Asp Ala Leu Ala Glu
195 200 205 Thr Ser Glu Phe Ala Arg Lys Trp Val Pro Phe Val Lys Lys
Tyr Asn 210 215 220 Ile Glu Pro Arg Ala Pro Glu Trp Tyr Phe Ser Gln
Lys Ile Asp Tyr 225 230 235 240 Leu Lys Asp Lys Val His Pro Ser Phe
Val Lys Asp Arg Arg Ala Met 245 250 255 Lys Arg Glu Tyr Glu Glu Phe
Lys Val Arg Ile Asn Gly Leu Val Ala 260 265 270 Lys Ala Gln Lys Val
Pro Glu Glu Gly Trp Ile Met Gln Asp Gly Thr 275 280 285 Pro Trp Pro
Gly Asn Asn Thr Arg Asp His Pro Gly Met Ile Gln Val 290 295 300 Phe
Leu Gly His Ser Gly Gly Leu Asp Thr Glu Gly Asn Glu Leu Pro 305 310
315 320 Arg Leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Gln His
His 325 330 335 Lys Lys Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser
Ala Val Leu 340 345 350 Thr Asn Gly Gln Tyr Met Leu Asn Leu Asp Cys
Asp His Tyr Ile Asn 355 360 365 Asn Ser Lys Ala Leu Arg Glu Ala Met
Cys Phe Leu Met Asp Pro Asn 370 375 380 Leu Gly Arg Ser Val Cys Tyr
Val Gln Phe Pro Gln Arg Phe Asp Gly 385 390 395 400 Ile Asp Arg Asn
Asp Arg Tyr Ala Asn Arg Asn Thr Val Phe Phe Asp 405 410 415 Ile Asn
Leu Arg Gly Leu Asp Gly Ile Gln Gly Pro Val Tyr Val Gly 420 425 430
Thr Gly Cys Val Phe Asn Arg Thr Ala Leu Tyr Gly Tyr Glu Pro Pro 435
440 445 Ile Lys Gln Lys Lys Lys Gly Ser Phe Leu Ser Ser Leu Cys Gly
Gly 450 455 460 Arg Lys Lys Ala Ser Lys Ser Lys Lys Lys Ser Ser Asp
Lys Lys Lys 465 470 475 480 Ser Asn Lys His Val Asp Ser Ala Val Pro
Val Phe Asn Leu Glu Asp 485 490 495 Ile Glu Glu Gly Val Glu Gly Ala
Gly Phe Asp Asp Glu Lys Ser Leu 500 505 510 Leu Met Ser Gln Met Ser
Leu Glu Lys Arg Phe Gly Gln Ser Ala Ala 515 520 525 Phe Val Ala Ser
Thr Leu Met Glu Tyr Gly Gly Val Pro Gln Ser Ala 530 535 540 Thr Pro
Glu Ser Leu Leu Lys Glu Ala Ile His Val Ile Ser Cys Gly 545 550 555
560 Tyr Glu Asp Lys Thr Glu Trp Gly Thr Glu Ile Gly Trp Ile Tyr Gly
565 570 575 Ser Val Thr Glu Asp Ile Leu Thr Gly Phe Lys Met His Ala
Arg Gly 580 585 590 Trp Arg Ser Ile Tyr Cys Met Pro Lys Arg Pro Ala
Phe Lys Gly Ser 595 600 605 Ala Pro Ile Asn Leu Ser Asp Arg Leu Asn
Gln Val Leu Arg Trp Ala 610 615 620 Leu Gly Ser Val Glu Ile Leu Phe
Ser Arg His Cys Pro Ile Trp Tyr 625 630 635 640 Gly Tyr Gly Gly Arg
Leu Lys Phe Leu Glu Arg Phe Ala Tyr Ile Asn 645 650 655 Thr Thr Ile
Tyr Pro Leu Thr Ser Ile Pro Leu Leu Ile Tyr Cys Val 660 665 670 Leu
Pro Ala Ile Cys Leu Leu Thr Gly Lys Phe Ile Ile Pro Glu Ile 675 680
685 Ser Asn Phe Ala Ser Ile Trp Phe Ile Ser Leu Phe Ile Ser Ile Phe
690 695 700 Ala Thr Gly Ile Leu Glu Met Arg Trp Ser Gly Val Gly Ile
Asp Glu 705 710 715 720 Trp Trp Arg Asn Glu Gln Phe Trp Val Ile Gly
Gly Ile Ser Ala His 725 730 735 Leu Phe Ala Val Phe Gln Gly Leu Leu
Lys Val Leu Ala Gly Ile Asp 740 745 750 Thr Asn Phe Thr Val Thr Ser
Lys Ala Ser Asp Glu Asp Gly Asp Phe 755 760 765 Ala Glu Leu Tyr Met
Phe Lys Trp Thr Thr Leu Leu Ile Pro Pro Thr 770 775 780 Thr Ile Leu
Ile Ile Asn Leu Val Gly Val Val Ala Gly Ile Ser Tyr 785 790 795 800
Ala Ile Asn Ser Gly Tyr Gln Ser Trp Gly Pro Leu Phe Gly Lys Leu 805
810 815 Phe Phe Ala Phe Trp Val Ile Val His Leu Tyr Pro Phe Leu Lys
Gly 820 825 830 Leu Met Gly Arg Gln Asn Arg Thr Pro Thr Ile Val Val
Val Trp Ala 835 840 845 Ile Leu Leu Ala Ser Ile Phe Ser Leu Leu Trp
Val Arg Ile Asp Pro 850 855 860 Phe Thr Thr Arg Val Thr Gly Pro Asp
Thr Gln Thr Cys Gly Ile Asn 865 870 875 880 Cys 9 723 PRT
Acetobacter xylinum 9 Met Pro Glu Val Arg Ser Ser Thr Gln Ser Glu
Ser Gly Met Ser Gln 1 5 10 15 Trp Met Gly Lys Ile Leu Ser Ile Arg
Gly Ala Gly Leu Thr Ile Gly 20 25 30 Val Phe Gly Leu Cys Ala Leu
Ile Ala Ala Thr Ser Val Thr Leu Pro 35 40 45 Pro Glu Gln Gln Leu
Ile Val Ala Phe Val Cys Val Val Ile Phe Phe 50 55 60 Ile Val Gly
His Lys Pro Ser Arg Arg Ser Gln Ile Phe Leu Glu Val 65 70 75 80 Leu
Ser Gly Leu Val Ser Leu Arg Tyr Leu Thr Trp Arg Leu Thr Glu 85 90
95 Thr Leu Ser Phe Asp Thr Trp Leu Gln Gly Leu Leu Gly Thr Met Leu
100 105 110 Leu Val Ala Glu Leu Tyr Ala Leu Met Met Leu Phe Leu Ser
Tyr Phe 115 120 125 Gln Thr Ile Ala Pro Leu His Arg Ala Pro Leu Pro
Leu Pro Pro Asn 130 135 140 Pro Asp Glu Trp Pro Thr Val Asp Ile Phe
Val Pro Thr Tyr Asn Glu 145 150 155 160 Glu Leu Ser Ile Val Arg Leu
Thr Val Leu Gly Ser Leu Gly Ile Asp 165 170 175 Trp Pro Pro Glu Lys
Val Arg Val His Ile Leu Asp Asp Gly Arg Arg 180 185 190 Pro Glu Phe
Ala Ala Phe Ala Ala Glu Cys Gly Ala Asn Tyr Ile Ala 195 200 205 Arg
Pro Thr Asn Glu His Ala Lys Ala Gly Asn Leu Asn Tyr Ala Ile 210 215
220 Gly His Thr Asp Gly Asp Tyr Ile Leu Ile Phe Asp Cys Asp His Val
225 230 235 240 Pro Thr Arg Ala Phe Leu Gln Leu Thr Met Gly Trp Met
Val Glu Asp 245 250 255 Pro Lys Ile Ala Leu Met Gln Thr Pro His His
Phe Tyr Ser Pro Asp 260 265 270 Pro Phe Gln Arg Asn Leu Ser Ala Gly
Tyr Arg Thr Pro Pro Glu Gly 275 280 285 Asn Leu Phe Tyr Gly Val Val
Gln Asp Gly Asn Asp Phe Trp Asp Ala 290 295 300 Thr Phe Phe Cys Gly
Ser Cys Ala Ile Leu Arg Arg Thr Ala Ile Glu 305 310 315 320 Gln Ile
Gly Gly Phe Ala Thr Gln Thr Val Thr Glu Asp Ala His Thr 325 330 335
Ala Leu Lys Met Gln Arg Leu Gly Trp Ser Thr Ala Tyr Leu Arg Ile 340
345 350 Pro Leu Ala Gly Gly Leu Ala Thr Glu Arg Leu Ile Leu His Ile
Gly 355 360 365 Gln Arg Val Arg Trp Ala Arg Gly Met Leu Gln Ile Phe
Arg Ile Asp 370 375 380 Asn Pro Leu Phe Gly Arg Gly Leu Ser Trp Gly
Gln Arg Leu Cys Tyr 385 390 395 400 Leu Ser Ala Met Thr Ser Phe Leu
Phe Ala Val Pro Arg Val Ile Phe 405 410 415 Leu Ser Ser Pro Leu Ala
Phe Leu Phe Phe Gly Gln Asn Ile Ile Ala 420 425 430 Ala Ser Pro Leu
Ala Leu Leu Ala Tyr Ala Ile Pro His Met Phe His 435 440 445 Ala Val
Gly Thr Ala Ser Lys Ile Asn Lys Gly Trp Arg Tyr Ser Phe 450 455 460
Trp Ser Glu Val Tyr Glu Thr Thr Met Ala Leu Phe Leu Val Arg Val 465
470 475 480 Thr Ile Val Thr Leu Leu Ser Pro Ser Arg Gly Lys Phe Asn
Val Thr 485 490 495 Asp Lys Gly Gly Leu Leu Glu Lys Gly Tyr Phe Asp
Leu Gly Ala Val 500 505 510 Tyr Pro Asn Ile Ile Leu Gly Leu Ile Met
Phe Gly Gly Leu Ala Arg 515 520 525 Gly Val Tyr Glu Leu Ser Phe Gly
His Leu Asp Gln Ile Ala Glu Arg 530 535 540 Ala Tyr Leu Leu Asn Ser
Ala Trp Ala Met Leu Ser Leu Ile Ile Ile 545 550 555 560 Leu Ala Ala
Ile Ala Val Gly Arg Glu Thr Gln Gln Lys Arg Asn Ser 565 570 575 His
Arg Ile Pro Ala Thr Ile Pro Val Glu Val Ala Asn Ala Asp Gly 580 585
590 Ser Ile Ile Val Thr Gly Val Thr Glu Asp Leu Ser Met Gly Gly Ala
595 600 605 Ala Val Lys Met Ser Trp Pro Ala Lys Leu Ser Gly Pro Thr
Pro Val 610 615 620 Tyr Ile Arg Thr Val Leu Asp Gly Glu Glu Leu Ile
Leu Pro Ala Arg 625 630 635 640 Ile Ile Arg Ala Gly Asn Gly Arg Gly
Ile Phe Ile Trp Thr Ile Asp 645 650 655 Asn Leu Gln Gln Glu Phe Ser
Val Ile Arg Leu Val Phe Gly Arg Ala 660 665 670 Asp Ala Trp Val Asp
Leu Gly Gln Leu Gln Gly Arg Pro Pro Ala Ala 675 680 685 Gln Pro His
Gly His Gly Ser Gln Arg Gln Gly Pro Val Pro Phe Lys 690 695 700 Trp
Arg Tyr Arg Pro Ser Gln Phe Pro Asn Gln Ala Phe Gly Trp Gln 705 710
715 720 Cys Pro Val 10 756 PRT acetobacter xylinum 10 Met Ser Glu
Val Gln Ser Pro Val Pro Thr Glu Ser Arg Leu Gly Arg 1 5 10 15 Ile
Ser Asn Lys Ile Leu Ser Leu Arg Gly Ala Ser Tyr Ile Val Gly 20 25
30 Ala Leu Gly Leu Cys Ala Leu Ile Ala Ala Thr Thr Val Thr Leu Asn
35 40 45 Asn Asn Glu Gln Leu Ile Val Ala Ala Val Cys Val Val Ile
Phe Phe 50 55 60 Val Val Gly Arg Gly Lys Ser Arg Arg Thr Gln Ile
Phe Leu Glu Val 65 70 75 80 Leu Ser Ala Leu Val
Ser Leu Arg Tyr Leu Thr Trp Arg Leu Thr Glu 85 90 95 Thr Leu Asp
Phe Asn Thr Trp Ile Gln Gly Ile Leu Gly Val Ile Leu 100 105 110 Leu
Met Ala Glu Leu Tyr Ala Leu Tyr Met Leu Phe Leu Ser Tyr Phe 115 120
125 Gln Thr Ile Gln Pro Leu His Arg Ala Pro Leu Pro Leu Pro Asp Asn
130 135 140 Val Asp Asp Trp Pro Thr Val Asp Ile Phe Ile Pro Thr Tyr
Asp Glu 145 150 155 160 Gln Leu Ser Ile Val Arg Leu Thr Val Leu Gly
Ala Leu Gly Ile Asp 165 170 175 Trp Pro Pro Asp Lys Val Asn Val Tyr
Ile Leu Asp Asp Gly Val Arg 180 185 190 Pro Glu Phe Glu Gln Phe Ala
Lys Asp Cys Gly Ala Leu Tyr Ile Gly 195 200 205 Arg Val Asp Val Asp
Ser Ala His Ala Lys Ala Gly Asn Leu Asn His 210 215 220 Ala Ile Lys
Arg Thr Ser Gly Asp Tyr Ile Leu Ile Leu Asp Cys Asp 225 230 235 240
His Ile Pro Thr Arg Ala Phe Leu Gln Ile Ala Met Gly Trp Met Val 245
250 255 Ala Asp Arg Lys Ile Ala Leu Met Gln Thr Pro His His Phe Tyr
Ser 260 265 270 Pro Asp Pro Phe Gln Arg Asn Leu Ala Val Gly Tyr Arg
Thr Pro Pro 275 280 285 Glu Gly Asn Leu Phe Tyr Gly Val Ile Gln Asp
Gly Asn Asp Phe Trp 290 295 300 Asp Ala Thr Phe Phe Cys Gly Ser Cys
Ala Ile Leu Arg Arg Glu Ala 305 310 315 320 Ile Glu Ser Ile Gly Gly
Phe Ala Val Glu Thr Val Thr Glu Asp Ala 325 330 335 His Thr Ala Leu
Arg Met Gln Arg Arg Gly Trp Ser Thr Ala Tyr Leu 340 345 350 Arg Ile
Pro Val Ala Ser Gly Leu Ala Thr Glu Arg Leu Thr Thr His 355 360 365
Ile Gly Gln Arg Met Arg Trp Ala Arg Gly Met Ile Gln Ile Phe Arg 370
375 380 Val Asp Asn Pro Met Leu Gly Arg Gly Leu Lys Leu Gly Gln Arg
Leu 385 390 395 400 Cys Tyr Leu Ser Ala Met Thr Ser Phe Phe Phe Ala
Ile Pro Arg Val 405 410 415 Ile Phe Leu Ala Ser Pro Leu Ala Phe Leu
Phe Ala Gly Gln Asn Ile 420 425 430 Ile Ala Ala Ala Pro Leu Ala Val
Ala Ala Tyr Ala Leu Pro His Met 435 440 445 Phe His Ser Ile Ala Thr
Ala Ala Lys Val Asn Lys Gly Trp Arg Tyr 450 455 460 Ser Phe Trp Ser
Glu Val Tyr Glu Thr Thr Met Ala Leu Phe Leu Val 465 470 475 480 Arg
Val Thr Ile Val Thr Leu Leu Phe Pro Ser Lys Gly Lys Phe Asn 485 490
495 Val Thr Glu Lys Gly Gly Val Leu Glu Glu Glu Glu Phe Asp Leu Gly
500 505 510 Ala Thr Tyr Pro Asn Ile Ile Phe Ala Thr Ile Met Met Gly
Gly Leu 515 520 525 Leu Ile Gly Leu Phe Glu Leu Ile Val Arg Phe Asn
Gln Leu Asp Val 530 535 540 Ile Ala Arg Asn Ala Tyr Leu Leu Asn Cys
Ala Trp Ala Leu Ile Ser 545 550 555 560 Leu Ile Ile Leu Phe Ala Ala
Ile Ala Val Gly Arg Glu Thr Lys Gln 565 570 575 Val Arg Tyr Asn His
Arg Val Glu Ala His Ile Pro Val Thr Val Tyr 580 585 590 Asp Ala Pro
Ala Glu Gly Gln Pro His Thr Tyr Tyr Asn Ala Thr His 595 600 605 Gly
Met Thr Gln Asp Val Ser Met Gly Gly Val Ala Val His Ile Pro 610 615
620 Leu Pro Asp Val Thr Thr Gly Pro Val Lys Lys Arg Ile His Ala Val
625 630 635 640 Leu Asp Gly Glu Glu Ile Asp Ile Pro Ala Thr Met Leu
Arg Cys Thr 645 650 655 Asn Gly Lys Ala Val Phe Thr Trp Asp Asn Asn
Asp Leu Asp Thr Glu 660 665 670 Arg Asp Ile Val Arg Phe Val Phe Gly
Arg Ala Asp Ala Trp Leu Gln 675 680 685 Trp Asn Asn Tyr Glu Asp Asp
Arg Pro Leu Arg Ser Leu Trp Ser Leu 690 695 700 Leu Leu Ser Ile Lys
Ala Leu Phe Arg Lys Lys Gly Lys Ile Met Ala 705 710 715 720 Asn Ser
Arg Pro Lys Lys Lys Pro Leu Ala Leu Pro Val Glu Arg Arg 725 730 735
Glu Pro Thr Thr Ile His Ser Gly Gln Thr Gln Glu Gly Lys Ile Ser 740
745 750 Arg Ala Ala Ser 755 11 693 PRT Escherichia coli 11 Met Leu
Leu Trp Gly Val Ala Leu Ile Val Arg Arg Met Pro Gly Arg 1 5 10 15
Phe Ser Ala Leu Met Leu Ile Val Leu Ser Leu Thr Val Ser Cys Arg 20
25 30 Tyr Ile Trp Trp Arg Tyr Thr Ser Thr Leu Asn Trp Asp Asp Pro
Val 35 40 45 Ser Leu Val Cys Gly Leu Ile Leu Leu Phe Ala Ile Thr
Tyr Ala Trp 50 55 60 Ile Val Leu Val Leu Gly Tyr Phe Gln Val Val
Trp Pro Leu Asn Arg 65 70 75 80 Gln Pro Val Pro Leu Pro Lys Asp Met
Ser Leu Trp Pro Ser Val Asp 85 90 95 Ile Phe Val Pro Thr Tyr Asn
Glu Asp Leu Asn Val Val Lys Asn Thr 100 105 110 Ile Tyr Ala Ser Leu
Gly Ile Asp Trp Pro Lys Asp Lys Leu Asn Ile 115 120 125 Trp Ile Leu
Asp Asp Gly Gly Arg Glu Glu Phe Arg Gln Phe Ala Gln 130 135 140 Asn
Val Gly Val Lys Tyr Ile Ala Arg Thr Thr His Glu His Ala Lys 145 150
155 160 Ala Gly Asn Ile Asn Asn Ala Leu Lys Tyr Ala Lys Gly Glu Phe
Val 165 170 175 Ser Ile Phe Asp Cys Asp His Val Pro Thr Arg Ser Phe
Leu Gln Met 180 185 190 Thr Met Gly Trp Phe Leu Lys Glu Lys Gln Leu
Ala Met Met Gln Thr 195 200 205 Pro His His Phe Phe Ser Pro Asp Pro
Phe Glu Arg Asn Leu Gly Arg 210 215 220 Phe Arg Lys Thr Pro Asn Glu
Gly Thr Leu Phe Tyr Gly Leu Val Gln 225 230 235 240 Asp Gly Asn Asp
Met Trp Asp Ala Thr Phe Phe Cys Gly Ser Cys Ala 245 250 255 Val Ile
Arg Arg Lys Pro Leu Asp Glu Ile Gly Gly Ile Ala Val Glu 260 265 270
Thr Val Thr Glu Asp Ala His Thr Ser Leu Arg Leu His Arg Arg Gly 275
280 285 Tyr Thr Ser Ala Tyr Met Arg Ile Pro Gln Ala Ala Gly Leu Ala
Thr 290 295 300 Glu Ser Leu Ser Ala His Ile Gly Gln Arg Ile Arg Trp
Ala Arg Gly 305 310 315 320 Met Val Gln Ile Phe Arg Leu Asp Asn Pro
Leu Thr Gly Lys Gly Leu 325 330 335 Lys Phe Ala Gln Arg Leu Cys Tyr
Val Asn Ala Met Phe His Phe Leu 340 345 350 Ser Gly Ile Pro Arg Leu
Ile Phe Leu Thr Ala Pro Leu Ala Phe Leu 355 360 365 Leu Leu His Ala
Tyr Ile Ile Tyr Ala Pro Ala Leu Met Ile Ala Leu 370 375 380 Phe Val
Leu Pro His Met Ile His Ala Ser Leu Thr Asn Ser Lys Ile 385 390 395
400 Gln Gly Lys Tyr Arg His Ser Phe Trp Ser Glu Ile Tyr Glu Thr Val
405 410 415 Leu Ala Trp Tyr Ile Ala Pro Pro Thr Leu Val Ala Leu Ile
Asn Pro 420 425 430 His Lys Gly Lys Phe Asn Val Thr Ala Lys Gly Gly
Gly Leu Val Glu 435 440 445 Glu Glu Tyr Val Asp Trp Val Ile Ser Arg
Pro Tyr Ile Phe Leu Val 450 455 460 Leu Leu Asn Leu Val Gly Val Ala
Val Gly Ile Trp Arg Tyr Phe Tyr 465 470 475 480 Gly Pro Pro Thr Glu
Met Leu Thr Val Val Val Ser Met Val Trp Val 485 490 495 Phe Tyr Asn
Leu Ile Val Leu Gly Gly Ala Val Ala Val Ser Val Glu 500 505 510 Ser
Lys Gln Val Arg Arg Ser His Arg Val Glu Met Thr Met Pro Ala 515 520
525 Ala Ile Ala Arg Glu Asp Gly His Leu Phe Ser Cys Thr Val Gln Asp
530 535 540 Phe Ser Asp Gly Gly Leu Gly Ile Lys Ile Asn Gly Gln Ala
Gln Ile 545 550 555 560 Leu Glu Gly Gln Lys Val Asn Leu Leu Leu Lys
Arg Gly Gln Gln Glu 565 570 575 Tyr Val Phe Pro Thr Gln Val Ala Arg
Val Met Gly Asn Glu Val Gly 580 585 590 Leu Lys Leu Met Pro Leu Thr
Thr Gln Gln His Ile Asp Phe Val Gln 595 600 605 Cys Thr Phe Ala Arg
Ala Asp Thr Trp Ala Leu Trp Gln Asp Ser Tyr 610 615 620 Pro Glu Asp
Lys Pro Leu Glu Ser Leu Leu Asp Ile Leu Lys Leu Gly 625 630 635 640
Phe Arg Gly Tyr Arg His Leu Ala Glu Phe Ala Pro Ser Ser Val Lys 645
650 655 Gly Ile Phe Arg Val Leu Thr Ser Leu Val Ser Trp Val Val Ser
Phe 660 665 670 Ile Pro Pro Arg Pro Glu Arg Ser Glu Thr Ala Gln Pro
Ser Asp Gln 675 680 685 Ala Leu Ala Gln Gln 690 12 861 PRT
Agrobacterium tumefaciens 12 Met Cys Arg Cys Gly Arg Ala Val Arg
Ser Arg Pro Val Cys Arg Pro 1 5 10 15 Gly Gln Leu Val Val Arg Arg
Ser Pro Arg Pro Arg Ser Arg Asn His 20 25 30 Ser Arg Cys Arg Pro
Leu Arg Leu Ser Val Phe Pro Arg Pro His Arg 35 40 45 Arg Val Arg
His His Cys Gln Arg Asp Leu Arg Trp Glu Pro Gly Arg 50 55 60 Trp
Ile Ala Val Arg Trp Lys Ala Ala Arg Ser His Arg Arg Phe Arg 65 70
75 80 Arg Cys Pro Phe Pro Arg Gln Leu Val Trp Pro Val Arg Glu Arg
His 85 90 95 Arg Asp Ala Gly Asp Arg Arg Asn Gln Arg Glu Arg Arg
Arg Arg Asp 100 105 110 Ala Tyr His Glu Ile Ser Glu Pro Lys Phe Arg
Thr Arg Lys Arg Thr 115 120 125 Glu Ser Phe Trp Met Asn Lys Ala Ile
Thr Val Ile Val Trp Leu Leu 130 135 140 Val Ser Leu Cys Val Leu Ala
Ile Ile Thr Met Pro Val Ser Leu Gln 145 150 155 160 Thr His Leu Val
Ala Thr Ala Ile Ser Leu Ile Leu Leu Ala Thr Ile 165 170 175 Lys Ser
Phe Asn Gly Gln Gly Ala Trp Arg Leu Val Ala Leu Gly Phe 180 185 190
Gly Thr Ala Ile Val Leu Arg Tyr Val Tyr Trp Arg Thr Thr Ser Thr 195
200 205 Leu Pro Pro Val Asn Gln Leu Glu Asn Phe Ile Pro Gly Phe Leu
Leu 210 215 220 Tyr Leu Ala Glu Met Tyr Ser Val Val Met Leu Gly Leu
Ser Leu Val 225 230 235 240 Ile Val Ser Met Pro Leu Pro Ser Arg Lys
Thr Arg Pro Gly Ser Pro 245 250 255 Asp Tyr Arg Pro Thr Val Asp Val
Phe Val Pro Ser Tyr Asn Glu Asp 260 265 270 Ala Glu Leu Leu Ala Asn
Thr Leu Ala Ala Ala Lys Asn Met Asp Tyr 275 280 285 Pro Ala Asp Arg
Phe Thr Val Trp Leu Leu Asp Asp Gly Gly Ser Val 290 295 300 Gln Lys
Arg Asn Ala Ala Asn Ile Val Glu Ala Gln Ala Ala Gln Arg 305 310 315
320 Arg His Glu Glu Leu Lys Lys Leu Cys Glu Asp Leu Asp Val Arg Tyr
325 330 335 Leu Thr Arg Glu Arg Asn Val His Ala Lys Ala Gly Asn Leu
Asn Asn 340 345 350 Gly Leu Ala His Ser Thr Gly Glu Leu Val Thr Val
Phe Asp Ala Asp 355 360 365 His Ala Pro Ala Arg Asp Phe Leu Leu Glu
Thr Val Gly Tyr Phe Asp 370 375 380 Glu Asp Pro Arg Leu Phe Leu Val
Gln Thr Pro His Phe Phe Val Asn 385 390 395 400 Pro Asp Pro Ile Glu
Arg Asn Leu Arg Thr Phe Glu Thr Met Pro Ser 405 410 415 Glu Asn Glu
Met Phe Tyr Gly Ile Ile Gln Arg Gly Leu Asp Lys Trp 420 425 430 Asn
Gly Ala Phe Phe Cys Gly Ser Ala Ala Val Leu Arg Arg Glu Ala 435 440
445 Leu Gln Asp Ser Asp Gly Phe Ser Gly Val Ser Ile Thr Glu Asp Cys
450 455 460 Glu Thr Ala Leu Ala Leu His Ser Arg Gly Trp Asn Ser Val
Tyr Val 465 470 475 480 Asp Lys Pro Leu Ile Ala Gly Leu Gln Pro Ala
Thr Phe Ala Ser Phe 485 490 495 Ile Gly Gln Arg Ser Arg Trp Ala Gln
Gly Met Met Gln Ile Leu Ile 500 505 510 Phe Arg Gln Pro Leu Phe Lys
Arg Gly Leu Ser Phe Thr Gln Arg Leu 515 520 525 Cys Tyr Met Ser Ser
Thr Leu Phe Trp Leu Phe Pro Phe Pro Arg Thr 530 535 540 Ile Phe Leu
Phe Ala Pro Leu Phe Tyr Leu Phe Phe Asp Leu Gln Ile 545 550 555 560
Phe Val Ala Ser Gly Gly Glu Phe Leu Ala Tyr Thr Ala Ala Tyr Met 565
570 575 Leu Val Asn Leu Met Met Gln Asn Tyr Leu Tyr Gly Ser Phe Arg
Trp 580 585 590 Pro Trp Ile Ser Glu Leu Tyr Glu Tyr Val Gln Thr Val
His Leu Leu 595 600 605 Pro Ala Val Val Ser Val Ile Phe Asn Pro Gly
Lys Pro Thr Phe Lys 610 615 620 Val Thr Ala Lys Asp Glu Ser Ile Ala
Glu Ala Arg Leu Ser Glu Ile 625 630 635 640 Ser Arg Pro Phe Phe Val
Ile Phe Ala Leu Leu Leu Val Ala Met Ala 645 650 655 Phe Ala Val Trp
Arg Ile Tyr Ser Glu Pro Tyr Lys Ala Asp Val Thr 660 665 670 Leu Val
Val Gly Gly Trp Asn Leu Leu Asn Leu Ile Phe Ala Gly Cys 675 680 685
Ala Leu Gly Val Val Ser Glu Arg Gly Asp Lys Ser Ala Ser Arg Arg 690
695 700 Ile Thr Val Lys Arg Arg Cys Glu Val Gln Leu Gly Gly Ser Asp
Thr 705 710 715 720 Trp Val Pro Ala Ser Ile Asp Asn Val Ser Val His
Gly Leu Leu Ile 725 730 735 Asn Ile Phe Asp Ser Ala Thr Asn Ile Glu
Lys Gly Ala Thr Ala Ile 740 745 750 Val Lys Val Lys Pro His Ser Glu
Gly Val Pro Glu Thr Met Pro Leu 755 760 765 Asn Val Val Arg Thr Val
Arg Gly Glu Gly Phe Val Ser Ile Gly Cys 770 775 780 Thr Phe Ser Pro
Gln Arg Ala Val Asp His Arg Leu Ile Ala Asp Leu 785 790 795 800 Ile
Phe Ala Asn Ser Glu Gln Trp Ser Glu Phe Gln Arg Val Arg Arg 805 810
815 Lys Lys Pro Gly Leu Ile Arg Gly Thr Ala Ile Phe Leu Ala Ile Ala
820 825 830 Leu Phe Gln Thr Gln Arg Gly Leu Tyr Tyr Leu Val Arg Ala
Arg Arg 835 840 845 Pro Ala Pro Lys Ser Ala Lys Pro Val Gly Ala Val
Lys 850 855 860
* * * * *