Plant Seeds With Altered Storage Compound Levels, Related Constructs And Methods Involving Genes Encoding Oxidoreductase Motif Polypeptides

Meyer; Knut ;   et al.

Patent Application Summary

U.S. patent application number 13/039779 was filed with the patent office on 2011-09-08 for plant seeds with altered storage compound levels, related constructs and methods involving genes encoding oxidoreductase motif polypeptides. This patent application is currently assigned to E.I. DU PONT DE NEMOURS AND COMPANY. Invention is credited to Knut Meyer, Kevin L. Stecca.

Application Number20110219474 13/039779
Document ID /
Family ID44120905
Filed Date2011-09-08

United States Patent Application 20110219474
Kind Code A1
Meyer; Knut ;   et al. September 8, 2011

PLANT SEEDS WITH ALTERED STORAGE COMPOUND LEVELS, RELATED CONSTRUCTS AND METHODS INVOLVING GENES ENCODING OXIDOREDUCTASE MOTIF POLYPEPTIDES

Abstract

This invention is in the field of plant molecular biology. More specifically, this invention pertains to isolated nucleic acid fragments encoding ORM proteins in plants and seeds and the use of such fragments to modulate expression of a gene encoding ORM protein activity in a transformed host cell.


Inventors: Meyer; Knut; (Wilmington, DE) ; Stecca; Kevin L.; (New Castle, DE)
Assignee: E.I. DU PONT DE NEMOURS AND COMPANY
Wilmington
DE

Family ID: 44120905
Appl. No.: 13/039779
Filed: March 3, 2011

Related U.S. Patent Documents

Application Number Filing Date Patent Number
61309906 Mar 3, 2010

Current U.S. Class: 800/278 ; 536/23.6; 800/298; 800/312; 800/320.1
Current CPC Class: C12N 15/8247 20130101; C12N 15/8245 20130101; C12N 15/8251 20130101; C12N 5/14 20130101; C12N 9/0004 20130101
Class at Publication: 800/278 ; 800/298; 800/320.1; 800/312; 536/23.6
International Class: C12N 15/87 20060101 C12N015/87; A01H 5/00 20060101 A01H005/00; C07H 21/04 20060101 C07H021/04

Claims



1. A transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein seed obtained from said transgenic plant has an altered i.e. increased or decreased oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.

2. A transgenic seed obtained from the transgenic plant of claim 1 comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a seed from a control plant not comprising said recombinant DNA construct.

3. A transgenic seed comprising: a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes a ORM protein, and wherein said plant has an altered, increased or decreased oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.

4. The transgenic seed of claim 1, wherein the oil content in increased by at least 2% when compared to the oil content of a non-transgenic seed.

5. A transgenic seed comprising a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 2%, on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.

6. A method for producing a transgenic plant, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and regenerating a plant from the transformed plant cell.

7. A method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.

8. A method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increased starch content of at least 0.5% as compared to a transgenic seed obtained from a non-transgenic plant.

9. A method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.

10. A method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increase in oil content of at least 2%, on a dry-weight basis, as compared to a transgenic seed obtained from a non-transgenic plant.

11. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein the transgenic seed is obtained from a monocot or dicot plant.

12. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein the transgenic seed is obtained from a maize or soybean plant.

13. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein the at least one regulatory element is a seed-specific or seed-preferred promoter.

14. The transgenic seed of any one of claim 1, 2, 3, 4, or 5, wherein at least one regulatory element is an endosperm or embryo-specific promoter.

15. The method of any one of claim 6, 7, 8, 9, or 10, wherein the transgenic seed is obtained from a transgenic dicot plant comprising in its genome the recombinant construct.

16. The method of any one of claim 6, 7, 8, 9, or 10, wherein the dicot plant is soybean.

17. Transgenic seed obtained by the method of any one of claims 6, 7, 8, 9, or 10.

18. A product and/or by-product obtained from the transgenic seed of claim of any one of claim 6, 7, 8, 9, or 10.

19. The transgenic seed obtained by the method of any one of claim 6, 7, 8, 9, 10 or 11, wherein the transgenic seed is obtained from a monocot or dicot plant.

20. A product and/or by-product from transgenic seed of claim 2 wherein the plant is maize or soybean.

21. A product and/or by-product from the transgenic seed of claim 3, wherein the plant is maize or soybean.

22. A product and/or by-product from the transgenic seed of claim 4, wherein the plant is maize or soybean.

23. A product and/or by-product from the transgenic seed of claim 5, wherein the plant is maize or soybean.

24. An isolated polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide required for altering i.e. increasing or decreasing oil, protein, starch and/or soluble carbohydrate content in a plant, wherein, based on the Clustal V method of alignment with pairwise alignment default parameters of KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5, the polypeptide has an amino acid sequence of at least 70% sequence identity when compared to SEQ ID NO:32; 102, 104; 113, or 116; or (b) the full complement of the nucleotide sequence of (a).

25. The polynucleotide of claim 24, wherein the amino acid sequence of the polypeptide comprises SEQ ID NO: 32; 102, 104; 113, or 116.

26. The polynucleotide of claim 24, wherein the nucleotide sequence comprises SEQ ID NO:31, 101, 103, 112, or 115.

27. A plant or seed comprising a recombinant DNA construct, wherein the recombinant DNA construct comprises the polynucleotide of any one of claims 24 to 26 operably linked to at least one regulatory sequence.
Description



FIELD OF THE INVENTION

[0001] This invention is in the field of plant molecular biology. More specifically, this invention pertains to isolated nucleic acid fragments encoding oxidoreductase motif proteins in plants and seeds and the use of such fragments to modulate expression of a gene encoding oxidoreductase activity.

BACKGROUND OF THE INVENTION

[0002] At maturity, about 40% of soybean seed dry weight is protein and 20% extractable oil. These constitute the economically valuable products of the soybean crop. Plant oils for example are the most energy-rich biomass available from plants; they have twice the energy content of carbohydrates. It also requires very little energy to extract plant oils and convert them to fuels. Of the remaining 40% of seed weight, about 10% is soluble carbohydrate. The soluble carbohydrate portion contributes little to the economic value of soybean seeds and the main component of the soluble carbohydrate fraction, raffinosaccharides, are deleterious both to processing and to the food value of soybean meal in monogastric animals (Coon et al., (1988) Proceedings Soybean Utilization Alternatives, Univ. of Minnesota, pp. 203-211).

[0003] As the pathways of storage compound biosynthesis in seeds are becoming better understood it is clear that it may be possible to modulate the size of the storage compound pools in plant cells by altering the catalytic activity of specific enzymes in the oil, starch and soluble carbohydrate biosynthetic pathways (Taiz L., et al. Plant Physiology; The Benjamin/Cummings Publishing Company: New York, 1991). For example, studies investigating the over-expression of LPAT and DAGAT showed that the final steps acylating the glycerol backbone exert significant control over flux to lipids in seeds. Seed oil content could also be increased in oil-seed rape by overexpression of a yeast glycerol-3-phosphate dehydrogenase, whereas over-expression of the individual genes involved in de novo fatty acid synthesis in the plastid, such as acetyl-CoA carboxylase and fatty acid synthase, did not substantially alter the amount of lipids accumulated (Vigeolas H., et al. Plant Biotechnology J. 5, 431-441 (2007). A low-seed-oil mutant, wrinkled 1, has been identified in Arabidopsis. The mutation apparently causes a deficiency in the seed-specific regulation of carbohydrate metabolism (Focks, Nicole et al., Plant Physiol. (1998), 118(1), 91-101. There is a continued interest in identifying the genes that encode proteins that can modulate the synthesis of storage compounds, such as oil, protein, starch and soluble carbohydrates, in plants.

[0004] The biochemical term oxidoreductase refers to enzymes involved in the transfer of electrons from one molecule (the reductant, also called the hydrogen or electron donor) to another (the oxidant, also called the hydrogen or electron acceptor). For some oxidoreductase proteins catalytic properties are known while other proteins are only identified based on the presence of a motif found also in known oxidoreductase enzymes. Small, proteins, 10-30 kDA in size with, with an oxidoreductase motif (ORM) and unkown catalytic properties are prevalent in eukaryotes ranging from unicellular yeast and algae to the animal and plant kingdom. Yoshikawa et al (FEMS Yeast Research (2009), 9(1), 32-44.) disclose that disruption of YPL107W of Saccharomyces cerevisae encoding a protein with oxidoreductase motif and mitochondrial localization is hypersensitive osmotic and ethanol stress. Although proteins with an oxidoreductase motif closely related to that of YPL107W have been identified in every plant that was subjected to in-depth genome or EST sequencing few studies have been conducted on the role of these proteins. In view of the ubiquitous nature of genes encoding ORM proteins in plants further investigation of their role in plant growth and development and specifically in the regulation of storage compound content in seed is of great interest.

SUMMARY OF THE INVENTION

[0005] In a first embodiment the present invention concerns a transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein seeds from said transgenic plant have an altered oil, protein, starch and/or soluble carbohydrate content when compared to seeds from a control plant not comprising said recombinant DNA construct.

[0006] In a second embodiment the present invention concerns transgenic seed comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a control seed not comprising said recombinant DNA construct.

[0007] In a third embodiment the present invention concerns transgenic seed comprising: a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes an ORM protein, and wherein said plant has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.

[0008] In a fourth embodiment the invention concerns transgenic seed having an increased oil content of at least 2% on a dry-weight basis when compared to the oil content of a non-transgenic seed, wherein said transgenic seed comprises a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 2% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.

[0009] In a fifth embodiment the invention concerns transgenic seed comprising a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM proteins activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 2% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.

[0010] In a sixth embodiment the present invention concerns a method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.

[0011] In a seventh embodiment this invention concerns a method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant;

(b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.

[0012] In an eighth embodiment, the present invention concerns a method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increase in oil content of at least 2% on a dry-weight basis, as compared to a transgenic seed obtained from a non-transgenic plant.

[0013] In a ninth embodiment the invention concerns a transgenic seed comprising: a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 40, 42, 44, 46, 48, 64, 65, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes a ORM protein, and wherein said plant has an altered, increased or decreased oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.

[0014] In a tenth embodiment, the present invention includes an isolated polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide required for altering i.e. increasing or decreasing oil, protein, starch and/or soluble carbohydrate content, wherein the polypeptide has an amino acid sequence of at least 70% sequence identity when compared to SEQ ID NO: 32, 102, 104; 113, or 116, or (b) a full complement of the nucleotide sequence, wherein the full complement and the nucleotide sequence consist of the same number of nucleotides and are 100% complementary. The polypeptide may comprise the amino acid sequence of SEQ ID NO: 32; 102, 104; 113, or 116. The nucleotide sequence may comprise the nucleotide sequence of SEQ ID NO:31, 101, 103, 112, or 115.

[0015] In another embodiment, the present invention concerns a recombinant DNA construct comprising any of the isolated polynucleotides of the present invention operably linked to at least one regulatory sequence, and a cell, a plant, and a seed comprising the recombinant DNA construct. The cell may be eukaryotic, e.g., a yeast, insect or plant cell, or prokaryotic, e.g., a bacterial cell.

[0016] Seeds obtained from monocot and dicot plants (such as for example maize and soybean, respectively) comprising the recombinant constructs of the invention are within the scope of the present invention. Also included are seed-specific or seed-preferred promoters driving the expression of the nucleic acid sequences of the invention. Embryo or endosperm specific promoters driving the expression of the nucleic acid sequences of the invention are also included.

Furthermore, the methods of the present inventions are useful for obtaining transgenic seeds from monocot plants (such as maize and rice) and dicot plants (such as soybean and canola).

[0017] Also within the scope of the invention are product(s) and/or by-product(s) obtained from the transgenic seed obtained from monocot or dicot plants, such as maize and soybean, respectively.

[0018] In another embodiment, this invention relates to a method for suppressing in a plant the level of expression of a gene encoding a polypeptide having ORM protein activity, wherein the method comprises transforming a monocot or dicot plant with any of the nucleic acid fragments of the present invention.

BRIEF DESCRIPTION OF THE DRAWING AND SEQUENCE LISTING

[0019] The invention can be more fully understood from the following detailed description and the accompanying Drawing and Sequence Listing which form a part of this application.

[0020] FIG. 1A-1B shows an alignment of the amino acid sequences of ORM proteins encoded by the nucleotide sequences derived from the following: Brassica rapa (SEQ ID NO:26, 28, and 30); Helianthus annuus (SEQ ID NO:32); Ricinus communis (SEQ ID NO:34); Glycine max (SEQ ID NO:36, and 38), Zea mays (SEQ ID NO:40, 42, 44, and 66, which corresponds to NCBI GI NO:195615148); Oryza sativa (SEQ ID NO:46); Sorghum bicolor (SEQ ID NO:48; Populus trichocarpa (SEQ ID NO:64; NCBI GI NO.:118481427); SEQ ID NO:65 corresponding to SEQ ID NO:36271 from US Patent Application US20060123505; SEQ ID NO:67 corresponding to SEQ ID NO:233249 of US Patent Application US20040214272; and Arabidopsis thaliana (SEQ ID NO:69, At5G17280). For the alignment, amino acids which are conserved among all sequences at a given position, are indicated with an asterisk (*). Dashes are used by the program to maximize the alignment of the sequences. A conserved sequence motif is boxed in the alignment and corresponds to SEQ ID NO:70.

[0021] FIG. 2 shows a chart of the percent sequence identity for each pair of amino acid sequences displayed in FIGS. 1A-1B.

[0022] FIG. 3A-3C shows an alignment of the amino acid sequences of ORM proteins encoded by the nucleotide sequences derived from the following: Brassica rapa (SEQ ID NO:26, 28, and 30); Helianthus annuus (SEQ ID NO:32); Ricinus communis (SEQ ID NO:34); Glycine max (SEQ ID NO:36, and 38), Zea mays (SEQ ID NO:40, 42, 44, and 66, which corresponds to NCBI GI NO:195615148); Oryza sativa (SEQ ID NO:46); Sorghum bicolor (SEQ ID NO:48; Populus trichocarpa (SEQ ID NO:64; NCBI GI NO.:118481427); SEQ ID NO:65 corresponding to SEQ ID NO:36271 from US Patent Application US20060123505; SEQ ID NO:67 corresponding to SEQ ID NO:233249 of US Patent Application US20040214272; Arabidopsis thaliana (SEQ ID NO:69, At5G17280), Guar (SEQ ID NO:102, Ids2c.pk014.b22), Bahia (SEQ ID NO:104, contig), Arabidopsis lyrata (SEQ ID NO:105, NCBI GI NO:297807753), Picea sitchensis (SEQ ID NO:106, NCBI GI NO:116782186), Hordeum vulgare (SEQ ID NO:108), Raphanus sativus (SEQ ID NO:110), Dennstaedtia punctiloba (SEQ ID NO:113), Osmunda cinnamomea (SEQ ID NO:116). For the alignment, amino acids which are conserved among all sequences at a given position, are indicated with an asterisk (*). Dashes are used by the program to maximize the alignment of the sequences. A conserved sequence motif is boxed in the alignment and corresponds to SEQ ID NO:117.

[0023] FIG. 4 shows a chart of the percent sequence identity for each pair of amino acid sequences displayed in FIGS. 3A-3C.

[0024] The sequence descriptions and Sequence Listing attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. .sctn.1.821-1.825.

SEQ ID NO:1 corresponds to the nucleotide sequence of vector PHSbarENDS2. SEQ ID NO:2 corresponds to the nucleotide sequence of vector pUC9 and a polylinker. SEQ ID NO:3 corresponds to the nucleotide sequence of vector pKR85. SEQ ID NO:4 corresponds to the nucleotide sequence of vector pKR278. SEQ ID NO:5 corresponds to the nucleotide sequence of vector pKR407. SEQ ID NO:6 corresponds to the nucleotide sequence of vector pKR1468. SEQ ID NO:7 corresponds to the nucleotide sequence of vector pKR1475. SEQ ID NO:8 corresponds to the nucleotide sequence of vector pKR92. SEQ ID NO:9 corresponds to the nucleotide sequence of vector pKR1478. SEQ ID NO:10 corresponds to SAIFF and genomic DNA of lo17849. SEQ ID NO:11 corresponds to the forward primer ORM ORF FWD. SEQ ID NO:12 corresponds to the reverse primer ORM ORF REV. SEQ ID NO:13 corresponds to the nucleotide sequence of vector pENTR comprising ORM. SEQ ID NO:14 corresponds to the nucleotide sequence of vector pKR1478-ORM. SEQ ID NO:15 corresponds to the nucleotide sequence of PKR1482. SEQ ID NO:16 corresponds to the AthLcc In forward primer. SEQ ID NO:17 corresponds to the AthLcc In reverse primer. SEQ ID NO:18 corresponds to the PCR product with the laccase intron. SEQ ID NO:19 corresponds to the nucleotide sequence of PSM1318. SEQ ID NO:20 corresponds to the nucleotide sequence of pMBL18 ATTR12 INT. SEQ ID NO:21 corresponds to the nucleotide sequence of PSM1789. SEQ ID NO:22 corresponds to the nucleotide sequence of pMBL18 ATTR12 INT ATTR21. SEQ ID NO:23 corresponds to the nucleotide sequence of vector pKR1480. SEQ ID NO:24 corresponds to the nucleotide sequence of pKR1482-ORM.

[0025] Table 1 lists the polypeptides that are described herein, the designation of the clones that comprise the nucleic acid fragments encoding polypeptides representing all or a substantial portion of these polypeptides, and the corresponding identifier (SEQ ID NO:) as used in the attached Sequence Listing. Table 1 also identifies the cDNA clones as individual ESTs ("EST"), the sequences of the entire cDNA inserts comprising the indicated cDNA clones ("FIS"), contigs assembled from two or more ESTs ("Contig"), contigs assembled from an FIS and one or more ESTs ("Contig*", or sequences encoding the entire or functional protein derived from an FIS, a contig, an EST and PCR, or an FIS and PCR ("CGS").

TABLE-US-00001 TABLE 1 ORM Proteins SEQ ID NO: Protein (Plant Source) Clone Designation Status (Nucleotide) (Amino Acid) ORM (Brassica rapa) TC44737 CGS 25 26 ORM (Brassica rapa) TC52165 CGS 27 28 ORM (Brassica rapa) TC52879 CGS 29 30 ORM (Helianthus annuus) hso1c.pk014.c16 CGS 31 32 ORM (Ricinus communis) XM_002533611 CGS 33 34 ORM (Glycine max) Glyma02g05870 CGS 35 36 ORM (Glycine max) Glyma16g24560 CGS 37 38 ORM (Zea mays) GRMZM2G1312101 CGS 39 40 ORM (Zea mays) pco642986 CGS 41 42 ORM (Zea mays) pco597536 CGS 43 44 ORM (Oryza sativa) Os09g36120 CGS 45 46 ORM (Sorghum bicolor) Sb02g030770 CGS 47 48

SEQ ID NO:49 is the nucleic acid sequence of the linker described in Example 19. SEQ ID NO:50 is the nucleic acid sequence of vector pKS133 described in Example 18. SEQ ID NO:51 corresponds to the single copy of ELVISLIVES. SEQ ID NO:52 corresponds to two copies of ELVISLIVES. SEQ ID NO:53 corresponds the primer described in Example 20. SEQ ID NO:54 corresponds to the primer described in Example 20. SEQ ID NO:55 corresponds to a synthetic PCR primer (SA195). SEQ ID NO:56 corresponds to a synthetic PCR primer (SA196). SEQ ID NO:57 corresponds to a synthetic PCR primer (SA200). SEQ ID NO:58 corresponds to a synthetic PCR primer (SA201). SEQ ID NO:59 corresponds to pGemTA. SEQ ID NO:60 corresponds to pGemTB. SEQ ID NO:61 corresponds to pGemT-ORM-HP. SEQ ID NO:62 corresponds to pKS433. SEQ ID NO:63 corresponds to pKS120. SEQ ID NO:64 corresponds to NCBI GI NO: 118481427 (Populus trichocarpa) SEQ ID NO:65 corresponds to SEQ ID NO:36271 from US Patent Application, US20060123505. SEQ ID NO:66 corresponds to NCBI Gi NO: 195615148 (Zea mays). SEQ ID NO:67 corresponds to SEQ ID NO:233249 of US20040214272. SEQ ID NO:68 corresponds to the nucleotide sequence of At5G17280. SEQ ID NO:69 corresponds to the amino acid sequence encoded by SEQ ID NO:68. SEQ ID NO:70 is a conserved sequence motif associated with sequences included in the present invention as shown in FIGS. 1A and 1B. SEQ ID NO:71 corresponds to the SA3 11 primer. SEQ ID NO:72 corresponds to the SA3 12 primer. SEQ ID NO:73 corresponds to the SA3 13 primer. SEQ ID NO:74 corresponds to the SA3 14 primer. SEQ ID NO:75 corresponds to the SA3 15 primer. SEQ ID NO:76 corresponds to the SA3 16 primer. SEQ ID NO:77 corresponds to the nucleotide sequence of pGEM T Easy-C. SEQ ID NO:78 corresponds to the nucleotide sequence of pGEM T Easy-D. SEQ ID NO:79 corresponds to the nucleotide sequence of pGEM T Easy-E. SEQ ID NO:80 corresponds to the nucleotide sequence of pBluescript SK+-C. SEQ ID NO:81 corresponds to the nucleotide sequence of pBluescript SK+-CD. SEQ ID NO:82 corresponds to the nucleotide sequence of pBluescript SK+-CDE. SEQ ID NO:83 corresponds to the nucleotide sequence of KS442. SEQ ID NO:84 corresponds to the nucleotide sequence of KS442-CDE. SEQ ID NO:85 corresponds to the nucleotide sequence of lo127 SEQ ID NO:86 corresponds to the sequence of artificial microRNA, OX16. SEQ ID NO:87 corresponds to the sequence of artificial microRNA, OX2. SEQ ID NO:88 corresponds to the sequence of artificial microRNA, OX16. SEQ ID NO:89 corresponds to the sequence of artificial microRNA, OX2. SEQ ID NO:90 corresponds to the microRNA 396 precursor. SEQ ID NO:91 corresponds to the microRNA 396 precursor v3. SEQ ID NO:92 corresponds to OX16 primer A. SEQ ID NO:93 corresponds to OX16 primer B. SEQ ID NO:94 corresponds to the nucleotide sequence of plasmid OX16. SEQ ID NO:95 corresponds to the microRNA 159 precursor. SEQ ID NO:96 corresponds to the in-fusion ready microRNA 159 precursor. SEQ ID NO:97 corresponds to the 1590.times.2 primer A. SEQ ID NO:98 corresponds to the 1590.times.2 primer B. SEQ ID NO:99 corresponds to the nucleotide sequence of plasmid 159-OX2. SEQ ID NO:100 corresponds to the nucleotide sequence of plasmid KS434. SEQ ID NO:101 corresponds to the nucleotide sequence of a Guar ORM (Ids2c.pk014.b22). SEQ ID NO:102 corresponds to the amino acid sequence of the Guar ORM encoded by Nucleotides of SEQ ID NO:101. SEQ ID NO:103 corresponds to the nucleotide sequence of a contig of a Bahia ORM. SEQ ID NO:104 corresponds to the amino acid sequence encoded by nucleotides of SEQ ID NO:103. SEQ ID NO:105 corresponds to NCBI GI NO: 297807753 (Arabidopsis lyrata). SEQ ID NO:106 corresponds to NCBI GI NO: 116782186 (Picea sitchensis). SEQ ID NO:107 corresponds to a Hordeum vulgare ORM sequence, obtained a from a Hordeum vulgare seedling shoot EST library. SEQ ID NO:108 corresponds to the partial amino acid sequence encoded by SEQ ID NO: 107. SEQ ID NO:109 corresponds to a partial ORM nucleotide sequence obtained from Raphanus sativus. SEQ ID NO:110 corresponds to the amino acid sequence encoded by SEQ ID NO:109. SEQ ID NO:111 corresponds to the ORM nucleotide sequence from Dennstaedtia punctiloba. SEQ ID NO:112 corresponds to the nucleotide sequence of the ORM-ORF of SEQ ID NO:111. SEQ ID NO:113 corresponds to the amino acids sequence encoded by SEQ ID NO:112. SEQ ID NO:114 corresponds to the ORM nucleotide sequence from Osmunda cinnamomea. SEQ ID NO:115 corresponds to the nucleotide sequence of the ORM-ORF of SEQ ID NO:114. SEQ ID NO:116 corresponds to the amino acid sequence encoded by SEQ ID NO:115. SEQ ID NO:117: corresponds to a conserved sequence motif associated with sequences included in the present invention as shown in FIG. 3A-3C. SEQ ID NO:118 corresponds to the amino acid sequence from Glycine max in US Patent US2004031072-A1-14947. SEQ ID NO:119 corresponds to the amino acid sequence from Sorghum bicolor (NCBI GI: 8062081). SEQ ID NO:120 corresponds to the amino acid sequence form Arabidopsis thaliana (BAB10515). SEQ ID NO:121 corresponds to the amino acid sequence form Oryza sativa (NCBI GI: 5207721).

[0026] The Sequence Listing contains the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB standards described in Nucleic Acids Res. 13:3021-3030 (1985) and in the Biochemical J. 219 (No. 2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. .sctn.1.822.

DETAILED DESCRIPTION OF THE INVENTION

[0027] All patents, patent applications, and publications cited throughout the application are hereby incorporated by reference in their entirety.

[0028] As used herein and in the appended claims, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a plant" includes a plurality of such plants, reference to "a cell" includes one or more cells and equivalents thereof known to those skilled in the art, and so forth.

[0029] In the context of this disclosure a number of terms and abbreviations are used. The following definitions are provided.

[0030] "Open reading frame" is abbreviated ORF.

[0031] "Polymerase chain reaction" is abbreviated PCR.

[0032] "Triacylglycerols" are abbreviated TAGs.

[0033] "Co-enzyme A" is abbreviated CoA.

[0034] "Pyrophosphatase" is abbreviated PPiase.

[0035] The term "fatty acids" refers to long chain aliphatic acids (alkanoic acids) of varying chain length, from about C.sub.12 to C.sub.22 (although both longer and shorter chain-length acids are known). The predominant chain lengths are between C.sub.16 and C.sub.22. The structure of a fatty acid is represented by a simple notation system of "X:Y", where X is the total number of carbon (C) atoms in the particular fatty acid and Y is the number of double bonds.

[0036] Generally, fatty acids are classified as saturated or unsaturated. The term "saturated fatty acids" refers to those fatty acids that have no "double bonds" between their carbon backbone. In contrast, "unsaturated fatty acids" have "double bonds" along their carbon backbones (which are most commonly in the cis-configuration). "Monounsaturated fatty acids" have only one "double bond" along the carbon backbone (e.g., usually between the 9.sup.th and 10.sup.th carbon atom as for palmitoleic acid (16:1) and oleic acid (18:1)), while "polyunsaturated fatty acids" (or "PUFAs") have at least two double bonds along the carbon backbone (e.g., between the 9.sup.th and 10.sup.th, and 12.sup.th and 13.sup.th carbon atoms for linoleic acid (18:2); and between the 9.sup.th and 10.sup.th, 12.sup.th and 13.sup.th, and 15.sup.th and 16.sup.th for .alpha.-linolenic acid (18:3)).

[0037] The terms "triacylglycerol", "oil" and "TAGs" refer to neutral lipids composed of three fatty acyl residues esterified to a glycerol molecule (and such terms will be used interchangeably throughout the present disclosure herein). Such oils can contain long chain PUFAs, as well as shorter saturated and unsaturated fatty acids and longer chain saturated fatty acids. Thus, "oil biosynthesis" generically refers to the synthesis of TAGs in the cell.

The term "modulation" or "alteration" in the context of the present invention refers to increases or decreases of ORM protein expression, protein level or enzyme activity, as well as to an increase or decrease in the storage compound levels, such as oil, protein, starch or soluble carbohydrates.

[0038] The term "plant" includes reference to whole plants, plant parts or organs (e.g., leaves, stems, roots, etc.), plant cells, seeds and progeny of same. Plant cell, as used herein includes, without limitation, cells obtained from or found in the following: seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores. Plant cells can also be understood to include modified cells, such as protoplasts, obtained from the aforementioned tissues. The class of plants which can be used in the methods of the invention is generally as broad as the class of higher plants amenable to transformation techniques, including both monocotyledonous and dicotyledonous plants.

[0039] Examples of monocots include, but are not limited to (corn) maize, wheat, rice, sorghum, millet, barley, palm, lily, Alstroemeria, rye, and oat.

[0040] Examples of dicots include, but are not limited to, soybean, rape, sunflower, canola, grape, guayule, columbine, cotton, tobacco, peas, beans, flax, safflower, and alfalfa.

[0041] Plant tissue includes differentiated and undifferentiated tissues or plants, including but not limited to, roots, stems, shoots, leaves, pollen, seeds, tumor tissue, and various forms of cells and culture such as single cells, protoplasm, embryos, and callus tissue.

[0042] The term "plant organ" refers to plant tissue or group of tissues that constitute a morphologically and functionally distinct part of a plant.

[0043] The term "genome" refers to the following: 1. The entire complement of genetic material (genes and non-coding sequences) is present in each cell of an organism, or virus or organelle. 2. A complete set of chromosomes inherited as a (haploid) unit from one parent. The term "stably integrated" refers to the transfer of a nucleic acid fragment into the genome of a host organism or cell resulting in genetically stable inheritance.

[0044] The terms "polynucleotide", "polynucleotide sequence", "nucleic acid", nucleic acid sequence", and "nucleic acid fragment" are used interchangeably herein. These terms encompass nucleotide sequences and the like. A polynucleotide may be a polymer of RNA or DNA that is single- or double-stranded, that optionally contains synthetic, non-natural or altered nucleotide bases. A polynucleotide in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA, synthetic DNA, or mixtures thereof.

[0045] The term "isolated" refers to materials, such as "isolated nucleic acid fragments" and/or "isolated polypeptides", which are substantially free or otherwise removed from components that normally accompany or interact with the materials in a naturally occurring environment. Isolated polynucleotides may be purified from a host cell in which they naturally occur. Conventional nucleic acid purification methods known to skilled artisans may be used to obtain isolated polynucleotides. The term also embraces recombinant polynucleotides and chemically synthesized polynucleotides.

[0046] The term "isolated nucleic acid fragment" is used interchangeably with "isolated polynucleotide" and is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.

[0047] The terms "subfragment that is functionally equivalent" and "functionally equivalent subfragment" are used interchangeably herein. These terms refer to a portion or subsequence of an isolated nucleic acid fragment in which the ability to alter gene expression or produce a certain phenotype is retained whether or not the fragment or subfragment encodes an active enzyme. For example, the fragment or subfragment can be used in the design of recombinant DNA constructs to produce the desired phenotype in a transformed plant. Recombinant DNA constructs can be designed for use in co-suppression or antisense by linking a nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in the appropriate orientation relative to a plant promoter sequence.

[0048] "Cosuppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar native genes (U.S. Pat. No. 5,231,020). Cosuppression technology constitutes the subject matter of U.S. Pat. No. 5,231,020, which issued to Jorgensen et al. on Jul. 27, 1999. The phenomenon observed by Napoli et al. in petunia was referred to as "cosuppression" since expression of both the endogenous gene and the introduced transgene were suppressed (for reviews see Vaucheret et al., Plant J. 16:651-659 (1998); and Gura, Nature 404:804-808 (2000)).

[0049] Co-suppression constructs in plants previously have been designed by focusing on overexpression of a nucleic acid sequence having homology to an endogenous mRNA, in the sense orientation, which results in the reduction of all RNA having homology to the overexpressed sequence (see Vaucheret et al. (1998) Plant J 16:651-659; and Gura (2000) Nature 404:804-808). The overall efficiency of this phenomenon is low, and the extent of the RNA reduction is widely variable. Recent work has described the use of "hairpin" structures that incorporate all, or part, of an mRNA encoding sequence in a complementary orientation that results in a potential "stem-loop" structure for the expressed RNA (PCT Publication WO 99/53050 published on Oct. 21, 1999). This increases the frequency of co-suppression in the recovered transgenic plants. Another variation describes the use of plant viral sequences to direct the suppression, or "silencing", of proximal mRNA encoding sequences (PCT Publication WO 98/36083 published on Aug. 20, 1998). Both of these co-suppressing phenomena have not been elucidated mechanistically, although recent genetic evidence has begun to unravel this complex situation (Elmayan et al. (1998) Plant Cell 10:1747-1757).

[0050] In addition to cosuppression, antisense technology has also been used to block the function of specific genes in cells. Antisense RNA is complementary to the normally expressed RNA, and presumably inhibits gene expression by interacting with the normal RNA strand. The mechanisms by which the expression of a specific gene are inhibited by either antisense or sense RNA are on their way to being understood. However, the frequencies of obtaining the desired phenotype in a transgenic plant may vary with the design of the construct, the gene, the strength and specificity of its promoter, the method of transformation and the complexity of transgene insertion events (Baulcombe, Curr. Biol. 12(3):R82-84 (2002); Tang et al., Genes Dev. 17(1):49-63 (2003); Yu et al., Plant Cell. Rep. 22(3):167-174 (2003)). Cosuppression and antisense inhibition are also referred to as "gene silencing", "post-transcriptional gene silencing" (PTGS), RNA interference or RNAi. See for example U.S. Pat. No. 6,506,559.

[0051] MicroRNAs (miRNA) are small regulatory RNSs that control gene expression. miRNAs bind to regions of target RNAs and inhibit their translation and, thus, interfere with production of the polypeptide encoded by the target RNA. miRNAs can be designed to be complementary to any region of the target sequence RNA including the 3' untranslated region, coding region, etc. miRNAs are processed from highly structured RNA precursors that are processed by the action of a ribonuclease III termed DICER. While the exact mechanism of action of miRNAs is unknown, it appears that they function to regulate expression of the target gene. See, e.g., U.S. Patent Publication No. 2004/0268441 A1 which was published on Dec. 30, 2004.

[0052] The term "expression", as used herein, refers to the production of a functional end-product, be it mRNA or translation of mRNA into a polypeptide. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).

[0053] "Overexpression" refers to the production of a functional end-product in transgenic organisms that exceeds levels of production when compared to expression of that functional end-product in a normal, wild type or non-transformed organism.

[0054] "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. The preferred method of cell transformation of rice, corn and other monocots is using particle-accelerated or "gene gun" transformation technology (Klein et al. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050), or an Agrobacterium-mediated method (Ishida Y. et al. (1996) Nature Biotech. 14:745-750). The term "transformation" as used herein refers to both stable transformation and transient transformation.

[0055] "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein.

[0056] As stated herein, "suppression" refers to the reduction of the level of enzyme activity or protein functionality detectable in a transgenic plant when compared to the level of enzyme activity or protein functionality detectable in a plant with the native enzyme or protein. The level of enzyme activity in a plant with the native enzyme is referred to herein as "wild type" activity. The level of protein functionality in a plant with the native protein is referred to herein as "wild type" functionality. The term "suppression" includes lower, reduce, decline, decrease, inhibit, eliminate and prevent. This reduction may be due to the decrease in translation of the native mRNA into an active enzyme or functional protein. It may also be due to the transcription of the native DNA into decreased amounts of mRNA and/or to rapid degradation of the native mRNA. The term "native enzyme" refers to an enzyme that is produced naturally in the desired cell.

[0057] "Gene silencing," as used herein, is a general term that refers to decreasing mRNA levels as compared to wild-type plants, does not specify mechanism and is inclusive, and not limited to, anti-sense, cosuppression, viral-suppression, hairpin suppression and stem-loop suppression.

[0058] The terms "homology", "homologous", "substantially similar" and "corresponding substantially" are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. For example, alterations in a nucleic acid fragment which result in the production of a chemically equivalent amino acid at a given site, but do not effect the functional properties of the encoded polypeptide, are well known in the art. Thus, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a functionally equivalent product. Nucleotide changes that result in alteration of the N-terminal and C-terminal portions of the polypeptide molecule would also not be expected to alter the activity of the polypeptide. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the specific exemplary sequences.

[0059] Moreover, the skilled artisan recognizes that substantially similar nucleic acid sequences encompassed by this invention are also defined by their ability to hybridize, under moderately stringent conditions (for example, 1.times.SSC, 0.1% SDS, 60.degree. C.) with the sequences exemplified herein, or to any portion of the nucleotide sequences reported herein and which are functionally equivalent to the gene or the promoter of the invention. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions. One set of preferred conditions involves a series of washes starting with 6.times.SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2.times.SSC, 0.5% SDS at 45.degree. C. for 30 min, and then repeated twice with 0.2.times.SSC, 0.5% SDS at 50.degree. C. for 30 min. A more preferred set of stringent conditions involves the use of higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2.times.SSC, 0.5% SDS was increased to 60.degree. C. Another preferred set of highly stringent conditions involves the use of two final washes in 0.1.times.SSC, 0.1% SDS at 65.degree. C.

[0060] With respect to the degree of substantial similarity between the target (endogenous) mRNA and the RNA region in the construct having homology to the target mRNA, such sequences should be at least 25 nucleotides in length, preferably at least 50 nucleotides in length, more preferably at least 100 nucleotides in length, again more preferably at least 200 nucleotides in length, and most preferably at least 300 nucleotides in length; and should be at least 80% identical, preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95% identical.

[0061] Substantially similar nucleic acid fragments of the instant invention may also be characterized by the percent identity of the amino acid sequences that they encode to the amino acid sequences disclosed herein, as determined by algorithms commonly employed by those skilled in this art. Suitable nucleic acid fragments (isolated polynucleotides of the present invention) encode polypeptides that are at least 70% identical, preferably at least 80% identical to the amino acid sequences reported herein. Preferred nucleic acid fragments encode amino acid sequences that are at least 85% identical to the amino acid sequences reported herein. More preferred nucleic acid fragments encode amino acid sequences that are at least 90% identical to the amino acid sequences reported herein. Most preferred are nucleic acid fragments that encode amino acid sequences that are at least 95% identical to the amino acid sequences reported herein.

[0062] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying related polypeptide sequences. Useful examples of percent identities are 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%.

[0063] Sequence alignments and percent similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Unless stated otherwise, multiple alignment of the sequences provided herein were performed using the Clustal method of alignment (Higgins and Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. After alignment of the sequences, using the Clustal V program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table on the same program.

[0064] Unless otherwise stated, "BLAST" sequence identity/similarity values provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)). Software for performing BLAST analyses is publicly available, e.g., through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=.sup.-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).

[0065] "Sequence identity" or "identity" in the context of nucleic acid or polypeptide sequences refers to the nucleic acid bases or amino acid residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window.

[0066] Thus, "Percentage of sequence identity" refers to the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the results by 100 to yield the percentage of sequence identity. Useful examples of percent sequence identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. These identities can be determined using any of the programs described herein.

[0067] Sequence alignments and percent identity or similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences are performed using the Clustal V method of alignment (Higgins, D. G. and Sharp, P. M. (1989) Comput. Appl. Biosci. 5:151-153; Higgins, D. G. et al. (1992) Comput. Appl. Biosci. 8:189-191) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4.

[0068] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, from other plant species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. Indeed, any integer amino acid identity from 50%-100% may be useful in describing the present invention. Also, of interest is any full or partial complement of this isolated nucleotide fragment.

[0069] The term "recombinant" means, for example, that a nucleic acid sequence is made by an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis or by the manipulation of isolated nucleic acids by genetic engineering techniques.

[0070] As used herein, "contig" refers to a nucleotide sequence that is assembled from two or more constituent nucleotide sequences that share common or overlapping regions of sequence homology. For example, the nucleotide sequences of two or more nucleic acid fragments can be compared and aligned in order to identify common or overlapping sequences. Where common or overlapping sequences exist between two or more nucleic acid fragments, the sequences (and thus their corresponding nucleic acid fragments) can be assembled into a single contiguous nucleotide sequence.

[0071] "Codon degeneracy" refers to divergence in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment comprising a nucleotide sequence that encodes all or a substantial portion of the amino acid sequences set forth herein. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a nucleic acid fragment for improved expression in a host cell, it is desirable to design the nucleic acid fragment such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.

[0072] The terms "synthetic nucleic acid" or "synthetic genes" refer to nucleic acid molecules assembled either in whole or in part from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form larger nucleic acid fragments which may then be enzymatically assembled to construct the entire desired nucleic acid fragment. "Chemically synthesized", as related to a nucleic acid fragment, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of nucleic acid fragments may be accomplished using well established procedures, or automated chemical synthesis can be performed using one of a number of commercially available machines. Accordingly, the nucleic acid fragments can be tailored for optimal gene expression based on optimization of the nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determination of preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.

[0073] "Gene" refers to a nucleic acid fragment that is capable of directing expression a specific protein or functional RNA.

[0074] "Native gene" refers to a gene as found in nature with its own regulatory sequences.

[0075] "Chimeric gene" or "recombinant DNA construct" are used interchangeably herein, and refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature, or to an isolated native gene optionally modified and reintroduced into a host cell.

[0076] A chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. In one embodiment, a regulatory region and a coding sequence region are assembled from two different sources. In another embodiment, a regulatory region and a coding sequence region are derived from the same source but arranged in a manner different than that found in nature. In another embodiment, the coding sequence region is assembled from at least two different sources. In another embodiment, the coding region is assembled from the same source but in a manner not found in nature.

[0077] The term "endogenous gene" refers to a native gene in its natural location in the genome of an organism.

[0078] The term "foreign gene" refers to a gene not normally found in the host organism that is introduced into the host organism by gene transfer.

[0079] The term "transgene" refers to a gene that has been introduced into a host cell by a transformation procedure. Transgenes may become physically inserted into a genome of the host cell (e.g., through recombination) or may be maintained outside of a genome of the host cell (e.g., on an extrachromasomal array).

[0080] An "allele" is one of several alternative forms of a gene occupying a given locus on a chromosome. When the alleles present at a given locus on a pair of homologous chromosomes in a diploid plant are the same that plant is homozygous at that locus. If the alleles present at a given locus on a pair of homologous chromosomes in a diploid plant differ that plant is heterozygous at that locus. If a transgene is present on one of a pair of homologous chromosomes in a diploid plant that plant is hemizygous at that locus.

[0081] The term "coding sequence" refers to a DNA fragment that codes for a polypeptide having a specific amino acid sequence, or a structural RNA. The boundaries of a protein coding sequence are generally determined by a ribosome binding site (prokaryotes) or by an ATG start codon (eukaryotes) located at the 5' end of the mRNA and a transcription terminator sequence located just downstream of the open reading frame at the 3' end of the mRNA. A coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences.

[0082] "Mature" protein refers to a post-translationally processed polypeptide; i.e., one from which any pre- or pro-peptides present in the primary translation product have been removed. "Precursor" protein refers to the primary product of translation of mRNA; i.e., with pre- and pro-peptides still present. Pre- and pro-peptides may be and are not limited to intracellular localization signals.

[0083] "RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to and synthesized from an mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form using the Klenow fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro. "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target isolated nucleic acid fragment (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.

[0084] The term "endogenous RNA" refers to any RNA which is encoded by any nucleic acid sequence present in the genome of the host prior to transformation with the recombinant construct of the present invention, whether naturally-occurring or non-naturally occurring, i.e., introduced by recombinant means, mutagenesis, etc.

[0085] The term "non-naturally occurring" means artificial, not consistent with what is normally found in nature.

[0086] "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell.

[0087] "cDNA" refers to a DNA that is complementary to and synthesized from a mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form using the Klenow fragment of DNA polymerase I.

[0088] "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro.

[0089] "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA, and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence.

[0090] "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated, yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.

[0091] The term "recombinant DNA construct" refers to a DNA construct assembled from nucleic acid fragments obtained from different sources. The types and origins of the nucleic acid fragments may be very diverse.

[0092] A "recombinant expression construct" contains a nucleic acid fragment operably linked to at least one regulatory element, that is capable of effecting expression of the nucleic acid fragment. The recombinant expression construct may also affect expression of a homologous sequence in a host cell.

[0093] In one embodiment the choice of recombinant expression construct is dependent upon the method that will be used to transform host cells. The skilled artisan is well aware of the genetic elements that must be present on the recombinant expression construct in order to successfully transform, select and propagate host cells. The skilled artisan will also recognize that different independent transformation events may be screened to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by, but is not limited to, Southern analysis of DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.

[0094] The term "operably linked" refers to the association of nucleic acid fragments on a single nucleic acid fragment so that the function of one is regulated by the other. For example, a promoter is operably linked with a coding sequence when it is capable of regulating the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions of the invention can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.

[0095] "Regulatory sequences" refer to nucleotides located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which may influence the transcription, RNA processing, stability, or translation of the associated coding sequence. Regulatory sequences may include, and are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

[0096] "Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoter sequences can also be located within the transcribed portions of genes, and/or downstream of the transcribed sequences. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of an isolated nucleic acid fragment in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause an isolated nucleic acid fragment to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, (1989) Biochemistry of Plants 15:1-82. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.

[0097] Specific examples of promoters that may be useful in expressing the nucleic acid fragments of the invention include, but are not limited to, the oleosin promoter (PCT Publication WO99/65479, published Dec. 12, 1999), the maize 27 kD zein promoter (Ueda et al (1994) Mol. Cell. Biol. 14:4350-4359), the ubiquitin promoter (Christensen et al (1992) Plant Mol. Biol. 18:675-680), the SAM synthetase promoter (PCT Publication WO00/37662, published Jun. 29, 2000), the CaMV 35S (Odell et al (1985) Nature 313:810-812), and the promoter described in PCT Publication WO02/099063 published Dec. 12, 2002.

[0098] The "translation leader sequence" refers to a polynucleotide fragment located between the promoter of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner, R. and Foster, G. D. (1995) Mol. Biotechnol. 3:225-236).

[0099] An "intron" is an intervening sequence in a gene that does not encode a portion of the protein sequence. Thus, such sequences are transcribed into RNA but are then excised and are not translated. The term is also used for the excised RNA sequences.

[0100] The "3' non-coding sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht, I. L., et al. (1989) Plant Cell 1:671-680.

[0101] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 1989. Transformation methods are well known to those skilled in the art and are described below.

[0102] "PCR" or "Polymerase Chain Reaction" is a technique for the synthesis of large quantities of specific DNA segments, consists of a series of repetitive cycles (Perkin Elmer Cetus Instruments, Norwalk, Conn.). Typically, the double stranded DNA is heat denatured, the two primers complementary to the 3' boundaries of the target segment are annealed at low temperature and then extended at an intermediate temperature. One set of these three consecutive steps is referred to as a cycle.

[0103] "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including nuclear and organellar genomes, resulting in genetically stable inheritance.

[0104] In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance.

[0105] Host organisms comprising the transformed nucleic acid fragments are referred to as "transgenic" organisms.

[0106] The term "amplified" means the construction of multiple copies of a nucleic acid sequence or multiple copies complementary to the nucleic acid sequence using at least one of the nucleic acid sequences as a template. Amplification systems include the polymerase chain reaction (PCR) system, ligase chain reaction (LCR) system, nucleic acid sequence based amplification (NASBA, Cangene, Mississauga, Ontario), Q-Beta Replicase systems, transcription-based amplification system (TAS), and strand displacement amplification (SDA). See, e.g., Diagnostic Molecular Microbiology: Principles and Applications, D. H. Persing et al., Ed., American Society for Microbiology, Washington, D.C. (1993). The product of amplification is termed an amplicon.

[0107] The term "chromosomal location" includes reference to a length of a chromosome which may be measured by reference to the linear segment of DNA which it comprises. The chromosomal location can be defined by reference to two unique DNA sequences, i.e., markers.

[0108] The term "marker" includes reference to a locus on a chromosome that serves to identify a unique position on the chromosome. A "polymorphic marker" includes reference to a marker which appears in multiple forms (alleles) such that different forms of the marker, when they are present in a homologous pair, allow transmission of each of the chromosomes in that pair to be followed. A genotype may be defined by use of one or a plurality of markers.

[0109] The present invention includes, inter alia, compositions and methods for altering or modulating (i.e., increasing or decreasing) the level of ORM polypeptides described herein in plants. The size of the oil, protein, starch and soluble carbohydrate pools in soybean seeds can be modulated or altered (i.e. increased or decreased) by altering the expression of a specific gene, encoding ORM protein.

[0110] In one embodiment, the present invention concerns a transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein seed obtained from said transgenic plant has an altered oil, protein, starch and/or soluble carbohydrate content when compared to seed obtained from a control plant not comprising said recombinant DNA construct.

[0111] In a second embodiment the present invention concerns a transgenic seed obtained from the transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an altered oil, protein, starch and/or soluble carbohydrate content when compared to a control plant not comprising said recombinant DNA construct.

[0112] In a third embodiment the present invention concerns a transgenic seed obtained from the transgenic plant comprising a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 and wherein said transgenic seed has an increased starch content of at least 0.5%, 1%, 1.5%, 2%, 2.5%, 3.0%, 3.5%, 4.0%, 4.5%, 5.0%, 5.5%, 6.0%, 6.5%, 7.0%, 7.5%, 8.0%, 8.5%, 9.0%, 9.5%, 10.0%, 10.5%, 11%, 11.5%, 12.0% 12.5%, 13.0, 13.5%. 14.0%, 14.5%, 15.0%, 15.5%, 15.0%, 16.5%, 17.0%, 17.5% 18.0%, 18.5%, 19.0%, 19.5%, 20.0%, 20.5%, 21.0%, 21.5%, 22.0%, 22.5%, 23.0%, 23.5%, 24.0%, 24.5%, 25.0%, 25.5%, 26.0%, 26.5%, 27.0%, 27.5%, 28.0%, 28.5%, 29%, 29.5%, 30.0%, 30.5%, 31.0%, 31.5%, 32.0%, 32.5%, 33.0%, 33.5%, 34.0%, 35.0%, 35.5%, 36.0%, 36.5%, 37.0%, 37.5%, 38.0%, 38.5%, 39.0%, 39.5%, 40.0%, 40.5%, 41.0%, 41.5%, 42.0%, 42.5%, 43.0%, 43.5%, 44.0%, 44.5%, 45.0%, 45.5%, 46.0%, 46.5%, 47.0%, 47.5%, 48.0%, 48.5%, 49.0%, 49.5%, or 50.0% on a dry weight basis when compared to a control seed not comprising said recombinant DNA construct.

[0113] In another embodiment, the present invention relates to a recombinant DNA construct comprising any of the isolated polynucleotides of the present invention operably linked to at least one regulatory sequence.

[0114] In another embodiment of the present invention, a recombinant construct of the present invention further comprises an enhancer.

[0115] In another embodiment, the present invention relates to a vector comprising any of the polynucleotides of the present invention.

[0116] In another embodiment, the present invention relates to an isolated polynucleotide fragment comprising a nucleotide sequence comprised by any of the polynucleotides of the present invention, wherein the nucleotide sequence contains at least 30, 40, 60, 100, 200, 300, 400, 500 or 600 nucleotides.

[0117] In another embodiment, the present invention relates to a method for transforming a cell comprising transforming a cell with any of the isolated polynucleotides of the present invention, and the cell transformed by this method. Advantageously, the cell is eukaryotic, e.g., a yeast or plant cell, or prokaryotic, e.g., a bacterium.

[0118] In yet another embodiment, the present invention relates to a method for transforming a cell, comprising transforming a cell with a polynucleotide of the present invention.

[0119] In another embodiment, the present invention relates to a method for producing a transgenic plant comprising transforming a plant cell with any of the isolated polynucleotides of the present invention and regenerating a transgenic plant from the transformed plant cell.

[0120] In another embodiment, a cell, plant, or seed comprising a recombinant DNA construct of the present invention.

[0121] In another embodiment, an isolated polynucleotide comprising: (i) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; or (ii) a full complement of the nucleic acid sequence of (i), wherein the full complement and the nucleic acid sequence of (i) consist of the same number of nucleotides and are 100% complementary. Any of the foregoing isolated polynucleotides may be utilized in any recombinant DNA constructs (including suppression DNA constructs) of the present invention. Preferably the polypeptide is an ORM protein.

[0122] In another embodiment, an isolated polynucleotide comprising: (i) a nucleic acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) a full complement of the nucleic acid sequence of (i). Any of the foregoing isolated polynucleotides may be utilized in any recombinant DNA constructs (including suppression DNA constructs) of the present invention. Preferably, the polypeptide is an ORM protein.

[0123] In one aspect, the present invention includes recombinant DNA constructs (including suppression DNA constructs).

[0124] In another embodiment, the present invention relates to a method of selecting an isolated polynucleotide that alters, i.e. increases or decreases, the level of expression of a ORM protein gene, protein or enzyme activity in a host cell, preferably a plant cell, the method comprising the steps of: (a) constructing an isolated polynucleotide of the present invention or an isolated recombinant DNA construct of the present invention; (b) introducing the isolated polynucleotide or the isolated recombinant DNA construct into a host cell; (c) measuring the level of the ORM protein RNA, protein or enzyme activity in the host cell containing the isolated polynucleotide or recombinant DNA construct; (d) comparing the level of the PPiase RNA, protein or enzyme activity in the host cell containing the isolated polynucleotide or recombinant DNA construct with the level of the ORM protein RNA, protein or enzyme activity in a host cell that does not contain the isolated polynucleotide or recombinant DNA construct, and selecting the isolated polynucleotide or recombinant DNA construct that alters, i.e., increases or decreases, the level of expression of the ORM protein gene, protein or enzyme activity in the plant cell.

[0125] In another embodiment, this invention concerns a method for suppressing the level of expression of a gene encoding a ORM protein having ORM protein activity in a transgenic plant, wherein the method comprises: (a) transforming a plant cell with a fragment of the isolated polynucleotide of the invention; (b) regenerating a transgenic plant from the transformed plant cell of 9a); and (c) selecting a transgenic plant wherein the level of expression of a gene encoding a polypeptide having ORM protein activity has been suppressed.

[0126] Preferably, the gene encodes a polypeptide having ORM protein activity, and the plant is a soybean plant.

[0127] In another embodiment, the invention concerns a method for producing transgenic seed, the method comprising: a) transforming a plant cell with the recombinant DNA construct of (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114, or (ii) the complement of (i); wherein (i) or (ii) is useful in co-suppression or antisense suppression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces transgenic seeds having an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% compared to seed obtained from a non-transgenic plant. Preferably, the seed is a soybean plant.

[0128] In another embodiment, a plant comprising in its genome a recombinant DNA construct comprising: (a) a polynucleotide operably linked to at least one regulatory element, wherein said polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117 or (b) a suppression DNA construct comprising at least one regulatory element operably linked to: (i) all or part of: (A) a nucleic acid sequence encoding a polypeptide having an amino acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117, or (B) a full complement of the nucleic acid sequence of (b)(i)(A); or (ii) a region derived from all or part of a sense strand or antisense strand of a target gene of interest, said region having a nucleic acid sequence of at least 70% sequence identity, based on the Clustal V method of alignment, when compared to said all or part of a sense strand or antisense strand from which said region is derived, and wherein said target gene of interest encodes a ORM protein, and wherein said plant has an altered oil, protein, starch and/or soluble carbohydrate content, when compared to a control plant not comprising said recombinant DNA construct.

[0129] A transgenic seed having an increased oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% when compared to the oil content of a non-transgenic seed, wherein said transgenic seed comprises a recombinant DNA construct comprising: (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114;

or (b) the full-length complement of (a): wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.

[0130] Yet another embodiment of the invention concerns a transgenic seed comprising a recombinant DNA construct comprising:

[0131] (a) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (b) the full-length complement of (a):

wherein (a) or (b) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant and further wherein said seed has an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% on a dry-weight basis, as compared to seed obtained from a non-transgenic plant.

[0132] In another embodiment, the invention concerns a method for producing a transgenic plant, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a plant from the transformed plant cell.

[0133] Another embodiment of the invention concerns, a method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.

[0134] Another embodiment of the invention concerns, a method for producing transgenic seeds, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising a polynucleotide operably linked to at least one regulatory sequence, wherein the polynucleotide encodes a polypeptide having an amino acid sequence of at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, based on the Clustal V method of alignment, when compared to SEQ ID NO: 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 64, 66, 67, 69, 70, 102, 104, 105, 106, 108, 110, 113, 116, or 117; and (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increased starch content of at least 0.5%, 1%, 1.5%, 2%, 2.5%, 3.0%, 3.5%, 4.0%, 4.5%, 5.0%, 5.5%, 6.0%, 6.5%, 7.0%, 7.5%, 8.0%, 8.5%, 9.0%, 9.5%, 10.0%, 10.5%, 11%, 11.5%, 12.0% 12.5%, 13.0, 13.5%. 14.0%, 14.5%, 15.0%, 15.5%, 15.0%, 16.5%, 17.0%, 17.5% 18.0%, 18.5%, 19.0%, 19.5%, 20.0%, 20.5%, 21.0%, 21.5%, 22.0%, 22.5%, 23.0%, 23.5%, 24.0%, 24.5%, 25.0%, 25.5%, 26.0%, 26.5%, 27.0%, 27.5%, 28.0%, 28.5%, 29%, 29.5%, 30.0%, 30.5%, 31.0%, 31.5%, 32.0%, 32.5%, 33.0%, 33.5%, 34.0%, 35.0%, 35.5%, 36.0%, 36.5%, 37.0%, 37.5%, 38.0%, 38.5%, 39.0%, 39.5%, 40.0%, 40.5%, 41.0%, 41.5%, 42.0%, 42.5%, 43.0%, 43.5%, 44.0%, 44.5%, 45.0%, 45.5%, 46.0%, 46.5%, 47.0%, 47.5%, 48.0%, 48.5%, 49.0%, 49.5%, or 50.0% on a dry weight basis as compared to a transgenic seed obtained from a non-transgenic plant.

[0135] In another embodiment, the invention concerns a method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i); wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an altered oil, protein, starch and/or soluble carbohydrate content, as compared to a transgenic seed obtained from a non-transgenic plant.

[0136] A method for producing transgenic seed, the method comprising: (a) transforming a plant cell with a recombinant DNA construct comprising: (i) all or part of the nucleotide sequence set forth in SEQ ID NO: 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 48, 68, 101, 103, 107, 109, 111, or 114; or (ii) the full-length complement of (i);

wherein (i) or (ii) is of sufficient length to inhibit expression of endogenous ORM protein activity in a transgenic plant; (b) regenerating a transgenic plant from the transformed plant cell of (a); and (c) selecting a transgenic plant that produces a transgenic seed having an increase in oil content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30%, on a dry-weight basis, as compared to a transgenic seed obtained from a non-transgenic plant.

[0137] Soybeans can be processed into a number of products. For example, "soy protein products" can include, and are not limited to, those items listed in Table 2. "Soy protein products".

TABLE-US-00002 TABLE 2 Soy Protein Products Derived from Soybean Seeds.sup.a Whole Soybean Products Roasted Soybeans Baked Soybeans Soy Sprouts Soy Milk Specialty Soy Foods/Ingredients Soy Milk Tofu Tempeh Miso Soy Sauce Hydrolyzed Vegetable Protein Whipping Protein Processed Soy Protein Products Full Fat and Defatted Flours Soy Grits Soy Hypocotyls Soybean Meal Soy Milk Soy Protein Isolates Soy Protein Concentrates Textured Soy Proteins Textured Flours and Concentrates Textured Concentrates Textured Isolates .sup.aSee Soy Protein Products: Characteristics, Nutritional Aspects and Utilization (1987). Soy Protein Council.

[0138] "Processing" refers to any physical and chemical methods used to obtain the products listed in Table A and includes, and is not limited to, heat conditioning, flaking and grinding, extrusion, solvent extraction, or aqueous soaking and extraction of whole or partial seeds. Furthermore, "processing" includes the methods used to concentrate and isolate soy protein from whole or partial seeds, as well as the various traditional Oriental methods in preparing fermented soy food products. Trading Standards and Specifications have been established for many of these products (see National Oilseed Processors Association Yearbook and Trading Rules 1991-1992).

[0139] "White" flakes refer to flaked, dehulled cotyledons that have been defatted and treated with controlled moist heat to have a PDI (AOCS: Bal 0-65) of about 85 to 90. This term can also refer to a flour with a similar PDI that has been ground to pass through a No. 100 U.S. Standard Screen size.

[0140] "Grits" refer to defatted, dehulled cotyledons having a U.S. Standard screen size of between No. 10 and 80.

[0141] "Soy Protein Concentrates" refer to those products produced from dehulled, defatted soybeans by three basic processes: acid leaching (at about pH 4.5), extraction with alcohol (about 55-80%), and denaturing the protein with moist heat prior to extraction with water. Conditions typically used to prepare soy protein concentrates have been described by Pass ((1975) U.S. Pat. No. 3,897,574; Campbell et al., (1985) in New Protein Foods, ed. by Altschul and Wilcke, Academic Press, Vol. 5, Chapter 10, Seed Storage Proteins, pp 302-338).

[0142] "Extrusion" refers to processes whereby material (grits, flour or concentrate) is passed through a jacketed auger using high pressures and temperatures as a means of altering the texture of the material. "Texturing" and "structuring" refer to extrusion processes used to modify the physical characteristics of the material. The characteristics of these processes, including thermoplastic extrusion, have been described previously (Atkinson (1970) U.S. Pat. No. 3,488,770, Horan (1985) In New Protein Foods, ed. by Altschul and Wilcke, Academic Press, Vol. 1A, Chapter 8, pp 367-414). Moreover, conditions used during extrusion processing of complex foodstuff mixtures that include soy protein products have been described previously (Rokey (1983) Feed Manufacturing Technology III, 222-237; McCulloch, U.S. Pat. No. 4,454,804).

TABLE-US-00003 TABLE 3 Generalized Steps for Soybean Oil and Byproduct Production Process Impurities Removed and/or Step Process By-Products Obtained # 1 soybean seed # 2 oil extraction meal # 3 Degumming lecithin # 4 alkali or physical gums, free fatty acids, refining pigments # 5 water washing soap # 6 Bleaching color, soap, metal # 7 (hydrogenation) # 8 (winterization) stearine # 9 Deodorization free fatty acids, tocopherols, sterols, volatiles # 10 oil products

[0143] More specifically, soybean seeds are cleaned, tempered, dehulled, and flaked, thereby increasing the efficiency of oil extraction. Oil extraction is usually accomplished by solvent (e.g., hexane) extraction but can also be achieved by a combination of physical pressure and/or solvent extraction. The resulting oil is called crude oil. The crude oil may be degummed by hydrating phospholipids and other polar and neutral lipid complexes that facilitate their separation from the nonhydrating, triglyceride fraction (soybean oil). The resulting lecithin gums may be further processed to make commercially important lecithin products used in a variety of food and industrial products as emulsification and release (i.e., antisticking) agents. Degummed oil may be further refined for the removal of impurities (primarily free fatty acids, pigments and residual gums). Refining is accomplished by the addition of a caustic agent that reacts with free fatty acid to form soap and hydrates phosphatides and proteins in the crude oil. Water is used to wash out traces of soap formed during refining. The soapstock byproduct may be used directly in animal feeds or acidulated to recover the free fatty acids. Color is removed through adsorption with a bleaching earth that removes most of the chlorophyll and carotenoid compounds. The refined oil can be hydrogenated, thereby resulting in fats with various melting properties and textures. Winterization (fractionation) may be used to remove stearine from the hydrogenated oil through crystallization under carefully controlled cooling conditions. Deodorization (principally via steam distillation under vacuum) is the last step and is designed to remove compounds which impart odor or flavor to the oil. Other valuable byproducts such as tocopherols and sterols may be removed during the deodorization process. Deodorized distillate containing these byproducts may be sold for production of natural vitamin E and other high-value pharmaceutical products. Refined, bleached, (hydrogenated, fractionated) and deodorized oils and fats may be packaged and sold directly or further processed into more specialized products. A more detailed reference to soybean seed processing, soybean oil production, and byproduct utilization can be found in Erickson, Practical Handbook of Soybean Processing and Utilization, The American Oil Chemists' Society and United Soybean Board (1995). Soybean oil is liquid at room temperature because it is relatively low in saturated fatty acids when compared with oils such as coconut, palm, palm kernel, and cocoa butter.

[0144] For example, plant and microbial oils containing polyunsaturated fatty acids (PUFAs) that have been refined and/or purified can be hydrogenated, thereby resulting in fats with various melting properties and textures. Many processed fats (including spreads, confectionary fats, hard butters, margarines, baking shortenings, etc.) require varying degrees of solidity at room temperature and can only be produced through alteration of the source oil's physical properties. This is most commonly achieved through catalytic hydrogenation.

[0145] Hydrogenation is a chemical reaction in which hydrogen is added to the unsaturated fatty acid double bonds with the aid of a catalyst such as nickel. For example, high oleic soybean oil contains unsaturated oleic, linoleic, and linolenic fatty acids, and each of these can be hydrogenated. Hydrogenation has two primary effects. First, the oxidative stability of the oil is increased as a result of the reduction of the unsaturated fatty acid content. Second, the physical properties of the oil are changed because the fatty acid modifications increase the melting point resulting in a semi-liquid or solid fat at room temperature.

[0146] There are many variables which affect the hydrogenation reaction, which in turn alter the composition of the final product. Operating conditions including pressure, temperature, catalyst type and concentration, agitation, and reactor design are among the more important parameters that can be controlled. Selective hydrogenation conditions can be used to hydrogenate the more unsaturated fatty acids in preference to the less unsaturated ones. Very light or brush hydrogenation is often employed to increase stability of liquid oils. Further hydrogenation converts a liquid oil to a physically solid fat. The degree of hydrogenation depends on the desired performance and melting characteristics designed for the particular end product. Liquid shortenings (used in the manufacture of baking products, solid fats and shortenings used for commercial frying and roasting operations) and base stocks for margarine manufacture are among the myriad of possible oil and fat products achieved through hydrogenation. A more detailed description of hydrogenation and hydrogenated products can be found in Patterson, H. B. W., Hydrogenation of Fats and Oils: Theory and Practice. The American Oil Chemists' Society (1994).

[0147] Hydrogenated oils have become somewhat controversial due to the presence of trans-fatty acid isomers that result from the hydrogenation process. Ingestion of large amounts of trans-isomers has been linked with detrimental health effects including increased ratios of low density to high density lipoproteins in the blood plasma and increased risk of coronary heart disease.

[0148] In a another embodiment, the invention concerns a transgenic seed produced by any of the above methods. Preferably, the seed is a soybean seed.

[0149] The present invention concerns a transgenic soybean seed having increased total fatty acid content of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% when compared to the total fatty acid content of a non-transgenic, null segregant soybean seed. It is understood that any measurable increase in the total fatty acid content of a transgenic versus a non-transgenic, null segregant would be useful. Such increases in the total fatty acid content would include, but are not limited to, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30%.

[0150] Regulatory sequences may include, and are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

[0151] "Tissue-specific" promoters direct RNA production preferentially in particular types of cells or tissues. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg (Biochemistry of Plants 15:1-82 (1989)). It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.

[0152] A number of promoters can be used to practice the present invention. The promoters can be selected based on the desired outcome. The nucleic acids can be combined with constitutive, tissue-specific (preferred), inducible, or other promoters for expression in the host organism. Suitable constitutive promoters for use in a plant host cell include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO 99/43838 and U.S. Pat. No. 6,072,050; the core CaMV 35S promoter (Odell et al., Nature 313:810-812 (1985)); rice actin (McElroy et al., Plant Cell 2:163-171 (1990)); ubiquitin (Christensen et al., Plant Mol. Biol. 12:619-632 (1989) and Christensen et al., Plant Mol. Biol. 18:675-689 (1992)); pEMU (Last et al., Theor. Appl. Genet. 81:581-588 (1991)); MAS (Velten et al., EMBO J. 3:2723-2730 (1984)); ALS promoter (U.S. Pat. No. 5,659,026), and the like. Other constitutive promoters include, for example, those discussed in U.S. Pat. Nos. 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; 5,608,142; and 6,177,611.

[0153] In choosing a promoter to use in the methods of the invention, it may be desirable to use a tissue-specific or developmentally regulated promoter. A tissue-specific or developmentally regulated promoter is a DNA sequence which regulates the expression of a DNA sequence selectively in particular cells/tissues of a plant. Any identifiable promoter may be used in the methods of the present invention which causes the desired temporal and spatial expression.

[0154] Promoters which are seed or embryo specific and may be useful in the invention include patatin (potato tubers) (Rocha-Sosa, M., et al. (1989) EMBO J. 8:23-29), convicilin, vicilin, and legumin (pea cotyledons) (Rerie, W. G., et al. (1991) Mol. Gen. Genet. 259:149-157; Newbigin, E. J., et al. (1990) Planta 180:461-470; Higgins, T. J. V., et al. (1988) Plant. Mol. Biol. 11:683-695), zein (maize endosperm) (Schemthaner, J. P., et al. (1988) EMBO J. 7:1249-1255), phaseolin (bean cotyledon) (Segupta-Gopalan, C., et al. (1985) Proc. Natl. Acad. Sci. U.S.A. 82:3320-3324), phytohemagglutinin (bean cotyledon) (Voelker, T. et al. (1987) EMBO J. 6:3571-3577), B-conglycinin and glycinin (soybean cotyledon) (Chen, Z-L, et al. (1988) EMBO J. 7:297-302), glutelin (rice endosperm), hordein (barley endosperm) (Marris, C., et al. (1988) Plant Mol. Biol. 10:359-366), glutenin and gliadin (wheat endosperm) (Colot, V., et al. (1987) EMBO J. 6:3559-3564), and sporamin (sweet potato tuberous root) (Hattori, T., et al. (1990) Plant Mol. Biol. 14:595-604). Promoters of seed-specific genes operably linked to heterologous coding regions in chimeric gene constructions maintain their temporal and spatial expression pattern in transgenic plants. Such examples include Arabidopsis thaliana 2S seed storage protein gene promoter to express enkephalin peptides in Arabidopsis and Brassica napus seeds (Vanderkerckhove et al., Bio/Technology 7:L929-932 (1989)), bean lectin and bean beta-phaseolin promoters to express luciferase (Riggs et al., Plant Sci. 63:47-57 (1989)), and wheat glutenin promoters to express chloramphenicol acetyl transferase (Colot et al., EMBO J 6:3559-3564 (1987)).

[0155] A plethora of promoters is described in WO 00/18963, published on Apr. 6, 2000, the disclosure of which is hereby incorporated by reference. Examples of seed-specific promoters include, and are not limited to, the promoter for soybean Kunitz trypsin inhibitor (Kti3, Jofuku and Goldberg, Plant Cell 1:1079-1093 (1989)) .beta.-conglycinin (Chen et al., Dev. Genet. 10:112-122 (1989)), the napin promoter, and the phaseolin promoter.

[0156] In some embodiments, isolated nucleic acids which serve as promoter or enhancer elements can be introduced in the appropriate position (generally upstream) of a non-heterologous form of a polynucleotide of the present invention so as to up or down regulate expression of a polynucleotide of the present invention. For example, endogenous promoters can be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., PCT/US93/03868), or isolated promoters can be introduced into a plant cell in the proper orientation and distance from a cognate gene of a polynucleotide of the present invention so as to control the expression of the gene. Gene expression can be modulated under conditions suitable for plant growth so as to alter the total concentration and/or alter the composition of the polypeptides of the present invention in plant cell. Thus, the present invention includes compositions, and methods for making, heterologous promoters and/or enhancers operably linked to a native, endogenous (i.e., non-heterologous) form of a polynucleotide of the present invention.

[0157] An intron sequence can be added to the 5' untranslated region or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg, Mol. Cell Biol. 8:4395-4405 (1988); Callis et al., Genes Dev. 1:1183-1200 (1987)). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. See generally, The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, New York (1994). A vector comprising the sequences from a polynucleotide of the present invention will typically comprise a marker gene which confers a selectable phenotype on plant cells. Typical vectors useful for expression of genes in higher plants are well known in the art and include vectors derived from the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens described by Rogers et al., Meth. in Enzymol. 153:253-277 (1987).

[0158] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added can be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.

[0159] Preferred recombinant DNA constructs include the following combinations: a) a nucleic acid fragment corresponding to a promoter operably linked to at least one nucleic acid fragment encoding a selectable marker, followed by a nucleic acid fragment corresponding to a terminator, b) a nucleic acid fragment corresponding to a promoter operably linked to a nucleic acid fragment capable of producing a stem-loop structure, and followed by a nucleic acid fragment corresponding to a terminator, and c) any combination of a) and b) above. Preferably, in the stem-loop structure at least one nucleic acid fragment that is capable of suppressing expression of a native gene comprises the "loop" and is surrounded by nucleic acid fragments capable of producing a stem.

[0160] Preferred methods for transforming dicots and obtaining transgenic plants have been published, among others, for cotton (U.S. Pat. No. 5,004,863, U.S. Pat. No. 5,159,135); soybean (U.S. Pat. No. 5,569,834, U.S. Pat. No. 5,416,011); Brassica (U.S. Pat. No. 5,463,174); peanut (Cheng et al. (1996) Plant Cell Rep. 15:653-657, McKently et al. (1995) Plant Cell Rep. 14:699-703); papaya (Ling, K. et al. (1991) Bio/technology 9:752-758); and pea (Grant et al. (1995) Plant Cell Rep. 15:254-258). For a review of other commonly used methods of plant transformation see Newell, C. A. (2000) Mol. Biotechnol. 16:53-65. One of these methods of transformation uses Agrobacterium rhizogenes (Tepfler, M. and Casse-Delbart, F. (1987) Microbiol. Sci. 4:24-28). Transformation of soybeans using direct delivery of DNA has been published using PEG fusion (PCT publication WO 92/17598), electroporation (Chowrira, G. M. et al. (1995) Mol. Biotechnol. 3:17-23; Christou, P. et al. (1987) Proc. Natl. Acad. Sci. U.S.A. 84:3962-3966), microinjection, or particle bombardment (McCabe, D. E. et. Al. (1988) Bio/Technology 6:923; Christou et al. (1988) Plant Physiol. 87:671-674).

[0161] There are a variety of methods for the regeneration of plants from plant tissue. The particular method of regeneration will depend on the starting plant tissue and the particular plant species to be regenerated. The regeneration, development and cultivation of plants from single plant protoplast transformants or from various transformed explants are well known in the art (Weissbach and Weissbach, (1988) In.: Methods for Plant Molecular Biology, (Eds.), Academic Press, Inc., San Diego, Calif.). This regeneration and growth process typically includes the steps of selection of transformed cells, culturing those individualized cells through the usual stages of embryonic development through the rooted plantlet stage. Transgenic embryos and seeds are similarly regenerated. The resulting transgenic rooted shoots are thereafter planted in an appropriate plant growth medium such as soil. The regenerated plants may be self-pollinated. Otherwise, pollen obtained from the regenerated plants is crossed to seed-grown plants of agronomically important lines. Conversely, pollen from plants of these important lines is used to pollinate regenerated plants. A transgenic plant of the present invention containing a desired polypeptide(s) is cultivated using methods well known to one skilled in the art.

[0162] In addition to the above discussed procedures, practitioners are familiar with the standard resource materials which describe specific conditions and procedures for the construction, manipulation and isolation of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of recombinant DNA fragments and recombinant expression constructs and the screening and isolating of clones, (see for example, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press; Maliga et al. (1995) Methods in Plant Molecular Biology, Cold Spring Harbor Press; Birren et al. (1998) Genome Analysis: Detecting Genes, 1, Cold Spring Harbor, N.Y.; Birren et al. (1998) Genome Analysis: Analyzing DNA, 2, Cold Spring Harbor, N.Y.; Plant Molecular Biology: A Laboratory Manual, eds. Clark, Springer, New York (1997)).

[0163] Assays to detect proteins may be performed by SDS-polyacrylamide gel electrophoresis or immunological assays. Assays to detect levels of substrates or products of enzymes may be performed using gas chromatography or liquid chromatography for separation and UV or visible spectrometry or mass spectrometry for detection, or the like. Determining the levels of mRNA of the enzyme of interest may be accomplished using northern-blotting or RT-PCR techniques. Once plants have been regenerated, and progeny plants homozygous for the transgene have been obtained, plants will have a stable phenotype that will be observed in similar seeds in later generations.

[0164] In another aspect, this invention includes a polynucleotide of this invention or a functionally equivalent subfragment thereof useful in antisense inhibition or cosuppression of expression of nucleic acid sequences encoding proteins having cytosolic pyrophosphatase activity, most preferably in antisense inhibition or cosuppression of an endogenous ORM protein gene.

[0165] Protocols for antisense inhibition or co-suppression are well known to those skilled in the art.

[0166] The sequences of the polynucleotide fragments used for suppression do not have to be 100% identical to the sequences of the polynucleotide fragment found in the gene to be suppressed. For example, suppression of all the subunits of the soybean seed storage protein R-conglycinin has been accomplished using a polynucleotide derived from a portion of the gene encoding the .alpha. subunit (U.S. Pat. No. 6,362,399). R-conglycinin is a heterogeneous glycoprotein composed of varying combinations of three highly negatively charged subunits identified as .alpha.,.alpha.' and .beta.. The polynucleotide sequences encoding the .alpha. and .alpha.' subunits are 85% identical to each other while the polynucleotide sequences encoding the .beta. subunit are 75 to 80% identical to the .alpha. and .alpha.' subunits, respectively. Thus, polynucleotides that are at least 75% identical to a region of the polynucleotide that is target for suppression have been shown to be effective in suppressing the desired target.

[0167] The polynucleotide may be at least 80% identical, at least 90% identical, at least 95% identical, or about 100% identical to the desired target sequence.

[0168] The isolated nucleic acids and proteins and any embodiments of the present invention can be used over a broad range of plant types, particularly dicots such as the species of the genus Glycine.

[0169] It is believed that the nucleic acids and proteins and any embodiments of the present invention can be with monocots as well including, but not limited to, Graminiae including Sorghum bicolor and Zea mays.

[0170] The isolated nucleic acid and proteins of the present invention can also be used in species from the following dicot genera: Cucurbita, Rosa, Vitis, Juglans, Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Cichorium, Helianthus, Lactuca, Antirrhinum, Pelargonium, Ranunculus, Senecio, Salpiglossis, Cucumis, Browallia, Glycine, Pisum, Phaseolus, and from the following monocot genera: Bromus, Asparagus, Hemerocallis, Panicum, Pennisetum, Lolium, Oryza, Avena, Hordeum, Secale, Triticum, Bambusa, Dendrocalamus, and Melocanna.

EXAMPLES

[0171] The present invention is further defined in the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

[0172] The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.

EXAMPLES

[0173] The present invention is further defined in the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

[0174] The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.

Example 1

Creation of an Arabidopsis Population with Activation-Tagged Genes

[0175] An 18.49-kb T-DNA based binary construct was created, pHSbarENDs2 (SEQ ID NO:1;), that contains four multimerized enhancer elements derived from the Cauliflower Mosaic Virus 35S promoter (corresponding to sequences -341 to -64, as defined by Odell et al., Nature 313:810-812 (1985)). The construct also contains vector sequences (pUC9) and a poly-linker (SEQ ID NO:2) to allow plasmid rescue, transposon sequences (Ds) to remobilize the T-DNA, and the bar gene to allow for glufosinate selection of transgenic plants. In principle, only the 10.8-kb segment from the right border (RB) to left border (LB) inclusive will be transferred into the host plant genome. Since the enhancer elements are located near the RB, they can induce cis-activation of genomic loci following T-DNA integration.

[0176] Arabidopsis activation-tagged populations were created by whole plant Agrobacterium transformation. The pHSbarENDs2 (SEQ ID NO:1) construct was transformed into Agrobacterium tumefaciens strain C58, grown in lysogeny broth medium at 25.degree. C. to OD600.about.1.0. Cells were then pelleted by centrifugation and resuspended in an equal volume of 5% sucrose/0.05% Silwet L-77 (OSI Specialties, Inc). At early bolting, soil grown Arabidopsis thaliana ecotype Col-0 were top watered with the Agrobacterium suspension. A week later, the same plants were top watered again with the same Agrobacterium strain in sucrose/Silwet. The plants were then allowed to set seed as normal. The resulting T1 seed were sown on soil, and transgenic seedlings were selected by spraying with glufosinate (FINALE.RTM.; AgrEvo; Bayer Environmental Science). A total of 100,000 glufosinate resistant T1 seedlings were selected. T2 seed from each line was kept separate. Small aliquots of T2 seed from independently generated activation-tagged lines were pooled. The pooled seed were planted in soil and plants were grown to maturity producing T3 seed pools each comprised of seed derived from 96 activation-tagged lines.

Example 2

Identification and Characterization of Mutant Line lo17849

[0177] A method for screening Arabidopsis seed density was developed based on Focks and Benning (1998) with significant modifications. Arabidopsis seeds can be separated according to their density. Density layers were prepared by a mixture of 1,6 dibromohexane (d=1.6), 1-bromohexane (d=1.17) and mineral oil (d=0.84) at different ratios. From the bottom to the top of the tube, 6 layers of organic solvents each comprised of 2 mL were added sequentially. The ratios of 1,6 dibromohexane:1-bromohexane:mineral oil for each layer were 1:1:0, 1:2:0, 0:1:0, 0:5:1, 0:3:1, 0:0:1. About 600 mg of T3 seed of a given pool of 96 activation-tagged lines corresponding to about 30,000 seeds were loaded on to the surface layer of a 15 ml glass tube containing said step gradient. After centrifugation for 5 min at 2000.times.g, seeds were separated according to their density. The seeds in the lower two layers of the step gradient and from the bottom of the tube were collected. Organic solvents were removed by sequential washing with 100% and 80% ethanol and seeds were sterilized using a solution of 5% hypochloride (NaOCl) in water. Seed were rinsed in sterile water and plated on MS-1 media comprised of 0.5.times.MS salts, 1% (WN) sucrose, 0.05 MES/KOH (pH 5.8), 200 .mu.g/mL, 10 g/L agar and 15 mg L.sup.-1 glufosinate ammonium (Basta; Sigma Aldrich, USA). A total of 520 T3 pools each derived from 96 T2 activation-tagged lines were screened in this manner. Seed pool 475 when subjected to density gradient centrifugation as described above produced about 25 seed with increased density. These seed were sterilized and plated on selective media containing Basta. Basta-resistant seedlings were transferred to soil and plants were grown in a controlled environment (22.degree. C., 16 h light/8 h dark, 100-200 .mu.E m.sup.-2s.sup.-1). to maturity for about 8-10 weeks alongside four untransformed wild type plants of the Columbia ecotype. Oil content of T4 seed and control seed was measured by NMR as follows.

[0178] NMR based analysis of seed oil content:

[0179] Seed oil content was determined using a Maran Ultra NMR analyzer (Resonance Instruments Ltd, Whitney, Oxfordshire, UK). Samples (e.g., batches of Arabidopsis seed ranging in weight between 5 and 200 mg) were placed into pre-weighed 2 mL polypropylene tubes (Corning Inc, Corning N.Y., USA; Part no. 430917) previously labeled with unique bar code identifiers. Samples were then placed into 96 place carriers and processed through the following series of steps by an ADEPT COBRA 600.TM. SCARA robotic system: [0180] 1. pick up tube (the robotic arm was fitted with a vacuum pickup devise); [0181] 2. read bar code; [0182] 3. expose tube to antistatic device (ensured that Arabidopsis seed were not adhering to the tube walls); [0183] 4. weigh tube (containing the sample), to 0.0001 g precision; [0184] 5. take NMR reading; measured as the intensity of the proton spin echo 1 msec after a 22.95 MHz signal had been applied to the sample (data was collected for 32 NMR scans per sample); [0185] 6. return tube to rack; and [0186] 7. repeat process with next tube. Bar codes, tubes weights and NMR readings were recorded by a computer connected to the system. Sample weight was determined by subtracting the polypropylene tube weight from the weight of the tube containing the sample.

[0187] Seed oil content of soybeans seed or soybean somatic embryos was calculated as follows:

% oil ( % wt basis ) = ( NMR signal / sample wt ( g ) ) - 70.58 ) 351.45 ##EQU00001##

[0188] Calibration parameters were determined by precisely weighing samples of soy oil (ranging from 0.0050 to 0.0700 g at approximately 0.0050 g intervals; weighed to a precision of 0.0001 g) into Corning tubes (see above) and subjecting them to NMR analysis. A calibration curve of oil content (% seed wt basis; assuming a standard seed weight of 0.1500 g) to NMR value was established.

[0189] The relationship between seed oil contents measured by NMR and absolute oil contents measured by classical analytical chemistry methods was determined as follows. Fifty soybean seed, chosen to have a range of oil contents, were dried at 40.degree. C. in a forced air oven for 48 h. Individual seeds were subjected to NMR analysis, as described above, and were then ground to a fine powder in a GenoGrinder (SPEX Centriprep (Metuchen, N.J., U.S.A.); 1500 oscillations per minute, for 1 minute). Aliquots of between 70 and 100 mg were weighed (to 0.0001 g precision) into 13.times.100 mm glass tubes fitted with Teflon.RTM. lined screw caps; the remainder of the powder from each bean was used to determine moisture content, by weight difference after 18 h in a forced air oven at 105.degree. C. Heptane (3 mL) was added to the powders in the tubes and after vortex mixing samples were extracted, on an end-over-end agitator, for 1 h at room temperature. The extracts were centrifuged, 1500.times.g for 10 min, the supernatant decanted into a clean tube and the pellets were extracted two more times (1 h each) with 1 mL heptane. The supernatants from the three extractions were combined and 50 .mu.L internal standard (triheptadecanoic acid; 10 mg/mL toluene) was added prior to evaporation to dryness at room temperature under a stream of nitrogen gas; standards containing 0, 0.0050, 0.0100, 0.0150, 0.0200 and 0.0300 g soybean oil, in 5 mL heptane, were prepared in the same manner. Fats were converted to fatty acid methyl esters (FAMEs) by adding 1 mL 5% sulfuric acid (v:v. in anhydrous methanol) to the dried pellets and heating them at 80.degree. C. for 30 min, with occasional vortex mixing. The samples were allowed to cool to room temperature and 1 mL 25% aqueous sodium chloride was added followed by 0.8 mL heptane. After vortex mixing the phases were allowed to separate and the upper organic phase was transferred to a sample vial and subjected to GC analysis.

[0190] Plotting NMR determined oil contents versus GC determined oil contents resulted in a linear relationship between 9.66 and 26.27% oil (GC values; % seed wt basis) with a slope of 1.0225 and an R.sup.2 of 0.9744; based on a seed moisture content that averaged 2.6+/-0.8%.

[0191] Seed oil content (on a % seed weight basis) of Arabidopsis seed was calculated as follows:

mg oil=(NMR signal-2.1112)/37.514;

% oil=[(mg oil)/1000]/[g of seed sample weight].times.100.

[0192] Prior to establishing this formula, Arabidopsis seed oil was extracted as follows. Approximately 5 g of mature Arabidopsis seed (cv Columbia) were ground to a fine powder using a mortar and pestle. The powder was placed into a 33.times.94 mm paper thimble (Ahlstrom #7100-3394; Ahlstrom, Mount Holly Springs, Pa., USA) and the oil extracted during approximately 40 extraction cycles with petroleum ether (BP 39.9-51.7.degree. C.) in a Soxhlet apparatus. The extract was allowed to cool and the crude oil was recovered by removing the solvent under vacuum in a rotary evaporator. Calibration parameters were determined by precisely weighing 11 standard samples of partially purified Arabidopsis oil (samples contained 3.6, 6.3, 7.9, 9.6, 12.8, 16.3, 20.3, 28.2, 32.1, 39.9 and 60 mg of partially purified Arabidopsis oil) weighed to a precision of 0.0001 g) into 2 mL polypropylene tubes (Corning Inc, Corning N.Y., USA; Part no. 430917) and subjecting them to NMR analysis. A calibration curve of oil content (% seed weight basis) to NMR value was established.

[0193] Table 4 shows that the seed oil content of T4 activation-tagged line with Bar code ID K17849 is only 86% of that of the average of four WT control plants grown in the same flat.

TABLE-US-00004 TABLE 4 Oil Content of T4 activation-tagged lines derived from T3 pool 256 % oil content % BARCODE Oil T3 pool ID # of WT K17835 40.1 256 95.8 K17836 43.0 256 102.7 K17837 42.2 256 100.8 K17838 42.6 256 101.8 K17839 41.7 256 99.6 K17840 42.4 256 101.3 K17841 43.7 256 104.5 K17842 40.9 256 97.6 K17843 42.9 256 102.5 K17844 43.3 256 103.5 K17845 43.6 256 104.1 K17846 41.5 256 99.1 K17847 40.9 256 97.8 K17848 41.7 256 99.7 K17849 36.0 256 86.0 K17851 43.3 256 103.5 K17852 42.8 256 102.3 K17853 43.0 256 102.8 K17854 42.1 256 100.6 K17855 42.8 256 102.2 K17856 41.9 wt K17857 40.2 wt

K17849 was renamed lo17849. T4 seed were plated on selective media and nine glufosinate-resistant seedlings were planted in the same flat as six untransformed WT plants. Plants were grown to maturity and oil content was determined by NMR.

TABLE-US-00005 TABLE 5 Oil Content of T5 seed of activation-tagged line lo17849 Average oil content T5 activation- % Average oil content % % of BARCODE tagged line ID Oil % oil of WT WT K24753 lo17849 39.3 38.3 95.3 92.9 K24747 lo17849 38.9 94.2 K24752 lo17849 38.8 94.1 K24746 lo17849 38.4 93.2 K24750 lo17849 38.4 93.1 K24751 lo17849 38.2 92.7 K24748 lo17849 38.0 92.1 K24754 lo17849 37.8 91.5 K24749 lo17849 36.9 89.5 K24760 wt 42.9 K24755 wt 41.7 K24757 wt 41.6 K24756 wt 40.9 K24759 wt 40.7 K24758 wt 39.7

[0194] Table 5 shows that the seed oil content of T5 seed of activation-tagged line lo17849 is between 89.5 and 95.3% of that of WT control plants grown in the same flat. The average seed oil content of all T5 lines of lo17849 was 93% of the WT control plant average. Twenty-four Basta-resistant T5 seedlings of lo17849 were planted in the same flat alongside 12 untransformed WT control plants of the Columbia ecotype. Plants were grown to maturity and seed was bulk-harvested from all 24 lo17849 and 12 WT plants. Oil content of lo17849 and WT seed was measured by NMR (Table 6).

TABLE-US-00006 TABLE 6 Oil Content of T6 activation-tagged line lo17849 % oil content % Barcode Oil Seed ID of WT K37207 39.7 LO 17849 92.3 K37208 43.0 WT

[0195] T6 seed of lo17849 and WT seed produced under identical conditions were subjected to compositional analysis as described below. Seed weight was measured by determining the weight of 100 seed. This analysis was performed in triplicate.

[0196] Tissue Preparation:

[0197] Arabidopsis seed (approximately 0.5 g in a 1/2.times.2'' polycarbonate vial) was ground to a homogeneous paste in a GENOGRINDER.RTM. (3.times.30 sec at 1400 strokes per minute, with a 15 sec interval between each round of agitation). After the second round of agitation, the vials were removed and the Arabidopsis paste was scraped from the walls with a spatula prior to the last burst of agitation.

[0198] Determination of Protein Content:

[0199] Protein contents were estimated by combustion analysis on a Thermo FINNIGAN.TM. Flash 1112EA combustion analyzer running in the NCS mode (vanadium pentoxide was omitted) according to instructions of the manufacturer. Triplicate samples of the ground pastes, 4-8 mg, weighed to an accuracy of 0.001 mg on a METTLER-TOLEDO.RTM. MX5 micro balance, were used for analysis. Protein contents were calculated by multiplying % N, determined by the analyzer, by 6.25. Final protein contents were expressed on a % tissue weight basis.

[0200] Determination of Non-Structural Carbohydrate Content:

[0201] Sub-samples of the ground paste were weighed (to an accuracy of 0.1 mg) into 13.times.100 mm glass tubes; the tubes had TEFLON.RTM. lined screw-cap closures. Three replicates were prepared for each sample tested.

[0202] Lipid extraction was performed by adding 2 ml aliquots of heptane to each tube. The tubes were vortex mixed and placed into an ultrasonic bath (VWR Scientific Model 750D) filled with water heated to 60.degree. C. The samples were sonicated at full-power (.about.360 W) for 15 min and were then centrifuged (5 min.times.1700 g). The supernatants were transferred to clean 13.times.100 mm glass tubes and the pellets were extracted 2 more times with heptane (2 ml, second extraction; 1 ml third extraction) with the supernatants from each extraction being pooled. After lipid extraction 1 ml acetone was added to the pellets and after vortex mixing, to fully disperse the material, they were taken to dryness in a Speedvac.

[0203] Non-Structural Carbohydrate Extraction and Analysis:

[0204] Two ml of 80% ethanol was added to the dried pellets from above. The samples were thoroughly vortex mixed until the plant material was fully dispersed in the solvent prior to sonication at 60.degree. C. for 15 min. After centrifugation, 5 min.times.1700 g, the supernatants were decanted into clean 13.times.100 mm glass tubes. Two more extractions with 80% ethanol were performed and the supernatants from each were pooled. The extracted pellets were suspended in acetone and dried (as above). An internal standard .beta.-phenyl glucopyranoside (100 .mu.l of a 0.5000+/-0.0010 g/100 ml stock) was added to each extract prior to drying in a Speedvac. The extracts were maintained in a desiccator until further analysis.

[0205] The acetone dried powders from above were suspended in 0.9 ml MOPS (3-N[Morpholino]propane-sulfonic acid; 50 mM, 5 mM CaCl.sub.2, pH 7.0) buffer containing 100 U of heat-stable .alpha.-amylase (from Bacillus licheniformis; Sigma A-4551).

[0206] Samples were placed in a heat block (90.degree. C.) for 75 min and were vortex mixed every 15 min. Samples were then allowed to cool to room temperature and 0.6 ml acetate buffer (285 mM, pH 4.5) containing 5 U amyloglucosidase (Roche 110 202 367 001) was added to each. Samples were incubated for 15-18 h at 55.degree. C. in a water bath fitted with a reciprocating shaker; standards of soluble potato starch (Sigma S-2630) were included to ensure that starch digestion went to completion.

[0207] Post-digestion the released carbohydrates were extracted prior to analysis. Absolute ethanol (6 ml) was added to each tube and after vortex mixing the samples were sonicated for 15 min at 60.degree. C. Samples were centrifuged (5 min.times.1700 g) and the supernatants were decanted into clean 13.times.100 mm glass tubes. The pellets were extracted 2 more times with 3 ml of 80% ethanol and the resulting supernatants were pooled. Internal standard (100 .mu.l .beta.-phenyl glucopyranoside, as above) was added to each sample prior to drying in a Speedvac.

[0208] Sample Preparation and Analysis:

[0209] The dried samples from the soluble and starch extractions described above were solubilized in anhydrous pyridine (Sigma-Aldrich P57506) containing 30 mg/ml of hydroxylamine HCl (Sigma-Aldrich 159417). Samples were placed on an orbital shaker (300 rpm) overnight and were then heated for 1 hr (75.degree. C.) with vigorous vortex mixing applied every 15 min. After cooling to room temperature, 1 ml hexamethyldisilazane (Sigma-Aldrich H-4875) and 100 .mu.l trifluoroacetic acid (Sigma-Aldrich T-6508) were added. The samples were vortex mixed and the precipitates were allowed to settle prior to transferring the supernatants to GC sample vials.

[0210] Samples were analyzed on an Agilent 6890 gas chromatograph fitted with a DB-17MS capillary column (15m.times.0.32 mm.times.0.25 um film). Inlet and detector temperatures were both 275.degree. C. After injection (2 .mu.l, 20:1 split) the initial column temperature (150.degree. C.) was increased to 180.degree. C. at a rate of 3.degree. C./min and then at 25.degree. C./min to a final temperature of 320.degree. C. The final temperature was maintained for 10 min. The carrier gas was H.sub.2 at a linear velocity of 51 cm/sec. Detection was by flame ionization. Data analysis was performed using Agilent ChemStation software. Each sugar was quantified relative to the internal standard and detector responses were applied for each individual carbohydrate (calculated from standards run with each set of samples). Final carbohydrate concentrations were expressed on a tissue weight basis.

[0211] Carbohydrates were identified by retention time matching with authentic samples of each sugar run in the same chromatographic set and by GC-MS with spectral matching to the NIST Mass Spectral Library Version 2a, build Jul. 1, 2002.

TABLE-US-00007 TABLE 7 Compositional Analysis of lo17849 and WT Control Seed Seed fructose Barcode Oil Weight (.mu.g mg-1 Genotype ID (%, NMR) Protein % (.mu.g) seed) lo17849 K37207 39.7 16.95 24 0.66 WT K37208 43.0 15.49 23.67 0.57 .DELTA. -7.7 9.4 1.4 15.8 TG/WT % glucose sucrose raffinose stachyose Barcode (.mu.g mg-1 (.mu.g mg-1 (.mu.g mg-1 (.mu.g mg-1 Genotype ID seed) seed) seed) seed) lo17849 K37207 9.54 16.07 1.44 4.71 WT K37208 8.02 17.59 1.21 3.48 .DELTA. 19.0 -8.6 19.0 35.3 TG/WT %

Table 7 shows that no change of seed weight is associated with the seed oil reduction in lo17849. There is however a 10% increase in protein content in 1017849 compared to control seed. The soluble carbohydrate profile of lo17849 differs from that of WT seed. The former shows decrease a sucrose and increased levels of fructose, glucose, raffinose and stachyose.

[0212] In summary the lo17849 contains a genetic locus that confers glufosinate herbicide resistance. Presence of this transgene is associated with a low oil trait (reduction in oil content of 5-8% compared to WT) that is accompanied by unaltered seed size, increased protein content and a shift in the carbohydrate profile mature dry seed that consists of decreased sucrose levels and increased levels of fructose, glucose and raffinosaccharides.

Example 3

Identification of Activation-Tagged Genes

[0213] Genes flanking the T-DNA insert in the lo17849 lines were identified using one, or both, of the following two standard procedures: (1) thermal asymmetric interlaced (TAIL) PCR (Liu et al., Plant J. 8:457-63 (1995)); and (2) SAIFF PCR (Siebert et al., Nucleic Acids Res. 23:1087-1088 (1995)). In lines with complex multimerized T-DNA inserts, TAIL PCR and SAIFF PCR may both prove insufficient to identify candidate genes. In these cases, other procedures, including inverse PCR, plasmid rescue and/or genomic library construction, can be employed.

[0214] A successful result is one where a single TAIL or SAIFF PCR fragment contains a T-DNA border sequence and Arabidopsis genomic sequence. Once a tag of genomic sequence flanking a T-DNA insert is obtained, candidate genes are identified by alignment to publicly available Arabidopsis genome sequence. Specifically, the annotated gene nearest the 35S enhancer elements/T-DNA RB are candidates for genes that are activated.

[0215] To verify that an identified gene is truly near a T-DNA and to rule out the possibility that the TAIL/SAIFF fragment is a chimeric cloning artifact, a diagnostic PCR on genomic DNA is done with one oligo in the T-DNA and one oligo specific for the candidate gene. Genomic DNA samples that give a PCR product are interpreted as representing a T-DNA insertion. This analysis also verifies a situation in which more than one insertion event occurs in the same line, e.g., if multiple differing genomic fragments are identified in TAIL and/or SAIFF PCR analyses.

Example 4

Identification of Activation-Tagged Genes in lo17849

Construction of pKR1478 for Seed Specific Overexpression of Genes in Arabidopsis

[0216] Plasmid pKR85 (SEQ ID NO:3; described in US Patent Application Publication US 2007/0118929 published on May 24, 2007) was digested with HindIII and the fragment containing the hygromycin selectable marker was re-ligated together to produce pKR278 (SEQ ID NO:4).

[0217] Plasmid pKR407 (SEQ ID NO:5; described in PCT Int. Appl. WO 2008/124048 published on Oct. 16, 2008) was digested with BamHI/HindIII and the fragment containing the Gy1 promoter/NotI/LegA2 terminator cassette was effectively cloned into the BamHI/HindIII fragment of pKR278 (SEQ ID NO:4) to produce pKR1468 (SEQ ID NO:6).

[0218] Plasmid pKR1468 (SEQ ID NO:6) was digested with NotI and the resulting DNA ends were filled using Klenow. After filling to form blunt ends, the DNA fragments were treated with calf intestinal alkaline phosphatase and separated using agarose gel electrophoresis. The purified fragment was ligated with cassette formA containing a chloramphenicol resistance and ccdB genes flanked by attR1 and attR2 sites, using the Gateway.RTM. Vector Conversion System (Cat. No. 11823-029, Invitrogen Corporation) following the manufacturer's protocol to pKR1475 (SEQ ID NO:7).

[0219] Plasmid pKR1475 (SEQ ID NO:7) was digested with AscI and the fragment containing the Gy1 promoter/NotI/LegA2 terminator Gateway.RTM. L/R cloning cassette was cloned into the AscI fragment of binary vector pKR92 (SEQ ID NO:8; described in US Patent Application Publication US 2007/0118929 published on May 24, 2007) to produce pKR1478 (SEQ ID NO:9).

[0220] In this way, genes flanked by attL1 and attL2 sites could be cloned into pKR1478 (SEQ ID NO:9) using Gateway.RTM. technology (Invitrogen Corporation) and the gene could be expressed in Arabidopsis from the strong, seed-specific soybean Gy1 promoter in soy.

[0221] The activation tagged-line (1017849) showing reduced oil content was further analyzed. DNA from the line was extracted, and genes flanking the T-DNA insert in the mutant line were identified using ligation-mediated PCR (Siebert et al., Nucleic Acids Res. 23:1087-1088 (1995)). A single amplified fragment was identified that contained a T-DNA border sequence and Arabidopsis genomic sequence. The sequence of this PCR product which contains part of the left border of the inserted T-DNA is set forth as SEQ ID NO:10. Once a tag of genomic sequence flanking a T-DNA insert was obtained, a candidate gene was identified by alignment of SEQ ID NO:10 to the completed Arabidopsis genome (NCBI). Specifically, the SAIFF PCR product generated with PCR primers corresponding to the left border sequence of the T-DNA present in pHSbarENDs2 aligns with sequence of the Arabidopsis genome that is located in the second intron of Arabidopsis gene At5g17270 and 5949 by upstream of the inferred start codon of At5g17280.

Validation of Candidate Arabidopsis Gene (At5017280) Via Transformation into Arabidopsis

[0222] The gene At5g17280, specifically its inferred start codon is 5.5 kb downstream of the SAIFF sequence corresponding to sequence adjacent to the left T-DNA border in lo17849. This gene is annotated as encoding a protein with an oxidoreductase motif (ORM). Primers ORM ORF FWD (SEQ ID NO:11) and ORM ORF REV (SEQ ID NO:12) were used to amplify the At5g172800RF from genomic DNA of Arabidopsis plants of the Columbia ecotype. The PCR product was cloned into pENTR (Invitrogen, USA) to give pENTR-ORM (SEQ ID NO:13). The At5g17280 ORF was inserted in the sense orientation downstream of the GY1 promoter in binary plant transformation vector pKR1478 using Gateway LR recombinase (Invitrogen, USA) using manufacturer instructions. The sequence of the resulting plasmid pKR1478-ORM is set forth as SEQ ID NO:14.

[0223] pKR1478-ORM (SEQ ID NO:14) was introduced into Agrobacterium tumefaciens NTL4 (Luo et al, Molecular Plant-Microbe Interactions (2001) 14(1):98-103) by electroporation. Briefly, 1 .mu.g plasmid DNA was mixed with 100 .mu.L of electro-competent cells on ice. The cell suspension was transferred to a 100 .mu.L electroporation cuvette (1 mm gap width) and electroporated using a BIORAD electroporator set to 1 kV, 400.OMEGA. and 25 .mu.F. Cells were transferred to 1 mL LB medium and incubated for 2 h at 30.degree. C. Cells were plated onto LB medium containing 50 .mu.g/mL kanamycin. Plates were incubated at 30.degree. C. for 60 h. Recombinant Agrobacterium cultures (500 mL LB, 50 .mu.g/mL kanamycin) were inoculated from single colonies of transformed agrobacterium cells and grown at 30.degree. C. for 60 h. Cells were harvested by centrifugation (5000.times.g, 10 min) and resuspended in 1 L of 5% (WN) sucrose containing 0.05% (V/V) Silwet. Arabidopsis plants were grown in soil at a density of 30 plants per 100 cm.sup.2 pot in METRO-MIX.RTM. 360 soil mixture for 4 weeks (22.degree. C., 16 h light/8 h dark, 100 .mu.E m.sup.-2s.sup.-1). Plants were repeatedly dipped into the Agrobacterium suspension harboring the binary vector pKR1478-ORM and kept in a dark, high humidity environment for 24 h. Post dipping, plants were grown for three to four weeks under standard plant growth conditions described above and plant material was harvested and dried for one week at ambient temperatures in paper bags. Seeds were harvested using a 0.425 mm mesh brass sieve.

[0224] Cleaned Arabidopsis seeds (2 grams, corresponding to about 100,000 seeds) were sterilized by washes in 45 mL of 80% ethanol, 0.01% TRITON.RTM. X-100, followed by 45 mL of 30% (V/V) household bleach in water, 0.01% TRITON.RTM. X-100 and finally by repeated rinsing in sterile water. Aliquots of 20,000 seeds were transferred to square plates (20.times.20 cm) containing 150 mL of sterile plant growth medium comprised of 0.5.times.MS salts, 0.53% (W/V) sorbitol, 0.05 MES/KOH (pH 5.8), 200 .mu.g/mL TIMENTIN.RTM., and 50 .mu.g/mL kanamycin solidified with 10 g/L agar. Homogeneous dispersion of the seed on the medium was facilitated by mixing the aqueous seed suspension with an equal volume of melted plant growth medium. Plates were incubated under standard growth conditions for ten days. Kanamycin-resistant seedlings were transferred to plant growth medium without selective agent and grown for one week before transfer to soil. T1 Plants are grown to maturity alongside wt control plants and T2 seeds were harvested. A total of six wt plant were grown alongside the T1 plants and two bulk samples were generated by combining seed from three wt plants. Oil content was measured by NMR and is shown in Table 8

TABLE-US-00008 TABLE 8 Seed oil content of T1 plants generated with binary vector pKR1478-ORM for seed-specific over-expression of At5g17280 oil avg. oil % content % content % Construct BARCODE oil of WT of WT pKR1478- K42329 42.4 104.7 ORM pKR1478- K42319 41.6 102.8 ORM pKR1478- K42320 41.0 101.4 ORM pKR1478- K42326 40.6 100.5 ORM pKR1478- K42330 40.1 99.1 ORM pKR1478- K42324 40.0 98.8 ORM pKR1478- K42333 39.8 98.4 ORM pKR1478- K42323 39.7 98.1 ORM pKR1478- K42321 39.3 97.3 ORM pKR1478- K42332 38.3 94.8 ORM pKR1478- K42328 38.1 94.1 ORM pKR1478- K42322 37.8 93.6 ORM pKR1478- K42327 37.1 91.6 ORM pKR1478- K42325 35.6 88.0 ORM pKR1478- K42334 34.1 84.2 ORM pKR1478- K42331 34.0 84.1 95.7 ORM wt K42335 40.4

T2 seed of events K42334 and K42331 were plated on selective media and planted alongside untransformed wt control plants. Plants were gown to maturity. Seeds were harvested and oil content was measured by NMR (Table 9)

TABLE-US-00009 TABLE 9 Seed oil content of T2 plants generated with binary vector pKR1478-PAE for seed-specific over-expression of At5g17280 oil content avg. oil % of content Event ID Construct BARCODE % oil WT % of WT K42334 pKR1478- K44550 40.5 102.0 ORM pKR1478- K44537 39.2 98.9 ORM pKR1478- K44543 39.2 98.7 ORM pKR1478- K44553 39.0 98.2 ORM pKR1478- K44535 38.1 96.0 ORM pKR1478- K44545 37.9 95.5 ORM pKR1478- K44546 37.5 94.5 ORM pKR1478- K44551 37.2 93.8 ORM pKR1478- K44542 36.9 92.9 ORM pKR1478- K44549 36.6 92.1 ORM pKR1478- K44538 36.4 91.7 ORM pKR1478- K44547 36.2 91.1 ORM pKR1478- K44552 36.1 91.1 ORM pKR1478- K44540 35.6 89.8 ORM pKR1478- K44539 35.4 89.3 ORM pKR1478- K44544 35.0 88.1 ORM pKR1478- K44534 34.7 87.4 ORM pKR1478- K44536 34.4 86.7 ORM pKR1478- K44548 33.0 83.2 ORM pKR1478- K44541 30.3 76.2 91.9 ORM wt K44563 42.9 wt K44555 42.6 wt K44558 41.4 wt K44559 40.6 wt K44554 39.7 wt K44557 39.3 wt K44564 39.3 wt K44561 38.8 wt K44556 38.6 wt K44562 38.2 wt K44565 37.8 wt K44560 37.1 K42331 pKR1478- K46263 40.3 94.0 ORM pKR1478- K46264 39.7 92.6 ORM pKR1478- K46266 39.7 92.5 ORM pKR1478- K46268 38.8 90.4 ORM pKR1478- K46262 38.7 90.3 ORM pKR1478- K46248 38.7 90.3 ORM pKR1478- K46251 38.4 89.6 ORM pKR1478- K46269 38.4 89.5 ORM pKR1478- K46249 38.3 89.4 ORM pKR1478- K46250 38.3 89.2 ORM pKR1478- K46258 38.3 89.2 ORM pKR1478- K46261 38.1 88.8 ORM pKR1478- K46254 38.0 88.7 ORM pKR1478- K46255 38.0 88.7 ORM pKR1478- K46267 37.9 88.3 ORM pKR1478- K46256 37.8 88.1 ORM pKR1478- K46253 37.6 87.6 ORM pKR1478- K46265 37.3 87.1 ORM pKR1478- K46257 37.2 86.7 ORM pKR1478- K46259 37.1 86.5 ORM pKR1478- K46260 36.9 86.0 ORM pKR1478- K46252 35.8 83.6 89.0 ORM wt K46275 44.7 wt K46270 43.6 wt K46272 43.4 wt K46280 43.4 wt K46281 43.3 wt K46277 43.2 wt K46271 43.0 wt K46273 42.8 wt K46278 42.7 wt K46279 42.6 wt K46276 42.2 wt K46274 39.8

T3 seed of lines K44584 and K44581 derived from event K42334 were plated on selective media and planted alongside untransformed wt control plants. Plants were gown to maturity. Seeds were harvested and oil content was measured by NMR (Table 10)

TABLE-US-00010 TABLE 10 Seed oil content of T3 plants generated with binary vector pKR1478-PAE for seed-specific over-expression of At5g17280 oil content avg. oil % of content Event ID Construct BARCODE % oil WT % of WT K42334/K44548 pKR1478- K49194 39.3 92.9 ORM pKR1478- K49193 39.0 92.1 ORM pKR1478- K49204 38.9 92.1 ORM pKR1478- K49206 38.7 91.5 ORM pKR1478- K49197 38.7 91.5 ORM pKR1478- K49208 38.7 91.5 ORM pKR1478- K49199 38.2 90.3 ORM pKR1478- K49207 37.8 89.4 ORM pKR1478- K49214 37.7 89.0 ORM pKR1478- K49196 37.6 88.9 ORM pKR1478- K49191 37.5 88.8 ORM pKR1478- K49192 37.3 88.2 ORM pKR1478- K49205 37.2 87.8 ORM pKR1478- K49209 36.5 86.3 ORM pKR1478- K49211 36.5 86.2 ORM pKR1478- K49212 36.4 86.0 ORM pKR1478- K49200 36.3 85.9 89.3 ORM wt K49223 43.0 wt K49219 42.8 wt K49221 42.7 wt K49222 42.4 wt K49220 42.1 wt K49216 42.0 wt K49218 41.8 wt K49217 41.7 K42334/K44541 pKR1478- K49174 38.8 93.0 ORM pKR1478- K49152 38.1 91.3 ORM pKR1478- K49173 38.1 91.3 ORM pKR1478- K49177 37.7 90.2 ORM pKR1478- K49162 37.6 90.1 ORM pKR1478- K49176 36.9 88.2 ORM pKR1478- K49167 36.8 88.2 ORM pKR1478- K49157 36.8 88.2 ORM pKR1478- K49163 36.8 88.1 ORM pKR1478- K49170 36.7 87.9 ORM pKR1478- K49171 36.7 87.8 ORM pKR1478- K49178 36.6 87.7 ORM pKR1478- K49154 36.5 87.3 ORM pKR1478- K49156 35.7 85.5 ORM pKR1478- K49165 35.0 83.7 ORM pKR1478- K49161 33.8 80.9 ORM pKR1478- K49179 33.6 80.5 87.6 ORM wt K49185 43.1 wt K49186 42.5 wt K49187 42.3 wt K49181 42.2 wt K49182 42.0 wt K49184 41.5 wt K49180 40.8 wt K49183 39.8

Tables 8-10 demonstrate that seed specific over-expression of At5g17280 leads to a decrease in oil content of 10%. The decrease in oil content associated with the transgene is heritable. This finding suggests that the low seed oil phenotype in lo17849 in related to increased expression of At5g17280 resulting from the nearby insertion of quadruple 35S enhancer sequence present in the pHSbarENDs2-derived T-DNA.

Example 5

Seed-Specific RNAi of At5g17280

Generation and Phenotypic Characterization of Transgenic Lines

[0225] A binary plant transformation vector pKR1482 (SEQ ID NO:15) for generation of hairpin constructs facilitating seed-specific RNAi under control of the GY1 promoter derived from the soy gene Glyma03g32030.1 was constructed. The RNAi-related expression cassette that can be used for cloning of a given DNA fragment flanked by ATTL sites in antisense and sense orientation downstream of the seed-specific promoter. The two gene fragments are interrupted by a spliceable intron sequence derived from the Arabidopsis gene At2g38080.

[0226] An intron of an Arabidopsis laccase gene (At2g38080) was amplified from genomic Arabidopsis DNA of ecotype Columbia using primers AthLcc IN FWD (SEQ ID NO:16) and AthLcc IN REV (SEQ ID NO:17). PCR products were cloned into pGEM T EASY (Promega, USA) according to manufacturer instructions and sequenced. The DNA sequence of the PCR product containing the laccase intron is set forth as SEQ ID NO:18. The PCR primers introduce an HpaI restriction site at the 5' end of the intron and restriction sites for NruI and SpeI at the 3' end of the intron. A three-way ligation of DNA fragments was performed as follows. XbaI digested, dephosphorylated DNA of pMBL18 (Nakano, Yoshio; Yoshida, Yasuo; Yamashita, Yoshihisa; Koga, Toshihiko. Construction of a series of pACYC-derived plasmid vectors. Gene (1995), 162(1), 157-8.) was ligated to the XbaI, EcoRV DNA fragment of PSM1318 (SEQ ID NO:19) containing ATTR12 sites a DNA Gyrase inhibitor gene (ccdB), a chloramphenicol acetyltransferase gene, an HpaI/SpeI restriction fragment excised from pGEM T EASY Lacc INT (SEQ ID NO:18) containing intron 1 of At2g38080. Ligation products were transformed into the DB 3.1 strain of E. coli (Invitrogen, USA). Recombinant clones were characterized by restriction digests and sequenced. The DNA sequence of the resulting plasmid pMBL18 ATTR12 INT is set forth as SEQ ID NO:20. DNA of pMBL18 ATTR12 INT was linearized with NruI, dephosphorylated and ligated to the XbaI, EcoRV DNA fragment of PSM1789 (SEQ ID NO: 21) containing ATTR12 sites and a DNA Gyrase inhibitor gene (ccdB). Prior to ligation ends of the PSM1789 restriction fragment had been filled in with T4 DNA polymerase (Promega, USA). Ligation products were transformed into the DB 3.1 strain of E. coli (Invitrogen, USA). Recombinant clones were characterized by restriction digests and sequenced. The DNA sequence of the resulting plasmid pMBL18 ATTR12 INT ATTR21 is set forth as SEQ ID NO:22.

[0227] Plasmid pMBL18 ATTR12 INT ATTR21 (SEQ ID NO:22) was digested with XbaI and after filling to blunt the XbaI site generated, the resulting DNA was digested with Ecl136II and the fragment containing the attR cassettes was cloned into the NotI/BsiWI (where the NotI site was completely filled in) fragment of pKR1468 (SEQ ID NO:6), containing the Gy1 promoter, to produce pKR1480 (SEQ ID NO:23).

[0228] pKR1480 (SEQ ID NO:23) was digested with AscI and the fragment containing the Gy1 promoter/attR cassettes was cloned into the AscI fragment of binary vector pKR92 (SEQ ID NO:8) to produce pKR1482 (SEQ ID NO:15).

[0229] 5 .mu.g of plasmid DNA of pENTR-ORM (SEQ ID NO:13). was digested with EcoRV/HpaI. A restriction fragment of 0.7 kb (derived from pENTR-ORM) was excised from an agarose gel. The purified DNA fragment was inserted into vector pKR1482 using LR clonase (Invitrogen) according to the manufacturers instructions, to give pKR1482-ORM (SEQ ID NO:24)

[0230] pKR1482-ORM (SEQ ID NO:24) was introduced into Agrobacterium tumefaciens NTL4 (Luo et al, Molecular Plant-Microbe Interactions (2001) 14(1):98-103) by electroporation. Briefly, 1 .mu.g plasmid DNA was mixed with 100 .mu.L of electro-competent cells on ice. The cell suspension was transferred to a 100 .mu.L electroporation cuvette (1 mm gap width) and electroporated using a BIORAD electroporator set to 1 kV, 400.OMEGA. and 25 .mu.F. Cells were transferred to 1 mL LB medium and incubated for 2 h at 30.degree. C. Cells were plated onto LB medium containing 50 .mu.g/mL kanamycin. Plates were incubated at 30.degree. C. for 60 h. Recombinant Agrobacterium cultures (500 mL LB, 50 .mu.g/mL kanamycin) were inoculated from single colonies of transformed agrobacterium cells and grown at 30.degree. C. for 60 h. Cells were harvested by centrifugation (5000.times.g, 10 min) and resuspended in 1 L of 5% (WN) sucrose containing 0.05% (V/V) Silwet. Arabidopsis plants were grown in soil at a density of 30 plants per 100 cm.sup.2 pot in METRO-MIX.RTM. 360 soil mixture for 4 weeks (22.degree. C., 16 h light/8 h dark, 100 .mu.E m.sup.-2s.sup.-1). Plants were repeatedly dipped into the Agrobacterium suspension harboring the binary vector pKR1482-ORM (SEQ ID NO:24) and kept in a dark, high humidity environment for 24 h. Plants were grown for three to four weeks under standard plant growth conditions described above and plant material was harvested and dried for one week at ambient temperatures in paper bags. Seeds were harvested using a 0.425 mm mesh brass sieve.

[0231] Cleaned Arabidopsis seeds (2 grams, corresponding to about 100,000 seeds) were sterilized by washes in 45 mL of 80% ethanol, 0.01% TRITON.RTM. X-100, followed by 45 mL of 30% (V/V) household bleach in water, 0.01% TRITON.RTM. X-100 and finally by repeated rinsing in sterile water. Aliquots of 20,000 seeds were transferred to square plates (20.times.20 cm) containing 150 mL of sterile plant growth medium comprised of 0.5.times.MS salts, 0.53% (W/V) sorbitol, 0.05 MES/KOH (pH 5.8), 200 .mu.g/mL TIMENTIN.RTM., and 50 .mu.g/mL kanamycin solidified with 10 g/L agar. Homogeneous dispersion of the seed on the medium was facilitated by mixing the aqueous seed suspension with an equal volume of melted plant growth medium. Plates were incubated under standard growth conditions for ten days. Kanamycin-resistant seedlings were transferred to plant growth medium without selective agent and grown for one week before transfer to soil. Plants were grown to maturity and T2 seeds were harvested. A total of 15 events were generated with pKR1482-ORM (SEQ ID NO:24). Six wild-type (WT) control plants were grown in the same flat. WT seeds were bulk harvested thus generating two batches of wt control seed derived form three plants. T2 seed of individual transgenic lines were harvested. Oil content was measured by NMR as described above.

TABLE-US-00011 TABLE 11 Seed oil content of T1 plants generated with binary vector pKR1482-ORM for seed specific gene suppression of At5g17280 (Experiment 1) oil content avg. oil % of content Construct BARCODE % oil WT % of WT pKR1482- K42351 41.4 111.5 ORM pKR1482- K42355 41.0 110.4 ORM pKR1482- K42361 40.8 109.8 ORM pKR1482- K42360 40.5 109.0 ORM pKR1482- K42359 40.2 108.2 ORM pKR1482- K42350 40.1 107.8 ORM pKR1482- K42362 39.5 106.2 ORM pKR1482- K42353 38.6 103.8 ORM pKR1482- K42352 38.5 103.7 ORM pKR1482- K42354 38.3 103.0 ORM pKR1482- K42356 38.3 102.9 ORM pKR1482- K42358 37.8 101.8 ORM pKR1482- K42349 36.7 98.9 ORM pKR1482- K42357 36.2 97.5 ORM pKR1482- K42348 36.0 96.8 104.7 ORM wt K42363 38.4 wt K42364 35.9

[0232] Table 11 shows that seed-specific down regulation of At5g17280 leads to increased oil content in Arabidopsis seed.

T2 seeds of event K42355 that carries transgene pKR1482-ORM (SEQ ID NO: 24) were plated on plant growth media containing kanamycin. Plants were grown to maturity alongside WT plants of the Columbia ecotype grown in the same flats. Oil content of T3 seed is depicted in Table 12.

TABLE-US-00012 TABLE 12 Seed oil content of T2 plants generated with binary vector pKR1482-ORM for seed specific gene suppression of At5g17280 (Experiment 1) avg. oil oil content content % Event ID Construct BARCODE % oil % of WT of WT K42335 pKR1482- K44642 43.3 107.8 ORM pKR1482- K44650 43.1 107.3 ORM pKR1482- K44643 42.8 106.5 ORM pKR1482- K44637 42.6 106.0 ORM pKR1482- K44641 42.2 105.1 ORM pKR1482- K44647 41.6 103.5 ORM pKR1482- K44652 41.3 102.8 ORM pKR1482- K44636 41.3 102.7 ORM pKR1482- K44639 41.0 102.1 ORM pKR1482- K44646 41.0 102.0 ORM pKR1482- K44653 40.9 101.7 ORM pKR1482- K44649 40.4 100.5 ORM pKR1482- K44644 40.3 100.2 ORM pKR1482- K44657 39.9 99.2 ORM pKR1482- K44654 39.5 98.3 ORM pKR1482- K44656 39.0 97.1 ORM pKR1482- K44651 38.4 95.6 102.0 ORM wt K44658 41.7 wt K44661 41.3 wt K44663 41.2 wt K44664 41.1 wt K44666 40.7 wt K44662 40.1 wt K44665 38.8 wt K44668 38.4 wt K44667 38.3

T3 seeds of lines K44650 and K44637 derived from event K42355 that carries transgene pKR1482-ORM were plated on plant growth media containing kanamycin. Plants were grown to maturity alongside WT plants of the Columbia ecotype grown in the same flats. Oil content of T3 seed is depicted in Table 13.

TABLE-US-00013 TABLE 13 Seed oil content of T3 plants generated with binary vector pKR1482- ORM for seed specific gene suppression of At5g17280 (Experiment 1) avg. oil content oil content % of Event ID Construct BARCODE % oil % of WT WT K42335/K44650 pKR1482- K49241 43.5 105.7 ORM pKR1482- K49231 43.3 105.3 ORM pKR1482- K49236 42.9 104.1 ORM pKR1482- K49227 42.8 104.0 ORM pKR1482- K49239 42.7 103.9 ORM pKR1482- K49234 42.7 103.8 ORM pKR1482- K49226 42.7 103.8 ORM pKR1482- K49249 42.6 103.6 ORM pKR1482- K49237 42.6 103.5 ORM pKR1482- K49233 42.6 103.4 ORM pKR1482- K49225 42.4 103.1 ORM pKR1482- K49228 42.4 103.0 ORM pKR1482- K49230 42.2 102.5 ORM pKR1482- K49244 42.1 102.3 ORM pKR1482- K49242 42.1 102.2 ORM pKR1482- K49232 42.0 102.1 ORM pKR1482- K49224 42.0 102.0 ORM pKR1482- K49248 41.8 101.6 ORM pKR1482- K49246 41.7 101.3 ORM pKR1482- K49238 41.6 101.0 ORM pKR1482- K49247 41.5 100.8 ORM pKR1482- K49245 41.5 100.7 ORM pKR1482- K49240 41.4 100.7 ORM pKR1482- K49250 41.3 100.4 ORM pKR1482- K49235 41.1 99.9 ORM pKR1482- K49229 41.1 99.8 ORM pKR1482- K49243 41.0 99.6 102.4 ORM wt K49255 42.2 wt K49257 41.8 wt K49252 41.7 wt K49256 41.5 wt K49251 40.9 wt K49253 40.3 wt K49254 39.6 K42335/K44637 pKR1482- K49600 42.3 116.5 ORM pKR1482- K49595 42.0 115.6 ORM pKR1482- K49596 41.9 115.2 ORM pKR1482- K49582 41.7 114.8 ORM pKR1482- K49598 41.5 114.2 ORM pKR1482- K49594 41.5 114.1 ORM pKR1482- K49591 41.4 113.9 ORM pKR1482- K49583 41.3 113.6 ORM pKR1482- K49592 41.1 113.2 ORM pKR1482- K49601 40.8 112.4 ORM pKR1482- K49576 40.8 112.2 ORM pKR1482- K49587 40.7 111.9 ORM pKR1482- K49599 40.5 111.4 ORM pKR1482- K49597 40.4 111.4 ORM pKR1482- K49579 40.4 111.2 ORM pKR1482- K49580 40.2 110.6 ORM pKR1482- K49578 40.1 110.4 ORM pKR1482- K49585 40.1 110.3 ORM pKR1482- K49586 40.0 110.3 ORM pKR1482- K49590 40.0 110.0 ORM pKR1482- K49588 39.6 109.1 ORM pKR1482- K49581 39.6 109.0 ORM pKR1482- K49584 39.3 108.3 ORM pKR1482- K49574 39.2 107.9 ORM pKR1482- K49593 39.2 107.8 ORM pKR1482- K49589 39.1 107.7 ORM pKR1482- K49577 39.0 107.3 ORM pKR1482- K49575 35.8 98.5 111.0 ORM wt K49604 39.1 wt K49603 37.7 wt K49606 36.7 wt K49602 34.1 wt K49605 33.9

Additional events were generated with pKR1482-ORM in a second experiment henceforth referred to as Experiment 2. Oil content of T1 and T2 plants of pKR1482-ORM events derived from Experiment 2 is shown in Tables 14 and 15.

TABLE-US-00014 TABLE 14 Seed oil content of T1 plants generated with binary vector pKR1482- ORM for seed specific gene suppression of At5g17280 (Experiment 2) oil content Construct BARCODE % oil % of WT pKR1482- K47030 41.8 104.9 ORM pKR1482- K47021 41.2 103.4 ORM pKR1482- K47018 41.1 103.2 ORM pKR1482- K47017 41.0 103.0 ORM pKR1482- K47013 40.3 101.1 ORM pKR1482- K47028 40.2 101.0 ORM pKR1482- K47015 40.2 100.8 ORM pKR1482- K47007 40.0 100.2 ORM pKR1482- K47025 39.6 99.3 ORM pKR1482- K47029 39.5 99.0 ORM pKR1482- K47008 39.3 98.7 ORM pKR1482- K47022 38.8 97.5 ORM pKR1482- K47020 38.8 97.3 ORM pKR1482- K47014 38.5 96.6 ORM pKR1482- K47026 38.4 96.2 ORM pKR1482- K47012 38.2 95.8 ORM pKR1482- K47023 38.0 95.4 ORM pKR1482- K47010 37.9 95.1 ORM pKR1482- K47019 37.3 93.5 ORM pKR1482- K47011 37.2 93.4 ORM pKR1482- K47027 37.2 93.3 ORM pKR1482- K47009 35.6 89.4 ORM pKR1482- K47024 35.5 89.1 ORM pKR1482- K47016 32.3 81.1 ORM wt K47308 40.9 wt K47312 40.4 wt K47306 40.3 wt K47307 40.2 wt K47302 40.1 wt K47301 39.9 wt K47310 39.7 wt K47305 39.6 wt K47309 39.5 wt K47311 39.3 wt K47304 39.2 wt K47303 39.1

TABLE-US-00015 TABLE 15 Seed oil content of T2 plants generated with binary vector pKR1482- ORM for seed specific gene suppression of At5g17280 (Experiment 2) oil avg. oil content content % Event ID Construct BARCODE % oil % of WT of WT K47021 pKR1482- K50089 44.5 107.6 ORM pKR1482- K50087 44.3 107.3 ORM pKR1482- K50093 44.3 107.3 ORM pKR1482- K50085 44.1 106.7 ORM pKR1482- K50086 43.9 106.3 ORM pKR1482- K50088 43.8 106.0 ORM pKR1482- K50091 43.6 105.6 ORM pKR1482- K50090 43.3 104.9 ORM pKR1482- K50094 43.0 104.2 ORM pKR1482- K50084 42.7 103.3 ORM pKR1482- K50092 42.5 102.8 105.6 ORM wt K50097 42.2 wt K50099 42.2 wt K50100 41.8 wt K50095 41.6 wt K50098 40.2 wt K50096 39.7 K47018 pKR1482- K50105 44.9 108.7 ORM pKR1482- K50102 44.7 108.2 ORM pKR1482- K50122 44.2 107.1 ORM pKR1482- K50109 44.2 107.0 ORM pKR1482- K50104 44.0 106.6 ORM pKR1482- K50114 44.0 106.5 ORM pKR1482- K50112 43.8 106.0 ORM pKR1482- K50111 43.7 105.9 ORM pKR1482- K50121 43.7 105.8 ORM pKR1482- K50115 43.6 105.7 ORM pKR1482- K50101 43.6 105.6 ORM pKR1482- K50106 43.6 105.6 ORM pKR1482- K50120 43.5 105.3 ORM pKR1482- K50123 43.4 105.2 ORM pKR1482- K50103 43.2 104.6 ORM pKR1482- K50110 43.1 104.4 ORM pKR1482- K50117 43.1 104.4 ORM pKR1482- K50108 43.0 104.1 ORM pKR1482- K50118 42.8 103.7 ORM pKR1482- K50119 42.5 103.0 ORM pKR1482- K50113 42.2 102.2 ORM pKR1482- K50107 42.1 101.9 ORM pKR1482- K50116 40.3 97.6 105.0 ORM wt K50129 42.8 wt K50132 42.7 wt K50130 42.7 wt K50133 42.5 wt K50134 42.3 wt K50124 42.2 wt K50127 41.7 wt K50128 41.3 wt K50125 39.7 wt K50126 39.2 wt K50131 37.1

Tables 11, 12, 13, 14 and 15 demonstrate that an oil increase of about 2-11% is associated with seed-specific down regulation of At5g17280. The oil increase is observed in multiple events and is heritable.

Example 6

Identification of cDNA Clones

[0233] cDNA clones encoding an ORM motif protein can be identified by conducting BLAST (Basic Local Alignment Search Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410; see also the explanation of the BLAST algorithm on the world wide web site for the National Center for Biotechnology Information at the National Library of Medicine of the National Institutes of Health) searches for similarity to amino acid sequences contained in the BLAST "nr" database (comprising all non-redundant GenBank CDS translations, sequences derived from the 3-dimensional structure Brookhaven Protein Data Bank, the last major release of the SWISS-PROT protein sequence database, EMBL, and DDBJ databases). The DNA sequences from clones can be translated in all reading frames and compared for similarity to all publicly available protein sequences contained in the "nr" database using the BLASTX algorithm (Gish and States (1993) Nat. Genet. 3:266-272) provided by the NCBI. The polypeptides encoded by the cDNA sequences can be analyzed for similarity to all publicly available amino acid sequences contained in the "nr" database using the BLASTP algorithm provided by the National Center for Biotechnology Information (NCBI). For convenience, the P-value (probability) or the E-value (expectation) of observing a match of a cDNA-encoded sequence to a sequence contained in the searched databases merely by chance as calculated by BLAST are reported herein as "pLog" values, which represent the negative of the logarithm of the reported P-value or E-value. Accordingly, the greater the pLog value, the greater the likelihood that the cDNA-encoded sequence and the BLAST "hit" represent homologous proteins.

[0234] ESTs sequences can be compared to the Genbank database as described above. ESTs that contain sequences more 5- or 3-prime can be found by using the BLASTN algorithm (Altschul et al (1997) Nucleic Acids Res. 25:3389-3402.) against the DUPONT.TM. proprietary database comparing nucleotide sequences that share common or overlapping regions of sequence homology. Where common or overlapping sequences exist between two or more nucleic acid fragments, the sequences can be assembled into a single contiguous nucleotide sequence, thus extending the original fragment in either the 5 or 3 prime direction. Once the most 5-prime EST is identified, its complete sequence can be determined by Full Insert Sequencing as described above. Homologous genes belonging to different species can be found by comparing the amino acid sequence of a known gene (from either a proprietary source or a public database) against an EST database using the TBLASTN algorithm. The TBLASTN algorithm searches an amino acid query against a nucleotide database that is translated in all 6 reading frames. This search allows for differences in nucleotide codon usage between different species, and for codon degeneracy.

Example 7

Characterization of cDNA Clones Encoding ORM protein Polypeptides

[0235] A cDNA library representing mRNAs from sunflower was prepared and cDNA clones encoding ORM polypeptides were identified. Clone hso1c.pk014.c16 was obtained from a cDNA library prepared from transgenic sunflower plants.

Example 8

Identification of Genes of Brassica napus Closely-Related to At5g17280

[0236] Public DNA sequences (NCBI and Brassica napus EST assembly (N) Brassica napus EST assembly version 3.0 (Jul. 30, 2007) from the Gene Index Project at Dana-Farber Cancer Institute were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. The assembly encompasses about 558465 public ESTs and has a total of 90310 sequences (47591 assemblies and 42719 singletons). There are three genes encoding proteins with homology to At5g17280. These genes, their % identity to At5g17280 and SEQ ID NOs are listed in Table 16.

TABLE-US-00016 TABLE 16 Brassica rapa gene closely related to At5g17280 % AA sequence identity to Gene name At5g17280 SEQ ID NO: NT SEQ ID NO: AA TC44737 51.8 25 26 TC52165 53.3 27 28 TC52879 48.2 29 30

Example 9

Identification of Genes of Sunflower Genes Closely-Related to At5g17280

[0237] Applicants Sunflower EST libraries were searched using the predicted amino acid sequence of were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. and tBLASTn. There is one EST encoding a protein that shares 47.2 sequence identity to At5g17280. The gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 17. Clone hso1c.pk014.c16 shares 38.3% sequence identity with the public sequence from Populus trichocarpa (NCBI GI:118481427, SEQ ID NO:64) and 35.7% sequence identity with SEQ ID NO: 36271 of US20060123505 (SEQ ID NO:65).

TABLE-US-00017 TABLE 17 Sunflower (Helianthus annuus) gene closely related to At5g17280 % AA sequence identity to SEQ ID NO: SEQ ID NO: Gene name At5g17280 AA NT hso1c.pk014.c16 39.1 31 32

Example 10

Identification of Genes of Castor Genes Closely-Related to At5g17280

[0238] The Non-redundant protein data set from NCBI including non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF protein sequences was searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There is one gene XM.sub.--002533611 which shares 50.7% amino acid sequence identity to At5g17280. This gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 18.

TABLE-US-00018 TABLE 18 Castor (Ricinus communis) gene closely related to At5g17280 % AA sequence identity to SEQ ID Gene name At5g17280 SEQ ID NO: NT NO: AA XM_002533611 50.7 33 34

Example 11

Identification of Genes of Soybean (Glycine max) Closely-Related to At5g17280

[0239] Public DNA sequences (Soybean cDNAs Glyma1.01 (JGI) (N) Predicted cDNAs from Soybean JGI Glyma1.01 genomic sequence, FGENESH predictions, and EST PASA analysis.) were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There are two genes that encode protein which share between 38.2 and 30.3% amino acid sequence identity with the predicted protein At5g17280. These genes, its properties and SEQ ID NO are listed in Table 19

TABLE-US-00019 TABLE 19 Soybean genes closely related to At5g17280 % AA sequence identity to SEQ ID NO: Gene name At5g17280 SEQ ID NO: NT AA Glyma02g05870 38.2 35 36 Glyma16g24560 30.3 37 38

Example 12

Identification of Genes of Maize (Zea mays) Closely-Related to At5g17280

[0240] The filtered Gene Set cDNAs of the maize genome sequence in the public maize database was searched using the predicted amino acid sequence of At5g17280 and tBLASTn. In addition applicant's maize EST data base was searched in a similar fashion. These genes, its properties and SEQ ID NO are listed in Table 20. Maize GRMZM2G132101 shares 94.4% sequence identity with the public sequence from maize, NCBI Gi NO: 195615148 (SEQ ID NO: 66) and 93.3 sequence identity with SEQ ID NO:233249 of US20040214272 (SEQ ID NO:67). Maize cDNA pco642986 shares 95.5% sequence identity with the public sequence from maize, NCBI Gi NO: 195615148 (SEQ ID NO: 66) and 96.6% sequence identity with SEQ ID NO:233249 of US20040214272 (SEQ ID NO:67). Maize cDNA pco597536 shares 99.2% sequence identity with the public sequence from maize, NCBI Gi NO: 195615148 (SEQ ID NO:66) and 100% sequence identity with SEQ ID NO:233249 of US20040214272 (SEQ ID NO:67).

TABLE-US-00020 TABLE 20 Maize genes closely related to At5g17280 % AA sequence identity to SEQ ID SEQ ID Gene name At5g17280 NO: NT NO: AA GRMZM2G132101 33.7 39 40 pco642986 33.0 41 42 pco597536 30.9 43 44

Example 13

Identification of Genes of Rice (Oryza sativa) Closely-Related to At5g17280

[0241] A public database of transcripts from rice gene models (Oryza sativa (japonica cultivar-group) MSU Rice Genome Annotation Project Osa1 release 6 (January 2009)) which includes untranslated regions (UTR) but no introns was searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There is one gene which shares 34.5% amino acid sequence identity to At5g17280. This gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 21.

TABLE-US-00021 TABLE 21 Rice gene closely related to At5g17280 % AA sequence SEQ ID SEQ ID Gene name identity to At5g17280 NO: NT NO: AA Os09g36120 34.5 45 46

Example 14

Identification of Genes of Sorghum (Sorghum bicolor) Closely-Related to At5g17280

[0242] The predicted coding sequences (mRNA) from the Sorghum JGI genomic sequence, version 1.4 were searched using the predicted amino acid sequence of At5g17280 and tBLASTn. There is one gene which shares 30.9% amino acid sequence identity to At5g17280. This gene, its % identity to At5g17280 and SEQ ID NOs are listed in Table 22.

TABLE-US-00022 TABLE 22 Sorghum gene closely related to At5g17280 % AA sequence identity to SEQ ID NO: SEQ ID NO: Gene name At5g17280 NT AA Sb02g030770 30.9 47 48

Example 15

Expression of Chimeric Genes in Monocot Cells

[0243] A chimeric gene comprising a cDNA encoding the instant polypeptides in sense orientation with respect to the maize 27 kD zein promoter that is located 5' to the cDNA fragment, and the 10 kD zein 3' end that is located 3' to the cDNA fragment, can be constructed. The cDNA fragment of this gene may be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites (NcoI or SmaI) can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the digested vector pML103 as described below. Amplification is then performed in a standard PCR. The amplified DNA is then digested with restriction enzymes NcoI and SmaI and fractionated on an agarose gel. The appropriate band can be isolated from the gel and combined with a 4.9 kb NcoI-SmaI fragment of the plasmid pML103. Plasmid pML103 has been deposited under the terms of the Budapest Treaty at ATCC (American Type Culture Collection, 10801 University Blvd., Manassas, Va. 20110-2209), and bears accession number ATCC 97366. The DNA segment from pML103 contains a 1.05 kb SalI-NcoI promoter fragment of the maize 27 kD zein gene and a 0.96 kb SmaI-SalI fragment from the 3' end of the maize 10 kD zein gene in the vector pGem9Zf(+) (Promega). Vector and insert DNA can be ligated at 15.degree. C. overnight, essentially as described (Maniatis). The ligated DNA may then be used to transform E. coli XL1-Blue (Epicurian Coli XL-1 Blue.TM.; Stratagene). Bacterial transformants can be screened by restriction enzyme digestion of plasmid DNA and limited nucleotide sequence analysis using the dideoxy chain termination method (Sequenase.TM. DNA Sequencing Kit; U.S. Biochemical). The resulting plasmid construct would comprise a chimeric gene encoding, in the 5' to 3' direction, the maize 27 kD zein promoter, a cDNA fragment encoding the instant polypeptides, and the 10 kD zein 3' region.

[0244] The chimeric gene described above can then be introduced into corn cells by the following procedure. Immature corn embryos can be dissected from developing caryopses derived from crosses of the inbred corn lines H99 and LH132. The embryos are isolated 10 to 11 days after pollination when they are 1.0 to 1.5 mm long. The embryos are then placed with the axis-side facing down and in contact with agarose-solidified N6 medium (Chu et al. (1975) Sci. Sin. Peking 18:659-668). The embryos are kept in the dark at 27.degree. C. Friable embryogenic callus consisting of undifferentiated masses of cells with somatic proembryoids and embryoids borne on suspensor structures proliferate from the scutellum of these immature embryos. The embryogenic callus isolated from the primary explant can be cultured on N6 medium and sub-cultured on this medium every 2 to 3 weeks.

[0245] The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag, Frankfurt, Germany) may be used in transformation experiments in order to provide for a selectable marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers resistance to herbicidal glutamine synthetase inhibitors such as phosphinothricin. The pat gene in p35S/Ac is under the control of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812) and the 3' region of the nopaline synthase gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens.

[0246] The particle bombardment method (Klein et al. (1987) Nature 327:70-73) may be used to transfer genes to the callus culture cells. According to this method, gold particles (1 .mu.m in diameter) are coated with DNA using the following technique. Ten .mu.g of plasmid DNAs are added to 50 .mu.L of a suspension of gold particles (60 mg per mL). Calcium chloride (50 .mu.L of a 2.5 M solution) and spermidine free base (20 .mu.L of a 1.0 M solution) are added to the particles. The suspension is vortexed during the addition of these solutions. After 10 minutes, the tubes are briefly centrifuged (5 sec at 15,000 rpm) and the supernatant removed. The particles are resuspended in 200 .mu.L of absolute ethanol, centrifuged again and the supernatant removed. The ethanol rinse is performed again and the particles resuspended in a final volume of 30 .mu.L of ethanol. An aliquot (5 .mu.L) of the DNA-coated gold particles can be placed in the center of a Kapton.TM. flying disc (Bio-Rad Labs). The particles are then accelerated into the corn tissue with a Biolistic.TM. PDS-1000/He (Bio-Rad Instruments, Hercules Calif.), using a helium pressure of 1000 psi, a gap distance of 0.5 cm and a flying distance of 1.0 cm.

[0247] For bombardment, the embryogenic tissue is placed on filter paper over agarose-solidified N6 medium. The tissue is arranged as a thin lawn and covered a circular area of about 5 cm in diameter. The petri dish containing the tissue can be placed in the chamber of the PDS-1000/He approximately 8 cm from the stopping screen. The air in the chamber is then evacuated to a vacuum of 28 inches of Hg. The macrocarrier is accelerated with a helium shock wave using a rupture membrane that bursts when the He pressure in the shock tube reaches 1000 psi. Seven days after bombardment the tissue can be transferred to N6 medium that contains gluphosinate (2 mg per liter) and lacks casein or proline. The tissue continues to grow slowly on this medium. After an additional 2 weeks the tissue can be transferred to fresh N6 medium containing gluphosinate. After 6 weeks, areas of about 1 cm in diameter of actively growing callus can be identified on some of the plates containing the glufosinate-supplemented medium. These calli may continue to grow when sub-cultured on the selective medium.

[0248] Plants can be regenerated from the transgenic callus by first transferring clusters of tissue to N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the tissue can be transferred to regeneration medium (Fromm et al. (1990) Bio/Technology 8:833-839).

Example 16

Expression of Chimeric Genes in Dicot Cells

[0249] A seed-specific construct composed of the promoter and transcription terminator from the gene encoding the .beta. subunit of the seed storage protein phaseolin from the bean Phaseolus vulgaris (Doyle et al. (1986) J. Biol. Chem. 261:9228-9238) can be used for expression of the instant polypeptides in transformed soybean. The phaseolin construct includes about 500 nucleotides upstream (5') from the translation initiation codon and about 1650 nucleotides downstream (3') from the translation stop codon of phaseolin. Between the 5' and 3' regions are the unique restriction endonuclease sites Nco I (which includes the ATG translation initiation codon), Sma I, Kpn I and Xba I. The entire construct is flanked by Hind III sites.

[0250] The cDNA fragment of this gene may be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the expression vector. Amplification is then performed as described above, and the isolated fragment is inserted into a pUC18 vector carrying the seed construct.

[0251] Soybean embryos may then be transformed with the expression vector comprising sequences encoding the instant polypeptides. To induce somatic embryos, cotyledons, 3-5 mm in length dissected from surface sterilized, immature seeds of the soybean cultivar A2872 can be cultured in the light or dark at 26.degree. C. on an appropriate agar medium for 6-10 weeks. Somatic embryos which produce secondary embryos are then excised and placed into a suitable liquid medium. After repeated selection for clusters of somatic embryos which multiplied as early, globular staged embryos, the suspensions are maintained as described below. Soybean embryogenic suspension cultures can be maintained in 35 mL of liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with fluorescent lights on a 16:8 hour day/night schedule. Cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 mL of liquid medium.

Soybean embryogenic suspension cultures may then be transformed by the method of particle gun bombardment (Klein et al. (1987) Nature (London) 327:70-73, U.S. Pat. No. 4,945,050). A DuPont Biolistic.TM. PDS1000/HE instrument (helium retrofit) can be used for these transformations.

[0252] A selectable marker gene which can be used to facilitate soybean transformation is a chimeric gene composed of the 35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812), the hygromycin phosphotransferase gene from plasmid pJR225 (from E. coli; Gritz et al. (1983) Gene 25:179-188) and the 3' region of the nopaline synthase gene from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens. The seed construct comprising the phaseolin 5' region, the fragment encoding the instant polypeptides and the phaseolin 3' region can be isolated as a restriction fragment. This fragment can then be inserted into a unique restriction site of the vector carrying the marker gene. To 50 .mu.L of a 60 mg/mL 1 .mu.m gold particle suspension is added (in order): 5 .mu.L DNA (1 .mu.g/.mu.L), 20 .mu.L spermidine (0.1 M), and 50 .mu.L CaCl.sub.2 (2.5M). The particle preparation is then agitated for three minutes, spun in a microfuge for 10 seconds and the supernatant removed. The DNA-coated particles are then washed once in 400 .mu.L 70% ethanol and resuspended in 40 .mu.L of anhydrous ethanol. The DNA/particle suspension can be sonicated three times for one second each. Five .mu.L of the DNA-coated gold particles are then loaded on each macro carrier disk. Approximately 300-400 mg of a two-week-old suspension culture is placed in an empty 60.times.15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 plates of tissue are normally bombarded. Membrane rupture pressure is set at 1100 psi and the chamber is evacuated to a vacuum of 28 inches of mercury. The tissue is placed approximately 3.5 inches away from the retaining screen and bombarded three times. Following bombardment, the tissue can be divided in half and placed back into liquid and cultured as described above.

[0253] Five to seven days post bombardment, the liquid media may be exchanged with fresh media, and eleven to twelve days post bombardment with fresh media containing 50 mg/mL hygromycin. This selective media can be refreshed weekly. Seven to eight weeks post bombardment, green, transformed tissue may be observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Each new line may be treated as an independent transformation event. These suspensions can then be subcultured and maintained as clusters of immature embryos or regenerated into whole plants by maturation and germination of individual somatic embryos.

Example 17

Expression of Chimeric Genes in Microbial Cells

[0254] The cDNAs encoding the instant polypeptides can be inserted into the T7 E. coli expression vector pBT430. This vector is a derivative of pET-3a (Rosenberg et al. (1987) Gene 56:125-135) which employs the bacteriophage T7 RNA polymerase/T7 promoter system. Plasmid pBT430 was constructed by first destroying the EcoR I and Hind III sites in pET-3a at their original positions. An oligonucleotide adaptor containing EcoR I and Hind III sites was inserted at the BamH I site of pET-3a. This created pET-3aM with additional unique cloning sites for insertion of genes into the expression vector. Then, the Nde I site at the position of translation initiation was converted to an Nco I site using oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM in this region, 5'-CATATGG, was converted to 5'-CCCATGG in pBT430.

[0255] Plasmid DNA containing a cDNA may be appropriately digested to release a nucleic acid fragment encoding the protein. This fragment may then be purified on a 1% NuSieve GTG.TM. low melting agarose gel (FMC). Buffer and agarose contain 10 .mu.g/mL ethidium bromide for visualization of the DNA fragment. The fragment can then be purified from the agarose gel by digestion with GELase.TM. (Epicentre Technologies) according to the manufacturer's instructions, ethanol precipitated, dried and resuspended in 20 .mu.L of water. Appropriate oligonucleotide adapters may be ligated to the fragment using T4 DNA ligase (New England Biolabs, Beverly, Mass.). The fragment containing the ligated adapters can be purified from the excess adapters using low melting agarose as described above. The vector pBT430 is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized with phenol/chloroform as described above. The prepared vector pBT430 and fragment can then be ligated at 16.degree. C. for 15 hours followed by transformation into DH5 electrocompetent cells (GIBCO BRL). Transformants can be selected on agar plates containing LB media and 100 .mu.g/mL ampicillin. Transformants containing the gene encoding the instant polypeptides are then screened for the correct orientation with respect to the T7 promoter by restriction enzyme analysis. For high level expression, a plasmid clone with the cDNA insert in the correct orientation relative to the T7 promoter can be transformed into E. coli strain BL21(DE3) (Studier et al. (1986) J. Mol. Biol. 189:113-130). Cultures are grown in LB medium containing ampicillin (100 mg/L) at 25.degree. C. At an optical density at 600 nm of approximately 1, IPTG (isopropylthio-.beta.-galactoside, the inducer) can be added to a final concentration of 0.4 mM and incubation can be continued for 3 h at 25.degree. C. Cells are then harvested by centrifugation and re-suspended in 50 .mu.L of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glass beads can be added and the mixture sonicated 3 times for about 5 seconds each time with a microprobe sonicator. The mixture is centrifuged and the protein concentration of the supernatant determined. One .mu.g of protein from the soluble fraction of the culture can be separated by SDS-polyacrylamide gel electrophoresis. Gels can be observed for protein bands migrating at the expected molecular weight.

Example 18

Transformation of Somatic Soybean Embryo Cultures

Generic Stable Soybean Transformation Protocol:

[0256] Soybean embryogenic suspension cultures are maintained in 35 ml liquid media (SB55 or SBP6) on a rotary shaker, 150 rpm, at 28.degree. C. with mixed fluorescent and incandescent lights on a 16:8 h day/night schedule. Cultures are subcultured every four weeks by inoculating approximately 35 mg of tissue into 35 ml of liquid medium.

TABLE-US-00023 TABLE 23 Stock Solutions (g/L): MS Sulfate 100X Stock MgSO.sub.47H.sub.2O 37.0 MnSO.sub.4H.sub.2O 1.69 ZnSO.sub.47H.sub.2O 0.86 CuSO.sub.45H.sub.2O 0.0025 MS Halides 100X Stock CaCl.sub.22H.sub.2O 44.0 KI 0.083 CoCl.sub.26H.sub.20 0.00125 KH.sub.2PO.sub.4 17.0 H.sub.3BO.sub.3 0.62 Na.sub.2MoO.sub.42H.sub.2O 0.025 MS FeEDTA 100X Stock Na.sub.2EDTA 3.724 FeSO.sub.47H.sub.2O 2.784 B5 Vitamin Stock 10 g m-inositol 100 mg nicotinic acid 100 mg pyridoxine HCl 1 g thiamine SB55 (per Liter, pH 5.7) 10 ml each MS stocks 1 ml B5 Vitamin stock 0.8 g NH.sub.4NO.sub.3 3.033 g KNO.sub.3 1 ml 2,4-D (10 mg/mL stock) 60 g sucrose 0.667 g asparagine SBP6 same as SB55 except 0.5 ml 2,4-D SB103 (per Liter, pH 5.7) 1X MS Salts 6% maltose 750 mg MgCl.sub.2 0.2% Gelrite SB71-1 (per Liter, pH 5.7) 1X B5 salts 1 ml B5 vitamin stock 3% sucrose 750 mg MgCl.sub.2 0.2% Gelrite

[0257] Soybean embryogenic suspension cultures are transformed with plasmid DNA by the method of particle gun bombardment (Klein et al (1987) Nature 327:70). A DuPont Biolistic PDS1000/HE instrument (helium retrofit) is used for these transformations.

[0258] To 50 ml of a 60 mg/ml 1 .mu.m gold particle suspension is added (in order); 5 .mu.L DNA (1 .mu.g/.mu.l), 20 .mu.l spermidine (0.1 M), and 50 .mu.l CaCl.sub.2 (2.5 M). The particle preparation is agitated for 3 min, spun in a microfuge for 10 sec and the supernatant removed. The DNA-coated particles are then washed once in 400 .mu.l 70% ethanol and re suspended in 40 .mu.l of anhydrous ethanol. The DNA/particle suspension is sonicated three times for 1 sec each. Five .mu.l of the DNA-coated gold particles are then loaded on each macro carrier disk. For selection, a plasmid conferring resistance to hygromycin phosphotransferase (HPT) may be co-bombarded with the silencing construct of interest.

[0259] Approximately 300-400 mg of a four week old suspension culture is placed in an empty 60.times.15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 plates of tissue are normally bombarded. Membrane rupture pressure is set at 1000 psi and the chamber is evacuated to a vacuum of 28 inches of mercury. The tissue is placed approximately 3.5 inches away from the retaining screen and bombarded three times. Following bombardment, the tissue is placed back into liquid and cultured as described above.

[0260] Eleven days post bombardment, the liquid media is exchanged with fresh SB55 containing 50 mg/ml hygromycin. The selective media is refreshed weekly. Seven weeks post bombardment, green, transformed tissue is observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Thus each new line is treated as an independent transformation event. These suspensions can then be maintained as suspensions of embryos maintained in an immature developmental stage or regenerated into whole plants by maturation and germination of individual somatic embryos.

[0261] Independent lines of transformed embryogenic clusters are removed from liquid culture and placed on a solid agar media (SB103) containing no hormones or antibiotics. Embryos are cultured for four weeks at 26.degree. C. with mixed fluorescent and incandescent lights on a 16:8 h day/night schedule. During this period, individual embryos are removed from the clusters and screened for alterations in gene expression.

[0262] It should be noted that any detectable phenotype, resulting from the alterted expression of a target gene, can be screened at this stage. This would include, but not be limited to, alterations in oil content, protein content, carbohydrate content, growth rate, viability, or the ability to develop normally into a soybean plant.

Example 19

Plasmid DNAs for "Complementary Region" Co-Suppression

[0263] The plasmids in the following experiments are made using standard cloning methods well known to those skilled in the art (Sambrook et al (1989) Molecular Cloning, CSHL Press, New York). A starting plasmid pKS18HH (U.S. Pat. No. 5,846,784 the contents of which are hereby incorporated by reference) contains a hygromycin B phosphotransferase (HPT) obtained from E. coli strain W677 under the control of a T7 promoter and the 35S cauliflower mosaic virus promoter. Plasmid pKS18HH thus contains the T7 promoter/HPT/T7 terminator cassette for expression of the HPT enzyme in certain strains of E. coli such as NovaBlue (DE3) [from Novagen], that are lysogenic for lambda DE3 (which carries the T7 RNA Polymerase gene under lacV5 control). Plasmid pKS18HH also contains the 35S/HPT/NOS cassette for constitutive expression of the HPT enzyme in plants, such as soybean. These two expression systems allow selection for growth in the presence of hygromycin to be used as a means of identifying cells that contain the plasmid in both bacterial and plant systems. pKS18HH also contains three unique restriction endonuclease sites suitable for the cloning other chimeric genes into this vector. Plasmid ZBL100 (PCT Application No. WO 00/11176 published on Mar. 2, 2000) is a derivative of pKS18HH with a reduced NOS 3' terminator. Plasmid pKS67 is a ZBL100 derivative with the insertion of a beta-conglycinin promoter, in front of a NotI cloning site, followed by a phaseolin 3' terminator (described in PCT Application No. WO 94/11516, published on May 26, 1994).

[0264] The 2.5 kb plasmid pKS17 contains pSP72 (obtained from Promega Biosystems) and the T7 promoter/HPT/T7 3' terminator region, and is the original vector into which the 3.2 kb BamHI-SalI fragment containing the 35S/HPT/NOS cassette was cloned to form pKS18HH. The plasmid pKS102 is a pKS17 derivative that is digested with XhoI and SalI, treated with mung-bean nuclease to generate blunt ends, and ligated to insert the linker described in SEQ ID NO:49:

[0265] The plasmid pKS83 has the 2.3 kb BamHI fragment of ML70 containing the Kti3 promoter/NotI/Kti3 3' terminator region (described in PCT Application No. WO 94/11516, published on May 26, 1994) ligated into the BamHI site of pKS17. Additional methods for suppression of endogenous genes are well know in the art and have been described in the detailed description of the instant invention and can be used to reduce the expression of endogenous ORM protein or enzyme activity in a plant cell.

Example 20

Suppression by ELVISLIVES Complementary Region

[0266] Constructs can be made which have "synthetic complementary regions" (SCR). In this example the target sequence is placed between complementary sequences that are not known to be part of any biologically derived gene or genome (i.e. sequences that are "synthetic" or conjured up from the mind of the inventor). The target DNA would therefore be in the sense or antisense orientation and the complementary RNA would be unrelated to any known nucleic acid sequence. It is possible to design a standard "suppression vector" into which pieces of any target gene for suppression could be dropped. The plasmids pKS106, pKS124, and pKS133 (SEQ ID NO:50) exemplify this. One skilled in the art will appreciate that all of the plasmid vectors contain antibiotic selection genes such as, but not limited to, hygromycin phosphotransferase with promoters such as the T7 inducible promoter.

[0267] pKS106 uses the beta-conglycinin promoter while the pKS124 and pKS133 plasmids use the Kti promoter, both of these promoters exhibit strong tissue specific expression in the seeds of soybean. pKS106 uses a 3' termination region from the phaseolin gene, and pKS124 and pKS133 use a Kti 3' termination region. pKS106 and pKS124 have single copies of the 36 nucleotide EagI-ELVISLIVES sequence surrounding a NotI site (the amino acids given in parentheses are back-translated from the complementary strand):

TABLE-US-00024 SEQ ID NO: 51 EagI E L V I S L I V E S NotI CGGCCG GAG CTG GTC ATC TCG CTC ATC GTC GAG TCG GCGGCCGC (S) (E) (V) (I) (L) (S) (I) (V) (L) (E) EagI CGA CTC GAC GAT GAG CGA GAT GAC CAG CTC CGGCCG

pKS133 has 2.times. copies of ELVISLIVES surrounding the NotI site:

TABLE-US-00025 SEQ ID NO: 52 EagI E L V I S L I V E S EagI E L V I S cggccggagctggtcatctcgctcatcgtcgagtcg gcggccg gagctggtcatctcg SEQ ID NO: 52 L I V E S NotI (S)(E (V)(I)(L)(S)(I)(V)(L)(E) EagI ctcatcgtcgagtcg gcggccgc cgactcgacgatgagcgagatgacc agctc cggccgc (S)(E)(V)(I)(L)(S)(I)(V)(L)(E) EagI cgactcgacgatgagcgagatgaccagctc cggccg

[0268] The idea is that the single EL linker (SCR) can be duplicated to increase stem lengths in increments of approximately 40 nucleotides. A series of vectors will cover the SCR lengths between 40 by and the 300 bp. Various target gene lengths can also be evaluated. It is believed that certain combinations of target lengths and complementary region lengths will give optimum suppression of the target, however, it is expected that the suppression phenomenon works well over a wide range of sizes and sequences. It is also believed that the lengths and ratios providing optimum suppression may vary somewhat given different target sequences and/or complementary regions.

[0269] The plasmid pKS106 is made by putting the EagI fragment of ELVISLIVES (SEQ ID NO:51) into the NotI site of pKS67. The ELVISLIVES fragment is made by PCR using two primers (SEQ ID NO:53 and SEQ ID NO:54) and no other DNA.

[0270] The product of the PCR reaction is digested with EagI (5'-CGGCCG-3') and then ligated into NotI digested pKS67. The term "ELVISLIVES" and "EL" are used interchangeably herein.

[0271] Additional plasmids can be used to test this example and any synthetic sequence, or naturally occurring sequence, can be used in an analogous manner.

Example 21

Screening of Transgenic Lines for Alterations in Oil, Protein, Starch and Soluble Carbohydrate Content

[0272] Transgenic lines can be selected from soybean transformed with a suppression plasmid, such as those described in Example 19 and Example 20. Transgenic lines can be screened for down regulation of plastidic HpaIL aldolase in soybean, by measuring alteration in oil, starch, protein, soluble carbohydrate and/or seed weight. Compositional analysis including measurements of seed compositional parameters such as protein content and content of soluble carbohydrates of soybean seed derived from transgenic events that show seed-specific down-regulation of ORM genes is performed as follows:

Oil content of mature soybean seed or lyophilized soybean somatic embryos can be measured by NMR as described in Example 2. Non-structural carbohydrate and protein analysis.

[0273] Dry soybean seed are ground to a fine powder in a GenoGrinder and subsamples are weighed (to an accuracy of 0.0001 g) into 13.times.100 mm glass tubes; the tubes have Teflon.RTM. lined screw-cap closures. Three replicates are prepared for each sample tested. Tissue dry weights are calculated by weighing sub-samples before and after drying in a forced air oven for 18 h at 105 C.

[0274] Lipid extraction is performed by adding 2 ml aliquots of heptane to each tube. The tubes are vortex mixed and placed into an ultrasonic bath (VWR Scientific Model 750D) filled with water heated to 60 C. The samples are sonicated at full-power (.about.360 W) for 15 min and were then centrifuged (5 min.times.1700 g). The supernatants are transferred to clean 13.times.100 mm glass tubes and the pellets are extracted 2 more times with heptane (2 ml, second extraction, 1 ml third extraction) with the supernatants from each extraction being pooled. After lipid extraction 1 ml acetone is added to the pellets and after vortex mixing, to fully disperse the material, they are taken to dryness in a Speedvac.

Non-structural carbohydrate extraction and analysis.

[0275] Two ml of 80% ethanol is added to the acetone dried pellets from above. The samples are thoroughly vortex mixed until the plant material was fully dispersed in the solvent prior to sonication at 60 C for 15 min. After centrifugation, 5 min.times.1700 g, the supernatants are decanted into clean 13.times.100 mm glass tubes. Two more extractions with 80% ethanol are performed and the supernatants from each are pooled. The extracted pellets are suspended in acetone and dried (as above). An internal standard .beta.-phenyl glucopyranoside (100 ul of a 0.5000+/-0.0010 g/100 ml stock) is added to each extract prior to drying in a Speedvac. The extracts are maintained in a desiccator until further analysis.

[0276] The acetone dried powders from above were suspended in 0.9 ml MOPS (3-N[Morpholino]propane-sulfonic acid; 50 mM, 5 mM CaCl.sub.2, pH 7.0) buffer containing 1000 of heat stable .alpha.-amylase (from Bacillus licheniformis; Sigma A-4551). Samples are placed in a heat block (90 C) for 75 min and were vortex mixed every 15 min. Samples are then allowed to cool to room temperature and 0.6 ml acetate buffer (285 mM, pH 4.5) containing 5 U amyloglucosidase (Roche 110 202 367 001) is added to each. Samples are incubated for 15-18 h at 55 C in a water bath fitted with a reciprocating shaker; standards of soluble potato starch (Sigma S-2630) are included to ensure that starch digestion went to completion.

[0277] Post-digestion the released carbohydrates are extracted prior to analysis. Absolute ethanol (6 ml) is added to each tube and after vortex mixing the samples were sonicated for 15 min at 60 C. Samples were centrifuged (5 min.times.1700 g) and the supernatants were decanted into clean 13.times.100 mm glass tubes. The pellets are extracted 2 more times with 3 ml of 80% ethanol and the resulting supernatants are pooled. Internal standard (100 ul .beta.-phenyl glucopyranoside, as above) is added to each sample prior to drying in a Speedvac.

Sample Preparation and Analysis

[0278] The dried samples from the soluble and starch extractions described above are solubilized in anhydrous pyridine (Sigma-Aldrich P57506) containing 30 mg/ml of hydroxylamine HCl (Sigma-Aldrich 159417). Samples are placed on an orbital shaker (300 rpm) overnight and are then heated for 1 hr (75 C) with vigorous vortex mixing applied every 15 min. After cooling to room temperature 1 ml hexamethyldisilazane (Sigma-Aldrich H-4875) and 100 ul trifluoroacetic acid (Sigma-Aldrich T-6508) are added. The samples are vortex mixed and the precipitates are allowed to settle prior to transferring the supernatants to GC sample vials. Samples are analyzed on an Agilent 6890 gas chromatograph fitted with a DB-17MS capillary column (15m.times.0.32 mm.times.0.25 um film). Inlet and detector temperatures are both 275 C. After injection (2 ul, 20:1 split) the initial column temperature (150 C) is increased to 180 C at a rate 3 C/min and then at 25 C/min to a final temperature of 320 C. The final temperature is maintained for 10 min. The carrier gas is H.sub.2 at a linear velocity of 51 cm/sec. Detection is by flame ionization. Data analysis is performed using Agilent ChemStation software. Each sugar is quantified relative to the internal standard and detector responses were applied for each individual carbohydrate (calculated from standards run with each set of samples). Final carbohydrate concentrations are expressed on a tissue dry weight basis.

Protein Analysis

[0279] Protein contents are estimated by combustion analysis on a Thermo Finnigan Flash 1112EA combustion analyzer. Samples, 4-8 mg, weighed to an accuracy of 0.001 mg on a Mettler-Toledo MX5 micro balance are used for analysis. Protein contents were calculated by multiplying % N, determined by the analyzer, by 6.25. Final protein contents are expressed on a % tissue dry weight basis. Additionally, the composition of intact single seed and bulk quantities of seed or powders derived from them, may be measured by near-infrared analysis. Measurements of moisture, protein and oil content in soy and moisture, protein, oil and starch content in corn can be measured when combined with the appropriate calibrations.

Example 22

Screening of Transgenic Maize Lines for Alterations in Oil, Protein, Starch and Soluble Carbohydrate Content

[0280] Transgenic maize lines prepared by the method described in Example 15 can be screened essentially as described in Example 21. Embryo-specific downregulation of ORM gene expression is expected to lead to an increase in seed oil content. In contrast overexpression of ORM genes in the endosperm-specific is expected to lead to an increase in seed starch and/or protein content.

Example 23

Seed-Specific RNAi of ORM Genes in Soybean

[0281] A plasmid vector (pKS433) for generation of transgenic soybean events that show seed specific down-regulation of the soy ORM genes corresponding to Glyma02g05870 and Glyma16g24560 genes was constructed.

[0282] Briefly plasmid DNA of applicants EST clone sl1.pk0142.e6 corresponding to Glyma02g05870 (SEQ ID NO:35) was used in a PCR reactions with Primers SA195 (SEQ ID NO:55) and SA196 (SEQ ID NO:56) and SA200 (SEQ ID NO:57) and SA201 (SEQ ID NO:58). A PCR product of 0.39 kb was generated with SA195 (SEQ ID NO:55) and SA196 (SEQ ID NO:56). It was gel purified and is henceforth known as product A. A PCR product of 0.19 kb was generated with SA200 (SEQ ID NO:57) and SA201 (SEQ ID NO:58). It was gel purified and is henceforth known as product B. PCR products A and B were cloned into pGEM T to give pGEM TA (SEQ ID NO:59) and pGEM TB (SEQ ID NO:60), respectively. pGEM TA (SEQ ID NO:59) was digested with HhaI. The digested DNA was treated with Klenow polymerase (NEB, Ipswich, Mass., USA), specifically the 3'-5' exonuclease activity of said enzyme was used to create blunt ends. A 0.58 kb DNA fragment was gel-purified. pGEM TB (SEQ ID NO:60), was linearized by digestion with BamHI. Overhanging ends were filled-in with Klenow polymerase activity and 3' ends were dephosphorylated using calf intestinal phosphatase (NEB, Ipswich, Mass., USA). The 0.58 kb HhaI fragment was ligated to BamHI-linearized pGEM TB to give rise to pGEM T-ORM-HP (SEQ ID NO:61).

[0283] pGEM T-ORM-HP (SEQ ID NO:61) was digested with NotI. A 0.56 kb was gel-purified. The gel purified product was ligated using T4 ligase and thereby cloned in the sense orientation behind the Kti promoter of soybean expression vector KS126 (PCT Publication No. WO 04/071467) that had previously been linearized with the restriction enzyme NotI to give pKS433 (SEQ ID NO:62).

[0284] Plasmid DNA of pKS433 can be used to generate transgenic somatic embryos or seed of soybean using hygromycin selection as described in Example 14. Composition of transgenic somatic embryos or soybean seed generated with pKS433 can be determined as described in Example 19.

[0285] The plasmid vector pKS123 is described in PCT Application No. WO 02/08269. Plasmid pKS120 (SEQ ID NO: 63) is identical to pKS123 (supra) with the exception that the HindIII fragment containing Bcon/NotI/Phas3' cassette was removed.

Generation of Transgenic Somatic Embryos:

[0286] Soybean somatic embryos soybean tissue was co-bombarded as described below with a plasmid DNA of pKS120 or pKS433

Culture Conditions:

[0287] Soybean embryogenic suspension cultures (cv. Jack) were maintained in 35 mL liquid medium SB196 (infra) on a rotary shaker, 150 rpm, 26.degree. C. with cool white fluorescent lights on 16:8 h day/night photoperiod at light intensity of 60-85 .mu.E/m2/s. Cultures were subcultured every 7 days to two weeks by inoculating approximately 35 mg of tissue into 35 mL of fresh liquid SB196 (the preferred subculture interval is every 7 days).

[0288] Soybean embryogenic suspension cultures were transformed with the soybean expression plasmids by the method of particle gun bombardment (Klein et al., Nature 327:70 (1987)) using a DuPont Biolistic PDS1000/HE instrument (helium retrofit) for all transformations.

Soybean Embryogenic Suspension Culture Initiation:

[0289] Soybean cultures were initiated twice each month with 5-7 days between each initiation. Pods with immature seeds from available soybean plants 45-55 days after planting were picked, removed from their shells and placed into a sterilized magenta box. The soybean seeds were sterilized by shaking them for 15 min in a 5% Clorox solution with 1 drop of ivory soap (i.e., 95 mL of autoclaved distilled water plus 5 mL Clorox and 1 drop of soap, mixed well). Seeds were rinsed using 2 1-liter bottles of sterile distilled water and those less than 4 mm were placed on individual microscope slides. The small end of the seed was cut and the cotyledons pressed out of the seed coat. Cotyledons were transferred to plates containing SB199 medium (25-30 cotyledons per plate) for 2 weeks, then transferred to SB1 for 2-4 weeks. Plates were wrapped with fiber tape. After this time, secondary embryos were cut and placed into SB196 liquid media for 7 days.

Preparation of DNA for Bombardment:

[0290] Plasmid DNA of pKS120 or pKS433 were used for bombardment.

[0291] A 50 .mu.L aliquot of sterile distilled water containing 1 mg of gold particles was added to 5 .mu.L of a 1 .mu.g/.mu.L plasmid DNA solution 50 .mu.L 2.5M CaCl.sub.2 and 20 .mu.L of 0.1 M spermidine. The mixture was pulsed 5 times on level 4 of a vortex shaker and spun for 5 sec in a bench microfuge. After a wash with 150 .mu.L of 100% ethanol, the pellet was suspended by sonication in 85 .mu.L of 100% ethanol. Five .mu.L of DNA suspension was dispensed to each flying disk of the Biolistic PDS1000/HE instrument disk. Each 5 .mu.L aliquot contained approximately 0.058 mg gold particles per bombardment (i.e., per disk).

Tissue Preparation and Bombardment with DNA:

[0292] Approximately 100-150 mg of 7 day old embryonic suspension cultures were placed in an empty, sterile 60.times.15 mm petri dish and the dish was placed inside of an empty 150.times.25 mm Petri dish. Tissue was bombarded 1 shot per plate with membrane rupture pressure set at 650 PSI and the chamber was evacuated to a vacuum of 27-28 inches of mercury. Tissue was placed approximately 2.5 inches from the retaining/stopping screen.

Selection of Transformed Embryos:

[0293] Transformed embryos were selected using hygromycin as the selectable marker. Specifically, following bombardment, the tissue was placed into fresh SB196 media and cultured as described above. Six to eight days post-bombardment, the SB196 is exchanged with fresh SB196 containing 30 mg/L hygromycin. The selection media was refreshed weekly. Four to six weeks post-selection, green, transformed tissue was observed growing from untransformed, necrotic embryogenic clusters. Isolated, green tissue was removed and inoculated into multi-well plates to generate new, clonally propagated, transformed embryogenic suspension cultures.

Embryo Maturation:

[0294] Transformed embryogenic clusters were cultured for one-three weeks at 26.degree. C. in SB196 under cool white fluorescent (Phillips cool white Econowatt F40/CW/RS/EW) and Agro (Phillips F40 Agro) bulbs (40 watt) on a 16:8 hr photoperiod with light intensity of 90-120 .mu.E/m.sup.2s. After this time embryo clusters were removed to a solid agar media, SB166, for 1 week. Then subcultured to medium SB103 for 3 weeks. Alternatively, embryo clusters were removed to SB228 (SHaM) liquid media, 35 mL in 250 mL Erlenmeyer flask, for 2-3 weeks. Tissue cultured in SB228 was maintained on a rotary shaker, 130 rpm, 26.degree. C. with cool white fluorescent lights on 16:8 h day/night photoperiod at light intensity of 60-85 .mu.E/m2/s. During this period, individual embryos were removed from the clusters and screened for alterations in their fatty acid compositions as described supra.

Media Recipes:

TABLE-US-00026 [0295] SB 196 - FN Lite Liquid Proliferation Medium (per liter) MS FeEDTA - 100x Stock 1 10 mL MS Sulfate - 100x Stock 2 10 mL FN Lite Halides - 100x Stock 3 10 mL FN Lite P, B, Mo - 100x Stock 4 10 mL B5 vitamins (1 mL/L) 1.0 mL 2,4-D (10 mg/L final concentration) 1.0 mL KNO.sub.3 2.83 gm (NH.sub.4).sub.2SO.sub.4 0.463 gm Asparagine 1.0 gm Sucrose (1%) 10 gm pH 5.8

TABLE-US-00027 FN Lite Stock Solutions Stock Number 1000 mL 500 mL 1 MS Fe EDTA 100x Stock Na.sub.2 EDTA* 3.724 g 1.862 g FeSO.sub.4--7H.sub.2O 2.784 g 1.392 g 2 MS Sulfate 100x stock MgSO.sub.4--7H.sub.2O 37.0 g 18.5 g MnSO.sub.4--H.sub.2O 1.69 g 0.845 g ZnSO.sub.4--7H.sub.2O 0.86 g 0.43 g CuSO.sub.4--5H.sub.2O 0.0025 g 0.00125 g 3 FN Lite Halides 100x Stock CaCl.sub.2--2H.sub.2O 30.0 g 15.0 g KI 0.083 g 0.0715 g CoCl.sub.2--6H.sub.2O 0.0025 g 0.00125 g 4 FN Lite P, B, Mo 100x Stock KH.sub.2PO.sub.4 18.5 g 9.25 g H.sub.3BO.sub.3 0.62 g 0.31 g Na.sub.2MoO.sub.4--2H.sub.2O 0.025 g 0.0125 g *Add first, dissolve in dark bottle while stirring

SB1 Solid Medium

Per Liter

[0296] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)

[0297] 1 mL B5 vitamins 1000.times. stock

[0298] 31.5 g Glucose

[0299] 2 mL 2,4-D (20 mg/L final concentration)

[0300] pH 5.7

[0301] 8 g TC agar

SB199 Solid Medium

Per Liter

[0302] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)

[0303] 1 mL B5 vitamins 1000.times. stock

[0304] 30 g Sucrose

[0305] 4 ml 2,4-D (40 mg/L final concentration)

[0306] pH 7.0

[0307] 2 gm Gelrite

SB 166 Solid Medium

Per Liter

[0308] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)

[0309] 1 mL B5 vitamins 1000.times. stock

[0310] 60 g maltose

[0311] 750 mg MgCl.sub.2 hexahydrate

[0312] 5 g Activated charcoal

[0313] pH 5.7

[0314] 2 g Gelrite

SB 103 Solid Medium

Per Liter

[0315] 1 package MS salts (Gibco/BRL--Cat. No. 11117-066)

[0316] 1 mL B5 vitamins 1000.times. stock

[0317] 60 g maltose

[0318] 750 mg MgCl2 hexahydrate

[0319] pH 5.7

[0320] 2 g Gelrite

SB 71-4 Solid Medium

Per Liter

[0321] 1 bottle Gamborg's B5 salts w/sucrose (Gibco/BRL--Cat. No. 21153-036)

[0322] pH 5.7

[0323] g TC agar

2,4-D Stock

[0324] Obtain premade from Phytotech Cat. No. D 295--concentration 1 mg/mL

B5 Vitamins Stock

Per 100 mL

[0325] Store aliquots at -20.degree. C.

[0326] 10 g Myo-inositol

[0327] 100 mg Nicotinic acid

[0328] 100 mg Pyridoxine HCl

[0329] 1 g Thiamine

If the solution does not dissolve quickly enough, apply a low level of heat via the hot stir plate.

TABLE-US-00028 SB 228 - Soybean Histodifferentiation & Maturation (SHaM) (per liter) DDI H2O 600 ml FN-Lite Macro Salts for SHaM 10X 100 ml MS Micro Salts 1000x 1 ml MS FeEDTA 100x 10 ml CaCl 100x 6.82 ml B5 Vitamins 1000x 1 ml L-Methionine 0.149 g Sucrose 30 g Sorbitol 30 g Adjust volume to 900 mL pH 5.8 Autoclave Add to cooled media (.ltoreq.30 C.): *Glutamine (Final conc. 30 mM) 4% 110 mL *Note: Final volume will be 1010 mL after glutamine addition.

Because glutamine degrades relatively rapidly, it may be preferable to add immediately prior to using media. Expiration 2 weeks after glutamine is added; base media can be kept longer w/o glutamine.

TABLE-US-00029 FN-lite Macro for SHAM 10X - Stock #1 (per liter) (NH.sub.4)2SO.sub.4 (Ammonium Sulfate) 4.63 g KNO.sub.3 (Potassium Nitrate) 28.3 g MgSO.sub.4*7H.sub.20 (Magnesium Sulfate Heptahydrate) 3.7 g KH.sub.2PO.sub.4 (Potassium Phosphate, Monobasic) 1.85 g Bring to volume Autoclave

TABLE-US-00030 MS Micro 1000X - Stock #2 (per 1 liter) H.sub.3BO.sub.3 (Boric Acid) 6.2 g MnSO.sub.4*H.sub.2O (Manganese Sulfate Monohydrate) 16.9 g ZnSO4*7H20 (Zinc Sulfate Heptahydrate) 8.6 g Na.sub.2MoO.sub.4*2H20 (Sodium Molybdate Dihydrate) 0.25 g CuSO.sub.4*5H.sub.20 (Copper Sulfate Pentahydrate) 0.025 g CoCl.sub.2*6H.sub.20 (Cobalt Chloride Hexahydrate) 0.025 g KI (Potassium Iodide) 0.8300 g Bring to volume Autoclave

TABLE-US-00031 FeEDTA 100X - Stock #3 (per liter) Na.sub.2EDTA* (Sodium EDTA) 3.73 g FeSO.sub.4*7H.sub.20 (Iron Sulfate Heptahydrate) 2.78 g *EDTA must be completely dissolved before adding iron.

Bring to Volume

[0330] Solution is photosensitive. Bottle(s) should be wrapped in foil to omit light.

Autoclave

TABLE-US-00032 [0331] Ca 100X - Stock #4 (per liter) CaCl.sub.2*2H.sub.20 (Calcium Chloride Dihydrate) 44 g Bring to Volume Autoclave

TABLE-US-00033 B5 Vitamin 1000X - Stock #5 (per liter) Thiamine*HCl 10 g Nicotinic Acid 1 g Pyridoxine*HCl 1 g Myo-Inositol 100 g Bring to Volume Store frozen

TABLE-US-00034 4% Glutamine - Stock #6 (per liter) DDI water heated to 30.degree. C. 900 ml L-Glutamine 40 g Gradually add while stirring and applying low heat. Do not exceed 35.degree. C. Bring to Volume Filter Sterilize Store frozen* *Note: Warm thawed stock in 31.degree. C. bath to fully dissolve crystals.

Oil Analysis:

[0332] Oil content of somatic embryos is measured using NMR. Briefly lyophilized soybean somatic embryo tissue is pulverized in genogrinder vial as described previously (Example 2). 20-200 mg of tissue powder were transferred to NMR tubes. Oil content of the somatic embryo tissue powder is calculated from the NMR signal as described in Example 2.

Example 24

Compositional Analysis of Arabidospis Events Transformed with DNA Constructs for Seed-Preferred Silencing of ORM Genes

[0333] The example describes seed composition of transgenic events gene generated with pKR1482-ORM (SEQ ID NO:24). It demonstrates that transformation with DNA constructs for silencing of genes encoding ORM genes leads to increased oil content that is accompanied by a reduction in seed storage protein and soluble carbohydrate content.

T4 seed of event K42335 described in Table 13 of Example 5 and T3 seed of event K47021 and K47018 described in Table 15 of Example 5 were used to create three bulk seed samples. Three bulk seed sample of WT control plants grown alongside the T4 and T3 plants described in Tables 13 and 15 of Example 5 were also generated. Oil content of the six seed samples was measured by NMR as described in Example 2. The seed samples were subjected to compositional analysis of protein and soluble carbohydrate content of triplicate samples as described in Example 2. The results of this analysis are summarized in Table 24.

TABLE-US-00035 TABLE 24 Seed composition of arabidospis events transformed with DNA constructs for silencing of ORM genes fructose glucose Oil (%, (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID NMR) Protein % seed) seed) pKR1482- K42335/ 44.3 16.7 0.2 3.3 ORM K44650 WT 42.1 18.0 0.3 4.3 .DELTA. TG/WT % 5.2 -7.2 -29.7 -23.2 total soluble sucrose raffinose stachyose CHO (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Bar code ID seed) seed) seed) seed) pKR1482- K42335/ 11.8 0.1 0.6 16.6 ORM K44650 WT 15.9 0.3 0.2 21.3 .DELTA. TG/WT % -25.9 -57.2 167.9 -21.9 fructose glucose Oil (%, (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID NMR) Protein % seed) seed) pKR1482- K47021 44.9 16.7 0.3 3.5 ORM WT 42.5 17.9 0.2 4.0 .DELTA. TG/WT % 5.6 -6.7 16.1 -12.5 total soluble sucrose raffinose stachyose CHO (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID seed) seed) seed) seed) pKR1482- K47021 14.6 0.3 0.3 19.2 ORM WT 15.9 0.4 0.8 21.6 .DELTA. TG/WT % -8.3 -22.1 -65.8 -10.9 fructose glucose Oil (%, (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID NMR) Protein % seed) seed) pKR1482- K47018 44.8 15.7 0.2 2.7 ORM WT 42.6 17.7 0.3 4.3 .DELTA. TG/WT % 5.2 -11.1 -16.6 -37.0 total soluble sucrose raffinose stachyose CHO (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID seed) seed) seed) seed) pKR1482- K47018 15.2 0.3 0.8 19.5 ORM WT 16.1 0.4 1.2 22.5 .DELTA. TG/WT % -5.7 -13.4 -32.2 -13.1

Table 24 demonstrates that the oil increase associated with the presence of the pKR1482-ORM transgene (SEQ ID NO:24) is accompanied by a reduction in seed protein content and a reduction in soluble carbohydrate content. The latter was calculated by summarizing the content of pinitol, sorbitol, fructose, glucose, myo-Inositol, sucrose, raffinose and stachyose.

Example 25

Compositional Analysis of Arabidospis Events Transformed with DNA Constructs for Seed-Preferred Over-Expression of ORM Genes

[0334] The example describes seed composition of transgenic events gene generated with pKR1478-ORM (SEQ ID NO:14). It demonstrates that transformation with DNA constructs for seed-preferred overexpression genes encoding ORM genes leads to decreased oil content that is accompanied by increased seed storage protein and a small decrease in soluble carbohydrate content.

[0335] T4 seed of event K42334 described in Table 10 of Example 4 were used to create two bulk seed samples. Bulk seed sample of WT control plants grown alongside the T3 plants described in Table 10 of Example 4 were also generated. Oil content of the four seed samples was measured by NMR as described in Example 2. The seed samples were subjected to compositional analysis of protein and soluble carbohydrate content of triplicate samples as described in Example 2. The results of this analysis are summarized in Table 25.

TABLE-US-00036 TABLE 25 Seed composition of arabidospis events transformed with DNA constructs for seed-preferred overexpression of ORM genes fructose glucose Oil (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID (%, NMR) Protein % seed) seed) pKR1478- K42334/ 39.5 19.3 0.2 4.9 ORM K44548 WT 42.3 17.2 0.3 3.4 .DELTA. TG/WT % -6.6 12.5 -11.9 41.1 total soluble sucrose raffinose stachyose CHO (.mu.g (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 mg.sup.-1 Genotype Event ID seed) seed) seed) seed) pKR1478- K42334/ 12.8 0.4 1.6 20.1 ORM K44548 WT 16.4 0.4 1.6 22.4 .DELTA. TG/WT % -22.3 -5.1 0.0 -10.2 fructose glucose Oil (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 Genotype Event ID (%, NMR) Protein % seed) seed) pKR1478- K42334/ 37.0 19.8 0.3 6.2 ORM K44541 WT 42.2 17.8 0.3 3.7 .DELTA. TG/WT % -12.3 11.1 11.5 65.9 total soluble sucrose raffinose stachyose CHO (.mu.g (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 (.mu.g mg.sup.-1 mg.sup.-1 Genotype Event ID seed) seed) seed) seed) pKR1478- K42334/ 13.1 0.4 2.1 22.6 ORM K44541 WT 16.6 0.4 1.8 23.2 .DELTA. TG/WT % -21.2 0.5 17.4 -2.6

Table 25 shows that the oil reduction associated with seed-specific over-expression of ORM genes such as At5g17280 is accompanied by an increase in seed storage protein and a small decrease in soluble carbohydrate content of the seed.

Example 25

Characterization of Arabidospis Events Transformed with a DNA Construct that Contains an Intron-Less Inverted Repeat Construct Derived from Sequences of the At5g17280 (ORM) Gene

[0336] A plasmid vector lo127 for generation of transgenic arabidopsis events that show seed specific down-regulation of the ORM gene corresponding to At5g17280 was constructed.

[0337] Briefly, plasmid DNA isolated from a pooled Arabidopsis cDNA library was used in two PCR reactions with either primers SA311 (SEQ ID NO:71) and SA 312 (SEQ ID NO:72) or SA313 (SEQ ID NO:73) and SA 314 (SEQ ID NO:74). A PCR product of 0.208 kb was generated with SA311 (SEQ ID NO:71) and SA 312 (SEQ ID NO:72). It was gel purified and is henceforth known as product C. A PCR product of 0.183 kb was generated with SA313 (SEQ ID NO:73) and SA 314 (SEQ ID NO:74). It was gel purified and is henceforth known as product D. In a similar fashion a PCR product of 0.208 kb was generated with SA316 (SEQ ID NO:75) and SA 315 (SEQ ID NO:76). It was gel purified and is henceforth known as product E. PCR products C, D and E were cloned into pGEM T easy using instructions of the manufacturer which generated plasmids pGEM T easy C (SEQ ID NO:77), pGEM T easy D (SEQ ID NO:78), pGEM T easy E (SEQ ID NO:79). A restriction fragment of 215 by was excised form pGEM T easy C with NotI and BamHI and cloned into pBluesript SK+ (Stratagene, USA). The resulting plasmid pBluescript-C (SEQ ID NO:80) was linearized with BamHI and PstI and ligated to a 193 by fragment excised from pGEM T easy D with BamHI and PstI. The resulting plasmid pBluescript-CD (SEQ ID NO:81) was linearized with PstI and EcoRI and ligated to a 218 by fragment excised from pGEM T easy E with PstI, EcoRI to give pBluescript-CDE (SEQ ID NO:82). A fragment of 619 by was excised from pBluescript-CDE with NotI and ligated to NotI linearized KS442 (SEQ ID NO:83) to give KS442-CDE (SEQ ID NO:84).

[0338] Prior to this KS442 was constructed as follows. KS121 (PCT Application No. WO 02/00904) was digested BamHI and XmnI and ligated to a fragment comprised of the soybean GYI promoter. The GYI promoter was obtained from KS349 (US 20080295204 A1, published Nov. 27, 2008). Briefly, KS349 was digested with NcoI, overhangs were filled in with Klenow DNA polymerase (NEB, USA) according to manufacturer instructions. The linearized KS349 plasmid was digested with BamHI thus releasing the GYI promoter used for construction of KS442.

[0339] KS442-CDE was digested with AscI and a DNA fragment of 1.558 kb was ligated to Asc-linearized pKR92 (SEQ ID NO:8) to give lo127 (SEQ ID NO:85).

[0340] Plasmid DNA of lo127 was used for agrobacterium-mediated transformation of arabidopsis as described in Example 4. A total of 54 events were generated with lo127. T1 plant of these events were grown to maturity alongside WT control plants. Seed were harvested and oil content was measured by NMR as described in Example 2. The results of this analysis are summarized in Table 26.

TABLE-US-00037 TABLE 26 Seed oil content of T1 plants generated with binary vector lo127 for seed- specific silencing of At5g17280 construct/ oil content avg oil content % genotype event ID % oil % of WT avg of WT ARALO 127 K61385 42.0 116.5 ARALO 127 K61388 41.0 113.7 ARALO 127 K61386 40.6 112.6 ARALO 127 K61389 40.2 111.5 ARALO 127 K61377 40.1 111.2 ARALO 127 K61375 40.0 110.9 ARALO 127 K61379 39.6 109.8 ARALO 127 K61378 39.5 109.5 ARALO 127 K61383 39.3 109.0 ARALO 127 K61367 39.0 108.2 ARALO 127 K61371 38.9 107.9 ARALO 127 K61372 38.8 107.6 ARALO 127 K61394 38.5 106.8 ARALO 127 K61382 38.4 106.5 ARALO 127 K61393 38.2 105.9 ARALO 127 K61391 38.2 105.9 ARALO 127 K61387 38.1 105.7 ARALO 127 K61373 37.9 105.1 ARALO 127 K61381 37.4 103.7 ARALO 127 K61368 37.2 103.2 ARALO 127 K61374 37.2 103.2 ARALO 127 K61392 37.2 103.2 ARALO 127 K61380 37.1 102.9 ARALO 127 K61370 36.6 101.5 ARALO 127 K61384 36.5 101.2 ARALO 127 K61369 35.3 97.9 ARALO 127 K61376 34.8 96.5 ARALO 127 K61390 34.8 96.5 106.2 col 37.2 col 36.9 col 36.8 col 35.5 WT avg col 33.9 36.06 ARALO 127 K61403 41.0 118.2 ARALO 127 K61406 39.7 114.4 ARALO 127 K61425 39.4 113.5 ARALO 127 K61405 39.2 113.0 ARALO 127 K61401 39.2 113.0 ARALO 127 K61408 39.1 112.7 ARALO 127 K61416 38.9 112.1 ARALO 127 K61415 38.9 112.1 ARALO 127 K61404 38.5 111.0 ARALO 127 K61420 38.4 110.7 ARALO 127 K61414 38.2 110.1 ARALO 127 K61407 37.8 108.9 ARALO 127 K61402 37.8 108.9 ARALO 127 K61400 37.7 108.6 ARALO 127 K61424 37.4 107.8 ARALO 127 K61421 37.3 107.5 ARALO 127 K61417 37.3 107.5 ARALO 127 K61419 37.2 107.2 ARALO 127 K61411 37.2 107.2 ARALO 127 K61426 36.5 105.2 ARALO 127 K61409 36.3 104.6 ARALO 127 K61413 35.8 103.2 ARALO 127 K61418 35.7 102.9 ARALO 127 K61422 35.5 102.3 ARALO 127 K61410 35.4 102.0 ARALO 127 K61412 35.3 101.7 108.7 col 36.7 col 36.5 col 34.2 WT avg col 31.4 34.7

[0341] T2 seed of events K61385, K61388, K61386 and K61403 were germinated on selective plant growth media containing kanamycin, planted in soil alongside WT plants and grown to maturity. T3 seed oil content was measured by NMR. The results of this analysis are summarized in Table 27.

TABLE-US-00038 TABLE 27 Seed oil content of T2 plants generated with binary vector lo127 for seed preferred silencing of At5g17280 oil content % avg oil content % event ID/genotype Line ID % oil of WT avg of WT K61385 K62439 42.7 109.5 K62454 42.3 108.5 K62447 41.9 107.4 K63000 41.9 107.4 K63001 41.9 107.4 K62441 41.8 107.2 K62453 41.4 106.2 K62444 41.1 105.4 K62440 40.9 104.9 K62452 40.7 104.4 K62450 40.5 103.8 K62442 40.5 103.8 K62445 40.5 103.8 K62456 39.7 101.8 K62443 39.7 101.8 K62448 38.5 98.7 K62446 38.0 97.4 K62455 37.8 96.9 K62451 37.5 96.2 K62449 37.2 95.4 103.4 col 42.5 col 41.5 col 40.8 col 40.0 col 39.9 col 39.8 col 39.0 col 37.6 col 36.3 col 36.0 WT avg col 35.6 39 K61388 K62406 42.6 107.4 K62414 42.5 107.2 K62410 42.4 106.9 K62411 42.2 106.4 K62419 42.2 106.4 K62413 42.0 105.9 K62415 41.7 105.1 K62408 41.3 104.1 K62412 41.3 104.1 K62422 41.2 103.9 K62424 41.1 103.6 K62404 41.1 103.6 K62425 41.1 103.6 K62417 40.9 103.1 K62409 40.8 102.9 K62423 40.7 102.6 K62421 40.5 102.1 K62416 40.0 100.8 K62426 39.9 100.6 K62418 39.8 100.3 K62427 38.3 96.6 K62407 38.0 95.8 K62420 37.3 94.0 K62405 36.4 91.8 102.5 col 41.2 col 41.2 col 41.0 col 40.9 col 40.6 col 39.4 col 38.9 col 38.7 col 38.7 col 38.5 WT avg col 37.2 39.7 K61386 K63580 45.2 110.9 K63587 45.1 110.6 K63577 44.8 109.9 K63575 44.8 109.9 K63589 44.3 108.6 K63585 43.7 107.2 K63578 43.2 105.9 K62744 43.2 105.9 K63583 43.2 105.9 K63576 43.1 105.7 K63592 43.1 105.7 K63579 43.0 105.5 K63593 42.9 105.2 K63591 42.7 104.7 K63584 41.6 102.0 K63586 41.6 102.0 K63574 41.5 101.8 K63590 41.2 101.0 K63581 40.7 99.8 K63582 40.1 98.3 K63588 39.4 96.6 K63595 37.4 91.7 K63596 37.3 91.5 K63594 36.9 90.5 103.2 col K63601 44.6 col K63600 43.0 col K63598 42.4 col K63599 41.1 col K63604 41.1 col K63606 41.0 col K63605 40.9 col K63608 40.3 col K63597 39.9 col K63607 39.4 col K63602 38.9 WT avg col K63603 36.7 40.8 K61403 K62316 43.1 111.5 K62308 43.0 111.3 K62321 43.0 111.3 K62315 42.1 109.0 K62306 41.8 108.2 K62318 41.4 107.1 K62312 41.4 107.1 K62324 41.3 106.9 K62305 41.0 106.1 K62323 40.7 105.3 K62313 40.3 104.3 K62310 40.0 103.5 K62314 39.6 102.5 K62307 39.6 102.5 K62322 38.8 100.4 K62317 37.4 96.8 K62309 37.1 96.0 K62320 37.0 95.8 K62319 36.7 95.0 K62311 28.7 74.3 102.7 col 41.6 col 40.7 col 40.4 col 40.0 col 38.6 col 38.3 col 35.8 WT avg col 33.7 38.6

Table 23-25 show that silencing of ORM genes such as At5g17280 using hairpin constructs that contain an intron-less inverted repeat lead to a heritable oil increase. In T3 lines that still segregate for the lo127 derived T-DNA insertion the average oil content was 2.5-3.4% higher than that of WT control plants.

Example 25

Seed-Preferred Silencing of ORM Genes in Soybean Using Artificial miRNAs

[0342] The example describes the construction of a plasmid vector for soybean transformation. The plasmid provides seed-preferred expression of two artificial microRNAs that target soybean ORM genes Glyma02g05870 and Glyma16g24560, respectively.

[0343] Vectors were made to silence ORM genes using an artificial microRNA largely as described in U.S. patent application Ser. No. 12/335,717, filed Dec. 16, 2008. The following briefly explains the procedure.

Design of Artificial MicroRNA Sequences

[0344] Artificial microRNAs (amiRNAs) that would have the ability to silence the desired target genes were designed largely according to rules described in Schwab R, et al. (2005) Dev Cell 8: 517-27. To summarize, microRNA sequences are 21 nucleotides in length, start at their 5'-end with a "U", display 5' instability relative to their star sequence which is achieved by including a C or G at position 19, and their 10th nucleotide is either an "A" or an "U". An additional requirement for artificial microRNA design was that the amiRNA have a high free delta-G as calculated using the ZipFold algorithm (Markham, N. R. & Zuker, M. (2005) Nucleic Acids Res. 33: W577-W581.) The DNA sequence corresponding to the amiRNA (OX16) that was used to silence Glyma16g24560 is set forth in SEQ ID NO:86. The DNA sequence corresponding to the amiRNA (OX2) that was used to silence the Glyma02g05870 gene is set forth in SEQ ID NO:87.

Design of an Artificial Star Sequences

[0345] "Star sequences" are those that base pair with the amiRNA sequences, in the precursor RNA, to form imperfect stem structures. To form a perfect stem structure the star sequence would be the exact reverse complement of the amiRNA. The soybean precursor sequence as described in "Novel and nodulation-regulated microRNAs in soybean roots" Subramanian S, Fu Y, Sunkar R, Barbazuk W B, Zhu J K, Yu O BMC Genomics. 9:160 (2008) and accessed on mirBase (Conservation and divergence of microRNA families in plants" Dezulian T, Palatnik J F, Huson D H, Weigel D (2005) Genome Biology 6:P13) was folded using mfold (M. Zuker (2003) Nucleic Acids Res. 31: 3406-15; and D. H. Mathews, J. et al. (1999) J. Mol. Biol. 288: 911-940). The miRNA sequence was then replaced with the amiRNA sequence and the endogenous star sequence was replaced with the exact reverse complement of the amiRNA. Changes in the artificial star sequence were introduced so that the structure of the stem would remain the same as the endogenous structure. The altered sequence was then folded with mfold and the original and altered structures were compared by eye. If necessary, further alternations to the artificial star sequence were introduced to maintain the original structure. The first amiRNA star sequence (OX16 star) that was used to silence Glyma16g24560 is set forth as SEQ ID NO:88. The 2.sup.nd amiRNA star sequence (OX2 star) that was used to silence Glyma02g05870 is set forth as SEQ ID NO:89.

Conversion of Genomic MicroRNA Precursors to Artificial MicroRNA Precursors

[0346] Genomic miRNA precursor genes as described in US Patent Publication No. 2009-0155910A1, published Jun. 18, 2009 can be converted to amiRNAs using overlapping PCR and the resulting DNAs are completely sequenced. These DNAs are then cloned downstream of an appropriate promoter in a vector capable of soybean transformation.

[0347] Alternatively, amiRNAs can be synthesized commercially, for example by Codon Devices, (Cambridge, Mass.), DNA 2.0 (Menlo Park, Calif.) or Genescript (Piscataway, N.J.). The synthesized DNA is then cloned downstream of an appropriate promoter in a vector capable of soybean transformation.

[0348] Alternatively, amiRNAs can be constructed using In-Fusion.TM. technology (Clontech, Mountain View, Calif.).

Conversion of Genomic MicroRNA Precursors to Artificial MicroRNA Precursors

[0349] Genomic miRNA precursor genes were converted to amiRNA precursors using In-Fusion.TM. as described above. In brief, the microRNA 396b precursor (SEQ ID NO: 90) was altered to include Pme I sites immediately flanking the star and microRNA sequences to form the in-fusion ready microRNA 396b precursorv3 (SEQ ID NO: 91).

[0350] The microRNA 396b precursor (Seq ID 90) was used as a PCR template with the primers shown in SEQ ID NO:92 and SEQ ID NO:93. The primers are designed according to the protocol provided by Clontech (USA) and do not leave any footprint of the Pme I sites after the In-Fusion recombination reaction. The amplified sequence is recombined into the in-fusion ready microRNA 396b (SEQ ID NO:91) cloned into pCR2.1 and digested with Pme I. This was done using protocols provided with the In-Fusion.TM. kit. The resulting plasmid 396b-OX16 is shown in SEQ ID 94.

[0351] To construct 159-OX2, the microRNA 159 precursor (SEQ ID No: 95) was altered to include Pme I sites immediately flanking the star and microRNA sequences to form the in-fusion ready microRNA 159 precursor (SEQ ID NO: 96).

[0352] The microRNA 159 precursor (SEQ ID NO: 95) was used as a PCR template with the primers shown in SEQ ID NO:97 and SEQ ID NO:98. The primers are designed according to the protocol provided by Clontech and do not leave any footprint of the Pme I sites after the In-Fusion recombination reaction. The amplified sequences is recombined into the in-fusion ready microRNA 159 (SEQ ID NO:96) cloned into pCR2.1 and digested with Pme I. This was done using protocols provided with the In-Fusion.TM. kit. The resulting plasmid 159-OX2 is shown in Table 3 (SEQ ID NO: 99).

The 611 by Not I-Eco RI fragment was removed from 396b-OX16 (SEQ ID NO:94) and a 965 by EcoRI-Not I fragment was removed from 159-OX2 SEQ ID NO: 100 and cloned into the Not I site of KS126 (PCT Publication No. WO 04/071467) to form KS 434 (SEQ ID NO 100).

Example 26

Compositional Analysis of Soybean Somatic Embryos Transformed with Constructs for RNAi- or amiRNA-Mediated Suppression of ORM Gene Expression

[0353] DNA of plasmids KS120, KS433 and KS434 were stably transformed into soybean suspension cultures and transgenic somatic embryos were generated as described in Example 23. Oil content was analyzed by NMR as described in Example 2.

TABLE-US-00039 TABLE 30 Oil content of somatic embryos generated with plasmids KS120, KS433 and KS434 experiment name plasmid event id % oil average % oil 2698 KS120 K57206 6.6 K57198 6.2 K57195 5.0 K57207 5.0 K57201 5.0 K57211 4.9 K57187 4.8 K57204 4.6 K57189 4.3 K57212 4.3 K57194 4.2 K57188 4.0 K57193 3.9 K57190 3.9 K57200 3.8 K57202 3.8 K57191 3.7 K57210 3.6 K57205 3.5 K57209 3.5 K57208 3.4 K57199 3.1 K57197 3.1 K57192 3.0 K57203 2.6 K57196 2.4 4.1 2699 KS433 K57232 10.0 K57238 9.9 K57236 9.8 K57224 9.4 K57215 8.2 K57220 8.2 K57225 8.1 K57222 8.1 K57237 7.5 K57221 7.2 K57233 7.0 K57229 6.9 K57234 6.5 K57217 6.3 K57213 6.1 K57230 5.9 K57214 5.8 K57227 5.3 K57226 5.3 K57231 5.2 K57223 4.9 K57219 4.5 K57235 4.1 K57228 3.9 K57218 2.8 K57216 1.9 6.5 2700 KS434 K57239 7.6 K57247 7.1 K57261 6.5 K57242 6.3 K57243 6.0 K57252 5.8 K57256 5.7 K57260 5.6 K57264 5.5 K57251 5.2 K57255 5.2 K57263 5.2 K57245 4.7 K57249 4.7 K57265 4.7 K57266 4.6 K57246 4.6 K57250 4.5 K57240 4.4 K57257 4.3 K57248 4.1 K57269 3.6 K57259 3.4 K57267 3.2 K57254 3.1 K57268 2.9 K57262 2.9 K57253 2.9 K57258 2.6 K57244 2.6 K57241 2.5 4.6

Table 30 shows that silencing of the soybean ORM genes Glyma02g05870 and Glyma16g24560 (KS433) using RNAi- or amiRNA-mediated suppression led to an increase in oil compared to the control.

Sequence CWU 1

1

121118491DNAArtificial SequencepHSbarEND2s activation tagging vector 1catgaatcaa acaaacatac acagcgactt attcacacga gctcaaatta caacggtata 60tatcctgccg tcgacaacca tggtctagac aggatccccg ggtaccgagc tcgaatttgc 120aggtcgactg cgtcatccct tacgtcagtg gagatatcac atcaatccac ttgctttgaa 180gacgtggttg gaacgtcttc tttttccacg atgctcctcg tgggtggggg tccatctttg 240ggaccactgt cggcagaggc atcttgaacg atagcctttc ctttatcgca atgatggcat 300ttgtaggtgc caccttcctt ttctactgtc cttttgatga agtgacagat agctgggcaa 360tggaatccga ggaggtttcc cgatattacc ctttgttgaa aagtctcaat tgccctttgg 420tcttctgaga ctgttgcgtc atcccttacg tcagtggaga tatcacatca atccacttgc 480tttgaagacg tggttggaac gtcttctttt tccacgatgc tcctcgtggg tgggggtcca 540tctttgggac cactgtcggc agaggcatct tgaacgatag cctttccttt atcgcaatga 600tggcatttgt aggtgccacc ttccttttct actgtccttt tgatgaagtg acagatagct 660gggcaatgga atccgaggag gtttcccgat attacccttt gttgaaaagt ctcagttaac 720ccgcgatcct gcgtcatccc ttacgtcagt ggagatatca catcaatcca cttgctttga 780agacgtggtt ggaacgtctt ctttttccac gatgctcctc gtgggtgggg gtccatcttt 840gggaccactg tcggcagagg catcttgaac gatagccttt cctttatcgc aatgatggca 900tttgtaggtg ccaccttcct tttctactgt ccttttgatg aagtgacaga tagctgggca 960atggaatccg aggaggtttc ccgatattac cctttgttga aaagtctcaa ttgccctttg 1020gtcttctgag actgttgcgt catcccttac gtcagtggag atatcacatc aatccacttg 1080ctttgaagac gtggttggaa cgtcttcttt ttccacgatg ctcctcgtgg gtgggggtcc 1140atctttggga ccactgtcgg cagaggcatc ttgaacgata gcctttcctt tatcgcaatg 1200atggcatttg taggtgccac cttccttttc tactgtcctt ttgatgaagt gacagatagc 1260tgggcaatgg aatccgagga ggtttcccga tattaccctt tgttgaaaag tctcagttaa 1320cccgcaattc actggccgtc gttttacaac gtcgtgactg ggaaaaccct ggcgttaccc 1380aacttaatcg ccttgcagca catccccctt tcgccagctg gcgtaatagc gaagaggccc 1440gcaccgatcg cccttcccaa cagttgcgca gcctgaatgg cgaatggatc gatccgtcga 1500tcgaccaaag cggccatcgt gcctccccac tcctgcagtt cgggggcatg gatgcgcgga 1560tagccgctgc tggtttcctg gatgccgacg gatttgcact gccggtagaa ctccgcgagg 1620tcgtccagcc tcaggcagca gctgaaccaa ctcgcgaggg gatcgagccc ctgctgagcc 1680tcgacatgtt gtcgcaaaat tcgccctgga cccgcccaac gatttgtcgt cactgtcaag 1740gtttgacctg cacttcattt ggggcccaca tacaccaaaa aaatgctgca taattctcgg 1800ggcagcaagt cggttacccg gccgccgtgc tggaccgggt tgaatggtgc ccgtaacttt 1860cggtagagcg gacggccaat actcaacttc aaggaatctc acccatgcgc gccggcgggg 1920aaccggagtt cccttcagtg aacgttatta gttcgccgct cggtgtgtcg tagatactag 1980cccctggggc cttttgaaat ttgaataaga tttatgtaat cagtctttta ggtttgaccg 2040gttctgccgc tttttttaaa attggatttg taataataaa acgcaattgt ttgttattgt 2100ggcgctctat catagatgtc gctataaacc tattcagcac aatatattgt tttcatttta 2160atattgtaca tataagtagt agggtacaat cagtaaattg aacggagaat attattcata 2220aaaatacgat agtaacgggt gatatattca ttagaatgaa ccgaaaccgg cggtaaggat 2280ctgagctaca catgctcagg ttttttacaa cgtgcacaac agaattgaaa gcaaatatca 2340tgcgatcata ggcgtctcgc atatctcatt aaagcagggg gtgggcgaag aactccagca 2400tgagatcccc gcgctggagg atcatccagc cggcgtcccg gaaaacgatt ccgaagccca 2460acctttcata gaaggcggcg gtggaatcga aatctcgtga tggcaggttg ggcgtcgctt 2520ggtcggtcat ttcgaacccc agagtcccgc tcagaagaac tcgtcaagaa ggcgatagaa 2580ggcgatgcgc tgcgaatcgg gagcggcgat accgtaaagc acgaggaagc ggtcagccca 2640ttcgccgcca agctcttcag caatatcacg ggtagccaac gctatgtcct gatagcggtc 2700cgccacaccc agccggccac agtcgatgaa tccagaaaag cggccatttt ccaccatgat 2760attcggcaag caggcatcgc catgggtcac gacgagatcc tcgccgtcgg gcatgccccc 2820caattcactg gccgtcgttt tacaacgtcg tgactgggaa aaccctggcg ttacccaact 2880taatcgcctt gcagcacatc cccctttcgc cagctggcgt aatagcgaag aggcccgcac 2940cgatcgccct tcccaacagt tgcgcagcct gaatggcgaa tggcgcctga tgcggtattt 3000tctccttacg catctgtgcg gtatttcaca ccgcatatgg tgcactctca gtacaatctg 3060ctctgatgcc gcatagttaa gccagccccg acacccgcca acacccgctg acgcgccctg 3120acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct ccgggagctg 3180catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg agacgaaagg gcctcgtgat 3240acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt caggtggcac 3300ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac attcaaatat 3360gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa aaaggaagag 3420tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat tttgccttcc 3480tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc agttgggtgc 3540acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga gttttcgccc 3600cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg cggtattatc 3660ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactattctc agaatgactt 3720ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag taagagaatt 3780atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc tgacaacgat 3840cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg taactcgcct 3900tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg acaccacgat 3960gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac ttactctagc 4020ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac cacttctgcg 4080ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg agcgtgggtc 4140tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg tagttatcta 4200cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg agataggtgc 4260ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac tttagattga 4320tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg ataatctcat 4380gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg tagaaaagat 4440caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc aaacaaaaaa 4500accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc tttttccgaa 4560ggtaactggc ttcagcagag cgcagatacc aaatactgtc cttctagtgt agccgtagtt 4620aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc taatcctgtt 4680accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact caagacgata 4740gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac agcccagctt 4800ggagcgaacg acctacaccg aactgagata cctacagcgt gagcattgag aaagcgccac 4860gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg gaacaggaga 4920gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg tcgggtttcg 4980ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga gcctatggaa 5040aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt ttgctcacat 5100gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct ttgagtgagc 5160tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg aggaagcgga 5220agagcgccca atacgcaaac cgcctctccc cgcgcgttgg ccgattcatt aatgcagctg 5280gcacgacagg tttcccgact ggaaagcggg cagtgagcgc aacgcaatta atgtgagtta 5340gctcactcat taggcacccc aggctttaca ctttatgctt ccggctcgta tgttgtgtgg 5400aattgtgagc ggataacaat ttcacacagg aaacagctat gaccatgatt acgccaagct 5460ttctaggggg ggggtaccga tctgagatcg gtaacgaaaa cgaacgggta gggatgaaaa 5520cggtcggtaa cggtcggtaa aatacctcta ccgttttcat tttcatattt aacttgcggg 5580acggaaacga aaacgggata taccggtaac gaaaacgaac gggataaata cggtaatcga 5640aaaccgatac gatccggtcg ggttaaagtc gaaatcggac gggaaccggt atttttgttc 5700ggtaaaatca cacatgaaaa catatattca aaacttaaaa acaaatataa aaaattgtaa 5760acacaagtct taatgatcac tagtggcgcg cctaggagat ctcgagtagg gataacaggg 5820taatacatag ataaaatcca tataaatctg gagcacacat agtttaatgt agcacataag 5880tgataagtct tgggctcttg gctaacataa gaagccatat aagtctacta gcacacatga 5940cacaatataa agtttaaaac acatattcat aatcacttgc tcacatctgg atcacttagc 6000atgctacagc tagtgcaata ttagacactt tccaatattt ctcaaacttt tcactcattg 6060caacggccat tctcctaatg acaaattttt catgaacaca ccattggtca atcaaatcct 6120ttatctcaca gaaacctttg taaaataaat ttgcagtgga atattgagta ccagatagga 6180gttcagtgag atcaaaaaac ttcttcaaac acttaaaaag agttaatgcc atcttccact 6240cctcggcttt aggacaaatt gcatcgtacc tacaataatt gacatttgat taattgagaa 6300tttataatga tgacatgtac aacaattgag acaaacatac ctgcgaggat cacttgtttt 6360aagccgtgtt agtgcaggct tataatataa ggcatccctc aacatcaaat aggttgaatt 6420ccatctagtt gagacatcat atgagatccc tttagattta tccaagtcac attcactagc 6480acacttcatt agttcttccc actgcaaagg agaagatttt acagcaagaa caatcgcttt 6540gattttctca attgttcctg caattacagc caagccatcc tttgcaacca agttcagtat 6600gtgacaagca cacctcacat gaaagaaagc accatcacaa actagatttg aatcagtgtc 6660ctgcaaatcc tcaattatat cgtgcacagc tacttcattt gcactagcat tatccaaaga 6720caaggcaaac aattttttct caatgttcca cttaaccatg attgcagtga aggtttgtga 6780taacctttgg ccagtgtggc gcccttcaac atgaaaaaag ccaacaattc ttttttggag 6840acaccaatca tcatcaatcc aatggatggt gacacacatg tatgacttat tttgacaaga 6900tgtccacata tccatagttg tactgaagcg agactgaaca tcttttagtt ttccatacaa 6960cttttctttt tcttccaaat acaaatccat gatatatttt ctagcagtga cacgggactt 7020tattggaaag tgagggcgca gagacttaac aaactcaaca aagtactcat gttctacaat 7080attgaaagga tattcatgca tgattattgc caaatgaagc ttctttaggc taaccacttc 7140atcgtactta taaggctcaa tgagatttat gtctttgcca tgatcctttt cactttttag 7200acacaactga cctttaacta aactatgtga tgttctcaag tgatttcgaa atccgcttgt 7260tccatgatga ccctcagccc tatacttagc cttgcaatta ggaaagttgc aatgtcccca 7320tacctgaacg tatttctttc catcgacctc cacttcaatt tccttcttgg tgaaatgctg 7380ccatacatcc gatgtgcact tctttgccct cttctgtggt gcttcttctt cgggttcagg 7440ttgtggctgt ggttgtggtt ctggttgtgg ttgtggttgt ggttgtggtt catgaacaat 7500agccatatca tcttgactcg gatctgtagc tgtaccattt gcattactac tgcttacact 7560ctgaataaaa tgcctctcgg cctcagctgt tgatgatgat ggtgatgtgc ggccacatcc 7620atgcccacgc gcacgtgcac gtacattctg aatccgacta gaagaggctt cagcttttct 7680tttcaaccct gttataaaca gatttttcgt attattctac agtcaatatg atgcttccca 7740atctacaacc aattagtaat gctaatgcta ttgctactgt ttttctaata tataccttga 7800gcatatgcag agaatacgga atttgttttg cgagtagaag gcgctcttgt ggtagacatc 7860aacttggcca atcttatggc tgagcctgag ggaggattat ttccaaccgg aggcgtcatc 7920tgaggaatgg agtcgtagcc ggctagccga agtggagagc agagccctgg acagcaggtg 7980ttcagcaatc agcttggtgc tgtactgctg tgacttgtga gcacctggac ggctggacag 8040caatcagcag gtgttgcaga gcccctggac agcacacaaa tgacacaaca gcttggtgca 8100atggtgctga cgtgctgtac tgctaagtgc tgtgagcctg tgagcagccg tggagacagg 8160gagaccgcgg atggccggat gggcgagcgc cgagcagtgg aggtctggag gaccgctgac 8220cgcagatggc ggatggcgga tgggcggacc gcggatgggc gagcagtgga gtggaggtct 8280gggcggatgg gcggaccgcg gcgcggatgg gcgagtcgcg agcagtggag tggagggcgg 8340accgtggatg gcggcgtctg cgtccggcgt gccgcgtcac ggccgtcacc gcgtgtggtg 8400cctggtgcag cccagcggcc ggccggctgg gagacaggga gagtcggaga gagcaggcga 8460gagcgagacg cgtcgccggc gtcggcgtgc ggctggcggc gtccggactc cggcgtgggc 8520gcgtggcggc gtgtgaatgt gtgatgctgt tactcgtgtg gtgcctggcc gcctgggaga 8580gaggcagagc agcgttcgct aggtatttct tacatgggct gggcctcagt ggttatggat 8640gggagttgga gctggccata ttgcagtcat cccgaattag aaaatacggt aacgaaacgg 8700gatcatcccg attaaaaacg ggatcccggt gaaacggtcg ggaaactagc tctaccgttt 8760ccgtttccgt ttaccgtttt gtatatcccg tttccgttcc gttttcgttt tttacctcgg 8820gttcgaaatc gatcgggata aaactaacaa aatcggttat acgataacgg tcggtacggg 8880attttcccat cctactttca tccctgagat tattgtcgtt tctttcgcag atcggtaccc 8940cccccctaga gtcgacatcg atctagtaac atagatgaca ccgcgcgcga taatttatcc 9000tagtttgcgc gctatatttt gttttctatc gcgtattaaa tgtataattg cgggactcta 9060atcataaaaa cccatctcat aaataacgtc atgcattaca tgttaattat tacatgctta 9120acgtaattca acagaaatta tatgataatc atcgcaagac cggcaacagg attcaatctt 9180aagaaacttt attgccaaat gtttgaacga tctgcttcga cgcactcctt ctttaggtac 9240ggactagatc tcggtgacgg gcaggaccgg acggggcggt accggcaggc tgaagtccag 9300ctgccagaaa cccacgtcat gccagttccc gtgcttgaag ccggccgccc gcagcatgcc 9360gcggggggca tatccgagcg cctcgtgcat gcgcacgctc gggtcgttgg gcagcccgat 9420gacagcgacc acgctcttga agccctgtgc ctccagggac ttcagcaggt gggtgtagag 9480cgtggagccc agtcccgtcc gctggtggcg gggggagacg tacacggtcg actcggccgt 9540ccagtcgtag gcgttgcgtg ccttccaggg gcccgcgtag gcgatgccgg cgacctcgcc 9600gtccacctcg gcgacgagcc agggatagcg ctcccgcaga cggacgaggt cgtccgtcca 9660ctcctgcggt tcctgcggct cggtacggaa gttgaccgtg cttgtctcga tgtagtggtt 9720gacgatggtg cagaccgccg gcatgtccgc ctcggtggca cggcggatgt cggccgggcg 9780tcgttctggg ctcatggatc tggattgaga gtgaatatga gactctaatt ggataccgag 9840gggaatttat ggaacgtcag tggagcattt ttgacaagaa atatttgcta gctgatagtg 9900accttaggcg acttttgaac gcgcaataat ggtttctgac gtatgtgctt agctcattaa 9960actccagaaa cccgcggctg agtggctcct tcaatcgttg cggttctgtc agttccaaac 10020gtaaaacggc ttgtcccgcg tcatcggcgg gggtcataac gtgactccct taattctccg 10080ctcatgatcc ccgggtaccg agctcgaatt gcggctgagt ggctccttca atcgttgcgg 10140ttctgtcagt tccaaacgta aaacggcttg tcccgcgtca tcggcggggg tcataacgtg 10200actcccttaa ttctccgctc atgatcttga tcccctgcgc catcagatcc ttggcggcaa 10260gaaagccatc cagtttactt tgcagggctt cccaacctta ccagagggcg ccccagctgg 10320caattccggt tcgcttgctg tatcgatatg gtggatttat cacaaatggg acccgccgcc 10380gacagaggtg tgatgttagg ccaggacttt gaaaatttgc gcaactatcg tatagtggcc 10440gacaaattga cgccgagttg acagactgcc tagcatttga gtgaattatg tgaggtaatg 10500ggctacactg aattggtagc tcaaactgtc agtatttatg tatatgagtg tatattttcg 10560cataatctca gaccaatctg aagatgaaat gggtatctgg gaatggcgaa atcaaggcat 10620cgatcgtgaa gtttctcatc taagccccca tttggacgtg aatgtagaca cgtcgaaata 10680aagatttccg aattagaata atttgtttat tgctttcgcc tataaatacg acggatcgta 10740atttgtcgtt ttatcaaaat gtactttcat tttataataa cgctgcggac atctacattt 10800ttgaattgaa aaaaaattgg taattactct ttctttttct ccatattgac catcatactc 10860attgctgatc catgtagatt tcccggacat gaagccattt acaattgaat atatcctgcc 10920gccgctgccg ctttgcaccc ggtggagctt gcatgttggt ttctacgcag aactgagccg 10980gttaggcaga taatttccat tgagaactga gccatgtgca ccttcccccc aacacggtga 11040gcgacggggc aacggagtga tccacatggg acttttaaac atcatccgtc ggatggcgtt 11100gcgagagaag cagtcgatcc gtgagatcag ccgacgcacc gggcaggcgc gcaacacgat 11160cgcaaagtat ttgaacgcag gtacaatcga gccgacgttc accgtcaccc tggatgctgt 11220aggcataggc ttggttatgc cggtactgcc gggcctcttg cgggatatcg tccattccga 11280cagcatcgcc agtcactatg gcgtgctgct agcgctatat gcgttgatgc aatttctatg 11340cgcacccgtt ctcggagcac tgtccgaccg ctttggccgc cgcccagtcc tgctcgcttc 11400gctacttgga gccactatcg actacgcgat catggcgacc acacccgtcc tgtggtccaa 11460cccctccgct gctatagtgc agtcggcttc tgacgttcag tgcagccgtc ttctgaaaac 11520gacatgtcgc acaagtccta agttacgcga caggctgccg ccctgccctt ttcctggcgt 11580tttcttgtcg cgtgttttag tcgcataaag tagaatactt gcgactagaa ccggagacat 11640tacgccatga acaagagcgc cgccgctggc ctgctgggct atgcccgcgt cagcaccgac 11700gaccaggact tgaccaacca acgggccgaa ctgcacgcgg ccggctgcac caagctgttt 11760tccgagaaga tcaccggcac caggcgcgac cgcccggagc tggccaggat gcttgaccac 11820ctacgccctg gcgacgttgt gacagtgacc aggctagacc gcctggcccg cagcacccgc 11880gacctactgg acattgccga gcgcatccag gaggccggcg cgggcctgcg tagcctggca 11940gagccgtggg ccgacaccac cacgccggcc ggccgcatgg tgttgaccgt gttcgccggc 12000attgccgagt tcgagcgttc cctaatcatc gaccgcaccc ggagcgggcg cgaggccgcc 12060aaggcccgag gcgtgaagtt tggcccccgc cctaccctca ccccggcaca gatcgcgcac 12120gcccgcgagc tgatcgacca ggaaggccgc accgtgaaag aggcggctgc actgcttggc 12180gtgcatcgct cgaccctgta ccgcgcactt gagcgcagcg aggaagtgac gcccaccgag 12240gccaggcggc gcggtgcctt ccgtgaggac gcattgaccg aggccgacgc cctggcggcc 12300gccgagaatg aacgccaaga ggaacaagca tgaaaccgca ccaggacggc caggacgaac 12360cgtttttcat taccgaagag atcgaggcgg agatgatcgc ggccgggtac gtgttcgagc 12420cgcccgcgca cgtctcaacc gtgcggctgc atgaaatcct ggccggtttg tctgatgcca 12480agctggcggc ctggccggcc agcttggccg ctgaagaaac cgagcgccgc cgtctaaaaa 12540ggtgatgtgt atttgagtaa aacagcttgc gtcatgcggt cgctgcgtat atgatgcgat 12600gagtaaataa acaaatacgc aagggaacgc atgaagttat cgctgtactt aaccagaaag 12660gcgggtcagg caagacgacc atcgcaaccc atctagcccg cgccctgcaa ctcgccgggg 12720ccgatgttct gttagtcgat tccgatcccc agggcagtgc ccgcgattgg gcggccgtgc 12780gggaagatca accgctaacc gttgtcggca tcgaccgccc gacgattgac cgcgacgtga 12840aggccatcgg ccggcgcgac ttcgtagtga tcgacggagc gccccaggcg gcggacttgg 12900ctgtgtccgc gatcaaggca gccgacttcg tgctgattcc ggtgcagcca agcccttacg 12960acatatgggc caccgccgac ctggtggagc tggttaagca gcgcattgag gtcacggatg 13020gaaggctaca agcggccttt gtcgtgtcgc gggcgatcaa aggcacgcgc atcggcggtg 13080aggttgccga ggcgctggcc gggtacgagc tgcccattct tgagtcccgt atcacgcagc 13140gcgtgagcta cccaggcact gccgccgccg gcacaaccgt tcttgaatca gaacccgagg 13200gcgacgctgc ccgcgaggtc caggcgctgg ccgctgaaat taaatcaaaa ctcatttgag 13260ttaatgaggt aaagagaaaa tgagcaaaag cacaaacacg ctaagtgccg gccgtccgag 13320cgcacgcagc agcaaggctg caacgttggc cagcctggca gacacgccag ccatgaagcg 13380ggtcaacttt cagttgccgg cggaggatca caccaagctg aagatgtacg cggtacgcca 13440aggcaagacc attaccgagc tgctatctga atacatcgcg cagctaccag agtaaatgag 13500caaatgaata aatgagtaga tgaattttag cggctaaagg aggcggcatg gaaaatcaag 13560aacaaccagg caccgacgcc gtggaatgcc ccatgtgtgg aggaacgggc ggttggccag 13620gcgtaagcgg ctgggttgtc tgccggccct gcaatggcac tggaaccccc aagcccgagg 13680aatcggcgtg agcggtcgca aaccatccgg cccggtacaa atcggcgcgg cgctgggtga 13740tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca tcgaggcaga 13800agcacgcccc ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag aatcccggca 13860accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg agcaaccaga 13920ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca tcatggacgt 13980ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc gctacgagct 14040tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg tgtgggatta 14100cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat accgggaagg 14160gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac tcaagttctg 14220ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca ttcggttaaa 14280caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc tggtgacggt 14340atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa ccgggcggcc 14400ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag aaggcaagaa 14460cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca tcggccgttt 14520tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt tgttcaagac 14580gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca ccgtgcgcaa 14640gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg ggcaggctgg 14700cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg ccggttccta 14760atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc gaaaaggtct 14820ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga accggaaccc 14880gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag tgactgatat 14940aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta aaactcttaa

15000aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc tgcaaaaagc 15060gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc ctatcgcggc 15120cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc gcggacaagc 15180cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc gcgcgtttcg 15240gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca gcttgtctgt 15300aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc 15360ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc ttaactatgc 15420ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg 15480cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg actcgctgcg 15540ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc 15600cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag 15660gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 15720tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 15780ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 15840atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 15900gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 15960tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 16020cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 16080cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa ggacagtatt 16140tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc 16200cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 16260cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg 16320gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta 16380gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg 16440gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg 16500ttcatccata gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc 16560atctggcccc agtgctgcaa tgataccgcg agacccacgc tcaccggctc cagatttatc 16620agcaataaac cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc 16680ctccatccag tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag 16740tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat 16800ggcttcattc agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg 16860caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt 16920gttatcactc atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag 16980atgcttttct gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg 17040accgagttgc tcttgcccgg cgtcaacacg ggataatacc gcgccacata gcagaacttt 17100aaaagtgctc atcattggaa aagacctgca gggggggggg ggaaagccac gttgtgtctc 17160aaaatctctg atgttacatt gcacaagata aaaatatatc atcatgaaca ataaaactgt 17220ctgcttacat aaacagtaat acaaggggtg ttatgagcca tattcaacgg gaaacgtctt 17280gctcgaggcc gcgattaaat tccaacatgg atgctgattt atatgggtat aaatgggctc 17340gcgataatgt cgggcaatca ggtgcgacaa tctatcgatt gtatgggaag cccgatgcgc 17400cagagttgtt tctgaaacat ggcaaaggta gcgttgccaa tgatgttaca gatgagatgg 17460tcagactaaa ctggctgacg gaatttatgc ctcttccgac catcaagcat tttatccgta 17520ctcctgatga tgcatggtta ctcaccactg cgatccccgg gaaaacagca ttccaggtat 17580tagaagaata tcctgattca ggtgaaaata ttgttgatgc gctggcagtg ttcctgcgcc 17640ggttgcattc gattcctgtt tgtaattgtc cttttaacag cgatcgcgta tttcgtctcg 17700ctcaggcgca atcacgaatg aataacggtt tggttgatgc gagtgatttt gatgacgagc 17760gtaatggctg gcctgttgaa caagtctgga aagaaatgca taagcttttg ccattctcac 17820cggattcagt cgtcactcat ggtgatttct cacttgataa ccttattttt gacgagggga 17880aattaatagg ttgtattgat gttggacgag tcggaatcgc agaccgatac caggatcttg 17940ccatcctatg gaactgcctc ggtgagtttt ctccttcatt acagaaacgg ctttttcaaa 18000aatatggtat tgataatcct gatatgaata aattgcagtt tcatttgatg ctcgatgagt 18060ttttctaatc agaattggtt aattggttgt aacactggca gagcattacg ctgacttgac 18120gggacggcgg ctttgttgaa taaatcgaac ttttgctgag ttgaaggatc agatcacgca 18180tcttcccgac aacgcagacc gttccgtggc aaagcaaaag ttcaaaatca ccaactggtc 18240cacctacaac aaagctctca tcaaccgtgg ctccctcact ttctggctgg atgatggggc 18300gattcaggcc tggtatgagt cagcaacacc ttcttcacga ggcagacctc agcgcccccc 18360cccccctgca ggtcaattcg gtcgatatgg ctattacgaa gaaggctcgt gcgcggagtc 18420ccgtgaactt tcccacgcaa caagtgaacc gcaccgggtt tgccggaggc catttcgtta 18480aaatgcgcag c 18491250DNAArtificial Sequencepoly-linker 2gatcactagt ggcgcgccta ggagatctcg agtagggata acagggtaat 5037085DNAArtificial SequencePlasmid pKR85 3cgcgccaagc ttttgatcca tgcccttcat ttgccgctta ttaattaatt tggtaacagt 60ccgtactaat cagttactta tccttccccc atcataatta atcttggtag tctcgaatgc 120cacaacactg actagtctct tggatcataa gaaaaagcca aggaacaaaa gaagacaaaa 180cacaatgaga gtatcctttg catagcaatg tctaagttca taaaattcaa acaaaaacgc 240aatcacacac agtggacatc acttatccac tagctgatca ggatcgccgc gtcaagaaaa 300aaaaactgga ccccaaaagc catgcacaac aacacgtact cacaaaggtg tcaatcgagc 360agcccaaaac attcaccaac tcaacccatc atgagccctc acatttgttg tttctaaccc 420aacctcaaac tcgtattctc ttccgccacc tcatttttgt ttatttcaac acccgtcaaa 480ctgcatgcca ccccgtggcc aaatgtccat gcatgttaac aagacctatg actataaata 540gctgcaatct cggcccaggt tttcatcatc aagaaccagt tcaatatcct agtacaccgt 600attaaagaat ttaagatata ctgcggccgc aagtatgaac taaaatgcat gtaggtgtaa 660gagctcatgg agagcatgga atattgtatc cgaccatgta acagtataat aactgagctc 720catctcactt cttctatgaa taaacaaagg atgttatgat atattaacac tctatctatg 780caccttattg ttctatgata aatttcctct tattattata aatcatctga atcgtgacgg 840cttatggaat gcttcaaata gtacaaaaac aaatgtgtac tataagactt tctaaacaat 900tctaacctta gcattgtgaa cgagacataa gtgttaagaa gacataacaa ttataatgga 960agaagtttgt ctccatttat atattatata ttacccactt atgtattata ttaggatgtt 1020aaggagacat aacaattata aagagagaag tttgtatcca tttatatatt atatactacc 1080catttatata ttatacttat ccacttattt aatgtcttta taaggtttga tccatgatat 1140ttctaatatt ttagttgata tgtatatgaa agggtactat ttgaactctc ttactctgta 1200taaaggttgg atcatcctta aagtgggtct atttaatttt attgcttctt acagataaaa 1260aaaaaattat gagttggttt gataaaatat tgaaggattt aaaataataa taaataacat 1320ataatatatg tatataaatt tattataata taacatttat ctataaaaaa gtaaatattg 1380tcataaatct atacaatcgt ttagccttgc tggacgaatc tcaattattt aaacgagagt 1440aaacatattt gactttttgg ttatttaaca aattattatt taacactata tgaaattttt 1500ttttttatca gcaaagaata aaattaaatt aagaaggaca atggtgtccc aatccttata 1560caaccaactt ccacaagaaa gtcaagtcag agacaacaaa aaaacaagca aaggaaattt 1620tttaatttga gttgtcttgt ttgctgcata atttatgcag taaaacacta cacataaccc 1680ttttagcagt agagcaatgg ttgaccgtgt gcttagcttc ttttatttta tttttttatc 1740agcaaagaat aaataaaata aaatgagaca cttcagggat gtttcaacaa gcttggatct 1800cctgcaggat ctggccggcc ggatctcgta cggatccgtc gacggcgcgc ccgatcatcc 1860ggatatagtt cctcctttca gcaaaaaacc cctcaagacc cgtttagagg ccccaagggg 1920ttatgctagt tattgctcag cggtggcagc agccaactca gcttcctttc gggctttgtt 1980agcagccgga tcgatccaag ctgtacctca ctattccttt gccctcggac gagtgctggg 2040gcgtcggttt ccactatcgg cgagtacttc tacacagcca tcggtccaga cggccgcgct 2100tctgcgggcg atttgtgtac gcccgacagt cccggctccg gatcggacga ttgcgtcgca 2160tcgaccctgc gcccaagctg catcatcgaa attgccgtca accaagctct gatagagttg 2220gtcaagacca atgcggagca tatacgcccg gagccgcggc gatcctgcaa gctccggatg 2280cctccgctcg aagtagcgcg tctgctgctc catacaagcc aaccacggcc tccagaagaa 2340gatgttggcg acctcgtatt gggaatcccc gaacatcgcc tcgctccagt caatgaccgc 2400tgttatgcgg ccattgtccg tcaggacatt gttggagccg aaatccgcgt gcacgaggtg 2460ccggacttcg gggcagtcct cggcccaaag catcagctca tcgagagcct gcgcgacgga 2520cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac acatggggat cagcaatcgc 2580gcatatgaaa tcacgccatg tagtgtattg accgattcct tgcggtccga atgggccgaa 2640cccgctcgtc tggctaagat cggccgcagc gatcgcatcc atagcctccg cgaccggctg 2700cagaacagcg ggcagttcgg tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg 2760ggagatgcaa taggtcaggc tctcgctgaa ttccccaatg tcaagcactt ccggaatcgg 2820gagcgcggcc gatgcaaagt gccgataaac ataacgatct ttgtagaaac catcggcgca 2880gctatttacc cgcaggacat atccacgccc tcctacatcg aagctgaaag cacgagattc 2940ttcgccctcc gagagctgca tcaggtcgga gacgctgtcg aacttttcga tcagaaactt 3000ctcgacagac gtcgcggtga gttcaggctt ttccatgggt atatctcctt cttaaagtta 3060aacaaaatta tttctagagg gaaaccgttg tggtctccct atagtgagtc gtattaattt 3120cgcgggatcg agatcgatcc aattccaatc ccacaaaaat ctgagcttaa cagcacagtt 3180gctcctctca gagcagaatc gggtattcaa caccctcata tcaactacta cgttgtgtat 3240aacggtccac atgccggtat atacgatgac tggggttgta caaaggcggc aacaaacggc 3300gttcccggag ttgcacacaa gaaatttgcc actattacag aggcaagagc agcagctgac 3360gcgtacacaa caagtcagca aacagacagg ttgaacttca tccccaaagg agaagctcaa 3420ctcaagccca agagctttgc taaggcccta acaagcccac caaagcaaaa agcccactgg 3480ctcacgctag gaaccaaaag gcccagcagt gatccagccc caaaagagat ctcctttgcc 3540ccggagatta caatggacga tttcctctat ctttacgatc taggaaggaa gttcgaaggt 3600gaaggtgacg acactatgtt caccactgat aatgagaagg ttagcctctt caatttcaga 3660aagaatgctg acccacagat ggttagagag gcctacgcag caggtctcat caagacgatc 3720tacccgagta acaatctcca ggagatcaaa taccttccca agaaggttaa agatgcagtc 3780aaaagattca ggactaattg catcaagaac acagagaaag acatatttct caagatcaga 3840agtactattc cagtatggac gattcaaggc ttgcttcata aaccaaggca agtaatagag 3900attggagtct ctaaaaaggt agttcctact gaatctaagg ccatgcatgg agtctaagat 3960tcaaatcgag gatctaacag aactcgccgt gaagactggc gaacagttca tacagagtct 4020tttacgactc aatgacaaga agaaaatctt cgtcaacatg gtggagcacg acactctggt 4080ctactccaaa aatgtcaaag atacagtctc agaagaccaa agggctattg agacttttca 4140acaaaggata atttcgggaa acctcctcgg attccattgc ccagctatct gtcacttcat 4200cgaaaggaca gtagaaaagg aaggtggctc ctacaaatgc catcattgcg ataaaggaaa 4260ggctatcatt caagatgcct ctgccgacag tggtcccaaa gatggacccc cacccacgag 4320gagcatcgtg gaaaaagaag acgttccaac cacgtcttca aagcaagtgg attgatgtga 4380catctccact gacgtaaggg atgacgcaca atcccactat ccttcgcaag acccttcctc 4440tatataagga agttcatttc atttggagag gacacgctcg agctcatttc tctattactt 4500cagccataac aaaagaactc ttttctcttc ttattaaacc atgaaaaagc ctgaactcac 4560cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac agcgtctccg acctgatgca 4620gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat gtaggagggc gtggatatgt 4680cctgcgggta aatagctgcg ccgatggttt ctacaaagat cgttatgttt atcggcactt 4740tgcatcggcc gcgctcccga ttccggaagt gcttgacatt ggggaattca gcgagagcct 4800gacctattgc atctcccgcc gtgcacaggg tgtcacgttg caagacctgc ctgaaaccga 4860actgcccgct gttctgcagc cggtcgcgga ggccatggat gcgatcgctg cggccgatct 4920tagccagacg agcgggttcg gcccattcgg accgcaagga atcggtcaat acactacatg 4980gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat cactggcaaa ctgtgatgga 5040cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag ctgatgcttt gggccgagga 5100ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc tccaacaatg tcctgacgga 5160caatggccgc ataacagcgg tcattgactg gagcgaggcg atgttcgggg attcccaata 5220cgaggtcgcc aacatcttct tctggaggcc gtggttggct tgtatggagc agcagacgcg 5280ctacttcgag cggaggcatc cggagcttgc aggatcgccg cggctccggg cgtatatgct 5340ccgcattggt cttgaccaac tctatcagag cttggttgac ggcaatttcg atgatgcagc 5400ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga gccgggactg tcgggcgtac 5460acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc tgtgtagaag tactcgccga 5520tagtggaaac cgacgcccca gcactcgtcc gagggcaaag gaatagtgag gtacctaaag 5580aaggagtgcg tcgaagcaga tcgttcaaac atttggcaat aaagtttctt aagattgaat 5640cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 5700ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 5760caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 5820tcgcgcgcgg tgtcatctat gttactagat cgatgtcgaa tcgatcaacc tgcattaatg 5880aatcggccaa cgcgcgggga gaggcggttt gcgtattggg cgctcttccg cttcctcgct 5940cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc actcaaaggc 6000ggtaatacgg ttatccacag aatcagggga taacgcagga aagaacatgt gagcaaaagg 6060ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg 6120cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg 6180actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc ctgttccgac 6240cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca 6300atgctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt 6360gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc 6420caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca ggattagcag 6480agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact acggctacac 6540tagaaggaca gtatttggta tctgcgctct gctgaagcca gttaccttcg gaaaaagagt 6600tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa 6660gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct tttctacggg 6720gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga cattaaccta 6780taaaaatagg cgtatcacga ggccctttcg tctcgcgcgt ttcggtgatg acggtgaaaa 6840cctctgacac atgcagctcc cggagacggt cacagcttgt ctgtaagcgg atgccgggag 6900cagacaagcc cgtcagggcg cgtcagcggg tgttggcggg tgtcggggct ggcttaacta 6960tgcggcatca gagcagattg tactgagagt gcaccatatg gacatattgt cgttagaacg 7020cggctacaat taatacataa ccttatgtat catacacata cgatttaggt gacactatag 7080aacgg 708545303DNAArtificial SequencePlasmid pKR278 4agcttggatc tcctgcagga tctggccggc cggatctcgt acggatccgt cgacggcgcg 60cccgatcatc cggatatagt tcctcctttc agcaaaaaac ccctcaagac ccgtttagag 120gccccaaggg gttatgctag ttattgctca gcggtggcag cagccaactc agcttccttt 180cgggctttgt tagcagccgg atcgatccaa gctgtacctc actattcctt tgccctcgga 240cgagtgctgg ggcgtcggtt tccactatcg gcgagtactt ctacacagcc atcggtccag 300acggccgcgc ttctgcgggc gatttgtgta cgcccgacag tcccggctcc ggatcggacg 360attgcgtcgc atcgaccctg cgcccaagct gcatcatcga aattgccgtc aaccaagctc 420tgatagagtt ggtcaagacc aatgcggagc atatacgccc ggagccgcgg cgatcctgca 480agctccggat gcctccgctc gaagtagcgc gtctgctgct ccatacaagc caaccacggc 540ctccagaaga agatgttggc gacctcgtat tgggaatccc cgaacatcgc ctcgctccag 600tcaatgaccg ctgttatgcg gccattgtcc gtcaggacat tgttggagcc gaaatccgcg 660tgcacgaggt gccggacttc ggggcagtcc tcggcccaaa gcatcagctc atcgagagcc 720tgcgcgacgg acgcactgac ggtgtcgtcc atcacagttt gccagtgata cacatgggga 780tcagcaatcg cgcatatgaa atcacgccat gtagtgtatt gaccgattcc ttgcggtccg 840aatgggccga acccgctcgt ctggctaaga tcggccgcag cgatcgcatc catagcctcc 900gcgaccggct gcagaacagc gggcagttcg gtttcaggca ggtcttgcaa cgtgacaccc 960tgtgcacggc gggagatgca ataggtcagg ctctcgctga attccccaat gtcaagcact 1020tccggaatcg ggagcgcggc cgatgcaaag tgccgataaa cataacgatc tttgtagaaa 1080ccatcggcgc agctatttac ccgcaggaca tatccacgcc ctcctacatc gaagctgaaa 1140gcacgagatt cttcgccctc cgagagctgc atcaggtcgg agacgctgtc gaacttttcg 1200atcagaaact tctcgacaga cgtcgcggtg agttcaggct tttccatggg tatatctcct 1260tcttaaagtt aaacaaaatt atttctagag ggaaaccgtt gtggtctccc tatagtgagt 1320cgtattaatt tcgcgggatc gagatcgatc caattccaat cccacaaaaa tctgagctta 1380acagcacagt tgctcctctc agagcagaat cgggtattca acaccctcat atcaactact 1440acgttgtgta taacggtcca catgccggta tatacgatga ctggggttgt acaaaggcgg 1500caacaaacgg cgttcccgga gttgcacaca agaaatttgc cactattaca gaggcaagag 1560cagcagctga cgcgtacaca acaagtcagc aaacagacag gttgaacttc atccccaaag 1620gagaagctca actcaagccc aagagctttg ctaaggccct aacaagccca ccaaagcaaa 1680aagcccactg gctcacgcta ggaaccaaaa ggcccagcag tgatccagcc ccaaaagaga 1740tctcctttgc cccggagatt acaatggacg atttcctcta tctttacgat ctaggaagga 1800agttcgaagg tgaaggtgac gacactatgt tcaccactga taatgagaag gttagcctct 1860tcaatttcag aaagaatgct gacccacaga tggttagaga ggcctacgca gcaggtctca 1920tcaagacgat ctacccgagt aacaatctcc aggagatcaa ataccttccc aagaaggtta 1980aagatgcagt caaaagattc aggactaatt gcatcaagaa cacagagaaa gacatatttc 2040tcaagatcag aagtactatt ccagtatgga cgattcaagg cttgcttcat aaaccaaggc 2100aagtaataga gattggagtc tctaaaaagg tagttcctac tgaatctaag gccatgcatg 2160gagtctaaga ttcaaatcga ggatctaaca gaactcgccg tgaagactgg cgaacagttc 2220atacagagtc ttttacgact caatgacaag aagaaaatct tcgtcaacat ggtggagcac 2280gacactctgg tctactccaa aaatgtcaaa gatacagtct cagaagacca aagggctatt 2340gagacttttc aacaaaggat aatttcggga aacctcctcg gattccattg cccagctatc 2400tgtcacttca tcgaaaggac agtagaaaag gaaggtggct cctacaaatg ccatcattgc 2460gataaaggaa aggctatcat tcaagatgcc tctgccgaca gtggtcccaa agatggaccc 2520ccacccacga ggagcatcgt ggaaaaagaa gacgttccaa ccacgtcttc aaagcaagtg 2580gattgatgtg acatctccac tgacgtaagg gatgacgcac aatcccacta tccttcgcaa 2640gacccttcct ctatataagg aagttcattt catttggaga ggacacgctc gagctcattt 2700ctctattact tcagccataa caaaagaact cttttctctt cttattaaac catgaaaaag 2760cctgaactca ccgcgacgtc tgtcgagaag tttctgatcg aaaagttcga cagcgtctcc 2820gacctgatgc agctctcgga gggcgaagaa tctcgtgctt tcagcttcga tgtaggaggg 2880cgtggatatg tcctgcgggt aaatagctgc gccgatggtt tctacaaaga tcgttatgtt 2940tatcggcact ttgcatcggc cgcgctcccg attccggaag tgcttgacat tggggaattc 3000agcgagagcc tgacctattg catctcccgc cgtgcacagg gtgtcacgtt gcaagacctg 3060cctgaaaccg aactgcccgc tgttctgcag ccggtcgcgg aggccatgga tgcgatcgct 3120gcggccgatc ttagccagac gagcgggttc ggcccattcg gaccgcaagg aatcggtcaa 3180tacactacat ggcgtgattt catatgcgcg attgctgatc cccatgtgta tcactggcaa 3240actgtgatgg acgacaccgt cagtgcgtcc gtcgcgcagg ctctcgatga gctgatgctt 3300tgggccgagg actgccccga agtccggcac ctcgtgcacg cggatttcgg ctccaacaat 3360gtcctgacgg acaatggccg cataacagcg gtcattgact ggagcgaggc gatgttcggg 3420gattcccaat acgaggtcgc caacatcttc ttctggaggc cgtggttggc ttgtatggag 3480cagcagacgc gctacttcga gcggaggcat ccggagcttg caggatcgcc gcggctccgg 3540gcgtatatgc tccgcattgg tcttgaccaa ctctatcaga gcttggttga cggcaatttc 3600gatgatgcag cttgggcgca gggtcgatgc gacgcaatcg tccgatccgg agccgggact 3660gtcgggcgta cacaaatcgc ccgcagaagc gcggccgtct ggaccgatgg ctgtgtagaa 3720gtactcgccg atagtggaaa ccgacgcccc agcactcgtc cgagggcaaa ggaatagtga 3780ggtacctaaa gaaggagtgc gtcgaagcag atcgttcaaa catttggcaa taaagtttct 3840taagattgaa tcctgttgcc ggtcttgcga tgattatcat ataatttctg ttgaattacg 3900ttaagcatgt aataattaac atgtaatgca tgacgttatt tatgagatgg gtttttatga 3960ttagagtccc gcaattatac atttaatacg cgatagaaaa caaaatatag cgcgcaaact 4020aggataaatt atcgcgcgcg gtgtcatcta tgttactaga tcgatgtcga atcgatcaac 4080ctgcattaat gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc 4140gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct

4200cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg 4260tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc 4320cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga 4380aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 4440cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 4500gcgctttctc aatgctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 4560ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 4620cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 4680aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 4740tacggctaca ctagaaggac agtatttggt atctgcgctc tgctgaagcc agttaccttc 4800ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 4860tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc 4920ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg 4980acattaacct ataaaaatag gcgtatcacg aggccctttc gtctcgcgcg tttcggtgat 5040gacggtgaaa acctctgaca catgcagctc ccggagacgg tcacagcttg tctgtaagcg 5100gatgccggga gcagacaagc ccgtcagggc gcgtcagcgg gtgttggcgg gtgtcggggc 5160tggcttaact atgcggcatc agagcagatt gtactgagag tgcaccatat ggacatattg 5220tcgttagaac gcggctacaa ttaatacata accttatgta tcatacacat acgatttagg 5280tgacactata gaacggcgcg cca 530354140DNAArtificial SequencePlasmid pKR407 5ggccgcattt cgcaccaaat caatgaaagt aataatgaaa agtctgaata agaatactta 60ggcttagatg cctttgttac ttgtgtaaaa taacttgagt catgtacctt tggcggaaac 120agaataaata aaaggtgaaa ttccaatgct ctatgtataa gttagtaata cttaatgtgt 180tctacggttg tttcaatatc atcaaactct aattgaaact ttagaaccac aaatctcaat 240cttttcttaa tgaaatgaaa aatcttaatt gtaccatgtt tatgttaaac accttacaat 300tggttggaga ggaggaccaa ccgatgggac aacattggga gaaagagatt caatggagat 360ttggatagga gaacaacatt ctttttcact tcaatacaag atgagtgcaa cactaaggat 420atgtatgaga ctttcagaag ctacgacaac atagatgagt gaggtggtga ttcctagcaa 480gaaagacatt agaggaagcc aaaatcgaac aaggaagaca tcaagggcaa gagacaggac 540catccatctc aggaaaagga gctttgggat agtccgagaa gttgtacaag aaattttttg 600gagggtgagt gatgcattgc tggtgacttt aactcaatca aaattgagaa agaaagaaaa 660gggagggggc tcacatgtga atagaaggga aacgggagaa ttttacagtt ttgatctaat 720gggcatccca gctagtggta acatattcac catgtttaac cttcacgtac gtctagagga 780tcccccgggc tgcaggaatt cactggccgt cgttttacaa cgtcgtgact gggaaaaccc 840tggcgttacc caacttaatc gccttgcagc acatccccct ttcgccagct ggcgtaatag 900cgaagaggcc cgcaccgatc gcccttccca acagttgcgc agcctgaatg gcgaatggcg 960cctgatgcgg tattttctcc ttacgcatct gtgcggtatt tcacaccgca tatggtgcac 1020tctcagtaca atctgctctg atgccgcata gttaagccag ccccgacacc cgccaacacc 1080cgctgacgcg ccctgacggg cttgtctgct cccggcatcc gcttacagac aagctgtgac 1140cgtctccggg agctgcatgt gtcagaggtt ttcaccgtca tcaccgaaac gcgcgagacg 1200aaagggcctc gtgatacgcc tatttttata ggttaatgtc atgataataa tggtttctta 1260gacgtcaggt ggcacttttc ggggaaatgt gcgcggaacc cctatttgtt tatttttcta 1320aatacattca aatatgtatc cgctcatgag acaataaccc tgataaatgc ttcaataata 1380ttgaaaaagg aagagtatga gtattcaaca tttccgtgtc gcccttattc ccttttttgc 1440ggcattttgc cttcctgttt ttgctcaccc agaaacgctg gtgaaagtaa aagatgctga 1500agatcagttg ggtgcacgag tgggttacat cgaactggat ctcaacagcg gtaagatcct 1560tgagagtttt cgccccgaag aacgttttcc aatgatgagc acttttaaag ttctgctatg 1620tggcgcggta ttatcccgta ttgacgccgg gcaagagcaa ctcggtcgcc gcatacacta 1680ttctcagaat gacttggttg agtactcacc agtcacagaa aagcatctta cggatggcat 1740gacagtaaga gaattatgca gtgctgccat aaccatgagt gataacactg cggccaactt 1800acttctgaca acgatcggag gaccgaagga gctaaccgct tttttgcaca acatggggga 1860tcatgtaact cgccttgatc gttgggaacc ggagctgaat gaagccatac caaacgacga 1920gcgtgacacc acgatgcctg tagcaatggc aacaacgttg cgcaaactat taactggcga 1980actacttact ctagcttccc ggcaacaatt aatagactgg atggaggcgg ataaagttgc 2040aggaccactt ctgcgctcgg cccttccggc tggctggttt attgctgata aatctggagc 2100cggtgagcgt gggtctcgcg gtatcattgc agcactgggg ccagatggta agccctcccg 2160tatcgtagtt atctacacga cggggagtca ggcaactatg gatgaacgaa atagacagat 2220cgctgagata ggtgcctcac tgattaagca ttggtaactg tcagaccaag tttactcata 2280tatactttag attgatttaa aacttcattt ttaatttaaa aggatctagg tgaagatcct 2340ttttgataat ctcatgacca aaatccctta acgtgagttt tcgttccact gagcgtcaga 2400ccccgtagaa aagatcaaag gatcttcttg agatcctttt tttctgcgcg taatctgctg 2460cttgcaaaca aaaaaaccac cgctaccagc ggtggtttgt ttgccggatc aagagctacc 2520aactcttttt ccgaaggtaa ctggcttcag cagagcgcag ataccaaata ctgtccttct 2580agtgtagccg tagttaggcc accacttcaa gaactctgta gcaccgccta catacctcgc 2640tctgctaatc ctgttaccag tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt 2700ggactcaaga cgatagttac cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg 2760cacacagccc agcttggagc gaacgaccta caccgaactg agatacctac agcgtgagct 2820atgagaaagc gccacgcttc ccgaagggag aaaggcggac aggtatccgg taagcggcag 2880ggtcggaaca ggagagcgca cgagggagct tccaggggga aacgcctggt atctttatag 2940tcctgtcggg tttcgccacc tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg 3000gcggagccta tggaaaaacg ccagcaacgc ggccttttta cggttcctgg ccttttgctg 3060gccttttgct cacatgttct ttcctgcgtt atcccctgat tctgtggata accgtattac 3120cgcctttgag tgagctgata ccgctcgccg cagccgaacg accgagcgca gcgagtcagt 3180gagcgaggaa gcggaagagc gcccaatacg caaaccgcct ctccccgcgc gttggccgat 3240tcattaatgc agctggcacg acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc 3300aattaatgtg agttagctca ctcattaggc accccaggct ttacacttta tgcttccggc 3360tcgtatgttg tgtggaattg tgagcggata acaatttcac acaggaaaca gctatgacca 3420tgattacgcc aagcttgcat gcctgcaggc tagcctaagt acgtactcaa aatgccaaca 3480aataaaaaaa aagttgcttt aataatgcca aaacaaatta ataaaacact tacaacaccg 3540gatttttttt aattaaaatg tgccatttag gataaatagt taatattttt aataattatt 3600taaaaagccg tatctactaa aatgattttt atttggttga aaatattaat atgtttaaat 3660caacacaatc tatcaaaatt aaactaaaaa aaaaataagt gtacgtggtt aacattagta 3720cagtaatata agaggaaaat gagaaattaa gaaattgaaa gcgagtctaa tttttaaatt 3780atgaacctgc atatataaaa ggaaagaaag aatccaggaa gaaaagaaat gaaaccatgc 3840atggtcccct cgtcatcacg agtttctgcc atttgcaata gaaacactga aacacctttc 3900tctttgtcac ttaattgaga tgccgaagcc acctcacacc atgaacttca tgaggtgtag 3960cacccaaggc ttccatagcc atgcatactg aagaatgtct caagctcagc accctacttc 4020tgtgacgtgt ccctcattca ccttcctctc ttccctataa ataaccacgc ctcaggttct 4080ccgcttcaca actcaaacat tctctccatt ggtccttaaa cactcatcag tcatcaccgc 414066747DNAArtificial SequencePlasmid pKR1468 6gatccgtcga cggcgcgccc gatcatccgg atatagttcc tcctttcagc aaaaaacccc 60tcaagacccg tttagaggcc ccaaggggtt atgctagtta ttgctcagcg gtggcagcag 120ccaactcagc ttcctttcgg gctttgttag cagccggatc gatccaagct gtacctcact 180attcctttgc cctcggacga gtgctggggc gtcggtttcc actatcggcg agtacttcta 240cacagccatc ggtccagacg gccgcgcttc tgcgggcgat ttgtgtacgc ccgacagtcc 300cggctccgga tcggacgatt gcgtcgcatc gaccctgcgc ccaagctgca tcatcgaaat 360tgccgtcaac caagctctga tagagttggt caagaccaat gcggagcata tacgcccgga 420gccgcggcga tcctgcaagc tccggatgcc tccgctcgaa gtagcgcgtc tgctgctcca 480tacaagccaa ccacggcctc cagaagaaga tgttggcgac ctcgtattgg gaatccccga 540acatcgcctc gctccagtca atgaccgctg ttatgcggcc attgtccgtc aggacattgt 600tggagccgaa atccgcgtgc acgaggtgcc ggacttcggg gcagtcctcg gcccaaagca 660tcagctcatc gagagcctgc gcgacggacg cactgacggt gtcgtccatc acagtttgcc 720agtgatacac atggggatca gcaatcgcgc atatgaaatc acgccatgta gtgtattgac 780cgattccttg cggtccgaat gggccgaacc cgctcgtctg gctaagatcg gccgcagcga 840tcgcatccat agcctccgcg accggctgca gaacagcggg cagttcggtt tcaggcaggt 900cttgcaacgt gacaccctgt gcacggcggg agatgcaata ggtcaggctc tcgctgaatt 960ccccaatgtc aagcacttcc ggaatcggga gcgcggccga tgcaaagtgc cgataaacat 1020aacgatcttt gtagaaacca tcggcgcagc tatttacccg caggacatat ccacgccctc 1080ctacatcgaa gctgaaagca cgagattctt cgccctccga gagctgcatc aggtcggaga 1140cgctgtcgaa cttttcgatc agaaacttct cgacagacgt cgcggtgagt tcaggctttt 1200ccatgggtat atctccttct taaagttaaa caaaattatt tctagaggga aaccgttgtg 1260gtctccctat agtgagtcgt attaatttcg cgggatcgag atcgatccaa ttccaatccc 1320acaaaaatct gagcttaaca gcacagttgc tcctctcaga gcagaatcgg gtattcaaca 1380ccctcatatc aactactacg ttgtgtataa cggtccacat gccggtatat acgatgactg 1440gggttgtaca aaggcggcaa caaacggcgt tcccggagtt gcacacaaga aatttgccac 1500tattacagag gcaagagcag cagctgacgc gtacacaaca agtcagcaaa cagacaggtt 1560gaacttcatc cccaaaggag aagctcaact caagcccaag agctttgcta aggccctaac 1620aagcccacca aagcaaaaag cccactggct cacgctagga accaaaaggc ccagcagtga 1680tccagcccca aaagagatct cctttgcccc ggagattaca atggacgatt tcctctatct 1740ttacgatcta ggaaggaagt tcgaaggtga aggtgacgac actatgttca ccactgataa 1800tgagaaggtt agcctcttca atttcagaaa gaatgctgac ccacagatgg ttagagaggc 1860ctacgcagca ggtctcatca agacgatcta cccgagtaac aatctccagg agatcaaata 1920ccttcccaag aaggttaaag atgcagtcaa aagattcagg actaattgca tcaagaacac 1980agagaaagac atatttctca agatcagaag tactattcca gtatggacga ttcaaggctt 2040gcttcataaa ccaaggcaag taatagagat tggagtctct aaaaaggtag ttcctactga 2100atctaaggcc atgcatggag tctaagattc aaatcgagga tctaacagaa ctcgccgtga 2160agactggcga acagttcata cagagtcttt tacgactcaa tgacaagaag aaaatcttcg 2220tcaacatggt ggagcacgac actctggtct actccaaaaa tgtcaaagat acagtctcag 2280aagaccaaag ggctattgag acttttcaac aaaggataat ttcgggaaac ctcctcggat 2340tccattgccc agctatctgt cacttcatcg aaaggacagt agaaaaggaa ggtggctcct 2400acaaatgcca tcattgcgat aaaggaaagg ctatcattca agatgcctct gccgacagtg 2460gtcccaaaga tggaccccca cccacgagga gcatcgtgga aaaagaagac gttccaacca 2520cgtcttcaaa gcaagtggat tgatgtgaca tctccactga cgtaagggat gacgcacaat 2580cccactatcc ttcgcaagac ccttcctcta tataaggaag ttcatttcat ttggagagga 2640cacgctcgag ctcatttctc tattacttca gccataacaa aagaactctt ttctcttctt 2700attaaaccat gaaaaagcct gaactcaccg cgacgtctgt cgagaagttt ctgatcgaaa 2760agttcgacag cgtctccgac ctgatgcagc tctcggaggg cgaagaatct cgtgctttca 2820gcttcgatgt aggagggcgt ggatatgtcc tgcgggtaaa tagctgcgcc gatggtttct 2880acaaagatcg ttatgtttat cggcactttg catcggccgc gctcccgatt ccggaagtgc 2940ttgacattgg ggaattcagc gagagcctga cctattgcat ctcccgccgt gcacagggtg 3000tcacgttgca agacctgcct gaaaccgaac tgcccgctgt tctgcagccg gtcgcggagg 3060ccatggatgc gatcgctgcg gccgatctta gccagacgag cgggttcggc ccattcggac 3120cgcaaggaat cggtcaatac actacatggc gtgatttcat atgcgcgatt gctgatcccc 3180atgtgtatca ctggcaaact gtgatggacg acaccgtcag tgcgtccgtc gcgcaggctc 3240tcgatgagct gatgctttgg gccgaggact gccccgaagt ccggcacctc gtgcacgcgg 3300atttcggctc caacaatgtc ctgacggaca atggccgcat aacagcggtc attgactgga 3360gcgaggcgat gttcggggat tcccaatacg aggtcgccaa catcttcttc tggaggccgt 3420ggttggcttg tatggagcag cagacgcgct acttcgagcg gaggcatccg gagcttgcag 3480gatcgccgcg gctccgggcg tatatgctcc gcattggtct tgaccaactc tatcagagct 3540tggttgacgg caatttcgat gatgcagctt gggcgcaggg tcgatgcgac gcaatcgtcc 3600gatccggagc cgggactgtc gggcgtacac aaatcgcccg cagaagcgcg gccgtctgga 3660ccgatggctg tgtagaagta ctcgccgata gtggaaaccg acgccccagc actcgtccga 3720gggcaaagga atagtgaggt acctaaagaa ggagtgcgtc gaagcagatc gttcaaacat 3780ttggcaataa agtttcttaa gattgaatcc tgttgccggt cttgcgatga ttatcatata 3840atttctgttg aattacgtta agcatgtaat aattaacatg taatgcatga cgttatttat 3900gagatgggtt tttatgatta gagtcccgca attatacatt taatacgcga tagaaaacaa 3960aatatagcgc gcaaactagg ataaattatc gcgcgcggtg tcatctatgt tactagatcg 4020atgtcgaatc gatcaacctg cattaatgaa tcggccaacg cgcggggaga ggcggtttgc 4080gtattgggcg ctcttccgct tcctcgctca ctgactcgct gcgctcggtc gttcggctgc 4140ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa tcaggggata 4200acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg 4260cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 4320caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 4380gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 4440tcccttcggg aagcgtggcg ctttctcaat gctcacgctg taggtatctc agttcggtgt 4500aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 4560ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 4620cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 4680tgaagtggtg gcctaactac ggctacacta gaaggacagt atttggtatc tgcgctctgc 4740tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 4800ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 4860aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 4920aagggatttt ggtcatgaca ttaacctata aaaataggcg tatcacgagg ccctttcgtc 4980tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 5040cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 5100ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 5160accatatgga catattgtcg ttagaacgcg gctacaatta atacataacc ttatgtatca 5220tacacatacg atttaggtga cactatagaa cggcgcgcca agcttgcatg cctgcaggct 5280agcctaagta cgtactcaaa atgccaacaa ataaaaaaaa agttgcttta ataatgccaa 5340aacaaattaa taaaacactt acaacaccgg atttttttta attaaaatgt gccatttagg 5400ataaatagtt aatattttta ataattattt aaaaagccgt atctactaaa atgattttta 5460tttggttgaa aatattaata tgtttaaatc aacacaatct atcaaaatta aactaaaaaa 5520aaaataagtg tacgtggtta acattagtac agtaatataa gaggaaaatg agaaattaag 5580aaattgaaag cgagtctaat ttttaaatta tgaacctgca tatataaaag gaaagaaaga 5640atccaggaag aaaagaaatg aaaccatgca tggtcccctc gtcatcacga gtttctgcca 5700tttgcaatag aaacactgaa acacctttct ctttgtcact taattgagat gccgaagcca 5760cctcacacca tgaacttcat gaggtgtagc acccaaggct tccatagcca tgcatactga 5820agaatgtctc aagctcagca ccctacttct gtgacgtgtc cctcattcac cttcctctct 5880tccctataaa taaccacgcc tcaggttctc cgcttcacaa ctcaaacatt ctctccattg 5940gtccttaaac actcatcagt catcaccgcg gccgcatttc gcaccaaatc aatgaaagta 6000ataatgaaaa gtctgaataa gaatacttag gcttagatgc ctttgttact tgtgtaaaat 6060aacttgagtc atgtaccttt ggcggaaaca gaataaataa aaggtgaaat tccaatgctc 6120tatgtataag ttagtaatac ttaatgtgtt ctacggttgt ttcaatatca tcaaactcta 6180attgaaactt tagaaccaca aatctcaatc ttttcttaat gaaatgaaaa atcttaattg 6240taccatgttt atgttaaaca ccttacaatt ggttggagag gaggaccaac cgatgggaca 6300acattgggag aaagagattc aatggagatt tggataggag aacaacattc tttttcactt 6360caatacaaga tgagtgcaac actaaggata tgtatgagac tttcagaagc tacgacaaca 6420tagatgagtg aggtggtgat tcctagcaag aaagacatta gaggaagcca aaatcgaaca 6480aggaagacat caagggcaag agacaggacc atccatctca ggaaaaggag ctttgggata 6540gtccgagaag ttgtacaaga aattttttgg agggtgagtg atgcattgct ggtgacttta 6600actcaatcaa aattgagaaa gaaagaaaag ggagggggct cacatgtgaa tagaagggaa 6660acgggagaat tttacagttt tgatctaatg ggcatcccag ctagtggtaa catattcacc 6720atgtttaacc ttcacgtacg tctagag 674778462DNAArtificial SequencePlasmid pKR1475 7ggccgcattt cgcaccaaat caatgaaagt aataatgaaa agtctgaata agaatactta 60ggcttagatg cctttgttac ttgtgtaaaa taacttgagt catgtacctt tggcggaaac 120agaataaata aaaggtgaaa ttccaatgct ctatgtataa gttagtaata cttaatgtgt 180tctacggttg tttcaatatc atcaaactct aattgaaact ttagaaccac aaatctcaat 240cttttcttaa tgaaatgaaa aatcttaatt gtaccatgtt tatgttaaac accttacaat 300tggttggaga ggaggaccaa ccgatgggac aacattggga gaaagagatt caatggagat 360ttggatagga gaacaacatt ctttttcact tcaatacaag atgagtgcaa cactaaggat 420atgtatgaga ctttcagaag ctacgacaac atagatgagt gaggtggtga ttcctagcaa 480gaaagacatt agaggaagcc aaaatcgaac aaggaagaca tcaagggcaa gagacaggac 540catccatctc aggaaaagga gctttgggat agtccgagaa gttgtacaag aaattttttg 600gagggtgagt gatgcattgc tggtgacttt aactcaatca aaattgagaa agaaagaaaa 660gggagggggc tcacatgtga atagaaggga aacgggagaa ttttacagtt ttgatctaat 720gggcatccca gctagtggta acatattcac catgtttaac cttcacgtac gtctagagga 780tccgtcgacg gcgcgcccga tcatccggat atagttcctc ctttcagcaa aaaacccctc 840aagacccgtt tagaggcccc aaggggttat gctagttatt gctcagcggt ggcagcagcc 900aactcagctt cctttcgggc tttgttagca gccggatcga tccaagctgt acctcactat 960tcctttgccc tcggacgagt gctggggcgt cggtttccac tatcggcgag tacttctaca 1020cagccatcgg tccagacggc cgcgcttctg cgggcgattt gtgtacgccc gacagtcccg 1080gctccggatc ggacgattgc gtcgcatcga ccctgcgccc aagctgcatc atcgaaattg 1140ccgtcaacca agctctgata gagttggtca agaccaatgc ggagcatata cgcccggagc 1200cgcggcgatc ctgcaagctc cggatgcctc cgctcgaagt agcgcgtctg ctgctccata 1260caagccaacc acggcctcca gaagaagatg ttggcgacct cgtattggga atccccgaac 1320atcgcctcgc tccagtcaat gaccgctgtt atgcggccat tgtccgtcag gacattgttg 1380gagccgaaat ccgcgtgcac gaggtgccgg acttcggggc agtcctcggc ccaaagcatc 1440agctcatcga gagcctgcgc gacggacgca ctgacggtgt cgtccatcac agtttgccag 1500tgatacacat ggggatcagc aatcgcgcat atgaaatcac gccatgtagt gtattgaccg 1560attccttgcg gtccgaatgg gccgaacccg ctcgtctggc taagatcggc cgcagcgatc 1620gcatccatag cctccgcgac cggctgcaga acagcgggca gttcggtttc aggcaggtct 1680tgcaacgtga caccctgtgc acggcgggag atgcaatagg tcaggctctc gctgaattcc 1740ccaatgtcaa gcacttccgg aatcgggagc gcggccgatg caaagtgccg ataaacataa 1800cgatctttgt agaaaccatc ggcgcagcta tttacccgca ggacatatcc acgccctcct 1860acatcgaagc tgaaagcacg agattcttcg ccctccgaga gctgcatcag gtcggagacg 1920ctgtcgaact tttcgatcag aaacttctcg acagacgtcg cggtgagttc aggcttttcc 1980atgggtatat ctccttctta aagttaaaca aaattatttc tagagggaaa ccgttgtggt 2040ctccctatag tgagtcgtat taatttcgcg ggatcgagat cgatccaatt ccaatcccac 2100aaaaatctga gcttaacagc acagttgctc ctctcagagc agaatcgggt attcaacacc 2160ctcatatcaa ctactacgtt gtgtataacg gtccacatgc cggtatatac gatgactggg 2220gttgtacaaa ggcggcaaca aacggcgttc ccggagttgc acacaagaaa tttgccacta 2280ttacagaggc aagagcagca gctgacgcgt acacaacaag tcagcaaaca gacaggttga 2340acttcatccc caaaggagaa gctcaactca agcccaagag ctttgctaag gccctaacaa 2400gcccaccaaa gcaaaaagcc cactggctca cgctaggaac caaaaggccc agcagtgatc 2460cagccccaaa agagatctcc tttgccccgg agattacaat ggacgatttc ctctatcttt 2520acgatctagg aaggaagttc gaaggtgaag gtgacgacac tatgttcacc actgataatg 2580agaaggttag cctcttcaat ttcagaaaga atgctgaccc acagatggtt agagaggcct 2640acgcagcagg tctcatcaag acgatctacc cgagtaacaa tctccaggag atcaaatacc 2700ttcccaagaa ggttaaagat gcagtcaaaa gattcaggac taattgcatc aagaacacag 2760agaaagacat atttctcaag atcagaagta ctattccagt atggacgatt caaggcttgc 2820ttcataaacc aaggcaagta atagagattg gagtctctaa aaaggtagtt cctactgaat

2880ctaaggccat gcatggagtc taagattcaa atcgaggatc taacagaact cgccgtgaag 2940actggcgaac agttcataca gagtctttta cgactcaatg acaagaagaa aatcttcgtc 3000aacatggtgg agcacgacac tctggtctac tccaaaaatg tcaaagatac agtctcagaa 3060gaccaaaggg ctattgagac ttttcaacaa aggataattt cgggaaacct cctcggattc 3120cattgcccag ctatctgtca cttcatcgaa aggacagtag aaaaggaagg tggctcctac 3180aaatgccatc attgcgataa aggaaaggct atcattcaag atgcctctgc cgacagtggt 3240cccaaagatg gacccccacc cacgaggagc atcgtggaaa aagaagacgt tccaaccacg 3300tcttcaaagc aagtggattg atgtgacatc tccactgacg taagggatga cgcacaatcc 3360cactatcctt cgcaagaccc ttcctctata taaggaagtt catttcattt ggagaggaca 3420cgctcgagct catttctcta ttacttcagc cataacaaaa gaactctttt ctcttcttat 3480taaaccatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag 3540ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc 3600ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac 3660aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt 3720gacattgggg aattcagcga gagcctgacc tattgcatct cccgccgtgc acagggtgtc 3780acgttgcaag acctgcctga aaccgaactg cccgctgttc tgcagccggt cgcggaggcc 3840atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg 3900caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat 3960gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc 4020gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat 4080ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc 4140gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg 4200ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga 4260tcgccgcggc tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg 4320gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga 4380tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc 4440gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg 4500gcaaaggaat agtgaggtac ctaaagaagg agtgcgtcga agcagatcgt tcaaacattt 4560ggcaataaag tttcttaaga ttgaatcctg ttgccggtct tgcgatgatt atcatataat 4620ttctgttgaa ttacgttaag catgtaataa ttaacatgta atgcatgacg ttatttatga 4680gatgggtttt tatgattaga gtcccgcaat tatacattta atacgcgata gaaaacaaaa 4740tatagcgcgc aaactaggat aaattatcgc gcgcggtgtc atctatgtta ctagatcgat 4800gtcgaatcga tcaacctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt 4860attgggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg 4920cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac 4980gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg 5040ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca 5100agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc 5160tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc 5220ccttcgggaa gcgtggcgct ttctcaatgc tcacgctgta ggtatctcag ttcggtgtag 5280gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc 5340ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca 5400gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg 5460aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg 5520aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct 5580ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa 5640gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa 5700gggattttgg tcatgacatt aacctataaa aataggcgta tcacgaggcc ctttcgtctc 5760gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca 5820gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt 5880ggcgggtgtc ggggctggct taactatgcg gcatcagagc agattgtact gagagtgcac 5940catatggaca tattgtcgtt agaacgcggc tacaattaat acataacctt atgtatcata 6000cacatacgat ttaggtgaca ctatagaacg gcgcgccaag cttgcatgcc tgcaggctag 6060cctaagtacg tactcaaaat gccaacaaat aaaaaaaaag ttgctttaat aatgccaaaa 6120caaattaata aaacacttac aacaccggat tttttttaat taaaatgtgc catttaggat 6180aaatagttaa tatttttaat aattatttaa aaagccgtat ctactaaaat gatttttatt 6240tggttgaaaa tattaatatg tttaaatcaa cacaatctat caaaattaaa ctaaaaaaaa 6300aataagtgta cgtggttaac attagtacag taatataaga ggaaaatgag aaattaagaa 6360attgaaagcg agtctaattt ttaaattatg aacctgcata tataaaagga aagaaagaat 6420ccaggaagaa aagaaatgaa accatgcatg gtcccctcgt catcacgagt ttctgccatt 6480tgcaatagaa acactgaaac acctttctct ttgtcactta attgagatgc cgaagccacc 6540tcacaccatg aacttcatga ggtgtagcac ccaaggcttc catagccatg catactgaag 6600aatgtctcaa gctcagcacc ctacttctgt gacgtgtccc tcattcacct tcctctcttc 6660cctataaata accacgcctc aggttctccg cttcacaact caaacattct ctccattggt 6720ccttaaacac tcatcagtca tcaccgcggc catcacaagt ttgtacaaaa aagctgaacg 6780agaaacgtaa aatgatataa atatcaatat attaaattag attttgcata aaaaacagac 6840tacataatac tgtaaaacac aacatatcca gtcatattgg cggccgcatt aggcacccca 6900ggctttacac tttatgcttc cggctcgtat aatgtgtgga ttttgagtta ggatccgtcg 6960agattttcag gagctaagga agctaaaatg gagaaaaaaa tcactggata taccaccgtt 7020gatatatccc aatggcatcg taaagaacat tttgaggcat ttcagtcagt tgctcaatgt 7080acctataacc agaccgttca gctggatatt acggcctttt taaagaccgt aaagaaaaat 7140aagcacaagt tttatccggc ctttattcac attcttgccc gcctgatgaa tgctcatccg 7200gaattccgta tggcaatgaa agacggtgag ctggtgatat gggatagtgt tcacccttgt 7260tacaccgttt tccatgagca aactgaaacg ttttcatcgc tctggagtga ataccacgac 7320gatttccggc agtttctaca catatattcg caagatgtgg cgtgttacgg tgaaaacctg 7380gcctatttcc ctaaagggtt tattgagaat atgtttttcg tctcagccaa tccctgggtg 7440agtttcacca gttttgattt aaacgtggcc aatatggaca acttcttcgc ccccgttttc 7500accatgggca aatattatac gcaaggcgac aaggtgctga tgccgctggc gattcaggtt 7560catcatgccg tttgtgatgg cttccatgtc ggcagaatgc ttaatgaatt acaacagtac 7620tgcgatgagt ggcagggcgg ggcgtaaacg cgtggatccg gcttactaaa agccagataa 7680cagtatgcgt atttgcgcgc tgatttttgc ggtataagaa tatatactga tatgtatacc 7740cgaagtatgt caaaaagagg tatgctatga agcagcgtat tacagtgaca gttgacagcg 7800acagctatca gttgctcaag gcatatatga tgtcaatatc tccggtctgg taagcacaac 7860catgcagaat gaagcccgtc gtctgcgtgc cgaacgctgg aaagcggaaa atcaggaagg 7920gatggctgag gtcgcccggt ttattgaaat gaacggctct tttgctgacg agaacagggg 7980ctggtgaaat gcagtttaag gtttacacct ataaaagaga gagccgttat cgtctgtttg 8040tggatgtaca gagtgatatt attgacacgc ccgggcgacg gatggtgatc cccctggcca 8100gtgcacgtct gctgtcagat aaagtctccc gtgaacttta cccggtggtg catatcgggg 8160atgaaagctg gcgcatgatg accaccgata tggccagtgt gccggtctcc gttatcgggg 8220aagaagtggc tgatctcagc caccgcgaaa atgacatcaa aaacgccatt aacctgatgt 8280tctggggaat ataaatgtca ggctccctta tacacagcca gtctgcaggt cgaccatagt 8340gactggatat gttgtgtttt acagcattat gtagtctgtt ttttatgcaa aatctaattt 8400aatatattga tatttatatc attttacgtt tctcgttcag ctttcttgta caaagtggtg 8460at 8462813268DNAArtificial SequencePlasmid pKR92 8cgcgcctcga gtgggcggat cccccgggct gcaggaattc actggccgtc gttttacaac 60gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca catccccctt 120tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa cagttgcgca 180gcctgaatgg cgaatggatc gatccatcgc gatgtacctt ttgttagtca gcctctcgat 240tgctcatcgt cattacacag taccgaagtt tgatcgatct agtaacatag atgacaccgc 300gcgcgataat ttatcctagt ttgcgcgcta tattttgttt tctatcgcgt attaaatgta 360taattgcggg actctaatca taaaaaccca tctcataaat aacgtcatgc attacatgtt 420aattattaca tgcttaacgt aattcaacag aaattatatg ataatcatcg caagaccggc 480aacaggattc aatcttaaga aactttattg ccaaatgttt gaacgatctg cttcgacgca 540ctccttcttt actccaccat ctcgtcctta ttgaaaacgt gggtagcacc aaaacgaatc 600aagtcgctgg aactgaagtt accaatcacg ctggatgatt tgccagttgg attaatcttg 660cctttccccg catgaataat attgatgaat gcatgcgtga ggggtagttc gatgttggca 720atagctgcaa ttgccgcgac atcctccaac gagcataatt cttcagaaaa atagcgatgt 780tccatgttgt cagggcatgc atgatgcacg ttatgaggtg acggtgctag gcagtattcc 840ctcaaagttt catagtcagt atcatattca tcattgcatt cctgcaagag agaattgaga 900cgcaatccac acgctgcggc aaccttccgg cgttcgtggt ctatttgctc ttggacgttg 960caaacgtaag tgttggatcg atccggggtg ggcgaagaac tccagcatga gatccccgcg 1020ctggaggatc atccagccgg cgtcccggaa aacgattccg aagcccaacc tttcatagaa 1080ggcggcggtg gaatcgaaat ctcgtgatgg caggttgggc gtcgcttggt cggtcatttc 1140gaaccccaga gtcccgctca gaagaactcg tcaagaaggc gatagaaggc gatgcgctgc 1200gaatcgggag cggcgatacc gtaaagcacg aggaagcggt cagcccattc gccgccaagc 1260tcttcagcaa tatcacgggt agccaacgct atgtcctgat agcggtccgc cacacccagc 1320cggccacagt cgatgaatcc agaaaagcgg ccattttcca ccatgatatt cggcaagcag 1380gcatcgccat gggtcacgac gagatcctcg ccgtcgggca tgcgcgcctt gagcctggcg 1440aacagttcgg ctggcgcgag cccctgatgc tcttcgtcca gatcatcctg atcgacaaga 1500ccggcttcca tccgagtacg tgctcgctcg atgcgatgtt tcgcttggtg gtcgaatggg 1560caggtagccg gatcaagcgt atgcagccgc cgcattgcat cagccatgat ggatactttc 1620tcggcaggag caaggtgaga tgacaggaga tcctgccccg gcacttcgcc caatagcagc 1680cagtcccttc ccgcttcagt gacaacgtcg agcacagctg cgcaaggaac gcccgtcgtg 1740gccagccacg atagccgcgc tgcctcgtcc tgcagttcat tcagggcacc ggacaggtcg 1800gtcttgacaa aaagaaccgg gcgcccctgc gctgacagcc ggaacacggc ggcatcagag 1860cagccgattg tctgttgtgc ccagtcatag ccgaatagcc tctccaccca agcggccgga 1920gaacctgcgt gcaatccatc ttgttcaatc atgcgaaacg atccccgcaa gcttggagac 1980tggtgatttc agcgtgtcct ctccaaatga aatgaacttc cttatataga ggaagggtct 2040tgcgaaggat agtgggattg tgcgtcatcc cttacgtcag tggagatatc acatcaatcc 2100acttgctttg aagacgtggt tggaacgtct tctttttcca cgatgctcct cgtgggtggg 2160ggtccatctt tgggaccact gtcggcagag gcatcttcaa cgatggcctt tcctttatcg 2220caatgatggc atttgtagga gccaccttcc ttttccacta tcttcacaat aaagtgacag 2280atagctgggc aatggaatcc gaggaggttt ccggatatta ccctttgttg aaaagtctca 2340attgcccttt ggtcttctga gactgtatct ttgatatttt tggagtagac aagcgtgtcg 2400tgctccacca tgttgacgaa gattttcttc ttgtcattga gtcgtaagag actctgtatg 2460aactgttcgc cagtctttac ggcgagttct gttaggtcct ctatttgaat ctttgactcc 2520atggcctttg attcagtggg aactaccttt ttagagactc caatctctat tacttgcctt 2580ggtttgtgaa gcaagccttg aatcgtccat actggaatag tacttctgat cttgagaaat 2640atatctttct ctgtgttctt gatgcagtta gtcctgaatc ttttgactgc atctttaacc 2700ttcttgggaa ggtatttgat ctcctggaga ttattgctcg ggtagatcgt cttgatgaga 2760cctgctgcgt aagcctctct aaccatctgt gggttagcat tctttctgaa attgaaaagg 2820ctaatcttct cattatcagt ggtgaacatg gtatcgtcac cttctccgtc gaacttcctg 2880actagatcgt agagatagag gaagtcgtcc attgtgatct ctggggcaaa ggagtctgaa 2940ttaattcgat atggtggatt tatcacaaat gggacccgcc gccgacagag gtgtgatgtt 3000aggccaggac tttgaaaatt tgcgcaacta tcgtatagtg gccgacaaat tgacgccgag 3060ttgacagact gcctagcatt tgagtgaatt atgtgaggta atgggctaca ctgaattggt 3120agctcaaact gtcagtattt atgtatatga gtgtatattt tcgcataatc tcagaccaat 3180ctgaagatga aatgggtatc tgggaatggc gaaatcaagg catcgatcgt gaagtttctc 3240atctaagccc ccatttggac gtgaatgtag acacgtcgaa ataaagattt ccgaattaga 3300ataatttgtt tattgctttc gcctataaat acgacggatc gtaatttgtc gttttatcaa 3360aatgtacttt cattttataa taacgctgcg gacatctaca tttttgaatt gaaaaaaaat 3420tggtaattac tctttctttt tctccatatt gaccatcata ctcattgctg atccatgtag 3480atttcccgga catgaagcca tttacaattg aatatatcct gccgccgctg ccgctttgca 3540cccggtggag cttgcatgtt ggtttctacg cagaactgag ccggttaggc agataatttc 3600cattgagaac tgagccatgt gcaccttccc cccaacacgg tgagcgacgg ggcaacggag 3660tgatccacat gggactttta aacatcatcc gtcggatggc gttgcgagag aagcagtcga 3720tccgtgagat cagccgacgc accgggcagg cgcgcaacac gatcgcaaag tatttgaacg 3780caggtacaat cgagccgacg ttcacgcgga acgaccaagc aagctagctt taatgcggta 3840gtttatcaca gttaaattgc taacgcagtc aggcaccgtg tatgaaatct aacaatgcgc 3900tcatcgtcat cctcggcacc gtcaccctgg atgctgtagg cataggcttg gttatgccgg 3960tactgccggg cctcttgcgg gatatcgtcc attccgacag catcgccagt cactatggcg 4020tgctgctagc gctatatgcg ttgatgcaat ttctatgcgc acccgttctc ggagcactgt 4080ccgaccgctt tggccgccgc ccagtcctgc tcgcttcgct acttggagcc actatcgact 4140acgcgatcat ggcgaccaca cccgtcctgt ggtccaaccc ctccgctgct atagtgcagt 4200cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca agtcctaagt 4260tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt gttttagtcg 4320cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca agagcgccgc 4380cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga ccaaccaacg 4440ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca ccggcaccag 4500gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg acgttgtgac 4560agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca ttgccgagcg 4620catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg acaccaccac 4680gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg agcgttccct 4740aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg tgaagtttgg 4800cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga tcgaccagga 4860aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga ccctgtaccg 4920cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg gtgccttccg 4980tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac gccaagagga 5040acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac cgaagagatc 5100gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt ctcaaccgtg 5160cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg gccggccagc 5220ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt tgagtaaaac 5280agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca aatacgcaag 5340ggaacgcatg aagttatcgc tgtacttaac cagaaaggcg ggtcaggcaa gacgaccatc 5400gcaacccatc tagcccgcgc cctgcaactc gccggggccg atgttctgtt agtcgattcc 5460gatccccagg gcagtgcccg cgattgggcg gccgtgcggg aagatcaacc gctaaccgtt 5520gtcggcatcg accgcccgac gattgaccgc gacgtgaagg ccatcggccg gcgcgacttc 5580gtagtgatcg acggagcgcc ccaggcggcg gacttggctg tgtccgcgat caaggcagcc 5640gacttcgtgc tgattccggt gcagccaagc ccttacgaca tatgggccac cgccgacctg 5700gtggagctgg ttaagcagcg cattgaggtc acggatggaa ggctacaagc ggcctttgtc 5760gtgtcgcggg cgatcaaagg cacgcgcatc ggcggtgagg ttgccgaggc gctggccggg 5820tacgagctgc ccattcttga gtcccgtatc acgcagcgcg tgagctaccc aggcactgcc 5880gccgccggca caaccgttct tgaatcagaa cccgagggcg acgctgcccg cgaggtccag 5940gcgctggccg ctgaaattaa atcaaaactc atttgagtta atgaggtaaa gagaaaatga 6000gcaaaagcac aaacacgcta agtgccggcc gtccgagcgc acgcagcagc aaggctgcaa 6060cgttggccag cctggcagac acgccagcca tgaagcgggt caactttcag ttgccggcgg 6120aggatcacac caagctgaag atgtacgcgg tacgccaagg caagaccatt accgagctgc 6180tatctgaata catcgcgcag ctaccagagt aaatgagcaa atgaataaat gagtagatga 6240attttagcgg ctaaaggagg cggcatggaa aatcaagaac aaccaggcac cgacgccgtg 6300gaatgcccca tgtgtggagg aacgggcggt tggccaggcg taagcggctg ggttgtctgc 6360cggccctgca atggcactgg aacccccaag cccgaggaat cggcgtgagc ggtcgcaaac 6420catccggccc ggtacaaatc ggcgcggcgc tgggtgatga cctggtggag aagttgaagg 6480ccgcgcaggc cgcccagcgg caacgcatcg aggcagaagc acgccccggt gaatcgtggc 6540aagcggccgc tgatcgaatc cgcaaagaat cccggcaacc gccggcagcc ggtgcgccgt 6600cgattaggaa gccgcccaag ggcgacgagc aaccagattt tttcgttccg atgctctatg 6660acgtgggcac ccgcgatagt cgcagcatca tggacgtggc cgttttccgt ctgtcgaagc 6720gtgaccgacg agctggcgag gtgatccgct acgagcttcc agacgggcac gtagaggttt 6780ccgcagggcc ggccggcatg gccagtgtgt gggattacga cctggtactg atggcggttt 6840cccatctaac cgaatccatg aaccgatacc gggaagggaa gggagacaag cccggccgcg 6900tgttccgtcc acacgttgcg gacgtactca agttctgccg gcgagccgat ggcggaaagc 6960agaaagacga cctggtagaa acctgcattc ggttaaacac cacgcacgtt gccatgcagc 7020gtacgaagaa ggccaagaac ggccgcctgg tgacggtatc cgagggtgaa gccttgatta 7080gccgctacaa gatcgtaaag agcgaaaccg ggcggccgga gtacatcgag atcgagctag 7140ctgattggat gtaccgcgag atcacagaag gcaagaaccc ggacgtgctg acggttcacc 7200ccgattactt tttgatcgat cccggcatcg gccgttttct ctaccgcctg gcacgccgcg 7260ccgcaggcaa ggcagaagcc agatggttgt tcaagacgat ctacgaacgc agtggcagcg 7320ccggagagtt caagaagttc tgtttcaccg tgcgcaagct gatcgggtca aatgacctgc 7380cggagtacga tttgaaggag gaggcggggc aggctggccc gatcctagtc atgcgctacc 7440gcaacctgat cgagggcgaa gcatccgccg gttcctaatg tacggagcag atgctagggc 7500aaattgccct agcaggggaa aaaggtcgaa aaggtctctt tcctgtggat agcacgtaca 7560ttgggaaccc aaagccgtac attgggaacc ggaacccgta cattgggaac ccaaagccgt 7620acattgggaa ccggtcacac atgtaagtga ctgatataaa agagaaaaaa ggcgattttt 7680ccgcctaaaa ctctttaaaa cttattaaaa ctcttaaaac ccgcctggcc tgtgcataac 7740tgtctggcca gcgcacagcc gaagagctgc aaaaagcgcc tacccttcgg tcgctgcgct 7800ccctacgccc cgccgcttcg cgtcggccta tcgcggccgc tggccgctca aaaatggctg 7860gcctacggcc aggcaatcta ccagggcgcg gacaagccgc gccgtcgcca ctcgaccgcc 7920ggcgcccaca tcaaggcacc ctgcctcgcg cgtttcggtg atgacggtga aaacctctga 7980cacatgcagc tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa 8040gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg gcgcagccat gacccagtca 8100cgtagcgata gcggagtgta tactggctta actatgcggc atcagagcag attgtactga 8160gagtgcacca tatgcggtgt gaaataccgc acagatgcgt aaggagaaaa taccgcatca 8220ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 8280cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 8340gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 8400tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 8460agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 8520tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 8580cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 8640ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 8700ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 8760ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 8820ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 8880cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 8940gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 9000atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 9060ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa 9120gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa 9180tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc 9240ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga 9300taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa 9360gggccgagcg cagaagtggt

cctgcaactt tatccgcctc catccagtct attaattgtt 9420gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg 9480ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 9540aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 9600gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag 9660cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt 9720actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt 9780caacacggga taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaag 9840acctgcaggg gggggggggc gctgaggtct gcctcgtgaa gaaggtgttg ctgactcata 9900ccaggcctga atcgccccat catccagcca gaaagtgagg gagccacggt tgatgagagc 9960tttgttgtag gtggaccagt tggtgatttt gaacttttgc tttgccacgg aacggtctgc 10020gttgtcggga agatgcgtga tctgatcctt caactcagca aaagttcgat ttattcaaca 10080aagccgccgt cccgtcaagt cagcgtaatg ctctgccagt gttacaacca attaaccaat 10140tctgattaga aaaactcatc gagcatcaaa tgaaactgca atttattcat atcaggatta 10200tcaataccat atttttgaaa aagccgtttc tgtaatgaag gagaaaactc accgaggcag 10260ttccatagga tggcaagatc ctggtatcgg tctgcgattc cgactcgtcc aacatcaata 10320caacctatta atttcccctc gtcaaaaata aggttatcaa gtgagaaatc accatgagtg 10380acgactgaat ccggtgagaa tggcaaaagc ttatgcattt ctttccagac ttgttcaaca 10440ggccagccat tacgctcgtc atcaaaatca ctcgcatcaa ccaaaccgtt attcattcgt 10500gattgcgcct gagcgagacg aaatacgcga tcgctgttaa aaggacaatt acaaacagga 10560atcgaatgca accggcgcag gaacactgcc agcgcatcaa caatattttc acctgaatca 10620ggatattctt ctaatacctg gaatgctgtt ttcccgggga tcgcagtggt gagtaaccat 10680gcatcatcag gagtacggat aaaatgcttg atggtcggaa gaggcataaa ttccgtcagc 10740cagtttagtc tgaccatctc atctgtaaca tcattggcaa cgctaccttt gccatgtttc 10800agaaacaact ctggcgcatc gggcttccca tacaatcgat agattgtcgc acctgattgc 10860ccgacattat cgcgagccca tttataccca tataaatcag catccatgtt ggaatttaat 10920cgcggcctcg agcaagacgt ttcccgttga atatggctca taacacccct tgtattactg 10980tttatgtaag cagacagttt tattgttcat gatgatatat ttttatcttg tgcaatgtaa 11040catcagagat tttgagacac aacgtggctt tccccccccc ccctgcaggt caattcggtc 11100gatatggcta ttacgaagaa ggctcgtgcg cggagtcccg tgaactttcc cacgcaacaa 11160gtgaaccgca ccgggtttgc cggaggccat ttcgttaaaa tgcgcagcca tggctgcttc 11220gtccagcatg gcgtaatact gatcctcgtc ttcggctggc ggtatattgc cgatgggctt 11280caaaagccgc cgtggttgaa ccagtctatc cattccaagg tagcgaactc gaccgcttcg 11340aagctcctcc atggtccacg ccgatgaatg acctcggcct tgtaaagacc gttgatcgct 11400tctgcgaggg cgttgtcgtg ctgtcgccga cgcttccgat agatggctcg atacctgctt 11460ctgccaaccg ctcggaatag cgaaaggaca cgtattgaac accgcgatcc gagtgatgca 11520ctaggccgcc atgagcggga cgccgatcat gatgagcctc ctcgagggca tcgaggacaa 11580agcctgcatg tgctgtccgg ctcgcccgcc atccgacaat gcgacgggcg aagacgtcga 11640tcacgaaggc cacgtagacg aagccctccc aagtggcgac ataagtacgg acatgcgcaa 11700aggctttccc ggtttgtcgc tgatggtgca agagacgctg aagcgcgatc cgatgcgcag 11760gcatctgttc gtcttccgcg gtcgtggcgg tggcctgatc aaggtcactc gccgaagagc 11820tgcatgattg gctcgaaacc gagcggggga aattgtcgcg cagttctccc gtcgccgagg 11880cgataaatta catgctcaag cgatgggatg gcattacgtc attcctcgat gacggcccga 11940tttgcctgac gaacaatgct gccgaacgaa cgctcagagg ctatgtactc ggcaggaagt 12000catggctgtt tgccggatcg gatcgttgtg ctgaacgtgc ggcgttcatg gcgacactga 12060tcatgagcgc caagctcaat aacatcgatc cgcaggcctg gcttgccgac gtccgcgccg 12120accttgcgga cgctccgatc agcaggcttg agcaacagct gccgtggaac tggacatcca 12180agacactgag tgctcaggcg gcctgacctg cggccttcac cggatactta ccccattatc 12240gcagattgcg atgaagcatc agcgtcattc agcaatcttg ccaaagtatg caggctcgcg 12300agaatcgacg tgcgaaaccg gctggttgcg ccaaagatcc gcttgcggag cggtcgaaca 12360ttcatgctgg gacttcaaga ggtcgagtag aggaagaacc ggaaaggttg caccggaaaa 12420tatgcgttcc tttggagagc gcctcatgga cgtgaacaaa tcgcccggac caaggatgcc 12480acggatacaa aagctcgcga agctcggtcc cgtgggtgtt ctgtcgtctc gttgtacaac 12540gaaatccatt cccattccgc gctcaagatg gcttcccctc ggcagttcat cagggctaaa 12600tcaatctagc cgacttgtcc ggtgaaatgg gctgcactcc aacagaaaca atcaaacaaa 12660catacacagc gacttattca cacgagctca aattacaacg gtatatatcc tgccagtcag 12720catcatcaca ccaaaagtta ggcccgaata gtttgaaatt agaaagctcg caattgaggt 12780ctacaggcca aattcgctct tagccgtaca atattactca ccggtgcgat gccccccatc 12840gtaggtgaag gtggaaatta atgatccatc ttgagaccac aggcccacaa cagctaccag 12900tttcctcaag ggtccaccaa aaacgtaagc gcttacgtac atggtcgata agaaaaggca 12960atttgtagat gttaacatcc aacgtcgctt tcagggatcg atccaatacg caaaccgcct 13020ctccccgcgc gttggccgat tcattaatgc agctggcacg acaggtttcc cgactggaaa 13080gcgggcagtg agcgcaacgc aattaatgtg agttagctca ctcattaggc accccaggct 13140ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata acaatttcac 13200acaggaaaca gctatgacca tgattacgcc aagcttgcat gcctgcaggt cgactctaga 13260ggatctgg 13268916490DNAArtificial SequencepKR1478 9cgcgccagat cctctagagt cgacctgcag gcatgcaagc ttggcgtaat catggtcata 60gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag 120cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg 180ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 240acgcgcgggg agaggcggtt tgcgtattgg atcgatccct gaaagcgacg ttggatgtta 300acatctacaa attgcctttt cttatcgacc atgtacgtaa gcgcttacgt ttttggtgga 360cccttgagga aactggtagc tgttgtgggc ctgtggtctc aagatggatc attaatttcc 420accttcacct acgatggggg gcatcgcacc ggtgagtaat attgtacggc taagagcgaa 480tttggcctgt agacctcaat tgcgagcttt ctaatttcaa actattcggg cctaactttt 540ggtgtgatga tgctgactgg caggatatat accgttgtaa tttgagctcg tgtgaataag 600tcgctgtgta tgtttgtttg attgtttctg ttggagtgca gcccatttca ccggacaagt 660cggctagatt gatttagccc tgatgaactg ccgaggggaa gccatcttga gcgcggaatg 720ggaatggatt tcgttgtaca acgagacgac agaacaccca cgggaccgag cttcgcgagc 780ttttgtatcc gtggcatcct tggtccgggc gatttgttca cgtccatgag gcgctctcca 840aaggaacgca tattttccgg tgcaaccttt ccggttcttc ctctactcga cctcttgaag 900tcccagcatg aatgttcgac cgctccgcaa gcggatcttt ggcgcaacca gccggtttcg 960cacgtcgatt ctcgcgagcc tgcatacttt ggcaagattg ctgaatgacg ctgatgcttc 1020atcgcaatct gcgataatgg ggtaagtatc cggtgaaggc cgcaggtcag gccgcctgag 1080cactcagtgt cttggatgtc cagttccacg gcagctgttg ctcaagcctg ctgatcggag 1140cgtccgcaag gtcggcgcgg acgtcggcaa gccaggcctg cggatcgatg ttattgagct 1200tggcgctcat gatcagtgtc gccatgaacg ccgcacgttc agcacaacga tccgatccgg 1260caaacagcca tgacttcctg ccgagtacat agcctctgag cgttcgttcg gcagcattgt 1320tcgtcaggca aatcgggccg tcatcgagga atgacgtaat gccatcccat cgcttgagca 1380tgtaatttat cgcctcggcg acgggagaac tgcgcgacaa tttcccccgc tcggtttcga 1440gccaatcatg cagctcttcg gcgagtgacc ttgatcaggc caccgccacg accgcggaag 1500acgaacagat gcctgcgcat cggatcgcgc ttcagcgtct cttgcaccat cagcgacaaa 1560ccgggaaagc ctttgcgcat gtccgtactt atgtcgccac ttgggagggc ttcgtctacg 1620tggccttcgt gatcgacgtc ttcgcccgtc gcattgtcgg atggcgggcg agccggacag 1680cacatgcagg ctttgtcctc gatgccctcg aggaggctca tcatgatcgg cgtcccgctc 1740atggcggcct agtgcatcac tcggatcgcg gtgttcaata cgtgtccttt cgctattccg 1800agcggttggc agaagcaggt atcgagccat ctatcggaag cgtcggcgac agcacgacaa 1860cgccctcgca gaagcgatca acggtcttta caaggccgag gtcattcatc ggcgtggacc 1920atggaggagc ttcgaagcgg tcgagttcgc taccttggaa tggatagact ggttcaacca 1980cggcggcttt tgaagcccat cggcaatata ccgccagccg aagacgagga tcagtattac 2040gccatgctgg acgaagcagc catggctgcg cattttaacg aaatggcctc cggcaaaccc 2100ggtgcggttc acttgttgcg tgggaaagtt cacgggactc cgcgcacgag ccttcttcgt 2160aatagccata tcgaccgaat tgacctgcag gggggggggg gaaagccacg ttgtgtctca 2220aaatctctga tgttacattg cacaagataa aaatatatca tcatgaacaa taaaactgtc 2280tgcttacata aacagtaata caaggggtgt tatgagccat attcaacggg aaacgtcttg 2340ctcgaggccg cgattaaatt ccaacatgga tgctgattta tatgggtata aatgggctcg 2400cgataatgtc gggcaatcag gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc 2460agagttgttt ctgaaacatg gcaaaggtag cgttgccaat gatgttacag atgagatggt 2520cagactaaac tggctgacgg aatttatgcc tcttccgacc atcaagcatt ttatccgtac 2580tcctgatgat gcatggttac tcaccactgc gatccccggg aaaacagcat tccaggtatt 2640agaagaatat cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg 2700gttgcattcg attcctgttt gtaattgtcc ttttaacagc gatcgcgtat ttcgtctcgc 2760tcaggcgcaa tcacgaatga ataacggttt ggttgatgcg agtgattttg atgacgagcg 2820taatggctgg cctgttgaac aagtctggaa agaaatgcat aagcttttgc cattctcacc 2880ggattcagtc gtcactcatg gtgatttctc acttgataac cttatttttg acgaggggaa 2940attaataggt tgtattgatg ttggacgagt cggaatcgca gaccgatacc aggatcttgc 3000catcctatgg aactgcctcg gtgagttttc tccttcatta cagaaacggc tttttcaaaa 3060atatggtatt gataatcctg atatgaataa attgcagttt catttgatgc tcgatgagtt 3120tttctaatca gaattggtta attggttgta acactggcag agcattacgc tgacttgacg 3180ggacggcggc tttgttgaat aaatcgaact tttgctgagt tgaaggatca gatcacgcat 3240cttcccgaca acgcagaccg ttccgtggca aagcaaaagt tcaaaatcac caactggtcc 3300acctacaaca aagctctcat caaccgtggc tccctcactt tctggctgga tgatggggcg 3360attcaggcct ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccccccc 3420ccccctgcag gtcttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat 3480tatcccgtgt tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg 3540acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag 3600aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 3660cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 3720gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca 3780cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc 3840tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc 3900tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 3960ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta 4020tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag 4080gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga 4140ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 4200tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa 4260agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa 4320aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc 4380cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt 4440agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc 4500tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac 4560gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca 4620gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta tgagaaagcg 4680ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag 4740gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt 4800ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat 4860ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc 4920acatgttctt tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt 4980gagctgatac cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag 5040cggaagagcg cctgatgcgg tattttctcc ttacgcatct gtgcggtatt tcacaccgca 5100tatggtgcac tctcagtaca atctgctctg atgccgcata gttaagccag tatacactcc 5160gctatcgcta cgtgactggg tcatggctgc gccccgacac ccgccaacac ccgctgacgc 5220gccctgacgg gcttgtctgc tcccggcatc cgcttacaga caagctgtga ccgtctccgg 5280gagctgcatg tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgaggc agggtgcctt 5340gatgtgggcg ccggcggtcg agtggcgacg gcgcggcttg tccgcgccct ggtagattgc 5400ctggccgtag gccagccatt tttgagcggc cagcggccgc gataggccga cgcgaagcgg 5460cggggcgtag ggagcgcagc gaccgaaggg taggcgcttt ttgcagctct tcggctgtgc 5520gctggccaga cagttatgca caggccaggc gggttttaag agttttaata agttttaaag 5580agttttaggc ggaaaaatcg ccttttttct cttttatatc agtcacttac atgtgtgacc 5640ggttcccaat gtacggcttt gggttcccaa tgtacgggtt ccggttccca atgtacggct 5700ttgggttccc aatgtacgtg ctatccacag gaaagagacc ttttcgacct ttttcccctg 5760ctagggcaat ttgccctagc atctgctccg tacattagga accggcggat gcttcgccct 5820cgatcaggtt gcggtagcgc atgactagga tcgggccagc ctgccccgcc tcctccttca 5880aatcgtactc cggcaggtca tttgacccga tcagcttgcg cacggtgaaa cagaacttct 5940tgaactctcc ggcgctgcca ctgcgttcgt agatcgtctt gaacaaccat ctggcttctg 6000ccttgcctgc ggcgcggcgt gccaggcggt agagaaaacg gccgatgccg ggatcgatca 6060aaaagtaatc ggggtgaacc gtcagcacgt ccgggttctt gccttctgtg atctcgcggt 6120acatccaatc agctagctcg atctcgatgt actccggccg cccggtttcg ctctttacga 6180tcttgtagcg gctaatcaag gcttcaccct cggataccgt caccaggcgg ccgttcttgg 6240ccttcttcgt acgctgcatg gcaacgtgcg tggtgtttaa ccgaatgcag gtttctacca 6300ggtcgtcttt ctgctttccg ccatcggctc gccggcagaa cttgagtacg tccgcaacgt 6360gtggacggaa cacgcggccg ggcttgtctc ccttcccttc ccggtatcgg ttcatggatt 6420cggttagatg ggaaaccgcc atcagtacca ggtcgtaatc ccacacactg gccatgccgg 6480ccggccctgc ggaaacctct acgtgcccgt ctggaagctc gtagcggatc acctcgccag 6540ctcgtcggtc acgcttcgac agacggaaaa cggccacgtc catgatgctg cgactatcgc 6600gggtgcccac gtcatagagc atcggaacga aaaaatctgg ttgctcgtcg cccttgggcg 6660gcttcctaat cgacggcgca ccggctgccg gcggttgccg ggattctttg cggattcgat 6720cagcggccgc ttgccacgat tcaccggggc gtgcttctgc ctcgatgcgt tgccgctggg 6780cggcctgcgc ggccttcaac ttctccacca ggtcatcacc cagcgccgcg ccgatttgta 6840ccgggccgga tggtttgcga ccgctcacgc cgattcctcg ggcttggggg ttccagtgcc 6900attgcagggc cggcagacaa cccagccgct tacgcctggc caaccgcccg ttcctccaca 6960catggggcat tccacggcgt cggtgcctgg ttgttcttga ttttccatgc cgcctccttt 7020agccgctaaa attcatctac tcatttattc atttgctcat ttactctggt agctgcgcga 7080tgtattcaga tagcagctcg gtaatggtct tgccttggcg taccgcgtac atcttcagct 7140tggtgtgatc ctccgccggc aactgaaagt tgacccgctt catggctggc gtgtctgcca 7200ggctggccaa cgttgcagcc ttgctgctgc gtgcgctcgg acggccggca cttagcgtgt 7260ttgtgctttt gctcattttc tctttacctc attaactcaa atgagttttg atttaatttc 7320agcggccagc gcctggacct cgcgggcagc gtcgccctcg ggttctgatt caagaacggt 7380tgtgccggcg gcggcagtgc ctgggtagct cacgcgctgc gtgatacggg actcaagaat 7440gggcagctcg tacccggcca gcgcctcggc aacctcaccg ccgatgcgcg tgcctttgat 7500cgcccgcgac acgacaaagg ccgcttgtag ccttccatcc gtgacctcaa tgcgctgctt 7560aaccagctcc accaggtcgg cggtggccca tatgtcgtaa gggcttggct gcaccggaat 7620cagcacgaag tcggctgcct tgatcgcgga cacagccaag tccgccgcct ggggcgctcc 7680gtcgatcact acgaagtcgc gccggccgat ggccttcacg tcgcggtcaa tcgtcgggcg 7740gtcgatgccg acaacggtta gcggttgatc ttcccgcacg gccgcccaat cgcgggcact 7800gccctgggga tcggaatcga ctaacagaac atcggccccg gcgagttgca gggcgcgggc 7860tagatgggtt gcgatggtcg tcttgcctga cccgcctttc tggttaagta cagcgataac 7920ttcatgcgtt cccttgcgta tttgtttatt tactcatcgc atcatatacg cagcgaccgc 7980atgacgcaag ctgttttact caaatacaca tcaccttttt agacggcggc gctcggtttc 8040ttcagcggcc aagctggccg gccaggccgc cagcttggca tcagacaaac cggccaggat 8100ttcatgcagc cgcacggttg agacgtgcgc gggcggctcg aacacgtacc cggccgcgat 8160catctccgcc tcgatctctt cggtaatgaa aaacggttcg tcctggccgt cctggtgcgg 8220tttcatgctt gttcctcttg gcgttcattc tcggcggccg ccagggcgtc ggcctcggtc 8280aatgcgtcct cacggaaggc accgcgccgc ctggcctcgg tgggcgtcac ttcctcgctg 8340cgctcaagtg cgcggtacag ggtcgagcga tgcacgccaa gcagtgcagc cgcctctttc 8400acggtgcggc cttcctggtc gatcagctcg cgggcgtgcg cgatctgtgc cggggtgagg 8460gtagggcggg ggccaaactt cacgcctcgg gccttggcgg cctcgcgccc gctccgggtg 8520cggtcgatga ttagggaacg ctcgaactcg gcaatgccgg cgaacacggt caacaccatg 8580cggccggccg gcgtggtggt gtcggcccac ggctctgcca ggctacgcag gcccgcgccg 8640gcctcctgga tgcgctcggc aatgtccagt aggtcgcggg tgctgcgggc caggcggtct 8700agcctggtca ctgtcacaac gtcgccaggg cgtaggtggt caagcatcct ggccagctcc 8760gggcggtcgc gcctggtgcc ggtgatcttc tcggaaaaca gcttggtgca gccggccgcg 8820tgcagttcgg cccgttggtt ggtcaagtcc tggtcgtcgg tgctgacgcg ggcatagccc 8880agcaggccag cggcggcgct cttgttcatg gcgtaatgtc tccggttcta gtcgcaagta 8940ttctacttta tgcgactaaa acacgcgaca agaaaacgcc aggaaaaggg cagggcggca 9000gcctgtcgcg taacttagga cttgtgcgac atgtcgtttt cagaagacgg ctgcactgaa 9060cgtcagaagc cgactgcact atagcagcgg aggggttgga ccacaggacg ggtgtggtcg 9120ccatgatcgc gtagtcgata gtggctccaa gtagcgaagc gagcaggact gggcggcggc 9180caaagcggtc ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc aacgcatata 9240gcgctagcag cacgccatag tgactggcga tgctgtcgga atggacgata tcccgcaaga 9300ggcccggcag taccggcata accaagccta tgcctacagc atccagggtg acggtgccga 9360ggatgacgat gagcgcattg ttagatttca tacacggtgc ctgactgcgt tagcaattta 9420actgtgataa actaccgcat taaagctagc ttgcttggtc gttccgcgtg aacgtcggct 9480cgattgtacc tgcgttcaaa tactttgcga tcgtgttgcg cgcctgcccg gtgcgtcggc 9540tgatctcacg gatcgactgc ttctctcgca acgccatccg acggatgatg tttaaaagtc 9600ccatgtggat cactccgttg ccccgtcgct caccgtgttg gggggaaggt gcacatggct 9660cagttctcaa tggaaattat ctgcctaacc ggctcagttc tgcgtagaaa ccaacatgca 9720agctccaccg ggtgcaaagc ggcagcggcg gcaggatata ttcaattgta aatggcttca 9780tgtccgggaa atctacatgg atcagcaatg agtatgatgg tcaatatgga gaaaaagaaa 9840gagtaattac caattttttt tcaattcaaa aatgtagatg tccgcagcgt tattataaaa 9900tgaaagtaca ttttgataaa acgacaaatt acgatccgtc gtatttatag gcgaaagcaa 9960taaacaaatt attctaattc ggaaatcttt atttcgacgt gtctacattc acgtccaaat 10020gggggcttag atgagaaact tcacgatcga tgccttgatt tcgccattcc cagataccca 10080tttcatcttc agattggtct gagattatgc gaaaatatac actcatatac ataaatactg 10140acagtttgag ctaccaattc agtgtagccc attacctcac ataattcact caaatgctag 10200gcagtctgtc aactcggcgt caatttgtcg gccactatac gatagttgcg caaattttca 10260aagtcctggc ctaacatcac acctctgtcg gcggcgggtc ccatttgtga taaatccacc 10320atatcgaatt aattcagact cctttgcccc agagatcaca atggacgact tcctctatct 10380ctacgatcta gtcaggaagt tcgacggaga aggtgacgat accatgttca ccactgataa 10440tgagaagatt agccttttca atttcagaaa gaatgctaac ccacagatgg ttagagaggc 10500ttacgcagca ggtctcatca agacgatcta cccgagcaat aatctccagg agatcaaata 10560ccttcccaag aaggttaaag atgcagtcaa aagattcagg actaactgca tcaagaacac 10620agagaaagat atatttctca agatcagaag tactattcca gtatggacga ttcaaggctt 10680gcttcacaaa ccaaggcaag taatagagat tggagtctct aaaaaggtag ttcccactga 10740atcaaaggcc atggagtcaa agattcaaat agaggaccta acagaactcg ccgtaaagac 10800tggcgaacag ttcatacaga gtctcttacg actcaatgac aagaagaaaa tcttcgtcaa 10860catggtggag cacgacacgc ttgtctactc caaaaatatc aaagatacag tctcagaaga 10920ccaaagggca attgagactt ttcaacaaag ggtaatatcc ggaaacctcc tcggattcca 10980ttgcccagct atctgtcact ttattgtgaa gatagtggaa aaggaaggtg gctcctacaa 11040atgccatcat tgcgataaag gaaaggccat cgttgaagat gcctctgccg

acagtggtcc 11100caaagatgga cccccaccca cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc 11160ttcaaagcaa gtggattgat gtgatatctc cactgacgta agggatgacg cacaatccca 11220ctatccttcg caagaccctt cctctatata aggaagttca tttcatttgg agaggacacg 11280ctgaaatcac cagtctccaa gcttgcgggg atcgtttcgc atgattgaac aagatggatt 11340gcacgcaggt tctccggccg cttgggtgga gaggctattc ggctatgact gggcacaaca 11400gacaatcggc tgctctgatg ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct 11460ttttgtcaag accgacctgt ccggtgccct gaatgaactg caggacgagg cagcgcggct 11520atcgtggctg gccacgacgg gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc 11580gggaagggac tggctgctat tgggcgaagt gccggggcag gatctcctgt catctcacct 11640tgctcctgcc gagaaagtat ccatcatggc tgatgcaatg cggcggctgc atacgcttga 11700tccggctacc tgcccattcg accaccaagc gaaacatcgc atcgagcgag cacgtactcg 11760gatggaagcc ggtcttgtcg atcaggatga tctggacgaa gagcatcagg ggctcgcgcc 11820agccgaactg ttcgccaggc tcaaggcgcg catgcccgac ggcgaggatc tcgtcgtgac 11880ccatggcgat gcctgcttgc cgaatatcat ggtggaaaat ggccgctttt ctggattcat 11940cgactgtggc cggctgggtg tggcggaccg ctatcaggac atagcgttgg ctacccgtga 12000tattgctgaa gagcttggcg gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc 12060cgctcccgat tcgcagcgca tcgccttcta tcgccttctt gacgagttct tctgagcggg 12120actctggggt tcgaaatgac cgaccaagcg acgcccaacc tgccatcacg agatttcgat 12180tccaccgccg ccttctatga aaggttgggc ttcggaatcg ttttccggga cgccggctgg 12240atgatcctcc agcgcgggga tctcatgctg gagttcttcg cccaccccgg atcgatccaa 12300cacttacgtt tgcaacgtcc aagagcaaat agaccacgaa cgccggaagg ttgccgcagc 12360gtgtggattg cgtctcaatt ctctcttgca ggaatgcaat gatgaatatg atactgacta 12420tgaaactttg agggaatact gcctagcacc gtcacctcat aacgtgcatc atgcatgccc 12480tgacaacatg gaacatcgct atttttctga agaattatgc tcgttggagg atgtcgcggc 12540aattgcagct attgccaaca tcgaactacc cctcacgcat gcattcatca atattattca 12600tgcggggaaa ggcaagatta atccaactgg caaatcatcc agcgtgattg gtaacttcag 12660ttccagcgac ttgattcgtt ttggtgctac ccacgttttc aataaggacg agatggtgga 12720gtaaagaagg agtgcgtcga agcagatcgt tcaaacattt ggcaataaag tttcttaaga 12780ttgaatcctg ttgccggtct tgcgatgatt atcatataat ttctgttgaa ttacgttaag 12840catgtaataa ttaacatgta atgcatgacg ttatttatga gatgggtttt tatgattaga 12900gtcccgcaat tatacattta atacgcgata gaaaacaaaa tatagcgcgc aaactaggat 12960aaattatcgc gcgcggtgtc atctatgtta ctagatcgat caaacttcgg tactgtgtaa 13020tgacgatgag caatcgagag gctgactaac aaaaggtaca tcgcgatgga tcgatccatt 13080cgccattcag gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct tcgctattac 13140gccagctggc gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg ccagggtttt 13200cccagtcacg acgttgtaaa acgacggcca gtgaattcct gcagcccggg ggatccgccc 13260actcgaggcg cgccaagctt gcatgcctgc aggctagcct aagtacgtac tcaaaatgcc 13320aacaaataaa aaaaaagttg ctttaataat gccaaaacaa attaataaaa cacttacaac 13380accggatttt ttttaattaa aatgtgccat ttaggataaa tagttaatat ttttaataat 13440tatttaaaaa gccgtatcta ctaaaatgat ttttatttgg ttgaaaatat taatatgttt 13500aaatcaacac aatctatcaa aattaaacta aaaaaaaaat aagtgtacgt ggttaacatt 13560agtacagtaa tataagagga aaatgagaaa ttaagaaatt gaaagcgagt ctaattttta 13620aattatgaac ctgcatatat aaaaggaaag aaagaatcca ggaagaaaag aaatgaaacc 13680atgcatggtc ccctcgtcat cacgagtttc tgccatttgc aatagaaaca ctgaaacacc 13740tttctctttg tcacttaatt gagatgccga agccacctca caccatgaac ttcatgaggt 13800gtagcaccca aggcttccat agccatgcat actgaagaat gtctcaagct cagcacccta 13860cttctgtgac gtgtccctca ttcaccttcc tctcttccct ataaataacc acgcctcagg 13920ttctccgctt cacaactcaa acattctctc cattggtcct taaacactca tcagtcatca 13980ccgcggccat cacaagtttg tacaaaaaag ctgaacgaga aacgtaaaat gatataaata 14040tcaatatatt aaattagatt ttgcataaaa aacagactac ataatactgt aaaacacaac 14100atatccagtc atattggcgg ccgcattagg caccccaggc tttacacttt atgcttccgg 14160ctcgtataat gtgtggattt tgagttagga tccgtcgaga ttttcaggag ctaaggaagc 14220taaaatggag aaaaaaatca ctggatatac caccgttgat atatcccaat ggcatcgtaa 14280agaacatttt gaggcatttc agtcagttgc tcaatgtacc tataaccaga ccgttcagct 14340ggatattacg gcctttttaa agaccgtaaa gaaaaataag cacaagtttt atccggcctt 14400tattcacatt cttgcccgcc tgatgaatgc tcatccggaa ttccgtatgg caatgaaaga 14460cggtgagctg gtgatatggg atagtgttca cccttgttac accgttttcc atgagcaaac 14520tgaaacgttt tcatcgctct ggagtgaata ccacgacgat ttccggcagt ttctacacat 14580atattcgcaa gatgtggcgt gttacggtga aaacctggcc tatttcccta aagggtttat 14640tgagaatatg tttttcgtct cagccaatcc ctgggtgagt ttcaccagtt ttgatttaaa 14700cgtggccaat atggacaact tcttcgcccc cgttttcacc atgggcaaat attatacgca 14760aggcgacaag gtgctgatgc cgctggcgat tcaggttcat catgccgttt gtgatggctt 14820ccatgtcggc agaatgctta atgaattaca acagtactgc gatgagtggc agggcggggc 14880gtaaacgcgt ggatccggct tactaaaagc cagataacag tatgcgtatt tgcgcgctga 14940tttttgcggt ataagaatat atactgatat gtatacccga agtatgtcaa aaagaggtat 15000gctatgaagc agcgtattac agtgacagtt gacagcgaca gctatcagtt gctcaaggca 15060tatatgatgt caatatctcc ggtctggtaa gcacaaccat gcagaatgaa gcccgtcgtc 15120tgcgtgccga acgctggaaa gcggaaaatc aggaagggat ggctgaggtc gcccggttta 15180ttgaaatgaa cggctctttt gctgacgaga acaggggctg gtgaaatgca gtttaaggtt 15240tacacctata aaagagagag ccgttatcgt ctgtttgtgg atgtacagag tgatattatt 15300gacacgcccg ggcgacggat ggtgatcccc ctggccagtg cacgtctgct gtcagataaa 15360gtctcccgtg aactttaccc ggtggtgcat atcggggatg aaagctggcg catgatgacc 15420accgatatgg ccagtgtgcc ggtctccgtt atcggggaag aagtggctga tctcagccac 15480cgcgaaaatg acatcaaaaa cgccattaac ctgatgttct ggggaatata aatgtcaggc 15540tcccttatac acagccagtc tgcaggtcga ccatagtgac tggatatgtt gtgttttaca 15600gcattatgta gtctgttttt tatgcaaaat ctaatttaat atattgatat ttatatcatt 15660ttacgtttct cgttcagctt tcttgtacaa agtggtgatg gccgcatttc gcaccaaatc 15720aatgaaagta ataatgaaaa gtctgaataa gaatacttag gcttagatgc ctttgttact 15780tgtgtaaaat aacttgagtc atgtaccttt ggcggaaaca gaataaataa aaggtgaaat 15840tccaatgctc tatgtataag ttagtaatac ttaatgtgtt ctacggttgt ttcaatatca 15900tcaaactcta attgaaactt tagaaccaca aatctcaatc ttttcttaat gaaatgaaaa 15960atcttaattg taccatgttt atgttaaaca ccttacaatt ggttggagag gaggaccaac 16020cgatgggaca acattgggag aaagagattc aatggagatt tggataggag aacaacattc 16080tttttcactt caatacaaga tgagtgcaac actaaggata tgtatgagac tttcagaagc 16140tacgacaaca tagatgagtg aggtggtgat tcctagcaag aaagacatta gaggaagcca 16200aaatcgaaca aggaagacat caagggcaag agacaggacc atccatctca ggaaaaggag 16260ctttgggata gtccgagaag ttgtacaaga aattttttgg agggtgagtg atgcattgct 16320ggtgacttta actcaatcaa aattgagaaa gaaagaaaag ggagggggct cacatgtgaa 16380tagaagggaa acgggagaat tttacagttt tgatctaatg ggcatcccag ctagtggtaa 16440catattcacc atgtttaacc ttcacgtacg tctagaggat ccgtcgacgg 164901049DNAartificial sequenceSaiff and genomic DNA of lo17849 10gaaggctcta agctgtgttg taggcttctt agcattcatt tctgtttgc 491130DNAartificial sequenceprimer 11caccatggtt gttgtgtctc ttcttcctcg 301223DNAartificial sequenceprimer 12tcaaattgat ttagtttctc cag 23132988DNAartificial sequencevector 13aagggtgggc gcgccgaccc agctttcttg tacaaagttg gcattataag aaagcattgc 60ttatcaattt gttgcaacga acaggtcact atcagtcaaa ataaaatcat tatttgccat 120ccagctgata tcccctatag tgagtcgtat tacatggtca tagctgtttc ctggcagctc 180tggcccgtgt ctcaaaatct ctgatgttac attgcacaag ataaaaatat atcatcatga 240acaataaaac tgtctgctta cataaacagt aatacaaggg gtgttatgag ccatattcaa 300cgggaaacgt cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa 360tgggctcgcg ataatgtcgg gcaatcaggt gcgacaatct atcgcttgta tgggaagccc 420gatgcgccag agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat 480gagatggtca gactaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt 540atccgtactc ctgatgatgc atggttactc accactgcga tccccggaaa aacagcattc 600caggtattag aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc 660ctgcgccggt tgcattcgat tcctgtttgt aattgtcctt ttaacagcga tcgcgtattt 720cgtctcgctc aggcgcaatc acgaatgaat aacggtttgg ttgatgcgag tgattttgat 780gacgagcgta atggctggcc tgttgaacaa gtctggaaag aaatgcataa acttttgcca 840ttctcaccgg attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac 900gaggggaaat taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag 960gatcttgcca tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt 1020tttcaaaaat atggtattga taatcctgat atgaataaat tgcagtttca tttgatgctc 1080gatgagtttt tctaatcaga attggttaat tggttgtaac actggcagag cattacgctg 1140acttgacggg acggcgcaag ctcatgacca aaatccctta acgtgagtta cgcgtcgttc 1200cactgagcgt cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg 1260cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg 1320gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc gcagatacca 1380aatactgtcc ttctagtgta gccgtagtta ggccaccact tcaagaactc tgtagcaccg 1440cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg cgataagtcg 1500tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga 1560acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga actgagatac 1620ctacagcgtg agcattgaga aagcgccacg cttcccgaag ggagaaaggc ggacaggtat 1680ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg gggaaacgcc 1740tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg atttttgtga 1800tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc 1860ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc tgattctgtg 1920gataaccgta ttaccgcctt tgagtgagct gataccgctc gccgcagccg aacgaccgag 1980cgcagcgagt cagtgagcga ggaagcggaa gagcgcccaa tacgcaaacc gcctctcccc 2040gcgcgttggc cgattcatta atgcagctgg cacgacaggt ttcccgactg gaaagcgggc 2100agtgagcgca acgcaattaa tacgcgtacc gctagccagg aagagtttgt agaaacgcaa 2160aaaggccatc cgtcaggatg gccttctgct tagtttgatg cctggcagtt tatggcgggc 2220gtcctgcccg ccaccctccg ggccgttgct tcacaacgtt caaatccgct cccggcggat 2280ttgtcctact caggagagcg ttcaccgaca aacaacagat aaaacgaaag gcccagtctt 2340ccgactgagc ctttcgtttt atttgatgcc tggcagttcc ctactctcgc gttaacgcta 2400gcatggatgt tttcccagtc acgacgttgt aaaacgacgg ccagtcttaa gctcgggccc 2460caaataatga ttttattttg actgatagtg acctgttcgt tgcaacaaat tgatgagcaa 2520tgctttttta taatgccaac tttgtacaaa aaagcaggct ccgcggccgc ccccttcacc 2580atggttgttg tgtctcttct tcctcgaatc tcgatcgtta catcaccggg ttctagcctt 2640cacgatgtgc ttttgagcat gagatttggt ttgacgcgac atctccctct caaacgatct 2700ttctccaatt attcaatcac ttccgtatct ccagaacaac agctcaaatc tccggtgacc 2760atggcgacga ccgagagcaa gaatcttgta gaagcttcca aggaggagac aaacaagaag 2820gagacagaag ataagaagga ggtgggagtt tcggttcctc caccgccaga gaaaccagag 2880cctggcgatt gttgcggtag cggttgcgtc cgatgcgttt gggatgttta ttacgatgag 2940ctcgaagatt acaacaagca gctttctgga gaaactaaat caatttga 29881415279DNAartificial sequencevector 14acccagcttt cttgtacaaa gtggtgatgg ccgcatttcg caccaaatca atgaaagtaa 60taatgaaaag tctgaataag aatacttagg cttagatgcc tttgttactt gtgtaaaata 120acttgagtca tgtacctttg gcggaaacag aataaataaa aggtgaaatt ccaatgctct 180atgtataagt tagtaatact taatgtgttc tacggttgtt tcaatatcat caaactctaa 240ttgaaacttt agaaccacaa atctcaatct tttcttaatg aaatgaaaaa tcttaattgt 300accatgttta tgttaaacac cttacaattg gttggagagg aggaccaacc gatgggacaa 360cattgggaga aagagattca atggagattt ggataggaga acaacattct ttttcacttc 420aatacaagat gagtgcaaca ctaaggatat gtatgagact ttcagaagct acgacaacat 480agatgagtga ggtggtgatt cctagcaaga aagacattag aggaagccaa aatcgaacaa 540ggaagacatc aagggcaaga gacaggacca tccatctcag gaaaaggagc tttgggatag 600tccgagaagt tgtacaagaa attttttgga gggtgagtga tgcattgctg gtgactttaa 660ctcaatcaaa attgagaaag aaagaaaagg gagggggctc acatgtgaat agaagggaaa 720cgggagaatt ttacagtttt gatctaatgg gcatcccagc tagtggtaac atattcacca 780tgtttaacct tcacgtacgt ctagaggatc cgtcgacggc gcgccagatc ctctagagtc 840gacctgcagg catgcaagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg 900ttatccgctc acaattccac acaacatacg agccggaagc ataaagtgta aagcctgggg 960tgcctaatga gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc 1020gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga gaggcggttt 1080gcgtattgga tcgatccctg aaagcgacgt tggatgttaa catctacaaa ttgccttttc 1140ttatcgacca tgtacgtaag cgcttacgtt tttggtggac ccttgaggaa actggtagct 1200gttgtgggcc tgtggtctca agatggatca ttaatttcca ccttcaccta cgatgggggg 1260catcgcaccg gtgagtaata ttgtacggct aagagcgaat ttggcctgta gacctcaatt 1320gcgagctttc taatttcaaa ctattcgggc ctaacttttg gtgtgatgat gctgactggc 1380aggatatata ccgttgtaat ttgagctcgt gtgaataagt cgctgtgtat gtttgtttga 1440ttgtttctgt tggagtgcag cccatttcac cggacaagtc ggctagattg atttagccct 1500gatgaactgc cgaggggaag ccatcttgag cgcggaatgg gaatggattt cgttgtacaa 1560cgagacgaca gaacacccac gggaccgagc ttcgcgagct tttgtatccg tggcatcctt 1620ggtccgggcg atttgttcac gtccatgagg cgctctccaa aggaacgcat attttccggt 1680gcaacctttc cggttcttcc tctactcgac ctcttgaagt cccagcatga atgttcgacc 1740gctccgcaag cggatctttg gcgcaaccag ccggtttcgc acgtcgattc tcgcgagcct 1800gcatactttg gcaagattgc tgaatgacgc tgatgcttca tcgcaatctg cgataatggg 1860gtaagtatcc ggtgaaggcc gcaggtcagg ccgcctgagc actcagtgtc ttggatgtcc 1920agttccacgg cagctgttgc tcaagcctgc tgatcggagc gtccgcaagg tcggcgcgga 1980cgtcggcaag ccaggcctgc ggatcgatgt tattgagctt ggcgctcatg atcagtgtcg 2040ccatgaacgc cgcacgttca gcacaacgat ccgatccggc aaacagccat gacttcctgc 2100cgagtacata gcctctgagc gttcgttcgg cagcattgtt cgtcaggcaa atcgggccgt 2160catcgaggaa tgacgtaatg ccatcccatc gcttgagcat gtaatttatc gcctcggcga 2220cgggagaact gcgcgacaat ttcccccgct cggtttcgag ccaatcatgc agctcttcgg 2280cgagtgacct tgatcaggcc accgccacga ccgcggaaga cgaacagatg cctgcgcatc 2340ggatcgcgct tcagcgtctc ttgcaccatc agcgacaaac cgggaaagcc tttgcgcatg 2400tccgtactta tgtcgccact tgggagggct tcgtctacgt ggccttcgtg atcgacgtct 2460tcgcccgtcg cattgtcgga tggcgggcga gccggacagc acatgcaggc tttgtcctcg 2520atgccctcga ggaggctcat catgatcggc gtcccgctca tggcggccta gtgcatcact 2580cggatcgcgg tgttcaatac gtgtcctttc gctattccga gcggttggca gaagcaggta 2640tcgagccatc tatcggaagc gtcggcgaca gcacgacaac gccctcgcag aagcgatcaa 2700cggtctttac aaggccgagg tcattcatcg gcgtggacca tggaggagct tcgaagcggt 2760cgagttcgct accttggaat ggatagactg gttcaaccac ggcggctttt gaagcccatc 2820ggcaatatac cgccagccga agacgaggat cagtattacg ccatgctgga cgaagcagcc 2880atggctgcgc attttaacga aatggcctcc ggcaaacccg gtgcggttca cttgttgcgt 2940gggaaagttc acgggactcc gcgcacgagc cttcttcgta atagccatat cgaccgaatt 3000gacctgcagg gggggggggg aaagccacgt tgtgtctcaa aatctctgat gttacattgc 3060acaagataaa aatatatcat catgaacaat aaaactgtct gcttacataa acagtaatac 3120aaggggtgtt atgagccata ttcaacggga aacgtcttgc tcgaggccgc gattaaattc 3180caacatggat gctgatttat atgggtataa atgggctcgc gataatgtcg ggcaatcagg 3240tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg 3300caaaggtagc gttgccaatg atgttacaga tgagatggtc agactaaact ggctgacgga 3360atttatgcct cttccgacca tcaagcattt tatccgtact cctgatgatg catggttact 3420caccactgcg atccccggga aaacagcatt ccaggtatta gaagaatatc ctgattcagg 3480tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg 3540taattgtcct tttaacagcg atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa 3600taacggtttg gttgatgcga gtgattttga tgacgagcgt aatggctggc ctgttgaaca 3660agtctggaaa gaaatgcata agcttttgcc attctcaccg gattcagtcg tcactcatgg 3720tgatttctca cttgataacc ttatttttga cgaggggaaa ttaataggtt gtattgatgt 3780tggacgagtc ggaatcgcag accgatacca ggatcttgcc atcctatgga actgcctcgg 3840tgagttttct ccttcattac agaaacggct ttttcaaaaa tatggtattg ataatcctga 3900tatgaataaa ttgcagtttc atttgatgct cgatgagttt ttctaatcag aattggttaa 3960ttggttgtaa cactggcaga gcattacgct gacttgacgg gacggcggct ttgttgaata 4020aatcgaactt ttgctgagtt gaaggatcag atcacgcatc ttcccgacaa cgcagaccgt 4080tccgtggcaa agcaaaagtt caaaatcacc aactggtcca cctacaacaa agctctcatc 4140aaccgtggct ccctcacttt ctggctggat gatggggcga ttcaggcctg gtatgagtca 4200gcaacacctt cttcacgagg cagacctcag cgcccccccc cccctgcagg tcttttccaa 4260tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtgtt gacgccgggc 4320aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag tactcaccag 4380tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt gctgccataa 4440ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga ccgaaggagc 4500taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt tgggaaccgg 4560agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta gcaatggcaa 4620caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg caacaattaa 4680tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc cttccggctg 4740gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt atcattgcag 4800cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg gggagtcagg 4860caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg attaagcatt 4920ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa cttcattttt 4980aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa atcccttaac 5040gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag 5100atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg 5160tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca 5220gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga 5280actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca 5340gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc 5400agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca 5460ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa 5520aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 5580cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc 5640gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg 5700cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat 5760cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca 5820gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt 5880attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa 5940tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt 6000catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 6060cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 6120ttcaccgtca tcaccgaaac gcgcgaggca gggtgccttg atgtgggcgc cggcggtcga 6180gtggcgacgg cgcggcttgt ccgcgccctg gtagattgcc tggccgtagg ccagccattt 6240ttgagcggcc agcggccgcg ataggccgac gcgaagcggc ggggcgtagg

gagcgcagcg 6300accgaagggt aggcgctttt tgcagctctt cggctgtgcg ctggccagac agttatgcac 6360aggccaggcg ggttttaaga gttttaataa gttttaaaga gttttaggcg gaaaaatcgc 6420cttttttctc ttttatatca gtcacttaca tgtgtgaccg gttcccaatg tacggctttg 6480ggttcccaat gtacgggttc cggttcccaa tgtacggctt tgggttccca atgtacgtgc 6540tatccacagg aaagagacct tttcgacctt tttcccctgc tagggcaatt tgccctagca 6600tctgctccgt acattaggaa ccggcggatg cttcgccctc gatcaggttg cggtagcgca 6660tgactaggat cgggccagcc tgccccgcct cctccttcaa atcgtactcc ggcaggtcat 6720ttgacccgat cagcttgcgc acggtgaaac agaacttctt gaactctccg gcgctgccac 6780tgcgttcgta gatcgtcttg aacaaccatc tggcttctgc cttgcctgcg gcgcggcgtg 6840ccaggcggta gagaaaacgg ccgatgccgg gatcgatcaa aaagtaatcg gggtgaaccg 6900tcagcacgtc cgggttcttg ccttctgtga tctcgcggta catccaatca gctagctcga 6960tctcgatgta ctccggccgc ccggtttcgc tctttacgat cttgtagcgg ctaatcaagg 7020cttcaccctc ggataccgtc accaggcggc cgttcttggc cttcttcgta cgctgcatgg 7080caacgtgcgt ggtgtttaac cgaatgcagg tttctaccag gtcgtctttc tgctttccgc 7140catcggctcg ccggcagaac ttgagtacgt ccgcaacgtg tggacggaac acgcggccgg 7200gcttgtctcc cttcccttcc cggtatcggt tcatggattc ggttagatgg gaaaccgcca 7260tcagtaccag gtcgtaatcc cacacactgg ccatgccggc cggccctgcg gaaacctcta 7320cgtgcccgtc tggaagctcg tagcggatca cctcgccagc tcgtcggtca cgcttcgaca 7380gacggaaaac ggccacgtcc atgatgctgc gactatcgcg ggtgcccacg tcatagagca 7440tcggaacgaa aaaatctggt tgctcgtcgc ccttgggcgg cttcctaatc gacggcgcac 7500cggctgccgg cggttgccgg gattctttgc ggattcgatc agcggccgct tgccacgatt 7560caccggggcg tgcttctgcc tcgatgcgtt gccgctgggc ggcctgcgcg gccttcaact 7620tctccaccag gtcatcaccc agcgccgcgc cgatttgtac cgggccggat ggtttgcgac 7680cgctcacgcc gattcctcgg gcttgggggt tccagtgcca ttgcagggcc ggcagacaac 7740ccagccgctt acgcctggcc aaccgcccgt tcctccacac atggggcatt ccacggcgtc 7800ggtgcctggt tgttcttgat tttccatgcc gcctccttta gccgctaaaa ttcatctact 7860catttattca tttgctcatt tactctggta gctgcgcgat gtattcagat agcagctcgg 7920taatggtctt gccttggcgt accgcgtaca tcttcagctt ggtgtgatcc tccgccggca 7980actgaaagtt gacccgcttc atggctggcg tgtctgccag gctggccaac gttgcagcct 8040tgctgctgcg tgcgctcgga cggccggcac ttagcgtgtt tgtgcttttg ctcattttct 8100ctttacctca ttaactcaaa tgagttttga tttaatttca gcggccagcg cctggacctc 8160gcgggcagcg tcgccctcgg gttctgattc aagaacggtt gtgccggcgg cggcagtgcc 8220tgggtagctc acgcgctgcg tgatacggga ctcaagaatg ggcagctcgt acccggccag 8280cgcctcggca acctcaccgc cgatgcgcgt gcctttgatc gcccgcgaca cgacaaaggc 8340cgcttgtagc cttccatccg tgacctcaat gcgctgctta accagctcca ccaggtcggc 8400ggtggcccat atgtcgtaag ggcttggctg caccggaatc agcacgaagt cggctgcctt 8460gatcgcggac acagccaagt ccgccgcctg gggcgctccg tcgatcacta cgaagtcgcg 8520ccggccgatg gccttcacgt cgcggtcaat cgtcgggcgg tcgatgccga caacggttag 8580cggttgatct tcccgcacgg ccgcccaatc gcgggcactg ccctggggat cggaatcgac 8640taacagaaca tcggccccgg cgagttgcag ggcgcgggct agatgggttg cgatggtcgt 8700cttgcctgac ccgcctttct ggttaagtac agcgataact tcatgcgttc ccttgcgtat 8760ttgtttattt actcatcgca tcatatacgc agcgaccgca tgacgcaagc tgttttactc 8820aaatacacat caccttttta gacggcggcg ctcggtttct tcagcggcca agctggccgg 8880ccaggccgcc agcttggcat cagacaaacc ggccaggatt tcatgcagcc gcacggttga 8940gacgtgcgcg ggcggctcga acacgtaccc ggccgcgatc atctccgcct cgatctcttc 9000ggtaatgaaa aacggttcgt cctggccgtc ctggtgcggt ttcatgcttg ttcctcttgg 9060cgttcattct cggcggccgc cagggcgtcg gcctcggtca atgcgtcctc acggaaggca 9120ccgcgccgcc tggcctcggt gggcgtcact tcctcgctgc gctcaagtgc gcggtacagg 9180gtcgagcgat gcacgccaag cagtgcagcc gcctctttca cggtgcggcc ttcctggtcg 9240atcagctcgc gggcgtgcgc gatctgtgcc ggggtgaggg tagggcgggg gccaaacttc 9300acgcctcggg ccttggcggc ctcgcgcccg ctccgggtgc ggtcgatgat tagggaacgc 9360tcgaactcgg caatgccggc gaacacggtc aacaccatgc ggccggccgg cgtggtggtg 9420tcggcccacg gctctgccag gctacgcagg cccgcgccgg cctcctggat gcgctcggca 9480atgtccagta ggtcgcgggt gctgcgggcc aggcggtcta gcctggtcac tgtcacaacg 9540tcgccagggc gtaggtggtc aagcatcctg gccagctccg ggcggtcgcg cctggtgccg 9600gtgatcttct cggaaaacag cttggtgcag ccggccgcgt gcagttcggc ccgttggttg 9660gtcaagtcct ggtcgtcggt gctgacgcgg gcatagccca gcaggccagc ggcggcgctc 9720ttgttcatgg cgtaatgtct ccggttctag tcgcaagtat tctactttat gcgactaaaa 9780cacgcgacaa gaaaacgcca ggaaaagggc agggcggcag cctgtcgcgt aacttaggac 9840ttgtgcgaca tgtcgttttc agaagacggc tgcactgaac gtcagaagcc gactgcacta 9900tagcagcgga ggggttggac cacaggacgg gtgtggtcgc catgatcgcg tagtcgatag 9960tggctccaag tagcgaagcg agcaggactg ggcggcggcc aaagcggtcg gacagtgctc 10020cgagaacggg tgcgcataga aattgcatca acgcatatag cgctagcagc acgccatagt 10080gactggcgat gctgtcggaa tggacgatat cccgcaagag gcccggcagt accggcataa 10140ccaagcctat gcctacagca tccagggtga cggtgccgag gatgacgatg agcgcattgt 10200tagatttcat acacggtgcc tgactgcgtt agcaatttaa ctgtgataaa ctaccgcatt 10260aaagctagct tgcttggtcg ttccgcgtga acgtcggctc gattgtacct gcgttcaaat 10320actttgcgat cgtgttgcgc gcctgcccgg tgcgtcggct gatctcacgg atcgactgct 10380tctctcgcaa cgccatccga cggatgatgt ttaaaagtcc catgtggatc actccgttgc 10440cccgtcgctc accgtgttgg ggggaaggtg cacatggctc agttctcaat ggaaattatc 10500tgcctaaccg gctcagttct gcgtagaaac caacatgcaa gctccaccgg gtgcaaagcg 10560gcagcggcgg caggatatat tcaattgtaa atggcttcat gtccgggaaa tctacatgga 10620tcagcaatga gtatgatggt caatatggag aaaaagaaag agtaattacc aatttttttt 10680caattcaaaa atgtagatgt ccgcagcgtt attataaaat gaaagtacat tttgataaaa 10740cgacaaatta cgatccgtcg tatttatagg cgaaagcaat aaacaaatta ttctaattcg 10800gaaatcttta tttcgacgtg tctacattca cgtccaaatg ggggcttaga tgagaaactt 10860cacgatcgat gccttgattt cgccattccc agatacccat ttcatcttca gattggtctg 10920agattatgcg aaaatataca ctcatataca taaatactga cagtttgagc taccaattca 10980gtgtagccca ttacctcaca taattcactc aaatgctagg cagtctgtca actcggcgtc 11040aatttgtcgg ccactatacg atagttgcgc aaattttcaa agtcctggcc taacatcaca 11100cctctgtcgg cggcgggtcc catttgtgat aaatccacca tatcgaatta attcagactc 11160ctttgcccca gagatcacaa tggacgactt cctctatctc tacgatctag tcaggaagtt 11220cgacggagaa ggtgacgata ccatgttcac cactgataat gagaagatta gccttttcaa 11280tttcagaaag aatgctaacc cacagatggt tagagaggct tacgcagcag gtctcatcaa 11340gacgatctac ccgagcaata atctccagga gatcaaatac cttcccaaga aggttaaaga 11400tgcagtcaaa agattcagga ctaactgcat caagaacaca gagaaagata tatttctcaa 11460gatcagaagt actattccag tatggacgat tcaaggcttg cttcacaaac caaggcaagt 11520aatagagatt ggagtctcta aaaaggtagt tcccactgaa tcaaaggcca tggagtcaaa 11580gattcaaata gaggacctaa cagaactcgc cgtaaagact ggcgaacagt tcatacagag 11640tctcttacga ctcaatgaca agaagaaaat cttcgtcaac atggtggagc acgacacgct 11700tgtctactcc aaaaatatca aagatacagt ctcagaagac caaagggcaa ttgagacttt 11760tcaacaaagg gtaatatccg gaaacctcct cggattccat tgcccagcta tctgtcactt 11820tattgtgaag atagtggaaa aggaaggtgg ctcctacaaa tgccatcatt gcgataaagg 11880aaaggccatc gttgaagatg cctctgccga cagtggtccc aaagatggac ccccacccac 11940gaggagcatc gtggaaaaag aagacgttcc aaccacgtct tcaaagcaag tggattgatg 12000tgatatctcc actgacgtaa gggatgacgc acaatcccac tatccttcgc aagacccttc 12060ctctatataa ggaagttcat ttcatttgga gaggacacgc tgaaatcacc agtctccaag 12120cttgcgggga tcgtttcgca tgattgaaca agatggattg cacgcaggtt ctccggccgc 12180ttgggtggag aggctattcg gctatgactg ggcacaacag acaatcggct gctctgatgc 12240cgccgtgttc cggctgtcag cgcaggggcg cccggttctt tttgtcaaga ccgacctgtc 12300cggtgccctg aatgaactgc aggacgaggc agcgcggcta tcgtggctgg ccacgacggg 12360cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg ggaagggact ggctgctatt 12420gggcgaagtg ccggggcagg atctcctgtc atctcacctt gctcctgccg agaaagtatc 12480catcatggct gatgcaatgc ggcggctgca tacgcttgat ccggctacct gcccattcga 12540ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg atggaagccg gtcttgtcga 12600tcaggatgat ctggacgaag agcatcaggg gctcgcgcca gccgaactgt tcgccaggct 12660caaggcgcgc atgcccgacg gcgaggatct cgtcgtgacc catggcgatg cctgcttgcc 12720gaatatcatg gtggaaaatg gccgcttttc tggattcatc gactgtggcc ggctgggtgt 12780ggcggaccgc tatcaggaca tagcgttggc tacccgtgat attgctgaag agcttggcgg 12840cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc gctcccgatt cgcagcgcat 12900cgccttctat cgccttcttg acgagttctt ctgagcggga ctctggggtt cgaaatgacc 12960gaccaagcga cgcccaacct gccatcacga gatttcgatt ccaccgccgc cttctatgaa 13020aggttgggct tcggaatcgt tttccgggac gccggctgga tgatcctcca gcgcggggat 13080ctcatgctgg agttcttcgc ccaccccgga tcgatccaac acttacgttt gcaacgtcca 13140agagcaaata gaccacgaac gccggaaggt tgccgcagcg tgtggattgc gtctcaattc 13200tctcttgcag gaatgcaatg atgaatatga tactgactat gaaactttga gggaatactg 13260cctagcaccg tcacctcata acgtgcatca tgcatgccct gacaacatgg aacatcgcta 13320tttttctgaa gaattatgct cgttggagga tgtcgcggca attgcagcta ttgccaacat 13380cgaactaccc ctcacgcatg cattcatcaa tattattcat gcggggaaag gcaagattaa 13440tccaactggc aaatcatcca gcgtgattgg taacttcagt tccagcgact tgattcgttt 13500tggtgctacc cacgttttca ataaggacga gatggtggag taaagaagga gtgcgtcgaa 13560gcagatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt tgccggtctt 13620gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat taacatgtaa 13680tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt atacatttaa 13740tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg cgcggtgtca 13800tctatgttac tagatcgatc aaacttcggt actgtgtaat gacgatgagc aatcgagagg 13860ctgactaaca aaaggtacat cgcgatggat cgatccattc gccattcagg ctgcgcaact 13920gttgggaagg gcgatcggtg cgggcctctt cgctattacg ccagctggcg aaagggggat 13980gtgctgcaag gcgattaagt tgggtaacgc cagggttttc ccagtcacga cgttgtaaaa 14040cgacggccag tgaattcctg cagcccgggg gatccgccca ctcgaggcgc gccaagcttg 14100catgcctgca ggctagccta agtacgtact caaaatgcca acaaataaaa aaaaagttgc 14160tttaataatg ccaaaacaaa ttaataaaac acttacaaca ccggattttt tttaattaaa 14220atgtgccatt taggataaat agttaatatt tttaataatt atttaaaaag ccgtatctac 14280taaaatgatt tttatttggt tgaaaatatt aatatgttta aatcaacaca atctatcaaa 14340attaaactaa aaaaaaaata agtgtacgtg gttaacatta gtacagtaat ataagaggaa 14400aatgagaaat taagaaattg aaagcgagtc taatttttaa attatgaacc tgcatatata 14460aaaggaaaga aagaatccag gaagaaaaga aatgaaacca tgcatggtcc cctcgtcatc 14520acgagtttct gccatttgca atagaaacac tgaaacacct ttctctttgt cacttaattg 14580agatgccgaa gccacctcac accatgaact tcatgaggtg tagcacccaa ggcttccata 14640gccatgcata ctgaagaatg tctcaagctc agcaccctac ttctgtgacg tgtccctcat 14700tcaccttcct ctcttcccta taaataacca cgcctcaggt tctccgcttc acaactcaaa 14760cattctctcc attggtcctt aaacactcat cagtcatcac cgcggccatc acaagtttgt 14820acaaaaaagc aggctccgcg gccgccccct tcaccatggt tgttgtgtct cttcttcctc 14880gaatctcgat cgttacatca ccgggttcta gccttcacga tgtgcttttg agcatgagat 14940ttggtttgac gcgacatctc cctctcaaac gatctttctc caattattca atcacttccg 15000tatctccaga acaacagctc aaatctccgg tgaccatggc gacgaccgag agcaagaatc 15060ttgtagaagc ttccaaggag gagacaaaca agaaggagac agaagataag aaggaggtgg 15120gagtttcggt tcctccaccg ccagagaaac cagagcctgg cgattgttgc ggtagcggtt 15180gcgtccgatg cgtttgggat gtttattacg atgagctcga agattacaac aagcagcttt 15240ctggagaaac taaatcaatt tgaaagggtg ggcgcgccg 152791517273DNAartificial sequencevector 15cgcgccagat cctctagagt cgacctgcag gcatgcaagc ttggcgtaat catggtcata 60gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag 120cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg 180ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 240acgcgcgggg agaggcggtt tgcgtattgg atcgatccct gaaagcgacg ttggatgtta 300acatctacaa attgcctttt cttatcgacc atgtacgtaa gcgcttacgt ttttggtgga 360cccttgagga aactggtagc tgttgtgggc ctgtggtctc aagatggatc attaatttcc 420accttcacct acgatggggg gcatcgcacc ggtgagtaat attgtacggc taagagcgaa 480tttggcctgt agacctcaat tgcgagcttt ctaatttcaa actattcggg cctaactttt 540ggtgtgatga tgctgactgg caggatatat accgttgtaa tttgagctcg tgtgaataag 600tcgctgtgta tgtttgtttg attgtttctg ttggagtgca gcccatttca ccggacaagt 660cggctagatt gatttagccc tgatgaactg ccgaggggaa gccatcttga gcgcggaatg 720ggaatggatt tcgttgtaca acgagacgac agaacaccca cgggaccgag cttcgcgagc 780ttttgtatcc gtggcatcct tggtccgggc gatttgttca cgtccatgag gcgctctcca 840aaggaacgca tattttccgg tgcaaccttt ccggttcttc ctctactcga cctcttgaag 900tcccagcatg aatgttcgac cgctccgcaa gcggatcttt ggcgcaacca gccggtttcg 960cacgtcgatt ctcgcgagcc tgcatacttt ggcaagattg ctgaatgacg ctgatgcttc 1020atcgcaatct gcgataatgg ggtaagtatc cggtgaaggc cgcaggtcag gccgcctgag 1080cactcagtgt cttggatgtc cagttccacg gcagctgttg ctcaagcctg ctgatcggag 1140cgtccgcaag gtcggcgcgg acgtcggcaa gccaggcctg cggatcgatg ttattgagct 1200tggcgctcat gatcagtgtc gccatgaacg ccgcacgttc agcacaacga tccgatccgg 1260caaacagcca tgacttcctg ccgagtacat agcctctgag cgttcgttcg gcagcattgt 1320tcgtcaggca aatcgggccg tcatcgagga atgacgtaat gccatcccat cgcttgagca 1380tgtaatttat cgcctcggcg acgggagaac tgcgcgacaa tttcccccgc tcggtttcga 1440gccaatcatg cagctcttcg gcgagtgacc ttgatcaggc caccgccacg accgcggaag 1500acgaacagat gcctgcgcat cggatcgcgc ttcagcgtct cttgcaccat cagcgacaaa 1560ccgggaaagc ctttgcgcat gtccgtactt atgtcgccac ttgggagggc ttcgtctacg 1620tggccttcgt gatcgacgtc ttcgcccgtc gcattgtcgg atggcgggcg agccggacag 1680cacatgcagg ctttgtcctc gatgccctcg aggaggctca tcatgatcgg cgtcccgctc 1740atggcggcct agtgcatcac tcggatcgcg gtgttcaata cgtgtccttt cgctattccg 1800agcggttggc agaagcaggt atcgagccat ctatcggaag cgtcggcgac agcacgacaa 1860cgccctcgca gaagcgatca acggtcttta caaggccgag gtcattcatc ggcgtggacc 1920atggaggagc ttcgaagcgg tcgagttcgc taccttggaa tggatagact ggttcaacca 1980cggcggcttt tgaagcccat cggcaatata ccgccagccg aagacgagga tcagtattac 2040gccatgctgg acgaagcagc catggctgcg cattttaacg aaatggcctc cggcaaaccc 2100ggtgcggttc acttgttgcg tgggaaagtt cacgggactc cgcgcacgag ccttcttcgt 2160aatagccata tcgaccgaat tgacctgcag gggggggggg gaaagccacg ttgtgtctca 2220aaatctctga tgttacattg cacaagataa aaatatatca tcatgaacaa taaaactgtc 2280tgcttacata aacagtaata caaggggtgt tatgagccat attcaacggg aaacgtcttg 2340ctcgaggccg cgattaaatt ccaacatgga tgctgattta tatgggtata aatgggctcg 2400cgataatgtc gggcaatcag gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc 2460agagttgttt ctgaaacatg gcaaaggtag cgttgccaat gatgttacag atgagatggt 2520cagactaaac tggctgacgg aatttatgcc tcttccgacc atcaagcatt ttatccgtac 2580tcctgatgat gcatggttac tcaccactgc gatccccggg aaaacagcat tccaggtatt 2640agaagaatat cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg 2700gttgcattcg attcctgttt gtaattgtcc ttttaacagc gatcgcgtat ttcgtctcgc 2760tcaggcgcaa tcacgaatga ataacggttt ggttgatgcg agtgattttg atgacgagcg 2820taatggctgg cctgttgaac aagtctggaa agaaatgcat aagcttttgc cattctcacc 2880ggattcagtc gtcactcatg gtgatttctc acttgataac cttatttttg acgaggggaa 2940attaataggt tgtattgatg ttggacgagt cggaatcgca gaccgatacc aggatcttgc 3000catcctatgg aactgcctcg gtgagttttc tccttcatta cagaaacggc tttttcaaaa 3060atatggtatt gataatcctg atatgaataa attgcagttt catttgatgc tcgatgagtt 3120tttctaatca gaattggtta attggttgta acactggcag agcattacgc tgacttgacg 3180ggacggcggc tttgttgaat aaatcgaact tttgctgagt tgaaggatca gatcacgcat 3240cttcccgaca acgcagaccg ttccgtggca aagcaaaagt tcaaaatcac caactggtcc 3300acctacaaca aagctctcat caaccgtggc tccctcactt tctggctgga tgatggggcg 3360attcaggcct ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccccccc 3420ccccctgcag gtcttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat 3480tatcccgtgt tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg 3540acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag 3600aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 3660cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 3720gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca 3780cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc 3840tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc 3900tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 3960ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta 4020tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag 4080gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga 4140ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 4200tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa 4260agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa 4320aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc 4380cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt 4440agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc 4500tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac 4560gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca 4620gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta tgagaaagcg 4680ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag 4740gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt 4800ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat 4860ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc 4920acatgttctt tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt 4980gagctgatac cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag 5040cggaagagcg cctgatgcgg tattttctcc ttacgcatct gtgcggtatt tcacaccgca 5100tatggtgcac tctcagtaca atctgctctg atgccgcata gttaagccag tatacactcc 5160gctatcgcta cgtgactggg tcatggctgc gccccgacac ccgccaacac ccgctgacgc 5220gccctgacgg gcttgtctgc tcccggcatc cgcttacaga caagctgtga ccgtctccgg 5280gagctgcatg tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgaggc agggtgcctt 5340gatgtgggcg ccggcggtcg agtggcgacg gcgcggcttg tccgcgccct ggtagattgc 5400ctggccgtag gccagccatt tttgagcggc cagcggccgc gataggccga cgcgaagcgg 5460cggggcgtag ggagcgcagc gaccgaaggg taggcgcttt ttgcagctct tcggctgtgc 5520gctggccaga cagttatgca caggccaggc gggttttaag agttttaata agttttaaag 5580agttttaggc ggaaaaatcg ccttttttct cttttatatc agtcacttac atgtgtgacc 5640ggttcccaat gtacggcttt gggttcccaa tgtacgggtt ccggttccca atgtacggct 5700ttgggttccc aatgtacgtg ctatccacag gaaagagacc ttttcgacct ttttcccctg 5760ctagggcaat ttgccctagc atctgctccg tacattagga accggcggat gcttcgccct 5820cgatcaggtt gcggtagcgc atgactagga tcgggccagc ctgccccgcc tcctccttca 5880aatcgtactc cggcaggtca tttgacccga tcagcttgcg cacggtgaaa cagaacttct 5940tgaactctcc ggcgctgcca ctgcgttcgt agatcgtctt gaacaaccat ctggcttctg 6000ccttgcctgc

ggcgcggcgt gccaggcggt agagaaaacg gccgatgccg ggatcgatca 6060aaaagtaatc ggggtgaacc gtcagcacgt ccgggttctt gccttctgtg atctcgcggt 6120acatccaatc agctagctcg atctcgatgt actccggccg cccggtttcg ctctttacga 6180tcttgtagcg gctaatcaag gcttcaccct cggataccgt caccaggcgg ccgttcttgg 6240ccttcttcgt acgctgcatg gcaacgtgcg tggtgtttaa ccgaatgcag gtttctacca 6300ggtcgtcttt ctgctttccg ccatcggctc gccggcagaa cttgagtacg tccgcaacgt 6360gtggacggaa cacgcggccg ggcttgtctc ccttcccttc ccggtatcgg ttcatggatt 6420cggttagatg ggaaaccgcc atcagtacca ggtcgtaatc ccacacactg gccatgccgg 6480ccggccctgc ggaaacctct acgtgcccgt ctggaagctc gtagcggatc acctcgccag 6540ctcgtcggtc acgcttcgac agacggaaaa cggccacgtc catgatgctg cgactatcgc 6600gggtgcccac gtcatagagc atcggaacga aaaaatctgg ttgctcgtcg cccttgggcg 6660gcttcctaat cgacggcgca ccggctgccg gcggttgccg ggattctttg cggattcgat 6720cagcggccgc ttgccacgat tcaccggggc gtgcttctgc ctcgatgcgt tgccgctggg 6780cggcctgcgc ggccttcaac ttctccacca ggtcatcacc cagcgccgcg ccgatttgta 6840ccgggccgga tggtttgcga ccgctcacgc cgattcctcg ggcttggggg ttccagtgcc 6900attgcagggc cggcagacaa cccagccgct tacgcctggc caaccgcccg ttcctccaca 6960catggggcat tccacggcgt cggtgcctgg ttgttcttga ttttccatgc cgcctccttt 7020agccgctaaa attcatctac tcatttattc atttgctcat ttactctggt agctgcgcga 7080tgtattcaga tagcagctcg gtaatggtct tgccttggcg taccgcgtac atcttcagct 7140tggtgtgatc ctccgccggc aactgaaagt tgacccgctt catggctggc gtgtctgcca 7200ggctggccaa cgttgcagcc ttgctgctgc gtgcgctcgg acggccggca cttagcgtgt 7260ttgtgctttt gctcattttc tctttacctc attaactcaa atgagttttg atttaatttc 7320agcggccagc gcctggacct cgcgggcagc gtcgccctcg ggttctgatt caagaacggt 7380tgtgccggcg gcggcagtgc ctgggtagct cacgcgctgc gtgatacggg actcaagaat 7440gggcagctcg tacccggcca gcgcctcggc aacctcaccg ccgatgcgcg tgcctttgat 7500cgcccgcgac acgacaaagg ccgcttgtag ccttccatcc gtgacctcaa tgcgctgctt 7560aaccagctcc accaggtcgg cggtggccca tatgtcgtaa gggcttggct gcaccggaat 7620cagcacgaag tcggctgcct tgatcgcgga cacagccaag tccgccgcct ggggcgctcc 7680gtcgatcact acgaagtcgc gccggccgat ggccttcacg tcgcggtcaa tcgtcgggcg 7740gtcgatgccg acaacggtta gcggttgatc ttcccgcacg gccgcccaat cgcgggcact 7800gccctgggga tcggaatcga ctaacagaac atcggccccg gcgagttgca gggcgcgggc 7860tagatgggtt gcgatggtcg tcttgcctga cccgcctttc tggttaagta cagcgataac 7920ttcatgcgtt cccttgcgta tttgtttatt tactcatcgc atcatatacg cagcgaccgc 7980atgacgcaag ctgttttact caaatacaca tcaccttttt agacggcggc gctcggtttc 8040ttcagcggcc aagctggccg gccaggccgc cagcttggca tcagacaaac cggccaggat 8100ttcatgcagc cgcacggttg agacgtgcgc gggcggctcg aacacgtacc cggccgcgat 8160catctccgcc tcgatctctt cggtaatgaa aaacggttcg tcctggccgt cctggtgcgg 8220tttcatgctt gttcctcttg gcgttcattc tcggcggccg ccagggcgtc ggcctcggtc 8280aatgcgtcct cacggaaggc accgcgccgc ctggcctcgg tgggcgtcac ttcctcgctg 8340cgctcaagtg cgcggtacag ggtcgagcga tgcacgccaa gcagtgcagc cgcctctttc 8400acggtgcggc cttcctggtc gatcagctcg cgggcgtgcg cgatctgtgc cggggtgagg 8460gtagggcggg ggccaaactt cacgcctcgg gccttggcgg cctcgcgccc gctccgggtg 8520cggtcgatga ttagggaacg ctcgaactcg gcaatgccgg cgaacacggt caacaccatg 8580cggccggccg gcgtggtggt gtcggcccac ggctctgcca ggctacgcag gcccgcgccg 8640gcctcctgga tgcgctcggc aatgtccagt aggtcgcggg tgctgcgggc caggcggtct 8700agcctggtca ctgtcacaac gtcgccaggg cgtaggtggt caagcatcct ggccagctcc 8760gggcggtcgc gcctggtgcc ggtgatcttc tcggaaaaca gcttggtgca gccggccgcg 8820tgcagttcgg cccgttggtt ggtcaagtcc tggtcgtcgg tgctgacgcg ggcatagccc 8880agcaggccag cggcggcgct cttgttcatg gcgtaatgtc tccggttcta gtcgcaagta 8940ttctacttta tgcgactaaa acacgcgaca agaaaacgcc aggaaaaggg cagggcggca 9000gcctgtcgcg taacttagga cttgtgcgac atgtcgtttt cagaagacgg ctgcactgaa 9060cgtcagaagc cgactgcact atagcagcgg aggggttgga ccacaggacg ggtgtggtcg 9120ccatgatcgc gtagtcgata gtggctccaa gtagcgaagc gagcaggact gggcggcggc 9180caaagcggtc ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc aacgcatata 9240gcgctagcag cacgccatag tgactggcga tgctgtcgga atggacgata tcccgcaaga 9300ggcccggcag taccggcata accaagccta tgcctacagc atccagggtg acggtgccga 9360ggatgacgat gagcgcattg ttagatttca tacacggtgc ctgactgcgt tagcaattta 9420actgtgataa actaccgcat taaagctagc ttgcttggtc gttccgcgtg aacgtcggct 9480cgattgtacc tgcgttcaaa tactttgcga tcgtgttgcg cgcctgcccg gtgcgtcggc 9540tgatctcacg gatcgactgc ttctctcgca acgccatccg acggatgatg tttaaaagtc 9600ccatgtggat cactccgttg ccccgtcgct caccgtgttg gggggaaggt gcacatggct 9660cagttctcaa tggaaattat ctgcctaacc ggctcagttc tgcgtagaaa ccaacatgca 9720agctccaccg ggtgcaaagc ggcagcggcg gcaggatata ttcaattgta aatggcttca 9780tgtccgggaa atctacatgg atcagcaatg agtatgatgg tcaatatgga gaaaaagaaa 9840gagtaattac caattttttt tcaattcaaa aatgtagatg tccgcagcgt tattataaaa 9900tgaaagtaca ttttgataaa acgacaaatt acgatccgtc gtatttatag gcgaaagcaa 9960taaacaaatt attctaattc ggaaatcttt atttcgacgt gtctacattc acgtccaaat 10020gggggcttag atgagaaact tcacgatcga tgccttgatt tcgccattcc cagataccca 10080tttcatcttc agattggtct gagattatgc gaaaatatac actcatatac ataaatactg 10140acagtttgag ctaccaattc agtgtagccc attacctcac ataattcact caaatgctag 10200gcagtctgtc aactcggcgt caatttgtcg gccactatac gatagttgcg caaattttca 10260aagtcctggc ctaacatcac acctctgtcg gcggcgggtc ccatttgtga taaatccacc 10320atatcgaatt aattcagact cctttgcccc agagatcaca atggacgact tcctctatct 10380ctacgatcta gtcaggaagt tcgacggaga aggtgacgat accatgttca ccactgataa 10440tgagaagatt agccttttca atttcagaaa gaatgctaac ccacagatgg ttagagaggc 10500ttacgcagca ggtctcatca agacgatcta cccgagcaat aatctccagg agatcaaata 10560ccttcccaag aaggttaaag atgcagtcaa aagattcagg actaactgca tcaagaacac 10620agagaaagat atatttctca agatcagaag tactattcca gtatggacga ttcaaggctt 10680gcttcacaaa ccaaggcaag taatagagat tggagtctct aaaaaggtag ttcccactga 10740atcaaaggcc atggagtcaa agattcaaat agaggaccta acagaactcg ccgtaaagac 10800tggcgaacag ttcatacaga gtctcttacg actcaatgac aagaagaaaa tcttcgtcaa 10860catggtggag cacgacacgc ttgtctactc caaaaatatc aaagatacag tctcagaaga 10920ccaaagggca attgagactt ttcaacaaag ggtaatatcc ggaaacctcc tcggattcca 10980ttgcccagct atctgtcact ttattgtgaa gatagtggaa aaggaaggtg gctcctacaa 11040atgccatcat tgcgataaag gaaaggccat cgttgaagat gcctctgccg acagtggtcc 11100caaagatgga cccccaccca cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc 11160ttcaaagcaa gtggattgat gtgatatctc cactgacgta agggatgacg cacaatccca 11220ctatccttcg caagaccctt cctctatata aggaagttca tttcatttgg agaggacacg 11280ctgaaatcac cagtctccaa gcttgcgggg atcgtttcgc atgattgaac aagatggatt 11340gcacgcaggt tctccggccg cttgggtgga gaggctattc ggctatgact gggcacaaca 11400gacaatcggc tgctctgatg ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct 11460ttttgtcaag accgacctgt ccggtgccct gaatgaactg caggacgagg cagcgcggct 11520atcgtggctg gccacgacgg gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc 11580gggaagggac tggctgctat tgggcgaagt gccggggcag gatctcctgt catctcacct 11640tgctcctgcc gagaaagtat ccatcatggc tgatgcaatg cggcggctgc atacgcttga 11700tccggctacc tgcccattcg accaccaagc gaaacatcgc atcgagcgag cacgtactcg 11760gatggaagcc ggtcttgtcg atcaggatga tctggacgaa gagcatcagg ggctcgcgcc 11820agccgaactg ttcgccaggc tcaaggcgcg catgcccgac ggcgaggatc tcgtcgtgac 11880ccatggcgat gcctgcttgc cgaatatcat ggtggaaaat ggccgctttt ctggattcat 11940cgactgtggc cggctgggtg tggcggaccg ctatcaggac atagcgttgg ctacccgtga 12000tattgctgaa gagcttggcg gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc 12060cgctcccgat tcgcagcgca tcgccttcta tcgccttctt gacgagttct tctgagcggg 12120actctggggt tcgaaatgac cgaccaagcg acgcccaacc tgccatcacg agatttcgat 12180tccaccgccg ccttctatga aaggttgggc ttcggaatcg ttttccggga cgccggctgg 12240atgatcctcc agcgcgggga tctcatgctg gagttcttcg cccaccccgg atcgatccaa 12300cacttacgtt tgcaacgtcc aagagcaaat agaccacgaa cgccggaagg ttgccgcagc 12360gtgtggattg cgtctcaatt ctctcttgca ggaatgcaat gatgaatatg atactgacta 12420tgaaactttg agggaatact gcctagcacc gtcacctcat aacgtgcatc atgcatgccc 12480tgacaacatg gaacatcgct atttttctga agaattatgc tcgttggagg atgtcgcggc 12540aattgcagct attgccaaca tcgaactacc cctcacgcat gcattcatca atattattca 12600tgcggggaaa ggcaagatta atccaactgg caaatcatcc agcgtgattg gtaacttcag 12660ttccagcgac ttgattcgtt ttggtgctac ccacgttttc aataaggacg agatggtgga 12720gtaaagaagg agtgcgtcga agcagatcgt tcaaacattt ggcaataaag tttcttaaga 12780ttgaatcctg ttgccggtct tgcgatgatt atcatataat ttctgttgaa ttacgttaag 12840catgtaataa ttaacatgta atgcatgacg ttatttatga gatgggtttt tatgattaga 12900gtcccgcaat tatacattta atacgcgata gaaaacaaaa tatagcgcgc aaactaggat 12960aaattatcgc gcgcggtgtc atctatgtta ctagatcgat caaacttcgg tactgtgtaa 13020tgacgatgag caatcgagag gctgactaac aaaaggtaca tcgcgatgga tcgatccatt 13080cgccattcag gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct tcgctattac 13140gccagctggc gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg ccagggtttt 13200cccagtcacg acgttgtaaa acgacggcca gtgaattcct gcagcccggg ggatccgccc 13260actcgaggcg cgccaagctt gcatgcctgc aggctagcct aagtacgtac tcaaaatgcc 13320aacaaataaa aaaaaagttg ctttaataat gccaaaacaa attaataaaa cacttacaac 13380accggatttt ttttaattaa aatgtgccat ttaggataaa tagttaatat ttttaataat 13440tatttaaaaa gccgtatcta ctaaaatgat ttttatttgg ttgaaaatat taatatgttt 13500aaatcaacac aatctatcaa aattaaacta aaaaaaaaat aagtgtacgt ggttaacatt 13560agtacagtaa tataagagga aaatgagaaa ttaagaaatt gaaagcgagt ctaattttta 13620aattatgaac ctgcatatat aaaaggaaag aaagaatcca ggaagaaaag aaatgaaacc 13680atgcatggtc ccctcgtcat cacgagtttc tgccatttgc aatagaaaca ctgaaacacc 13740tttctctttg tcacttaatt gagatgccga agccacctca caccatgaac ttcatgaggt 13800gtagcaccca aggcttccat agccatgcat actgaagaat gtctcaagct cagcacccta 13860cttctgtgac gtgtccctca ttcaccttcc tctcttccct ataaataacc acgcctcagg 13920ttctccgctt cacaactcaa acattctctc cattggtcct taaacactca tcagtcatca 13980ccgcggccct agacgcccat cacaagtttg tacaaaaaag ctgaacgaga aacgtaaaat 14040gatataaata tcaatatatt aaattagatt ttgcataaaa aacagactac ataatactgt 14100aaaacacaac atatccagtc atattggcgg ccgcattagg caccccaggc tttacacttt 14160atgcttccgg ctcgtataat gtgtggattt tgagttagga tccgtcgaga ttttcaggag 14220ctaaggaagc taaaatggag aaaaaaatca ctggatatac caccgttgat atatcccaat 14280ggcatcgtaa agaacatttt gaggcatttc agtcagttgc tcaatgtacc tataaccaga 14340ccgttcagct ggatattacg gcctttttaa agaccgtaaa gaaaaataag cacaagtttt 14400atccggcctt tattcacatt cttgcccgcc tgatgaatgc tcatccggaa ttccgtatgg 14460caatgaaaga cggtgagctg gtgatatggg atagtgttca cccttgttac accgttttcc 14520atgagcaaac tgaaacgttt tcatcgctct ggagtgaata ccacgacgat ttccggcagt 14580ttctacacat atattcgcaa gatgtggcgt gttacggtga aaacctggcc tatttcccta 14640aagggtttat tgagaatatg tttttcgtct cagccaatcc ctgggtgagt ttcaccagtt 14700ttgatttaaa cgtggccaat atggacaact tcttcgcccc cgttttcacc atgggcaaat 14760attatacgca aggcgacaag gtgctgatgc cgctggcgat tcaggttcat catgccgttt 14820gtgatggctt ccatgtcggc agaatgctta atgaattaca acagtactgc gatgagtggc 14880agggcggggc gtaaacgcgt ggatccggct tactaaaagc cagataacag tatgcgtatt 14940tgcgcgctga tttttgcggt ataagaatat atactgatat gtatacccga agtatgtcaa 15000aaagaggtat gctatgaagc agcgtattac agtgacagtt gacagcgaca gctatcagtt 15060gctcaaggca tatatgatgt caatatctcc ggtctggtaa gcacaaccat gcagaatgaa 15120gcccgtcgtc tgcgtgccga acgctggaaa gcggaaaatc aggaagggat ggctgaggtc 15180gcccggttta ttgaaatgaa cggctctttt gctgacgaga acaggggctg gtgaaatgca 15240gtttaaggtt tacacctata aaagagagag ccgttatcgt ctgtttgtgg atgtacagag 15300tgatattatt gacacgcccg ggcgacggat ggtgatcccc ctggccagtg cacgtctgct 15360gtcagataaa gtctcccgtg aactttaccc ggtggtgcat atcggggatg aaagctggcg 15420catgatgacc accgatatgg ccagtgtgcc ggtctccgtt atcggggaag aagtggctga 15480tctcagccac cgcgaaaatg acatcaaaaa cgccattaac ctgatgttct ggggaatata 15540aatgtcaggc tcccttatac acagccagtc tgcaggtcga ccatagtgac tggatatgtt 15600gtgttttaca gcattatgta gtctgttttt tatgcaaaat ctaatttaat atattgatat 15660ttatatcatt ttacgtttct cgttcagctt tcttgtacaa agtggtgatg ataaccaagt 15720ttaacgtgag tttatatatt cacagttcca tttacagatc ttatgctgat tgcagcatat 15780aacatagtcg caacttaact ttatccctgc ttacgtaaag aaacatacat attgtttgtg 15840gcttcgtagt ggaacatatg caattatgta atctttatat tatgagcctt tacttacaaa 15900gattacttga gatttatgta cgtgtgctat tttcactttt caaacatgaa tttcctacgt 15960ttacaatcat ttaatgtaaa agggatgata taatgtattt acgtacatgt gaacaaccaa 16020gcatgttatt ttttcctttt ttgttgcaac ttacaatcaa gtaatgatta tggttatgat 16080tatgatattg gtgtgtgtct tttgccttat atatatattt atccctttcg tttaactttg 16140caatataatt attactgatc actatatttt ggtttgaaat ggcgcaggtt gtaatgatcg 16200atcatcacca ctttgtacaa gaaagctgaa cgagaaacgt aaaatgatat aaatatcaat 16260atattaaatt agattttgca taaaaaacag actacataat gctgtaaaac acaacatatc 16320cagtcactat ggtcgacctg cagactggct gtgtataagg gagcctgaca tttatattcc 16380ccagaacatc aggttaatgg cgtttttgat gtcattttcg cggtggctga gatcagccac 16440ttcttccccg ataacggaga ccggcacact ggccatatcg gtggtcatca tgcgccagct 16500ttcatccccg atatgcacca ccgggtaaag ttcacgggag actttatctg acagcagacg 16560tgcactggcc agggggatca ccatccgtcg cccgggcgtg tcaataatat cactctgtac 16620atccacaaac agacgataac ggctctctct tttataggtg taaaccttaa actgcatttc 16680accagcccct gttctcgtca gcaaaagagc cgttcatttc aataaaccgg gcgacctcag 16740ccatcccttc ctgattttcc gctttccagc gttcggcacg cagacgacgg gcttcattct 16800gcatggttgt gcttaccaga ccggagatat tgacatcata tatgccttga gcaactgata 16860gctgtcgctg tcaactgtca ctgtaatacg ctgcttcata gcatacctct ttttgacata 16920cttcgggtat acatatcagt atatattctt ataccgcaaa aatcagcgcg caaatacgca 16980tactgttatc tggcttttag taagccggat cctaactcaa aatccacaca ttatacgagc 17040cggaagcata aagtgtaaag cctggggtgc ctaatgcggc cgccaatatg actggatatg 17100ttgtgtttta cagtattatg tagtctgttt tttatgcaaa atctaattta atatattgat 17160atttatatca ttttacgttt ctcgttcagc ttttttgtac aaacttgtga tgggcgtcta 17220gcgaactaga ggatccccgg gtaccgaggt acgtctagag gatccgtcga cgg 172731638DNAartificial sequenceprimer 16cctagggtta accaagttta acgtgagttt atatattc 381738DNAartificial sequenceprimer 17actagttcgc gatcattaca acctgcgcca tttcaaac 3818506DNAartificial sequencePCR product & intron 18cctagggtta accaagttta acgtgagttt atatattcac agttccattt acagatctta 60tgctgattgc agcatataac atagtcgcaa cttaacttta tccctgctta cgtaaagaaa 120catacatatt gtttgtggct tcgtagtgga acatatgcaa ttatgtaatc tttatattat 180gagcctttac ttacaaagat tacttgagat ttatgtacgt gtgctatttt cacttttcaa 240acatgaattt cctacgttta caatcattta atgtaaaagg gatgatataa tgtatttacg 300tacatgtgaa caaccaagca tgttattttt tccttttttg ttgcaactta caatcaagta 360atgattatgg ttatgattat gatattggtg tgtgtctttt gccttatata tatatttatc 420cctttcgttt aactttgcaa tataattatt actgatcact atattttggt ttgaaatggc 480gcaggttgta atgatcgcga actagt 506191724DNAartificial sequencesynthetic DNA fragment 19ctagacgccc atcacaagtt tgtacaaaaa agctgaacga gaaacgtaaa atgatataaa 60tatcaatata ttaaattaga ttttgcataa aaaacagact acataatact gtaaaacaca 120acatatccag tcatattggc ggccgcatta ggcaccccag gctttacact ttatgcttcc 180ggctcgtata atgtgtggat tttgagttag gatccgtcga gattttcagg agctaaggaa 240gctaaaatgg agaaaaaaat cactggatat accaccgttg atatatccca atggcatcgt 300aaagaacatt ttgaggcatt tcagtcagtt gctcaatgta cctataacca gaccgttcag 360ctggatatta cggccttttt aaagaccgta aagaaaaata agcacaagtt ttatccggcc 420tttattcaca ttcttgcccg cctgatgaat gctcatccgg aattccgtat ggcaatgaaa 480gacggtgagc tggtgatatg ggatagtgtt cacccttgtt acaccgtttt ccatgagcaa 540actgaaacgt tttcatcgct ctggagtgaa taccacgacg atttccggca gtttctacac 600atatattcgc aagatgtggc gtgttacggt gaaaacctgg cctatttccc taaagggttt 660attgagaata tgtttttcgt ctcagccaat ccctgggtga gtttcaccag ttttgattta 720aacgtggcca atatggacaa cttcttcgcc cccgttttca ccatgggcaa atattatacg 780caaggcgaca aggtgctgat gccgctggcg attcaggttc atcatgccgt ttgtgatggc 840ttccatgtcg gcagaatgct taatgaatta caacagtact gcgatgagtg gcagggcggg 900gcgtaaacgc gtggatccgg cttactaaaa gccagataac agtatgcgta tttgcgcgct 960gatttttgcg gtataagaat atatactgat atgtataccc gaagtatgtc aaaaagaggt 1020atgctatgaa gcagcgtatt acagtgacag ttgacagcga cagctatcag ttgctcaagg 1080catatatgat gtcaatatct ccggtctggt aagcacaacc atgcagaatg aagcccgtcg 1140tctgcgtgcc gaacgctgga aagcggaaaa tcaggaaggg atggctgagg tcgcccggtt 1200tattgaaatg aacggctctt ttgctgacga gaacaggggc tggtgaaatg cagtttaagg 1260tttacaccta taaaagagag agccgttatc gtctgtttgt ggatgtacag agtgatatta 1320ttgacacgcc cgggcgacgg atggtgatcc ccctggccag tgcacgtctg ctgtcagata 1380aagtctcccg tgaactttac ccggtggtgc atatcgggga tgaaagctgg cgcatgatga 1440ccaccgatat ggccagtgtg ccggtctccg ttatcgggga agaagtggct gatctcagcc 1500accgcgaaaa tgacatcaaa aacgccatta acctgatgtt ctggggaata taaatgtcag 1560gctcccttat acacagccag tctgcaggtc gaccatagtg actggatatg ttgtgtttta 1620cagcattatg tagtctgttt tttatgcaaa atctaattta atatattgat atttatatca 1680ttttacgttt ctcgttcagc tttcttgtac aaagtggtga tgat 1724204934DNAartificial sequenceplasmid 20ctagaggatc cccgggtacc gagctcgaat tcgtaatcat ggtcatagct gtttcctgtg 60tgaaattgtt atccgctcac aattccacac aacatacgag ccggaagcat aaagtgtaaa 120gcctggggtg cctaatgagt gagctaactc acattaattg cgttgcgctc actgcccgct 180ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa tcggccaacg cgcggggaga 240ggcggtttgc gtattgggcg ctagcggagt gtatactggc ttactatgtt ggcactgatg 300agggtgtcag tgaagtgctt catgtggcag gagaaaaaag gctgcaccgg tgcgtcagca 360gaatatgtga tacaggatat attccgcttc ctcgctcact gactcgctac gctcggtcgt 420tcgactgcgg cgagcggaaa tggcttacga acggggcgga gatttcctgg aagatgccag 480gaagatactt aacagggaag tgagagggcc gcggcaaagc cgtttttcca taggctccgc 540ccccctgaca agcatcacga aatctgacgc tcaaatcagt ggtggcgaaa cccgacagga 600ctataaagat accaggcgtt tccccctggc ggctccctcg tgcgctctcc tgttcctgcc 660tttcggttta ccggtgtcat tccgctgtta tggccgcgtt tgtctcattc cacgcctgac 720actcagttcc gggtaggcag ttcgctccaa gctggactgt atgcacgaac cccccgttca 780gtccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg aaagacatgc 840aaaagcacca ctggcagcag ccactggtaa ttgatttaga ggagttagtc ttgaagtcat 900gcgccggtta aggctaaact gaaaggacaa gttttggtga ctgcgctcct ccaagccagt 960tacctcggtt caaagagttg gtagctcaga gaaccttcga aaaaccgccc tgcaaggcgg 1020ttttttcgtt ttcagagcaa gagattacgc gcagaccaaa acgatctcaa gaagatcatc 1080ttattaaggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga 1140gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt

tttaaatcaa 1200tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac 1260ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga 1320taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata ccgcgagacc 1380cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg gccgagcgca 1440gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc cgggaagcta 1500gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct acaggcatcg 1560tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa cgatcaaggc 1620gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg 1680ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca ctgcataatt 1740ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac tcaaccaagt 1800cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca atacgggata 1860ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt tcttcggggc 1920gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc actcgtgcac 1980ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa 2040ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct 2100tcctttttca atattattga agcatttatc agggttattg tctcatgagc ggatacatat 2160ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc 2220cacctgacgt ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca 2280cgaggccctt tcgtctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc 2340tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg 2400gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa ctatgcggca tcagagcaga 2460ttgtactgag agtgcaccat atgcggtgtg aaataccgca cagatgcgta aggagaaaat 2520accgcatcag gcgccattcg ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc 2580gggcctcttc gctattacgc cagctggcga aagggggatg tgctgcaagg cgattaagtt 2640gggtaacgcc agggttttcc cagtcacgac gttgtaaaac gacggccagt gccaagcttg 2700catgcctgca ggtcgactct agacgcccat cacaagtttg tacaaaaaag ctgaacgaga 2760aacgtaaaat gatataaata tcaatatatt aaattagatt ttgcataaaa aacagactac 2820ataatactgt aaaacacaac atatccagtc atattggcgg ccgcattagg caccccaggc 2880tttacacttt atgcttccgg ctcgtataat gtgtggattt tgagttagga tccgtcgaga 2940ttttcaggag ctaaggaagc taaaatggag aaaaaaatca ctggatatac caccgttgat 3000atatcccaat ggcatcgtaa agaacatttt gaggcatttc agtcagttgc tcaatgtacc 3060tataaccaga ccgttcagct ggatattacg gcctttttaa agaccgtaaa gaaaaataag 3120cacaagtttt atccggcctt tattcacatt cttgcccgcc tgatgaatgc tcatccggaa 3180ttccgtatgg caatgaaaga cggtgagctg gtgatatggg atagtgttca cccttgttac 3240accgttttcc atgagcaaac tgaaacgttt tcatcgctct ggagtgaata ccacgacgat 3300ttccggcagt ttctacacat atattcgcaa gatgtggcgt gttacggtga aaacctggcc 3360tatttcccta aagggtttat tgagaatatg tttttcgtct cagccaatcc ctgggtgagt 3420ttcaccagtt ttgatttaaa cgtggccaat atggacaact tcttcgcccc cgttttcacc 3480atgggcaaat attatacgca aggcgacaag gtgctgatgc cgctggcgat tcaggttcat 3540catgccgttt gtgatggctt ccatgtcggc agaatgctta atgaattaca acagtactgc 3600gatgagtggc agggcggggc gtaaacgcgt ggatccggct tactaaaagc cagataacag 3660tatgcgtatt tgcgcgctga tttttgcggt ataagaatat atactgatat gtatacccga 3720agtatgtcaa aaagaggtat gctatgaagc agcgtattac agtgacagtt gacagcgaca 3780gctatcagtt gctcaaggca tatatgatgt caatatctcc ggtctggtaa gcacaaccat 3840gcagaatgaa gcccgtcgtc tgcgtgccga acgctggaaa gcggaaaatc aggaagggat 3900ggctgaggtc gcccggttta ttgaaatgaa cggctctttt gctgacgaga acaggggctg 3960gtgaaatgca gtttaaggtt tacacctata aaagagagag ccgttatcgt ctgtttgtgg 4020atgtacagag tgatattatt gacacgcccg ggcgacggat ggtgatcccc ctggccagtg 4080cacgtctgct gtcagataaa gtctcccgtg aactttaccc ggtggtgcat atcggggatg 4140aaagctggcg catgatgacc accgatatgg ccagtgtgcc ggtctccgtt atcggggaag 4200aagtggctga tctcagccac cgcgaaaatg acatcaaaaa cgccattaac ctgatgttct 4260ggggaatata aatgtcaggc tcccttatac acagccagtc tgcaggtcga ccatagtgac 4320tggatatgtt gtgttttaca gcattatgta gtctgttttt tatgcaaaat ctaatttaat 4380atattgatat ttatatcatt ttacgtttct cgttcagctt tcttgtacaa agtggtgatg 4440ataaccaagt ttaacgtgag tttatatatt cacagttcca tttacagatc ttatgctgat 4500tgcagcatat aacatagtcg caacttaact ttatccctgc ttacgtaaag aaacatacat 4560attgtttgtg gcttcgtagt ggaacatatg caattatgta atctttatat tatgagcctt 4620tacttacaaa gattacttga gatttatgta cgtgtgctat tttcactttt caaacatgaa 4680tttcctacgt ttacaatcat ttaatgtaaa agggatgata taatgtattt acgtacatgt 4740gaacaaccaa gcatgttatt ttttcctttt ttgttgcaac ttacaatcaa gtaatgatta 4800tggttatgat tatgatattg gtgtgtgtct tttgccttat atatatattt atccctttcg 4860tttaactttg caatataatt attactgatc actatatttt ggtttgaaat ggcgcaggtt 4920gtaatgatcg cgaa 4934211021DNAartificial sequencesynthetic DNA fragment 21ctagacgccc atcacaagtt tgtacaaaaa agctgaacga gaaacgtaaa atgatataaa 60tatcaatata ttaaattaga ttttgcataa aaaacagact acataatact gtaaaacaca 120acatatccag tcatattggc ggccgcatta ggcaccccag gctttacact ttatgcttcc 180ggctcgtata atgtgtggat tttgagttag gatccggctt actaaaagcc agataacagt 240atgcgtattt gcgcgctgat ttttgcggta taagaatata tactgatatg tatacccgaa 300gtatgtcaaa aagaggtatg ctatgaagca gcgtattaca gtgacagttg acagcgacag 360ctatcagttg ctcaaggcat atatgatgtc aatatctccg gtctggtaag cacaaccatg 420cagaatgaag cccgtcgtct gcgtgccgaa cgctggaaag cggaaaatca ggaagggatg 480gctgaggtcg cccggtttat tgaaatgaac ggctcttttg ctgacgagaa caggggctgg 540tgaaatgcag tttaaggttt acacctataa aagagagagc cgttatcgtc tgtttgtgga 600tgtacagagt gatattattg acacgcccgg gcgacggatg gtgatccccc tggccagtgc 660acgtctgctg tcagataaag tctcccgtga actttacccg gtggtgcata tcggggatga 720aagctggcgc atgatgacca ccgatatggc cagtgtgccg gtctccgtta tcggggaaga 780agtggctgat ctcagccacc gcgaaaatga catcaaaaac gccattaacc tgatgttctg 840gggaatataa atgtcaggct cccttataca cagccagtct gcaggtcgac catagtgact 900ggatatgttg tgttttacag cattatgtag tctgtttttt atgcaaaatc taatttaata 960tattgatatt tatatcattt tacgtttctc gttcagcttt cttgtacaaa gtggtgatga 1020t 1021225955DNAartificial sequenceplasmid 22atcatcacca ctttgtacaa gaaagctgaa cgagaaacgt aaaatgatat aaatatcaat 60atattaaatt agattttgca taaaaaacag actacataat gctgtaaaac acaacatatc 120cagtcactat ggtcgacctg cagactggct gtgtataagg gagcctgaca tttatattcc 180ccagaacatc aggttaatgg cgtttttgat gtcattttcg cggtggctga gatcagccac 240ttcttccccg ataacggaga ccggcacact ggccatatcg gtggtcatca tgcgccagct 300ttcatccccg atatgcacca ccgggtaaag ttcacgggag actttatctg acagcagacg 360tgcactggcc agggggatca ccatccgtcg cccgggcgtg tcaataatat cactctgtac 420atccacaaac agacgataac ggctctctct tttataggtg taaaccttaa actgcatttc 480accagcccct gttctcgtca gcaaaagagc cgttcatttc aataaaccgg gcgacctcag 540ccatcccttc ctgattttcc gctttccagc gttcggcacg cagacgacgg gcttcattct 600gcatggttgt gcttaccaga ccggagatat tgacatcata tatgccttga gcaactgata 660gctgtcgctg tcaactgtca ctgtaatacg ctgcttcata gcatacctct ttttgacata 720cttcgggtat acatatcagt atatattctt ataccgcaaa aatcagcgcg caaatacgca 780tactgttatc tggcttttag taagccggat cctaactcaa aatccacaca ttatacgagc 840cggaagcata aagtgtaaag cctggggtgc ctaatgcggc cgccaatatg actggatatg 900ttgtgtttta cagtattatg tagtctgttt tttatgcaaa atctaattta atatattgat 960atttatatca ttttacgttt ctcgttcagc ttttttgtac aaacttgtga tgggcgtcta 1020gcgaactaga ggatccccgg gtaccgagct cgaattcgta atcatggtca tagctgtttc 1080ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt 1140gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc 1200ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg 1260ggagaggcgg tttgcgtatt gggcgctagc ggagtgtata ctggcttact atgttggcac 1320tgatgagggt gtcagtgaag tgcttcatgt ggcaggagaa aaaaggctgc accggtgcgt 1380cagcagaata tgtgatacag gatatattcc gcttcctcgc tcactgactc gctacgctcg 1440gtcgttcgac tgcggcgagc ggaaatggct tacgaacggg gcggagattt cctggaagat 1500gccaggaaga tacttaacag ggaagtgaga gggccgcggc aaagccgttt ttccataggc 1560tccgcccccc tgacaagcat cacgaaatct gacgctcaaa tcagtggtgg cgaaacccga 1620caggactata aagataccag gcgtttcccc ctggcggctc cctcgtgcgc tctcctgttc 1680ctgcctttcg gtttaccggt gtcattccgc tgttatggcc gcgtttgtct cattccacgc 1740ctgacactca gttccgggta ggcagttcgc tccaagctgg actgtatgca cgaacccccc 1800gttcagtccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggaaaga 1860catgcaaaag caccactggc agcagccact ggtaattgat ttagaggagt tagtcttgaa 1920gtcatgcgcc ggttaaggct aaactgaaag gacaagtttt ggtgactgcg ctcctccaag 1980ccagttacct cggttcaaag agttggtagc tcagagaacc ttcgaaaaac cgccctgcaa 2040ggcggttttt tcgttttcag agcaagagat tacgcgcaga ccaaaacgat ctcaagaaga 2100tcatcttatt aaggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt 2160catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 2220atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga 2280ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt 2340gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg 2400agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga 2460gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga 2520agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg 2580catcgtggtg tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc 2640aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc 2700gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca 2760taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac 2820caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg 2880ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc 2940ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg 3000tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac 3060aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat 3120actcttcctt tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata 3180catatttgaa tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa 3240agtgccacct gacgtctaag aaaccattat tatcatgaca ttaacctata aaaataggcg 3300tatcacgagg ccctttcgtc tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat 3360gcagctcccg gagacggtca cagcttgtct gtaagcggat gccgggagca gacaagcccg 3420tcagggcgcg tcagcgggtg ttggcgggtg tcggggctgg cttaactatg cggcatcaga 3480gcagattgta ctgagagtgc accatatgcg gtgtgaaata ccgcacagat gcgtaaggag 3540aaaataccgc atcaggcgcc attcgccatt caggctgcgc aactgttggg aagggcgatc 3600ggtgcgggcc tcttcgctat tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt 3660aagttgggta acgccagggt tttcccagtc acgacgttgt aaaacgacgg ccagtgccaa 3720gcttgcatgc ctgcaggtcg actctagacg cccatcacaa gtttgtacaa aaaagctgaa 3780cgagaaacgt aaaatgatat aaatatcaat atattaaatt agattttgca taaaaaacag 3840actacataat actgtaaaac acaacatatc cagtcatatt ggcggccgca ttaggcaccc 3900caggctttac actttatgct tccggctcgt ataatgtgtg gattttgagt taggatccgt 3960cgagattttc aggagctaag gaagctaaaa tggagaaaaa aatcactgga tataccaccg 4020ttgatatatc ccaatggcat cgtaaagaac attttgaggc atttcagtca gttgctcaat 4080gtacctataa ccagaccgtt cagctggata ttacggcctt tttaaagacc gtaaagaaaa 4140ataagcacaa gttttatccg gcctttattc acattcttgc ccgcctgatg aatgctcatc 4200cggaattccg tatggcaatg aaagacggtg agctggtgat atgggatagt gttcaccctt 4260gttacaccgt tttccatgag caaactgaaa cgttttcatc gctctggagt gaataccacg 4320acgatttccg gcagtttcta cacatatatt cgcaagatgt ggcgtgttac ggtgaaaacc 4380tggcctattt ccctaaaggg tttattgaga atatgttttt cgtctcagcc aatccctggg 4440tgagtttcac cagttttgat ttaaacgtgg ccaatatgga caacttcttc gcccccgttt 4500tcaccatggg caaatattat acgcaaggcg acaaggtgct gatgccgctg gcgattcagg 4560ttcatcatgc cgtttgtgat ggcttccatg tcggcagaat gcttaatgaa ttacaacagt 4620actgcgatga gtggcagggc ggggcgtaaa cgcgtggatc cggcttacta aaagccagat 4680aacagtatgc gtatttgcgc gctgattttt gcggtataag aatatatact gatatgtata 4740cccgaagtat gtcaaaaaga ggtatgctat gaagcagcgt attacagtga cagttgacag 4800cgacagctat cagttgctca aggcatatat gatgtcaata tctccggtct ggtaagcaca 4860accatgcaga atgaagcccg tcgtctgcgt gccgaacgct ggaaagcgga aaatcaggaa 4920gggatggctg aggtcgcccg gtttattgaa atgaacggct cttttgctga cgagaacagg 4980ggctggtgaa atgcagttta aggtttacac ctataaaaga gagagccgtt atcgtctgtt 5040tgtggatgta cagagtgata ttattgacac gcccgggcga cggatggtga tccccctggc 5100cagtgcacgt ctgctgtcag ataaagtctc ccgtgaactt tacccggtgg tgcatatcgg 5160ggatgaaagc tggcgcatga tgaccaccga tatggccagt gtgccggtct ccgttatcgg 5220ggaagaagtg gctgatctca gccaccgcga aaatgacatc aaaaacgcca ttaacctgat 5280gttctgggga atataaatgt caggctccct tatacacagc cagtctgcag gtcgaccata 5340gtgactggat atgttgtgtt ttacagcatt atgtagtctg ttttttatgc aaaatctaat 5400ttaatatatt gatatttata tcattttacg tttctcgttc agctttcttg tacaaagtgg 5460tgatgataac caagtttaac gtgagtttat atattcacag ttccatttac agatcttatg 5520ctgattgcag catataacat agtcgcaact taactttatc cctgcttacg taaagaaaca 5580tacatattgt ttgtggcttc gtagtggaac atatgcaatt atgtaatctt tatattatga 5640gcctttactt acaaagatta cttgagattt atgtacgtgt gctattttca cttttcaaac 5700atgaatttcc tacgtttaca atcatttaat gtaaaaggga tgatataatg tatttacgta 5760catgtgaaca accaagcatg ttattttttc cttttttgtt gcaacttaca atcaagtaat 5820gattatggtt atgattatga tattggtgtg tgtcttttgc cttatatata tatttatccc 5880tttcgtttaa ctttgcaata taattattac tgatcactat attttggttt gaaatggcgc 5940aggttgtaat gatcg 5955239245DNAartificial sequencevector 23gtacgtctag aggatccgtc gacggcgcgc ccgatcatcc ggatatagtt cctcctttca 60gcaaaaaacc cctcaagacc cgtttagagg ccccaagggg ttatgctagt tattgctcag 120cggtggcagc agccaactca gcttcctttc gggctttgtt agcagccgga tcgatccaag 180ctgtacctca ctattccttt gccctcggac gagtgctggg gcgtcggttt ccactatcgg 240cgagtacttc tacacagcca tcggtccaga cggccgcgct tctgcgggcg atttgtgtac 300gcccgacagt cccggctccg gatcggacga ttgcgtcgca tcgaccctgc gcccaagctg 360catcatcgaa attgccgtca accaagctct gatagagttg gtcaagacca atgcggagca 420tatacgcccg gagccgcggc gatcctgcaa gctccggatg cctccgctcg aagtagcgcg 480tctgctgctc catacaagcc aaccacggcc tccagaagaa gatgttggcg acctcgtatt 540gggaatcccc gaacatcgcc tcgctccagt caatgaccgc tgttatgcgg ccattgtccg 600tcaggacatt gttggagccg aaatccgcgt gcacgaggtg ccggacttcg gggcagtcct 660cggcccaaag catcagctca tcgagagcct gcgcgacgga cgcactgacg gtgtcgtcca 720tcacagtttg ccagtgatac acatggggat cagcaatcgc gcatatgaaa tcacgccatg 780tagtgtattg accgattcct tgcggtccga atgggccgaa cccgctcgtc tggctaagat 840cggccgcagc gatcgcatcc atagcctccg cgaccggctg cagaacagcg ggcagttcgg 900tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg ggagatgcaa taggtcaggc 960tctcgctgaa ttccccaatg tcaagcactt ccggaatcgg gagcgcggcc gatgcaaagt 1020gccgataaac ataacgatct ttgtagaaac catcggcgca gctatttacc cgcaggacat 1080atccacgccc tcctacatcg aagctgaaag cacgagattc ttcgccctcc gagagctgca 1140tcaggtcgga gacgctgtcg aacttttcga tcagaaactt ctcgacagac gtcgcggtga 1200gttcaggctt ttccatgggt atatctcctt cttaaagtta aacaaaatta tttctagagg 1260gaaaccgttg tggtctccct atagtgagtc gtattaattt cgcgggatcg agatcgatcc 1320aattccaatc ccacaaaaat ctgagcttaa cagcacagtt gctcctctca gagcagaatc 1380gggtattcaa caccctcata tcaactacta cgttgtgtat aacggtccac atgccggtat 1440atacgatgac tggggttgta caaaggcggc aacaaacggc gttcccggag ttgcacacaa 1500gaaatttgcc actattacag aggcaagagc agcagctgac gcgtacacaa caagtcagca 1560aacagacagg ttgaacttca tccccaaagg agaagctcaa ctcaagccca agagctttgc 1620taaggcccta acaagcccac caaagcaaaa agcccactgg ctcacgctag gaaccaaaag 1680gcccagcagt gatccagccc caaaagagat ctcctttgcc ccggagatta caatggacga 1740tttcctctat ctttacgatc taggaaggaa gttcgaaggt gaaggtgacg acactatgtt 1800caccactgat aatgagaagg ttagcctctt caatttcaga aagaatgctg acccacagat 1860ggttagagag gcctacgcag caggtctcat caagacgatc tacccgagta acaatctcca 1920ggagatcaaa taccttccca agaaggttaa agatgcagtc aaaagattca ggactaattg 1980catcaagaac acagagaaag acatatttct caagatcaga agtactattc cagtatggac 2040gattcaaggc ttgcttcata aaccaaggca agtaatagag attggagtct ctaaaaaggt 2100agttcctact gaatctaagg ccatgcatgg agtctaagat tcaaatcgag gatctaacag 2160aactcgccgt gaagactggc gaacagttca tacagagtct tttacgactc aatgacaaga 2220agaaaatctt cgtcaacatg gtggagcacg acactctggt ctactccaaa aatgtcaaag 2280atacagtctc agaagaccaa agggctattg agacttttca acaaaggata atttcgggaa 2340acctcctcgg attccattgc ccagctatct gtcacttcat cgaaaggaca gtagaaaagg 2400aaggtggctc ctacaaatgc catcattgcg ataaaggaaa ggctatcatt caagatgcct 2460ctgccgacag tggtcccaaa gatggacccc cacccacgag gagcatcgtg gaaaaagaag 2520acgttccaac cacgtcttca aagcaagtgg attgatgtga catctccact gacgtaaggg 2580atgacgcaca atcccactat ccttcgcaag acccttcctc tatataagga agttcatttc 2640atttggagag gacacgctcg agctcatttc tctattactt cagccataac aaaagaactc 2700ttttctcttc ttattaaacc atgaaaaagc ctgaactcac cgcgacgtct gtcgagaagt 2760ttctgatcga aaagttcgac agcgtctccg acctgatgca gctctcggag ggcgaagaat 2820ctcgtgcttt cagcttcgat gtaggagggc gtggatatgt cctgcgggta aatagctgcg 2880ccgatggttt ctacaaagat cgttatgttt atcggcactt tgcatcggcc gcgctcccga 2940ttccggaagt gcttgacatt ggggaattca gcgagagcct gacctattgc atctcccgcc 3000gtgcacaggg tgtcacgttg caagacctgc ctgaaaccga actgcccgct gttctgcagc 3060cggtcgcgga ggccatggat gcgatcgctg cggccgatct tagccagacg agcgggttcg 3120gcccattcgg accgcaagga atcggtcaat acactacatg gcgtgatttc atatgcgcga 3180ttgctgatcc ccatgtgtat cactggcaaa ctgtgatgga cgacaccgtc agtgcgtccg 3240tcgcgcaggc tctcgatgag ctgatgcttt gggccgagga ctgccccgaa gtccggcacc 3300tcgtgcacgc ggatttcggc tccaacaatg tcctgacgga caatggccgc ataacagcgg 3360tcattgactg gagcgaggcg atgttcgggg attcccaata cgaggtcgcc aacatcttct 3420tctggaggcc gtggttggct tgtatggagc agcagacgcg ctacttcgag cggaggcatc 3480cggagcttgc aggatcgccg cggctccggg cgtatatgct ccgcattggt cttgaccaac 3540tctatcagag cttggttgac ggcaatttcg atgatgcagc ttgggcgcag ggtcgatgcg 3600acgcaatcgt ccgatccgga gccgggactg tcgggcgtac acaaatcgcc cgcagaagcg 3660cggccgtctg gaccgatggc tgtgtagaag tactcgccga tagtggaaac cgacgcccca 3720gcactcgtcc gagggcaaag gaatagtgag gtacctaaag aaggagtgcg tcgaagcaga 3780tcgttcaaac atttggcaat aaagtttctt aagattgaat cctgttgccg gtcttgcgat 3840gattatcata taatttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat 3900gacgttattt atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc 3960gatagaaaac aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat 4020gttactagat cgatgtcgaa tcgatcaacc tgcattaatg aatcggccaa

cgcgcgggga 4080gaggcggttt gcgtattggg cgctcttccg cttcctcgct cactgactcg ctgcgctcgg 4140tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg ttatccacag 4200aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc 4260gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca 4320aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt 4380ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc 4440tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca atgctcacgc tgtaggtatc 4500tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc 4560ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact 4620tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg 4680ctacagagtt cttgaagtgg tggcctaact acggctacac tagaaggaca gtatttggta 4740tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct tgatccggca 4800aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa 4860aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct cagtggaacg 4920aaaactcacg ttaagggatt ttggtcatga cattaaccta taaaaatagg cgtatcacga 4980ggccctttcg tctcgcgcgt ttcggtgatg acggtgaaaa cctctgacac atgcagctcc 5040cggagacggt cacagcttgt ctgtaagcgg atgccgggag cagacaagcc cgtcagggcg 5100cgtcagcggg tgttggcggg tgtcggggct ggcttaacta tgcggcatca gagcagattg 5160tactgagagt gcaccatatg gacatattgt cgttagaacg cggctacaat taatacataa 5220ccttatgtat catacacata cgatttaggt gacactatag aacggcgcgc caagcttgca 5280tgcctgcagg ctagcctaag tacgtactca aaatgccaac aaataaaaaa aaagttgctt 5340taataatgcc aaaacaaatt aataaaacac ttacaacacc ggattttttt taattaaaat 5400gtgccattta ggataaatag ttaatatttt taataattat ttaaaaagcc gtatctacta 5460aaatgatttt tatttggttg aaaatattaa tatgtttaaa tcaacacaat ctatcaaaat 5520taaactaaaa aaaaaataag tgtacgtggt taacattagt acagtaatat aagaggaaaa 5580tgagaaatta agaaattgaa agcgagtcta atttttaaat tatgaacctg catatataaa 5640aggaaagaaa gaatccagga agaaaagaaa tgaaaccatg catggtcccc tcgtcatcac 5700gagtttctgc catttgcaat agaaacactg aaacaccttt ctctttgtca cttaattgag 5760atgccgaagc cacctcacac catgaacttc atgaggtgta gcacccaagg cttccatagc 5820catgcatact gaagaatgtc tcaagctcag caccctactt ctgtgacgtg tccctcattc 5880accttcctct cttccctata aataaccacg cctcaggttc tccgcttcac aactcaaaca 5940ttctctccat tggtccttaa acactcatca gtcatcaccg cggccctaga cgcccatcac 6000aagtttgtac aaaaaagctg aacgagaaac gtaaaatgat ataaatatca atatattaaa 6060ttagattttg cataaaaaac agactacata atactgtaaa acacaacata tccagtcata 6120ttggcggccg cattaggcac cccaggcttt acactttatg cttccggctc gtataatgtg 6180tggattttga gttaggatcc gtcgagattt tcaggagcta aggaagctaa aatggagaaa 6240aaaatcactg gatataccac cgttgatata tcccaatggc atcgtaaaga acattttgag 6300gcatttcagt cagttgctca atgtacctat aaccagaccg ttcagctgga tattacggcc 6360tttttaaaga ccgtaaagaa aaataagcac aagttttatc cggcctttat tcacattctt 6420gcccgcctga tgaatgctca tccggaattc cgtatggcaa tgaaagacgg tgagctggtg 6480atatgggata gtgttcaccc ttgttacacc gttttccatg agcaaactga aacgttttca 6540tcgctctgga gtgaatacca cgacgatttc cggcagtttc tacacatata ttcgcaagat 6600gtggcgtgtt acggtgaaaa cctggcctat ttccctaaag ggtttattga gaatatgttt 6660ttcgtctcag ccaatccctg ggtgagtttc accagttttg atttaaacgt ggccaatatg 6720gacaacttct tcgcccccgt tttcaccatg ggcaaatatt atacgcaagg cgacaaggtg 6780ctgatgccgc tggcgattca ggttcatcat gccgtttgtg atggcttcca tgtcggcaga 6840atgcttaatg aattacaaca gtactgcgat gagtggcagg gcggggcgta aacgcgtgga 6900tccggcttac taaaagccag ataacagtat gcgtatttgc gcgctgattt ttgcggtata 6960agaatatata ctgatatgta tacccgaagt atgtcaaaaa gaggtatgct atgaagcagc 7020gtattacagt gacagttgac agcgacagct atcagttgct caaggcatat atgatgtcaa 7080tatctccggt ctggtaagca caaccatgca gaatgaagcc cgtcgtctgc gtgccgaacg 7140ctggaaagcg gaaaatcagg aagggatggc tgaggtcgcc cggtttattg aaatgaacgg 7200ctcttttgct gacgagaaca ggggctggtg aaatgcagtt taaggtttac acctataaaa 7260gagagagccg ttatcgtctg tttgtggatg tacagagtga tattattgac acgcccgggc 7320gacggatggt gatccccctg gccagtgcac gtctgctgtc agataaagtc tcccgtgaac 7380tttacccggt ggtgcatatc ggggatgaaa gctggcgcat gatgaccacc gatatggcca 7440gtgtgccggt ctccgttatc ggggaagaag tggctgatct cagccaccgc gaaaatgaca 7500tcaaaaacgc cattaacctg atgttctggg gaatataaat gtcaggctcc cttatacaca 7560gccagtctgc aggtcgacca tagtgactgg atatgttgtg ttttacagca ttatgtagtc 7620tgttttttat gcaaaatcta atttaatata ttgatattta tatcatttta cgtttctcgt 7680tcagctttct tgtacaaagt ggtgatgata accaagttta acgtgagttt atatattcac 7740agttccattt acagatctta tgctgattgc agcatataac atagtcgcaa cttaacttta 7800tccctgctta cgtaaagaaa catacatatt gtttgtggct tcgtagtgga acatatgcaa 7860ttatgtaatc tttatattat gagcctttac ttacaaagat tacttgagat ttatgtacgt 7920gtgctatttt cacttttcaa acatgaattt cctacgttta caatcattta atgtaaaagg 7980gatgatataa tgtatttacg tacatgtgaa caaccaagca tgttattttt tccttttttg 8040ttgcaactta caatcaagta atgattatgg ttatgattat gatattggtg tgtgtctttt 8100gccttatata tatatttatc cctttcgttt aactttgcaa tataattatt actgatcact 8160atattttggt ttgaaatggc gcaggttgta atgatcgatc atcaccactt tgtacaagaa 8220agctgaacga gaaacgtaaa atgatataaa tatcaatata ttaaattaga ttttgcataa 8280aaaacagact acataatgct gtaaaacaca acatatccag tcactatggt cgacctgcag 8340actggctgtg tataagggag cctgacattt atattcccca gaacatcagg ttaatggcgt 8400ttttgatgtc attttcgcgg tggctgagat cagccacttc ttccccgata acggagaccg 8460gcacactggc catatcggtg gtcatcatgc gccagctttc atccccgata tgcaccaccg 8520ggtaaagttc acgggagact ttatctgaca gcagacgtgc actggccagg gggatcacca 8580tccgtcgccc gggcgtgtca ataatatcac tctgtacatc cacaaacaga cgataacggc 8640tctctctttt ataggtgtaa accttaaact gcatttcacc agcccctgtt ctcgtcagca 8700aaagagccgt tcatttcaat aaaccgggcg acctcagcca tcccttcctg attttccgct 8760ttccagcgtt cggcacgcag acgacgggct tcattctgca tggttgtgct taccagaccg 8820gagatattga catcatatat gccttgagca actgatagct gtcgctgtca actgtcactg 8880taatacgctg cttcatagca tacctctttt tgacatactt cgggtataca tatcagtata 8940tattcttata ccgcaaaaat cagcgcgcaa atacgcatac tgttatctgg cttttagtaa 9000gccggatcct aactcaaaat ccacacatta tacgagccgg aagcataaag tgtaaagcct 9060ggggtgccta atgcggccgc caatatgact ggatatgttg tgttttacag tattatgtag 9120tctgtttttt atgcaaaatc taatttaata tattgatatt tatatcattt tacgtttctc 9180gttcagcttt tttgtacaaa cttgtgatgg gcgtctagcg aactagagga tccccgggta 9240ccgag 92452415500DNAartificial sequencevector 24ccgcggccgc ccccttcacc atggttgttg tgtctcttct tcctcgaatc tcgatcgtta 60catcaccggg ttctagcctt cacgatgtgc ttttgagcat gagatttggt ttgacgcgac 120atctccctct caaacgatct ttctccaatt attcaatcac ttccgtatct ccagaacaac 180agctcaaatc tccggtgacc atggcgacga ccgagagcaa gaatcttgta gaagcttcca 240aggaggagac aaacaagaag gagacagaag ataagaagga ggtgggagtt tcggttcctc 300caccgccaga gaaaccagag cctggcgatt gttgcggtag cggttgcgtc cgatgcgttt 360gggatgttta ttacgatgag ctcgaagatt acaacaagca gctttctgga gaaactaaat 420caatttgaaa gggtgggcgc gccgacccag ctttcttgta caaagtggtg tgagtttata 480tattcacagt tccatttaca gatcttatgc tgattgcagc atataacata gtcgcaactt 540aactttatcc ctgcttacgt aaagaaacat acatattgtt tgtggcttcg tagtggaaca 600tatgcaatta tgtaatcttt atattatgag cctttactta caaagattac ttgagattta 660tgtacgtgtg ctattttcac ttttcaaaca tgaatttcct acgtttacaa tcatttaatg 720taaaagggat gatataatgt atttacgtac atgtgaacaa ccaagcatgt tattttttcc 780ttttttgttg caacttacaa tcaagtaatg attatggtta tgattatgat attggtgtgt 840gtcttttgcc ttatatatat atttatccct ttcgtttaac tttgcaatat aattattact 900gatcactata ttttggtttg aaatggcgca gaccactttg tacaagaaag ctgggtcggc 960gcgcccaccc tttcaaattg atttagtttc tccagaaagc tgcttgttgt aatcttcgag 1020ctcatcgtaa taaacatccc aaacgcatcg gacgcaaccg ctaccgcaac aatcgccagg 1080ctctggtttc tctggcggtg gaggaaccga aactcccacc tccttcttat cttctgtctc 1140cttcttgttt gtctcctcct tggaagcttc tacaagattc ttgctctcgg tcgtcgccat 1200ggtcaccgga gatttgagct gttgttctgg agatacggaa gtgattgaat aattggagaa 1260agatcgtttg agagggagat gtcgcgtcaa accaaatctc atgctcaaaa gcacatcgtg 1320aaggctagaa cccggtgatg taacgatcga gattcgagga agaagagaca caacaaccat 1380ggtgaagggg gcggccgcgg agcctgcttt tttgtacaaa cttgtgatgg gcgtctagcg 1440aactagagga tccccgggta ccgaggtacg tctagaggat ccgtcgacgg cgcgccagat 1500cctctagagt cgacctgcag gcatgcaagc ttggcgtaat catggtcata gctgtttcct 1560gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 1620aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 1680gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 1740agaggcggtt tgcgtattgg atcgatccct gaaagcgacg ttggatgtta acatctacaa 1800attgcctttt cttatcgacc atgtacgtaa gcgcttacgt ttttggtgga cccttgagga 1860aactggtagc tgttgtgggc ctgtggtctc aagatggatc attaatttcc accttcacct 1920acgatggggg gcatcgcacc ggtgagtaat attgtacggc taagagcgaa tttggcctgt 1980agacctcaat tgcgagcttt ctaatttcaa actattcggg cctaactttt ggtgtgatga 2040tgctgactgg caggatatat accgttgtaa tttgagctcg tgtgaataag tcgctgtgta 2100tgtttgtttg attgtttctg ttggagtgca gcccatttca ccggacaagt cggctagatt 2160gatttagccc tgatgaactg ccgaggggaa gccatcttga gcgcggaatg ggaatggatt 2220tcgttgtaca acgagacgac agaacaccca cgggaccgag cttcgcgagc ttttgtatcc 2280gtggcatcct tggtccgggc gatttgttca cgtccatgag gcgctctcca aaggaacgca 2340tattttccgg tgcaaccttt ccggttcttc ctctactcga cctcttgaag tcccagcatg 2400aatgttcgac cgctccgcaa gcggatcttt ggcgcaacca gccggtttcg cacgtcgatt 2460ctcgcgagcc tgcatacttt ggcaagattg ctgaatgacg ctgatgcttc atcgcaatct 2520gcgataatgg ggtaagtatc cggtgaaggc cgcaggtcag gccgcctgag cactcagtgt 2580cttggatgtc cagttccacg gcagctgttg ctcaagcctg ctgatcggag cgtccgcaag 2640gtcggcgcgg acgtcggcaa gccaggcctg cggatcgatg ttattgagct tggcgctcat 2700gatcagtgtc gccatgaacg ccgcacgttc agcacaacga tccgatccgg caaacagcca 2760tgacttcctg ccgagtacat agcctctgag cgttcgttcg gcagcattgt tcgtcaggca 2820aatcgggccg tcatcgagga atgacgtaat gccatcccat cgcttgagca tgtaatttat 2880cgcctcggcg acgggagaac tgcgcgacaa tttcccccgc tcggtttcga gccaatcatg 2940cagctcttcg gcgagtgacc ttgatcaggc caccgccacg accgcggaag acgaacagat 3000gcctgcgcat cggatcgcgc ttcagcgtct cttgcaccat cagcgacaaa ccgggaaagc 3060ctttgcgcat gtccgtactt atgtcgccac ttgggagggc ttcgtctacg tggccttcgt 3120gatcgacgtc ttcgcccgtc gcattgtcgg atggcgggcg agccggacag cacatgcagg 3180ctttgtcctc gatgccctcg aggaggctca tcatgatcgg cgtcccgctc atggcggcct 3240agtgcatcac tcggatcgcg gtgttcaata cgtgtccttt cgctattccg agcggttggc 3300agaagcaggt atcgagccat ctatcggaag cgtcggcgac agcacgacaa cgccctcgca 3360gaagcgatca acggtcttta caaggccgag gtcattcatc ggcgtggacc atggaggagc 3420ttcgaagcgg tcgagttcgc taccttggaa tggatagact ggttcaacca cggcggcttt 3480tgaagcccat cggcaatata ccgccagccg aagacgagga tcagtattac gccatgctgg 3540acgaagcagc catggctgcg cattttaacg aaatggcctc cggcaaaccc ggtgcggttc 3600acttgttgcg tgggaaagtt cacgggactc cgcgcacgag ccttcttcgt aatagccata 3660tcgaccgaat tgacctgcag gggggggggg gaaagccacg ttgtgtctca aaatctctga 3720tgttacattg cacaagataa aaatatatca tcatgaacaa taaaactgtc tgcttacata 3780aacagtaata caaggggtgt tatgagccat attcaacggg aaacgtcttg ctcgaggccg 3840cgattaaatt ccaacatgga tgctgattta tatgggtata aatgggctcg cgataatgtc 3900gggcaatcag gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc agagttgttt 3960ctgaaacatg gcaaaggtag cgttgccaat gatgttacag atgagatggt cagactaaac 4020tggctgacgg aatttatgcc tcttccgacc atcaagcatt ttatccgtac tcctgatgat 4080gcatggttac tcaccactgc gatccccggg aaaacagcat tccaggtatt agaagaatat 4140cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg gttgcattcg 4200attcctgttt gtaattgtcc ttttaacagc gatcgcgtat ttcgtctcgc tcaggcgcaa 4260tcacgaatga ataacggttt ggttgatgcg agtgattttg atgacgagcg taatggctgg 4320cctgttgaac aagtctggaa agaaatgcat aagcttttgc cattctcacc ggattcagtc 4380gtcactcatg gtgatttctc acttgataac cttatttttg acgaggggaa attaataggt 4440tgtattgatg ttggacgagt cggaatcgca gaccgatacc aggatcttgc catcctatgg 4500aactgcctcg gtgagttttc tccttcatta cagaaacggc tttttcaaaa atatggtatt 4560gataatcctg atatgaataa attgcagttt catttgatgc tcgatgagtt tttctaatca 4620gaattggtta attggttgta acactggcag agcattacgc tgacttgacg ggacggcggc 4680tttgttgaat aaatcgaact tttgctgagt tgaaggatca gatcacgcat cttcccgaca 4740acgcagaccg ttccgtggca aagcaaaagt tcaaaatcac caactggtcc acctacaaca 4800aagctctcat caaccgtggc tccctcactt tctggctgga tgatggggcg attcaggcct 4860ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccccccc ccccctgcag 4920gtcttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat tatcccgtgt 4980tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga 5040gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag 5100tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg 5160accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg 5220ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgt 5280agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg 5340gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc 5400ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg 5460tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta tctacacgac 5520ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact 5580gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga ttgatttaaa 5640acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc tcatgaccaa 5700aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa agatcaaagg 5760atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc 5820gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac 5880tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca 5940ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt 6000ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac gatagttacc 6060ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg 6120aacgacctac accgaactga gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc 6180cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac 6240gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct 6300ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc 6360cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgttctt 6420tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt gagctgatac 6480cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag cggaagagcg 6540cctgatgcgg tattttctcc ttacgcatct gtgcggtatt tcacaccgca tatggtgcac 6600tctcagtaca atctgctctg atgccgcata gttaagccag tatacactcc gctatcgcta 6660cgtgactggg tcatggctgc gccccgacac ccgccaacac ccgctgacgc gccctgacgg 6720gcttgtctgc tcccggcatc cgcttacaga caagctgtga ccgtctccgg gagctgcatg 6780tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgaggc agggtgcctt gatgtgggcg 6840ccggcggtcg agtggcgacg gcgcggcttg tccgcgccct ggtagattgc ctggccgtag 6900gccagccatt tttgagcggc cagcggccgc gataggccga cgcgaagcgg cggggcgtag 6960ggagcgcagc gaccgaaggg taggcgcttt ttgcagctct tcggctgtgc gctggccaga 7020cagttatgca caggccaggc gggttttaag agttttaata agttttaaag agttttaggc 7080ggaaaaatcg ccttttttct cttttatatc agtcacttac atgtgtgacc ggttcccaat 7140gtacggcttt gggttcccaa tgtacgggtt ccggttccca atgtacggct ttgggttccc 7200aatgtacgtg ctatccacag gaaagagacc ttttcgacct ttttcccctg ctagggcaat 7260ttgccctagc atctgctccg tacattagga accggcggat gcttcgccct cgatcaggtt 7320gcggtagcgc atgactagga tcgggccagc ctgccccgcc tcctccttca aatcgtactc 7380cggcaggtca tttgacccga tcagcttgcg cacggtgaaa cagaacttct tgaactctcc 7440ggcgctgcca ctgcgttcgt agatcgtctt gaacaaccat ctggcttctg ccttgcctgc 7500ggcgcggcgt gccaggcggt agagaaaacg gccgatgccg ggatcgatca aaaagtaatc 7560ggggtgaacc gtcagcacgt ccgggttctt gccttctgtg atctcgcggt acatccaatc 7620agctagctcg atctcgatgt actccggccg cccggtttcg ctctttacga tcttgtagcg 7680gctaatcaag gcttcaccct cggataccgt caccaggcgg ccgttcttgg ccttcttcgt 7740acgctgcatg gcaacgtgcg tggtgtttaa ccgaatgcag gtttctacca ggtcgtcttt 7800ctgctttccg ccatcggctc gccggcagaa cttgagtacg tccgcaacgt gtggacggaa 7860cacgcggccg ggcttgtctc ccttcccttc ccggtatcgg ttcatggatt cggttagatg 7920ggaaaccgcc atcagtacca ggtcgtaatc ccacacactg gccatgccgg ccggccctgc 7980ggaaacctct acgtgcccgt ctggaagctc gtagcggatc acctcgccag ctcgtcggtc 8040acgcttcgac agacggaaaa cggccacgtc catgatgctg cgactatcgc gggtgcccac 8100gtcatagagc atcggaacga aaaaatctgg ttgctcgtcg cccttgggcg gcttcctaat 8160cgacggcgca ccggctgccg gcggttgccg ggattctttg cggattcgat cagcggccgc 8220ttgccacgat tcaccggggc gtgcttctgc ctcgatgcgt tgccgctggg cggcctgcgc 8280ggccttcaac ttctccacca ggtcatcacc cagcgccgcg ccgatttgta ccgggccgga 8340tggtttgcga ccgctcacgc cgattcctcg ggcttggggg ttccagtgcc attgcagggc 8400cggcagacaa cccagccgct tacgcctggc caaccgcccg ttcctccaca catggggcat 8460tccacggcgt cggtgcctgg ttgttcttga ttttccatgc cgcctccttt agccgctaaa 8520attcatctac tcatttattc atttgctcat ttactctggt agctgcgcga tgtattcaga 8580tagcagctcg gtaatggtct tgccttggcg taccgcgtac atcttcagct tggtgtgatc 8640ctccgccggc aactgaaagt tgacccgctt catggctggc gtgtctgcca ggctggccaa 8700cgttgcagcc ttgctgctgc gtgcgctcgg acggccggca cttagcgtgt ttgtgctttt 8760gctcattttc tctttacctc attaactcaa atgagttttg atttaatttc agcggccagc 8820gcctggacct cgcgggcagc gtcgccctcg ggttctgatt caagaacggt tgtgccggcg 8880gcggcagtgc ctgggtagct cacgcgctgc gtgatacggg actcaagaat gggcagctcg 8940tacccggcca gcgcctcggc aacctcaccg ccgatgcgcg tgcctttgat cgcccgcgac 9000acgacaaagg ccgcttgtag ccttccatcc gtgacctcaa tgcgctgctt aaccagctcc 9060accaggtcgg cggtggccca tatgtcgtaa gggcttggct gcaccggaat cagcacgaag 9120tcggctgcct tgatcgcgga cacagccaag tccgccgcct ggggcgctcc gtcgatcact 9180acgaagtcgc gccggccgat ggccttcacg tcgcggtcaa tcgtcgggcg gtcgatgccg 9240acaacggtta gcggttgatc ttcccgcacg gccgcccaat cgcgggcact gccctgggga 9300tcggaatcga ctaacagaac atcggccccg gcgagttgca gggcgcgggc tagatgggtt 9360gcgatggtcg tcttgcctga cccgcctttc tggttaagta cagcgataac ttcatgcgtt 9420cccttgcgta tttgtttatt tactcatcgc atcatatacg cagcgaccgc atgacgcaag 9480ctgttttact caaatacaca tcaccttttt agacggcggc gctcggtttc ttcagcggcc 9540aagctggccg gccaggccgc cagcttggca tcagacaaac cggccaggat ttcatgcagc 9600cgcacggttg agacgtgcgc gggcggctcg aacacgtacc cggccgcgat catctccgcc 9660tcgatctctt cggtaatgaa aaacggttcg tcctggccgt cctggtgcgg tttcatgctt 9720gttcctcttg gcgttcattc tcggcggccg ccagggcgtc ggcctcggtc aatgcgtcct 9780cacggaaggc

accgcgccgc ctggcctcgg tgggcgtcac ttcctcgctg cgctcaagtg 9840cgcggtacag ggtcgagcga tgcacgccaa gcagtgcagc cgcctctttc acggtgcggc 9900cttcctggtc gatcagctcg cgggcgtgcg cgatctgtgc cggggtgagg gtagggcggg 9960ggccaaactt cacgcctcgg gccttggcgg cctcgcgccc gctccgggtg cggtcgatga 10020ttagggaacg ctcgaactcg gcaatgccgg cgaacacggt caacaccatg cggccggccg 10080gcgtggtggt gtcggcccac ggctctgcca ggctacgcag gcccgcgccg gcctcctgga 10140tgcgctcggc aatgtccagt aggtcgcggg tgctgcgggc caggcggtct agcctggtca 10200ctgtcacaac gtcgccaggg cgtaggtggt caagcatcct ggccagctcc gggcggtcgc 10260gcctggtgcc ggtgatcttc tcggaaaaca gcttggtgca gccggccgcg tgcagttcgg 10320cccgttggtt ggtcaagtcc tggtcgtcgg tgctgacgcg ggcatagccc agcaggccag 10380cggcggcgct cttgttcatg gcgtaatgtc tccggttcta gtcgcaagta ttctacttta 10440tgcgactaaa acacgcgaca agaaaacgcc aggaaaaggg cagggcggca gcctgtcgcg 10500taacttagga cttgtgcgac atgtcgtttt cagaagacgg ctgcactgaa cgtcagaagc 10560cgactgcact atagcagcgg aggggttgga ccacaggacg ggtgtggtcg ccatgatcgc 10620gtagtcgata gtggctccaa gtagcgaagc gagcaggact gggcggcggc caaagcggtc 10680ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc aacgcatata gcgctagcag 10740cacgccatag tgactggcga tgctgtcgga atggacgata tcccgcaaga ggcccggcag 10800taccggcata accaagccta tgcctacagc atccagggtg acggtgccga ggatgacgat 10860gagcgcattg ttagatttca tacacggtgc ctgactgcgt tagcaattta actgtgataa 10920actaccgcat taaagctagc ttgcttggtc gttccgcgtg aacgtcggct cgattgtacc 10980tgcgttcaaa tactttgcga tcgtgttgcg cgcctgcccg gtgcgtcggc tgatctcacg 11040gatcgactgc ttctctcgca acgccatccg acggatgatg tttaaaagtc ccatgtggat 11100cactccgttg ccccgtcgct caccgtgttg gggggaaggt gcacatggct cagttctcaa 11160tggaaattat ctgcctaacc ggctcagttc tgcgtagaaa ccaacatgca agctccaccg 11220ggtgcaaagc ggcagcggcg gcaggatata ttcaattgta aatggcttca tgtccgggaa 11280atctacatgg atcagcaatg agtatgatgg tcaatatgga gaaaaagaaa gagtaattac 11340caattttttt tcaattcaaa aatgtagatg tccgcagcgt tattataaaa tgaaagtaca 11400ttttgataaa acgacaaatt acgatccgtc gtatttatag gcgaaagcaa taaacaaatt 11460attctaattc ggaaatcttt atttcgacgt gtctacattc acgtccaaat gggggcttag 11520atgagaaact tcacgatcga tgccttgatt tcgccattcc cagataccca tttcatcttc 11580agattggtct gagattatgc gaaaatatac actcatatac ataaatactg acagtttgag 11640ctaccaattc agtgtagccc attacctcac ataattcact caaatgctag gcagtctgtc 11700aactcggcgt caatttgtcg gccactatac gatagttgcg caaattttca aagtcctggc 11760ctaacatcac acctctgtcg gcggcgggtc ccatttgtga taaatccacc atatcgaatt 11820aattcagact cctttgcccc agagatcaca atggacgact tcctctatct ctacgatcta 11880gtcaggaagt tcgacggaga aggtgacgat accatgttca ccactgataa tgagaagatt 11940agccttttca atttcagaaa gaatgctaac ccacagatgg ttagagaggc ttacgcagca 12000ggtctcatca agacgatcta cccgagcaat aatctccagg agatcaaata ccttcccaag 12060aaggttaaag atgcagtcaa aagattcagg actaactgca tcaagaacac agagaaagat 12120atatttctca agatcagaag tactattcca gtatggacga ttcaaggctt gcttcacaaa 12180ccaaggcaag taatagagat tggagtctct aaaaaggtag ttcccactga atcaaaggcc 12240atggagtcaa agattcaaat agaggaccta acagaactcg ccgtaaagac tggcgaacag 12300ttcatacaga gtctcttacg actcaatgac aagaagaaaa tcttcgtcaa catggtggag 12360cacgacacgc ttgtctactc caaaaatatc aaagatacag tctcagaaga ccaaagggca 12420attgagactt ttcaacaaag ggtaatatcc ggaaacctcc tcggattcca ttgcccagct 12480atctgtcact ttattgtgaa gatagtggaa aaggaaggtg gctcctacaa atgccatcat 12540tgcgataaag gaaaggccat cgttgaagat gcctctgccg acagtggtcc caaagatgga 12600cccccaccca cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc ttcaaagcaa 12660gtggattgat gtgatatctc cactgacgta agggatgacg cacaatccca ctatccttcg 12720caagaccctt cctctatata aggaagttca tttcatttgg agaggacacg ctgaaatcac 12780cagtctccaa gcttgcgggg atcgtttcgc atgattgaac aagatggatt gcacgcaggt 12840tctccggccg cttgggtgga gaggctattc ggctatgact gggcacaaca gacaatcggc 12900tgctctgatg ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct ttttgtcaag 12960accgacctgt ccggtgccct gaatgaactg caggacgagg cagcgcggct atcgtggctg 13020gccacgacgg gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc gggaagggac 13080tggctgctat tgggcgaagt gccggggcag gatctcctgt catctcacct tgctcctgcc 13140gagaaagtat ccatcatggc tgatgcaatg cggcggctgc atacgcttga tccggctacc 13200tgcccattcg accaccaagc gaaacatcgc atcgagcgag cacgtactcg gatggaagcc 13260ggtcttgtcg atcaggatga tctggacgaa gagcatcagg ggctcgcgcc agccgaactg 13320ttcgccaggc tcaaggcgcg catgcccgac ggcgaggatc tcgtcgtgac ccatggcgat 13380gcctgcttgc cgaatatcat ggtggaaaat ggccgctttt ctggattcat cgactgtggc 13440cggctgggtg tggcggaccg ctatcaggac atagcgttgg ctacccgtga tattgctgaa 13500gagcttggcg gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc cgctcccgat 13560tcgcagcgca tcgccttcta tcgccttctt gacgagttct tctgagcggg actctggggt 13620tcgaaatgac cgaccaagcg acgcccaacc tgccatcacg agatttcgat tccaccgccg 13680ccttctatga aaggttgggc ttcggaatcg ttttccggga cgccggctgg atgatcctcc 13740agcgcgggga tctcatgctg gagttcttcg cccaccccgg atcgatccaa cacttacgtt 13800tgcaacgtcc aagagcaaat agaccacgaa cgccggaagg ttgccgcagc gtgtggattg 13860cgtctcaatt ctctcttgca ggaatgcaat gatgaatatg atactgacta tgaaactttg 13920agggaatact gcctagcacc gtcacctcat aacgtgcatc atgcatgccc tgacaacatg 13980gaacatcgct atttttctga agaattatgc tcgttggagg atgtcgcggc aattgcagct 14040attgccaaca tcgaactacc cctcacgcat gcattcatca atattattca tgcggggaaa 14100ggcaagatta atccaactgg caaatcatcc agcgtgattg gtaacttcag ttccagcgac 14160ttgattcgtt ttggtgctac ccacgttttc aataaggacg agatggtgga gtaaagaagg 14220agtgcgtcga agcagatcgt tcaaacattt ggcaataaag tttcttaaga ttgaatcctg 14280ttgccggtct tgcgatgatt atcatataat ttctgttgaa ttacgttaag catgtaataa 14340ttaacatgta atgcatgacg ttatttatga gatgggtttt tatgattaga gtcccgcaat 14400tatacattta atacgcgata gaaaacaaaa tatagcgcgc aaactaggat aaattatcgc 14460gcgcggtgtc atctatgtta ctagatcgat caaacttcgg tactgtgtaa tgacgatgag 14520caatcgagag gctgactaac aaaaggtaca tcgcgatgga tcgatccatt cgccattcag 14580gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct tcgctattac gccagctggc 14640gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg ccagggtttt cccagtcacg 14700acgttgtaaa acgacggcca gtgaattcct gcagcccggg ggatccgccc actcgaggcg 14760cgccaagctt gcatgcctgc aggctagcct aagtacgtac tcaaaatgcc aacaaataaa 14820aaaaaagttg ctttaataat gccaaaacaa attaataaaa cacttacaac accggatttt 14880ttttaattaa aatgtgccat ttaggataaa tagttaatat ttttaataat tatttaaaaa 14940gccgtatcta ctaaaatgat ttttatttgg ttgaaaatat taatatgttt aaatcaacac 15000aatctatcaa aattaaacta aaaaaaaaat aagtgtacgt ggttaacatt agtacagtaa 15060tataagagga aaatgagaaa ttaagaaatt gaaagcgagt ctaattttta aattatgaac 15120ctgcatatat aaaaggaaag aaagaatcca ggaagaaaag aaatgaaacc atgcatggtc 15180ccctcgtcat cacgagtttc tgccatttgc aatagaaaca ctgaaacacc tttctctttg 15240tcacttaatt gagatgccga agccacctca caccatgaac ttcatgaggt gtagcaccca 15300aggcttccat agccatgcat actgaagaat gtctcaagct cagcacccta cttctgtgac 15360gtgtccctca ttcaccttcc tctcttccct ataaataacc acgcctcagg ttctccgctt 15420cacaactcaa acattctctc cattggtcct taaacactca tcagtcatca ccgcacaagt 15480ttgtacaaaa aagcaggcta 1550025585DNABrassica napus 25gatattcgta tggttttgtt gcatgtccat caccacggtc ttcatctcca tcccagaatc 60tcgatcgcca catcgcccga ttacaatcgc ctccggaaaa gccttaacga tgtgcttttg 120agcatgagat ttggactaac gcgagatctc cctctgaaac gatcatcatt cgcctattat 180tccggatctc gagaacaaca gcccatcacc atggcgacca agggcgacaa gacttcgacg 240gaggtgaaag aaaaggtagt ggaggagaag aaggataatg ataagaagga ggaggtatcg 300ctcccaccgc cgccggagaa accagaggct ggcgattgtt gcggtagcgg ttgcgtccga 360tgcgtttggg atgtgtatta cgatgaactc gaagaataca acaagcttac tgctttcgct 420cctggagata ctaaatccaa ttgattgaat tgctttgttc tctattgttg ttagattcgc 480tcctggagat actaaatcca attgattgaa ttgctttgtt ctctattgtt gttagaaaaa 540gttaaacaat cgctttgttc gaataaaaag tactgatcga ccata 58526144PRTBrassica napus 26Met Val Leu Leu His Val His His His Gly Leu His Leu His Pro Arg1 5 10 15Ile Ser Ile Ala Thr Ser Pro Asp Tyr Asn Arg Leu Arg Lys Ser Leu 20 25 30Asn Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr Arg Asp Leu Pro 35 40 45Leu Lys Arg Ser Ser Phe Ala Tyr Tyr Ser Gly Ser Arg Glu Gln Gln 50 55 60Pro Ile Thr Met Ala Thr Lys Gly Asp Lys Thr Ser Thr Glu Val Lys65 70 75 80Glu Lys Val Val Glu Glu Lys Lys Asp Asn Asp Lys Lys Glu Glu Val 85 90 95Ser Leu Pro Pro Pro Pro Glu Lys Pro Glu Ala Gly Asp Cys Cys Gly 100 105 110Ser Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu 115 120 125Glu Tyr Asn Lys Leu Thr Ala Phe Ala Pro Gly Asp Thr Lys Ser Asn 130 135 14027552DNABrassica napus 27ggattcaaaa aagtcttcga ttttcccgaa gggttttgtt gcatgttcat caccacggtc 60ttcatctcca tcccagaatc tcgatcgcca catcgcccga ttacaatcgc ctccggaaaa 120gccttaacga tgtgcttttg agcatgagat ttggactaac gcgagatctc cctctgaaac 180gatcatcatt cgcctattat tccggatctc gagaacaaca gcccatcacc atggcgacca 240agggcgacaa gacttcgacg gaggtgaaag aaaaggtagt ggaggagaag aaggataatg 300ataagaagga ggaggtatcg ctcccaccgc cgccggagaa accagaggct ggcgattgtt 360gcggtagcgg ttgcgtccga tgcgtttggg atgtgtatta cgatgagctc gaagaataca 420acaagcttac tgcttccgct cctggagata ctaaatccaa ttgattgaat tgctttgttc 480tctattgttg ttagaaaaag ttaaacaatc gctttgttcg aataaaaagt actgatcgac 540cattttaaac ga 55228106PRTBrassica napus 28Met Arg Phe Gly Leu Thr Arg Asp Leu Pro Leu Lys Arg Ser Ser Phe1 5 10 15Ala Tyr Tyr Ser Gly Ser Arg Glu Gln Gln Pro Ile Thr Met Ala Thr 20 25 30Lys Gly Asp Lys Thr Ser Thr Glu Val Lys Glu Lys Val Val Glu Glu 35 40 45Lys Lys Asp Asn Asp Lys Lys Glu Glu Val Ser Leu Pro Pro Pro Pro 50 55 60Glu Lys Pro Glu Ala Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65 70 75 80Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu Glu Tyr Asn Lys Leu Thr 85 90 95Ala Ser Ala Pro Gly Asp Thr Lys Ser Asn 100 10529600DNABrassica napus 29gagcaaaaga agtcttcgat atattcgtat ggttttgttc catcaccacg gtcttcatct 60ccatcccaga atctcgatca ccacatcgcc cggttacaat cgactccgga aaagccttaa 120cgatgtgctt ctgagcatga gatttggact aacacgagat ctccgtctga aacgaccatc 180attcgcatac tattccggat ctcgaggaca acagcccatc accatggcga ccaagggcga 240caagacttcg acagaggtga aagataaggt agtggaggag aagaaggata tggataagga 300taagaaggaa gaggtatcgc tcccaccgcc gccggagaaa ccagaggctg gcgattgttg 360cggtagcggt tgcgtccgat gcgtttggga tgtgtattac gatgagctcg aagaatacaa 420caagcttact gcttccactc ctggagatac taaatccaat tgattgaatt gggattgctt 480tgttctgatt gttaccctat tgttgctaga aaaagttaaa caattgcttt gttctataat 540aaagactggt caagaactga tcgaccaata ttaaacgatt tcaatctttt tttcactgtg 60030144PRTBrassica napus 30Met Val Leu Phe His His His Gly Leu His Leu His Pro Arg Ile Ser1 5 10 15Ile Thr Thr Ser Pro Gly Tyr Asn Arg Leu Arg Lys Ser Leu Asn Asp 20 25 30Val Leu Leu Ser Met Arg Phe Gly Leu Thr Arg Asp Leu Arg Leu Lys 35 40 45Arg Pro Ser Phe Ala Tyr Tyr Ser Gly Ser Arg Gly Gln Gln Pro Ile 50 55 60Thr Met Ala Thr Lys Gly Asp Lys Thr Ser Thr Glu Val Lys Asp Lys65 70 75 80Val Val Glu Glu Lys Lys Asp Met Asp Lys Asp Lys Lys Glu Glu Val 85 90 95Ser Leu Pro Pro Pro Pro Glu Lys Pro Glu Ala Gly Asp Cys Cys Gly 100 105 110Ser Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu 115 120 125Glu Tyr Asn Lys Leu Thr Ala Ser Thr Pro Gly Asp Thr Lys Ser Asn 130 135 14031619DNAHelianthus annuus 31aagctggagc tccaccgcgg tggcggccgc tctagaacta gtggatcccc cgggctgcag 60gaattcggca cgagctccga cgccatggca ccgttaaccg tcactcgcct atgagatcac 120agactctgca ccgactcacc accactttta accgatctca tctcaatcca attcaacctt 180ctctcagatc tgattcaaat ttcaacctca ccatggctga ttcaggttct aataataaaa 240tcaagtcaga tgacggttcg agcgccgtta aggacgcaac ggagacgaaa aagctgccgg 300agatccctcc gccgccggag aaaccgttgc cgggagactg ttgtggcagc ggttgtgttc 360ggtgcgtttg ggacgtgtat tacgacgagc ttgaagagta taataagatt tgtaaaggag 420gatctgattc tacagctgga tctaaggttt cgtaaacgtt ttgtagaaat tgtttgatta 480ttgattgtta tagatcaatt tgattattga ttgttataga tctatttgat gttcaaataa 540acgaattagt tcgatatctg tgttgtgagt ttcttgtcat gatgtgtctt tgtttacata 600taatcgatcg aatatgatt 61932114PRTHelianthus annuus 32Met Arg Ser Gln Thr Leu His Arg Leu Thr Thr Thr Phe Asn Arg Ser1 5 10 15His Leu Asn Pro Ile Gln Pro Ser Leu Arg Ser Asp Ser Asn Phe Asn 20 25 30Leu Thr Met Ala Asp Ser Gly Ser Asn Asn Lys Ile Lys Ser Asp Asp 35 40 45Gly Ser Ser Ala Val Lys Asp Ala Thr Glu Thr Lys Lys Leu Pro Glu 50 55 60Ile Pro Pro Pro Pro Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser65 70 75 80Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu Glu 85 90 95Tyr Asn Lys Ile Cys Lys Gly Gly Ser Asp Ser Thr Ala Gly Ser Lys 100 105 110Val Ser 33219DNACastor canadensis 33atggccacca acaaaactga acctctagat tcaaaaacac acaatataaa taagaaagaa 60gaagaaaaga aattgccgcc gccgccgccg ccggagaagc cggagcctgg ggattgttgt 120ggaagcggat gtgttaggtg cgtatgggat gtgtattatg aagagcttga agaatataat 180aagctttatc aatcccattc tgattctaag cgcccttga 2193472PRTCastor canadensis 34Met Ala Thr Asn Lys Thr Glu Pro Leu Asp Ser Lys Thr His Asn Ile1 5 10 15Asn Lys Lys Glu Glu Glu Lys Lys Leu Pro Pro Pro Pro Pro Pro Glu 20 25 30Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val 35 40 45Trp Asp Val Tyr Tyr Glu Glu Leu Glu Glu Tyr Asn Lys Leu Tyr Gln 50 55 60Ser His Ser Asp Ser Lys Arg Pro65 7035771DNAGlycine max 35cggcagggtt acaatcttat cttcgtattg gacttcaatt gatcccaaag aaaaatatag 60agagagagag aatgtggtgg cggcggcgcc cgagaccatg agaactacag caccttccga 120tttcattttc acccaaaagc ttcacccttt caacatcacc tccaccaaaa cctccctcca 180acgaacccta ccctattttc tccaactcaa tcgcatggcc gaggctgcac gaaccgcgca 240taaacccgcg ccgcacccga tccaacccaa acccgacgat aaaaccccga atccggcgaa 300ggagattccg ccgccgccgg agaagccgga gcccggcgat tgctgcggca gcgggtgcgt 360ccgatgcgtc tgggatgtgt actacgacga actcgaagaa tacaataagc gatacaaaca 420ggtcgatccc agccccaaac cttcttcgta atcttcaaca tcgcttggat tagctttatt 480aatttattta tattacatcc taattttaaa aagctttggg tatttcttga tttcgtgaat 540tgtccctttt tatcaaaaag gatcgaaatg ttgtatgtgg aattatacat gtagaataaa 600ctgatttttt taaaaaaaat gccagggcta aaatgtacga tttatataat cccgaagatt 660aattcggaga tttacttctc agatcgcata attcccaagt tttttggtaa tagtacgctg 720tgttttttct ttcatgactt tgtttatgta ttttttataa ccattttgat a 77136117PRTGlycine max 36Met Arg Thr Thr Ala Pro Ser Asp Phe Ile Phe Thr Gln Lys Leu His1 5 10 15Pro Phe Asn Ile Thr Ser Thr Lys Thr Ser Leu Gln Arg Thr Leu Pro 20 25 30Tyr Phe Leu Gln Leu Asn Arg Met Ala Glu Ala Ala Arg Thr Ala His 35 40 45Lys Pro Ala Pro His Pro Ile Gln Pro Lys Pro Asp Asp Lys Thr Pro 50 55 60Asn Pro Ala Lys Glu Ile Pro Pro Pro Pro Glu Lys Pro Glu Pro Gly65 70 75 80Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr 85 90 95Asp Glu Leu Glu Glu Tyr Asn Lys Arg Tyr Lys Gln Val Asp Pro Ser 100 105 110Pro Lys Pro Ser Ser 11537664DNAGlycine max 37ggacttcaaa ttcaattgat ccgtttccca acccaaagaa agagagagaa tgtggtggtg 60gcggcggcgg cgtagatctt gagaacatct ccttcttccg atttcatttt cacccaaaag 120cttctcgctt tcaacatcac cctcaccaaa acccctcttc aacgagccct actcttcttc 180tttctccatc ccaatcgaat ggccgagggt gcacgaaccg cgcatgcacc cgccccgcac 240ccgatccaac ccaaacccga cgataaaacc ccgaatccgg tgaaggagac tccgccgccg 300ccggagaaac cggagcccgg cgattgctgc ggcagcggat gcgtccggga cgtttactac 360gacgaactcg aagatacaat aagctataca aacaagacga tcccagcccc aaagcttctt 420catagtcttc atcatcgcat gggtggaagt gttatgggtc gatgattgtt gggttattgt 480cgtcgtcaat acaccaggta tgttgttact gggtgagtgt gttaagtgat tcgtaaggca 540aattttaaca tatagatcaa cttgaattat atggatgaag ttgattcgta agttgataat 600aaactaacgg atcatgttga tttgtattga ttacagattt tgatttttta aaaatttctt 660aaaa 6643888PRTGlycine max 38Met Ala Glu Gly Ala Arg Thr Ala His Ala Pro Ala Pro His Pro Ile1 5 10 15Gln Pro Lys Pro Asp Asp Lys Thr Pro Asn Pro Val Lys Glu Thr Pro 20 25 30Pro Pro Pro Glu Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys 35 40 45Val Arg Asp Val Tyr Tyr Asp Glu Leu Glu Asp Thr Ile Ser Tyr Thr 50 55 60Asn Lys Thr Ile Pro Ala Pro Lys Leu Leu His Ser Leu His His Arg65 70 75 80Met Gly Gly Ser Val Met Gly Arg 8539689DNAZea mays 39 ggggctgtat gctgggcgcc gtcgtccgtg tcccgggccc gatcctacct ttcctgcccg

60ggccgacgcg cccctctcct ccgccgccgc cactacctcc cgcccgaaac gcccatggcc 120tcggccaccc cttgcgatgg cggcaccggg aagcccgacg ccgcgccggc tcccacgccc 180gcgccaacgc cgctgccgcc cgagaagcct ctcccgggcg actgctgcgg cagcggctgc 240gttcgctgcg tctgggacat atatttcgac gagctcgacg cgtacgacaa ggccctcgcc 300gcgcgcgcgg cctcctcagg ctccggcggc aaggacgact ctgctgatac caagcccaaa 360gaaggcaaga caacaaggtg aaagaaacca aagcgtgagg ccaacctgtt gcagttggaa 420acattgaacc tgtccccggc gacgcattgc cctttccacc gccgcggagc ctcgctcatg 480ccgtcgtctc taaaactggc cgactctggc cagattcctg caaagcgcgg accaccagga 540cacctcagtc ttcgaactga aatgtcagtc ctaatcccag ttgctactga aagaaaagaa 600agtgaaggga aatctcctca ccagtgtcta gcacaccgat aatggaatcc tcaccgaacc 660tactctctgg gaggattcca gccgaatgc 6894088PRTZea mays 40Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys Pro Asp Ala1 5 10 15Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro Glu Lys Pro 20 25 30Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp 35 40 45Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu Ala Ala Arg 50 55 60Ala Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala Asp Thr Lys65 70 75 80Pro Lys Glu Gly Lys Thr Thr Arg 8541503DNAZea mays 41 gaggcgccgt cgtccgtgtc ccgggcccga tcctgccttt cctgcccggg ccgacgcgcc 60ctttcctccg ccgccgccac tacctcccgg cccgagacgc ccatggcctc ggccacccct 120tgcgatggcg gcaccgggaa gcccgacgcc gcgccggctc ccacgcccgc gccaacgccg 180ctgccgcccg agaagcctct cccgggcgac tgctgcggca gcggctgcgt ccgctgcgtc 240tgggacatat atttcgacga gctcgacgcc tacgacaagg ccgtcgccgc ccacgcggcc 300tcctcaggct ccggcggcaa ggacgactcc gctgatacca agcccaacga aggtgccaag 360tcctgaagtg cgcctctcat gtgtaatgac ctcttctgct ctgaactgaa tttagattac 420tggcgttcac atacgccact accaattctt agcactcgaa acattacagt accgttgtgc 480ctgctgtctt aatatgctta gac 5034287PRTZea mays 42Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys Pro Asp Ala1 5 10 15Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro Glu Lys Pro 20 25 30Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp 35 40 45Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Val Ala Ala His 50 55 60Ala Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala Asp Thr Lys65 70 75 80Pro Asn Glu Gly Ala Lys Ser 8543674DNAZea mays 43cttgtggcag ttggattctc ctgcgtccca atcacgcagc ctcttggcct ccgcgaccgt 60cccctcccct cccctttcga cgagcgaggg gctgtatgct gggcgccgtc gtccgtgtcc 120cgggcccgat cctacctttc ctgcccgggc cgacgcgccc tctcctccgc cgccgccact 180acctcccgcc cgagacgccc atggcctcgg ccaccccttg cgatggcggc accgggaagc 240ccgacgccgc gccggctccc acgcccgcgc caacgccgct gccgcccgag aagcctctcc 300cgggcgactg ctgcggcagc ggctgcgttc gctgcgtctg ggacatatat ttcgacgagc 360tcgacgcgta cgacaaggcc ctcgccgcgc acgcggcctc ctcaggctcc ggcggcaagg 420acgactctgc tgataccaag cccaaagaag gtgccaaatc ctgaagtgcg cctctcatgt 480gtaatgacct cttctgctct gaactgaatt agattgctgg cgtctcacca gattcacata 540cgctggtacc aattcttagc actcgagaca ttacaatact cttgtgcctg ctgtgttata 600tgctagatta agatgcttta tcaattcagc ctccttattg tgtaacagga ggaagttatg 660aaacaaaaaa aaaa 67444122PRTZea mays 44Met Leu Gly Ala Val Val Arg Val Pro Gly Pro Ile Leu Pro Phe Leu1 5 10 15Pro Gly Pro Thr Arg Pro Leu Leu Arg Arg Arg His Tyr Leu Pro Pro 20 25 30Glu Thr Pro Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys 35 40 45Pro Asp Ala Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro 50 55 60Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65 70 75 80Val Trp Asp Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu 85 90 95Ala Ala His Ala Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala 100 105 110Asp Thr Lys Pro Lys Glu Gly Ala Lys Ser 115 12045653DNAOryza sativa 45cccttctctt ccgcatgctg gtcgccgccc tccgcgtccc ggcgccgatc ccctcgtcgc 60tcccctcgcc ggcgcgccct ctcctccgcc gccgcagcag ccaccgcctg ccccctcccc 120cgccccccgc cgcgtcaatg gccgacgccg gcggcgccac cacgaacaag cccgctccgg 180ccccggcccc ggagccgccc gagaagccgc tccccggcga ctgctgcggc agcggctgcg 240tccgctgcgt ctgggacgtc tactacgacg agctcgacgc ctacaataag gctctcgccg 300cccactcctc gtcggcatcc tccggcagca agcccgctac cagcgacggc gccaaatcat 360gaggcgaatc aggattcagg agttctgagg acgacttgca gtatgcgtcc cttcctctct 420tttcattttt tttccccttc cccaaatcgg ggtcttggtg tggtactcct accagctagt 480agtattaaaa ttactcgttt gattatagtg aaacatttgt gttatctcat tgtgtatgct 540gcaatttgta ctagagtgga atggttgttg ttccaacgaa aaattccctg attacataca 600gagaattgtt catggatagt tcttgtgtaa caaacattag cattttggca gaa 65346115PRTOryza sativa 46Met Leu Val Ala Ala Leu Arg Val Pro Ala Pro Ile Pro Ser Ser Leu1 5 10 15Pro Ser Pro Ala Arg Pro Leu Leu Arg Arg Arg Ser Ser His Arg Leu 20 25 30Pro Pro Pro Pro Pro Pro Ala Ala Ser Met Ala Asp Ala Gly Gly Ala 35 40 45Thr Thr Asn Lys Pro Ala Pro Ala Pro Ala Pro Glu Pro Pro Glu Lys 50 55 60Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp65 70 75 80Asp Val Tyr Tyr Asp Glu Leu Asp Ala Tyr Asn Lys Ala Leu Ala Ala 85 90 95His Ser Ser Ser Ala Ser Ser Gly Ser Lys Pro Ala Thr Ser Asp Gly 100 105 110Ala Lys Ser 11547399DNASorghum bicolor 47atgctgggcg ccgtcgtccg tgtcccggcg ccgatcctgc tgcctctcct ccccggaccg 60acgcgccctc tcctcctccg ccgccgccgc cactgcctcc cgcccgaggc gcccatggcc 120tcggccaccc ctagcgacgg cggcgccgcg aagcccgatg ccgcgcccgc gcccgtgccc 180gtgcccgcgc ccgcgccaac gccgctgccg ctgccgcccg agaagcctct cccgggcgac 240tgctgcggca gcggctgcgt gcgctgcgtc tgggacatat atttcgacga gctcgacgcg 300tacgacaagg cgctcgccgc ccacgcggcc gcctcctcag gctccggcgc caaggacgac 360tccgccgata ccaagcccag cgacggcgcc aagtcctga 39948132PRTSorghum bicolor 48Met Leu Gly Ala Val Val Arg Val Pro Ala Pro Ile Leu Leu Pro Leu1 5 10 15Leu Pro Gly Pro Thr Arg Pro Leu Leu Leu Arg Arg Arg Arg His Cys 20 25 30Leu Pro Pro Glu Ala Pro Met Ala Ser Ala Thr Pro Ser Asp Gly Gly 35 40 45Ala Ala Lys Pro Asp Ala Ala Pro Ala Pro Val Pro Val Pro Ala Pro 50 55 60Ala Pro Thr Pro Leu Pro Leu Pro Pro Glu Lys Pro Leu Pro Gly Asp65 70 75 80Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Ile Tyr Phe Asp 85 90 95Glu Leu Asp Ala Tyr Asp Lys Ala Leu Ala Ala His Ala Ala Ala Ser 100 105 110Ser Gly Ser Gly Ala Lys Asp Asp Ser Ala Asp Thr Lys Pro Ser Asp 115 120 125Gly Ala Lys Ser 1304934DNAartificial sequencelinker 49ggcgcgccaa gcttggatcc gtcgacggcg cgcc 34504974DNAartificial sequencevector 50ggccgccgac tcgacgatga gcgagatgac cagctccggc cgcgacacaa gtgtgagagt 60actaaataaa tgctttggtt gtacgaaatc attacactaa ataaaataat caaagcttat 120atatgccttc cgctaaggcc gaatgcaaag aaattggttc tttctcgtta tcttttgcca 180cttttactag tacgtattaa ttactactta atcatctttg tttacggctc attatatccg 240tcgacggcgc gcccgatcat ccggatatag ttcctccttt cagcaaaaaa cccctcaaga 300cccgtttaga ggccccaagg ggttatgcta gttattgctc agcggtggca gcagccaact 360cagcttcctt tcgggctttg ttagcagccg gatcgatcca agctgtacct cactattcct 420ttgccctcgg acgagtgctg gggcgtcggt ttccactatc ggcgagtact tctacacagc 480catcggtcca gacggccgcg cttctgcggg cgatttgtgt acgcccgaca gtcccggctc 540cggatcggac gattgcgtcg catcgaccct gcgcccaagc tgcatcatcg aaattgccgt 600caaccaagct ctgatagagt tggtcaagac caatgcggag catatacgcc cggagccgcg 660gcgatcctgc aagctccgga tgcctccgct cgaagtagcg cgtctgctgc tccatacaag 720ccaaccacgg cctccagaag aagatgttgg cgacctcgta ttgggaatcc ccgaacatcg 780cctcgctcca gtcaatgacc gctgttatgc ggccattgtc cgtcaggaca ttgttggagc 840cgaaatccgc gtgcacgagg tgccggactt cggggcagtc ctcggcccaa agcatcagct 900catcgagagc ctgcgcgacg gacgcactga cggtgtcgtc catcacagtt tgccagtgat 960acacatgggg atcagcaatc gcgcatatga aatcacgcca tgtagtgtat tgaccgattc 1020cttgcggtcc gaatgggccg aacccgctcg tctggctaag atcggccgca gcgatcgcat 1080ccatagcctc cgcgaccggc tgcagaacag cgggcagttc ggtttcaggc aggtcttgca 1140acgtgacacc ctgtgcacgg cgggagatgc aataggtcag gctctcgctg aattccccaa 1200tgtcaagcac ttccggaatc gggagcgcgg ccgatgcaaa gtgccgataa acataacgat 1260ctttgtagaa accatcggcg cagctattta cccgcaggac atatccacgc cctcctacat 1320cgaagctgaa agcacgagat tcttcgccct ccgagagctg catcaggtcg gagacgctgt 1380cgaacttttc gatcagaaac ttctcgacag acgtcgcggt gagttcaggc ttttccatgg 1440gtatatctcc ttcttaaagt taaacaaaat tatttctaga gggaaaccgt tgtggtctcc 1500ctatagtgag tcgtattaat ttcgcgggat cgagatctga tcaacctgca ttaatgaatc 1560ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact 1620gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta 1680atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag 1740caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc 1800cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta 1860taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg 1920ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc 1980tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac 2040gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac 2100ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg 2160aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga 2220aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt 2280agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag 2340cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct 2400gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgacatt aacctataaa 2460aataggcgta tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc 2520tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga 2580caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg 2640gcatcagagc agattgtact gagagtgcac catatggaca tattgtcgtt agaacgcggc 2700tacaattaat acataacctt atgtatcata cacatacgat ttaggtgaca ctatagaacg 2760gcgcgccaag cttggatcct cgaagagaag ggttaataac acatttttta acatttttaa 2820cacaaatttt agttatttaa aaatttatta aaaaatttaa aataagaaga ggaactcttt 2880aaataaatct aacttacaaa atttatgatt tttaataagt tttcaccaat aaaaaatgtc 2940ataaaaatat gttaaaaagt atattatcaa tattctcttt atgataaata aaaagaaaaa 3000aaaaataaaa gttaagtgaa aatgagattg aagtgacttt aggtgtgtat aaatatatca 3060accccgccaa caatttattt aatccaaata tattgaagta tattattcca tagcctttat 3120ttatttatat atttattata taaaagcttt atttgttcta ggttgttcat gaaatatttt 3180tttggtttta tctccgttgt aagaaaatca tgtgctttgt gtcgccactc actattgcag 3240ctttttcatg cattggtcag attgacggtt gattgtattt ttgtttttta tggttttgtg 3300ttatgactta agtcttcatc tctttatctc ttcatcaggt ttgatggtta cctaatatgg 3360tccatgggta catgcatggt taaattaggt ggccaacttt gttgtgaacg atagaatttt 3420ttttatatta agtaaactat ttttatatta tgaaataata ataaaaaaaa tattttatca 3480ttattaacaa aatcatatta gttaatttgt taactctata ataaaagaaa tactgtaaca 3540ttcacattac atggtaacat ctttccaccc tttcatttgt tttttgtttg atgacttttt 3600ttcttgttta aatttatttc ccttctttta aatttggaat acattatcat catatataaa 3660ctaaaatact aaaaacagga ttacacaaat gataaataat aacacaaata tttataaatc 3720tagctgcaat atatttaaac tagctatatc gatattgtaa aataaaacta gctgcattga 3780tactgataaa aaaatatcat gtgctttctg gactgatgat gcagtatact tttgacattg 3840cctttatttt atttttcaga aaagctttct tagttctggg ttcttcatta tttgtttccc 3900atctccattg tgaattgaat catttgcttc gtgtcacaaa tacaatttag ntaggtacat 3960gcattggtca gattcacggt ttattatgtc atgacttaag ttcatggtag tacattacct 4020gccacgcatg cattatattg gttagatttg ataggcaaat ttggttgtca acaatataaa 4080tataaataat gtttttatat tacgaaataa cagtgatcaa aacaaacagt tttatcttta 4140ttaacaagat tttgtttttg tttgatgacg ttttttaatg tttacgcttt cccccttctt 4200ttgaatttag aacactttat catcataaaa tcaaatacta aaaaaattac atatttcata 4260aataataaca caaatatttt taaaaaatct gaaataataa tgaacaatat tacatattat 4320cacgaaaatt cattaataaa aatattatat aaataaaatg taatagtagt tatatgtagg 4380aaaaaagtac tgcacgcata atatatacaa aaagattaaa atgaactatt ataaataata 4440acactaaatt aatggtgaat catatcaaaa taatgaaaaa gtaaataaaa tttgtaatta 4500acttctatat gtattacaca cacaaataat aaataatagt aaaaaaaatt atgataaata 4560tttaccatct cataagatat ttaaaataat gataaaaata tagattattt tttatgcaac 4620tagctagcca aaaagagaac acgggtatat ataaaaagag tacctttaaa ttctactgta 4680cttcctttat tcctgacgtt tttatatcaa gtggacatac gtgaagattt taattatcag 4740tctaaatatt tcattagcac ttaatacttt tctgttttat tcctatccta taagtagtcc 4800cgattctccc aacattgctt attcacacaa ctaactaaga aagtcttcca tagcccccca 4860agcggccgga gctggtcatc tcgctcatcg tcgagtcggc ggccggagct ggtcatctcg 4920ctcatcgtcg agtcggcggc cgccgactcg acgatgagcg agatgaccag ctcc 49745180DNAartificial sequenceEagI ELVISLIVES sequence 51cggccggagc tggtcatctc gctcatcgtc gagtcggcgg ccgccgactc gacgatgagc 60gagatgacca gctccggccg 8052118DNAartificial sequence2XELVISLIVES 52cggccggagc tggtcatctc gctcatcgtc gagtcggcgg ccggagctgg tcatctcgct 60catcgtcgag tcggcggccg ccgactcgac gatgagcgag atgaccagct ccggccgc 1185392DNAartificial sequenceprimer 53gaattccggc cggagctggt catctcgctc atcgtcgagt cggcggccgc cgactcgacg 60atgagcgaga tgaccagctc cggccggaat tc 925415DNAartificial sequenceprimer 54gaattccggc cggag 155529DNAartificial sequenceprimer 55gcggccgcat gtataattcc acatacaac 295629DNAartificial sequenceprimer 56gcggccgcat gtataattcc acatacaac 295726DNAartificial sequenceprimer 57tctagaccac atacaacatt tcgatc 265826DNAartificial sequenceprimer 58tctagaccac atacaacatt tcgatc 26593402DNAartificial sequenceplasmid 59gcggccgctc aatcgcatgg ccgaggctgc acgaaccgcg cataaacccg cgccgcaccc 60gatccaaccc aaacccgacg ataaaacccc gaatccggcg aaggagattc cgccgccgcc 120ggagaagccg gagcccggcg attgctgcgg cagcgggtgc gtccgatgcg tctgggatgt 180gtactacgac gaactcgaag aatacaataa gcgatacaaa caggtcgatc ccagccccaa 240accttcttcg taatcttcaa catcgcttgg attagcttta ttaatttatt tatattacat 300cctaatttta aaaagctttg ggtatttctt gatttcgtga attgtccctt tttatcaaaa 360aggatcgaaa tgttgtatgt ggaattatac atgcggccgc aatcactagt gcggccgcct 420gcaggtcgac catatgggag agctcccaac gcgttggatg catagcttga gtattctata 480gtgtcaccta aatagcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta 540tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc 600ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt tccagtcggg 660aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg 720tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg 780gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa 840cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc 900gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc 960aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag 1020ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct 1080cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca gttcggtgta 1140ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc 1200cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc 1260agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt 1320gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct gcgctctgct 1380gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc 1440tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca 1500agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta 1560agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa 1620atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg 1680cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg 1740actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc 1800aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc 1860cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa 1920ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc 1980cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg 2040ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc 2100cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat 2160ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg 2220tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc 2280ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg 2340aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat 2400gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg 2460gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg 2520ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg

gttattgtct 2580catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac 2640atttccccga aaagtgccac ctgatgcggt gtgaaatacc gcacagatgc gtaaggagaa 2700aataccgcat caggaaattg taagcgttaa tattttgtta aaattcgcgt taaatttttg 2760ttaaatcagc tcatttttta accaataggc cgaaatcggc aaaatccctt ataaatcaaa 2820agaatagacc gagatagggt tgagtgttgt tccagtttgg aacaagagtc cactattaaa 2880gaacgtggac tccaacgtca aagggcgaaa aaccgtctat cagggcgatg gcccactacg 2940tgaaccatca ccctaatcaa gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa 3000ccctaaaggg agcccccgat ttagagcttg acggggaaag ccggcgaacg tggcgagaaa 3060ggaagggaag aaagcgaaag gagcgggcgc tagggcgctg gcaagtgtag cggtcacgct 3120gcgcgtaacc accacacccg ccgcgcttaa tgcgccgcta cagggcgcgt ccattcgcca 3180ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg cctcttcgct attacgccag 3240ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg taacgccagg gttttcccag 3300tcacgacgtt gtaaaacgac ggccagtgaa ttgtaatacg actcactata gggcgaattg 3360ggcccgacgt cgcatgctcc cggccgccat ggccgcggga tt 3402603204DNAartificial sequenceplasmid 60aatcactagt gcggccgcct gcaggtcgac catatgggag agctcccaac gcgttggatg 60catagcttga gtattctata gtgtcaccta aatagcttgg cgtaatcatg gtcatagctg 120tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata 180aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca 240ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc 300gcggggagag gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg 360cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 420tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 480aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 540catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 600caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 660ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 720aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 780gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 840cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 900ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 960tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 1020tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 1080cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 1140tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 1200tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 1260tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 1320cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 1380ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 1440tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 1500gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 1560agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 1620atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 1680tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 1740gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 1800agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 1860cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 1920ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 1980ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 2040actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 2100ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 2160atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 2220caaatagggg ttccgcgcac atttccccga aaagtgccac ctgatgcggt gtgaaatacc 2280gcacagatgc gtaaggagaa aataccgcat caggaaattg taagcgttaa tattttgtta 2340aaattcgcgt taaatttttg ttaaatcagc tcatttttta accaataggc cgaaatcggc 2400aaaatccctt ataaatcaaa agaatagacc gagatagggt tgagtgttgt tccagtttgg 2460aacaagagtc cactattaaa gaacgtggac tccaacgtca aagggcgaaa aaccgtctat 2520cagggcgatg gcccactacg tgaaccatca ccctaatcaa gttttttggg gtcgaggtgc 2580cgtaaagcac taaatcggaa ccctaaaggg agcccccgat ttagagcttg acggggaaag 2640ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag gagcgggcgc tagggcgctg 2700gcaagtgtag cggtcacgct gcgcgtaacc accacacccg ccgcgcttaa tgcgccgcta 2760cagggcgcgt ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 2820cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 2880taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgaa ttgtaatacg 2940actcactata gggcgaattg ggcccgacgt cgcatgctcc cggccgccat ggccgcggga 3000ttggatccac tcgaagaata caataagcga tacaaacagg tcgatcccag ccccaaacct 3060tcttcgtaat cttcaacatc gcttggatta gctttattaa tttatttata ttacatccta 3120attttaaaaa gctttgggta tttcttgatt tcgtgaattg tcccttttta tcaaaaagga 3180tcgaaatgtt gtatgtggtc taga 3204613790DNAartificial sequenceplasmid 61gatccactcg aagaatacaa taagcgatac aaacaggtcg atcccagccc caaaccttct 60tcgtaatctt caacatcgct tggattagct ttattaattt atttatatta catcctaatt 120ttaaaaagct ttgggtattt cttgatttcg tgaattgtcc ctttttatca aaaaggatcg 180aaatgttgta tgtggtctag aaatcactag tgcggccgcc tgcaggtcga ccatatggga 240gagctcccaa cgcgttggat gcatagcttg agtattctat agtgtcacct aaatagcttg 300gcgtaatcat ggtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac 360aacatacgag ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc 420acattaattg cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg 480cattaatgaa tcggccaacg cgcggggaga ggcggtttgc gtattgggcg ctcttccgct 540tcctcgctca ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac 600tcaaaggcgg taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga 660gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat 720aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac 780ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 840gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 900ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 960ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1020cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1080attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1140ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga 1200aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt 1260gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 1320tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 1380ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 1440taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct 1500atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata 1560actacgatac gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca 1620cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga 1680agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga 1740gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 1800gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga 1860gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 1920gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct 1980cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca 2040ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 2100accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 2160aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 2220aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 2280caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc 2340ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt 2400gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca 2460cctgatgcgg tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcaggaaatt 2520gtaagcgtta atattttgtt aaaattcgcg ttaaattttt gttaaatcag ctcatttttt 2580aaccaatagg ccgaaatcgg caaaatccct tataaatcaa aagaatagac cgagataggg 2640ttgagtgttg ttccagtttg gaacaagagt ccactattaa agaacgtgga ctccaacgtc 2700aaagggcgaa aaaccgtcta tcagggcgat ggcccactac gtgaaccatc accctaatca 2760agttttttgg ggtcgaggtg ccgtaaagca ctaaatcgga accctaaagg gagcccccga 2820tttagagctt gacggggaaa gccggcgaac gtggcgagaa aggaagggaa gaaagcgaaa 2880ggagcgggcg ctagggcgct ggcaagtgta gcggtcacgc tgcgcgtaac caccacaccc 2940gccgcgctta atgcgccgct acagggcgcg tccattcgcc attcaggctg cgcaactgtt 3000gggaagggcg atcggtgcgg gcctcttcgc tattacgcca gctggcgaaa gggggatgtg 3060ctgcaaggcg attaagttgg gtaacgccag ggttttccca gtcacgacgt tgtaaaacga 3120cggccagtga attgtaatac gactcactat agggcgaatt gggcccgacg tcgcatgctc 3180ccggccgcca tggccgcggg attggatcca acgcaattaa tgtgagttag ctcactcatt 3240aggcacccca ggctttacac tttatgcttc cggctcgtat gttgtgtgga attgtgagcg 3300gataacaatt tcacacagga aacagctatg accatgatta cgccaagcta tttaggtgac 3360actatagaat actcaagcta tgcatccaac gcgttgggag ctctcccata tggtcgacct 3420gcaggcggcc gcactagtga ttgcggccgc atgtataatt ccacatacaa catttcgatc 3480ctttttgata aaaagggaca attcacgaaa tcaagaaata cccaaagctt tttaaaatta 3540ggatgtaata taaataaatt aataaagcta atccaagcga tgttgaagat tacgaagaag 3600gtttggggct gggatcgacc tgtttgtatc gcttattgta ttcttcgagt tcgtcgtagt 3660acacatccca gacgcatcgg acgcacccgc tgccgcagca atcgccgggc tccggcttct 3720ccggcggcgg cggaatctcc ttcgccggat tcggggtttt atcgtcgggt ttgggttgga 3780tcgggtgcgg 3790628113DNAartificial sequencevector 62ggccgcgaca caagtgtgag agtactaaat aaatgctttg gttgtacgaa atcattacac 60taaataaaat aatcaaagct tatatatgcc ttccgctaag gccgaatgca aagaaattgg 120ttctttctcg ttatcttttg ccacttttac tagtacgtat taattactac ttaatcatct 180ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat catccggata tagttcctcc 240tttcagcaaa aaacccctca agacccgttt agaggcccca aggggttatg ctagttattg 300ctcagcggtg gcagcagcca actcagcttc ctttcgggct ttgttagcag ccggatcgat 360ccaagctgta cctcactatt cctttgccct cggacgagtg ctggggcgtc ggtttccact 420atcggcgagt acttctacac agccatcggt ccagacggcc gcgcttctgc gggcgatttg 480tgtacgcccg acagtcccgg ctccggatcg gacgattgcg tcgcatcgac cctgcgccca 540agctgcatca tcgaaattgc cgtcaaccaa gctctgatag agttggtcaa gaccaatgcg 600gagcatatac gcccggagcc gcggcgatcc tgcaagctcc ggatgcctcc gctcgaagta 660gcgcgtctgc tgctccatac aagccaacca cggcctccag aagaagatgt tggcgacctc 720gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg accgctgtta tgcggccatt 780gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg aggtgccgga cttcggggca 840gtcctcggcc caaagcatca gctcatcgag agcctgcgcg acggacgcac tgacggtgtc 900gtccatcaca gtttgccagt gatacacatg gggatcagca atcgcgcata tgaaatcacg 960ccatgtagtg tattgaccga ttccttgcgg tccgaatggg ccgaacccgc tcgtctggct 1020aagatcggcc gcagcgatcg catccatagc ctccgcgacc ggctgcagaa cagcgggcag 1080ttcggtttca ggcaggtctt gcaacgtgac accctgtgca cggcgggaga tgcaataggt 1140caggctctcg ctgaattccc caatgtcaag cacttccgga atcgggagcg cggccgatgc 1200aaagtgccga taaacataac gatctttgta gaaaccatcg gcgcagctat ttacccgcag 1260gacatatcca cgccctccta catcgaagct gaaagcacga gattcttcgc cctccgagag 1320ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga aacttctcga cagacgtcgc 1380ggtgagttca ggcttttcca tgggtatatc tccttcttaa agttaaacaa aattatttct 1440agagggaaac cgttgtggtc tccctatagt gagtcgtatt aatttcgcgg gatcgagatc 1500gatccaattc caatcccaca aaaatctgag cttaacagca cagttgctcc tctcagagca 1560gaatcgggta ttcaacaccc tcatatcaac tactacgttg tgtataacgg tccacatgcc 1620ggtatatacg atgactgggg ttgtacaaag gcggcaacaa acggcgttcc cggagttgca 1680cacaagaaat ttgccactat tacagaggca agagcagcag ctgacgcgta cacaacaagt 1740cagcaaacag acaggttgaa cttcatcccc aaaggagaag ctcaactcaa gcccaagagc 1800tttgctaagg ccctaacaag cccaccaaag caaaaagccc actggctcac gctaggaacc 1860aaaaggccca gcagtgatcc agccccaaaa gagatctcct ttgccccgga gattacaatg 1920gacgatttcc tctatcttta cgatctagga aggaagttcg aaggtgaagg tgacgacact 1980atgttcacca ctgataatga gaaggttagc ctcttcaatt tcagaaagaa tgctgaccca 2040cagatggtta gagaggccta cgcagcaggt ctcatcaaga cgatctaccc gagtaacaat 2100ctccaggaga tcaaatacct tcccaagaag gttaaagatg cagtcaaaag attcaggact 2160aattgcatca agaacacaga gaaagacata tttctcaaga tcagaagtac tattccagta 2220tggacgattc aaggcttgct tcataaacca aggcaagtaa tagagattgg agtctctaaa 2280aaggtagttc ctactgaatc taaggccatg catggagtct aagattcaaa tcgaggatct 2340aacagaactc gccgtgaaga ctggcgaaca gttcatacag agtcttttac gactcaatga 2400caagaagaaa atcttcgtca acatggtgga gcacgacact ctggtctact ccaaaaatgt 2460caaagataca gtctcagaag accaaagggc tattgagact tttcaacaaa ggataatttc 2520gggaaacctc ctcggattcc attgcccagc tatctgtcac ttcatcgaaa ggacagtaga 2580aaaggaaggt ggctcctaca aatgccatca ttgcgataaa ggaaaggcta tcattcaaga 2640tgcctctgcc gacagtggtc ccaaagatgg acccccaccc acgaggagca tcgtggaaaa 2700agaagacgtt ccaaccacgt cttcaaagca agtggattga tgtgacatct ccactgacgt 2760aagggatgac gcacaatccc actatccttc gcaagaccct tcctctatat aaggaagttc 2820atttcatttg gagaggacac gctcgagctc atttctctat tacttcagcc ataacaaaag 2880aactcttttc tcttcttatt aaaccatgaa aaagcctgaa ctcaccgcga cgtctgtcga 2940gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 3000agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 3060ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 3120cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 3180ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 3240gcagccggtc gcggaggcca tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 3300gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 3360cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 3420gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 3480gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3540agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3600cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3660gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3720ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3780atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3840aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3900ccccagcact cgtccgaggg caaaggaata gtgaggtacc taaagaagga gtgcgtcgaa 3960gcagatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt tgccggtctt 4020gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat taacatgtaa 4080tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt atacatttaa 4140tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg cgcggtgtca 4200tctatgttac tagatcgatg tcgaatctga tcaacctgca ttaatgaatc ggccaacgcg 4260cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc 4320gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat 4380ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca 4440ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc 4500atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc 4560aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg 4620gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc tcacgctgta 4680ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg 4740ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac 4800acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag 4860gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat 4920ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat 4980ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 5040gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt 5100ggaacgaaaa ctcacgttaa gggattttgg tcatgacatt aacctataaa aataggcgta 5160tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc 5220agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga caagcccgtc 5280agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg gcatcagagc 5340agattgtact gagagtgcac catatggaca tattgtcgtt agaacgcggc tacaattaat 5400acataacctt atgtatcata cacatacgat ttaggtgaca ctatagaacg gcgcgccaag 5460cttggatcct cgaagagaag ggttaataac acactttttt aacattttta acacaaattt 5520tagttattta aaaatttatt aaaaaattta aaataagaag aggaactctt taaataaatc 5580taacttacaa aatttatgat ttttaataag ttttcaccaa taaaaaatgt cataaaaata 5640tgttaaaaag tatattatca atattctctt tatgataaat aaaaagaaaa aaaaaataaa 5700agttaagtga aaatgagatt gaagtgactt taggtgtgta taaatatatc aaccccgcca 5760acaatttatt taatccaaat atattgaagt atattattcc atagccttta tttatttata 5820tatttattat ataaaagctt tatttgttct aggttgttca tgaaatattt ttttggtttt 5880atctccgttg taagaaaatc atgtgctttg tgtcgccact cactattgca gctttttcat 5940gcattggtca gattgacggt tgattgtatt tttgtttttt atggttttgt gttatgactt 6000aagtcttcat ctctttatct cttcatcagg tttgatggtt acctaatatg gtccatgggt 6060acatgcatgg ttaaattagg tggccaactt tgttgtgaac gatagaattt tttttatatt 6120aagtaaacta tttttatatt atgaaataat aataaaaaaa atattttatc attattaaca 6180aaatcatatt agttaatttg ttaactctat aataaaagaa atactgtaac attcacatta 6240catggtaaca tctttccacc ctttcatttg ttttttgttt gatgactttt tttcttgttt 6300aaatttattt cccttctttt aaatttggaa tacattatca tcatatataa actaaaatac 6360taaaaacagg attacacaaa tgataaataa taacacaaat atttataaat ctagctgcaa 6420tatatttaaa ctagctatat cgatattgta aaataaaact agctgcattg atactgataa 6480aaaaatatca tgtgctttct ggactgatga tgcagtatac ttttgacatt gcctttattt 6540tatttttcag aaaagctttc ttagttctgg gttcttcatt atttgtttcc catctccatt 6600gtgaattgaa tcatttgctt cgtgtcacaa atacaattta gntaggtaca tgcattggtc 6660agattcacgg tttattatgt catgacttaa gttcatggta gtacattacc tgccacgcat 6720gcattatatt ggttagattt gataggcaaa tttggttgtc aacaatataa atataaataa 6780tgtttttata ttacgaaata acagtgatca aaacaaacag ttttatcttt attaacaaga 6840ttttgttttt gtttgatgac gttttttaat gtttacgctt tcccccttct tttgaattta 6900gaacacttta tcatcataaa atcaaatact aaaaaaatta catatttcat aaataataac 6960acaaatattt ttaaaaaatc tgaaataata atgaacaata ttacatatta

tcacgaaaat 7020tcattaataa aaatattata taaataaaat gtaatagtag ttatatgtag gaaaaaagta 7080ctgcacgcat aatatataca aaaagattaa aatgaactat tataaataat aacactaaat 7140taatggtgaa tcatatcaaa ataatgaaaa agtaaataaa atttgtaatt aacttctata 7200tgtattacac acacaaataa taaataatag taaaaaaaat tatgataaat atttaccatc 7260tcataagata tttaaaataa tgataaaaat atagattatt ttttatgcaa ctagctagcc 7320aaaaagagaa cacgggtata tataaaaaga gtacctttaa attctactgt acttccttta 7380ttcctgacgt ttttatatca agtggacata cgtgaagatt ttaattatca gtctaaatat 7440ttcattagca cttaatactt ttctgtttta ttcctatcct ataagtagtc ccgattctcc 7500caacattgct tattcacaca actaactaag aaagtcttcc atagcccccc aagcggccgc 7560atgtataatt ccacatacaa catttcgatc ctttttgata aaaagggaca attcacgaaa 7620tcaagaaata cccaaagctt tttaaaatta ggatgtaata taaataaatt aataaagcta 7680atccaagcga tgttgaagat tacgaagaag gtttggggct gggatcgacc tgtttgtatc 7740gcttattgta ttcttcgagt tcgtcgtagt acacatccca gacgcatcgg acgcacccgc 7800tgccgcagca atcgccgggc tccggcttct ccggcggcgg cggaatctcc ttcgccggat 7860tcggggtttt atcgtcgggt ttgggttgga tcgggtgcgg gatccactcg aagaatacaa 7920taagcgatac aaacaggtcg atcccagccc caaaccttct tcgtaatctt caacatcgct 7980tggattagct ttattaattt atttatatta catcctaatt ttaaaaagct ttgggtattt 8040cttgatttcg tgaattgtcc ctttttatca aaaaggatcg aaatgttgta tgtggtctag 8100aaatcactag tgc 8113635267DNAartificial sequencevector 63atctgatcaa cctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 60ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 120cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 180gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 240tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 300agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 360tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 420cgggaagcgt ggcgctttct caatgctcac gctgtaggta tctcagttcg gtgtaggtcg 480ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 540ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 600ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 660ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 720cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 780gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 840atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 900ttttggtcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 960gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 1020gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 1080ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata 1140tggacatatt gtcgttagaa cgcggctaca attaatacat aaccttatgt atcatacaca 1200tacgatttag gtgacactat agaacggcgc gccaagcttg gatccgtcga cggcgcgccc 1260gatcatccgg atatagttcc tcctttcagc aaaaaacccc tcaagacccg tttagaggcc 1320ccaaggggtt atgctagtta ttgctcagcg gtggcagcag ccaactcagc ttcctttcgg 1380gctttgttag cagccggatc gatccaagct gtacctcact attcctttgc cctcggacga 1440gtgctggggc gtcggtttcc actatcggcg agtacttcta cacagccatc ggtccagacg 1500gccgcgcttc tgcgggcgat ttgtgtacgc ccgacagtcc cggctccgga tcggacgatt 1560gcgtcgcatc gaccctgcgc ccaagctgca tcatcgaaat tgccgtcaac caagctctga 1620tagagttggt caagaccaat gcggagcata tacgcccgga gccgcggcga tcctgcaagc 1680tccggatgcc tccgctcgaa gtagcgcgtc tgctgctcca tacaagccaa ccacggcctc 1740cagaagaaga tgttggcgac ctcgtattgg gaatccccga acatcgcctc gctccagtca 1800atgaccgctg ttatgcggcc attgtccgtc aggacattgt tggagccgaa atccgcgtgc 1860acgaggtgcc ggacttcggg gcagtcctcg gcccaaagca tcagctcatc gagagcctgc 1920gcgacggacg cactgacggt gtcgtccatc acagtttgcc agtgatacac atggggatca 1980gcaatcgcgc atatgaaatc acgccatgta gtgtattgac cgattccttg cggtccgaat 2040gggccgaacc cgctcgtctg gctaagatcg gccgcagcga tcgcatccat agcctccgcg 2100accggctgca gaacagcggg cagttcggtt tcaggcaggt cttgcaacgt gacaccctgt 2160gcacggcggg agatgcaata ggtcaggctc tcgctgaatt ccccaatgtc aagcacttcc 2220ggaatcggga gcgcggccga tgcaaagtgc cgataaacat aacgatcttt gtagaaacca 2280tcggcgcagc tatttacccg caggacatat ccacgccctc ctacatcgaa gctgaaagca 2340cgagattctt cgccctccga gagctgcatc aggtcggaga cgctgtcgaa cttttcgatc 2400agaaacttct cgacagacgt cgcggtgagt tcaggctttt ccatgggtat atctccttct 2460taaagttaaa caaaattatt tctagaggga aaccgttgtg gtctccctat agtgagtcgt 2520attaatttcg cgggatcgag atcgatccaa ttccaatccc acaaaaatct gagcttaaca 2580gcacagttgc tcctctcaga gcagaatcgg gtattcaaca ccctcatatc aactactacg 2640ttgtgtataa cggtccacat gccggtatat acgatgactg gggttgtaca aaggcggcaa 2700caaacggcgt tcccggagtt gcacacaaga aatttgccac tattacagag gcaagagcag 2760cagctgacgc gtacacaaca agtcagcaaa cagacaggtt gaacttcatc cccaaaggag 2820aagctcaact caagcccaag agctttgcta aggccctaac aagcccacca aagcaaaaag 2880cccactggct cacgctagga accaaaaggc ccagcagtga tccagcccca aaagagatct 2940cctttgcccc ggagattaca atggacgatt tcctctatct ttacgatcta ggaaggaagt 3000tcgaaggtga aggtgacgac actatgttca ccactgataa tgagaaggtt agcctcttca 3060atttcagaaa gaatgctgac ccacagatgg ttagagaggc ctacgcagca ggtctcatca 3120agacgatcta cccgagtaac aatctccagg agatcaaata ccttcccaag aaggttaaag 3180atgcagtcaa aagattcagg actaattgca tcaagaacac agagaaagac atatttctca 3240agatcagaag tactattcca gtatggacga ttcaaggctt gcttcataaa ccaaggcaag 3300taatagagat tggagtctct aaaaaggtag ttcctactga atctaaggcc atgcatggag 3360tctaagattc aaatcgagga tctaacagaa ctcgccgtga agactggcga acagttcata 3420cagagtcttt tacgactcaa tgacaagaag aaaatcttcg tcaacatggt ggagcacgac 3480actctggtct actccaaaaa tgtcaaagat acagtctcag aagaccaaag ggctattgag 3540acttttcaac aaaggataat ttcgggaaac ctcctcggat tccattgccc agctatctgt 3600cacttcatcg aaaggacagt agaaaaggaa ggtggctcct acaaatgcca tcattgcgat 3660aaaggaaagg ctatcattca agatgcctct gccgacagtg gtcccaaaga tggaccccca 3720cccacgagga gcatcgtgga aaaagaagac gttccaacca cgtcttcaaa gcaagtggat 3780tgatgtgaca tctccactga cgtaagggat gacgcacaat cccactatcc ttcgcaagac 3840ccttcctcta tataaggaag ttcatttcat ttggagagga cacgctcgag ctcatttctc 3900tattacttca gccataacaa aagaactctt ttctcttctt attaaaccat gaaaaagcct 3960gaactcaccg cgacgtctgt cgagaagttt ctgatcgaaa agttcgacag cgtctccgac 4020ctgatgcagc tctcggaggg cgaagaatct cgtgctttca gcttcgatgt aggagggcgt 4080ggatatgtcc tgcgggtaaa tagctgcgcc gatggtttct acaaagatcg ttatgtttat 4140cggcactttg catcggccgc gctcccgatt ccggaagtgc ttgacattgg ggaattcagc 4200gagagcctga cctattgcat ctcccgccgt gcacagggtg tcacgttgca agacctgcct 4260gaaaccgaac tgcccgctgt tctgcagccg gtcgcggagg ccatggatgc gatcgctgcg 4320gccgatctta gccagacgag cgggttcggc ccattcggac cgcaaggaat cggtcaatac 4380actacatggc gtgatttcat atgcgcgatt gctgatcccc atgtgtatca ctggcaaact 4440gtgatggacg acaccgtcag tgcgtccgtc gcgcaggctc tcgatgagct gatgctttgg 4500gccgaggact gccccgaagt ccggcacctc gtgcacgcgg atttcggctc caacaatgtc 4560ctgacggaca atggccgcat aacagcggtc attgactgga gcgaggcgat gttcggggat 4620tcccaatacg aggtcgccaa catcttcttc tggaggccgt ggttggcttg tatggagcag 4680cagacgcgct acttcgagcg gaggcatccg gagcttgcag gatcgccgcg gctccgggcg 4740tatatgctcc gcattggtct tgaccaactc tatcagagct tggttgacgg caatttcgat 4800gatgcagctt gggcgcaggg tcgatgcgac gcaatcgtcc gatccggagc cgggactgtc 4860gggcgtacac aaatcgcccg cagaagcgcg gccgtctgga ccgatggctg tgtagaagta 4920ctcgccgata gtggaaaccg acgccccagc actcgtccga gggcaaagga atagtgaggt 4980acctaaagaa ggagtgcgtc gaagcagatc gttcaaacat ttggcaataa agtttcttaa 5040gattgaatcc tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta 5100agcatgtaat aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta 5160gagtcccgca attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg 5220ataaattatc gcgcgcggtg tcatctatgt tactagatcg atgtcga 526764142PRTPopulus trichocarpa 64Met Glu Ala Thr Leu His Asn His Phe Leu Ser Arg Ile Phe Ser Tyr1 5 10 15Thr Leu Pro Lys Pro Lys Asn Pro Pro Asn Asp Pro Thr His Phe Ile 20 25 30Phe Ala Met Lys Asn Pro Phe Lys Pro Ile Phe Ile Ser Pro Lys Thr 35 40 45Ile Thr Phe Asn Ser Arg Ser Gln Asp Pro Lys Ser Cys His Val Thr 50 55 60Ala Asn Phe Val Met Ala Thr Glu Asn Lys Asn Glu Gln Ile Glu Ser65 70 75 80Thr Val Met Ser Lys Gln Gly Glu Glu Glu Ser Lys Lys Lys Thr Ala 85 90 95Pro Pro Pro Pro Pro Pro Pro Glu Lys Pro Glu Pro Gly Asp Cys Cys 100 105 110Gly Ser Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr Glu Glu Leu 115 120 125Glu Glu Tyr Asp Lys Leu Tyr Lys Ser Asp Ser Ser Lys Ser 130 135 14065115PRTOryza sativa 65Met Leu Val Ala Ala Leu Arg Val Pro Ala Pro Ile Pro Ser Ser Leu1 5 10 15Pro Ser Pro Ala Arg Pro Leu Leu Arg Arg Arg Ser Ser His Arg Leu 20 25 30Pro Pro Pro Pro Pro Pro Ala Ala Ser Met Ala Asp Ala Gly Gly Ala 35 40 45Thr Thr Asn Lys Pro Ala Pro Ala Pro Ala Pro Glu Pro Pro Glu Lys 50 55 60Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp65 70 75 80Asp Val Tyr Tyr Asp Glu Leu Asp Ala Tyr Asn Lys Ala Leu Ala Ala 85 90 95His Ser Ser Ser Ala Ser Ser Gly Ser Lys Pro Ala Thr Ser Asp Gly 100 105 110Ala Lys Ser 11566122PRTZea mays 66Met Leu Gly Ala Val Val Arg Val Pro Gly Pro Ile Leu Pro Phe Leu1 5 10 15Pro Gly Pro Thr Arg Pro Leu Leu Arg Arg Arg His Tyr Leu Pro Pro 20 25 30Glu Thr Pro Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys 35 40 45Pro Asp Ala Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro 50 55 60Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65 70 75 80Val Trp Asp Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu 85 90 95Ala Ala Arg Ala Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala 100 105 110Asp Thr Lys Pro Lys Glu Gly Ala Lys Ser 115 12067122PRTZea mays 67Met Leu Gly Ala Val Val Arg Val Pro Gly Pro Ile Leu Pro Phe Leu1 5 10 15Pro Gly Pro Thr Arg Pro Leu Leu Arg Arg Arg His Tyr Leu Pro Pro 20 25 30Glu Thr Pro Met Ala Ser Ala Thr Pro Cys Asp Gly Gly Thr Gly Lys 35 40 45Pro Asp Ala Ala Pro Ala Pro Thr Pro Ala Pro Thr Pro Leu Pro Pro 50 55 60Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65 70 75 80Val Trp Asp Ile Tyr Phe Asp Glu Leu Asp Ala Tyr Asp Lys Ala Leu 85 90 95Ala Ala His Ala Ala Ser Ser Gly Ser Gly Gly Lys Asp Asp Ser Ala 100 105 110Asp Thr Lys Pro Lys Glu Gly Ala Lys Ser 115 12068597DNAArabidopsis thaliana 68acgaaaaaag gaaatagaaa aaaaaagaag agaacaaatc tttttgatct gtgcgtatgg 60ttgttgtgtc tcttcttcct cgaatctcga tcgttacatc accgggttct agccttcacg 120atgtgctttt gagcatgaga tttggtttga cgcgacatct ccctctcaaa cgatctttct 180ccaattattc aatcacttcc gtatctccag aacaacagct caaatctccg gtgaccatgg 240cgacgaccga gagcaagaat cttgtagaag cttccaagga ggagacaaac aagaaggaga 300cagaagataa gaaggaggtg ggagtttcgg ttcctccacc gccagagaaa ccagagcctg 360gcgattgttg cggtagcggt tgcgtccgat gcgtttggga tgtttattac gatgagctcg 420aagattacaa caagcagctt tctggagaaa ctaaatcaat ttgactgatt tttcctcgca 480ttgttaatgg agaaattaaa catttgtctt tgtcgatttg atgatacagt gcttttgttg 540aacaacattt tggatctctc tatgaacttg agctgattta cttgtgaata gaagaaa 59769135PRTArabidopsis thaliana 69Met Val Val Val Ser Leu Leu Pro Arg Ile Ser Ile Val Thr Ser Pro1 5 10 15Gly Ser Ser Leu His Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr 20 25 30Arg His Leu Pro Leu Lys Arg Ser Phe Ser Asn Tyr Ser Ile Thr Ser 35 40 45Val Ser Pro Glu Gln Gln Leu Lys Ser Pro Val Thr Met Ala Thr Thr 50 55 60Glu Ser Lys Asn Leu Val Glu Ala Ser Lys Glu Glu Thr Asn Lys Lys65 70 75 80Glu Thr Glu Asp Lys Lys Glu Val Gly Val Ser Val Pro Pro Pro Pro 85 90 95Glu Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys 100 105 110Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu Asp Tyr Asn Lys Gln Leu 115 120 125Ser Gly Glu Thr Lys Ser Ile 130 1357027PRTArtificial sequencemisc_feature 70Pro Glu Lys Pro Xaa Xaa Gly Asp Cys Cys Gly Ser Gly Cys Val Arg1 5 10 15Xaa Xaa Xaa Asp Xaa Tyr Xaa Xaa Glu Leu Xaa 20 257130DNAartificial sequenceprimer 71gcggccgcgt tgttgtgtct cttcttcctc 307230DNAartificial sequenceprimer 72ggatccctac aagattcttg ctctcggtcg 307330DNAartificial sequenceggatccttccaaggaggagacaaacaagaa 73ggatccttcc aaggaggaga caaacaagaa 307430DNAartificial sequenceprimer 74gctgcagtta gtttctccag aaagctgctt 307538DNAartificial sequenceprimer 75gaattcgcgg ccgcgttgtt gtgtctcttc ttcctcga 387630DNAartificial sequenceprimer 76ctgcagctac aagattcttg ctctcggtcg 30773239DNAartificial sequenceplasmid 77taatcactag tgaattcgcg gccgcctgca ggtcgaccat atgggagagc tcccaacgcg 60ttggatgcat agcttgagta ttctatagtg tcacctaaat agcttggcgt aatcatggtc 120atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 180aagcataaag tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt 240gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 300ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 360ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 420acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 480aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 540tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 600aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 660gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 720acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 780accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 840ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 900gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 960aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 1020ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 1080gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 1140cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 1200cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 1260gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 1320tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 1380gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 1440agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 1500tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 1560agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 1620gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 1680catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 1740ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 1800atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 1860tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 1920cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 1980cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 2040atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 2100aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 2160ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 2220aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg atgcggtgtg 2280aaataccgca cagatgcgta aggagaaaat accgcatcag gaaattgtaa gcgttaatat 2340tttgttaaaa ttcgcgttaa atttttgtta aatcagctca ttttttaacc aataggccga 2400aatcggcaaa atcccttata aatcaaaaga atagaccgag atagggttga gtgttgttcc 2460agtttggaac aagagtccac tattaaagaa cgtggactcc aacgtcaaag ggcgaaaaac 2520cgtctatcag ggcgatggcc cactacgtga accatcaccc taatcaagtt ttttggggtc 2580gaggtgccgt aaagcactaa atcggaaccc taaagggagc ccccgattta gagcttgacg 2640gggaaagccg gcgaacgtgg cgagaaagga agggaagaaa gcgaaaggag cgggcgctag 2700ggcgctggca agtgtagcgg tcacgctgcg cgtaaccacc acacccgccg cgcttaatgc 2760gccgctacag ggcgcgtcca ttcgccattc aggctgcgca actgttggga agggcgatcg 2820gtgcgggcct cttcgctatt acgccagctg gcgaaagggg gatgtgctgc aaggcgatta 2880agttgggtaa cgccagggtt ttcccagtca cgacgttgta aaacgacggc cagtgaattg 2940taatacgact cactataggg cgaattgggc ccgacgtcgc atgctcccgg ccgccatggc 3000ggccgcggga attcgatgcg gccgcgttgt tgtgtctctt cttcctcgaa tctcgatcgt 3060tacatcaccg ggttctagcc ttcacgatgt gcttttgagc atgagatttg gtttgacgcg 3120acatctccct ctcaaacgat ctttctccaa ttattcaatc acttccgtat ctccagaaca 3180acagctcaaa tctccggtga ccatggcgac

gaccgagagc aagaatcttg tagggatcc 3239783213DNAartificial sequenceplasmid 78taatcactag tgaattcgcg gccgcctgca ggtcgaccat atgggagagc tcccaacgcg 60ttggatgcat agcttgagta ttctatagtg tcacctaaat agcttggcgt aatcatggtc 120atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 180aagcataaag tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt 240gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 300ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 360ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 420acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 480aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 540tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 600aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 660gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 720acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 780accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 840ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 900gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 960aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 1020ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 1080gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 1140cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 1200cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 1260gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 1320tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 1380gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 1440agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 1500tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 1560agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 1620gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 1680catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 1740ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 1800atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 1860tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 1920cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 1980cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 2040atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 2100aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 2160ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 2220aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg atgcggtgtg 2280aaataccgca cagatgcgta aggagaaaat accgcatcag gaaattgtaa gcgttaatat 2340tttgttaaaa ttcgcgttaa atttttgtta aatcagctca ttttttaacc aataggccga 2400aatcggcaaa atcccttata aatcaaaaga atagaccgag atagggttga gtgttgttcc 2460agtttggaac aagagtccac tattaaagaa cgtggactcc aacgtcaaag ggcgaaaaac 2520cgtctatcag ggcgatggcc cactacgtga accatcaccc taatcaagtt ttttggggtc 2580gaggtgccgt aaagcactaa atcggaaccc taaagggagc ccccgattta gagcttgacg 2640gggaaagccg gcgaacgtgg cgagaaagga agggaagaaa gcgaaaggag cgggcgctag 2700ggcgctggca agtgtagcgg tcacgctgcg cgtaaccacc acacccgccg cgcttaatgc 2760gccgctacag ggcgcgtcca ttcgccattc aggctgcgca actgttggga agggcgatcg 2820gtgcgggcct cttcgctatt acgccagctg gcgaaagggg gatgtgctgc aaggcgatta 2880agttgggtaa cgccagggtt ttcccagtca cgacgttgta aaacgacggc cagtgaattg 2940taatacgact cactataggg cgaattgggc ccgacgtcgc atgctcccgg ccgccatggc 3000ggccgcggga attcgatgga tccttccaag gaggagacaa acaagaagga gacagaagat 3060aagaaggagg tgggagtttc ggttcctcca ccgccagaga aaccagagcc tggcgattgt 3120tgcggtagcg gttgcgtccg atgcgtttgg gatgtttatt acgatgagct cgaagattac 3180aacaagcagc tttctggaga aactaactgc agc 3213793245DNAartificial sequenceplasmid 79taatcactag tgaattcgcg gccgcctgca ggtcgaccat atgggagagc tcccaacgcg 60ttggatgcat agcttgagta ttctatagtg tcacctaaat agcttggcgt aatcatggtc 120atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 180aagcataaag tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt 240gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 300ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 360ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 420acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 480aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 540tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 600aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 660gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 720acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 780accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 840ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 900gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 960aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 1020ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 1080gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 1140cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 1200cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 1260gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 1320tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 1380gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 1440agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 1500tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 1560agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 1620gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 1680catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 1740ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 1800atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 1860tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 1920cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 1980cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 2040atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 2100aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 2160ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 2220aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg atgcggtgtg 2280aaataccgca cagatgcgta aggagaaaat accgcatcag gaaattgtaa gcgttaatat 2340tttgttaaaa ttcgcgttaa atttttgtta aatcagctca ttttttaacc aataggccga 2400aatcggcaaa atcccttata aatcaaaaga atagaccgag atagggttga gtgttgttcc 2460agtttggaac aagagtccac tattaaagaa cgtggactcc aacgtcaaag ggcgaaaaac 2520cgtctatcag ggcgatggcc cactacgtga accatcaccc taatcaagtt ttttggggtc 2580gaggtgccgt aaagcactaa atcggaaccc taaagggagc ccccgattta gagcttgacg 2640gggaaagccg gcgaacgtgg cgagaaagga agggaagaaa gcgaaaggag cgggcgctag 2700ggcgctggca agtgtagcgg tcacgctgcg cgtaaccacc acacccgccg cgcttaatgc 2760gccgctacag ggcgcgtcca ttcgccattc aggctgcgca actgttggga agggcgatcg 2820gtgcgggcct cttcgctatt acgccagctg gcgaaagggg gatgtgctgc aaggcgatta 2880agttgggtaa cgccagggtt ttcccagtca cgacgttgta aaacgacggc cagtgaattg 2940taatacgact cactataggg cgaattgggc ccgacgtcgc atgctcccgg ccgccatggc 3000ggccgcggga attcgatctg cagctacaag attcttgctc tcggtcgtcg ccatggtcac 3060cggagatttg agctgttgtt ctggagatac ggaagtgatt gaataattgg agaaagatcg 3120tttgagaggg agatgtcgcg tcaaaccaaa tctcatgctc aaaagcacat cgtgaaggct 3180agaacccggt gatgtaacga tcgagattcg aggaagaaga gacacaacaa cgcggccgcg 3240aattc 3245803154DNAartificial sequenceplasmid 80gatcccccgg gctgcaggaa ttcgatatca agcttatcga taccgtcgac ctcgaggggg 60ggcccggtac ccagcttttg ttccctttag tgagggttaa tttcgagctt ggcgtaatca 120tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga 180gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt 240gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga 300atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc 360actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg 420gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc 480cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc 540ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 600ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc 660ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat 720agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg 780cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc 840aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 900gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 960agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 1020ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 1080cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 1140tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa 1200aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 1260tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 1320atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata 1380cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 1440gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 1500gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 1560tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 1620tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga 1680tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt 1740aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc 1800atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa 1860tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca 1920catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca 1980aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 2040tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 2100gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa 2160tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 2220tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctaaattg 2280taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc tcatttttta 2340accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc gagatagggt 2400tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac tccaacgtca 2460aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca ccctaatcaa 2520gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg agcccccgat 2580ttagagcttg acggggaaag ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag 2640gagcgggcgc tagggcgctg gcaagtgtag cggtcacgct gcgcgtaacc accacacccg 2700ccgcgcttaa tgcgccgcta cagggcgcgt cccattcgcc attcaggctg cgcaactgtt 2760gggaagggcg atcggtgcgg gcctcttcgc tattacgcca gctggcgaaa gggggatgtg 2820ctgcaaggcg attaagttgg gtaacgccag ggttttccca gtcacgacgt tgtaaaacga 2880cggccagtga attgtaatac gactcactat agggcgaatt ggagctccac cgcggtggcg 2940gccgcgttgt tgtgtctctt cttcctcgaa tctcgatcgt tacatcaccg ggttctagcc 3000ttcacgatgt gcttttgagc atgagatttg gtttgacgcg acatctccct ctcaaacgat 3060ctttctccaa ttattcaatc acttccgtat ctccagaaca acagctcaaa tctccggtga 3120ccatggcgac gaccgagagc aagaatcttg tagg 3154813331DNAartificial sequenceplasmid 81ggaattcgat atcaagctta tcgataccgt cgacctcgag ggggggcccg gtacccagct 60tttgttccct ttagtgaggg ttaatttcga gcttggcgta atcatggtca tagctgtttc 120ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt 180gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc 240ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg 300ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct 360cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 420cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 480accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 540acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 600cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 660acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 720atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 780agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 840acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 900gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg 960gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 1020gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 1080gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 1140acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga 1200tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt 1260ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt 1320catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat 1380ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag 1440caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct 1500ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt 1560tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg 1620cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca 1680aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt 1740tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat 1800gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac 1860cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa 1920aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt 1980tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt 2040tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa 2100gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt 2160atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa 2220taggggttcc gcgcacattt ccccgaaaag tgccacctaa attgtaagcg ttaatatttt 2280gttaaaattc gcgttaaatt tttgttaaat cagctcattt tttaaccaat aggccgaaat 2340cggcaaaatc ccttataaat caaaagaata gaccgagata gggttgagtg ttgttccagt 2400ttggaacaag agtccactat taaagaacgt ggactccaac gtcaaagggc gaaaaaccgt 2460ctatcagggc gatggcccac tacgtgaacc atcaccctaa tcaagttttt tggggtcgag 2520gtgccgtaaa gcactaaatc ggaaccctaa agggagcccc cgatttagag cttgacgggg 2580aaagccggcg aacgtggcga gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc 2640gctggcaagt gtagcggtca cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc 2700gctacagggc gcgtcccatt cgccattcag gctgcgcaac tgttgggaag ggcgatcggt 2760gcgggcctct tcgctattac gccagctggc gaaaggggga tgtgctgcaa ggcgattaag 2820ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa acgacggcca gtgaattgta 2880atacgactca ctatagggcg aattggagct ccaccgcggt ggcggccgcg ttgttgtgtc 2940tcttcttcct cgaatctcga tcgttacatc accgggttct agccttcacg atgtgctttt 3000gagcatgaga tttggtttga cgcgacatct ccctctcaaa cgatctttct ccaattattc 3060aatcacttcc gtatctccag aacaacagct caaatctccg gtgaccatgg cgacgaccga 3120gagcaagaat cttgtaggga tccttccaag gaggagacaa acaagaagga gacagaagat 3180aagaaggagg tgggagtttc ggttcctcca ccgccagaga aaccagagcc tggcgattgt 3240tgcggtagcg gttgcgtccg atgcgtttgg gatgtttatt acgatgagct cgaagattac 3300aacaagcagc tttctggaga aactaactgc a 3331823547DNAartificial sequenceplasmid 82aattcgatat caagcttatc gataccgtcg acctcgaggg ggggcccggt acccagcttt 60tgttcccttt agtgagggtt aatttcgagc ttggcgtaat catggtcata gctgtttcct 120gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 180aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 240gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 300agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 360gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca 420gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 480cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 540aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 600tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac 660ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 720ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 780cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac 840ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 900gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt 960atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 1020aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga 1080aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 1140gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 1200cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct 1260gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca 1320tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct 1380ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca 1440ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc 1500atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg 1560cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct 1620tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa 1680aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 1740tcactcatgg

ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc 1800ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 1860agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa 1920gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg 1980agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc 2040accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 2100gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 2160cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 2220ggggttccgc gcacatttcc ccgaaaagtg ccacctaaat tgtaagcgtt aatattttgt 2280taaaattcgc gttaaatttt tgttaaatca gctcattttt taaccaatag gccgaaatcg 2340gcaaaatccc ttataaatca aaagaataga ccgagatagg gttgagtgtt gttccagttt 2400ggaacaagag tccactatta aagaacgtgg actccaacgt caaagggcga aaaaccgtct 2460atcagggcga tggcccacta cgtgaaccat caccctaatc aagttttttg gggtcgaggt 2520gccgtaaagc actaaatcgg aaccctaaag ggagcccccg atttagagct tgacggggaa 2580agccggcgaa cgtggcgaga aaggaaggga agaaagcgaa aggagcgggc gctagggcgc 2640tggcaagtgt agcggtcacg ctgcgcgtaa ccaccacacc cgccgcgctt aatgcgccgc 2700tacagggcgc gtcccattcg ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc 2760gggcctcttc gctattacgc cagctggcga aagggggatg tgctgcaagg cgattaagtt 2820gggtaacgcc agggttttcc cagtcacgac gttgtaaaac gacggccagt gaattgtaat 2880acgactcact atagggcgaa ttggagctcc accgcggtgg cggccgcgtt gttgtgtctc 2940ttcttcctcg aatctcgatc gttacatcac cgggttctag ccttcacgat gtgcttttga 3000gcatgagatt tggtttgacg cgacatctcc ctctcaaacg atctttctcc aattattcaa 3060tcacttccgt atctccagaa caacagctca aatctccggt gaccatggcg acgaccgaga 3120gcaagaatct tgtagggatc cttccaagga ggagacaaac aagaaggaga cagaagataa 3180gaaggaggtg ggagtttcgg ttcctccacc gccagagaaa ccagagcctg gcgattgttg 3240cggtagcggt tgcgtccgat gcgtttggga tgtttattac gatgagctcg aagattacaa 3300caagcagctt tctggagaaa ctaactgcag ctacaagatt cttgctctcg gtcgtcgcca 3360tggtcaccgg agatttgagc tgttgttctg gagatacgga agtgattgaa taattggaga 3420aagatcgttt gagagggaga tgtcgcgtca aaccaaatct catgctcaaa agcacatcgt 3480gaaggctaga acccggtgat gtaacgatcg agattcgagg aagaagagac acaacaacgc 3540ggccgcg 3547833453DNAartificial sequenceplasmid 83tcttccatag ccccccaagc ggccgcgaca caagtgtgag agtactaaat aaatgctttg 60gttgtacgaa atcattacac taaataaaat aatcaaagct tatatatgcc ttccgctaag 120gccgaatgca aagaaattgg ttctttctcg ttatcttttg ccacttttac tagtacgtat 180taattactac ttaatcatct ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat 240catccggata tagttcctcc tttcagcaaa aaacccctca agacccgttt agaggcccca 300aggggttatg ctagttattg ctcagcggtg gcagcagcca actcagcttc ctttcgggct 360ttgttagcag ccggatcgat ccaagctgta cctcactatt cctttgccct cggacgagtg 420ctggggcgtc ggtttccact atcggcgagt acttctacac agccatcggt ccagacggcc 480gcgcttctgc gggcgatttg tgtacgcccg acagtcccgg ctccggatcg gacgattgcg 540tcgcatcgac cctgcgccca agctgcatca tcgaaattgc cgtcaaccaa gctctgatag 600agttggtcaa gaccaatgcg gagcatatac gcccggagcc gcggcgatcc tgcaagctcc 660ggatgcctcc gctcgaagta gcgcgtctgc tgctccatac aagccaacca cggcctccag 720aagaagatgt tggcgacctc gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg 780accgctgtta tgcggccatt gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg 840aggtgccgga cttcggggca gtcctcggcc caaagcatca gctcatcgag agcctgcgcg 900acggacgcac tgacggtgtc gtccatcaca gtttgccagt gatacacatg gggatcagca 960atcgcgcata tgaaatcacg ccatgtagtg tattgaccga ttccttgcgg tccgaatggg 1020ccgaacccgc tcgtctggct aagatcggcc gcagcgatcg catccatagc ctccgcgacc 1080ggctgcagaa cagcgggcag ttcggtttca ggcaggtctt gcaacgtgac accctgtgca 1140cggcgggaga tgcaataggt caggctctcg ctgaattccc caatgtcaag cacttccgga 1200atcgggagcg cggccgatgc aaagtgccga taaacataac gatctttgta gaaaccatcg 1260gcgcagctat ttacccgcag gacatatcca cgccctccta catcgaagct gaaagcacga 1320gattcttcgc cctccgagag ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga 1380aacttctcga cagacgtcgc ggtgagttca ggcttttcca tgggtatatc tccttcttaa 1440agttaaacaa aattatttct agagggaaac cgttgtggtc tccctatagt gagtcgtatt 1500aatttcgcgg gatcgagatc tgatcaacct gcattaatga atcggccaac gcgcggggag 1560aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 1620cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 1680atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 1740taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 1800aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 1860tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 1920gtccgccttt ctcccttcgg gaagcgtggc gctttctcaa tgctcacgct gtaggtatct 1980cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 2040cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 2100atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 2160tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat 2220ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 2280acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 2340aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 2400aaactcacgt taagggattt tggtcatgac attaacctat aaaaataggc gtatcacgag 2460gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac ctctgacaca tgcagctccc 2520ggagacggtc acagcttgtc tgtaagcgga tgccgggagc agacaagccc gtcagggcgc 2580gtcagcgggt gttggcgggt gtcggggctg gcttaactat gcggcatcag agcagattgt 2640actgagagtg caccatatgg acatattgtc gttagaacgc ggctacaatt aatacataac 2700cttatgtatc atacacatac gatttaggtg acactataga acggcgcgcc aagcttggat 2760cctagcctaa gtacgtactc aaaatgccaa caaataaaaa aaaagttgct ttaataatgc 2820caaaacaaat taataaaaca cttacaacac cggatttttt ttaattaaaa tgtgccattt 2880aggataaata gttaatattt ttaataatta tttaaaaagc cgtatctact aaaatgattt 2940ttatttggtt gaaaatatta atatgtttaa atcaacacaa tctatcaaaa ttaaactaaa 3000aaaaaaataa gtgtacgtgg ttaacattag tacagtaata taagaggaaa atgagaaatt 3060aagaaattga aagcgagtct aatttttaaa ttatgaacct gcatatataa aaggaaagaa 3120agaatccagg aagaaaagaa atgaaaccat gcatggtccc ctcgtcatca cgagtttctg 3180ccatttgcaa tagaaacact gaaacacctt tctctttgtc acttaattga gatgccgaag 3240ccacctcaca ccatgaactt catgaggtgt agcacccaag gcttccatag ccatgcatac 3300tgaagaatgt ctcaagctca gcaccctact tctgtgacgt gtccctcatt caccttcctc 3360tcttccctat aaataaccac gcctcaggtt ctccgcttca caactcaaac attctctcca 3420ttggtcctta aacactcatc agtcatcacc atg 3453844072DNAartificial sequenceplasmid 84ggccgcgaca caagtgtgag agtactaaat aaatgctttg gttgtacgaa atcattacac 60taaataaaat aatcaaagct tatatatgcc ttccgctaag gccgaatgca aagaaattgg 120ttctttctcg ttatcttttg ccacttttac tagtacgtat taattactac ttaatcatct 180ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat catccggata tagttcctcc 240tttcagcaaa aaacccctca agacccgttt agaggcccca aggggttatg ctagttattg 300ctcagcggtg gcagcagcca actcagcttc ctttcgggct ttgttagcag ccggatcgat 360ccaagctgta cctcactatt cctttgccct cggacgagtg ctggggcgtc ggtttccact 420atcggcgagt acttctacac agccatcggt ccagacggcc gcgcttctgc gggcgatttg 480tgtacgcccg acagtcccgg ctccggatcg gacgattgcg tcgcatcgac cctgcgccca 540agctgcatca tcgaaattgc cgtcaaccaa gctctgatag agttggtcaa gaccaatgcg 600gagcatatac gcccggagcc gcggcgatcc tgcaagctcc ggatgcctcc gctcgaagta 660gcgcgtctgc tgctccatac aagccaacca cggcctccag aagaagatgt tggcgacctc 720gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg accgctgtta tgcggccatt 780gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg aggtgccgga cttcggggca 840gtcctcggcc caaagcatca gctcatcgag agcctgcgcg acggacgcac tgacggtgtc 900gtccatcaca gtttgccagt gatacacatg gggatcagca atcgcgcata tgaaatcacg 960ccatgtagtg tattgaccga ttccttgcgg tccgaatggg ccgaacccgc tcgtctggct 1020aagatcggcc gcagcgatcg catccatagc ctccgcgacc ggctgcagaa cagcgggcag 1080ttcggtttca ggcaggtctt gcaacgtgac accctgtgca cggcgggaga tgcaataggt 1140caggctctcg ctgaattccc caatgtcaag cacttccgga atcgggagcg cggccgatgc 1200aaagtgccga taaacataac gatctttgta gaaaccatcg gcgcagctat ttacccgcag 1260gacatatcca cgccctccta catcgaagct gaaagcacga gattcttcgc cctccgagag 1320ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga aacttctcga cagacgtcgc 1380ggtgagttca ggcttttcca tgggtatatc tccttcttaa agttaaacaa aattatttct 1440agagggaaac cgttgtggtc tccctatagt gagtcgtatt aatttcgcgg gatcgagatc 1500tgatcaacct gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc 1560gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg 1620tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa 1680agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 1740cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 1800ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 1860tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 1920gaagcgtggc gctttctcaa tgctcacgct gtaggtatct cagttcggtg taggtcgttc 1980gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 2040gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 2100ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 2160ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 2220ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 2280gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 2340ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt 2400tggtcatgac attaacctat aaaaataggc gtatcacgag gccctttcgt ctcgcgcgtt 2460tcggtgatga cggtgaaaac ctctgacaca tgcagctccc ggagacggtc acagcttgtc 2520tgtaagcgga tgccgggagc agacaagccc gtcagggcgc gtcagcgggt gttggcgggt 2580gtcggggctg gcttaactat gcggcatcag agcagattgt actgagagtg caccatatgg 2640acatattgtc gttagaacgc ggctacaatt aatacataac cttatgtatc atacacatac 2700gatttaggtg acactataga acggcgcgcc aagcttggat cctagcctaa gtacgtactc 2760aaaatgccaa caaataaaaa aaaagttgct ttaataatgc caaaacaaat taataaaaca 2820cttacaacac cggatttttt ttaattaaaa tgtgccattt aggataaata gttaatattt 2880ttaataatta tttaaaaagc cgtatctact aaaatgattt ttatttggtt gaaaatatta 2940atatgtttaa atcaacacaa tctatcaaaa ttaaactaaa aaaaaaataa gtgtacgtgg 3000ttaacattag tacagtaata taagaggaaa atgagaaatt aagaaattga aagcgagtct 3060aatttttaaa ttatgaacct gcatatataa aaggaaagaa agaatccagg aagaaaagaa 3120atgaaaccat gcatggtccc ctcgtcatca cgagtttctg ccatttgcaa tagaaacact 3180gaaacacctt tctctttgtc acttaattga gatgccgaag ccacctcaca ccatgaactt 3240catgaggtgt agcacccaag gcttccatag ccatgcatac tgaagaatgt ctcaagctca 3300gcaccctact tctgtgacgt gtccctcatt caccttcctc tcttccctat aaataaccac 3360gcctcaggtt ctccgcttca caactcaaac attctctcca ttggtcctta aacactcatc 3420agtcatcacc atgtcttcca tagcccccca agcggccgcg ttgttgtgtc tcttcttcct 3480cgaatctcga tcgttacatc accgggttct agccttcacg atgtgctttt gagcatgaga 3540tttggtttga cgcgacatct ccctctcaaa cgatctttct ccaattattc aatcacttcc 3600gtatctccag aacaacagct caaatctccg gtgaccatgg cgacgaccga gagcaagaat 3660cttgtaggga tccttccaag gaggagacaa acaagaagga gacagaagat aagaaggagg 3720tgggagtttc ggttcctcca ccgccagaga aaccagagcc tggcgattgt tgcggtagcg 3780gttgcgtccg atgcgtttgg gatgtttatt acgatgagct cgaagattac aacaagcagc 3840tttctggaga aactaactgc agctacaaga ttcttgctct cggtcgtcgc catggtcacc 3900ggagatttga gctgttgttc tggagatacg gaagtgattg aataattgga gaaagatcgt 3960ttgagaggga gatgtcgcgt caaaccaaat ctcatgctca aaagcacatc gtgaaggcta 4020gaacccggtg atgtaacgat cgagattcga ggaagaagag acacaacaac gc 40728514827DNAartificial sequenceplasmid 85cgcgcctcga gtgggcggat cccccgggct gcaggaattc actggccgtc gttttacaac 60gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca catccccctt 120tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa cagttgcgca 180gcctgaatgg cgaatggatc gatccatcgc gatgtacctt ttgttagtca gcctctcgat 240tgctcatcgt cattacacag taccgaagtt tgatcgatct agtaacatag atgacaccgc 300gcgcgataat ttatcctagt ttgcgcgcta tattttgttt tctatcgcgt attaaatgta 360taattgcggg actctaatca taaaaaccca tctcataaat aacgtcatgc attacatgtt 420aattattaca tgcttaacgt aattcaacag aaattatatg ataatcatcg caagaccggc 480aacaggattc aatcttaaga aactttattg ccaaatgttt gaacgatctg cttcgacgca 540ctccttcttt actccaccat ctcgtcctta ttgaaaacgt gggtagcacc aaaacgaatc 600aagtcgctgg aactgaagtt accaatcacg ctggatgatt tgccagttgg attaatcttg 660cctttccccg catgaataat attgatgaat gcatgcgtga ggggtagttc gatgttggca 720atagctgcaa ttgccgcgac atcctccaac gagcataatt cttcagaaaa atagcgatgt 780tccatgttgt cagggcatgc atgatgcacg ttatgaggtg acggtgctag gcagtattcc 840ctcaaagttt catagtcagt atcatattca tcattgcatt cctgcaagag agaattgaga 900cgcaatccac acgctgcggc aaccttccgg cgttcgtggt ctatttgctc ttggacgttg 960caaacgtaag tgttggatcg atccggggtg ggcgaagaac tccagcatga gatccccgcg 1020ctggaggatc atccagccgg cgtcccggaa aacgattccg aagcccaacc tttcatagaa 1080ggcggcggtg gaatcgaaat ctcgtgatgg caggttgggc gtcgcttggt cggtcatttc 1140gaaccccaga gtcccgctca gaagaactcg tcaagaaggc gatagaaggc gatgcgctgc 1200gaatcgggag cggcgatacc gtaaagcacg aggaagcggt cagcccattc gccgccaagc 1260tcttcagcaa tatcacgggt agccaacgct atgtcctgat agcggtccgc cacacccagc 1320cggccacagt cgatgaatcc agaaaagcgg ccattttcca ccatgatatt cggcaagcag 1380gcatcgccat gggtcacgac gagatcctcg ccgtcgggca tgcgcgcctt gagcctggcg 1440aacagttcgg ctggcgcgag cccctgatgc tcttcgtcca gatcatcctg atcgacaaga 1500ccggcttcca tccgagtacg tgctcgctcg atgcgatgtt tcgcttggtg gtcgaatggg 1560caggtagccg gatcaagcgt atgcagccgc cgcattgcat cagccatgat ggatactttc 1620tcggcaggag caaggtgaga tgacaggaga tcctgccccg gcacttcgcc caatagcagc 1680cagtcccttc ccgcttcagt gacaacgtcg agcacagctg cgcaaggaac gcccgtcgtg 1740gccagccacg atagccgcgc tgcctcgtcc tgcagttcat tcagggcacc ggacaggtcg 1800gtcttgacaa aaagaaccgg gcgcccctgc gctgacagcc ggaacacggc ggcatcagag 1860cagccgattg tctgttgtgc ccagtcatag ccgaatagcc tctccaccca agcggccgga 1920gaacctgcgt gcaatccatc ttgttcaatc atgcgaaacg atccccgcaa gcttggagac 1980tggtgatttc agcgtgtcct ctccaaatga aatgaacttc cttatataga ggaagggtct 2040tgcgaaggat agtgggattg tgcgtcatcc cttacgtcag tggagatatc acatcaatcc 2100acttgctttg aagacgtggt tggaacgtct tctttttcca cgatgctcct cgtgggtggg 2160ggtccatctt tgggaccact gtcggcagag gcatcttcaa cgatggcctt tcctttatcg 2220caatgatggc atttgtagga gccaccttcc ttttccacta tcttcacaat aaagtgacag 2280atagctgggc aatggaatcc gaggaggttt ccggatatta ccctttgttg aaaagtctca 2340attgcccttt ggtcttctga gactgtatct ttgatatttt tggagtagac aagcgtgtcg 2400tgctccacca tgttgacgaa gattttcttc ttgtcattga gtcgtaagag actctgtatg 2460aactgttcgc cagtctttac ggcgagttct gttaggtcct ctatttgaat ctttgactcc 2520atggcctttg attcagtggg aactaccttt ttagagactc caatctctat tacttgcctt 2580ggtttgtgaa gcaagccttg aatcgtccat actggaatag tacttctgat cttgagaaat 2640atatctttct ctgtgttctt gatgcagtta gtcctgaatc ttttgactgc atctttaacc 2700ttcttgggaa ggtatttgat ctcctggaga ttattgctcg ggtagatcgt cttgatgaga 2760cctgctgcgt aagcctctct aaccatctgt gggttagcat tctttctgaa attgaaaagg 2820ctaatcttct cattatcagt ggtgaacatg gtatcgtcac cttctccgtc gaacttcctg 2880actagatcgt agagatagag gaagtcgtcc attgtgatct ctggggcaaa ggagatctga 2940attaattcga tatggtggat ttatcacaaa tgggacccgc cgccgacaga ggtgtgatgt 3000taggccagga ctttgaaaat ttgcgcaact atcgtatagt ggccgacaaa ttgacgccga 3060gttgacagac tgcctagcat ttgagtgaat tatgtgaggt aatgggctac actgaattgg 3120tagctcaaac tgtcagtatt tatgtatatg agtgtatatt ttcgcataat ctcagaccaa 3180tctgaagatg aaatgggtat ctgggaatgg cgaaatcaag gcatcgatcg tgaagtttct 3240catctaagcc cccatttgga cgtgaatgta gacacgtcga aataaagatt tccgaattag 3300aataatttgt ttattgcttt cgcctataaa tacgacggat cgtaatttgt cgttttatca 3360aaatgtactt tcattttata ataacgctgc ggacatctac atttttgaat tgaaaaaaaa 3420ttggtaatta ctctttcttt ttctccatat tgaccatcat actcattgct gatccatgta 3480gatttcccgg acatgaagcc atttacaatt gaatatatcc tgccgccgct gccgctttgc 3540acccggtgga gcttgcatgt tggtttctac gcagaactga gccggttagg cagataattt 3600ccattgagaa ctgagccatg tgcaccttcc ccccaacacg gtgagcgacg gggcaacgga 3660gtgatccaca tgggactttt aaacatcatc cgtcggatgg cgttgcgaga gaagcagtcg 3720atccgtgaga tcagccgacg caccgggcag gcgcgcaaca cgatcgcaaa gtatttgaac 3780gcaggtacaa tcgagccgac gttcacgcgg aacgaccaag caagctagct ttaatgcggt 3840agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 3900ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 3960gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 4020gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 4080tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 4140tacgcgatca tggcgaccac acccgtcctg tggtccaacc cctccgctgc tatagtgcag 4200tcggcttctg acgttcagtg cagccgtctt ctgaaaacga catgtcgcac aagtcctaag 4260ttacgcgaca ggctgccgcc ctgccctttt cctggcgttt tcttgtcgcg tgttttagtc 4320gcataaagta gaatacttgc gactagaacc ggagacatta cgccatgaac aagagcgccg 4380ccgctggcct gctgggctat gcccgcgtca gcaccgacga ccaggacttg accaaccaac 4440gggccgaact gcacgcggcc ggctgcacca agctgttttc cgagaagatc accggcacca 4500ggcgcgaccg cccggagctg gccaggatgc ttgaccacct acgccctggc gacgttgtga 4560cagtgaccag gctagaccgc ctggcccgca gcacccgcga cctactggac attgccgagc 4620gcatccagga ggccggcgcg ggcctgcgta gcctggcaga gccgtgggcc gacaccacca 4680cgccggccgg ccgcatggtg ttgaccgtgt tcgccggcat tgccgagttc gagcgttccc 4740taatcatcga ccgcacccgg agcgggcgcg aggccgccaa ggcccgaggc gtgaagtttg 4800gcccccgccc taccctcacc ccggcacaga tcgcgcacgc ccgcgagctg atcgaccagg 4860aaggccgcac cgtgaaagag gcggctgcac tgcttggcgt gcatcgctcg accctgtacc 4920gcgcacttga gcgcagcgag gaagtgacgc ccaccgaggc caggcggcgc ggtgccttcc 4980gtgaggacgc attgaccgag gccgacgccc tggcggccgc cgagaatgaa cgccaagagg 5040aacaagcatg aaaccgcacc aggacggcca ggacgaaccg tttttcatta ccgaagagat 5100cgaggcggag atgatcgcgg ccgggtacgt gttcgagccg cccgcgcacg tctcaaccgt 5160gcggctgcat gaaatcctgg ccggtttgtc tgatgccaag ctggcggcct ggccggccag 5220cttggccgct gaagaaaccg agcgccgccg tctaaaaagg tgatgtgtat ttgagtaaaa 5280cagcttgcgt catgcggtcg ctgcgtatat gatgcgatga gtaaataaac aaatacgcaa 5340gggaacgcat gaagttatcg ctgtacttaa ccagaaaggc gggtcaggca agacgaccat 5400cgcaacccat ctagcccgcg ccctgcaact cgccggggcc gatgttctgt tagtcgattc 5460cgatccccag ggcagtgccc gcgattgggc ggccgtgcgg gaagatcaac cgctaaccgt 5520tgtcggcatc

gaccgcccga cgattgaccg cgacgtgaag gccatcggcc ggcgcgactt 5580cgtagtgatc gacggagcgc cccaggcggc ggacttggct gtgtccgcga tcaaggcagc 5640cgacttcgtg ctgattccgg tgcagccaag cccttacgac atatgggcca ccgccgacct 5700ggtggagctg gttaagcagc gcattgaggt cacggatgga aggctacaag cggcctttgt 5760cgtgtcgcgg gcgatcaaag gcacgcgcat cggcggtgag gttgccgagg cgctggccgg 5820gtacgagctg cccattcttg agtcccgtat cacgcagcgc gtgagctacc caggcactgc 5880cgccgccggc acaaccgttc ttgaatcaga acccgagggc gacgctgccc gcgaggtcca 5940ggcgctggcc gctgaaatta aatcaaaact catttgagtt aatgaggtaa agagaaaatg 6000agcaaaagca caaacacgct aagtgccggc cgtccgagcg cacgcagcag caaggctgca 6060acgttggcca gcctggcaga cacgccagcc atgaagcggg tcaactttca gttgccggcg 6120gaggatcaca ccaagctgaa gatgtacgcg gtacgccaag gcaagaccat taccgagctg 6180ctatctgaat acatcgcgca gctaccagag taaatgagca aatgaataaa tgagtagatg 6240aattttagcg gctaaaggag gcggcatgga aaatcaagaa caaccaggca ccgacgccgt 6300ggaatgcccc atgtgtggag gaacgggcgg ttggccaggc gtaagcggct gggttgtctg 6360ccggccctgc aatggcactg gaacccccaa gcccgaggaa tcggcgtgag cggtcgcaaa 6420ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga gaagttgaag 6480gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg tgaatcgtgg 6540caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc cggtgcgccg 6600tcgattagga agccgcccaa gggcgacgag caaccagatt ttttcgttcc gatgctctat 6660gacgtgggca cccgcgatag tcgcagcatc atggacgtgg ccgttttccg tctgtcgaag 6720cgtgaccgac gagctggcga ggtgatccgc tacgagcttc cagacgggca cgtagaggtt 6780tccgcagggc cggccggcat ggccagtgtg tgggattacg acctggtact gatggcggtt 6840tcccatctaa ccgaatccat gaaccgatac cgggaaggga agggagacaa gcccggccgc 6900gtgttccgtc cacacgttgc ggacgtactc aagttctgcc ggcgagccga tggcggaaag 6960cagaaagacg acctggtaga aacctgcatt cggttaaaca ccacgcacgt tgccatgcag 7020cgtacgaaga aggccaagaa cggccgcctg gtgacggtat ccgagggtga agccttgatt 7080agccgctaca agatcgtaaa gagcgaaacc gggcggccgg agtacatcga gatcgagcta 7140gctgattgga tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct gacggttcac 7200cccgattact ttttgatcga tcccggcatc ggccgttttc tctaccgcct ggcacgccgc 7260gccgcaggca aggcagaagc cagatggttg ttcaagacga tctacgaacg cagtggcagc 7320gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc aaatgacctg 7380ccggagtacg atttgaagga ggaggcgggg caggctggcc cgatcctagt catgcgctac 7440cgcaacctga tcgagggcga agcatccgcc ggttcctaat gtacggagca gatgctaggg 7500caaattgccc tagcagggga aaaaggtcga aaaggtctct ttcctgtgga tagcacgtac 7560attgggaacc caaagccgta cattgggaac cggaacccgt acattgggaa cccaaagccg 7620tacattggga accggtcaca catgtaagtg actgatataa aagagaaaaa aggcgatttt 7680tccgcctaaa actctttaaa acttattaaa actcttaaaa cccgcctggc ctgtgcataa 7740ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc ctacccttcg gtcgctgcgc 7800tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc aaaaatggct 7860ggcctacggc caggcaatct accagggcgc ggacaagccg cgccgtcgcc actcgaccgc 7920cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg aaaacctctg 7980acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca 8040agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca tgacccagtc 8100acgtagcgat agcggagtgt atactggctt aactatgcgg catcagagca gattgtactg 8160agagtgcacc atatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc 8220aggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 8280gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca 8340ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 8400ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 8460cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 8520ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 8580tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 8640gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 8700tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 8760gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 8820tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag 8880ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt 8940agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa 9000gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg 9060attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga 9120agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta 9180atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc 9240cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg 9300ataccgcgag acccacgctc accggctcca gatttatcag caataaacca gccagccgga 9360agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt 9420tgccgggaag ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 9480gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc 9540caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt tagctccttc 9600ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca 9660gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag 9720tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 9780tcaacacggg ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 9840gacctgcagg gggggggggg cgctgaggtc tgcctcgtga agaaggtgtt gctgactcat 9900accaggcctg aatcgcccca tcatccagcc agaaagtgag ggagccacgg ttgatgagag 9960ctttgttgta ggtggaccag ttggtgattt tgaacttttg ctttgccacg gaacggtctg 10020cgttgtcggg aagatgcgtg atctgatcct tcaactcagc aaaagttcga tttattcaac 10080aaagccgccg tcccgtcaag tcagcgtaat gctctgccag tgttacaacc aattaaccaa 10140ttctgattag aaaaactcat cgagcatcaa atgaaactgc aatttattca tatcaggatt 10200atcaatacca tatttttgaa aaagccgttt ctgtaatgaa ggagaaaact caccgaggca 10260gttccatagg atggcaagat cctggtatcg gtctgcgatt ccgactcgtc caacatcaat 10320acaacctatt aatttcccct cgtcaaaaat aaggttatca agtgagaaat caccatgagt 10380gacgactgaa tccggtgaga atggcaaaag cttatgcatt tctttccaga cttgttcaac 10440aggccagcca ttacgctcgt catcaaaatc actcgcatca accaaaccgt tattcattcg 10500tgattgcgcc tgagcgagac gaaatacgcg atcgctgtta aaaggacaat tacaaacagg 10560aatcgaatgc aaccggcgca ggaacactgc cagcgcatca acaatatttt cacctgaatc 10620aggatattct tctaatacct ggaatgctgt tttcccgggg atcgcagtgg tgagtaacca 10680tgcatcatca ggagtacgga taaaatgctt gatggtcgga agaggcataa attccgtcag 10740ccagtttagt ctgaccatct catctgtaac atcattggca acgctacctt tgccatgttt 10800cagaaacaac tctggcgcat cgggcttccc atacaatcga tagattgtcg cacctgattg 10860cccgacatta tcgcgagccc atttataccc atataaatca gcatccatgt tggaatttaa 10920tcgcggcctc gagcaagacg tttcccgttg aatatggctc ataacacccc ttgtattact 10980gtttatgtaa gcagacagtt ttattgttca tgatgatata tttttatctt gtgcaatgta 11040acatcagaga ttttgagaca caacgtggct ttcccccccc cccctgcagg tcaattcggt 11100cgatatggct attacgaaga aggctcgtgc gcggagtccc gtgaactttc ccacgcaaca 11160agtgaaccgc accgggtttg ccggaggcca tttcgttaaa atgcgcagcc atggctgctt 11220cgtccagcat ggcgtaatac tgatcctcgt cttcggctgg cggtatattg ccgatgggct 11280tcaaaagccg ccgtggttga accagtctat ccattccaag gtagcgaact cgaccgcttc 11340gaagctcctc catggtccac gccgatgaat gacctcggcc ttgtaaagac cgttgatcgc 11400ttctgcgagg gcgttgtcgt gctgtcgccg acgcttccga tagatggctc gatacctgct 11460tctgccaacc gctcggaata gcgaaaggac acgtattgaa caccgcgatc cgagtgatgc 11520actaggccgc catgagcggg acgccgatca tgatgagcct cctcgagggc atcgaggaca 11580aagcctgcat gtgctgtccg gctcgcccgc catccgacaa tgcgacgggc gaagacgtcg 11640atcacgaagg ccacgtagac gaagccctcc caagtggcga cataagtacg gacatgcgca 11700aaggctttcc cggtttgtcg ctgatggtgc aagagacgct gaagcgcgat ccgatgcgca 11760ggcatctgtt cgtcttccgc ggtcgtggcg gtggcctgat caaggtcact cgccgaagag 11820ctgcatgatt ggctcgaaac cgagcggggg aaattgtcgc gcagttctcc cgtcgccgag 11880gcgataaatt acatgctcaa gcgatgggat ggcattacgt cattcctcga tgacggcccg 11940atttgcctga cgaacaatgc tgccgaacga acgctcagag gctatgtact cggcaggaag 12000tcatggctgt ttgccggatc ggatcgttgt gctgaacgtg cggcgttcat ggcgacactg 12060atcatgagcg ccaagctcaa taacatcgat ccgcaggcct ggcttgccga cgtccgcgcc 12120gaccttgcgg acgctccgat cagcaggctt gagcaacagc tgccgtggaa ctggacatcc 12180aagacactga gtgctcaggc ggcctgacct gcggccttca ccggatactt accccattat 12240cgcagattgc gatgaagcat cagcgtcatt cagcaatctt gccaaagtat gcaggctcgc 12300gagaatcgac gtgcgaaacc ggctggttgc gccaaagatc cgcttgcgga gcggtcgaac 12360attcatgctg ggacttcaag aggtcgagta gaggaagaac cggaaaggtt gcaccggaaa 12420atatgcgttc ctttggagag cgcctcatgg acgtgaacaa atcgcccgga ccaaggatgc 12480cacggataca aaagctcgcg aagctcggtc ccgtgggtgt tctgtcgtct cgttgtacaa 12540cgaaatccat tcccattccg cgctcaagat ggcttcccct cggcagttca tcagggctaa 12600atcaatctag ccgacttgtc cggtgaaatg ggctgcactc caacagaaac aatcaaacaa 12660acatacacag cgacttattc acacgagctc aaattacaac ggtatatatc ctgccagtca 12720gcatcatcac accaaaagtt aggcccgaat agtttgaaat tagaaagctc gcaattgagg 12780tctacaggcc aaattcgctc ttagccgtac aatattactc accggtgcga tgccccccat 12840cgtaggtgaa ggtggaaatt aatgatccat cttgagacca caggcccaca acagctacca 12900gtttcctcaa gggtccacca aaaacgtaag cgcttacgta catggtcgat aagaaaaggc 12960aatttgtaga tgttaacatc caacgtcgct ttcagggatc gatccaatac gcaaaccgcc 13020tctccccgcg cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa 13080agcgggcagt gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc 13140tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca 13200cacaggaaac agctatgacc atgattacgc caagcttgca tgcctgcagg tcgactctag 13260aggatctggc gcgccaagct tggatcctag cctaagtacg tactcaaaat gccaacaaat 13320aaaaaaaaag ttgctttaat aatgccaaaa caaattaata aaacacttac aacaccggat 13380tttttttaat taaaatgtgc catttaggat aaatagttaa tatttttaat aattatttaa 13440aaagccgtat ctactaaaat gatttttatt tggttgaaaa tattaatatg tttaaatcaa 13500cacaatctat caaaattaaa ctaaaaaaaa aataagtgta cgtggttaac attagtacag 13560taatataaga ggaaaatgag aaattaagaa attgaaagcg agtctaattt ttaaattatg 13620aacctgcata tataaaagga aagaaagaat ccaggaagaa aagaaatgaa accatgcatg 13680gtcccctcgt catcacgagt ttctgccatt tgcaatagaa acactgaaac acctttctct 13740ttgtcactta attgagatgc cgaagccacc tcacaccatg aacttcatga ggtgtagcac 13800ccaaggcttc catagccatg catactgaag aatgtctcaa gctcagcacc ctacttctgt 13860gacgtgtccc tcattcacct tcctctcttc cctataaata accacgcctc aggttctccg 13920cttcacaact caaacattct ctccattggt ccttaaacac tcatcagtca tcaccatgtc 13980ttccatagcc ccccaagcgg ccgcgttgtt gtgtctcttc ttcctcgaat ctcgatcgtt 14040acatcaccgg gttctagcct tcacgatgtg cttttgagca tgagatttgg tttgacgcga 14100catctccctc tcaaacgatc tttctccaat tattcaatca cttccgtatc tccagaacaa 14160cagctcaaat ctccggtgac catggcgacg accgagagca agaatcttgt agggatcctt 14220ccaaggagga gacaaacaag aaggagacag aagataagaa ggaggtggga gtttcggttc 14280ctccaccgcc agagaaacca gagcctggcg attgttgcgg tagcggttgc gtccgatgcg 14340tttgggatgt ttattacgat gagctcgaag attacaacaa gcagctttct ggagaaacta 14400actgcagcta caagattctt gctctcggtc gtcgccatgg tcaccggaga tttgagctgt 14460tgttctggag atacggaagt gattgaataa ttggagaaag atcgtttgag agggagatgt 14520cgcgtcaaac caaatctcat gctcaaaagc acatcgtgaa ggctagaacc cggtgatgta 14580acgatcgaga ttcgaggaag aagagacaca acaacgcggc cgcgacacaa gtgtgagagt 14640actaaataaa tgctttggtt gtacgaaatc attacactaa ataaaataat caaagcttat 14700atatgccttc cgctaaggcc gaatgcaaag aaattggttc tttctcgtta tcttttgcca 14760cttttactag tacgtattaa ttactactta atcatctttg tttacggctc attatatccg 14820tcgacgg 148278621DNAartificial sequenceamiRNA 86taacccaaca atcatcgacc c 218721DNAartificial sequenceamiRNA 87ttggagaaaa tagggtaggg t 218821DNAartificial sequenceamiRNA 88gggacgatga ttgttgggtt a 218921DNAartificial sequenceamiRNA 89accctaccct acattctcca t 2190590DNAartificial sequencemicroRNA percursor 90gcggccgcgc gagaaacttt gtatgggcat ggttatttct cacttctcac cctcctttac 60tttcttatgc taaatcctcc ttcccctata tctccaccct caaccccttt ttctcattat 120aacttttggt gcctagatgg tgtgtgtgtg tgcgcgcgag agatctgagc tcaattttcc 180tctctcaagt cctggtcatg cttttccaca gctttcttga acttcttatg catcttatat 240ctctccacct ccaggatttt aagccctaga agctcaagaa agctgtggga gaatatggca 300attcaggctt ttaattgctt tcatttggta ccatcacttg caagatttca gagtacaagg 360tgaacacaca catcttcctc ttcatcaatt ctctagtttc atccttatct tttcattcac 420ggtaactctc actaccctct ttcatcttat aagttatacc gggggtgtga tgttgatgag 480tgtaaattaa atatatgtga tctctttctc tggaaaaatt ttcagtgtga tatacatann 540natctcttaa tctagagatt ttatggcttt gttatatata aggcggccgc 59091605DNAartificial sequencemicroRNA precursor 91gcggccgcgc gagaaacttt gtatgggcat ggttatttct cacttctcac cctcctttac 60tttcttatgc taaatcctcc ttcccctata tctccaccct caaccccttt ttctcattat 120aacttttggt gcctagatgg tgtgtgtgtg tgcgcgcgag agatctgagc tcaattttcc 180tctctcaagt cctggtcatg ctgtttaaac cacagctttc ttgaacttct tatgcatctt 240atatctctcc acctccagga ttttaagccc tagaagctca agaaagctgt gggagtttaa 300actatggcaa ttcaggcttt taattgcttt catttggtac catcacttgc aagatttcag 360agtacaaggt gaacacacac atcttcctct tcatcaattc tctagtttca tccttatctt 420ttcattcacg gtaactctca ctaccctctt tcatcttata agttataccg ggggtgtgat 480gttgatgagt gtaaattaaa tatatgtgat ctctttctct ggaaaaattt tcagtgtgat 540atacatannn atctcttaat ctagagattt tatggctttg ttatatataa ggaattcgcg 600gccgc 6059259DNAartificial sequenceprimer 92tctcaagtcc tggtcatgct ttaacccaac aatcatcgac cccttatgca tcttatatc 599359DNAartificial sequenceprimer 93cctgaattgc catattctaa cccaacaatc atcgtcccct agggcttaaa atcctggag 59944536DNAartificial sequenceplasmid 94aagggcgaat tctgcagata tccatcacac tggcggccgc tcgagcatgc atctagaggg 60cccaattcgc cctatagtga gtcgtattac aattcactgg ccgtcgtttt acaacgtcgt 120gactgggaaa accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc 180agctggcgta atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg 240aatggcgaat ggacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc 300gcagcgtgac cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt 360cctttctcgc cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag 420ggttccgatt tagtgcttta cggcacctcg accccaaaaa acttgattag ggtgatggtt 480cacgtagtgg gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt 540tctttaatag tggactcttg ttccaaactg gaacaacact caaccctatc tcggtctatt 600cttttgattt ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt 660aacaaaaatt taacgcgaat tttaacaaaa ttcagggcgc aagggctgct aaaggaagcg 720gaacacgtag aaagccagtc cgcagaaacg gtgctgaccc cggatgaatg tcagctactg 780ggctatctgg acaagggaaa acgcaagcgc aaagagaaag caggtagctt gcagtgggct 840tacatggcga tagctagact gggcggtttt atggacagca agcgaaccgg aattgccagc 900tggggcgccc tctggtaagg ttgggaagcc ctgcaaagta aactggatgg ctttcttgcc 960gccaaggatc tgatggcgca ggggatcaag atctgatcaa gagacaggat gaggatcgtt 1020tcgcatgatt gaacaagatg gattgcacgc aggttctccg gccgcttggg tggagaggct 1080attcggctat gactgggcac aacagacaat cggctgctct gatgccgccg tgttccggct 1140gtcagcgcag gggcgcccgg ttctttttgt caagaccgac ctgtccggtg ccctgaatga 1200actgcaggac gaggcagcgc ggctatcgtg gctggccacg acgggcgttc cttgcgcagc 1260tgtgctcgac gttgtcactg aagcgggaag ggactggctg ctattgggcg aagtgccggg 1320gcaggatctc ctgtcatccc accttgctcc tgccgagaaa gtatccatca tggctgatgc 1380aatgcggcgg ctgcatacgc ttgatccggc tacctgccca ttcgaccacc aagcgaaaca 1440tcgcatcgag cgagcacgta ctcggatgga agccggtctt gtcgatcagg atgatctgga 1500cgaagagcat caggggctcg cgccagccga actgttcgcc aggctcaagg cgcgcatgcc 1560cgacggcgag gatctcgtcg tgacccatgg cgatgcctgc ttgccgaata tcatggtgga 1620aaatggccgc ttttctggat tcatcgactg tggccggctg ggtgtggcgg accgctatca 1680ggacatagcg ttggctaccc gtgatattgc tgaagagctt ggcggcgaat gggctgaccg 1740cttcctcgtg ctttacggta tcgccgctcc cgattcgcag cgcatcgcct tctatcgcct 1800tcttgacgag ttcttctgaa ttgaaaaagg aagagtatga gtattcaaca tttccgtgtc 1860gcccttattc ccttttttgc ggcattttgc cttcctgttt ttgctcaccc agaaacgctg 1920gtgaaagtaa aagatgctga agatcagttg ggtgcacgag tgggttacat cgaactggat 1980ctcaacagcg gtaagatcct tgagagtttt cgccccgaag aacgttttcc aatgatgagc 2040acttttaaag ttctgctatg tggcgcggta ttatcccgta ttgacgccgg gcaagagcaa 2100ctcggtcgcc gcatacacta ttctcagaat gacttggttg agtactcacc agtcacagaa 2160aagcatctta cggatggcat gacagtaaga gaattatgca gtgctgccat aaccatgagt 2220gataacactg cggccaactt acttctgaca acgatcggag gaccgaagga gctaaccgct 2280tttttgcaca acatggggga tcatgtaact cgccttgatc gttgggaacc ggagctgaat 2340gaagccatac caaacgacga gcgtgacacc acgatgcctg tagcaatggc aacaacgttg 2400cgcaaactat taactggcga actacttact ctagcttccc ggcaacaatt aatagactgg 2460atggaggcgg ataaagttgc aggaccactt ctgcgctcgg cccttccggc tggctggttt 2520attgctgata aatctggagc cggtgagcgt gggtctcgcg gtatcattgc agcactgggg 2580ccagatggta agccctcccg tatcgtagtt atctacacga cggggagtca ggcaactatg 2640gatgaacgaa atagacagat cgctgagata ggtgcctcac tgattaagca ttggtaactg 2700tcagaccaag tttactcata tatactttag attgatttaa aacttcattt ttaatttaaa 2760aggatctagg tgaagatcct ttttgataat ctcatgacca aaatccctta acgtgagttt 2820tcgttccact gagcgtcaga ccccgtagaa aagatcaaag gatcttcttg agatcctttt 2880tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac cgctaccagc ggtggtttgt 2940ttgccggatc aagagctacc aactcttttt ccgaaggtaa ctggcttcag cagagcgcag 3000ataccaaata ctgttcttct agtgtagccg tagttaggcc accacttcaa gaactctgta 3060gcaccgccta catacctcgc tctgctaatc ctgttaccag tggctgctgc cagtggcgat 3120aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac cggataaggc gcagcggtcg 3180ggctgaacgg ggggttcgtg cacacagccc agcttggagc gaacgaccta caccgaactg 3240agatacctac agcgtgagct atgagaaagc gccacgcttc ccgaagggag aaaggcggac 3300aggtatccgg taagcggcag ggtcggaaca ggagagcgca cgagggagct tccaggggga 3360aacgcctggt atctttatag tcctgtcggg tttcgccacc tctgacttga gcgtcgattt 3420ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg ccagcaacgc ggccttttta 3480cggttcctgg ccttttgctg gccttttgct cacatgttct ttcctgcgtt atcccctgat 3540tctgtggata accgtattac cgcctttgag tgagctgata ccgctcgccg cagccgaacg 3600accgagcgca gcgagtcagt gagcgaggaa gcggaagagc gcccaatacg caaaccgcct 3660ctccccgcgc gttggccgat tcattaatgc agctggcacg acaggtttcc cgactggaaa 3720gcgggcagtg agcgcaacgc aattaatgtg agttagctca ctcattaggc

accccaggct 3780ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata acaatttcac 3840acaggaaaca gctatgacca tgattacgcc aagcttggta ccgagctcgg atccactagt 3900aacggccgcc agtgtgctgg aattcgccct tgcggccgcg cgagaaactt tgtatgggca 3960tggttatttc tcacttctca ccctccttta ctttcttatg ctaaatcctc cttcccctat 4020atctccaccc tcaacccctt tttctcatta taacttttgg tgcctagatg gtgtgtgtgt 4080gtgcgcgcga gagatctgag ctcaattttc ctctctcaag tcctggtcat gctgtttaaa 4140ccacagcttt cttgaacttc ttatgcatct tatatctctc cacctccagg attttaagcc 4200ctagaagctc aagaaagctg tgggagttta aactatggca attcaggctt ttaattgctt 4260tcatttggta ccatcacttg caagatttca gagtacaagg tgaacacaca catcttcctc 4320ttcatcaatt ctctagtttc atccttatct tttcattcac ggtaactctc actaccctct 4380ttcatcttat aagttatacc gggggtgtga tgttgatgag tgtaaattaa atatatgtga 4440tctctttctc tggaaaaatt ttcagtgtga tatacatann natctcttaa tctagagatt 4500ttatggcttt gttatatata aggaattcgc ggccgc 453695974DNAartificial sequencemicroRNA precursor 95gcggccgctt ctagctagct agggtttggg tagtgagtgt aataaagttg caaagttttt 60ggttaggtta cgttttgacc ttattattat agttcaaagg gaaacattaa ttaaagggga 120ttatgaagtg gagctccttg aagtccaatt gaggatctta ctgggtgaat tgagctgctt 180agctatggat cccacagttc tacccatcaa taagtgcttt tgtggtagtc ttgtggcttc 240catatctggg gagcttcatt tgcctttata gtattaacct tctttggatt gaagggagct 300ctacaccctt ctcttctttt ctctcataat aatttaaatt tgttatagac tctaaacttt 360aaatgttttt tttgaagttt ttccgttttt ctcttttgcc atgatcccgt tcttgctgtg 420gagtaacctt gtccgaggta tgtgcatgat tagatccata cttaatttgt gtgcatcacg 480aaggtgaggt tgaaatgaac tttgcttttt tgacctttta ggaaagttct tttgttgcag 540taatcaattt taattagttt taattgacac tattactttt attgtcatct ttgttagttt 600tattgttgaa ttgagtgcat atttcctagg aaattctctt acctaacatt ttttatacag 660atctatgctc ttggctcttg cccttactct tggccttgtg ttggttattt gtctacatat 720ttattgactg gtcgatgaga catgtcacaa ttcttgggct tatttgttgg tctaataaaa 780ggagtgctta ttgaaagatc aagacggaga ttcggtttta tataaataaa ctaaagatga 840catattagtg tgttgatgtc tcttcaggat aatttttgtt tgaaataata tggtaatgtc 900ttgtctaaat ttgtgtacat aattcttact gattttttgg attgttggat ttttataaac 960aaatctgcgg ccgc 97496990DNAartificial sequencein-fusion ready microRNA 159 precursor 96gcggccgctt ctagctagct agggtttggg tagtgagtgt aataaagttg caaagttttt 60ggttaggtta cgttttgacc ttattattat agttcaaagg gaaacattaa ttaaagggga 120ttatgaagtg tttaaacgga gctccttgaa gtccaattga ggatcttact gggtgaattg 180agctgcttag ctatggatcc cacagttcta cccatcaata agtgcttttg tggtagtctt 240gtggcttcca tatctgggga gcttcatttg cctttatagt attaaccttc tttggattga 300agggagctct agtttaaacc acccttctct tcttttctct cataataatt taaatttgtt 360atagactcta aactttaaat gttttttttg aagtttttcc gtttttctct tttgccatga 420tcccgttctt gctgtggagt aaccttgtcc gaggtatgtg catgattaga tccatactta 480atttgtgtgc atcacgaagg tgaggttgaa atgaactttg cttttttgac cttttaggaa 540agttcttttg ttgcagtaat caattttaat tagttttaat tgacactatt acttttattg 600tcatctttgt tagttttatt gttgaattga gtgcatattt cctaggaaat tctcttacct 660aacatttttt atacagatct atgctcttgg ctcttgccct tactcttggc cttgtgttgg 720ttatttgtct acatatttat tgactggtcg atgagacatg tcacaattct tgggcttatt 780tgttggtcta ataaaaggag tgcttattga aagatcaaga cggagattcg gttttatata 840aataaactaa agatgacata ttagtgtgtt gatgtctctt caggataatt tttgtttgaa 900ataatatggt aatgtcttgt ctaaatttgt gtacataatt cttactgatt ttttggattg 960ttggattttt ataaacaaat ctgcggccgc 9909754DNAartificial sequenceprimer 97attaaagggg attatgaaga ccctacccta cattctccat tgaggatctt actg 549853DNAartificial sequenceprimer 98agaaaagaag agaagggtga ccctacccta ttttctccaa gaaggttaat act 53994911DNAartificial sequenceplasmid 99ggccgcgaat tcttctagct agctagggtt tgggtagtga gtgtaataaa gttgcaaagt 60ttttggttag gttacgtttt gaccttatta ttatagttca aagggaaaca ttaattaaag 120gggattatga agaccctacc ctacattctc cattgaggat cttactgggt gaattgagct 180gcttagctat ggatcccaca gttctaccca tcaataagtg cttttgtggt agtcttgtgg 240cttccatatc tggggagctt catttgcctt tatagtatta accttcttgg agaaaatagg 300gtagggtcac ccttctcttc ttttctctca taataattta aatttgttat agactctaaa 360ctttaaatgt tttttttgaa gtttttccgt ttttctcttt tgccatgatc ccgttcttgc 420tgtggagtaa ccttgtccga ggtatgtgca tgattagatc catacttaat ttgtgtgcat 480cacgaaggtg aggttgaaat gaactttgct tttttgacct tttaggaaag ttcttttgtt 540gcagtaatca attttaatta gttttaattg acactattac ttttattgtc atctttgtta 600gttttattgt tgaattgagt gcatatttcc taggaaattc tcttacctaa cattttttat 660acagatctat gctcttggct cttgccctta ctcttggcct tgtgttggtt atttgtctac 720atatttattg actggtcgat gagacatgtc acaattcttg ggcttatttg ttggtctaat 780aaaaggagtg cttattgaaa gatcaagacg gagattcggt tttatataaa taaactaaag 840atgacatatt agtgtgttga tgtctcttca ggataatttt tgtttgaaat aatatggtaa 900tgtcttgtct aaatttgtgt acataattct tactgatttt ttggattgtt ggatttttat 960aaacaaatct gcggccgcaa gggcgaattc tgcagatatc catcacactg gcggccgctc 1020gagcatgcat ctagagggcc caattcgccc tatagtgagt cgtattacaa ttcactggcc 1080gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa tcgccttgca 1140gcacatcccc ctttcgccag ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc 1200caacagttgc gcagcctgaa tggcgaatgg acgcgccctg tagcggcgca ttaagcgcgg 1260cgggtgtggt ggttacgcgc agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc 1320ctttcgcttt cttcccttcc tttctcgcca cgttcgccgg ctttccccgt caagctctaa 1380atcgggggct ccctttaggg ttccgattta gtgctttacg gcacctcgac cccaaaaaac 1440ttgattaggg tgatggttca cgtagtgggc catcgccctg atagacggtt tttcgccctt 1500tgacgttgga gtccacgttc tttaatagtg gactcttgtt ccaaactgga acaacactca 1560accctatctc ggtctattct tttgatttat aagggatttt gccgatttcg gcctattggt 1620taaaaaatga gctgatttaa caaaaattta acgcgaattt taacaaaatt cagggcgcaa 1680gggctgctaa aggaagcgga acacgtagaa agccagtccg cagaaacggt gctgaccccg 1740gatgaatgtc agctactggg ctatctggac aagggaaaac gcaagcgcaa agagaaagca 1800ggtagcttgc agtgggctta catggcgata gctagactgg gcggttttat ggacagcaag 1860cgaaccggaa ttgccagctg gggcgccctc tggtaaggtt gggaagccct gcaaagtaaa 1920ctggatggct ttcttgccgc caaggatctg atggcgcagg ggatcaagat ctgatcaaga 1980gacaggatga ggatcgtttc gcatgattga acaagatgga ttgcacgcag gttctccggc 2040cgcttgggtg gagaggctat tcggctatga ctgggcacaa cagacaatcg gctgctctga 2100tgccgccgtg ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca agaccgacct 2160gtccggtgcc ctgaatgaac tgcaggacga ggcagcgcgg ctatcgtggc tggccacgac 2220gggcgttcct tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg actggctgct 2280attgggcgaa gtgccggggc aggatctcct gtcatcccac cttgctcctg ccgagaaagt 2340atccatcatg gctgatgcaa tgcggcggct gcatacgctt gatccggcta cctgcccatt 2400cgaccaccaa gcgaaacatc gcatcgagcg agcacgtact cggatggaag ccggtcttgt 2460cgatcaggat gatctggacg aagagcatca ggggctcgcg ccagccgaac tgttcgccag 2520gctcaaggcg cgcatgcccg acggcgagga tctcgtcgtg acccatggcg atgcctgctt 2580gccgaatatc atggtggaaa atggccgctt ttctggattc atcgactgtg gccggctggg 2640tgtggcggac cgctatcagg acatagcgtt ggctacccgt gatattgctg aagagcttgg 2700cggcgaatgg gctgaccgct tcctcgtgct ttacggtatc gccgctcccg attcgcagcg 2760catcgccttc tatcgccttc ttgacgagtt cttctgaatt gaaaaaggaa gagtatgagt 2820attcaacatt tccgtgtcgc ccttattccc ttttttgcgg cattttgcct tcctgttttt 2880gctcacccag aaacgctggt gaaagtaaaa gatgctgaag atcagttggg tgcacgagtg 2940ggttacatcg aactggatct caacagcggt aagatccttg agagttttcg ccccgaagaa 3000cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtatt 3060gacgccgggc aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag 3120tactcaccag tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt 3180gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga 3240ccgaaggagc taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt 3300tgggaaccgg agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta 3360gcaatggcaa caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg 3420caacaattaa tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc 3480cttccggctg gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt 3540atcattgcag cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg 3600gggagtcagg caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg 3660attaagcatt ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa 3720cttcattttt aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa 3780atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga 3840tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg 3900ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact 3960ggcttcagca gagcgcagat accaaatact gttcttctag tgtagccgta gttaggccac 4020cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg 4080gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg 4140gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga 4200acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc 4260gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg 4320agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 4380tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc 4440agcaacgcgg cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt 4500cctgcgttat cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc 4560gctcgccgca gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc 4620ccaatacgca aaccgcctct ccccgcgcgt tggccgattc attaatgcag ctggcacgac 4680aggtttcccg actggaaagc gggcagtgag cgcaacgcaa ttaatgtgag ttagctcact 4740cattaggcac cccaggcttt acactttatg cttccggctc gtatgttgtg tggaattgtg 4800agcggataac aatttcacac aggaaacagc tatgaccatg attacgccaa gcttggtacc 4860gagctcggat ccactagtaa cggccgccag tgtgctggaa ttcgcccttg c 49111009130DNAartificial sequenceplasmid 100ggccgcgaca caagtgtgag agtactaaat aaatgctttg gttgtacgaa atcattacac 60taaataaaat aatcaaagct tatatatgcc ttccgctaag gccgaatgca aagaaattgg 120ttctttctcg ttatcttttg ccacttttac tagtacgtat taattactac ttaatcatct 180ttgtttacgg ctcattatat ccgtcgacgg cgcgcccgat catccggata tagttcctcc 240tttcagcaaa aaacccctca agacccgttt agaggcccca aggggttatg ctagttattg 300ctcagcggtg gcagcagcca actcagcttc ctttcgggct ttgttagcag ccggatcgat 360ccaagctgta cctcactatt cctttgccct cggacgagtg ctggggcgtc ggtttccact 420atcggcgagt acttctacac agccatcggt ccagacggcc gcgcttctgc gggcgatttg 480tgtacgcccg acagtcccgg ctccggatcg gacgattgcg tcgcatcgac cctgcgccca 540agctgcatca tcgaaattgc cgtcaaccaa gctctgatag agttggtcaa gaccaatgcg 600gagcatatac gcccggagcc gcggcgatcc tgcaagctcc ggatgcctcc gctcgaagta 660gcgcgtctgc tgctccatac aagccaacca cggcctccag aagaagatgt tggcgacctc 720gtattgggaa tccccgaaca tcgcctcgct ccagtcaatg accgctgtta tgcggccatt 780gtccgtcagg acattgttgg agccgaaatc cgcgtgcacg aggtgccgga cttcggggca 840gtcctcggcc caaagcatca gctcatcgag agcctgcgcg acggacgcac tgacggtgtc 900gtccatcaca gtttgccagt gatacacatg gggatcagca atcgcgcata tgaaatcacg 960ccatgtagtg tattgaccga ttccttgcgg tccgaatggg ccgaacccgc tcgtctggct 1020aagatcggcc gcagcgatcg catccatagc ctccgcgacc ggctgcagaa cagcgggcag 1080ttcggtttca ggcaggtctt gcaacgtgac accctgtgca cggcgggaga tgcaataggt 1140caggctctcg ctgaattccc caatgtcaag cacttccgga atcgggagcg cggccgatgc 1200aaagtgccga taaacataac gatctttgta gaaaccatcg gcgcagctat ttacccgcag 1260gacatatcca cgccctccta catcgaagct gaaagcacga gattcttcgc cctccgagag 1320ctgcatcagg tcggagacgc tgtcgaactt ttcgatcaga aacttctcga cagacgtcgc 1380ggtgagttca ggcttttcca tgggtatatc tccttcttaa agttaaacaa aattatttct 1440agagggaaac cgttgtggtc tccctatagt gagtcgtatt aatttcgcgg gatcgagatc 1500gatccaattc caatcccaca aaaatctgag cttaacagca cagttgctcc tctcagagca 1560gaatcgggta ttcaacaccc tcatatcaac tactacgttg tgtataacgg tccacatgcc 1620ggtatatacg atgactgggg ttgtacaaag gcggcaacaa acggcgttcc cggagttgca 1680cacaagaaat ttgccactat tacagaggca agagcagcag ctgacgcgta cacaacaagt 1740cagcaaacag acaggttgaa cttcatcccc aaaggagaag ctcaactcaa gcccaagagc 1800tttgctaagg ccctaacaag cccaccaaag caaaaagccc actggctcac gctaggaacc 1860aaaaggccca gcagtgatcc agccccaaaa gagatctcct ttgccccgga gattacaatg 1920gacgatttcc tctatcttta cgatctagga aggaagttcg aaggtgaagg tgacgacact 1980atgttcacca ctgataatga gaaggttagc ctcttcaatt tcagaaagaa tgctgaccca 2040cagatggtta gagaggccta cgcagcaggt ctcatcaaga cgatctaccc gagtaacaat 2100ctccaggaga tcaaatacct tcccaagaag gttaaagatg cagtcaaaag attcaggact 2160aattgcatca agaacacaga gaaagacata tttctcaaga tcagaagtac tattccagta 2220tggacgattc aaggcttgct tcataaacca aggcaagtaa tagagattgg agtctctaaa 2280aaggtagttc ctactgaatc taaggccatg catggagtct aagattcaaa tcgaggatct 2340aacagaactc gccgtgaaga ctggcgaaca gttcatacag agtcttttac gactcaatga 2400caagaagaaa atcttcgtca acatggtgga gcacgacact ctggtctact ccaaaaatgt 2460caaagataca gtctcagaag accaaagggc tattgagact tttcaacaaa ggataatttc 2520gggaaacctc ctcggattcc attgcccagc tatctgtcac ttcatcgaaa ggacagtaga 2580aaaggaaggt ggctcctaca aatgccatca ttgcgataaa ggaaaggcta tcattcaaga 2640tgcctctgcc gacagtggtc ccaaagatgg acccccaccc acgaggagca tcgtggaaaa 2700agaagacgtt ccaaccacgt cttcaaagca agtggattga tgtgacatct ccactgacgt 2760aagggatgac gcacaatccc actatccttc gcaagaccct tcctctatat aaggaagttc 2820atttcatttg gagaggacac gctcgagctc atttctctat tacttcagcc ataacaaaag 2880aactcttttc tcttcttatt aaaccatgaa aaagcctgaa ctcaccgcga cgtctgtcga 2940gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 3000agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 3060ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 3120cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 3180ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 3240gcagccggtc gcggaggcca tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 3300gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 3360cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 3420gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 3480gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3540agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3600cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3660gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3720ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3780atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3840aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3900ccccagcact cgtccgaggg caaaggaata gtgaggtacc taaagaagga gtgcgtcgaa 3960gcagatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt tgccggtctt 4020gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat taacatgtaa 4080tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt atacatttaa 4140tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg cgcggtgtca 4200tctatgttac tagatcgatg tcgaatctga tcaacctgca ttaatgaatc ggccaacgcg 4260cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc 4320gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat 4380ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca 4440ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc 4500atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc 4560aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg 4620gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc tcacgctgta 4680ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg 4740ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac 4800acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag 4860gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat 4920ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat 4980ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 5040gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt 5100ggaacgaaaa ctcacgttaa gggattttgg tcatgacatt aacctataaa aataggcgta 5160tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc 5220agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga caagcccgtc 5280agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg gcatcagagc 5340agattgtact gagagtgcac catatggaca tattgtcgtt agaacgcggc tacaattaat 5400acataacctt atgtatcata cacatacgat ttaggtgaca ctatagaacg gcgcgccaag 5460cttggatcct cgaagagaag ggttaataac acactttttt aacattttta acacaaattt 5520tagttattta aaaatttatt aaaaaattta aaataagaag aggaactctt taaataaatc 5580taacttacaa aatttatgat ttttaataag ttttcaccaa taaaaaatgt cataaaaata 5640tgttaaaaag tatattatca atattctctt tatgataaat aaaaagaaaa aaaaaataaa 5700agttaagtga aaatgagatt gaagtgactt taggtgtgta taaatatatc aaccccgcca 5760acaatttatt taatccaaat atattgaagt atattattcc atagccttta tttatttata 5820tatttattat ataaaagctt tatttgttct aggttgttca tgaaatattt ttttggtttt 5880atctccgttg taagaaaatc atgtgctttg tgtcgccact cactattgca gctttttcat 5940gcattggtca gattgacggt tgattgtatt tttgtttttt atggttttgt gttatgactt 6000aagtcttcat ctctttatct cttcatcagg tttgatggtt acctaatatg gtccatgggt 6060acatgcatgg ttaaattagg tggccaactt tgttgtgaac gatagaattt tttttatatt 6120aagtaaacta tttttatatt atgaaataat aataaaaaaa atattttatc attattaaca 6180aaatcatatt agttaatttg ttaactctat aataaaagaa atactgtaac attcacatta 6240catggtaaca tctttccacc ctttcatttg ttttttgttt gatgactttt tttcttgttt 6300aaatttattt cccttctttt aaatttggaa tacattatca tcatatataa actaaaatac 6360taaaaacagg attacacaaa tgataaataa taacacaaat atttataaat ctagctgcaa 6420tatatttaaa ctagctatat cgatattgta aaataaaact agctgcattg atactgataa 6480aaaaatatca tgtgctttct ggactgatga tgcagtatac ttttgacatt gcctttattt 6540tatttttcag aaaagctttc ttagttctgg gttcttcatt atttgtttcc catctccatt 6600gtgaattgaa tcatttgctt cgtgtcacaa atacaattta gntaggtaca tgcattggtc 6660agattcacgg tttattatgt catgacttaa gttcatggta gtacattacc tgccacgcat 6720gcattatatt ggttagattt gataggcaaa tttggttgtc aacaatataa atataaataa 6780tgtttttata ttacgaaata acagtgatca aaacaaacag ttttatcttt attaacaaga 6840ttttgttttt gtttgatgac gttttttaat gtttacgctt tcccccttct tttgaattta 6900gaacacttta tcatcataaa atcaaatact aaaaaaatta catatttcat

aaataataac 6960acaaatattt ttaaaaaatc tgaaataata atgaacaata ttacatatta tcacgaaaat 7020tcattaataa aaatattata taaataaaat gtaatagtag ttatatgtag gaaaaaagta 7080ctgcacgcat aatatataca aaaagattaa aatgaactat tataaataat aacactaaat 7140taatggtgaa tcatatcaaa ataatgaaaa agtaaataaa atttgtaatt aacttctata 7200tgtattacac acacaaataa taaataatag taaaaaaaat tatgataaat atttaccatc 7260tcataagata tttaaaataa tgataaaaat atagattatt ttttatgcaa ctagctagcc 7320aaaaagagaa cacgggtata tataaaaaga gtacctttaa attctactgt acttccttta 7380ttcctgacgt ttttatatca agtggacata cgtgaagatt ttaattatca gtctaaatat 7440ttcattagca cttaatactt ttctgtttta ttcctatcct ataagtagtc ccgattctcc 7500caacattgct tattcacaca actaactaag aaagtcttcc atagcccccc aagcggccgc 7560gcgagaaact ttgtatgggc atggttattt ctcacttctc accctccttt actttcttat 7620gctaaatcct ccttccccta tatctccacc ctcaacccct ttttctcatt ataacttttg 7680gtgcctagat ggtgtgtgtg tgtgcgcgcg agagatctga gctcaatttt cctctctcaa 7740gtcctggtca tgctttaacc caacaatcat cgacccctta tgcatcttat atctctccac 7800ctccaggatt ttaagcccta ggggacgatg attgttgggt tagaatatgg caattcaggc 7860ttttaattgc tttcatttgg taccatcact tgcaagattt cagagtacaa ggtgaacaca 7920cacatcttcc tcttcatcaa ttctctagtt tcatccttat cttttcattc acggtaactc 7980tcactaccct ctttcatctt ataagttata ccgggggtgt gatgttgatg agtgtaaatt 8040aaatatatgt gatctctttc tctggaaaaa ttttcagtgt gatatacata ataatctctt 8100aatctagaga ttttatggct ttgttatata taagcggcca attctgcaga tatccatcac 8160actggaattc ttctagctag ctagggtttg ggtagtgagt gtaataaagt tgcaaagttt 8220ttggttaggt tacgttttga ccttattatt atagttcaaa gggaaacatt aattaaaggg 8280gattatgaag accctaccct acattctcca ttgaggatct tactgggtga attgagctgc 8340ttagctatgg atcccacagt tctacccatc aataagtgct tttgtggtag tcttgtggct 8400tccatatctg gggagcttca tttgccttta tagtattaac cttcttggag aaaatagggt 8460agggtcaccc ttctcttctt ttctctcata ataatttaaa tttgttatag actctaaact 8520ttaaatgttt tttttgaagt ttttccgttt ttctcttttg ccatgatccc gttcttgctg 8580tggagtaacc ttgtccgagg tatgtgcatg attagatcca tacttaattt gtgtgcatca 8640cgaaggtgag gttgaaatga actttgcttt tttgaccttt taggaaagtt cttttgttgc 8700agtaatcaat tttaattagt tttaattgac actattactt ttattgtcat ctttgttagt 8760tttattgttg aattgagtgc atatttccta ggaaattctc ttacctaaca ttttttatac 8820agatctatgc tcttggctct tgcccttact cttggccttg tgttggttat ttgtctacat 8880atttattgac tggtcgatga gacatgtcac aattcttggg cttatttgtt ggtctaataa 8940aaggagtgct tattgaaaga tcaagacgga gattcggttt tatataaata aactaaagat 9000gacatattag tgtgttgatg tctcttcagg ataatttttg tttgaaataa tatggtaatg 9060tcttgtctaa atttgtgtac ataattctta ctgatttttt ggattgttgg atttttataa 9120acaaatctgc 9130101607DNACyamopsis tetragonoloba 101gcacgaggtt gcggctcaca gtcgttgtgc ttcccaatcc ccgatcccca aaagagagag 60agagaatgag gtggtggcgg cgccggcgtt gaccatgaga ccagtagcaa ccgatttcac 120ccaaaagctc ctcccttcca atctcattct ggccaccaac aatcgccttc aacgtacctc 180tcccttcttt ctccatccat atcgcatggc cgacggcgca gcgacatcca atacacccgc 240gccgcaccag atccaaccca aactggaccc aaacgccgag aagaaggaga atctaccgaa 300ggagattcct ccgccgccgg agaagcccga gcccggcgat tgttgcggca gcggatgcgt 360ccgatgcgtt tgggatattt actatgagga gcttgaacaa tacaataagc tctacaaaca 420cgacgattcc aaccccaaac cttaattagg atcattcttt tcccaatgta attcacaatt 480caagggttaa aatgacatca tgattttgtc aatatctcca aagtttatcg ttaatggcaa 540gctcagggtt caccttgcca aatttgacat tcaaggatgt gtagatctat actaagaaga 600gcttgaa 607102116PRTCyamopsis tetragonoloba 102Met Arg Pro Val Ala Thr Asp Phe Thr Gln Lys Leu Leu Pro Ser Asn1 5 10 15Leu Ile Leu Ala Thr Asn Asn Arg Leu Gln Arg Thr Ser Pro Phe Phe 20 25 30Leu His Pro Tyr Arg Met Ala Asp Gly Ala Ala Thr Ser Asn Thr Pro 35 40 45Ala Pro His Gln Ile Gln Pro Lys Leu Asp Pro Asn Ala Glu Lys Lys 50 55 60Glu Asn Leu Pro Lys Glu Ile Pro Pro Pro Pro Glu Lys Pro Glu Pro65 70 75 80Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Ile Tyr 85 90 95Tyr Glu Glu Leu Glu Gln Tyr Asn Lys Leu Tyr Lys His Asp Asp Ser 100 105 110Asn Pro Lys Pro 115103608DNABahia 103cgatggtgtc accaatcaca gcagcctccc cctccccttc ccctttcggc ggcctgtatg 60ctgggcgccg tcctccgcgt ctcggccccg atcccgtctc tcctccccgc gccgacgcgc 120cctctcctac tccgccgccg cagccacagc ctcccgcccg agacgcccat ggccgcggcc 180gccccwcgcg acgccggcgc cacgaagccc gacgccgcgc cggcgccggc gccagtgccg 240cagccacccg agaagccgct ccctggcgac tgctgcggga gcggctgcgt ccgctgcgtc 300tgggacatct attacgacga actcgacgcg tacgaaaagg ccctcgccgc ccacgcggcc 360tccgccggcg gcaaggcctc cccctatccc gctgacakca agcccagcga cggcgccaag 420tcctgaagca cgtggggcgt catgcgtatc ccttcttctg ttcccaactg aaatagattt 480tcagatatgc tgctagcaat tgttgacact gagacattac atatgtgtat gctagattga 540gatgctttgt caattcaacc tcatcgttgt gcaagtgtgt aacaagagaa agttaatatg 600attattaa 608104122PRTBahiamisc_feature(114)..(114)Xaa can be any naturally occurring amino acid 104Met Leu Gly Ala Val Leu Arg Val Ser Ala Pro Ile Pro Ser Leu Leu1 5 10 15Pro Ala Pro Thr Arg Pro Leu Leu Leu Arg Arg Arg Ser His Ser Leu 20 25 30Pro Pro Glu Thr Pro Met Ala Ala Ala Ala Pro Arg Asp Ala Gly Ala 35 40 45Thr Lys Pro Asp Ala Ala Pro Ala Pro Ala Pro Val Pro Gln Pro Pro 50 55 60Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys65 70 75 80Val Trp Asp Ile Tyr Tyr Asp Glu Leu Asp Ala Tyr Glu Lys Ala Leu 85 90 95Ala Ala His Ala Ala Ser Ala Gly Gly Lys Ala Ser Pro Tyr Pro Ala 100 105 110Asp Xaa Lys Pro Ser Asp Gly Ala Lys Ser 115 120105133PRTArabidopsis lyrata 105Met Val Val Val Ser Leu His Arg Ile Ser Ile Thr Thr Ser Pro Gly1 5 10 15Ser Ser Leu His Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr Arg 20 25 30Arg His Leu Pro Leu Lys Arg Pro Phe Thr Asn Tyr Ser Ile Thr Ser 35 40 45Val Ser Pro Glu Gln Gln Leu Ile Ser Pro Val Thr Met Ala Thr Thr 50 55 60Glu Ser Gln Asn Leu Val Gln Ala Ser Lys Glu Glu Thr Asn Lys Lys65 70 75 80Glu Val Glu Asp Thr Lys Glu Ile Leu Ala Pro Pro Pro Pro Glu Lys 85 90 95Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp 100 105 110Asp Val Tyr Tyr Glu Glu Leu Glu Asp Tyr Asn Lys Lys Leu Ser Gly 115 120 125Glu Thr Lys Ser Val 130106113PRTPicea sitchensis 106Met Arg Ser Pro Phe Cys Ile Pro Ser Val Val Ser Ala Arg Thr Arg1 5 10 15Val Cys Phe Arg Phe Thr Cys Phe Thr Met Ala Thr Val Ser Gly Gly 20 25 30Gly Val Glu Gly Lys Glu Asn Leu Glu Lys Ser Ile Glu Ala Lys Ala 35 40 45Lys Asp Glu Lys Lys Lys Ala Glu Glu Glu Ile Glu Lys Ile Leu Met 50 55 60Glu Lys Ile Gly Pro Pro Pro Glu Lys Pro Leu Pro Gly Asp Cys Cys65 70 75 80Gly Ser Gly Cys Glu Ile Cys Val Trp Asp Thr Tyr Phe Asp Gln Leu 85 90 95Gln Glu Tyr Lys Lys Glu Lys Asp Ser Ile Leu Lys Ser Ile Ser Pro 100 105 110Pro 107821DNAHordeum vulgare 107ctccgcggcc cgggctctcc gatcccgcct ctcttccccg cgccggggcg ccctctcatc 60cacctatccc gccgcctccc tacggcgccc gccatggccg acgccaagaa gaccgacgcg 120ccggcgaccc cggccccgga gccgcccgag aagccgctcc ccggcgactg ctgcggcagc 180ggctgcgtcc gctgcgtctg ggacatctac tacgacgagc tccaggacta caaggaggcc 240ctcgccgccc acgcggccgc ggccgatccc agcggcgaca aggcatgcgt cgacgagaag 300aagaccgaat gatgagaccc gggaggaggc aggacccggg tgtgtatgct ggaactagta 360ctgggaccaa ataggatgcg cggctcgagt gggatatggg agcatgactc atggaatggc 420ggagcggcgt agctggcgtt gtggcgagaa aaaaaaatac taccaacagg gggggcccga 480gaccgagtga gtcctctaat tataatggaa gcaaaagcgt gaacgggtgt gtgcgcgggc 540gtggtcttga agagctctgg tgaagctgtg ccgaggagca gatgtgtccg tgcgtccata 600cgggtacaga gacgactagg aggtgttgta cgcggcttag tgagcgtggt taggcgggat 660gaaggagaag gggaggggga aggcgtgaga tgatagaaga tgatgggttg acgagatatg 720acgacggtgg agacgtagga ggcatgtgat aacagtaggc tgggctgagg tgggatgcgg 780aaggaggaga gatatatgag gggagggtgc ggttatagac g 821108103PRTHordeum vulgare 108Leu Arg Gly Pro Gly Ser Pro Ile Pro Pro Leu Phe Pro Ala Pro Gly1 5 10 15Arg Pro Leu Ile His Leu Ser Arg Arg Leu Pro Thr Ala Pro Ala Met 20 25 30Ala Asp Ala Lys Lys Thr Asp Ala Pro Ala Thr Pro Ala Pro Glu Pro 35 40 45Pro Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg 50 55 60Cys Val Trp Asp Ile Tyr Tyr Asp Glu Leu Gln Asp Tyr Lys Glu Ala65 70 75 80Leu Ala Ala His Ala Ala Ala Ala Asp Pro Ser Gly Asp Lys Ala Cys 85 90 95Val Asp Glu Lys Lys Thr Glu 100109547DNARaphanus sativus 109gggcatggtt tcgttgcatc atatccatcc tcgattctcg accgccgcat cgtcggaata 60caatcgtcgc cggaaaagct tccacgatgt gcttctgagc atgagatttg gatttacgcg 120agatctctct ctgaaacggt ccttggtcaa ctactattcc ttatctcgac aacaacgaca 180cctcaagtcg cccatcacca tggccaccaa gagcgagaag acttccacgg aggagaagga 240taagaaggag gaggtttcac tccctccgcc tccgccgccg gagaaaccag agcctggcga 300ctgctgcggt agcggatgcg tgcgatgcgt ttgggatgtg tattacgaag agctccaaga 360atacaacaag ctttctacat cccttcctgg acaaactaaa tccaattgaa tgctaaattt 420ttgtgtgcaa atgtactcgt cttcgagttt gagaagtcga agatgatgtt atgtttgaac 480attattggat cattatcgtt actacttatc tacaaagttt actaaaagaa aaaaaaaaaa 540aaaaaaa 547110134PRTRaphanus sativus 110Met Val Ser Leu His His Ile His Pro Arg Phe Ser Thr Ala Ala Ser1 5 10 15Ser Glu Tyr Asn Arg Arg Arg Lys Ser Phe His Asp Val Leu Leu Ser 20 25 30Met Arg Phe Gly Phe Thr Arg Asp Leu Ser Leu Lys Arg Ser Leu Val 35 40 45Asn Tyr Tyr Ser Leu Ser Arg Gln Gln Arg His Leu Lys Ser Pro Ile 50 55 60Thr Met Ala Thr Lys Ser Glu Lys Thr Ser Thr Glu Glu Lys Asp Lys65 70 75 80Lys Glu Glu Val Ser Leu Pro Pro Pro Pro Pro Pro Glu Lys Pro Glu 85 90 95Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Val 100 105 110Tyr Tyr Glu Glu Leu Gln Glu Tyr Asn Lys Leu Ser Thr Ser Leu Pro 115 120 125Gly Gln Thr Lys Ser Asn 130111800DNADennstaedtia punctilobula 111ggcccctaca atatcccaaa atttcatccg accaaagaaa ttttgggctg ctgtaacgct 60ggtgaaggta atgaaggtag ctttcttgaa ctattcattg attccttcct tcttctcgcc 120ctcacctgta ctacaacgag ggttagggtt tcgcgagact acaagggcgg caatgtccgg 180taacagggag cctgatcccg atcttgtgct agaaagtact cctcccaagc agaagcagca 240gaatcacaag aaagaagtag atggagaaga gaagaaagaa gaagatgatg cagagatttt 300gaggaagcag cttggcgagc cccctgagaa gcctttgcct ggagactgtt gcggcagtgg 360atgtgtccga tgtgtctggg acatttattt tgacgagctc gagctttata actcccgcaa 420ggatgtcctt gatgcccgcc gtgcttcgtg atagtaccaa ctcgggatgc ctactattca 480tagctgaaga tttgcaagga ggcccacact catctctgca gcagctcaac tcatcaattt 540tctgtgtgac ttgtttcaag gttcccctgt gaccttgcac aatatttttc attgatctgt 600attctttacc atcataaaca ttggaattgg gggttcctga aaggactaaa tcccctgttt 660ttttcaaggt aaccctgcca tttatgggtt aatctgtatt gtttccttcc atgtacattt 720gcctagattc taccatatac atcagaaggc cagaaataaa tccagggctt caattggctg 780tccagatgct tcgttttggg 800112381DNADennstaedtia punctilobula 112atgaaggtag ctttcttgaa ctattcattg attccttcct tcttctcgcc ctcacctgta 60ctacaacgag ggttagggtt tcgcgagact acaagggcgg caatgtccgg taacagggag 120cctgatcccg atcttgtgct agaaagtact cctcccaagc agaagcagca gaatcacaag 180aaagaagtag atggagaaga gaagaaagaa gaagatgatg cagagatttt gaggaagcag 240cttggcgagc cccctgagaa gcctttgcct ggagactgtt gcggcagtgg atgtgtccga 300tgtgtctggg acatttattt tgacgagctc gagctttata actcccgcaa ggatgtcctt 360gatgcccgcc gtgcttcgtg a 381113126PRTDennstaedtia punctilobula 113Met Lys Val Ala Phe Leu Asn Tyr Ser Leu Ile Pro Ser Phe Phe Ser1 5 10 15Pro Ser Pro Val Leu Gln Arg Gly Leu Gly Phe Arg Glu Thr Thr Arg 20 25 30Ala Ala Met Ser Gly Asn Arg Glu Pro Asp Pro Asp Leu Val Leu Glu 35 40 45Ser Thr Pro Pro Lys Gln Lys Gln Gln Asn His Lys Lys Glu Val Asp 50 55 60Gly Glu Glu Lys Lys Glu Glu Asp Asp Ala Glu Ile Leu Arg Lys Gln65 70 75 80Leu Gly Glu Pro Pro Glu Lys Pro Leu Pro Gly Asp Cys Cys Gly Ser 85 90 95Gly Cys Val Arg Cys Val Trp Asp Ile Tyr Phe Asp Glu Leu Glu Leu 100 105 110Tyr Asn Ser Arg Lys Asp Val Leu Asp Ala Arg Arg Ala Ser 115 120 125114777DNAOsmunda cinnamomea 114acaatgagat agcatgaggt tgggtatcct gccctgcccc ttcatcaggc ctctgcttcc 60ctcgccatcc atcgcccctc cctcctccag cctcctaacc ttccgcgctt cgccacgagc 120catggacaaa cagcaggttc tccatcccaa gcccgcggat ctccccaaga atgactccaa 180acagaacgac ctaacgctgc ctgcggatca ggaggaatcg cagctcggtc ctccaccgga 240aaagccgctc ccaggtgatt gctgtggcag cggttgcgtg cggtgtgtct gggataccta 300tttcgaggag ctggatagtt acaacgagcg caaagaggcg tttgaatccc gcctgaagaa 360gtcgcctcct ctgtaatttt ctacattggc ggtagggaaa gggagtaaaa aatttacgag 420gaagaatgtg caatgttttt gtgaggatga agtatcaggt ggtggggata gttcagaagg 480ctaagaactc caaagatctt tcaagttgat ggtttgaaac ttattgaatg gactctcatg 540aagtcaagac tgcactctct ttattgttac agactttcca ttgatatatt ttttcgccat 600attagcggac atgcagatgt cacttgagat cttcgtccaa gttgtggcca gctgattctt 660tctatctgca gtggtgcatt tgcccaacca gctaccttct ctaagcattt tgatcagagc 720ttctaaaaga gcaggctgaa gtgatgatat atggtttctt tacatcaatc atggctg 777115363DNAOsmunda cinnamomea 115atgaggttgg gtatcctgcc ctgccccttc atcaggcctc tgcttccctc gccatccatc 60gcccctccct cctccagcct cctaaccttc cgcgcttcgc cacgagccat ggacaaacag 120caggttctcc atcccaagcc cgcggatctc cccaagaatg actccaaaca gaacgaccta 180acgctgcctg cggatcagga ggaatcgcag ctcggtcctc caccggaaaa gccgctccca 240ggtgattgct gtggcagcgg ttgcgtgcgg tgtgtctggg atacctattt cgaggagctg 300gatagttaca acgagcgcaa agaggcgttt gaatcccgcc tgaagaagtc gcctcctctg 360taa 363116120PRTOsmunda cinnamomea 116Met Arg Leu Gly Ile Leu Pro Cys Pro Phe Ile Arg Pro Leu Leu Pro1 5 10 15Ser Pro Ser Ile Ala Pro Pro Ser Ser Ser Leu Leu Thr Phe Arg Ala 20 25 30Ser Pro Arg Ala Met Asp Lys Gln Gln Val Leu His Pro Lys Pro Ala 35 40 45Asp Leu Pro Lys Asn Asp Ser Lys Gln Asn Asp Leu Thr Leu Pro Ala 50 55 60Asp Gln Glu Glu Ser Gln Leu Gly Pro Pro Pro Glu Lys Pro Leu Pro65 70 75 80Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Thr Tyr 85 90 95Phe Glu Glu Leu Asp Ser Tyr Asn Glu Arg Lys Glu Ala Phe Glu Ser 100 105 110Arg Leu Lys Lys Ser Pro Pro Leu 115 12011725PRTArtificial sequencemisc_feature 117Glu Lys Pro Xaa Xaa Gly Asp Cys Cys Gly Ser Gly Cys Xaa Xaa1 5 10 15Cys Val Trp Asp Xaa Tyr Xaa Xaa Xaa Leu 20 25118117PRTGlycine max 118Met Arg Thr Thr Ala Pro Ser Asp Phe Ile Phe Thr Gln Lys Leu His1 5 10 15Pro Phe Asn Ile Thr Ser Thr Lys Thr Ser Leu Gln Arg Thr Leu Pro 20 25 30Tyr Phe Leu Gln Leu Asn Arg Met Ala Glu Ala Ala Arg Thr Ala His 35 40 45Lys Pro Ala Pro His Pro Ile Gln Pro Lys Pro Asp Asp Lys Thr Pro 50 55 60Asn Pro Ala Lys Glu Ile Pro Pro Pro Pro Glu Lys Pro Glu Pro Gly65 70 75 80Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Val Tyr Tyr 85 90 95Asp Glu Leu Glu Glu Tyr Asn Lys Arg Tyr Lys Gln Val Asp Pro Ser 100 105 110Pro Lys Pro Ser Ser 115119132PRTSorghum bicolor 119Met Leu Gly Ala Val Val Arg Val Pro Ala Pro Ile Leu Leu Pro Leu1 5 10 15Leu Pro Gly Pro Thr Arg Pro Leu Leu Leu Arg Arg Arg Arg His Cys 20 25 30Leu Pro Pro Glu Ala Pro Met Ala Ser Ala Thr Pro Ser Asp Gly Gly 35 40 45Ala Ala Lys Pro Asp Ala Ala

Pro Ala Pro Val Pro Val Pro Ala Pro 50 55 60Ala Pro Thr Pro Leu Pro Leu Pro Pro Glu Lys Pro Leu Pro Gly Asp65 70 75 80Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp Asp Ile Tyr Phe Asp 85 90 95Glu Leu Asp Ala Tyr Asp Lys Ala Leu Ala Ala His Ala Ala Ala Ser 100 105 110Ser Gly Ser Gly Ala Lys Asp Asp Ser Ala Asp Thr Lys Pro Ser Asp 115 120 125Gly Ala Lys Ser 130120135PRTArabidopsis thaliana 120Met Val Val Val Ser Leu Leu Pro Arg Ile Ser Ile Val Thr Ser Pro1 5 10 15Gly Ser Ser Leu His Asp Val Leu Leu Ser Met Arg Phe Gly Leu Thr 20 25 30Arg His Leu Pro Leu Lys Arg Ser Phe Ser Asn Tyr Ser Ile Thr Ser 35 40 45Val Ser Pro Glu Gln Gln Leu Lys Ser Pro Val Thr Met Ala Thr Thr 50 55 60Glu Ser Lys Asn Leu Val Glu Ala Ser Lys Glu Glu Thr Asn Lys Lys65 70 75 80Glu Thr Glu Asp Lys Lys Glu Val Gly Val Ser Val Pro Pro Pro Pro 85 90 95Glu Lys Pro Glu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys 100 105 110Val Trp Asp Val Tyr Tyr Asp Glu Leu Glu Asp Tyr Asn Lys Gln Leu 115 120 125Ser Gly Glu Thr Lys Ser Ile 130 135121115PRTOryza sativa 121Met Leu Val Ala Ala Leu Arg Val Pro Ala Pro Ile Pro Ser Ser Leu1 5 10 15Pro Ser Pro Ala Arg Pro Leu Leu Arg Arg Arg Ser Ser His Arg Leu 20 25 30Pro Pro Pro Pro Pro Pro Ala Ala Ser Met Ala Asp Ala Gly Gly Ala 35 40 45Thr Thr Asn Lys Pro Ala Pro Ala Pro Ala Pro Glu Pro Pro Glu Lys 50 55 60Pro Leu Pro Gly Asp Cys Cys Gly Ser Gly Cys Val Arg Cys Val Trp65 70 75 80Asp Val Tyr Tyr Asp Glu Leu Asp Ala Tyr Asn Lys Ala Leu Ala Ala 85 90 95His Ser Ser Ser Ala Ser Ser Gly Ser Lys Pro Ala Thr Ser Asp Gly 100 105 110Ala Lys Ser 115

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed