Soybean promoters SC194 and flower-preferred expression thereof in transgenic plants Li; Zhongsen [Li; Zhongsen]

Soybean promoters SC194 and flower-preferred expression thereof in transgenic plants

Li; Zhongsen

Patent Application Summary

U.S. patent application number 12/152375 was filed with the patent office on 2008-11-27 for soybean promoters sc194 and flower-preferred expression thereof in transgenic plants. Invention is credited to Zhongsen Li.

Application Number	20080295202 12/152375
Document ID	/
Family ID	40073667
Filed Date	2008-11-27

United States Patent Application	20080295202
Kind Code	A1
Li; Zhongsen	November 27, 2008

Soybean promoters SC194 and flower-preferred expression thereof in transgenic plants

Abstract

The promoters of a soybean SC194 polypeptide and fragments thereof and their use in promoting the expression of one or more heterologous nucleic acid fragments in plants are described.

Inventors:	Li; Zhongsen; (Hockessin, DE)
Correspondence Address:	POTTER ANDERSON & CORROON LLP;ATTN: KATHLEEN W. GEIGER, ESQ. P.O. BOX 951 WILMINGTON DE 19899-0951 US
Family ID:	40073667
Appl. No.:	12/152375
Filed:	May 14, 2008

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
60930877	May 17, 2007

Current U.S. Class:	800/278 ; 435/320.1; 435/419; 435/468; 530/350; 536/23.2; 536/23.6; 800/287; 800/298
Current CPC Class:	C12N 15/823 20130101; C12N 15/8222 20130101
Class at Publication:	800/278 ; 536/23.6; 536/23.2; 435/320.1; 435/419; 800/298; 800/287; 435/468; 530/350
International Class:	A01H 1/00 20060101 A01H001/00; C07H 21/04 20060101 C07H021/04; C12N 15/63 20060101 C12N015/63; C12N 15/87 20060101 C12N015/87; C07K 14/415 20060101 C07K014/415; C12N 5/10 20060101 C12N005/10; A01H 5/00 20060101 A01H005/00

Claims

1. An isolated polynucleotide comprising: a) a nucleotide sequence comprising the sequence set forth in SEQ ID NO:1 or a full-length complement thereof; or b) a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the sequence set forth in SEQ ID NO:1; wherein said nucleotide sequence is a promoter.

2. The isolated polynucleotide of claim 1, wherein the nucleotide sequence of b) has at 95% identity, based on the BLASTN method of alignment, when compared to the sequence set forth in SEQ ID NO:1.

3. A recombinant DNA construct comprising the isolated polynucleotide of claim 1 operably linked to at least one heterologous sequence.

4. The recombinant DNA construct of claim 3, wherein the heterologous nucleotide sequence encodes a gene involved in anthocyanin biosynthesis, a gene involved in the synthesis of fragrant fatty acid derivatives, a gene that is determinative of flower morphology, or a gene involved in biosynthesis of plant cytokinin.

5. The recombinant DNA construct of claim 4, wherein the gene involved in anthocyanin biosynthesis is dyhydroflavonol 4-reductase, flavonoid 3,5-hydroxylase, chalcone synthase, chalcone isomerase, flavonoid 3-hydroxylase, anthocyanin synthase, or UDP-glucose 3-O-flavonoid glucosyl transferase.

6. The recombinant DNA construct of claim 4, wherein the gene involved in the synthesis of fragrant fatty acid derivatives is S-linalool synthase, acetyl CoA:benzylalcohol acetyltransferase, benzyl CoA:benzylalcohol benzoyl transferase, S-adenosyl-L-methionine:benzoic acid carboxylmethyl transferase, mycrene synthase, (E)-.beta.-ocimene synthase, orcinol O-methyltransferase, or limonene synthase.

7. The recombinant DNA construct of claim 4, wherein the gene that is determinative of flower morphology is AGAMOUS, APETALA, or PISTILLATA.

8. The recombinant DNA construct of claim 4, wherein the gene involved in biosynthesis of plant cytokinin is isopentenyl transferase.

9. A vector comprising the recombinant DNA construct of claim 3.

10. A cell comprising the recombinant DNA construct of claim 3.

11. The cell of claim 10, wherein the cell is a plant cell.

12. A transgenic plant having stably incorporated into its genome the recombinant DNA construct of claim 3.

13. The transgenic plant of claim 12, wherein the plant is a flowering plant.

14. The transgenic plant of claim 13, wherein the flowering plant is rose, carnation, Gerbera, Chrysanthemum, tulip, Gladioli, Alstroemeria, Anthurium, Iisianthus, larkspur, irises, orchid, snapdragon, African violet, or azalea.

15. A transgenic seed produced by the transgenic plant of claim 12.

16. A method of expressing a coding sequence or a functional RNA in a flowering plant comprising: a) introducing the recombinant DNA construct of claim 3 into the plant, wherein the at least one heterologous sequence comprises a coding sequence or a functional RNA; b) growing the plant of step a); and c) selecting a plant displaying expression of the coding sequence or the functional RNA of the recombinant DNA construct.

17. A method of transgenically altering a marketable flower trait of a flowering plant, comprising: a) introducing a recombinant DNA construct of claim 3 into the flowering plant; b) growing a fertile, mature flowering plant resulting from step a); and c) selecting a flowering plant expressing the at least one heterologous nucleotide sequence in flower tissue based on the altered marketable flower trait.

18. The method of claim 17 wherein the marketable flower trait is color, morphology, or fragrance.

19. An isolated polynucleotide comprising: (a) a nucleotide sequence comprising a fragment of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a full-length complement thereof; or (b) a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the nucleotide sequence of (a); wherein said nucleotide sequence is a promoter.

20. The isolated polynucleotide of claim 19, wherein the nucleotide sequence of (a) comprises SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7.

21. An isolated polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide having at least 90% sequence identity, based on the Clustal method of alignment, when compared to the sequence set forth in SEQ ID NO:20, or (b) a full-length complement of the nucleotide sequence of (a).

22. The isolated polynucleotide of claim 21, wherein the polypeptide has at least 95% sequence identity, based on the Clustal method of alignment, when compared to the sequence set forth in SEQ ID NO:20.

23. The isolated polynucleotide of claim 22 encoding the sequence set forth in SEQ ID NO:20.

24. The isolated polynucleotide of claim 23, wherein the nucleotide sequence comprises the sequence set forth in SEQ ID NO:19.

25. A vector comprising the isolated polynucleotide of claim 21.

26. A recombinant DNA construct comprising the isolated polynucleotide of claim 21 operably linked to a regulatory sequence.

27. A cell comprising the recombinant DNA construct of claim 26.

28. A plant comprising the recombinant DNA construct of claim 26.

29. A seed comprising the recombinant DNA construct of claim 26.

30. A method for transforming a cell, comprising transforming a cell with the isolated polynucleotide of claim 21.

31. A method for producing a plant comprising transforming a plant cell with the isolated polynucleotide of claim 21 and regenerating a plant from the transformed plant cell.

32. An isolated polypeptide having at least 90% sequence identity, based on the Clustal method of alignment, when compared to the sequence set forth in SEQ ID NO:20.

33. The isolated polypeptide of claim 32, wherein the isolated polypeptide has at least 95% sequence identity, based on the Clustal method of alignment, when compared to the sequence set forth in SEQ ID NO:20.

34. The isolated polypeptide of claim 33, wherein the isolated polypeptide comprises the amino acid sequence set forth in SEQ ID NO:20.

Description

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims the benefit of U.S. Provisional Application Ser. No. 60/930,877, filed May 17, 2007, which is incorporated by reference herein in its entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to the field of plant molecular biology, more particularly to regulation of gene expression in plants.

BACKGROUND OF THE INVENTION

[0003] Recent advances in plant genetic engineering have opened new doors to engineer plants to have improved characteristics or traits, such as plant disease resistance, insect resistance, herbicidal resistance, yield improvement, improvement of the nutritional quality of the edible portions of the plant, and enhanced stability or shelf-life of the ultimate consumer product obtained from the plants. Thus, a desired gene (or genes) with the molecular function to impart different or improved characteristics or qualities can be incorporated properly into the plant's genome. The newly integrated gene (or genes) coding sequence can then be expressed in the plant cell to exhibit the desired new trait or characteristic. It is important that appropriate regulatory signals be present in proper configurations in order to obtain the expression of the newly inserted gene coding sequence in the plant cell. These regulatory signals typically include a promoter region, a 5' non-translated leader sequence and a 3' transcription termination/polyadenylation sequence.

[0004] A promoter is a non-coding genomic DNA sequence, usually upstream (5') to the relevant coding sequence, to which RNA polymerase binds before initiating transcription. This binding aligns the RNA polymerase so that transcription will initiate at a specific transcription initiation site. The nucleotide sequence of the promoter determines the nature of the RNA polymerase binding and other related protein factors that attach to the RNA polymerase and/or promoter, and the rate of RNA synthesis.

[0005] It has been shown that certain promoters are able to direct RNA synthesis at a higher rate than others. These are called "strong promoters". Certain other promoters have been shown to direct RNA synthesis at higher levels only in particular types of cells or tissues and are often referred to as "tissue specific promoters", or "tissue-preferred promoters", if the promoters direct RNA synthesis preferentially in certain tissues (RNA synthesis may occur in other tissues at reduced levels). Since patterns of expression of a chimeric gene (or genes) introduced into a plant are controlled using promoters, there is an ongoing interest in the isolation of novel promoters that are capable of controlling the expression of a chimeric gene (or genes) at certain levels in specific tissue types or at specific plant developmental stages. Among the most commonly used promoters are the nopaline synthase (NOS) promoter (Ebert et al., Proc. Natl. Acad. Sci. USA 84:5745-5749 (1987)); the octapine synthase (OCS) promoter; caulimovirus promoters such as the cauliflower mosaic virus (CaMV) 19S promoter (Lawton et al., Plant Mol. Biol. 9:315-324 (1987)), the CaMV 35S promoter (Odell et al., Nature 313:810-812 (1985)), and the figwort mosaic virus 35S promoter (Sanger et al., Plant Mol. Biol. 14, 43343 (1990)); the light inducible promoter from the small subunit of rubisco (Pellegrineschi et al., Biochem. Soc. Trans. 23(2):247-250 (1995)); the Adh promoter (Walker et al., Proc. Natl. Acad. Sci. USA 84:6624-66280 (1987)); the sucrose synthase promoter (Yang et al., Proc. Natl. Acad. Sci. USA 87:414-44148 (1990)); the R gene complex promoter (Chandler et al., Plant Cell 1:1175-1183 (1989)); the chlorophyll a/b binding protein gene promoter; and the like.

[0006] An angiosperm flower is a complex structure generally consisting of a pedicel, sepals, petals, stamens, and a pistil. A stamen comprises a filament and an anther in which the male gametophyte pollens reside. A pistil comprises a stigma, style and ovary. An ovary contains one or more ovules in which the female gametophyte embryo sac, egg cell, central cell, and other specialized cells reside. Flower promoters in general include promoters that direct gene expression in any of the above tissues or cell types.

[0007] Although advances in technology provide greater success in transforming plants with chimeric genes, there is still a need for preferred expression of such genes in desired plants. Often times it is desired to selectively express target genes in a specific tissue because of toxicity or efficacy concerns. For example, flower tissue is a type of tissue where preferred expression is desirable and there remains a need for promoters that preferably initiate transcription in flower tissue. Promoters that initiate transcription preferably in flower tissue control genes involved in flower development and flower abortion.

SUMMARY OF THE INVENTION

[0008] Compositions and methods for regulating gene expression in a plant are provided. One aspect is for an isolated polynucleotide comprising: a) a nucleotide sequence comprising the sequence set forth in SEQ ID NO: 1 or a full-length complement thereof; or b) a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the sequence set forth in SEQ ID NO:1; wherein said nucleotide sequence is a promoter. Another aspect is for an isolated polynucleotide comprising (a) a nucleotide sequence comprising a fragment of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a full-length complement thereof; or (b) a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the nucleotide sequence of (a); wherein said nucleotide sequence is a promoter.

[0009] Other embodiments include recombinant DNA constructs comprising a polynucleotide sequence of the present invention operably linked to a heterologous sequence. Additional, some embodiments provide for transgenic plant cells, transient and stable, transgenic plant seeds, as well as transgenic plants comprising the provided recombinant DNA constructs.

[0010] There are provided some embodiments that include methods of expressing a coding sequence or a functional RNA in a flowering plant comprising: introducing a recombinant DNA construct described above into the plant, wherein the heterologous sequence comprises a coding sequence; growing the plant; and selecting a plant displaying expression of the coding sequence or the functional RNA of the recombinant DNA construct.

[0011] Furthermore, some embodiments of the present invention include methods of transgenically altering a marketable flower trait of a flowering plant, comprising: introducing a recombinant DNA construct described above into the flowering plant; growing a fertile, mature flowering plant resulting from the introducing step; and selecting a flowering plant expressing the heterologous nucleotide sequence in flower tissue based on the altered marketable flower trait.

[0012] Another aspect is for an isolated polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide, wherein the polypeptide has at least 90% sequence identity, based on the Clustal method of alignment, when compared to the sequence set forth in SEQ ID NO:20, or (b) a full-length complement of the nucleotide sequence of (a).

[0013] A further aspect is for an isolated polypeptide, wherein the isolated polypeptide has at least 90% sequence identity, based on the Clustal method of alignment, when compared to the sequence set forth in SEQ ID NO:20.

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCES

[0014] The invention can be more fully understood from the following detailed description, the accompanying drawings and Sequence Listing which form a part of this application. The Sequence Listing contains the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB standards described in Nucleic Acids Research 13:3021-3030 (1985) and in the Biochemical Journal 219 (No. 2): 345-373 (1984), which are herein incorporated by reference in their entirety. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. .sctn. 1.822.

[0015] SEQ ID NO:1 is a DNA sequence comprising a 1358 nucleotide soybean SC194 promoter (or full-length SC194 promoter).

[0016] SEQ ID NO:2 is a 1328 basepair truncated form of the SC194 promoter shown in SEQ ID NO:1 (bp 30-1357 of SEQ ID NO:1).

[0017] SEQ ID NO:3 is a 1134 basepair truncated form of the SC194 promoter shown in SEQ ID NO:1 (bp 224-1357 of SEQ ID NO:1).

[0018] SEQ ID NO:4 is a 932 basepair truncated form of the SC194 promoter shown in SEQ ID NO:1 (bp 426-1357 of SEQ ID NO:1).

[0019] SEQ ID NO:5 is a 685 basepair truncated form of the SC194 promoter shown in SEQ ID NO:1 (bp 673-1357 of SEQ ID NO:1).

[0020] SEQ ID NO:6 is a 472 basepair truncated form of the SC194 promoter shown in SEQ ID NO:1 (bp 886-1357 of SEQ ID NO:1).

[0021] SEQ ID NO:7 is a 237 basepair truncated form of the SC194 promoter shown in SEQ ID NO:1 (bp 1121-1357 of SEQ ID NO:1).

[0022] SEQ ID NO:8 is an oligonucleotide primer used in the PCR amplifications of the truncated SC194 promoter in SEQ ID NO:2 when paired with SEQ ID NO:9, and the truncated SC194 promoters in SEQ ID NOs: 3, 4, 5, 6 or 7 when paired with SEQ ID NOs: 10, 11, 12, 13, or 14, respectively.

[0023] SEQ ID NO:9 is an oligonucleotide primer used in the PCR amplification of the truncated SC194 promoter in SEQ ID NO:2 when paired with SEQ ID NO:8.

[0024] SEQ ID NO:10 is an oligonucleotide primer used in the PCR amplification of the truncated SC194 promoter in SEQ ID NO:3 when paired with SEQ ID NO:8.

[0025] SEQ ID NO:11 is an oligonucleotide primer used in the PCR amplification of the truncated SC194 promoter in SEQ ID NO:4 when paired with SEQ ID NO:8.

[0026] SEQ ID NO:12 is an oligonucleotide primer used in the PCR amplification of the truncated SC194 promoter in SEQ ID NO:5 when paired with SEQ ID NO:8.

[0027] SEQ ID NO:13 is an oligonucleotide primer used in the PCR amplification of the truncated SC194 promoter in SEQ ID NO:6 when paired with SEQ ID NO:8.

[0028] SEQ ID NO:14 is an oligonucleotide primer used in the PCR amplification of the truncated SC194 promoter in SEQ ID NO:7 when paired with SEQ ID NO:8.

[0029] SEQ ID NO:15 is an oligonucleotide primer specific to the soybean PSO375649 gene used in the first nested PCR amplification of the SC194 promoter when paired with SEQ ID NO:16.

[0030] SEQ ID NO:16 is an oligonucleotide primer used in the first nested PCR amplification of the SC194 promoter when paired with SEQ ID NO:15.

[0031] SEQ ID NO:17 is an oligonucleotide primer specific to the soybean PSO375649 gene used in the second nested PCR amplification of the SC194 promoter when paired with SEQ ID NO: 18. An NcoI restriction site CCATGG is added for subsequent cloning.

[0032] SEQ ID NO:18 is an oligonucleotide primer used in the second nested PCR amplification of the SC194 promoter when paired with SEQ ID NO:17.

[0033] SEQ ID NO:19 is the nucleotide sequence of a novel soybean cDNA PSO375649 encoding an unknown polypeptide. Nucleotides 1 to 86 are the 5' untranslated sequence, nucleotides 87 to 89 are the translation initiation codon, nucleotides 87 to 467 are polypeptide coding region, nucleotides 468 to 470 are the termination codon, nucleotides 468 to 804 are the 3' untranslated sequence, nucleotides 805 to 832 are part of the poly (A) tail.

[0034] SEQ ID NO:20 is the 127 amino acid long putative PSO375649 translation product SC194 protein sequence.

[0035] SEQ ID NO:21 is an oligonucleotide primer used in the diagnostic PCR to check for soybean genomic DNA presence in total RNA or cDNA when paired with SEQ ID NO:22.

[0036] SEQ ID NO:22 is an oligonucleotide primer used in the diagnostic PCR to check for soybean genomic DNA presence in total RNA or cDNA when paired with SEQ ID NO:21.

[0037] SEQ ID NO:23 is the longer strand sequence of the adaptor supplied in ClonTech.TM. GenomeWalker.TM. kit.

[0038] SEQ ID NO:24 is an MPSS tag sequence that is specific to the unique gene PSO375649.

[0039] SEQ ID NO:25 is a sense primer used in quantitative RT-PCR analysis of PSO375649 gene expression profile.

[0040] SEQ ID NO:26 is an antisense primer used in quantitative RT-PCR analysis of PSO375649 gene expression profile.

[0041] SEQ ID NO:27 is a sense primer used as an endogenous control gene-specific primer in the quantitative RT-PCR analysis of PSO375649 gene expression profile.

[0042] SEQ ID NO:28 is an antisense primer used as an endogenous control gene-specific primer in the quantitative RT-PCR analysis of PSO375649 gene expression profile.

[0043] SEQ ID NO:29 is a sense primer used in quantitative PCR analysis of SAMS:ALS transgene copy numbers.

[0044] SEQ ID NO:30 is a FAM labeled fluorescent DNA oligo probe used in quantitative PCR analysis of SAMS:ALS transgene copy numbers.

[0045] SEQ ID NO:31 is an antisense primer used in quantitative PCR analysis of SAMS:ALS transgene copy numbers.

[0046] SEQ ID NO:32 is a sense primer used in quantitative PCR analysis of GM-SC194:YFP transgene copy numbers.

[0047] SEQ ID NO:33 is a FAM labeled fluorescent DNA oligo probe used in quantitative PCR analysis of GM-SC194:YFP transgene copy numbers.

[0048] SEQ ID NO:34 is an antisense primer used in quantitative PCR analysis of GM-SC194:YFP transgene copy numbers.

[0049] SEQ ID NO:35 is a sense primer used as an endogenous control gene primer in quantitative PCR analysis of transgene copy numbers.

[0050] SEQ ID NO:36 is a VIC labeled DNA oligo probe used as an endogenous control gene probe in quantitative PCR analysis of transgene copy numbers.

[0051] SEQ ID NO:37 is an antisense primer used as an endogenous control gene primer in quantitative PCR analysis of transgene copy numbers.

[0052] SEQ ID NO:38 is the recombination site attB1 sequence in the Gateway cloning system (Invitrogen).

[0053] SEQ ID NO:39 is the recombination site attB2 sequence in the Gateway cloning system (Invitrogen).

[0054] SEQ ID NO:40 is the 3291 bp sequence of QC299.

[0055] SEQ ID NO:41 is the 4642 bp sequence of QC300.

[0056] SEQ ID NO:42 is the 8187 bp sequence of PHP25224.

[0057] SEQ ID NO:43 is the 8945 bp sequence of QC302.

[0058] SEQ ID NO:44 is the 2817 bp sequence of pCR8/GW/TOPO.

[0059] SEQ ID NO:45 is the 4145 bp sequence of QC300-1.

[0060] SEQ ID NO:46 is the 5286 bp sequence of QC330.

[0061] SEQ ID NO:47 is the 4986 bp sequence of QC300-1Y.

[0062] SEQ ID NO:48 is the 4792 bp sequence of QC300-2Y.

[0063] SEQ ID NO:49 is the 4590 bp sequence of QC300-3Y.

[0064] SEQ ID NO:50 is the 4343 bp sequence of QC300-4Y.

[0065] SEQ ID NO:51 is the 4130 bp sequence of QC300-5Y.

[0066] SEQ ID NO:52 is the 3895 bp sequence of QC300-6Y.

[0067] SEQ ID NO:53 is the 4157 bp sequence of pZSL90.

[0068] Table 1 displays the relative abundance (parts per million, PPM) of the PSO375649 gene determined by Lynx MPSS gene expression profiling.

[0069] Table 2 displays the relative transgene copy numbers and YFP expression of SC194:YFP transgenic soybean plants.

[0070] FIG. 1 displays the logarithm of relative quantifications of the PSO375649 gene expression in 14 different soybean tissues by quantitative RT-PCR. The gene expression profile indicates that the PSO375649 gene is highly expressed in flower buds and open flowers.

[0071] FIG. 2 displays the SC194 promoter copy number analysis by Southern hybridization. Also displayed is a schematic of the SC194 promoter showing relative linear positions of a number of restriction sites.

[0072] FIG. 3 is a schematic representation of the map of plasmids QC299, QC300, PHP25224, and QC302.

[0073] FIG. 4 displays schematic representations of a Gateway cloning entry vector pCR8/GW/TOPO (Invitrogen), the construct QC300-1 created by cloning the full length SC194 promoter into pCR8/GW/TOPO, a Gateway cloning destination vector QC330 containing a reporter ZS-YELLOW1 N1, and a final construct QC300-1Y with the 1328 bp truncated SC194 promoter (SEQ ID NO:2) placed in front of the ZS-YELLOW1 N1 reporter gene. Promoter deletion constructs QC300-2Y, QC300-3Y, QC300-4Y, QC300-5Y, and QC300-6Y containing the 1134, 932, 685, 472, and 237 bp truncated SC194 promoters, respectively, have similar map configurations, the difference being in the length of the promoter.

[0074] FIG. 5 is a linear schematic of the SC194 promoter constructs QC300, QC300-1Y, QC300-2Y, QC300-3Y, QC300-4Y, QC300-5Y, and QC300-6Y wherein the reporter ZS-YELLOW1 N1 is operably linked to the full length SC194 promoter and the progressive truncations of the SC194 promoter.

[0075] FIG. 6 displays the transient expression of the fluorescent protein reporter gene ZS-YELLOW1 N1 in the cotyledons of germinating soybean seeds. The reporter gene is driven by the full length SC194 promoter in construct QC300, or driven by the SC194 promoter or the progressively truncated SC194 promoters in the transient expression constructs QC300-1Y to QC300-6Y. Construct pZSL90 represents the positive control (constitutive promoter SCP1 drives the same reporter gene).

[0076] FIG. 7 displays the stable expression of the fluorescent protein reporter gene ZS-YELLOW1 N1 in the floral and other tissues of transgenic soybean plants containing a single copy of the transgene construct QC302. The green color indicates ZS-YELLOW1 N1 gene expression. The red color is background auto fluorescence from plant green tissues.

DETAILED DESCRIPTION OF THE INVENTION

[0077] The disclosure of all patents, patent applications, and publications cited herein are incorporated by reference in their entirety.

[0078] As used herein and in the appended claims, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a plant" includes a plurality of such plants, reference to "a cell" includes one or more cells and equivalents thereof known to those skilled in the art, and so forth.

[0079] In the context of this disclosure, a number of terms shall be utilized.

[0080] The term "promoter" refers to a nucleotide sequence capable of controlling the expression of a coding sequence or functional RNA. Functional RNA includes, but is not limited to, transfer RNA (tRNA) and ribosomal RNA (rRNA). Numerous examples of promoters may be found in the compilation by Okamuro and Goldberg (Biochemistry of Plants 15:1-82 (1989)). The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause a gene to be expressed in most cell types at most-times are commonly referred to as "constitutive promoters". It is further recognized that, since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.

[0081] An "intron" is an intervening sequence in a gene that is transcribed into RNA and then excised in the process of generating the mature mRNA. The term is also used for the excised RNA sequences. An "exon" is a portion of the sequence of a gene that is transcribed and is found in the mature messenger RNA derived from the gene, and is not necessarily a part of the sequence that encodes the final gene product.

[0082] A "flower" is a complex structure consisting of pedicel, sepal, petal, stamen, and carpel. A stamen comprises an anther, pollen and filament. A carpel comprises a stigma, style and ovary. An ovary comprises an ovule, embryo sac, and egg cell. Soybean pods develop from the pistil. It is likely that a gene expressed in the pistil of a flower continues to express in early pod. A "flower cell" is a cell from any one of these structures. Flower promoters in general include promoters that direct gene expression in any of the above tissues or cell types.

[0083] The term "flower crop" or "flowering plants" are plants that produce flowers that are marketable within the floriculture industry. Flower crops include both cut flowers and potted flowering plants. Cut flowers are plants that generate flowers that can be cut from the plant and can be used in fresh flower arrangements. Flower crops include roses, carnations, Gerberas, Chrysanthemums, tulips, Gladiolis, Alstroemerias, Anthuriums, Iisianthuses, larkspurs, irises, orchids, snapdragons, African violets, azaleas, in addition to other less popular flower crops.

[0084] The terms "flower-specific promoter" or "flower-preferred promoter" may be used interchangeably herein and refer to promoters active in flower, with promoter activity being significantly higher in flower tissue versus non-flower tissue. "Preferentially initiates transcription", when describing a particular cell type, refers to the relative level of transcription in that particular cell type as opposed to other cell types. The described SC194 promoters are promoters that preferentially initiate transcription in flower cells. Preferably, the promoter activity in terms of expression levels of an operably linked sequence is more than ten-fold higher in flower tissue than non-flower tissue. More preferably, the promoter activity is present in flower tissue while undetectable in non-flower tissue.

[0085] As used herein, an "SC194 promoter" refers to one type of flower-specific promoter. The native SC194 promoter (or full-length native SC194 promoter) is the native promoter of the putative soybean SC194 polypeptide, which is a novel protein without significant homology to any known protein in public databases. The "SC194 promoter", as used herein, also refers to fragments of the full-length native promoter that retain significant promoter activity. For example, an SC194 promoter of the present invention can be the full-length promoter (SEQ ID NO:1) or a promoter-functioning fragment thereof, which includes, among others, the polynucleotides of SEQ ID NOs: 2, 3, 4, 5, 6 and 7. An SC194 promoter also includes variants that are substantially similar and functionally equivalent to any portion of the nucleotide sequence set forth in SEQ ID NOs: 1, 2, 3, 4, 5, 6, or 7, or sequences therebetween.

[0086] An "isolated nucleic acid fragment" or "isolated polynucleotide" refers to a polymer of ribonucleotides (RNA) or deoxyribonucleotides (DNA) that is single-stranded or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated polynucleotide in the form of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.

[0087] The terms "polynucleotide", "polynucleotide sequence", "nucleic acid sequence", and "nucleic acid fragment"/"isolated nucleic acid fragment" are used interchangeably herein. These terms encompass nucleotide sequences and the like. A polynucleotide may be a polymer of RNA or DNA that is single- or double-stranded, that optionally contains synthetic, non-natural or altered nucleotide bases. A polynucleotide in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA, synthetic DNA, or mixtures thereof. Nucleotides (usually found in their 5'-monophosphate form) are referred to by a single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.

[0088] A "heterologous nucleic acid fragment" or "heterologous nucleotide sequence" refers to a nucleotide sequence that is not naturally occurring with the plant promoter sequence of the invention. While this nucleotide sequence is heterologous to the promoter sequence, it may be homologous, or native, or heterologous, or foreign, to the plant host. However, it is recognized that the instant promoters may be used with their native coding sequences to increase or decrease expression resulting in a change in phenotype in the transformed seed.

[0089] The terms "fragment (or variant) that is functionally equivalent" and "functionally equivalent fragment (or variant)" are used interchangeably herein. These terms refer to a portion or subsequence or variant of the promoter sequence of the present invention in which the ability to initiate transcription or drive gene expression (such as to produce a certain phenotype) is retained. Fragments and variants can be obtained via methods such as site-directed mutagenesis and synthetic construction. As with the provided promoter sequences described herein, the contemplated fragments and variants operate to promote the flower-preferred expression of an operably linked heterologous nucleic acid sequence, forming a recombinant DNA construct (also, a chimeric gene). For example, the fragment or variant can be used in the design of recombinant DNA constructs to produce the desired phenotype in a transformed plant. Recombinant DNA constructs can be designed for use in co-suppression or antisense by linking a promoter fragment or variant thereof in the appropriate orientation relative to a heterologous nucleotide sequence.

[0090] In some aspects of the present invention, the promoter fragments can comprise at least about 20 contiguous nucleotides, or at least about 50 contiguous nucleotides, or at least about 75 contiguous nucleotides, or at least about 100 contiguous nucleotides of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7. In another aspect, a promoter fragment is the nucleotide sequence set forth in SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7. The nucleotides of such fragments will usually comprise the TATA recognition sequence of the particular promoter sequence. Such fragments may be obtained by use of restriction enzymes to cleave the naturally occurring promoter nucleotide sequences disclosed herein, by synthesizing a nucleotide sequence from the naturally occurring promoter DNA sequence, or may be obtained through the use of PCR technology. See particularly, Mullis et al., Methods Enzymol. 155:335-350 (1987), and Higuchi, R. In PCR Technology: Principles and Applications for DNA Amplifications; Erlich, H. A., Ed.; Stockton Press Inc.: New York, 1989.

[0091] The terms "substantially similar" and "corresponding substantially" as used herein refer to nucleic acid sequences, particularly promoter sequences, wherein changes in one or more nucleotide bases do not substantially alter the ability of the promoter to initiate transcription or drive gene expression or produce a certain phenotype. These terms also refer to modifications, including deletions and variants, of the nucleic acid sequences of the instant invention by way of deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting promoter relative to the initial, unmodified promoter. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the specific exemplary sequences.

[0092] In one example of substantially similar, substantially similar nucleic acid sequences include those that are also defined by their ability to hybridize to the disclosed nucleic acid sequences, or portions thereof. Substantially similar nucleic acid sequences include those sequences that hybridize, under moderately stringent conditions (for example, 0.5.times.SSC, 0.1% SDS, 60.degree. C.) with the sequences exemplified herein, or to any portion of the nucleotide sequences reported herein and which are functionally equivalent to the promoter of the invention. Estimates of such homology are provided by either DNA-DNA or DNA-RNA hybridization under conditions of stringency as is well understood by those skilled in the art (Hames and Higgins, Eds.; In Nucleic Acid Hybridisation; IRL Press: Oxford, U.K., 1985). Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes partially determine stringency conditions. One set of conditions uses a series of washes starting with 6.times.SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2.times.SSC, 0.5% SDS at 45.degree. C. for 30 min, and then repeated twice with 0.2.times.SSC, 0.5% SDS at 50.degree. C. for 30 min. Another set of stringent conditions uses higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2.times.SSC, 0.5% SDS is increased to 60.degree. C. Another set of highly stringent conditions uses two final washes in 0.1.times.SSC, 0.1% SDS at 65.degree. C.

[0093] In some examples, substantially similar nucleic acid sequences are those sequences that are at least 80% identical to the nucleic acid sequences reported herein or which are at least 80% identical to any portion of the nucleotide sequences reported herein. In some instances, substantially similar nucleic acid sequences are those that are at least 90% identical to the nucleic acid sequences reported herein, or at least 90% identical to any portion of the nucleotide sequences reported herein. In some examples, substantially similar nucleic acid sequences are those that are at least 95% identical to the nucleic acid sequences reported herein, or are at least 95% identical to any portion of the nucleotide sequences reported herein. It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying related polynucleotide sequences. Useful examples of percent identities are those listed above, or also any integer percentage from 80% to 100%, such as, for example, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% and 99%.

[0094] "Codon degeneracy" refers to divergence in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment comprising a nucleotide sequence that encodes all or a substantial portion of the amino acid sequences set forth herein. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a nucleic acid sequence for improved expression in a host cell, it is desirable to design the nucleic acid sequence such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.

[0095] Sequence alignments and percent similarity calculations may be determined using the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences are performed using the Clustal method of alignment (Higgins and Sharp, CABIOS 5:151-153 (1989)) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are GAP PENALTY=10, GAP LENGTH PENALTY=10, KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. A "substantial portion" of an amino acid or nucleotide sequence comprises enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford putative identification of that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Altschul, S. F. et al., J. Mol. Biol. 215:403-410 (1993)) and Gapped Blast (Altschul, S. F. et al., Nucleic Acids Res. 25:3389-3402 (1997)). BLASTN refers to a BLAST program that compares a nucleotide query sequence against a nucleotide sequence database.

[0096] The term "gene" refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Chimeric gene" or "recombinant expression construct", which are used interchangeably, refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, and arranged in a manner different than that found in nature. "Endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene not normally found in the host organism, which is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene that has been introduced into the genome by a transformation procedure.

[0097] "Coding sequence" refers to a DNA sequence that encodes for a specific amino acid sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include, and are not limited to, promoters, enhancers, translation leader sequences, introns, and polyadenylation recognition sequences.

[0098] The "translation leader sequence" refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner, R. and Foster, G. D., Molecular Biotechnology 3:225 (1995)).

[0099] The "3' non-coding sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized as affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht et al., Plant Cell 1:671-680 (1989).

[0100] "RNA transcript" refers to a product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When an RNA transcript is a perfect complementary copy of a DNA sequence, it is referred to as a primary transcript, or it may be a RNA sequence derived from posttranscriptional processing of a primary transcript and is referred to as a mature RNA. "Messenger RNA" ("mRNA") refers to RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to and synthesized from an mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded using the Klenow fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes mRNA and so can be translated into protein within a cell or in vitro. "Antisense RNA" refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks expression or transcripts accumulation of a target gene. The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e. at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated yet has an effect on cellular processes.

[0101] The term "operably linked" refers to the association of nucleic acid sequences on a single polynucleotide so that the function of one is affected by the other. For example, a promoter is operably linked with a heterologous nucleotide sequence, e.g., a coding sequence, when it is capable of affecting the expression of that heterologous nucleotide sequence (i.e., for example, the coding sequence is under the transcriptional control of the promoter). A coding sequence can be operably linked to promoter sequences in sense or antisense orientation.

[0102] The terms "initiate transcription", "initiate expression", "drive transcription", and "drive expression" are used interchangeably herein and all refer to the primary function of a promoter. As detailed throughout this disclosure, a promoter is a non-coding genomic DNA sequence, usually upstream (5') to the relevant coding sequence, and its primary function is to act as a binding site for RNA polymerase and initiate transcription by the RNA polymerase. Additionally, there is "expression" of RNA, including functional RNA, or the expression of polypeptide for operably linked encoding nucleotide sequences, as the transcribed RNA ultimately is translated into the corresponding polypeptide.

[0103] The term "expression", as used herein, refers to the production of a functional end-product, e.g., an mRNA or a protein (precursor or mature).

[0104] The term "recombinant DNA construct" or "recombinant expression construct" is used interchangeably and refers to a discrete polynucleotide into which a nucleic acid sequence or fragment can be moved. Preferably, it is a plasmid vector or a fragment thereof comprising the promoters of the present invention. The choice of plasmid vector is dependent upon the method that will be used to transform host plants. The skilled artisan is well aware of the genetic elements that must be present on the plasmid vector in order to successfully transform, select and propagate host cells containing the recombinant DNA construct. The skilled artisan will also recognize that different independent transformation events will result in different levels and patterns of expression (Jones et al., EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics 218:78-86 (1989)), and thus that multiple events must be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by PCR and Southern analysis of DNA, RT-PCR and Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.

[0105] Expression or overexpression of a gene involves transcription of the gene and translation of the mRNA into a precursor or mature protein. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. "Overexpression" refers to the production of a gene product in transgenic organisms that exceeds levels of production in normal or non-transformed organisms. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression or transcript accumulation of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020). The mechanism of co-suppression may be at the DNA level (such as DNA methylation), at the transcriptional level, or at post-transcriptional level.

[0106] Co-suppression constructs in plants previously have been designed by focusing on overexpression of a nucleic acid sequence having homology to an endogenous mRNA, in the sense orientation, which results in the reduction of all RNA having homology to the overexpressed sequence (see Vaucheret et al., Plant J. 16:651-659 (1998); and Gura, Nature 404:804-808 (2000)). The overall efficiency of this phenomenon is low, and the extent of the RNA reduction is widely variable. Recent work has described the use of "hairpin" structures that incorporate all, or part, of an mRNA encoding sequence in a complementary orientation that results in a potential "stem-loop" structure for the expressed RNA (WO99/53050 and WO02/00904). This increases the frequency of co-suppression in the recovered transgenic plants. Another variation describes the use of plant viral sequences to direct the suppression, or "silencing", of proximal mRNA encoding sequences (WO98/36083). Neither of these co-suppressing phenomena has been elucidated mechanistically at the molecular level, although genetic evidence has been obtained that may lead to the identification of potential components (Elmayan et al., Plant Cell 10:1747-1757 (1998)).

[0107] As stated herein, "suppression" refers to a reduction of the level of enzyme activity or protein functionality (e.g., a phenotype associated with a protein) detectable in a transgenic plant when compared to the level of enzyme activity or protein functionality detectable in a non-transgenic or wild type plant with the native enzyme or protein. The level of enzyme activity in a plant with the native enzyme is referred to herein as "wild type" activity. The level of protein functionality in a plant with the native protein is referred to herein as "wild type" functionality. The term "suppression" includes lower, reduce, decline, decrease, inhibit, eliminate and prevent. This reduction may be due to a decrease in translation of the native mRNA into an active enzyme or functional protein. It may also be due to the transcription of the native DNA into decreased amounts of mRNA and/or to rapid degradation of the native mRNA. The term "native enzyme" refers to an enzyme that is produced naturally in a non-transgenic or wild type cell. The terms "non-transgenic" and "wild type" are used interchangeably herein.

[0108] "Altering expression" refers to the production of gene product(s) in transgenic organisms in amounts or proportions that differ significantly from the amount of the gene product(s) produced by the corresponding wild-type organisms (i.e., expression is increased or decreased).

[0109] "Transformation" refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. Thus, a "transgenic plant cell" as used herein refers to a plant cell containing the transformed nucleic acid fragments. The preferred method of soybean cell transformation is use of particle-accelerated or "gene gun" transformation technology (Klein, T., Nature (London) 327:70-73 (1987); U.S. Pat. No. 4,945,050).

[0110] "Transient expression" refers to the temporary expression of often reporter genes such as .beta.-glucuronidase (GUS), fluorescent protein genes GFP, ZS-YELLOW1 N1, AM-CYAN1, DS-RED in selected certain cell types of the host organism in which the transgenic gene is introduced temporally by a transformation method. The transformed materials of the host organism are subsequently discarded after the transient gene expression assay.

[0111] A "marketable flower trait" is a characteristic or phenotype of the flower of a plant such as the color, scent or morphology of a flower. The marketable flower trait is a characteristic of a flower that is of high regard to a flower crop consumer in deciding whether to purchase the flower crop.

[0112] The phrase "genes involved in anthocyanin biosynthesis" refers to genes that encode proteins that play a role in converting metabolic precursors into the one of a number of anthocyanins. Examples of genes involved in the biosynthesis of anthocyanin are dyhydroflavonol 4-reductase, flavonoid 3,5-hydroxylase, chalcone synthase, chalcone isomerase, flavonoid 3-hydroxylase, anthocyanin synthase, and UDP-glucose 3-O-flavonoid glucosyl transferase (see, e.g., Mori et al., Plant Cell Reports 22:415-421 (2004)).

[0113] The phrase "genes involved in the biosynthesis of fragrant fatty acid derivatives" refers to genes that encode proteins that play a role in manipulating the biosynthesis of fragrant fatty acid derivatives such as terpenoids, phenylpropanoids, and benzenoids in flowers (see, e.g., Tanaka et al., Plant Cell, Tissue and Organ Culture 80:1-24 (2005)). Examples of such genes include S-linalool synthase, acetyl CoA:benzylalcohol acetyltransferase, benzyl CoA:benzylalcohol benzoyl transferase, S-adenosyl-L-methionine:benzoic acid carboxylmethyl transferase (BAMT), mycrene synthases, (E)-.beta.-ocimene synthase, orcinol O-methyltransferase, and limonene synthases (see, e.g., Tanaka et al., supra).

[0114] The term "flower homeotic genes" or "flower morphology modifying genes" refers to genes that are involved in pathways associated with flower morphology. A modification of flower morphology can lead to a novel form of the respective flower that can enhance its value in the flower crop marketplace. Morphology can include the size, shape, or petal pattern of a flower. Some example of flower homeotic genes include genes involved in cell-fate determination (in ABC combinatorial model of gene expression), including AGAMOUS, which determines carpel fate in the central whorl, APETALA3, which determines the sepal fate in the outer whorl, and PISTILLATA, which determines petal development in the second whorl (Espinosa-Soto et al., Plant Cell 16:2923-2939 (2004)).

[0115] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J. et al., In Molecular Cloning: A Laboratory Manual; 2.sup.nd ed.; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, N.Y., 1989 (hereinafter "Sambrook et al., 1989") or Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A. and Struhl, K., Eds.; In Current Protocols in Molecular Biology; John Wiley and Sons: New York, 1990 (hereinafter "Ausubel et al., 1990").

[0116] "PCR" or "Polymerase Chain Reaction" is a technique for the synthesis of large quantities of specific DNA segments consisting of a series of repetitive cycles (Perkin Elmer Cetus Instruments, Norwalk, Conn.). Typically, the double stranded DNA is heat denatured; the two primers complementary to the 3' boundaries of the target segment are annealed at low temperature and then extended at an intermediate temperature. One set of these three consecutive steps comprises a cycle.

[0117] Embodiments of the present invention include isolated polynucleotides comprising a nucleotide sequence that is a promoter. In some instances, the nucleotide sequence includes one or more of the following: [0118] a) the sequence set forth in SEQ ID NO:1 or a full-length complement thereof; or [0119] b) a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the sequence set forth in SEQ ID NO:1. In other aspects, the nucleotide sequence includes one or more of the following: [0120] (a) a nucleotide sequence comprising a fragment of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a full-length complement thereof; or [0121] (b) a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the nucleotide sequence of (a). The nucleotide sequences of the present invention can be referred to as a promoter or as having promoter-like activity. In some embodiments the nucleotide sequence is a promoter that preferentially initiates transcription in a plant flower cell. Such promoter is referred to as a flower-specific promoter. Preferably the promoter of the present invention is the soybean "SC194" promoter.

[0122] In a preferred embodiment, the promoter comprises the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7. The present invention also includes nucleic acid fragments, variants, and complements of the aforementioned nucleotide sequences or promoters, provided that they are substantially similar and functionally equivalent to the nucleotide sequence set forth in these nucleotide sequences. A nucleic acid fragment or variant that is functionally equivalent to the present SC194 promoter is any nucleic acid fragment or variant that is capable of initiating the expression, preferably initiating flower-specific expression, of a coding sequence or functional RNA in a similar manner to the SC194 promoter. The expression patterns of SC194 gene and its promoter are set forth in Examples 1, 2, 7, and 8. In one example, the expression pattern of a SC194 promoter fragment or variant will have expression patterns similar to that of the SC194 promoter.

[0123] In some aspects, a recombinant DNA construct can be formed in part by operably linking at least one of the promoters of the present invention to any heterologous nucleotide sequence. The heterologous nucleotide sequence can be expressed in a cell as either a functional RNA or a polypeptide. The cell for expression includes a plant or bacterial cell, preferably a plant cell. The recombinant DNA construct preferably includes the SC194 promoter. The recombinant DNA construct preferably includes a heterologous nucleotide sequence that encodes a protein that plays a role in flower color formation, fragrance production, or shape/morphology development of the flower. The color of a flower can be altered transgenically by expressing genes involved in betalain, carotenoid, or flavanoid biosynthesis. In regard to genes involved in the biosynthesis of anthocyanin, dyhydroflavonol 4-reductase, flavonoid 3,5-hydroxylase, chalcone synthase, chalcone isomerase, flavonoid 3-hydroxylase, anthocyanin synthase, and UDP-glucose 3-O-flavonoid glucosyl transferase are some examples. The scent of a flower can be altered transgenically by expressing genes that manipulate the biosynthesis of fragrant fatty acid derivatives such as terpenoids, phenylpropanoids, and benzenoids in flowers. Some embodiments of the invention include a heterologous nucleotide sequence that is selected from S-linalool synthase, acetyl CoA:benzylalcohol acetyltransferase, benzyl CoA:benzylalcohol benzoyl transferase, S-adenosyl-L-methionine:benzoic acid carboxylmethyl transferase, mycrene synthases, (E)-.beta.-ocimene synthase, orcinol O-methyltransferase, or limonene synthases. Flower structures/morphologies can be altered transgenically by expressing flower homeotic genes to create novel ornamental varieties. Some embodiments of the invention include a heterologous nucleotide sequence that is selected from genes such as, for example, AGAMOUS, APETALA3, and PISTILLATA.

[0124] It is recognized that the instant promoters may be used with their native coding sequences to increase or decrease expression in flower tissue. The selection of the heterologous nucleic acid fragment depends upon the desired application or phenotype to be achieved. The various nucleic acid sequences can be manipulated so as to provide for the nucleic acid sequences in the proper orientation.

[0125] Plasmid vectors comprising the instant recombinant DNA construct can be constructed. The choice of plasmid vector is dependent upon the method that will be used to transform host cells. The skilled artisan is well aware of the genetic elements that must be present on the plasmid vector in order to successfully transform, select and propagate host cells containing the recombinant DNA construct.

[0126] The described polynucleotide embodiments encompass isolated or substantially purified nucleic acid compositions. An "isolated" or "purified" nucleic acid molecule, or biologically active portion thereof, is substantially free of other cellular material or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. An "isolated" nucleic acid is essentially free of sequences (preferably protein encoding sequences) that naturally flank the polynucleotide (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the polynucleotide is derived. For example, in various embodiments, the isolated polynucleotide can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequences that naturally flank the polynucleotide in genomic DNA of the cell from which the polynucleotide is derived.

[0127] In another embodiment, the present invention includes host cells comprising either the recombinant DNA constructs or isolated polynucleotides of the present invention. Examples of the host cells of the present invention include, and are not limited to, yeast, bacteria, and plants, including flower crops such as, e.g., rose, carnation, Gerbera, Chrysanthemum, tulip, Gladioli, Alstroemeria, Anthurium, Iisianthus, larkspur, irises, orchid, snapdragon, African violet, or azalea. Preferably, the host cells are plant cells, and more preferably, flower crop cells, and more preferably, Gerbera, rose, camation, Chrysanthemum, or tulip cells.

[0128] Methods for transforming dicots, primarily by use of Agrobacterium tumefaciens, and obtaining transgenic plants have been published, among others, for cotton (U.S. Pat. No. 5,004,863, U.S. Pat. No. 5,159,135); soybean (U.S. Pat. No. 5,569,834, U.S. Pat. No. 5,416,011); Brassica (U.S. Pat. No. 5,463,174); peanut (Cheng et al., Plant Cell Rep. 15:653-657 (1996); McKently et al., Plant Cell Rep. 14:699-703 (1995)); papaya (Ling et al., Bio/technology 9:752-758 (1991)); and pea (Grant et al., Plant Cell Rep. 15:254-258 (1995)). For a review of other commonly used methods of plant transformation see Newell, C. A., Mol. Biotechnol. 16:53-65 (2000). One of these methods of transformation uses Agrobacterium rhizogenes (Tepfler, M. and Casse-Delbart, F., Microbiol. Sci. 4:24-28 (1987)). Transformation of soybeans using direct delivery of DNA has been published using PEG fusion (WO 92/17598), electroporation (Chowrira et al., Mol. Biotechnol. 3:17-23 (1995); Christou et al., Proc. Natl. Acad. Sci. U.S.A. 84:3962-3966 (1987)), microinjection (Neuhaus et al., Physiol. Plant. 79:213-217 (1990)), or particle bombardment (McCabe et al., Biotechnology 6:923 (1988); Christou et al., Plant Physiol. 87:671-674 (1988)).

[0129] In another embodiment, the present invention includes transgenic plants comprising the recombinant DNA constructs provided herein. The transgenic plants are selected from, for example, one of a number of various flower crops including roses, carnations, Gerberas, Chrysanthemums, tulips, Gladiolis, Alstroemerias, Anthuriums, lisianthuses, larkspurs, irises, orchids, snapdragons, African violets, azaleas, in addition to other less popular flower crops.

[0130] In some embodiments of the invention, there are provided transgenic seeds produced by the transgenic plants provided. Such seeds are able to produce another generation of transgenic plants.

[0131] There are a variety of methods for the regeneration of plants from plant tissues. The particular method of regeneration will depend on the starting plant tissue and the particular plant species to be regenerated. The regeneration, development and cultivation of plants from single plant protoplast transformants or from various transformed explants is well known in the art (Weissbach and Weissbach, Eds.; In Methods for Plant Molecular Biology; Academic Press, Inc.: San Diego, Calif., 1988). This regeneration and growth process typically includes the steps of selection of transformed cells, culturing those individualized cells through the usual stages of embryonic development through the rooted plantlet stage. Transgenic embryos and seeds are similarly regenerated. The resulting transgenic rooted shoots are thereafter planted in an appropriate plant growth medium such as soil. Preferably, the regenerated plants are self-pollinated to provide homozygous transgenic plants. Otherwise, pollen obtained from the regenerated plants is crossed to seed-grown plants of agronomically important lines. Conversely, pollen from plants of these important lines is used to pollinate regenerated plants. A transgenic plant of the present invention containing a desired polypeptide is cultivated using methods well known to one skilled in the art.

[0132] In addition to the above discussed procedures, there are generally available standard resource materials that describe specific conditions and procedures for the construction, manipulation and isolation of macromolecules (e.g., DNA molecules, plasmids, and the like), generation of recombinant DNA fragments and recombinant expression constructs, and the screening and isolating of clones (see, for example, Sambrook et al., 1989; Maliga et al., In Methods in Plant Molecular Biology; Cold Spring Harbor Press, 1995; Birren et al., In Genome Analysis: Detecting Genes, 1; Cold Spring Harbor: New York, 1998; Birren et al., In Genome Analysis: Analyzing DNA, 2; Cold Spring Harbor: New York, 1998; Clark, Ed., In Plant Molecular Biology: A Laboratory Manual; Springer: New York, 1997).

[0133] The skilled artisan will also recognize that different independent transformation events will result in different levels and patterns of expression of the chimeric genes (Jones et al., EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics 218:78-86 (1989)). Thus, multiple events must be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by northern analysis of mRNA expression, western analysis of protein expression, or phenotypic analysis. Also of interest are seeds obtained from transformed plants displaying the desired expression profile.

[0134] The level of activity of the SC194 promoter in flowers is in some cases comparable to that of many known strong promoters such as the CaMV 35S promoter (Atanassova et al., Plant Mol. Biol. 37:275-285 (1998); Battraw and Hall, Plant Mol. Biol. 15:527-538 (1990); Holtorf et al., Plant Mol. Biol. 29:637-646 (1995); Jefferson et al., EMBO J. 6:3901-3907 (1987); Wilmink et al., Plant Mol. Biol. 28:949-955 (1995)), the Arabidopsis oleosin promoters (Plant et al., Plant Mol. Biol. 25:193-205 (1994); Li, Texas A&M University Ph.D. dissertation, pp. 107-128 (1997)), the Arabidopsis ubiquitin extension protein promoters (Callis et al., J. Biol. Chem. 265(21):12486-12493 (1990)), a tomato ubiquitin gene promoter (Rollfinke et al., Gene 211:267-276 (1998)), a soybean heat shock protein promoter (Raschke et al., J. Mol. Biol. 199(4):549-557 (1988)), and a maize H3 histone gene promoter (Atanassova et al., Plant Mol. Biol. 37:275-285 (1998)).

[0135] In some embodiments, the promoters of the present invention are useful when flower-specific expression of a target heterologous nucleic acid fragment is required. Another useful feature of the promoters is its expression profile having high levels in developing flowers and low levels in young developing seeds (See Example 1). The promoters of the present invention are most active in developing flower buds and open flowers, while still having activity although approximately ten times lower in developing seeds. Thus, the promoters can be used for gene expression or gene silencing in flowers, especially when gene expression or gene silencing is desired predominantly in flowers along with a lower degree in developing seeds.

[0136] In some embodiments, the promoters of the present invention are used to construct recombinant DNA constructs that can be used to reduce expression of at least one heterologous nucleic acid sequence in a plant cell. To accomplish this, a recombinant DNA construct can be constructed by linking the heterologous nucleic acid sequence to a promoter of the present invention. (See, e.g., U.S. Pat. No. 5,231,020, WO99/53050, WO02/00904, and WO98/36083 for methodology to block plant gene expression via cosuppression.) Alternatively, recombinant DNA constructs designed to express antisense RNA for a heterologous nucleic acid fragment can be constructed by linking the fragment in reverse orientation to a promoter of the present invention. (See, e.g., U.S. Pat. No. 5,107,065 for methodology to block plant gene expression via antisense RNA.) Either the cosuppression or antisense chimeric gene can be introduced into plants via transformation. Transformants, wherein expression of the heterologous nucleic acid sequence is decreased or eliminated, are then selected.

[0137] There are embodiments of the present invention that include promoters of the present invention being utilized for methods of altering (increasing or decreasing) the expression of at least one heterologous nucleic acid sequence in a plant cell which comprises: transforming a plant cell with a recombinant DNA expression construct described herein; growing fertile mature plants from the transformed plant cell; and selecting plants containing a transformed plant cell wherein the expression of the heterologous nucleotide sequence is altered (increased or decreased).

[0138] Transformation and selection can be accomplished using methods well-known to those skilled in the art including, but not limited to, the methods described herein.

[0139] There are provided some embodiments that include methods of expressing a coding sequence in a plant that is a flower crop comprising: introducing a recombinant DNA construct disclosed herein into the plant; growing the plant; and selecting a plant displaying expression of the coding sequence; wherein the nucleotide sequence comprises: a nucleotide sequence comprising the sequence set forth in SEQ ID NO:1 or a full-length complement thereof; a nucleotide sequence comprising a fragment of the sequence set forth in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a full-length complement thereof, or in alternative embodiments, the sequence set forth in SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7; or a nucleotide sequence comprising a sequence having at least 90% sequence identity, based on the BLASTN method of alignment, when compared to the sequence set forth in SEQ ID NO:1; wherein said nucleotide sequence initiates transcription in a flower cell of the plant.

[0140] Furthermore, some embodiments of the present invention include methods of transgenically altering a marketable flower trait of a flowering plant, comprising: introducing a recombinant DNA construct disclosed herein into the flowering plant; growing a fertile, mature flowering plant resulting from the introducing step; and selecting a flowering plant expressing the heterologous nucleotide sequence in flower tissue based on the altered marketable flower trait.

[0141] As further described in the Examples below, the promoter activity of the soybean genomic DNA fragment sequence SEQ ID NO:1 upstream of the SC194 protein coding sequence was assessed by linking the fragment to a yellow fluorescence reporter gene, ZS-YELLOW1 N1 (YFP) (Matz et al., Nat. Biotechnol. 17:969-973 (1999)), transforming the promoter::YFP expression cassette into soybean, and analyzing YFP expression in various cell types of the transgenic plants (see Example 7 and 8). All parts of the transgenic plants were analyzed and YFP expression was predominantly detected in flowers. These results indicated that the nucleic acid fragment contained flower-preferred promoter.

[0142] Some embodiments of the present invention provide recombinant DNA constructs comprising at least one isopentenyl transferase nucleic acid sequence operably linked to a provide promoter, preferably a SC194 promoter. The isopentenyl transferase plays a key step in the biosynthesis of plant cytokinin (Kakimoto, J. Plant Res. 116:233-239 (2003)). Elevated levels of cytokinin in plant cells might help to delay floral senescence and abortion which may present a potential way to improve crop yields (Chang et al., Plant Physiol. 132:2174-2183 (2003); Young et al., Plant J. 38:910-922 (2004)).

[0143] Utilities for Flower-Specific Promoters

[0144] The color, scent or morphology of a flower represents marketable flower traits, or characteristics/phenotypes of a flower that consumers, particularly floriculturalists, consider when determining which flowers are desirable and will be purchased. Hence, it would be beneficial to be able to alter these characteristics in order to satisfy the desires of consumers. Transgenic technologies can be implemented in order to achieve such results.

[0145] The phenotype of a flower can be altered transgenically by expressing genes, preferably in flower tissue, that play a role in color formation, fragrance production, or shape/morphology development of the flower. This type of alteration is particularly useful in the floriculture industry, and particularly useful for flowering plants.

[0146] The color of a flower is mainly the result of three types of pigment: flavanoids, carotenoids, and betalains. The flavanoids are the most common of the three and they contribute to colors ranging from yellow to red to blue, with anthocyanins being the major flavanoid. Carotenoids are C-40 tetraterpenoids that contribute to the majority of yellow hues and contribute to orange/red, bronze and brown colors, e.g., that seen in roses and chrysanthemums. Betalains are the least abundant and contribute to various hues of ivory, yellow, orange, red and violet. The color of a flower can be altered transgenically by expressing genes involved in, e.g., betalain, carotenoid, or flavanoid biosynthesis. In one example, the color of a flower can be altered transgenically by expressing genes involved in the biosynthesis of anthocyanin, for example, dyhydroflavonol 4-reductase, flavonoid 3,5-hydroxylase, chalcone synthase, chalcone isomerase, flavonoid 3-hydroxylase, anthocyanin synthase, and UDP-glucose 3-O-flavonoid glucosyl transferase. In some aspects of the invention, the gene involved in anthocyanin biosynthesis is the flavonoid 3,5-hydroxylase gene (see, e.g., Mori et al., supra). This type of alteration is particularly useful in the floriculture industry, providing novel flower colors in flower crops.

[0147] In addition to color, the scent of a flower can be altered transgenically by expressing genes that manipulate the biosynthesis of fragrant fatty acid derivatives such as terpenoids, phenylpropanoids, and benzenoids in flowers (see, e.g., Tanaka et al., supra). Genes involved in the biosynthesis of fragrant fatty acid derivatives can be operably linked to the flower-specific promoters presently described for preferential expression in flower tissue. The preferential expression in flower tissue can be utilized to generate new and desirable fragrances to enhance the demand for the underlying cut flower. A number of known genes that are involved in the biosynthesis of floral scents are described below. A strong sweet scent can be generated in a flower by introducing or upregulating expression of S-linalool synthase, which was earlier isolated from Clarkia breweri. Two genes that are responsible for the production of benzylacetate and benzylbenzoate are acetyl CoA:benzylalcohol acetyltransferase and benzyl CoA:benzylalcohol benzoyl transferase, respectively. These transferases were also reported to have been isolated from C. breweri. A phenylpropanoid floral scent, methylbenzoate, is synthesized in part by S-adenosyl-L-methionine:benzoic acid carboxylmethyl transferase (BAMT), which catalyzes the final step in the biosynthesis of methyl benzoate. BAMT is known to have a significant role in the emission of methyl benzoate in snapdragon flowers. Two monoterpenes, mycrene and (E)-.beta.-ocimene, from snapdragon are known to be synthesized in part by the terpene synthases: mycrene synthases and (E)-.beta.-ocimene synthases. Other genes involved in biosynthesis of floral scents have been reported and are being newly discovered, many of which are isolated from rose. Some genes involved in scent production in the rose include orcinol O-methyltransferase, for synthesis of S-adenosylmethionine, and limonene synthases (see, e.g., Tanaka et al., supra).

[0148] Flower structures/morphologies can be altered transgenically by expressing flower homeotic genes to create novel ornamental varieties. The flower homeotic genes that are determinative of flower morphology include genes such as AGAMOUS, APETALA3, PISTILLATA, and others that are known and/or are being elucidated (see, e.g., Espinosa-Soto et al., supra).

EXAMPLES

[0149] Aspects of the present invention are exemplified in the following Examples. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

[0150] In the discussion below, parts and percentages are by weight and degrees are Celsius, unless otherwise stated. Sequences of promoters, cDNA, adaptors, and primers listed herein are in the 5' to 3' orientation unless described otherwise. Techniques in molecular biology were typically performed as described in Ausubel et al., 1990 or Sambrook et al., 1989.

Example 1

Lynx MPSS Profiling of Soybean Genes Preferably Expressed in Flowers

[0151] Soybean expression sequence tags (ESTs) were generated by sequencing randomly selected clones from cDNA libraries constructed from different soybean tissues. Multiple EST sequences may have different lengths representing different regions of the same soybean gene. For those EST sequences representing the same gene that are found more frequently in a flower-specific cDNA library, there is a possibility that the representative gene could be a flower preferred gene candidate. Multiple EST sequences representing the same soybean gene were compiled electronically based on their overlapping sequence homology into a full length sequence representing a unique gene. These assembled, unique gene sequences were cumulatively collected and the information was stored in a searchable database. Flower specific candidate genes were identified by searching this database to find gene sequences that are frequently found in flower libraries but are rarely found in other tissue libraries, or not found in other tissue libraries.

[0152] One unique gene, PSO375649, was identified in the search as a flower specific gene candidate since all of the ESTs representing PSO375649 were found only in flower tissue. PSO375649 cDNA sequence (SEQ ID NO:19) as well as its putative translated protein sequence (SEQ ID NO:20) were used to search National Center for Biotechnology Information (NCBI) databases. PSO375649 was found to represent a novel soybean gene without significant homology to any known gene. PSO375649 was subsequently named after its genomic DNA clone lab name SC194.

[0153] A more sensitive gene expression profiling methodology MPSS (Mass Parallel Signature Sequence) transcript profiling technique (Brenner et al., Proc. Natl. Acad. Sci. USA 97:1665-70 (2000)) was used to confirm PSO375649 as a flower specific gene. The MPSS technology involves the generation of 17 base signature tags from mRNA samples that have been reverse transcribed from poly A+ RNA isolated using standard molecular biology techniques (Sambrook et al., 1989). The tags are simultaneously sequenced and assigned to genes or ESTs. The abundance of these tags is given a number value that is normalized to parts per million (PPM) which then allows the tag expression, or tag abundance, to be compared across different tissues. Genome wide gene expressions can be profiled simultaneously using this technology. Since each 17 base tag is long enough to be specific to only one or a few genes in any genome, the MPSS platform can be used to determine the expression pattern of a particular gene and its expression levels in different tissues.

[0154] MPSS gene expression profiles were generated from different soybean tissues over time, and the profiles were accumulated in a searchable database. PSO375649 cDNA sequence SEQ ID NO:19 was used to search the MPSS database to identify a MPSS tag sequence (SEQ ID NO:24) that is identical to a 17 base pair region from position 352 to 368 in the PSO375649 cDNA sequence. The identified MPSS tag was then used to search the MPSS database to reveal its abundance in different soybean tissues. As illustrated in Table 1, the PSO375649 gene was confirmed to be highly abundant in flowers and pods, a desired expression profile for its promoter to be able to express genes in flowers and in early developing pods.

TABLE-US-00001 TABLE 1 Target Gene PSO375649 MPSS Tag Sequence SEQ ID NO:24 Flower 4818 Pod 61 Flower Bud 2759 Lateral Root 0 Leaf 0 Petiole 0 Primary Root 0 Seed 17 Stem 0

Example 2

Quantitative RT-PCR Profiles of SC194 Gene Expression in Soybean

[0155] The MPSS profiles of SC194 gene, PSO375649, was confirmed and extended by analyzing 14 different soybean tissues using the relative quantitative RT-PCR (qRT-PCR) technique with a 7500 real time PCR system (Applied Biosystems, Foster City, Calif.).

[0156] Fourteen soybean tissues (somatic embryo, somatic embryo grown one week on charcoal plate, leaf, leaf petiole, root, flower bud, open flower, R3 pod, R4 seed, R4 pod coat, R5 seed, R5 pod coat, R6 seed, R6 pod coat) were collected from cultivar `Jack` and flash frozen in liquid nitrogen. The seed and pod development stages were defined according to descriptions in Fehr and Caviness, IWSRBC 80:1-12 (1977). Total RNA was extracted with Trizol reagents (Invitrogen, Carlsbad, Calif.) and treated with DNase I to remove any trace amount of genomic DNA contamination. The first strand cDNA was synthesized with Superscript III reverse transcriptase (Invitrogen).

[0157] PCR analysis was performed to confirm that the cDNA was free of genomic DNA. The forward and reverse primers used for PCR analysis are shown in SEQ ID NO:21 and SEQ ID NO:22, respectively The primers are specific to the 5'UTR intron/exon junction region of a soybean S-adenosylmethionine synthetase gene promoter (WO00/37662). PCR using this primer set amplifies a 967 bp DNA fragment from any soybean genomic DNA template and a 376 bp DNA fragment from the cDNA template. The genomic DNA-free cDNA aliquots were used in qRT-PCR analysis of PSO375649 using gene-specific primers SEQ ID NO:25 and SEQ ID NO:26. An endogenous soybean ATP sulfurylase gene was used as an internal control for normalization with primers SEQ ID NO:27 and SEQ ID NO:28 and soybean wild type genomic DNA was used as the calibrator for relative quantification.

[0158] The qRT-PCR profiling of the SC194 gene expression confirmed its predominant flower expression and also showed ongoing expression at levels more than ten fold lower during early pod and seed development (see FIG. 1).

Example 3

Isolation of Soybean SC194 Promoter

[0159] The soybean genomic DNA fragment corresponding to the SC194 promoter was isolated using a polymerase chain reaction (PCR) based approach called genome walking using the Universal GenomeWalker.TM. kit from Clontech.TM. (Product User Manual No. PT3042-1).

[0160] Soybean genomic DNA samples were digested, separately, to completion with four restriction enzymes DraI, EcoRV, HpaI, or PmlI, each of which generates DNA fragments having blunt ends. Double strand adaptors supplied in the GenomeWalker.TM. kit were added to the blunt ends of the genomic DNA fragments by DNA ligase. Two rounds of PCR were performed to amplify the SC194 corresponding genomic DNA fragment using two nested primers supplied in the Universal GenomeWalker.TM. kit that are specific to the adaptor sequence (AP1 and AP2, for the first and second adaptor primer, respectively), and two SC194 gene specific primers (GSP1 and GSP2) designed based on the 5' coding sequence of SC194 (PSO375649). The oligonucleotide sequences of the four primers are shown in SEQ ID NO:15 (GSP1), SEQ ID NO:16 (AP1),

SEQ ID NO:17 (GSP2), and SEQ ID NO:18 (AP2). TheGSP2 primer contains a recognition site for the restriction enzyme NcoI. The AP2 primer from the Universal GenomeWalker.TM. kit contains a SalI restriction site. The 3' end of the adaptor sequence SEQ ID NO:23 contains a XmaI recognition site downstream to the corresponding SalI restriction site in AP2 primer.

[0161] The AP1 and the GSP1 primers were used in the first round PCR using each of the adaptor ligated genomic DNA samples (DraI, EcoRV, HpaI or PmlI) under conditions defined in the GenomeWalker.TM. protocol. Cycle conditions were 94.degree. C. for 4 minutes; 35 cycles of 94.degree. C. for 30 seconds, 60.degree. C. for 1 minute, and 68.degree. C. for 3 minutes; and a final 68.degree. C. for 5 minutes before holding at 4.degree. C. One microliter from each of the first round PCR products was used as templates for the second round PCR with the AP2 and GSP2 primers. Cycle conditions for second round PCR were 94.degree. C. for 4 minutes; 25 cycles of 94.degree. C. for 30 seconds, 60.degree. C. for 1 minute, and 68.degree. C. for 3 minutes; and a final 68.degree. C. for 5 minutes before holding at 4.degree. C. Agarose gels were run to identify specific PCR product with an optimal fragment length. An approximately 1.3 Kb PCR product was detected and subsequently cloned into pCR2.1-TOPO vector by TOPO TA cloning (Invitrogen). Sequencing of the cloned PCR products revealed that its 3' end matched the 84 bp 5' end of the PSO375649 cDNA sequence, indicating that the PCR product was indeed the corresponding SC194 genomic DNA fragment. The 1358 bp genomic DNA sequence upstream of the putative SC194 start codon ATG is herein designated as soybean SC194 promoter (SEQ ID NO:1).

Example 4

SC194 Promoter Copy Number Analysis

[0162] Southern hybridization analysis was performed to determine whether there were other sequences in the soybean genome with high similarity to the SC194 promoter. Soybean `Jack` wild type genomic DNA was digested with nine different restriction enzymes BamHI, BglII, DraI, EcoRI, EcoRV, HindIII, MfeI, NdeI, and SpeI, each separately, and distributed in a 0.7% agarose gel by electrophoresis. Each of the digested DNA samples was blotted onto a Nylon membrane and hybridized with digoxigenin (DIG) labeled SC194 promoter DNA probe according to the standard protocol (Roche Applied Science, Indianapolis, Ind.). The SC194 promoter probe was labeled by PCR using the DIG DNA labeling kit (Roche Applied Science) with two gene specific primers SEQ ID NO:12 and SEQ ID NO:8 to make a 685 bp probe described in SEQ ID NO:5 covering the 3' half of SC194 promoter sequence.

[0163] Since none of the above nine different restriction enzymes cuts inside the SC194 probe region as illustrated in FIG. 2B, a single band is expected to be hybridized by the SC194 probe in each of the lanes if there is only a single copy of the SC194 promoter sequence in soybean genome. A strong major band and a weak minor band were detected in each of eight digestion lanes, BamHI, BglII, DraI, EcoRI, EcoRV, HindIII, MfeI, and NdeI, suggesting that there is, in addition to the SC194 promoter sequence, another sequence with enough similarity to the SC194 promoter sequence to be hybridized though less effectively by the same SC194 probe (FIG. 2A). The fact that only one band was detected on the Southern blot of the SpeI digestion could be explained if two bands representing the SC194 promoter sequence and the other similar sequence, respectively, were similar in size to show as one overlapping band, or if the other similar sequence resulted in a band too small to be kept on the blot (any band smaller than 1 Kb would run out of the agarose gel under the experiment conditions).

Example 5

SC194:YFP Reporter Constructs and Soybean Transformation

[0164] The cloned SC194 promoter PCR fragment described in EXAMPLE 3 was digested with XmaI and NcoI, gel purified using a DNA gel extraction kit (Qiagen, Valencia, Calif.) and directionally cloned into the XmaI and NcoI site of a Gateway cloning ready vector QC299 (FIG. 3 and SEQ ID NO:40) containing a promoter-less fluorescent reporter gene ZS-YELLOW1 N1 (YFP) to make the reporter construct QC300 (SEQ ID NO:41) with the soybean SC194 promoter driving the YFP gene expression (FIG. 3). The SC194:YFP expression cassette in construct QC300 was linked to the SAMS:ALS (S-adenosyl methionine synthetase:acetolactate synthase) expression cassette in construct PHP25224 (FIG. 3 and SEQ ID NO:42) by Gateway cloning to create construct QC302 (FIG. 3 and SEQ ID NO:43). The linked SC194:YFP and SAMS:ALS cassettes were released as a 6431 bp DNA fragment from construct QC302 by AscI restriction digestion, separated from the vector backbone fragment by agarose gel electrophoresis, and purified from the gel using a Qiagen DNA gel extraction kit. The purified DNA fragment was used to transform soybean cultivar Jack using the particle gun bombardment method (Klein et al., Nature 327:70-73 (1987); U.S. Pat. No. 4,945,050) to study the SC194 promoter activity in stably transformed soybean plants.

[0165] Soybean somatic embryos from the Jack cultivar were induced as follows. Cotyledons (smaller than 3 mm in length) were dissected from surface-sterilized, immature seeds and were cultured for 6-10 weeks under fluorescent light at 26.degree. C. on a Murashige and Skoog media ("MS media") containing 0.7% agar and supplemented with 10 mg/ml 2,4-dichlorophenoxyacetic acid (2,4-D). Globular stage somatic embryos, which produced secondary embryos, were then excised and placed into flasks containing liquid MS medium supplemented with 2,4-D (10 mg/ml) and cultured in the light on a rotary shaker. After repeated selection for clusters of somatic embryos that multiplied as early, globular staged embryos, the soybean embryogenic suspension cultures were maintained in 35 ml liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with fluorescent lights on a 16:8 hour day/night schedule. Cultures were subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 ml of the same fresh liquid MS medium.

[0166] Soybean embryogenic suspension cultures were then transformed by the method of particle gun bombardment using a DuPont Biolistic.TM. PDS1000/HE instrument (helium retrofit) (Bio-Rad Laboratories, Hercules, Calif.). To 50 .mu.l of a 60 mg/ml 1.0 mm gold particle suspension were added (in order): 30 .mu.l of 10 ng/.mu.l SC194:YFP+SAMS:ALS DNA fragment, 20 .mu.l of 0.1 M spermidine, and 25 .mu.l of 5 M CaCl.sub.2. The particle preparation was then agitated for 3 minutes, spun in a centrifuge for 10 seconds and the supernatant removed. The DNA-coated particles were then washed once in 400 .mu.l 100% ethanol and resuspended in 45 .mu.l of 100% ethanol. The DNA/particle suspension was sonicated three times for one second each. 5 .mu.l of the DNA-coated gold particles was then loaded on each macro carrier disk.

[0167] Approximately 300-400 mg of a two-week-old suspension culture was placed in an empty 60.times.15 mm Petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5 to 10 plates of tissue were bombarded. Membrane rupture pressure was set at 1100 psi and the chamber was evacuated to a vacuum of 28 inches mercury. The tissue was placed approximately 3.5 inches away from the retaining screen and bombarded once. Following bombardment, the tissue was divided in half and placed back into liquid media and cultured as described above.

[0168] Five to seven days post bombardment, the liquid media was exchanged with fresh media containing 100 ng/ml chlorsulfuron as selection agent. This selective media was refreshed weekly. Seven to eight weeks post bombardment, green, transformed tissue was observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue was removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Each clonally propagated culture was treated as an independent transformation event and subcultured in the same liquid MS media supplemented with 2,4-D (10 mg/ml) and 100 ng/ml chlorsulfuron selection agent to increase mass. The embryogenic suspension cultures were then transferred to solid agar MS media plates without 2,4-D supplement to allow somatic embryos to develop. A sample of each event was collected at this stage for PCR and quantitative PCR analysis.

[0169] Cotyledon stage somatic embryos were dried-down (by transferring them into an empty small Petri dish that was seated on top of a 10 cm Petri dish to allow slow dry down) to mimic the last stages of soybean seed development. Dried-down embryos were placed on germination solid media, and transgenic soybean plantlets were regenerated. The transgenic plants were then transferred to soil and maintained in growth chambers for seed production.

[0170] Genomic DNA was extracted from somatic embryo samples and analyzed by quantitative PCR using the 7500 real time PCR system (Applied Biosystems) with gene-specific primers and 6-carboxyfluorescein (FAM)-labeled fluorescence probes to check copy numbers of both the SAMS:ALS expression cassette and the SC194:YFP expression cassette. The qPCR analysis was done in duplex reactions with a heat shock protein (HSP) gene as the endogenous control and a transgenic DNA sample with a known single copy of SAMS:ALS or YFP transgene as the calibrator using the relative quantification methodology. The endogenous control HSP probe was labeled with VIC (Applera Corporation, Norwalk, Conn.) and the target gene SAMS or YFP probe was labeled with FAM for the simultaneous detection of both fluorescent probes in the same duplex reactions. The primers and probes used in the qPCR analysis are listed below.

SAMS forward primer: SEQ ID NO:29 FAM labeled SAMS probe: SEQ ID NO:30 SAMS reverse primer: SEQ ID NO:31 YFP forward primer: SEQ ID NO:32 FAM labeled YFP probe: SEQ ID NO:33 YFP reverse primer: SEQ ID NO:34 HSP forward primer: SEQ ID NO:35 VIC labeled HSP probe: SEQ ID NO:36 HSP reverse primer: SEQ ID NO:37 FAM labeled DNA oligo probes and VIC labeled oligo probes were obtained from Sigma Genosys (The Woodlands, Tex.).

[0171] Only transgenic soybean events containing 1 or 2 copies of both the SAMS:ALS expression cassette and the SC194:YFP expression cassette were selected for further gene expression evaluation and seed production (see Table 2). Events negative for YFP qPCR or with more than 2 copies for the SAMS or YFP qPCR were terminated. YFP expression detection in flowers as described in EXAMPLE 8 is also recorded in the same table.

TABLE-US-00002 TABLE 2 Event ID SAMS qPCR YFP qPCR YFP Expression 4775.1.1 1.30 1.16 - 4775.1.3 1.01 1.26 - 4775.3.1 1.24 1.33 + 4775.3.2 1.17 1.36 - 4775.3.3 1.79 1.38 + 4775.3.4 2.08 1.29 + 4775.4.1 1.18 1.43 + 4775.5.1 1.47 1.11 + 4775.6.2 0.93 1.06 + 4775.7.2 1.43 1.20 + 4775.1.4 1.31 1.39 - 4775.2.2 1.12 1.13 + 4775.3.5 1.28 1.89 - 4775.3.6 2.48 1.17 + 4775.3.7 1.30 1.21 + 4775.8.2 1.28 1.30 + 4775.2.3 2.33 1.91 +

Example 6

Construction of SC194 Promoter Deletion Constructs

[0172] To define the transcriptional elements controlling the SC194 promoter activity, six 5' unidirectional deletion fragments SEQ ID NO:2 of 1328 bp, SEQ ID NO:3 of 1134 bp, SEQ ID NO:4 of 932 bp, SEQ ID NO:5 of 685 bp, SEQ ID NO:6 of 472 bp, and SEQ ID NO:7 of 237 bp were made by utilizing PCR amplification and the full length soybean SC194 promoter contained in the original construct QC300 (FIG. 3) as DNA template. The same antisense primer (SEQ ID NO:8) was used in the amplification of the six SC194 promoter fragments by pairing with different sense primers SEQ ID NOs: 9, 10, 11, 12, 13, and 14 respectively, to produce the promoter fragments represented by SEQ ID NOs: 2, 3, 4, 5, 6, and 7.

[0173] Each of the PCR amplified promoter DNA fragments was cloned into the Gateway cloning ready TA cloning vector pCR8/GW/TOPO (Invitrogen, Carlsbad, Calif.; FIG. 4 and SEQ ID NO:44), and clones with the insert in correct orientation, relative to the Gateway recombination sites attL1 and attL2 in the pCR8/GW/TOPO vector, were selected by AflII restriction enzyme digestion analysis or sequence confirmation (see the example map QC300-1 (SEQ ID NO:45) in FIG. 4, which contains the 1328 SC194 promoter deletion fragment SEQ ID NO:2). The maps of constructs QC300-2, QC300-3, QC3004, QC300-5, and QC300-6 containing the SC194 promoter deletion fragments SEQ ID NOs:3, 4, 5, 6, and 7 were similar. The promoter fragment in the right orientation was subsequently cloned into the Gateway destination vector QC330 (FIG. 4 and SEQ ID NO:46) by Gateway LR clonase reaction (Invitrogen) to place the promoter fragment in front of the reporter gene YFP (see the example map QC300-1Y (SEQ ID NO:47) in FIG. 4, which contains the 1328 SC194 promoter deletion fragment SEQ ID NO:2). A 21 bp Gateway recombination site attB2 (SEQ ID NO:39) was inserted between the promoter and the YFP reporter gene coding region as a result of the Gateway cloning process. Another 21 bp Gateway recombination site attB1 (SEQ ID NO:38) was left at the 5' end of the SC194 promoter. The maps of constructs QC300-2Y (SEQ ID NO:48), QC300-3Y (SEQ ID NO:49), QC300-4Y (SEQ ID NO:50), QC300-5Y (SEQ ID NO:51), and QC300-6Y (SEQ ID NO:52) containing the SC194 promoter deletion fragments SEQ ID NOs: 3, 4, 5, 6, and 7 were similar.

[0174] The SC194:YFP promoter deletion constructs QC300-1Y, QC300-2Y, QC300-3Y, QC300-4Y, QC300-5Y, and QC300-6Y were ready to be transformed into germinating soybean cotyledons by gene gun bombardment method for transient gene expression study. The 1358 bp full length SC194 promoter in construct QC300 was included as a positive control for transient expression analysis. A simple schematic description of the six SC194 promoter deletion fragments can be found in FIG. 5.

Example 7

Transient Expression Analysis of SC194:YFP Constructs

[0175] Full length SC194 promoter construct QC300 and its series deletion constructs QC300-1Y, 2Y, 3Y, 4Y, 5Y, and 6Y were tested by the YFP gene transient expression assay using germinating soybean cotyledons as the target tissue. Soybean seeds were rinsed with 10% Tween 20 in sterile water, surface-sterilized with 70% ethanol for 2 minutes and then by 6% sodium hypochloride for 15 minutes. After rinsing, the seeds were placed on wet filter paper in a Petri dish to germinate for 4-6 days under fluorescent light at 26.degree. C. Green cotyledons were excised and placed inner side up on a 0.7% agar plate containing MS media for particle gun bombardment.

[0176] The DNA and gold particle mixtures were prepared similarly as described in EXAMPLE 5 except with more DNA (100 ng/.mu.l). The bombardments were also carried out under similar parameters as described in EXAMPLE 5. YFP expression was checked under a Leica MZFLIII stereo microscope equipped with UV light source and appropriate light filters (Leica Microsystems Inc., Bannockburn, Ill.), and all microscopic pictures were taken under the same camera settings: 1.06 gamma, 0.0% gain, and 0.58 seconds exposure approximately 24 hours after bombardment with 8.times. magnification.

[0177] The full length SC194 promoter construct QC300 expressed YFP but much weaker than the positive control construct pZSL90 (SEQ ID NO:53), which contained a strong constitutive promoter SCP1 (U.S. Pat. No. 6,072,050), in transient expression assay as shown by the different size green dots (FIG. 6A, H). Each dot represented a single cotyledon cell which appeared larger if the fluorescence was strong or smaller if the fluorescence was weak, even under the same magnification. The QC300-1Y and QC300-2Y constructs containing, respectively, the 1328 bp and 1134 bp truncated SC194 promoter fragments and with the attB2 Gateway recombination site (Invitrogen) inserted between the SC194 promoter and the YFP had similar expression that also appeared to be weaker than the full length SC194 promoter (FIG. 6B, C). The 932 bp truncated SC194 promoter construct QC300-3Y (FIG. 6D) had obviously lower expression than the above three longer SC194 promoter constructs. Further truncations of the SC194 promoter to 685 bp in construct QC300-4Y and to 472 bp in construct QC300-5Y further reduced the promoter activity as indicated by the fewer and smaller fluorescence dots (FIG. 6E, F). But even when the SC194 promoter was truncated to the 237 bp minimal size in construct QC300-6Y, the promoter fragment still retained very low level activity with only a few faint green dots barely detectable (FIG. 6G).

Example 8

SC194:YFP Expression in Stable Transgenic Soybean Plants

[0178] YFP gene expression was checked at different stages of transgenic plant development for yellow fluorescence emission under a Leica MZFLIII stereo microscope equipped with UV light source and appropriate light filters (Leica Microsystems Inc., Bannockburn, Ill.). No specific yellow fluorescence was detected during somatic embryo development or in vegetative tissues such as leaf, petiole, stem, or root of the transgenic plants. Fluorescence was only detected in flowers.

[0179] A soybean flower consists of five sepals, five petals including one standard large upper petal, two large side petals, and two small fused lower petals called kneel to enclose ten stamens and one pistil. The filaments of the ten stamens fuse together to form a sheath to enclose the pistil and separate into 10 branches only at the top to each bear an anther. The pistil consists of a stigma, a style, and an ovary in which there are normally 24 ovules that will eventually develop into seeds.

[0180] Specific fluorescence signal (green color) was first detected at the junctions between anthers and filaments, and also in the distal part of petals in young flower bud when the petals were still completely enclosed by sepals (FIG. 7A). In older flower bud and open flower, fluorescence spread throughout all petals and the entire filaments but still concentrated at the anther and filament junctions (FIG. 7B, C, D). No specific fluorescence was detected in sepals or in flower pedicle, which displayed red auto fluorescence resulting from plant green tissues (FIG. 7A, C, D). Fluorescence was detected in the style but not in the ovary part of the pistil (FIG. 7F). It seemed that under higher magnification no YFP fluorescence was detectable in stigma or in pollen, though it is noted that auto fluorescence was strong in pollen (FIG. 7E, G). The yellow auto fluorescence in pollen was even stronger under a non-specific UV light filter, while YFP-specific greenish fluorescence disappeared under the same non-specific filter. When an open flower was dissected longitudinally to expose the inside of the ovary, no fluorescence was detected in the inside ovary wall or in any of the ovules (FIG. 7D). Similarly, no fluorescence was detected in any part of young or old developing pod or seeds (FIG. 7H).

[0181] In conclusion, the SC194:YFP expression was only detected in petals, filaments, style, and was strongest in the anther and filament junctions of a soybean flower. The expression was first detectable in young flower buds when the petals were still completely enclosed by sepals. No expression was detectable in other parts of the flower such ovary, stigma, or pollens or other tissues such as leaf, root, petiole, pod coat, or developing seeds of transgenic soybean plants.

[0182] Twelve out of 17 transgenic events expressed YFP in the same manner as described in details above (Table 2). The other five events contained the transgene as revealed by qPCR but failed to express YFP.

Sequence CWU 1

1

5311358DNAGlycine max 1gggctggtaa cctagttaat aaattaaaag gagaacatta ttaatgtgaa aatcatgcaa 60acttaaaaaa atcatcaaca acataatttt ataattctaa taaaatattt ttttctttta 120attctttaat caatgtctaa catttatcta ttatttatca catttgttat ttaatgtttc 180tatctttaga gctatcaaaa atttaaaatg gtggaacctt actcattggg ttgagttcac 240ctaacttgtt taataaatag atcaatctaa ttctattcat ctcttagtaa gtattaaaaa 300tgttggccca actctccata tattggtgag ttataggagt ttactcactt aaaatgataa 360taaaaatatt tgttttaaaa tcatttttta aacaaaaaaa taatgtttca gattatttat 420tcttagatca taacttacaa gcaacatttc aatgatcaat tcaattgtca gaatcaaaac 480caattgaaag agacaaatat tcatgctaat cttcatcaga aactaaacat tgacataaag 540caatagtatt ggaactacaa gttataatta tgtactttgt aatagtgtga agaaaatcaa 600aatacaaata gtaatcatca tgataaatgc tatctcaatt tattcaatta taaaaatata 660gaaataaaat gtgataaatg gataacatgt gtgctaatcc agtccactac gcccaccaca 720agttcaaccc aatggactgg atcatcttct ttttttctta ctgatttctc tcttcttcca 780ttctaatcca tcccaaaagt agatgtttac tatttcccct ttcatagttt cacaagtgtg 840cgcagaggcc aaactgaaag tggtagtaca tggtgtaata ttaatcacag atgtgctctc 900atgaagtctg aacttacagc tcaagtaaca accaacaagt aaaaagtaca gaagatagca 960taaaaaatga aggtagaaca aattccaagt tttctacata ttacggtgca taaatcaacc 1020acgtgaaggc tccatttatt tgccgctata acattggtga ccctcttcca caaatagtaa 1080gtaataaaac caagtacaaa aaaatgttca actaccaagt gatcacaatc ttcatgcatc 1140tgagtcacac tattgccctt tgctcatgaa gtacacttta ctcaccgcca aagttcactc 1200aacactgtag aacaaaggaa tcatataaat aatgcatatc tctcccttaa gccttcaaca 1260catacaaaag tgacacacca aatcaaagac acctgagcca ttcaattccc ctcctttatt 1320gctttcaagt ttcaacacta attttattat ctgaaacc 135821328DNAGlycine max 2ggagaacatt attaatgtga aaatcatgca aacttaaaaa aatcatcaac aacataattt 60tataattcta ataaaatatt tttttctttt aattctttaa tcaatgtcta acatttatct 120attatttatc acatttgtta tttaatgttt ctatctttag agctatcaaa aatttaaaat 180ggtggaacct tactcattgg gttgagttca cctaacttgt ttaataaata gatcaatcta 240attctattca tctcttagta agtattaaaa atgttggccc aactctccat atattggtga 300gttataggag tttactcact taaaatgata ataaaaatat ttgttttaaa atcatttttt 360aaacaaaaaa ataatgtttc agattattta ttcttagatc ataacttaca agcaacattt 420caatgatcaa ttcaattgtc agaatcaaaa ccaattgaaa gagacaaata ttcatgctaa 480tcttcatcag aaactaaaca ttgacataaa gcaatagtat tggaactaca agttataatt 540atgtactttg taatagtgtg aagaaaatca aaatacaaat agtaatcatc atgataaatg 600ctatctcaat ttattcaatt ataaaaatat agaaataaaa tgtgataaat ggataacatg 660tgtgctaatc cagtccacta cgcccaccac aagttcaacc caatggactg gatcatcttc 720tttttttctt actgatttct ctcttcttcc attctaatcc atcccaaaag tagatgttta 780ctatttcccc tttcatagtt tcacaagtgt gcgcagaggc caaactgaaa gtggtagtac 840atggtgtaat attaatcaca gatgtgctct catgaagtct gaacttacag ctcaagtaac 900aaccaacaag taaaaagtac agaagatagc ataaaaaatg aaggtagaac aaattccaag 960ttttctacat attacggtgc ataaatcaac cacgtgaagg ctccatttat ttgccgctat 1020aacattggtg accctcttcc acaaatagta agtaataaaa ccaagtacaa aaaaatgttc 1080aactaccaag tgatcacaat cttcatgcat ctgagtcaca ctattgccct ttgctcatga 1140agtacacttt actcaccgcc aaagttcact caacactgta gaacaaagga atcatataaa 1200taatgcatat ctctccctta agccttcaac acatacaaaa gtgacacacc aaatcaaaga 1260cacctgagcc attcaattcc cctcctttat tgctttcaag tttcaacact aattttatta 1320tctgaaac 132831134DNAGlycine max 3cattgggttg agttcaccta acttgtttaa taaatagatc aatctaattc tattcatctc 60ttagtaagta ttaaaaatgt tggcccaact ctccatatat tggtgagtta taggagttta 120ctcacttaaa atgataataa aaatatttgt tttaaaatca ttttttaaac aaaaaaataa 180tgtttcagat tatttattct tagatcataa cttacaagca acatttcaat gatcaattca 240attgtcagaa tcaaaaccaa ttgaaagaga caaatattca tgctaatctt catcagaaac 300taaacattga cataaagcaa tagtattgga actacaagtt ataattatgt actttgtaat 360agtgtgaaga aaatcaaaat acaaatagta atcatcatga taaatgctat ctcaatttat 420tcaattataa aaatatagaa ataaaatgtg ataaatggat aacatgtgtg ctaatccagt 480ccactacgcc caccacaagt tcaacccaat ggactggatc atcttctttt tttcttactg 540atttctctct tcttccattc taatccatcc caaaagtaga tgtttactat ttcccctttc 600atagtttcac aagtgtgcgc agaggccaaa ctgaaagtgg tagtacatgg tgtaatatta 660atcacagatg tgctctcatg aagtctgaac ttacagctca agtaacaacc aacaagtaaa 720aagtacagaa gatagcataa aaaatgaagg tagaacaaat tccaagtttt ctacatatta 780cggtgcataa atcaaccacg tgaaggctcc atttatttgc cgctataaca ttggtgaccc 840tcttccacaa atagtaagta ataaaaccaa gtacaaaaaa atgttcaact accaagtgat 900cacaatcttc atgcatctga gtcacactat tgccctttgc tcatgaagta cactttactc 960accgccaaag ttcactcaac actgtagaac aaaggaatca tataaataat gcatatctct 1020cccttaagcc ttcaacacat acaaaagtga cacaccaaat caaagacacc tgagccattc 1080aattcccctc ctttattgct ttcaagtttc aacactaatt ttattatctg aaac 11344932DNAGlycine max 4gatcataact tacaagcaac atttcaatga tcaattcaat tgtcagaatc aaaaccaatt 60gaaagagaca aatattcatg ctaatcttca tcagaaacta aacattgaca taaagcaata 120gtattggaac tacaagttat aattatgtac tttgtaatag tgtgaagaaa atcaaaatac 180aaatagtaat catcatgata aatgctatct caatttattc aattataaaa atatagaaat 240aaaatgtgat aaatggataa catgtgtgct aatccagtcc actacgccca ccacaagttc 300aacccaatgg actggatcat cttctttttt tcttactgat ttctctcttc ttccattcta 360atccatccca aaagtagatg tttactattt cccctttcat agtttcacaa gtgtgcgcag 420aggccaaact gaaagtggta gtacatggtg taatattaat cacagatgtg ctctcatgaa 480gtctgaactt acagctcaag taacaaccaa caagtaaaaa gtacagaaga tagcataaaa 540aatgaaggta gaacaaattc caagttttct acatattacg gtgcataaat caaccacgtg 600aaggctccat ttatttgccg ctataacatt ggtgaccctc ttccacaaat agtaagtaat 660aaaaccaagt acaaaaaaat gttcaactac caagtgatca caatcttcat gcatctgagt 720cacactattg ccctttgctc atgaagtaca ctttactcac cgccaaagtt cactcaacac 780tgtagaacaa aggaatcata taaataatgc atatctctcc cttaagcctt caacacatac 840aaaagtgaca caccaaatca aagacacctg agccattcaa ttcccctcct ttattgcttt 900caagtttcaa cactaatttt attatctgaa ac 9325685DNAGlycine max 5gataaatgga taacatgtgt gctaatccag tccactacgc ccaccacaag ttcaacccaa 60tggactggat catcttcttt ttttcttact gatttctctc ttcttccatt ctaatccatc 120ccaaaagtag atgtttacta tttccccttt catagtttca caagtgtgcg cagaggccaa 180actgaaagtg gtagtacatg gtgtaatatt aatcacagat gtgctctcat gaagtctgaa 240cttacagctc aagtaacaac caacaagtaa aaagtacaga agatagcata aaaaatgaag 300gtagaacaaa ttccaagttt tctacatatt acggtgcata aatcaaccac gtgaaggctc 360catttatttg ccgctataac attggtgacc ctcttccaca aatagtaagt aataaaacca 420agtacaaaaa aatgttcaac taccaagtga tcacaatctt catgcatctg agtcacacta 480ttgccctttg ctcatgaagt acactttact caccgccaaa gttcactcaa cactgtagaa 540caaaggaatc atataaataa tgcatatctc tcccttaagc cttcaacaca tacaaaagtg 600acacaccaaa tcaaagacac ctgagccatt caattcccct cctttattgc tttcaagttt 660caacactaat tttattatct gaaac 6856472DNAGlycine max 6cacagatgtg ctctcatgaa gtctgaactt acagctcaag taacaaccaa caagtaaaaa 60gtacagaaga tagcataaaa aatgaaggta gaacaaattc caagttttct acatattacg 120gtgcataaat caaccacgtg aaggctccat ttatttgccg ctataacatt ggtgaccctc 180ttccacaaat agtaagtaat aaaaccaagt acaaaaaaat gttcaactac caagtgatca 240caatcttcat gcatctgagt cacactattg ccctttgctc atgaagtaca ctttactcac 300cgccaaagtt cactcaacac tgtagaacaa aggaatcata taaataatgc atatctctcc 360cttaagcctt caacacatac aaaagtgaca caccaaatca aagacacctg agccattcaa 420ttcccctcct ttattgcttt caagtttcaa cactaatttt attatctgaa ac 4727237DNAGlycine max 7gatcacaatc ttcatgcatc tgagtcacac tattgccctt tgctcatgaa gtacacttta 60ctcaccgcca aagttcactc aacactgtag aacaaaggaa tcatataaat aatgcatatc 120tctcccttaa gccttcaaca catacaaaag tgacacacca aatcaaagac acctgagcca 180ttcaattccc ctcctttatt gctttcaagt ttcaacacta attttattat ctgaaac 237832DNAArtificialPrimer 8gtttcagata ataaaattag tgttgaaact tg 32929DNAArtificialPrimer 9ggagaacatt attaatgtga aaatcatgc 291025DNAArtificialPrimer 10cattgggttg agttcaccta acttg 251129DNAArtificialPrimer 11gatcataact tacaagcaac atttcaatg 291228DNAArtificialPrimer 12gataaatgga taacatgtgt gctaatcc 281325DNAArtificialPrimer 13cacagatgtg ctctcatgaa gtctg 251426DNAArtificialPrimer 14gatcacaatc ttcatgcatc tgagtc 261526DNAArtificialPrimer 15caaggaaaaa cgaaactttg aaagcc 261625DNAArtificialPrimer 16gtaatacgac tcactatagg gcacg 251737DNAArtificialPrimer 17ccatggtttc agataataaa attagtgttg aaacttg 371822DNAArtificialPrimer 18ctatagggca cgcgtggtcg ac 2219832DNAGlycine max 19acacaccaaa tcaaagacac ctgagccatt caattcccct cctttattgc tttcaagttt 60caacactaat tttattatct gaaaaaatgg ctttcaaagt ttcgtttttc cttgcacttg 120ttctagtttc caatatcctc ctccttgata caacagctgc tggacgcagc attggcgaaa 180actccaactc agaggaaaag aaagagcctg agttcttgtt caagcatgaa ggtggggtgt 240atattccagg gattggacct gttggatttc cacataaatt tcatctcaca cctcaaaatc 300cattacctgg tggcaatgga aatggaggag caggaaccgc aacaggatca ggatcaccac 360caggtagcag ttatgttcct ggtggtgatg acacttttgt cccaaaccct ggttatgagg 420ttcccattcc cggcagtggt ggaagtgttc cagcaccagc tgcaccatga gttaactcat 480gcatgattaa tgtgatgcat ggtagttaat aaggtggtta tgcttaagtt tgtctttttc 540tttctgtttt ctagccataa taataactta tcataaataa gtatgctcca tgtgcacatt 600ggtgtatatg gtgaacacca tggattgcca agtcattctg tttgttcttg tagtcttgtt 660ttaagatgaa ttgagtgtga cgtaagctta tttgtttttc gaagtaaaaa ctgatgaatg 720agtcctcaaa aataatttct gttatgattc caatttgata ttctcttttc atgcacagtt 780ttatgtgttt ggtccttgaa tgataaaaaa aaaaaaaaaa aaaaaaaaaa aa 83220127PRTGlycine max 20Met Ala Phe Lys Val Ser Phe Phe Leu Ala Leu Val Leu Val Ser Asn1 5 10 15Ile Leu Leu Leu Asp Thr Thr Ala Ala Gly Arg Ser Ile Gly Glu Asn20 25 30Ser Asn Ser Glu Glu Lys Lys Glu Pro Glu Phe Leu Phe Lys His Glu35 40 45Gly Gly Val Tyr Ile Pro Gly Ile Gly Pro Val Gly Phe Pro His Lys50 55 60Phe His Leu Thr Pro Gln Asn Pro Leu Pro Gly Gly Asn Gly Asn Gly65 70 75 80Gly Ala Gly Thr Ala Thr Gly Ser Gly Ser Pro Pro Gly Ser Ser Tyr85 90 95Val Pro Gly Gly Asp Asp Thr Phe Val Pro Asn Pro Gly Tyr Glu Val100 105 110Pro Ile Pro Gly Ser Gly Gly Ser Val Pro Ala Pro Ala Ala Pro115 120 1252126DNAArtificialPrimer 21gaccaagaca cactcgttca tatatc 262225DNAArtificialPrimer 22tctgctgctc aatgtttaca aggac 252348DNAArtificialLonger strand sequence of the adaptor supplied in GenomeWalker(tm) kit 23gtaatacgac tcactatagg gcacgcgtgg tcgacggccc gggctggt 482417DNAArtificialMPSS tag sequence 24gatcaccacc aggtagc 172520DNAArtificialPrimer 25aaccgcaaca ggatcaggat 202621DNAArtificialPrimer 26accagggttt gggacaaaag t 212724DNAArtificialPrimer 27catgattggg agaaacctta agct 242820DNAArtificialPrimer 28agattgggcc agaggatcct 202922DNAArtificialPrimer 29ggaagaagag aatcgggtgg tt 223023DNAArtificialFAM labeled fluorescent DNA oligo probe 30attgtgttgt gtggcatggt tat 233123DNAArtificialPrimer 31ggcttgttgt gcagtttttg aag 233220DNAArtificialPrimer 32aacggccaca agttcgtgat 203320DNAArtificialFAM labeled fluorescent DNA oligo probe 33accggcgagg gcatcggcta 203420DNAArtificialPrimer 34cttcaagggc aagcagacca 203524DNAArtificialPrimer 35caaacttgac aaagccacaa ctct 243620DNAArtificialVIC labeled DNA oligo probe 36ctctcatctc atataaatac 203721DNAArtificialPrimer 37ggagaaattg gtgtcgtgga a 213821DNAArtificialRecombination site attB1 sequence 38caagtttgta caaaaaagca g 213921DNAArtificialRecombination site attB2 sequence 39cagctttctt gtacaaagtg g 21403291DNAArtificialNucleotide sequence of QC299 40tcgacccggg atccatggcc cacagcaagc acggcctgaa ggaggagatg accatgaagt 60accacatgga gggctgcgtg aacggccaca agttcgtgat caccggcgag ggcatcggct 120accccttcaa gggcaagcag accatcaacc tgtgcgtgat cgagggcggc cccctgccct 180tcagcgagga catcctgagc gccggcttca agtacggcga ccggatcttc accgagtacc 240cccaggacat cgtggactac ttcaagaaca gctgccccgc cggctacacc tggggccgga 300gcttcctgtt cgaggacggc gccgtgtgca tctgtaacgt ggacatcacc gtgagcgtga 360aggagaactg catctaccac aagagcatct tcaacggcgt gaacttcccc gccgacggcc 420ccgtgatgaa gaagatgacc accaactggg aggccagctg cgagaagatc atgcccgtgc 480ctaagcaggg catcctgaag ggcgacgtga gcatgtacct gctgctgaag gacggcggcc 540ggtaccggtg ccagttcgac accgtgtaca aggccaagag cgtgcccagc aagatgcccg 600agtggcactt catccagcac aagctgctgc gggaggaccg gagcgacgcc aagaaccaga 660agtggcagct gaccgagcac gccatcgcct tccccagcgc cctggcctga gagctcgaat 720ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc tgttgccggt 780cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat aattaacatg 840taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca attatacatt 900taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc gcgcgcggtg 960tcatctatgt tactagatcg ggaattctag tggccggccc agctgatatc catcacactg 1020gcggccgcac tcgagatatc tagacccagc tttcttgtac aaagttggca ttataagaaa 1080gcattgctta tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata aaatcattat 1140ttgccatcca gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac attgcacaag 1200ataaaaatat atcatcatga acaataaaac tgtctgctta cataaacagt aatacaaggg 1260gtgttatgag ccatattcaa cgggaaacgt cgaggccgcg attaaattcc aacatggatg 1320ctgatttata tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt gcgacaatct 1380atcgcttgta tgggaagccc gatgcgccag agttgtttct gaaacatggc aaaggtagcg 1440ttgccaatga tgttacagat gagatggtca gactaaactg gctgacggaa tttatgcctc 1500ttccgaccat caagcatttt atccgtactc ctgatgatgc atggttactc accactgcga 1560tccccggaaa aacagcattc caggtattag aagaatatcc tgattcaggt gaaaatattg 1620ttgatgcgct ggcagtgttc ctgcgccggt tgcattcgat tcctgtttgt aattgtcctt 1680ttaacagcga tcgcgtattt cgtctcgctc aggcgcaatc acgaatgaat aacggtttgg 1740ttgatgcgag tgattttgat gacgagcgta atggctggcc tgttgaacaa gtctggaaag 1800aaatgcataa acttttgcca ttctcaccgg attcagtcgt cactcatggt gatttctcac 1860ttgataacct tatttttgac gaggggaaat taataggttg tattgatgtt ggacgagtcg 1920gaatcgcaga ccgataccag gatcttgcca tcctatggaa ctgcctcggt gagttttctc 1980cttcattaca gaaacggctt tttcaaaaat atggtattga taatcctgat atgaataaat 2040tgcagtttca tttgatgctc gatgagtttt tctaatcaga attggttaat tggttgtaac 2100attattcaga ttgggccccg ttccactgag cgtcagaccc cgtagaaaag atcaaaggat 2160cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc 2220taccagcggt ggtttgtttg ccggatcaag agctaccaac tctttttccg aaggtaactg 2280gcttcagcag agcgcagata ccaaatactg ttcttctagt gtagccgtag ttaggccacc 2340acttcaagaa ctctgtagca ccgcctacat acctcgctct gctaatcctg ttaccagtgg 2400ctgctgccag tggcgataag tcgtgtctta ccgggttgga ctcaagacga tagttaccgg 2460ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa 2520cgacctacac cgaactgaga tacctacagc gtgagctatg agaaagcgcc acgcttcccg 2580aagggagaaa ggcggacagg tatccggtaa gcggcagggt cggaacagga gagcgcacga 2640gggagcttcc agggggaaac gcctggtatc tttatagtcc tgtcgggttt cgccacctct 2700gacttgagcg tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca 2760gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 2820ctgcgttatc ccctgattct gtggataacc gtattaccgc tagcatggat ctcggggacg 2880tctaactact aagcgagagt agggaactgc caggcatcaa ataaaacgaa aggctcagtc 2940ggaagactgg gcctttcgtt ttatctgttg tttgtcggtg aacgctctcc tgagtaggac 3000aaatccgccg ggagcggatt tgaacgttgt gaagcaacgg cccggagggt ggcgggcagg 3060acgcccgcca taaactgcca ggcatcaaac taagcagaag gccatcctga cggatggcct 3120ttttgcgttt ctacaaactc ttcctgttag ttagttactt aagctcgggc cccaaataat 3180gattttattt tgactgatag tgacctgttc gttgcaacaa attgataagc aatgcttttt 3240tataatgcca actttgtaca aaaaagcagg ctggcgccgg aaccaattca g 3291414642DNAArtificialNucleotide sequence of QC300 41catggcccac agcaagcacg gcctgaagga ggagatgacc atgaagtacc acatggaggg 60ctgcgtgaac ggccacaagt tcgtgatcac cggcgagggc atcggctacc ccttcaaggg 120caagcagacc atcaacctgt gcgtgatcga gggcggcccc ctgcccttca gcgaggacat 180cctgagcgcc ggcttcaagt acggcgaccg gatcttcacc gagtaccccc aggacatcgt 240ggactacttc aagaacagct gccccgccgg ctacacctgg ggccggagct tcctgttcga 300ggacggcgcc gtgtgcatct gtaacgtgga catcaccgtg agcgtgaagg agaactgcat 360ctaccacaag agcatcttca acggcgtgaa cttccccgcc gacggccccg tgatgaagaa 420gatgaccacc aactgggagg ccagctgcga gaagatcatg cccgtgccta agcagggcat 480cctgaagggc gacgtgagca tgtacctgct gctgaaggac ggcggccggt accggtgcca 540gttcgacacc gtgtacaagg ccaagagcgt gcccagcaag atgcccgagt ggcacttcat 600ccagcacaag ctgctgcggg aggaccggag cgacgccaag aaccagaagt ggcagctgac 660cgagcacgcc atcgccttcc ccagcgccct ggcctgagag ctcgaatttc cccgatcgtt 720caaacatttg gcaataaagt ttcttaagat tgaatcctgt tgccggtctt gcgatgatta 780tcatataatt tctgttgaat tacgttaagc atgtaataat taacatgtaa tgcatgacgt 840tatttatgag atgggttttt atgattagag tcccgcaatt atacatttaa tacgcgatag 900aaaacaaaat

atagcgcgca aactaggata aattatcgcg cgcggtgtca tctatgttac 960tagatcggga attctagtgg ccggcccagc tgatatccat cacactggcg gccgcactcg 1020agatatctag acccagcttt cttgtacaaa gttggcatta taagaaagca ttgcttatca 1080atttgttgca acgaacaggt cactatcagt caaaataaaa tcattatttg ccatccagct 1140gcagctctgg cccgtgtctc aaaatctctg atgttacatt gcacaagata aaaatatatc 1200atcatgaaca ataaaactgt ctgcttacat aaacagtaat acaaggggtg ttatgagcca 1260tattcaacgg gaaacgtcga ggccgcgatt aaattccaac atggatgctg atttatatgg 1320gtataaatgg gctcgcgata atgtcgggca atcaggtgcg acaatctatc gcttgtatgg 1380gaagcccgat gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt 1440tacagatgag atggtcagac taaactggct gacggaattt atgcctcttc cgaccatcaa 1500gcattttatc cgtactcctg atgatgcatg gttactcacc actgcgatcc ccggaaaaac 1560agcattccag gtattagaag aatatcctga ttcaggtgaa aatattgttg atgcgctggc 1620agtgttcctg cgccggttgc attcgattcc tgtttgtaat tgtcctttta acagcgatcg 1680cgtatttcgt ctcgctcagg cgcaatcacg aatgaataac ggtttggttg atgcgagtga 1740ttttgatgac gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa tgcataaact 1800tttgccattc tcaccggatt cagtcgtcac tcatggtgat ttctcacttg ataaccttat 1860ttttgacgag gggaaattaa taggttgtat tgatgttgga cgagtcggaa tcgcagaccg 1920ataccaggat cttgccatcc tatggaactg cctcggtgag ttttctcctt cattacagaa 1980acggcttttt caaaaatatg gtattgataa tcctgatatg aataaattgc agtttcattt 2040gatgctcgat gagtttttct aatcagaatt ggttaattgg ttgtaacatt attcagattg 2100ggccccgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt cttgagatcc 2160tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 2220ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc 2280gcagatacca aatactgttc ttctagtgta gccgtagtta ggccaccact tcaagaactc 2340tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg 2400cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg 2460gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 2520actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag ggagaaaggc 2580ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg 2640gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg 2700atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt 2760tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc 2820tgattctgtg gataaccgta ttaccgctag catggatctc ggggacgtct aactactaag 2880cgagagtagg gaactgccag gcatcaaata aaacgaaagg ctcagtcgga agactgggcc 2940tttcgtttta tctgttgttt gtcggtgaac gctctcctga gtaggacaaa tccgccggga 3000gcggatttga acgttgtgaa gcaacggccc ggagggtggc gggcaggacg cccgccataa 3060actgccaggc atcaaactaa gcagaaggcc atcctgacgg atggcctttt tgcgtttcta 3120caaactcttc ctgttagtta gttacttaag ctcgggcccc aaataatgat tttattttga 3180ctgatagtga cctgttcgtt gcaacaaatt gataagcaat gcttttttat aatgccaact 3240ttgtacaaaa aagcaggctg gcgccggaac caattcagtc gacccgggct ggtaacctag 3300ttaataaatt aaaaggagaa cattattaat gtgaaaatca tgcaaactta aaaaaatcat 3360caacaacata attttataat tctaataaaa tatttttttc ttttaattct ttaatcaatg 3420tctaacattt atctattatt tatcacattt gttatttaat gtttctatct ttagagctat 3480caaaaattta aaatggtgga accttactca ttgggttgag ttcacctaac ttgtttaata 3540aatagatcaa tctaattcta ttcatctctt agtaagtatt aaaaatgttg gcccaactct 3600ccatatattg gtgagttata ggagtttact cacttaaaat gataataaaa atatttgttt 3660taaaatcatt ttttaaacaa aaaaataatg tttcagatta tttattctta gatcataact 3720tacaagcaac atttcaatga tcaattcaat tgtcagaatc aaaaccaatt gaaagagaca 3780aatattcatg ctaatcttca tcagaaacta aacattgaca taaagcaata gtattggaac 3840tacaagttat aattatgtac tttgtaatag tgtgaagaaa atcaaaatac aaatagtaat 3900catcatgata aatgctatct caatttattc aattataaaa atatagaaat aaaatgtgat 3960aaatggataa catgtgtgct aatccagtcc actacgccca ccacaagttc aacccaatgg 4020actggatcat cttctttttt tcttactgat ttctctcttc ttccattcta atccatccca 4080aaagtagatg tttactattt cccctttcat agtttcacaa gtgtgcgcag aggccaaact 4140gaaagtggta gtacatggtg taatattaat cacagatgtg ctctcatgaa gtctgaactt 4200acagctcaag taacaaccaa caagtaaaaa gtacagaaga tagcataaaa aatgaaggta 4260gaacaaattc caagttttct acatattacg gtgcataaat caaccacgtg aaggctccat 4320ttatttgccg ctataacatt ggtgaccctc ttccacaaat agtaagtaat aaaaccaagt 4380acaaaaaaat gttcaactac caagtgatca caatcttcat gcatctgagt cacactattg 4440ccctttgctc atgaagtaca ctttactcac cgccaaagtt cactcaacac tgtagaacaa 4500aggaatcata taaataatgc atatctctcc cttaagcctt caacacatac aaaagtgaca 4560caccaaatca aagacacctg agccattcaa ttcccctcct ttattgcttt caagtttcaa 4620cactaatttt attatctgaa ac 4642428187DNAArtificialNucleotide sequence of PHP25224 42cgcgcgtttc ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac 60agcttgtctg taagcggatg ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt 120tggcgggtgt cggggctggc ttaactatgc ggcatcagag cagattgtac tgagagtgca 180ccatatggac atattgtcgt tagaacgcgg ctacaattaa tacataacct tatgtatcat 240acacatacga tttaggtgac actatagaac ggcgcgccgg taccgggccc cccctcgagt 300gcggccgcaa gcttgtcgac ggagatcacc actttgtaca agaaagctga acgagaaacg 360taaaatgata taaatatcaa tatattaaat tagattttgc ataaaaaaca gactacataa 420tactgtaaaa cacaacatat ccagtcacta tggtcgacct gcagactggc tgtgtataag 480ggagcctgac atttatattc cccagaacat caggttaatg gcgtttttga tgtcattttc 540gcggtggctg agatcagcca cttcttcccc gataacggag accggcacac tggccatatc 600ggtggtcatc atgcgccagc tttcatcccc gatatgcacc accgggtaaa gttcacggga 660gactttatct gacagcagac gtgcactggc cagggggatc accatccgtc gcccgggcgt 720gtcaataata tcactctgta catccacaaa cagacgataa cggctctctc ttttataggt 780gtaaacctta aactgcattt caccagtccc tgttctcgtc agcaaaagag ccgttcattt 840caataaaccg ggcgacctca gccatccctt cctgattttc cgctttccag cgttcggcac 900gcagacgacg ggcttcattc tgcatggttg tgcttaccag accggagata ttgacatcat 960atatgccttg agcaactgat agctgtcgct gtcaactgtc actgtaatac gctgcttcat 1020agcacacctc tttttgacat acttcgggta tacatatcag tatatattct tataccgcaa 1080aaatcagcgc gcaaatacgc atactgttat ctggctttta gtaagccgga tccacgcgtt 1140tacgccccgc cctgccactc atcgcagtac tgttgtaatt cattaagcat tctgccgaca 1200tggaagccat cacagacggc atgatgaacc tgaatcgcca gcggcatcag caccttgtcg 1260ccttgcgtat aatatttgcc catggtgaaa acgggggcga agaagttgtc catattggcc 1320acgtttaaat caaaactggt gaaactcacc cagggattgg ctgagacgaa aaacatattc 1380tcaataaacc ctttagggaa ataggccagg ttttcaccgt aacacgccac atcttgcgaa 1440tatatgtgta gaaactgccg gaaatcgtcg tggtattcac tccagagcga tgaaaacgtt 1500tcagtttgct catggaaaac ggtgtaacaa gggtgaacac tatcccatat caccagctca 1560ccgtctttca ttgccatacg gaattccgga tgagcattca tcaggcgggc aagaatgtga 1620ataaaggccg gataaaactt gtgcttattt ttctttacgg tctttaaaaa ggccgtaata 1680tccagctgaa cggtctggtt ataggtacat tgagcaactg actgaaatgc ctcaaaatgt 1740tctttacgat gccattggga tatatcaacg gtggtatatc cagtgatttt tttctccatt 1800ttagcttcct tagctcctga aaatctcgcc ggatcctaac tcaaaatcca cacattatac 1860gagccggaag cataaagtgt aaagcctggg gtgcctaatg cggccgccat agtgactgga 1920tatgttgtgt tttacagtat tatgtagtct gttttttatg caaaatctaa tttaatatat 1980tgatatttat atcattttac gtttctcgtt cagctttttt gtacaaactt gtgattcttc 2040cttaccaatc atactaatta ttttgggtta aatattaatc attattttta agatattaat 2100taagaaatta aaagattttt taaaaaaatg tataaaatta tattattcat gatttttcat 2160acatttgatt ttgataataa atatattttt tttaatttct taaaaaatgt tgcaagacac 2220ttattagaca tagtcttgtt ctgtttacaa aagcattcat catttaatac attaaaaaat 2280atttaatact aacagtagaa tcttcttgtg agtggtgtgg gagtaggcaa cctggcattg 2340aaacgagaga aagagagtca gaaccagaag acaaataaaa agtatgcaac aaacaaatca 2400aaatcaaagg gcaaaggctg gggttggctc aattggttgc tacattcaat tttcaactca 2460gtcaacggtt gagattcact ctgacttccc caatctaagc cgcggatgca aacggttgaa 2520tctaacccac aatccaatct cgttacttag gggcttttcc gtcattaact cacccctgcc 2580acccggtttc cctataaatt ggaactcaat gctcccctct aaactcgtat cgcttcagag 2640ttgagaccaa gacacactcg ttcatatatc tctctgctct tctcttctct tctacctctc 2700aaggtacttt tcttctccct ctaccaaatc ctagattccg tggttcaatt tcggatcttg 2760cacttctggt ttgctttgcc ttgctttttc ctcaactggg tccatctagg atccatgtga 2820aactctactc tttctttaat atctgcggaa tacgcgtttg actttcagat ctagtcgaaa 2880tcatttcata attgcctttc tttcttttag cttatgagaa ataaaatcac ttttttttta 2940tttcaaaata aaccttgggc cttgtgctga ctgagatggg gtttggtgat tacagaattt 3000tagcgaattt tgtaattgta cttgtttgtc tgtagttttg ttttgttttc ttgtttctca 3060tacattcctt aggcttcaat tttattcgag tataggtcac aataggaatt caaactttga 3120gcaggggaat taatcccttc cttcaaatcc agtttgtttg tatatatgtt taaaaaatga 3180aacttttgct ttaaattcta ttataacttt ttttatggct gaaatttttg catgtgtctt 3240tgctctctgt tgtaaattta ctgtttaggt actaactcta ggcttgttgt gcagtttttg 3300aagtataacc atgccacaca acacaatggc ggccaccgct tccagaacca cccgattctc 3360ttcttcctct tcacacccca ccttccccaa acgcattact agatccaccc tccctctctc 3420tcatcaaacc ctcaccaaac ccaaccacgc tctcaaaatc aaatgttcca tctccaaacc 3480ccccacggcg gcgcccttca ccaaggaagc gccgaccacg gagcccttcg tgtcacggtt 3540cgcctccggc gaacctcgca agggcgcgga catccttgtg gaggcgctgg agaggcaggg 3600cgtgacgacg gtgttcgcgt accccggcgg tgcgtcgatg gagatccacc aggcgctcac 3660gcgctccgcc gccatccgca acgtgctccc gcgccacgag cagggcggcg tcttcgccgc 3720cgaaggctac gcgcgttcct ccggcctccc cggcgtctgc attgccacct ccggccccgg 3780cgccaccaac ctcgtgagcg gcctcgccga cgctttaatg gacagcgtcc cagtcgtcgc 3840catcaccggc caggtcgccc gccggatgat cggcaccgac gccttccaag aaaccccgat 3900cgtggaggtg agcagatcca tcacgaagca caactacctc atcctcgacg tcgacgacat 3960cccccgcgtc gtcgccgagg ctttcttcgt cgccacctcc ggccgccccg gtccggtcct 4020catcgacatt cccaaagacg ttcagcagca actcgccgtg cctaattggg acgagcccgt 4080taacctcccc ggttacctcg ccaggctgcc caggcccccc gccgaggccc aattggaaca 4140cattgtcaga ctcatcatgg aggcccaaaa gcccgttctc tacgtcggcg gtggcagttt 4200gaattccagt gctgaattga ggcgctttgt tgaactcact ggtattcccg ttgctagcac 4260tttaatgggt cttggaactt ttcctattgg tgatgaatat tcccttcaga tgctgggtat 4320gcatggtact gtttatgcta actatgctgt tgacaatagt gatttgttgc ttgcctttgg 4380ggtaaggttt gatgaccgtg ttactgggaa gcttgaggct tttgctagta gggctaagat 4440tgttcacatt gatattgatt ctgccgagat tgggaagaac aagcaggcgc acgtgtcggt 4500ttgcgcggat ttgaagttgg ccttgaaggg aattaatatg attttggagg agaaaggagt 4560ggagggtaag tttgatcttg gaggttggag agaagagatt aatgtgcaga aacacaagtt 4620tccattgggt tacaagacat tccaggacgc gatttctccg cagcatgcta tcgaggttct 4680tgatgagttg actaatggag atgctattgt tagtactggg gttgggcagc atcaaatgtg 4740ggctgcgcag ttttacaagt acaagagacc gaggcagtgg ttgacctcag ggggtcttgg 4800agccatgggt tttggattgc ctgcggctat tggtgctgct gttgctaacc ctggggctgt 4860tgtggttgac attgatgggg atggtagttt catcatgaat gttcaggagt tggccactat 4920aagagtggag aatctcccag ttaagatatt gttgttgaac aatcagcatt tgggtatggt 4980ggttcagttg gaggataggt tctacaagtc caatagagct cacacctatc ttggagatcc 5040gtctagcgag agcgagatat tcccaaacat gctcaagttt gctgatgctt gtgggatacc 5100ggcagcgcga gtgacgaaga aggaagagct tagagcggca attcagagaa tgttggacac 5160ccctggcccc taccttcttg atgtcattgt gccccatcag gagcatgtgt tgccgatgat 5220tcccagtaat ggatccttca aggatgtgat aactgagggt gatggtagaa cgaggtactg 5280attgcctaga ccaaatgttc cttgatgctt gttttgtaca atatatataa gataatgctg 5340tcctagttgc aggatttggc ctgtggtgag catcatagtc tgtagtagtt ttggtagcaa 5400gacattttat tttcctttta tttaacttac tacatgcagt agcatctatc tatctctgta 5460gtctgatatc tcctgttgtc tgtattgtgc cgttggattt tttgctgtag tgagactgaa 5520aatgatgtgc tagtaataat atttctgtta gaaatctaag tagagaatct gttgaagaag 5580tcaaaagcta atggaatcag gttacatatt caatgttttt ctttttttag cggttggtag 5640acgtgtagat tcaacttctc ttggagctca cctaggcaat cagtaaaatg catattcctt 5700ttttaacttg ccatttattt acttttagtg gaaattgtga ccaatttgtt catgtagaac 5760ggatttggac cattgcgtcc acaaaacgtc tcttttgctc gatcttcaca aagcgatacc 5820gaaatccaga gatagttttc aaaagtcaga aatggcaaag ttataaatag taaaacagaa 5880tagatgctgt aatcgacttc aataacaagt ggcatcacgt ttctagttct agacccgggt 5940accggcgcgc ccgatcatcc ggatatagtt cctcctttca gcaaaaaacc cctcaagacc 6000cgtttagagg ccccaagggg ttatgctagt tattgctcag cggtggcagc agccaactca 6060gcttcctttc gggctttgtt agcagccgga tcgatccaag ctgtacctca ctattccttt 6120gccctcggac gagtgctggg gcgtcggttt ccactatcgg cgagtacttc tacacagcca 6180tcggtccaga cggccgcgct tctgcgggcg atttgtgtac gcccgacagt cccggctccg 6240gatcggacga ttgcgtcgca tcgaccctgc gcccaagctg catcatcgaa attgccgtca 6300accaagctct gatagagttg gtcaagacca atgcggagca tatacgcccg gagccgcggc 6360gatcctgcaa gctccggatg cctccgctcg aagtagcgcg tctgctgctc catacaagcc 6420aaccacggcc tccagaagaa gatgttggcg acctcgtatt gggaatcccc gaacatcgcc 6480tcgctccagt caatgaccgc tgttatgcgg ccattgtccg tcaggacatt gttggagccg 6540aaatccgcgt gcacgaggtg ccggacttcg gggcagtcct cggcccaaag catcagctca 6600tcgagagcct gcgcgacgga cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac 6660acatggggat cagcaatcgc gcatatgaaa tcacgccatg tagtgtattg accgattcct 6720tgcggtccga atgggccgaa cccgctcgtc tggctaagat cggccgcagc gatcgcatcc 6780atagcctccg cgaccggctg cagaacagcg ggcagttcgg tttcaggcag gtcttgcaac 6840gtgacaccct gtgcacggcg ggagatgcaa taggtcaggc tctcgctgaa ttccccaatg 6900tcaagcactt ccggaatcgg gagcgcggcc gatgcaaagt gccgataaac ataacgatct 6960ttgtagaaac catcggcgca gctatttacc cgcaggacat atccacgccc tcctacatcg 7020aagctgaaag cacgagattc ttcgccctcc gagagctgca tcaggtcgga gacgctgtcg 7080aacttttcga tcagaaactt ctcgacagac gtcgcggtga gttcaggctt ttccatgggt 7140atatctcctt cttaaagtta aacaaaatta tttctagagg gaaaccgttg tggtctccct 7200atagtgagtc gtattaattt cgcgggatcg agatctgatc aacctgcatt aatgaatcgg 7260ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 7320ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 7380acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 7440aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 7500tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 7560aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 7620gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcaatgctc 7680acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 7740accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 7800ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 7860gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 7920gacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 7980ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 8040gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 8100cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgacattaa cctataaaaa 8160taggcgtatc acgaggccct ttcgtct 8187438945DNAArtificialNucleotide sequence of QC302 43tttgtacaaa cttgtgattc ttccttacca atcatactaa ttattttggg ttaaatatta 60atcattattt ttaagatatt aattaagaaa ttaaaagatt ttttaaaaaa atgtataaaa 120ttatattatt catgattttt catacatttg attttgataa taaatatatt ttttttaatt 180tcttaaaaaa tgttgcaaga cacttattag acatagtctt gttctgttta caaaagcatt 240catcatttaa tacattaaaa aatatttaat actaacagta gaatcttctt gtgagtggtg 300tgggagtagg caacctggca ttgaaacgag agaaagagag tcagaaccag aagacaaata 360aaaagtatgc aacaaacaaa tcaaaatcaa agggcaaagg ctggggttgg ctcaattggt 420tgctacattc aattttcaac tcagtcaacg gttgagattc actctgactt ccccaatcta 480agccgcggat gcaaacggtt gaatctaacc cacaatccaa tctcgttact taggggcttt 540tccgtcatta actcacccct gccacccggt ttccctataa attggaactc aatgctcccc 600tctaaactcg tatcgcttca gagttgagac caagacacac tcgttcatat atctctctgc 660tcttctcttc tcttctacct ctcaaggtac ttttcttctc cctctaccaa atcctagatt 720ccgtggttca atttcggatc ttgcacttct ggtttgcttt gccttgcttt ttcctcaact 780gggtccatct aggatccatg tgaaactcta ctctttcttt aatatctgcg gaatacgcgt 840ttgactttca gatctagtcg aaatcatttc ataattgcct ttctttcttt tagcttatga 900gaaataaaat cacttttttt ttatttcaaa ataaaccttg ggccttgtgc tgactgagat 960ggggtttggt gattacagaa ttttagcgaa ttttgtaatt gtacttgttt gtctgtagtt 1020ttgttttgtt ttcttgtttc tcatacattc cttaggcttc aattttattc gagtataggt 1080cacaatagga attcaaactt tgagcagggg aattaatccc ttccttcaaa tccagtttgt 1140ttgtatatat gtttaaaaaa tgaaactttt gctttaaatt ctattataac tttttttatg 1200gctgaaattt ttgcatgtgt ctttgctctc tgttgtaaat ttactgttta ggtactaact 1260ctaggcttgt tgtgcagttt ttgaagtata accatgccac acaacacaat ggcggccacc 1320gcttccagaa ccacccgatt ctcttcttcc tcttcacacc ccaccttccc caaacgcatt 1380actagatcca ccctccctct ctctcatcaa accctcacca aacccaacca cgctctcaaa 1440atcaaatgtt ccatctccaa accccccacg gcggcgccct tcaccaagga agcgccgacc 1500acggagccct tcgtgtcacg gttcgcctcc ggcgaacctc gcaagggcgc ggacatcctt 1560gtggaggcgc tggagaggca gggcgtgacg acggtgttcg cgtaccccgg cggtgcgtcg 1620atggagatcc accaggcgct cacgcgctcc gccgccatcc gcaacgtgct cccgcgccac 1680gagcagggcg gcgtcttcgc cgccgaaggc tacgcgcgtt cctccggcct ccccggcgtc 1740tgcattgcca cctccggccc cggcgccacc aacctcgtga gcggcctcgc cgacgcttta 1800atggacagcg tcccagtcgt cgccatcacc ggccaggtcg cccgccggat gatcggcacc 1860gacgccttcc aagaaacccc gatcgtggag gtgagcagat ccatcacgaa gcacaactac 1920ctcatcctcg acgtcgacga catcccccgc gtcgtcgccg aggctttctt cgtcgccacc 1980tccggccgcc ccggtccggt cctcatcgac attcccaaag acgttcagca gcaactcgcc 2040gtgcctaatt gggacgagcc cgttaacctc cccggttacc tcgccaggct gcccaggccc 2100cccgccgagg cccaattgga acacattgtc agactcatca tggaggccca aaagcccgtt 2160ctctacgtcg gcggtggcag tttgaattcc agtgctgaat tgaggcgctt tgttgaactc 2220actggtattc ccgttgctag cactttaatg ggtcttggaa cttttcctat tggtgatgaa 2280tattcccttc agatgctggg tatgcatggt actgtttatg ctaactatgc tgttgacaat 2340agtgatttgt tgcttgcctt tggggtaagg tttgatgacc gtgttactgg gaagcttgag 2400gcttttgcta gtagggctaa gattgttcac attgatattg attctgccga gattgggaag 2460aacaagcagg cgcacgtgtc ggtttgcgcg gatttgaagt tggccttgaa gggaattaat 2520atgattttgg aggagaaagg agtggagggt aagtttgatc ttggaggttg gagagaagag 2580attaatgtgc agaaacacaa gtttccattg ggttacaaga cattccagga cgcgatttct 2640ccgcagcatg ctatcgaggt tcttgatgag ttgactaatg gagatgctat tgttagtact 2700ggggttgggc agcatcaaat gtgggctgcg cagttttaca agtacaagag accgaggcag 2760tggttgacct cagggggtct tggagccatg ggttttggat tgcctgcggc tattggtgct 2820gctgttgcta accctggggc tgttgtggtt gacattgatg gggatggtag tttcatcatg 2880aatgttcagg agttggccac tataagagtg gagaatctcc cagttaagat attgttgttg 2940aacaatcagc

atttgggtat ggtggttcag ttggaggata ggttctacaa gtccaataga 3000gctcacacct atcttggaga tccgtctagc gagagcgaga tattcccaaa catgctcaag 3060tttgctgatg cttgtgggat accggcagcg cgagtgacga agaaggaaga gcttagagcg 3120gcaattcaga gaatgttgga cacccctggc ccctaccttc ttgatgtcat tgtgccccat 3180caggagcatg tgttgccgat gattcccagt aatggatcct tcaaggatgt gataactgag 3240ggtgatggta gaacgaggta ctgattgcct agaccaaatg ttccttgatg cttgttttgt 3300acaatatata taagataatg ctgtcctagt tgcaggattt ggcctgtggt gagcatcata 3360gtctgtagta gttttggtag caagacattt tattttcctt ttatttaact tactacatgc 3420agtagcatct atctatctct gtagtctgat atctcctgtt gtctgtattg tgccgttgga 3480ttttttgctg tagtgagact gaaaatgatg tgctagtaat aatatttctg ttagaaatct 3540aagtagagaa tctgttgaag aagtcaaaag ctaatggaat caggttacat attcaatgtt 3600tttctttttt tagcggttgg tagacgtgta gattcaactt ctcttggagc tcacctaggc 3660aatcagtaaa atgcatattc cttttttaac ttgccattta tttactttta gtggaaattg 3720tgaccaattt gttcatgtag aacggatttg gaccattgcg tccacaaaac gtctcttttg 3780ctcgatcttc acaaagcgat accgaaatcc agagatagtt ttcaaaagtc agaaatggca 3840aagttataaa tagtaaaaca gaatagatgc tgtaatcgac ttcaataaca agtggcatca 3900cgtttctagt tctagacccg ggtaccggcg cgcccgatca tccggatata gttcctcctt 3960tcagcaaaaa acccctcaag acccgtttag aggccccaag gggttatgct agttattgct 4020cagcggtggc agcagccaac tcagcttcct ttcgggcttt gttagcagcc ggatcgatcc 4080aagctgtacc tcactattcc tttgccctcg gacgagtgct ggggcgtcgg tttccactat 4140cggcgagtac ttctacacag ccatcggtcc agacggccgc gcttctgcgg gcgatttgtg 4200tacgcccgac agtcccggct ccggatcgga cgattgcgtc gcatcgaccc tgcgcccaag 4260ctgcatcatc gaaattgccg tcaaccaagc tctgatagag ttggtcaaga ccaatgcgga 4320gcatatacgc ccggagccgc ggcgatcctg caagctccgg atgcctccgc tcgaagtagc 4380gcgtctgctg ctccatacaa gccaaccacg gcctccagaa gaagatgttg gcgacctcgt 4440attgggaatc cccgaacatc gcctcgctcc agtcaatgac cgctgttatg cggccattgt 4500ccgtcaggac attgttggag ccgaaatccg cgtgcacgag gtgccggact tcggggcagt 4560cctcggccca aagcatcagc tcatcgagag cctgcgcgac ggacgcactg acggtgtcgt 4620ccatcacagt ttgccagtga tacacatggg gatcagcaat cgcgcatatg aaatcacgcc 4680atgtagtgta ttgaccgatt ccttgcggtc cgaatgggcc gaacccgctc gtctggctaa 4740gatcggccgc agcgatcgca tccatagcct ccgcgaccgg ctgcagaaca gcgggcagtt 4800cggtttcagg caggtcttgc aacgtgacac cctgtgcacg gcgggagatg caataggtca 4860ggctctcgct gaattcccca atgtcaagca cttccggaat cgggagcgcg gccgatgcaa 4920agtgccgata aacataacga tctttgtaga aaccatcggc gcagctattt acccgcagga 4980catatccacg ccctcctaca tcgaagctga aagcacgaga ttcttcgccc tccgagagct 5040gcatcaggtc ggagacgctg tcgaactttt cgatcagaaa cttctcgaca gacgtcgcgg 5100tgagttcagg cttttccatg ggtatatctc cttcttaaag ttaaacaaaa ttatttctag 5160agggaaaccg ttgtggtctc cctatagtga gtcgtattaa tttcgcggga tcgagatctg 5220atcaacctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg tattgggcgc 5280tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 5340tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag 5400aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 5460tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg 5520tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag ctccctcgtg 5580cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 5640agcgtggcgc tttctcaatg ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 5700tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt 5760aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact 5820ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt gaagtggtgg 5880cctaactacg gctacactag aaggacagta tttggtatct gcgctctgct gaagccagtt 5940accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt 6000ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 6060ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 6120gtcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtct cgcgcgtttc 6180ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac agcttgtctg 6240taagcggatg ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt 6300cggggctggc ttaactatgc ggcatcagag cagattgtac tgagagtgca ccatatggac 6360atattgtcgt tagaacgcgg ctacaattaa tacataacct tatgtatcat acacatacga 6420tttaggtgac actatagaac ggcgcgccgg taccgggccc cccctcgagt gcggccgcaa 6480gcttgtcgac ggagatcacc actttgtaca agaaagctgg gtctagatat ctcgagtgcg 6540gccgccagtg tgatggatat cagctgggcc ggccactaga attcccgatc tagtaacata 6600gatgacaccg cgcgcgataa tttatcctag tttgcgcgct atattttgtt ttctatcgcg 6660tattaaatgt ataattgcgg gactctaatc ataaaaaccc atctcataaa taacgtcatg 6720cattacatgt taattattac atgcttaacg taattcaaca gaaattatat gataatcatc 6780gcaagaccgg caacaggatt caatcttaag aaactttatt gccaaatgtt tgaacgatcg 6840gggaaattcg agctctcagg ccagggcgct ggggaaggcg atggcgtgct cggtcagctg 6900ccacttctgg ttcttggcgt cgctccggtc ctcccgcagc agcttgtgct ggatgaagtg 6960ccactcgggc atcttgctgg gcacgctctt ggccttgtac acggtgtcga actggcaccg 7020gtaccggccg ccgtccttca gcagcaggta catgctcacg tcgcccttca ggatgccctg 7080cttaggcacg ggcatgatct tctcgcagct ggcctcccag ttggtggtca tcttcttcat 7140cacggggccg tcggcgggga agttcacgcc gttgaagatg ctcttgtggt agatgcagtt 7200ctccttcacg ctcacggtga tgtccacgtt acagatgcac acggcgccgt cctcgaacag 7260gaagctccgg ccccaggtgt agccggcggg gcagctgttc ttgaagtagt ccacgatgtc 7320ctgggggtac tcggtgaaga tccggtcgcc gtacttgaag ccggcgctca ggatgtcctc 7380gctgaagggc agggggccgc cctcgatcac gcacaggttg atggtctgct tgcccttgaa 7440ggggtagccg atgccctcgc cggtgatcac gaacttgtgg ccgttcacgc agccctccat 7500gtggtacttc atggtcatct cctccttcag gccgtgcttg ctgtgggcca tggtttcaga 7560taataaaatt agtgttgaaa cttgaaagca ataaaggagg ggaattgaat ggctcaggtg 7620tctttgattt ggtgtgtcac ttttgtatgt gttgaaggct taagggagag atatgcatta 7680tttatatgat tcctttgttc tacagtgttg agtgaacttt ggcggtgagt aaagtgtact 7740tcatgagcaa agggcaatag tgtgactcag atgcatgaag attgtgatca cttggtagtt 7800gaacattttt ttgtacttgg ttttattact tactatttgt ggaagagggt caccaatgtt 7860atagcggcaa ataaatggag ccttcacgtg gttgatttat gcaccgtaat atgtagaaaa 7920cttggaattt gttctacctt cattttttat gctatcttct gtacttttta cttgttggtt 7980gttacttgag ctgtaagttc agacttcatg agagcacatc tgtgattaat attacaccat 8040gtactaccac tttcagtttg gcctctgcgc acacttgtga aactatgaaa ggggaaatag 8100taaacatcta cttttgggat ggattagaat ggaagaagag agaaatcagt aagaaaaaaa 8160gaagatgatc cagtccattg ggttgaactt gtggtgggcg tagtggactg gattagcaca 8220catgttatcc atttatcaca ttttatttct atatttttat aattgaataa attgagatag 8280catttatcat gatgattact atttgtattt tgattttctt cacactatta caaagtacat 8340aattataact tgtagttcca atactattgc tttatgtcaa tgtttagttt ctgatgaaga 8400ttagcatgaa tatttgtctc tttcaattgg ttttgattct gacaattgaa ttgatcattg 8460aaatgttgct tgtaagttat gatctaagaa taaataatct gaaacattat ttttttgttt 8520aaaaaatgat tttaaaacaa atatttttat tatcatttta agtgagtaaa ctcctataac 8580tcaccaatat atggagagtt gggccaacat ttttaatact tactaagaga tgaatagaat 8640tagattgatc tatttattaa acaagttagg tgaactcaac ccaatgagta aggttccacc 8700attttaaatt tttgatagct ctaaagatag aaacattaaa taacaaatgt gataaataat 8760agataaatgt tagacattga ttaaagaatt aaaagaaaaa aatattttat tagaattata 8820aaattatgtt gttgatgatt tttttaagtt tgcatgattt tcacattaat aatgttctcc 8880ttttaattta ttaactaggt taccagcccg ggtcgactga attggttccg gcgccagcct 8940gcttt 8945442817DNAArtificialNucleotide sequence of pCR8/GW/TOPO 44ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga 60taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 120gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 180cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaata cgcgtaccgc 240tagccaggaa gagtttgtag aaacgcaaaa aggccatccg tcaggatggc cttctgctta 300gtttgatgcc tggcagttta tggcgggcgt cctgcccgcc accctccggg ccgttgcttc 360acaacgttca aatccgctcc cggcggattt gtcctactca ggagagcgtt caccgacaaa 420caacagataa aacgaaaggc ccagtcttcc gactgagcct ttcgttttat ttgatgcctg 480gcagttccct actctcgcgt taacgctagc atggatgttt tcccagtcac gacgttgtaa 540aacgacggcc agtcttaagc tcgggcccca aataatgatt ttattttgac tgatagtgac 600ctgttcgttg caacaaattg atgagcaatg cttttttata atgccaactt tgtacaaaaa 660agcaggctcc gaattcgccc ttaagggcga attcgaccca gctttcttgt acaaagttgg 720cattataaaa aataattgct catcaatttg ttgcaacgaa caggtcacta tcagtcaaaa 780taaaatcatt atttgccatc cagctgatat cccctatagt gagtcgtatt acatggtcat 840agctgtttcc tggcagctct ggcccgtgtc tcaaaatctc tgatgttaca ttgcacaaga 900taaaaatata tcatcatgcc tcctctagac cagccaggac agaaatgcct cgacttcgct 960gctgcccaag gttgccgggt gacgcacacc gtggaaacgg atgaaggcac gaacccagtg 1020gacataagcc tgttcggttc gtaagctgta atgcaagtag cgtatgcgct cacgcaactg 1080gtccagaacc ttgaccgaac gcagcggtgg taacggcgca gtggcggttt tcatggcttg 1140ttatgactgt ttttttgggg tacagtctat gcctcgggca tccaagcagc aagcgcgtta 1200cgccgtgggt cgatgtttga tgttatggag cagcaacgat gttacgcagc agggcagtcg 1260ccctaaaaca aagttaaaca tcatgaggga agcggtgatc gccgaagtat cgactcaact 1320atcagaggta gttggcgtca tcgagcgcca tctcgaaccg acgttgctgg ccgtacattt 1380gtacggctcc gcagtggatg gcggcctgaa gccacacagt gatattgatt tgctggttac 1440ggtgaccgta aggcttgatg aaacaacgcg gcgagctttg atcaacgacc ttttggaaac 1500ttcggcttcc cctggagaga gcgagattct ccgcgctgta gaagtcacca ttgttgtgca 1560cgacgacatc attccgtggc gttatccagc taagcgcgaa ctgcaatttg gagaatggca 1620gcgcaatgac attcttgcag gtatcttcga gccagccacg atcgacattg atctggctat 1680cttgctgaca aaagcaagag aacatagcgt tgccttggta ggtccagcgg cggaggaact 1740ctttgatccg gttcctgaac aggatctatt tgaggcgcta aatgaaacct taacgctatg 1800gaactcgccg cccgactggg ctggcgatga gcgaaatgta gtgcttacgt tgtcccgcat 1860ttggtacagc gcagtaaccg gcaaaatcgc gccgaaggat gtcgctgccg actgggcaat 1920ggagcgcctg ccggcccagt atcagcccgt catacttgaa gctagacagg cttatcttgg 1980acaagaagaa gatcgcttgg cctcgcgcgc agatcagttg gaagaatttg tccactacgt 2040gaaaggcgag atcaccaagg tagtcggcaa ataaccctcg agccacccat gaccaaaatc 2100ccttaacgtg agttacgcgt cgttccactg agcgtcagac cccgtagaaa agatcaaagg 2160atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc 2220gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac 2280tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca 2340ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt 2400ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac gatagttacc 2460ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg 2520aacgacctac accgaactga gatacctaca gcgtgagcat tgagaaagcg ccacgcttcc 2580cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac 2640gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct 2700ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc 2760cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgtt 2817454145DNAArtificialNucleotide sequence of QC300-1 45aagggcgaat tcgacccagc tttcttgtac aaagttggca ttataaaaaa taattgctca 60tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata aaatcattat ttgccatcca 120gctgatatcc cctatagtga gtcgtattac atggtcatag ctgtttcctg gcagctctgg 180cccgtgtctc aaaatctctg atgttacatt gcacaagata aaaatatatc atcatgcctc 240ctctagacca gccaggacag aaatgcctcg acttcgctgc tgcccaaggt tgccgggtga 300cgcacaccgt ggaaacggat gaaggcacga acccagtgga cataagcctg ttcggttcgt 360aagctgtaat gcaagtagcg tatgcgctca cgcaactggt ccagaacctt gaccgaacgc 420agcggtggta acggcgcagt ggcggttttc atggcttgtt atgactgttt ttttggggta 480cagtctatgc ctcgggcatc caagcagcaa gcgcgttacg ccgtgggtcg atgtttgatg 540ttatggagca gcaacgatgt tacgcagcag ggcagtcgcc ctaaaacaaa gttaaacatc 600atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 660gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 720ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 780acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 840gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 900tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 960atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 1020catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 1080gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 1140ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 1200aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 1260cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 1320tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 1380gtcggcaaat aaccctcgag ccacccatga ccaaaatccc ttaacgtgag ttacgcgtcg 1440ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt 1500ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg 1560ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata 1620ccaaatactg tccttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca 1680ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag 1740tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc 1800tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga 1860tacctacagc gtgagcattg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg 1920tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac 1980gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg 2040tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg 2100ttcctggcct tttgctggcc ttttgctcac atgttctttc ctgcgttatc ccctgattct 2160gtggataacc gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc 2220gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa accgcctctc 2280cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 2340ggcagtgagc gcaacgcaat taatacgcgt accgctagcc aggaagagtt tgtagaaacg 2400caaaaaggcc atccgtcagg atggccttct gcttagtttg atgcctggca gtttatggcg 2460ggcgtcctgc ccgccaccct ccgggccgtt gcttcacaac gttcaaatcc gctcccggcg 2520gatttgtcct actcaggaga gcgttcaccg acaaacaaca gataaaacga aaggcccagt 2580cttccgactg agcctttcgt tttatttgat gcctggcagt tccctactct cgcgttaacg 2640ctagcatgga tgttttccca gtcacgacgt tgtaaaacga cggccagtct taagctcggg 2700ccccaaataa tgattttatt ttgactgata gtgacctgtt cgttgcaaca aattgatgag 2760caatgctttt ttataatgcc aactttgtac aaaaaagcag gctccgaatt cgcccttgga 2820gaacattatt aatgtgaaaa tcatgcaaac ttaaaaaaat catcaacaac ataattttat 2880aattctaata aaatattttt ttcttttaat tctttaatca atgtctaaca tttatctatt 2940atttatcaca tttgttattt aatgtttcta tctttagagc tatcaaaaat ttaaaatggt 3000ggaaccttac tcattgggtt gagttcacct aacttgttta ataaatagat caatctaatt 3060ctattcatct cttagtaagt attaaaaatg ttggcccaac tctccatata ttggtgagtt 3120ataggagttt actcacttaa aatgataata aaaatatttg ttttaaaatc attttttaaa 3180caaaaaaata atgtttcaga ttatttattc ttagatcata acttacaagc aacatttcaa 3240tgatcaattc aattgtcaga atcaaaacca attgaaagag acaaatattc atgctaatct 3300tcatcagaaa ctaaacattg acataaagca atagtattgg aactacaagt tataattatg 3360tactttgtaa tagtgtgaag aaaatcaaaa tacaaatagt aatcatcatg ataaatgcta 3420tctcaattta ttcaattata aaaatataga aataaaatgt gataaatgga taacatgtgt 3480gctaatccag tccactacgc ccaccacaag ttcaacccaa tggactggat catcttcttt 3540ttttcttact gatttctctc ttcttccatt ctaatccatc ccaaaagtag atgtttacta 3600tttccccttt catagtttca caagtgtgcg cagaggccaa actgaaagtg gtagtacatg 3660gtgtaatatt aatcacagat gtgctctcat gaagtctgaa cttacagctc aagtaacaac 3720caacaagtaa aaagtacaga agatagcata aaaaatgaag gtagaacaaa ttccaagttt 3780tctacatatt acggtgcata aatcaaccac gtgaaggctc catttatttg ccgctataac 3840attggtgacc ctcttccaca aatagtaagt aataaaacca agtacaaaaa aatgttcaac 3900taccaagtga tcacaatctt catgcatctg agtcacacta ttgccctttg ctcatgaagt 3960acactttact caccgccaaa gttcactcaa cactgtagaa caaaggaatc atataaataa 4020tgcatatctc tcccttaagc cttcaacaca tacaaaagtg acacaccaaa tcaaagacac 4080ctgagccatt caattcccct cctttattgc tttcaagttt caacactaat tttattatct 4140gaaac 4145465286DNAArtificialNucleotide sequence of QC330 46atcaacaagt ttgtacaaaa aagctgaacg agaaacgtaa aatgatataa atatcaatat 60attaaattag attttgcata aaaaacagac tacataatac tgtaaaacac aacatatcca 120gtcatattgg cggccgcatt aggcacccca ggctttacac tttatgcttc cggctcgtat 180aatgtgtgga ttttgagtta ggatccgtcg agattttcag gagctaagga agctaaaatg 240gagaaaaaaa tcactggata taccaccgtt gatatatccc aatggcatcg taaagaacat 300tttgaggcat ttcagtcagt tgctcaatgt acctataacc agaccgttca gctggatatt 360acggcctttt taaagaccgt aaagaaaaat aagcacaagt tttatccggc ctttattcac 420attcttgccc gcctgatgaa tgctcatccg gaattccgta tggcaatgaa agacggtgag 480ctggtgatat gggatagtgt tcacccttgt tacaccgttt tccatgagca aactgaaacg 540ttttcatcgc tctggagtga ataccacgac gatttccggc agtttctaca catatattcg 600caagatgtgg cgtgttacgg tgaaaacctg gcctatttcc ctaaagggtt tattgagaat 660atgtttttcg tctcagccaa tccctgggtg agtttcacca gttttgattt aaacgtggcc 720aatatggaca acttcttcgc ccccgttttc accatgggca aatattatac gcaaggcgac 780aaggtgctga tgccgctggc gattcaggtt catcatgccg tttgtgatgg cttccatgtc 840ggcagaatgc ttaatgaatt acaacagtac tgcgatgagt ggcagggcgg ggcgtaaaga 900tctggatccg gcttactaaa agccagataa cagtatgcgt atttgcgcgc tgatttttgc 960ggtataagaa tatatactga tatgtatacc cgaagtatgt caaaaagagg tatgctatga 1020agcagcgtat tacagtgaca gttgacagcg acagctatca gttgctcaag gcatatatga 1080tgtcaatatc tccggtctgg taagcacaac catgcagaat gaagcccgtc gtctgcgtgc 1140cgaacgctgg aaagcggaaa atcaggaagg gatggctgag gtcgcccggt ttattgaaat 1200gaacggctct tttgctgacg agaacagggg ctggtgaaat gcagtttaag gtttacacct 1260ataaaagaga gagccgttat cgtctgtttg tggatgtaca gagtgatatt attgacacgc 1320ccgggcgacg gatggtgatc cccctggcca gtgcacgtct gctgtcagat aaagtctccc 1380gtgaacttta cccggtggtg catatcgggg atgaaagctg gcgcatgatg accaccgata 1440tggccagtgt gccggtctcc gttatcgggg aagaagtggc tgatctcagc caccgcgaaa 1500atgacatcaa aaacgccatt aacctgatgt tctggggaat ataaatgtca ggctccctta 1560tacacagcca gtctgcaggt cgaccatagt gactggatat gttgtgtttt acagtattat 1620gtagtctgtt ttttatgcaa aatctaattt aatatattga tatttatatc attttacgtt 1680tctcgttcag ctttcttgta caaagtggtt gatgggatcc atggcccaca gcaagcacgg 1740cctgaaggag gagatgacca tgaagtacca catggagggc tgcgtgaacg gccacaagtt 1800cgtgatcacc

ggcgagggca tcggctaccc cttcaagggc aagcagacca tcaacctgtg 1860cgtgatcgag ggcggccccc tgcccttcag cgaggacatc ctgagcgccg gcttcaagta 1920cggcgaccgg atcttcaccg agtaccccca ggacatcgtg gactacttca agaacagctg 1980ccccgccggc tacacctggg gccggagctt cctgttcgag gacggcgccg tgtgcatctg 2040taacgtggac atcaccgtga gcgtgaagga gaactgcatc taccacaaga gcatcttcaa 2100cggcgtgaac ttccccgccg acggccccgt gatgaagaag atgaccacca actgggaggc 2160cagctgcgag aagatcatgc ccgtgcctaa gcagggcatc ctgaagggcg acgtgagcat 2220gtacctgctg ctgaaggacg gcggccggta ccggtgccag ttcgacaccg tgtacaaggc 2280caagagcgtg cccagcaaga tgcccgagtg gcacttcatc cagcacaagc tgctgcggga 2340ggaccggagc gacgccaaga accagaagtg gcagctgacc gagcacgcca tcgccttccc 2400cagcgccctg gcctgagagc tcgaatttcc ccgatcgttc aaacatttgg caataaagtt 2460tcttaagatt gaatcctgtt gccggtcttg cgatgattat catataattt ctgttgaatt 2520acgttaagca tgtaataatt aacatgtaat gcatgacgtt atttatgaga tgggttttta 2580tgattagagt cccgcaatta tacatttaat acgcgataga aaacaaaata tagcgcgcaa 2640actaggataa attatcgcgc gcggtgtcat ctatgttact agatcgggaa ttctagtggc 2700cggcccagct gatatccatc acactggcgg ccgctcgagt tctatagtgt cacctaaatc 2760gtatgtgtat gatacataag gttatgtatt aattgtagcc gcgttctaac gacaatatgt 2820ccatatggtg cactctcagt acaatctgct ctgatgccgc atagttaagc cagccccgac 2880acccgccaac acccgctgac gcgccctgac gggcttgtct gctcccggca tccgcttaca 2940gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg tcatcaccga 3000aacgcgcgag acgaaagggc ctcgtgatac gcctattttt ataggttaat gtcatgacca 3060aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa aagatcaaag 3120gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac 3180cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactcttttt ccgaaggtaa 3240ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg tagttaggcc 3300accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag 3360tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 3420cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc agcttggagc 3480gaacgaccta caccgaactg agatacctac agcgtgagca ttgagaaagc gccacgcttc 3540ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3600cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc 3660tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 3720ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgct cacatgttct 3780ttcctgcgtt atcccctgat tctgtggata accgtattac cgcctttgag tgagctgata 3840ccgctcgccg cagccgaacg accgagcgca gcgagtcagt gagcgaggaa gcggaagagc 3900gcccaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc aggttgatca 3960gatctcgatc ccgcgaaatt aatacgactc actataggga gaccacaacg gtttccctct 4020agaaataatt ttgtttaact ttaagaagga gatataccca tggaaaagcc tgaactcacc 4080gcgacgtctg tcgagaagtt tctgatcgaa aagttcgaca gcgtctccga cctgatgcag 4140ctctcggagg gcgaagaatc tcgtgctttc agcttcgatg taggagggcg tggatatgtc 4200ctgcgggtaa atagctgcgc cgatggtttc tacaaagatc gttatgttta tcggcacttt 4260gcatcggccg cgctcccgat tccggaagtg cttgacattg gggaattcag cgagagcctg 4320acctattgca tctcccgccg tgcacagggt gtcacgttgc aagacctgcc tgaaaccgaa 4380ctgcccgctg ttctgcagcc ggtcgcggag gctatggatg cgatcgctgc ggccgatctt 4440agccagacga gcgggttcgg cccattcgga ccgcaaggaa tcggtcaata cactacatgg 4500cgtgatttca tatgcgcgat tgctgatccc catgtgtatc actggcaaac tgtgatggac 4560gacaccgtca gtgcgtccgt cgcgcaggct ctcgatgagc tgatgctttg ggccgaggac 4620tgccccgaag tccggcacct cgtgcacgcg gatttcggct ccaacaatgt cctgacggac 4680aatggccgca taacagcggt cattgactgg agcgaggcga tgttcgggga ttcccaatac 4740gaggtcgcca acatcttctt ctggaggccg tggttggctt gtatggagca gcagacgcgc 4800tacttcgagc ggaggcatcc ggagcttgca ggatcgccgc ggctccgggc gtatatgctc 4860cgcattggtc ttgaccaact ctatcagagc ttggttgacg gcaatttcga tgatgcagct 4920tgggcgcagg gtcgatgcga cgcaatcgtc cgatccggag ccgggactgt cgggcgtaca 4980caaatcgccc gcagaagcgc ggccgtctgg accgatggct gtgtagaagt actcgccgat 5040agtggaaacc gacgccccag cactcgtccg agggcaaagg aatagtgagg tacagcttgg 5100atcgatccgg ctgctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag 5160caataactag cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa 5220ggaggaacta tatccggatg atcgtcgagg cctcacgtgt taacaagctt gcatgcctgc 5280aggttt 5286474986DNAArtificialNucleotide sequence of QC300-1Y 47cttgtacaaa gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttggaga acattattaa tgtgaaaatc 3660atgcaaactt aaaaaaatca tcaacaacat aattttataa ttctaataaa atattttttt 3720cttttaattc tttaatcaat gtctaacatt tatctattat ttatcacatt tgttatttaa 3780tgtttctatc tttagagcta tcaaaaattt aaaatggtgg aaccttactc attgggttga 3840gttcacctaa cttgtttaat aaatagatca atctaattct attcatctct tagtaagtat 3900taaaaatgtt ggcccaactc tccatatatt ggtgagttat aggagtttac tcacttaaaa 3960tgataataaa aatatttgtt ttaaaatcat tttttaaaca aaaaaataat gtttcagatt 4020atttattctt agatcataac ttacaagcaa catttcaatg atcaattcaa ttgtcagaat 4080caaaaccaat tgaaagagac aaatattcat gctaatcttc atcagaaact aaacattgac 4140ataaagcaat agtattggaa ctacaagtta taattatgta ctttgtaata gtgtgaagaa 4200aatcaaaata caaatagtaa tcatcatgat aaatgctatc tcaatttatt caattataaa 4260aatatagaaa taaaatgtga taaatggata acatgtgtgc taatccagtc cactacgccc 4320accacaagtt caacccaatg gactggatca tcttcttttt ttcttactga tttctctctt 4380cttccattct aatccatccc aaaagtagat gtttactatt tcccctttca tagtttcaca 4440agtgtgcgca gaggccaaac tgaaagtggt agtacatggt gtaatattaa tcacagatgt 4500gctctcatga agtctgaact tacagctcaa gtaacaacca acaagtaaaa agtacagaag 4560atagcataaa aaatgaaggt agaacaaatt ccaagttttc tacatattac ggtgcataaa 4620tcaaccacgt gaaggctcca tttatttgcc gctataacat tggtgaccct cttccacaaa 4680tagtaagtaa taaaaccaag tacaaaaaaa tgttcaacta ccaagtgatc acaatcttca 4740tgcatctgag tcacactatt gccctttgct catgaagtac actttactca ccgccaaagt 4800tcactcaaca ctgtagaaca aaggaatcat ataaataatg catatctctc ccttaagcct 4860tcaacacata caaaagtgac acaccaaatc aaagacacct gagccattca attcccctcc 4920tttattgctt tcaagtttca acactaattt tattatctga aacaagggcg aattcgaccc 4980agcttt 4986484792DNAArtificialNucleotide sequence of QC300-2Y 48cttgtacaaa gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttcattg ggttgagttc acctaacttg 3660tttaataaat agatcaatct aattctattc atctcttagt aagtattaaa aatgttggcc 3720caactctcca tatattggtg agttatagga gtttactcac ttaaaatgat aataaaaata 3780tttgttttaa aatcattttt taaacaaaaa aataatgttt cagattattt attcttagat 3840cataacttac aagcaacatt tcaatgatca attcaattgt cagaatcaaa accaattgaa 3900agagacaaat attcatgcta atcttcatca gaaactaaac attgacataa agcaatagta 3960ttggaactac aagttataat tatgtacttt gtaatagtgt gaagaaaatc aaaatacaaa 4020tagtaatcat catgataaat gctatctcaa tttattcaat tataaaaata tagaaataaa 4080atgtgataaa tggataacat gtgtgctaat ccagtccact acgcccacca caagttcaac 4140ccaatggact ggatcatctt ctttttttct tactgatttc tctcttcttc cattctaatc 4200catcccaaaa gtagatgttt actatttccc ctttcatagt ttcacaagtg tgcgcagagg 4260ccaaactgaa agtggtagta catggtgtaa tattaatcac agatgtgctc tcatgaagtc 4320tgaacttaca gctcaagtaa caaccaacaa gtaaaaagta cagaagatag cataaaaaat 4380gaaggtagaa caaattccaa gttttctaca tattacggtg cataaatcaa ccacgtgaag 4440gctccattta tttgccgcta taacattggt gaccctcttc cacaaatagt aagtaataaa 4500accaagtaca aaaaaatgtt caactaccaa gtgatcacaa tcttcatgca tctgagtcac 4560actattgccc tttgctcatg aagtacactt tactcaccgc caaagttcac tcaacactgt 4620agaacaaagg aatcatataa ataatgcata tctctccctt aagccttcaa cacatacaaa 4680agtgacacac caaatcaaag acacctgagc cattcaattc ccctccttta ttgctttcaa 4740gtttcaacac taattttatt atctgaaaca agggcgaatt cgacccagct tt 4792494590DNAArtificialNucleotide sequence of QC300-3Y 49cttgtacaaa gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc

cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttgatca taacttacaa gcaacatttc 3660aatgatcaat tcaattgtca gaatcaaaac caattgaaag agacaaatat tcatgctaat 3720cttcatcaga aactaaacat tgacataaag caatagtatt ggaactacaa gttataatta 3780tgtactttgt aatagtgtga agaaaatcaa aatacaaata gtaatcatca tgataaatgc 3840tatctcaatt tattcaatta taaaaatata gaaataaaat gtgataaatg gataacatgt 3900gtgctaatcc agtccactac gcccaccaca agttcaaccc aatggactgg atcatcttct 3960ttttttctta ctgatttctc tcttcttcca ttctaatcca tcccaaaagt agatgtttac 4020tatttcccct ttcatagttt cacaagtgtg cgcagaggcc aaactgaaag tggtagtaca 4080tggtgtaata ttaatcacag atgtgctctc atgaagtctg aacttacagc tcaagtaaca 4140accaacaagt aaaaagtaca gaagatagca taaaaaatga aggtagaaca aattccaagt 4200tttctacata ttacggtgca taaatcaacc acgtgaaggc tccatttatt tgccgctata 4260acattggtga ccctcttcca caaatagtaa gtaataaaac caagtacaaa aaaatgttca 4320actaccaagt gatcacaatc ttcatgcatc tgagtcacac tattgccctt tgctcatgaa 4380gtacacttta ctcaccgcca aagttcactc aacactgtag aacaaaggaa tcatataaat 4440aatgcatatc tctcccttaa gccttcaaca catacaaaag tgacacacca aatcaaagac 4500acctgagcca ttcaattccc ctcctttatt gctttcaagt ttcaacacta attttattat 4560ctgaaacaag ggcgaattcg acccagcttt 4590504343DNAArtificialNucleotide sequence of QC300-4Y 50cttgtacaaa gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttgataa atggataaca tgtgtgctaa 3660tccagtccac tacgcccacc acaagttcaa cccaatggac tggatcatct tctttttttc 3720ttactgattt ctctcttctt ccattctaat ccatcccaaa agtagatgtt tactatttcc 3780cctttcatag tttcacaagt gtgcgcagag gccaaactga aagtggtagt acatggtgta 3840atattaatca cagatgtgct ctcatgaagt ctgaacttac agctcaagta acaaccaaca 3900agtaaaaagt acagaagata gcataaaaaa tgaaggtaga acaaattcca agttttctac 3960atattacggt gcataaatca accacgtgaa ggctccattt atttgccgct ataacattgg 4020tgaccctctt ccacaaatag taagtaataa aaccaagtac aaaaaaatgt tcaactacca 4080agtgatcaca atcttcatgc atctgagtca cactattgcc ctttgctcat gaagtacact 4140ttactcaccg ccaaagttca ctcaacactg tagaacaaag gaatcatata aataatgcat 4200atctctccct taagccttca acacatacaa aagtgacaca ccaaatcaaa gacacctgag 4260ccattcaatt cccctccttt attgctttca agtttcaaca ctaattttat tatctgaaac 4320aagggcgaat tcgacccagc ttt 4343514130DNAArtificialNucleotide sequence of QC300-5Y 51cttgtacaaa gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttcacag atgtgctctc atgaagtctg 3660aacttacagc tcaagtaaca accaacaagt aaaaagtaca gaagatagca taaaaaatga 3720aggtagaaca aattccaagt tttctacata ttacggtgca taaatcaacc acgtgaaggc 3780tccatttatt tgccgctata acattggtga ccctcttcca caaatagtaa gtaataaaac 3840caagtacaaa aaaatgttca actaccaagt gatcacaatc ttcatgcatc tgagtcacac 3900tattgccctt tgctcatgaa gtacacttta ctcaccgcca aagttcactc aacactgtag 3960aacaaaggaa tcatataaat aatgcatatc tctcccttaa gccttcaaca catacaaaag 4020tgacacacca aatcaaagac acctgagcca ttcaattccc ctcctttatt gctttcaagt 4080ttcaacacta attttattat ctgaaacaag ggcgaattcg acccagcttt 4130523895DNAArtificialNucleotide sequence of QC300-6Y 52cttgtacaaa gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca

atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttgatca caatcttcat gcatctgagt 3660cacactattg ccctttgctc atgaagtaca ctttactcac cgccaaagtt cactcaacac 3720tgtagaacaa aggaatcata taaataatgc atatctctcc cttaagcctt caacacatac 3780aaaagtgaca caccaaatca aagacacctg agccattcaa ttcccctcct ttattgcttt 3840caagtttcaa cactaatttt attatctgaa acaagggcga attcgaccca gcttt 3895534157DNAArtificialNucleotide sequence of pZSL90 53gatccatggc ccacagcaag cacggcctga aggaggagat gaccatgaag taccacatgg 60agggctgcgt gaacggccac aagttcgtga tcaccggcga gggcatcggc taccccttca 120agggcaagca gaccatcaac ctgtgcgtga tcgagggcgg ccccctgccc ttcagcgagg 180acatcctgag cgccggcttc aagtacggcg accggatctt caccgagtac ccccaggaca 240tcgtggacta cttcaagaac agctgccccg ccggctacac ctggggccgg agcttcctgt 300tcgaggacgg cgccgtgtgc atctgtaacg tggacatcac cgtgagcgtg aaggagaact 360gcatctacca caagagcatc ttcaacggcg tgaacttccc cgccgacggc cccgtgatga 420agaagatgac caccaactgg gaggccagct gcgagaagat catgcccgtg cctaagcagg 480gcatcctgaa gggcgacgtg agcatgtacc tgctgctgaa ggacggcggc cggtaccggt 540gccagttcga caccgtgtac aaggccaaga gcgtgcccag caagatgccc gagtggcact 600tcatccagca caagctgctg cgggaggacc ggagcgacgc caagaaccag aagtggcagc 660tgaccgagca cgccatcgcc ttccccagcg ccctggcctg agagctcgaa tttccccgat 720cgttcaaaca tttggcaata aagtttctta agattgaatc ctgttgccgg tcttgcgatg 780attatcatat aatttctgtt gaattacgtt aagcatgtaa taattaacat gtaatgcatg 840acgttattta tgagatgggt ttttatgatt agagtcccgc aattatacat ttaatacgcg 900atagaaaaca aaatatagcg cgcaaactag gataaattat cgcgcgcggt gtcatctatg 960ttactagatc gggaattcta gtggccggcc cagctgatat ccatcacact ggcggccgct 1020cgagttctat agtgtcacct aaatcgtatg tgtatgatac ataaggttat gtattaattg 1080tagccgcgtt ctaacgacaa tatgtccata tggtgcactc tcagtacaat ctgctctgat 1140gccgcatagt taagccagcc ccgacacccg ccaacacccg ctgacgcgcc ctgacgggct 1200tgtctgctcc cggcatccgc ttacagacaa gctgtgaccg tctccgggag ctgcatgtgt 1260cagaggtttt caccgtcatc accgaaacgc gcgagacgaa agggcctcgt gatacgccta 1320tttttatagg ttaatgtcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg 1380tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc 1440tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag 1500ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc 1560cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac 1620ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc 1680gggttggact caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt 1740tcgtgcacac agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt 1800gagcattgag aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc 1860ggcagggtcg gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt 1920tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca 1980ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt 2040tgctggcctt ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt 2100attaccgcct ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag 2160tcagtgagcg aggaagcgga agagcgccca atacgcaaac cgcctctccc cgcgcgttgg 2220ccgattcatt aatgcaggtt gatcagatct cgatcccgcg aaattaatac gactcactat 2280agggagacca caacggtttc cctctagaaa taattttgtt taactttaag aaggagatat 2340acccatggaa aagcctgaac tcaccgcgac gtctgtcgag aagtttctga tcgaaaagtt 2400cgacagcgtc tccgacctga tgcagctctc ggagggcgaa gaatctcgtg ctttcagctt 2460cgatgtagga gggcgtggat atgtcctgcg ggtaaatagc tgcgccgatg gtttctacaa 2520agatcgttat gtttatcggc actttgcatc ggccgcgctc ccgattccgg aagtgcttga 2580cattggggaa ttcagcgaga gcctgaccta ttgcatctcc cgccgtgcac agggtgtcac 2640gttgcaagac ctgcctgaaa ccgaactgcc cgctgttctg cagccggtcg cggaggctat 2700ggatgcgatc gctgcggccg atcttagcca gacgagcggg ttcggcccat tcggaccgca 2760aggaatcggt caatacacta catggcgtga tttcatatgc gcgattgctg atccccatgt 2820gtatcactgg caaactgtga tggacgacac cgtcagtgcg tccgtcgcgc aggctctcga 2880tgagctgatg ctttgggccg aggactgccc cgaagtccgg cacctcgtgc acgcggattt 2940cggctccaac aatgtcctga cggacaatgg ccgcataaca gcggtcattg actggagcga 3000ggcgatgttc ggggattccc aatacgaggt cgccaacatc ttcttctgga ggccgtggtt 3060ggcttgtatg gagcagcaga cgcgctactt cgagcggagg catccggagc ttgcaggatc 3120gccgcggctc cgggcgtata tgctccgcat tggtcttgac caactctatc agagcttggt 3180tgacggcaat ttcgatgatg cagcttgggc gcagggtcga tgcgacgcaa tcgtccgatc 3240cggagccggg actgtcgggc gtacacaaat cgcccgcaga agcgcggccg tctggaccga 3300tggctgtgta gaagtactcg ccgatagtgg aaaccgacgc cccagcactc gtccgagggc 3360aaaggaatag tgaggtacag cttggatcga tccggctgct aacaaagccc gaaaggaagc 3420tgagttggct gctgccaccg ctgagcaata actagcataa ccccttgggg cctctaaacg 3480ggtcttgagg ggttttttgc tgaaaggagg aactatatcc ggatgatcgt cgaggcctca 3540cgtgttaaca agcttgcatg cctgcaggtt taaacagtcg actctagaga tccgtcaaca 3600tggtggagca cgacactctc gtctactcca agaatatcaa agatacagtc tcagaagacc 3660aaagggctat tgagactttt caacaaaggg taatatcggg aaacctcctc ggattccatt 3720gcccagctat ctgtcacttc atcaaaagga cagtagaaaa ggaaggtggc acctacaaat 3780gccatcattg cgataaagga aaggctatcg ttcaagatgc ctctgccgac agtggtccca 3840aagatggacc cccacccacg aggagcatcg tggaaaaaga agacgttcca accacgtctt 3900caaagcaagt ggattgatgt gatgatccta tgcgtatggt atgacgtgtg ttcaagatga 3960tgacttcaaa cctacctatg acgtatggta tgacgtgtgt cgactgatga cttagatcca 4020ctcgagcggc tataaatacg tacctacgca ccctgcgcta ccatccctag agctgcagct 4080tatttttaca acaattacca acaacaacaa acaacaaaca acattacaat tactatttac 4140aattacagtc gacccgg 4157

* * * * *