U.S. patent application number 12/152375 was filed with the patent office on 2008-11-27 for soybean promoters sc194 and flower-preferred expression thereof in transgenic plants.
Invention is credited to Zhongsen Li.
Application Number | 20080295202 12/152375 |
Document ID | / |
Family ID | 40073667 |
Filed Date | 2008-11-27 |
United States Patent
Application |
20080295202 |
Kind Code |
A1 |
Li; Zhongsen |
November 27, 2008 |
Soybean promoters SC194 and flower-preferred expression thereof in
transgenic plants
Abstract
The promoters of a soybean SC194 polypeptide and fragments
thereof and their use in promoting the expression of one or more
heterologous nucleic acid fragments in plants are described.
Inventors: |
Li; Zhongsen; (Hockessin,
DE) |
Correspondence
Address: |
POTTER ANDERSON & CORROON LLP;ATTN: KATHLEEN W. GEIGER, ESQ.
P.O. BOX 951
WILMINGTON
DE
19899-0951
US
|
Family ID: |
40073667 |
Appl. No.: |
12/152375 |
Filed: |
May 14, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60930877 |
May 17, 2007 |
|
|
|
Current U.S.
Class: |
800/278 ;
435/320.1; 435/419; 435/468; 530/350; 536/23.2; 536/23.6; 800/287;
800/298 |
Current CPC
Class: |
C12N 15/823 20130101;
C12N 15/8222 20130101 |
Class at
Publication: |
800/278 ;
536/23.6; 536/23.2; 435/320.1; 435/419; 800/298; 800/287; 435/468;
530/350 |
International
Class: |
A01H 1/00 20060101
A01H001/00; C07H 21/04 20060101 C07H021/04; C12N 15/63 20060101
C12N015/63; C12N 15/87 20060101 C12N015/87; C07K 14/415 20060101
C07K014/415; C12N 5/10 20060101 C12N005/10; A01H 5/00 20060101
A01H005/00 |
Claims
1. An isolated polynucleotide comprising: a) a nucleotide sequence
comprising the sequence set forth in SEQ ID NO:1 or a full-length
complement thereof; or b) a nucleotide sequence comprising a
sequence having at least 90% sequence identity, based on the BLASTN
method of alignment, when compared to the sequence set forth in SEQ
ID NO:1; wherein said nucleotide sequence is a promoter.
2. The isolated polynucleotide of claim 1, wherein the nucleotide
sequence of b) has at 95% identity, based on the BLASTN method of
alignment, when compared to the sequence set forth in SEQ ID
NO:1.
3. A recombinant DNA construct comprising the isolated
polynucleotide of claim 1 operably linked to at least one
heterologous sequence.
4. The recombinant DNA construct of claim 3, wherein the
heterologous nucleotide sequence encodes a gene involved in
anthocyanin biosynthesis, a gene involved in the synthesis of
fragrant fatty acid derivatives, a gene that is determinative of
flower morphology, or a gene involved in biosynthesis of plant
cytokinin.
5. The recombinant DNA construct of claim 4, wherein the gene
involved in anthocyanin biosynthesis is dyhydroflavonol
4-reductase, flavonoid 3,5-hydroxylase, chalcone synthase, chalcone
isomerase, flavonoid 3-hydroxylase, anthocyanin synthase, or
UDP-glucose 3-O-flavonoid glucosyl transferase.
6. The recombinant DNA construct of claim 4, wherein the gene
involved in the synthesis of fragrant fatty acid derivatives is
S-linalool synthase, acetyl CoA:benzylalcohol acetyltransferase,
benzyl CoA:benzylalcohol benzoyl transferase,
S-adenosyl-L-methionine:benzoic acid carboxylmethyl transferase,
mycrene synthase, (E)-.beta.-ocimene synthase, orcinol
O-methyltransferase, or limonene synthase.
7. The recombinant DNA construct of claim 4, wherein the gene that
is determinative of flower morphology is AGAMOUS, APETALA, or
PISTILLATA.
8. The recombinant DNA construct of claim 4, wherein the gene
involved in biosynthesis of plant cytokinin is isopentenyl
transferase.
9. A vector comprising the recombinant DNA construct of claim
3.
10. A cell comprising the recombinant DNA construct of claim 3.
11. The cell of claim 10, wherein the cell is a plant cell.
12. A transgenic plant having stably incorporated into its genome
the recombinant DNA construct of claim 3.
13. The transgenic plant of claim 12, wherein the plant is a
flowering plant.
14. The transgenic plant of claim 13, wherein the flowering plant
is rose, carnation, Gerbera, Chrysanthemum, tulip, Gladioli,
Alstroemeria, Anthurium, Iisianthus, larkspur, irises, orchid,
snapdragon, African violet, or azalea.
15. A transgenic seed produced by the transgenic plant of claim
12.
16. A method of expressing a coding sequence or a functional RNA in
a flowering plant comprising: a) introducing the recombinant DNA
construct of claim 3 into the plant, wherein the at least one
heterologous sequence comprises a coding sequence or a functional
RNA; b) growing the plant of step a); and c) selecting a plant
displaying expression of the coding sequence or the functional RNA
of the recombinant DNA construct.
17. A method of transgenically altering a marketable flower trait
of a flowering plant, comprising: a) introducing a recombinant DNA
construct of claim 3 into the flowering plant; b) growing a
fertile, mature flowering plant resulting from step a); and c)
selecting a flowering plant expressing the at least one
heterologous nucleotide sequence in flower tissue based on the
altered marketable flower trait.
18. The method of claim 17 wherein the marketable flower trait is
color, morphology, or fragrance.
19. An isolated polynucleotide comprising: (a) a nucleotide
sequence comprising a fragment of SEQ ID NO:1, SEQ ID NO:2, SEQ ID
NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a
full-length complement thereof; or (b) a nucleotide sequence
comprising a sequence having at least 90% sequence identity, based
on the BLASTN method of alignment, when compared to the nucleotide
sequence of (a); wherein said nucleotide sequence is a
promoter.
20. The isolated polynucleotide of claim 19, wherein the nucleotide
sequence of (a) comprises SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4,
SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7.
21. An isolated polynucleotide comprising: (a) a nucleotide
sequence encoding a polypeptide having at least 90% sequence
identity, based on the Clustal method of alignment, when compared
to the sequence set forth in SEQ ID NO:20, or (b) a full-length
complement of the nucleotide sequence of (a).
22. The isolated polynucleotide of claim 21, wherein the
polypeptide has at least 95% sequence identity, based on the
Clustal method of alignment, when compared to the sequence set
forth in SEQ ID NO:20.
23. The isolated polynucleotide of claim 22 encoding the sequence
set forth in SEQ ID NO:20.
24. The isolated polynucleotide of claim 23, wherein the nucleotide
sequence comprises the sequence set forth in SEQ ID NO:19.
25. A vector comprising the isolated polynucleotide of claim
21.
26. A recombinant DNA construct comprising the isolated
polynucleotide of claim 21 operably linked to a regulatory
sequence.
27. A cell comprising the recombinant DNA construct of claim
26.
28. A plant comprising the recombinant DNA construct of claim
26.
29. A seed comprising the recombinant DNA construct of claim
26.
30. A method for transforming a cell, comprising transforming a
cell with the isolated polynucleotide of claim 21.
31. A method for producing a plant comprising transforming a plant
cell with the isolated polynucleotide of claim 21 and regenerating
a plant from the transformed plant cell.
32. An isolated polypeptide having at least 90% sequence identity,
based on the Clustal method of alignment, when compared to the
sequence set forth in SEQ ID NO:20.
33. The isolated polypeptide of claim 32, wherein the isolated
polypeptide has at least 95% sequence identity, based on the
Clustal method of alignment, when compared to the sequence set
forth in SEQ ID NO:20.
34. The isolated polypeptide of claim 33, wherein the isolated
polypeptide comprises the amino acid sequence set forth in SEQ ID
NO:20.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims the benefit of U.S. Provisional
Application Ser. No. 60/930,877, filed May 17, 2007, which is
incorporated by reference herein in its entirety.
FIELD OF THE INVENTION
[0002] The present invention relates to the field of plant
molecular biology, more particularly to regulation of gene
expression in plants.
BACKGROUND OF THE INVENTION
[0003] Recent advances in plant genetic engineering have opened new
doors to engineer plants to have improved characteristics or
traits, such as plant disease resistance, insect resistance,
herbicidal resistance, yield improvement, improvement of the
nutritional quality of the edible portions of the plant, and
enhanced stability or shelf-life of the ultimate consumer product
obtained from the plants. Thus, a desired gene (or genes) with the
molecular function to impart different or improved characteristics
or qualities can be incorporated properly into the plant's genome.
The newly integrated gene (or genes) coding sequence can then be
expressed in the plant cell to exhibit the desired new trait or
characteristic. It is important that appropriate regulatory signals
be present in proper configurations in order to obtain the
expression of the newly inserted gene coding sequence in the plant
cell. These regulatory signals typically include a promoter region,
a 5' non-translated leader sequence and a 3' transcription
termination/polyadenylation sequence.
[0004] A promoter is a non-coding genomic DNA sequence, usually
upstream (5') to the relevant coding sequence, to which RNA
polymerase binds before initiating transcription. This binding
aligns the RNA polymerase so that transcription will initiate at a
specific transcription initiation site. The nucleotide sequence of
the promoter determines the nature of the RNA polymerase binding
and other related protein factors that attach to the RNA polymerase
and/or promoter, and the rate of RNA synthesis.
[0005] It has been shown that certain promoters are able to direct
RNA synthesis at a higher rate than others. These are called
"strong promoters". Certain other promoters have been shown to
direct RNA synthesis at higher levels only in particular types of
cells or tissues and are often referred to as "tissue specific
promoters", or "tissue-preferred promoters", if the promoters
direct RNA synthesis preferentially in certain tissues (RNA
synthesis may occur in other tissues at reduced levels). Since
patterns of expression of a chimeric gene (or genes) introduced
into a plant are controlled using promoters, there is an ongoing
interest in the isolation of novel promoters that are capable of
controlling the expression of a chimeric gene (or genes) at certain
levels in specific tissue types or at specific plant developmental
stages. Among the most commonly used promoters are the nopaline
synthase (NOS) promoter (Ebert et al., Proc. Natl. Acad. Sci. USA
84:5745-5749 (1987)); the octapine synthase (OCS) promoter;
caulimovirus promoters such as the cauliflower mosaic virus (CaMV)
19S promoter (Lawton et al., Plant Mol. Biol. 9:315-324 (1987)),
the CaMV 35S promoter (Odell et al., Nature 313:810-812 (1985)),
and the figwort mosaic virus 35S promoter (Sanger et al., Plant
Mol. Biol. 14, 43343 (1990)); the light inducible promoter from the
small subunit of rubisco (Pellegrineschi et al., Biochem. Soc.
Trans. 23(2):247-250 (1995)); the Adh promoter (Walker et al.,
Proc. Natl. Acad. Sci. USA 84:6624-66280 (1987)); the sucrose
synthase promoter (Yang et al., Proc. Natl. Acad. Sci. USA
87:414-44148 (1990)); the R gene complex promoter (Chandler et al.,
Plant Cell 1:1175-1183 (1989)); the chlorophyll a/b binding protein
gene promoter; and the like.
[0006] An angiosperm flower is a complex structure generally
consisting of a pedicel, sepals, petals, stamens, and a pistil. A
stamen comprises a filament and an anther in which the male
gametophyte pollens reside. A pistil comprises a stigma, style and
ovary. An ovary contains one or more ovules in which the female
gametophyte embryo sac, egg cell, central cell, and other
specialized cells reside. Flower promoters in general include
promoters that direct gene expression in any of the above tissues
or cell types.
[0007] Although advances in technology provide greater success in
transforming plants with chimeric genes, there is still a need for
preferred expression of such genes in desired plants. Often times
it is desired to selectively express target genes in a specific
tissue because of toxicity or efficacy concerns. For example,
flower tissue is a type of tissue where preferred expression is
desirable and there remains a need for promoters that preferably
initiate transcription in flower tissue. Promoters that initiate
transcription preferably in flower tissue control genes involved in
flower development and flower abortion.
SUMMARY OF THE INVENTION
[0008] Compositions and methods for regulating gene expression in a
plant are provided. One aspect is for an isolated polynucleotide
comprising: a) a nucleotide sequence comprising the sequence set
forth in SEQ ID NO: 1 or a full-length complement thereof; or b) a
nucleotide sequence comprising a sequence having at least 90%
sequence identity, based on the BLASTN method of alignment, when
compared to the sequence set forth in SEQ ID NO:1; wherein said
nucleotide sequence is a promoter. Another aspect is for an
isolated polynucleotide comprising (a) a nucleotide sequence
comprising a fragment of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ
ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a full-length
complement thereof; or (b) a nucleotide sequence comprising a
sequence having at least 90% sequence identity, based on the BLASTN
method of alignment, when compared to the nucleotide sequence of
(a); wherein said nucleotide sequence is a promoter.
[0009] Other embodiments include recombinant DNA constructs
comprising a polynucleotide sequence of the present invention
operably linked to a heterologous sequence. Additional, some
embodiments provide for transgenic plant cells, transient and
stable, transgenic plant seeds, as well as transgenic plants
comprising the provided recombinant DNA constructs.
[0010] There are provided some embodiments that include methods of
expressing a coding sequence or a functional RNA in a flowering
plant comprising: introducing a recombinant DNA construct described
above into the plant, wherein the heterologous sequence comprises a
coding sequence; growing the plant; and selecting a plant
displaying expression of the coding sequence or the functional RNA
of the recombinant DNA construct.
[0011] Furthermore, some embodiments of the present invention
include methods of transgenically altering a marketable flower
trait of a flowering plant, comprising: introducing a recombinant
DNA construct described above into the flowering plant; growing a
fertile, mature flowering plant resulting from the introducing
step; and selecting a flowering plant expressing the heterologous
nucleotide sequence in flower tissue based on the altered
marketable flower trait.
[0012] Another aspect is for an isolated polynucleotide comprising:
(a) a nucleotide sequence encoding a polypeptide, wherein the
polypeptide has at least 90% sequence identity, based on the
Clustal method of alignment, when compared to the sequence set
forth in SEQ ID NO:20, or (b) a full-length complement of the
nucleotide sequence of (a).
[0013] A further aspect is for an isolated polypeptide, wherein the
isolated polypeptide has at least 90% sequence identity, based on
the Clustal method of alignment, when compared to the sequence set
forth in SEQ ID NO:20.
BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCES
[0014] The invention can be more fully understood from the
following detailed description, the accompanying drawings and
Sequence Listing which form a part of this application. The
Sequence Listing contains the one letter code for nucleotide
sequence characters and the three letter codes for amino acids as
defined in conformity with the IUPAC-IUBMB standards described in
Nucleic Acids Research 13:3021-3030 (1985) and in the Biochemical
Journal 219 (No. 2): 345-373 (1984), which are herein incorporated
by reference in their entirety. The symbols and format used for
nucleotide and amino acid sequence data comply with the rules set
forth in 37 C.F.R. .sctn. 1.822.
[0015] SEQ ID NO:1 is a DNA sequence comprising a 1358 nucleotide
soybean SC194 promoter (or full-length SC194 promoter).
[0016] SEQ ID NO:2 is a 1328 basepair truncated form of the SC194
promoter shown in SEQ ID NO:1 (bp 30-1357 of SEQ ID NO:1).
[0017] SEQ ID NO:3 is a 1134 basepair truncated form of the SC194
promoter shown in SEQ ID NO:1 (bp 224-1357 of SEQ ID NO:1).
[0018] SEQ ID NO:4 is a 932 basepair truncated form of the SC194
promoter shown in SEQ ID NO:1 (bp 426-1357 of SEQ ID NO:1).
[0019] SEQ ID NO:5 is a 685 basepair truncated form of the SC194
promoter shown in SEQ ID NO:1 (bp 673-1357 of SEQ ID NO:1).
[0020] SEQ ID NO:6 is a 472 basepair truncated form of the SC194
promoter shown in SEQ ID NO:1 (bp 886-1357 of SEQ ID NO:1).
[0021] SEQ ID NO:7 is a 237 basepair truncated form of the SC194
promoter shown in SEQ ID NO:1 (bp 1121-1357 of SEQ ID NO:1).
[0022] SEQ ID NO:8 is an oligonucleotide primer used in the PCR
amplifications of the truncated SC194 promoter in SEQ ID NO:2 when
paired with SEQ ID NO:9, and the truncated SC194 promoters in SEQ
ID NOs: 3, 4, 5, 6 or 7 when paired with SEQ ID NOs: 10, 11, 12,
13, or 14, respectively.
[0023] SEQ ID NO:9 is an oligonucleotide primer used in the PCR
amplification of the truncated SC194 promoter in SEQ ID NO:2 when
paired with SEQ ID NO:8.
[0024] SEQ ID NO:10 is an oligonucleotide primer used in the PCR
amplification of the truncated SC194 promoter in SEQ ID NO:3 when
paired with SEQ ID NO:8.
[0025] SEQ ID NO:11 is an oligonucleotide primer used in the PCR
amplification of the truncated SC194 promoter in SEQ ID NO:4 when
paired with SEQ ID NO:8.
[0026] SEQ ID NO:12 is an oligonucleotide primer used in the PCR
amplification of the truncated SC194 promoter in SEQ ID NO:5 when
paired with SEQ ID NO:8.
[0027] SEQ ID NO:13 is an oligonucleotide primer used in the PCR
amplification of the truncated SC194 promoter in SEQ ID NO:6 when
paired with SEQ ID NO:8.
[0028] SEQ ID NO:14 is an oligonucleotide primer used in the PCR
amplification of the truncated SC194 promoter in SEQ ID NO:7 when
paired with SEQ ID NO:8.
[0029] SEQ ID NO:15 is an oligonucleotide primer specific to the
soybean PSO375649 gene used in the first nested PCR amplification
of the SC194 promoter when paired with SEQ ID NO:16.
[0030] SEQ ID NO:16 is an oligonucleotide primer used in the first
nested PCR amplification of the SC194 promoter when paired with SEQ
ID NO:15.
[0031] SEQ ID NO:17 is an oligonucleotide primer specific to the
soybean PSO375649 gene used in the second nested PCR amplification
of the SC194 promoter when paired with SEQ ID NO: 18. An NcoI
restriction site CCATGG is added for subsequent cloning.
[0032] SEQ ID NO:18 is an oligonucleotide primer used in the second
nested PCR amplification of the SC194 promoter when paired with SEQ
ID NO:17.
[0033] SEQ ID NO:19 is the nucleotide sequence of a novel soybean
cDNA PSO375649 encoding an unknown polypeptide. Nucleotides 1 to 86
are the 5' untranslated sequence, nucleotides 87 to 89 are the
translation initiation codon, nucleotides 87 to 467 are polypeptide
coding region, nucleotides 468 to 470 are the termination codon,
nucleotides 468 to 804 are the 3' untranslated sequence,
nucleotides 805 to 832 are part of the poly (A) tail.
[0034] SEQ ID NO:20 is the 127 amino acid long putative PSO375649
translation product SC194 protein sequence.
[0035] SEQ ID NO:21 is an oligonucleotide primer used in the
diagnostic PCR to check for soybean genomic DNA presence in total
RNA or cDNA when paired with SEQ ID NO:22.
[0036] SEQ ID NO:22 is an oligonucleotide primer used in the
diagnostic PCR to check for soybean genomic DNA presence in total
RNA or cDNA when paired with SEQ ID NO:21.
[0037] SEQ ID NO:23 is the longer strand sequence of the adaptor
supplied in ClonTech.TM. GenomeWalker.TM. kit.
[0038] SEQ ID NO:24 is an MPSS tag sequence that is specific to the
unique gene PSO375649.
[0039] SEQ ID NO:25 is a sense primer used in quantitative RT-PCR
analysis of PSO375649 gene expression profile.
[0040] SEQ ID NO:26 is an antisense primer used in quantitative
RT-PCR analysis of PSO375649 gene expression profile.
[0041] SEQ ID NO:27 is a sense primer used as an endogenous control
gene-specific primer in the quantitative RT-PCR analysis of
PSO375649 gene expression profile.
[0042] SEQ ID NO:28 is an antisense primer used as an endogenous
control gene-specific primer in the quantitative RT-PCR analysis of
PSO375649 gene expression profile.
[0043] SEQ ID NO:29 is a sense primer used in quantitative PCR
analysis of SAMS:ALS transgene copy numbers.
[0044] SEQ ID NO:30 is a FAM labeled fluorescent DNA oligo probe
used in quantitative PCR analysis of SAMS:ALS transgene copy
numbers.
[0045] SEQ ID NO:31 is an antisense primer used in quantitative PCR
analysis of SAMS:ALS transgene copy numbers.
[0046] SEQ ID NO:32 is a sense primer used in quantitative PCR
analysis of GM-SC194:YFP transgene copy numbers.
[0047] SEQ ID NO:33 is a FAM labeled fluorescent DNA oligo probe
used in quantitative PCR analysis of GM-SC194:YFP transgene copy
numbers.
[0048] SEQ ID NO:34 is an antisense primer used in quantitative PCR
analysis of GM-SC194:YFP transgene copy numbers.
[0049] SEQ ID NO:35 is a sense primer used as an endogenous control
gene primer in quantitative PCR analysis of transgene copy
numbers.
[0050] SEQ ID NO:36 is a VIC labeled DNA oligo probe used as an
endogenous control gene probe in quantitative PCR analysis of
transgene copy numbers.
[0051] SEQ ID NO:37 is an antisense primer used as an endogenous
control gene primer in quantitative PCR analysis of transgene copy
numbers.
[0052] SEQ ID NO:38 is the recombination site attB1 sequence in the
Gateway cloning system (Invitrogen).
[0053] SEQ ID NO:39 is the recombination site attB2 sequence in the
Gateway cloning system (Invitrogen).
[0054] SEQ ID NO:40 is the 3291 bp sequence of QC299.
[0055] SEQ ID NO:41 is the 4642 bp sequence of QC300.
[0056] SEQ ID NO:42 is the 8187 bp sequence of PHP25224.
[0057] SEQ ID NO:43 is the 8945 bp sequence of QC302.
[0058] SEQ ID NO:44 is the 2817 bp sequence of pCR8/GW/TOPO.
[0059] SEQ ID NO:45 is the 4145 bp sequence of QC300-1.
[0060] SEQ ID NO:46 is the 5286 bp sequence of QC330.
[0061] SEQ ID NO:47 is the 4986 bp sequence of QC300-1Y.
[0062] SEQ ID NO:48 is the 4792 bp sequence of QC300-2Y.
[0063] SEQ ID NO:49 is the 4590 bp sequence of QC300-3Y.
[0064] SEQ ID NO:50 is the 4343 bp sequence of QC300-4Y.
[0065] SEQ ID NO:51 is the 4130 bp sequence of QC300-5Y.
[0066] SEQ ID NO:52 is the 3895 bp sequence of QC300-6Y.
[0067] SEQ ID NO:53 is the 4157 bp sequence of pZSL90.
[0068] Table 1 displays the relative abundance (parts per million,
PPM) of the PSO375649 gene determined by Lynx MPSS gene expression
profiling.
[0069] Table 2 displays the relative transgene copy numbers and YFP
expression of SC194:YFP transgenic soybean plants.
[0070] FIG. 1 displays the logarithm of relative quantifications of
the PSO375649 gene expression in 14 different soybean tissues by
quantitative RT-PCR. The gene expression profile indicates that the
PSO375649 gene is highly expressed in flower buds and open
flowers.
[0071] FIG. 2 displays the SC194 promoter copy number analysis by
Southern hybridization. Also displayed is a schematic of the SC194
promoter showing relative linear positions of a number of
restriction sites.
[0072] FIG. 3 is a schematic representation of the map of plasmids
QC299, QC300, PHP25224, and QC302.
[0073] FIG. 4 displays schematic representations of a Gateway
cloning entry vector pCR8/GW/TOPO (Invitrogen), the construct
QC300-1 created by cloning the full length SC194 promoter into
pCR8/GW/TOPO, a Gateway cloning destination vector QC330 containing
a reporter ZS-YELLOW1 N1, and a final construct QC300-1Y with the
1328 bp truncated SC194 promoter (SEQ ID NO:2) placed in front of
the ZS-YELLOW1 N1 reporter gene. Promoter deletion constructs
QC300-2Y, QC300-3Y, QC300-4Y, QC300-5Y, and QC300-6Y containing the
1134, 932, 685, 472, and 237 bp truncated SC194 promoters,
respectively, have similar map configurations, the difference being
in the length of the promoter.
[0074] FIG. 5 is a linear schematic of the SC194 promoter
constructs QC300, QC300-1Y, QC300-2Y, QC300-3Y, QC300-4Y, QC300-5Y,
and QC300-6Y wherein the reporter ZS-YELLOW1 N1 is operably linked
to the full length SC194 promoter and the progressive truncations
of the SC194 promoter.
[0075] FIG. 6 displays the transient expression of the fluorescent
protein reporter gene ZS-YELLOW1 N1 in the cotyledons of
germinating soybean seeds. The reporter gene is driven by the full
length SC194 promoter in construct QC300, or driven by the SC194
promoter or the progressively truncated SC194 promoters in the
transient expression constructs QC300-1Y to QC300-6Y. Construct
pZSL90 represents the positive control (constitutive promoter SCP1
drives the same reporter gene).
[0076] FIG. 7 displays the stable expression of the fluorescent
protein reporter gene ZS-YELLOW1 N1 in the floral and other tissues
of transgenic soybean plants containing a single copy of the
transgene construct QC302. The green color indicates ZS-YELLOW1 N1
gene expression. The red color is background auto fluorescence from
plant green tissues.
DETAILED DESCRIPTION OF THE INVENTION
[0077] The disclosure of all patents, patent applications, and
publications cited herein are incorporated by reference in their
entirety.
[0078] As used herein and in the appended claims, the singular
forms "a", "an", and "the" include plural reference unless the
context clearly dictates otherwise. Thus, for example, reference to
"a plant" includes a plurality of such plants, reference to "a
cell" includes one or more cells and equivalents thereof known to
those skilled in the art, and so forth.
[0079] In the context of this disclosure, a number of terms shall
be utilized.
[0080] The term "promoter" refers to a nucleotide sequence capable
of controlling the expression of a coding sequence or functional
RNA. Functional RNA includes, but is not limited to, transfer RNA
(tRNA) and ribosomal RNA (rRNA). Numerous examples of promoters may
be found in the compilation by Okamuro and Goldberg (Biochemistry
of Plants 15:1-82 (1989)). The promoter sequence consists of
proximal and more distal upstream elements, the latter elements
often referred to as enhancers. Accordingly, an "enhancer" is a DNA
sequence which can stimulate promoter activity and may be an innate
element of the promoter or a heterologous element inserted to
enhance the level or tissue-specificity of a promoter. Promoters
may be derived in their entirety from a native gene, or be composed
of different elements derived from different promoters found in
nature, or even comprise synthetic DNA segments. It is understood
by those skilled in the art that different promoters may direct the
expression of a gene in different tissues or cell types, or at
different stages of development, or in response to different
environmental conditions. Promoters which cause a gene to be
expressed in most cell types at most-times are commonly referred to
as "constitutive promoters". It is further recognized that, since
in most cases the exact boundaries of regulatory sequences have not
been completely defined, DNA fragments of some variation may have
identical promoter activity.
[0081] An "intron" is an intervening sequence in a gene that is
transcribed into RNA and then excised in the process of generating
the mature mRNA. The term is also used for the excised RNA
sequences. An "exon" is a portion of the sequence of a gene that is
transcribed and is found in the mature messenger RNA derived from
the gene, and is not necessarily a part of the sequence that
encodes the final gene product.
[0082] A "flower" is a complex structure consisting of pedicel,
sepal, petal, stamen, and carpel. A stamen comprises an anther,
pollen and filament. A carpel comprises a stigma, style and ovary.
An ovary comprises an ovule, embryo sac, and egg cell. Soybean pods
develop from the pistil. It is likely that a gene expressed in the
pistil of a flower continues to express in early pod. A "flower
cell" is a cell from any one of these structures. Flower promoters
in general include promoters that direct gene expression in any of
the above tissues or cell types.
[0083] The term "flower crop" or "flowering plants" are plants that
produce flowers that are marketable within the floriculture
industry. Flower crops include both cut flowers and potted
flowering plants. Cut flowers are plants that generate flowers that
can be cut from the plant and can be used in fresh flower
arrangements. Flower crops include roses, carnations, Gerberas,
Chrysanthemums, tulips, Gladiolis, Alstroemerias, Anthuriums,
Iisianthuses, larkspurs, irises, orchids, snapdragons, African
violets, azaleas, in addition to other less popular flower
crops.
[0084] The terms "flower-specific promoter" or "flower-preferred
promoter" may be used interchangeably herein and refer to promoters
active in flower, with promoter activity being significantly higher
in flower tissue versus non-flower tissue. "Preferentially
initiates transcription", when describing a particular cell type,
refers to the relative level of transcription in that particular
cell type as opposed to other cell types. The described SC194
promoters are promoters that preferentially initiate transcription
in flower cells. Preferably, the promoter activity in terms of
expression levels of an operably linked sequence is more than
ten-fold higher in flower tissue than non-flower tissue. More
preferably, the promoter activity is present in flower tissue while
undetectable in non-flower tissue.
[0085] As used herein, an "SC194 promoter" refers to one type of
flower-specific promoter. The native SC194 promoter (or full-length
native SC194 promoter) is the native promoter of the putative
soybean SC194 polypeptide, which is a novel protein without
significant homology to any known protein in public databases. The
"SC194 promoter", as used herein, also refers to fragments of the
full-length native promoter that retain significant promoter
activity. For example, an SC194 promoter of the present invention
can be the full-length promoter (SEQ ID NO:1) or a
promoter-functioning fragment thereof, which includes, among
others, the polynucleotides of SEQ ID NOs: 2, 3, 4, 5, 6 and 7. An
SC194 promoter also includes variants that are substantially
similar and functionally equivalent to any portion of the
nucleotide sequence set forth in SEQ ID NOs: 1, 2, 3, 4, 5, 6, or
7, or sequences therebetween.
[0086] An "isolated nucleic acid fragment" or "isolated
polynucleotide" refers to a polymer of ribonucleotides (RNA) or
deoxyribonucleotides (DNA) that is single-stranded or
double-stranded, optionally containing synthetic, non-natural or
altered nucleotide bases. An isolated polynucleotide in the form of
DNA may be comprised of one or more segments of cDNA, genomic DNA
or synthetic DNA.
[0087] The terms "polynucleotide", "polynucleotide sequence",
"nucleic acid sequence", and "nucleic acid fragment"/"isolated
nucleic acid fragment" are used interchangeably herein. These terms
encompass nucleotide sequences and the like. A polynucleotide may
be a polymer of RNA or DNA that is single- or double-stranded, that
optionally contains synthetic, non-natural or altered nucleotide
bases. A polynucleotide in the form of a polymer of DNA may be
comprised of one or more segments of cDNA, genomic DNA, synthetic
DNA, or mixtures thereof. Nucleotides (usually found in their
5'-monophosphate form) are referred to by a single letter
designation as follows: "A" for adenylate or deoxyadenylate (for
RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate,
"G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for
deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C
or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and
"N" for any nucleotide.
[0088] A "heterologous nucleic acid fragment" or "heterologous
nucleotide sequence" refers to a nucleotide sequence that is not
naturally occurring with the plant promoter sequence of the
invention. While this nucleotide sequence is heterologous to the
promoter sequence, it may be homologous, or native, or
heterologous, or foreign, to the plant host. However, it is
recognized that the instant promoters may be used with their native
coding sequences to increase or decrease expression resulting in a
change in phenotype in the transformed seed.
[0089] The terms "fragment (or variant) that is functionally
equivalent" and "functionally equivalent fragment (or variant)" are
used interchangeably herein. These terms refer to a portion or
subsequence or variant of the promoter sequence of the present
invention in which the ability to initiate transcription or drive
gene expression (such as to produce a certain phenotype) is
retained. Fragments and variants can be obtained via methods such
as site-directed mutagenesis and synthetic construction. As with
the provided promoter sequences described herein, the contemplated
fragments and variants operate to promote the flower-preferred
expression of an operably linked heterologous nucleic acid
sequence, forming a recombinant DNA construct (also, a chimeric
gene). For example, the fragment or variant can be used in the
design of recombinant DNA constructs to produce the desired
phenotype in a transformed plant. Recombinant DNA constructs can be
designed for use in co-suppression or antisense by linking a
promoter fragment or variant thereof in the appropriate orientation
relative to a heterologous nucleotide sequence.
[0090] In some aspects of the present invention, the promoter
fragments can comprise at least about 20 contiguous nucleotides, or
at least about 50 contiguous nucleotides, or at least about 75
contiguous nucleotides, or at least about 100 contiguous
nucleotides of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4,
SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7. In another aspect, a
promoter fragment is the nucleotide sequence set forth in SEQ ID
NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID
NO:7. The nucleotides of such fragments will usually comprise the
TATA recognition sequence of the particular promoter sequence. Such
fragments may be obtained by use of restriction enzymes to cleave
the naturally occurring promoter nucleotide sequences disclosed
herein, by synthesizing a nucleotide sequence from the naturally
occurring promoter DNA sequence, or may be obtained through the use
of PCR technology. See particularly, Mullis et al., Methods
Enzymol. 155:335-350 (1987), and Higuchi, R. In PCR Technology:
Principles and Applications for DNA Amplifications; Erlich, H. A.,
Ed.; Stockton Press Inc.: New York, 1989.
[0091] The terms "substantially similar" and "corresponding
substantially" as used herein refer to nucleic acid sequences,
particularly promoter sequences, wherein changes in one or more
nucleotide bases do not substantially alter the ability of the
promoter to initiate transcription or drive gene expression or
produce a certain phenotype. These terms also refer to
modifications, including deletions and variants, of the nucleic
acid sequences of the instant invention by way of deletion or
insertion of one or more nucleotides that do not substantially
alter the functional properties of the resulting promoter relative
to the initial, unmodified promoter. It is therefore understood, as
those skilled in the art will appreciate, that the invention
encompasses more than the specific exemplary sequences.
[0092] In one example of substantially similar, substantially
similar nucleic acid sequences include those that are also defined
by their ability to hybridize to the disclosed nucleic acid
sequences, or portions thereof. Substantially similar nucleic acid
sequences include those sequences that hybridize, under moderately
stringent conditions (for example, 0.5.times.SSC, 0.1% SDS,
60.degree. C.) with the sequences exemplified herein, or to any
portion of the nucleotide sequences reported herein and which are
functionally equivalent to the promoter of the invention. Estimates
of such homology are provided by either DNA-DNA or DNA-RNA
hybridization under conditions of stringency as is well understood
by those skilled in the art (Hames and Higgins, Eds.; In Nucleic
Acid Hybridisation; IRL Press: Oxford, U.K., 1985). Stringency
conditions can be adjusted to screen for moderately similar
fragments, such as homologous sequences from distantly related
organisms, to highly similar fragments, such as genes that
duplicate functional enzymes from closely related organisms.
Post-hybridization washes partially determine stringency
conditions. One set of conditions uses a series of washes starting
with 6.times.SSC, 0.5% SDS at room temperature for 15 min, then
repeated with 2.times.SSC, 0.5% SDS at 45.degree. C. for 30 min,
and then repeated twice with 0.2.times.SSC, 0.5% SDS at 50.degree.
C. for 30 min. Another set of stringent conditions uses higher
temperatures in which the washes are identical to those above
except for the temperature of the final two 30 min washes in
0.2.times.SSC, 0.5% SDS is increased to 60.degree. C. Another set
of highly stringent conditions uses two final washes in
0.1.times.SSC, 0.1% SDS at 65.degree. C.
[0093] In some examples, substantially similar nucleic acid
sequences are those sequences that are at least 80% identical to
the nucleic acid sequences reported herein or which are at least
80% identical to any portion of the nucleotide sequences reported
herein. In some instances, substantially similar nucleic acid
sequences are those that are at least 90% identical to the nucleic
acid sequences reported herein, or at least 90% identical to any
portion of the nucleotide sequences reported herein. In some
examples, substantially similar nucleic acid sequences are those
that are at least 95% identical to the nucleic acid sequences
reported herein, or are at least 95% identical to any portion of
the nucleotide sequences reported herein. It is well understood by
one skilled in the art that many levels of sequence identity are
useful in identifying related polynucleotide sequences. Useful
examples of percent identities are those listed above, or also any
integer percentage from 80% to 100%, such as, for example, 81%,
82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,
95%, 96%, 97%, 98% and 99%.
[0094] "Codon degeneracy" refers to divergence in the genetic code
permitting variation of the nucleotide sequence without affecting
the amino acid sequence of an encoded polypeptide. Accordingly, the
instant invention relates to any nucleic acid fragment comprising a
nucleotide sequence that encodes all or a substantial portion of
the amino acid sequences set forth herein. The skilled artisan is
well aware of the "codon-bias" exhibited by a specific host cell in
usage of nucleotide codons to specify a given amino acid.
Therefore, when synthesizing a nucleic acid sequence for improved
expression in a host cell, it is desirable to design the nucleic
acid sequence such that its frequency of codon usage approaches the
frequency of preferred codon usage of the host cell.
[0095] Sequence alignments and percent similarity calculations may
be determined using the Megalign program of the LASARGENE
bioinformatics computing suite (DNASTAR Inc., Madison, Wis.).
Multiple alignment of the sequences are performed using the Clustal
method of alignment (Higgins and Sharp, CABIOS 5:151-153 (1989))
with the default parameters (GAP PENALTY=10, GAP LENGTH
PENALTY=10). Default parameters for pairwise alignments and
calculation of percent identity of protein sequences using the
Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS
SAVED=5. For nucleic acids these parameters are GAP PENALTY=10, GAP
LENGTH PENALTY=10, KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS
SAVED=4. A "substantial portion" of an amino acid or nucleotide
sequence comprises enough of the amino acid sequence of a
polypeptide or the nucleotide sequence of a gene to afford putative
identification of that polypeptide or gene, either by manual
evaluation of the sequence by one skilled in the art, or by
computer-automated sequence comparison and identification using
algorithms such as BLAST (Altschul, S. F. et al., J. Mol. Biol.
215:403-410 (1993)) and Gapped Blast (Altschul, S. F. et al.,
Nucleic Acids Res. 25:3389-3402 (1997)). BLASTN refers to a BLAST
program that compares a nucleotide query sequence against a
nucleotide sequence database.
[0096] The term "gene" refers to a nucleic acid fragment that
expresses a specific protein, including regulatory sequences
preceding (5' non-coding sequences) and following (3' non-coding
sequences) the coding sequence. "Native gene" refers to a gene as
found in nature with its own regulatory sequences. "Chimeric gene"
or "recombinant expression construct", which are used
interchangeably, refers to any gene that is not a native gene,
comprising regulatory and coding sequences that are not found
together in nature. Accordingly, a chimeric gene may comprise
regulatory sequences and coding sequences that are derived from
different sources, or regulatory sequences and coding sequences
derived from the same source, and arranged in a manner different
than that found in nature. "Endogenous gene" refers to a native
gene in its natural location in the genome of an organism. A
"foreign" gene refers to a gene not normally found in the host
organism, which is introduced into the host organism by gene
transfer. Foreign genes can comprise native genes inserted into a
non-native organism, or chimeric genes. A "transgene" is a gene
that has been introduced into the genome by a transformation
procedure.
[0097] "Coding sequence" refers to a DNA sequence that encodes for
a specific amino acid sequence. "Regulatory sequences" refer to
nucleotide sequences located upstream (5' non-coding sequences),
within, or downstream (3' non-coding sequences) of a coding
sequence, and which influence the transcription, RNA processing or
stability, or translation of the associated coding sequence.
Regulatory sequences may include, and are not limited to,
promoters, enhancers, translation leader sequences, introns, and
polyadenylation recognition sequences.
[0098] The "translation leader sequence" refers to a DNA sequence
located between the promoter sequence of a gene and the coding
sequence. The translation leader sequence is present in the fully
processed mRNA upstream of the translation start sequence. The
translation leader sequence may affect processing of the primary
transcript to mRNA, mRNA stability or translation efficiency.
Examples of translation leader sequences have been described
(Turner, R. and Foster, G. D., Molecular Biotechnology 3:225
(1995)).
[0099] The "3' non-coding sequences" refer to DNA sequences located
downstream of a coding sequence and include polyadenylation
recognition sequences and other sequences encoding regulatory
signals capable of affecting mRNA processing or gene expression.
The polyadenylation signal is usually characterized as affecting
the addition of polyadenylic acid tracts to the 3' end of the mRNA
precursor. The use of different 3' non-coding sequences is
exemplified by Ingelbrecht et al., Plant Cell 1:671-680 (1989).
[0100] "RNA transcript" refers to a product resulting from RNA
polymerase-catalyzed transcription of a DNA sequence. When an RNA
transcript is a perfect complementary copy of a DNA sequence, it is
referred to as a primary transcript, or it may be a RNA sequence
derived from posttranscriptional processing of a primary transcript
and is referred to as a mature RNA. "Messenger RNA" ("mRNA") refers
to RNA that is without introns and that can be translated into
protein by the cell. "cDNA" refers to a DNA that is complementary
to and synthesized from an mRNA template using the enzyme reverse
transcriptase. The cDNA can be single-stranded or converted into
the double-stranded using the Klenow fragment of DNA polymerase I.
"Sense" RNA refers to RNA transcript that includes mRNA and so can
be translated into protein within a cell or in vitro. "Antisense
RNA" refers to a RNA transcript that is complementary to all or
part of a target primary transcript or mRNA and that blocks
expression or transcripts accumulation of a target gene. The
complementarity of an antisense RNA may be with any part of the
specific gene transcript, i.e. at the 5' non-coding sequence, 3'
non-coding sequence, introns, or the coding sequence. "Functional
RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may
not be translated yet has an effect on cellular processes.
[0101] The term "operably linked" refers to the association of
nucleic acid sequences on a single polynucleotide so that the
function of one is affected by the other. For example, a promoter
is operably linked with a heterologous nucleotide sequence, e.g., a
coding sequence, when it is capable of affecting the expression of
that heterologous nucleotide sequence (i.e., for example, the
coding sequence is under the transcriptional control of the
promoter). A coding sequence can be operably linked to promoter
sequences in sense or antisense orientation.
[0102] The terms "initiate transcription", "initiate expression",
"drive transcription", and "drive expression" are used
interchangeably herein and all refer to the primary function of a
promoter. As detailed throughout this disclosure, a promoter is a
non-coding genomic DNA sequence, usually upstream (5') to the
relevant coding sequence, and its primary function is to act as a
binding site for RNA polymerase and initiate transcription by the
RNA polymerase. Additionally, there is "expression" of RNA,
including functional RNA, or the expression of polypeptide for
operably linked encoding nucleotide sequences, as the transcribed
RNA ultimately is translated into the corresponding
polypeptide.
[0103] The term "expression", as used herein, refers to the
production of a functional end-product, e.g., an mRNA or a protein
(precursor or mature).
[0104] The term "recombinant DNA construct" or "recombinant
expression construct" is used interchangeably and refers to a
discrete polynucleotide into which a nucleic acid sequence or
fragment can be moved. Preferably, it is a plasmid vector or a
fragment thereof comprising the promoters of the present invention.
The choice of plasmid vector is dependent upon the method that will
be used to transform host plants. The skilled artisan is well aware
of the genetic elements that must be present on the plasmid vector
in order to successfully transform, select and propagate host cells
containing the recombinant DNA construct. The skilled artisan will
also recognize that different independent transformation events
will result in different levels and patterns of expression (Jones
et al., EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen.
Genetics 218:78-86 (1989)), and thus that multiple events must be
screened in order to obtain lines displaying the desired expression
level and pattern. Such screening may be accomplished by PCR and
Southern analysis of DNA, RT-PCR and Northern analysis of mRNA
expression, Western analysis of protein expression, or phenotypic
analysis.
[0105] Expression or overexpression of a gene involves
transcription of the gene and translation of the mRNA into a
precursor or mature protein. "Antisense inhibition" refers to the
production of antisense RNA transcripts capable of suppressing the
expression of the target protein. "Overexpression" refers to the
production of a gene product in transgenic organisms that exceeds
levels of production in normal or non-transformed organisms.
"Co-suppression" refers to the production of sense RNA transcripts
capable of suppressing the expression or transcript accumulation of
identical or substantially similar foreign or endogenous genes
(U.S. Pat. No. 5,231,020). The mechanism of co-suppression may be
at the DNA level (such as DNA methylation), at the transcriptional
level, or at post-transcriptional level.
[0106] Co-suppression constructs in plants previously have been
designed by focusing on overexpression of a nucleic acid sequence
having homology to an endogenous mRNA, in the sense orientation,
which results in the reduction of all RNA having homology to the
overexpressed sequence (see Vaucheret et al., Plant J. 16:651-659
(1998); and Gura, Nature 404:804-808 (2000)). The overall
efficiency of this phenomenon is low, and the extent of the RNA
reduction is widely variable. Recent work has described the use of
"hairpin" structures that incorporate all, or part, of an mRNA
encoding sequence in a complementary orientation that results in a
potential "stem-loop" structure for the expressed RNA (WO99/53050
and WO02/00904). This increases the frequency of co-suppression in
the recovered transgenic plants. Another variation describes the
use of plant viral sequences to direct the suppression, or
"silencing", of proximal mRNA encoding sequences (WO98/36083).
Neither of these co-suppressing phenomena has been elucidated
mechanistically at the molecular level, although genetic evidence
has been obtained that may lead to the identification of potential
components (Elmayan et al., Plant Cell 10:1747-1757 (1998)).
[0107] As stated herein, "suppression" refers to a reduction of the
level of enzyme activity or protein functionality (e.g., a
phenotype associated with a protein) detectable in a transgenic
plant when compared to the level of enzyme activity or protein
functionality detectable in a non-transgenic or wild type plant
with the native enzyme or protein. The level of enzyme activity in
a plant with the native enzyme is referred to herein as "wild type"
activity. The level of protein functionality in a plant with the
native protein is referred to herein as "wild type" functionality.
The term "suppression" includes lower, reduce, decline, decrease,
inhibit, eliminate and prevent. This reduction may be due to a
decrease in translation of the native mRNA into an active enzyme or
functional protein. It may also be due to the transcription of the
native DNA into decreased amounts of mRNA and/or to rapid
degradation of the native mRNA. The term "native enzyme" refers to
an enzyme that is produced naturally in a non-transgenic or wild
type cell. The terms "non-transgenic" and "wild type" are used
interchangeably herein.
[0108] "Altering expression" refers to the production of gene
product(s) in transgenic organisms in amounts or proportions that
differ significantly from the amount of the gene product(s)
produced by the corresponding wild-type organisms (i.e., expression
is increased or decreased).
[0109] "Transformation" refers to the transfer of a nucleic acid
fragment into the genome of a host organism, resulting in
genetically stable inheritance. Host organisms containing the
transformed nucleic acid fragments are referred to as "transgenic"
organisms. Thus, a "transgenic plant cell" as used herein refers to
a plant cell containing the transformed nucleic acid fragments. The
preferred method of soybean cell transformation is use of
particle-accelerated or "gene gun" transformation technology
(Klein, T., Nature (London) 327:70-73 (1987); U.S. Pat. No.
4,945,050).
[0110] "Transient expression" refers to the temporary expression of
often reporter genes such as .beta.-glucuronidase (GUS),
fluorescent protein genes GFP, ZS-YELLOW1 N1, AM-CYAN1, DS-RED in
selected certain cell types of the host organism in which the
transgenic gene is introduced temporally by a transformation
method. The transformed materials of the host organism are
subsequently discarded after the transient gene expression
assay.
[0111] A "marketable flower trait" is a characteristic or phenotype
of the flower of a plant such as the color, scent or morphology of
a flower. The marketable flower trait is a characteristic of a
flower that is of high regard to a flower crop consumer in deciding
whether to purchase the flower crop.
[0112] The phrase "genes involved in anthocyanin biosynthesis"
refers to genes that encode proteins that play a role in converting
metabolic precursors into the one of a number of anthocyanins.
Examples of genes involved in the biosynthesis of anthocyanin are
dyhydroflavonol 4-reductase, flavonoid 3,5-hydroxylase, chalcone
synthase, chalcone isomerase, flavonoid 3-hydroxylase, anthocyanin
synthase, and UDP-glucose 3-O-flavonoid glucosyl transferase (see,
e.g., Mori et al., Plant Cell Reports 22:415-421 (2004)).
[0113] The phrase "genes involved in the biosynthesis of fragrant
fatty acid derivatives" refers to genes that encode proteins that
play a role in manipulating the biosynthesis of fragrant fatty acid
derivatives such as terpenoids, phenylpropanoids, and benzenoids in
flowers (see, e.g., Tanaka et al., Plant Cell, Tissue and Organ
Culture 80:1-24 (2005)). Examples of such genes include S-linalool
synthase, acetyl CoA:benzylalcohol acetyltransferase, benzyl
CoA:benzylalcohol benzoyl transferase,
S-adenosyl-L-methionine:benzoic acid carboxylmethyl transferase
(BAMT), mycrene synthases, (E)-.beta.-ocimene synthase, orcinol
O-methyltransferase, and limonene synthases (see, e.g., Tanaka et
al., supra).
[0114] The term "flower homeotic genes" or "flower morphology
modifying genes" refers to genes that are involved in pathways
associated with flower morphology. A modification of flower
morphology can lead to a novel form of the respective flower that
can enhance its value in the flower crop marketplace. Morphology
can include the size, shape, or petal pattern of a flower. Some
example of flower homeotic genes include genes involved in
cell-fate determination (in ABC combinatorial model of gene
expression), including AGAMOUS, which determines carpel fate in the
central whorl, APETALA3, which determines the sepal fate in the
outer whorl, and PISTILLATA, which determines petal development in
the second whorl (Espinosa-Soto et al., Plant Cell 16:2923-2939
(2004)).
[0115] Standard recombinant DNA and molecular cloning techniques
used herein are well known in the art and are described more fully
in Sambrook, J. et al., In Molecular Cloning: A Laboratory Manual;
2.sup.nd ed.; Cold Spring Harbor Laboratory Press: Cold Spring
Harbor, N.Y., 1989 (hereinafter "Sambrook et al., 1989") or
Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman,
J. G., Smith, J. A. and Struhl, K., Eds.; In Current Protocols in
Molecular Biology; John Wiley and Sons: New York, 1990 (hereinafter
"Ausubel et al., 1990").
[0116] "PCR" or "Polymerase Chain Reaction" is a technique for the
synthesis of large quantities of specific DNA segments consisting
of a series of repetitive cycles (Perkin Elmer Cetus Instruments,
Norwalk, Conn.). Typically, the double stranded DNA is heat
denatured; the two primers complementary to the 3' boundaries of
the target segment are annealed at low temperature and then
extended at an intermediate temperature. One set of these three
consecutive steps comprises a cycle.
[0117] Embodiments of the present invention include isolated
polynucleotides comprising a nucleotide sequence that is a
promoter. In some instances, the nucleotide sequence includes one
or more of the following: [0118] a) the sequence set forth in SEQ
ID NO:1 or a full-length complement thereof; or [0119] b) a
nucleotide sequence comprising a sequence having at least 90%
sequence identity, based on the BLASTN method of alignment, when
compared to the sequence set forth in SEQ ID NO:1. In other
aspects, the nucleotide sequence includes one or more of the
following: [0120] (a) a nucleotide sequence comprising a fragment
of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5,
SEQ ID NO:6, or SEQ ID NO:7, or a full-length complement thereof;
or [0121] (b) a nucleotide sequence comprising a sequence having at
least 90% sequence identity, based on the BLASTN method of
alignment, when compared to the nucleotide sequence of (a). The
nucleotide sequences of the present invention can be referred to as
a promoter or as having promoter-like activity. In some embodiments
the nucleotide sequence is a promoter that preferentially initiates
transcription in a plant flower cell. Such promoter is referred to
as a flower-specific promoter. Preferably the promoter of the
present invention is the soybean "SC194" promoter.
[0122] In a preferred embodiment, the promoter comprises the
nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:2, SEQ ID
NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7. The
present invention also includes nucleic acid fragments, variants,
and complements of the aforementioned nucleotide sequences or
promoters, provided that they are substantially similar and
functionally equivalent to the nucleotide sequence set forth in
these nucleotide sequences. A nucleic acid fragment or variant that
is functionally equivalent to the present SC194 promoter is any
nucleic acid fragment or variant that is capable of initiating the
expression, preferably initiating flower-specific expression, of a
coding sequence or functional RNA in a similar manner to the SC194
promoter. The expression patterns of SC194 gene and its promoter
are set forth in Examples 1, 2, 7, and 8. In one example, the
expression pattern of a SC194 promoter fragment or variant will
have expression patterns similar to that of the SC194 promoter.
[0123] In some aspects, a recombinant DNA construct can be formed
in part by operably linking at least one of the promoters of the
present invention to any heterologous nucleotide sequence. The
heterologous nucleotide sequence can be expressed in a cell as
either a functional RNA or a polypeptide. The cell for expression
includes a plant or bacterial cell, preferably a plant cell. The
recombinant DNA construct preferably includes the SC194 promoter.
The recombinant DNA construct preferably includes a heterologous
nucleotide sequence that encodes a protein that plays a role in
flower color formation, fragrance production, or shape/morphology
development of the flower. The color of a flower can be altered
transgenically by expressing genes involved in betalain,
carotenoid, or flavanoid biosynthesis. In regard to genes involved
in the biosynthesis of anthocyanin, dyhydroflavonol 4-reductase,
flavonoid 3,5-hydroxylase, chalcone synthase, chalcone isomerase,
flavonoid 3-hydroxylase, anthocyanin synthase, and UDP-glucose
3-O-flavonoid glucosyl transferase are some examples. The scent of
a flower can be altered transgenically by expressing genes that
manipulate the biosynthesis of fragrant fatty acid derivatives such
as terpenoids, phenylpropanoids, and benzenoids in flowers. Some
embodiments of the invention include a heterologous nucleotide
sequence that is selected from S-linalool synthase, acetyl
CoA:benzylalcohol acetyltransferase, benzyl CoA:benzylalcohol
benzoyl transferase, S-adenosyl-L-methionine:benzoic acid
carboxylmethyl transferase, mycrene synthases, (E)-.beta.-ocimene
synthase, orcinol O-methyltransferase, or limonene synthases.
Flower structures/morphologies can be altered transgenically by
expressing flower homeotic genes to create novel ornamental
varieties. Some embodiments of the invention include a heterologous
nucleotide sequence that is selected from genes such as, for
example, AGAMOUS, APETALA3, and PISTILLATA.
[0124] It is recognized that the instant promoters may be used with
their native coding sequences to increase or decrease expression in
flower tissue. The selection of the heterologous nucleic acid
fragment depends upon the desired application or phenotype to be
achieved. The various nucleic acid sequences can be manipulated so
as to provide for the nucleic acid sequences in the proper
orientation.
[0125] Plasmid vectors comprising the instant recombinant DNA
construct can be constructed. The choice of plasmid vector is
dependent upon the method that will be used to transform host
cells. The skilled artisan is well aware of the genetic elements
that must be present on the plasmid vector in order to successfully
transform, select and propagate host cells containing the
recombinant DNA construct.
[0126] The described polynucleotide embodiments encompass isolated
or substantially purified nucleic acid compositions. An "isolated"
or "purified" nucleic acid molecule, or biologically active portion
thereof, is substantially free of other cellular material or
culture medium when produced by recombinant techniques, or
substantially free of chemical precursors or other chemicals when
chemically synthesized. An "isolated" nucleic acid is essentially
free of sequences (preferably protein encoding sequences) that
naturally flank the polynucleotide (i.e., sequences located at the
5' and 3' ends of the nucleic acid) in the genomic DNA of the
organism from which the polynucleotide is derived. For example, in
various embodiments, the isolated polynucleotide can contain less
than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of
nucleotide sequences that naturally flank the polynucleotide in
genomic DNA of the cell from which the polynucleotide is
derived.
[0127] In another embodiment, the present invention includes host
cells comprising either the recombinant DNA constructs or isolated
polynucleotides of the present invention. Examples of the host
cells of the present invention include, and are not limited to,
yeast, bacteria, and plants, including flower crops such as, e.g.,
rose, carnation, Gerbera, Chrysanthemum, tulip, Gladioli,
Alstroemeria, Anthurium, Iisianthus, larkspur, irises, orchid,
snapdragon, African violet, or azalea. Preferably, the host cells
are plant cells, and more preferably, flower crop cells, and more
preferably, Gerbera, rose, camation, Chrysanthemum, or tulip
cells.
[0128] Methods for transforming dicots, primarily by use of
Agrobacterium tumefaciens, and obtaining transgenic plants have
been published, among others, for cotton (U.S. Pat. No. 5,004,863,
U.S. Pat. No. 5,159,135); soybean (U.S. Pat. No. 5,569,834, U.S.
Pat. No. 5,416,011); Brassica (U.S. Pat. No. 5,463,174); peanut
(Cheng et al., Plant Cell Rep. 15:653-657 (1996); McKently et al.,
Plant Cell Rep. 14:699-703 (1995)); papaya (Ling et al.,
Bio/technology 9:752-758 (1991)); and pea (Grant et al., Plant Cell
Rep. 15:254-258 (1995)). For a review of other commonly used
methods of plant transformation see Newell, C. A., Mol. Biotechnol.
16:53-65 (2000). One of these methods of transformation uses
Agrobacterium rhizogenes (Tepfler, M. and Casse-Delbart, F.,
Microbiol. Sci. 4:24-28 (1987)). Transformation of soybeans using
direct delivery of DNA has been published using PEG fusion (WO
92/17598), electroporation (Chowrira et al., Mol. Biotechnol.
3:17-23 (1995); Christou et al., Proc. Natl. Acad. Sci. U.S.A.
84:3962-3966 (1987)), microinjection (Neuhaus et al., Physiol.
Plant. 79:213-217 (1990)), or particle bombardment (McCabe et al.,
Biotechnology 6:923 (1988); Christou et al., Plant Physiol.
87:671-674 (1988)).
[0129] In another embodiment, the present invention includes
transgenic plants comprising the recombinant DNA constructs
provided herein. The transgenic plants are selected from, for
example, one of a number of various flower crops including roses,
carnations, Gerberas, Chrysanthemums, tulips, Gladiolis,
Alstroemerias, Anthuriums, lisianthuses, larkspurs, irises,
orchids, snapdragons, African violets, azaleas, in addition to
other less popular flower crops.
[0130] In some embodiments of the invention, there are provided
transgenic seeds produced by the transgenic plants provided. Such
seeds are able to produce another generation of transgenic
plants.
[0131] There are a variety of methods for the regeneration of
plants from plant tissues. The particular method of regeneration
will depend on the starting plant tissue and the particular plant
species to be regenerated. The regeneration, development and
cultivation of plants from single plant protoplast transformants or
from various transformed explants is well known in the art
(Weissbach and Weissbach, Eds.; In Methods for Plant Molecular
Biology; Academic Press, Inc.: San Diego, Calif., 1988). This
regeneration and growth process typically includes the steps of
selection of transformed cells, culturing those individualized
cells through the usual stages of embryonic development through the
rooted plantlet stage. Transgenic embryos and seeds are similarly
regenerated. The resulting transgenic rooted shoots are thereafter
planted in an appropriate plant growth medium such as soil.
Preferably, the regenerated plants are self-pollinated to provide
homozygous transgenic plants. Otherwise, pollen obtained from the
regenerated plants is crossed to seed-grown plants of agronomically
important lines. Conversely, pollen from plants of these important
lines is used to pollinate regenerated plants. A transgenic plant
of the present invention containing a desired polypeptide is
cultivated using methods well known to one skilled in the art.
[0132] In addition to the above discussed procedures, there are
generally available standard resource materials that describe
specific conditions and procedures for the construction,
manipulation and isolation of macromolecules (e.g., DNA molecules,
plasmids, and the like), generation of recombinant DNA fragments
and recombinant expression constructs, and the screening and
isolating of clones (see, for example, Sambrook et al., 1989;
Maliga et al., In Methods in Plant Molecular Biology; Cold Spring
Harbor Press, 1995; Birren et al., In Genome Analysis: Detecting
Genes, 1; Cold Spring Harbor: New York, 1998; Birren et al., In
Genome Analysis: Analyzing DNA, 2; Cold Spring Harbor: New York,
1998; Clark, Ed., In Plant Molecular Biology: A Laboratory Manual;
Springer: New York, 1997).
[0133] The skilled artisan will also recognize that different
independent transformation events will result in different levels
and patterns of expression of the chimeric genes (Jones et al.,
EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics
218:78-86 (1989)). Thus, multiple events must be screened in order
to obtain lines displaying the desired expression level and
pattern. Such screening may be accomplished by northern analysis of
mRNA expression, western analysis of protein expression, or
phenotypic analysis. Also of interest are seeds obtained from
transformed plants displaying the desired expression profile.
[0134] The level of activity of the SC194 promoter in flowers is in
some cases comparable to that of many known strong promoters such
as the CaMV 35S promoter (Atanassova et al., Plant Mol. Biol.
37:275-285 (1998); Battraw and Hall, Plant Mol. Biol. 15:527-538
(1990); Holtorf et al., Plant Mol. Biol. 29:637-646 (1995);
Jefferson et al., EMBO J. 6:3901-3907 (1987); Wilmink et al., Plant
Mol. Biol. 28:949-955 (1995)), the Arabidopsis oleosin promoters
(Plant et al., Plant Mol. Biol. 25:193-205 (1994); Li, Texas
A&M University Ph.D. dissertation, pp. 107-128 (1997)), the
Arabidopsis ubiquitin extension protein promoters (Callis et al.,
J. Biol. Chem. 265(21):12486-12493 (1990)), a tomato ubiquitin gene
promoter (Rollfinke et al., Gene 211:267-276 (1998)), a soybean
heat shock protein promoter (Raschke et al., J. Mol. Biol.
199(4):549-557 (1988)), and a maize H3 histone gene promoter
(Atanassova et al., Plant Mol. Biol. 37:275-285 (1998)).
[0135] In some embodiments, the promoters of the present invention
are useful when flower-specific expression of a target heterologous
nucleic acid fragment is required. Another useful feature of the
promoters is its expression profile having high levels in
developing flowers and low levels in young developing seeds (See
Example 1). The promoters of the present invention are most active
in developing flower buds and open flowers, while still having
activity although approximately ten times lower in developing
seeds. Thus, the promoters can be used for gene expression or gene
silencing in flowers, especially when gene expression or gene
silencing is desired predominantly in flowers along with a lower
degree in developing seeds.
[0136] In some embodiments, the promoters of the present invention
are used to construct recombinant DNA constructs that can be used
to reduce expression of at least one heterologous nucleic acid
sequence in a plant cell. To accomplish this, a recombinant DNA
construct can be constructed by linking the heterologous nucleic
acid sequence to a promoter of the present invention. (See, e.g.,
U.S. Pat. No. 5,231,020, WO99/53050, WO02/00904, and WO98/36083 for
methodology to block plant gene expression via cosuppression.)
Alternatively, recombinant DNA constructs designed to express
antisense RNA for a heterologous nucleic acid fragment can be
constructed by linking the fragment in reverse orientation to a
promoter of the present invention. (See, e.g., U.S. Pat. No.
5,107,065 for methodology to block plant gene expression via
antisense RNA.) Either the cosuppression or antisense chimeric gene
can be introduced into plants via transformation. Transformants,
wherein expression of the heterologous nucleic acid sequence is
decreased or eliminated, are then selected.
[0137] There are embodiments of the present invention that include
promoters of the present invention being utilized for methods of
altering (increasing or decreasing) the expression of at least one
heterologous nucleic acid sequence in a plant cell which comprises:
transforming a plant cell with a recombinant DNA expression
construct described herein; growing fertile mature plants from the
transformed plant cell; and selecting plants containing a
transformed plant cell wherein the expression of the heterologous
nucleotide sequence is altered (increased or decreased).
[0138] Transformation and selection can be accomplished using
methods well-known to those skilled in the art including, but not
limited to, the methods described herein.
[0139] There are provided some embodiments that include methods of
expressing a coding sequence in a plant that is a flower crop
comprising: introducing a recombinant DNA construct disclosed
herein into the plant; growing the plant; and selecting a plant
displaying expression of the coding sequence; wherein the
nucleotide sequence comprises: a nucleotide sequence comprising the
sequence set forth in SEQ ID NO:1 or a full-length complement
thereof; a nucleotide sequence comprising a fragment of the
sequence set forth in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID
NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or a full-length
complement thereof, or in alternative embodiments, the sequence set
forth in SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID
NO:6, or SEQ ID NO:7; or a nucleotide sequence comprising a
sequence having at least 90% sequence identity, based on the BLASTN
method of alignment, when compared to the sequence set forth in SEQ
ID NO:1; wherein said nucleotide sequence initiates transcription
in a flower cell of the plant.
[0140] Furthermore, some embodiments of the present invention
include methods of transgenically altering a marketable flower
trait of a flowering plant, comprising: introducing a recombinant
DNA construct disclosed herein into the flowering plant; growing a
fertile, mature flowering plant resulting from the introducing
step; and selecting a flowering plant expressing the heterologous
nucleotide sequence in flower tissue based on the altered
marketable flower trait.
[0141] As further described in the Examples below, the promoter
activity of the soybean genomic DNA fragment sequence SEQ ID NO:1
upstream of the SC194 protein coding sequence was assessed by
linking the fragment to a yellow fluorescence reporter gene,
ZS-YELLOW1 N1 (YFP) (Matz et al., Nat. Biotechnol. 17:969-973
(1999)), transforming the promoter::YFP expression cassette into
soybean, and analyzing YFP expression in various cell types of the
transgenic plants (see Example 7 and 8). All parts of the
transgenic plants were analyzed and YFP expression was
predominantly detected in flowers. These results indicated that the
nucleic acid fragment contained flower-preferred promoter.
[0142] Some embodiments of the present invention provide
recombinant DNA constructs comprising at least one isopentenyl
transferase nucleic acid sequence operably linked to a provide
promoter, preferably a SC194 promoter. The isopentenyl transferase
plays a key step in the biosynthesis of plant cytokinin (Kakimoto,
J. Plant Res. 116:233-239 (2003)). Elevated levels of cytokinin in
plant cells might help to delay floral senescence and abortion
which may present a potential way to improve crop yields (Chang et
al., Plant Physiol. 132:2174-2183 (2003); Young et al., Plant J.
38:910-922 (2004)).
[0143] Utilities for Flower-Specific Promoters
[0144] The color, scent or morphology of a flower represents
marketable flower traits, or characteristics/phenotypes of a flower
that consumers, particularly floriculturalists, consider when
determining which flowers are desirable and will be purchased.
Hence, it would be beneficial to be able to alter these
characteristics in order to satisfy the desires of consumers.
Transgenic technologies can be implemented in order to achieve such
results.
[0145] The phenotype of a flower can be altered transgenically by
expressing genes, preferably in flower tissue, that play a role in
color formation, fragrance production, or shape/morphology
development of the flower. This type of alteration is particularly
useful in the floriculture industry, and particularly useful for
flowering plants.
[0146] The color of a flower is mainly the result of three types of
pigment: flavanoids, carotenoids, and betalains. The flavanoids are
the most common of the three and they contribute to colors ranging
from yellow to red to blue, with anthocyanins being the major
flavanoid. Carotenoids are C-40 tetraterpenoids that contribute to
the majority of yellow hues and contribute to orange/red, bronze
and brown colors, e.g., that seen in roses and chrysanthemums.
Betalains are the least abundant and contribute to various hues of
ivory, yellow, orange, red and violet. The color of a flower can be
altered transgenically by expressing genes involved in, e.g.,
betalain, carotenoid, or flavanoid biosynthesis. In one example,
the color of a flower can be altered transgenically by expressing
genes involved in the biosynthesis of anthocyanin, for example,
dyhydroflavonol 4-reductase, flavonoid 3,5-hydroxylase, chalcone
synthase, chalcone isomerase, flavonoid 3-hydroxylase, anthocyanin
synthase, and UDP-glucose 3-O-flavonoid glucosyl transferase. In
some aspects of the invention, the gene involved in anthocyanin
biosynthesis is the flavonoid 3,5-hydroxylase gene (see, e.g., Mori
et al., supra). This type of alteration is particularly useful in
the floriculture industry, providing novel flower colors in flower
crops.
[0147] In addition to color, the scent of a flower can be altered
transgenically by expressing genes that manipulate the biosynthesis
of fragrant fatty acid derivatives such as terpenoids,
phenylpropanoids, and benzenoids in flowers (see, e.g., Tanaka et
al., supra). Genes involved in the biosynthesis of fragrant fatty
acid derivatives can be operably linked to the flower-specific
promoters presently described for preferential expression in flower
tissue. The preferential expression in flower tissue can be
utilized to generate new and desirable fragrances to enhance the
demand for the underlying cut flower. A number of known genes that
are involved in the biosynthesis of floral scents are described
below. A strong sweet scent can be generated in a flower by
introducing or upregulating expression of S-linalool synthase,
which was earlier isolated from Clarkia breweri. Two genes that are
responsible for the production of benzylacetate and benzylbenzoate
are acetyl CoA:benzylalcohol acetyltransferase and benzyl
CoA:benzylalcohol benzoyl transferase, respectively. These
transferases were also reported to have been isolated from C.
breweri. A phenylpropanoid floral scent, methylbenzoate, is
synthesized in part by S-adenosyl-L-methionine:benzoic acid
carboxylmethyl transferase (BAMT), which catalyzes the final step
in the biosynthesis of methyl benzoate. BAMT is known to have a
significant role in the emission of methyl benzoate in snapdragon
flowers. Two monoterpenes, mycrene and (E)-.beta.-ocimene, from
snapdragon are known to be synthesized in part by the terpene
synthases: mycrene synthases and (E)-.beta.-ocimene synthases.
Other genes involved in biosynthesis of floral scents have been
reported and are being newly discovered, many of which are isolated
from rose. Some genes involved in scent production in the rose
include orcinol O-methyltransferase, for synthesis of
S-adenosylmethionine, and limonene synthases (see, e.g., Tanaka et
al., supra).
[0148] Flower structures/morphologies can be altered transgenically
by expressing flower homeotic genes to create novel ornamental
varieties. The flower homeotic genes that are determinative of
flower morphology include genes such as AGAMOUS, APETALA3,
PISTILLATA, and others that are known and/or are being elucidated
(see, e.g., Espinosa-Soto et al., supra).
EXAMPLES
[0149] Aspects of the present invention are exemplified in the
following Examples. It should be understood that these Examples,
while indicating preferred embodiments of the invention, are given
by way of illustration only. From the above discussion and these
Examples, one skilled in the art can ascertain the essential
characteristics of this invention, and without departing from the
spirit and scope thereof, can make various changes and
modifications of the invention to adapt it to various usages and
conditions. Thus, various modifications of the invention in
addition to those shown and described herein will be apparent to
those skilled in the art from the foregoing description. Such
modifications are also intended to fall within the scope of the
appended claims.
[0150] In the discussion below, parts and percentages are by weight
and degrees are Celsius, unless otherwise stated. Sequences of
promoters, cDNA, adaptors, and primers listed herein are in the 5'
to 3' orientation unless described otherwise. Techniques in
molecular biology were typically performed as described in Ausubel
et al., 1990 or Sambrook et al., 1989.
Example 1
Lynx MPSS Profiling of Soybean Genes Preferably Expressed in
Flowers
[0151] Soybean expression sequence tags (ESTs) were generated by
sequencing randomly selected clones from cDNA libraries constructed
from different soybean tissues. Multiple EST sequences may have
different lengths representing different regions of the same
soybean gene. For those EST sequences representing the same gene
that are found more frequently in a flower-specific cDNA library,
there is a possibility that the representative gene could be a
flower preferred gene candidate. Multiple EST sequences
representing the same soybean gene were compiled electronically
based on their overlapping sequence homology into a full length
sequence representing a unique gene. These assembled, unique gene
sequences were cumulatively collected and the information was
stored in a searchable database. Flower specific candidate genes
were identified by searching this database to find gene sequences
that are frequently found in flower libraries but are rarely found
in other tissue libraries, or not found in other tissue
libraries.
[0152] One unique gene, PSO375649, was identified in the search as
a flower specific gene candidate since all of the ESTs representing
PSO375649 were found only in flower tissue. PSO375649 cDNA sequence
(SEQ ID NO:19) as well as its putative translated protein sequence
(SEQ ID NO:20) were used to search National Center for
Biotechnology Information (NCBI) databases. PSO375649 was found to
represent a novel soybean gene without significant homology to any
known gene. PSO375649 was subsequently named after its genomic DNA
clone lab name SC194.
[0153] A more sensitive gene expression profiling methodology MPSS
(Mass Parallel Signature Sequence) transcript profiling technique
(Brenner et al., Proc. Natl. Acad. Sci. USA 97:1665-70 (2000)) was
used to confirm PSO375649 as a flower specific gene. The MPSS
technology involves the generation of 17 base signature tags from
mRNA samples that have been reverse transcribed from poly A+ RNA
isolated using standard molecular biology techniques (Sambrook et
al., 1989). The tags are simultaneously sequenced and assigned to
genes or ESTs. The abundance of these tags is given a number value
that is normalized to parts per million (PPM) which then allows the
tag expression, or tag abundance, to be compared across different
tissues. Genome wide gene expressions can be profiled
simultaneously using this technology. Since each 17 base tag is
long enough to be specific to only one or a few genes in any
genome, the MPSS platform can be used to determine the expression
pattern of a particular gene and its expression levels in different
tissues.
[0154] MPSS gene expression profiles were generated from different
soybean tissues over time, and the profiles were accumulated in a
searchable database. PSO375649 cDNA sequence SEQ ID NO:19 was used
to search the MPSS database to identify a MPSS tag sequence (SEQ ID
NO:24) that is identical to a 17 base pair region from position 352
to 368 in the PSO375649 cDNA sequence. The identified MPSS tag was
then used to search the MPSS database to reveal its abundance in
different soybean tissues. As illustrated in Table 1, the PSO375649
gene was confirmed to be highly abundant in flowers and pods, a
desired expression profile for its promoter to be able to express
genes in flowers and in early developing pods.
TABLE-US-00001 TABLE 1 Target Gene PSO375649 MPSS Tag Sequence SEQ
ID NO:24 Flower 4818 Pod 61 Flower Bud 2759 Lateral Root 0 Leaf 0
Petiole 0 Primary Root 0 Seed 17 Stem 0
Example 2
Quantitative RT-PCR Profiles of SC194 Gene Expression in
Soybean
[0155] The MPSS profiles of SC194 gene, PSO375649, was confirmed
and extended by analyzing 14 different soybean tissues using the
relative quantitative RT-PCR (qRT-PCR) technique with a 7500 real
time PCR system (Applied Biosystems, Foster City, Calif.).
[0156] Fourteen soybean tissues (somatic embryo, somatic embryo
grown one week on charcoal plate, leaf, leaf petiole, root, flower
bud, open flower, R3 pod, R4 seed, R4 pod coat, R5 seed, R5 pod
coat, R6 seed, R6 pod coat) were collected from cultivar `Jack` and
flash frozen in liquid nitrogen. The seed and pod development
stages were defined according to descriptions in Fehr and Caviness,
IWSRBC 80:1-12 (1977). Total RNA was extracted with Trizol reagents
(Invitrogen, Carlsbad, Calif.) and treated with DNase I to remove
any trace amount of genomic DNA contamination. The first strand
cDNA was synthesized with Superscript III reverse transcriptase
(Invitrogen).
[0157] PCR analysis was performed to confirm that the cDNA was free
of genomic DNA. The forward and reverse primers used for PCR
analysis are shown in SEQ ID NO:21 and SEQ ID NO:22, respectively
The primers are specific to the 5'UTR intron/exon junction region
of a soybean S-adenosylmethionine synthetase gene promoter
(WO00/37662). PCR using this primer set amplifies a 967 bp DNA
fragment from any soybean genomic DNA template and a 376 bp DNA
fragment from the cDNA template. The genomic DNA-free cDNA aliquots
were used in qRT-PCR analysis of PSO375649 using gene-specific
primers SEQ ID NO:25 and SEQ ID NO:26. An endogenous soybean ATP
sulfurylase gene was used as an internal control for normalization
with primers SEQ ID NO:27 and SEQ ID NO:28 and soybean wild type
genomic DNA was used as the calibrator for relative
quantification.
[0158] The qRT-PCR profiling of the SC194 gene expression confirmed
its predominant flower expression and also showed ongoing
expression at levels more than ten fold lower during early pod and
seed development (see FIG. 1).
Example 3
Isolation of Soybean SC194 Promoter
[0159] The soybean genomic DNA fragment corresponding to the SC194
promoter was isolated using a polymerase chain reaction (PCR) based
approach called genome walking using the Universal GenomeWalker.TM.
kit from Clontech.TM. (Product User Manual No. PT3042-1).
[0160] Soybean genomic DNA samples were digested, separately, to
completion with four restriction enzymes DraI, EcoRV, HpaI, or
PmlI, each of which generates DNA fragments having blunt ends.
Double strand adaptors supplied in the GenomeWalker.TM. kit were
added to the blunt ends of the genomic DNA fragments by DNA ligase.
Two rounds of PCR were performed to amplify the SC194 corresponding
genomic DNA fragment using two nested primers supplied in the
Universal GenomeWalker.TM. kit that are specific to the adaptor
sequence (AP1 and AP2, for the first and second adaptor primer,
respectively), and two SC194 gene specific primers (GSP1 and GSP2)
designed based on the 5' coding sequence of SC194 (PSO375649). The
oligonucleotide sequences of the four primers are shown in SEQ ID
NO:15 (GSP1), SEQ ID NO:16 (AP1),
SEQ ID NO:17 (GSP2), and SEQ ID NO:18 (AP2). TheGSP2 primer
contains a recognition site for the restriction enzyme NcoI. The
AP2 primer from the Universal GenomeWalker.TM. kit contains a SalI
restriction site. The 3' end of the adaptor sequence SEQ ID NO:23
contains a XmaI recognition site downstream to the corresponding
SalI restriction site in AP2 primer.
[0161] The AP1 and the GSP1 primers were used in the first round
PCR using each of the adaptor ligated genomic DNA samples (DraI,
EcoRV, HpaI or PmlI) under conditions defined in the
GenomeWalker.TM. protocol. Cycle conditions were 94.degree. C. for
4 minutes; 35 cycles of 94.degree. C. for 30 seconds, 60.degree. C.
for 1 minute, and 68.degree. C. for 3 minutes; and a final
68.degree. C. for 5 minutes before holding at 4.degree. C. One
microliter from each of the first round PCR products was used as
templates for the second round PCR with the AP2 and GSP2 primers.
Cycle conditions for second round PCR were 94.degree. C. for 4
minutes; 25 cycles of 94.degree. C. for 30 seconds, 60.degree. C.
for 1 minute, and 68.degree. C. for 3 minutes; and a final
68.degree. C. for 5 minutes before holding at 4.degree. C. Agarose
gels were run to identify specific PCR product with an optimal
fragment length. An approximately 1.3 Kb PCR product was detected
and subsequently cloned into pCR2.1-TOPO vector by TOPO TA cloning
(Invitrogen). Sequencing of the cloned PCR products revealed that
its 3' end matched the 84 bp 5' end of the PSO375649 cDNA sequence,
indicating that the PCR product was indeed the corresponding SC194
genomic DNA fragment. The 1358 bp genomic DNA sequence upstream of
the putative SC194 start codon ATG is herein designated as soybean
SC194 promoter (SEQ ID NO:1).
Example 4
SC194 Promoter Copy Number Analysis
[0162] Southern hybridization analysis was performed to determine
whether there were other sequences in the soybean genome with high
similarity to the SC194 promoter. Soybean `Jack` wild type genomic
DNA was digested with nine different restriction enzymes BamHI,
BglII, DraI, EcoRI, EcoRV, HindIII, MfeI, NdeI, and SpeI, each
separately, and distributed in a 0.7% agarose gel by
electrophoresis. Each of the digested DNA samples was blotted onto
a Nylon membrane and hybridized with digoxigenin (DIG) labeled
SC194 promoter DNA probe according to the standard protocol (Roche
Applied Science, Indianapolis, Ind.). The SC194 promoter probe was
labeled by PCR using the DIG DNA labeling kit (Roche Applied
Science) with two gene specific primers SEQ ID NO:12 and SEQ ID
NO:8 to make a 685 bp probe described in SEQ ID NO:5 covering the
3' half of SC194 promoter sequence.
[0163] Since none of the above nine different restriction enzymes
cuts inside the SC194 probe region as illustrated in FIG. 2B, a
single band is expected to be hybridized by the SC194 probe in each
of the lanes if there is only a single copy of the SC194 promoter
sequence in soybean genome. A strong major band and a weak minor
band were detected in each of eight digestion lanes, BamHI, BglII,
DraI, EcoRI, EcoRV, HindIII, MfeI, and NdeI, suggesting that there
is, in addition to the SC194 promoter sequence, another sequence
with enough similarity to the SC194 promoter sequence to be
hybridized though less effectively by the same SC194 probe (FIG.
2A). The fact that only one band was detected on the Southern blot
of the SpeI digestion could be explained if two bands representing
the SC194 promoter sequence and the other similar sequence,
respectively, were similar in size to show as one overlapping band,
or if the other similar sequence resulted in a band too small to be
kept on the blot (any band smaller than 1 Kb would run out of the
agarose gel under the experiment conditions).
Example 5
SC194:YFP Reporter Constructs and Soybean Transformation
[0164] The cloned SC194 promoter PCR fragment described in EXAMPLE
3 was digested with XmaI and NcoI, gel purified using a DNA gel
extraction kit (Qiagen, Valencia, Calif.) and directionally cloned
into the XmaI and NcoI site of a Gateway cloning ready vector QC299
(FIG. 3 and SEQ ID NO:40) containing a promoter-less fluorescent
reporter gene ZS-YELLOW1 N1 (YFP) to make the reporter construct
QC300 (SEQ ID NO:41) with the soybean SC194 promoter driving the
YFP gene expression (FIG. 3). The SC194:YFP expression cassette in
construct QC300 was linked to the SAMS:ALS (S-adenosyl methionine
synthetase:acetolactate synthase) expression cassette in construct
PHP25224 (FIG. 3 and SEQ ID NO:42) by Gateway cloning to create
construct QC302 (FIG. 3 and SEQ ID NO:43). The linked SC194:YFP and
SAMS:ALS cassettes were released as a 6431 bp DNA fragment from
construct QC302 by AscI restriction digestion, separated from the
vector backbone fragment by agarose gel electrophoresis, and
purified from the gel using a Qiagen DNA gel extraction kit. The
purified DNA fragment was used to transform soybean cultivar Jack
using the particle gun bombardment method (Klein et al., Nature
327:70-73 (1987); U.S. Pat. No. 4,945,050) to study the SC194
promoter activity in stably transformed soybean plants.
[0165] Soybean somatic embryos from the Jack cultivar were induced
as follows. Cotyledons (smaller than 3 mm in length) were dissected
from surface-sterilized, immature seeds and were cultured for 6-10
weeks under fluorescent light at 26.degree. C. on a Murashige and
Skoog media ("MS media") containing 0.7% agar and supplemented with
10 mg/ml 2,4-dichlorophenoxyacetic acid (2,4-D). Globular stage
somatic embryos, which produced secondary embryos, were then
excised and placed into flasks containing liquid MS medium
supplemented with 2,4-D (10 mg/ml) and cultured in the light on a
rotary shaker. After repeated selection for clusters of somatic
embryos that multiplied as early, globular staged embryos, the
soybean embryogenic suspension cultures were maintained in 35 ml
liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with
fluorescent lights on a 16:8 hour day/night schedule. Cultures were
subcultured every two weeks by inoculating approximately 35 mg of
tissue into 35 ml of the same fresh liquid MS medium.
[0166] Soybean embryogenic suspension cultures were then
transformed by the method of particle gun bombardment using a
DuPont Biolistic.TM. PDS1000/HE instrument (helium retrofit)
(Bio-Rad Laboratories, Hercules, Calif.). To 50 .mu.l of a 60 mg/ml
1.0 mm gold particle suspension were added (in order): 30 .mu.l of
10 ng/.mu.l SC194:YFP+SAMS:ALS DNA fragment, 20 .mu.l of 0.1 M
spermidine, and 25 .mu.l of 5 M CaCl.sub.2. The particle
preparation was then agitated for 3 minutes, spun in a centrifuge
for 10 seconds and the supernatant removed. The DNA-coated
particles were then washed once in 400 .mu.l 100% ethanol and
resuspended in 45 .mu.l of 100% ethanol. The DNA/particle
suspension was sonicated three times for one second each. 5 .mu.l
of the DNA-coated gold particles was then loaded on each macro
carrier disk.
[0167] Approximately 300-400 mg of a two-week-old suspension
culture was placed in an empty 60.times.15 mm Petri dish and the
residual liquid removed from the tissue with a pipette. For each
transformation experiment, approximately 5 to 10 plates of tissue
were bombarded. Membrane rupture pressure was set at 1100 psi and
the chamber was evacuated to a vacuum of 28 inches mercury. The
tissue was placed approximately 3.5 inches away from the retaining
screen and bombarded once. Following bombardment, the tissue was
divided in half and placed back into liquid media and cultured as
described above.
[0168] Five to seven days post bombardment, the liquid media was
exchanged with fresh media containing 100 ng/ml chlorsulfuron as
selection agent. This selective media was refreshed weekly. Seven
to eight weeks post bombardment, green, transformed tissue was
observed growing from untransformed, necrotic embryogenic clusters.
Isolated green tissue was removed and inoculated into individual
flasks to generate new, clonally propagated, transformed
embryogenic suspension cultures. Each clonally propagated culture
was treated as an independent transformation event and subcultured
in the same liquid MS media supplemented with 2,4-D (10 mg/ml) and
100 ng/ml chlorsulfuron selection agent to increase mass. The
embryogenic suspension cultures were then transferred to solid agar
MS media plates without 2,4-D supplement to allow somatic embryos
to develop. A sample of each event was collected at this stage for
PCR and quantitative PCR analysis.
[0169] Cotyledon stage somatic embryos were dried-down (by
transferring them into an empty small Petri dish that was seated on
top of a 10 cm Petri dish to allow slow dry down) to mimic the last
stages of soybean seed development. Dried-down embryos were placed
on germination solid media, and transgenic soybean plantlets were
regenerated. The transgenic plants were then transferred to soil
and maintained in growth chambers for seed production.
[0170] Genomic DNA was extracted from somatic embryo samples and
analyzed by quantitative PCR using the 7500 real time PCR system
(Applied Biosystems) with gene-specific primers and
6-carboxyfluorescein (FAM)-labeled fluorescence probes to check
copy numbers of both the SAMS:ALS expression cassette and the
SC194:YFP expression cassette. The qPCR analysis was done in duplex
reactions with a heat shock protein (HSP) gene as the endogenous
control and a transgenic DNA sample with a known single copy of
SAMS:ALS or YFP transgene as the calibrator using the relative
quantification methodology. The endogenous control HSP probe was
labeled with VIC (Applera Corporation, Norwalk, Conn.) and the
target gene SAMS or YFP probe was labeled with FAM for the
simultaneous detection of both fluorescent probes in the same
duplex reactions. The primers and probes used in the qPCR analysis
are listed below.
SAMS forward primer: SEQ ID NO:29 FAM labeled SAMS probe: SEQ ID
NO:30 SAMS reverse primer: SEQ ID NO:31 YFP forward primer: SEQ ID
NO:32 FAM labeled YFP probe: SEQ ID NO:33 YFP reverse primer: SEQ
ID NO:34 HSP forward primer: SEQ ID NO:35 VIC labeled HSP probe:
SEQ ID NO:36 HSP reverse primer: SEQ ID NO:37 FAM labeled DNA oligo
probes and VIC labeled oligo probes were obtained from Sigma
Genosys (The Woodlands, Tex.).
[0171] Only transgenic soybean events containing 1 or 2 copies of
both the SAMS:ALS expression cassette and the SC194:YFP expression
cassette were selected for further gene expression evaluation and
seed production (see Table 2). Events negative for YFP qPCR or with
more than 2 copies for the SAMS or YFP qPCR were terminated. YFP
expression detection in flowers as described in EXAMPLE 8 is also
recorded in the same table.
TABLE-US-00002 TABLE 2 Event ID SAMS qPCR YFP qPCR YFP Expression
4775.1.1 1.30 1.16 - 4775.1.3 1.01 1.26 - 4775.3.1 1.24 1.33 +
4775.3.2 1.17 1.36 - 4775.3.3 1.79 1.38 + 4775.3.4 2.08 1.29 +
4775.4.1 1.18 1.43 + 4775.5.1 1.47 1.11 + 4775.6.2 0.93 1.06 +
4775.7.2 1.43 1.20 + 4775.1.4 1.31 1.39 - 4775.2.2 1.12 1.13 +
4775.3.5 1.28 1.89 - 4775.3.6 2.48 1.17 + 4775.3.7 1.30 1.21 +
4775.8.2 1.28 1.30 + 4775.2.3 2.33 1.91 +
Example 6
Construction of SC194 Promoter Deletion Constructs
[0172] To define the transcriptional elements controlling the SC194
promoter activity, six 5' unidirectional deletion fragments SEQ ID
NO:2 of 1328 bp, SEQ ID NO:3 of 1134 bp, SEQ ID NO:4 of 932 bp, SEQ
ID NO:5 of 685 bp, SEQ ID NO:6 of 472 bp, and SEQ ID NO:7 of 237 bp
were made by utilizing PCR amplification and the full length
soybean SC194 promoter contained in the original construct QC300
(FIG. 3) as DNA template. The same antisense primer (SEQ ID NO:8)
was used in the amplification of the six SC194 promoter fragments
by pairing with different sense primers SEQ ID NOs: 9, 10, 11, 12,
13, and 14 respectively, to produce the promoter fragments
represented by SEQ ID NOs: 2, 3, 4, 5, 6, and 7.
[0173] Each of the PCR amplified promoter DNA fragments was cloned
into the Gateway cloning ready TA cloning vector pCR8/GW/TOPO
(Invitrogen, Carlsbad, Calif.; FIG. 4 and SEQ ID NO:44), and clones
with the insert in correct orientation, relative to the Gateway
recombination sites attL1 and attL2 in the pCR8/GW/TOPO vector,
were selected by AflII restriction enzyme digestion analysis or
sequence confirmation (see the example map QC300-1 (SEQ ID NO:45)
in FIG. 4, which contains the 1328 SC194 promoter deletion fragment
SEQ ID NO:2). The maps of constructs QC300-2, QC300-3, QC3004,
QC300-5, and QC300-6 containing the SC194 promoter deletion
fragments SEQ ID NOs:3, 4, 5, 6, and 7 were similar. The promoter
fragment in the right orientation was subsequently cloned into the
Gateway destination vector QC330 (FIG. 4 and SEQ ID NO:46) by
Gateway LR clonase reaction (Invitrogen) to place the promoter
fragment in front of the reporter gene YFP (see the example map
QC300-1Y (SEQ ID NO:47) in FIG. 4, which contains the 1328 SC194
promoter deletion fragment SEQ ID NO:2). A 21 bp Gateway
recombination site attB2 (SEQ ID NO:39) was inserted between the
promoter and the YFP reporter gene coding region as a result of the
Gateway cloning process. Another 21 bp Gateway recombination site
attB1 (SEQ ID NO:38) was left at the 5' end of the SC194 promoter.
The maps of constructs QC300-2Y (SEQ ID NO:48), QC300-3Y (SEQ ID
NO:49), QC300-4Y (SEQ ID NO:50), QC300-5Y (SEQ ID NO:51), and
QC300-6Y (SEQ ID NO:52) containing the SC194 promoter deletion
fragments SEQ ID NOs: 3, 4, 5, 6, and 7 were similar.
[0174] The SC194:YFP promoter deletion constructs QC300-1Y,
QC300-2Y, QC300-3Y, QC300-4Y, QC300-5Y, and QC300-6Y were ready to
be transformed into germinating soybean cotyledons by gene gun
bombardment method for transient gene expression study. The 1358 bp
full length SC194 promoter in construct QC300 was included as a
positive control for transient expression analysis. A simple
schematic description of the six SC194 promoter deletion fragments
can be found in FIG. 5.
Example 7
Transient Expression Analysis of SC194:YFP Constructs
[0175] Full length SC194 promoter construct QC300 and its series
deletion constructs QC300-1Y, 2Y, 3Y, 4Y, 5Y, and 6Y were tested by
the YFP gene transient expression assay using germinating soybean
cotyledons as the target tissue. Soybean seeds were rinsed with 10%
Tween 20 in sterile water, surface-sterilized with 70% ethanol for
2 minutes and then by 6% sodium hypochloride for 15 minutes. After
rinsing, the seeds were placed on wet filter paper in a Petri dish
to germinate for 4-6 days under fluorescent light at 26.degree. C.
Green cotyledons were excised and placed inner side up on a 0.7%
agar plate containing MS media for particle gun bombardment.
[0176] The DNA and gold particle mixtures were prepared similarly
as described in EXAMPLE 5 except with more DNA (100 ng/.mu.l). The
bombardments were also carried out under similar parameters as
described in EXAMPLE 5. YFP expression was checked under a Leica
MZFLIII stereo microscope equipped with UV light source and
appropriate light filters (Leica Microsystems Inc., Bannockburn,
Ill.), and all microscopic pictures were taken under the same
camera settings: 1.06 gamma, 0.0% gain, and 0.58 seconds exposure
approximately 24 hours after bombardment with 8.times.
magnification.
[0177] The full length SC194 promoter construct QC300 expressed YFP
but much weaker than the positive control construct pZSL90 (SEQ ID
NO:53), which contained a strong constitutive promoter SCP1 (U.S.
Pat. No. 6,072,050), in transient expression assay as shown by the
different size green dots (FIG. 6A, H). Each dot represented a
single cotyledon cell which appeared larger if the fluorescence was
strong or smaller if the fluorescence was weak, even under the same
magnification. The QC300-1Y and QC300-2Y constructs containing,
respectively, the 1328 bp and 1134 bp truncated SC194 promoter
fragments and with the attB2 Gateway recombination site
(Invitrogen) inserted between the SC194 promoter and the YFP had
similar expression that also appeared to be weaker than the full
length SC194 promoter (FIG. 6B, C). The 932 bp truncated SC194
promoter construct QC300-3Y (FIG. 6D) had obviously lower
expression than the above three longer SC194 promoter constructs.
Further truncations of the SC194 promoter to 685 bp in construct
QC300-4Y and to 472 bp in construct QC300-5Y further reduced the
promoter activity as indicated by the fewer and smaller
fluorescence dots (FIG. 6E, F). But even when the SC194 promoter
was truncated to the 237 bp minimal size in construct QC300-6Y, the
promoter fragment still retained very low level activity with only
a few faint green dots barely detectable (FIG. 6G).
Example 8
SC194:YFP Expression in Stable Transgenic Soybean Plants
[0178] YFP gene expression was checked at different stages of
transgenic plant development for yellow fluorescence emission under
a Leica MZFLIII stereo microscope equipped with UV light source and
appropriate light filters (Leica Microsystems Inc., Bannockburn,
Ill.). No specific yellow fluorescence was detected during somatic
embryo development or in vegetative tissues such as leaf, petiole,
stem, or root of the transgenic plants. Fluorescence was only
detected in flowers.
[0179] A soybean flower consists of five sepals, five petals
including one standard large upper petal, two large side petals,
and two small fused lower petals called kneel to enclose ten
stamens and one pistil. The filaments of the ten stamens fuse
together to form a sheath to enclose the pistil and separate into
10 branches only at the top to each bear an anther. The pistil
consists of a stigma, a style, and an ovary in which there are
normally 24 ovules that will eventually develop into seeds.
[0180] Specific fluorescence signal (green color) was first
detected at the junctions between anthers and filaments, and also
in the distal part of petals in young flower bud when the petals
were still completely enclosed by sepals (FIG. 7A). In older flower
bud and open flower, fluorescence spread throughout all petals and
the entire filaments but still concentrated at the anther and
filament junctions (FIG. 7B, C, D). No specific fluorescence was
detected in sepals or in flower pedicle, which displayed red auto
fluorescence resulting from plant green tissues (FIG. 7A, C, D).
Fluorescence was detected in the style but not in the ovary part of
the pistil (FIG. 7F). It seemed that under higher magnification no
YFP fluorescence was detectable in stigma or in pollen, though it
is noted that auto fluorescence was strong in pollen (FIG. 7E, G).
The yellow auto fluorescence in pollen was even stronger under a
non-specific UV light filter, while YFP-specific greenish
fluorescence disappeared under the same non-specific filter. When
an open flower was dissected longitudinally to expose the inside of
the ovary, no fluorescence was detected in the inside ovary wall or
in any of the ovules (FIG. 7D). Similarly, no fluorescence was
detected in any part of young or old developing pod or seeds (FIG.
7H).
[0181] In conclusion, the SC194:YFP expression was only detected in
petals, filaments, style, and was strongest in the anther and
filament junctions of a soybean flower. The expression was first
detectable in young flower buds when the petals were still
completely enclosed by sepals. No expression was detectable in
other parts of the flower such ovary, stigma, or pollens or other
tissues such as leaf, root, petiole, pod coat, or developing seeds
of transgenic soybean plants.
[0182] Twelve out of 17 transgenic events expressed YFP in the same
manner as described in details above (Table 2). The other five
events contained the transgene as revealed by qPCR but failed to
express YFP.
Sequence CWU 1
1
5311358DNAGlycine max 1gggctggtaa cctagttaat aaattaaaag gagaacatta
ttaatgtgaa aatcatgcaa 60acttaaaaaa atcatcaaca acataatttt ataattctaa
taaaatattt ttttctttta 120attctttaat caatgtctaa catttatcta
ttatttatca catttgttat ttaatgtttc 180tatctttaga gctatcaaaa
atttaaaatg gtggaacctt actcattggg ttgagttcac 240ctaacttgtt
taataaatag atcaatctaa ttctattcat ctcttagtaa gtattaaaaa
300tgttggccca actctccata tattggtgag ttataggagt ttactcactt
aaaatgataa 360taaaaatatt tgttttaaaa tcatttttta aacaaaaaaa
taatgtttca gattatttat 420tcttagatca taacttacaa gcaacatttc
aatgatcaat tcaattgtca gaatcaaaac 480caattgaaag agacaaatat
tcatgctaat cttcatcaga aactaaacat tgacataaag 540caatagtatt
ggaactacaa gttataatta tgtactttgt aatagtgtga agaaaatcaa
600aatacaaata gtaatcatca tgataaatgc tatctcaatt tattcaatta
taaaaatata 660gaaataaaat gtgataaatg gataacatgt gtgctaatcc
agtccactac gcccaccaca 720agttcaaccc aatggactgg atcatcttct
ttttttctta ctgatttctc tcttcttcca 780ttctaatcca tcccaaaagt
agatgtttac tatttcccct ttcatagttt cacaagtgtg 840cgcagaggcc
aaactgaaag tggtagtaca tggtgtaata ttaatcacag atgtgctctc
900atgaagtctg aacttacagc tcaagtaaca accaacaagt aaaaagtaca
gaagatagca 960taaaaaatga aggtagaaca aattccaagt tttctacata
ttacggtgca taaatcaacc 1020acgtgaaggc tccatttatt tgccgctata
acattggtga ccctcttcca caaatagtaa 1080gtaataaaac caagtacaaa
aaaatgttca actaccaagt gatcacaatc ttcatgcatc 1140tgagtcacac
tattgccctt tgctcatgaa gtacacttta ctcaccgcca aagttcactc
1200aacactgtag aacaaaggaa tcatataaat aatgcatatc tctcccttaa
gccttcaaca 1260catacaaaag tgacacacca aatcaaagac acctgagcca
ttcaattccc ctcctttatt 1320gctttcaagt ttcaacacta attttattat ctgaaacc
135821328DNAGlycine max 2ggagaacatt attaatgtga aaatcatgca
aacttaaaaa aatcatcaac aacataattt 60tataattcta ataaaatatt tttttctttt
aattctttaa tcaatgtcta acatttatct 120attatttatc acatttgtta
tttaatgttt ctatctttag agctatcaaa aatttaaaat 180ggtggaacct
tactcattgg gttgagttca cctaacttgt ttaataaata gatcaatcta
240attctattca tctcttagta agtattaaaa atgttggccc aactctccat
atattggtga 300gttataggag tttactcact taaaatgata ataaaaatat
ttgttttaaa atcatttttt 360aaacaaaaaa ataatgtttc agattattta
ttcttagatc ataacttaca agcaacattt 420caatgatcaa ttcaattgtc
agaatcaaaa ccaattgaaa gagacaaata ttcatgctaa 480tcttcatcag
aaactaaaca ttgacataaa gcaatagtat tggaactaca agttataatt
540atgtactttg taatagtgtg aagaaaatca aaatacaaat agtaatcatc
atgataaatg 600ctatctcaat ttattcaatt ataaaaatat agaaataaaa
tgtgataaat ggataacatg 660tgtgctaatc cagtccacta cgcccaccac
aagttcaacc caatggactg gatcatcttc 720tttttttctt actgatttct
ctcttcttcc attctaatcc atcccaaaag tagatgttta 780ctatttcccc
tttcatagtt tcacaagtgt gcgcagaggc caaactgaaa gtggtagtac
840atggtgtaat attaatcaca gatgtgctct catgaagtct gaacttacag
ctcaagtaac 900aaccaacaag taaaaagtac agaagatagc ataaaaaatg
aaggtagaac aaattccaag 960ttttctacat attacggtgc ataaatcaac
cacgtgaagg ctccatttat ttgccgctat 1020aacattggtg accctcttcc
acaaatagta agtaataaaa ccaagtacaa aaaaatgttc 1080aactaccaag
tgatcacaat cttcatgcat ctgagtcaca ctattgccct ttgctcatga
1140agtacacttt actcaccgcc aaagttcact caacactgta gaacaaagga
atcatataaa 1200taatgcatat ctctccctta agccttcaac acatacaaaa
gtgacacacc aaatcaaaga 1260cacctgagcc attcaattcc cctcctttat
tgctttcaag tttcaacact aattttatta 1320tctgaaac 132831134DNAGlycine
max 3cattgggttg agttcaccta acttgtttaa taaatagatc aatctaattc
tattcatctc 60ttagtaagta ttaaaaatgt tggcccaact ctccatatat tggtgagtta
taggagttta 120ctcacttaaa atgataataa aaatatttgt tttaaaatca
ttttttaaac aaaaaaataa 180tgtttcagat tatttattct tagatcataa
cttacaagca acatttcaat gatcaattca 240attgtcagaa tcaaaaccaa
ttgaaagaga caaatattca tgctaatctt catcagaaac 300taaacattga
cataaagcaa tagtattgga actacaagtt ataattatgt actttgtaat
360agtgtgaaga aaatcaaaat acaaatagta atcatcatga taaatgctat
ctcaatttat 420tcaattataa aaatatagaa ataaaatgtg ataaatggat
aacatgtgtg ctaatccagt 480ccactacgcc caccacaagt tcaacccaat
ggactggatc atcttctttt tttcttactg 540atttctctct tcttccattc
taatccatcc caaaagtaga tgtttactat ttcccctttc 600atagtttcac
aagtgtgcgc agaggccaaa ctgaaagtgg tagtacatgg tgtaatatta
660atcacagatg tgctctcatg aagtctgaac ttacagctca agtaacaacc
aacaagtaaa 720aagtacagaa gatagcataa aaaatgaagg tagaacaaat
tccaagtttt ctacatatta 780cggtgcataa atcaaccacg tgaaggctcc
atttatttgc cgctataaca ttggtgaccc 840tcttccacaa atagtaagta
ataaaaccaa gtacaaaaaa atgttcaact accaagtgat 900cacaatcttc
atgcatctga gtcacactat tgccctttgc tcatgaagta cactttactc
960accgccaaag ttcactcaac actgtagaac aaaggaatca tataaataat
gcatatctct 1020cccttaagcc ttcaacacat acaaaagtga cacaccaaat
caaagacacc tgagccattc 1080aattcccctc ctttattgct ttcaagtttc
aacactaatt ttattatctg aaac 11344932DNAGlycine max 4gatcataact
tacaagcaac atttcaatga tcaattcaat tgtcagaatc aaaaccaatt 60gaaagagaca
aatattcatg ctaatcttca tcagaaacta aacattgaca taaagcaata
120gtattggaac tacaagttat aattatgtac tttgtaatag tgtgaagaaa
atcaaaatac 180aaatagtaat catcatgata aatgctatct caatttattc
aattataaaa atatagaaat 240aaaatgtgat aaatggataa catgtgtgct
aatccagtcc actacgccca ccacaagttc 300aacccaatgg actggatcat
cttctttttt tcttactgat ttctctcttc ttccattcta 360atccatccca
aaagtagatg tttactattt cccctttcat agtttcacaa gtgtgcgcag
420aggccaaact gaaagtggta gtacatggtg taatattaat cacagatgtg
ctctcatgaa 480gtctgaactt acagctcaag taacaaccaa caagtaaaaa
gtacagaaga tagcataaaa 540aatgaaggta gaacaaattc caagttttct
acatattacg gtgcataaat caaccacgtg 600aaggctccat ttatttgccg
ctataacatt ggtgaccctc ttccacaaat agtaagtaat 660aaaaccaagt
acaaaaaaat gttcaactac caagtgatca caatcttcat gcatctgagt
720cacactattg ccctttgctc atgaagtaca ctttactcac cgccaaagtt
cactcaacac 780tgtagaacaa aggaatcata taaataatgc atatctctcc
cttaagcctt caacacatac 840aaaagtgaca caccaaatca aagacacctg
agccattcaa ttcccctcct ttattgcttt 900caagtttcaa cactaatttt
attatctgaa ac 9325685DNAGlycine max 5gataaatgga taacatgtgt
gctaatccag tccactacgc ccaccacaag ttcaacccaa 60tggactggat catcttcttt
ttttcttact gatttctctc ttcttccatt ctaatccatc 120ccaaaagtag
atgtttacta tttccccttt catagtttca caagtgtgcg cagaggccaa
180actgaaagtg gtagtacatg gtgtaatatt aatcacagat gtgctctcat
gaagtctgaa 240cttacagctc aagtaacaac caacaagtaa aaagtacaga
agatagcata aaaaatgaag 300gtagaacaaa ttccaagttt tctacatatt
acggtgcata aatcaaccac gtgaaggctc 360catttatttg ccgctataac
attggtgacc ctcttccaca aatagtaagt aataaaacca 420agtacaaaaa
aatgttcaac taccaagtga tcacaatctt catgcatctg agtcacacta
480ttgccctttg ctcatgaagt acactttact caccgccaaa gttcactcaa
cactgtagaa 540caaaggaatc atataaataa tgcatatctc tcccttaagc
cttcaacaca tacaaaagtg 600acacaccaaa tcaaagacac ctgagccatt
caattcccct cctttattgc tttcaagttt 660caacactaat tttattatct gaaac
6856472DNAGlycine max 6cacagatgtg ctctcatgaa gtctgaactt acagctcaag
taacaaccaa caagtaaaaa 60gtacagaaga tagcataaaa aatgaaggta gaacaaattc
caagttttct acatattacg 120gtgcataaat caaccacgtg aaggctccat
ttatttgccg ctataacatt ggtgaccctc 180ttccacaaat agtaagtaat
aaaaccaagt acaaaaaaat gttcaactac caagtgatca 240caatcttcat
gcatctgagt cacactattg ccctttgctc atgaagtaca ctttactcac
300cgccaaagtt cactcaacac tgtagaacaa aggaatcata taaataatgc
atatctctcc 360cttaagcctt caacacatac aaaagtgaca caccaaatca
aagacacctg agccattcaa 420ttcccctcct ttattgcttt caagtttcaa
cactaatttt attatctgaa ac 4727237DNAGlycine max 7gatcacaatc
ttcatgcatc tgagtcacac tattgccctt tgctcatgaa gtacacttta 60ctcaccgcca
aagttcactc aacactgtag aacaaaggaa tcatataaat aatgcatatc
120tctcccttaa gccttcaaca catacaaaag tgacacacca aatcaaagac
acctgagcca 180ttcaattccc ctcctttatt gctttcaagt ttcaacacta
attttattat ctgaaac 237832DNAArtificialPrimer 8gtttcagata ataaaattag
tgttgaaact tg 32929DNAArtificialPrimer 9ggagaacatt attaatgtga
aaatcatgc 291025DNAArtificialPrimer 10cattgggttg agttcaccta acttg
251129DNAArtificialPrimer 11gatcataact tacaagcaac atttcaatg
291228DNAArtificialPrimer 12gataaatgga taacatgtgt gctaatcc
281325DNAArtificialPrimer 13cacagatgtg ctctcatgaa gtctg
251426DNAArtificialPrimer 14gatcacaatc ttcatgcatc tgagtc
261526DNAArtificialPrimer 15caaggaaaaa cgaaactttg aaagcc
261625DNAArtificialPrimer 16gtaatacgac tcactatagg gcacg
251737DNAArtificialPrimer 17ccatggtttc agataataaa attagtgttg
aaacttg 371822DNAArtificialPrimer 18ctatagggca cgcgtggtcg ac
2219832DNAGlycine max 19acacaccaaa tcaaagacac ctgagccatt caattcccct
cctttattgc tttcaagttt 60caacactaat tttattatct gaaaaaatgg ctttcaaagt
ttcgtttttc cttgcacttg 120ttctagtttc caatatcctc ctccttgata
caacagctgc tggacgcagc attggcgaaa 180actccaactc agaggaaaag
aaagagcctg agttcttgtt caagcatgaa ggtggggtgt 240atattccagg
gattggacct gttggatttc cacataaatt tcatctcaca cctcaaaatc
300cattacctgg tggcaatgga aatggaggag caggaaccgc aacaggatca
ggatcaccac 360caggtagcag ttatgttcct ggtggtgatg acacttttgt
cccaaaccct ggttatgagg 420ttcccattcc cggcagtggt ggaagtgttc
cagcaccagc tgcaccatga gttaactcat 480gcatgattaa tgtgatgcat
ggtagttaat aaggtggtta tgcttaagtt tgtctttttc 540tttctgtttt
ctagccataa taataactta tcataaataa gtatgctcca tgtgcacatt
600ggtgtatatg gtgaacacca tggattgcca agtcattctg tttgttcttg
tagtcttgtt 660ttaagatgaa ttgagtgtga cgtaagctta tttgtttttc
gaagtaaaaa ctgatgaatg 720agtcctcaaa aataatttct gttatgattc
caatttgata ttctcttttc atgcacagtt 780ttatgtgttt ggtccttgaa
tgataaaaaa aaaaaaaaaa aaaaaaaaaa aa 83220127PRTGlycine max 20Met
Ala Phe Lys Val Ser Phe Phe Leu Ala Leu Val Leu Val Ser Asn1 5 10
15Ile Leu Leu Leu Asp Thr Thr Ala Ala Gly Arg Ser Ile Gly Glu Asn20
25 30Ser Asn Ser Glu Glu Lys Lys Glu Pro Glu Phe Leu Phe Lys His
Glu35 40 45Gly Gly Val Tyr Ile Pro Gly Ile Gly Pro Val Gly Phe Pro
His Lys50 55 60Phe His Leu Thr Pro Gln Asn Pro Leu Pro Gly Gly Asn
Gly Asn Gly65 70 75 80Gly Ala Gly Thr Ala Thr Gly Ser Gly Ser Pro
Pro Gly Ser Ser Tyr85 90 95Val Pro Gly Gly Asp Asp Thr Phe Val Pro
Asn Pro Gly Tyr Glu Val100 105 110Pro Ile Pro Gly Ser Gly Gly Ser
Val Pro Ala Pro Ala Ala Pro115 120 1252126DNAArtificialPrimer
21gaccaagaca cactcgttca tatatc 262225DNAArtificialPrimer
22tctgctgctc aatgtttaca aggac 252348DNAArtificialLonger strand
sequence of the adaptor supplied in GenomeWalker(tm) kit
23gtaatacgac tcactatagg gcacgcgtgg tcgacggccc gggctggt
482417DNAArtificialMPSS tag sequence 24gatcaccacc aggtagc
172520DNAArtificialPrimer 25aaccgcaaca ggatcaggat
202621DNAArtificialPrimer 26accagggttt gggacaaaag t
212724DNAArtificialPrimer 27catgattggg agaaacctta agct
242820DNAArtificialPrimer 28agattgggcc agaggatcct
202922DNAArtificialPrimer 29ggaagaagag aatcgggtgg tt
223023DNAArtificialFAM labeled fluorescent DNA oligo probe
30attgtgttgt gtggcatggt tat 233123DNAArtificialPrimer 31ggcttgttgt
gcagtttttg aag 233220DNAArtificialPrimer 32aacggccaca agttcgtgat
203320DNAArtificialFAM labeled fluorescent DNA oligo probe
33accggcgagg gcatcggcta 203420DNAArtificialPrimer 34cttcaagggc
aagcagacca 203524DNAArtificialPrimer 35caaacttgac aaagccacaa ctct
243620DNAArtificialVIC labeled DNA oligo probe 36ctctcatctc
atataaatac 203721DNAArtificialPrimer 37ggagaaattg gtgtcgtgga a
213821DNAArtificialRecombination site attB1 sequence 38caagtttgta
caaaaaagca g 213921DNAArtificialRecombination site attB2 sequence
39cagctttctt gtacaaagtg g 21403291DNAArtificialNucleotide sequence
of QC299 40tcgacccggg atccatggcc cacagcaagc acggcctgaa ggaggagatg
accatgaagt 60accacatgga gggctgcgtg aacggccaca agttcgtgat caccggcgag
ggcatcggct 120accccttcaa gggcaagcag accatcaacc tgtgcgtgat
cgagggcggc cccctgccct 180tcagcgagga catcctgagc gccggcttca
agtacggcga ccggatcttc accgagtacc 240cccaggacat cgtggactac
ttcaagaaca gctgccccgc cggctacacc tggggccgga 300gcttcctgtt
cgaggacggc gccgtgtgca tctgtaacgt ggacatcacc gtgagcgtga
360aggagaactg catctaccac aagagcatct tcaacggcgt gaacttcccc
gccgacggcc 420ccgtgatgaa gaagatgacc accaactggg aggccagctg
cgagaagatc atgcccgtgc 480ctaagcaggg catcctgaag ggcgacgtga
gcatgtacct gctgctgaag gacggcggcc 540ggtaccggtg ccagttcgac
accgtgtaca aggccaagag cgtgcccagc aagatgcccg 600agtggcactt
catccagcac aagctgctgc gggaggaccg gagcgacgcc aagaaccaga
660agtggcagct gaccgagcac gccatcgcct tccccagcgc cctggcctga
gagctcgaat 720ttccccgatc gttcaaacat ttggcaataa agtttcttaa
gattgaatcc tgttgccggt 780cttgcgatga ttatcatata atttctgttg
aattacgtta agcatgtaat aattaacatg 840taatgcatga cgttatttat
gagatgggtt tttatgatta gagtcccgca attatacatt 900taatacgcga
tagaaaacaa aatatagcgc gcaaactagg ataaattatc gcgcgcggtg
960tcatctatgt tactagatcg ggaattctag tggccggccc agctgatatc
catcacactg 1020gcggccgcac tcgagatatc tagacccagc tttcttgtac
aaagttggca ttataagaaa 1080gcattgctta tcaatttgtt gcaacgaaca
ggtcactatc agtcaaaata aaatcattat 1140ttgccatcca gctgcagctc
tggcccgtgt ctcaaaatct ctgatgttac attgcacaag 1200ataaaaatat
atcatcatga acaataaaac tgtctgctta cataaacagt aatacaaggg
1260gtgttatgag ccatattcaa cgggaaacgt cgaggccgcg attaaattcc
aacatggatg 1320ctgatttata tgggtataaa tgggctcgcg ataatgtcgg
gcaatcaggt gcgacaatct 1380atcgcttgta tgggaagccc gatgcgccag
agttgtttct gaaacatggc aaaggtagcg 1440ttgccaatga tgttacagat
gagatggtca gactaaactg gctgacggaa tttatgcctc 1500ttccgaccat
caagcatttt atccgtactc ctgatgatgc atggttactc accactgcga
1560tccccggaaa aacagcattc caggtattag aagaatatcc tgattcaggt
gaaaatattg 1620ttgatgcgct ggcagtgttc ctgcgccggt tgcattcgat
tcctgtttgt aattgtcctt 1680ttaacagcga tcgcgtattt cgtctcgctc
aggcgcaatc acgaatgaat aacggtttgg 1740ttgatgcgag tgattttgat
gacgagcgta atggctggcc tgttgaacaa gtctggaaag 1800aaatgcataa
acttttgcca ttctcaccgg attcagtcgt cactcatggt gatttctcac
1860ttgataacct tatttttgac gaggggaaat taataggttg tattgatgtt
ggacgagtcg 1920gaatcgcaga ccgataccag gatcttgcca tcctatggaa
ctgcctcggt gagttttctc 1980cttcattaca gaaacggctt tttcaaaaat
atggtattga taatcctgat atgaataaat 2040tgcagtttca tttgatgctc
gatgagtttt tctaatcaga attggttaat tggttgtaac 2100attattcaga
ttgggccccg ttccactgag cgtcagaccc cgtagaaaag atcaaaggat
2160cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa
aaaccaccgc 2220taccagcggt ggtttgtttg ccggatcaag agctaccaac
tctttttccg aaggtaactg 2280gcttcagcag agcgcagata ccaaatactg
ttcttctagt gtagccgtag ttaggccacc 2340acttcaagaa ctctgtagca
ccgcctacat acctcgctct gctaatcctg ttaccagtgg 2400ctgctgccag
tggcgataag tcgtgtctta ccgggttgga ctcaagacga tagttaccgg
2460ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac acagcccagc
ttggagcgaa 2520cgacctacac cgaactgaga tacctacagc gtgagctatg
agaaagcgcc acgcttcccg 2580aagggagaaa ggcggacagg tatccggtaa
gcggcagggt cggaacagga gagcgcacga 2640gggagcttcc agggggaaac
gcctggtatc tttatagtcc tgtcgggttt cgccacctct 2700gacttgagcg
tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca
2760gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac
atgttctttc 2820ctgcgttatc ccctgattct gtggataacc gtattaccgc
tagcatggat ctcggggacg 2880tctaactact aagcgagagt agggaactgc
caggcatcaa ataaaacgaa aggctcagtc 2940ggaagactgg gcctttcgtt
ttatctgttg tttgtcggtg aacgctctcc tgagtaggac 3000aaatccgccg
ggagcggatt tgaacgttgt gaagcaacgg cccggagggt ggcgggcagg
3060acgcccgcca taaactgcca ggcatcaaac taagcagaag gccatcctga
cggatggcct 3120ttttgcgttt ctacaaactc ttcctgttag ttagttactt
aagctcgggc cccaaataat 3180gattttattt tgactgatag tgacctgttc
gttgcaacaa attgataagc aatgcttttt 3240tataatgcca actttgtaca
aaaaagcagg ctggcgccgg aaccaattca g
3291414642DNAArtificialNucleotide sequence of QC300 41catggcccac
agcaagcacg gcctgaagga ggagatgacc atgaagtacc acatggaggg 60ctgcgtgaac
ggccacaagt tcgtgatcac cggcgagggc atcggctacc ccttcaaggg
120caagcagacc atcaacctgt gcgtgatcga gggcggcccc ctgcccttca
gcgaggacat 180cctgagcgcc ggcttcaagt acggcgaccg gatcttcacc
gagtaccccc aggacatcgt 240ggactacttc aagaacagct gccccgccgg
ctacacctgg ggccggagct tcctgttcga 300ggacggcgcc gtgtgcatct
gtaacgtgga catcaccgtg agcgtgaagg agaactgcat 360ctaccacaag
agcatcttca acggcgtgaa cttccccgcc gacggccccg tgatgaagaa
420gatgaccacc aactgggagg ccagctgcga gaagatcatg cccgtgccta
agcagggcat 480cctgaagggc gacgtgagca tgtacctgct gctgaaggac
ggcggccggt accggtgcca 540gttcgacacc gtgtacaagg ccaagagcgt
gcccagcaag atgcccgagt ggcacttcat 600ccagcacaag ctgctgcggg
aggaccggag cgacgccaag aaccagaagt ggcagctgac 660cgagcacgcc
atcgccttcc ccagcgccct ggcctgagag ctcgaatttc cccgatcgtt
720caaacatttg gcaataaagt ttcttaagat tgaatcctgt tgccggtctt
gcgatgatta 780tcatataatt tctgttgaat tacgttaagc atgtaataat
taacatgtaa tgcatgacgt 840tatttatgag atgggttttt atgattagag
tcccgcaatt atacatttaa tacgcgatag 900aaaacaaaat
atagcgcgca aactaggata aattatcgcg cgcggtgtca tctatgttac
960tagatcggga attctagtgg ccggcccagc tgatatccat cacactggcg
gccgcactcg 1020agatatctag acccagcttt cttgtacaaa gttggcatta
taagaaagca ttgcttatca 1080atttgttgca acgaacaggt cactatcagt
caaaataaaa tcattatttg ccatccagct 1140gcagctctgg cccgtgtctc
aaaatctctg atgttacatt gcacaagata aaaatatatc 1200atcatgaaca
ataaaactgt ctgcttacat aaacagtaat acaaggggtg ttatgagcca
1260tattcaacgg gaaacgtcga ggccgcgatt aaattccaac atggatgctg
atttatatgg 1320gtataaatgg gctcgcgata atgtcgggca atcaggtgcg
acaatctatc gcttgtatgg 1380gaagcccgat gcgccagagt tgtttctgaa
acatggcaaa ggtagcgttg ccaatgatgt 1440tacagatgag atggtcagac
taaactggct gacggaattt atgcctcttc cgaccatcaa 1500gcattttatc
cgtactcctg atgatgcatg gttactcacc actgcgatcc ccggaaaaac
1560agcattccag gtattagaag aatatcctga ttcaggtgaa aatattgttg
atgcgctggc 1620agtgttcctg cgccggttgc attcgattcc tgtttgtaat
tgtcctttta acagcgatcg 1680cgtatttcgt ctcgctcagg cgcaatcacg
aatgaataac ggtttggttg atgcgagtga 1740ttttgatgac gagcgtaatg
gctggcctgt tgaacaagtc tggaaagaaa tgcataaact 1800tttgccattc
tcaccggatt cagtcgtcac tcatggtgat ttctcacttg ataaccttat
1860ttttgacgag gggaaattaa taggttgtat tgatgttgga cgagtcggaa
tcgcagaccg 1920ataccaggat cttgccatcc tatggaactg cctcggtgag
ttttctcctt cattacagaa 1980acggcttttt caaaaatatg gtattgataa
tcctgatatg aataaattgc agtttcattt 2040gatgctcgat gagtttttct
aatcagaatt ggttaattgg ttgtaacatt attcagattg 2100ggccccgttc
cactgagcgt cagaccccgt agaaaagatc aaaggatctt cttgagatcc
2160tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac
cagcggtggt 2220ttgtttgccg gatcaagagc taccaactct ttttccgaag
gtaactggct tcagcagagc 2280gcagatacca aatactgttc ttctagtgta
gccgtagtta ggccaccact tcaagaactc 2340tgtagcaccg cctacatacc
tcgctctgct aatcctgtta ccagtggctg ctgccagtgg 2400cgataagtcg
tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg
2460gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga
cctacaccga 2520actgagatac ctacagcgtg agctatgaga aagcgccacg
cttcccgaag ggagaaaggc 2580ggacaggtat ccggtaagcg gcagggtcgg
aacaggagag cgcacgaggg agcttccagg 2640gggaaacgcc tggtatcttt
atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg 2700atttttgtga
tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt
2760tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg
cgttatcccc 2820tgattctgtg gataaccgta ttaccgctag catggatctc
ggggacgtct aactactaag 2880cgagagtagg gaactgccag gcatcaaata
aaacgaaagg ctcagtcgga agactgggcc 2940tttcgtttta tctgttgttt
gtcggtgaac gctctcctga gtaggacaaa tccgccggga 3000gcggatttga
acgttgtgaa gcaacggccc ggagggtggc gggcaggacg cccgccataa
3060actgccaggc atcaaactaa gcagaaggcc atcctgacgg atggcctttt
tgcgtttcta 3120caaactcttc ctgttagtta gttacttaag ctcgggcccc
aaataatgat tttattttga 3180ctgatagtga cctgttcgtt gcaacaaatt
gataagcaat gcttttttat aatgccaact 3240ttgtacaaaa aagcaggctg
gcgccggaac caattcagtc gacccgggct ggtaacctag 3300ttaataaatt
aaaaggagaa cattattaat gtgaaaatca tgcaaactta aaaaaatcat
3360caacaacata attttataat tctaataaaa tatttttttc ttttaattct
ttaatcaatg 3420tctaacattt atctattatt tatcacattt gttatttaat
gtttctatct ttagagctat 3480caaaaattta aaatggtgga accttactca
ttgggttgag ttcacctaac ttgtttaata 3540aatagatcaa tctaattcta
ttcatctctt agtaagtatt aaaaatgttg gcccaactct 3600ccatatattg
gtgagttata ggagtttact cacttaaaat gataataaaa atatttgttt
3660taaaatcatt ttttaaacaa aaaaataatg tttcagatta tttattctta
gatcataact 3720tacaagcaac atttcaatga tcaattcaat tgtcagaatc
aaaaccaatt gaaagagaca 3780aatattcatg ctaatcttca tcagaaacta
aacattgaca taaagcaata gtattggaac 3840tacaagttat aattatgtac
tttgtaatag tgtgaagaaa atcaaaatac aaatagtaat 3900catcatgata
aatgctatct caatttattc aattataaaa atatagaaat aaaatgtgat
3960aaatggataa catgtgtgct aatccagtcc actacgccca ccacaagttc
aacccaatgg 4020actggatcat cttctttttt tcttactgat ttctctcttc
ttccattcta atccatccca 4080aaagtagatg tttactattt cccctttcat
agtttcacaa gtgtgcgcag aggccaaact 4140gaaagtggta gtacatggtg
taatattaat cacagatgtg ctctcatgaa gtctgaactt 4200acagctcaag
taacaaccaa caagtaaaaa gtacagaaga tagcataaaa aatgaaggta
4260gaacaaattc caagttttct acatattacg gtgcataaat caaccacgtg
aaggctccat 4320ttatttgccg ctataacatt ggtgaccctc ttccacaaat
agtaagtaat aaaaccaagt 4380acaaaaaaat gttcaactac caagtgatca
caatcttcat gcatctgagt cacactattg 4440ccctttgctc atgaagtaca
ctttactcac cgccaaagtt cactcaacac tgtagaacaa 4500aggaatcata
taaataatgc atatctctcc cttaagcctt caacacatac aaaagtgaca
4560caccaaatca aagacacctg agccattcaa ttcccctcct ttattgcttt
caagtttcaa 4620cactaatttt attatctgaa ac
4642428187DNAArtificialNucleotide sequence of PHP25224 42cgcgcgtttc
ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac 60agcttgtctg
taagcggatg ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt
120tggcgggtgt cggggctggc ttaactatgc ggcatcagag cagattgtac
tgagagtgca 180ccatatggac atattgtcgt tagaacgcgg ctacaattaa
tacataacct tatgtatcat 240acacatacga tttaggtgac actatagaac
ggcgcgccgg taccgggccc cccctcgagt 300gcggccgcaa gcttgtcgac
ggagatcacc actttgtaca agaaagctga acgagaaacg 360taaaatgata
taaatatcaa tatattaaat tagattttgc ataaaaaaca gactacataa
420tactgtaaaa cacaacatat ccagtcacta tggtcgacct gcagactggc
tgtgtataag 480ggagcctgac atttatattc cccagaacat caggttaatg
gcgtttttga tgtcattttc 540gcggtggctg agatcagcca cttcttcccc
gataacggag accggcacac tggccatatc 600ggtggtcatc atgcgccagc
tttcatcccc gatatgcacc accgggtaaa gttcacggga 660gactttatct
gacagcagac gtgcactggc cagggggatc accatccgtc gcccgggcgt
720gtcaataata tcactctgta catccacaaa cagacgataa cggctctctc
ttttataggt 780gtaaacctta aactgcattt caccagtccc tgttctcgtc
agcaaaagag ccgttcattt 840caataaaccg ggcgacctca gccatccctt
cctgattttc cgctttccag cgttcggcac 900gcagacgacg ggcttcattc
tgcatggttg tgcttaccag accggagata ttgacatcat 960atatgccttg
agcaactgat agctgtcgct gtcaactgtc actgtaatac gctgcttcat
1020agcacacctc tttttgacat acttcgggta tacatatcag tatatattct
tataccgcaa 1080aaatcagcgc gcaaatacgc atactgttat ctggctttta
gtaagccgga tccacgcgtt 1140tacgccccgc cctgccactc atcgcagtac
tgttgtaatt cattaagcat tctgccgaca 1200tggaagccat cacagacggc
atgatgaacc tgaatcgcca gcggcatcag caccttgtcg 1260ccttgcgtat
aatatttgcc catggtgaaa acgggggcga agaagttgtc catattggcc
1320acgtttaaat caaaactggt gaaactcacc cagggattgg ctgagacgaa
aaacatattc 1380tcaataaacc ctttagggaa ataggccagg ttttcaccgt
aacacgccac atcttgcgaa 1440tatatgtgta gaaactgccg gaaatcgtcg
tggtattcac tccagagcga tgaaaacgtt 1500tcagtttgct catggaaaac
ggtgtaacaa gggtgaacac tatcccatat caccagctca 1560ccgtctttca
ttgccatacg gaattccgga tgagcattca tcaggcgggc aagaatgtga
1620ataaaggccg gataaaactt gtgcttattt ttctttacgg tctttaaaaa
ggccgtaata 1680tccagctgaa cggtctggtt ataggtacat tgagcaactg
actgaaatgc ctcaaaatgt 1740tctttacgat gccattggga tatatcaacg
gtggtatatc cagtgatttt tttctccatt 1800ttagcttcct tagctcctga
aaatctcgcc ggatcctaac tcaaaatcca cacattatac 1860gagccggaag
cataaagtgt aaagcctggg gtgcctaatg cggccgccat agtgactgga
1920tatgttgtgt tttacagtat tatgtagtct gttttttatg caaaatctaa
tttaatatat 1980tgatatttat atcattttac gtttctcgtt cagctttttt
gtacaaactt gtgattcttc 2040cttaccaatc atactaatta ttttgggtta
aatattaatc attattttta agatattaat 2100taagaaatta aaagattttt
taaaaaaatg tataaaatta tattattcat gatttttcat 2160acatttgatt
ttgataataa atatattttt tttaatttct taaaaaatgt tgcaagacac
2220ttattagaca tagtcttgtt ctgtttacaa aagcattcat catttaatac
attaaaaaat 2280atttaatact aacagtagaa tcttcttgtg agtggtgtgg
gagtaggcaa cctggcattg 2340aaacgagaga aagagagtca gaaccagaag
acaaataaaa agtatgcaac aaacaaatca 2400aaatcaaagg gcaaaggctg
gggttggctc aattggttgc tacattcaat tttcaactca 2460gtcaacggtt
gagattcact ctgacttccc caatctaagc cgcggatgca aacggttgaa
2520tctaacccac aatccaatct cgttacttag gggcttttcc gtcattaact
cacccctgcc 2580acccggtttc cctataaatt ggaactcaat gctcccctct
aaactcgtat cgcttcagag 2640ttgagaccaa gacacactcg ttcatatatc
tctctgctct tctcttctct tctacctctc 2700aaggtacttt tcttctccct
ctaccaaatc ctagattccg tggttcaatt tcggatcttg 2760cacttctggt
ttgctttgcc ttgctttttc ctcaactggg tccatctagg atccatgtga
2820aactctactc tttctttaat atctgcggaa tacgcgtttg actttcagat
ctagtcgaaa 2880tcatttcata attgcctttc tttcttttag cttatgagaa
ataaaatcac ttttttttta 2940tttcaaaata aaccttgggc cttgtgctga
ctgagatggg gtttggtgat tacagaattt 3000tagcgaattt tgtaattgta
cttgtttgtc tgtagttttg ttttgttttc ttgtttctca 3060tacattcctt
aggcttcaat tttattcgag tataggtcac aataggaatt caaactttga
3120gcaggggaat taatcccttc cttcaaatcc agtttgtttg tatatatgtt
taaaaaatga 3180aacttttgct ttaaattcta ttataacttt ttttatggct
gaaatttttg catgtgtctt 3240tgctctctgt tgtaaattta ctgtttaggt
actaactcta ggcttgttgt gcagtttttg 3300aagtataacc atgccacaca
acacaatggc ggccaccgct tccagaacca cccgattctc 3360ttcttcctct
tcacacccca ccttccccaa acgcattact agatccaccc tccctctctc
3420tcatcaaacc ctcaccaaac ccaaccacgc tctcaaaatc aaatgttcca
tctccaaacc 3480ccccacggcg gcgcccttca ccaaggaagc gccgaccacg
gagcccttcg tgtcacggtt 3540cgcctccggc gaacctcgca agggcgcgga
catccttgtg gaggcgctgg agaggcaggg 3600cgtgacgacg gtgttcgcgt
accccggcgg tgcgtcgatg gagatccacc aggcgctcac 3660gcgctccgcc
gccatccgca acgtgctccc gcgccacgag cagggcggcg tcttcgccgc
3720cgaaggctac gcgcgttcct ccggcctccc cggcgtctgc attgccacct
ccggccccgg 3780cgccaccaac ctcgtgagcg gcctcgccga cgctttaatg
gacagcgtcc cagtcgtcgc 3840catcaccggc caggtcgccc gccggatgat
cggcaccgac gccttccaag aaaccccgat 3900cgtggaggtg agcagatcca
tcacgaagca caactacctc atcctcgacg tcgacgacat 3960cccccgcgtc
gtcgccgagg ctttcttcgt cgccacctcc ggccgccccg gtccggtcct
4020catcgacatt cccaaagacg ttcagcagca actcgccgtg cctaattggg
acgagcccgt 4080taacctcccc ggttacctcg ccaggctgcc caggcccccc
gccgaggccc aattggaaca 4140cattgtcaga ctcatcatgg aggcccaaaa
gcccgttctc tacgtcggcg gtggcagttt 4200gaattccagt gctgaattga
ggcgctttgt tgaactcact ggtattcccg ttgctagcac 4260tttaatgggt
cttggaactt ttcctattgg tgatgaatat tcccttcaga tgctgggtat
4320gcatggtact gtttatgcta actatgctgt tgacaatagt gatttgttgc
ttgcctttgg 4380ggtaaggttt gatgaccgtg ttactgggaa gcttgaggct
tttgctagta gggctaagat 4440tgttcacatt gatattgatt ctgccgagat
tgggaagaac aagcaggcgc acgtgtcggt 4500ttgcgcggat ttgaagttgg
ccttgaaggg aattaatatg attttggagg agaaaggagt 4560ggagggtaag
tttgatcttg gaggttggag agaagagatt aatgtgcaga aacacaagtt
4620tccattgggt tacaagacat tccaggacgc gatttctccg cagcatgcta
tcgaggttct 4680tgatgagttg actaatggag atgctattgt tagtactggg
gttgggcagc atcaaatgtg 4740ggctgcgcag ttttacaagt acaagagacc
gaggcagtgg ttgacctcag ggggtcttgg 4800agccatgggt tttggattgc
ctgcggctat tggtgctgct gttgctaacc ctggggctgt 4860tgtggttgac
attgatgggg atggtagttt catcatgaat gttcaggagt tggccactat
4920aagagtggag aatctcccag ttaagatatt gttgttgaac aatcagcatt
tgggtatggt 4980ggttcagttg gaggataggt tctacaagtc caatagagct
cacacctatc ttggagatcc 5040gtctagcgag agcgagatat tcccaaacat
gctcaagttt gctgatgctt gtgggatacc 5100ggcagcgcga gtgacgaaga
aggaagagct tagagcggca attcagagaa tgttggacac 5160ccctggcccc
taccttcttg atgtcattgt gccccatcag gagcatgtgt tgccgatgat
5220tcccagtaat ggatccttca aggatgtgat aactgagggt gatggtagaa
cgaggtactg 5280attgcctaga ccaaatgttc cttgatgctt gttttgtaca
atatatataa gataatgctg 5340tcctagttgc aggatttggc ctgtggtgag
catcatagtc tgtagtagtt ttggtagcaa 5400gacattttat tttcctttta
tttaacttac tacatgcagt agcatctatc tatctctgta 5460gtctgatatc
tcctgttgtc tgtattgtgc cgttggattt tttgctgtag tgagactgaa
5520aatgatgtgc tagtaataat atttctgtta gaaatctaag tagagaatct
gttgaagaag 5580tcaaaagcta atggaatcag gttacatatt caatgttttt
ctttttttag cggttggtag 5640acgtgtagat tcaacttctc ttggagctca
cctaggcaat cagtaaaatg catattcctt 5700ttttaacttg ccatttattt
acttttagtg gaaattgtga ccaatttgtt catgtagaac 5760ggatttggac
cattgcgtcc acaaaacgtc tcttttgctc gatcttcaca aagcgatacc
5820gaaatccaga gatagttttc aaaagtcaga aatggcaaag ttataaatag
taaaacagaa 5880tagatgctgt aatcgacttc aataacaagt ggcatcacgt
ttctagttct agacccgggt 5940accggcgcgc ccgatcatcc ggatatagtt
cctcctttca gcaaaaaacc cctcaagacc 6000cgtttagagg ccccaagggg
ttatgctagt tattgctcag cggtggcagc agccaactca 6060gcttcctttc
gggctttgtt agcagccgga tcgatccaag ctgtacctca ctattccttt
6120gccctcggac gagtgctggg gcgtcggttt ccactatcgg cgagtacttc
tacacagcca 6180tcggtccaga cggccgcgct tctgcgggcg atttgtgtac
gcccgacagt cccggctccg 6240gatcggacga ttgcgtcgca tcgaccctgc
gcccaagctg catcatcgaa attgccgtca 6300accaagctct gatagagttg
gtcaagacca atgcggagca tatacgcccg gagccgcggc 6360gatcctgcaa
gctccggatg cctccgctcg aagtagcgcg tctgctgctc catacaagcc
6420aaccacggcc tccagaagaa gatgttggcg acctcgtatt gggaatcccc
gaacatcgcc 6480tcgctccagt caatgaccgc tgttatgcgg ccattgtccg
tcaggacatt gttggagccg 6540aaatccgcgt gcacgaggtg ccggacttcg
gggcagtcct cggcccaaag catcagctca 6600tcgagagcct gcgcgacgga
cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac 6660acatggggat
cagcaatcgc gcatatgaaa tcacgccatg tagtgtattg accgattcct
6720tgcggtccga atgggccgaa cccgctcgtc tggctaagat cggccgcagc
gatcgcatcc 6780atagcctccg cgaccggctg cagaacagcg ggcagttcgg
tttcaggcag gtcttgcaac 6840gtgacaccct gtgcacggcg ggagatgcaa
taggtcaggc tctcgctgaa ttccccaatg 6900tcaagcactt ccggaatcgg
gagcgcggcc gatgcaaagt gccgataaac ataacgatct 6960ttgtagaaac
catcggcgca gctatttacc cgcaggacat atccacgccc tcctacatcg
7020aagctgaaag cacgagattc ttcgccctcc gagagctgca tcaggtcgga
gacgctgtcg 7080aacttttcga tcagaaactt ctcgacagac gtcgcggtga
gttcaggctt ttccatgggt 7140atatctcctt cttaaagtta aacaaaatta
tttctagagg gaaaccgttg tggtctccct 7200atagtgagtc gtattaattt
cgcgggatcg agatctgatc aacctgcatt aatgaatcgg 7260ccaacgcgcg
gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga
7320ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa
aggcggtaat 7380acggttatcc acagaatcag gggataacgc aggaaagaac
atgtgagcaa aaggccagca 7440aaaggccagg aaccgtaaaa aggccgcgtt
gctggcgttt ttccataggc tccgcccccc 7500tgacgagcat cacaaaaatc
gacgctcaag tcagaggtgg cgaaacccga caggactata 7560aagataccag
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc
7620gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt
ctcaatgctc 7680acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc
aagctgggct gtgtgcacga 7740accccccgtt cagcccgacc gctgcgcctt
atccggtaac tatcgtcttg agtccaaccc 7800ggtaagacac gacttatcgc
cactggcagc agccactggt aacaggatta gcagagcgag 7860gtatgtaggc
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag
7920gacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa
gagttggtag 7980ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt
ttttttgttt gcaagcagca 8040gattacgcgc agaaaaaaag gatctcaaga
agatcctttg atcttttcta cggggtctga 8100cgctcagtgg aacgaaaact
cacgttaagg gattttggtc atgacattaa cctataaaaa 8160taggcgtatc
acgaggccct ttcgtct 8187438945DNAArtificialNucleotide sequence of
QC302 43tttgtacaaa cttgtgattc ttccttacca atcatactaa ttattttggg
ttaaatatta 60atcattattt ttaagatatt aattaagaaa ttaaaagatt ttttaaaaaa
atgtataaaa 120ttatattatt catgattttt catacatttg attttgataa
taaatatatt ttttttaatt 180tcttaaaaaa tgttgcaaga cacttattag
acatagtctt gttctgttta caaaagcatt 240catcatttaa tacattaaaa
aatatttaat actaacagta gaatcttctt gtgagtggtg 300tgggagtagg
caacctggca ttgaaacgag agaaagagag tcagaaccag aagacaaata
360aaaagtatgc aacaaacaaa tcaaaatcaa agggcaaagg ctggggttgg
ctcaattggt 420tgctacattc aattttcaac tcagtcaacg gttgagattc
actctgactt ccccaatcta 480agccgcggat gcaaacggtt gaatctaacc
cacaatccaa tctcgttact taggggcttt 540tccgtcatta actcacccct
gccacccggt ttccctataa attggaactc aatgctcccc 600tctaaactcg
tatcgcttca gagttgagac caagacacac tcgttcatat atctctctgc
660tcttctcttc tcttctacct ctcaaggtac ttttcttctc cctctaccaa
atcctagatt 720ccgtggttca atttcggatc ttgcacttct ggtttgcttt
gccttgcttt ttcctcaact 780gggtccatct aggatccatg tgaaactcta
ctctttcttt aatatctgcg gaatacgcgt 840ttgactttca gatctagtcg
aaatcatttc ataattgcct ttctttcttt tagcttatga 900gaaataaaat
cacttttttt ttatttcaaa ataaaccttg ggccttgtgc tgactgagat
960ggggtttggt gattacagaa ttttagcgaa ttttgtaatt gtacttgttt
gtctgtagtt 1020ttgttttgtt ttcttgtttc tcatacattc cttaggcttc
aattttattc gagtataggt 1080cacaatagga attcaaactt tgagcagggg
aattaatccc ttccttcaaa tccagtttgt 1140ttgtatatat gtttaaaaaa
tgaaactttt gctttaaatt ctattataac tttttttatg 1200gctgaaattt
ttgcatgtgt ctttgctctc tgttgtaaat ttactgttta ggtactaact
1260ctaggcttgt tgtgcagttt ttgaagtata accatgccac acaacacaat
ggcggccacc 1320gcttccagaa ccacccgatt ctcttcttcc tcttcacacc
ccaccttccc caaacgcatt 1380actagatcca ccctccctct ctctcatcaa
accctcacca aacccaacca cgctctcaaa 1440atcaaatgtt ccatctccaa
accccccacg gcggcgccct tcaccaagga agcgccgacc 1500acggagccct
tcgtgtcacg gttcgcctcc ggcgaacctc gcaagggcgc ggacatcctt
1560gtggaggcgc tggagaggca gggcgtgacg acggtgttcg cgtaccccgg
cggtgcgtcg 1620atggagatcc accaggcgct cacgcgctcc gccgccatcc
gcaacgtgct cccgcgccac 1680gagcagggcg gcgtcttcgc cgccgaaggc
tacgcgcgtt cctccggcct ccccggcgtc 1740tgcattgcca cctccggccc
cggcgccacc aacctcgtga gcggcctcgc cgacgcttta 1800atggacagcg
tcccagtcgt cgccatcacc ggccaggtcg cccgccggat gatcggcacc
1860gacgccttcc aagaaacccc gatcgtggag gtgagcagat ccatcacgaa
gcacaactac 1920ctcatcctcg acgtcgacga catcccccgc gtcgtcgccg
aggctttctt cgtcgccacc 1980tccggccgcc ccggtccggt cctcatcgac
attcccaaag acgttcagca gcaactcgcc 2040gtgcctaatt gggacgagcc
cgttaacctc cccggttacc tcgccaggct gcccaggccc 2100cccgccgagg
cccaattgga acacattgtc agactcatca tggaggccca aaagcccgtt
2160ctctacgtcg gcggtggcag tttgaattcc agtgctgaat tgaggcgctt
tgttgaactc 2220actggtattc ccgttgctag cactttaatg ggtcttggaa
cttttcctat tggtgatgaa 2280tattcccttc agatgctggg tatgcatggt
actgtttatg ctaactatgc tgttgacaat 2340agtgatttgt tgcttgcctt
tggggtaagg tttgatgacc gtgttactgg gaagcttgag 2400gcttttgcta
gtagggctaa gattgttcac attgatattg attctgccga gattgggaag
2460aacaagcagg cgcacgtgtc ggtttgcgcg gatttgaagt tggccttgaa
gggaattaat 2520atgattttgg aggagaaagg agtggagggt aagtttgatc
ttggaggttg gagagaagag 2580attaatgtgc agaaacacaa gtttccattg
ggttacaaga cattccagga cgcgatttct 2640ccgcagcatg ctatcgaggt
tcttgatgag ttgactaatg gagatgctat tgttagtact 2700ggggttgggc
agcatcaaat gtgggctgcg cagttttaca agtacaagag accgaggcag
2760tggttgacct cagggggtct tggagccatg ggttttggat tgcctgcggc
tattggtgct 2820gctgttgcta accctggggc tgttgtggtt gacattgatg
gggatggtag tttcatcatg 2880aatgttcagg agttggccac tataagagtg
gagaatctcc cagttaagat attgttgttg 2940aacaatcagc
atttgggtat ggtggttcag ttggaggata ggttctacaa gtccaataga
3000gctcacacct atcttggaga tccgtctagc gagagcgaga tattcccaaa
catgctcaag 3060tttgctgatg cttgtgggat accggcagcg cgagtgacga
agaaggaaga gcttagagcg 3120gcaattcaga gaatgttgga cacccctggc
ccctaccttc ttgatgtcat tgtgccccat 3180caggagcatg tgttgccgat
gattcccagt aatggatcct tcaaggatgt gataactgag 3240ggtgatggta
gaacgaggta ctgattgcct agaccaaatg ttccttgatg cttgttttgt
3300acaatatata taagataatg ctgtcctagt tgcaggattt ggcctgtggt
gagcatcata 3360gtctgtagta gttttggtag caagacattt tattttcctt
ttatttaact tactacatgc 3420agtagcatct atctatctct gtagtctgat
atctcctgtt gtctgtattg tgccgttgga 3480ttttttgctg tagtgagact
gaaaatgatg tgctagtaat aatatttctg ttagaaatct 3540aagtagagaa
tctgttgaag aagtcaaaag ctaatggaat caggttacat attcaatgtt
3600tttctttttt tagcggttgg tagacgtgta gattcaactt ctcttggagc
tcacctaggc 3660aatcagtaaa atgcatattc cttttttaac ttgccattta
tttactttta gtggaaattg 3720tgaccaattt gttcatgtag aacggatttg
gaccattgcg tccacaaaac gtctcttttg 3780ctcgatcttc acaaagcgat
accgaaatcc agagatagtt ttcaaaagtc agaaatggca 3840aagttataaa
tagtaaaaca gaatagatgc tgtaatcgac ttcaataaca agtggcatca
3900cgtttctagt tctagacccg ggtaccggcg cgcccgatca tccggatata
gttcctcctt 3960tcagcaaaaa acccctcaag acccgtttag aggccccaag
gggttatgct agttattgct 4020cagcggtggc agcagccaac tcagcttcct
ttcgggcttt gttagcagcc ggatcgatcc 4080aagctgtacc tcactattcc
tttgccctcg gacgagtgct ggggcgtcgg tttccactat 4140cggcgagtac
ttctacacag ccatcggtcc agacggccgc gcttctgcgg gcgatttgtg
4200tacgcccgac agtcccggct ccggatcgga cgattgcgtc gcatcgaccc
tgcgcccaag 4260ctgcatcatc gaaattgccg tcaaccaagc tctgatagag
ttggtcaaga ccaatgcgga 4320gcatatacgc ccggagccgc ggcgatcctg
caagctccgg atgcctccgc tcgaagtagc 4380gcgtctgctg ctccatacaa
gccaaccacg gcctccagaa gaagatgttg gcgacctcgt 4440attgggaatc
cccgaacatc gcctcgctcc agtcaatgac cgctgttatg cggccattgt
4500ccgtcaggac attgttggag ccgaaatccg cgtgcacgag gtgccggact
tcggggcagt 4560cctcggccca aagcatcagc tcatcgagag cctgcgcgac
ggacgcactg acggtgtcgt 4620ccatcacagt ttgccagtga tacacatggg
gatcagcaat cgcgcatatg aaatcacgcc 4680atgtagtgta ttgaccgatt
ccttgcggtc cgaatgggcc gaacccgctc gtctggctaa 4740gatcggccgc
agcgatcgca tccatagcct ccgcgaccgg ctgcagaaca gcgggcagtt
4800cggtttcagg caggtcttgc aacgtgacac cctgtgcacg gcgggagatg
caataggtca 4860ggctctcgct gaattcccca atgtcaagca cttccggaat
cgggagcgcg gccgatgcaa 4920agtgccgata aacataacga tctttgtaga
aaccatcggc gcagctattt acccgcagga 4980catatccacg ccctcctaca
tcgaagctga aagcacgaga ttcttcgccc tccgagagct 5040gcatcaggtc
ggagacgctg tcgaactttt cgatcagaaa cttctcgaca gacgtcgcgg
5100tgagttcagg cttttccatg ggtatatctc cttcttaaag ttaaacaaaa
ttatttctag 5160agggaaaccg ttgtggtctc cctatagtga gtcgtattaa
tttcgcggga tcgagatctg 5220atcaacctgc attaatgaat cggccaacgc
gcggggagag gcggtttgcg tattgggcgc 5280tcttccgctt cctcgctcac
tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 5340tcagctcact
caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag
5400aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc
gttgctggcg 5460tttttccata ggctccgccc ccctgacgag catcacaaaa
atcgacgctc aagtcagagg 5520tggcgaaacc cgacaggact ataaagatac
caggcgtttc cccctggaag ctccctcgtg 5580cgctctcctg ttccgaccct
gccgcttacc ggatacctgt ccgcctttct cccttcggga 5640agcgtggcgc
tttctcaatg ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc
5700tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc
cttatccggt 5760aactatcgtc ttgagtccaa cccggtaaga cacgacttat
cgccactggc agcagccact 5820ggtaacagga ttagcagagc gaggtatgta
ggcggtgcta cagagttctt gaagtggtgg 5880cctaactacg gctacactag
aaggacagta tttggtatct gcgctctgct gaagccagtt 5940accttcggaa
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt
6000ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca
agaagatcct 6060ttgatctttt ctacggggtc tgacgctcag tggaacgaaa
actcacgtta agggattttg 6120gtcatgacat taacctataa aaataggcgt
atcacgaggc cctttcgtct cgcgcgtttc 6180ggtgatgacg gtgaaaacct
ctgacacatg cagctcccgg agacggtcac agcttgtctg 6240taagcggatg
ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt
6300cggggctggc ttaactatgc ggcatcagag cagattgtac tgagagtgca
ccatatggac 6360atattgtcgt tagaacgcgg ctacaattaa tacataacct
tatgtatcat acacatacga 6420tttaggtgac actatagaac ggcgcgccgg
taccgggccc cccctcgagt gcggccgcaa 6480gcttgtcgac ggagatcacc
actttgtaca agaaagctgg gtctagatat ctcgagtgcg 6540gccgccagtg
tgatggatat cagctgggcc ggccactaga attcccgatc tagtaacata
6600gatgacaccg cgcgcgataa tttatcctag tttgcgcgct atattttgtt
ttctatcgcg 6660tattaaatgt ataattgcgg gactctaatc ataaaaaccc
atctcataaa taacgtcatg 6720cattacatgt taattattac atgcttaacg
taattcaaca gaaattatat gataatcatc 6780gcaagaccgg caacaggatt
caatcttaag aaactttatt gccaaatgtt tgaacgatcg 6840gggaaattcg
agctctcagg ccagggcgct ggggaaggcg atggcgtgct cggtcagctg
6900ccacttctgg ttcttggcgt cgctccggtc ctcccgcagc agcttgtgct
ggatgaagtg 6960ccactcgggc atcttgctgg gcacgctctt ggccttgtac
acggtgtcga actggcaccg 7020gtaccggccg ccgtccttca gcagcaggta
catgctcacg tcgcccttca ggatgccctg 7080cttaggcacg ggcatgatct
tctcgcagct ggcctcccag ttggtggtca tcttcttcat 7140cacggggccg
tcggcgggga agttcacgcc gttgaagatg ctcttgtggt agatgcagtt
7200ctccttcacg ctcacggtga tgtccacgtt acagatgcac acggcgccgt
cctcgaacag 7260gaagctccgg ccccaggtgt agccggcggg gcagctgttc
ttgaagtagt ccacgatgtc 7320ctgggggtac tcggtgaaga tccggtcgcc
gtacttgaag ccggcgctca ggatgtcctc 7380gctgaagggc agggggccgc
cctcgatcac gcacaggttg atggtctgct tgcccttgaa 7440ggggtagccg
atgccctcgc cggtgatcac gaacttgtgg ccgttcacgc agccctccat
7500gtggtacttc atggtcatct cctccttcag gccgtgcttg ctgtgggcca
tggtttcaga 7560taataaaatt agtgttgaaa cttgaaagca ataaaggagg
ggaattgaat ggctcaggtg 7620tctttgattt ggtgtgtcac ttttgtatgt
gttgaaggct taagggagag atatgcatta 7680tttatatgat tcctttgttc
tacagtgttg agtgaacttt ggcggtgagt aaagtgtact 7740tcatgagcaa
agggcaatag tgtgactcag atgcatgaag attgtgatca cttggtagtt
7800gaacattttt ttgtacttgg ttttattact tactatttgt ggaagagggt
caccaatgtt 7860atagcggcaa ataaatggag ccttcacgtg gttgatttat
gcaccgtaat atgtagaaaa 7920cttggaattt gttctacctt cattttttat
gctatcttct gtacttttta cttgttggtt 7980gttacttgag ctgtaagttc
agacttcatg agagcacatc tgtgattaat attacaccat 8040gtactaccac
tttcagtttg gcctctgcgc acacttgtga aactatgaaa ggggaaatag
8100taaacatcta cttttgggat ggattagaat ggaagaagag agaaatcagt
aagaaaaaaa 8160gaagatgatc cagtccattg ggttgaactt gtggtgggcg
tagtggactg gattagcaca 8220catgttatcc atttatcaca ttttatttct
atatttttat aattgaataa attgagatag 8280catttatcat gatgattact
atttgtattt tgattttctt cacactatta caaagtacat 8340aattataact
tgtagttcca atactattgc tttatgtcaa tgtttagttt ctgatgaaga
8400ttagcatgaa tatttgtctc tttcaattgg ttttgattct gacaattgaa
ttgatcattg 8460aaatgttgct tgtaagttat gatctaagaa taaataatct
gaaacattat ttttttgttt 8520aaaaaatgat tttaaaacaa atatttttat
tatcatttta agtgagtaaa ctcctataac 8580tcaccaatat atggagagtt
gggccaacat ttttaatact tactaagaga tgaatagaat 8640tagattgatc
tatttattaa acaagttagg tgaactcaac ccaatgagta aggttccacc
8700attttaaatt tttgatagct ctaaagatag aaacattaaa taacaaatgt
gataaataat 8760agataaatgt tagacattga ttaaagaatt aaaagaaaaa
aatattttat tagaattata 8820aaattatgtt gttgatgatt tttttaagtt
tgcatgattt tcacattaat aatgttctcc 8880ttttaattta ttaactaggt
taccagcccg ggtcgactga attggttccg gcgccagcct 8940gcttt
8945442817DNAArtificialNucleotide sequence of pCR8/GW/TOPO
44ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga
60taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga
120gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat
gcagctggca 180cgacaggttt cccgactgga aagcgggcag tgagcgcaac
gcaattaata cgcgtaccgc 240tagccaggaa gagtttgtag aaacgcaaaa
aggccatccg tcaggatggc cttctgctta 300gtttgatgcc tggcagttta
tggcgggcgt cctgcccgcc accctccggg ccgttgcttc 360acaacgttca
aatccgctcc cggcggattt gtcctactca ggagagcgtt caccgacaaa
420caacagataa aacgaaaggc ccagtcttcc gactgagcct ttcgttttat
ttgatgcctg 480gcagttccct actctcgcgt taacgctagc atggatgttt
tcccagtcac gacgttgtaa 540aacgacggcc agtcttaagc tcgggcccca
aataatgatt ttattttgac tgatagtgac 600ctgttcgttg caacaaattg
atgagcaatg cttttttata atgccaactt tgtacaaaaa 660agcaggctcc
gaattcgccc ttaagggcga attcgaccca gctttcttgt acaaagttgg
720cattataaaa aataattgct catcaatttg ttgcaacgaa caggtcacta
tcagtcaaaa 780taaaatcatt atttgccatc cagctgatat cccctatagt
gagtcgtatt acatggtcat 840agctgtttcc tggcagctct ggcccgtgtc
tcaaaatctc tgatgttaca ttgcacaaga 900taaaaatata tcatcatgcc
tcctctagac cagccaggac agaaatgcct cgacttcgct 960gctgcccaag
gttgccgggt gacgcacacc gtggaaacgg atgaaggcac gaacccagtg
1020gacataagcc tgttcggttc gtaagctgta atgcaagtag cgtatgcgct
cacgcaactg 1080gtccagaacc ttgaccgaac gcagcggtgg taacggcgca
gtggcggttt tcatggcttg 1140ttatgactgt ttttttgggg tacagtctat
gcctcgggca tccaagcagc aagcgcgtta 1200cgccgtgggt cgatgtttga
tgttatggag cagcaacgat gttacgcagc agggcagtcg 1260ccctaaaaca
aagttaaaca tcatgaggga agcggtgatc gccgaagtat cgactcaact
1320atcagaggta gttggcgtca tcgagcgcca tctcgaaccg acgttgctgg
ccgtacattt 1380gtacggctcc gcagtggatg gcggcctgaa gccacacagt
gatattgatt tgctggttac 1440ggtgaccgta aggcttgatg aaacaacgcg
gcgagctttg atcaacgacc ttttggaaac 1500ttcggcttcc cctggagaga
gcgagattct ccgcgctgta gaagtcacca ttgttgtgca 1560cgacgacatc
attccgtggc gttatccagc taagcgcgaa ctgcaatttg gagaatggca
1620gcgcaatgac attcttgcag gtatcttcga gccagccacg atcgacattg
atctggctat 1680cttgctgaca aaagcaagag aacatagcgt tgccttggta
ggtccagcgg cggaggaact 1740ctttgatccg gttcctgaac aggatctatt
tgaggcgcta aatgaaacct taacgctatg 1800gaactcgccg cccgactggg
ctggcgatga gcgaaatgta gtgcttacgt tgtcccgcat 1860ttggtacagc
gcagtaaccg gcaaaatcgc gccgaaggat gtcgctgccg actgggcaat
1920ggagcgcctg ccggcccagt atcagcccgt catacttgaa gctagacagg
cttatcttgg 1980acaagaagaa gatcgcttgg cctcgcgcgc agatcagttg
gaagaatttg tccactacgt 2040gaaaggcgag atcaccaagg tagtcggcaa
ataaccctcg agccacccat gaccaaaatc 2100ccttaacgtg agttacgcgt
cgttccactg agcgtcagac cccgtagaaa agatcaaagg 2160atcttcttga
gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc
2220gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc
cgaaggtaac 2280tggcttcagc agagcgcaga taccaaatac tgtccttcta
gtgtagccgt agttaggcca 2340ccacttcaag aactctgtag caccgcctac
atacctcgct ctgctaatcc tgttaccagt 2400ggctgctgcc agtggcgata
agtcgtgtct taccgggttg gactcaagac gatagttacc 2460ggataaggcg
cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg
2520aacgacctac accgaactga gatacctaca gcgtgagcat tgagaaagcg
ccacgcttcc 2580cgaagggaga aaggcggaca ggtatccggt aagcggcagg
gtcggaacag gagagcgcac 2640gagggagctt ccagggggaa acgcctggta
tctttatagt cctgtcgggt ttcgccacct 2700ctgacttgag cgtcgatttt
tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc 2760cagcaacgcg
gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgtt
2817454145DNAArtificialNucleotide sequence of QC300-1 45aagggcgaat
tcgacccagc tttcttgtac aaagttggca ttataaaaaa taattgctca 60tcaatttgtt
gcaacgaaca ggtcactatc agtcaaaata aaatcattat ttgccatcca
120gctgatatcc cctatagtga gtcgtattac atggtcatag ctgtttcctg
gcagctctgg 180cccgtgtctc aaaatctctg atgttacatt gcacaagata
aaaatatatc atcatgcctc 240ctctagacca gccaggacag aaatgcctcg
acttcgctgc tgcccaaggt tgccgggtga 300cgcacaccgt ggaaacggat
gaaggcacga acccagtgga cataagcctg ttcggttcgt 360aagctgtaat
gcaagtagcg tatgcgctca cgcaactggt ccagaacctt gaccgaacgc
420agcggtggta acggcgcagt ggcggttttc atggcttgtt atgactgttt
ttttggggta 480cagtctatgc ctcgggcatc caagcagcaa gcgcgttacg
ccgtgggtcg atgtttgatg 540ttatggagca gcaacgatgt tacgcagcag
ggcagtcgcc ctaaaacaaa gttaaacatc 600atgagggaag cggtgatcgc
cgaagtatcg actcaactat cagaggtagt tggcgtcatc 660gagcgccatc
tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc
720ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag
gcttgatgaa 780acaacgcggc gagctttgat caacgacctt ttggaaactt
cggcttcccc tggagagagc 840gagattctcc gcgctgtaga agtcaccatt
gttgtgcacg acgacatcat tccgtggcgt 900tatccagcta agcgcgaact
gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 960atcttcgagc
cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa
1020catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt
tcctgaacag 1080gatctatttg aggcgctaaa tgaaacctta acgctatgga
actcgccgcc cgactgggct 1140ggcgatgagc gaaatgtagt gcttacgttg
tcccgcattt ggtacagcgc agtaaccggc 1200aaaatcgcgc cgaaggatgt
cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 1260cagcccgtca
tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc
1320tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat
caccaaggta 1380gtcggcaaat aaccctcgag ccacccatga ccaaaatccc
ttaacgtgag ttacgcgtcg 1440ttccactgag cgtcagaccc cgtagaaaag
atcaaaggat cttcttgaga tccttttttt 1500ctgcgcgtaa tctgctgctt
gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg 1560ccggatcaag
agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata
1620ccaaatactg tccttctagt gtagccgtag ttaggccacc acttcaagaa
ctctgtagca 1680ccgcctacat acctcgctct gctaatcctg ttaccagtgg
ctgctgccag tggcgataag 1740tcgtgtctta ccgggttgga ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc 1800tgaacggggg gttcgtgcac
acagcccagc ttggagcgaa cgacctacac cgaactgaga 1860tacctacagc
gtgagcattg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg
1920tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc
agggggaaac 1980gcctggtatc tttatagtcc tgtcgggttt cgccacctct
gacttgagcg tcgatttttg 2040tgatgctcgt caggggggcg gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg 2100ttcctggcct tttgctggcc
ttttgctcac atgttctttc ctgcgttatc ccctgattct 2160gtggataacc
gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc
2220gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa
accgcctctc 2280cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca
ggtttcccga ctggaaagcg 2340ggcagtgagc gcaacgcaat taatacgcgt
accgctagcc aggaagagtt tgtagaaacg 2400caaaaaggcc atccgtcagg
atggccttct gcttagtttg atgcctggca gtttatggcg 2460ggcgtcctgc
ccgccaccct ccgggccgtt gcttcacaac gttcaaatcc gctcccggcg
2520gatttgtcct actcaggaga gcgttcaccg acaaacaaca gataaaacga
aaggcccagt 2580cttccgactg agcctttcgt tttatttgat gcctggcagt
tccctactct cgcgttaacg 2640ctagcatgga tgttttccca gtcacgacgt
tgtaaaacga cggccagtct taagctcggg 2700ccccaaataa tgattttatt
ttgactgata gtgacctgtt cgttgcaaca aattgatgag 2760caatgctttt
ttataatgcc aactttgtac aaaaaagcag gctccgaatt cgcccttgga
2820gaacattatt aatgtgaaaa tcatgcaaac ttaaaaaaat catcaacaac
ataattttat 2880aattctaata aaatattttt ttcttttaat tctttaatca
atgtctaaca tttatctatt 2940atttatcaca tttgttattt aatgtttcta
tctttagagc tatcaaaaat ttaaaatggt 3000ggaaccttac tcattgggtt
gagttcacct aacttgttta ataaatagat caatctaatt 3060ctattcatct
cttagtaagt attaaaaatg ttggcccaac tctccatata ttggtgagtt
3120ataggagttt actcacttaa aatgataata aaaatatttg ttttaaaatc
attttttaaa 3180caaaaaaata atgtttcaga ttatttattc ttagatcata
acttacaagc aacatttcaa 3240tgatcaattc aattgtcaga atcaaaacca
attgaaagag acaaatattc atgctaatct 3300tcatcagaaa ctaaacattg
acataaagca atagtattgg aactacaagt tataattatg 3360tactttgtaa
tagtgtgaag aaaatcaaaa tacaaatagt aatcatcatg ataaatgcta
3420tctcaattta ttcaattata aaaatataga aataaaatgt gataaatgga
taacatgtgt 3480gctaatccag tccactacgc ccaccacaag ttcaacccaa
tggactggat catcttcttt 3540ttttcttact gatttctctc ttcttccatt
ctaatccatc ccaaaagtag atgtttacta 3600tttccccttt catagtttca
caagtgtgcg cagaggccaa actgaaagtg gtagtacatg 3660gtgtaatatt
aatcacagat gtgctctcat gaagtctgaa cttacagctc aagtaacaac
3720caacaagtaa aaagtacaga agatagcata aaaaatgaag gtagaacaaa
ttccaagttt 3780tctacatatt acggtgcata aatcaaccac gtgaaggctc
catttatttg ccgctataac 3840attggtgacc ctcttccaca aatagtaagt
aataaaacca agtacaaaaa aatgttcaac 3900taccaagtga tcacaatctt
catgcatctg agtcacacta ttgccctttg ctcatgaagt 3960acactttact
caccgccaaa gttcactcaa cactgtagaa caaaggaatc atataaataa
4020tgcatatctc tcccttaagc cttcaacaca tacaaaagtg acacaccaaa
tcaaagacac 4080ctgagccatt caattcccct cctttattgc tttcaagttt
caacactaat tttattatct 4140gaaac 4145465286DNAArtificialNucleotide
sequence of QC330 46atcaacaagt ttgtacaaaa aagctgaacg agaaacgtaa
aatgatataa atatcaatat 60attaaattag attttgcata aaaaacagac tacataatac
tgtaaaacac aacatatcca 120gtcatattgg cggccgcatt aggcacccca
ggctttacac tttatgcttc cggctcgtat 180aatgtgtgga ttttgagtta
ggatccgtcg agattttcag gagctaagga agctaaaatg 240gagaaaaaaa
tcactggata taccaccgtt gatatatccc aatggcatcg taaagaacat
300tttgaggcat ttcagtcagt tgctcaatgt acctataacc agaccgttca
gctggatatt 360acggcctttt taaagaccgt aaagaaaaat aagcacaagt
tttatccggc ctttattcac 420attcttgccc gcctgatgaa tgctcatccg
gaattccgta tggcaatgaa agacggtgag 480ctggtgatat gggatagtgt
tcacccttgt tacaccgttt tccatgagca aactgaaacg 540ttttcatcgc
tctggagtga ataccacgac gatttccggc agtttctaca catatattcg
600caagatgtgg cgtgttacgg tgaaaacctg gcctatttcc ctaaagggtt
tattgagaat 660atgtttttcg tctcagccaa tccctgggtg agtttcacca
gttttgattt aaacgtggcc 720aatatggaca acttcttcgc ccccgttttc
accatgggca aatattatac gcaaggcgac 780aaggtgctga tgccgctggc
gattcaggtt catcatgccg tttgtgatgg cttccatgtc 840ggcagaatgc
ttaatgaatt acaacagtac tgcgatgagt ggcagggcgg ggcgtaaaga
900tctggatccg gcttactaaa agccagataa cagtatgcgt atttgcgcgc
tgatttttgc 960ggtataagaa tatatactga tatgtatacc cgaagtatgt
caaaaagagg tatgctatga 1020agcagcgtat tacagtgaca gttgacagcg
acagctatca gttgctcaag gcatatatga 1080tgtcaatatc tccggtctgg
taagcacaac catgcagaat gaagcccgtc gtctgcgtgc 1140cgaacgctgg
aaagcggaaa atcaggaagg gatggctgag gtcgcccggt ttattgaaat
1200gaacggctct tttgctgacg agaacagggg ctggtgaaat gcagtttaag
gtttacacct 1260ataaaagaga gagccgttat cgtctgtttg tggatgtaca
gagtgatatt attgacacgc 1320ccgggcgacg gatggtgatc cccctggcca
gtgcacgtct gctgtcagat aaagtctccc 1380gtgaacttta cccggtggtg
catatcgggg atgaaagctg gcgcatgatg accaccgata 1440tggccagtgt
gccggtctcc gttatcgggg aagaagtggc tgatctcagc caccgcgaaa
1500atgacatcaa aaacgccatt aacctgatgt tctggggaat ataaatgtca
ggctccctta 1560tacacagcca gtctgcaggt cgaccatagt gactggatat
gttgtgtttt acagtattat 1620gtagtctgtt ttttatgcaa aatctaattt
aatatattga tatttatatc attttacgtt 1680tctcgttcag ctttcttgta
caaagtggtt gatgggatcc atggcccaca gcaagcacgg 1740cctgaaggag
gagatgacca tgaagtacca catggagggc tgcgtgaacg gccacaagtt
1800cgtgatcacc
ggcgagggca tcggctaccc cttcaagggc aagcagacca tcaacctgtg
1860cgtgatcgag ggcggccccc tgcccttcag cgaggacatc ctgagcgccg
gcttcaagta 1920cggcgaccgg atcttcaccg agtaccccca ggacatcgtg
gactacttca agaacagctg 1980ccccgccggc tacacctggg gccggagctt
cctgttcgag gacggcgccg tgtgcatctg 2040taacgtggac atcaccgtga
gcgtgaagga gaactgcatc taccacaaga gcatcttcaa 2100cggcgtgaac
ttccccgccg acggccccgt gatgaagaag atgaccacca actgggaggc
2160cagctgcgag aagatcatgc ccgtgcctaa gcagggcatc ctgaagggcg
acgtgagcat 2220gtacctgctg ctgaaggacg gcggccggta ccggtgccag
ttcgacaccg tgtacaaggc 2280caagagcgtg cccagcaaga tgcccgagtg
gcacttcatc cagcacaagc tgctgcggga 2340ggaccggagc gacgccaaga
accagaagtg gcagctgacc gagcacgcca tcgccttccc 2400cagcgccctg
gcctgagagc tcgaatttcc ccgatcgttc aaacatttgg caataaagtt
2460tcttaagatt gaatcctgtt gccggtcttg cgatgattat catataattt
ctgttgaatt 2520acgttaagca tgtaataatt aacatgtaat gcatgacgtt
atttatgaga tgggttttta 2580tgattagagt cccgcaatta tacatttaat
acgcgataga aaacaaaata tagcgcgcaa 2640actaggataa attatcgcgc
gcggtgtcat ctatgttact agatcgggaa ttctagtggc 2700cggcccagct
gatatccatc acactggcgg ccgctcgagt tctatagtgt cacctaaatc
2760gtatgtgtat gatacataag gttatgtatt aattgtagcc gcgttctaac
gacaatatgt 2820ccatatggtg cactctcagt acaatctgct ctgatgccgc
atagttaagc cagccccgac 2880acccgccaac acccgctgac gcgccctgac
gggcttgtct gctcccggca tccgcttaca 2940gacaagctgt gaccgtctcc
gggagctgca tgtgtcagag gttttcaccg tcatcaccga 3000aacgcgcgag
acgaaagggc ctcgtgatac gcctattttt ataggttaat gtcatgacca
3060aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa
aagatcaaag 3120gatcttcttg agatcctttt tttctgcgcg taatctgctg
cttgcaaaca aaaaaaccac 3180cgctaccagc ggtggtttgt ttgccggatc
aagagctacc aactcttttt ccgaaggtaa 3240ctggcttcag cagagcgcag
ataccaaata ctgtccttct agtgtagccg tagttaggcc 3300accacttcaa
gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag
3360tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga
cgatagttac 3420cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg
cacacagccc agcttggagc 3480gaacgaccta caccgaactg agatacctac
agcgtgagca ttgagaaagc gccacgcttc 3540ccgaagggag aaaggcggac
aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3600cgagggagct
tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc
3660tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta
tggaaaaacg 3720ccagcaacgc ggccttttta cggttcctgg ccttttgctg
gccttttgct cacatgttct 3780ttcctgcgtt atcccctgat tctgtggata
accgtattac cgcctttgag tgagctgata 3840ccgctcgccg cagccgaacg
accgagcgca gcgagtcagt gagcgaggaa gcggaagagc 3900gcccaatacg
caaaccgcct ctccccgcgc gttggccgat tcattaatgc aggttgatca
3960gatctcgatc ccgcgaaatt aatacgactc actataggga gaccacaacg
gtttccctct 4020agaaataatt ttgtttaact ttaagaagga gatataccca
tggaaaagcc tgaactcacc 4080gcgacgtctg tcgagaagtt tctgatcgaa
aagttcgaca gcgtctccga cctgatgcag 4140ctctcggagg gcgaagaatc
tcgtgctttc agcttcgatg taggagggcg tggatatgtc 4200ctgcgggtaa
atagctgcgc cgatggtttc tacaaagatc gttatgttta tcggcacttt
4260gcatcggccg cgctcccgat tccggaagtg cttgacattg gggaattcag
cgagagcctg 4320acctattgca tctcccgccg tgcacagggt gtcacgttgc
aagacctgcc tgaaaccgaa 4380ctgcccgctg ttctgcagcc ggtcgcggag
gctatggatg cgatcgctgc ggccgatctt 4440agccagacga gcgggttcgg
cccattcgga ccgcaaggaa tcggtcaata cactacatgg 4500cgtgatttca
tatgcgcgat tgctgatccc catgtgtatc actggcaaac tgtgatggac
4560gacaccgtca gtgcgtccgt cgcgcaggct ctcgatgagc tgatgctttg
ggccgaggac 4620tgccccgaag tccggcacct cgtgcacgcg gatttcggct
ccaacaatgt cctgacggac 4680aatggccgca taacagcggt cattgactgg
agcgaggcga tgttcgggga ttcccaatac 4740gaggtcgcca acatcttctt
ctggaggccg tggttggctt gtatggagca gcagacgcgc 4800tacttcgagc
ggaggcatcc ggagcttgca ggatcgccgc ggctccgggc gtatatgctc
4860cgcattggtc ttgaccaact ctatcagagc ttggttgacg gcaatttcga
tgatgcagct 4920tgggcgcagg gtcgatgcga cgcaatcgtc cgatccggag
ccgggactgt cgggcgtaca 4980caaatcgccc gcagaagcgc ggccgtctgg
accgatggct gtgtagaagt actcgccgat 5040agtggaaacc gacgccccag
cactcgtccg agggcaaagg aatagtgagg tacagcttgg 5100atcgatccgg
ctgctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag
5160caataactag cataacccct tggggcctct aaacgggtct tgaggggttt
tttgctgaaa 5220ggaggaacta tatccggatg atcgtcgagg cctcacgtgt
taacaagctt gcatgcctgc 5280aggttt 5286474986DNAArtificialNucleotide
sequence of QC300-1Y 47cttgtacaaa gtggttgatg ggatccatgg cccacagcaa
gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca
caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc
agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag
gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta
cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca
300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac
gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat
cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga
ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag
ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg
ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca
600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac
cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc
cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac
atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat
gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca
tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg
900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta
ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct
agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta
tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc
1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg
1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac
cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct
gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc
1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc
aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt
tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc
2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat
taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta
tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa
gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga
2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta
tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg
acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca
cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct
2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg
gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg
atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg
cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac
3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct
ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg
ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca
atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag
3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc
tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac
gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg
tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa
3600gtttgtacaa aaaagcaggc tccgaattcg cccttggaga acattattaa
tgtgaaaatc 3660atgcaaactt aaaaaaatca tcaacaacat aattttataa
ttctaataaa atattttttt 3720cttttaattc tttaatcaat gtctaacatt
tatctattat ttatcacatt tgttatttaa 3780tgtttctatc tttagagcta
tcaaaaattt aaaatggtgg aaccttactc attgggttga 3840gttcacctaa
cttgtttaat aaatagatca atctaattct attcatctct tagtaagtat
3900taaaaatgtt ggcccaactc tccatatatt ggtgagttat aggagtttac
tcacttaaaa 3960tgataataaa aatatttgtt ttaaaatcat tttttaaaca
aaaaaataat gtttcagatt 4020atttattctt agatcataac ttacaagcaa
catttcaatg atcaattcaa ttgtcagaat 4080caaaaccaat tgaaagagac
aaatattcat gctaatcttc atcagaaact aaacattgac 4140ataaagcaat
agtattggaa ctacaagtta taattatgta ctttgtaata gtgtgaagaa
4200aatcaaaata caaatagtaa tcatcatgat aaatgctatc tcaatttatt
caattataaa 4260aatatagaaa taaaatgtga taaatggata acatgtgtgc
taatccagtc cactacgccc 4320accacaagtt caacccaatg gactggatca
tcttcttttt ttcttactga tttctctctt 4380cttccattct aatccatccc
aaaagtagat gtttactatt tcccctttca tagtttcaca 4440agtgtgcgca
gaggccaaac tgaaagtggt agtacatggt gtaatattaa tcacagatgt
4500gctctcatga agtctgaact tacagctcaa gtaacaacca acaagtaaaa
agtacagaag 4560atagcataaa aaatgaaggt agaacaaatt ccaagttttc
tacatattac ggtgcataaa 4620tcaaccacgt gaaggctcca tttatttgcc
gctataacat tggtgaccct cttccacaaa 4680tagtaagtaa taaaaccaag
tacaaaaaaa tgttcaacta ccaagtgatc acaatcttca 4740tgcatctgag
tcacactatt gccctttgct catgaagtac actttactca ccgccaaagt
4800tcactcaaca ctgtagaaca aaggaatcat ataaataatg catatctctc
ccttaagcct 4860tcaacacata caaaagtgac acaccaaatc aaagacacct
gagccattca attcccctcc 4920tttattgctt tcaagtttca acactaattt
tattatctga aacaagggcg aattcgaccc 4980agcttt
4986484792DNAArtificialNucleotide sequence of QC300-2Y 48cttgtacaaa
gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa
gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg
120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg
atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt
caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact
acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg
ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt
gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc
420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc
tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt
gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg
acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac
ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca
gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct
720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt
aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt
tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt
atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc
gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg
tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata
1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat
gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca
atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag
ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc
ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga
gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga
1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat
cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga
tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg
caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga
gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac
caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac
1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc
tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat
agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca
cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg
tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt
atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca
1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg
acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga
aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct
tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg
tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg
agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa
2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc
tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt
ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga
aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt
tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt
gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag
2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat
cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag
agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga
cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta
tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca
ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg
2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca
ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc
gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa
caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg
aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg
aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag
3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca
ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat
gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg
gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg
atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact
cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc
3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat
aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg
ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac
aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc
tccgaattcg cccttcattg ggttgagttc acctaacttg 3660tttaataaat
agatcaatct aattctattc atctcttagt aagtattaaa aatgttggcc
3720caactctcca tatattggtg agttatagga gtttactcac ttaaaatgat
aataaaaata 3780tttgttttaa aatcattttt taaacaaaaa aataatgttt
cagattattt attcttagat 3840cataacttac aagcaacatt tcaatgatca
attcaattgt cagaatcaaa accaattgaa 3900agagacaaat attcatgcta
atcttcatca gaaactaaac attgacataa agcaatagta 3960ttggaactac
aagttataat tatgtacttt gtaatagtgt gaagaaaatc aaaatacaaa
4020tagtaatcat catgataaat gctatctcaa tttattcaat tataaaaata
tagaaataaa 4080atgtgataaa tggataacat gtgtgctaat ccagtccact
acgcccacca caagttcaac 4140ccaatggact ggatcatctt ctttttttct
tactgatttc tctcttcttc cattctaatc 4200catcccaaaa gtagatgttt
actatttccc ctttcatagt ttcacaagtg tgcgcagagg 4260ccaaactgaa
agtggtagta catggtgtaa tattaatcac agatgtgctc tcatgaagtc
4320tgaacttaca gctcaagtaa caaccaacaa gtaaaaagta cagaagatag
cataaaaaat 4380gaaggtagaa caaattccaa gttttctaca tattacggtg
cataaatcaa ccacgtgaag 4440gctccattta tttgccgcta taacattggt
gaccctcttc cacaaatagt aagtaataaa 4500accaagtaca aaaaaatgtt
caactaccaa gtgatcacaa tcttcatgca tctgagtcac 4560actattgccc
tttgctcatg aagtacactt tactcaccgc caaagttcac tcaacactgt
4620agaacaaagg aatcatataa ataatgcata tctctccctt aagccttcaa
cacatacaaa 4680agtgacacac caaatcaaag acacctgagc cattcaattc
ccctccttta ttgctttcaa 4740gtttcaacac taattttatt atctgaaaca
agggcgaatt cgacccagct tt 4792494590DNAArtificialNucleotide sequence
of QC300-3Y 49cttgtacaaa gtggttgatg ggatccatgg cccacagcaa
gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca
caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc
agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag
gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta
cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca
300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac
gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat
cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga
ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag
ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg
ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca
600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac
cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc
cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac
atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat
gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca
tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg
900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta
ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct
agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta
tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc
1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg
1500gtttgtttgc
cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga
1560gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca
cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt
taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac
tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg
ttcgtgcaca cagcccagct tggagcgaac gacctacacc 1800gaactgagat
acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag
1860gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag
ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc
gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg
agcctatgga aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt
ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg
tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc
2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc
aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt
tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta tagggagacc
acaacggttt ccctctagaa ataattttgt 2340ttaactttaa gaaggagata
tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg
atcgaaaagt tcgacagcgt ctccgacctg atgcagctct cggagggcga
2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc
gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg
cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg acattgggga
attcagcgag agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca
cgttgcaaga cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc
gcggaggcta tggatgcgat cgctgcggcc gatcttagcc agacgagcgg
2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg
atttcatatg 2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg
atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat
gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt
tcggctccaa caatgtcctg acggacaatg gccgcataac 3000agcggtcatt
gactggagcg aggcgatgtt cggggattcc caatacgagg tcgccaacat
3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact
tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat
atgctccgca ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa
tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat
ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc
gtctggaccg atggctgtgt agaagtactc gccgatagtg gaaaccgacg
3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg
atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc
gctgagcaat aactagcata 3480accccttggg gcctctaaac gggtcttgag
gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc
acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa
aaaagcaggc tccgaattcg cccttgatca taacttacaa gcaacatttc
3660aatgatcaat tcaattgtca gaatcaaaac caattgaaag agacaaatat
tcatgctaat 3720cttcatcaga aactaaacat tgacataaag caatagtatt
ggaactacaa gttataatta 3780tgtactttgt aatagtgtga agaaaatcaa
aatacaaata gtaatcatca tgataaatgc 3840tatctcaatt tattcaatta
taaaaatata gaaataaaat gtgataaatg gataacatgt 3900gtgctaatcc
agtccactac gcccaccaca agttcaaccc aatggactgg atcatcttct
3960ttttttctta ctgatttctc tcttcttcca ttctaatcca tcccaaaagt
agatgtttac 4020tatttcccct ttcatagttt cacaagtgtg cgcagaggcc
aaactgaaag tggtagtaca 4080tggtgtaata ttaatcacag atgtgctctc
atgaagtctg aacttacagc tcaagtaaca 4140accaacaagt aaaaagtaca
gaagatagca taaaaaatga aggtagaaca aattccaagt 4200tttctacata
ttacggtgca taaatcaacc acgtgaaggc tccatttatt tgccgctata
4260acattggtga ccctcttcca caaatagtaa gtaataaaac caagtacaaa
aaaatgttca 4320actaccaagt gatcacaatc ttcatgcatc tgagtcacac
tattgccctt tgctcatgaa 4380gtacacttta ctcaccgcca aagttcactc
aacactgtag aacaaaggaa tcatataaat 4440aatgcatatc tctcccttaa
gccttcaaca catacaaaag tgacacacca aatcaaagac 4500acctgagcca
ttcaattccc ctcctttatt gctttcaagt ttcaacacta attttattat
4560ctgaaacaag ggcgaattcg acccagcttt
4590504343DNAArtificialNucleotide sequence of QC300-4Y 50cttgtacaaa
gtggttgatg ggatccatgg cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa
gtaccacatg gagggctgcg tgaacggcca caagttcgtg atcaccggcg
120agggcatcgg ctaccccttc aagggcaagc agaccatcaa cctgtgcgtg
atcgagggcg 180gccccctgcc cttcagcgag gacatcctga gcgccggctt
caagtacggc gaccggatct 240tcaccgagta cccccaggac atcgtggact
acttcaagaa cagctgcccc gccggctaca 300cctggggccg gagcttcctg
ttcgaggacg gcgccgtgtg catctgtaac gtggacatca 360ccgtgagcgt
gaaggagaac tgcatctacc acaagagcat cttcaacggc gtgaacttcc
420ccgccgacgg ccccgtgatg aagaagatga ccaccaactg ggaggccagc
tgcgagaaga 480tcatgcccgt gcctaagcag ggcatcctga agggcgacgt
gagcatgtac ctgctgctga 540aggacggcgg ccggtaccgg tgccagttcg
acaccgtgta caaggccaag agcgtgccca 600gcaagatgcc cgagtggcac
ttcatccagc acaagctgct gcgggaggac cggagcgacg 660ccaagaacca
gaagtggcag ctgaccgagc acgccatcgc cttccccagc gccctggcct
720gagagctcga atttccccga tcgttcaaac atttggcaat aaagtttctt
aagattgaat 780cctgttgccg gtcttgcgat gattatcata taatttctgt
tgaattacgt taagcatgta 840ataattaaca tgtaatgcat gacgttattt
atgagatggg tttttatgat tagagtcccg 900caattataca tttaatacgc
gatagaaaac aaaatatagc gcgcaaacta ggataaatta 960tcgcgcgcgg
tgtcatctat gttactagat cgggaattct agtggccggc ccagctgata
1020tccatcacac tggcggccgc tcgagttcta tagtgtcacc taaatcgtat
gtgtatgata 1080cataaggtta tgtattaatt gtagccgcgt tctaacgaca
atatgtccat atggtgcact 1140ctcagtacaa tctgctctga tgccgcatag
ttaagccagc cccgacaccc gccaacaccc 1200gctgacgcgc cctgacgggc
ttgtctgctc ccggcatccg cttacagaca agctgtgacc 1260gtctccggga
gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga
1320aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat
cccttaacgt 1380gagttttcgt tccactgagc gtcagacccc gtagaaaaga
tcaaaggatc ttcttgagat 1440cctttttttc tgcgcgtaat ctgctgcttg
caaacaaaaa aaccaccgct accagcggtg 1500gtttgtttgc cggatcaaga
gctaccaact ctttttccga aggtaactgg cttcagcaga 1560gcgcagatac
caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac
1620tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc
tgctgccagt 1680ggcgataagt cgtgtcttac cgggttggac tcaagacgat
agttaccgga taaggcgcag 1740cggtcgggct gaacgggggg ttcgtgcaca
cagcccagct tggagcgaac gacctacacc 1800gaactgagat acctacagcg
tgagcattga gaaagcgcca cgcttcccga agggagaaag 1860gcggacaggt
atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca
1920gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg
acttgagcgt 1980cgatttttgt gatgctcgtc aggggggcgg agcctatgga
aaaacgccag caacgcggcc 2040tttttacggt tcctggcctt ttgctggcct
tttgctcaca tgttctttcc tgcgttatcc 2100cctgattctg tggataaccg
tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 2160cgaacgaccg
agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa
2220ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc
tcgatcccgc 2280gaaattaata cgactcacta tagggagacc acaacggttt
ccctctagaa ataattttgt 2340ttaactttaa gaaggagata tacccatgga
aaagcctgaa ctcaccgcga cgtctgtcga 2400gaagtttctg atcgaaaagt
tcgacagcgt ctccgacctg atgcagctct cggagggcga 2460agaatctcgt
gctttcagct tcgatgtagg agggcgtgga tatgtcctgc gggtaaatag
2520ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat
cggccgcgct 2580cccgattccg gaagtgcttg acattgggga attcagcgag
agcctgacct attgcatctc 2640ccgccgtgca cagggtgtca cgttgcaaga
cctgcctgaa accgaactgc ccgctgttct 2700gcagccggtc gcggaggcta
tggatgcgat cgctgcggcc gatcttagcc agacgagcgg 2760gttcggccca
ttcggaccgc aaggaatcgg tcaatacact acatggcgtg atttcatatg
2820cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca
ccgtcagtgc 2880gtccgtcgcg caggctctcg atgagctgat gctttgggcc
gaggactgcc ccgaagtccg 2940gcacctcgtg cacgcggatt tcggctccaa
caatgtcctg acggacaatg gccgcataac 3000agcggtcatt gactggagcg
aggcgatgtt cggggattcc caatacgagg tcgccaacat 3060cttcttctgg
aggccgtggt tggcttgtat ggagcagcag acgcgctact tcgagcggag
3120gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca
ttggtcttga 3180ccaactctat cagagcttgg ttgacggcaa tttcgatgat
gcagcttggg cgcagggtcg 3240atgcgacgca atcgtccgat ccggagccgg
gactgtcggg cgtacacaaa tcgcccgcag 3300aagcgcggcc gtctggaccg
atggctgtgt agaagtactc gccgatagtg gaaaccgacg 3360ccccagcact
cgtccgaggg caaaggaata gtgaggtaca gcttggatcg atccggctgc
3420taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat
aactagcata 3480accccttggg gcctctaaac gggtcttgag gggttttttg
ctgaaaggag gaactatatc 3540cggatgatcg tcgaggcctc acgtgttaac
aagcttgcat gcctgcaggt ttatcaacaa 3600gtttgtacaa aaaagcaggc
tccgaattcg cccttgataa atggataaca tgtgtgctaa 3660tccagtccac
tacgcccacc acaagttcaa cccaatggac tggatcatct tctttttttc
3720ttactgattt ctctcttctt ccattctaat ccatcccaaa agtagatgtt
tactatttcc 3780cctttcatag tttcacaagt gtgcgcagag gccaaactga
aagtggtagt acatggtgta 3840atattaatca cagatgtgct ctcatgaagt
ctgaacttac agctcaagta acaaccaaca 3900agtaaaaagt acagaagata
gcataaaaaa tgaaggtaga acaaattcca agttttctac 3960atattacggt
gcataaatca accacgtgaa ggctccattt atttgccgct ataacattgg
4020tgaccctctt ccacaaatag taagtaataa aaccaagtac aaaaaaatgt
tcaactacca 4080agtgatcaca atcttcatgc atctgagtca cactattgcc
ctttgctcat gaagtacact 4140ttactcaccg ccaaagttca ctcaacactg
tagaacaaag gaatcatata aataatgcat 4200atctctccct taagccttca
acacatacaa aagtgacaca ccaaatcaaa gacacctgag 4260ccattcaatt
cccctccttt attgctttca agtttcaaca ctaattttat tatctgaaac
4320aagggcgaat tcgacccagc ttt 4343514130DNAArtificialNucleotide
sequence of QC300-5Y 51cttgtacaaa gtggttgatg ggatccatgg cccacagcaa
gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca
caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc
agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag
gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta
cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca
300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac
gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat
cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga
ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag
ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg
ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca
600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac
cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc
cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac
atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat
gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca
tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg
900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta
ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct
agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta
tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc
1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg
1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac
cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct
gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc
1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc
aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt
tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc
2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat
taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta
tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa
gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga
2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta
tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg
acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca
cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct
2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg
gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg
atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg
cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac
3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct
ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg
ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca
atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag
3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc
tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac
gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg
tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa
3600gtttgtacaa aaaagcaggc tccgaattcg cccttcacag atgtgctctc
atgaagtctg 3660aacttacagc tcaagtaaca accaacaagt aaaaagtaca
gaagatagca taaaaaatga 3720aggtagaaca aattccaagt tttctacata
ttacggtgca taaatcaacc acgtgaaggc 3780tccatttatt tgccgctata
acattggtga ccctcttcca caaatagtaa gtaataaaac 3840caagtacaaa
aaaatgttca actaccaagt gatcacaatc ttcatgcatc tgagtcacac
3900tattgccctt tgctcatgaa gtacacttta ctcaccgcca aagttcactc
aacactgtag 3960aacaaaggaa tcatataaat aatgcatatc tctcccttaa
gccttcaaca catacaaaag 4020tgacacacca aatcaaagac acctgagcca
ttcaattccc ctcctttatt gctttcaagt 4080ttcaacacta attttattat
ctgaaacaag ggcgaattcg acccagcttt 4130523895DNAArtificialNucleotide
sequence of QC300-6Y 52cttgtacaaa gtggttgatg ggatccatgg cccacagcaa
gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg tgaacggcca
caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc aagggcaagc
agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc cttcagcgag
gacatcctga gcgccggctt caagtacggc gaccggatct 240tcaccgagta
cccccaggac atcgtggact acttcaagaa cagctgcccc gccggctaca
300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg catctgtaac
gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc acaagagcat
cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg aagaagatga
ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt gcctaagcag
ggcatcctga agggcgacgt gagcatgtac ctgctgctga 540aggacggcgg
ccggtaccgg tgccagttcg acaccgtgta caaggccaag agcgtgccca
600gcaagatgcc cgagtggcac ttcatccagc acaagctgct gcgggaggac
cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc acgccatcgc
cttccccagc gccctggcct 720gagagctcga atttccccga tcgttcaaac
atttggcaat aaagtttctt aagattgaat 780cctgttgccg gtcttgcgat
gattatcata taatttctgt tgaattacgt taagcatgta 840ataattaaca
tgtaatgcat gacgttattt atgagatggg tttttatgat tagagtcccg
900caattataca tttaatacgc gatagaaaac aaaatatagc gcgcaaacta
ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat cgggaattct
agtggccggc ccagctgata 1020tccatcacac tggcggccgc tcgagttcta
tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat atggtgcact 1140ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc
1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 1440cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg
1500gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca cttcaagaac 1620tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt cgtgtcttac
cgggttggac tcaagacgat agttaccgga taaggcgcag 1740cggtcgggct
gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc
1800gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
agggagaaag 1860gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt gatgctcgtc
aggggggcgg agcctatgga aaaacgccag caacgcggcc 2040tttttacggt
tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc
2100cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg gccgattcat
taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata cgactcacta
tagggagacc acaacggttt ccctctagaa ataattttgt 2340ttaactttaa
gaaggagata tacccatgga aaagcctgaa ctcaccgcga cgtctgtcga
2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca aagatcgtta
tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg gaagtgcttg
acattgggga attcagcgag agcctgacct attgcatctc 2640ccgccgtgca
cagggtgtca cgttgcaaga cctgcctgaa accgaactgc ccgctgttct
2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg atttcatatg 2820cgcgattgct gatccccatg tgtatcactg
gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg caggctctcg
atgagctgat gctttgggcc gaggactgcc ccgaagtccg 2940gcacctcgtg
cacgcggatt tcggctccaa caatgtcctg acggacaatg gccgcataac
3000agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact tcgagcggag 3120gcatccggag cttgcaggat cgccgcggct
ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat cagagcttgg
ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg 3240atgcgacgca
atcgtccgat ccggagccgg gactgtcggg cgtacacaaa tcgcccgcag
3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag ctgagttggc
tgctgccacc gctgagcaat aactagcata 3480accccttggg gcctctaaac
gggtcttgag gggttttttg ctgaaaggag gaactatatc 3540cggatgatcg
tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt ttatcaacaa
3600gtttgtacaa aaaagcaggc tccgaattcg cccttgatca caatcttcat
gcatctgagt 3660cacactattg ccctttgctc atgaagtaca ctttactcac
cgccaaagtt cactcaacac 3720tgtagaacaa aggaatcata taaataatgc
atatctctcc cttaagcctt caacacatac 3780aaaagtgaca caccaaatca
aagacacctg agccattcaa ttcccctcct ttattgcttt 3840caagtttcaa
cactaatttt attatctgaa acaagggcga attcgaccca gcttt
3895534157DNAArtificialNucleotide sequence of pZSL90 53gatccatggc
ccacagcaag cacggcctga aggaggagat gaccatgaag taccacatgg 60agggctgcgt
gaacggccac aagttcgtga tcaccggcga gggcatcggc taccccttca
120agggcaagca gaccatcaac ctgtgcgtga tcgagggcgg ccccctgccc
ttcagcgagg 180acatcctgag cgccggcttc aagtacggcg accggatctt
caccgagtac ccccaggaca 240tcgtggacta cttcaagaac agctgccccg
ccggctacac ctggggccgg agcttcctgt 300tcgaggacgg cgccgtgtgc
atctgtaacg tggacatcac cgtgagcgtg aaggagaact 360gcatctacca
caagagcatc ttcaacggcg tgaacttccc cgccgacggc cccgtgatga
420agaagatgac caccaactgg gaggccagct gcgagaagat catgcccgtg
cctaagcagg 480gcatcctgaa gggcgacgtg agcatgtacc tgctgctgaa
ggacggcggc cggtaccggt 540gccagttcga caccgtgtac aaggccaaga
gcgtgcccag caagatgccc gagtggcact 600tcatccagca caagctgctg
cgggaggacc ggagcgacgc caagaaccag aagtggcagc 660tgaccgagca
cgccatcgcc ttccccagcg ccctggcctg agagctcgaa tttccccgat
720cgttcaaaca tttggcaata aagtttctta agattgaatc ctgttgccgg
tcttgcgatg 780attatcatat aatttctgtt gaattacgtt aagcatgtaa
taattaacat gtaatgcatg 840acgttattta tgagatgggt ttttatgatt
agagtcccgc aattatacat ttaatacgcg 900atagaaaaca aaatatagcg
cgcaaactag gataaattat cgcgcgcggt gtcatctatg 960ttactagatc
gggaattcta gtggccggcc cagctgatat ccatcacact ggcggccgct
1020cgagttctat agtgtcacct aaatcgtatg tgtatgatac ataaggttat
gtattaattg 1080tagccgcgtt ctaacgacaa tatgtccata tggtgcactc
tcagtacaat ctgctctgat 1140gccgcatagt taagccagcc ccgacacccg
ccaacacccg ctgacgcgcc ctgacgggct 1200tgtctgctcc cggcatccgc
ttacagacaa gctgtgaccg tctccgggag ctgcatgtgt 1260cagaggtttt
caccgtcatc accgaaacgc gcgagacgaa agggcctcgt gatacgccta
1320tttttatagg ttaatgtcat gaccaaaatc ccttaacgtg agttttcgtt
ccactgagcg 1380tcagaccccg tagaaaagat caaaggatct tcttgagatc
ctttttttct gcgcgtaatc 1440tgctgcttgc aaacaaaaaa accaccgcta
ccagcggtgg tttgtttgcc ggatcaagag 1500ctaccaactc tttttccgaa
ggtaactggc ttcagcagag cgcagatacc aaatactgtc 1560cttctagtgt
agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac
1620ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc
gtgtcttacc 1680gggttggact caagacgata gttaccggat aaggcgcagc
ggtcgggctg aacggggggt 1740tcgtgcacac agcccagctt ggagcgaacg
acctacaccg aactgagata cctacagcgt 1800gagcattgag aaagcgccac
gcttcccgaa gggagaaagg cggacaggta tccggtaagc 1860ggcagggtcg
gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt
1920tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg
atgctcgtca 1980ggggggcgga gcctatggaa aaacgccagc aacgcggcct
ttttacggtt cctggccttt 2040tgctggcctt ttgctcacat gttctttcct
gcgttatccc ctgattctgt ggataaccgt 2100attaccgcct ttgagtgagc
tgataccgct cgccgcagcc gaacgaccga gcgcagcgag 2160tcagtgagcg
aggaagcgga agagcgccca atacgcaaac cgcctctccc cgcgcgttgg
2220ccgattcatt aatgcaggtt gatcagatct cgatcccgcg aaattaatac
gactcactat 2280agggagacca caacggtttc cctctagaaa taattttgtt
taactttaag aaggagatat 2340acccatggaa aagcctgaac tcaccgcgac
gtctgtcgag aagtttctga tcgaaaagtt 2400cgacagcgtc tccgacctga
tgcagctctc ggagggcgaa gaatctcgtg ctttcagctt 2460cgatgtagga
gggcgtggat atgtcctgcg ggtaaatagc tgcgccgatg gtttctacaa
2520agatcgttat gtttatcggc actttgcatc ggccgcgctc ccgattccgg
aagtgcttga 2580cattggggaa ttcagcgaga gcctgaccta ttgcatctcc
cgccgtgcac agggtgtcac 2640gttgcaagac ctgcctgaaa ccgaactgcc
cgctgttctg cagccggtcg cggaggctat 2700ggatgcgatc gctgcggccg
atcttagcca gacgagcggg ttcggcccat tcggaccgca 2760aggaatcggt
caatacacta catggcgtga tttcatatgc gcgattgctg atccccatgt
2820gtatcactgg caaactgtga tggacgacac cgtcagtgcg tccgtcgcgc
aggctctcga 2880tgagctgatg ctttgggccg aggactgccc cgaagtccgg
cacctcgtgc acgcggattt 2940cggctccaac aatgtcctga cggacaatgg
ccgcataaca gcggtcattg actggagcga 3000ggcgatgttc ggggattccc
aatacgaggt cgccaacatc ttcttctgga ggccgtggtt 3060ggcttgtatg
gagcagcaga cgcgctactt cgagcggagg catccggagc ttgcaggatc
3120gccgcggctc cgggcgtata tgctccgcat tggtcttgac caactctatc
agagcttggt 3180tgacggcaat ttcgatgatg cagcttgggc gcagggtcga
tgcgacgcaa tcgtccgatc 3240cggagccggg actgtcgggc gtacacaaat
cgcccgcaga agcgcggccg tctggaccga 3300tggctgtgta gaagtactcg
ccgatagtgg aaaccgacgc cccagcactc gtccgagggc 3360aaaggaatag
tgaggtacag cttggatcga tccggctgct aacaaagccc gaaaggaagc
3420tgagttggct gctgccaccg ctgagcaata actagcataa ccccttgggg
cctctaaacg 3480ggtcttgagg ggttttttgc tgaaaggagg aactatatcc
ggatgatcgt cgaggcctca 3540cgtgttaaca agcttgcatg cctgcaggtt
taaacagtcg actctagaga tccgtcaaca 3600tggtggagca cgacactctc
gtctactcca agaatatcaa agatacagtc tcagaagacc 3660aaagggctat
tgagactttt caacaaaggg taatatcggg aaacctcctc ggattccatt
3720gcccagctat ctgtcacttc atcaaaagga cagtagaaaa ggaaggtggc
acctacaaat 3780gccatcattg cgataaagga aaggctatcg ttcaagatgc
ctctgccgac agtggtccca 3840aagatggacc cccacccacg aggagcatcg
tggaaaaaga agacgttcca accacgtctt 3900caaagcaagt ggattgatgt
gatgatccta tgcgtatggt atgacgtgtg ttcaagatga 3960tgacttcaaa
cctacctatg acgtatggta tgacgtgtgt cgactgatga cttagatcca
4020ctcgagcggc tataaatacg tacctacgca ccctgcgcta ccatccctag
agctgcagct 4080tatttttaca acaattacca acaacaacaa acaacaaaca
acattacaat tactatttac 4140aattacagtc gacccgg 4157
* * * * *