U.S. patent application number 14/207904 was filed with the patent office on 2014-09-18 for soybean ef1a promoter and its use in constitutive expression of transgenic genes in plants.
This patent application is currently assigned to E I DU PONT DE NEMOURS AND COMPANY. The applicant listed for this patent is E I DU PONT DE NEMOURS AND COMPANY. Invention is credited to ZHONGSEN LI.
Application Number | 20140283210 14/207904 |
Document ID | / |
Family ID | 40133641 |
Filed Date | 2014-09-18 |
United States Patent
Application |
20140283210 |
Kind Code |
A1 |
LI; ZHONGSEN |
September 18, 2014 |
SOYBEAN EF1A PROMOTER AND ITS USE IN CONSTITUTIVE EXPRESSION OF
TRANSGENIC GENES IN PLANTS
Abstract
The promoter of a soybean translation elongation factor EF1
alpha and fragments thereof and their use in promoting the
expression of one or more heterologous nucleic acid fragments in a
tissue-independent or constitutive manner in plants are
described.
Inventors: |
LI; ZHONGSEN; (Hockessin,
DE) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
E I DU PONT DE NEMOURS AND COMPANY |
Wilmington |
DE |
US |
|
|
Assignee: |
E I DU PONT DE NEMOURS AND
COMPANY
Wilmington
DE
|
Family ID: |
40133641 |
Appl. No.: |
14/207904 |
Filed: |
March 13, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12120281 |
May 14, 2008 |
8710206 |
|
|
14207904 |
|
|
|
|
60944143 |
Jun 15, 2007 |
|
|
|
Current U.S.
Class: |
800/279 ;
435/320.1; 435/419; 435/468; 800/278; 800/281; 800/284; 800/289;
800/290; 800/297; 800/312 |
Current CPC
Class: |
C12N 15/8245 20130101;
C12N 15/8251 20130101; C12N 15/8286 20130101; C12N 15/8222
20130101; C12N 15/8247 20130101; C12N 15/8279 20130101; C12N
15/8261 20130101; C12N 15/8273 20130101 |
Class at
Publication: |
800/279 ;
435/320.1; 435/419; 800/297; 800/312; 800/278; 800/284; 800/281;
800/290; 800/289; 435/468 |
International
Class: |
C12N 15/82 20060101
C12N015/82 |
Claims
1. A recombinant DNA construct comprising a nucleotide sequence
comprising a sequence having at least 90% sequence identity, based
on the BLASTN method of alignment, when compared to any one of the
sequence set forth in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID
NO:4, SEQ ID NO:5, SEQ ID NO:6 or SEQ ID NO:7; operably linked to
at least one heterologous sequence, wherein said nucleotide
sequence is a promoter.
2. The recombinant DNA construct of claim 1, wherein the nucleotide
sequence of has at least 95% identity, based on the BLASTN method
of alignment, when compared to any one of the sequence set forth in
SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, or
SEQ ID NO:6;
3. A recombinant DNA construct comprising: (a) a nucleotide
sequence comprising a fragment of SEQ ID NO:1, SEQ ID NO:2, SEQ ID
NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 or SEQ ID NO:7; or (b)
a nucleotide sequence comprising a sequence having at least 90%
sequence identity, based on the BLASTN method of alignment, when
compared to the nucleotide sequence of (a); operably linked to at
least one heteroloqous sequence, wherein said nucleotide sequence
is a promoter.
4. (canceled)
5. The recombinant DNA construct of claim 1 wherein the nucleotide
sequence is a constitutive promoter.
6. (canceled)
7. A vector comprising the recombinant DNA construct of claim
1.
8. A cell comprising the recombinant DNA construct of claim 1.
9. The cell of claim 8, wherein the cell is a plant cell.
10. A transgenic plant having stably incorporated into its genome
the recombinant DNA construct of claim 1.
11. The transgenic plant of claim 10 wherein said plant is selected
from the group consisting of dicotyledonous plants.
12. The plant of claim 11 wherein the plant is soybean.
13. A transgenic seed produced by the transgenic plant of claim 11,
wherein the transgenic seed comprises the recombinant DNA
construct.
14. A method of expressing a coding sequence or a functional RNA in
a plant comprising: a) introducing the recombinant DNA construct of
claim 1 into the plant, wherein the at least one heterologous
sequence comprises a coding sequence or encodes a functional RNA;
b) growing the plant of step a); and c) selecting a plant
displaying expression of the coding sequence or the functional RNA
of the recombinant DNA construct.
15. A method of transgenically altering a marketable plant trait,
comprising: a) introducing a recombinant DNA construct of claim 1
into the plant; b) growing a fertile, mature plant resulting from
step a); and c) selecting a plant expressing the at least one
heterologous nucleotide sequence in at least one plant tissue based
on the altered marketable trait.
16. The method of claim 15 wherein the marketable trait is selected
from the group consisting of: disease resistance, herbicide
resistance, insect resistance carbohydrate metabolism, fatty acid
metabolism, amino acid metabolism, plant development, plant growth
regulation, yield improvement, drought resistance, cold resistance,
heat resistance, and salt resistance.
17. A method for altering expression of at least one heterologous
nucleic acid fragment in plant comprising: (a) transforming a plant
cell with the recombinant expression construct of claim 1; (b)
growing fertile mature plants from transformed plant cell of step
(a); and (c) selecting plants containing the transformed plant cell
wherein the expression of the heterologous nucleic acid fragment is
increased or decreased.
18. The method of claim 17 wherein the plant is a soybean
plant.
19. A method for altering expression of at least one heterologous
nucleic acid fragment in a plant comprising: (a) transforming a
plant cell with a recombinant DNA construct of claim 3; (b) growing
fertile mature plants from transformed plant cell of step (a); and
(c) selecting plants comprising a transformed plant cell expressing
the heterologous nucleic acid fragment during early seed
development.
20. A method for expressing a yellow fluorescent protein ZS-YELLOW1
N1 in a host cell comprising: (a) transforming a host cell with a
recombinant expression construct comprising at least one ZS-YELLOW1
N1 (YFP) nucleic acid fragment operably linked to a promoter
wherein said promoter consists essentially of the nucleotide
sequence set forth in SEQ ID NOs: 1, 2, 3, 4, 5, 6, or 7; and, (b)
growing the transformed host cell under conditions that are
suitable for expression of the recombinant DNA construct, wherein
expression of the recombinant DNA construct results in production
of increased levels of ZS-YELLOW1 N1 protein in the transformed
host cell when compared to a corresponding nontransformed host
cell.
21. A plant stably transformed with a recombinant expression
construct comprising a soybean promoter and a heterologous nucleic
acid fragment operably linked to said promoter, wherein said
promoter is a capable of controlling expression of said
heterologous nucleic acid fragment in a plant cell, and further
wherein said promoter comprises a fragment of SEQ ID NO:1,SEQ ID
NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 or SEQ ID
NO:7;
22. An isolated polynucleotide according to any one of claim 1,
wherein the heterologous nucleic acid sequence codes for a gene
selected from the group consisting of: a reporter gene, a selection
marker, a disease resistance conferring gene, a herbicide
resistance conferring gene, an insect resistance conferring gene; a
gene involved in carbohydrate metabolism, a gene involved in fatty
acid metabolism, a gene involved in amino acid metabolism, a gene
involved in plant development, a gene involved in plant growth
regulation, a gene involved in yield improvement, a gene involved
in drought resistance, a gene involved in cold resistance, a gene
involved in heat and salt resistance in plants.
23. An isolated polynucleotide according to any one of claim 1,
wherein the heterologous nucleic acid sequence encodes a protein
selected from the group consisting of: a reporter protein, a
selection marker, a protein conferring disease resistance, protein
conferring herbicide resistance, protein conferring insect
resistance; protein involved in carbohydrate metabolism, protein
involved in fatty acid metabolism, protein involved in amino acid
metabolism, protein involved in plant development, protein involved
in plant growth regulation, protein involved in yield improvement,
protein involved in drought resistance, protein involved in cold
resistance, protein involved in heat resistance and salt resistance
in plants.
Description
[0001] This application is a Continuation of U.S. application Ser.
No. 12/120,281, filed May 14, 2008, now allowed, which claims the
benefit of U.S. Application No. 60/944,143, filed Jun. 15, 2007,
now expired, the entire content which is herein incorporated by
reference.
FIELD OF THE INVENTION
[0002] This invention relates to a plant promoter GM-EF1A and
fragments thereof and their use in altering expression of at least
one heterologous nucleic acid fragment in plants in a
tissue-independent or constitutive manner.
BACKGROUND
[0003] Recent advances in plant genetic engineering have opened new
doors to engineer plants to have improved characteristics or
traits, such as plant disease resistance, insect resistance,
herbicidal resistance, yield improvement, improvement of the
nutritional quality of the edible portions of the plant, and
enhanced stability or shelf-life of the ultimate consumer product
obtained from the plants. Thus, a desired gene (or genes) with the
molecular function to impart different or improved characteristics
or qualities, can be incorporated properly into the plant's genome.
The newly integrated gene (or genes) coding sequence can then be
expressed in the plant cell to exhibit the desired new trait or
characteristics. It is important that appropriate regulatory
signals must be present in proper configurations in order to obtain
the expression of the newly inserted gene coding sequence in the
plant cell. These regulatory signals typically include a promoter
region, a 5' non-translated leader sequence and a 3' transcription
termination/polyadenylation sequence.
[0004] A promoter is a non-coding genomic DNA sequence, usually
upstream (5') to the relevant coding sequence, to which RNA
polymerase binds before initiating transcription. This binding
aligns the RNA polymerase so that transcription will initiate at a
specific transcription initiation site. The nucleotide sequence of
the promoter determines the nature of the enzyme and other related
protein factors that attach to it and the rate of RNA synthesis.
The RNA is processed to produce messenger RNA (mRNA) which serves
as a template for translation of the RNA sequence into the amino
acid sequence of the encoded polypeptide. The 5' non-translated
leader sequence is a region of the mRNA upstream of the coding
region that may play a role in initiation and translation of the
mRNA. The 3' transcription termination/polyadenylation signal is a
non-translated region downstream of the coding region that
functions in the plant cell to cause termination of the RNA
synthesis and the addition of polyadenylate nucleotides to the 3'
end.
[0005] It has been shown that certain promoters are able to direct
RNA synthesis at a higher rate than others. These are called
"strong promoters". Certain other promoters have been shown to
direct RNA synthesis at higher levels only in particular types of
cells or tissues and are often referred to as "tissue specific
promoters", or "tissue-preferred promoters" if the promoters direct
RNA synthesis preferably in certain tissues but also in other
tissues at reduced levels. Since the patterns of the expression of
a chimeric gene (or genes) introduced into a plant are controlled
using promoters, there is an ongoing interest in the isolation of
novel promoters which are capable of controlling the expression of
a chimeric gene or (genes) at certain levels in specific tissue
types or at specific plant developmental stages.
[0006] Certain promoters are able to direct RNA synthesis at
relatively similar levels across all tissues of a plant. These are
called "constitutive promoters" or "tissue--independent" promoters.
Constitutive promoters can be divided into strong, moderate and
weak according to their effectiveness to direct RNA synthesis.
Since it is necessary in many cases to simultaneously express a
chimeric gene (or genes) in different tissues of a plant to get the
desired functions of the gene (or genes), constitutive promoters
are especially useful in this consideration. Though many
constitutive promoters have been discovered from plants and plant
viruses and characterized, there is still an ongoing interest in
the isolation of more novel constitutive promoters which are
capable of controlling the expression of a chimeric gene or (genes)
at different levels and the expression of multiple genes in the
same transgenic plant for gene stacking.
SUMMARY OF THE INVENTION
[0007] This invention concerns an isolated nucleic acid fragment
comprising a promoter wherein said promoter consists essentially of
the nucleotide sequence set forth in SEQ ID NOs: 1, 2, 3, 4, 5, 6,
7 or said promoter consists essentially of a fragment that is
substantially similar and functionally equivalent to the nucleotide
sequence set forth in SEQ ID NOs: 1, 2, 3, 4, 5, 6, 7.
[0008] In a second embodiment, this invention concerns a
recombinant expression construct comprising at least one
heterologous nucleic acid fragment operably linked to the promoter
of the invention.
[0009] In a third embodiment, this invention concerns a cell,
plant, or seed comprising a recombinant expression construct of the
present disclosure.
[0010] In a fourth embodiment, this invention concerns plants
comprising this recombinant expression construct and seeds obtained
from such plants.
[0011] In a fifth embodiment, this invention concerns a method of
altering (increasing or decreasing) expression of at least one
heterologous nucleic acid fragment in a plant cell which comprises:
[0012] (a) transforming a plant cell with the recombinant
expression construct described above; [0013] (b) growing fertile
mature plants from the transformed plant cell of step (a); [0014]
(c) selecting plants containing the transformed plant cell wherein
the expression of the heterologous nucleic acid fragment is
increased or decreased.
[0015] In a sixth embodiment, this invention concerns a method for
expressing a yellow fluorescent protein ZS-YELLOW1 N1 in a host
cell comprising: [0016] (a) transforming a host cell with a
recombinant expression construct comprising at least one ZS-YELLOW1
N1 (YFP) nucleic acid fragment operably linked to a promoter
wherein said promoter consists essentially of the nucleotide
sequence set forth in SEQ ID NOs:1, 2, 3, 4, 5, 6, or 7; and [0017]
(b) growing the transformed host cell under conditions that are
suitable for expression of the recombinant DNA construct, wherein
expression of the recombinant DNA construct results in production
of increased levels of ZS-YELLOW1 N1 protein in the transformed
host cell when compared to a corresponding nontransformed host
cell.
[0018] In a seventh embodiment, this invention concerns an isolated
nucleic acid fragment comprising a plant translation elongation
factor EF1A gene promoter.
[0019] In an eighth embodiment, this invention concerns a method of
altering a marketable plant trait. The marketable plant trait
concerns genes and proteins involved in disease resistance,
herbicide resistance, insect resistance, carbohydrate metabolism,
fatty acid metabolism, amino acid metabolism, plant development,
plant growth regulation, yield improvement, drought resistance,
cold resistance, heat resistance, and salt resistance.
[0020] In a ninth embodiment, this invention concerns an isolated
polynucleotide linked to a heterologous nucleic acid sequence. The
heterologous nucleic acid sequence encodes a protein involved in
disease resistance, herbicide resistance, insect resistance;
carbohydrate metabolism, fatty acid metabolism, amino acid
metabolism, plant development, plant growth regulation, yield
improvement, drought resistance, cold resistance, heat resistance,
or salt resistance in plants.
BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCES
[0021] The invention can be more fully understood from the
following detailed descriptions, the drawings and the sequence
descriptions that form a part of this application. The sequence
descriptions and sequence listing attached hereto comply with the
rules governing nucleotide and/or amino acid sequence disclosures
in patent applications as set forth in 37 C.F.R. .sctn.1.821-1.825.
The sequence descriptions contain the three letter codes for amino
acids as defined in 37 C.F.R. .sctn.1.821-1.825, which are
incorporated herein by reference.
[0022] SEQ ID NO:1 is the DNA sequence comprising a 2181 bp (base
pair) soybean EF1A promoter.
[0023] SEQ ID NO:2 is a 1841 bp truncated form of the EF1A promoter
shown in SEQ ID NO:1 (bp 340-2181 of SEQ ID NO:1).
[0024] SEQ ID NO:3 is a 1642 bp truncated form of the EF1A promoter
shown in SEQ ID NO:1 (bp 539-2181 of SEQ ID NO:1).
[0025] SEQ ID NO:4 is a 1431 bp truncated form of the EF1A promoter
shown in SEQ ID NO:1 (bp 750-2181 of SEQ ID NO:1).
[0026] SEQ ID NO:5 is a 1215 bp truncated form of the EF1A promoter
shown in SEQ ID NO:1 (bp 966-2181 of SEQ ID NO:1).
[0027] SEQ ID NO:6 is a 1012 bp truncated form of the EF1A promoter
shown in SEQ ID NO:1 (bp 1169-2181 of SEQ ID NO:1).
[0028] SEQ ID NO:7 is a 837 bp truncated form of the EF1A promoter
shown in SEQ ID NO:1 (bp 1344-2181 of SEQ ID NO:1).
[0029] SEQ ID NO:8 is an oligonucleotide primer used as an
antisense primer in the PCR amplifications of the full length EF1A
promoter in SEQ ID NO:1 when paired with SEQ ID NO:9, and the
truncated EF1A promoters in SEQ ID NOs:2, 3, 4, 5, 6, or 7 when
paired with SEQ ID NOs:10, 11, 12, 13, 14 or 15, respectively.
[0030] SEQ ID NO:9 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the full length EF1A promoter in
SEQ ID NO:1 when paired with SEQ ID NO:8.
[0031] SEQ ID NO:10 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the truncated EF1A promoter in
SEQ ID NO:2 when paired with SEQ ID NO:8.
[0032] SEQ ID NO:11 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the truncated EF1A promoter in
SEQ ID NO:3 when paired with SEQ ID NO:8.
[0033] SEQ ID NO:12 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the truncated EF1A promoter in
SEQ ID NO:4 when paired with SEQ ID NO:8.
[0034] SEQ ID NO:13 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the truncated EF1A promoter in
SEQ ID NO:5 when paired with SEQ ID NO:8.
[0035] SEQ ID NO:14 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the truncated EF1A promoter in
SEQ ID NO:6 when paired with SEQ ID NO:8.
[0036] SEQ ID NO:15 is an oligonucleotide primer used as a sense
primer in the PCR amplification of the truncated EF1A promoter in
SEQ ID NO:7 when paired with SEQ ID NO:8.
[0037] SEQ ID NO:16 is a 770 bp intron located in the 5'
untranslated region (5UTR) of the EF1A gene as revealed by sequence
alignment between the EF1A promoter and EF1A cDNA sequence.
[0038] SEQ ID NO:17 is a 54 bp putative 5' untranslated region
(5UTR) upstream of the 770 bp intron SEQ ID NO:16 of the EF1A
gene.
[0039] SEQ ID NO:18 is a 13 bp putative 5' untranslated region
(5UTR) downstream of the 770 bp intron SEQ ID NO:16 of the EF1A
gene. The last two base pairs CC are not naturally present in the
EF1A gene and are extra base pairs introduced as part of the NcoI
cloning site CCATGG.
[0040] SEQ ID NO:19 is a 41 bp oligonucleotide primer specific to
the soybean EF1A promoter 5' end for the amplification of the
promoter when paired with SEQ ID NO:20. A XmaI restriction site
CCCGGG is added for subsequent cloning.
[0041] SEQ ID NO:20 is a 35 bp oligonucleotide primer specific to
the soybean EF1A promoter 3' end for the amplification of the
promoter when paired with SEQ ID NO:19. A NcoI restriction site
CCATGG is added for subsequent cloning.
[0042] SEQ ID NO:21 is the 1724 bp nucleotide sequence of the
putative soybean translation elongation factor EF1A gene.
Nucleotides 1 to 65 are the 5' untranslated sequence, nucleotides
66 to 68 are the translation initiation codon, nucleotides 66 to
1406 are the polypeptide coding region, nucleotides 1407 to 1409
are the termination codon, and nucleotides 1410 to 1724 are part of
the 3' untranslated sequence.
[0043] SEQ ID NO:22 is the predicted 447 aa (amino acid) long
protein sequence translated from the coding region of the putative
soybean translation elongation factor EF1A gene nucleotide sequence
SEQ ID NO:21.
[0044] SEQ ID NO:23 is the 5465 bp sequence of QC314.
[0045] SEQ ID NO:24 is the 9768 bp sequence of QC318.
[0046] SEQ ID NO:25 is the 5839 bp sequence of QC314-1Y.
[0047] SEQ ID NO:26 is an oligonucleotide primer used in the
diagnostic PCR to check for soybean genomic DNA presence in total
RNA or cDNA when paired with SEQ ID NO:27.
[0048] SEQ ID NO:27 is an oligonucleotide primer used in the
diagnostic PCR to check for soybean genomic DNA presence in total
RNA or cDNA when paired with SEQ ID NO:26.
[0049] SEQ ID NO:28 is a sense primer used in quantitative PCR
analysis of SAMS:ALS transgene copy numbers.
[0050] SEQ ID NO:29 is a FAM labeled fluorescent DNA oligo probe
used in quantitative PCR analysis of SAMS:ALS transgene copy
numbers.
[0051] SEQ ID NO:30 is an antisense primer used in quantitative PCR
analysis of SAMS:ALS transgene copy numbers.
[0052] SEQ ID NO:31 is a sense primer used in quantitative PCR
analysis of GM-EFA1:YFP transgene copy numbers.
[0053] SEQ ID NO:32 is a FAM labeled fluorescent DNA oligo probe
used in quantitative PCR analysis of GM-EFA1:YFP transgene copy
numbers.
[0054] SEQ ID NO:33 is an antisense primer used in quantitative PCR
analysis of GM-EFA1:YFP transgene copy numbers.
[0055] SEQ ID NO:34 is a sense primer used as an endogenous control
gene primer in quantitative PCR analysis of transgene copy
numbers.
[0056] SEQ ID NO:35 is a VIC labeled DNA oligo probe used as an
endogenous control gene probe in quantitative PCR analysis of
transgene copy numbers.
[0057] SEQ ID NO:36 is an antisense primer used as an endogenous
control gene primer in quantitative PCR analysis of transgene copy
numbers.
[0058] SEQ ID NO:37 is the recombination site attL1 sequence in the
Gateway cloning system (Invitrogen).
[0059] SEQ ID NO:38 is the recombination site attL2 sequence in the
Gateway cloning system (Invitrogen).
[0060] SEQ ID NO:39 is the recombination site attR1 sequence in the
Gateway cloning system (Invitrogen).
[0061] SEQ ID NO:40 is the recombination site attR2 sequence in the
Gateway cloning system (Invitrogen).
[0062] SEQ ID NO:41 is the recombination site attB1 sequence in the
Gateway cloning system (Invitrogen).
[0063] SEQ ID NO:42 is the recombination site attB2 sequence in the
Gateway cloning system (Invitrogen).
[0064] SEQ ID NO:43 is the 17 base signature tag used in the MPSS
analysis described in Example 1.
[0065] FIG. 1 is the logarithm of relative quantifications of the
soybean EF1A gene expression in 14 different soybean tissues by
quantitative RT-PCR. The gene expression profile indicates that the
EF1A gene is highly expressed in all the checked tissues.
[0066] FIG. 2A is EF1A promoter copy number analysis by
Southern.
[0067] FIG. 2B shows a schematic of the EF1A1 promoter.
[0068] FIG. 3 are the maps of plasmid QC314, PHP25224, and
QC318.
[0069] FIG. 4 are the maps of plasmid pCR8/GW/TOPO, QC314-1, QC300,
and QC314-1Y containing the full length 2181 bp EF1A promoter.
Promoter deletion constructs QC314-2Y, QC314-3Y, QC314-4Y,
QC314-5Y, QC314-6Y, and QC314-7Y containing the 1841, 1642, 1431,
1215, 1012, and 837 bp truncated EF1A promoters, respectively, have
the similar map configuration.
[0070] FIG. 5 is the schematic description of the progressive
truncation constructs, QC314-1Y, QC314-2Y, QC314-3Y, QC314-4Y,
QC314-5Y, QC314-6Y, and QC314-7Y, of the EF1A promoter. The size of
each promoter deletion including the 5'UTR, 5'UTR intron is given
at the left end of each drawing. The size of the promoter region
upstream of the 5'UTR is given under each drawing.
[0071] FIG. 6 is the transient expression of the fluorescent
protein reporter gene ZS-YELLOW1 N1 in the cotyledons of
germinating soybean seeds. The reporter gene is driven by the full
length EF1A promoter in QC314 or by progressively truncated EF1A
promoters in the transient expression constructs QC314-1Y to
QC314-7Y.
[0072] FIG. 7 is the stable expression of the fluorescent protein
reporter gene ZS-YELLOW1 N1 in transgenic soybean plants containing
a single copy of the transgene construct QC318.
DETAILED DESCRIPTION OF THE INVENTION
[0073] The disclosure of all patents, patent applications, and
publications cited herein are incorporated by reference in their
entirety.
[0074] In the context of this disclosure, a number of terms shall
be utilized.
[0075] As used herein, a "GM-EF1A promoter" refers to a promoter of
the Glycine max EF-1-alpha polypeptide which is a putative soybean
protein with significant homology to translation elongation factor
EF-1.alpha. genes identified in various species including soybean
(Aguilar et al, Plant Mol. Biol. 17), 351-360 (1991)).
[0076] The term "constitutive promoter" refers to promoters active
in all or most tissues of a plant at all or most developing stages.
As with other promoters classified as "constitutive" (e.g.
ubiquitin), some variation in absolute levels of expression can
exist among different tissues or stages.
[0077] The term "constitutive promoter" or "tissue-independent" are
used interchangeably herewithin.
[0078] The promoter nucleotide sequences and methods disclosed
herein are useful in regulating constitutive expression of any
heterologous nucleotide sequences in a host plant in order to alter
the phenotype of a plant.
[0079] Various changes in phenotype are of interest including, but
not limited to, modifying the fatty acid composition in a plant,
altering the amino acid content of a plant, altering a plant's
pathogen defense mechanism, and the like. These results can be
achieved by providing expression of heterologous products or
increased expression of endogenous products in plants.
Alternatively, the results can be achieved by providing for a
reduction of expression of one or more endogenous products,
particularly enzymes or cofactors in the plant. These changes
result in a change in phenotype of the transformed plant.
[0080] Genes of interest are reflective of the commercial markets
and interests of those involved in the development of the crop.
Crops and markets of interest change, and as developing nations
open up world markets, new crops and technologies will emerge also.
In addition, as our understanding of agronomic characteristics and
traits such as yield and heterosis increase, the choice of genes
for transformation will change accordingly. General categories of
genes of interest include, but are not limited to, those genes
involved in information, such as zinc fingers, those involved in
communication, such as kinases, and those involved in housekeeping,
such as heat shock proteins. More specific categories of
transgenes, for example, include, but are not limited to, genes
encoding important traits for agronomics, insect resistance,
disease resistance, herbicide resistance, sterility, grain or seed
characteristics, and commercial products. Genes of interest
include, generally, those involved in oil, starch, carbohydrate, or
nutrient metabolism as well as those affecting seed size, plant
development, plant growth regulation, and yield improvement. Plant
development and growth regulation also refer to the development and
growth regulation of various parts of a plant, such as the flower,
seed, root, leaf and shoot.
[0081] Other commercially desirable traits are genes and proteins
conferring cold, heat, salt, and drought resistance.
[0082] Disease and/or insect resistance genes may encode resistance
to pests that have great yield drag such as for example,
Anthracnose, Soybean Mosaic Virus, Soybean Cyst Nematode, Root-Knot
Nematode, Brown leaf spot, Downy Mildew, Purple Seed Stain, Seed
Decay and Seedling Diseases caused commonly by the fungi--Pythium
sp., Phytophthora sp., Rhizoctonia sp., Diaporthe sp. Bacterial
Blight caused by the bacterium Pseudomonas syringae pv. Glycinea.
Genes conferring insect resistance include, for example, Bacillus
thuringiensis toxic protein genes (U.S. Pat. Nos. 5,366,892;
5,747,450; 5,737,514; 5,723,756; 5,593,881; and Geiser et al (1986)
Gene 48:109); lectins (Van Damme et al. (1994) Plant Mol. Biol.
24:825); and the like.
[0083] Herbicide resistance traits may include genes coding for
resistance to herbicides that act to inhibit the action of
acetolactate synthase (ALS), in particular the sulfonylurea-type
herbicides (e.g., the acetolactate synthase ALS gene containing
mutations leading to such resistance, in particular the S4 and/or
Hra mutations). The ALS-gene mutants encode resistance to the
herbicide chlorosulfuron. GAT is an N-acetyltransferase from
Bacillus licheniformis that was optimized by gene shuffling for
acetylation of the broad spectrum herbicide, glyphosate, forming
the basis of a novel mechanism of glyphosate tolerance in
transgenic plants (Castle et al. (2004) Science 304,
1151-1154).
[0084] Antibiotic resistance genes include, for example, neomycin
phosphotransferase (npt) and hygromycin phosphotransferase (hpt).
Two neomycin phosphotransferase genes are used in selection of
transformed organisms: the neomycin phosphotransferase I (nptI)
gene and the neomycin phosphotransferase II (nptII) gene. The
second one is more widely used. It was initially isolated from the
transposon Tn5 that was present in the bacterium strain Escherichia
coli K12. The gene codes for the aminoglycoside
3'-phosphotransferase (denoted aph(3')-II or NPTII) enzyme, which
inactivates by phosphorylation a range of aminoglycoside
antibiotics such as kanamycin, neomycin, geneticin and paroromycin.
NPTII is widely used as a selectable marker for plant
transformation. It is also used in gene expression and regulation
studies in different organisms in part because N-terminal fusions
can be constructed that retain enzyme activity. NPTII protein
activity can be detected by enzymatic assay. In other detection
methods, the modified substrates--the phosphorylated
antibiotics-are detected by thin-layer chromatography, dot-blot
analysis or polyacrylamide gel electrophoresis. Plants such as
maize, cotton, tobacco, Arabidopsis, flax, soybean and many others
have been successfully transformed with the nptll gene.
[0085] The hygromycin phosphotransferase (denoted hpt, hph or
aphIV) gene was originally derived from Escherichia coli. The gene
codes for hygromycin phosphotransferase (HPT), which detoxifies the
aminocyclitol antibiotic hygromycin B. A large number of plants
have been transformed with the hpt gene and hygromycin B has proved
very effective in the selection of a wide range of plants,
including monocotyledonous. Most plants exhibit higher sensitivity
to hygromycin B than to kanamycin, for instance cereals. Likewise,
the hpt gene is used widely in selection of transformed mammalian
cells. The sequence of the hpt gene has been modified for its use
in plant transformation. Deletions and substitutions of amino acid
residues close to the carboxy (C)-terminus of the enzyme have
increased the level of resistance in certain plants, such as
tobacco. At the same time, the hydrophilic C-terminus of the enzyme
has been maintained and may be essential for the strong activity of
HPT. HPT activity can be checked using an enzymatic assay. A
non-destructive callus induction test can be used to verify
hygromycin resistance.
[0086] Genes involved in plant growth and development have been
identified in plants. One such gene, which is involved in cytokinin
biosynthesis, is isopentenyl transferase (IPT). Cytokinin plays a
critical role in plant growth and development by stimulating cell
division and cell differentiation (Sun et al. (2003), Plant
Physiol. 131: 167-176).
Calcium-dependent protein kinases (CDPK), a family of
serine-threonine kinase found primarily in the plant kingdom, are
likely to function as sensor molecules in calcium-mediated
signaling pathways. Calcium ions are important second messengers
during plant growth and development (Harper et al. Science 252,
951-954 (1993). Roberts et al. Curr Opin Cell Biol 5, 242-246
(1993). Roberts et al. Annu Rev Plant Mol Biol 43, 375-414
(1992)).
[0087] Nematode responsive protein (NRP) is produced by soybean
upon the infection of soybean cyst nematode. NRP has homology to a
taste-modifying glycoprotein miraculin and the NF34 protein
involved in tumor formation and hyper response induction. NRP is
believed to function as a defense-inducer in response to nematode
infection (Tenhaken et al. BMC Bioinformatics 6:169 (2005)).
[0088] The quality of seeds and grains is reflected in traits such
as levels and types of fatty acids or oils, saturated and
unsaturated, quality and quantity of essential amino acids, and
levels of carbohydrates. Therefore, commercial traits can also be
encoded on a gene or genes that could increase for example
methionine and cysteine, two sulfur containing amino acids that are
present in low amounts in soybeans. Cystathionine gamma synthase
(CGS) and serine acetyl transferase (SAT) are proteins involved in
the synthesis of methionine and cysteine, respectively.
[0089] Other commercial traits can encode genes to increase for
example monounsaturated fatty acids, such as oleic acid, in oil
seeds. Soybean oil for example contains high levels of
polyunsaturated fatty acids and is more prone to oxidation than
oils with higher levels of monounsaturated and saturated fatty
acids. High oleic soybean seeds can be prepared by recombinant
manipulation of the activity of oleoyl 12-desaturase (Fad2). High
oleic soybean oil can be used in applications that require a high
degree of oxidative stability, such as cooking for a long period of
time at an elevated temperature.
[0090] Raffinose saccharides accumulate in significant quantities
in the edible portion of many economically significant crop
species, such as soybean (Glycine max L. Merrill), sugar beet (Beta
vulgaris), cotton (Gossypium hirsutum L.), canola (Brassica sp.)
and all of the major edible leguminous crops including beans
(Phaseolus sp.), chick pea (Cicer arietinum), cowpea (Vigna
unguiculata), mung bean (Vigna radiata), peas (Pisum sativum),
lentil (Lens culinaris) and lupine (Lupinus sp.). Although abundant
in many species, raffinose saccharides are an obstacle to the
efficient utilization of some economically important crop
species.
[0091] Downregulation of the expression of the enzymes involved in
raffinose saccharide synthesis, such as galactinol synthase for
example, would be a desirable trait.
[0092] In certain embodiments, the present invention contemplates
the transformation of a recipient cell with more than one
advantageous transgene. Two or more transgenes can be supplied in a
single transformation event using either distinct
transgene-encoding vectors, or using a single vector incorporating
two or more gene coding sequences. Any two or more transgenes of
any description, such as those conferring herbicide, insect,
disease (viral, bacterial, fungal, nematode) or drought resistance,
oil quantity and quality, or those increasing yield or nutritional
quality may be employed as desired.
[0093] An "isolated nucleic acid fragment" refers to a polymer of
ribonucleotides (RNA) or deoxyribonucleotides (DNA) that is single-
or double-stranded, optionally containing synthetic, non-natural or
altered nucleotide bases. An isolated nucleic acid fragment in the
form of DNA may be comprised of one or more segments of cDNA,
genomic DNA or synthetic DNA.
[0094] The terms "polynucleotide", "polynucleotide sequence",
"nucleic acid sequence", and "nucleic acid fragment"/"isolated
nucleic acid fragment" are used interchangeably herein. These terms
encompass nucleotide sequences and the like. A polynucleotide may
be a polymer of RNA or DNA that is single- or double-stranded, that
optionally contains synthetic, non-natural or altered nucleotide
bases. A polynucleotide in the form of a polymer of DNA may be
comprised of one or more segments of cDNA, genomic DNA, synthetic
DNA, or mixtures thereof. Nucleotides (usually found in their
5'-monophosphate form) are referred to by a single letter
designation as follows: "A" for adenylate or deoxyadenylate (for
RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate,
"G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for
deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C
or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and
"N" for any nucleotide.
[0095] A "heterologous nucleic acid fragment" refers to a sequence
that is not naturally occurring with the plant promoter sequence of
the invention. While this nucleotide sequence is heterologous to
the promoter sequence, it may be homologous, or native, or
heterologous, or foreign, to the plant host. However, it is
recognized that the instant promoters may be used with their native
coding sequences to increase or decrease expression resulting in a
change in phenotype in the transformed seed.
[0096] The terms "subfragment that is functionally equivalent" and
"functionally equivalent subfragment" are used interchangeably
herein. These terms refer to a portion or subsequence of an
isolated nucleic acid fragment in which the ability to alter gene
expression or produce a certain phenotype is retained whether or
not the fragment or subfragment encodes an active enzyme. For
example, the fragment or subfragment can be used in the design of
chimeric genes to produce the desired phenotype in a transformed
plant. Chimeric genes can be designed for use in co-suppression or
antisense by linking a nucleic acid fragment or subfragment
thereof, whether or not it encodes an active enzyme, in the
appropriate orientation relative to a plant promoter sequence.
[0097] The terms "substantially similar" and "corresponding
substantially" as used herein refer to nucleic acid fragments
wherein changes in one or more nucleotide bases does not affect the
ability of the nucleic acid fragment to mediate gene expression or
produce a certain phenotype. These terms also refer to
modifications of the nucleic acid fragments of the instant
invention such as deletion or insertion of one or more nucleotides
that do not substantially alter the functional properties of the
resulting nucleic acid fragment relative to the initial, unmodified
fragment. It is therefore understood, as those skilled in the art
will appreciate, that the invention encompasses more than the
specific exemplary sequences.
[0098] Moreover, the skilled artisan recognizes that substantially
similar nucleic acid sequences encompassed by this invention are
also defined by their ability to hybridize, under moderately
stringent conditions (for example, 0.5.times.SSC, 0.1% SDS,
60.degree. C.) with the sequences exemplified herein, or to any
portion of the nucleotide sequences reported herein and which are
functionally equivalent to the promoter of the invention. Estimates
of such homology are provided by either DNA-DNA or DNA-RNA
hybridization under conditions of stringency as is well understood
by those skilled in the art (Hames and Higgins, Eds.; In Nucleic
Acid Hybridisation; IRL Press: Oxford, U.K., 1985). Stringency
conditions can be adjusted to screen for moderately similar
fragments, such as homologous sequences from distantly related
organisms, to highly similar fragments, such as genes that
duplicate functional enzymes from closely related organisms.
Post-hybridization washes partially determine stringency
conditions. One set of conditions uses a series of washes starting
with 6.times.SSC, 0.5% SDS at room temperature for 15 min, then
repeated with 2.times.SSC, 0.5% SDS at 45.degree. C. for 30 min,
and then repeated twice with 0.2.times.SSC, 0.5% SDS at 50.degree.
C. for 30 min. Another set of stringent conditions uses higher
temperatures in which the washes are identical to those above
except for the temperature of the final two 30 min washes in
0.2.times.SSC, 0.5% SDS was increased to 60.degree. C. Another set
of highly stringent conditions uses two final washes in
0.1.times.SSC, 0.1% SDS at 65.degree. C.
[0099] Preferred substantially similar nucleic acid sequences
encompassed by this invention are those sequences that are 80%
identical to the nucleic acid fragments reported herein or which
are 80% identical to any portion of the nucleotide sequences
reported herein. More preferred are nucleic acid fragments which
are 90% identical to the nucleic acid sequences reported herein, or
which are 90% identical to any portion of the nucleotide sequences
reported herein. Most preferred are nucleic acid fragments which
are 95% identical to the nucleic acid sequences reported herein, or
which are 95% identical to any portion of the nucleotide sequences
reported herein. It is well understood by one skilled in the art
that many levels of sequence identity are useful in identifying
related polynucleotide sequences. Useful examples of percent
identities are those listed above, or also preferred is any integer
percentage from 80% to 100%, such as 81%, 82%, 83%, 84%, 85%, 86%,
87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98 and
99%.
[0100] A "substantially homologous sequence" refers to variants of
the disclosed sequences such as those that result from
site-directed mutagenesis, as well as synthetically derived
sequences. A substantially homologous sequence of the present
invention also refers to those fragments of a particular promoter
nucleotide sequence disclosed herein that operate to promote the
constitutive expression of an operably linked heterologous nucleic
acid fragment. These promoter fragments will comprise at least
about 20 contiguous nucleotides, preferably at least about 50
contiguous nucleotides, more preferably at least about 75
contiguous nucleotides, even more preferably at least about 100
contiguous nucleotides of the particular promoter nucleotide
sequence disclosed herein. The nucleotides of such fragments will
usually comprise the TATA recognition sequence of the particular
promoter sequence. Such fragments may be obtained by use of
restriction enzymes to cleave the naturally occurring promoter
nucleotide sequences disclosed herein; by synthesizing a nucleotide
sequence from the naturally occurring promoter DNA sequence; or may
be obtained through the use of PCR technology. See particularly,
Mullis et al., Methods Enzymol. 155:335-350 (1987), and Higuchi, R.
In PCR Technology: Principles and Applications for DNA
Amplifications; Erlich, H. A., Ed.; Stockton Press Inc.: New York,
1989. Again, variants of these promoter fragments, such as those
resulting from site-directed mutagenesis, are encompassed by the
compositions of the present invention.
[0101] "Codon degeneracy" refers to divergence in the genetic code
permitting variation of the nucleotide sequence without affecting
the amino acid sequence of an encoded polypeptide. Accordingly, the
instant invention relates to any nucleic acid fragment comprising a
nucleotide sequence that encodes all or a substantial portion of
the amino acid sequences set forth herein. The skilled artisan is
well aware of the "codon-bias" exhibited by a specific host cell in
usage of nucleotide codons to specify a given amino acid.
Therefore, when synthesizing a nucleic acid fragment for improved
expression in a host cell, it is desirable to design the nucleic
acid fragment such that its frequency of codon usage approaches the
frequency of preferred codon usage of the host cell.
[0102] Sequence alignments and percent similarity calculations may
be determined using the Megalign program of the LASARGENE
bioinformatics computing suite (DNASTAR Inc., Madison, Wis.) or
using the AlignX program of the Vector NTI bioinformatics computing
suite (Invitrogen, Carlsbad, Calif.). Multiple alignment of the
sequences are performed using the Clustal method of alignment
(Higgins and Sharp, CABIOS 5:151-153 (1989)) with the default
parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default
parameters for pairwise alignments and calculation of percent
identity of protein sequences using the Clustal method are
KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For
nucleic acids these parameters are GAP PENALTY=10, GAP LENGTH
PENALTY=10, KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS
SAVED=4. A "substantial portion" of an amino acid or nucleotide
sequence comprises enough of the amino acid sequence of a
polypeptide or the nucleotide sequence of a gene to afford putative
identification of that polypeptide or gene, either by manual
evaluation of the sequence by one skilled in the art, or by
computer-automated sequence comparison and identification using
algorithms such as BLAST (Altschul, S. F. et al., J. Mol. Biol.
215:403-410 (1993)) and Gapped Blast (Altschul, S. F. et al.,
Nucleic Acids Res. 25:3389-3402 (1997)). BLASTN refers to a BLAST
program that compares a nucleotide query sequence against a
nucleotide sequence database.
[0103] "Gene" refers to a nucleic acid fragment that expresses a
specific protein, including regulatory sequences preceding (5'
non-coding sequences) and following (3' non-coding sequences) the
coding sequence. "Native gene" refers to a gene as found in nature
with its own regulatory sequences. "Chimeric gene" or "recombinant
expression construct", which are used interchangeably, refers to
any gene that is not a native gene, comprising regulatory and
coding sequences that are not found together in nature.
Accordingly, a chimeric gene may comprise regulatory sequences and
coding sequences that are derived from different sources, or
regulatory sequences and coding sequences derived from the same
source, but arranged in a manner different than that found in
nature. "Endogenous gene" refers to a native gene in its natural
location in the genome of an organism. A "foreign" gene refers to a
gene not normally found in the host organism, but that is
introduced into the host organism by gene transfer. Foreign genes
can comprise native genes inserted into a non-native organism, or
chimeric genes. A "transgene" is a gene that has been introduced
into the genome by a transformation procedure.
[0104] "Coding sequence" refers to a DNA sequence which codes for a
specific amino acid sequence. "Regulatory sequences" refer to
nucleotide sequences located upstream (5' non-coding sequences),
within, or downstream (3' non-coding sequences) of a coding
sequence, and which influence the transcription, RNA processing or
stability, or translation of the associated coding sequence.
Regulatory sequences may include, but are not limited to,
promoters, translation leader sequences, introns, and
polyadenylation recognition sequences.
[0105] "Promoter" refers to a DNA sequence capable of controlling
the expression of a coding sequence or functional RNA. Functional
RNA includes, but is not limited to, transfer RNA (tRNA) and
ribosomal RNA (rRNA). The promoter sequence consists of proximal
and more distal upstream elements, the latter elements often
referred to as enhancers. Accordingly, an "enhancer" is a DNA
sequence which can stimulate promoter activity and may be an innate
element of the promoter or a heterologous element inserted to
enhance the level or tissue-specificity of a promoter. Promoters
may be derived in their entirety from a native gene, or be composed
of different elements derived from different promoters found in
nature, or even comprise synthetic DNA segments. It is understood
by those skilled in the art that different promoters may direct the
expression of a gene in different tissues or cell types, or at
different stages of development, or in response to different
environmental conditions. Promoters which cause a gene to be
expressed in most cell types at most times are commonly referred to
as "constitutive promoters". New promoters of various types useful
in plant cells are constantly being discovered; numerous examples
may be found in the compilation by Okamuro and Goldberg
(Biochemistry of Plants 15:1-82 (1989)). It is further recognized
that since in most cases the exact boundaries of regulatory
sequences have not been completely defined, DNA fragments of some
variation may have identical promoter activity. An "intron" is an
intervening sequence in a gene that is transcribed into RNA but is
then excised in the process of generating the mature mRNA. The term
is also used for the excised RNA sequences. An "exon" is a portion
of the sequence of a gene that is transcribed and is found in the
mature messenger RNA derived from the gene, but is not necessarily
a part of the sequence that encodes the final gene product.
[0106] Among the most commonly used promoters are the nopaline
synthase (NOS) promoter (Ebert et al., Proc. Natl. Acad. Sci.
U.S.A. 84:5745-5749 (1987)), the octapine synthase (OCS) promoter,
caulimovirus promoters such as the cauliflower mosaic virus (CaMV)
19S promoter (Lawton et al., Plant Mol. Biol. 9:315-324 (1987)),
the CaMV 35S promoter (Odell et al., Nature 313:810-812 (1985)),
and the figwort mosaic virus 35S promoter (Sanger et al., Plant
Mol. Biol. 14:433-43 (1990)), the light inducible promoter from the
small subunit of rubisco, the Adh promoter (Walker et al., Proc.
Natl. Acad. Sci. U.S.A. 84:6624-66280 (1987), the sucrose synthase
promoter (Yang et al., Proc. Natl. Acad. Sci. U.S.A. 87:4144-4148
(1990)), the R gene complex promoter (Chandler et al., Plant Cell
1:1175-1183 (1989)), the chlorophyll a/b binding protein gene
promoter, etc. Other commonly used promoters are, the promoters for
the potato tuber ADPGPP genes, the sucrose synthase promoter, the
granule bound starch synthase promoter, the glutelin gene promoter,
the maize waxy promoter, Brittle gene promoter, and Shrunken 2
promoter, the acid chitinase gene promoter, and the zein gene
promoters (15 kD, 16 kD, 19 kD, 22 kD, and 27 kD; Perdersen et al.,
Cell 29:1015-1026 (1982)). A plethora of promoters is described in
PCT Publication No. WO 00/18963 published on Apr. 6, 2000, the
disclosure of which is hereby incorporated by reference.
[0107] The "translation leader sequence" refers to a DNA sequence
located between the promoter sequence of a gene and the coding
sequence. The translation leader sequence is present in the fully
processed mRNA upstream of the translation start sequence. The
translation leader sequence may affect processing of the primary
transcript to mRNA, mRNA stability or translation efficiency.
Examples of translation leader sequences have been described
(Turner, R. and Foster, G. D., Molecular Biotechnology 3:225
(1995)).
[0108] The "3' non-coding sequences" refer to DNA sequences located
downstream of a coding sequence and include polyadenylation
recognition sequences and other sequences encoding regulatory
signals capable of affecting mRNA processing or gene expression.
The polyadenylation signal is usually characterized by affecting
the addition of polyadenylic acid tracts to the 3' end of the mRNA
precursor. The use of different 3' non-coding sequences is
exemplified by Ingelbrecht et al., Plant Cell 1:671-680 (1989).
[0109] "RNA transcript" refers to a product resulting from RNA
polymerase-catalyzed transcription of a DNA sequence. When an RNA
transcript is a perfect complementary copy of a DNA sequence, it is
referred to as a primary transcript or it may be a RNA sequence
derived from posttranscriptional processing of a primary transcript
and is referred to as a mature RNA. "Messenger RNA" ("mRNA") refers
to RNA that is without introns and that can be translated into
protein by the cell. "cDNA" refers to a DNA that is complementary
to and synthesized from an mRNA template using the enzyme reverse
transcriptase. The cDNA can be single-stranded or converted into
the double-stranded by using the Klenow fragment of DNA polymerase
I. "Sense" RNA refers to RNA transcript that includes mRNA and so
can be translated into protein within a cell or in vitro.
"Antisense RNA" refers to a RNA transcript that is complementary to
all or part of a target primary transcript or mRNA and that blocks
expression or transcripts accumulation of a target gene (U.S. Pat.
No. 5,107,065). The complementarity of an antisense RNA may be with
any part of the specific gene transcript, i.e. at the 5' non-coding
sequence, 3' non-coding sequence, introns, or the coding sequence.
"Functional RNA" refers to antisense RNA, ribozyme RNA, or other
RNA that may not be translated but yet has an effect on cellular
processes.
[0110] The term "operably linked" refers to the association of
nucleic acid sequences on a single nucleic acid fragment so that
the function of one is affected by the other. For example, a
promoter is operably linked with a coding sequence when it is
capable of affecting the expression of that coding sequence (i.e.,
that the coding sequence is under the transcriptional control of
the promoter). Coding sequences can be operably linked to
regulatory sequences in sense or antisense orientation.
[0111] The term "expression", as used herein, refers to the
production of a functional end-product e.g., a mRNA or a protein
(precursor or mature).
[0112] The term "expression cassette" as used herein, refers to a
discrete nucleic acid fragment into which a nucleic acid sequence
or fragment can be moved.
[0113] Expression or overexpression of a gene involves
transcription of the gene and translation of the mRNA into a
precursor or mature protein. "Antisense inhibition" refers to the
production of antisense RNA transcripts capable of suppressing the
expression of the target protein. "Overexpression" refers to the
production of a gene product in transgenic organisms that exceeds
levels of production in normal or non-transformed organisms.
"Co-suppression" refers to the production of sense RNA transcripts
capable of suppressing the expression or transcript accumulation of
identical or substantially similar foreign or endogenous genes
(U.S. Pat. No. 5,231,020). The mechanism of co-suppression may be
at the DNA level (such as DNA methylation), at the transcriptional
level, or at posttranscriptional level.
[0114] Co-suppression constructs in plants previously have been
designed by focusing on overexpression of a nucleic acid sequence
having homology to an endogenous mRNA, in the sense orientation,
which results in the reduction of all RNA having homology to the
overexpressed sequence (see Vaucheret et al., Plant J. 16:651-659
(1998); and Gura, Nature 404:804-808 (2000)). The overall
efficiency of this phenomenon is low, and the extent of the RNA
reduction is widely variable. Recent work has described the use of
"hairpin" structures that incorporate all, or part, of an mRNA
encoding sequence in a complementary orientation that results in a
potential "stem-loop" structure for the expressed RNA (PCT
Publication No. WO 99/53050, which published on Oct. 21, 1999; and
PCT Publication No. WO 02/00904, which published on Jan. 3, 2002).
This increases the frequency of co-suppression in the recovered
transgenic plants. Another variation describes the use of plant
viral sequences to direct the suppression, or "silencing", of
proximal mRNA encoding sequences (PCT Publication No. WO 98/36083,
which published on Aug. 20, 1998). Genetic and molecular evidences
have been obtained suggesting that dsRNA mediated mRNA cleavage may
have been the conserved mechanism underlying these gene silencing
phenomena (Elmayan et al., Plant Cell 10:1747-1757 (1998); Galun,
In Vitro Cell. Dev. Biol. Plant 41(2):113-123 (2005); Pickford et
al, Cell. Mol. Life Sci. 60(5):871-882 (2003)).
[0115] As stated herein, "suppression" refers to a reduction of the
level of enzyme activity or protein functionality (e.g., a
phenotype associated with a protein) detectable in a transgenic
plant when compared to the level of enzyme activity or protein
functionality detectable in a non-transgenic or wild type plant
with the native enzyme or protein. The level of enzyme activity in
a plant with the native enzyme is referred to herein as "wild type"
activity. The level of protein functionality in a plant with the
native protein is referred to herein as "wild type" functionality.
The term "suppression" includes lower, reduce, decline, decrease,
inhibit, eliminate and prevent. This reduction may be due to a
decrease in translation of the native mRNA into an active enzyme or
functional protein. It may also be due to the transcription of the
native DNA into decreased amounts of mRNA and/or to rapid
degradation of the native mRNA. The term "native enzyme" refers to
an enzyme that is produced naturally in a non-transgenic or wild
type cell. The terms "non-transgenic" and "wild type" are used
interchangeably herein.
[0116] "Altering expression" refers to the production of gene
product(s) in transgenic organisms in amounts or proportions that
differ significantly from the amount of the gene product(s)
produced by the corresponding wild-type organisms (i.e., expression
is increased or decreased).
[0117] "Transformation" refers to the transfer of a nucleic acid
fragment into the genome of a host organism, resulting in
genetically stable inheritance. Host organisms containing the
transformed nucleic acid fragments are referred to as "transgenic"
organisms. The preferred method of soybean cell transformation is
the use of particle-accelerated or "gene gun" transformation
technology (Klein, T., Nature (London) 327:70-73 (1987); U.S. Pat.
No. 4,945,050).
[0118] "Transient expression" refers to the temporary expression of
often reporter genes such as .beta.-glucuronidase (GUS),
fluorescent protein genes GFP, ZS-YELLOW1 N1, AM-CYAN1, DS-RED in
selected certain cell types of the host organism in which the
transgenic gene is introduced temporally by a transformation
method. The transformed materials of the host organism are
subsequently discarded after the transient gene expression
assay.
[0119] Standard recombinant DNA and molecular cloning techniques
used herein are well known in the art and are described more fully
in Sambrook, J. et al., In Molecular Cloning: A Laboratory Manual;
2.sup.nd ed.; Cold Spring Harbor Laboratory Press: Cold Spring
Harbor, N.Y., 1989 (hereinafter "Sambrook et al., 1989") or
Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman,
J. G., Smith, J. A. and Struhl, K., Eds.; In Current Protocols in
Molecular Biology; John Wiley and Sons: New York, 1990 (hereinafter
"Ausubel et al., 1990").
[0120] "PCR" or "Polymerase Chain Reaction" is a technique for the
synthesis of large quantities of specific DNA segments, consisting
of a series of repetitive cycles (Perkin Elmer Cetus Instruments,
Norwalk, Conn.). Typically, the double stranded DNA is heat
denatured, the two primers complementary to the 3' boundaries of
the target segment are annealed at low temperature and then
extended at an intermediate temperature. One set of these three
consecutive steps comprises a cycle.
[0121] A "recombinant expression construct" is a plasmid vector or
a fragment thereof comprising the instant soybean constitutive
promoter. The choice of plasmid vector is dependent upon the method
that will be used to transform host plants. The skilled artisan is
well aware of the genetic elements that must be present on the
plasmid vector in order to successfully transform, select and
propagate host cells containing the chimeric gene. The skilled
artisan will also recognize that different independent
transformation events will result in different levels and patterns
of expression (Jones et al., EMBO J. 4:2411-2418 (1985); De Almeida
et al., Mol. Gen. Genetics 218:78-86 (1989)), and thus that
multiple events must be screened in order to obtain lines
displaying the desired expression level and pattern. Such screening
may be accomplished by PCR and Southern analysis of DNA, RT-PCR and
Northern analysis of mRNA expression, Western analysis of protein
expression, or phenotypic analysis.
[0122] The translation elongation factor EF1alpha belongs to the
GTP-binding elongation factor family and promotes the GTP-dependent
binding of aminoacyl-tRNA to the A-site of ribosomes during protein
biosynthesis. EF1 alpha genes and in some cases their promoters
have been isolated from different plants including soybean (Aguilar
et al, Plant Mol. Biol. 17):351-360 (1991)), Arabidopsis (Axelos et
al, Mol. Gen. Genetics. 219:106-112 (1989)), tobacco (Aida et al,
Japan Agric. Res. Quarterly 39(4):269-274 (2005)), potato (Nakane
et al, J. Gen. Plant Pathol. 69:378-384 (2003)), tomato (Shewmaker
et al, Nucleic Acids Res. 18(14):4276 (1990)), lily (Wang et al, J.
Biol. Chem 274(17):12001-12008 (1999)), carrot (Kawahara et al,
Eur. J. Biochem. 209(1):157-162 (1992)), and other plant species.
The reported EF1alpha genes are abundant especially in fast growing
plant tissues in most cases. In at least one case, the tobacco
EF1alpha gene promoter has been reported to be more efficiently to
express a report transgene than the 35S promoter of cauliflower
mosaic virus (Aida et al, Japan Agric. Res. Quarterly 39(4):269-274
(2005)). It is demonstrated herein that the soybean EF1A gene
promoter can, in fact, be used as a constitutive promoter to drive
efficient expression of transgenes, and that such promoter can be
isolated and used by one skilled in the art.
[0123] This invention concerns an isolated nucleic acid fragment
comprising a constitutive elongation factor EF1alpha gene promoter
EF1A. This invention also concerns an isolated nucleic acid
fragment comprising a promoter wherein said promoter consists
essentially of the nucleotide sequence set forth in SEQ ID NO:1, or
said promoter consists essentially of a fragment that is
substantially similar and functionally equivalent to the nucleotide
sequence set forth in SEQ ID NO:1. A nucleic acid fragment that is
functionally equivalent to the instant EF1A promoter is any nucleic
acid fragment that is capable of controlling the expression of a
coding sequence or functional RNA in a similar manner to the EF1A
promoter. The expression patterns of EF1A gene and its promoter are
set forth in Examples 1, 2, 7, and 8.
[0124] The promoter activity of the soybean genomic DNA fragment
upstream of the EF1A protein coding sequence SEQ ID NO:1 was
assessed by linking the fragment to a yellow fluorescence reporter
gene, ZS-YELLOW1 N1 (YFP) (Matz et al, Nat. Biotechnol. 17:969-973
(1999)), transforming the promoter::YFP expression cassette into
soybean, and analyzing YFP expression in various cell types of the
transgenic plants (see Example 7 and 8). YFP expression was
detected in all parts of the transgenic plants though stronger
expression was detected in fast growing tissues such as developing
embryos and pods. These results indicated that the nucleic acid
fragment contained a constitutive promoter.
[0125] It is clear from the disclosure set forth herein that one of
ordinary skill in the art could perform the following
procedure:
[0126] 1) operably linking the nucleic acid fragment containing the
EF1A promoter sequence to a suitable reporter gene; there are a
variety of reporter genes that are well known to those skilled in
the art, including the bacterial GUS gene, the firefly luciferase
gene, and the cyan, green, red, and yellow fluorescent protein
genes; any gene for which an easy and reliable assay is available
can serve as the reporter gene.
[0127] 2) transforming a chimeric EF1A promoter::reporter gene
expression cassette into an appropriate plant for expression of the
promoter. There are a variety of appropriate plants which can be
used as a host for transformation that are well known to those
skilled in the art, including the dicots, Arabidopsis, tobacco,
soybean, oilseed rape, peanut, sunflower, safflower, cotton,
tomato, potato, cocoa and the monocots, corn, wheat, rice, barley
and palm.
[0128] 3) testing for expression of the EF1A promoter in various
cell types of transgenic plant tissues, e.g., leaves, roots,
flowers, seeds, transformed with the chimeric EF1A
promoter::reporter gene expression cassette by assaying for
expression of the reporter gene product.
[0129] In another aspect, this invention concerns a recombinant DNA
construct comprising at least one heterologous nucleic acid
fragment operably linked to any promoter, or combination of
promoter elements, of the present invention. Recombinant DNA
constructs can be constructed by operably linking the nucleic acid
fragment of the invention EFA1 promoter or a fragment that is
substantially similar and functionally equivalent to any portion of
the nucleotide sequence set forth in SEQ ID NOs:1, 2, 3, 4, 5, 6,
or 7 to a heterologous nucleic acid fragment. Any heterologous
nucleic acid fragment can be used to practice the invention. The
selection will depend upon the desired application or phenotype to
be achieved. The various nucleic acid sequences can be manipulated
so as to provide for the nucleic acid sequences in the proper
orientation. It is believed that various combinations of promoter
elements as described herein may be useful in practicing the
present invention.
[0130] In another aspect, this invention concerns a recombinant DNA
construct comprising at least one acetolactate synthase (ALS)
nucleic acid fragment operably linked to EF1A promoter, or
combination of promoter elements, of the present invention. The
acetolactate synthase gene is involved in the biosynthesis of
branched chain amino acids in plants and is the site of action of
several herbicides including sulfonyl urea. Expression of a mutated
acetolactate synthase gene encoding a protein that can no longer
bind the herbicide will enable the transgenic plants to be
resistant to the herbicide (U.S. Pat. No. 5,605,011, U.S. Pat. No.
5,378,824). The mutated acetolactate synthase gene is also widely
used in plant transformation to select transgenic plants.
[0131] In another embodiment, this invention concerns host cells
comprising either the recombinant DNA constructs of the invention
as described herein or isolated polynucleotides of the invention as
described herein. Examples of host cells which can be used to
practice the invention include, but are not limited to, yeast,
bacteria, and plants.
[0132] Plasmid vectors comprising the instant recombinant
expression construct can be constructed. The choice of plasmid
vector is dependent upon the method that will be used to transform
host cells. The skilled artisan is well aware of the genetic
elements that must be present on the plasmid vector in order to
successfully transform, select and propagate host cells containing
the chimeric gene.
[0133] Methods for transforming dicots, primarily by use of
Agrobacterium tumefaciens, and obtaining transgenic plants have
been published, among others, for cotton (U.S. Pat. No. 5,004,863,
U.S. Pat. No. 5,159,135); soybean (U.S. Pat. No. 5,569,834, U.S.
Pat. No. 5,416,011); Brassica (U.S. Pat. No. 5,463,174); peanut
(Cheng et al., Plant Cell Rep. 15:653-657 (1996), McKently et al.,
Plant Cell Rep. 14:699-703 (1995)); papaya (Ling et al.,
Bio/technology 9:752-758 (1991)); and pea (Grant et al., Plant Cell
Rep. 15:254-258 (1995)). For a review of other commonly used
methods of plant transformation see Newell, C. A., Mol. Biotechnol.
16:53-65 (2000). One of these methods of transformation uses
Agrobacterium rhizogenes (Tepfler, M. and Casse-Delbart, F.,
Microbiol. Sci. 4:24-28 (1987)). Transformation of soybeans using
direct delivery of DNA has been published using PEG fusion (PCT
Publication No. WO 92/17598), electroporation (Chowrira et al.,
Mol. Biotechnol. 3:17-23 (1995); Christou et al., Proc. Natl. Acad.
Sci. U.S.A. 84:3962-3966 (1987)), microinjection, or particle
bombardment (McCabe et al., BiolTechnology 6:923 (1988); Christou
et al., Plant Physiol. 87:671-674 (1988)).
[0134] There are a variety of methods for the regeneration of
plants from plant tissues. The particular method of regeneration
will depend on the starting plant tissue and the particular plant
species to be regenerated. The regeneration, development and
cultivation of plants from single plant protoplast transformants or
from various transformed explants is well known in the art
(Weissbach and Weissbach, Eds.; In Methods for Plant Molecular
Biology; Academic Press, Inc.: San Diego, Calif., 1988). This
regeneration and growth process typically includes the steps of
selection of transformed cells, culturing those individualized
cells through the usual stages of embryonic development or through
the rooted plantlet stage. Transgenic embryos and seeds are
similarly regenerated. The resulting transgenic rooted shoots are
thereafter planted in an appropriate plant growth medium such as
soil. Preferably, the regenerated plants are self-pollinated to
provide homozygous transgenic plants. Otherwise, pollen obtained
from the regenerated plants is crossed to seed-grown plants of
agronomically important lines. Conversely, pollen from plants of
these important lines is used to pollinate regenerated plants. A
transgenic plant of the present invention containing a desired
polypeptide is cultivated using methods well known to one skilled
in the art.
[0135] In addition to the above discussed procedures, practitioners
are familiar with the standard resource materials which describe
specific conditions and procedures for the construction,
manipulation and isolation of macromolecules (e.g., DNA molecules,
plasmids, etc.), generation of recombinant DNA fragments and
recombinant expression constructs and the screening and isolating
of clones, (see for example, Sambrook, J. et al., In Molecular
Cloning: A Laboratory Manual; 2.sup.nd ed.; Cold Spring Harbor
Laboratory Press: Cold Spring Harbor, N.Y., 1989; Maliga et al., In
Methods in Plant Molecular Biology; Cold Spring Harbor Press, 1995;
Birren et al., In Genome Analysis: Detecting Genes, 1; Cold Spring
Harbor: New York, 1998; Birren et al., In Genome Analysis:
Analyzing DNA, 2; Cold Spring Harbor: New York, 1998; Clark, Ed.,
In Plant Molecular Biology: A Laboratory Manual; Springer: New
York, 1997).
[0136] The skilled artisan will also recognize that different
independent transformation events will result in different levels
and patterns of expression of the chimeric genes (Jones et al.,
EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics
218:78-86 (1989)). Thus, multiple events must be screened in order
to obtain lines displaying the desired expression level and
pattern. Such screening may be accomplished by Northern analysis of
mRNA expression, Western analysis of protein expression, or
phenotypic analysis. Also of interest are seeds obtained from
transformed plants displaying the desired gene expression
profile.
[0137] The level of activity of the EF1A promoter is comparable to
that of many known strong promoters, such as the CaMV 35S promoter
(Atanassova et al., Plant Mol. Biol. 37:275-285 (1998); Battraw and
Hall, Plant Mol. Biol. 15:527-538 (1990); Holtorf et al., Plant
Mol. Biol. 29:637-646 (1995); Jefferson et al., EMBO J. 6:3901-3907
(1987); Wilmink et al., Plant Mol. Biol. 28:949-955 (1995)), the
Arabidopsis oleosin promoters (Plant et al., Plant Mol. Biol.
25:193-205 (1994); Li, Texas A&M University Ph.D. dissertation,
pp. 107-128 (1997)), the Arabidopsis ubiquitin extension protein
promoters (Callis et al., J. Biol. Chem. 265(21):12486-12493
(1990)), a tomato ubiquitin gene promoter (Rollfinke et al., Gene
211:267-276 (1998)), a soybean heat shock protein promoter, and a
maize H3 histone gene promoter (Atanassova et al., Plant Mol. Biol.
37:275-285 (1998)). Universal expression of chimeric genes in most
plant cells makes the EF1A promoter of the instant invention
especially useful when constitutive expression of a target
heterologous nucleic acid fragment is required.
[0138] Another general application of the EF1A promoter of the
invention is to construct chimeric genes that can be used to reduce
expression of at least one heterologous nucleic acid fragment in a
plant cell. To accomplish this, a chimeric gene designed for gene
silencing of a heterologous nucleic acid fragment can be
constructed by linking the fragment to the EF1A promoter of the
present invention. (See U.S. Pat. No. 5,231,020, and PCT
Publication No. WO 99/53050, which published on Oct. 21, 1999, PCT
Publication No. WO 02/00904, which published on Jan. 3, 2002, and
PCT Publication No. WO 98/36083, which published on Aug. 20, 1998,
for methodology to block plant gene expression via cosuppression.)
Alternatively, a chimeric gene designed to express antisense RNA
for a heterologous nucleic acid fragment can be constructed by
linking the fragment in reverse orientation to the EF1A promoter of
the present invention. (See U.S. Pat. No. 5,107,065 for methodology
to block plant gene expression via antisense RNA.) Either the
cosuppression or antisense chimeric gene can be introduced into
plants via transformation. Transformants wherein expression of the
heterologous nucleic acid fragment is decreased or eliminated are
then selected.
[0139] This invention also concerns a method of altering
(increasing or decreasing) the expression of at least one
heterologous nucleic acid fragment in a plant cell which comprises:
[0140] (a) transforming a plant cell with the recombinant
expression construct described herein; [0141] (b) growing fertile
mature plants from the transformed plant cell of step (a); [0142]
(c) selecting plants containing a transformed plant cell wherein
the expression of the heterologous nucleic acid fragment is
increased or decreased.
[0143] Transformation and selection can be accomplished using
methods well-known to those skilled in the art including, but not
limited to, the methods described herein.
EXAMPLES
[0144] The present invention is further defined in the following
Examples, in which parts and percentages are by weight and degrees
are Celsius, unless otherwise stated. Sequences of promoters, cDNA,
adaptors, and primers listed in this invention all are in the 5' to
3' orientation unless described otherwise. Techniques in molecular
biology were typically performed as described in Ausubel, F. M. et
al., In Current Protocols in Molecular Biology; John Wiley and
Sons: New York, 1990 or Sambrook, J. et al., In Molecular Cloning:
A Laboratory Manual; 2.sup.nd ed.; Cold Spring Harbor Laboratory
Press: Cold Spring Harbor, N.Y., 1989 (hereinafter "Sambrook et
al., 1989"). It should be understood that these Examples, while
indicating preferred embodiments of the invention, are given by way
of illustration only. From the above discussion and these Examples,
one skilled in the art can ascertain the essential characteristics
of this invention, and without departing from the spirit and scope
thereof, can make various changes and modifications of the
invention to adapt it to various usages and conditions. Thus,
various modifications of the invention in addition to those shown
and described herein will be apparent to those skilled in the art
from the foregoing description. Such modifications are also
intended to fall within the scope of the appended claims.
[0145] The disclosure of each reference set forth herein is
incorporated herein by reference in its entirety.
Example 1
Identification of Soybean Constitutive Promoter Candidate Genes
[0146] Soybean expression sequence tags (EST) were generated by
sequencing randomly selected clones from cDNA libraries constructed
from different soybean tissues. Multiple EST sequences could often
be found with different lengths representing the different regions
of the same soybean gene. If more EST sequences representing the
same gene are more frequently found from a tissue-specific cDNA
library such as a flower library than from a leaf library, there is
a possibility that the represented gene could be a flower preferred
gene candidate. Likewise, if similar numbers of ESTs for the same
gene were found in various libraries constructed from different
tissues, the represented gene could be a constitutively expressed
gene. Multiple EST sequences representing the same soybean gene
could be compiled electronically based on their overlapping
sequence homology into a unique full length sequence representing
the gene. These assembled unique gene sequences were accumulatively
collected in Pioneer Hi-Bred Intl proprietary searchable databases.
To identify strong constitutive promoter candidate genes, searches
were performed to look for gene sequences that were found at
similar frequencies in leaf, root, flower, embryos, pod, and also
in other libraries. One unique gene PSO467151 was identified in the
search to be a constitutive gene candidate. PSO467151 cDNA sequence
(SEQ ID NO:21) as well as its putative translated protein sequence
(SEQ ID NO:22) were used to search National Center for
Biotechnology Information (NCBI) databases. Both PSO467151
nucleotide and amino acid sequences were found to have high
homology to translation elongation factor EF1alpha genes discovered
in several plants including soybean (Aguilar et al, Plant Mol.
Biol. 17(3):351-360 (1991)).
[0147] Due to the limited number of ESTs representing PSO467151 in
the databases, it was necessary to apply an additional analysis to
confirm its gene expression profile. A more sensitive gene
expression profiling methodology MPSS (Mass Parallel Signature
Sequence) transcript profiling technique (Brenner et al., Proc Natl
Acad Sci USA 97:1665-70 (2000)) was used to confirm that PSO467151
is indeed constitutively expressed. The MPSS technology involves
the generation of 17 base signature tags from mRNA samples that
have been reverse transcribed from poly A+RNA isolated using
standard molecular biology techniques (Sambrook et al., 1989). The
tags are simultaneously sequenced and assigned to genes or ESTs.
The abundance of these tags is given a number value that is
normalized to parts per million (PPM) which then allows the tag
expression, or tag abundance, to be compared across different
tissues. Thus, the MPSS platform can be used to determine the
expression pattern of a particular gene and its expression levels
in different tissues. MPSS gene expression profiles generated from
different soybean tissues over the time have been accumulatively
collected in Pioneer Hi-Bred Intl proprietary searchable databases.
The PSO467151 cDNA sequence was first used to search the databases
to identify a MPSS tag that was unique and identical to a 17 base
pair region in the 3' end of the PSO467151 cDNA sequence (SEQ ID
NO: 43). The sequence tag was then used to search the databases
again to reveal its abundance in different tissues. As illustrated
in Table 1, the PSO467151 gene was confirmed to be highly abundant
in all tissues, a desired expression profile for its promoter to be
used as a constitutive promoter with stronger expression in seed
and pod.
TABLE-US-00001 TABLE 1 Lynx MPSS Expression Profiles of the
PSO467151 Gene Target gene PSO467151 Tag sequence SEQ ID NO: 43
Flower 1869 Pod 7176 Flower bud 2191 Lateral root 3112 Leaf 3068
Petiole 1716 Primary root 2910 Seed 7584 Stem 2198
Example 2
Quantitative RT-PCR Profiles of EF1A Gene Expression in Soybean
[0148] The MPSS profile of PSO467151 was confirmed and extended by
analyzing 14 different soybean tissues using the relative
quantitative RT-PCR technique with a ABI7500 real time PCR system
(Applied Biosystems, Foster City, Calif.). Fourteen soybean
tissues, somatic embryo, somatic embryo one week on charcoal plate,
leaf, leaf petiole, root, flower bud, open flower, R3 pod, R4 seed,
R4 pod coat, R5 seed, R5 pod coat, R6 seed, R6 pod coat were
collected from cultivar `Jack` and flash frozen in liquid nitrogen.
The seed and pod development stages were defined according to
descriptions in Fehr and Caviness, IWSRBC 80:1-12 (1977). Total RNA
was extracted with Trizol reagents (Invitrogen, Carlsbad, Calif.)
and treated with DNase I to remove any trace amount of genomic DNA
contamination. The first strand cDNA was synthesized using the
Superscript III reverse transcriptase (Invitrogen). Regular PCR
analysis was done to confirm that the cDNA was free of any genomic
DNA using primers shown in SEQ ID NO:26 and 27.
The primers are specific to the 5'UTR intron/exon junction region
of a soybean S-adenosylmethionine synthetase gene promoter SAMS
(Falco and Li, WO 00/37662 (2000)). PCR using this primer set will
amplify a 967 bp DNA fragment from any soybean genomic DNA template
and a 376 bp DNA fragment from the cDNA template. Genome DNA free
cDNA aliquots were used in quantitative RT-PCR analysis in which an
endogenous soybean ATP sulfurylase gene was used as an internal
control and wild type soybean genomic DNA was used as the
calibrator for relative quantification. PCR reaction data were
captured and analyzed using the sequence detection software
provided with the ABI7500 real time PCR system. The qRT-PCR
profiling of the PSO467151 EF1A gene expression confirmed its
strong and constitutive expression pattern (FIG. 1).
Example 3
Isolation of Soybean EF1A Promoter
[0149] PSO467151 sequence was used to search the Pioneer Hi-Bred
Intl propriety EST databases and a BAC (bacterial artificial
chromosome) clone sbacm.pk087.m18 was identified based on sequence
similarity. The BAC clone was partially sequenced to reveal a
approximately 2.5 Kb sequence upstream of PSO467151 EF1A gene
coding region. The primers shown in SEQ ID NO:19 and 20 were then
designed to amplify the putative full length 2181 bp EF1A promoter
from the BAC clone DNA using a polymerase chain reaction (PCR). SEQ
ID NO:19 contains a recognition site for the restriction enzyme
XmaI. SEQ ID NO:20 contains a recognition site for the restriction
enzyme Ncol. In order to study promoter function, the EF1A promoter
was cloned into an expression vector via the restriction enzymes
sites.
[0150] PCR cycle conditions were 94.degree. C. for 4 minutes; 35
cycles of 94.degree. C. for 30 seconds, 60.degree. C. for 1 minute,
and 68.degree. C. for 2 minutes; and a final 68.degree. C. for 5
minutes before holding at 4.degree. C. The PCR reaction was
resolved using agarose gel electrophoresis to identify the right
size PCR product representing the .about.2.1 Kb EF1A promoter. The
PCR amplified DNA of the correct size was then digested with XmaI
and NcoI restriction enzymes and the fragment was cloned into a
Gateway (Invitrogen) cloning entry vector by conventional ligation
to place the putative EF1A promoter upstream of the ZS-YELLOW N1
fluorescent reporter gene (YFP). Several clones containing the
.about.2.1 Kb DNA insert were sequenced and construct QC314 (FIG.
3) was confirmed to contain the identical EF1A promoter sequence as
previously sequenced from the BAC clone sbacm.pk087.m18. The EF1A
promoter sequence is herein listed as SEQ ID NO:1.
[0151] Sequence alignment analysis between the EF1A promoter
sequence and the full length EF1 alpha cDNA sequence revealed that
there is a 770 bp intron SEQ ID NO:16 in the 5'UTR (un-translated
region). The promoter region upstream of the putative 5'UTR is 1344
bp long. The 5'UTR is interrupted by the intron into a 54 bp
fragment SEQ ID NO:17 upstream of the intron and a 13 bp fragment
SEQ ID NO:18 downstream of the intron. The last two base pairs CC
in the 13 bp fragment are extra base pairs introduced as part of
the cloning site NcoI CCATGG. The two 5'UTR fragments, the 5'UTR
intron, and the upstream promoter region with a total of 2181 bp
nucleotides (SEQ ID NO:1) are herein collectively called EF1A
promoter.
Example 4
EF1A Promoter Copy Number Analysis
[0152] Southern hybridization analysis was performed to examine
whether additional copies or sequences with significant similarity
to the EF1A promoter exist in the soybean genome. Soybean `Jack`
wild type genomic DNA was digested with nine different restriction
enzymes, BamHI, BgIII, DraI, EcoRI, EcoRV, HindIII, MfeI, NdeI, and
SpeI and distributed in a 0.7% agraose gel by electrophoresis. The
DNA was blotted onto Nylon membrane and hybridized at 50.degree. C.
with digoxigenin labeled EF1A promoter DNA probe in EasyHyb
Southern hybridization solution, and subsequentially washed 10
minutes with 2.times.SSC/0.1% SDS at room temperature and
3.times.10 minutes at 65.degree. C. with 0.1.times.SSC/0.1% SDS
according to the protocol provided by the manufacturer (Roche
Applied Science, Indianapolis, Ind.). The EF1A promoter probe was
labeled by PCR using the DIG DNA labeling kit (Roche Applied
Science) with two gene specific primers SEQ ID NO:15 and SEQ ID
NO:8 to make a 837 bp long probe SEQ ID NO:7 covering the EF1A
5'UTR and its 5'UTR intron.
[0153] According to the EF1A promoter sequence, restriction enzymes
EcoRI, EcoRV, HindIII, MfeI, NdeI, and SpeI do not cut the probe
region, therefore only one band would be expected to hybridize to
the probe for each of the six digestions if only one copy of EF1A
sequence exists in the soybean genome (FIG. 2B). The observation
that, in addition to a strong major band, a weak minor band
hybridized for EcoRI, MfeI, NdeI, and SpeI digestions and two weak
bands hybridized for HindIII digestion suggested that there is
another sequence in the soybean genome with enough similarity to
the 837 bp EF1A probe sequence to hybridize to the probe (FIG.
2A).
[0154] Each of the three enzymes, BamHI, BgIII, and DraI cuts the
EF1A probe region at least once, therefore the presence of two or
more bands would be expected after digestion with each one of these
enzymes. BamHI cuts only the probe region once roughly in the
middle; two bands with unspecified sizes were expected and
observed. In addition, a weak band was also observed suggesting the
existence of another similar sequence in addition to the EF1A probe
in soybean genome. BgIII cuts five times in the probe region but
only bands larger than .about.900 bp would have been retained and
transferred onto the Southern blot after gel electrophoresis. A
strong major band and two weak bands were observed for BgIII
digestion, also suggesting the existence of another soybean genomic
DNA sequence similar to the EF1A probe. DraI cuts the probe region
once roughly in the middle and also several times in the upstream
region, a 937 bp band and another band with unspecified size were
expected and observed. In conclusion, Southern blot analysis
suggested the existence of another sequence with enough similarity
to the EF1A promoter to hybridize to the same EF1A probe under the
described Southern hybridization conditions
Example 5
EF1A:YFP Reporter Gene Constructs and Soybean Transformation
[0155] The EF1A:YFP expression cassette in Gateway entry construct
QC314 (SEQ ID NO:23) described in EXAMPLE 3 was moved into a
Gateway destination vector PHP25224 bp LR clonase mediated DNA
recombination between the attL1 and attL2 recombination sites (SEQ
ID NO:37, and 38, respectively) in QC314 and the attR1-attR2
recombination sites (SEQ ID NO:39, and 40, respectively) in
PHP25224 (Invitrogen). Since the destination vector PHP25224
already contains a soybean transformation selectable marker gene
SAMS:ALS, the resulting DNA construct QC318 (SEQ ID NO:24) has two
gene expression cassettes EF1A:YFP and SAMS:ALS linked together
(FIG. 3). Two 21 bp recombination sites attB1 and attB2 (SEQ ID
NO:41, and 42, respectively) were newly created recombination sites
resulting from DNA recombination between attL1 and attR2, and
between attL2 and attR2, respectively. The 7524 bp DNA fragment
containing the linked EF1A:YFP and SAMS:ALS expression cassettes
was isolated from plasmid QC318 with AscI digestion, separated from
the vector backbone fragment by agarose gel electrophoresis, and
purified from the gel with a DNA gel extraction kit (Qiagen,
Valencia, Calif.). The purified DNA fragment was transformed to
soybean cultivar Jack by the method of particle gun bombardment
(Klein et al., Nature 327:70-73 (1987); U.S. Pat. No. 4,945,050) to
study the EF1A1 promoter activity in stably transformed soybean
plants.
The same methodology as outlined above for the EF1A:YFP expression
cassette construction and transformation can be used with other
heterologous nucleic acid sequences encoding for example a reporter
protein, a selection marker, a protein conferring disease
resistance, protein conferring herbicide resistance, protein
conferring insect resistance; protein involved in carbohydrate
metabolism, protein involved in fatty acid metabolism, protein
involved in amino acid metabolism, protein involved in plant
development, protein involved in plant growth regulation, protein
involved in yield improvement, protein involved in drought
resistance, protein involved in cold resistance, protein involved
in heat resistance and salt resistance in plants.
[0156] Soybean somatic embryos from the Jack cultivar were induced
as follows. Cotyledons (.about.3 mm in length) were dissected from
surface sterilized, immature seeds and were cultured for 6-10 weeks
in the light at 26.degree. C. on a Murashige and Skoog media
containing 0.7% agar and supplemented with 10 mg/ml 2,4-D. Globular
stage somatic embryos, which produced secondary embryos, were then
excised and placed into flasks containing liquid MS medium
supplemented with 2,4-D (10 mg/ml) and cultured in the light on a
rotary shaker. After repeated selection for clusters of somatic
embryos that multiplied as early, globular staged embryos, the
soybean embryogenic suspension cultures were maintained in 35 ml
liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with
fluorescent lights on a 16:8 hour day/night schedule. Cultures were
subcultured every two weeks by inoculating approximately 35 mg of
tissue into 35 ml of the same fresh liquid MS medium.
[0157] Soybean embryogenic suspension cultures were then
transformed by the method of particle gun bombardment using a
DuPont Biolistic.TM. PDS1000/HE instrument (Bio-Rad Laboratories,
Hercules, Calif.). To 50 .mu.l of a 60 mg/ml 1.0 mm gold particle
suspension were added (in order): 30 .mu.l of 10 ng/.mu.l QC318 DNA
fragment EF1A:YFP+SAMS:ALS, 20 .mu.l of 0.1 M spermidine, and 25
.mu.l of 5 M CaCl.sub.2. The particle preparation was then agitated
for 3 minutes, spun in a centrifuge for 10 seconds and the
supernatant removed. The DNA-coated particles were then washed once
in 400 .mu.l 100% ethanol and resuspended in 45 .mu.l of 100%
ethanol. The DNA/particle suspension was sonicated three times for
one second each. 5 .mu.l of the DNA-coated gold particles was then
loaded on each macro carrier disk.
[0158] Approximately 300-400 mg of a two-week-old suspension
culture was placed in an empty 60.times.15 mm Petri dish and the
residual liquid removed from the tissue with a pipette. For each
transformation experiment, approximately 5 to 10 plates of tissue
were bombarded. Membrane rupture pressure was set at 1100 psi and
the chamber was evacuated to a vacuum of 28 inches mercury. The
tissue was placed approximately 3.5 inches away from the retaining
screen and bombarded once. Following bombardment, the tissue was
divided in half and placed back into liquid media and cultured as
described above.
[0159] Five to seven days post bombardment, the liquid media was
exchanged with fresh media containing 100 ng/ml chlorsulfuron as
selection agent. This selective media was refreshed weekly. Seven
to eight weeks post bombardment, green, transformed tissue was
observed growing from untransformed, necrotic embryogenic clusters.
Isolated green tissue was removed and inoculated into individual
flasks to generate new, clonally propagated, transformed
embryogenic suspension cultures. Each clonally propagated culture
was treated as an independent transformation event and subcultured
in the same liquid MS media supplemented with 2,4-D (10 mg/ml) and
100 ng/ml chlorsulfuron selection agent to increase mass. The
embryogenic suspension cultures were then transferred to agar solid
MS media plates without 2,4-D supplement to allow somatic embryos
to develop. A sample of each event was collected at this stage for
PCR and quantitative PCR analysis.
[0160] Cotyledon stage somatic embryos were dried-down (by
transferring them into an empty small Petri dish that was seated on
top of a 10 cm Petri dish containing some agar gel to allow slow
dry down) to mimic the last stages of soybean seed development.
Dried-down embryos were placed on germination solid media and
transgenic soybean plantlets were regenerated. The transgenic
plants were then transferred to soil and maintained in growth
chambers for seed production. Genomic DNA were extracted from
somatic embryo samples and analyzed by quantitative PCR using the
7500 real time PCR system (Applied Biosystems) with gene-specific
primers and FAM-labeled fluorescence probes to check copy numbers
of both the SAMS:ALS expression cassette and the EF1A:YFP
expression cassette. The qPCR analysis was done in duplex reactions
with a heat shock protein (HSP) gene as the endogenous controls and
a transgenic DNA sample with a known single copy of SAMS:ALS or YFP
transgene as the calibrator using the relative quantification
methodology (Applied Biosystems). The endogenous control HSP probe
was labeled with VIC and the target gene SAMS or YFP probe was
labeled with FAM for the simultaneous detection of both fluorescent
probes in the same duplex reactions. FAM labeled DNA oligo probes
and VIC labeled oligo probes were obtained from Sigma Genosy (The
Woodlands, Tex.).
[0161] The primers and probes used in the qPCR analysis are listed
below.
SAMS forward primer: SEQ ID NO:28 FAM labeled SAMS probe: SEQ ID
NO:29 SAMS reverse primer: SEQ ID NO:30 YFP forward primer: SEQ ID
NO:31 FAM labeled YFP probe: SEQ ID NO:32 YFP reverse primer: SEQ
ID NO:33 HSP forward primer: SEQ ID NO:34 VIC labeled HSP probe:
SEQ ID NO:35 HSP reverse primer: SEQ ID NO:36
[0162] Only transgenic soybean events containing 1 or 2 copies of
both the SAMS:ALS expression cassette and the EF1A:YFP expression
cassette were selected for further gene expression evaluation and
seed production (see Table 2). Events negative for YFP qPCR or with
more than 2 copies for the SAMS qPCR were not further followed. YFP
expressions are described in detail in EXAMPLE 8 and are recorded
in Table 2.
TABLE-US-00002 TABLE 2 Relative transgene copy numbers and YFP
expression of EF1A:YFP transgenic plants Event ID YFP YFP qPCR SAMS
qPCR 4870.2.1 + 1.2 0.6 4870.2.2 + 1.0 0.7 4870.2.4 + 1.0 0.6
4870.2.6 + 1.3 0.6 4870.2.7 + 1.0 0.7 4870.2.8 + 1.0 0.5 4870.3.1 +
1.0 0.6 4870.3.2 + 1.0 0.4 4870.3.4 + 1.4 0.7 4870.3.5 + 1.3 0.5
4870.4.1 + 1.8 1.0 4870.6.1 + 0.9 0.5 4870.6.3 + 1.1 0.8 4870.6.4 +
1.0 0.5 4870.6.5 + 1.0 0.5 4870.6.9 + 1.1 0.7 4870.7.1 + 1.1 0.5
4870.8.1 + 1.8 1.3 4870.8.3 + 1.0 0.7 4870.8.5 + 1.7 1.3 4870.8.7 +
1.7 1.2
Example 6
Construction of EF1A Promoter Deletion Constructs
[0163] To define the transcriptional elements controlling the EF1A
promoter activity, the 2181 bp full length (SEQ ID NO:1) and six 5'
unidirectional deletion fragments 1841 bp, 1642 bp, 1431 bp, 1215
bp, 1012, and 837 bp in length corresponding to SEQ ID NO:2, 3, 4,
5, 6, and 7, respectively, were made by PCR amplification from the
full length soybean EF1A promoter contained in the original
construct QC314 (FIG. 3). The same antisense primer (SEQ ID NO:8)
was used in the amplification by PCR of all the seven EF1A promoter
fragments (SEQ ID NO:1, 2, 3, 4, 5, 6, and 7) by pairing with
different sense primers SEQ ID NOs:9, 10, 11, 12, 13, 14, and 15,
respectively. Each of the PCR amplified promoter DNA fragments was
cloned into the Gateway cloning ready TA cloning vector
pCR8/GW/TOPO (Invitrogen) and clones with the correct orientation,
relative to the Gateway recombination sites attL1 and attL2, were
selected by BamHI+XhoI double restriction enzymes digestion
analysis and sequence confirmation (see the example map QC314-1 in
FIG. 4). The maps of constructs QC314-2, 3, 4, 5, 6, and 7
containing the EF1A promoter fragments SEQ ID NOs:2, 3, 4, 5, 6, 7
are similar to QC314-1 map and are not shown. The promoter fragment
in the right orientation was subsequently cloned into a Gateway
destination vector QC300 by Gateway LR clonase reaction
(Invitrogen) to place the promoter fragment in front of the
reporter gene YFP (see the example map QC314-1Y in FIG. 4). A 21 bp
Gateway recombination site attB2 SEQ ID NO:42 was inserted between
the promoter and the YFP reporter gene coding region as a result of
the Gateway cloning process. The maps of constructs QC314-2Y, 3Y,
4Y, 5Y, 6Y, and 7Y containing the EF1A promoter fragments SEQ ID
NOs:2, 3, 4, 5, 6, and 7 are similar to QC314-1Y map and not shown.
The EF1A:YFP promoter deletion constructs were delivered into
germinating soybean cotyledons by gene gun bombardment for
transient gene expression study. The full length EF1A promoter in
QC314 without the attB2 site located between the promoter and the
YFP gene was included as a positive control for transient
expression analysis. The seven EF1A promoter fragments analyzed are
schematically described in FIG. 5.
Example 7
Transient Expression Analysis of EF1A:YFP Constructs
[0164] The constructs containing the full length and partial
promoter fragments (QC314, QC314-1Y, 2Y, 3Y, 4Y, 5Y, 6Y, and 7Y)
were tested by transiently expressing the ZS-YELLOW1 N1 (YFP)
reporter gene in germinating soybean cotyledons. Germinating
soybean cotyledons were used as the target tissue for transient
expression assays. Soybean seeds were rinsed with 10% Tween 20 in
sterile water, surface sterilized with 70% ethanol for 2 minutes
and then by 6% sodium hypochloride for 15 minutes. After rinsing
the seeds were placed on wet filter paper in Petri dish to
germinate for 4-6 days under light at 26.degree. C. Green
cotyledons were excised and placed inner side up on a 0.7% agar
plate containing Murashige and Skoog media for particle gun
bombardment. The DNA and gold particle mixtures were prepared
similarly as described in EXAMPLE 5 except with more DNA (100
ng/.mu.l). The bombardments were also carried out under similar
parameters as described in EXAMPLE 5. YFP expression was checked
under a Leica MZFLIII stereo microscope equipped with UV light
source and appropriate light filters (Leica Microsystems Inc.,
Bannockburn, Ill.) and pictures were taken approximately 24 hours
after bombardment with 8.times. magnification and the same camera
settings as 1.06 gamma, 0.0% gain, and 0.58 second exposure.
[0165] The original full length EF1A promoter construct QC314, the
full length construct with the attB2 gateway recombination site
between the promoter and the YFP gene QC314-1Y, and the series of
deletions QC314-2Y, 3Y, 4Y, 5Y, and 6Y all had strong yellow
fluorescence signals in transient expression assay by showing the
large green/yellow dots (shown as white dots in FIG. 6). The attB2
site did not seem to interfere with promoter activity and reporter
gene expression. Each dot represented a single cotyledon cell which
appeared larger if the fluorescence signal was strong or smaller if
the fluorescence signal was weak even under the same magnification.
The four longer deletions constructs QC314-2Y, 3Y, 4Y, 5Y had
similar level of YFP expression as the full length construct
QC314-1Y, indicating that the elements necessary for the EF1A
promoter proper activity were retained in all these deletions.
Interestingly, the shortest deletion construct QC314-6Y with only
175 bp promoter sequence upstream of the 5'UTR had the strongest
fluorescence signal, indicating that the 175 bp sequence was long
enough to retain strong promoter activity. Negative elements might
exist upstream of the 175 bp region to further regulate the EF1A
promoter activity. Removal of the 175 bp upstream sequence,
resulting in the 5'UTR and 5'UTR intron construct QC314-7Y, caused
most of the promoter activity to disappear. Only faint fluorescence
signals were detected by the transient assay (FIG. 6). Construct
pZSL90 with the constitutive promoter SCP1 driving the YFP
expression and construct QC299i without any promoter driving the
YFP expression were used in the transient assays as positive and
negative controls, respectively. No fluorescence was detected in
the negative control (picture not shown).
Example 8
EFIA:YFP Expression in Stable Transgenic Soybean Plants
[0166] YFP gene expression was tested at different stages of
transgenic plant development for yellow fluorescence emission under
a Leica MZFLIII stereo microscope equipped with a UV light source
and appropriate fluorescent light filters (Leica Microsystems
Inc.). Yellow fluorescence (shown as bright white areas in FIG. 7)
was detected early on during somatic embryo development and
throughout all stages of transgenic plant development in all
tissues tested, such as somatic embryos, leaf, stem, root, flower,
pod, and seed. During tissue culture stages of transgenic plant
regeneration, fluorescence was uniformly detected in young globular
and torpedo stage somatic embryos (FIG. 7A), in cotyledon stages
embryos (FIG. 7B), and in mature and dried down embryos (FIG. 7C).
Negative control embryos emitted weak red color (shown as dark grey
areas in FIG. 7D) due to auto fluorescence from the chlorophyll
contained in soybean green tissues including embryos. Negative
controls for other tissue types displayed in FIG. 7 are not shown,
but any green tissue such as leaf or stem, negative for YFP
expression, would be red and any white tissue such as root, petal
would be dark under the yellow fluorescent light filter.
[0167] When transgenic plantlets were regenerated from somatic
embryos, yellow fluorescence was detected in leaf, stem, and root
and was retained in all vegetative tissues throughout mature
plants. Fluorescence in leaf was not as strong as in other tissues
and fluorescence in the adaxial side of leaf (FIG. 7E) was not as
strong as in the abaxial side (FIG. 7F). Veins had the strongest
fluorescence in both sides of the leaf. Though trichomes on both
sides of the leaf showed fluorescence, it was difficult to
determine if the fluorescence signals were specific to the
transgenic reporter gene since trichomes tended to fluoresce under
different non-specific fluorescent light filters. Fluorescence was
readily detected in stem and was strongest in the vascular bundles
as shown by a stem cross section (FIG. 7G). Fluorescence was strong
in all parts of root (FIG. 7H).
[0168] A soybean flower consists of five sepals, five petals
including one standard large upper petal, two large side petals,
and two small fused lower petals called kneel to enclose ten
stamens and one pistil. The pistil consists of a stigma, a style,
and an ovary in which there are 2-4 ovules. A stamen consists of a
filament, and an anther on its tip. Pollen grains reside inside
anther chambers and are released during pollination. Yellow
fluorescence was detected in sepals and in the exposed part of
petals of a young flower bud when its petals were still mostly
enclosed by sepals (FIG. 7I). Fluorescence was detected in all
parts including petals, anthers, filaments, and the pistil when a
mature flower was dissected (FIG. 7J). Strong yellow fluorescence
was detected in pollen grains which were still sticking to the
anthers when a stamen was dissected and magnified (FIG. 7K). The
stigma, style, ovary wall, and ovule inside the ovary of a
dissected pistil all showed fluorescence though signals were not
strong in the style and ovary wall (FIG. 7L).
[0169] Strong yellow fluorescence was detected in developing pods
and seeds at all stages of the EF1A:YFP transgenic plants from very
young R3 pod of .about.5 mm long (FIG. 7M), to full R4 pod of
.about.20 mm long (FIG. 7N), until mature R5, R6 pod fully filled
with seeds (FIG. 7O-P). Detail descriptions of soybean development
stages can be found in (Fehr and Caviness, CODEN:IWSRBC 80:1-12
(1977)). Since T0 transgenic plants are hemizygous in nature, the
embryos of their progeny T1 seeds will segregate according to
Mendel's law, but the seed coats will not segregate since they are
derived from the maternal tissue ovule coats. A positive T1 seed
would have both its seed coat and embryo fluorescing (FIG. 7O). A
negative T1 seed would only have its coat fluorescing but not its
embryo (FIG. 7P). In conclusion, EF1A:YFP expression was detected
with high levels in all tissues throughout transgenic plant
development indicating that the soybean EF1A promoter is a strong
constitutive promoter.
Sequence CWU 1
1
4312181DNAGlycine max 1gggcaatcaa attatatatg taaagcaatt acagtttatc
aaactttatt tatggaaata 60atttattatc acatttattt tggtttataa attttaaatt
aaaatatcac ctaaataaaa 120ataattttta acatgactta ttgtcctaaa
taaattattt ccgtaaatta aataaaatga 180agtttttttc tttcaaagaa
tctaaatggt cataatgaga attctctaaa aaaatacata 240atgagaataa
ttatggaatt tatttattaa taaaaattaa tagcattttg atagacaatt
300aataaaattt taaaaataac catatagaaa taataatttt tttactatcg
gttccaatta 360aaataatgat aaaaaataaa atagattatt aattgatatt
gatatgaaat ttaaataaag 420aatataatca tatattttat tgatatatga
tatgatatag attaattgat attgattttg 480atatggaatt taaaaataat
ataataattg tttttattta ttaatacgtg taatcaaata 540attctcactt
tttgaatcaa tcagtgtact taaagataat atcagttgaa tattttttat
600ccttttacgt gtgctgtgag acattatcat caattgtgtt gtatatgata
tatagatata 660gatatataaa tatatagatt gagtgatata atatatttaa
aatataaatt atatatatgt 720tttaatatat ttttgcatat atatatatat
ttgtaaaaac tagaagtatt tttcatgaga 780taattattat cgagttgaat
aagtctatta tttgtgagag ccaaccatat ttatatatgt 840gattaaattt
tatctttgtg aaattaaaaa taataaaaaa taccttaaaa tcataataat
900agaaaaactt atatttataa tttaccatta tacttaaaaa aaattaaata
aatattataa 960atataaatac tatcgagtaa tggccgcgct agggtttttg
agaaaaaatc ttcccacgca 1020ctcaactgca ctgtacggcg tcgttttcac
agccgcataa tagaagccgc gttccccaac 1080ccttcctcac aacattctcg
gaccctccag caccgtcacc caaacaaata tccacgcggt 1140agtaggcgcg
tgaaacaaac tctaatccga actacgagac gtgagaagca cgcgctttag
1200cgagcgtttc aattgtcgct acgaaagcag agaaggatac aaacggaact
agggtaaatt 1260agtaagggta atttcgtaaa cagaagaaaa gagttgtagc
tataaataaa ccctctaacc 1320ctcgtcgcat tacttctctt cacacctttg
ttcactcttc ttctcttgcg gctagggttt 1380tagcgcagct tcttctaggt
tcgttatcta ccaccgttct atggatttta ttccttctat 1440tcgtgtttat
tctattggtt tatgttgctt gcaatatgtt ttttctgaat ctgtcgtcgt
1500tgtcttcaat tttatccatg tttcagagat caattttgtt tgtgtagtat
gtgcttattc 1560ttcttctttt cgttcgagtt gttaataacg gtgctatggt
gttttcaaaa gtgttttttt 1620tattactttt gatttaaagt ttttttggta
aggcttttat ttgcttgtta tattcaaatc 1680tttggatcca gatcttatat
aagtttttgg ttcaagaaag tttttggtta ctgatgaata 1740gatctattaa
ctgttacttt aatcgattca agctaaagtt ttttggttac tgatgaatag
1800atctattatc tgttactttt aatcggttca agctcaagtt ttttggttac
tgatgaatag 1860atctatatac gtcacagtgt gctaaacatg cccttgtttt
atctcgatct tatgtatggg 1920agtgccataa attttgttat gtctattttt
ttatctgttg gaatcatact gagtttgatg 1980cgttacgatt gagcatacct
atttttgggc ttgttgtatg gtgggtattt agatcttaat 2040ctttttatgc
ttatgaaagg ttttgtaatg acaaaggtct taatgttgtt aaacttttat
2100ttttacttta tatggtgtgt tgatgtgtta tggttttgac aacttttttt
ttttctggat 2160ttttgcagat ttaaggaagc c 218121841DNAGlycine max
2tttactatcg gttccaatta aaataatgat aaaaaataaa atagattatt aattgatatt
60gatatgaaat ttaaataaag aatataatca tatattttat tgatatatga tatgatatag
120attaattgat attgattttg atatggaatt taaaaataat ataataattg
tttttattta 180ttaatacgtg taatcaaata attctcactt tttgaatcaa
tcagtgtact taaagataat 240atcagttgaa tattttttat ccttttacgt
gtgctgtgag acattatcat caattgtgtt 300gtatatgata tatagatata
gatatataaa tatatagatt gagtgatata atatatttaa 360aatataaatt
atatatatgt tttaatatat ttttgcatat atatatatat ttgtaaaaac
420tagaagtatt tttcatgaga taattattat cgagttgaat aagtctatta
tttgtgagag 480ccaaccatat ttatatatgt gattaaattt tatctttgtg
aaattaaaaa taataaaaaa 540taccttaaaa tcataataat agaaaaactt
atatttataa tttaccatta tacttaaaaa 600aaattaaata aatattataa
atataaatac tatcgagtaa tggccgcgct agggtttttg 660agaaaaaatc
ttcccacgca ctcaactgca ctgtacggcg tcgttttcac agccgcataa
720tagaagccgc gttccccaac ccttcctcac aacattctcg gaccctccag
caccgtcacc 780caaacaaata tccacgcggt agtaggcgcg tgaaacaaac
tctaatccga actacgagac 840gtgagaagca cgcgctttag cgagcgtttc
aattgtcgct acgaaagcag agaaggatac 900aaacggaact agggtaaatt
agtaagggta atttcgtaaa cagaagaaaa gagttgtagc 960tataaataaa
ccctctaacc ctcgtcgcat tacttctctt cacacctttg ttcactcttc
1020ttctcttgcg gctagggttt tagcgcagct tcttctaggt tcgttatcta
ccaccgttct 1080atggatttta ttccttctat tcgtgtttat tctattggtt
tatgttgctt gcaatatgtt 1140ttttctgaat ctgtcgtcgt tgtcttcaat
tttatccatg tttcagagat caattttgtt 1200tgtgtagtat gtgcttattc
ttcttctttt cgttcgagtt gttaataacg gtgctatggt 1260gttttcaaaa
gtgttttttt tattactttt gatttaaagt ttttttggta aggcttttat
1320ttgcttgtta tattcaaatc tttggatcca gatcttatat aagtttttgg
ttcaagaaag 1380tttttggtta ctgatgaata gatctattaa ctgttacttt
aatcgattca agctaaagtt 1440ttttggttac tgatgaatag atctattatc
tgttactttt aatcggttca agctcaagtt 1500ttttggttac tgatgaatag
atctatatac gtcacagtgt gctaaacatg cccttgtttt 1560atctcgatct
tatgtatggg agtgccataa attttgttat gtctattttt ttatctgttg
1620gaatcatact gagtttgatg cgttacgatt gagcatacct atttttgggc
ttgttgtatg 1680gtgggtattt agatcttaat ctttttatgc ttatgaaagg
ttttgtaatg acaaaggtct 1740taatgttgtt aaacttttat ttttacttta
tatggtgtgt tgatgtgtta tggttttgac 1800aacttttttt ttttctggat
ttttgcagat ttaaggaagc c 184131642DNAGlycine max 3aattctcact
ttttgaatca atcagtgtac ttaaagataa tatcagttga atatttttta 60tccttttacg
tgtgctgtga gacattatca tcaattgtgt tgtatatgat atatagatat
120agatatataa atatatagat tgagtgatat aatatattta aaatataaat
tatatatatg 180ttttaatata tttttgcata tatatatata tttgtaaaaa
ctagaagtat ttttcatgag 240ataattatta tcgagttgaa taagtctatt
atttgtgaga gccaaccata tttatatatg 300tgattaaatt ttatctttgt
gaaattaaaa ataataaaaa ataccttaaa atcataataa 360tagaaaaact
tatatttata atttaccatt atacttaaaa aaaattaaat aaatattata
420aatataaata ctatcgagta atggccgcgc tagggttttt gagaaaaaat
cttcccacgc 480actcaactgc actgtacggc gtcgttttca cagccgcata
atagaagccg cgttccccaa 540cccttcctca caacattctc ggaccctcca
gcaccgtcac ccaaacaaat atccacgcgg 600tagtaggcgc gtgaaacaaa
ctctaatccg aactacgaga cgtgagaagc acgcgcttta 660gcgagcgttt
caattgtcgc tacgaaagca gagaaggata caaacggaac tagggtaaat
720tagtaagggt aatttcgtaa acagaagaaa agagttgtag ctataaataa
accctctaac 780cctcgtcgca ttacttctct tcacaccttt gttcactctt
cttctcttgc ggctagggtt 840ttagcgcagc ttcttctagg ttcgttatct
accaccgttc tatggatttt attccttcta 900ttcgtgttta ttctattggt
ttatgttgct tgcaatatgt tttttctgaa tctgtcgtcg 960ttgtcttcaa
ttttatccat gtttcagaga tcaattttgt ttgtgtagta tgtgcttatt
1020cttcttcttt tcgttcgagt tgttaataac ggtgctatgg tgttttcaaa
agtgtttttt 1080ttattacttt tgatttaaag tttttttggt aaggctttta
tttgcttgtt atattcaaat 1140ctttggatcc agatcttata taagtttttg
gttcaagaaa gtttttggtt actgatgaat 1200agatctatta actgttactt
taatcgattc aagctaaagt tttttggtta ctgatgaata 1260gatctattat
ctgttacttt taatcggttc aagctcaagt tttttggtta ctgatgaata
1320gatctatata cgtcacagtg tgctaaacat gcccttgttt tatctcgatc
ttatgtatgg 1380gagtgccata aattttgtta tgtctatttt tttatctgtt
ggaatcatac tgagtttgat 1440gcgttacgat tgagcatacc tatttttggg
cttgttgtat ggtgggtatt tagatcttaa 1500tctttttatg cttatgaaag
gttttgtaat gacaaaggtc ttaatgttgt taaactttta 1560tttttacttt
atatggtgtg ttgatgtgtt atggttttga caactttttt tttttctgga
1620tttttgcaga tttaaggaag cc 164241431DNAGlycine Max 4ttgtaaaaac
tagaagtatt tttcatgaga taattattat cgagttgaat aagtctatta 60tttgtgagag
ccaaccatat ttatatatgt gattaaattt tatctttgtg aaattaaaaa
120taataaaaaa taccttaaaa tcataataat agaaaaactt atatttataa
tttaccatta 180tacttaaaaa aaattaaata aatattataa atataaatac
tatcgagtaa tggccgcgct 240agggtttttg agaaaaaatc ttcccacgca
ctcaactgca ctgtacggcg tcgttttcac 300agccgcataa tagaagccgc
gttccccaac ccttcctcac aacattctcg gaccctccag 360caccgtcacc
caaacaaata tccacgcggt agtaggcgcg tgaaacaaac tctaatccga
420actacgagac gtgagaagca cgcgctttag cgagcgtttc aattgtcgct
acgaaagcag 480agaaggatac aaacggaact agggtaaatt agtaagggta
atttcgtaaa cagaagaaaa 540gagttgtagc tataaataaa ccctctaacc
ctcgtcgcat tacttctctt cacacctttg 600ttcactcttc ttctcttgcg
gctagggttt tagcgcagct tcttctaggt tcgttatcta 660ccaccgttct
atggatttta ttccttctat tcgtgtttat tctattggtt tatgttgctt
720gcaatatgtt ttttctgaat ctgtcgtcgt tgtcttcaat tttatccatg
tttcagagat 780caattttgtt tgtgtagtat gtgcttattc ttcttctttt
cgttcgagtt gttaataacg 840gtgctatggt gttttcaaaa gtgttttttt
tattactttt gatttaaagt ttttttggta 900aggcttttat ttgcttgtta
tattcaaatc tttggatcca gatcttatat aagtttttgg 960ttcaagaaag
tttttggtta ctgatgaata gatctattaa ctgttacttt aatcgattca
1020agctaaagtt ttttggttac tgatgaatag atctattatc tgttactttt
aatcggttca 1080agctcaagtt ttttggttac tgatgaatag atctatatac
gtcacagtgt gctaaacatg 1140cccttgtttt atctcgatct tatgtatggg
agtgccataa attttgttat gtctattttt 1200ttatctgttg gaatcatact
gagtttgatg cgttacgatt gagcatacct atttttgggc 1260ttgttgtatg
gtgggtattt agatcttaat ctttttatgc ttatgaaagg ttttgtaatg
1320acaaaggtct taatgttgtt aaacttttat ttttacttta tatggtgtgt
tgatgtgtta 1380tggttttgac aacttttttt ttttctggat ttttgcagat
ttaaggaagc c 143151215DNAGlycine max 5atactatcga gtaatggccg
cgctagggtt tttgagaaaa aatcttccca cgcactcaac 60tgcactgtac ggcgtcgttt
tcacagccgc ataatagaag ccgcgttccc caacccttcc 120tcacaacatt
ctcggaccct ccagcaccgt cacccaaaca aatatccacg cggtagtagg
180cgcgtgaaac aaactctaat ccgaactacg agacgtgaga agcacgcgct
ttagcgagcg 240tttcaattgt cgctacgaaa gcagagaagg atacaaacgg
aactagggta aattagtaag 300ggtaatttcg taaacagaag aaaagagttg
tagctataaa taaaccctct aaccctcgtc 360gcattacttc tcttcacacc
tttgttcact cttcttctct tgcggctagg gttttagcgc 420agcttcttct
aggttcgtta tctaccaccg ttctatggat tttattcctt ctattcgtgt
480ttattctatt ggtttatgtt gcttgcaata tgttttttct gaatctgtcg
tcgttgtctt 540caattttatc catgtttcag agatcaattt tgtttgtgta
gtatgtgctt attcttcttc 600ttttcgttcg agttgttaat aacggtgcta
tggtgttttc aaaagtgttt tttttattac 660ttttgattta aagttttttt
ggtaaggctt ttatttgctt gttatattca aatctttgga 720tccagatctt
atataagttt ttggttcaag aaagtttttg gttactgatg aatagatcta
780ttaactgtta ctttaatcga ttcaagctaa agttttttgg ttactgatga
atagatctat 840tatctgttac ttttaatcgg ttcaagctca agttttttgg
ttactgatga atagatctat 900atacgtcaca gtgtgctaaa catgcccttg
ttttatctcg atcttatgta tgggagtgcc 960ataaattttg ttatgtctat
ttttttatct gttggaatca tactgagttt gatgcgttac 1020gattgagcat
acctattttt gggcttgttg tatggtgggt atttagatct taatcttttt
1080atgcttatga aaggttttgt aatgacaaag gtcttaatgt tgttaaactt
ttatttttac 1140tttatatggt gtgttgatgt gttatggttt tgacaacttt
ttttttttct ggatttttgc 1200agatttaagg aagcc 121561012DNAGlycine max
6aactacgaga cgtgagaagc acgcgcttta gcgagcgttt caattgtcgc tacgaaagca
60gagaaggata caaacggaac tagggtaaat tagtaagggt aatttcgtaa acagaagaaa
120agagttgtag ctataaataa accctctaac cctcgtcgca ttacttctct
tcacaccttt 180gttcactctt cttctcttgc ggctagggtt ttagcgcagc
ttcttctagg ttcgttatct 240accaccgttc tatggatttt attccttcta
ttcgtgttta ttctattggt ttatgttgct 300tgcaatatgt tttttctgaa
tctgtcgtcg ttgtcttcaa ttttatccat gtttcagaga 360tcaattttgt
ttgtgtagta tgtgcttatt cttcttcttt tcgttcgagt tgttaataac
420ggtgctatgg tgttttcaaa agtgtttttt ttattacttt tgatttaaag
tttttttggt 480aaggctttta tttgcttgtt atattcaaat ctttggatcc
agatcttata taagtttttg 540gttcaagaaa gtttttggtt actgatgaat
agatctatta actgttactt taatcgattc 600aagctaaagt tttttggtta
ctgatgaata gatctattat ctgttacttt taatcggttc 660aagctcaagt
tttttggtta ctgatgaata gatctatata cgtcacagtg tgctaaacat
720gcccttgttt tatctcgatc ttatgtatgg gagtgccata aattttgtta
tgtctatttt 780tttatctgtt ggaatcatac tgagtttgat gcgttacgat
tgagcatacc tatttttggg 840cttgttgtat ggtgggtatt tagatcttaa
tctttttatg cttatgaaag gttttgtaat 900gacaaaggtc ttaatgttgt
taaactttta tttttacttt atatggtgtg ttgatgtgtt 960atggttttga
caactttttt tttttctgga tttttgcaga tttaaggaag cc 10127837DNAGlycine
max 7cctttgttca ctcttcttct cttgcggcta gggttttagc gcagcttctt
ctaggttcgt 60tatctaccac cgttctatgg attttattcc ttctattcgt gtttattcta
ttggtttatg 120ttgcttgcaa tatgtttttt ctgaatctgt cgtcgttgtc
ttcaatttta tccatgtttc 180agagatcaat tttgtttgtg tagtatgtgc
ttattcttct tcttttcgtt cgagttgtta 240ataacggtgc tatggtgttt
tcaaaagtgt tttttttatt acttttgatt taaagttttt 300ttggtaaggc
ttttatttgc ttgttatatt caaatctttg gatccagatc ttatataagt
360ttttggttca agaaagtttt tggttactga tgaatagatc tattaactgt
tactttaatc 420gattcaagct aaagtttttt ggttactgat gaatagatct
attatctgtt acttttaatc 480ggttcaagct caagtttttt ggttactgat
gaatagatct atatacgtca cagtgtgcta 540aacatgccct tgttttatct
cgatcttatg tatgggagtg ccataaattt tgttatgtct 600atttttttat
ctgttggaat catactgagt ttgatgcgtt acgattgagc atacctattt
660ttgggcttgt tgtatggtgg gtatttagat cttaatcttt ttatgcttat
gaaaggtttt 720gtaatgacaa aggtcttaat gttgttaaac ttttattttt
actttatatg gtgtgttgat 780gtgttatggt tttgacaact tttttttttt
ctggattttt gcagatttaa ggaagcc 837827DNAartificialprimer 8ggcttcctta
aatctgcaaa aatccag 27932DNAartificialprimer 9gggcaatcaa attatatatg
taaagcaatt ac 321030DNAartificialprimer 10tttactatcg gttccaatta
aaataatgat 301130DNAartificialprimer 11aattctcact ttttgaatca
atcagtgtac 301231DNAartificialprimer 12ttgtaaaaac tagaagtatt
tttcatgaga t 311323DNAartificialprimer 13atactatcga gtaatggccg cgc
231424DNAartificialprimer 14aactacgaga cgtgagaagc acgc
241526DNAartificialprimer 15cctttgttca ctcttcttct cttgcg
2616770DNAGlycine max 16gttcgttatc taccaccgtt ctatggattt tattccttct
attcgtgttt attctattgg 60tttatgttgc ttgcaatatg ttttttctga atctgtcgtc
gttgtcttca attttatcca 120tgtttcagag atcaattttg tttgtgtagt
atgtgcttat tcttcttctt ttcgttcgag 180ttgttaataa cggtgctatg
gtgttttcaa aagtgttttt tttattactt ttgatttaaa 240gtttttttgg
taaggctttt atttgcttgt tatattcaaa tctttggatc cagatcttat
300ataagttttt ggttcaagaa agtttttggt tactgatgaa tagatctatt
aactgttact 360ttaatcgatt caagctaaag ttttttggtt actgatgaat
agatctatta tctgttactt 420ttaatcggtt caagctcaag ttttttggtt
actgatgaat agatctatat acgtcacagt 480gtgctaaaca tgcccttgtt
ttatctcgat cttatgtatg ggagtgccat aaattttgtt 540atgtctattt
ttttatctgt tggaatcata ctgagtttga tgcgttacga ttgagcatac
600ctatttttgg gcttgttgta tggtgggtat ttagatctta atctttttat
gcttatgaaa 660ggttttgtaa tgacaaaggt cttaatgttg ttaaactttt
atttttactt tatatggtgt 720gttgatgtgt tatggttttg acaacttttt
ttttttctgg atttttgcag 7701754DNAGlycine max 17cctttgttca ctcttcttct
cttgcggcta gggttttagc gcagcttctt ctag 541813DNAGlycien Max
18atttaaggaa gcc 131941DNAartificialprimer 19acttcccggg caatcaaatt
atatatgtaa agcaattaca g 412035DNAartificialprimer 20tagtccatgg
cttccttaaa tctgcaaaaa tccag 35211724DNAGlycine max 21cctttgttca
ctcttcttct cttgcggcta gggttttagc gcagcttctt ctagatttaa 60ggaagatggg
taaggaaaag gttcacatca gtattgtggt cattggccat gtcgactctg
120ggaaatccac taccactggt cacctgattt acaagcttgg aggcattgac
aagcgtgtta 180ttgagaggtt tgagaaggaa gctgctgaga tgaacaagag
gtctttcaag tatgcctggg 240tgctggacaa acttaaggct gagcgtgaaa
gaggaatcac cattgatatt gctttgtgga 300agtttgaaac aacaaagtat
tattgcacag ttattgatgc gcctggacat agggatttca 360ttaagaatat
gattactggg acatcccaag ctgactgtgc tgttcttatc attgattcga
420ccactggtgg ttttgaagct ggtatttcca aggatggaca gactcgtgaa
catgctctgc 480tttcattcac ccttggtgtg aaacagatga tttgctgctg
taacaaaatg gatgctacta 540cacccaagta ttccaaggcc aggtatgatg
aaattgtgaa ggaagtctct tcctacttga 600agaaagtagg atacaaccct
gacaagattc cttttgttcc tatctctggt tttgagggag 660acaacatgat
tgagaggtcc acaaaccttg actggtacaa gggtccaact ctgcttgatg
720cacttgacca gattagtgag cccaagaggc cctctgacaa gcccctcagg
cttccccttc 780aggatgtgta caagattgga ggtattggaa ctgtgccagt
gggacgtgtt gagaccggtg 840tcttgaagcc tggaatggtg gtgacttttg
caccaactgg actgacaact gaagtcaagt 900ctgtggagat gcaccatgaa
tctcttacag aggcacatcc tggtgacaat gtgggattca 960atgttaagaa
tgttgctgtt aaggatttga agcgtggtta tgttgcctca aactcaaagg
1020atgaccctgc aaaggaggct gctaacttca cagcccaagt catcatcatg
aaccaccctg 1080gtcagattgg aaatggctat gcccctgtcc tcgactgcca
cacttctcac attgctgtca 1140agtttgctga actcatgacc aagattgaca
ggcgatccgg caaagagctt gagaaggagc 1200ccaagttttt gaagaacggt
gatgctggtt ttgttaagat gattccaacc aaacccatgg 1260ttgttgaaac
tttctccgag tatcctccac ttggtaggtt tgctgttagg gacatgcgtc
1320aaactgttgc tgtgggagtc atcaagaacg ttgagaagaa ggatcctacc
ggagccaagg 1380tcaccaaggc tgcccagaag aagaagtgaa tcgtgcgggc
tggttcatca ggggatgttg 1440gttacaataa atgttggttt cttttctgta
ctcttgtgtc ttcttttcta ggtagcttgt 1500ttttcggaca aagtttgaag
tctccaccat catctcgcaa ctgttgttcc cagaactggg 1560ttcttgatcg
acggtggcaa aattgctttt atttatctgt gttttaatgt gttgtgtttg
1620tcggaacccc tgattacatt tttgttaagc gcagcgagtt tcaggacttt
gctgcgttgt 1680gttgctttgg tttattaaat gtcaactttc tatttgtagt gttc
172422447PRTGlycine max 22Met Gly Lys Glu Lys Val His Ile Ser Ile
Val Val Ile Gly His Val 1 5 10 15 Asp Ser Gly Lys Ser Thr Thr Thr
Gly His Leu Ile Tyr Lys Leu Gly 20 25 30 Gly Ile Asp Lys Arg Val
Ile Glu Arg Phe Glu Lys Glu Ala Ala Glu 35 40 45 Met Asn Lys Arg
Ser Phe Lys Tyr Ala Trp Val Leu Asp Lys Leu Lys 50 55 60 Ala Glu
Arg Glu Arg Gly Ile Thr Ile Asp Ile Ala Leu Trp Lys Phe 65 70 75 80
Glu Thr Thr Lys Tyr Tyr Cys Thr Val Ile Asp Ala Pro Gly His Arg 85
90 95 Asp Phe Ile Lys Asn Met Ile Thr Gly Thr Ser Gln Ala Asp Cys
Ala 100 105 110 Val Leu Ile Ile Asp Ser Thr Thr Gly Gly Phe Glu Ala
Gly Ile Ser 115 120 125 Lys Asp Gly Gln Thr Arg Glu His Ala Leu Leu
Ser Phe Thr Leu Gly 130 135 140 Val Lys Gln Met Ile Cys Cys Cys Asn
Lys Met Asp Ala Thr Thr Pro 145 150 155 160 Lys Tyr Ser Lys Ala
Arg Tyr Asp Glu Ile Val Lys Glu Val Ser Ser 165 170 175 Tyr Leu Lys
Lys Val Gly Tyr Asn Pro Asp Lys Ile Pro Phe Val Pro 180 185 190 Ile
Ser Gly Phe Glu Gly Asp Asn Met Ile Glu Arg Ser Thr Asn Leu 195 200
205 Asp Trp Tyr Lys Gly Pro Thr Leu Leu Asp Ala Leu Asp Gln Ile Ser
210 215 220 Glu Pro Lys Arg Pro Ser Asp Lys Pro Leu Arg Leu Pro Leu
Gln Asp 225 230 235 240 Val Tyr Lys Ile Gly Gly Ile Gly Thr Val Pro
Val Gly Arg Val Glu 245 250 255 Thr Gly Val Leu Lys Pro Gly Met Val
Val Thr Phe Ala Pro Thr Gly 260 265 270 Leu Thr Thr Glu Val Lys Ser
Val Glu Met His His Glu Ser Leu Thr 275 280 285 Glu Ala His Pro Gly
Asp Asn Val Gly Phe Asn Val Lys Asn Val Ala 290 295 300 Val Lys Asp
Leu Lys Arg Gly Tyr Val Ala Ser Asn Ser Lys Asp Asp 305 310 315 320
Pro Ala Lys Glu Ala Ala Asn Phe Thr Ala Gln Val Ile Ile Met Asn 325
330 335 His Pro Gly Gln Ile Gly Asn Gly Tyr Ala Pro Val Leu Asp Cys
His 340 345 350 Thr Ser His Ile Ala Val Lys Phe Ala Glu Leu Met Thr
Lys Ile Asp 355 360 365 Arg Arg Ser Gly Lys Glu Leu Glu Lys Glu Pro
Lys Phe Leu Lys Asn 370 375 380 Gly Asp Ala Gly Phe Val Lys Met Ile
Pro Thr Lys Pro Met Val Val 385 390 395 400 Glu Thr Phe Ser Glu Tyr
Pro Pro Leu Gly Arg Phe Ala Val Arg Asp 405 410 415 Met Arg Gln Thr
Val Ala Val Gly Val Ile Lys Asn Val Glu Lys Lys 420 425 430 Asp Pro
Thr Gly Ala Lys Val Thr Lys Ala Ala Gln Lys Lys Lys 435 440 445
235465DNAartificialvector 23catggcccac agcaagcacg gcctgaagga
ggagatgacc atgaagtacc acatggaggg 60ctgcgtgaac ggccacaagt tcgtgatcac
cggcgagggc atcggctacc ccttcaaggg 120caagcagacc atcaacctgt
gcgtgatcga gggcggcccc ctgcccttca gcgaggacat 180cctgagcgcc
ggcttcaagt acggcgaccg gatcttcacc gagtaccccc aggacatcgt
240ggactacttc aagaacagct gccccgccgg ctacacctgg ggccggagct
tcctgttcga 300ggacggcgcc gtgtgcatct gtaacgtgga catcaccgtg
agcgtgaagg agaactgcat 360ctaccacaag agcatcttca acggcgtgaa
cttccccgcc gacggccccg tgatgaagaa 420gatgaccacc aactgggagg
ccagctgcga gaagatcatg cccgtgccta agcagggcat 480cctgaagggc
gacgtgagca tgtacctgct gctgaaggac ggcggccggt accggtgcca
540gttcgacacc gtgtacaagg ccaagagcgt gcccagcaag atgcccgagt
ggcacttcat 600ccagcacaag ctgctgcggg aggaccggag cgacgccaag
aaccagaagt ggcagctgac 660cgagcacgcc atcgccttcc ccagcgccct
ggcctgagag ctcgaatttc cccgatcgtt 720caaacatttg gcaataaagt
ttcttaagat tgaatcctgt tgccggtctt gcgatgatta 780tcatataatt
tctgttgaat tacgttaagc atgtaataat taacatgtaa tgcatgacgt
840tatttatgag atgggttttt atgattagag tcccgcaatt atacatttaa
tacgcgatag 900aaaacaaaat atagcgcgca aactaggata aattatcgcg
cgcggtgtca tctatgttac 960tagatcggga attctagtgg ccggcccagc
tgatatccat cacactggcg gccgcactcg 1020actgaattgg ttccggcgcc
agcctgcttt tttgtacaaa gttggcatta taaaaaagca 1080ttgcttatca
atttgttgca acgaacaggt cactatcagt caaaataaaa tcattatttg
1140gggcccgagc ttaagtaact aactaacagg aagagtttgt agaaacgcaa
aaaggccatc 1200cgtcaggatg gccttctgct tagtttgatg cctggcagtt
tatggcgggc gtcctgcccg 1260ccaccctccg ggccgttgct tcacaacgtt
caaatccgct cccggcggat ttgtcctact 1320caggagagcg ttcaccgaca
aacaacagat aaaacgaaag gcccagtctt ccgactgagc 1380ctttcgtttt
atttgatgcc tggcagttcc ctactctcgc ttagtagtta gacgtccccg
1440agatccatgc tagcggtaat acggttatcc acagaatcag gggataacgc
aggaaagaac 1500atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa
aggccgcgtt gctggcgttt 1560ttccataggc tccgcccccc tgacgagcat
cacaaaaatc gacgctcaag tcagaggtgg 1620cgaaacccga caggactata
aagataccag gcgtttcccc ctggaagctc cctcgtgcgc 1680tctcctgttc
cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc
1740gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt
cgttcgctcc 1800aagctgggct gtgtgcacga accccccgtt cagcccgacc
gctgcgcctt atccggtaac 1860tatcgtcttg agtccaaccc ggtaagacac
gacttatcgc cactggcagc agccactggt 1920aacaggatta gcagagcgag
gtatgtaggc ggtgctacag agttcttgaa gtggtggcct 1980aactacggct
acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc
2040ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg
tagcggtggt 2100ttttttgttt gcaagcagca gattacgcgc agaaaaaaag
gatctcaaga agatcctttg 2160atcttttcta cggggtctga cgctcagtgg
aacggggccc aatctgaata atgttacaac 2220caattaacca attctgatta
gaaaaactca tcgagcatca aatgaaactg caatttattc 2280atatcaggat
tatcaatacc atatttttga aaaagccgtt tctgtaatga aggagaaaac
2340tcaccgaggc agttccatag gatggcaaga tcctggtatc ggtctgcgat
tccgactcgt 2400ccaacatcaa tacaacctat taatttcccc tcgtcaaaaa
taaggttatc aagtgagaaa 2460tcaccatgag tgacgactga atccggtgag
aatggcaaaa gtttatgcat ttctttccag 2520acttgttcaa caggccagcc
attacgctcg tcatcaaaat cactcgcatc aaccaaaccg 2580ttattcattc
gtgattgcgc ctgagcgaga cgaaatacgc gatcgctgtt aaaaggacaa
2640ttacaaacag gaatcgaatg caaccggcgc aggaacactg ccagcgcatc
aacaatattt 2700tcacctgaat caggatattc ttctaatacc tggaatgctg
tttttccggg gatcgcagtg 2760gtgagtaacc atgcatcatc aggagtacgg
ataaaatgct tgatggtcgg aagaggcata 2820aattccgtca gccagtttag
tctgaccatc tcatctgtaa catcattggc aacgctacct 2880ttgccatgtt
tcagaaacaa ctctggcgca tcgggcttcc catacaagcg atagattgtc
2940gcacctgatt gcccgacatt atcgcgagcc catttatacc catataaatc
agcatccatg 3000ttggaattta atcgcggcct cgacgtttcc cgttgaatat
ggctcataac accccttgta 3060ttactgttta tgtaagcaga cagttttatt
gttcatgatg atatattttt atcttgtgca 3120atgtaacatc agagattttg
agacacgggc cagagctgca gctggatggc aaataatgat 3180tttattttga
ctgatagtga cctgttcgtt gcaacaaatt gataagcaat gctttcttat
3240aatgccaact ttgtacaaga aagctgggtc tagatatctc gacccgggca
atcaaattat 3300atatgtaaag caattacagt ttatcaaact ttatttatgg
aaataattta ttatcacatt 3360tattttggtt tataaatttt aaattaaaat
atcacctaaa taaaaataat ttttaacatg 3420acttattgtc ctaaataaat
tatttccgta aattaaataa aatgaagttt ttttctttca 3480aagaatctaa
atggtcataa tgagaattct ctaaaaaaat acataatgag aataattatg
3540gaatttattt attaataaaa attaatagca ttttgataga caattaataa
aattttaaaa 3600ataaccatat agaaataata atttttttac tatcggttcc
aattaaaata atgataaaaa 3660ataaaataga ttattaattg atattgatat
gaaatttaaa taaagaatat aatcatatat 3720tttattgata tatgatatga
tatagattaa ttgatattga ttttgatatg gaatttaaaa 3780ataatataat
aattgttttt atttattaat acgtgtaatc aaataattct cactttttga
3840atcaatcagt gtacttaaag ataatatcag ttgaatattt tttatccttt
tacgtgtgct 3900gtgagacatt atcatcaatt gtgttgtata tgatatatag
atatagatat ataaatatat 3960agattgagtg atataatata tttaaaatat
aaattatata tatgttttaa tatatttttg 4020catatatata tatatttgta
aaaactagaa gtatttttca tgagataatt attatcgagt 4080tgaataagtc
tattatttgt gagagccaac catatttata tatgtgatta aattttatct
4140ttgtgaaatt aaaaataata aaaaatacct taaaatcata ataatagaaa
aacttatatt 4200tataatttac cattatactt aaaaaaaatt aaataaatat
tataaatata aatactatcg 4260agtaatggcc gcgctagggt ttttgagaaa
aaatcttccc acgcactcaa ctgcactgta 4320cggcgtcgtt ttcacagccg
cataatagaa gccgcgttcc ccaacccttc ctcacaacat 4380tctcggaccc
tccagcaccg tcacccaaac aaatatccac gcggtagtag gcgcgtgaaa
4440caaactctaa tccgaactac gagacgtgag aagcacgcgc tttagcgagc
gtttcaattg 4500tcgctacgaa agcagagaag gatacaaacg gaactagggt
aaattagtaa gggtaatttc 4560gtaaacagaa gaaaagagtt gtagctataa
ataaaccctc taaccctcgt cgcattactt 4620ctcttcacac ctttgttcac
tcttcttctc ttgcggctag ggttttagcg cagcttcttc 4680taggttcgtt
atctaccacc gttctatgga ttttattcct tctattcgtg tttattctat
4740tggtttatgt tgcttgcaat atgttttttc tgaatctgtc gtcgttgtct
tcaattttat 4800ccatgtttca gagatcaatt ttgtttgtgt agtatgtgct
tattcttctt cttttcgttc 4860gagttgttaa taacggtgct atggtgtttt
caaaagtgtt ttttttatta cttttgattt 4920aaagtttttt tggtaaggct
tttatttgct tgttatattc aaatctttgg atccagatct 4980tatataagtt
tttggttcaa gaaagttttt ggttactgat gaatagatct attaactgtt
5040actttaatcg attcaagcta aagttttttg gttactgatg aatagatcta
ttatctgtta 5100cttttaatcg gttcaagctc aagttttttg gttactgatg
aatagatcta tatacgtcac 5160agtgtgctaa acatgccctt gttttatctc
gatcttatgt atgggagtgc cataaatttt 5220gttatgtcta tttttttatc
tgttggaatc atactgagtt tgatgcgtta cgattgagca 5280tacctatttt
tgggcttgtt gtatggtggg tatttagatc ttaatctttt tatgcttatg
5340aaaggttttg taatgacaaa ggtcttaatg ttgttaaact tttattttta
ctttatatgg 5400tgtgttgatg tgttatggtt ttgacaactt tttttttttc
tggatttttg cagatttaag 5460gaagc 5465249768DNAartificialvector
24tttgtacaaa cttgtgattc ttccttacca atcatactaa ttattttggg ttaaatatta
60atcattattt ttaagatatt aattaagaaa ttaaaagatt ttttaaaaaa atgtataaaa
120ttatattatt catgattttt catacatttg attttgataa taaatatatt
ttttttaatt 180tcttaaaaaa tgttgcaaga cacttattag acatagtctt
gttctgttta caaaagcatt 240catcatttaa tacattaaaa aatatttaat
actaacagta gaatcttctt gtgagtggtg 300tgggagtagg caacctggca
ttgaaacgag agaaagagag tcagaaccag aagacaaata 360aaaagtatgc
aacaaacaaa tcaaaatcaa agggcaaagg ctggggttgg ctcaattggt
420tgctacattc aattttcaac tcagtcaacg gttgagattc actctgactt
ccccaatcta 480agccgcggat gcaaacggtt gaatctaacc cacaatccaa
tctcgttact taggggcttt 540tccgtcatta actcacccct gccacccggt
ttccctataa attggaactc aatgctcccc 600tctaaactcg tatcgcttca
gagttgagac caagacacac tcgttcatat atctctctgc 660tcttctcttc
tcttctacct ctcaaggtac ttttcttctc cctctaccaa atcctagatt
720ccgtggttca atttcggatc ttgcacttct ggtttgcttt gccttgcttt
ttcctcaact 780gggtccatct aggatccatg tgaaactcta ctctttcttt
aatatctgcg gaatacgcgt 840ttgactttca gatctagtcg aaatcatttc
ataattgcct ttctttcttt tagcttatga 900gaaataaaat cacttttttt
ttatttcaaa ataaaccttg ggccttgtgc tgactgagat 960ggggtttggt
gattacagaa ttttagcgaa ttttgtaatt gtacttgttt gtctgtagtt
1020ttgttttgtt ttcttgtttc tcatacattc cttaggcttc aattttattc
gagtataggt 1080cacaatagga attcaaactt tgagcagggg aattaatccc
ttccttcaaa tccagtttgt 1140ttgtatatat gtttaaaaaa tgaaactttt
gctttaaatt ctattataac tttttttatg 1200gctgaaattt ttgcatgtgt
ctttgctctc tgttgtaaat ttactgttta ggtactaact 1260ctaggcttgt
tgtgcagttt ttgaagtata accatgccac acaacacaat ggcggccacc
1320gcttccagaa ccacccgatt ctcttcttcc tcttcacacc ccaccttccc
caaacgcatt 1380actagatcca ccctccctct ctctcatcaa accctcacca
aacccaacca cgctctcaaa 1440atcaaatgtt ccatctccaa accccccacg
gcggcgccct tcaccaagga agcgccgacc 1500acggagccct tcgtgtcacg
gttcgcctcc ggcgaacctc gcaagggcgc ggacatcctt 1560gtggaggcgc
tggagaggca gggcgtgacg acggtgttcg cgtaccccgg cggtgcgtcg
1620atggagatcc accaggcgct cacgcgctcc gccgccatcc gcaacgtgct
cccgcgccac 1680gagcagggcg gcgtcttcgc cgccgaaggc tacgcgcgtt
cctccggcct ccccggcgtc 1740tgcattgcca cctccggccc cggcgccacc
aacctcgtga gcggcctcgc cgacgcttta 1800atggacagcg tcccagtcgt
cgccatcacc ggccaggtcg cccgccggat gatcggcacc 1860gacgccttcc
aagaaacccc gatcgtggag gtgagcagat ccatcacgaa gcacaactac
1920ctcatcctcg acgtcgacga catcccccgc gtcgtcgccg aggctttctt
cgtcgccacc 1980tccggccgcc ccggtccggt cctcatcgac attcccaaag
acgttcagca gcaactcgcc 2040gtgcctaatt gggacgagcc cgttaacctc
cccggttacc tcgccaggct gcccaggccc 2100cccgccgagg cccaattgga
acacattgtc agactcatca tggaggccca aaagcccgtt 2160ctctacgtcg
gcggtggcag tttgaattcc agtgctgaat tgaggcgctt tgttgaactc
2220actggtattc ccgttgctag cactttaatg ggtcttggaa cttttcctat
tggtgatgaa 2280tattcccttc agatgctggg tatgcatggt actgtttatg
ctaactatgc tgttgacaat 2340agtgatttgt tgcttgcctt tggggtaagg
tttgatgacc gtgttactgg gaagcttgag 2400gcttttgcta gtagggctaa
gattgttcac attgatattg attctgccga gattgggaag 2460aacaagcagg
cgcacgtgtc ggtttgcgcg gatttgaagt tggccttgaa gggaattaat
2520atgattttgg aggagaaagg agtggagggt aagtttgatc ttggaggttg
gagagaagag 2580attaatgtgc agaaacacaa gtttccattg ggttacaaga
cattccagga cgcgatttct 2640ccgcagcatg ctatcgaggt tcttgatgag
ttgactaatg gagatgctat tgttagtact 2700ggggttgggc agcatcaaat
gtgggctgcg cagttttaca agtacaagag accgaggcag 2760tggttgacct
cagggggtct tggagccatg ggttttggat tgcctgcggc tattggtgct
2820gctgttgcta accctggggc tgttgtggtt gacattgatg gggatggtag
tttcatcatg 2880aatgttcagg agttggccac tataagagtg gagaatctcc
cagttaagat attgttgttg 2940aacaatcagc atttgggtat ggtggttcag
ttggaggata ggttctacaa gtccaataga 3000gctcacacct atcttggaga
tccgtctagc gagagcgaga tattcccaaa catgctcaag 3060tttgctgatg
cttgtgggat accggcagcg cgagtgacga agaaggaaga gcttagagcg
3120gcaattcaga gaatgttgga cacccctggc ccctaccttc ttgatgtcat
tgtgccccat 3180caggagcatg tgttgccgat gattcccagt aatggatcct
tcaaggatgt gataactgag 3240ggtgatggta gaacgaggta ctgattgcct
agaccaaatg ttccttgatg cttgttttgt 3300acaatatata taagataatg
ctgtcctagt tgcaggattt ggcctgtggt gagcatcata 3360gtctgtagta
gttttggtag caagacattt tattttcctt ttatttaact tactacatgc
3420agtagcatct atctatctct gtagtctgat atctcctgtt gtctgtattg
tgccgttgga 3480ttttttgctg tagtgagact gaaaatgatg tgctagtaat
aatatttctg ttagaaatct 3540aagtagagaa tctgttgaag aagtcaaaag
ctaatggaat caggttacat attcaatgtt 3600tttctttttt tagcggttgg
tagacgtgta gattcaactt ctcttggagc tcacctaggc 3660aatcagtaaa
atgcatattc cttttttaac ttgccattta tttactttta gtggaaattg
3720tgaccaattt gttcatgtag aacggatttg gaccattgcg tccacaaaac
gtctcttttg 3780ctcgatcttc acaaagcgat accgaaatcc agagatagtt
ttcaaaagtc agaaatggca 3840aagttataaa tagtaaaaca gaatagatgc
tgtaatcgac ttcaataaca agtggcatca 3900cgtttctagt tctagacccg
ggtaccggcg cgcccgatca tccggatata gttcctcctt 3960tcagcaaaaa
acccctcaag acccgtttag aggccccaag gggttatgct agttattgct
4020cagcggtggc agcagccaac tcagcttcct ttcgggcttt gttagcagcc
ggatcgatcc 4080aagctgtacc tcactattcc tttgccctcg gacgagtgct
ggggcgtcgg tttccactat 4140cggcgagtac ttctacacag ccatcggtcc
agacggccgc gcttctgcgg gcgatttgtg 4200tacgcccgac agtcccggct
ccggatcgga cgattgcgtc gcatcgaccc tgcgcccaag 4260ctgcatcatc
gaaattgccg tcaaccaagc tctgatagag ttggtcaaga ccaatgcgga
4320gcatatacgc ccggagccgc ggcgatcctg caagctccgg atgcctccgc
tcgaagtagc 4380gcgtctgctg ctccatacaa gccaaccacg gcctccagaa
gaagatgttg gcgacctcgt 4440attgggaatc cccgaacatc gcctcgctcc
agtcaatgac cgctgttatg cggccattgt 4500ccgtcaggac attgttggag
ccgaaatccg cgtgcacgag gtgccggact tcggggcagt 4560cctcggccca
aagcatcagc tcatcgagag cctgcgcgac ggacgcactg acggtgtcgt
4620ccatcacagt ttgccagtga tacacatggg gatcagcaat cgcgcatatg
aaatcacgcc 4680atgtagtgta ttgaccgatt ccttgcggtc cgaatgggcc
gaacccgctc gtctggctaa 4740gatcggccgc agcgatcgca tccatagcct
ccgcgaccgg ctgcagaaca gcgggcagtt 4800cggtttcagg caggtcttgc
aacgtgacac cctgtgcacg gcgggagatg caataggtca 4860ggctctcgct
gaattcccca atgtcaagca cttccggaat cgggagcgcg gccgatgcaa
4920agtgccgata aacataacga tctttgtaga aaccatcggc gcagctattt
acccgcagga 4980catatccacg ccctcctaca tcgaagctga aagcacgaga
ttcttcgccc tccgagagct 5040gcatcaggtc ggagacgctg tcgaactttt
cgatcagaaa cttctcgaca gacgtcgcgg 5100tgagttcagg cttttccatg
ggtatatctc cttcttaaag ttaaacaaaa ttatttctag 5160agggaaaccg
ttgtggtctc cctatagtga gtcgtattaa tttcgcggga tcgagatctg
5220atcaacctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg
tattgggcgc 5280tcttccgctt cctcgctcac tgactcgctg cgctcggtcg
ttcggctgcg gcgagcggta 5340tcagctcact caaaggcggt aatacggtta
tccacagaat caggggataa cgcaggaaag 5400aacatgtgag caaaaggcca
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 5460tttttccata
ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg
5520tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag
ctccctcgtg 5580cgctctcctg ttccgaccct gccgcttacc ggatacctgt
ccgcctttct cccttcggga 5640agcgtggcgc tttctcaatg ctcacgctgt
aggtatctca gttcggtgta ggtcgttcgc 5700tccaagctgg gctgtgtgca
cgaacccccc gttcagcccg accgctgcgc cttatccggt 5760aactatcgtc
ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact
5820ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt
gaagtggtgg 5880cctaactacg gctacactag aaggacagta tttggtatct
gcgctctgct gaagccagtt 5940accttcggaa aaagagttgg tagctcttga
tccggcaaac aaaccaccgc tggtagcggt 6000ggtttttttg tttgcaagca
gcagattacg cgcagaaaaa aaggatctca agaagatcct 6060ttgatctttt
ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg
6120gtcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtct
cgcgcgtttc 6180ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg
agacggtcac agcttgtctg 6240taagcggatg ccgggagcag acaagcccgt
cagggcgcgt cagcgggtgt tggcgggtgt 6300cggggctggc ttaactatgc
ggcatcagag cagattgtac tgagagtgca ccatatggac 6360atattgtcgt
tagaacgcgg ctacaattaa tacataacct tatgtatcat acacatacga
6420tttaggtgac actatagaac ggcgcgccgg taccgggccc cccctcgagt
gcggccgcaa 6480gcttgtcgac ggagatcacc actttgtaca agaaagctgg
gtctagatat ctcgacccgg 6540gcaatcaaat tatatatgta aagcaattac
agtttatcaa actttattta tggaaataat 6600ttattatcac atttattttg
gtttataaat tttaaattaa aatatcacct aaataaaaat 6660aatttttaac
atgacttatt gtcctaaata aattatttcc gtaaattaaa taaaatgaag
6720tttttttctt tcaaagaatc taaatggtca taatgagaat tctctaaaaa
aatacataat 6780gagaataatt atggaattta tttattaata aaaattaata
gcattttgat agacaattaa 6840taaaatttta aaaataacca tatagaaata
ataatttttt tactatcggt tccaattaaa 6900ataatgataa aaaataaaat
agattattaa ttgatattga tatgaaattt aaataaagaa 6960tataatcata
tattttattg atatatgata tgatatagat taattgatat tgattttgat
7020atggaattta aaaataatat aataattgtt tttatttatt aatacgtgta
atcaaataat 7080tctcactttt tgaatcaatc agtgtactta aagataatat
cagttgaata ttttttatcc 7140ttttacgtgt gctgtgagac attatcatca
attgtgttgt atatgatata tagatataga 7200tatataaata tatagattga
gtgatataat atatttaaaa tataaattat atatatgttt 7260taatatattt
ttgcatatat atatatattt gtaaaaacta gaagtatttt tcatgagata
7320attattatcg agttgaataa gtctattatt tgtgagagcc aaccatattt
atatatgtga 7380ttaaatttta tctttgtgaa attaaaaata ataaaaaata
ccttaaaatc ataataatag 7440aaaaacttat atttataatt taccattata
cttaaaaaaa attaaataaa tattataaat 7500ataaatacta tcgagtaatg
gccgcgctag ggtttttgag aaaaaatctt cccacgcact 7560caactgcact
gtacggcgtc gttttcacag ccgcataata gaagccgcgt tccccaaccc
7620ttcctcacaa
cattctcgga ccctccagca ccgtcaccca aacaaatatc cacgcggtag
7680taggcgcgtg aaacaaactc taatccgaac tacgagacgt gagaagcacg
cgctttagcg 7740agcgtttcaa ttgtcgctac gaaagcagag aaggatacaa
acggaactag ggtaaattag 7800taagggtaat ttcgtaaaca gaagaaaaga
gttgtagcta taaataaacc ctctaaccct 7860cgtcgcatta cttctcttca
cacctttgtt cactcttctt ctcttgcggc tagggtttta 7920gcgcagcttc
ttctaggttc gttatctacc accgttctat ggattttatt ccttctattc
7980gtgtttattc tattggttta tgttgcttgc aatatgtttt ttctgaatct
gtcgtcgttg 8040tcttcaattt tatccatgtt tcagagatca attttgtttg
tgtagtatgt gcttattctt 8100cttcttttcg ttcgagttgt taataacggt
gctatggtgt tttcaaaagt gtttttttta 8160ttacttttga tttaaagttt
ttttggtaag gcttttattt gcttgttata ttcaaatctt 8220tggatccaga
tcttatataa gtttttggtt caagaaagtt tttggttact gatgaataga
8280tctattaact gttactttaa tcgattcaag ctaaagtttt ttggttactg
atgaatagat 8340ctattatctg ttacttttaa tcggttcaag ctcaagtttt
ttggttactg atgaatagat 8400ctatatacgt cacagtgtgc taaacatgcc
cttgttttat ctcgatctta tgtatgggag 8460tgccataaat tttgttatgt
ctattttttt atctgttgga atcatactga gtttgatgcg 8520ttacgattga
gcatacctat ttttgggctt gttgtatggt gggtatttag atcttaatct
8580ttttatgctt atgaaaggtt ttgtaatgac aaaggtctta atgttgttaa
acttttattt 8640ttactttata tggtgtgttg atgtgttatg gttttgacaa
cttttttttt ttctggattt 8700ttgcagattt aaggaagcca tggcccacag
caagcacggc ctgaaggagg agatgaccat 8760gaagtaccac atggagggct
gcgtgaacgg ccacaagttc gtgatcaccg gcgagggcat 8820cggctacccc
ttcaagggca agcagaccat caacctgtgc gtgatcgagg gcggccccct
8880gcccttcagc gaggacatcc tgagcgccgg cttcaagtac ggcgaccgga
tcttcaccga 8940gtacccccag gacatcgtgg actacttcaa gaacagctgc
cccgccggct acacctgggg 9000ccggagcttc ctgttcgagg acggcgccgt
gtgcatctgt aacgtggaca tcaccgtgag 9060cgtgaaggag aactgcatct
accacaagag catcttcaac ggcgtgaact tccccgccga 9120cggccccgtg
atgaagaaga tgaccaccaa ctgggaggcc agctgcgaga agatcatgcc
9180cgtgcctaag cagggcatcc tgaagggcga cgtgagcatg tacctgctgc
tgaaggacgg 9240cggccggtac cggtgccagt tcgacaccgt gtacaaggcc
aagagcgtgc ccagcaagat 9300gcccgagtgg cacttcatcc agcacaagct
gctgcgggag gaccggagcg acgccaagaa 9360ccagaagtgg cagctgaccg
agcacgccat cgccttcccc agcgccctgg cctgagagct 9420cgaatttccc
cgatcgttca aacatttggc aataaagttt cttaagattg aatcctgttg
9480ccggtcttgc gatgattatc atataatttc tgttgaatta cgttaagcat
gtaataatta 9540acatgtaatg catgacgtta tttatgagat gggtttttat
gattagagtc ccgcaattat 9600acatttaata cgcgatagaa aacaaaatat
agcgcgcaaa ctaggataaa ttatcgcgcg 9660cggtgtcatc tatgttacta
gatcgggaat tctagtggcc ggcccagctg atatccatca 9720cactggcggc
cgcactcgac tgaattggtt ccggcgccag cctgcttt
9768255839DNAartificialvector 25cttgtacaaa gtggttgatg ggatccatgg
cccacagcaa gcacggcctg aaggaggaga 60tgaccatgaa gtaccacatg gagggctgcg
tgaacggcca caagttcgtg atcaccggcg 120agggcatcgg ctaccccttc
aagggcaagc agaccatcaa cctgtgcgtg atcgagggcg 180gccccctgcc
cttcagcgag gacatcctga gcgccggctt caagtacggc gaccggatct
240tcaccgagta cccccaggac atcgtggact acttcaagaa cagctgcccc
gccggctaca 300cctggggccg gagcttcctg ttcgaggacg gcgccgtgtg
catctgtaac gtggacatca 360ccgtgagcgt gaaggagaac tgcatctacc
acaagagcat cttcaacggc gtgaacttcc 420ccgccgacgg ccccgtgatg
aagaagatga ccaccaactg ggaggccagc tgcgagaaga 480tcatgcccgt
gcctaagcag ggcatcctga agggcgacgt gagcatgtac ctgctgctga
540aggacggcgg ccggtaccgg tgccagttcg acaccgtgta caaggccaag
agcgtgccca 600gcaagatgcc cgagtggcac ttcatccagc acaagctgct
gcgggaggac cggagcgacg 660ccaagaacca gaagtggcag ctgaccgagc
acgccatcgc cttccccagc gccctggcct 720gagagctcga atttccccga
tcgttcaaac atttggcaat aaagtttctt aagattgaat 780cctgttgccg
gtcttgcgat gattatcata taatttctgt tgaattacgt taagcatgta
840ataattaaca tgtaatgcat gacgttattt atgagatggg tttttatgat
tagagtcccg 900caattataca tttaatacgc gatagaaaac aaaatatagc
gcgcaaacta ggataaatta 960tcgcgcgcgg tgtcatctat gttactagat
cgggaattct agtggccggc ccagctgata 1020tccatcacac tggcggccgc
tcgagttcta tagtgtcacc taaatcgtat gtgtatgata 1080cataaggtta
tgtattaatt gtagccgcgt tctaacgaca atatgtccat atggtgcact
1140ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc
gccaacaccc 1200gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg
cttacagaca agctgtgacc 1260gtctccggga gctgcatgtg tcagaggttt
tcaccgtcat caccgaaacg cgcgagacga 1320aagggcctcg tgatacgcct
atttttatag gttaatgtca tgaccaaaat cccttaacgt 1380gagttttcgt
tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat
1440cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct
accagcggtg 1500gtttgtttgc cggatcaaga gctaccaact ctttttccga
aggtaactgg cttcagcaga 1560gcgcagatac caaatactgt ccttctagtg
tagccgtagt taggccacca cttcaagaac 1620tctgtagcac cgcctacata
cctcgctctg ctaatcctgt taccagtggc tgctgccagt 1680ggcgataagt
cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag
1740cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac
gacctacacc 1800gaactgagat acctacagcg tgagcattga gaaagcgcca
cgcttcccga agggagaaag 1860gcggacaggt atccggtaag cggcagggtc
ggaacaggag agcgcacgag ggagcttcca 1920gggggaaacg cctggtatct
ttatagtcct gtcgggtttc gccacctctg acttgagcgt 1980cgatttttgt
gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc
2040tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc
tgcgttatcc 2100cctgattctg tggataaccg tattaccgcc tttgagtgag
ctgataccgc tcgccgcagc 2160cgaacgaccg agcgcagcga gtcagtgagc
gaggaagcgg aagagcgccc aatacgcaaa 2220ccgcctctcc ccgcgcgttg
gccgattcat taatgcaggt tgatcagatc tcgatcccgc 2280gaaattaata
cgactcacta tagggagacc acaacggttt ccctctagaa ataattttgt
2340ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga
cgtctgtcga 2400gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg
atgcagctct cggagggcga 2460agaatctcgt gctttcagct tcgatgtagg
agggcgtgga tatgtcctgc gggtaaatag 2520ctgcgccgat ggtttctaca
aagatcgtta tgtttatcgg cactttgcat cggccgcgct 2580cccgattccg
gaagtgcttg acattgggga attcagcgag agcctgacct attgcatctc
2640ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc
ccgctgttct 2700gcagccggtc gcggaggcta tggatgcgat cgctgcggcc
gatcttagcc agacgagcgg 2760gttcggccca ttcggaccgc aaggaatcgg
tcaatacact acatggcgtg atttcatatg 2820cgcgattgct gatccccatg
tgtatcactg gcaaactgtg atggacgaca ccgtcagtgc 2880gtccgtcgcg
caggctctcg atgagctgat gctttgggcc gaggactgcc ccgaagtccg
2940gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg
gccgcataac 3000agcggtcatt gactggagcg aggcgatgtt cggggattcc
caatacgagg tcgccaacat 3060cttcttctgg aggccgtggt tggcttgtat
ggagcagcag acgcgctact tcgagcggag 3120gcatccggag cttgcaggat
cgccgcggct ccgggcgtat atgctccgca ttggtcttga 3180ccaactctat
cagagcttgg ttgacggcaa tttcgatgat gcagcttggg cgcagggtcg
3240atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa
tcgcccgcag 3300aagcgcggcc gtctggaccg atggctgtgt agaagtactc
gccgatagtg gaaaccgacg 3360ccccagcact cgtccgaggg caaaggaata
gtgaggtaca gcttggatcg atccggctgc 3420taacaaagcc cgaaaggaag
ctgagttggc tgctgccacc gctgagcaat aactagcata 3480accccttggg
gcctctaaac gggtcttgag gggttttttg ctgaaaggag gaactatatc
3540cggatgatcg tcgaggcctc acgtgttaac aagcttgcat gcctgcaggt
ttatcaacaa 3600gtttgtacaa aaaagcaggc tccgaattcg cccttgggca
atcaaattat atatgtaaag 3660caattacagt ttatcaaact ttatttatgg
aaataattta ttatcacatt tattttggtt 3720tataaatttt aaattaaaat
atcacctaaa taaaaataat ttttaacatg acttattgtc 3780ctaaataaat
tatttccgta aattaaataa aatgaagttt ttttctttca aagaatctaa
3840atggtcataa tgagaattct ctaaaaaaat acataatgag aataattatg
gaatttattt 3900attaataaaa attaatagca ttttgataga caattaataa
aattttaaaa ataaccatat 3960agaaataata atttttttac tatcggttcc
aattaaaata atgataaaaa ataaaataga 4020ttattaattg atattgatat
gaaatttaaa taaagaatat aatcatatat tttattgata 4080tatgatatga
tatagattaa ttgatattga ttttgatatg gaatttaaaa ataatataat
4140aattgttttt atttattaat acgtgtaatc aaataattct cactttttga
atcaatcagt 4200gtacttaaag ataatatcag ttgaatattt tttatccttt
tacgtgtgct gtgagacatt 4260atcatcaatt gtgttgtata tgatatatag
atatagatat ataaatatat agattgagtg 4320atataatata tttaaaatat
aaattatata tatgttttaa tatatttttg catatatata 4380tatatttgta
aaaactagaa gtatttttca tgagataatt attatcgagt tgaataagtc
4440tattatttgt gagagccaac catatttata tatgtgatta aattttatct
ttgtgaaatt 4500aaaaataata aaaaatacct taaaatcata ataatagaaa
aacttatatt tataatttac 4560cattatactt aaaaaaaatt aaataaatat
tataaatata aatactatcg agtaatggcc 4620gcgctagggt ttttgagaaa
aaatcttccc acgcactcaa ctgcactgta cggcgtcgtt 4680ttcacagccg
cataatagaa gccgcgttcc ccaacccttc ctcacaacat tctcggaccc
4740tccagcaccg tcacccaaac aaatatccac gcggtagtag gcgcgtgaaa
caaactctaa 4800tccgaactac gagacgtgag aagcacgcgc tttagcgagc
gtttcaattg tcgctacgaa 4860agcagagaag gatacaaacg gaactagggt
aaattagtaa gggtaatttc gtaaacagaa 4920gaaaagagtt gtagctataa
ataaaccctc taaccctcgt cgcattactt ctcttcacac 4980ctttgttcac
tcttcttctc ttgcggctag ggttttagcg cagcttcttc taggttcgtt
5040atctaccacc gttctatgga ttttattcct tctattcgtg tttattctat
tggtttatgt 5100tgcttgcaat atgttttttc tgaatctgtc gtcgttgtct
tcaattttat ccatgtttca 5160gagatcaatt ttgtttgtgt agtatgtgct
tattcttctt cttttcgttc gagttgttaa 5220taacggtgct atggtgtttt
caaaagtgtt ttttttatta cttttgattt aaagtttttt 5280tggtaaggct
tttatttgct tgttatattc aaatctttgg atccagatct tatataagtt
5340tttggttcaa gaaagttttt ggttactgat gaatagatct attaactgtt
actttaatcg 5400attcaagcta aagttttttg gttactgatg aatagatcta
ttatctgtta cttttaatcg 5460gttcaagctc aagttttttg gttactgatg
aatagatcta tatacgtcac agtgtgctaa 5520acatgccctt gttttatctc
gatcttatgt atgggagtgc cataaatttt gttatgtcta 5580tttttttatc
tgttggaatc atactgagtt tgatgcgtta cgattgagca tacctatttt
5640tgggcttgtt gtatggtggg tatttagatc ttaatctttt tatgcttatg
aaaggttttg 5700taatgacaaa ggtcttaatg ttgttaaact tttattttta
ctttatatgg tgtgttgatg 5760tgttatggtt ttgacaactt tttttttttc
tggatttttg cagatttaag gaagccaagg 5820gcgaattcga cccagcttt
58392626DNAartificialprimer 26gaccaagaca cactcgttca tatatc
262725DNAartificialprimer 27tctgctgctc aatgtttaca aggac
252822DNAartificialprimer 28ggaagaagag aatcgggtgg tt
222923DNAartificialoligo probe 29attgtgttgt gtggcatggt tat
233023DNAartificialprimer 30ggcttgttgt gcagtttttg aag
233120DNAartificialprimer 31aacggccaca agttcgtgat
203220DNAartificialoligo probe 32accggcgagg gcatcggcta
203320DNAartificialprimer 33cttcaagggc aagcagacca
203424DNAartificialprimer 34caaacttgac aaagccacaa ctct
243520DNAartificialoligo probe 35ctctcatctc atataaatac
203621DNAartificialprimer 36ggagaaattg gtgtcgtgga a
2137100DNAartificialattL1 recombination site 37caaataatga
ttttattttg actgatagtg acctgttcgt tgcaacaaat tgataagcaa 60tgctttttta
taatgccaac tttgtacaaa aaagcaggct 10038100DNAartificialattL2
recombination site 38caaataatga ttttattttg actgatagtg acctgttcgt
tgcaacaaat tgataagcaa 60tgctttctta taatgccaac tttgtacaag aaagctgggt
10039125DNAartificialattR1 recombination site 39acaagtttgt
acaaaaaagc tgaacgagaa acgtaaaatg atataaatat caatatatta 60aattagattt
tgcataaaaa acagactaca taatactgta aaacacaaca tatccagtca 120ctatg
12540125DNAartificialattR2 recombiantion site 40accactttgt
acaagaaagc tgaacgagaa acgtaaaatg atataaatat caatatatta 60aattagattt
tgcataaaaa acagactaca taatactgta aaacacaaca tatccagtca 120ctatg
1254121DNAartificialattB1 recombination site 41caagtttgta
caaaaaagca g 214221DNAartificialattB2 recombiantion site
42cagctttctt gtacaaagtg g 214317DNAartificialTag sequence
43gatcgacggt ggcaaga 17
* * * * *