U.S. patent application number 16/292560 was filed with the patent office on 2019-06-20 for methods and compositions to enhance plant breeding.
The applicant listed for this patent is Jason Bull, David Butruille, Sam Eathington, Marlin Edwards, Anju Gupta, Richard Johnson, Wayne Kennard, Jennifer Rinehart, Kunsheng Wu. Invention is credited to Jason Bull, David Butruille, Sam Eathington, Marlin Edwards, Anju Gupta, Richard Johnson, Wayne Kennard, Jennifer Rinehart, Kunsheng Wu.
Application Number | 20190185876 16/292560 |
Document ID | / |
Family ID | 37137417 |
Filed Date | 2019-06-20 |
United States Patent
Application |
20190185876 |
Kind Code |
A1 |
Bull; Jason ; et
al. |
June 20, 2019 |
METHODS AND COMPOSITIONS TO ENHANCE PLANT BREEDING
Abstract
The present invention provides breeding methods and compositions
to enhance the germplasm of a plant. The methods describe the
identification and accumulation of transgenes and favorable
haplotype genomic regions in the germplasm of a breeding population
of crop plants.
Inventors: |
Bull; Jason; (St. Louis,
MO) ; Butruille; David; (Urbandale, IA) ;
Eathington; Sam; (Ames, IA) ; Edwards; Marlin;
(Davis, CA) ; Gupta; Anju; (Ankeny, IA) ;
Johnson; Richard; (Urbana, IL) ; Kennard; Wayne;
(Ankeny, IA) ; Rinehart; Jennifer; (Spring Green,
WI) ; Wu; Kunsheng; (Ballwin, MO) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Bull; Jason
Butruille; David
Eathington; Sam
Edwards; Marlin
Gupta; Anju
Johnson; Richard
Kennard; Wayne
Rinehart; Jennifer
Wu; Kunsheng |
St. Louis
Urbandale
Ames
Davis
Ankeny
Urbana
Ankeny
Spring Green
Ballwin |
MO
IA
IA
CA
IA
IL
IA
WI
MO |
US
US
US
US
US
US
US
US
US |
|
|
Family ID: |
37137417 |
Appl. No.: |
16/292560 |
Filed: |
March 5, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15433124 |
Feb 15, 2017 |
10273498 |
|
|
16292560 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
A01H 1/02 20130101; Y02A
40/146 20180101; C12N 15/821 20130101; A01N 57/20 20130101; A01H
1/04 20130101; A01H 5/10 20130101; C12N 15/8286 20130101; Y02A
40/162 20180101; C12N 15/8275 20130101 |
International
Class: |
C12N 15/82 20060101
C12N015/82; A01N 57/20 20060101 A01N057/20; A01H 5/10 20060101
A01H005/10; A01H 1/02 20060101 A01H001/02; A01H 1/04 20060101
A01H001/04 |
Claims
1. A method of breeding a transgenic corn plant comprising the
steps of: providing at least two transgenic corn plants, each corn
plant having at least one transgene inserted into its genome;
determining a map location of the at least one transgene in the
genome of the at least two transgenic corn plants using at least
one DNA marker in the genomic region flanking the transgene insert;
and selecting at least one of the transgenic corn plants for
breeding, wherein the at least one transgenic corn plant that is
selected has in its genome at least one transgene that is
genetically linked to haplotype C1W30H4; and crossing the selected
at least one transgenic corn plants with a second corn plant to
produce one or more progeny plants comprising the at least one
transgene linked to haplotype C1W30H4; and wherein the at least one
transgene and the haplotype are linked at a genetic distance of 0
to within about 5 cM.
2. The method of claim 1, wherein the method further comprises
selecting a progeny plant of the at least one transgenic corn plant
selected for breeding by marker-assisted selection.
3. The method of claim 1, wherein the method further comprises
selecting a progeny plant of the at least one transgenic corn plant
selected for breeding by detection of expression of the at least
one transgene or expression of a transgenic agronomic trait.
4. The method of claim 3, further comprising the step of crossing
the progeny plant with another corn plant to produce additional
progeny plants.
5. The method of claim 1, wherein the genetic marker is a DNA
marker selected from the group consisting of SEQ ID NO: 42-47.
6. The method of claim 1, wherein the at least one transgene
encodes a protein providing an agronomic enhancement selected from
the group consisting of herbicide tolerance, disease resistance,
insect or pest resistance, altered fatty acid, protein or
carbohydrate metabolism, increased grain yield, increased oil,
altered plant maturity, enhanced stress tolerance, and altered
morphological characteristics.
7. The method of claim 6, wherein the herbicide tolerance is
selected from the group consisting of glyphosate, glufosinate,
sulfonylureas, imidazolinones, bromoxynil, dalapon, dicamba, 2,4-D,
cyclohezanedione, protoporphyrinogen oxidase inhibitors, and
isoxaflutole tolerance.
Description
[0001] This application is a continuation of U.S. application Ser.
No. 14/433,124, filed on Feb. 15, 2017, which is a continuation of
U.S. application Ser. No. 14/283,630, filed on May 21, 2014, which
is a continuation of U.S. application Ser. No. 12/640,069, filed on
Dec. 17, 2009, which is a continuation of U.S. application Ser. No.
11/441,915, filed May 26, 2006, which claims the benefit of U.S.
Provisional Application No. 60/685,584, filed May 27, 2005, the
entire text of which is specifically incorporated by reference
herein.
BACKGROUND OF THE INVENTION
Field of the Invention
[0002] The invention relates to the field of plant breeding and
plant biotechnology, in particular to a transgene inserted into
genetic linkage with a genomic region of a plant, and to the use of
the transgene/genomic region to enhance the germplasm and to
accumulate other favorable genomic regions in breeding
populations.
Description of Related Art
[0003] Breeding has advanced from selection for economically
important traits in plants and animals based on phenotypic records
of the individual and its relatives to the use of molecular
genetics to identify genomic regions that contain the valuable
genetic traits. Information at the DNA level has lead to faster
genetic accumulation of valuable traits into a germplasm than that
achieved based on the phenotypic data only. The development of
transgenic crops has further revolutionized breeding and
agricultural crop production. The outstanding success of
genetically engineered crops is evident from the fact that the area
of farmland devoted to transgenic crops has grown from a negligible
acreage ten years ago to well over half the acreage for major crops
in agriculturally important countries such as USA, Canada, Brazil
and Argentina. In addition to the development of input traits,
plant biotechnology also holds great promise for the future
development of output traits that will directly benefit consumers,
like nutritionally superior foods, such as the vitamin A enriched
rice, unsaturated oils, and agricultural products of medical value
to name a few. The potential for commercial success of a transgene
encoding a new or improved input or output trait is a great
incentive for development of novel transgenes and their deployment
through breeding these genes into elite germplasm.
[0004] During the development of transgenic crop plants much effort
is concentrated on optimization of the insertion and expression of
the transgene, and then introgressing the transgene throughout the
breeding population by classical breeding methods. The site of
insertion of a transgene into the host genome has been a concern
for at least two reasons; (i) the region where it inserted may
modulate the level of expression of the transgene, and (ii) the
insertion of the transgene may disrupt the normal function or
expression of a gene near or where it has been inserted. The
selection of genomic locations that are beneficial for gene
integration provides for suitable levels of stable expression of an
introduced gene, or genes, and generally does not negatively affect
other agronomic characteristics of the crop plant.
[0005] The genomic region in which the transgene has been inserted
also provides agronomic phenotypes to the crop plant. These
phenotypes have their own value in a breeding program and these
regions should be considered when selecting among multiple
transgene insertion events. Transgene insertion events into genomic
regions that are associated with improved performance with respect
to an agronomic trait or multiple trait index result in an improved
phenotype in the crop plant and progeny derived from the crop plant
that contain the transgene and the associated improved phenotype.
Selecting for the transgenic event necessarily results in selecting
a segment of the host genome that surrounds it, and the improved
phenotypic effect. Further improvements involve the identification
of molecular markers for the tracking and maintenance of the
genomic segment with the associated transgene. This is an area that
has not been adequately addressed in current plant breeding with
transgene insertion events.
[0006] There is a need in the art of plant breeding to identify
genomic regions associated with improved performance with respect
to an agronomic trait or multiple trait index that are linked with
a transgene insertion event and then select for these
transgene-genomic regions for dispersion into the breeding
population of the crop. The present invention provides
consideration to estimating the value of the genomic region and the
transgene event. This value can then be used as a criterion for
selecting among multiple transgenic events. A further benefit is
that linkage drag around a transgene is minimized and valuable
genomic regions are selected that contain the transgene for
breeding into the germplasm of a crop.
SUMMARY OF THE INVENTION
[0007] The present invention provides a method of breeding with
transgenic plants. In one aspect, this method comprises providing a
database identifying a value of an agronomic trait for at least two
distinct haplotypes of the genome for a set of germplasm. The
method further comprises transforming a parent plant with
recombinant DNA to produce at least two transgenic events wherein
the recombinant DNA is inserted into linkage with the at least two
distinct haplotypes of the genome of the parent plant. The database
may then be referenced to estimate the value of the agronomic trait
for the events linked to the distinct haplotypes, and transgenic
event having a higher referenced breeding value may then be
selected for breeding into a germplasm.
[0008] The present invention provides a method for improving plant
germplasm by accumulation of one or more haplotypes in a germplasm.
The method comprises inserting a transgene into a genome of a first
plant, and then determining a map location of the transgene in the
genome. The map location may be correlated to a linked haplotype,
wherein the transgene and the haplotype comprise a T-type genomic
region. The first plant may then be crossed with a second plant.
The second plant may contain at least one T-type genomic region or
haplotype that is different from the first plant T-type genomic
region. At least one progeny plant may then be selected, the
progeny plant having detectable expression of the transgene or its
phenotype and comprising in its genome the T-type genomic region of
the first plant and at least one T-type genomic or haplotype of the
second plant. The progeny plant may be used in activities related
to germplasm improvement, which can be selected from use of the
plant for making breeding crosses, further testing of the plant,
advancement of the plant through self fertilization, use of the
plant or parts thereof for transformation, use of the plant or
parts thereof for mutagenesis, and use of the plant or parts
thereof for TILLING, or any combination of these.
[0009] The present invention includes a method for breeding of a
crop plant, in particular a soybean or corn plant with enhanced
agronomic and transgenic traits comprising a preferred T-type
genomic region. A transgene of the T-type genomic region is further
defined as conferring a preferred property like herbicide
tolerance, disease resistance, insect or pest resistance, altered
fatty acid, protein or carbohydrate metabolism, increased grain
yield, increased oil, increased nutritional content, increased
growth rates, enhanced stress tolerance, or altered morphological
characteristics, or any combination of these.
[0010] The present invention provides a novel method for mapping at
least one genomic region of insertion of a transgene. This method
involves indirect mapping and does not require the establishment of
a de novo population segregating for a transgene. The method
comprises first identifying at least a first polymorphism between
the parent lines of a mapping population in the corresponding
genomic region adjacent to a transgenic insertion event in a
transformed plant or line, then assaying the progeny plants of the
mapping population for the polymorphism. Linkage analysis may be
performed to determine a map position of the polymorphism and
thereby a map location of the transgenic insertion event. The map
location in the mapping population may then be correlated to a
haplotype of the transformed plant and its progeny.
DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
[0011] The definitions and methods provided define the present
invention and guide those of ordinary skill in the art in the
practice of the present invention. Unless otherwise noted, terms
are to be understood according to conventional usage by those of
ordinary skill in the relevant art. Definitions of common terms in
molecular biology may also be found in Rieger et al. (1991); and
Lewin (1994). The nomenclature for DNA bases as set forth at 37 CFR
.sctn. 1.822 is used.
[0012] As used herein, the term "corn" means Zea mays or maize and
includes all plant varieties that can be bred with corn, including
wild maize species.
[0013] As used herein, the term "soybean" means Glycine max and
includes all plant varieties that can be bred with soybean,
including wild soybean species.
[0014] As used herein, the term "comprising" means "including but
not limited to".
[0015] A transgenic "event" is produced by transformation of a
plant cell with heterologous DNA, i.e., a nucleic acid construct
that includes a transgene of interest, regeneration of a population
of plants resulting from the insertion of the transgene into the
genome of the plant, and selection of a particular plant
characterized by insertion into a particular genome location. The
term "event" refers to the original transformant and progeny of the
transformant that include the heterologous DNA. The term "event"
also refers to progeny produced by a sexual outcross between the
transformant and another variety that include the heterologous
DNA.
[0016] The present invention overcomes the deficiencies of the
current transgene breeding methods by describing a T-type genomic
region, defined as a transgene and a linked haplotype genomic
region, through which the genetically linked transgene and
haplotype are selected and then introgressed into germplasm through
breeding. The selection of the T-type genomic region is based on
the estimation of a T-value that the T-type genomic region provides
to the germplasm of the crop plant. The basis of the valuation
distinguishes and selects improved T-type genomic regions for use
in a breeding method, and selects and advances plants comprising
the improved T-type genomic regions. The genomic locations for gene
integration are favorable based on providing suitable levels of
stable expression of an introduced gene, or genes, and for
identifying transgene associations with favorable haplotype regions
that also provide beneficial agronomic characteristics to the
germplasm. By considering the beneficial aspects of both the
transgene and the genomic region to which it is genetically linked,
additional value can be built into a transgenic event and its use
for developing superior germplasm. In an unexpected outcome from
extensive experience in breeding with transgenic plants, the
inventors have realized that additional consideration should be
given to the genomic region that is linked to the transgene
insertion. As a transgene is diffused by breeding methods into
plant germplasm a portion of the genetic region linked to the
transgene is also diffused. By giving consideration to the genetic
region linked to a transgene it is possible to implement
biotechnological and breeding strategies to increase the overall
value of the transgene and the genetic region to which it is linked
to enhance germplasm improvement and minimize the risk of
advancement of less favorable genetic regions, often referred to as
linkage drag.
[0017] For example, in one aspect of the present invention, T-type
genomic regions of new glyphosate tolerant soybean events have been
identified that comprise a glyphosate tolerance transgene with
suitable levels of expression in linkage with a haplotype. The
highest yielding T-type was identified as event 19788 (also
referred to as MON89788) and provided for the replacement of the
T-type genomic region of event 40-3-2 with a haplotype in the same
genomic region with improved yield as determined in a side-by-side
comparison. This finding will have significant impact on enhancing
the germplasm of glyphosate tolerant soybean. A significant portion
of recent soybean breeding has utilized lines containing the
Roundup Ready.RTM. trait found in event 40-3-2 (Padgette et al.,
1995), with possibly as much as 80-95% of the soybean germplasm
offered for sale in the United States currently containing this
transgenic event. In order to continue to enhance soybean
germplasm, it is desirable to be able to identify glyphosate
tolerant events that also have favorable haplotype genomic regions
and replace the 40-3-2 T-type genomic region in the germplasm,
therefore providing elite agronomic traits of the parental line to
the progeny.
[0018] In another aspect of the present invention, T-type genomic
regions of insect tolerant soybean events are identified that
comprise an insect resistance transgene with suitable levels of
expression in linkage with a haplotype. The event GM 19459 was
selected from a population of transgenic soybean events. These
events contain a transgene inserted into the soybean genome that
expresses a protein toxic to Lepidopteran insect pests of soybean.
The various haplotype genomic regions have been mapped to assist in
the selection of an event with the most favorable T-type genomic
region.
[0019] In another aspect of the present invention, T-type genomic
regions of insect tolerant corn events are identified that comprise
an insect resistance transgene with suitable levels of expression
in linkage with a haplotype. The insect tolerant corn event is
selected from a population of transgenic corn events. These events
contain a transgene inserted into the corn genome that expresses a
protein toxic to Lepidopteran insect pests of corn. The various
haplotype genomic regions are mapped to assist in the selection of
an event with the most favorable T-type genomic region.
[0020] Any transgene inserted into the genome of a crop plant that
can be mapped to a genomic location can then be compared to a
haplotype marker developed in that location to determine if the
location comprises a haplotype with an enhanced breeding value.
[0021] In one embodiment, the current invention provides genetic
markers and methods for the identification and breeding of T-type
genomic regions in soybean. The invention therefore allows for the
first time the creation of soybean plants that combine the value of
a transgene and an agronomically elite, or favorable haplotype.
Favorable haplotypes are at least identified as those that have
been inherited more frequently than expected in a plant population.
Using the methods of the present invention, loci comprising a
T-type genomic region may be introduced into potentially any
desired soybean plant. Molecular markers are provided that when
used in a marker assisted breeding program provide a means to
identify and maintain the association of the favorable haplotype
and the transgene to provide the valuable T-type genomic region.
The present invention provides examples of transgenes that provide
herbicide and insect resistant phenotypes to the soybean plants,
other transgenes that provide stress tolerance, disease tolerance,
enhanced protein, oil, amino acid or other feed quality, nutrition
or processing traits are also contemplated as aspects of the
present invention and germplasm comprising these T-types would be
crossed to provide a stacked trait product with preferred T-type
genomic regions.
[0022] In another embodiment, the current invention provides
genetic markers and methods for the identification and breeding of
T-type genomic regions in corn. The invention therefore allows for
the first time the creation of corn plants that combine the value
of a transgene and an agronomically elite, or favorable haplotype.
Using the methods of the present invention, loci comprising a
T-type genomic region may be introduced into potentially any
desired corn plant. Molecular markers are provided that when used
in a marker assisted breeding program provide a means to identify
and maintain the association of the favorable haplotype and the
transgene to provide the valuable T-type genomic region. The
present invention provides examples of transgenes that provide an
insect resistant phenotype to the corn plant, other transgenes that
provide stress tolerance, herbicide tolerance, enhanced protein,
oil, amino acid or other feed quality, nutrition or processing
traits are also contemplated as aspects of the present invention
and germplasm comprising these T-type would be crossed to provide a
stacked trait product with preferred T-type genomic regions.
[0023] T-Type Genomic Region and the Concept of T-Type Value
[0024] A T-type genomic region is a novel genetic composition
comprising at least one transgene, with suitable levels of
expression, in genetic linkage with a haplotype. In a preferred
embodiment the linkage of a transgene with a haplotype should have
no observable deleterious effect on the functional integrity of the
haplotype due to the local insertion of the transgene. Additionally
a haplotype of a T-type genomic region could be functionally
enhanced as a result of the integration into genetic linkage of a
transgene. The T-type genomic region composition has the benefit of
the transgene and the haplotype with which it is linked. The T-type
genomic region is the genetic composition through which a transgene
is diffused into germplasm by breeding.
[0025] In a preferred embodiment of the present invention, a
haplotype of a T-type genomic region comprises at least two
biallelic markers approximately 10 cM apart, or at least one
pluriallelic locus within 5 cM of the transgene and with high
polymorphic information content. Changes in a haplotype, brought
about by recombination for example, may result in the modification
of a haplotype so that it only comprises a portion of the original
(parental) haplotype physically linked to the transgene. Any such
change in a haplotype would be included in our definition of what
constitutes a T-type genomic region so long as the functional
integrity of the T-type genomic region is unchanged or improved.
The linkage of the transgene to the haplotype or functional portion
thereof that provides the desirable phenotype is preferably within
about 5 cM, or within about 2 cM, or within about 1 cM of the
haplotype region. The functional integrity of a haplotype is
considered to be unchanged if its value is not negative with
respect to yield, or is not positive with respect to maturity, or
is null with respect to maturity, or amongst the best 50 percent
with respect to an agronomic trait or a multiple trait index when
compared to any other haplotype at the same chromosome segment in a
set of germplasm (breeding germplasm, breeding population,
collection of elite inbred lines, population of random mating
individuals, biparental cross), or amongst the best 50 percent with
respect to an agronomic trait or a multiple trait index when
compared to any other haplotype across the entire genome in a set
of germplasm, or the haplotype being present with a frequency of 50
percent or more in a breeding population or a set of germplasm can
be taken as evidence of its high value, or any combination of
these.
[0026] The benefit or value of the plant comprising in its genome a
T-type genomic region is estimated by a T-value, which depends on
the value of the transgene trait and the value of the haplotype to
which the transgene is linked. The value of a transgene of a T-type
genomic region can be estimated from the value of the trait that
the transgene encodes. This value depends on the transgene trait
(for example, including but not limited to: herbicide tolerance,
insect resistance, disease resistance, improved nutrition, enhanced
yield, improved processing trait, or stress tolerance) and could be
estimated from increased crop plant output, or decrease in inputs
required for crop cultivation, or any combination of these. The
transgene trait also has value as a selectable or scorable marker.
This has value in breeding applications to one skilled in the art
because the ability to select or score for the transgene trait
results in the simultaneous selection of the linked haplotype. For
example in the case of a cross made with a plant comprising a
T-type, wherein the transgene encodes a herbicide tolerance,
spraying the progeny of that cross with the herbicide would have a
high probability of selecting for the transgene and the tightly
linked parental or recombinant haplotype. DNA markers that are
developed to define the haplotype can be used to confirm the
integrity of the T-type in the progeny of the cross.
[0027] A transgene comprising a recombinant construct may further
comprise a selectable marker or scorable marker. The nucleic acid
sequence serving as the selectable or scorable marker functions to
produce a phenotype in cells which facilitates their identification
relative to cells not containing the marker.
[0028] Examples of selectable markers include, but are not limited
to, a neo or nptII gene (Potrykus et al., 1991), which codes for
kanamycin resistance and can be selected for using kanamycin, G418,
etc.; a bar gene which codes for bialaphos resistance; glyphosate
resistant EPSP synthase, glyphosate resistant mutant EPSP synthase
(Hinchee et al., 1988) which encodes glyphosate resistance,
glyphosate inactivating enzymes; a nitrilase gene which confers
resistance to bromoxynil (Stalker et al., 1988); a mutant
acetolactate synthase gene (ALS) which confers imidazolinone or
sulphonylurea resistance (European Patent Application No. 0154204);
and a methotrexate resistant DHFR gene (Thillet et al., 1988).
[0029] Other exemplary scorable markers include: a
.beta.-glucuronidase or uidA gene (GUS), which encodes an enzyme
for which various chromogenic substrates are known (Jefferson,
1987; Jefferson et al., 1987); an R-locus gene, which encodes a
product that regulates the production of anthocyanin pigments (red
color) in plant tissues (Dellaporta et al., 1988); a
.beta.-lactamase gene (Sutcliffe et al., 1978), which encodes an
enzyme for which various chromogenic substrates are known (e.g.,
PADAC, a chromogenic cephalosporin); a luciferase gene (Ow et al.,
1986); a xylE gene (Zukowsky et al., 1983) which encodes a catechol
dioxygenase that can convert chromogenic catechols; an
.beta.-amylase gene (Ikatu et al., 1990); a tyrosinase gene (Katz
et al., 1983), which encodes an enzyme capable of oxidizing
tyrosine to DOPA and dopaquinone (which in turn condenses to
melanin); and an .beta.-galactosidase, which will turn a
chromogenic .beta.-galactose substrate.
[0030] Included within the terms "selectable or scorable markers"
are also genes that encode a secretable marker whose secretion can
be detected as a means of identifying or selecting for transformed
cells. Examples include markers that encode a secretable antigen
that can be identified by antibody interaction, or even secretable
enzymes which can be detected catalytically. Selectable secreted
marker proteins fall into a number of classes, including small,
diffusible proteins which are detectable, (e.g., by ELISA), small
active enzymes which are detectable in extracellular solution
(e.g., .beta.-amylase, .beta.-lactamase, phosphinothricin
transferase), or proteins which are inserted or trapped in the cell
wall (such as proteins which include a leader sequence such as that
found in the expression unit of extension or tobacco PR-S). Other
possible selectable marker genes will be apparent to those of skill
in the art.
[0031] A marker is preferably GUS, green fluorescent protein (GFP),
neomycin phosphotransferase II (nptII), luciferase (LUX), an
antibiotic resistance gene coding sequence, or an herbicide
resistance gene coding sequence. The selectable agent can be an
antibiotic, for example including but not limited to, kanamycin,
hygromycin, or a herbicide, for example including but not limited
to, glyphosate, glufosinate, 2,4-D, and dicamba.
[0032] The T-type genomic region has a value in marker-assisted
selection and marker-assisted breeding applications. Selection for
a transgene and a favorable haplotype in the case where they
comprise a T-type genomic region requires only one marker, whereas
at least two markers would be required if the transgene and
favorable haplotype are unlinked. This potential value would
increase as more T-type genomic regions are accumulated or stacked
together in a germplasm.
[0033] The T-value can be changed or modified by changing
expression of the transgene, wherein a change is brought about at
the level of transgene expression, or in the timing of transgene
expression, or in the localization of transgene expression, or any
combination of these. It is anticipated by this invention that the
change in T-value brought by a change in any of the components of
transgene expression could be effected through cis-acting (local)
or trans-acting (can act at a distance not simply on the DNA
molecule in which they occur) factors, or a combination of
these.
[0034] Additionally, the T-value can be changed or modified by
changing the haplotype with which the transgene is tightly linked.
A preferred embodiment of the present invention is the improvement
of the T-value by selecting or directing the transgene of an
existing T-type genomic into tight linkage with a different
recipient haplotype, wherein the different haplotype is associated
with additional value and improved with respect to an agronomic
trait or a multiple trait index over the existing T-type haplotype
as determined in a side-by-side or head-to-head comparison. A
change in the haplotype could also be brought about by generating
or selecting for at least one recombinant T-type haplotype that is
improved with respect to an agronomic trait or a multiple trait
index over the existing T-type haplotype as determined in a
replicated side-by-side or head-to-head comparison.
[0035] Another preferred embodiment of the present invention is to
build additional value into a new or novel transgene event by
selecting or directing the transgene into linkage with a recipient
haplotype that has a breeding value that is not negative with
respect to yield, or is not positive with respect to maturity, or
is null with respect to maturity, or amongst the best 50 percent
with respect to an agronomic trait or a multiple trait index when
compared to any other haplotype at the same chromosome segment in a
set of germplasm, or amongst the best 50 percent with respect to an
agronomic trait or a multiple trait index when compared to any
other haplotype across the entire genome in a set of germplasm, or
alleles conferring agronomic fitness to a crop plant or the
haplotype being present with a frequency of 50 percent or more in a
breeding population or a set of germplasm can be taken as evidence
of its high value, or any combination of these.
[0036] Another embodiment of the present invention is a selection
of a plant or line for transformation with at least a first
transgene, wherein the selection of the plant or line is based on
it comprising in its genome a high proportion of recipient
haplotypes that have a breeding value that is not negative with
respect to yield, or is not positive with respect to maturity, or
is null with respect to maturity, or amongst the best 50 percent
with respect to an agronomic trait or a multiple trait index when
compared to any other haplotype at the same chromosome segment in a
set of germplasm, or amongst the best 50 percent with respect to an
agronomic trait or a multiple trait index when compared to any
other haplotype across the entire genome in a set of germplasm, or
alleles conferring agronomic fitness to a crop plant or the
haplotype being present with a frequency of 50 percent or more in a
breeding population or a set of germplasm can be taken as evidence
of its high value, or any combination of these.
[0037] This invention anticipates an accumulating or stacking of
T-type genomic regions into plants or lines by addition of
transgenes by transformation, or by crossing parent plants or lines
containing different T-type genomic regions, or any combination of
these. The value of the accumulated or stacked T-type genomic
regions can be estimated by a composite T-value, which depends on a
combination of the value of the transgene traits and the value of
the haplotype(s) to which the transgenes are linked. The present
invention further anticipates that the composite T-value can be
improved by modifying the components of expression of one or each
of the stacked transgenes. Additionally, the present invention
anticipates that additional value can be built into the composite
T-value by selection of at least one recipient haplotype with a
favorable breeding value to which one or any of the transgenes are
linked, or by selection of plants or lines for stacking transgenes
by transformation or by breeding or by any combination of
these.
[0038] Transgenic crops for which a method of the present invention
can be applied include, but are not limited to herbicide tolerant
crops, for example, Roundup Ready.RTM. Cotton 1445 and 88913;
Roundup Ready.RTM. corn GA21, nk603, MON802, MON809; Roundup
Ready.RTM. Sugar beet GTSB77 and H7-1; Roundup Ready.RTM. Canola
RT73 and GT200; oilseed rape ZSR500, Roundup Ready.RTM. Soybean
40-3-2, MON89788-containing soybean, Roundup Ready.RTM. Bentgrass
ASR368, HCN10, HCN28 and HCN92 canola, MS1 and RF1 canola, OXY-235
canola, PHY14, PHY35 and PHY36 canola, RM3-3, RM3-4 and RM3-6
chicory, A2704-12, A2704-21, A5547-35, A5547-127 soybean, GU262
soybean, W62 and W98 soybean, 19-51A cotton, 31807 and 31808
cotton, BXN cotton, FP967 flax, LLRICE06 and LLRICE62 rice,
MON71800 wheat, 676 and 678 and 680 corn, B16 corn, Bt11 corn,
CBH-351 corn, DAS-06275-8 corn, DBT418 corn, MS3 and MS6 corn, T14
and T25 corn, H177 corn, and TC1507 corn. Herbicides for which
transgenic plant tolerance has been demonstrated and the method of
the present invention can be applied, include but are not limited
to: glyphosate, glufosinate, sulfonylureas, imidazolinones,
bromoxynil, dalapon, dicamba, 2,4-D, cyclohezanedione,
protoporphyrinogen oxidase inhibitors, and isoxaflutole herbicides.
Polynucleotide molecules encoding proteins involved in herbicide
tolerance are known in the art, and include, but are not limited to
a polynucleotide molecule encoding
5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) described in
U.S. Pat. Nos. 5,627,061, 5,633,435, 6,040,497 and in U.S. Pat. No.
5,094,945 for glyphosate tolerance, all of which are hereby
incorporated by reference; polynucleotides encoding a glyphosate
oxidoreductase, glyphosate-N-acetyl transferase, or glyphosate
decarboxylase (GOX, U.S. Pat. No. 5,463,175; GAT, US Patent
publications 20030083480 and 20050246798; glyphosate decarboxylase,
US Patent publications 20060021093; 20060021094; 20040177399,
herein incorporated by reference in their entirety); a
polynucleotide molecule encoding bromoxynil nitrilase (Bxn)
described in U.S. Pat. No. 4,810,648 for bromoxynil tolerance,
which is hereby incorporated by reference; a polynucleotide
molecule encoding phytoene desaturase (crtl) described in Misawa et
al, (1993) and Misawa et al, (1994) for norflurazon tolerance; a
polynucleotide molecule encoding acetohydroxyacid synthase (AHAS,
aka ALS) described in Sathasiivan et al. (1990) for tolerance to
sulfonylurea herbicides; and the bar gene described in DeBlock, et
al. (1987) for glufosinate and bialaphos tolerance; resistant
hydroxyphenyl pyruvate dehydrogenase (HPPD, U.S. Pat. No.
6,768,044). A promoter of a transgene of the present invention can
express genes that encode for phosphinothricin acetyltransferase,
glyphosate resistant EPSPS, aminoglycoside phosphotransferase,
hydroxyphenyl pyruvate dehydrogenase, hygromycin
phosphotransferase, neomycin phosphotransferase, dalapon
dehalogenase, bromoxynil resistant nitrilase, dicamba
mono-oxygenase, anthranilate synthase, glyphosate oxidoreductase,
glyphosate-N-acetyl transferase, or glyphosate decarboxylase.
[0039] Transgenic crops for which the method of the present
invention can be applied include, but are not limited to, insect
resistant crops, for example, cotton events, such as MON15985,
281-24-236, 3006-210-23, MON531, MON757, MON1076, and COT102; or
corn events, such as MIR604, BT176, BT11, CBH-351, DAS-06275-8,
DBT418, MON80100, MON810, MON863, TC1507, MIR152V, 3210M, and
3243M. Insect resistant transgenic crops can provide tolerance to
insect pest feeding damage and have been shown to be effective
against certain Lepidopterans, and Coleopterans plant pests, and
other transgenic crops that may also provide resistance to plant
pests such as, certain members of Hemiptera, Homoptera,
Heteroptera, Orthoptera, Thysanoptera, and plant parasitic
nematodes. Disease resistant transgenic crops, for example, virus
resistant papaya 55-1/63-1, and virus resistant squash CZW-3 and
ZW20. Male sterility transgenic crops, for example, PHY14, PHY35
and PHY36 canola and corn events 676, 678, 680, MS3 and MS6.
Additional transgenic crop plants may also provide resistance to
fungal and bacterial organisms that cause plant disease.
[0040] The present invention contemplates the above listed
transgenic crops and germplasm comprising the T-type genomic
regions for use in breeding and stacking of T-type genomic regions,
or haplotypes identified by an indirect mapping method, or any
combination of these to increase T-type value or to enhance overall
germplasm quality as described in the methods of the present
invention.
[0041] Haplotypes
[0042] A "haplotype" is a segment of DNA in the genome of an
organism that is assumed to be identical by descent for different
individuals when the knowledge of identity by state at one or more
loci is the same in the different individuals, and that the
regional amount of linkage disequilibrium in the vicinity of that
segment on the physical or genetic map is high. A haplotype can be
tracked through populations and its statistical association with a
given trait can be analyzed. Thus, a haplotype association study
allows one to define the frequency and the type of the ancestral
carrier haplotype. An "association study" is a genetic experiment
where one tests the level of departure from randomness between the
segregation of alleles at one or more marker loci and the value of
individual phenotype for one or more traits. Association studies
can be done on quantitative or categorical traits, accounting or
not for population structure and/or stratification.
[0043] A haplotype analysis is important in that it increases the
statistical power of an analysis involving individual biallelic
markers. In a first stage of a haplotype frequency analysis, the
frequency of the possible haplotypes based on various combinations
of the identified biallelic markers of the invention is determined.
The haplotype frequency is then compared for distinct populations
and mapping population. Generally, as a result of prior germplasm
improvement, the greater the haplotype frequency in a population of
set of germplasm the greater its value has been to the germplasm,
described as the alleles associated with agronomic fitness of a
crop plant (U.S. Pat. No. 5,437,697, herein incorporated by
reference in its entirety). A favorable haplotype can be selected
based on its frequency in a set of germplasm, generally a frequency
of 50 percent or more would indicate that the haplotype has value
in the germplasm. A haplotype that occurs at a high frequency would
be favorable for targeting with a transgene or selection of a
T-type wherein the haplotype has a high frequency in the germplasm
would be considered favorable. A haplotype occurring at any
frequency in the germplasm can be correlated to a trait and the
haplotype can be given a value based on a single trait or a
combination of traits. A favorable haplotype will provide one or
more favorable traits to a germplasm. In general, any method known
in the art to test whether a trait and a genotype show a
statistically significant correlation may be used. Methods for
determining the statistical significance of a correlation between a
phenotype and a genotype, in this case a haplotype, may be
determined by any statistical test known in the art and with any
accepted threshold of statistical significance being required. The
application of particular methods and thresholds of significance
are well with in the skill of the ordinary practitioner of the
art.
[0044] In plant breeding populations, linkage disequilibrium (LD),
which is the level of departure from random association between two
or more loci in a population, often persists over large chromosomal
segments. Although it is possible for one to be concerned with the
individual effect of each gene in the segment, for a practical
plant breeding purpose, what generally matters is what is the
average impact the region has for the trait(s) of interest(s) when
present in a line, hybrid or variety. The amount of pair-wise LD
(using the r.sup.2 statistics) was plotted against the distance in
centiMorgans (cM, one hundredth of a Morgan, on average one
recombination per meiosis, recombination is the result of the
reciprocal exchange of chromatid segment between homologous
chromosome paired at meiosis, and it is usually observed through
the association of alleles at linked loci from different
grandparents in the progeny) between the markers for a reference
germplasm set, for example, a set of 791 soybean elite US lines and
1211 SNP loci with a rare allele frequency greater than 5 percent.
A 200 data points moving average curve was an indicator of the
presence of LD even for loci 10 cM apart. Thus when predicting
average effect of chromosome segments, one should consider segments
a few centiMorgans long, and this is the acception given to a
haplotype region, that is a chromosome segment a few centiMorgans
long that persists over multiple generations of breeding and that
is carried by one or more breeding lines. This segment can be
identified with multiple linked marker loci it contains, and the
common haplotype identity at these loci in two lines gives a high
degree of confidence of the identity by descent of the entire
subjacent chromosome segment carried by these lines.
[0045] One should specify what the favorable haplotypes are and
what their frequency in the germplasm is. Thus, one would obtain or
generate a molecular marker survey of the germplasm under
consideration for breeding and/or propagation of a transformation
event. This marker survey will generate a fingerprint of each line.
These markers are assumed to have their approximate genomic map
position known. To simplify downstream analyses, quality assurance
and missing data estimations steps may need to be implemented at
this stage to produce a complete and accurate data matrix (marker
genotype by line). Error detections and missing data estimations
may require the use of parent-offspring tests, LD between marker
loci, interval mapping, re-genotyping, etc.
[0046] Markers are then grouped based on their proximity. This
grouping may be arbitrary (e.g. "start from one end of the
chromosome and include all markers that are within 10 cM of the
first marker included in the segment, before starting the next
segment") or based on some statistical analysis (e.g. "define
segment breakpoints based on LD patterns between adjacent
loci").
[0047] When a large set of lines is considered, and multiple lines
have the same allele at a marker locus, it is needed to ascertain
whether identity by state (IBS) at the marker locus is a good
predictor of identity by descent (IBD) at the chromosomal region
surrounding the marker locus. "Identity by descent" (IBD)
characterizes two loci/segment of DNA that are carried by two or
more individuals and are all derived from the same ancestor.
"Identity by state" (IBS) characterizes two loci/segments of DNA
that are carried by two or more individuals and have the same
alleles at the observable loci. A good indication that a number of
marker loci in a segment are enough to characterize IBD for the
segment is that they can predict the allele present at other marker
loci within the segment.
[0048] To estimate the frequency of a haplotype, the base reference
germplasm has to be defined (collection of elite inbred lines,
population of random mating individuals, etc.) and a representative
sample (or the all population) has to be genotyped. The haplotype
frequency can then be determined by simple counting if considering
a set of inbred individuals. Estimation methods that employ
computing techniques like the Expectation/Maximization algorithm
will be needed if individuals genotyped are heterozygous at more
the one loci in the segment and linkage phase is unknown (Excoffier
and Slatkin, 1995). Preferably, a method based on an
expectation-maximization (EM) algorithm (Dempster et al. 1977)
leading to maximum-likelihood estimates of haplotype frequencies
under the assumption of Hardy-Weinberg proportions (random mating)
is used (Excoffier and Slatkin, 1995). With the haplotype
estimates, and the identity of each chromosome segment for each
candidate host line, it is further possible to rank lines according
to their probability of giving rise to events located in high value
haplotypes. Several probability distributions of an event to be
located in a chromosome segment could be used, according to the
degree of knowledge acquired on the physical size of each segment
and the random or pattern-following mode of insertion of a
transgene in the genome. Alternative approaches can be employed to
perform association studies: genome-wide association studies,
candidate region association studies and candidate gene association
studies. The biallelic markers of the present invention may be
incorporated in any map of genetic markers of a plant genome in
order to perform genome-wide association studies.
[0049] The present invention comprises methods to detect an
association between a haplotype and a favorable property or a
multiple trait index. A multiple trait index (MTI) is a numerical
entity that is calculated through the combination of single trait
values in a formula. Most often calculated as a linear combination
of traits or normalized derivations of traits, it can also be the
result of more sophisticated calculations (for example, use of
ratios between traits). This MTI can then be used in genetic
analysis as if it where a trait. A favorable haplotype provides a
favorable property to a parent plant and to the progeny of the
parent when selected by a marker means or phenotypic means. The
method of the present invention provides for selection of favorable
haplotypes and the accumulation of favorable haplotypes in a
breeding population, for example one or more of the haplotypes
identified in the present invention. A particular embodiment of the
present invention, a transgene is associated with a favorable
haplotype to create a T-type that is accumulated with other
favorable haplotypes to enhance a germplasm.
[0050] Accumulation of T-Type Genomic Regions and Favorable
Haplotypes
[0051] Another embodiment of this invention is a method for
enhancing accumulation of one or more haplotypes in a germplasm.
The transformation of a plant cell with a transgene means that the
transgene DNA has been inserted into a genomic DNA region of the
plant. Genomic regions defined as haplotype regions include genetic
information and provide phenotypic traits to the plant. Variations
in the genetic information result in variation of the phenotypic
trait and the value of the phenotype can be measured. The genetic
mapping of the haplotype regions and genetic mapping of a transgene
insertion event allows for a determination of linkage of a
transgene insertion with a haplotype. Any transgene that has a DNA
sequence that is novel in the genome of a transformed plant can in
itself serve as a genetic marker of the transgene and the genomic
region in which it has inserted. For example, in the present
invention, a transgene that was inserted into the genome of a
soybean plant provides for the expression of a glyphosate resistant
5-enolpyruvylshikimate-3-phosphate synthase that has a DNA coding
sequence comprised within SEQ ID NO:28 disclosed in U.S. Pat. No.
6,660,911 and SEQ ID NO:9 disclosed in U.S. Pat. No. 5,633,435,
both herein incorporated by reference, from which a DNA primer or
probe molecule can be selected to function as a genetic marker for
the transgene in the genome.
[0052] Additionally, a transgene may provide a means to select for
plants that have the insert and the linked haplotype region.
Selection may be due to tolerance to an applied phytotoxic chemical
such as a herbicide or antibiotic. Selection may be due to
detection of a product of a transgene, for example, an mRNA or
protein product. Selection may be conducted by detection of the
transgene DNA inserted into the genome of the plant. A transgene
may also provide a phenotypic selection means, such as, a
morphological phenotype that is easily to observe, this could be a
seed color, seed germination characteristic, seedling growth
characteristic, leaf appearance, plant architecture, plant height,
and flower and fruit morphology, or selection based on an agronomic
phenotype, such as, yield, herbicide tolerance, disease tolerance,
insect tolerance, enhance feed quality, drought tolerance, cold
tolerance, or any other agronomic trait provided by a
transgene.
[0053] During the development of improved crop plants by insertion
of transgenic genes often hundreds of plants are produced with
different transgene insertion locations. These insertion events
occur throughout the genome of the plant and are incorporated into
tight linkage with many different haplotype regions. The present
invention provides for the screening of transgenic events that have
a transgene insertion into tight linkage with favorable haplotype
regions and selection of these events for use in a breeding program
to enhance the accumulation of favorable haplotype regions. The
method includes: a) inserting a transgene into a genome of a plant
cell and regenerating the plant cell into an intact transformed
plant using plant transformation and regeneration methods
previously described and known in the art of plant biotechnology;
and b) determining a map location of the transgene in the genome of
the transformed plant using DNA markers of the transgene and linked
genomic regions; and c) correlating the map location to a tightly
linked haplotype, wherein the transgene and the haplotype comprises
a T-type genomic region in the transformed plant; and d) crossing
the transformed plant with a second plant that may also be
transformed to contain at least one T-type genomic region that is
different from the first transformed plant T-type genomic region or
the second plant may contain a favorable haplotype region
identified by genetic markers that is different from the first
transformed plant; and e) selecting at least one progeny plant by
detecting expression of the transgene of the first plant or
selecting by the presence of a marker associated with the
transgene, wherein the progeny plant comprises in its genome at
least a portion of the T-type genomic region of the first plant and
at least one T-type genomic region or favorable haplotype of the
second plant; and f) using the progeny plant in activities related
to germplasm improvement the activities selected from the group
consisting of using the plant for making breeding crosses, further
testing of the plant, advancement of the plant through self
fertilization, use of the plant or parts thereof for
transformation, use of the plant or parts thereof for mutagenesis,
and use of the plant or parts thereof for TILLING (e.g. McCallum et
al., 2000).
[0054] Using this method, the present invention contemplates that
preferred T-type genomic regions are selected from a large
population of T-type genomic regions, and the preferred T-type
genomic regions have an enhanced T-value in the germplasm of a crop
plant. Additionally, the preferred T-type genomic region can be
used in the described breeding method to accumulate other
beneficial T-type genomic regions and favorable haplotype regions
and maintain these in a breeding population to enhance the overall
germplasm of the crop plant. Crop plants considered for use in the
method include but are not limited to, corn, soybean, cotton,
wheat, rice, canola, oilseed rape, sugar beet, sorghum, millet,
alfalfa, vegetable crops, forest trees, and fruit crops.
[0055] Genome Mapping of a T-Type Genomic Region
[0056] Another embodiment of this invention is a method for
indirect mapping at least one T-type genomic region. Mapping of the
T-type genomic region in the genome of a plant provides for
selection of favorable haplotype regions that comprise the T-type
genomic region. The present invention provides a method for mapping
of the transgene insertion event and its association with a genomic
region and location on a genome map of a plant. The method may
include the following steps: [0057] (a) Obtaining the DNA sequence
of the genome flanking the transgene insertion event; [0058] (b)
Comparing the DNA sequence chromatogram to eliminate paralogous
sequences when two or more sequences of high homology are obtained;
[0059] (c) Searching for the DNA sequence in a sequence database to
verify whether the insertion event has interrupted an endogenous
gene; [0060] (d) Designing one or a plurality of pairs of DNA
primer molecules on either or both the 5' and 3' genomic regions
flanking the transgene insertion. When multiple pairs of primers
are designed, it can be done in such a way as to obtain overlapping
PCR products from each genomic flanking region to ensure
substantial coverage of the associated genomic DNA; [0061] (e)
Using the parent lines of a mapping population(s) as template for
PCR; [0062] (f) Sequencing the PCR products obtained from these
primers/line combinations; [0063] (g) Identifying SNPs, or other
polymorphic feature such as indels or SSRs, between the parents of
at least one of the mapping populations; [0064] (h) Repeating steps
(d) through (g) on additional flanking sequence, sliding away from
the site of insertion in the 5' and 3' directions, until
polymorphic sites are found, or to obtain additional ones; [0065]
(i) Designing an assay to score the progeny plants of the mapping
population(s); [0066] (j) Perform a linkage analysis to ascertain
the map position of these polymorphism and consequently of the
location of the event; [0067] (k) Correlate map position with the
location of a haplotype region.
[0068] The genome flanking the transgene insertion event can
comprise a DNA segment of from a few hundred to tens of thousands
of nucleotide base pairs or a sufficient length to identify a
polymorphism. The genomic flanking region can be from the 5' or 3'
end of the transgene insert location extending into the genome from
the insert site. The "polymerase chain reaction" (PCR) is a process
of in vitro geometrical amplification of a target DNA segment
through the use of a heat-resistant DNA polymerase and cyclic
variation of temperature to allow for repetitive denaturing, primer
annealing and amplification or template DNA. "Paralogous sequences"
are two sequences of DNA with a high degree of similarity but
belong to different loci on the genome. A "mapping population" is a
set of individuals where alleles at marker loci and possibly at one
or a plurality of Quantitative Trait Loci (QTL) are segregating, in
a way that presence of linkage disequilibrium can be taken of
evidence as proximity on the chromosome and there is a positive
correlations between proximity and disequilibrium. The mapping
population is the same plant species or a plant species
demonstrating synteny or colinearity. These populations can be used
to estimate the relative positions of marker loci among themselves
or between these and QTLs. Generally mapping populations are
segregating populations. The method can be applied to any crop
species, particular important crop species are, for example, corn,
soybean, cotton, wheat, rice, canola, oilseed rape, sugar beet,
sorghum, millet, alfalfa, vegetable crops, forest trees, and fruit
crops. There are maps available to one skilled in the art for one
or more of these crops, by way of example, genetic maps are
referenced for maize (Lee et al., 2002), soybean (Ferreira et al.,
2000), cotton (Lacape et al., 2003), and canola (Cheung et al.,
1997). De novo mapping populations can also be generated for any
crop of interest and a genetic map crated that is useful in the
present invention to map the haplotype regions in which a transgene
has inserted.
[0069] Identification of cloned genomic DNA regions for example,
those contained in a Bac library can be probed with DNA markers
developed to identify the haplotype linked with a transgenic
insertion. Additional DNA markers can be developed by sequencing
the Bac clones and inspecting for polymorphisms in the sequence.
Genes of interest can be isolated from the Bac clones that can be
used as transgenes to improve the performance of the same crop
species or different crop species.
Recombinant Vectors and Transgenes
[0070] Means for preparing recombinant vectors are well known in
the art. Methods for making recombinant vectors particularly suited
to plant transformation include, without limitation, those
described in U.S. Pat. Nos. 4,971,908, 4,940,835, 4,769,061 and
4,757,011. These type of vectors have also been reviewed (Rodriguez
et al., 1988; Glick et al., 1993).
[0071] Typical vectors useful for expression of nucleic acids in
higher plants are well known in the art and include vectors derived
from the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens
(Rogers et al., 1987). Other recombinant vectors useful for plant
transformation, including the pCaMVCN transfer control vector, have
also been described (Fromm et al., 1985). Many crops species have
been transformed to contain one or more transgenes of agronomic
importance that in themselves provides a favorable property to the
plant. One example is a transgene that confers herbicide tolerance
to the crop plant. Transgenes that encode herbicide tolerance
proteins that have been transformed and expressed in plants
include, for example, a 5-enolpyruvylshikimate-3-phosphate synthase
(EPSPS) protein conferring glyphosate resistance and proteins
conferring resistance to others herbicides, such as glufosinate or
bromoxynil (Comai et al., 1985; Gordon-Kamm et al., 1990; Stalker
et al., 1988; Eichholtz et al., 1987; Shah et al., 1986; Charest et
al., 1990). Further examples include the expression of enzymes such
as dihydrofolate reductase and acetolactate synthase, mutant ALS
and AHAS enzymes that confer resistance to imidazalinone or a
sulfonylurea herbicides (Lee et al., 1988 and Miki et al., 1990), a
phosphinothricin-acetyl-transferase conferring phosphinothricin
resistance (European application No. 0 242 246), proteins
conferring resistance to phenoxy proprionic acids and
cycloshexones, such as sethoxydim and haloxyfop (Marshall et al.,
1992); and proteins conferring resistance to triazine (psbA and gs+
genes) and benzonitrile (nitrilase encoding gene, Przibila et al.
(1991).
[0072] A plant of the present invention may also comprise a
transgene that confers resistance to insect, pest, viral, or
bacterial attack. For example, a transgene conferring resistance to
a pest, such as soybean cyst nematode was described in PCT
Application WO96/30517 and PCT Application WO93/19181. Jones et al.
(1994) describe cloning of the tomato Cf-9 gene for resistance to
Cladosporium flavum); Martin et al. (1993) describe a tomato Pto
gene for resistance to Pseudomonas syringae pv. and Mindrinos et
al. (1994) describe an Arabidopsis RSP2 gene for resistance to
Pseudomonas syringae. Bacillus thuringiensis endotoxins may also be
used for insect resistance, for example, Geiser et al. (1986).
[0073] The expression of viral coat proteins as transgenes in
transformed plant cells is known to impart resistance to viral
infection and/or disease development affected by the virus from
which the coat protein gene is derived, as well as by related
viruses (Beachy et al., 1990).
[0074] Transgenes may also be used conferring increased nutritional
value or another value-added trait. One example is modified fatty
acid metabolism, for example, by transforming a plant with an
antisense gene of stearoyl-ACP desaturase to increase stearic acid
content of the plant, (Knutzon et al., 1992). A sense desaturase
gene may also be introduced to alter fatty acid content. Phytate
content may be modified by introduction of a phytase-encoding gene
to enhance breakdown of phytate, adding more free phosphate to the
transformed plant. Modified carbohydrate composition may also be
affected, for example, by transforming plants with a gene coding
for an enzyme that alters the branching pattern of starch (Shiroza
et al., 1988, nucleotide sequence of Streptococcus mutants
fructosyltransferase gene); Steinmetz et al. (1985) (nucleotide
sequence of Bacillus subtilis levansucrase gene); Pen et al.
(1992), production of transgenic plants that express Bacillus
lichenifonnis .alpha.-amylase); Elliot et al. (1993), nucleotide
sequences of tomato invertase genes); Sogaard et al. (1993),
site-directed mutagenesis of barley .alpha.-amylase gene; and
Fisher et al. (1993), maize endosperm starch branching enzyme
II.
[0075] Transgenes may also be used to alter protein metabolism. For
example, U.S. Pat. No. 5,545,545 describes lysine-insensitive maize
dihydrodipicolinic acid synthase (DHPS), which is substantially
resistant to concentrations of L-lysine which otherwise inhibit the
activity of native DHPS. Similarly, EP 0640141 describes sequences
encoding lysine-insensitive aspartokinase (AK) capable of causing a
higher than normal production of threonine, as well as a
subfragment encoding antisense lysine ketoglutarate reductase for
increasing lysine.
[0076] A transgene may be employed that alters plant carbohydrate
metabolism. For example, fructokinase genes are known for use in
metabolic engineering of fructokinase gene expression in transgenic
plants and their fruit (U.S. Pat. No. 6,031,154). Further examples
of transgenes that may be used are genes that alter grain yield.
For example, U.S. Pat. No. 6,486,383 describes modification of
starch content in plants with subunit proteins of adenosine
diphosphoglucose pyrophosphorylase ("ADPG PPase"). In EP0797673,
transgenic plants are discussed in which the introduction and
expression of particular DNA molecules results in the formation of
easily mobilized phosphate pools outside the vacuole and an
enhanced biomass production and/or altered flowering behavior.
Still further known are genes for altering plant maturity. U.S.
Pat. No. 6,774,284 describes DNA encoding a plant lipase and
methods of use thereof for controlling senescence in plants. U.S.
Pat. No. 6,140,085 provides FCA genes for altering flowering
characteristics, particularly timing of flowering. U.S. Pat. No.
5,637,785 discusses genetically modified plants having modulated
flower development such as having early floral meristem development
and comprising a structural gene encoding the LEAFY protein in its
genome.
[0077] Genes for altering plant morphological characteristics are
also known and may be used in accordance with the invention. U.S.
Pat. No. 6,184,440 discusses genetically engineered plants which
display altered structure or morphology as a result of expressing a
cell wall modulation transgene. Examples of cell wall modulation
transgenes include a cellulose binding domain, a cellulose binding
protein, or a cell wall modifying protein or enzyme such as
endoxyloglucan transferase, xyloglucan endo-transglycosylase, an
expansin, cellulose synthase, or a novel isolated
endo-1,4-.beta.-glucanase.
[0078] A transgene that provides a favorable property can be
associated with plant morphology, physiology, growth and
development, yield, nutritional enhancement, disease or pest
resistance, or environmental or chemical tolerance. A transgene
that provides a beneficial agronomic trait to crop plants may be,
for example, include but is not limited to the following examples
of genetic elements comprising herbicide resistance (U.S. Pat. Nos.
5,633,435 and 5,463,175), increased yield (U.S. Pat. No.
5,716,837), insect control (U.S. Pat. Nos. 6,063,597; 6,063,756;
6,093,695; 5,942,664; and 6,110,464), fungal disease resistance
(U.S. Pat. Nos. 5,516,671; 5,773,696; 6,121,436; 6,316,407, and
6,506,962), virus resistance (U.S. Pat. Nos. 5,304,730 and
6,013,864), nematode resistance (U.S. Pat. No. 6,228,992),
bacterial disease resistance (U.S. Pat. No. 5,516,671), starch
production (U.S. Pat. Nos. 5,750,876 and 6,476,295), modified oils
production (U.S. Pat. No. 6,444,876), high oil production (U.S.
Pat. Nos. 5,608,149 and 6,476,295), modified fatty acid content
(U.S. Pat. No. 6,537,750), high protein production (U.S. Pat. No.
6,380,466), fruit ripening (U.S. Pat. No. 5,512,466), enhanced
animal and human nutrition (U.S. Pat. Nos. 5,985,605 and
6,171,640), biopolymers (U.S. Pat. No. 5,958,745 and U.S. Patent
Publication US20030028917), environmental stress resistance (U.S.
Pat. No. 6,072,103), pharmaceutical peptides (U.S. Pat. No.
6,080,560), improved processing traits (U.S. Pat. No. 6,476,295),
improved digestibility (U.S. Pat. No. 6,531,648) low raffinose
(U.S. Pat. No. 6,166,292), industrial enzyme production (U.S. Pat.
No. 5,543,576), improved flavor (U.S. Pat. No. 6,011,199), nitrogen
fixation (U.S. Pat. No. 5,229,114), hybrid seed production (U.S.
Pat. No. 5,689,041), and biofuel production (U.S. Pat. No.
5,998,700), the genetic elements, methods, and transgenes described
in the patents listed above are hereby incorporated by
reference.
[0079] Alternatively, a transcribable polynucleotide molecule can
effect the above mentioned plant characteristic or phenotype by
encoding a RNA molecule that causes the targeted inhibition of
expression of an endogenous gene, for example via antisense,
inhibitory RNA (RNAi), or cosuppression-mediated mechanisms. The
RNA could also be a catalytic RNA molecule (i.e., a ribozyme)
engineered to cleave a desired endogenous mRNA product. Certain RNA
molecules can also be expressed in plant cells that inhibit targets
in organisms other than plants, for example, insects that feed on
the plant cells and ingest the inhibitory RNA, or nematodes that
feed on plant cells and ingest the inhibitory RNA. Thus, any
transcribable polynucleotide molecule that encodes a transcribed
RNA molecule that affects a phenotype or morphology change of
interest may be useful for the practice of the present
invention.
[0080] Breeding and Markers
[0081] Breeding techniques take advantage of a plant's method of
pollination. There are two general methods of pollination:
self-pollination, which occurs if pollen from one flower is
transferred to the same or another flower of the same plant, and
cross-pollination, which occurs if pollen comes to it from a flower
on a different plant. Plants that have been self-pollinated and
selected for type over many generations become homozygous at almost
all gene loci and produce a uniform population of true breeding
progeny, homozygous plants.
[0082] In development of suitable varieties, pedigree breeding may
be used. The pedigree breeding method for specific traits involves
crossing two genotypes. Each genotype can have one or more
desirable characteristics lacking in the other; or, each genotype
can complement the other. If the two original parental genotypes do
not provide all of the desired characteristics, other genotypes can
be included in the breeding population. Superior plants that are
the products of these crosses are selfed and are again advanced in
each successive generation. Each succeeding generation becomes more
homogeneous as a result of self-pollination and selection.
Typically, this method of breeding involves five or more
generations of selfing and selection: S.sub.1.fwdarw.S.sub.2;
S.sub.2.fwdarw.S.sub.3; S.sub.3.fwdarw.S.sub.4;
S.sub.4.fwdarw.S.sub.5, etc. A selfed generation (S) may be
considered to be a type of filial generation (F) and may be named F
as such. After at least five generations, the inbred plant is
considered genetically pure.
[0083] Each breeding program should include a periodic, objective
evaluation of the efficiency of the breeding procedure. Evaluation
criteria vary depending on the goal and objectives. Promising
advanced breeding lines are thoroughly tested and compared to
appropriate standards in environments representative of the
commercial target area(s) for generally three or more years.
Identification of individuals that are genetically superior because
genotypic value can be masked by confounding plant traits or
environmental factors. One method of identifying a superior plant
is to observe its performance relative to other experimental plants
and to one or more widely grown standard varieties. Single
observations can be inconclusive, while replicated observations
provide a better estimate of genetic worth.
[0084] Mass and recurrent selections can be used to improve
populations of either self- or cross-pollinating crops. A
genetically variable population of heterozygous individuals is
either identified or created by intercrossing several different
parents. The best plants are selected based on individual
superiority, outstanding progeny, or excellent combining ability.
The selected plants are intercrossed to produce a new population in
which further cycles of selection are continued. Descriptions of
other breeding methods that are commonly used for different traits
and crops can be found in one of several reference books (Allard,
1960; Simmonds, 1979; Sneep and Hendriksen, 1979; Fehr, 1987; Fehr,
1987).
[0085] The effectiveness of selecting for genotypes with enhanced
traits of interest (for example, a favorable property such as yield
of a harvested plant product, for example yield of a grain, seed,
fruit, fiber, forage; or an agronomic trait, for example, pest
resistance such as disease resistance, insect resistance, nematode
resistance, or improved growth rate, and stress tolerance; or an
improved processed product of the plant, for example, fatty acid
profile, amino acid profile, nutritional content, fiber quality) in
a breeding program will depend upon: 1) the extent to which the
variability in the traits of interest of individual plants in a
population is the result of genetic factors and is thus transmitted
to the progenies of the selected genotypes; and 2) how much the
variability in the traits of interest among the plants is due to
the environment in which the different genotypes are growing. The
inheritance of traits ranges from control by one major gene whose
expression is not influenced by the environment (i.e., qualitative
characters) to control by many genes whose effects are greatly
influenced by the environment (i.e., quantitative characters).
Breeding for quantitative traits such as yield is further
characterized by the fact that: 1) the differences resulting from
the effect of each gene are small, making it difficult or
impossible to identify them individually; 2) the number of genes
contributing to a character is large, so that distinct segregation
ratios are seldom, if ever, obtained; and 3) the effects of the
genes may be expressed in different ways based on environmental
variation. Therefore, the accurate identification of transgressive
segregates or superior genotypes with the traits of interest is
extremely difficult and its success is dependent on the plant
breeder's ability to minimize the environmental variation affecting
the expression of the quantitative character in the population.
[0086] The likelihood of identifying a transgressive segregant is
greatly reduced as the number of traits combined into one genotype
is increased. Consequently, all the breeder can generally hope for
is to obtain a favorable assortment of genes for the first complex
character combined with a favorable assortment of genes for the
second character into one genotype in addition to a selected
gene.
[0087] Introgression of a particular genomic region in a set of
genomic regions that contain a transgene, or transgenes into a
plant germplasm is defined as the result of the process of
backcross conversion. A plant germplasm into which a novel DNA
sequence has been introgressed may be referred to as a backcross
converted genotype, line, inbred, or hybrid. Additionally, an
introgression of a particular genomic region or transgene may be
conducted by a forward breeding process. Similarly a plant genotype
lacking the desired DNA sequence may be referred to as an
unconverted genotype, line, inbred, or hybrid. During breeding, the
genetic markers linked to a T-type genomic region may be used to
assist in breeding for the purpose of producing soybean plants with
increased yield and a transgenic trait. Backcrossing and
marker-assisted selection, or forward breeding and marker-assisted
selection in particular can be used with the present invention to
introduce the T-type genomic region into any variety by conversion
of that variety.
[0088] In another embodiment of this invention marker sequences are
provided that are genetically linked and can be used to follow the
selection of the soybean or corn haplotypes. Genomic libraries from
multiple corn or soybean lines are made by isolating genomic DNA
from different corn or soybean lines by Plant DNAzol Reagent" from
Life Technologies now Invitrogen (Invitrogen Life Technologies,
Carlsbad, Calif.). Genomic DNA are digested with Pst 1 endonuclease
restriction enzyme, size-fractionated over 1 percent agarose gel
and ligated in plasmid vector for sequencing by standard molecular
biology techniques as described in Sambrook et al. These libraries
are sequenced by standard procedures on ABI Prism.RTM.377 DNA
Sequencer using commercially available reagents (Applied
Biosystems, Foster City, Calif.). All sequences are assembles to
identify non-redundant sequences by Pangea Clustering and Alignment
Tools that is available from DoubleTwist Inc., Oakland, Calif.
Sequence from multiple corn or soybean lines are assembled into
loci having one or more polymorphisms, such as SNPs and/or Indels.
Candidate polymorphisms are qualified by the following parameters:
[0089] (a) The minimum length of a contig or singleton for a
consensus alignment is 200 bases. [0090] (b) The percentage
identity of observed bases in a region of 15 bases on each side of
a candidate SNP is at least 75 percent. [0091] (c) The minimum
Phred quality in each contig at a polymorphism site is 35. [0092]
(d) The minimum Phred quality in a region of 15 bases on each side
of the polymorphism site is 20.
[0093] Read data from automated sequencers varies significantly in
quality due to the nature of nucleotides in a polynucleotide
molecule and number of other reasons (Ewing et al., 1998). Many
algorithms were developed to address the issue of accurate base
pair calling (Giddings et al., 1993; Berno, 1996; Lawrence and
Solovyev, 1994). The most widely used algorithm calculates the
quality of the sequence as "q" in equation q=-10.times.log 10(p),
where p is the estimated error probability of that base call (Ewing
and Green, 1998). Thus a base call having a probability of 1/1000
of being incorrect in a particular sequence is assigned a quality
score of 30. Quality scores are also referred as "Phred
Scores".
[0094] Selection of Plants Using Marker-Assisted Selection
[0095] A primary motivation for development of molecular markers in
crop species is the potential for increased efficiency in plant
breeding through marker-assisted selection (MAS). Genetic marker
alleles (an "allele" is an alternative sequence at a locus) are
used to identify plants that contain a desired genotype at multiple
loci, and that are expected to transfer the desired genotype, along
with a desired phenotype to their progeny. Genetic marker alleles
can be used to identify plants that contain desired genotype at one
marker locus, several loci, or a haplotype, and that would be
expected to transfer the desired genotype, along with a desired
phenotype to their progeny.
[0096] Marker-assisted selection comprises the mapping of
phenotypic traits and relies on the ability to detect genetic
differences between individuals. A "genetic map" is the
representation of the relative position of characterized loci (DNA
markers or any other locus for which allele can be identified)
along the chromosomes. The measure of distance is relative to the
frequency of crossovers event between sister chromatids at meiosis.
The genetic differences, or "genetic markers" are then correlated
with phenotypic variations using statistical methods. In a
preferred case, a single gene encoding a protein responsible for a
phenotypic trait is detectable directly by a mutation which results
in the variation in phenotype. More commonly, multiple genetic loci
each contribute to the observed phenotype.
[0097] The presence and/or absence of a particular genetic marker
allele in the genome of a plant exhibiting a favorable phenotypic
trait is made by any method listed above using markers, for
example, DNA markers are Restriction Fragment Length Polymorphisms
(RFLP), Amplified Fragment Length Polymorphisms (AFLP), Simple
Sequence Repeats (SSR), Single Nucleotide Polymorphisms (SNP),
Insertion/Deletion Polymorphisms (Indels), Variable Number Tandem
Repeats (VNTR), and Random Amplified Polymorphic DNA (RAPD), and
others known to those skilled in the art. If the nucleic acids from
the plant are positive for a desired genetic marker, the plant can
be selfed to create a true breeding line with the same genotype, or
it can be crossed with a plant with the same marker or with other
desired characteristics to create a sexually crossed hybrid
generation. Methods of marker-assisted selection (MAS) using a
variety of genetic markers are provided. Plants selected by MAS
using the methods are provided.
[0098] Marker-assisted introgression involves the transfer of a
chromosome region defined by one or more markers from one germplasm
to a second germplasm. The initial step in that process is the
localization of the genomic region or transgene by gene mapping,
which is the process of determining the position of a gene or
genomic region relative to other genes and genetic markers through
linkage analysis. The basic principle for linkage mapping is that
the closer together two genes are on a chromosome, then the more
likely they are to be inherited together. Briefly, a cross is
generally made between two genetically compatible but divergent
parents relative to traits under study. Genetic markers can then be
used to follow the segregation of traits under study in the progeny
from the cross, often a backcross (BC1), F.sub.2, or recombinant
inbred population.
[0099] The selection of a suitable recurrent parent is an important
step for a successful backcrossing procedure. The goal of a
backcross protocol is to alter or substitute a trait or
characteristic in the original inbred. To accomplish this, one or
more loci of the recurrent inbred is modified or substituted with
the desired gene from the nonrecurrent (donor) parent, while
retaining essentially all of the rest of the desired genetic, and
therefore the desired physiological and morphological, constitution
of the original inbred. The choice of the particular donor parent
will depend on the purpose of the backcross. The exact backcrossing
protocol will depend on the characteristic or trait being altered
to determine an appropriate testing protocol. It may be necessary
to introduce a test of the progeny to determine if the desired
characteristic has been successfully transferred. In the case of
the present invention, one may test the progeny lines generated
during the backcrossing program as well as using the marker system
described herein to select lines based upon markers rather than
visual traits, the markers are indicative of the preferred T-type
genomic region or a genomic region comprising a favorable
haplotype.
[0100] Transformed Plants and Plant Cells
[0101] As used herein, the term "transformed" refers to a cell,
tissue, organ, or organism into which has been introduced a foreign
polynucleotide molecule, such as a construct. The introduced
polynucleotide molecule may be integrated into the genomic DNA of
the recipient cell, tissue, organ, or organism such that the
introduced polynucleotide molecule is inherited by subsequent
progeny. A "transgenic" or "transformed" cell or organism also
includes progeny of the cell or organism and progeny produced from
a breeding program employing such a transgenic plant as a parent in
a cross and exhibiting an altered phenotype resulting from the
presence of a foreign polynucleotide molecule. A plant
transformation construct containing a polynucleotide molecule of
the present invention may be introduced into plants by any plant
transformation method. Methods and materials for transforming
plants by introducing a plant expression construct into a plant
genome in the practice of this invention can include any of the
well-known and demonstrated methods including electroporation as
illustrated in U.S. Pat. No. 5,384,253; microprojectile bombardment
as illustrated in U.S. Pat. Nos. 5,015,580; 5,550,318; 5,538,880;
6,160,208; 6,399,861; and 6,403,865; Agrobacterium-mediated
transformation as illustrated in U.S. Pat. Nos. 5,824,877;
5,591,616; 5,981,840; and 6,384,301; and protoplast transformation
as illustrated in U.S. Pat. No. 5,508,184, all of which are hereby
incorporated by reference.
[0102] Methods for specifically transforming dicots are well known
to those skilled in the art. Transformation and plant regeneration
using these methods have been described for a number of crops
including, but not limited to, cotton (Gossypium hirsutum), soybean
(Glycine max), peanut (Arachis hypogaea), alfalfa (Medicago
sativa), and members of the genus Brassica.
[0103] Methods for transforming monocots are well known to those
skilled in the art. Transformation and plant regeneration using
these methods have been described for a number of crops including,
but not limited to, barley (Hordeum vulgarae); maize (Zea mays);
oats (Avena sativa); orchard grass (Dactylis glomerata); rice
(Oryza sativa, including indica and japonica varieties); sorghum
(Sorghum bicolor); sugar cane (Saccharum sp); tall fescue (Festuca
arundinacea); turfgrass species (e.g. species: Agrostis
stolonifera, Poa pratensis, Stenotaphrum secundatum); and wheat
(Triticum aestivum). It is apparent to those of skill in the art
that a number of transformation methodologies can be used and
modified for production of stable transgenic plants from any number
of target crops of interest. Methods for introducing a transgene
are well known in the art and include biological and physical,
plant transformation protocols. See, for example, Miki et al.
(1993). Once a transgene is introduced into a variety it may
readily be transferred by crossing. By using backcrossing,
essentially all of the desired morphological and physiological
characteristics of a variety are recovered in addition to the locus
transferred into the variety via the backcrossing technique.
Backcrossing and forward breeding methods can be used with the
present invention to improve or introduce a characteristic into a
plant (Poehlman and Sleper, 1995; Fehr, 1987a, b; Sprague and
Dudley, 1988).
[0104] Site-Specific Integration of Transgenes
[0105] A number of site-specific recombination-mediated methods
have been developed for incorporating transgene into plant genomes,
as well as for deleting unwanted genetic elements from plant and
animal cells. For example, the cre-lox recombination system of
bacteriophage P1, described by Abremski et al. (1983); Sternberg et
al. (1981) and others, has been used to promote recombination in a
variety of cell types. The cre-lox system utilizes the cre
recombinase isolated from bacteriophage P1 in conjunction with the
DNA sequences (termed lox sites) it recognizes. This recombination
system has been effective for achieving recombination in plant
cells (U.S. Pat. No. 5,658,772), animal cells (U.S. Pat. Nos.
4,959,317 and 5,801,030), and in viral vectors (Hardy et al.,
1997). Targeting and control of insertion or removal of transgene
sequences in a plant genome can be achieved by the use of molecular
recombination method (U.S. Pat. No. 6,573,425). An introduced
polynucleotide molecule comprising a heterologous recombination
site incorporated into a haplotype region is within the scope of
the prevent invention.
[0106] Wahl et al. (U.S. Pat. No. 5,654,182) used the site-specific
FLP recombinase system of Saccharomyces cerevisiae to delete DNA
sequences in eukaryotic cells. The deletions were designed to
accomplish either inactivation of a gene or activation of a gene by
bringing desired DNA fragments into association with one another.
Activity of the FLP recombinase in plants has been demonstrated
(Lyznik et al, 1996; Luo et al., 2000).
[0107] Others have used transposons, or mobile genetic elements
that transpose when a transposase gene is present in the same
genome, to separate target genes from ancillary sequences. Yoder et
al. (U.S. Pat. Nos. 5,482,852 and 5,792,924, both of which are
incorporated herein by reference) used constructs containing the
sequence of the transposase enzyme and the transposase recognition
sequences to provide a method for genetically altering plants that
contain a desired gene free of vector and/or marker sequences.
Other methods that use DNA sequence directed bacteriophage
recombinase or transposases to target specific regions are
described in US 20020132350 and EP 1308516 (both of which are
incorporated herein by reference). Zinc finger endonucleases can be
specifically designed to recognize a DNA sequence and can target
specific DNA sequences in a genome to create a recombination site
useful for the insertion of a transgene (Wright et al., 2005; U.S.
Pat. No. 7,030,215; US 20050208489; US 20050064474, herein
incorporated by reference in their entirety), for example, targeted
to a haplotype comprising the DNA sequences listed in the sequence
listing of the present invention and contained in the genome of a
corn or soybean plant is contemplated by the inventors.
[0108] A transgene that contains additional recombination sites
when it is a component of a preferred T-type genomic region
provides an opportunity to add additional transgenes to the T-type
genomic region, thereby increasing the value of the region in a
germplasm. The present invention contemplates that the T-type
genomic region is also a site for specific recombination activities
to remove or add new genetic material to the genomic region.
[0109] The following examples are included to demonstrate preferred
embodiments of the invention. It should be appreciated by those of
skill in the art that the techniques disclosed in the examples
which follow represent techniques discovered by the inventor to
function well in the practice of the invention, and thus can be
considered to constitute preferred modes for its practice. However,
those of skill in the art should, in light of the present
disclosure, appreciate that many changes can be made in the
specific embodiments which are disclosed and still obtain a like or
similar result without departing from the spirit and scope of the
invention.
Example 1
Identification of Haplotypes
[0110] This example illustrates identifying soybean haplotypes
useful in databases for practicing the methods of this invention.
The chromosomes of soybean were divided into haplotypes by
following the hereditability of a large set of makers. Allelic
forms of the haplotypes were identified for a set of 4 haplotypes
which are listed in Table 1. With reference to Table 1, a haplotype
mapped to a genomic location is identified by reference, for
example C8W6H5 refers to chromosome 8, window 6 in that chromosome
and haplotype 5 in that window (genomic region); SEQ_ID provides
reference to the sequence listing and the marker ID number is an
arbitrary identifying name for a DNA amplicon associated with the a
marker locus; START_POS refers to the start position of the marker
in the DNA amplicon; HAP allele refers to the nucleotide of an
SNP/Indel marker at the Start position where * indicates a deletion
of an Indel; "other marker states" identifies another nucleotide
allele of markers in the window.
TABLE-US-00001 TABLE 1 Summary information of marker loci used to
characterize four soybean haplotypes associated with the glyphosate
tolerant soybean events, including the sequence identification (SEQ
ID and marker ID number) and the position of the polymorphism
(START POS) being used to characterize alleles (HAP ALLELE) in
these sequences. Other START HAP marker Haplotype SEQ_ID POS ALLELE
states C8W6H5 1 962360 277 * G 2 1324623 785 A T 3 1271382 239 A G
C16W8H43 4 1271562 351 A G 5 894632 193 G C 6 928368 320 A G 7
1267271 563 C A 8 1271614 126 A G 9 1271496 359 T G C18W3H8 10
1271924 603 G A 11 1267375 741 T C 12 860401 372 G C C19W3H6 13
1271355 283 T C 14 1271476 546 A C 15 825651 294 T C
Example 2
Preparation of a Database with Agronomic Traits and Haplotypes
[0111] This example illustrates the preparation of a database
useful in a method of this invention. With reference to Table 2 the
database comprises computed values of agronomic traits, for
example, yield, maturity, plant height, and lodging, for the
specific allelic soybean haplotypes and the haplotype frequency in
a set of breeding lines. Other traits can be measured, for example,
yield of a grain, seed, fruit, fiber, forage, oil; or an agronomic
trait, for example, pest resistance such as disease resistance,
insect resistance, nematode resistance, or improved growth rate,
and stress tolerance; or an improved processed product of the
plant, for example, fatty acid profile, amino acid profile,
nutritional content, fiber quality and a database compiled for the
values of each haplotype for these other traits. The agronomic
trait values of these haplotypes represent the predicted population
change in mean value for the trait listed if the haplotype was
fixed in the germplasm, everything else staying the same. The
values for "yield" are in bushels of soybeans per acre. The values
for "maturity" are in days (maturity of a soybean line is the
relative flowering time of that line compared to a set of standard
checks of defined maturity). The values for "plant height" are in
inches of height measured from the soil surface to the tip of the
uppermost plant tissue at maturity. The values of "lodging" are a
percent of plants compared to a set of standard checks (lodging is
a phenomenon in which the main stem of crop plants has moved from
the vertical by a large angle, sometimes to the point of the plants
being laying on the ground).
[0112] The breeding values for each of the haplotypes are used to
select the haplotype that in combination with a transgene will be
the most beneficial for the improvement of the germplasm of the
crop. The breeding value is a combination of measured traits and
the estimation of how these traits will affect germplasm
improvement. The soybean haplotypes associated with the transgenic
events for glyphosate tolerance were measured and the results shown
in Table 2. The Haplotype C8W6H5 would be a favorable haplotype for
its effect on yield, and haplotype C18W3H8 would be a favorable
haplotype for its very high frequency in the germplasm (94
percent), indicating that little variability is present in the
target soy germplasm for this chromosome segment, making the
diffusion process of a transgenic event in it neutral. Haplotype
C19W3H6 is generally neutral with respect to yield.
TABLE-US-00002 TABLE 2 The calculated breeding values of four
haplotypes described for yield, maturity, plant height, and
lodging. The frequency of the haplotype in the soybean germplasm
was estimated from a sample of 365 soybean lines. Plant Frequency
Yield Maturity height Lodging in a breeding Haplotype
(Bushels/acre) (Days) (inches) (%) population C8W6H5 1.689 0.989
-0.195 -0.027 21% C16W8H43 -0.447 -0.211 -0.514 -0.101 42% C18W3H8
0.000 0.000 0.000 0.000 94% C19W3H6 -0.071 0.232 -0.495 0.001
58%
[0113] The haplotype regions were determined for each of the four
new glyphosate tolerant soybean events. 17194 is linked to
haplotype C16W8H43, 17426 is linked to haplotype C18W3H8, 19703 is
linked to haplotype C19W3H6, and 19788 is linked to haplotype
C8W6H5. The relative effect of these haplotypes was measured as
illustrated in Table 2. This represents the predicted population
change in mean value for the trait listed if the haplotype was
fixed in the germplasm, everything else staying the same. The
T-type of 19788 and the associated C8W6H5 haplotype is the most
favorable of the four T-types that were measured. This result
demonstrates that it is important in a process to improve crop
performance through transgenic methods that both transgenic events
and the linked haplotype regions are evaluated to continue to
enhance crop productivity.
[0114] The new glyphosate tolerant events were compared in
replicated field trials to a backcross conversion of 40-3-2 into
A3244 germplasm. This was demonstrated in replicated field trials
including yield data collected from seventeen locations in the
United States. The A3244 (U.S. Pat. No. 5,659,114, ATCC number
97549) is an elite soybean germplasm from Asgrow (Monsanto, St
Louis, Mo.) that was used as the parent line for transformation to
generate the new glyphosate tolerant soybean events 17194, 17426,
19703, and 19788. The results of the yield study showed that 40-3-2
A3244 backcross yielded an average of 60.7 bu/acre, 19788 an
average of 65.6 bu/acre, 19703 an average of 65.7 bu/acre, 17426 an
average of 65.3 bu/acre, and 17194 an average of 65.8 bu/acre. The
four new lines have an approximate yield advantage of 5 bu/acre
over the same genotype with the introgressed 40-3-2 T-type genomic
region. When the haplotype of each is considered then the most
favorable event is 19788.
[0115] These analyses demonstrate the value of determining the
T-type for each transgenic event that is being developed as a
commercial product. Failure to consider the agronomic effects of
the haplotype region in which the transgene has introgressed can
result in the introduction of a low performing event into the
germplasm of a crop.
Example 3
Use of Breeding Values
[0116] The haplotype regions and breeding values of each were
determined for four haplotype regions in which an insect tolerance
gene was inserted into the genome of a soybean plant. The relative
breeding value for each haplotype regions is shown in Table 3, the
definitions of the measurements are the same as described in
Example 2. The table is a database for determining the haplotype
and its breeding value in which an insect tolerance gene was
inserted (a T-type). A transgenic event comprising the T-type is
selected using the database information. A particular event,
GM_19459, contains the T-type of the insect tolerance gene
associated with C6W4H1 haplotype that is a favorable haplotype for
maturity.
TABLE-US-00003 TABLE 3 The calculated breeding values for yield,
maturity, plant height, and lodging of four haplotypes for the
insect tolerant soybean events. The frequency of the haplotype in
the germplasm was estimated from 2589 soybean lines. Yield Plant
(Bushels/ Maturity height Lodging Haplotype Haplotype acre) (Days)
(inches) (%) frequency C1W1H2 0.075 0.244 0.057 0.018 16% C1W2H1
0.160 0.314 0.069 0.022 67% C14W7H2 0.130 0.648 -0.101 -0.069 62%
C6W4H1 -0.156 -0.111 -- 0.070 29%
[0117] Allelic forms of the haplotypes were identified for a set of
4 haplotypes associated with transgenic insect resistant soybeans
as listed in Table 4. With reference to Table 4, a haplotype mapped
to a genomic location is identified by reference, for example
C1W1H2 refers to chromosome 1, window 1 in that chromosome and
haplotype 2 in that window (genomic region); SEQ_ID provides
reference to the sequence listing and the marker ID number is an
arbitrary identifying name for a DNA amplicon associated with the a
marker locus; START_POS refers to the start position of the marker
in the DNA amplicon; HAP allele refers to the nucleotide of an
SNP/Indel marker at the Start position where * indicates a deletion
of an Indel; "other marker states" identifies another nucleotide
allele of markers in the window; "NA" indicated another marker
allele is not present.
TABLE-US-00004 TABLE 4 Summary information of marker loci used to
characterize four soybean haplotypes associated with the insect
tolerant soybean events, including the sequence identification (SEQ
ID and marker ID number) and the position of the polymorphism
(START POS) being used to characterize alleles (HAP ALLELE) in
these sequences. Other START HAP marker Haplotype SEQ_ID POS ALLELE
states C1W1H2 16 NS0092678 0 C T 17 NS0092617 0.4 A G 18 NS0101549
1.4 A G 19 NS0127917 1.4 C A 20 NS0120003 1.8 A T 21 NS0118494 3 C
T 22 NS0124158 3 A G C1W2H1 23 NS0101025 11.3 C T 24 NS0101038 11.3
A C 25 NS0127234 11.3 T G 26 NS0129173 11.3 T A 27 NS0097228 16.2 C
NA C14W7H2 28 NS0096079 68.5 T C C6W4H1 29 NS0125775 30.3 G C 30
NS0130788 30.3 T C 31 NS0093984 32.9 C T 32 NS0096925 32.9 A *
Example 4
Application to Corn Breeding
[0118] This example illustrates the haplotype regions and breeding
values that were determined for four haplotype regions in which an
insect tolerance gene was inserted into the genome of a corn plant
(LH172). The relative breeding value for each haplotype regions is
shown in Table 5, the definitions of the measurements are the same
as described in Example 2. The table is a database for determining
the haplotype and its breeding value in which an insect tolerance
gene was inserted (a T-type). A transgenic event comprising the
T-type is selected using the database information. A particular
event contains the T-type of the insect tolerance gene associated
with the C1W36H2 haplotype.
TABLE-US-00005 TABLE 5 Calculated breeding value for yield of four
haplotypes for insect tolerant corn events. The frequency of the
haplotype in the germplasm was estimated from 6335 corn lines.
Yield Haplotype Haplotype (Bushels/acre) frequency C1W19H14 0.168
9.2% C1W30H4 -0.781 3.3% C1W36H2 0.008 18% C8W4H5 0.377 15%
[0119] Allelic forms of the haplotypes were identified for a set of
4 haplotypes for the transgenic insect resistant corn as listed in
Table 6. With reference to Table 6, a haplotype mapped to a genomic
location is identified by reference, for example C1W19H14 refers to
chromosome 1, window 19 in that chromosome and haplotype 14 in that
window (genomic region); SEQ_ID provides reference to the sequence
listing and the marker ID number is an arbitrary identifying name
for a DNA amplicon associated with the a marker locus; START_POS
refers to the start position of the marker in the DNA amplicon; HAP
allele refers to the nucleotide of an SNP/Indel marker at the Start
position where * indicates a deletion of an Indel; "other marker
states" identifies another nucleotide allele of markers in the
window.
TABLE-US-00006 TABLE 6 Summary information of marker loci used to
characterize four corn haplotypes associated with the insect
tolerant corn events, including the sequence id (SEQ ID and marker
ID number) and the position of the polymorphism (START POS) being
used to characterize alleles (HAP ALLELE) in these sequences. Other
START HAP marker Haplotype SEQ_ID POS ALLELE states C1W19H14 33
NC0053983 109.4 T C 34 NC0113263 110.1 A G 35 NC0008901 110.8 T C
36 NC0143254 110.9 A G 37 NC0030198 111 A G 38 NC0080733 111 T G 39
NC0104474 111 C T 40 NC0033728 113.3 C A 41 NC0029506 113.6 C G
C1W30H4 42 NC0039502 195.5 G A 43 NC0111626 196.4 T C 44 NC0008982
198.4 A G 45 NC0040427 199.4 G T 46 NC0033427 199.8 G T 47
NC0148362 200 G A C1W36H2 48 NC0146570 237 T G 49 NC0008996 238.1 A
T 50 NC0013490 240.7 T C C8W4H5 51 NC0111628 57.3 A G 52 NC0026720
58.7 A C 53 NC0037392 60 C T 54 NC0027485 60.1 C T
Example 5
Indirect Mapping of a T-Type Genomic Region
[0120] DNA markers are identified in the genomic region flanking a
transgene insert to provide a means to identify the genomic
location of the transgene by comparison of the DNA markers to a
mapping population. DNA markers can be developed to any transgenic
event by isolation of the genomic region, sequencing of the region,
isolation of the same region in a mapping population of the crop
plant, and determining the location relative to markers known in
the mapping population. The association of the transgene with
mapped phenotypes, quantitative trait loci comprising a haplotype
genomic region can be determined.
[0121] For example, for MON89788 a DNA primer pair was selected
from a DNA sequence that extends into the genome 5' to the
transgene insertion site (SEQ ID NO:55 and 56) and into the 3'
genomic region relative to the transgene insertion site (SEQ ID
NO:57-58). A DNA amplification method was used to produce DNA
products that comprise a portion of the soybean genome from the 5'
and 3' regions of the transgene insertion site. These DNA products
were sequenced. The same primer pairs were used to amplify DNA from
seven soybean lines (507354, Minsoy, Noir, HS1, PIC, 88788, A3244)
that are parents of four mapping populations. A single nucleotide
polymorphism (SNP) was identified at position 119 (SNP119, SEQ ID
NO:59) from the 3' flanking sequences when comparing sequences
across different lines. Table 7 shows the allelic composition at
this position on eight lines tested.
TABLE-US-00007 TABLE 7 Polymorphism at flanking sequences in
different soybean lines comprising MON89788. 5' Flanking 3'
Flanking Position 2809 119 507354 A T Minsoy A T Noir A T HS1 A T
PIC T C 88788 T A3244 T 507355 A T
[0122] A Taqman.RTM. (PE Applied Biosystems, Foster City, Calif.)
end point assay was developed from SNP119 in accordance to
instructions provided by the manufacturer. Primer and probe
sequences are given in Table 8. To map the SNP119 polymophism, an
F2 population, derived from a cross between HS1.times.PI407305
(PIC), consisting of 140 individuals, was used. Map position of
SNP119 was determined by placing the allelic scores against the
existing allelic data set using MapMaker (Lincoln and Lander,
1990). SNP119 was found on linkage group D1a+Q (Song, Q. J., et
al., 2004). Thus, MON89788 was indirectly mapped to this same
position.
TABLE-US-00008 TABLE 8 Primer and probe molecules for Taqman assay
for mapping haplotype Forward Primer SEQ ID NO: 19788_3E-119F
CGTTCTCGACTTCAACCATATGTGA 60 Reverse Primer SEQ ID NO:
19788_3E-119R GCATGGAATAAAGCGGAAAGGAAAG 61 VIC Probe SEQ ID NO:
19788_3E-119V2 CCATGGTATCATAGGCA 62 Fam Probe SEQ ID NO:
19788_3E-119M2 CCATGGTATCGTAGGCA 63
[0123] A deposit of Monsanto Technology LLC, soybean seed
comprising event MON89788 disclosed above and recited in the
claims, has been made under the Budapest Treaty with the American
Type Culture Collection (ATCC), 10801 University Boulevard,
Manassas, Va. 20110. The ATCC accession number is PTA-6708
deposited on May 11, 2005. The deposit will be maintained in the
depository for a period of 30 years, or 5 years after the last
request, or for the effective life of the patent, whichever is
longer, and will be replaced as necessary during that period. DNA
molecules of the present invention can be isolated from the genome
of the deposited material and the sequence corrected if necessary,
additional DNA molecules for use as probes or primers for the
haplotype regions disclosed herein can be isolated from the
deposited material.
[0124] All publications, patents and patent applications are herein
incorporated by reference to the same extent as if each individual
publication or patent application was specifically and individually
indicated to be incorporated by reference.
[0125] All of the compositions and methods disclosed and claimed
herein can be made and executed without undue experimentation in
light of the present disclosure. While the compositions and methods
of this invention have been described in terms of preferred
embodiments, it will be apparent to those of skill in the art that
variations may be applied to the methods and in the steps or in the
sequence of steps of the method described herein without departing
from the concept, spirit and scope of the invention. More
specifically, it will be apparent that certain agents which are
both chemically and physiologically related may be substituted for
the agents described herein while the same or similar results would
be achieved. All such similar substitutes and modifications
apparent to those skilled in the art are deemed to be within the
spirit, scope and concept of the invention as defined by the
appended claims.
REFERENCES
[0126] The following references, to the extent that they provide
exemplary procedural or other details supplementary to those set
forth herein, are specifically incorporated herein by reference.
[0127] U.S. Pat. No. 4,757,011 [0128] U.S. Pat. No. 4,769,061
[0129] U.S. Pat. No. 4,810,648 [0130] U.S. Pat. No. 4,940,835
[0131] U.S. Pat. No. 4,959,317 [0132] U.S. Pat. No. 4,971,908
[0133] U.S. Pat. No. 5,015,580 [0134] U.S. Pat. No. 5,094,945
[0135] U.S. Pat. No. 5,229,114 [0136] U.S. Pat. No. 5,304,730
[0137] U.S. Pat. No. 5,384,253 [0138] U.S. Pat. No. 5,437,697
[0139] U.S. Pat. No. 5,463,175 [0140] U.S. Pat. No. 5,482,852
[0141] U.S. Pat. No. 5,508,184 [0142] U.S. Pat. No. 5,512,466
[0143] U.S. Pat. No. 5,516,671 [0144] U.S. Pat. No. 5,538,880
[0145] U.S. Pat. No. 5,543,576 [0146] U.S. Pat. No. 5,545,545
[0147] U.S. Pat. No. 5,550,318 [0148] U.S. Pat. No. 5,591,616
[0149] U.S. Pat. No. 5,608,149 [0150] U.S. Pat. No. 5,627,061
[0151] U.S. Pat. No. 5,633,435 [0152] U.S. Pat. No. 5,637,785
[0153] U.S. Pat. No. 5,654,182 [0154] U.S. Pat. No. 5,658,772
[0155] U.S. Pat. No. 5,659,114 [0156] U.S. Pat. No. 5,689,041
[0157] U.S. Pat. No. 5,716,837 [0158] U.S. Pat. No. 5,750,876
[0159] U.S. Pat. No. 5,773,696 [0160] U.S. Pat. No. 5,792,924
[0161] U.S. Pat. No. 5,801,030 [0162] U.S. Pat. No. 5,824,877
[0163] U.S. Pat. No. 5,942,664 [0164] U.S. Pat. No. 5,958,745
[0165] U.S. Pat. No. 5,981,840 [0166] U.S. Pat. No. 5,985,605
[0167] U.S. Pat. No. 5,998,700 [0168] U.S. Pat. No. 6,011,199
[0169] U.S. Pat. No. 6,013,864 [0170] U.S. Pat. No. 6,031,154
[0171] U.S. Pat. No. 6,040,497 [0172] U.S. Pat. No. 6,063,597
[0173] U.S. Pat. No. 6,063,756 [0174] U.S. Pat. No. 6,072,103
[0175] U.S. Pat. No. 6,080,560 [0176] U.S. Pat. No. 6,093,695
[0177] U.S. Pat. No. 6,110,464 [0178] U.S. Pat. No. 6,121,436
[0179] U.S. Pat. No. 6,140,085 [0180] U.S. Pat. No. 6,160,208
[0181] U.S. Pat. No. 6,166,292 [0182] U.S. Pat. No. 6,171,640
[0183] U.S. Pat. No. 6,184,440 [0184] U.S. Pat. No. 6,228,992
[0185] U.S. Pat. No. 6,316,407 [0186] U.S. Pat. No. 6,380,466
[0187] U.S. Pat. No. 6,384,301 [0188] U.S. Pat. No. 6,399,861
[0189] U.S. Pat. No. 6,403,865 [0190] U.S. Pat. No. 6,444,876
[0191] U.S. Pat. No. 6,476,295 [0192] U.S. Pat. No. 6,476,295
[0193] U.S. Pat. No. 6,476,295 [0194] U.S. Pat. No. 6,486,383
[0195] U.S. Pat. No. 6,506,962 [0196] U.S. Pat. No. 6,531,648
[0197] U.S. Pat. No. 6,537,750 [0198] U.S. Pat. No. 6,660,911
[0199] U.S. Pat. No. 6,768,044 [0200] U.S. Pat. No. 6,774,284
[0201] U.S. Pat. No. 7,030,215 [0202] U.S. Pubin. 20020132350
[0203] U.S. Pubin. 20030083480 [0204] U.S. Pubin. 20040177399
[0205] U.S. Pubin. 20050064474 [0206] U.S. Pubin. 20050208489
[0207] U.S. Pubin. 20050246798 [0208] U.S. Pubin. 20060021093
[0209] U.S. Pubin. 20060021094 [0210] U.S. Pubin. 20030028917
[0211] Abremski et al., Cell, 32:1301-1311, 1983. [0212] Allard,
"Principles of Plant Breeding," John Wiley & Sons, NY, U. of
CA, Davis, Calif., 50-98, 1960 [0213] Beachy et al., Ann. Rev.
Phytopathol., 28:451, 1990. [0214] Berno, Genome Research, 6:80-91,
1996. [0215] Charest et al., Plant Cell Rep., 8:643, 1990. [0216]
Cheung et al., Theor. Appl. Genet., 94:569-582, 1997. [0217] Comai
et al., Nature, 317:741-744, 1985. [0218] DeBlock, et al., EMBO J.,
6:2513-2519, 1987. [0219] Dellaporta et al., Stadler Symposium,
11:263-282, 1988. [0220] Dempster et al. J. R. Stat. Soc.,
39B:1-38, 1977. [0221] Eichholtz et al., Somatic Cell Mol. Genet.,
13:67, 1987. [0222] Elliot et al., Plant Molec. Biol., 21:515,
1993. [0223] European Appln. 0 242 246 [0224] European Appln.
0640141 [0225] European Appln. 0797673 [0226] European Appln.
1308516 [0227] European Patent Appln. 0154204 [0228] Ewing et al.,
Genome Research, 8:175-185, 1998. [0229] Excoffier and Slatkin,
Biol. Evol., 12(5):921-927, 1995. [0230] Fehr, In: Principles of
variety development, Theory and Technique, (Vol 1) and In: Crop
Species Soybean (Vol 2), Iowa State Univ., Macmillian Pub. Co., NY,
360-376, 1987b. [0231] Fehr, In: Soybeans: Improvement, Production
and Uses, 2.sup.nd Ed., Manograph., 16:249, 1987a. [0232] Ferreira
et al., J. Hered., 91:392-396, 2000. [0233] Fisher et al., Plant
Physiol., 102:1045, 1993. [0234] Fromm et al., Proc. Natl. Acad.
Sci. USA, 82(17):5824-5828, 1985. [0235] Geiser et al., Gene,
48:109, 1986. [0236] Giddings et al., Nucleic Acid Res.,
21:4530-4540, 1993. [0237] Glick et al., In: Methods in Plant
Molecular Biology and Biotechnology, CRC Press, Boca Raton, Fla.,
1993. [0238] Gordon-Kamm et al., Plant Cell, 2:603-618, 1990.
[0239] Hardy et al., J. Virology, 71:1842, 1997. [0240] Hinchee et
al., Bio/Technology, 6:915-922, 1988. [0241] Ikatu et al.,
Bio/Technol., 8:241-242, 1990. [0242] Jefferson et al., EMBO J.,
6:3901-3907, 1987. [0243] Jefferson, Plant Mol. Biol, Rep.,
5:387-405, 1987. [0244] Jones et al., Science, 266:789, 1994.
[0245] Katz et al., J. Gen. Microbiol., 129:2703-2714, 1983. [0246]
Knutzon et al., Proc. Natl. Acad. Sci. USA, 89:2624, 1992. [0247]
Lacape et al., Genome, 46:612-626, 2003. [0248] Lawrence and
Solovyev; Nucleic Acid Res., 22:1272 1280, 1994. [0249] Lee et al.,
EMBO J., 7:1241, 1988. [0250] Lee et al., Plant Mol. Biol., 48:
53-461, 2002. [0251] Lewin, In: Genes V, Oxford University Press, N
Y, 1994. [0252] Lincoln and Lander, Mapping Genes Controlling
Quantitative Traits Using MAPMAKER/QTL, Whitehead Institute for
Biomedical Research, Massachusetts, 1990. [0253] Luo et al., Plant
J., 23:423-430, 2000. [0254] Lyznik et al, Nucleic Acids Res.,
24:3784-3789, 1996. [0255] Marshall et al., Theor. Appl. Genet.,
83:435, 1992. [0256] Martin et al., Science, 262:1432, 1993. [0257]
McCallum et al. (2000) Plant Physiol. 123:439-442, 2000. [0258]
Mild et al., In: Methods in Plant Molecular Biology and
Biotechnology, Glick and Thompson (Eds.), CRC Press, Inc., Boca
Raton, 67-88, 1993. [0259] Mild et al., Theor. Appl. Genet.,
80:449, 1990. [0260] Mindrinos et al., Cell, 78:1089, 1994. [0261]
Misawa et al, Plant J., 4:833-840, 1993. [0262] Misawa et al, Plant
J., 6:481-489, 1994. [0263] Ow et al., Science, 234:856-859, 1986.
[0264] Padgette et al., Crop Sci., 35:1451-1461, 1995. [0265] PCT
Appln. WO93/19181 [0266] PCT Appln. WO96/30517 [0267] Pen et al.,
Bio/Technology, 10:292, 1992. [0268] Poehlman and Sleper, In:
Breeding Field Crops, Iowa State University Press, Ames, [0269]
1995. [0270] Potrykus et al., Ann. Rev. Plant Physiol. Plant Mol.
Biol., 42: 205, 1991. [0271] Przibila et al., Plant Cell, 3:169,
1991. [0272] Rieger et al., In: Glossary of Genetics: Classical and
Molecular, 5.sup.th Ed., Springer-Verlag, N Y, 1991. [0273]
Rodriguez et al., In: Vectors: A Survey of Molecular Cloning
Vectors and Their Uses, Butterworths, Boston, 1988. [0274] Rogers
et al., Methods In Enzymology, 153:253-277, 1987. [0275] Sambrook
et al. [0276] Sathasiivan et al., Nucl. Acids Res., 18:2188-2193,
1990. [0277] Shah et al., Science, 233:478, 1986. [0278] Shiroza et
al., J. Bacteol., 170:810, 1988. [0279] Simmonds, In: Principles of
crop improvement, Longman, Inc., NY, 369-399, 1979. [0280] Sneep
and Hendriksen, In: Plant breeding perspectives, Wageningen (Ed.),
Center for Agricultural Publishing and Documentation, 1979. [0281]
Sogaard et al., J. Biol. Chem., 268:22480, 1993. [0282] Song, Q.
J., et al, Theor. Appl. Genetics 109:122-128, 2004. [0283] Sprague
and Dudley, In: Corn and Corn Improvement, 3.sup.rd Ed.., Crop
Science of America, Inc.; Soil Science of America, Inc., Wisconsin.
881-883; 901-918, 1988. [0284] Stalker et al., J. Biol. Chem.,
263:6310-6314, 1988. [0285] Stalker et al., Science, 242:419-423,
1988. [0286] Steinmetz et al., Mol. Gen. Genet., 20:220, 1985.
[0287] Sternberg et al. Cold Spring Harbor Symp. Quant. Biol.
45:297-309, 1981. [0288] Sutcliffe et al., Proc. Natl. Acad. Sci.
USA, 75:3737-3741, 1978. [0289] Thillet et al., J. Biol. Chem.,
263:12500-12508, 1988. [0290] Wright et al., Plant Journal,
44:693-705, 2005. [0291] Zukowsky et al., Proc. Natl. Acad. Sci.
USA, 80:1101-1105, 1983.
Sequence CWU 1
1
631664DNAGlycine max 1attcagaagg ctgttaaaac cccctcaggt caccaccact
aaacaagaca acaagacaat 60aacagataaa agcctagttt gtcttgcatg cacttacatg
cacctttcat ttttttttct 120tgcatgcatc atggtcccct actaatacca
tttatcttca actattcccc cctctcccaa 180aatcattcct tgcccttcaa
cttttcataa ttgtcttaat taaatgtttg gattaaagtc 240tataaaagta
tcacaaggct tactttttca aaactggata tctaggtaaa attttactct
300caaacatagt tttgggagta accaaacatt accctaaact gattttaatt
tcaaacatat 360acttttaaac cctcccactg gaaatccgaa cacatcctaa
gatgacttat tatatttgca 420tctatctaat aataataata aaagtaaatg
atttttatat tgatacttaa ttatgagttg 480ttttatgata tgtttattga
cttttatagt gattagtatc tactttaaaa atcatatcta 540ttttggacta
gctgagagtg tttatattga caatacatag aaattaaatt ttaagaataa
600gaaaatgata atcatatttt aggatattgg ttaagaataa aataaaaagt
tttattgaaa 660aaat 66421156DNAGlycine max 2aaagccaatc agatgctact
gagtaaaatc agaaaaaatg tgaactaaga gcaatgacaa 60gagtcaagaa tgctatacct
gctgctcagt catttgaaga aaagcaatag tatagtacac 120caattctctg
atattagcca caaccaccta tagaaatgca aaccatagat caaagtaatg
180atacaattga acacaatcag cagaaaaata agatttcaag aatgttctga
accaaacctt 240tcccagtctt ggattaccca caatggtcaa catgagttca
aacagctgta aataacaact 300tagataaaaa tgataattca aaaattatta
caagaaatga aggaaacaca aacaaaaaac 360catgcaaacc tgatttacgc
aagttttaga cagatctcag ccagtttatt atttaaaatc 420ctataattat
aattttttat catttccata caataatcca gcaaaactat acgacactaa
480taaaaggtat agatcaaaca aagaggttcc tttcaatagt aagctgatga
agccatcatt 540gcagtgacta cagtatcccc tgtgaattcc agtttgacta
atttgaaatt taggactgtc 600ttccttgcca aataacaagt taaaggatat
cattctagca gctgattcag tttgacactt 660catagtgacc tttctctctc
tggaatactg acaactatgt acagtttaaa ttaaacaatc 720aagttaacag
cagttttctt tggggcagta tgcagatgat acggtttcct ttgggacagc
780aacattggca aatgtaagag caatcaaggt gatgttgagg agctttgagt
tggtatcggg 840attaaagata aactttgcca acatctgttt ttgggcaatt
gggatgactg agcaatggat 900gaacagtgct gtcagataag ttgcagaatg
atgtcagtac tcttttctta tttgggtatc 960tctattgggg caaactcaag
acgtagtgag atttgggatc taatagtcaa gaaatgtgag 1020agaaaattgt
caaaatgaaa acaaaaatat ctttctttcg gggaagggtg actttgatta
1080agtcagtcct aaactcgatc cctaaaattt ttttcttttt tcaaggctcc
aaagaatgtg 1140gtggataggt aggtga 11563657DNAGlycine max 3atagttcact
aggtttgtac tcatgccata atatgccaat ctttcacagc attcatttcc 60tgacataatg
tttttagcac cttagtgcga ttttaatact gaaatatgca taaaagatgg
120tatcgaaatg aaagaagaaa aaaaatccaa taccaagtat gaagcggcat
gctttccaat 180ttccagtttt tttcttgatg gcaggctttt tgtaaatatc
aactgtccca tcttctgtgt 240ataaactatc ttctcccaca tcatgtgatt
ttttgacatc tcccatggtt tcacagcaga 300tttcctattt gctgcttctt
gttctctttg atatctacta caaatgtctg tttgcagggt 360gctggaaatg
agtaaacaaa ataatcggca aaaggtacac gaaaattaaa cagattgcta
420tcatgaattt catattataa atacttgatt tgagggtgtt tatgaagtag
gaatagcaaa 480gagaggttca gcaaagcaat gaatatgtta gctcatgaag
ctccaatcac aatccttcgc 540aaacacattt gatggcaaca ttgtgatttg
ggattattag tggtacaaag tgatagttat 600aatcacaaag aattaatgta
gaactgcagg tgacttagga ggtccgggtt cgacccg 6574659DNAGlycine max
4atttgggcgg cctacacttt tttgaaaatt aaaatatact tttgtgtagc taatttccgt
60tttgcatgtg tgtgtgtgtg taatggagag agagatagag agaaagagtt agtttggttc
120ctagtggcac tgaaattact accaaatctc caaaagtagc tatggagtta
tttaggatgt 180ctaatgagtt gagtttatgg tcttatttat atgtggaaat
aatgatttat taatccagag 240cagatgagtt aaaagtttct ctagagaatg
gtagtttcta aatgaataaa taggatagaa 300tctctagcac tcaaagaatg
aaagaatgtt tgcattttat tatcacctga gccaatttca 360gatatctcga
ttatttcctc ttaatatccc atggcaacat tcattgcgtt aagccaacat
420tttaaatgaa agtatctgtg atctccaagt ctttgatatt catttgtcta
ttccaaattt 480tggttccaac tggcttcgaa agctttgatc ctcctccctg
ctttcagcat gatttcctca 540ctcttcttga actttccata ctgaggaagt
ctgagacaca atggtaaaac tatcatggtt 600atggaatcca tgaaacaaac
atcattattt tctattaagc tctgaattgt agaaataca 6595372DNAGlycine max
5cattcatcag aagtgctcaa agcatttaat ttcagtgcca aagcactgca acttttcagc
60tcagacaaaa caaacttgct agcttccttg gtaactgtgg aggatttgat agacttgaaa
120accttctcaa cagattcttg agcacatctg cgtacctaaa aagcaaacaa
accacaacac 180aatttaaaca aacaaaaaat ccataagctc agagtaatga
aatgattaat aggcatggac 240caatgtctcg aaatcaaagg tcatatcata
accaaatgct gattgcattg taaggtaaaa 300ccatcaccaa atctatacat
aattggatat aaacaaggtt taaaactaaa gttgcagaca 360cattagcagc ac
37261448DNAGlycine max 6aacagaactc tatggccgca agacaagata gaccaaagag
gaaggacccg gcatattaaa 60agcagtgaca aagtaggaaa attgcactca tttacgatca
acccaggttc ttgtagatct 120agagtactag taaggttctc taatcactta
tggctcttaa tgttgaatag ccagaagtga 180taaaatcaaa tcaaataacc
ccctagggtc ggcctagtga cggggttttt ggtagcatgc 240acaaagtctt
agattgtaat cttgttgagt cattgtacac caaataaata aataaataaa
300attaaaatta atgtaaaata tgataaatgc aagtggaatt tatttccaac
taatttatgc 360tcgttctcaa cataaaaaat caagagattt gttgtgcata
actctttctt aagccatata 420tcatgactct tacctgctta gctgtcgcaa
aattcagtag cgcttcacat tcatcaagca 480tggctttttc cctagctgat
acaacagcaa cttcagacat taaactattc atgccctcca 540cctattccag
acacacacac aaaaaaaaaa aatggtgtca gccttaaggc tttagggatt
600taccattgaa tcaaaaagga aaatcattag gaagaaaaaa catacagtta
gtagaagaaa 660aaaagtttga tactaaatgt gtaggcctag aaaatagcaa
atgctagtgt gatattgtga 720gtcaaaccag tagaactcct aaaaaagtaa
cacaccccgt gacagcaaag ggcagatagc 780agatcccatt gcttgcatgg
catcaacagc tgaatagata gcatgcttca aatgttcaat 840atctgcctga
aaaagtgtaa acacagtaat gatgttagtg tctctttgca cgcatcgaac
900caaaagaaca gaaaccacca tacatacctt tgcccctcca gttagaggaa
gacgaagagt 960acttgcctcc aagtcttcta cagccccgga taaggcatca
atatgatcac tttcaagtac 1020agcccagtca tcaaggtagg ccatctgtgt
cattaattgt aaaaggacta caatttaaaa 1080aagtatcaaa aaaaagttga
atcatttaaa taagaaaatg gtttcatata tgacttgtaa 1140tcatatccac
cattaataat atgagttatg aataccatgt tatgacagac tagcataaac
1200aattaaacat aacttttcaa tgtgcagggc caacatcttg ctgagtatat
tttcctcatt 1260tataaacttc acaataaata tctctagtta aattaccaaa
aatgaaaatc gggaaaaaaa 1320aaaaagaaag aaagaaaaag taattgtaat
gtatcatcaa caataatatc gcacatagaa 1380tgataaatat ttcaggcaag
agagaagtat tacttgatca ttcaaaatag aattcagctt 1440cagctcaa
14487922DNAGlycine max 7agtttgctag gaagtgggtg ccattctgca agaaattttc
tatagaacct agagcaccag 60agatgtactt cagtgagaaa attgattacc tcaaggacaa
ggtgcagccc acctttgtta 120aggatcgtcg agctatgaag gtctgtatca
tatctatcag actagtactt gaacacatgg 180gagtttaaaa gttagtttaa
agcttattca gtgttaattg ggtgtttgac agagagaata 240tgaagagttt
aaggttagga tcaatgcact tgtggcaaaa gctcagaagg ttcctcaagg
300aggatggatt atgcaggatg ggacaccatg gcctggaaat aatactaagg
atcatcctgg 360tatgattcaa gtctttcttg gtcacagtgg aggtcatgat
actgaaggaa acgagcttcc 420tcgtcttgtt tatgtttccc gagagaaaag
gcctggattc caacaccaca agaaagctgg 480tgctatgaat gctttggtag
atttttttga gcagtttttg ttgttcctat gatgtccatt 540cacctttata
tgagacacaa ttccttgaca cttccaatta ttgctgtgat ttgcagattc
600gggtttccgc tgtgctcaca aatgctcctt tcatgctgaa cttggattgt
gatcattatg 660tcaataacag caaggctgct cgagaggcca tgtgcttttt
aatggacccc caaactggga 720agaaggtctg ctatgtccaa tttcctcaaa
gatttgatgg cattgatagg catgatcgtt 780atgctaatag aaacacggtt
ttctttgatg taagtcactg caagaaacac agcatcagca 840tagcatggcc
ttttctttga agcatttgac tatttttttt tggtagtgta agctaatact
900aactatttct tcttctttgt ct 9228730DNAGlycine max 8tccctattat
tcactgaagt aatgaataag tcgttgaaga aagttgggca tgtcattatg 60tcaaaatgct
tctgacttct gagggtcaaa agtttacacc tcttttctat tttcgtaaaa
120ttcctgagga acatttttct tctgacatgt aaagtgaaat tttatagctc
attgctgtac 180tgccgtttaa tatctgacaa tcattgaagt taattaaact
atctcataaa agttgttggt 240gatgaatgtc tggaggtgta agcgcaaaat
ttgcgaccag ttaatgaatg tcttatcaac 300gaaaatacgt gtactactaa
tcaaccaaca tatgtggctt aaacaatcct agtttgccag 360tagtataaat
gctggggtta cattatcagt agatgttttt attagagaac caggtcatga
420tcttcagttg aatattgcca caagtatgac atgtgttatg cttgtttttt
ccatcagaat 480agagtagtgg aaaaaaatgc taatctgtga caaatttagg
ttgtgtaagt tgaagtagtt 540gcatggaatg cgcttcatca tgatccttgt
gtcagtttct aattttcaat gttattttgg 600cgtaaacagg ttgaggaata
tagccttcgg acgcatctca tgcagatcaa gcaaataagt 660aaccaatctc
ggaaattcaa acacgcatga ccctgacctg cttctacgct aacaagaagt
720cttttgcagc 7309717DNAGlycine max 9tcttatttga tttctccaat
gatatgcata tagtggtgtt taaccatgtt tcttgacatt 60tatttgtgtt ctatttgatt
atgtgactgc tccagaatag agtatttatg caatatcttg 120gtaatggaaa
ctaacaaagt ggaattaatt aactcattgg agccattcat tatgattgtt
180tctttcaaat tgtctattga gttcaatcat ttgttgcttc tattgatttt
atttaatatt 240ttagtaggca tgatcggtcc aggggctgta gaaatatagt
gaaggatttg aaagtctttg 300acacattcaa tacttggttg cattatgaat
atgttgaaga caaattatag aagataaatc 360taaggagcaa ttttatatat
caacaagcag caagggaata tgttcgaaca gatggtgggc 420acataggttt
ggagctgtca tgagcttaat tcatttgagg atgacccata ttttaaccgt
480caaaagcaaa acatagaata aaaggaatat tgattctgtt tgcattttgt
ttggggtact 540ggctagacta gatacgtttt cctggtccaa tggaaacctt
tggatcgttg gttgatttga 600agttagtaat atgctgaagg aggaagctac
aagagaagtt tgcatttcac gatcattttt 660ccttatgtat aggttgtttt
ctattgttta ctcacatttt cagctgcagg catgcaa 71710807DNAGlycine max
10tgcataatca accaactgat atgacatttt ctgtggaatg gacagaaccg ttattatcat
60gatgttaatc agtagatcat ttgccatctg gcttccagat gttaatctct tagtagaatt
120tttcattgcc tagtattgag aataaaacag atttgagact taaggttctg
tatgcaatac 180aatgaattgt ttattagcat tgtctacttc ttgatactga
tggttgtcat tacagtaata 240tgtcgaaatc tataataact aataatcact
taagcaacaa gttaaattct gtttttggta 300ttttgtcatg ggtgtcttta
atgcaagttt atatctttga tgcttttttg gttttatttt 360tacaaaatag
tagatgaagt tcattaagat gtttttcctt attgattgtg aaatggaatg
420catgataata tttggtgttc tgtctacctc tcctgaatta gaccagtcag
tttaattctg 480ttgctctctc tgttttattt tactctcaat ctttgtgagt
ttttcggttc acttgagttg 540tactgtctct agaaggtcct attactttat
tggtcaagaa aaatatagaa ggatattaat 600ccaaccttgt gatttgtgtg
cattacactc atacacattt tatgttcata tagtcatctc 660aatctaatta
gttgttctat gcaaactttg ttgtggaatt gaactgcttc ctgctgtgca
720tttaacttgc cttgcttact gttctccttc tgtgtgtctc aggttagaaa
ttctttgaat 780gctcagcctt ggatccataa taatctg 80711839DNAGlycine max
11caaacttgca tgcctgcaga ctatatggca ccgcccctat agtgcccctt ccaacgatta
60cctagaattt aatttgatct tctgaaggta gatgaataaa taaataaacg tgtgaaaata
120aaacagtaag tacatgccag tacgtaataa tgtgaactag tttgtataca
tgaatttagg 180tccaatgctg caaaagacct agttagactt ggaacataaa
aggatatatt taaatgactc 240tcaagattaa ctaaataata cacagacaaa
tcagataatt aaactgcacg gccactaagg 300gatcagcata tgtgaaagtc
tcagagagca gacatgtcgc tagttatata taaatcaagc 360tgattttatt
atctatatgg gaatcaaata caagcttaat tctcttttgc tagcttcaat
420ttggatacac ataattccaa cctccaccaa ttgataacaa atactagtaa
tgtacacatg 480ctattgtgcc cccgggtagc ttaactcttg aaaaacacat
tctcgtggca tctcttgacg 540cacaccctcg taattcgaag caacaagagg
aggataaatc agagaactgg tttcacccca 600atcagtactt tgtccacaac
ttgaagaagc acaagcacca cccattatct tatcggtaat 660catttcctcc
ccatagaaga actccaacga cccacctaat aacctttgag aataatcatt
720agcagaaacc atttccccaa ctgtagtggt agtgaaactg ctttgtcctt
gcaccaactc 780agcacggtac tcattgctct gaccaaactg tacttggttg
ttgccaatag agttcatgc 83912751DNAGlycine max 12aaacactggc ttcggattta
tcacttgtaa gtagagtttg ctaactaaaa tgctttgtca 60ctctttattt tcaggttttg
ttctgatatc caaaatcctg gctatgttga ttgccattca 120aattgtaata
agtctacatc tcaagcgtct ttgtttttgt gttccgactc caacagtaga
180agaaatggtg tttttggtag accactttgt gtgaacccct ctggcaggag
aaacctagtt 240ggtccagctt tttattctct ggagactagt gcttatgacg
tggctgcttt agaatctcct 300tcccgtgttg cagaagaaaa agttggtgtg
ctgcttctca atctaggagg accagagaca 360ttgagtgacg tgcaaccttt
tctgtttaat ctttttgcag atcctgtatg ttagtttgta 420tttgtgcttt
ttctactgtt gatttttctt tttcctgttt atgtaaattc cattagcatt
480agtacatgtt catatgattt gtatgctaat gtgtttcttg tattgacata
ggatatcatt 540cgtcttccaa ggttgtttcg gtttctccag cgaccattgg
caaaattgat ttctgtactt 600cgggctccta aatccaagga agggtatgct
gctattggtg gtggctctcc tttacgaaaa 660attacagatg accaggtgga
gtttaaattt tttggttttc ccattatctg ctttgtggag 720cttttatctt
tctgcaacat gaatcttttt t 75113663DNAGlycine max 13aatatgaaga
agtacctgct ttaccattca acttagtgac tgtaagtatt caatactaga 60gccaagttca
cttttctatt tagctacata ctaggggggg ttctcttggt aaaaagaaac
120tatctatact tatatgttat ggaattacat gactttcatg atacaaatca
catgaatatc 180aatagttgca agtagttctt taatgattta tatttcttag
gaacatgact tgtgcataac 240ttctttgagg tcaatccacg gcttagagta
attctgggaa cccgtttgca tcattgtaaa 300caggcatttc acactttcga
atgcatcaaa tgaagcaaca tttttttata attggcattc 360aatgtccatt
tggatggttt gaactgataa ccatttggat ggtttttaca attggcatct
420gtgtcttcag gaaggggact tgagcaggac agcttccttt gcagatgatg
gagaggtttt 480agatggaata attactcgaa gccggggtga ggttagacgt
gtttgcagtc caaaggtgat 540gaaatccact ccaaacctat cccaagagtt
aacaagtcca aggctcacag ataaagtata 600cagccctcgg ataagccatc
tcagaggaaa tcaaagccct cgaggtgttg ggagaggatc 660att
66314713DNAGlycine max 14cttaagtctg aaaacaattt gtctacttgt
acataatctt tatcatggac aaagtatcaa 60gaacagaaaa tattttatat attgtctact
cttgcctcat tcttacacat ctttatttta 120tttattttgt ttcaagttat
ttgttattga aaaagataaa agtgttaact gtttttttaa 180taatattcta
atttaaataa attcaaacct atatttgagc tcttttttta atgaggataa
240tttaatattt tttatgttat aacttgtgta attaatattt ttttgagaga
actcagcaaa 300aaataaataa atttttgaga gaaaaataat atttttttta
aagaagtgtg tattatttta 360aaaaataaat aatatgagat ggaggcaaca
tgtgatttta acaatgactt gtaacatcta 420taagctcaaa atttttgaaa
aatgaactgg cgtaggataa aattaaacta cctggataaa 480gcaaaggttc
ttcccaattg gttatttaaa gcaatcttct ttgtataatg gataccataa
540cttcaatctc ttaactacca tgatttgatt gaagcgatcg atctcacaaa
gatgttcctt 600tcaatattct taaactcaag tacaattttc cctcaaggac
ccactatgtc tatattccat 660tggattacat agtaaaagca aaccaataat
ttctctacct ttagctgcat ttt 71315534DNAGlycine max 15cttgcatgcc
tgcaggagac tttgagaaag cacacttcag ttgtttaccc gatataggag 60agatacaagt
taaagggtta atggtaagta ctttactttc tgtttagtat ctatgcatcc
120ttttatgaat ttctgcacca atgagttttt gctcaagtta ctgcacattc
tcctaggtga 180agcgaaaatc atgcttctac tacctgcctg attcgttccc
tatacagtat gctttccctg 240caatcagagg aacatggttt cttcatgtgg
aagtgaagca tttaaagcgt ttgcgcattc 300cgtgtccacc tggtgatgca
gctgttcttt caaaacataa ggacttaaag acctgcaatg 360gtgaggataa
ggcaaaatgc aacagtgagg aaaataaaat ggaagggttc caaccccgtt
420catgttttgc agaagagcat gaaactacta atcatgtttc aaagaagctg
aacaaaaaga 480gaatttctaa tgaaaaccac acgcagaatg aagccactgg
aatgccagaa agat 53416433DNAGlycine max 16ttgccaatgc agctgctggc
ttgagtgcag ccatggcagc tcagcttgtg tggacccctg 60ttgatgtcgt gagccagagg
ctgatggttc aaggtgtttg tgattcggga aatcctaagg 120cttcagctct
tcggtacatc aacgggattg atgccttcag gaagatcttg agcagtgatg
180gtcttagggg cttgtatagg ggttttggga tatcaatttt gacctatgcc
ccttcaaatg 240cagtttggtg ggcttcatat tctgttgcac aaaggatggt
ttggggtgga gttgggtact 300acttgtgcaa gggaaatgat agtgcactga
agcctgatac aaagactgtg atggcagttc 360agggagtcag tgcagcagtg
gctggtggca tgtctgcttt gatcaccatg ccactggata 420ccatcaagac aag
43317554DNAGlycine max 17aaaaaaaagg acaatcatta aacacgtatc
taaaatgcat ttcatcaaaa tgaaaaatta 60tgcaatactg aaaatccatg cgtgttataa
aggcaaacaa aatgaacttg gagagcaatg 120caacaaagta ctttttacag
tcaatgtgca ctttaaaaaa tagtatattt catacttaca 180taaaagagct
gaatgagtgc aagacgtacg aaagaataaa atttcaaagt gccacctaag
240tcacagagtt tatgagaaac aaactgtgag ctttggtcag gtaatatcca
ccacaatgca 300gggatgacaa ccgagtttag gacgaatata ctgcacaaaa
atttaaaaga tgttgaaatc 360attaaacacg tagattttag attcatgatt
tgttcaggac aatcaatcca tggatgacaa 420aaatatgtac aatcagattc
cttcgagtca ttatgtcaaa agtatacata atccaatttc 480tttgccacaa
aatttcattc actgtgttga aataaattga agctagtttc acttctcctt
540ctgcaggtcg actt 55418810DNAGlycine max 18tgagtttaat catgtatctt
ctttttcaat gcttttggtt ggacattaaa gccatatttg 60tttggatttt gtgccatata
ctatcaatcc aattttatta agaactaaga cctactagtt 120tttcaaacaa
ggtcagcaat actcaaaaat aaattgccaa gttggaccct gtagttttgt
180aaatgtatcc caacaattat aattaaaaag tagttgtact gtataatatc
tagcaaattc 240aaaattctaa agtcaatttt ttactgtcta atccaaatgg
acgctaagat actagactat 300tgatactcac agaacattat ctgtagttaa
catgaaaaat gtagtttgtg gttttgatgc 360ttcctttttt attttattta
agtgacttag tttgtagatt ttactttgca gggagactat 420catgttgaca
gtgaattttg tggtacggac agtgtacagc tgaaaggatc tgagattact
480gctgaactta agtatctctt aaacttgttg acattgtgtt ggcacttttc
gaagaagccc 540tttcccttgt ttttagaaga aactggctac agtgaagaaa
acgttctcct tcgagaagcc 600aaagcaggag taagtcttgt ttgaatttta
ggaaaaaatg ataataattc aatatctgta 660ctgtgttgac taagtcattg
atagttatta acacacattc tcttttgagg aaggatggaa 720ggctgaaagc
acaagagatg ttgttttatt tagactgata aaatatggga taaaaaattg
780atgatagatg ccttcttttt gcttcacttt 810191222DNAGlycine max
19aggtgcagct gcccttttgt actcacattg aatcggaaat tgcagactca cgatgcaatc
60gacgaagatg gtgaagaaaa tgggagtgat acgcccactg atacgccatt aggtattggc
120cgtgtttctc atcggttaat ccaagcccct gcaacatggt tggagacaat
ttcaacattg 180tcagagactc tcaggttcac gtattcggag acacttggga
aatggccaat tggggatttg 240gcgtttggca tcagctttct tctaaagcgg
caggtaatga caacgagtag attttggttc 300tttattgttg ccctgatttg
aagcaactga aaatgccgga aagctgtgtc gtttttttat 360ctatctgtaa
ctttggacac atttaagtag taagtagaat atagaatcag tatttagtgt
420ggaagccagg tgcatatttt tcaggtagaa ctataattga tctgaaatgg
tagttgcaac 480ctgcacttaa tgtgcaactc acataattca cctaaaggat
tgcccgtcac tgacattgat 540gaatgaaaga gagagaaata tagagaaagt
aaaatggaca atggtcatgg aagttatgcg 600agttaggtga actttttcat
gtgtaaataa aattgttgat gattaaatgg ttggaagacc 660aatttaatgc
tttccatctc aataaaaaaa attatggtcg
ttaaagaaat aatcccacat 720ttggagagca gttataattt atattaactc
ttaagtgttc cttacatgtg ataggatctt 780tttttcatgg gtgttgtttt
tgttttgctt tttaagcctt ctcaacatca cagcagtggt 840ttgggatcat
gacttgcaat atttgaactt cttttgttac ttgttaatca tgatcacttt
900gaagttgaca tattagatta ttcttggatt ttatgattta catgatatga
tttctgttta 960tactttctag gagagtaata tggctaaggt agcttagaaa
tcagactatt ctctacaaaa 1020tgaatcctca caagattgtt cacatgacct
gtgctacttt atatatttga ttttgattta 1080aatcatatat taagcatttt
ttaaggtaga tcagtttcat gacatcctgg actacttaat 1140ttcttcatct
cagctcaaca taaatagatg agaagttgct ctgtaatatt ggttttgtgc
1200cagtttaatt tgttcatatt aa 122220682DNAGlycine max 20gcaacaaatg
gatgtacaca gtccgagccc tgcataattg gagcaaattg catattccaa 60cccttaatga
gaatacagaa atcaaaactt gataaataaa atacttcaaa tttgcccaca
120ttggttgcta atagctattg cagagtatac acaattattc cagtaataca
aacttgcata 180ttaccacaga actcttaatc ataccaacat taaaatagtc
tctgtagcca ggctttgagg 240gcacaagaaa gtgcaacaaa tagttgaaaa
aacactgtat gttgtggttc actaatctct 300tatgaaacag gttatgtaga
aaggcttcag ttgtcaccta ctaacatcag ttcaccttac 360acttgtaatc
gtagctacat cttgcttccg gaaacaaagc tatgtatcta atcactaata
420atgacatcaa agtatgaagt aagatatacc ttttcttcag aggtactctt
gacgaaacac 480aggaatgcct ctgagaactg agagccactg gaaccggttc
gcaagtcagc aacgcaaggg 540cattcaaggg ctttctgagt catttcctcc
acagactgaa aaatccaaaa tttccaattg 600ttttcattcg ttcaacctgc
caagttgaaa gcaaaagatc aaagcaattc aattcaaacc 660tcagtgttct
gatttccata tt 68221681DNAGlycine max 21actgaaacat tcgaaattcc
ttacataatt tattcttatt aaaaataaca gtaatctttt 60gacttgaatt ggtacagaag
tacaattatt ggttgaggct tttatttcac gcataccaca 120atgaaacaca
tttcaatttt tcttacccct ggttaattta atgtaccgaa tttatacatc
180aaagagaaga taactttcga agtaaaaatg attatcctaa acaccgtatg
ataaagtgta 240taagattgtt caccattact taggtttttg gaaatgtcaa
accttagcac tatggtaagt 300ttgttgcctt gtaacttgga ggtcatgggt
tcaaatcctg caaacagcct ctccttaggt 360aggaacctca tgcattggac
tgccattttt gttgctttgg tgtgagcata gatactgctg 420aattctttga
gatgccactt ctcacttatc ccatttttaa cttgcaatct taacttatgc
480ttgtaactac gttcctgaag gcctagaaat ggtgggataa ggatttatgt
gttgtttctt 540gatggatgtt ttgcagacct tcatgctggg tggatcacat
gtcactggac aatgaaacag 600gattggatcc accaggcata agagttaggc
ctgtctctgg acttgtagct gctgattact 660ttgctgcagg tcgactctag a
681221002DNAGlycine max 22tgagagcttc cattcagaac tactatcaag
tactgacagt tagcttcaat acttcattta 60taataaacag aataatcgct taaatgaaat
tggtttagtt tcattcacat taatttcagg 120cacagtgctt tagatgtaat
caattcaggg agctgagaaa gaaattccac aaaccctcag 180attttaaaag
ttgaacatcc tcagcttgct gcatcaatta acagtaaaaa agaaaggaaa
240tgagaaaaaa tgagatttaa gattttatag caatttcatg tgagatatta
gagcagatat 300gagagttgta actctgaaat ttcaactcac tatccaattt
tcttccacag tattatcatt 360gactccatgt aggttattaa ctgagttcag
ctgaatcctg tcagttggat ttgagaatga 420ataatgatgt tgatatttat
gtttttatgt tccaaaaggc cacctttggg caaagggaat 480acaaacaatt
acaatgaaac aagtaattat atacagaaaa ctgagaaaag aaaaaaatca
540acaaatacct gctttctcca tctgaaaatg agcaactcag ggcagggagt
tcagatgata 600actgcttagc atgatcttta gaaccagagc gacataaatc
cacaccagaa cttgactgag 660ataactctga gtgattctgt gcatgaaagt
aaattaaata ttaagctgaa tttaggaaat 720aacgtaatta ctttatcagg
aaggaggaag gagaaagcaa aagcatggat aaaaaggaac 780ttctgcttat
gttgctttgg acacaattta taaattttgt aatatgttta gatgttaaag
840ctgaactaac ttcaaaaaca gaatgagact caacatttag cacactttca
agcaaaaatt 900gttcatgaaa atatcatcag ttcatcactt ttgaagttaa
tcaaatgttg cctgcttact 960ggtaatattt taccaatact atcagcacaa
gtagttttat cc 100223785DNAGlycine max 23cgacttcgta gcctgcagca
agcaaaggcc caactccgtc cagggacggt ccagtacggt 60ccaagatgat caagctcacc
actatgcaca tttctttcgc ctcccaaacc tgaacttgct 120gcaacagttt
taaaagaata ttaatattaa tattaatatt aaagtttcct acagtaagtt
180attatttaga ataaacataa aaaaaattta tatgaatatt tatttttaat
aaataaatag 240tattcataaa aatgctaaaa tcaagcaagt aattttccta
caattttaaa atttgcaaga 300aaattacata caaatttaaa atccacaaga
aagataaact gtgtatttta aattcctgaa 360aaattaaata caaatttaaa
ttccgaagga aaattactag caaatttaaa ttctaccaga 420aaattatttg
caattcaccg aaaaattact tgcaaataaa ttatctgtga aatttctagc
480agattctttt agtaaaactt tatttataga cacaccactt tttatgtaaa
acattttgcc 540gcagaaattg ttgtatttgt tctagaaaaa ttagcaagaa
attttctatg agtttcaaaa 600ttttcaaaaa attaattatc tactaaggta
ttatttagga acccaagtat tggaaattca 660caggtaatta gtaataagaa
aaattctata agatatcgta aaaatataga tcacaataaa 720gcaagataaa
cgtacgggga aaaaaaaatg taaaagggaa tctatcttcg tataaactaa 780cgtat
78524805DNAGlycine max 24tattaggtca gccattatga caacatcgga
tatattcgac aatactaagg aactcatcaa 60ggacattgct gatgattaca aaccagcctc
tcctttagcc ttgggatctg gtcatgtcaa 120ccccaacaaa gcccttgacc
ctggacttgt ttacgatgta ggagttcaag attatgtcaa 180tcttctctgt
gcaatgagct ccactcaaca gaacatctca atcatcacta gatcgtctac
240taataattgc tccaatcctt ccttggatct caactaccct tctttcattg
gtttcttcag 300tagcaatggt tcttctaatg aatcaagggt agcttgggca
tttcagagaa cagtgaccaa 360tgttggggag aaacaaacaa tctattctgc
taacgttaca cccatcaaag ggtttaatgt 420tagtgttgtt ccaagcaagt
tggtgttcaa ggagaagaac gagaagctaa gttataagtt 480aaggatagaa
ggtccaatgg tcgaaggctt tgggtatctg acttggacgg acatgaagca
540tgcggtgagg agccctattg tggtcaccaa tcaggcaccc tcaaattcaa
tttccatata 600gatcaatttt gtgatggata aatgtttttc atatgtttga
agttaaaaat atatattaat 660agaggaaatg ttcgtacatg aatgattatc
atttctgata ataataataa ttttttttgg 720aaaagtttta acaccaattt
taattttttt tttcttatca cgcacaccaa ttttaattgt 780tacgtactga
aataatacgt tagtt 805251222DNAGlycine max 25tagatctgca ctcgtgaatg
ataacattgt tgaattaagg attttgatgc ttgatgcttg 60atgcttgact atgagagaga
atactattga aaattgaagt gaatacttag aagaagttca 120tggccttgga
atggaatgat catgtgaacc tcattacctg ccgacttggc actgcatata
180tggatctaat tcaagtcctt ttcatcctcc taaatgcctg tcccttcttc
tttagttctg 240atcctcaact tatccacatt agctttcttt ttctagtatt
tacaaggatt gctaaaatta 300attttatttg taataataaa aatgtttatt
attgttgtct ataattatta ataaatacaa 360ttactcgttt tagtgtacat
atttcttatt tctatatacc ctttaatata ttaattattt 420tcttcataaa
ccttcaagat gtaactgttc taattttttt ctaaaaaaac tgttatcaat
480actttcttta attgtttccc ttttttaaaa taaagataga agcatgaagt
gtctcatttt 540caattattta aataaacaat actttagtta gacacaagtt
cgaactataa gtttcccata 600attttgctcc attatatcct acaatttttg
tgaaatatat atattcttac aagataatat 660tacgcacaac ttttcatcaa
aatgttacaa acaactcgag cattttagga catttttttt 720caagtaaatc
ccaggccgaa taatcatcaa cctatgttac attcaccccc aacataaaaa
780ctaacggggg aagatatcta ttgttagtct gtacatttgt tagtgcctga
tctctctcgc 840ctacacagtc gcttgttctt ttaaaaaaaa ccagttagtc
accgtttatt ggtcttctcc 900ttgcctgcaa acaagtttgc cttgtgtcag
aattaagcat tactatagag aagcataatt 960ttcttaaata agattactca
ccaaatatag ttgattttaa aggaaatcga attgatgaac 1020ccttaaatct
cagctcccga ttatgcttgt ttctattttg tttctcaata gcactggaac
1080tattgctagt ttctccggtc agaaagtttg ccactttact taccttttca
tggtacacag 1140caggtggggc aagcttcaat ggaggcaagt tttctatttg
cataaatctc tgattcttct 1200gcaagctgct caagatctgg aa
1222261177DNAGlycine max 26agtatttttt aaagtacagt gagaaaatgt
aaaataaata aataaataaa taaattatct 60tagctatcat attattgccg ataaaaaaaa
atgtcttggc tatcaaagct cttaaagctt 120accatttagt acggatcctt
ccgtggcatc tttatacgcc catttacatg catctatggt 180actttcagat
gcgtatctaa aaaaaaaatt acccaagtta agtatgtata tatgctttga
240ataataatca gagacaacta aagaagctgg tttctttcat aaaaaaaaaa
gaagctggtt 300tctgttgttt ttctagttat gggtttttgg gatttaaata
aagaactcat ttttaagcat 360gtgataggat ggatatgcca ctattttcaa
catcagagaa ggatattata tttttatatt 420ctaaaggatt attttaatac
tattatatgt attgtattta aattatttat aataaaaatc 480ttaccaagaa
aattgaaaga tataaacgtg aaactcgcaa aagaaacatt atagaaataa
540ttggatttgg gtaaatgata tattaattat attattaata ataatgggac
atacgtagct 600gggcatggaa ggtcattatc accgcagttc tcccattctt
ctatctgatc ggcccatact 660ttctgtttat ttagataaaa ataaataaaa
aattgaagat atacaacctc aaacttcaca 720acccaaatcc ttatttagat
tgatgattaa aacaaaattg catacaatac cgtaatattc 780tgttgaagtg
catcgatgaa ttcgttcatg tcagagtcat aaaaattgtc gacttctgtc
840tgaaggatgg tactatccca aacctgtatg tatatgaagg taaaattaaa
catatcatat 900tgtcgtatat atagtatgtg aagacaagaa atggcaagtt
ttaatgcatt ccttcatcca 960gtttctaatt taaggactca tatttttatc
tcaatacatc agattttaaa atgcacacat 1020tcgagattta aaccatctga
tcttcaccta acggtgtcga tctatgaatc catgaagaaa 1080aaattgactt
acatggtgaa gattttgctt ccttttatac cagcgaacag taattgcatt
1140gcccccatgt ctgagagaaa gccacaatgt agaggct 117727685DNAGlycine
max 27agtttgcatg cctgcagcca agctctcgtg gatttggtgg tgttgttcgg
gtaaattttt 60cataatttta tattaaattc ttatgtttct tgatgtgttt tgcaccaaat
tcacctattt 120tgggcataac agacacccac cattgcctgt tcctgctgtc
tgtgaaagtc agctgcttta 180cagctgatgc cgtgggctgt tgtagctctg
tcaaactcat ggcctcttaa aaaaaacata 240ccccagtgtc ataaggctct
tcactatgcg aaagtatggg agagggtcat tgtatgtagc 300cttgtccttg
ctgatgcaag gaggttgctt ccgaattcaa acccatgacc aactggttag
360gcacaacttt actgttattc caggactcgc cctctagcca aaatgacctt
aacaaaaaat 420agtctctagc taaatgaatt gtgtcaatgg tgttatttta
aaggttaaac aaatgtgtat 480agtccatcag agacaaaaga gtttacacac
taaaactgat agcataaatt gtcacaggct 540gctattatgg atatacaagt
tgttccccat ggttttctta catgcggtgg ggatggaatt 600gtaaagctgg
tacggctgga aaataacttg cttggccatg gaattgagtt atgatacttc
660tgagatcctt tgggttgatg acaaa 685281343DNAGlycine max 28cttgcatgcc
tgcagaaaat tataaaattc ataaaacgct tctacagaaa attcaaaaag 60attgatttgc
tatatatcct atacacttgg acattttagt cgaaacctct atggatagag
120actttaagga cacaatatta tgaaaaatat tcagctcaaa tattataaaa
tgttaaaaaa 180caatgcatct caatattttt ttgaaaggtg tacttttaac
atgtttatga attgtacttt 240cttggctaat gagtgttttc cctggttaat
gttatggatt gtactttgta ggtgtactga 300tatatttttt ttcattaaat
actaatgttg attattcaat tttagaacag tgtacgcata 360gtatatgact
gttattgata aggtatttgt tatcgataag gtgcagttaa tataagcaga
420gaaagaaact aaaaggttaa ctacatgatt aagcttaagt gatgaaatgg
ataatccctt 480aaaagtcatc catgatttgt atatttggtg ctttgcaaaa
acaatcattt aagttcgtct 540tcaaaatcta gtaacagatt acttaaccat
tcttttagat cactacaaat atagtatttg 600tttttttaga aggaaaaaat
tgggtctgtg gcatcttaat ttttgtgatc acatttttgg 660ttctagtgat
actaacttct tagttcttac aatgtatgta tattttttct ttttacaaat
720gtcactcttc tcagctggat tcatggtttg aaaactcttt cctcataatg
tcaacaggtg 780gctccctata atacatttta ctcccaattg gaaaagcata
tgaatgaagt tggaattgtg 840cccacagtta accgatggga tgagcctcta
gcattgggca tggttgatcc ccatgattca 900ttatctcatc cagcaggtgt
ctctgatgtt caagctgagt ctgctacacg ggtggaccct 960gatcagttca
ctgattttgt ggtatgaatg tttctttaac attgacttgt aaggaaagta
1020aaatagtgga tatatgtgtg cacacatgtg tatgcgccag tagatggtat
ctttaacatt 1080catatatgct ttttctctgt ctgtattgtt gtcatgcaga
ttccaaactg gtttggagga 1140gagtccactg gggctacaaa aggcaaccca
ttcacgttac cagatgccta tatggtatct 1200cagcataaaa atgtatgtgt
ggtattaatt gatttgtata ttaattaacg attggatttc 1260caaatctttt
ggtgatcaaa ttttcaaaaa actttatttt aagcgaaata tgttttaact
1320acaatgaaat tgtatcttct tct 1343291062DNAGlycine max 29aacttgtgat
tcttaatagc cttctcacgc tttttgttgt caaaggttaa ttgatgcagc 60tttccatata
gcagataagc actatataat tacgtatttc aaactacaca ttagtaatta
120tgtcaggact tttgattatt tctgttggtc aaatttagaa tatggtacta
agttaatcta 180tagtaaatta aactaaccct ttttgagatt agatataact
ctctactttt ttttaatatt 240acattgacat ccttatacag ttatatatat
atatatatat atatttaaaa taattaagga 300agatttatat gtataaaggt
gtcaatgtaa actatatatt tttaaatata acaaaaaaga 360atagtgtagg
ggtatataat aagaaaaagg taacgggtaa atttgatgaa atttcaaggg
420gttaatataa tttaatggca ttcaactggg aaagcaatga gggatgatat
tgattggtgc 480gttgttggct tctaatgtgt ccaaagatgt gttacggaaa
attgagcacc aaaagtaccc 540atagtggttt agagaagtta ctgaaaatga
aagcatgtgg tccactctgt ttgatcgatc 600tccatttctt taaagaattg
aatcaaactc tattattaac atactttctg gttccagaat 660gaatagatat
aactagactt gttttatctg acaacaaata ttattccatt tgataaggac
720gaaacttatg ttcaaattct actttgttag ttaatgtaag aattttattg
atagacgatt 780tgaatgtatt tgagtacgaa ttttttgaat tgagactcaa
attaaaatat gtctaacctg 840atcagtgaat tattgagatc taatttacct
atatattttt ttataaaaaa agaattattt 900tatctaaacc ttttgaaaaa
ttaactactt atacatactt tttcaacaac tactcctacc 960ttagtattct
gacctagggc ggctcaattt tccttttttt atttatcagt atgacatatt
1020aacaaactcg gctgcaggac atgcaaggct ggcggtaaag ga
1062301095DNAGlycine max 30aaaagttagt agaatttcgc cttaggtggt
ttgggaatgc atgaagaaga cctaaagaag 60cctcgataag aagagcagat catatggagg
gcaatctatt tccttgtatc tctgtattgt 120atagagaggt gcagtattca
actcactctc tcaagttagt agttggcata ctgtgtcagt 180ttgtaaagtt
agttattgac agttgtcata actaactaac tgtttcctaa ctatctaact
240tcataactct ataaatagag tgttgtaact caggattcat taacctccat
aatattttct 300cattccattt atcttctttc ttttctcctt tttctatgat
ctaaacagag ttctaatgtg 360atctattagt tttctattat ggtatctaga
gcttggtgag atcttcaatg gctgcgaaca 420gcaccacatt cctttccgct
tcttcttttt cccaattcca tatcacataa acttgatgat 480tcaagctttc
ttctatgtcg tcaacaattt gagcctgcta tcaaaccaca caaacttaac
540gattcgttgc taatcctcag attccacttc gatttctctc tgaagaagat
caagaagttg 600gacgtgaaaa tccagcttac gaagcatggg aaaagcaaga
tcaggtgtta ttagcctgac 660acaaatcaag gcacttgctg atgctcttgc
ttcagtagga agccctataa tgattcaaga 720gcacattgat tcaattgttg
aaggtctttc tccagattat cacccgataa tcgagataat 780ttagagtaag
tttgaaaccg ttccaatcac gcaagttaaa gcacttcttc tagctcatga
840gtcttgtctg aataacttca acgattaatt acactcatgc acaacacaga
gcgaattcgc 900attcctaaaa ttacactttg ccaaaaaagt caagttctca
atcggatcct gaaagttttt 960tctggttttc gcggtggttc tgcgtgcggt
agctataata ggggtggcag cagcggcggt 1020agtggtggcc attgtcacac
tggtgcaggt caatttgcct atttccaatg ccaagtctgc 1080tttaaatttg gtcat
109531508DNAGlycine max 31ttgctcaaca tacaaaacct accacgcatt
gttaatttgc aacttaatct ttcgttattc 60tatcttgtag gcccatccat gtttgtccaa
gaatgcctcg accgcttcct taatggccaa 120gtttggcact aactgtgatg
gatcaagggg ttcccgtgtg attgggtcga atttacccac 180ctgtttaaga
agttgaatta ttgcacactg acttcattga agtggaaagc attcaacaga
240gaaggtgaaa catgcaaatc aggttacctt ctgaagatgc tcaagaatca
ctgctctctc 300atatgtaagt ccacttggag tgattacagg atcatggaaa
atgtcgagtg taattctaca 360gcacaaataa tctggcacct gtagactcaa
caagacaatt tgtgtaatta ataagatggt 420aatttccagg gcaccactga
aaaaagattt gatgaatata gaattaacaa aacaagccat 480gtacacacct
cagtaggtgt gtcagctt 50832580DNAGlycine max 32gcgtgcctgc aggtaaaaat
ttatgctatt aacgaagaag atgttaatgg caggtacgtt 60tgttattcag acatatcatg
caaaatataa cttgcttagc agcttcacag aacccaatag 120ggcatgaata
atatttccgt ttaactggtt accttaccag cagacgcaaa aaacctcatg
180ttagaccaca accaacacat gtggccagaa tcaataagca tagtaatttt
actaacagag 240aaaattttta accatgttca gtgtggcaaa tatatgctta
tggggtacag taaataatga 300taactatgct accaaagttt catgcattgg
gggaggataa atgaagggaa ttgtttgaga 360ttattttaag gagatacaaa
taagagatct ctagttaaca taacaaaatc cttcaattaa 420tgagcattta
ctttttttga gctctccact tcggagtatt ttgtaagcta aattacttta
480tacactttct ggtgcttgtc aaaattgaat tttaacattt attagaagag
cagaaattta 540taaaaacatg tcatatttgt ttttttatta aatcctttat
58033815DNAZea mays 33agccccctga aacattgagc tattaaaaat tagaaaacga
cacaccttca gatactttct 60gattctaaat ctaaatatga aacctcttgg tccttacttg
caagtaaaag ctacatatta 120tacaagtaat ttgatgacaa tgagctatgc
agtcagcatg gaaatgtcaa aacctatctg 180tgagaaaata tcaggagacc
tagcattagg attcttttgt tttatttttc tactgattga 240attcatgcat
gacttaccta gtgcaatcta gatgaaaagg tctaacattt tccccatctt
300aattaagcct gctacaattc acaactgggg agatagaaac tcaaactatg
gcagaggcat 360aagtgtgtat tagagtcaat ccatctacaa tgaaaagtat
gctatattca taggtattgg 420catttcaatt aatcagacaa acagtagtta
ctgcacgata aaacaactga acatcaactt 480tctagcattt tgctgagata
ggacccccgc agggtacata ttacaagaca ggatatgaag 540gatggagtag
catagcatta ttctagcata attatattaa cataacatga ttaggtcagg
600tcatactcat ccggaagttg tgtttcgtca ccatagtaac tttggcagca
aaacaaacta 660gacagaacta agtagtgaaa aagaagcaca gatcgcagag
agcagttcac atattttact 720actaccaagt agcaagaacc atgtttcatc
accaccctac ctttggtagc aaaacaatct 780acatagaact aattattagt
ataaaagaag ttgaa 81534763DNAZea mays 34tgtaagttat ctacttattt
gttccctatt ttcatttatt tatttaaagt tgagatttat 60ccaaagtatt gttagtgtaa
tgttttttct tctgccacat taggtttttg tcaatgcacg 120gcttcatgtc
tcaggagggg ctcttggggg tggtcgcatg gtagaagact cctcaagcgt
180tgcaggttcc atattttaaa ctttctttga tgatcacatt tttgtagtat
tcttttttta 240cgtaaaacaa ttctgtggta ttcaatagca aattatacac
ttcttacaag tcaacatata 300gatttcacta tctcagtttc tttggaggtt
actagcagaa aagtaattta gaaatgatta 360atatatttta ctgaggttct
caacgttgtg tttttcggtt gctggtaaat tgctcatttg 420ctgctatagt
tgattagtaa atatagcagt atttatgtca ttactggtta cttgtaatgc
480aaaccttttt cattgaagta catgttctgt aaaatactag aacatggtca
gtactttcag 540cattggtcac taacttttat gttttatgcg agtaataata
tttctatttc catgtttatc 600tacttagttt ccatgccatg ccgccttttc
agtattggac actgctgctg gagtttggtg 660tgataccaaa tcagtagtta
caactccaag gacaggaagg tatagtgcag atgcagcagg 720tggtgatgct
tctgtagagc ttacacggcg gtgcaggcac gca 76335750DNAZea mays
35tgaggcaatc agtgctactt tagctgctgt aaaggctagg caagttaacg gtgagatgga
60gcattcacct gacagggaac aatctccaga tgctgcacca agtgccaagc aaaattcaag
120ccttataaaa ccagatcctg ctcttatgaa caattcaaca ccaccacctg
gggttcggtt 180gcaccataga gcagtgagtt gaaaaaatag ttcattttgc
tgcttgttgt ttaaatttag 240ttattctatt cttatttaga cattcagtct
gtttaactta gaagtcatca catttacatg 300aaaaatgctc ttatttgttt
tgatgccagt tacatatttt ggccttgtag gttgtggtag 360cagcagaaac
tggaggtgcc ttaggtggca tggttagaca gctctcgatt gaccagtttg
420agaatgaagg tagaagggtc atttatggca cccctgagaa tgcaactgcg
gcaaggaaat 480tgctggatcg acaaatgtct
attaatagcg tgcccaaaaa ggtaatctac atttttctac 540tattgtaaga
ttactgacaa aagcaacaca tgctagaaaa ctgaaagagt tattatcata
600atggcttctg ctaaaaaaac aagcacttca tatgatgaca ttttctctaa
gatgtagatt 660tctatattga tttgttataa ttataatctg tggcccaata
attcaggtaa ttgcttctct 720gctgaaacct cgtggttgga gccccctgtg
75036607DNAZea mays 36cagtcactac tgtgcttttg actggaactt gtgtcgtctt
atggacatca gagaagaatg 60atggtagcag gccatctcgt gccatggcca tcaatattct
tggctgaagg aagtatcaaa 120gtggaagtat gaataccaat agcagcataa
actgaagttg tgcaatgcat atgcttgttg 180cgagaacaga taaagcagaa
aactgatgat atatatccag attatatgcc agtattcttc 240agatgttact
cattttaaaa ccatgcccac ttggctgatg actcatattt tccatcaatt
300tgaatcacag aagaaatttg atgatacatt ggttaagata tgcttatacc
tgtggcaata 360tagaccccat caaggttgag cagagagcaa gaacagcacc
agttgttaca agatacctaa 420aacaagataa gcacatgaaa acaatctcag
tcagacgcac accagatcat ccatgaaaaa 480aatgagaaga agtccttaca
ttgcccagtg catcccatgt ttggcaaagg cagatgaaat 540aggggtgtct
gggtccatag caaagtatgg taccagacca acaataacaa ctgaaaccaa 600catgtac
60737607DNAZea mays 37cagtcactac tgtgcttttg actggaactt gtgtcgtctt
atggacatca gagaagaatg 60atggtagcag gccatctcgt gccatggcca tcaatattct
tggctgaagg aagtatcaaa 120gtggaagtat gaataccaat agcagcataa
actgaagttg tgcaatgcat atgcttgttg 180cgagaacaga taaagcagaa
aactgatgat atatatccag attatatgcc agtattcttc 240agatgttact
cattttaaaa ccatgcccac ttggctgatg actcatattt tccatcaatt
300tgaatcacag aagaaatttg atgatacatt ggttaagata tgcttatacc
tgtggcaata 360tagaccccat caaggttgag cagagagcaa gaacagcacc
agttgttaca agatacctaa 420aacaagataa gcacatgaaa acaatctcag
tcagacgcac accagatcat ccatgaaaaa 480aatgagaaga agtccttaca
ttgcccagtg catcccatgt ttggcaaagg cagatgaaat 540aggggtgtct
gggtccatag caaagtatgg taccagacca acaataacaa ctgaaaccaa 600catgtac
607381025DNAZea mays 38ctctagagga tccccctggt ggttgagagg tactaccagc
aagtgacgat gtactgaggc 60cagtagattg ggctggggag ctgaacccaa aaaggttgtt
tccagagctc ggaaatgtaa 120atgaaggggt tgaagcagac tgtaaacctg
gtgcagtgga agtactagat ggtgcagcca 180cagatattcc agcagcatca
cttgaaggtg caactgcggt gaaagtaggc gtagatgtgg 240cgggggaact
tgaaaagctt gcaatacttg atgacacagg tgaaataaaa agttccgatg
300atgccttgct gcttgcatct ggtgcaacag atttcacttc agcttttgta
ccccctatac 360caaatgataa ggtgctagtg ccatcagatt tagctgatgt
taccaatgac aatccaaccg 420gagcactgga agaaaaacta aatcccgtaa
atgcaggaga cgatgaaata gctggactgc 480tactggatac ggcaaatatg
gaagtggaaa caggggcttc atggattcag tttgatgagc 540ccactcttgt
cctcgacctt gattctgaca aattggctgc attctctgct gcatacgcag
600aacttgaatc tgtactttct ggattgaatg tgcttgttga gacttacttt
gctgatgttc 660ctgctgagtc ctacaagtat gttatttatg ttgctagctc
acccttttct cagccattgc 720tcctcctaga cccttttgtg atgacatcat
acttgctgtt cttttgaatg caggacccta 780acatctctga gcagtgtgac
tgcttatggt tttgatcttg tccgtggaac ccaaactctt 840gggcttgtca
cgagtgctgg tttccctgct ggaaagtacc tctttgctgg tgttgtggat
900ggacgcaaca tctgggctga tgatcttgct acatctctca gcactctcca
gtctcttgag 960gctgttgttg ggaagggtaa tcatgcttgc acttattgtc
tgccataaaa ttggatttag 1020ttaca 102539450DNAZea mays 39tgaaacgata
gctttattta tatcactatt ggtacagtta gatagaaaag tttcaggcct 60caatcctaag
taaccgaccc cttacatatt tcgaacttct atttacaggc ctaggcaaca
120acaagctacc tatagtgcac cggcagagcc catgctcgcc gttgcacggg
ctgccgtccc 180taacggcggc agatgtcttg cgtcgtgaca cctcccgcat
ccggcgccgc ttcgcctcac 240aatcgtccgt ggtggcttct tttgcgtcgg
ccccggcccc ggcgccggca gccaccataa 300tccccctaga cggctcaccg
gacgcgggcg cgctggacta caccgtgaac gtcggctatg 360gcacgccgga
gcagcagttc ccgatgttcc tggacaccat cttcggcgtg tccctggtct
420tgtgcaagcc gtgcgcccca ggttccagca 45040373DNAZea mays
40tcacaaattc atctttaaat tggccctaca tatgatataa ctcacactga gtaactgttg
60atatcatatt ctaaatgact aaaagatttc agttattagc atattatgat atcacacacc
120tttccaacaa cctcaaacgt ccattgtttc aaccggaagg ccactgcgtt
tagatgatta 180tttggcatgg aggcccagtg tgtatcacca tttaaactct
gaaaaggtta gactttccct 240gatgaaacct tctaattaag tgggtaagca
aagcactatt cagtaattgt atcacctcct 300gttctagcag gaatctccaa
aatctcttca ccaggccaaa ctccgtgggg aacgtgttac 360tcactttgtg ctt
373411106DNAZea mays 41gcggctccac ggccaggcac aacaagcccg gtgtctccga
ccaccgagca agggtcgctc 60ccgttctccc gcgtcccagc gcaccagccc ccgggagtcg
gcacgtccgc gccgcggccc 120aggatctcct cgtctatcct gtccagcacc
gggtcgtcct cgtggaactt gcgggcgaac 180ggcgcgtcgc tggcgaccat
gcggtccagg tcctccgccg tcaagtagtg cgggtgctgc 240ttcggagggt
tgtcccacga gatgtagtgc aggtcgtggt tcaccgtcgt gttcttgaac
300tcctccgcgt tgcacacaac ggtgtggaag tatccttccg gcgacgagat
gaagtttgag 360tagtacatga gcactgtgcg aggtaggttg tcccagcccc
atatgcagta ttccacaaag 420ggcctggaca gtgccatcca ggcagaacct
gcgatgtcat aggctcatca ggtgcttcta 480tatctgaatc agtaactgac
atatatagat gcctggttat ctatatgagc tgtctggacc 540tagagtagtg
tttatatcta agctgtagtg tctgtttaag aaattggaat caattaattt
600cctgcctaca cagagaagtg caaacctcct ttggggggag atggctaagc
atggataatg 660gatgtagaac ctgaacacca tatttatatt aatgaaaaaa
cctgattttt gtgggaaaag 720ttcattgaca cggtttcatt aatatataaa
ttagtaactg acacataaaa caaggatagc 780ctttcaaaaa ttaatggcag
ctaacataaa tggatcaggg gagtaatcga gccaggaatg 840ccttggttgg
tttccagaac aggaggtcat ttttacatgc aaataaatgt gttgccatac
900aaatgttctc tactacaggc aacaaacaaa tagaatcaca aatgtttgtt
ccagcaataa 960gatcaaacaa aataagatta tacagatgga aaggtaattt
tttttaccaa ggagaatttc 1020aggcagcgat atactacact gtacaaaaaa
aaaagttaaa aaaataccac agctgtatgc 1080actattttga agaagaccta aggtaa
110642958DNAZea mays 42catcttgcat gcctgcagtt tctttggtag acataggtgg
gcatgaaggt ggttagttct 60ttaagattta agatgctaat ctttatgttg agttatactg
tgattaaaat gaagataaac 120atatagtagt gtggtctgtg gtgtcaggat
ttgttttcat aaaaattctt tctgcatgct 180taggtctgtt tgttaaggta
tatgctagtg taaaagatgg tgctctcaga catatccatg 240attcaaaatt
aaattagtgg ttgtgacgaa gaattttgag ctgggaaggc agtgtccaca
300ggcaattgct tagagccctg ccctgctgtg tatccttaaa attgacatga
atttaggacc 360cttgtgattt atacatctat gccaaattgc caagtgctac
ttttctctat gtggaaggag 420atacatgcat ggaaagtttt gcgcgagcct
tgagtgtatc acagactatc acagctgaga 480gggatatctg aaggatatat
catgattcat gattaagtta atgtgaagat ctgactaaaa 540ttgtgcttgc
agtaaacaaa acagttctgg ttgggaacca cagatctctt aggaacattg
600tattctgatc tgactaaaat tgtgcttgaa atgaacaaaa cagttattgg
caacctcaga 660tgcttagata tattaggaac aaatgtattc cgatctgact
aaaaccgtgc ttgcagtgag 720caaaacagtt cttgttggga acctcagatc
tcttaggaac aaatgtgatc tgatctgact 780aatattgtgc ttgtagtgat
aaaaaagctc ttgttgggaa catcagatat cttaattttt 840ttgaaaaacg
caggagagct gcgcatattt atagataaga tagaagaaag ggtcttacaa
900gagaggtaca ggttagggac acctgcaccc acacacacgc actatcaact gaaacaaa
95843683DNAZea mays 43tggtatgctc tctgaacttt tgctctgtaa ctgtgaccct
caataaaaaa aattcagtta 60aaggaatagg tcccgttgac cgagctcttc gattctctct
gaatgcagat agctacactg 120gtagctgtat acgcggactg ggggttcact
tcgatcgaag gcattggatg gggctgggct 180ggtgtggtgt ggctctacaa
cctcgtcttc tacttcccgc tcgacctcct caagttcctc 240atccgatacg
ctctgagtgg caaagcgtgg gatcttgtca ttgagcaaag ggtgatcaat
300ataaactgct cgttttgtca tgcacagcaa agcacagcac agcacctgtt
tgagtgaatt 360ccatgcacgc gcggtcggtg tgtcgctaat cgccggggtt
ttgcagattg cgtttacaag 420gaagaaggac ttcgggaagg aggagagggc
gctcaagtgg gcacacgcgc agaggacgct 480ccacgggctg cagccaccgg
atgccaagct gttccctgac agggtgaacg agctgaatca 540gatggccgaa
gaggccaaac ggagggccga gattgcaagg taagatgttg aagtccgtgg
600agatggtatc gcttgagggg aaagaaaggg caccgatgtc agcgtttccc
atgatctctc 660catatgcttt gggatatcta taa 68344772DNAZea mays
44cacccctacc aaatcgagca cagctcagaa gaggcagggg aggatcactc tcctaccaac
60caaaagtcac ctcgacactg gacgtgatac aagagacaca aaacatgggc cagaagacac
120catgcatcag acagttctga gaacttattc aacagcaaaa gtcatcttcg
gatttgttgg 180cttcctcgtt ccaccatcct ttttggcatc ttttcctgta
gaacagacca cgaatccaac 240caaaagaagc aagcataaat cacatcagta
gggtcataaa agaccttgtt gttaaagcac 300acatcattcc taaacttcca
aatggcccag agaatggctc ctatgcctag aacaatcaaa 360ttcctcgtgg
ttttcttata aatgttcatc cagtcctcaa aaagagaatc caggttttga
420ggacatctct tcacatccag ggcaacctga agaactctcc acacaaaagt
ggccactggg 480caattgagaa aaaggtgatt ggtggattcc aaaacaccac
agaaacaaca atcagtcaac 540ccaggccaat tccttttttc aggttttctt
tggagataat ttttaatctt cagaacaagc 600cacaggaaaa cttaaatttt
tgaggcattc ttattttcca aagaaactta ggaactttag 660tatgacgtca
aatagatata cttggctaat aattttgaag aatttaatga ttattatttg
720gtatcataaa tgctattatt tgattatata aaaattggtc aaacttatga tg
77245544DNAZea mays 45atacaatgag atggttggca tgcaaagtat tgtgttgaaa
ctgataaaac attatcttat 60atctcaaagt tctcaattgt ttacaagaag gtaaagccct
gtataatttt atggcagata 120actgcaactc ccatagaaaa tccaatgccg
gaggttccca catgagtagg gtctgggaag 180agaaaaacca aggcaagcct
tcccccgcag atgaggggag gctacaaatc tcatcaacag 240acaaaactca
tccatcggcg ttggtagcgc ccagatgcca catctgggtc tggcactgga
300cgcagcatcc accatccatt tgtcctatcc tttgatccac agttcattta
atttagtcca 360caaggaataa catatctact tctaattaat ccaagtgaaa
tgggactatc ttcgtgcatt 420cctttacctc cacatctgcc tacctcttgc
accatcgcta tttctctcgt gcggaattcc 480aaggctgggt taaaaaaaac
acggaacacc cctcacccct gtcggcctgt tcttccactg 540acgc 54446452DNAZea
mays 46cttccatcca cgaactcccg ttctccattt ccagccatgg cggtgtcgat
cgagctcacc 60aaggagtacg gctacgtcgt gctggtgctg gtggcctatg tcttcctcaa
cctttggatg 120ggcttccagg tcggcaaggc ccgcagaaag taagctctcc
gaaatctgaa tcgctcgtcg 180ccattgttgt cttcgtttgt ctccgcccaa
cttatttcat caacggacat aataatatga 240tggtttccgc tgtgattctt
ctttaggtac aaggtgttct accccaccat gtacgccatc 300gagtcggaga
acaaggacgc caagctcttc aactgcgtgc aggtgcgccc aagattctga
360catcctctcc cctcccccgt gattaattaa ttgctcttgt gaggggttgg
gactttggga 420ggcatctaaa tttccgctgg ttcttgtggt tg 452471064DNAZea
mays 47tagctttaga gcatgtggaa atttcagctt ctggacaggc tactaacttc
cctacttgca 60cgcaagcata aggtatggtt ttaatgaaca tgttacccaa gtttgtgttt
ttttagtatt 120tcttaactaa ctttagatca actgatatat gtttgtaggt
tctaatattc tcacaatgga 180caaaagtttt ggacattctt gagtattacc
tagattcaaa aggccttggg gtttgcagaa 240ttgatggtag tgttaatttg
gaagagaggc ggcgacaggt aacatgctag cctgtgccaa 300tatattactc
ctttcattcc aaattataaa atattgactt ttctagatac attacttttg
360ctatgtatct agatatacac tcatgtggga gcctccaaca ctggatctgc
cctagataca 420cactaagtct atatgcataa caaaaagtga tgtatctaga
aaagccaaaa cgtcttgtaa 480tttgggacgt gtactatttt tctaaagaac
atctaaactc tgaggatttg cactgtagtt 540agatatttgg gtacatgtat
aaatttattt cacaaaaaaa atcttttgtt actgttttct 600catgcagata
gcagagttga atgatttgaa tagcagtctg aatgtcttta ttctgagcac
660acgggctggc ggacttggta tcaaccttac ttctgctgat acatgtatcc
tttatgacag 720tgactgggta attctctgtc aagactacta tactcatcag
aaaaatgttt acaagaaagc 780cttttttttt gctggctact ttctttggct
gctgattgac ttattatgct agttctaata 840tggtgctcgt ttactcgaac
ctccagaatc ctcagatgga tcagcaggcc atggatcgat 900gccaccggat
tggtcaaaca cgcccagtac atgtatatag gctggctacc tcatattctg
960ttgaggtatg cttcagtgat ccgtgttttc agacttgtca ctttggctat
tgtctcaggt 1020ttactcagct tttcaccttt gcaggaacgg atcatcaaga aagc
1064481588DNAZea mays 48gggcgccggc gggtgagatc ggcaggggca tgccgccgtt
ggccccggcg gcggtagaga 60atcctgatga tgctgcagag tggtgcaacg gcacgtgagg
ggagccctgc gcttgccatg 120cccatgaggg gaatggctgg gcgcccagct
caggccttcg ctccaccggc ctttcctgga 180cctgcacgat atcgcagcag
cacgagcatg agctttgttt gtcaggtttt gaccacagtg 240aacgtttcca
cctttaagtc gttcacagga aacagagcga acaaacaacc aagattggat
300atcggcgagc atgattaatc tttcatgatt ctgttttaat taattgattg
attttaacgt 360atgtgcagca caatgaagac ttgatgctag ctctatttcc
cgtaaaaatt cagagaatcc 420tcaggactac agctgcagca cgtgatccct
gatcagcgta ctaaaagata catcccatgt 480tatgtcaaga aactgcgacc
caacaaccgc gactgcggtg tttcaaaaga tggaaagtgg 540cagaacggcc
tggtcgaaac accaaccagt ccacacacca aacagtacct gccattatag
600taaggagtac cattgccccc cacccccccc aaaaaaaaca gaaaaagaag
tgctgtatag 660tatttttcca aggagcaaga cgtttcataa gaaatttcta
aacaagctaa cagtaacagg 720ttggtaggga atcaatctac aagtaaatgt
ctatctagct gctctcagtc aaaaaggtta 780gacatgtcgc acatatatat
tcggcttgtt tctttcccat tatttccttt ttggaacaca 840ggtcatgttc
agctaaccat cctggattct acgatagttt ggaataactg atggagtatg
900tagtaacaga catttgcatg tgtaagtgta acaaagggat cttggcgata
cctggtggtt 960cggaaagaag ccatgacaag ctgacgagta caagcgctga
tggtgagatg acactgctgg 1020gttttggtga tacccgggcc aaggggatga
tgagctcatt ggctacagca gcagcaggtg 1080aaaggagcaa acactgtgag
atcaataata gttgtaggac ctggaccacg aattaatatc 1140gatctccccc
aaccatatgc caatcaaata gtcctatgaa tctttgtagg ctaggagtta
1200ctagtaggaa ctagtatcag aggtccaaga ttctacaaag acaattcaac
cgacagttgc 1260ttccatctta agattttaat tttttttcta gatacctctt
taggcaccat aagtaaaaag 1320ctatatactg gaaaattgaa cgggtggttt
tacctgagaa gccatcgtat ttgaagattc 1380aggggaatca catgttggct
ggaggccagc taaggtattg tcccttttgg ggtcttgcac 1440attaggctgt
gaaatccgca aatccaaatc aatggcatca ccatcatcaa ttgctgacaa
1500cgtagagaga accatgagat acacctcaga tactatatat attaaaaaaa
agaacggatg 1560acaaggccga gtgtaatacc ctcggttt 158849784DNAZea mays
49aacataattt catgtactgt tcgtacagag attatcttta gagagaatag taagtactac
60cttcgttctt gaatatttat catctgctag tttaatttta aactaaaacg tgataaataa
120aaaaaacgaa gagggtatct tctatcttgt aataccaacc acgtgaagac
cttcactcca 180cggttgtttt gcctttttag attatattaa aactttgccc
atacaaaaca gtttcttagg 240gaaatttcca gatttatctc actcgcaatt
accaactggc catcgttatt ttttagaacg 300atatcttgca atcttctaag
ctgactctgt aaatcttcac gtagccatct cctaaaaatg 360aggctctagt
ttttatattt catggcttca actgaacaat cacggtcctc gtttttttta
420aaaaaaatga gaaaagtgtg tctttaaagt aacctatagc cacaccatct
agtatgccaa 480aaaaatggtg gatttttcat tggccgacac cgtagacctg
ctgcttaagt aaatatatct 540ggtttggggt taattagaag tatgccactc
caaatttaaa gagttataaa gtagtttatt 600ttcatgtcgc tcaatgacat
caaaatgaat tacctatcta attttataaa aacagggtga 660caagtatgta
attaatcttt tttttcttag ctaggtttag tgatcttagt ccatctaatc
720aaatgatatt tccctcttcc aaaaaaaacg ctttctagca ctatccatct
ttccaataga 780tgtc 78450802DNAZea mays 50tgctgcgtcg tcgtgttgtt
acgatcgtcc tttttttttt gaaaatttat tgaaggcccg 60tgcttcttct tctcttcttc
ttttccttct ccactgtttt cctcgcatgc tacatgacag 120tctctctctc
tcgttaactc tgttggtgtt cttattattt gatccggatg caagtaatac
180tatacaaaca ggaatcggga atgcacgctc ccaatttttg acgcctccat
gcatgtagtg 240gcggagttcc gtaaaattaa cgtccgagat gtactattct
atcttttgat tttttttttc 300gtgttttatt taaaaataaa cagcggatga
taaatattta ataacggaga gtaacattta 360aaacaatcca gttaaactat
gacaagtagg gaaagtggat tagatattcg ggaacaatta 420cttacaatag
aggagaatca cgagcacatg gcagcagaca gtcaacactc aggacacaac
480gttcgccgct caggtggccg ttgttcattg gccaaggagg cactctgcac
tcgcctatga 540ataacaaaaa aagataatca agttgagaaa gttatatagc
ttggaagaag atgtaatgac 600aggggtggat ttgggcggtg gcatggcgcc
agcccccgcg cgcgacactg ctatagggtc 660gcagggaggc tggaggcgga
tggtggaggc cgatgccaga ctggcggtgg tggatgcccc 720gagtcaagcc
cgccccgacc gcctcggagg gttgctggtt tgttcctcat tgccgaattc
780cccggaagcc cttaaggctt tt 80251793DNAZea
maysunsure(1)..(793)n=a,g,c,or t 51aaagtaagca aaactaagct gaaatctgca
agaagcagta ttgagattac caaaccaaca 60agatcctgca gttaagttca acagaaccaa
gatcagcatt cagcacaaaa tgaactgatt 120cataatcact ctgtggacag
taacagtggc agagaaagac ctgatgccca actggtttcg 180aattgtgcaa
gatatgagaa aatgggtaaa acaaatctct cctgtctttt agtgtcctta
240agctattact attgagaact ccaccagtga tccttttgcg cattagagca
tccttaatcc 300tggactgact tagttgagta tcagcgagca tgttctgtgt
gctgcatagt ataataataa 360gccatcataa ataataatga tgataaacga
aatcaataac ataaaacaca taataactca 420cgtgttctgt gtgctgcgta
gtataataat aagccatcat aaataataat gatgataaac 480gaaatcaata
acataaaaca cataataact cacggtgtgt gttgctctat ttttactcta
540gggcttccat caggttcggt aagtgcagct tctccttcaa cagcataaac
aagaagacca 600agttggcagt accctaatgc tgtagcttgg aaccaggcaa
gatcaacacg gataccattc 660tcagctaaag agggatttgc atgagattca
agccagttga tgaccggaag aaacgttact 720ttcagatttc caccacggac
caaacgcagg gggatnctnt agagtcgaac ctgcaggnnc 780agcaagtcat agg
79352748DNAZea mays 52atcatttttc aaccaagtaa tggagatcgc tttttatttt
tagtatgtcg catcgttggc 60tagagctttg atctaccttc atggcaagca tgttatccat
cgggacatta aaccagagaa 120tcttttagtt ggagttcagg tacttgcatt
gtgttctttt ttggtatccc tctggcaccg 180ggtctgctct gcctcatctg
gaagactgga actatatgct tgctttacat cgtcgtctgt 240ttccttggaa
cagggcgaga tcaaaattgc cgactttggc tggtctgtgc acaccttcaa
300cagaagacgg actatgtgcg gaactctgga ttacctgcca cctgaaatgg
gtactttgct 360agccatttac ctccagttat gaactagttc aatgggtttt
gagaattccc acagcagtac 420ctagctgctt tccttggctc tgaaacctgt
tgtggtttat ccagtggaga aggcagaaca 480tgattaccat gttgatatat
ggagccttgg tgttctgtgc tatgagttcc tttacggggt 540cccacctttc
gaagctaagg agcactcaga aacctacaga aggtaactcc acaactctgg
600gatcttagta tgtcgtccct aacttgccga tatcttgccc tagattattt
cctgtggctt 660ttgacttttg agctatgctc tgataactgt gaggaaactt
ttgaagttgc atatagtgct 720agtgtagcaa agagagagca ctatattg
74853610DNAZea mays 53atattcaaaa gggaaacgaa gatggcttgt ttattagttc
agttgcctct agctcaaatc 60tgtgggcttt aattatggat gctggcactg gctttacatc
tcaagtatac gaactttcta 120attactttct tcacaaggta agcaatcaat
tctgttgact tcaaagatct gtaagggtcc 180ttcccctttt ttcttcttaa
tataatgata ttcagctctc ctgcttattt gagagaaaaa 240aaccttcaaa
gatatgtaag ggttcttcct ctttgtaata gggtctttac ccttcctttt
300cttcttctta atataatgat acacaatttc tcctgcttat tcgagaaaaa
aattgagaga 360aaaaaccttc aaaaatttac tgttctttgt ttatgattgt
acgccaacac ttcactgatt
420atacatcctc aatgatgtct cattgtcatt cgcatgcttg tactgtgtta
ggaatggata 480atggaacagt gggagagaaa tttctatatc actgcactgg
ctggggcgaa taatggaagc 540tctttggtga ttatgtcaag aggtaattaa
tgtaaatgtt tgagcttgat gcttagactg 600caggcatgca 61054883DNAZea mays
54atttgaatag gatacatatg aaaaaagaga aattaaaggc ctgtttggtt cactacctca
60gttgccacaa tttgcctaac ttttctgcct gaggttagtt attcaattcg aacgactaac
120cttaggcaaa gtgtggcaca tttagccaca aaccaaacag gccctaagtg
tttgctcagc 180caaacatcgt gtatcagctt gaaatccaaa atatgtttgg
caaaacatag cacatttatc 240aagaaatcat agaaggcaaa atgcaatatg
ctaatggaaa aggctcacag gtgactacga 300tatctctcaa caggatatac
aatgcttgag atggagttcg ccatactcag aaatttgttt 360gacatgtgtc
attttataat tttattttag aagctaggat tcagacttca actggagtag
420atcaagtcaa tacataaaca gtattctttc atactaagaa atatcacctt
gtaagattcc 480tcaagcctgt ccttgtattt ttcaaccaac agctctttac
tcaatgccag atctctttcc 540actactgcct ctgtctcaaa ctattaagag
acaagagaac acattactct atctattcaa 600aacaattcct gtaatcaagt
gataatataa ctataaccaa ctaaacatca tgaaaaaatt 660gcagtgcaat
acctgaataa caatgtgagg atacaatgtt tgtcctttac ggattggtgg
720atcaagagtg ataacaacaa atgtatgagg attgtttgac tgcaaaggtc
aaccatgcaa 780tgagtttagt tacatgtcac aatgtaatac tatgtacaat
aataaagcac aagcctcaac 840catatgatac caaggaacag aaaccttttg
gcaaaaagga aag 8835524DNAGlycine max 55tgcttgcttt ggacctacac aaaa
245624DNAGlycine max 56aaaagcccaa aaggaagagt ggag 245724DNAGlycine
max 57gcgatgacct tgtatggggt agac 245824DNAGlycine max 58ccatgccctg
attcattcat cata 24591249DNAGlycine max 59cagactctag tgactaccac
cttcactctc ctcaagcatt tcagcctctt ccccgctcag 60actccttagc tttgggagcc
aaattatccc ttacgttctc gacttcaacc atatgtgata 120gctgcctatg
ataccatggc tacttcccgt tagttcttta tctttccttt ccgctttatt
180ccatgcctta ccgatcctct gaagtgtctt tgcattagct tcattgaaac
ctcacgcgat 240gaaaggtgtg atggtctcct ccgatggcgc acttctcata
gggtaaccta attgtcttac 300gaccaacata ggattataat taatacaacc
cctcgtccct ataaaaggga catttggaaa 360tccttcacat aagcataaca
ctcctacccc tctttctttc cactgtggga accaactaat 420ggacgctcct
atcatgcctg ccaagagttc ttcccaattt gcctcgtcct ttcctgagca
480catgcgatga ccttgtatgg ggtagacaga tctactttca tgattgaaga
cgtgggatac 540caaccacaca taaagagcag gcgcacaaca gaaaatcctc
gtagtgctct tcttgcatct 600taagtcaaat gtatcataca cttatgctaa
aacaacaatg atcgggcttt ccttgctatg 660gtgataagca agaaaagcat
cgattgctac tagatccacc aactcgtcta cattcgaaaa 720tagtactatc
ccaaacacta gcagtgctaa tacgtcgatg aatgatgccc actctccttg
780gctggccaga gtttccgcct tctcctccaa tcacttcctt ggtattcccc
ctaccctatt 840cctactttgc ttcactcagt ctaattctca tttcgagatc
ttgacaactc ctgctattct 900cgccatagaa ggatagtacc cagaaaaaag
gtatggcttc cttcctccta tcgggcatcc 960taagatccct tcgaactcct
ctatggttgg tgctaactga aagtccccaa aagtgaagca 1020tctgagtgat
tggtcatagt attgggtgag agatgcgatg gcttcaacga acacttctat
1080catcaccaga tcccaaatct tcccatatac cttgttgaag gactgacgtt
gagctcgatc 1140catccgatgc cccagttttc gcaagatgac tacttctaga
ttcttgagtt cgacacgata 1200gaaccttttc ttaaaagaca gtgcttgtct
gaccccatct catcagact 12496025DNAGlycine max 60cgttctcgac ttcaaccata
tgtga 256125DNAGlycine max 61gcatggaata aagcggaaag gaaag
256217DNAGlycine max 62ccatggtatc ataggca 176317DNAGlycine max
63ccatggtatc gtaggca 17
* * * * *