U.S. patent application number 14/463687 was filed with the patent office on 2015-03-19 for plant genome modification using guide rna/cas endonuclease systems and methods of use.
The applicant listed for this patent is E I DU PONT DE NEMOURS AND COMPANY, PIONEER HI-BRED INTERNATIONAL INC. Invention is credited to Andrew Mark Cigan, Saverio Carl Falco, Huirong Gao, Zhongsen Li, Zhan-Bin Liu, L. Aleksander Lyznik, Jinrui Shi, Sergei Svitashev, Joshua K. Young.
Application Number | 20150082478 14/463687 |
Document ID | / |
Family ID | 51493053 |
Filed Date | 2015-03-19 |
United States Patent
Application |
20150082478 |
Kind Code |
A1 |
Cigan; Andrew Mark ; et
al. |
March 19, 2015 |
PLANT GENOME MODIFICATION USING GUIDE RNA/CAS ENDONUCLEASE SYSTEMS
AND METHODS OF USE
Abstract
Compositions and methods are provided for genome modification of
a target sequence in the genome of a plant or plant cell. The
methods and compositions employ a guide RNA/Cas endonuclease system
to provide an effective system for modifying or altering target
sites within the genome of a plant, plant cell or seed. Also
provided are compositions and methods employing a guide
polynucleotide/Cas endonuclease system for genome modification of a
nucleotide sequence in the genome of a cell or organism, for gene
editing, and/or for inserting or deleting a polynucleotide of
interest into or from the genome of a cell or organism. Once a
genomic target site is identified, a variety of methods can be
employed to further modify the target sites such that they contain
a variety of polynucleotides of interest. Breeding methods and
methods for selecting plants utilizing a two component RNA guide
and Cas endonuclease system are also disclosed. Compositions and
methods are also provided for editing a nucleotide sequence in the
genome of a cell.
Inventors: |
Cigan; Andrew Mark;
(Johnston, IA) ; Falco; Saverio Carl; (Wilmington,
DE) ; Gao; Huirong; (Johnston, IA) ; Li;
Zhongsen; (Hockessin, DE) ; Liu; Zhan-Bin;
(West Chester, PA) ; Lyznik; L. Aleksander;
(Johnston, IA) ; Shi; Jinrui; (Johnston, IA)
; Svitashev; Sergei; (Johnston, IA) ; Young;
Joshua K.; (Johnston, IA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
E I DU PONT DE NEMOURS AND COMPANY
PIONEER HI-BRED INTERNATIONAL INC |
Wilmington
Johnston |
DE
IA |
US
US |
|
|
Family ID: |
51493053 |
Appl. No.: |
14/463687 |
Filed: |
August 20, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62023239 |
Jul 11, 2014 |
|
|
|
61953090 |
Mar 14, 2014 |
|
|
|
61937045 |
Feb 7, 2014 |
|
|
|
61882532 |
Sep 25, 2013 |
|
|
|
61868706 |
Aug 22, 2013 |
|
|
|
Current U.S.
Class: |
800/270 ;
435/320.1; 435/418; 435/419; 435/441; 435/468; 435/469; 435/470;
435/6.1; 435/6.11; 435/69.7; 800/260; 800/275; 800/278; 800/298;
800/300 |
Current CPC
Class: |
C12N 15/113 20130101;
C12N 2310/3231 20130101; A01H 1/00 20130101; C12N 2310/10 20130101;
C12N 2310/20 20170501; C12N 15/8213 20130101; C12N 2310/3341
20130101; C12N 2310/315 20130101; C12N 15/8241 20130101; C12N
15/8216 20130101; C12N 2310/321 20130101; C12N 2310/3521 20130101;
C12N 2310/322 20130101; C12N 2310/3533 20130101 |
Class at
Publication: |
800/270 ;
800/260; 435/6.1; 435/441; 435/468; 435/470; 435/469; 800/275;
800/298; 435/320.1; 800/278; 435/419; 435/418; 800/300; 435/69.7;
435/6.11 |
International
Class: |
C12N 15/82 20060101
C12N015/82; C12Q 1/68 20060101 C12Q001/68; C12N 15/01 20060101
C12N015/01; A01H 1/02 20060101 A01H001/02; A01H 1/04 20060101
A01H001/04 |
Claims
1. A method for selecting a plant comprising an altered target site
in its plant genome, the method comprising: a) obtaining a first
plant comprising at least one Cas endonuclease capable of
introducing a double strand break at a target site in the plant
genome; b) obtaining a second plant comprising a guide RNA that is
capable of forming a complex with the Cas endonuclease of (a); c)
crossing the first plant of (a) with the second plant of (b); d)
evaluating the progeny of (c) for an alteration in the target site;
and, e) selecting a progeny plant that possesses the desired
alteration of said target site.
2. A method for selecting a plant comprising an altered target site
in its plant genome, the method comprising selecting at least one
progeny plant that comprises an alteration at a target site in its
plant genome, wherein said progeny plant was obtained by crossing a
first plant comprising at least one a Cas endonuclease with a
second plant comprising a guide RNA, wherein said Cas endonuclease
is capable of introducing a double strand break at said target
site.
3. A method for selecting a plant comprising an altered target site
in its plant genome, the method comprising: a) obtaining a first
plant comprising at least one Cas endonuclease capable of
introducing a double strand break at a target site in the plant
genome; b) obtaining a second plant comprising a guide RNA and a
donor DNA, wherein said guide RNA is capable of forming a complex
with the Cas endonuclease of (a), wherein said donor DNA comprises
a polynucleotide of interest; c) crossing the first plant of (a)
with the second plant of (b); d) evaluating the progeny of (c) for
an alteration in the target site; and, e) selecting a progeny plant
that comprises the polynucleotide of interest inserted at said
target site.
4. A method for selecting a plant comprising an altered target site
in its plant genome, the method comprising selecting at least one
progeny plant that comprises an alteration at a target site in its
plant genome, wherein said progeny plant was obtained by crossing a
first plant expressing at least one Cas endonuclease to a second
plant comprising a guide RNA and a donor DNA, wherein said Cas
endonuclease is capable of introducing a double strand break at
said target site, wherein said donor DNA comprises a polynucleotide
of interest.
5. A method for modifying a target site in the genome of a plant
cell, the method comprising providing a guide RNA to a plant cell
having a Cas endonuclease, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at said target
site.
6. A method for modifying a target site in the genome of a plant
cell, the method comprising providing a guide RNA and a Cas
endonuclease to said plant cell, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at said target
site.
7. A method for modifying a target site in the genome of a plant
cell, the method comprising providing a guide RNA and a donor DNA
to a plant cell having a Cas endonuclease, wherein said guide RNA
and Cas endonuclease are capable of forming a complex that enables
the Cas endonuclease to introduce a double strand break at said
target site, wherein said donor DNA comprises a polynucleotide of
interest.
8. A method for modifying a target site in the genome of a plant
cell, the method comprising: a) providing to a plant cell a guide
RNA and a Cas endonuclease, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at said target
site; and, b) identifying at least one plant cell that has a
modification at said target, wherein the modification includes at
least one deletion or substitution of one or more nucleotides in
said target site.
9. A method for modifying a target DNA sequence in the genome of a
plant cell, the method comprising: a) providing to a plant cell a
first recombinant DNA construct capable of expressing a guide RNA
and a second recombinant DNA construct capable of expressing a Cas
endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at said target site; and, b)
identifying at least one plant cell that has a modification at said
target, wherein the modification includes at least one deletion or
substitution of one or more nucleotides in said target site.
10. A method for introducing a polynucleotide of Interest into a
target site in the genome of a plant cell, the method comprising:
a) providing to a plant cell a first recombinant DNA construct
capable of expressing a guide RNA and a second recombinant DNA
construct capable of expressing a Cas endonuclease, wherein said
guide RNA and Cas endonuclease are capable of forming a complex
that enables the Cas endonuclease to introduce a double strand
break at said target site; b) contacting the plant cell of (a) with
a donor DNA comprising a polynucleotide of Interest; and, c)
identifying at least one plant cell from (b) comprising in its
genome the polynucleotide of Interest integrated at said target
site.
11. The method of claim 5, wherein the guide RNA is introduced
directly by particle bombardment.
12. The method of claim 5, wherein the guide RNA is introduced via
particle bombardment or Agrobacterium transformation of a
recombinant DNA construct comprising the corresponding guide DNA
operably linked to a plant U6 polymerase III promoter.
13. The method of claim 1, wherein the Cas endonuclease gene is a
plant optimized Cas9 endonuclease.
14. The method of claim 1, wherein the Cas endonuclease gene is
operably linked to a SV40 nuclear targeting signal upstream of the
Cas codon region and a VirD2 nuclear localization signal downstream
of the Cas codon region.
15. The method of claim 1, wherein the plant is a monocot or a
dicot.
16. The method of claim 15, wherein the monocot is selected from
the group consisting of maize, rice, sorghum, rye, barley, wheat,
millet, oats, sugarcane, turfgrass, or switchgrass.
17. The method of claim 16, wherein the dicot is selected from the
group consisting of soybean, canola, alfalfa, sunflower, cotton,
tobacco, peanut, potato, tobacco, Arabidopsis, or safflower.
18. The method of claim 1, wherein the target site is located in
the gene sequence of an acetolactate synthase (ALS) gene, an
Enolpyruvylshikimate Phosphate Synthase Gene (ESPSP) gene, a male
fertility (MS45, MS26 or MSCA1).
19. A plant or seed produced by the method of claim 5.
20. A plant comprising a recombinant DNA construct, said
recombinant DNA construct comprising a promoter operably linked to
a nucleotide sequence encoding a plant optimized Cas9 endonuclease,
wherein said plant optimized Cas9 endonuclease is capable of
binding to and creating a double strand break in a genomic target
sequence said plant genome.
21. A plant comprising a recombinant DNA construct and a guide RNA,
wherein said recombinant DNA construct comprises a promoter
operably linked to a nucleotide sequence encoding a plant optimized
Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease
and guide RNA are capable of forming a complex and creating a
double strand break in a genomic target sequence said plant
genome.
22. A recombinant DNA construct comprising a promoter operably
linked to a nucleotide sequence encoding a plant optimized Cas9
endonuclease, wherein said plant optimized Cas9 endonuclease is
capable of binding to and creating a double strand break in a
genomic target sequence said plant genome.
23. A recombinant DNA construct comprising a promoter operably
linked to a nucleotide sequence expressing a guide RNA, wherein
said guide RNA is capable of forming a complex with a plant
optimized Cas9 endonuclease, and wherein said complex is capable of
binding to and creating a double strand break in a genomic target
sequence said plant genome.
24. A method for selecting a male sterile plant, the method
comprising selecting at least one progeny plant that comprises an
alteration at a genomic target site located in a male fertility
gene locus, wherein said progeny plant is obtained by crossing a
first plant expressing a Cas9 endonuclease to a second plant
comprising a guide RNA, wherein said Cas endonuclease is capable of
introducing a double strand break at said genomic target site,
25. A method for producing a male sterile plant, the method
comprising: a) obtaining a first plant comprising at least one Cas
endonuclease capable of introducing a double strand break at a
genomic target site located in a male fertility gene locus in the
plant genome; b) obtaining a second plant comprising a guide RNA
that is capable of forming a complex with the Cas endonuclease of
(a); c) crossing the first plant of (a) with the second plant of
(b); d) evaluating the progeny of (c) for an alteration in the
target site; and, e) selecting a progeny plant that is male
sterile.
26. The method of claim 24, wherein the male fertility gene is
selected from the group consisting of MS26, MS45 and MSCA1.
27. The method of claim 24, wherein the plant is a monocot or a
dicot.
28. The method of claim 27, wherein the monocot is selected from
the group consisting of maize, rice, sorghum, rye, barley, wheat,
millet, oats, sugarcane, turfgrass, or switchgrass.
29. A method for editing a nucleotide sequence in the genome of a
cell, the method comprising introducing at least one guide RNA, at
least one polynucleotide modification template and at least one Cas
endonuclease into a cell, wherein the Cas endonuclease introduces a
double-strand break at a target site in the genome of said cell,
wherein said polynucleotide modification template comprises at
least one nucleotide modification of said nucleotide sequence.
30. The method of claim 29, wherein the cell is a plant cell.
31. The method of claim 29 wherein the nucleotide sequence is a
promoter, a regulatory sequence or a gene of interest of
interest.
32. The method of claim 31 wherein the gene of interest is an
enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene or an
acetolactate synthase (ALS) gene.
33. The method of claim 30 wherein the plant cell is a monocot or
dicot plant cell.
34. A method for producing an epsps mutant plant, the method
comprising: a) providing a guide RNA, a polynucleotide modification
template and at least one Cas endonuclease to a plant cell, wherein
the Cas endonuclease introduces a double strand break at a target
site within an epsps genomic sequence in the plant genome, wherein
said polynucleotide modification template comprises at least one
nucleotide modification of said epsps genomic sequence; b)
obtaining a plant from the plant cell of (a); c) evaluating the
plant of (b) for the presence of said at least one nucleotide
modification; and, d) selecting a progeny plant that shows
tolerance to glyphosate.
35. A method for producing an epsps mutant plant, the method
comprising: a) providing a guide RNA, a polynucleotide modification
template and at least one Cas endonuclease to a plant cell, wherein
the Cas endonuclease introduces a double strand break at a target
site within an epsps genomic sequence in the plant genome, wherein
said polynucleotide modification template comprises at least one
nucleotide modification of said epsps genomic sequence; b)
obtaining a plant from the plant cell of (a); c) evaluating the
plant of (b) for the presence of said at least one nucleotide
modification; and, d) screening a progeny plant of (c) that is void
of said guide RNA and Cas endonuclease.
36. The method of claim 35 further comprising selecting a plant
that shows resistance to glyphosate.
37. A plant, plant cell or seed produced by the method of claim
29.
38. The method of claim 29, wherein the Cas endonuclease is a Cas9
endonuclease.
39. The method of claim 38, wherein the Cas9 endonuclease is
expressed by SEQ ID NO: 5.
40. The method of claim 38 wherein the Cas9 endonuclease is encoded
by any one of SEQ ID NOs: 1, 124, 212, 213, 214, 215, 216, 193 or
nucleotides 2037-6329 of SEQ ID NO: 5, or any functional fragment
thereof.
41. The plant or plant cell of claim 37, wherein said plant cell
shows resistance to glyphosate.
42. A plant cell comprising a modified nucleotide sequence, wherein
the modified nucleotide sequence was produced by providing a guide
RNA, a polynucleotide modification template and at least one Cas
endonuclease to a plant cell, wherein the Cas endonuclease is
capable of introducing a double-strand break at a target site in
the plant genome, wherein said polynucleotide modification template
comprises at least one nucleotide modification of said nucleotide
sequence.
43. The method of claim 29, wherein the at least one nucleotide
modification is not a modification at said target site.
44. A method for producing a male sterile plant, the method
comprising: a) providing to a plant cell a guide RNA and a Cas
endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at a target site located in or near
a male fertility gene; b) identifying at least one plant cell that
has a modification in said male fertility gene, wherein the
modification includes at least one deletion, insertion, or
substitution of one or more nucleotides in said male sterility
gene; and, c) obtaining a plant from the plant cell of b).
45. The method of claim 44, further comprising selecting a progeny
plant from the plant of c) wherein said progeny plant is male
sterile.
46. The method of claim 44, wherein the male fertility gene is
selected from the group consisting of MS26, MS45 and MSCA1.
47. A plant comprising at least one altered target site, wherein
the at least one altered target site originated from a
corresponding target site that was recognized and cleaved by a
guideRNA/Cas endonuclease system, and wherein the at least one
altered target site is in a genomic region of interest that extends
from the target sequence set forth in SEQ ID NO: 229 to the target
site set forth in SEQ ID NO: 235.
48. The plant of claim 47, wherein the at least one altered target
site has an alteration selected from the group consisting of (i)
replacement of at least one nucleotide, (ii) a deletion of at least
one nucleotide, (iii) an insertion of at least one nucleotide, and
(iv) any combination of (i)-(iii).
49. The plant of claim 47, wherein the at least one altered target
site comprises a recombinant DNA molecule.
50. The plant of claim 47, wherein the plant comprises at least two
altered target sites, wherein each of the altered target site
originated from corresponding target site that was recognized and
cleaved by a guideRNA/Cas endonuclease system, wherein the
corresponding target site is selected from the group consisting of
SEQ ID NOs: 229, 230, 231, 232, 233, 234, 235 and 236.
51. A method for editing a nucleotide sequence in the genome of a
cell, the method comprising providing a guide polynucleotide, a Cas
endonuclease, and optionally a polynucleotide modification
template, to a cell, wherein said guide RNA and Cas endonuclease
are capable of forming a complex that enables the Cas endonuclease
to introduce a double strand break at a target site in the genome
of said cell, wherein said polynucleotide modification template
comprises at least one nucleotide modification of said nucleotide
sequence.
52. The method of claim 51, wherein the nucleotide sequence in the
genome of a cell is selected from the group consisting of a
promoter sequence, a terminator sequence, a regulatory element
sequence, a splice site, a coding sequence, a polyubiquitination
site, an intron site and an intron enhancing motif.
53. A method for editing a promoter sequence in the genome of a
cell, the method comprising providing a guide polynucleotide, a
polynucleotide modification template and at least one Cas
endonuclease to a cell, wherein said guide RNA and Cas endonuclease
are capable of forming a complex that enables the Cas endonuclease
to introduce a double strand break at a target site in the genome
of said cell, wherein said polynucleotide modification template
comprises at least one nucleotide modification of said promoter
sequence to be edited.
54. A method for replacing a first promoter sequence in a cell, the
method comprising providing a guide RNA, a polynucleotide
modification template, and a Cas endonuclease to said cell, wherein
said guide RNA and Cas endonuclease are capable of forming a
complex that enables the Cas endonuclease to introduce a double
strand break at a target site in the genome of said cell, wherein
said polynucleotide modification template comprises a second
promoter or second promoter fragment that is different from said
first promoter sequence.
55. The method of claim 54, wherein the replacement of the first
promoter sequence results in any one of the following, or any one
combination of the following: an increased promoter activity, an
increased promoter tissue specificity, a decreased promoter
activity, a decreased promoter tissue specificity, a new promoter
activity, an inducible promoter activity, an extended window of
gene expression, or a modification of the timing or developmental
progress of gene expression in the same cell layer or other cell
layer.
56. The method of claim 54, wherein the first promoter sequence is
selected from the group consisting of Zea mays ARGOS 8 promoter, a
soybean EPSPS1 promoter, a maize EPSPS promoter, maize NPK1
promoter, wherein the second promoter sequence is selected from the
group consisting of a Zea mays GOS2 PRO:GOS2-intron promoter, a
soybean ubiquitin promoter, a stress inducible maize RAB17
promoter, a Zea mays-PEPC1 promoter, a Zea mays Ubiquitin promoter,
a Zea mays-Rootmet2 promoter, a rice actin promoter, a sorghum RCC3
promoter, a Zea mays-GOS2 promoter, a Zea mays-ACO2 promoter, and a
Zea mays oleosin promoter.
57. A method for deleting a promoter sequence in the genome of a
cell, the method comprising providing a guide polynucleotide, a Cas
endonuclease to a cell, wherein said guide RNA and Cas endonuclease
are capable of forming a complex that enables the Cas endonuclease
to introduce a double strand break in at least one target site
located inside or outside said promoter sequence.
58. A method for inserting a promoter or a promoter element in the
genome of a cell, the method comprising providing a guide
polynucleotide, a polynucleotide modification template comprising
the promoter or the promoter element, and a Cas endonuclease to a
cell, wherein said guide RNA and Cas endonuclease are capable of
forming a complex that enables the Cas endonuclease to introduce a
double strand break at a target site in the genome of said
cell.
59. The method of claim 58, wherein the insertion of the promoter
or promoter element results in any one of the following, or any one
combination of the following: an increased promoter activity, an
increased promoter tissue specificity, a decreased promoter
activity, a decreased promoter tissue specificity, a new promoter
activity, an inducible promoter activity, an extended window of
gene expression, a modification of the timing or developmental
progress of gene expression, a mutation of DNA binding elements, or
an addition of DNA binding elements.
60. A method for editing a Zinc Finger transcription factor, the
method comprising providing a guide polynucleotide, a Cas
endonuclease, and optionally a polynucleotide modification
template, to a cell, wherein the Cas endonuclease introduces a
double-strand break at a target site in the genome of said cell,
wherein said polynucleotide modification template comprises at
least one nucleotide modification or deletion of said Zinc Finger
transcription factor, wherein the deletion or modification of said
Zinc Finger transcription factor results in the creation of a
dominant negative Zinc Finger transcription factor mutant.
61. A method for creating a fusion protein, the method comprising
introducing a guide polynucleotide, a Cas endonuclease, and a
polynucleotide modification template, into a cell, wherein the Cas
endonuclease introduces a double-strand break at a target site
located inside or outside a first coding sequence in the genome of
said cell, wherein said polynucleotide modification template
comprises a second coding sequence encoding a protein of interest,
wherein the protein fusion results in any one of the following, or
any one combination of the following: a targeting of the fusion
protein to the chloroplast of said cell, an increased protein
activity, an increased protein functionality, a decreased protein
activity, a decreased protein functionality, a new protein
functionality, a modified protein functionality, a new protein
localization, a new timing of protein expression, a modified
protein expression pattern, a chimeric protein, or a modified
protein with dominant phenotype functionality.
62. A method for producing in a plant a complex trait locus
comprising at least two altered target sequences in a genomic
region of interest, said method comprising: (a) selecting a genomic
region in a plant, wherein the genomic region comprises a first
target sequence and a second target sequence; (b) contacting at
least one plant cell with at least a first guide polynucleotide, a
second polynucleotide, and optionally at least one Donor DNA, and a
Cas endonuclease, wherein the first and second guide polynucleotide
and the Cas endonuclease can form a complex that enables the Cas
endonuclease to introduce a double strand break in at least a first
and a second target sequence; (c) identifying a cell from (b)
comprising a first alteration at the first target sequence and a
second alteration at the second target sequence; and (d) recovering
a first fertile plant from the cell of (c) said fertile plant
comprising the first alteration and the second alteration, wherein
the first alteration and the second alteration are physically
linked.
63. A method for producing in a plant a complex trait locus
comprising at least two altered target sequences in a genomic
region of interest, said method comprising: (a) selecting a genomic
region in a plant, wherein the genomic region comprises a first
target sequence and a second target sequence; (b) contacting at
least one plant cell with a first guide polynucleotide, a Cas
endonuclease, and optionally a first Donor DNA, wherein the first
guide polynucleotide and the Cas endonuclease can form a complex
that enables the Cas endonuclease to introduce a double strand
break a first target sequence; (c) identifying a cell from (b)
comprising a first alteration at the first target sequence; (d)
recovering a first fertile plant from the cell of (c), said first
fertile plant comprising the first alteration; (e) contacting at
least one plant cell with a second guide polynucleotide, a Cas
endonuclease, and optionally a second Donor DNA; (f) identifying a
cell from (e) comprising a second alteration at the second target
sequence; (g) recovering a second fertile plant from the cell of
(f), said second fertile plant comprising the second alteration;
and, (h) obtaining a fertile progeny plant from the second fertile
plant of (g), said fertile progeny plant comprising the first
alteration and the second alteration, wherein the first alteration
and the second alteration are physically linked.
64. The method of claim 29, wherein the editing of said nucleotide
sequence renders said nucleotide sequence capable of conferring
herbicide resistance to said cell.
65. A method for producing an acetolactate synthase (ALS) mutant
plant, the method comprising: a) obtaining a plant or a seed
thereof, wherein the plant or the seed comprises a modification in
an endogenous ALS gene, the modification generated by a Cas
endonuclease, a guide RNA and a polynucleotide modification
template, wherein the plant or the seed is resistant to
sulphonylurea; and, b) producing a progeny plant that is void of
said guide RNA and Cas endonuclease.
66. A method of generating a sulphonylurea resistant plant, the
method comprising providing a plant cell wherein its endogenous
chromosomal ALS gene by has been modified through a guide RNA/Cas
endonuclease system to produce a sulphonylurea resistant ALS
protein and growing a plant from said maize plant cell, wherein
said plant is resistant to sulphonylurea.
Description
[0001] This application claims the benefit of U.S. Provisional
Application No. 61/868,706, filed Aug. 22, 2013, U.S. Provisional
Application No. 61/882,532, filed Sep. 25, 2013, U.S. Provisional
Application No. 61/937,045, filed Feb. 7, 2014, U.S. Provisional
Application No. 61/953,090, filed Mar. 14, 2014, and U.S.
Provisional Application No. 62/023,239, filed Jul. 11, 2014; all of
which are hereby incorporated herein in their entirety by
reference.
FIELD
[0002] The disclosure relates to the field of plant molecular
biology, in particular, to methods for altering the genome of a
plant cell.
REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY
[0003] The official copy of the sequence listing is submitted
electronically via EFS-Web as an ASCII formatted sequence listing
with a file named 20140814_BB2284USNP_ST25_SeqLst created on Aug.
14, 2014 and having a size 560 kilobytes and is filed concurrently
with the specification. The sequence listing contained in this
ASCII formatted document is part of the specification and is herein
incorporated by reference in its entirety.
BACKGROUND
[0004] Recombinant DNA technology has made it possible to insert
foreign DNA sequences into the genome of an organism, thus,
altering the organism's phenotype. The most commonly used plant
transformation methods are Agrobacterium infection and biolistic
particle bombardment in which transgenes integrate into a plant
genome in a random fashion and in an unpredictable copy number.
Thus, efforts are undertaken to control transgene integration in
plants.
[0005] One method for inserting or modifying a DNA sequence
involves homologous DNA recombination by introducing a transgenic
DNA sequence flanked by sequences homologous to the genomic target.
U.S. Pat. No. 5,527,695 describes transforming eukaryotic cells
with DNA sequences that are targeted to a predetermined sequence of
the eukaryote's DNA. Specifically, the use of site-specific
recombination is discussed. Transformed cells are identified
through use of a selectable marker included as a part of the
introduced DNA sequences.
[0006] It was shown that artificially induced site-specific genomic
double-stranded breaks in plant cells were repaired by homologous
recombination with exogenously supplied DNA using two different
pathways. (Puchta et al., (1996) Proc. Natl. Acad. Sci. USA
93:5055-5060; U.S. Patent Application Publication No.
2005/0172365A1 published Aug. 4, 2005; U.S. Patent Application
Publication No. 2006/0282914 published Dec. 14, 2006; WO
2005/028942 published Jun. 2, 2005).
[0007] Since the isolation, cloning, transfer and recombination of
DNA segments, including coding sequences and non-coding sequences,
is most conveniently carried out using restriction endonuclease
enzymes. Much research has focused on studying and designing
endonucleases such as WO 2004/067736 published Aug. 12, 2004; U.S.
Pat. No. 5,792,632 issued to Dujon et al., Aug. 11, 1998; U.S. Pat.
No. 6,610,545 B2 issued to Dujon et al., Aug. 26, 2003; Chevalier
et al., (2002) Mol Cell 10:895-905; Chevalier et al., (2001)
Nucleic Acids Res 29:3757-3774; Seligman et al., (2002) Nucleic
Acids Res 30:3870-3879.
[0008] Although several approaches have been developed to target a
specific site for modification in the genome of a plant, there
still remains a need for more efficient and effective methods for
producing a fertile plant, having an altered genome comprising
specific modifications in a defined region of the genome of the
plant.
BRIEF SUMMARY
[0009] Compositions and methods are provided employing a guide
RNA/Cas endonuclease system in plants for genome modification of a
target sequence in the genome of a plant or plant cell, for
selecting plants, for gene editing, and for inserting a
polynucleotide of interest into the genome of a plant. The methods
and compositions employ a guide RNA/Cas endonuclease system to
provide for an effective system for modifying or altering target
sites and nucleotides of interest within the genome of a plant,
plant cell or seed. Once a genomic target site is identified, a
variety of methods can be employed to further modify the target
sites such that they contain a variety of polynucleotides of
interest. Breeding methods and methods for selecting plants
utilizing a two component RNA guide and Cas endonuclease system are
also disclosed. Also provided are nucleic acid constructs, plants,
plant cells, explants, seeds and grain having the guide RNA/Cas
endonuclease system. Compositions and methods are also provided
employing a guide polynucleotide/Cas endonuclease system for genome
modification of a target sequence in the genome of a cell or
organism, for gene editing, and for inserting or deleting a
polynucleotide of interest into or from the genome of a cell or
organism. The methods and compositions employ a guide
polynucleotide/Cas endonuclease system to provide for an effective
system for modifying or altering target sites and editing
nucleotide sequences of interest within the genome of a cell,
wherein the guide polynucleotide is comprised of a RNA sequence, a
DNA sequence, or a DNA-RNA combination sequence.
[0010] Thus in a first embodiment of the disclosure, the method
comprises a method for selecting a plant comprising an altered
target site in its plant genome, the method comprising: a)
obtaining a first plant comprising at least one Cas endonuclease
capable of introducing a double strand break at a target site in
the plant genome; b) obtaining a second plant comprising a guide
RNA that is capable of forming a complex with the Cas endonuclease
of (a), c) crossing the first plant of (a) with the second plant of
(b); d) evaluating the progeny of (c) for an alteration in the
target site and e) selecting a progeny plant that possesses the
desired alteration of said target site.
[0011] In another embodiment, the method comprises, a method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising selecting at least one progeny plant
that comprises an alteration at a target site in its plant genome,
wherein said progeny plant was obtained by crossing a first plant
comprising at least one Cas endonuclease with a second plant
comprising a guide RNA, wherein said Cas endonuclease is capable of
introducing a double strand break at said target site.
[0012] In another embodiment, the method comprises, a method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising:
[0013] a) obtaining a first plant comprising at least one Cas
endonuclease capable of introducing a double strand break at a
target site in the plant genome; b) obtaining a second plant
comprising a guide RNA and a donor DNA, wherein said guide RNA is
capable of forming a complex with the Cas endonuclease of (a),
wherein said donor DNA comprises a polynucleotide of interest; c)
crossing the first plant of (a) with the second plant of (b); d)
evaluating the progeny of (c) for an alteration in the target site
e) selecting a progeny plant that comprises the polynucleotide of
interest inserted at said target site.
[0014] In another embodiment, the method comprises, a method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising selecting at least one progeny plant
that comprises an alteration at a target site in its plant genome,
wherein said progeny plant was obtained by crossing a first plant
expressing at least one Cas endonuclease to a second plant
comprising a guide RNA and a donor DNA, wherein said Cas
endonuclease is capable of introducing a double strand break at
said target site, wherein said donor DNA comprises a polynucleotide
of interest.
[0015] In another embodiment, the method comprises, method for
modifying a target site in the genome of a plant cell, the method
comprising introducing a guide RNA into a plant cell having a Cas
endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at said target site.
[0016] In another embodiment, the method comprises, a method for
modifying a target site in the genome of a plant cell, the method
comprising introducing a guide RNA and a Cas endonuclease into said
plant cell, wherein said guide RNA and Cas endonuclease are capable
of forming a complex that enables the Cas endonuclease to introduce
a double strand break at said target site.
[0017] In another embodiment, the method comprises, a method for
modifying a target site in the genome of a plant cell, the method
comprising introducing a guide RNA and a donor DNA into a plant
cell having a Cas endonuclease, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at said target
site, wherein said donor DNA comprises a polynucleotide of
interest.
[0018] In another embodiment, the method comprises a method for
modifying a target site in the genome of a plant cell, the method
comprising: a) introducing into a plant cell a guide RNA and a Cas
endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at said target site; and, b)
identifying at least one plant cell that has a modification at said
target, wherein the modification includes at least one deletion or
substitution of one or more nucleotides in said target site.
[0019] In another embodiment, the method comprises, method for
modifying a target DNA sequence in the genome of a plant cell, the
method comprising: A) introducing into a plant cell a first
recombinant DNA construct capable of expressing a guide RNA and a
second recombinant DNA construct capable of expressing a Cas
endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at said target site; and, B)
identifying at least one plant cell that has a modification at said
target, wherein the modification includes at least one deletion or
substitution of one or more nucleotides in said target site.
[0020] In another embodiment, the method comprises, a method for
introducing a polynucleotide of Interest into a target site in the
genome of a plant cell, the method comprising: a) introducing into
a plant cell a first recombinant DNA construct capable of
expressing a guide RNA and a second recombinant DNA construct
capable of expressing a Cas endonuclease, wherein said guide RNA
and Cas endonuclease are capable of forming a complex that enables
the Cas endonuclease to introduce a double strand break at said
target site; (b) contacting the plant cell of (a) with a donor DNA
comprising a polynucleotide of Interest; and, (c) identifying at
least one plant cell from (b) comprising in its genome the
polynucleotide of Interest integrated at said target site.
[0021] In some of these embodiments, the guide RNA can be
introduced directly by particle bombardment or can be introduced
via particle bombardment or Agrobacterium transformation of a
recombinant DNA construct comprising the corresponding guide DNA
operably linked to a plant U6 polymerase III promoter.
[0022] In some of these embodiments, the Cas endonuclease gene is a
plant optimized Cas9 endonuclease.
[0023] In some of these embodiments, the Cas endonuclease gene is
operably linked to a SV40 nuclear targeting signal upstream of the
Cas codon region and a VirD2 nuclear localization signal downstream
of the Cas codon region.
[0024] The plant in these embodiments is a monocot or a dicot. More
specifically, the monocot is selected from the group consisting of
maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane,
turfgrass, or switchgrass. The dicot is selected from the group
consisting of soybean, canola, alfalfa, sunflower, cotton, tobacco,
peanut, potato, tobacco, Arabidopsis, or safflower.
[0025] In some embodiments, the target site is located in the gene
sequence of an acetolactate synthase (ALS) gene, an
Enolpyruvylshikimate Phosphate Synthase Gene (ESPSP) gene, a male
fertility (MS45, MS26 or MSCA1) gene.
[0026] In another embodiment the disclosure comprises a plant,
plant part, or seed, comprising a recombinant DNA construct, said
recombinant DNA construct comprising a promoter operably linked to
a nucleotide sequence encoding a plant optimized Cas9 endonuclease,
wherein said plant optimized Cas9 endonuclease is capable of
binding to and creating a double strand break in a genomic target
sequence said plant genome.
[0027] In another embodiment the plant comprises a recombinant DNA
construct and a guide RNA, wherein said recombinant DNA construct
comprises a promoter operably linked to a nucleotide sequence
encoding a plant optimized Cas9 endonuclease, wherein said plant
optimized Cas9 endonuclease and guide RNA are capable of forming a
complex and creating a double strand break in a genomic target
sequence said plant genome.
[0028] In another embodiment, the recombinant DNA construct
comprises a promoter operably linked to a nucleotide sequence
encoding a plant optimized Cas9 endonuclease, wherein said plant
optimized Cas9 endonuclease is capable of binding to and creating a
double strand break in a genomic target sequence said plant
genome.
[0029] In another embodiment, the recombinant DNA construct
comprises a promoter operably linked to a nucleotide sequence
expressing a guide RNA, wherein said guide RNA is capable of
forming a complex with a plant optimized Cas9 endonuclease, and
wherein said complex is capable of binding to and creating a double
strand break in a genomic target sequence said plant genome.
[0030] In another embodiment, the method comprises a method for
selecting a male sterile or male fertile plant, the method
comprising selecting at least one progeny plant that comprises an
alteration at a genomic target site located in a male fertility
gene locus, wherein said progeny plant is obtained by crossing a
first plant expressing a Cas9 endonuclease to a second plant
comprising a guide RNA, wherein said Cas endonuclease is capable of
introducing a double strand break at said genomic target site.
[0031] In another embodiment, the method comprises a method for
producing a male sterile or male fertile plant, the method
comprising: a) obtaining a first plant comprising at least one Cas
endonuclease capable of introducing a double strand break at a
genomic target site located in a male fertility gene locus in the
plant genome; b) obtaining a second plant comprising a guide RNA
that is capable of forming a complex with the Cas endonuclease of
(a), c) crossing the first plant of (a) with the second plant of
(b); d) evaluating the progeny of (c) for an alteration in the
target site; and e) selecting a progeny plant that is male sterile
or male fertile. Male fertility genes can be selected from, but are
not limited to MS26, MS45, MSCA1 genes
[0032] Compositions and methods are also provided for editing a
nucleotide sequence in the genome of a cell. In one embodiment, the
disclosure describes a method for editing a nucleotide sequence in
the genome of a plant cell, the method comprising providing a guide
RNA, a polynucleotide modification template, and at least one maize
optimized Cas9 endonuclease to a plant cell, wherein the maize
optimized Cas9 endonuclease is capable of introducing a
double-strand break at a target site in the plant genome, wherein
said polynucleotide modification template includes at least one
nucleotide modification of said nucleotide sequence. The nucleotide
to be edited (the nucleotide sequence of interest) can be located
within or outside a target site that is recognized and cleaved by a
Cas endonuclease. Cells include, but are not limited to, human,
animal, bacterial, fungal, insect, and plant cells as well as
plants and seeds produced by the methods described herein.
[0033] Additional embodiments of the methods and compositions of
the present disclosure are shown herein.
BRIEF DESCRIPTION OF THE DRAWINGS AND THE SEQUENCE LISTING
[0034] The disclosure can be more fully understood from the
following detailed description and the accompanying drawings and
Sequence Listing, which form a part of this application. The
sequence descriptions and sequence listing attached hereto comply
with the rules governing nucleotide and amino acid sequence
disclosures in patent applications as set forth in 37 C.F.R.
.sctn..sctn.1.821-1.825. The sequence descriptions contain the
three letter codes for amino acids as defined in 37 C.F.R.
.sctn..sctn.1.821-1.825, which are incorporated herein by
reference.
FIGURES
[0035] FIG. 1A shows a maize optimized Cas9 gene (encoding a Cas9
endonuclease) containing a potato ST-LS1 intron, a SV40 amino
terminal nuclear localization sequence (NLS), and a VirD2 carboxyl
terminal NLS, operably linked to a plant ubiquitin promoter (SEQ ID
NO: 5). The maize optimized Cas9 gene (just Cas9 coding sequence,
no NLSs) corresponds to nucleotide positions 2037-2411 and
2601-6329 of SEQ ID NO: 5 with the potato intron residing at
positions 2412-2600 of SEQ ID NO: 5. SV40 NLS is at positions
2010-2036 of SEQ ID NO: 5. VirD2 NLS is at positions 6330-6386 of
SEQ ID NO: 5. FIG. 1B shows a long guide RNA operably linked to a
maize U6 polymerase III promoter terminating with a maize U6
terminator (SEQ ID NO: 12). The long guide RNA containing the
variable targeting domain corresponding to the maize LIGCas-3
target site (SEQ ID NO: 8) is transcribed from/corresponds to
positions 1001-1094 of SEQ ID NO: 12. FIG. 1C shows the maize
optimized Cas9 and long guide RNA expression cassettes combined on
a single vector DNA (SEQ ID NO: 102).
[0036] FIG. 2A illustrates the duplexed crRNA (SEQ ID
NO:6)-tracrRNA (SEQ ID NO:7)/Cas9 endonuclease system and target
DNA complex relative to the appropriately oriented PAM sequence at
the maize LIGCas-3 (SEQ ID NO: 18, Table 1) target site with
triangles pointing towards the expected site of cleavage on both
sense and anti-sense DNA strands. FIG. 2 B illustrates the guide
RNA/Cas9 endonuclease complex interacting with the genomic target
site relative to the appropriately oriented PAM sequence (GGA) at
the maize genomic LIGCas-3 target site (SEQ ID NO:18, Table 1). The
guide RNA (shown as boxed-in in light gray, SEQ ID NO:8) is a
fusion between a crRNA and tracrRNA and comprises a variable
targeting domain that is complementary to one DNA strand of the
double strand DNA genomic target site. The Cas9 endonuclease is
shown in dark gray. Triangles point towards the expected site of
DNA cleavage on both sense and anti-sense DNA strands.
[0037] FIG. 3A-3B shows an alignment and count of the top 10 most
frequent NHEJ mutations induced by the maize optimized guide
RNA/Cas endonuclease system described herein compared to a LIG3-4
homing endonuclease control at the maize genomic Liguleless 1
locus. The mutations were identified by deep sequencing. The
reference sequence represents the unmodified locus with each target
site underlined. The PAM sequence and expected site of cleavage are
also indicated. Deletions or insertions as a result of imperfect
NHEJ are shown by a "-" or an italicized underlined nucleotide,
respectively. The reference and mutations 1-10 of the LIGCas-1
target site correspond to SEQ ID NOs: 55-65, respectively. The
reference and mutations 1-10 of the LIGCas-2 correspond to SEQ ID
NOs: 55, 65-75, respectively. The reference and mutations 1-10 of
the LIGCas-3 correspond to SEQ ID NOs: 76-86, respectively. The
reference and mutations 1-10 of the LIG3-4 homing endonuclease
target site correspond to SEQ ID NOs: 76, 87-96, respectively.
[0038] FIG. 4 illustrates how the homologous recombination (HR)
repair DNA vector (SEQ ID NO: 97) was constructed. To promote
site-specific transgene insertion by homologous recombination, the
transgene (shown in light gray) was flanked on either side by
approximately 1 kb of DNA with homology to the maize genomic
regions immediately adjacent to the LIGCas3 and LIG3-4 homing
endonuclease expected sites of cleavage.
[0039] FIG. 5 illustrates how genomic DNA extracted from stable
transformants was screened for site-specific transgene insertion by
PCR. Genomic primers (corresponding to SEQ ID NOs: 98 and 101)
within the Liguleless 1 locus were designed outside of the regions
used in constructing the HR repair DNA vector (SEQ ID NO: 97) and
were paired with primers inside the transgene (corresponding to SEQ
ID NOs: 99 and 100) to facilitate PCR detection of unique genomic
DNA junctions created by appropriately oriented site-specific
transgene integration.
[0040] FIG. 6 shows an alignment of the NHEJ mutations induced by
the maize optimized guide RNA/Cas endonuclease system, described
herein, when the short guide RNA was delivered directly as RNA. The
mutations were identified by deep sequencing. The reference
illustrates the unmodified locus with the genomic target site
underlined. The PAM sequence and expected site of cleavage are also
indicated. Deletions or insertions as a result of imperfect NHEJ
are shown by a "-" or an italicized underlined nucleotide,
respectively. The reference and mutations 1-6 for 55CasRNA-1
correspond to SEQ ID NOs: 104-110, respectively.
[0041] FIG. 7 shows the QC782 vector comprising the Cas9 expression
cassette.
[0042] FIG. 8A shows the QC783 vector comprising the guide RNA
expression cassette. FIG. 8B show the DNA sequence (coding
sequence) of the DD43CR1 (20 bp) variable targeting domain of the
guide RNA, as well as the terminator sequence linked to the guide
RNA. The 20 bp variable targeting domain DD43CR1 is in bold
[0043] FIG. 9 shows the map of a linked soybean optimized Cas9 and
guide RNA construct QC815.
[0044] FIG. 10A shows the DD20 soybean locus on chromosome 4 and
the DD20CR1 and DD20CR2 genomic target sites (indicated by bold
arrows). FIG. 10B shows the DD43 soybean locus on chromosome 4 and
the DD43CR1 and DD43CR2 genomic target sites (indicated by bold
arrows).
[0045] FIG. 11A-11D. Alignments of expected target site sequences
with mutant target sequences detected in four guide RNA induced
NHEJ experiments. FIG. 11A shows the DD20CR1 PCR amplicon
(reference sequence, SEQ ID NO:142, genomic target site is
underlined) and the 10 mutations (SEQ ID NOs: 147-156) induced by
the guideRNA/Cas endonuclease system at the DD20CR1 genomic target
site. FIG. 11B shows the DD20CR2 PCR amplicon (reference sequence,
SEQ ID NO:143) and the 10 mutations (SEQ ID NOs 157-166) induced by
the guide RNA/Cas endonuclease system at the DD20CR2 genomic target
site. FIG. 11C shows the DD43CR1 PCR amplicon (reference sequence,
SEQ ID NO:144) and the mutations (SEQ ID NOs:167-176) induced by
the guide RNA/Cas endonuclease system at the DD43CR1 genomic target
site. FIG. 11D shows the DD43CR2 PCR amplicon (reference sequence,
SEQ ID NO: 145) and the 10 mutations (SEQ ID NOs: 177-191) induced
by the guide RNA/Cas endonuclease system at the DD43CR2 genomic
target site. The target sequences corresponding different guide
RNAs are underlined. Each nucleotide deletions is indicated by "-".
Inserted and replaced sequences are in bold. The total number of
each mutant sequence is listed in the last column.
[0046] FIG. 12A-12B shows a schematic representation of the guide
RNA/Cas endonuclease system used for editing a nucleotide sequence
of interest. To enable specific nucleotide editing, a
polynucleotide modification template that includes at least one
nucleotide modification (when compared to the nucleotide sequence
to be edited) is introduced into a cell together with the guide RNA
and Cas endonuclease expression cassettes. For example, as shown
herein, the nucleotide sequence to be edited is an endogenous wild
type enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene in
maize cells. The Cas endonuclease (shaded circle) is a maize
optimized Cas9 endonuclease that cleaves a moCas9 target sequence
within the epsps genomic locus using a guide RNA of SEQ ID NO:194.
FIG. 12-A shows a polynucleotide modification template that
includes three nucleotide modifications (when compared to the wild
type epsps locus depicted in FIG. 12-B) flanked by two homology
regions HR-1 and HR-2. FIG. 12-B shows the guide RNA/maize
optimized Cas9 endonuclease complex interacting with the epsps
locus. The original nucleotide codons of the EPSPS gene that needed
to be edited are show as aCT and Cca (FIG. 12-B). The nucleotide
codons with modified nucleotides (shown in capitals) are shown as
aTC and Tca (FIG. 12-B).
[0047] FIG. 13 shows a diagram of a maize optimized Cas9
endonuclease expression cassette. The bacterial cas9 coding
sequence was codon optimized for expression in maize cells and
supplemented with the ST-LS1 potato intron (moCas9 coding sequence,
SEQ ID NO: 193). A DNA fragment encoding the SV40 nuclear
localization signal (NLS) was fused to the 5'-end of the moCas9
coding sequence. A maize ubiquitin promoter (Ubi promoter) and its
cognate intron (ubi intron) provided controlling elements for the
expression of moCas9 in maize cells. The pinII transcription
termination sequence (pinII) completed the maize moCAS9 gene
design.
[0048] FIG. 14 shows some examples of the moCas9 target sequence
(underlined), located on EPSPS DNA fragments, mutagenized by the
introduction of double-strand breaks at the cleavage site of the
moCas9 endonuclease (thick arrow) in maize cells. In SEQ ID NO:
206, three nucleotides were deleted (dashes) next to the moCas9
cleavage site. SEQ ID NOs: 207-208 indicate that the nucleotide
deletion can expand beyond the moCAs9 cleavage site
[0049] FIG. 15 depicts an EPSPS template vector used for delivery
of the EPSPS polynucleotide modification template containing the
three TIPS nucleotide modifications. The EPSP polynucleotide
modification template includes a partial fragment of the EPSPS
gene. The vector was 6,475 bp in length and consisted of two
homology regions to the epsps locus (epsps-HR1 and epsps-HR2). Two
Gateway cloning sites (ATTL4 and ATTL3), an antibiotic resistance
gene (KAN), and the pUC origin of replication (PUC ORI) completed
synthesis of the EPSPS template vector1.
[0050] FIG. 16 illustrates the PCR-based screening strategy for the
identification of maize events with TIPS nucleotide modifications
in maize cells. Two pairs of PCR primers were used to amplify the
genomic fragments of the epsps locus (upper section). Both of them
contained the TIPS specific primers (an arrow with a dot indicating
the site of the three TIPS modifications). The shorter fragment
(780 bp F-E2) was produced by amplification of the EPSPS
polynucleotide modification template fragment (template detection).
The amplified EPSPS polynucleotide modification template fragment
was found in all but 4 analyzed events (panel F-E2). The longer
fragment (839 bp H-T) was produced by amplification of the genomic
EPSPS sequence providing that the epsps locus contained the three
nucleotide modifications responsible for the TIPS modifications.
Six events were identified as containing the three nucleotide
modifications (panel H-T). The white arrows point to events that
contain both the amplified EPSPS polynucleotide modification
template and the nucleotide modifications responsible for the TIPS
modification.
[0051] FIG. 17A shows a schematic diagram of the PCR protocol used
to identify edited EPSPS DNA fragments in selected events. A
partial genomic fragment, comprising parts of Exon1, Intron 1 and
Exon2 of the epsps locus, was amplified regardless of the editing
product (panel A, 1050 bp F-E3). The amplification products,
representing only partial EPSPS gene sequences having one or more
mutations, were cloned and sequenced. FIG. 17B shows 2 examples of
sequenced amplification products. In some amplification products,
the epsps nucleotides and the moCas9 target sequence (underlined)
were unchanged indicating that one EPSPS allele was not edited
(wild type allele; SEQ ID NO: 210). In other amplification
products, three specific nucleotide substitutions (representing the
TIPS modifications) were identified with no mutations at the moCas9
target sequence (underlined) (SEQ ID NO: 209).
[0052] FIG. 18 shows the location of MHP14, TS8, TS9 and TS10 loci
comprising target sites for the guide RNA/Cas endonuclease system
near trait A (located at 53.14 cM) on chromosome 1 of maize.
[0053] FIG. 19A shows the location of the MHP14Cas1 maize genomic
target sequence (SEQ ID NO: 229) and the MSP14Cas-3 maize genomic
target sequence (SEQ ID NO: 230) on the MHP14 maize genomic DNA
locus on chromosome1. The 5' to 3' sequence. FIG. 19B shows the
location of the TS8Cas-1 (SEQ ID NO: 231) and TS8Cas-2 (SEQ ID NO:
232) maize genomic target sequences located on the TS8 locus. FIG.
19C shows the location of the TS9Cas-2 (SEQ ID NO: 233) and
TS9Cas-3 (SEQ ID NO: 234) maize genomic target sequences located on
the TS8 locus. FIG. 19D shows the location of the TS10Cas-1 (SEQ ID
NO: 235), and TS10Cas-3 (SEQ ID NO: 236) maize genomic target
sequences located on the TS10 locus. All these maize genomic target
sites are recognized are recognized and cleaved by a guide RNA/Cas
endonuclease system described herein. Each maize genomic target
sequence (indicated by an arrow) is highlighted in bold and
followed by the NGG PAM sequence shown boxed in.
[0054] FIG. 20 shows a schematic of a donor DNA (also referred to
as HR repair DNA) comprising a transgene cassette with a selectable
marker (phosphomannose isomerase, depicted in grey), flanked by
homologous recombination sequences (HR1 and HR2) of about 0.5 to 1
kb in length, used to introduce the transgene cassette into a
genomic target site for the guide RNA/Cas endonuclease system. The
arrows indicate the sections of the genomic DNA sequence on either
side of the endonuclease cleavage site that corresponds to the
homologous regions of the donor DNA. This schematic is
representative for homologous recombination occurring at any one of
the 8 target sites (4 loci) located on chromosome 1 from 51.54 cM
to 54.56 cM in maize genome.
[0055] FIG. 21 shows the junction PCR screen for identification of
insertion events. Primer 1 and 2 located on the transgene donor are
common for all target sites. Primer TSHR1f is located on the
genomic region outside of the homologous sequence HR1. Primer
combination THR1f/primer1 amplify junction 1. Primer TSHR2r is
located on the genomic region outside of the HR2 region. Primer
combination primer2/TSHR2r amplify junction 2.
[0056] FIG. 22 shows a junction PCR screen for identification of
insertion events at the TS10Cas10 locus. A gel picture indicates
the presence of insertion events at the TS10Cas10-1 target site
(lane 02 A1). PCR reaction of HR1 and HR2 junction loaded next to
each other (lane 02-white label and lane 02-gray label), with white
label representing HR1 junction PCR, gray label representing HR2
junction PCR.
[0057] FIG. 23 A-B. DNA expression cassettes used in gRNA/Cas9
mediated genome modification experiments. A) The Cas9 endonuclease
cassette (EF1A2:CAS9) comprising a soybean EF1A2 promoter (GM-EF1A2
PRO) driving the soybean codon optimized Cas9 endonucleases
(CAS9(SO), a soybean optimized SV40 nuclear localization signal
(SV40 NLS(SO)) and a PINII terminator (PINII TERM) was linked to a
guide RNA expression cassette (U6-9.1:DD20CR1, comprising a soybean
U6 promoter driving the DD20CR1 guide RNA) used in experiment
U6-9.1DD20CR1 (Table 27). Other Guide RNA/Cas9 cassettes listed in
Table 27 are identical except for the 20 bp variable targeting
domains of the guide RNA targeting the genomic target sites
DD20CR2, DD43CR1, or DD43CR2. B) The donor DNA cassette
(DD20HR1-SAMS:HPT-DD20HR2) used in experiment U6-9.1DD20CR1 (Table
27). DD20HR1 and DD20HR2 homologous DNA regions between the donor
DNA cassette and the genomic DNA sequences flanking the DD20 target
site). Other Donor DNA cassettes listed in Table 27 are identical
except for the DD43HR1 and DD43HR2 regions in two of them.
[0058] FIG. 24 A-C. DD20 and DD43 soybean genomic target sites
locations and qPCR amplicons. A) Diagram of Glycine max chromosome
04 indicating relative positions of DD20 and DD43 target sites.
Genetic mapping positions of DD20 and DD43 sites are the positions
of the most nearby genes Glyma04g39780.1 and Glyma04g39550.1. B)
DD20 qPCR 64 bp amplicon 45936307-45936370 from chromosome 04 (SEQ
ID NO: 304). Relative positions of the target sites DD20-CR1 and
DD20-CR2, qPCR primers and probe DD20-F, DD20-R, and DD20-T are
marked. C) DD43 qPCR 115 bp amplicon 45731879-45731993 from
chromosome 04 (SEQ ID NO: 305). Relative positions of the target
sites DD43-CR1 and DD43-CR2, qPCR primers and probe DD43-F2,
DD43-F, DD43-R, and DD43-T are marked.
[0059] FIG. 25 A-C. Schematic of guide RNA/Cas9 system mediated
site-specific non-homologous end joining (NHEJ) and transgene
insertion via homologous recombination (HR) at DD20CR1 site. A)
Soybean plants are co-transformed with guide RNA/Cas9 and donor DNA
cassettes as listed in Table 27. The DD20CR1 guide RNA/Cas9 complex
transcribed from the linked guide RNA/Cas9 DNA cassettes will
cleave specifically the DD20CR1 target site on chromosome 04 to
make DNA double strand breaks. The breaks can be repaired
spontaneously as NHEJs or repaired as a HR event by the donor DNA
facilitated by the flanking homologous regions DD20-HR1 and
DD20HR2. B) NHEJs are detected by DD20-specific qPCR and the
mutated sequences are assessed by sequencing cloned HR1-HR2 PCR
fragments. C) HR events are revealed by two border-specific PCR
analyses HR1-SAMS and NOS-HR2, noting that the primers are only
able to amplify DNA recombined between the DD20CR1 region of
chromosome 04 and the donor DNA. Guide RNA/Cas9 mediated NHEJ and
HR at DD20-CR2 site follow the same process except for using
DD20-CR2 guide RNA. Guide RNA/Cas9 mediated site-specific NHEJ and
HR at DD43CR1 and DD43CR2 sites follow the same process except for
using guide RNA and homologous regions specific to the DD43
sites.
[0060] FIG. 26 A-C. Sequences of gRNA/Cas9 system mediated NHEJs.
Only 60 bp sequences surrounding the genomic target site shown in
bold case are aligned to show the mutations. The PAM sequence is
shown boxed in. Insertion sequences are indicated by symbol marking
the insertion position followed by the size of the insert. Actual
insertion sequences are listed in the sequences listing. A) U6-9.1
DD20CR1 sequences. Three colonies were sequenced for each of 54
events from experiment U6-9.1 DD20CR1. A total of 150 sequences
were returned, of which 26 were found to be short unique deletions
while 2 of the events contained small insertions. B) U6-9.1 DD20CR2
sequences. Three colonies were sequenced for each of 28 events from
experiment U6-9.1 DD20CR2. A total of 84 sequences were returned,
of which 20 were found to be short unique deletions while 1 of the
events contained a single by insertion. C) U6-9.1DD43CR1 sequences.
Three colonies were sequenced for each of 46 events from experiment
U6-9.1 DD43CR1. A total of 132 sequences were returned, of which 18
were found to be short unique deletions while 10 of the events
contained small insertions. D) U6-9.1DD43CR2 sequences.
[0061] FIG. 27 A-C shows the ten most prevalent types of NHEJ
mutations recovered based on the crRNA/tracrRNA/Cas endonuclease
system. FIG. 27A shows NHEJ mutations for LIGCas-1 target site,
corresponding to SEQ ID NOs: 415-424), FIG. 27B shows NHEJ
mutations for LIGCas-2 target site corresponding to SEQ ID NOs:
425-434) and FIG. 27V shows NHEJ mutations (for LIGCas-3 target
site corresponding to SEQ ID NOs: 435-444).
[0062] FIG. 28. Schematic representation of Zm-GOS2 PRO:GOS2 INTRON
insertion in the 5'-UTR of maize ARGOS8 gene by targeting the guide
RNA/Cas9 target sequence 1 (CTS1, SEQ ID NO: 1) with the gRNA1/Cas9
endonuclease system, described herein. HR1 and HR2 indicate
homologous recombination regions.
[0063] FIG. 29 A-C. Identification and analysis of Zm-GOS2 PRO:GOS2
INTRON insertion events in maize plants. (A) Schematic
representation of Zm-GOS2 PRO:GOS2 INTRON insertion in the 5'-UTR
of Zm-ARGOS8. CTS1 was targeted with the gRNA1/Cas9 endonuclease
system, described herein. HR1 and HR2 indicate homologous
recombination regions. P1 to P4 indicate PCR primers. (B) PCR
screening of PMI-resistance calli to identify insertion events. PCR
results are shown for 13 representative calli. The left and right
junction PCRs were carried out with the primer pair P1+P2 and
P3+P4, respectively. (C) PCR analysis of a T0 plant. A PCR product
with the expected size (2.4 kb, Lane T0) was amplified with the
primer P3 and P4.
[0064] FIG. 30. Schematic representation of Zm-ARGOS8 promoter
substitution with Zm-GOS2 PRO:GOS2 INTRON by targeting CTS3 (SEQ ID
NO: 3) and CTS2 (SEQ ID NO:2). HR1 and HR2 indicate homologous
recombination regions.
[0065] FIG. 31 A-D. Substitution of the native promoter of the
ARGOS8 gene with Zm-GOS2 PRO:GOS2 INTRON in maize plants. (A)
Schematic representation of the Zm-GOS2 PRO:GOS2 INTRON:ARGOS8
allele generated by promoter swap. Two guide RNA/Cas9 target sites,
CTS3 (SEQ ID NO:3) and CTS2 (SEQ ID NO:2), were targeted with a
gRNA3/gRNA2/Cas9 system. HR1 and HR2 indicate homologous
recombination regions. P1 to P5 indicate PCR primers. (B) PCR
screening of PMI-resistance calli to identify swap events. PCR
results are shown for 10 representative calli. One callus sample,
12A09, is positive for both left junction (L, primer P1+P2) and
right junction (R, primer P5+P4) PCR, indicating that 12A09 is a
swap event. (C) PCR analysis of the callus events identified in
primary screening. PCR products with the expected size (2.4 kb)
were amplified using the primer P3 and P4 from event #3, 4, 6, 8
and 9, indicating presence of the Zm-GOS2 PRO:GOS2 INTRON:ARGOS8
allele. (D) PCR analysis of a T0 plant. A PCR product with the
expected size (2.4 kb, Lane T0) was amplified with the primer P3
and P4.
[0066] FIG. 32 A-B. Deletion of the native promoter of the ARGOS8
gene in maize plants. (A) Schematic representation of promoter
deletion. Two guide RNA's and a Cas9 endonuclease system, referred
to as a gRNA3/gRNA2/Cas9 system, were used to target the CTS3 and
CTS2 sites in Zm-ARGOS8. P1 and P4 indicate PCR primers for
deletion event screening. (B) PCR screening of PMI-resistance calli
to identify deletion events. PCR results are shown for 15
representative calli. A 1.1-kp PCR product indicates deletion of
the CTS3/CTS2 fragment.
[0067] FIG. 33. Schematic representation of enhancer element
deletions using the guide RNA/Cas9 target sequence. The enhancer
element to be deleted can be, but is not limited to, a 35S enhancer
element.
[0068] FIG. 34 A-C. Modification of a maize EPSPS
polyubiquitination site. (A) The selected maize EPSPS
polyubiquitination site is compared to the analogous sites of other
plant species. (B) The nucleotides to be edited in the maize EPSPS
coding sequence (underlined, encoded amino acid shown in bold). (C)
The edited EPSPS coding sequence identified in the selected T0
plant.
[0069] FIG. 35 A-C. The intron mediated enhanced element (A). The
5' section of the first intron of the EPSPS gene (editing:
substitutions underlined and deletions represented by dots) (B) and
its edited version conferring three IMEs elements (underlined). The
edited nucleotides are shown in bold (C).
[0070] FIG. 36 A-B. Alternatively spliced EPSPS mRNA in maize
cells. (A) left panel represents analysis of EPSPS cDNA. The lane
I4 in FIG. 36A shows amplification of the EPSPS pre-mRNA containing
the 3.sup.rd intron unspliced (the 804 bp diagnostic fragment as
shown in FIG. 36 B indicates an alternate splicing event). Lanes E3
and F8 show the EPSPS PCR amplified fragments with spliced introns.
These diagnostic fragments are not amplified unless cDNA is
synthesized (as is evident by the absence of bands in lanes E3, I4,
and F8 comprising total RNA (shown in the total RNA panel on right
of FIG. 36A). The grey boxes in FIG. 36 B represent the eight EPSPS
exons (their sizes are indicated above each of them).
[0071] FIG. 37. Splicing site at the junction between the second
EPSPS intron and the third exon (bolded). The nucleotide to be
edited is underlined.
[0072] FIG. 38. Schematic representation of Southern hybridization
analysis of T0 and T1 maize plants.
SEQUENCES
[0073] SEQ ID NO: 1 is the nucleotide sequence of the Cas9 gene
from Streptococcus pyogenes M1 GAS (SF370).
[0074] SEQ ID NO: 2 is the nucleotide sequence of the potato ST-LS1
intron.
[0075] SEQ ID NO: 3 is the amino acid sequence of SV40 amino
N-terminal.
[0076] SEQ ID NO: 4 is the amino acid sequence of Agrobacterium
tumefaciens bipartite VirD2 T-DNA border endonuclease carboxyl
terminal.
[0077] SEQ ID NO: 5 is the nucleotide sequence of an expression
cassette expressing the maize optimized Cas9.
[0078] SEQ ID NO: 6 is the nucleotide sequence of crRNA containing
the LIGCas-3 target sequence in the variable targeting domain.
[0079] SEQ ID NO: 7 is the nucleotide sequence of the tracrRNA.
[0080] SEQ ID NO: 8 is the nucleotide sequence of a long guide RNA
containing the LIGCas-3 target sequence in the variable targeting
domain.
[0081] SEQ ID NO: 9 is the nucleotide sequence of the Chromosome 8
maize U6 polymerase III promoter.
[0082] SEQ ID NO: 10 list two copies of the nucleotide sequence of
the maize U6 polymerase III terminator.
[0083] SEQ ID NO: 11 is the nucleotide sequence of the maize
optimized short guide RNA containing the LIGCas-3 variable
targeting domain.
[0084] SEQ ID NO: 12 is the nucleotide sequence of the maize
optimized long guide RNA expression cassette containing the
LIGCas-3 variable targeting domain.
[0085] SEQ ID NO: 13 is the nucleotide sequence of the Maize
genomic target site MS26Cas-1 plus PAM sequence.
[0086] SEQ ID NO: 14 is the nucleotide sequence of the Maize
genomic target site MS26Cas-2 plus PAM sequence.
[0087] SEQ ID NO: 15 is the nucleotide sequence of the Maize
genomic target site MS26Cas-3 plus PAM sequence.
[0088] SEQ ID NO: 16 is the nucleotide sequence of the Maize
genomic target site LIGCas-2 plus PAM sequence.
[0089] SEQ ID NO: 17 is the nucleotide sequence of the Maize
genomic target site LIGCas-3 plus PAM sequence.
[0090] SEQ ID NO: 18 is the nucleotide sequence of the Maize
genomic target site LIGCas-4 plus PAM sequence.
[0091] SEQ ID NO: 19 is the nucleotide sequence of the Maize
genomic target site MS45Cas-1 plus PAM sequence.
[0092] SEQ ID NO: 20 is the nucleotide sequence of the Maize
genomic target site MS45Cas-2 plus PAM sequence.
[0093] SEQ ID NO: 21 is the nucleotide sequence of the Maize
genomic target site MS45Cas-3 plus PAM sequence.
[0094] SEQ ID NO: 22 is the nucleotide sequence of the Maize
genomic target site ALSCas-1 plus PAM sequence.
[0095] SEQ ID NO: 23 is the nucleotide sequence of the Maize
genomic target site ALSCas-2 plus PAM sequence.
[0096] SEQ ID NO: 24 is the nucleotide sequence of the Maize
genomic target site ALSCas-3 plus PAM sequence.
[0097] SEQ ID NO: 25 is the nucleotide sequence of the Maize
genomic target site EPSPSCas-1 plus PAM sequence.
[0098] SEQ ID NO: 26 is the nucleotide sequence of the Maize
genomic target site EPSPSCas-2 plus PAM sequence.
[0099] SEQ ID NO: 27 is the nucleotide sequence of the Maize
genomic target site EPSPSCas-3 plus PAM sequence.
[0100] SEQ ID NOs: 28-52 are the nucleotide sequence of target site
specific forward primers for primary PCR as shown in Table 2.
[0101] SEQ ID NO: 53 is the nucleotide sequence of the forward
primer for secondary PCR.
[0102] SEQ ID NO: 54 is the nucleotide sequence of Reverse primer
for secondary PCR
[0103] SEQ ID NO: 55 is the nucleotide sequence of the unmodified
reference sequence for LIGCas-1 and LIGCas-2 locus.
[0104] SEQ ID NOs: 56-65 are the nucleotide sequences of mutations
1-10 for LIGCas-1.
[0105] SEQ ID NOs: 66-75 are the nucleotide sequences of mutations
1-10 for LIGCas-2.
[0106] SEQ ID NO: 76 is the nucleotide sequence of the unmodified
reference sequence for the LIGCas-3 and LIG3-4 homing endonuclease
locus.
[0107] SEQ ID NOs: 77-86 are the nucleotide sequences of mutations
1-10 for LIGCas-3.
[0108] SEQ ID NOs: 88-96 are the nucleotide sequences of mutations
1-10 for LIG3-4 homing endonuclease locus.
[0109] SEQ ID NO: 97 is the nucleotide sequence of a donor vector
referred to as an HR Repair DNA.
[0110] SEQ ID NO: 98 is the nucleotide sequence of forward PCR
primer for site-specific transgene insertion at junction 1.
[0111] SEQ ID NO: 99 is the nucleotide sequence of reverse PCR
primer for site-specific transgene insertion at junction 1.
[0112] SEQ ID NO: 100 is the nucleotide sequence of forward PCR
primer for site-specific transgene insertion at junction 2.
[0113] SEQ ID NO: 101 is the nucleotide sequence of reverse PCR
primer for site-specific transgene insertion at junction 2.
[0114] SEQ ID NO: 102 is the nucleotide sequence of the linked Cas9
endonuclease and LIGCas-3 long guide RNA expression cassettes
[0115] SEQ ID NO: 103 is the nucleotide sequence of Maize genomic
target site 55CasRNA-1 plus PAM sequence.
[0116] SEQ ID NO: 104 is the nucleotide sequence of the unmodified
reference sequence for 55CasRNA-1 locus.
[0117] SEQ ID NOs: 105-110 are the nucleotide sequences of
mutations 1-6 for 55CasRNA-1.
[0118] SEQ ID NO: 111 is the nucleotide sequence of LIG3-4 homing
endonuclease target site
[0119] SEQ ID NO: 112 is the nucleotide sequence of LIG3-4 homing
endonuclease coding sequence.
[0120] SEQ ID NO: 113 is the nucleotide sequence of the MS26++
homing endonuclease target site.
[0121] SEQ ID NO: 114 is the nucleotide sequence of MS26++ homing
endonuclease coding sequence
[0122] SEQ ID NO: 115 is the nucleotide sequence of the soybean
codon optimized Cas9 gene.
[0123] SEQ ID NO: 116 is the nucleotide sequence of the soybean
constitutive promoter GM-EF1A2.
[0124] SEQ ID NO: 117 is the nucleotide sequence of linker SV40
NLS.
[0125] SEQ ID NO: 118 is the amino acid sequence of soybean
optimized Cas9 with a SV40 NLS.
[0126] SEQ ID NO: 119 is the nucleotide sequence of vector
QC782.
[0127] SEQ ID NO: 120 is the nucleotide sequence of soybean U6
polymerase III promoter described herein, GM-U6-13.1 PRO.
[0128] SEQ ID NO: 121 is the nucleotide sequence of the guide RNA
in FIG. 8B.
[0129] SEQ ID NO: 122 is the nucleotide sequence of vector
QC783.
[0130] SEQ ID NO: 123 is the nucleotide sequence of vector
QC815.
[0131] SEQ ID NO: 124 is the nucleotide sequence of a Cas9
endonuclease (cas9-2) from S. pyogenes.
[0132] SEQ ID NO: 125 is the nucleotide sequence of the DD20CR1
soybean target site
[0133] SEQ ID NO: 126 is the nucleotide sequence of the DD20CR2
soybean target site
[0134] SEQ ID NO: 127 is the nucleotide sequence of the DD43CR1
soybean target site
[0135] SEQ ID NO: 128 is the nucleotide sequence of the DD43CR2
soybean target site
[0136] SEQ ID NO: 129 is the nucleotide sequence of the DD20
sequence in FIG. 10A.
[0137] SEQ ID NO: 130 is the nucleotide sequence of the DD20
sequence complementary in FIG. 10A.
[0138] SEQ ID NO: 131 is the nucleotide sequence of DD43
sequence.
[0139] SEQ ID NO: 132 is the nucleotide sequence of the DD43
complementary sequence.
[0140] SEQ ID NO: 133-141 are primer sequences.
[0141] SEQ ID NO: 142 is the nucleotide sequence of the DD20CR1 PCR
amplicon.
[0142] SEQ ID NO: 143 is the nucleotide sequence of the DD20CR2 PCR
amplicon.
[0143] SEQ ID NO: 144 is the nucleotide sequence of the DD43CR1 PCR
amplicon.
[0144] SEQ ID NO: 145 is the nucleotide sequence of the DD43CR2 PCR
amplicon.
[0145] SEQ ID NO: 146 is the nucleotide sequence of the DD43CR2 PCR
amplicon.
[0146] SEQ ID NO: 147-156 are the nucleotide sequence of mutations
1 to 10 for the DD20CR1 target site
[0147] SEQ ID NO: 157-166 are the nucleotide sequence of mutations
1 to 10 for the DD20CR2 target site
[0148] SEQ ID NO: 167-176 are the nucleotide sequence of mutations
1 to 10 for the DD43CR1 target site
[0149] SEQ ID NO: 177-191 are the nucleotide sequence of mutations
1 to 10 for the DD43CR2 target site.
[0150] SEQ ID NO: 192 is the amino acid sequence of a maize
optimized version of the Cas9 protein.
[0151] SEQ ID NO: 193 is the nucleotide sequence of the maize
optimized version of the Cas9 gene of SEQ ID NO: 192.
[0152] SEQ ID NO: 194 is the DNA version of guide RNA (EPSPS
sgRNA).
[0153] SEQ ID NO: 195 is the EPSPS polynucleotide modification
template.
[0154] SEQ ID NO: 196 is a nucleotide fragment comprising the TIPS
nucleotide modifications.
[0155] SEQ ID NO: 197-204 are primer sequences shown in Table
15.
[0156] SEQ ID NO: 205-208 are nucleotide fragments shown in FIG.
14.
[0157] SEQ ID NO: 209 is an example of a TIPS edited EPSPS
nucleotide sequence fragment shown in FIG. 17.
[0158] SEQ ID NO: 210 is an example of a Wild-type EPSPS nucleotide
sequence fragment shown in FIG. 17.
[0159] SEQ ID NO: 211 is the nucleotide sequence of a maize
enolpyruvylshikimate-3-phosphate synthase (epsps) locus
[0160] SEQ ID NO: 212 is the nucleotide sequence of a Cas9
endonuclease (genbank CS571758.1) from S. thermophiles.
[0161] SEQ ID NO: 213 is the nucleotide sequence of a Cas9
endonuclease (genbank CS571770.1) from S. thermophiles.
[0162] SEQ ID NO: 214 is the nucleotide sequence of a Cas9
endonuclease (genbank CS571785.1) from S. agalactiae.
[0163] SEQ ID NO: 215 is the nucleotide sequence of a Cas9
endonuclease, (genbank CS571790.1) from S. agalactiae.
[0164] SEQ ID NO: 216 is the nucleotide sequence of a Cas9
endonuclease (genbank CS571790.1) from S. mutant.
[0165] SEQ ID NOs: 217-228 are primer and probe nucleotide
sequences described in Example 17.
[0166] SEQ ID NOs: 229 is the nucleotide sequence of the MHP14Cas1
target site.
[0167] SEQ ID NOs: 230 is the nucleotide sequence of the MHP14Cas3
target site.
[0168] SEQ ID NOs: 231 is the nucleotide sequence of the TS8Cas1
target site.
[0169] SEQ ID NOs: 232 is the nucleotide sequence of the TS8Cas2
target site.
[0170] SEQ ID NOs: 233 is the nucleotide sequence of the TS9Cas2
target site.
[0171] SEQ ID NOs: 234 is the nucleotide sequence of the TS9Cas3
target site.
[0172] SEQ ID NOs: 235 is the nucleotide sequence of the TS10Cas1
target site.
[0173] SEQ ID NOs: 236 is the nucleotide sequence of the TS10Cas3
target site.
[0174] SEQ ID NOs: 237-244 are the nucleotide sequences shown in
FIG. 19A-D.
[0175] SEQ ID NOs: 245-252 are the nucleotide sequences of the
guide RNA expression cassettes described in Example 18.
[0176] SEQ ID NOs: 253-260 are the nucleotide sequences of donor
DNA expression cassettes described in Example 18.
[0177] SEQ ID NOs: 261-270 are the nucleotide sequences of the
primers described in Example 18.
[0178] SEQ ID NOs: 271-294 are the nucleotide sequences of the
primers and probes described in Example 18.
[0179] SEQ ID NO: 295 is the nucleotide sequence of GM-U6-13.1 PRO,
a soybean U6 polymerase III promoter described herein,
[0180] SEQ ID NOs: 298, 300, 301 and 303 are the nucleotide
sequences of the linked guideRNA/Cas9 expression cassettes.
[0181] SEQ ID NOs: 299 and 302 are the nucleotide sequences of the
donor DNA expression cassettes.
[0182] SEQ ID NOs: 271-294 are the nucleotide sequences of the
primers and probes described in Example 18.
[0183] SEQ ID NO: 304 is the nucleotide sequence of the DD20 qPCR
amplicon.
[0184] SEQ ID NO: 305 is the nucleotide sequence of the DD43 qPCR
amplicon.
[0185] SEQ ID NOs: 306-328 are the nucleotide sequences of the
primers and probes described herein.
[0186] SEQ ID NOs: 329-334 are the nucleotide sequences of PCR
amplicons described herein.
[0187] SEQ ID NO: 335 is the nucleotide sequence of a soybean
genomic region comprising the DD20CR1 target site.
[0188] SEQ ID NO: 364 is the nucleotide sequence of a soybean
genomic region comprising the DD20CR2 target site.
[0189] SEQ ID NO: 386 is the nucleotide sequence of a soybean
genomic region comprising the DD43CR1 target site.
[0190] SEQ ID NOs: 336-363, 365-385 and 387-414 are the nucleotide
sequences of shown in FIG. 26 A-C.
[0191] SEQ ID NOs: 415-444 are the nucleotide sequences of NHEJ
mutations recovered based on the crRNA/tracrRNA/Cas endonuclease
system shown in FIG. 27A-C.
[0192] SEQ ID NO: 445-447 are the nucleotide sequence of the
LIGCas-1, LIGCas2 and LIGCas3 crRNA expression cassettes,
respectively.
[0193] SEQ ID NO: 448 is the nucleotide sequence of the tracrRNA
expression cassette.
[0194] SEQ ID NO: 449 is the nucleotide sequence of LIGCas-2
forward primer for primary PCR
[0195] SEQ ID NO: 450 is the nucleotide sequence of LIGCas-3
forward primer for primary PCR.
[0196] SEQ ID NO: 451 is the nucleotide sequence of the maize
genomic Cas9 endonuclease target site Zm-ARGOS8-CTS1.
[0197] SEQ ID NO: 452 is the nucleotide sequence of the maize
genomic Cas9 endonuclease target site Zm-ARGOS8-CTS2.
[0198] SEQ ID NO: 453 is the nucleotide sequence of the maize
genomic Cas9 endonuclease target site Zm-ARGOS8-CTS3
[0199] SEQ ID NOs: 454-458 are the nucleotide sequence of primers
P1, P2, P3, P4, P5, respectively.
[0200] SEQ ID NO: 459 is the nucleotide sequence of a Primer
Binding Site (PBS), a sequence to facilitate event screening.
[0201] SEQ ID NO: 460 is the nucleotide sequence of the Zm-GOS2
PRO-GOS2 INTRON, the maize GOS2 promoter and GOS2 intron1 including
the promoter, 5'-UTR1, INTRON1 and 5'-UTR2.
[0202] SEQ ID NO: 461 is the nucleotide sequence of the maize
Zm-ARGOS8 promoter.
[0203] SEQ ID NO: 462 is the nucleotide sequence of the maize
Zm-ARGOS8 5'-UTR.
[0204] SEQ ID NO: 463 is the nucleotide sequence of the maize
Zm-ARGOS8 codon sequence
[0205] SEQ ID NO: 464 is the nucleotide sequence of the maize
Zm-GOS2 gene, including promoter, 5'-UTR, CDS, 3'-UTR and
introns.
[0206] SEQ ID NO: 465 is the nucleotide sequence of the maize
Zm-GOS2 PRO promoter.
[0207] SEQ ID NO: 466 is the nucleotide sequence of the maize GOS2
INTRON, maize GOS2 5'-UTR1 and intron1 and 5'-UTR2.
[0208] SEQ ID NOs: 467-468, 490-491, 503-504 are the nucleotide
sequence of the soybean genomic Cas endonuclease target sequences
soy EPSPS-CR1, soy EPSPS-CR2, soy EPSPS-CR4, soy EPSPS-CR5, soy
EPSPS-CR6, soy EPSPSCR7, respectively
[0209] SEQ ID NO: 469 is the nucleotide sequence of the soybean U6
small nuclear RNA promoter GM-U6-13.1.
[0210] SEQ ID NOs: 470, 471 are the nucleotide sequences of the
QC868, QC879 plasmids, respectively.
[0211] SEQ ID NOs: 472, 473, 492, 493, 494, 505, 506, 507 are the
nucleotide sequences of the RTW1013A, RTW1012A, RTW1199, RTW1200,
RTW1190A, RTW1201, RTW1202, RTW1192A respectively.
[0212] SEQ ID NOs: 474-488, 495-402, 508-512 are the nucleotide
sequences of primers and probes.
[0213] SEQ ID NO: 489 is the nucleotide sequence of the soybean
codon optimized Cas9.
[0214] SEQ ID NO: 513 is the nucleotide sequence of the 35S
enhancer.
[0215] SEQ ID NO: 514 is the nucleotide sequence of the 35S-CRTS
for gRNA1 at 163-181 (including pam at 3' end).
[0216] SEQ ID NO: 515 is the nucleotide sequence of the 35S-CRTS
for gRNA2 at 295-319 (including pam at 3' end).
[0217] SEQ ID NO: 516 is the nucleotide sequence of the 35S-CRT for
gRNA3 at 331-350 (including pam at 3' end).
[0218] SEQ ID NO: 517 is the nucleotide sequence of the EPSPS-K90R
template.
[0219] SEQ ID NO: 518 is the nucleotide sequence of the EPSPS-IME
template. S
[0220] SEQ ID NO: 519 is the nucleotide sequence of the
EPSPS-Tspliced template.
[0221] SEQ ID NO: 520 is the amino acid sequence of ZM-RAP2.7
peptide
[0222] SEQ ID NO: 521 is the nucleotide sequence ZM-RAP2.7 coding
DNA sequence
SEQ ID NOs: 522 is the amino acid sequence of ZM-NPK1B peptide
[0223] SEQ ID NO: 523 is the nucleotide sequence of the ZM-NPK1B
coding DNA sequence
[0224] SEQ ID NOs: 524 is the nucleotide sequence of the RAB17
promoter
[0225] SEQ ID NOs: 525 is the amino acid sequence of the Maize
FTM1.
[0226] SEQ ID NO: 526 is the nucleotide sequence of the Maize FTM1
coding DNA sequence.
[0227] SEQ ID NOs: 527-532 are the nucleotide sequences shown in
FIGS. 34, 35 and 37.
[0228] SEQ ID NOs: 533-534 are the nucleotide sequences of the
Southern genomic probe and Southern MoPAT probe of FIG. 38,
respectively. SEQ ID NOs: 535-541 are the nucleotide sequences of
the RF-FPCas-1, RF-FPCas-2, ALSCas-4, ALS modification repair
template 804, ALS modification repair template 127, ALS
Forward_primer and ALS Reverse_primer, respectively.
[0229] SEQ ID NOs: 542-549 are the nucleotide sequences of the soy
ALS1-CR1, Cas9 target sequence, soy ALS2-CR2, Cas9 target sequence,
QC880, QC881, RTW1026A, WOL900, Forward_primer, WOL578,
Reverse_primer and WOL573, Forward_primer, respectively.
[0230] SEQ ID NO: 550 is the nucleotide sequence of a maize ALS
protein.
DETAILED DESCRIPTION
[0231] The present disclosure includes compositions and methods for
genome modification of a target sequence in the genome of a plant
or plant cell, for selecting plants, for gene editing, and for
inserting a polynucleotide of interest into the genome of a plant.
The methods employ a guide RNA/Cas endonuclease system, wherein the
Cas endonuclease is guided by the guide RNA to recognize and
optionally introduce a double strand break at a specific target
site into the genome of a cell. The guide RNA/Cas endonuclease
system provides for an effective system for modifying target sites
within the genome of a plant, plant cell or seed. Further provided
are methods and compositions employing a guide polynucleotide/Cas
endonuclease system to provide an effective system for modifying
target sites within the genome of a cell and for editing a
nucleotide sequence in the genome of a cell. Once a genomic target
site is identified, a variety of methods can be employed to further
modify the target sites such that they contain a variety of
polynucleotides of interest. Breeding methods utilizing a two
component guide RNA/Cas endonuclease system are also disclosed.
Compositions and methods are also provided for editing a nucleotide
sequence in the genome of a cell. The nucleotide sequence to be
edited (the nucleotide sequence of interest) can be located within
or outside a target site that is recognized by a Cas
endonuclease.
[0232] CRISPR loci (Clustered Regularly Interspaced Short
Palindromic Repeats) (also known as SPIDRs--SPacer Interspersed
Direct Repeats) constitute a family of recently described DNA loci.
CRISPR loci consist of short and highly conserved DNA repeats
(typically 24 to 40 bp, repeated from 1 to 140 times--also referred
to as CRISPR-repeats) which are partially palindromic. The repeated
sequences (usually specific to a species) are interspaced by
variable sequences of constant length (typically 20 to 58 bp
depending on the CRISPR locus (WO2007/025097 published Mar. 1,
2007).
[0233] CRISPR loci were first recognized in E. coli (Ishino et al.
(1987) J. Bacterial. 169:5429-5433; Nakata et al. (1989) J.
Bacterial. 171:3553-3556). Similar interspersed short sequence
repeats have been identified in Haloferax mediterranei,
Streptococcus pyogenes, Anabaena, and Mycobacterium tuberculosis
(Groenen et al. (1993) Mol. Microbiol. 10:1057-1065; Hoe et al.
(1999) Emerg. Infect. Dis. 5:254-263; Masepohl et al. (1996)
Biochim. Biophys. Acta 1307:26-30; Mojica et al. (1995) Mol.
Microbiol. 17:85-93). The CRISPR loci differ from other SSRs by the
structure of the repeats, which have been termed short regularly
spaced repeats (SRSRs) (Janssen et al. (2002) OMICS J. Integ. Biol.
6:23-33; Mojica et al. (2000) Mol. Microbiol. 36:244-246). The
repeats are short elements that occur in clusters, that are always
regularly spaced by variable sequences of constant length (Mojica
et al. (2000) Mol. Microbiol. 36:244-246).
[0234] Cas gene includes a gene that is generally coupled,
associated or close to or in the vicinity of flanking CRISPR loci.
The terms "Cas gene", "CRISPR-associated (Cas) gene" are used
interchangeably herein. A comprehensive review of the Cas protein
family is presented in Haft et al. (2005) Computational Biology,
PLoS Comput Biol 1(6): e60. doi:10.1371/journal.pcbi.0010060.
[0235] As described therein, 41 CRISPR-associated (Cas) gene
families are described, in addition to the four previously known
gene families. It shows that CRISPR systems belong to different
classes, with different repeat patterns, sets of genes, and species
ranges. The number of Cas genes at a given CRISPR locus can vary
between species.
[0236] Cas endonuclease relates to a Cas protein encoded by a Cas
gene, wherein said Cas protein is capable of introducing a double
strand break into a DNA target sequence. The Cas endonuclease is
guided by the guide polynucleotide to recognize and optionally
introduce a double strand break at a specific target site into the
genome of a cell. As used herein, the term "guide
polynucleotide/Cas endonuclease system" includes a complex of a Cas
endonuclease and a guide polynucleotide that is capable of
introducing a double strand break into a DNA target sequence. The
Cas endonuclease unwinds the DNA duplex in close proximity of the
genomic target site and cleaves both DNA strands upon recognition
of a target sequence by a guide RNA, but only if the correct
protospacer-adjacent motif (PAM) is approximately oriented at the
3' end of the target sequence (FIG. 2A, FIG. 2B).
[0237] In one embodiment, the Cas endonuclease gene is a Cas9
endonuclease, such as but not limited to, Cas9 genes listed in SEQ
ID NOs: 462, 474, 489, 494, 499, 505, and 518 of WO2007/025097
published Mar. 1, 2007, and incorporated herein by reference. In
another embodiment, the Cas endonuclease gene is plant, maize or
soybean optimized Cas9 endonuclease (FIG. 1A). In another
embodiment, the Cas endonuclease gene is operably linked to a SV40
nuclear targeting signal upstream of the Cas codon region and a
bipartite VirD2 nuclear localization signal (Tinland et al. (1992)
Proc. Natl. Acad. Sci. USA 89:7442-6) downstream of the Cas codon
region.
[0238] In one embodiment, the Cas endonuclease gene is a Cas9
endonuclease gene of SEQ ID NO:1, 124, 212, 213, 214, 215, 216, 193
or nucleotides 2037-6329 of SEQ ID NO:5, or any functional fragment
or variant thereof.
[0239] The terms "functional fragment", "fragment that is
functionally equivalent" and "functionally equivalent fragment" are
used interchangeably herein. These terms refer to a portion or
subsequence of the Cas endonuclease sequence of the present
disclosure in which the ability to create a double-strand break is
retained.
[0240] The terms "functional variant", "Variant that is
functionally equivalent" and "functionally equivalent variant" are
used interchangeably herein. These terms refer to a variant of the
Cas endonuclease of the present disclosure in which the ability
create a double-strand break is retained. Fragments and variants
can be obtained via methods such as site-directed mutagenesis and
synthetic construction.
[0241] In one embodiment, the Cas endonuclease gene is a plant
codon optimized streptococcus pyogenes Cas9 gene that can recognize
any genomic sequence of the form N(12-30)NGG can in principle be
targeted.
[0242] In one embodiment, the Cas endonuclease is introduced
directly into a cell by any method known in the art, for example,
but not limited to transient introduction methods, transfection
and/or topical application.
[0243] Endonucleases are enzymes that cleave the phosphodiester
bond within a polynucleotide chain, and include restriction
endonucleases that cleave DNA at specific sites without damaging
the bases. Restriction endonucleases include Type I, Type II, Type
III, and Type IV endonucleases, which further include subtypes. In
the Type I and Type III systems, both the methylase and restriction
activities are contained in a single complex. Endonucleases also
include meganucleases, also known as homing endonucleases (HEases),
which like restriction endonucleases, bind and cut at a specific
recognition site, however the recognition sites for meganucleases
are typically longer, about 18 bp or more. (patent application
WO-PCT PCT/US12/30061 filed on Mar. 22, 2012) Meganucleases have
been classified into four families based on conserved sequence
motifs, the families are the LAGLIDADG, GIY-YIG, H-N-H, and His-Cys
box families. These motifs participate in the coordination of metal
ions and hydrolysis of phosphodiester bonds. HEases are notable for
their long recognition sites, and for tolerating some sequence
polymorphisms in their DNA substrates. The naming convention for
meganuclease is similar to the convention for other restriction
endonuclease. Meganucleases are also characterized by prefix F-,
I-, or PI- for enzymes encoded by free-standing ORFS, introns, and
inteins, respectively. One step in the recombination process
involves polynucleotide cleavage at or near the recognition site.
This cleaving activity can be used to produce a double-strand
break. For reviews of site-specific recombinases and their
recognition sites, see, Sauer (1994) Curr Op Biotechnol 5:521-7;
and Sadowski (1993) FASEB 7:760-7. In some examples the recombinase
is from the Integrase or Resolvase families.
[0244] TAL effector nucleases are a new class of sequence-specific
nucleases that can be used to make double-strand breaks at specific
target sequences in the genome of a plant or other organism.
(Miller et al. (2011) Nature Biotechnology 29:143-148). Zinc finger
nucleases (ZFNs) are engineered double-strand break inducing agents
comprised of a zinc finger DNA binding domain and a
double-strand-break-inducing agent domain. Recognition site
specificity is conferred by the zinc finger domain, which typically
comprising two, three, or four zinc fingers, for example having a
C2H2 structure, however other zinc finger structures are known and
have been engineered. Zinc finger domains are amenable for
designing polypeptides which specifically bind a selected
polynucleotide recognition sequence. ZFNs include an engineered
DNA-binding zinc finger domain linked to a non-specific
endonuclease domain, for example nuclease domain from a Type IIs
endonuclease such as FokI. Additional functionalities can be fused
to the zinc-finger binding domain, including transcriptional
activator domains, transcription repressor domains, and methylases.
In some examples, dimerization of nuclease domain is required for
cleavage activity. Each zinc finger recognizes three consecutive
base pairs in the target DNA. For example, a 3 finger domain
recognized a sequence of 9 contiguous nucleotides, with a
dimerization requirement of the nuclease, two sets of zinc finger
triplets are used to bind an 18 nucleotide recognition
sequence.
[0245] Bacteria and archaea have evolved adaptive immune defenses
termed clustered regularly interspaced short palindromic repeats
(CRISPR)/CRISPR-associated (Cas) systems that use short RNA to
direct degradation of foreign nucleic acids ((WO2007/025097
published Mar. 1, 2007). The type II CRISPR/Cas system from
bacteria employs a crRNA and tracrRNA to guide the Cas endonuclease
to its DNA target. The crRNA (CRISPR RNA) contains the region
complementary to one strand of the double strand DNA target and
base pairs with the tracrRNA (trans-activating CRISPR RNA) forming
a RNA duplex that directs the Cas endonuclease to cleave the DNA
target (FIG. 2 B).
[0246] As used herein, the term "guide RNA" relates to a synthetic
fusion of two RNA molecules, a crRNA (CRISPR RNA) comprising a
variable targeting domain, and a tracrRNA (FIG. 2 B). In one
embodiment, the guide RNA comprises a variable targeting domain of
12 to 30 nucleotide sequences and a RNA fragment that can interact
with a Cas endonuclease.
[0247] As used herein, the term "guide polynucleotide", relates to
a polynucleotide sequence that can form a complex with a Cas
endonuclease and enables the Cas endonuclease to recognize and
optionally cleave a DNA target site. The guide polynucleotide can
be a single molecule or a double molecule. The guide polynucleotide
sequence can be a RNA sequence, a DNA sequence, or a combination
thereof (a RNA-DNA combination sequence). Optionally, the guide
polynucleotide can comprise at least one nucleotide, phosphodiester
bond or linkage modification such as, but not limited, to Locked
Nucleic Acid (LNA), 5-methyl dC, 2,6-Diaminopurine, 2'-Fluoro A,
2'-Fluoro U, 2'-O-Methyl RNA, phosphorothioate bond, linkage to a
cholesterol molecule, linkage to a polyethylene glycol molecule,
linkage to a spacer 18 (hexaethylene glycol chain) molecule, or 5'
to 3' covalent linkage resulting in circularization. A guide
polynucleotride that solely comprises ribonucleic acids is also
referred to as a "guide RNA".
[0248] The guide polynucleotide can be a double molecule (also
referred to as duplex guide polynucleotide) comprising a first
nucleotide sequence domain (referred to as Variable Targeting
domain or VT domain) that is complementary to a nucleotide sequence
in a target DNA and a second nucleotide sequence domain (referred
to as Cas endonuclease recognition domain or CER domain) that
interacts with a Cas endonuclease polypeptide. The CER domain of
the double molecule guide polynucleotide comprises two separate
molecules that are hybridized along a region of complementarity.
The two separate molecules can be RNA, DNA, and/or
RNA-DNA-combination sequences. In some embodiments, the first
molecule of the duplex guide polynucleotide comprising a VT domain
linked to a CER domain is referred to as "crDNA" (when composed of
a contiguous stretch of DNA nucleotides) or "crRNA" (when composed
of a contiguous stretch of RNA nucleotides), or "crDNA-RNA" (when
composed of a combination of DNA and RNA nucleotides). The
crNucleotide can comprise a fragment of the cRNA naturally
occurring in Bacteria and Archaea. In one embodiment, the size of
the fragment of the cRNA naturally occurring in Bacteria and
Archaea that is present in a crNucleotide disclosed herein can
range from, but is not limited to, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides. In some
embodiments the second molecule of the duplex guide polynucleotide
comprising a CER domain is referred to as "tracrRNA" (when composed
of a contiguous stretch of RNA nucleotides) or "tracrDNA" (when
composed of a contiguous stretch of DNA nucleotides) or
"tracrDNA-RNA" (when composed of a combination of DNA and RNA
nucleotides In one embodiment, the RNA that guides the RNA/Cas9
endonuclease complex, is a duplexed RNA comprising a duplex
crRNA-tracrRNA.
[0249] The guide polynucleotide can also be a single molecule
comprising a first nucleotide sequence domain (referred to as
Variable Targeting domain or VT domain) that is complementary to a
nucleotide sequence in a target DNA and a second nucleotide domain
(referred to as Cas endonuclease recognition domain or CER domain)
that interacts with a Cas endonuclease polypeptide. By "domain" it
is meant a contiguous stretch of nucleotides that can be RNA, DNA,
and/or RNA-DNA-combination sequence. The VT domain and/or the CER
domain of a single guide polynucleotide can comprise a RNA
sequence, a DNA sequence, or a RNA-DNA-combination sequence. In
some embodiments the single guide polynucleotide comprises a
crNucleotide (comprising a VT domain linked to a CER domain) linked
to a tracrNucleotide (comprising a CER domain), wherein the linkage
is a nucleotide sequence comprising a RNA sequence, a DNA sequence,
or a RNA-DNA combination sequence. The single guide polynucleotide
being comprised of sequences from the crNucleotide and
tracrNucleotide may be referred to as "single guide RNA" (when
composed of a contiguous stretch of RNA nucleotides) or "single
guide DNA" (when composed of a contiguous stretch of DNA
nucleotides) or "single guide RNA-DNA" (when composed of a
combination of RNA and DNA nucleotides). In one embodiment of the
disclosure, the single guide RNA comprises a cRNA or cRNA fragment
and a tracrRNA or tracrRNA fragment of the type II CRISPR/Cas
system that can form a complex with a type II Cas endonuclease,
wherein said guide RNA/Cas endonuclease complex can direct the Cas
endonuclease to a plant genomic target site, enabling the Cas
endonuclease to introduce a double strand break into the genomic
target site. One aspect of using a single guide polynucleotide
versus a duplex guide polynucleotide is that only one expression
cassette needs to be made to express the single guide
polynucleotide.
[0250] The term "variable targeting domain" or "VT domain" is used
interchangeably herein and includes a nucleotide sequence that is
complementary to one strand (nucleotide sequence) of a double
strand DNA target site (FIGS. 2 A and 2 B). The % complementation
between the first nucleotide sequence domain (VT domain) and the
target sequence can be at least 50%, 51%, 52%, 53%, 54%, 55%, 56%,
57%, 58%, 59%, 60%, 61%, 62%, 63%, 63%, 65%, 66%, 67%, 68%, 69%,
70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%,
83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,
96%, 97%, 98%, 99% or 100%. The variable target domain can be at
least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29 or 30 nucleotides in length. In some embodiments, the
variable targeting domain comprises a contiguous stretch of 12 to
30 nucleotides. The variable targeting domain can be composed of a
DNA sequence, a RNA sequence, a modified DNA sequence, a modified
RNA sequence, or any combination thereof.
[0251] The term "Cas endonuclease recognition domain" or "CER
domain" of a guide polynucleotide is used interchangeably herein
and includes a nucleotide sequence (such as a second nucleotide
sequence domain of a guide polynucleotide), that interacts with a
Cas endonuclease polypeptide. The CER domain can be composed of a
DNA sequence, a RNA sequence, a modified DNA sequence, a modified
RNA sequence (see for example modifications described herein), or
any combination thereof.
[0252] The nucleotide sequence linking the crNucleotide and the
tracrNucleotide of a single guide polynucleotide can comprise a RNA
sequence, a DNA sequence, or a RNA-DNA combination sequence. In one
embodiment, the nucleotide sequence linking the crNucleotide and
the tracrNucleotide of a single guide polynucleotide can be at
least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 78, 79, 80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 or 100
nucleotides in length. In another embodiment, the nucleotide
sequence linking the crNucleotide and the tracrNucleotide of a
single guide polynucleotide can comprise a tetraloop sequence, such
as, but not limiting to a GAAA tetraloop sequence.
[0253] Nucleotide sequence modification of the guide
polynucleotide, VT domain and/or CER domain can be selected from,
but not limited to, the group consisting of a 5' cap, a 3'
polyadenylated tail, a riboswitch sequence, a stability control
sequence, a sequence that forms a dsRNA duplex, a modification or
sequence that targets the guide poly nucleotide to a subcellular
location, a modification or sequence that provides for tracking, a
modification or sequence that provides a binding site for proteins,
a Locked Nucleic Acid (LNA), a 5-methyl dC nucleotide, a
2,6-Diaminopurine nucleotide, a 2'-Fluoro A nucleotide, a 2'-Fluoro
U nucleotide; a 2'-O-Methyl RNA nucleotide, a phosphorothioate
bond, linkage to a cholesterol molecule, linkage to a polyethylene
glycol molecule, linkage to a spacer 18 molecule, a 5' to 3'
covalent linkage, or any combination thereof. These modifications
can result in at least one additional beneficial feature, wherein
the additional beneficial feature is selected from the group of a
modified or regulated stability, a subcellular targeting, tracking,
a fluorescent label, a binding site for a protein or protein
complex, modified binding affinity to complementary target
sequence, modified resistance to cellular degradation, and
increased cellular permeability.
[0254] In one embodiment, the guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at a DNA target site
[0255] In one embodiment of the disclosure the variable target
domain is 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29 or 30 nucleotides in length.
[0256] In one embodiment of the disclosure, the guide RNA comprises
a cRNA (or cRNA fragment) and a tracrRNA (or tracrRNA fragment) of
the type II CRISPR/Cas system that can form a complex with a type
II Cas endonuclease, wherein said guide RNA/Cas endonuclease
complex can direct the Cas endonuclease to a plant genomic target
site, enabling the Cas endonuclease to introduce a double strand
break into the genomic target site.
[0257] In one embodiment the guide RNA can be introduced into a
plant or plant cell directly using any method known in the art such
as, but not limited to, particle bombardment or topical
applications.
[0258] In another embodiment the guide RNA can be introduced
indirectly by introducing a recombinant DNA molecule comprising the
corresponding guide DNA sequence operably linked to a plant
specific promoter (as shown in FIG. 1B) that is capable of
transcribing the guide RNA in said plant cell. The term
"corresponding guide DNA" includes a DNA molecule that is identical
to the RNA molecule but has a "T" substituted for each "U" of the
RNA molecule.
[0259] In some embodiments, the guide RNA is introduced via
particle bombardment or Agrobacterium transformation of a
recombinant DNA construct comprising the corresponding guide DNA
operably linked to a plant U6 polymerase III promoter.
[0260] In one embodiment, the RNA that guides the RNA/Cas9
endonuclease complex, is a duplexed RNA comprising a duplex
crRNA-tracrRNA (as shown in FIG. 2B). One advantage of using a
guide RNA versus a duplexed crRNA-tracrRNA is that only one
expression cassette needs to be made to express the fused guide
RNA.
[0261] The terms "target site", "target sequence", "target DNA",
"target locus", "genomic target site", "genomic target sequence",
and "genomic target locus" are used interchangeably herein and
refer to a polynucleotide sequence in the genome (including
choloroplastic and mitochondrial DNA) of a plant cell at which a
double-strand break is induced in the plant cell genome by a Cas
endonuclease. The target site can be an endogenous site in the
plant genome, or alternatively, the target site can be heterologous
to the plant and thereby not be naturally occurring in the genome,
or the target site can be found in a heterologous genomic location
compared to where it occurs in nature. As used herein, terms
"endogenous target sequence" and "native target sequence" are used
interchangeable herein to refer to a target sequence that is
endogenous or native to the genome of a plant and is at the
endogenous or native position of that target sequence in the genome
of the plant.
[0262] In one embodiments, the target site can be similar to a DNA
recognition site or target site that that is specifically
recognized and/or bound by a double-strand break inducing agent
such as a LIG3-4 endonuclease (US patent publication 2009-0133152
A1 (published May 21, 2009) or a MS26++ meganuclease (U.S. patent
application Ser. No. 13/526,912 filed Jun. 19, 2012).
[0263] An "artificial target site" or "artificial target sequence"
are used interchangeably herein and refer to a target sequence that
has been introduced into the genome of a plant. Such an artificial
target sequence can be identical in sequence to an endogenous or
native target sequence in the genome of a plant but be located in a
different position (i.e., a non-endogenous or non-native position)
in the genome of a plant.
[0264] An "altered target site", "altered target sequence",
"modified target site", "modified target sequence" are used
interchangeably herein and refer to a target sequence as disclosed
herein that comprises at least one alteration when compared to
non-altered target sequence. Such "alterations" include, for
example:
(i) replacement of at least one nucleotide, (ii) a deletion of at
least one nucleotide, (iii) an insertion of at least one
nucleotide, or (iv) any combination of (i)-(iii).
[0265] Methods for modifying a plant genomic target site are
disclosed herein. In one embodiment, a method for modifying a
target site in the genome of a plant cell comprises introducing a
guide RNA into a plant cell having a Cas endonuclease, wherein said
guide RNA and Cas endonuclease are capable of forming a complex
that enables the Cas endonuclease to introduce a double strand
break at said target site.
[0266] Also provided is a method for modifying a target site in the
genome of a plant cell, the method comprising introducing a guide
RNA and a Cas endonuclease into said plant, wherein said guide RNA
and Cas endonuclease are capable of forming a complex that enables
the Cas endonuclease to introduce a double strand break at said
target site.
[0267] Further provided is a method for modifying a target site in
the genome of a plant cell, the method comprising introducing a
guide RNA and a donor DNA into a plant cell having a Cas
endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at said target site, wherein said
donor DNA comprises a polynucleotide of interest.
[0268] Further provided is a method for modifying a target site in
the genome of a plant cell, the method comprising: a) introducing
into a plant cell a guide RNA comprising a variable targeting
domain and a Cas endonuclease, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at said target
site; and, b) identifying at least one plant cell that has a
modification at said target, wherein the modification includes at
least one deletion or substitution of one or more nucleotides in
said target site.
[0269] Further provided, a method for modifying a target DNA
sequence in the genome of a plant cell, the method comprising: a)
introducing into a plant cell a first recombinant DNA construct
capable of expressing a guide RNA and a second recombinant DNA
construct capable of expressing a Cas endonuclease, wherein said
guide RNA and Cas endonuclease are capable of forming a complex
that enables the Cas endonuclease to introduce a double strand
break at said target site; and, b) identifying at least one plant
cell that has a modification at said target, wherein the
modification includes at least one deletion or substitution of one
or more nucleotides in said target site.
[0270] The length of the target site can vary, and includes, for
example, target sites that are at least 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides
in length. It is further possible that the target site can be
palindromic, that is, the sequence on one strand reads the same in
the opposite direction on the complementary strand. The
nick/cleavage site can be within the target sequence or the
nick/cleavage site could be outside of the target sequence. In
another variation, the cleavage could occur at nucleotide positions
immediately opposite each other to produce a blunt end cut or, in
other Cases, the incisions could be staggered to produce
single-stranded overhangs, also called "sticky ends", which can be
either 5' overhangs, or 3' overhangs.
[0271] In some embodiment, the genomic target site capable of being
cleaved by a Cas endonuclease comprises a 12 to 30 nucleotide
fragment of a male fertility gene such as MS26 (see for example
U.S. Pat. Nos. 7,098,388, 7,517,975, 7,612,251), MS45 (see for
example U.S. Pat. Nos. 5,478,369, 6,265,640) or MSCA1 (see for
example U.S. Pat. No. 7,919,676), ALS or ESPS genes.
[0272] Active variants of genomic target sites can also be used.
Such active variants can comprise at least 65%, 70%, 75%, 80%, 85%,
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence
identity to the given target site, wherein the active variants
retain biological activity and hence are capable of being
recognized and cleaved by an Cas endonuclease. Assays to measure
the double-strand break of a target site by an endonuclease are
known in the art and generally measure the overall activity and
specificity of the agent on DNA substrates containing recognition
sites.
[0273] Various methods and compositions can be employed to obtain a
plant having a polynucleotide of interest inserted in a target site
for a Cas endonuclease. Such methods can employ homologous
recombination to provide integration of the polynucleotide of
Interest at the target site. In one method provided, a
polynucleotide of interest is provided to the plant cell in a donor
DNA construct. As used herein, "donor DNA" is a DNA construct that
comprises a polynucleotide of Interest to be inserted into the
target site of a Cas endonuclease. The donor DNA construct further
comprises a first and a second region of homology that flank the
polynucleotide of Interest. The first and second regions of
homology of the donor DNA share homology to a first and a second
genomic region, respectively, present in or flanking the target
site of the plant genome. By "homology" is meant DNA sequences that
are similar. For example, a "region of homology to a genomic
region" that is found on the donor DNA is a region of DNA that has
a similar sequence to a given "genomic region" in the plant genome.
A region of homology can be of any length that is sufficient to
promote homologous recombination at the cleaved target site. For
example, the region of homology can comprise at least 5-10, 5-15,
5-20, 5-25, 5-30, 5-35, 5-40, 5-45, 5-50, 5-55, 5-60, 5-65, 5-70,
5-75, 5-80, 5-85, 5-90, 5-95, 5-100, 5-200, 5-300, 5-400, 5-500,
5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100, 5-1200, 5-1300, 5-1400,
5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000, 5-2100, 5-2200,
5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800, 5-2900, 5-3000,
5-3100 or more bases in length such that the region of homology has
sufficient homology to undergo homologous recombination with the
corresponding genomic region. "Sufficient homology" indicates that
two polynucleotide sequences have sufficient structural similarity
to act as substrates for a homologous recombination reaction. The
structural similarity includes overall length of each
polynucleotide fragment, as well as the sequence similarity of the
polynucleotides. Sequence similarity can be described by the
percent sequence identity over the whole length of the sequences,
and/or by conserved regions comprising localized similarities such
as contiguous nucleotides having 100% sequence identity, and
percent sequence identity over a portion of the length of the
sequences.
[0274] The amount of homology or sequence identity shared by a
target and a donor polynucleotide can vary and includes total
lengths and/or regions having unit integral values in the ranges of
about 1-20 bp, 20-50 bp, 50-100 bp, 75-150 bp, 100-250 bp, 150-300
bp, 200-400 bp, 250-500 bp, 300-600 bp, 350-750 bp, 400-800 bp,
450-900 bp, 500-1000 bp, 600-1250 bp, 700-1500 bp, 800-1750 bp,
900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb, 2.5-5 kb, 3-6 kb, 3.5-7
kb, 4-8 kb, 5-10 kb, or up to and including the total length of the
target site. These ranges include every integer within the range,
for example, the range of 1-20 bp includes 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20 bp. The amount of
homology can also described by percent sequence identity over the
full aligned length of the two polynucleotides which includes
percent sequence identity of about at least 50%, 55%, 60%, 65%,
70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%,
83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,
96%, 97%, 98%, 99% or 100%. Sufficient homology includes any
combination of polynucleotide length, global percent sequence
identity, and optionally conserved regions of contiguous
nucleotides or local percent sequence identity, for example
sufficient homology can be described as a region of 75-150 bp
having at least 80% sequence identity to a region of the target
locus. Sufficient homology can also be described by the predicted
ability of two polynucleotides to specifically hybridize under high
stringency conditions, see, for example, Sambrook et al., (1989)
Molecular Cloning: A Laboratory Manual, (Cold Spring Harbor
Laboratory Press, NY); Current Protocols in Molecular Biology,
Ausubel et al., Eds (1994) Current Protocols, (Greene Publishing
Associates, Inc. and John Wiley & Sons, Inc.); and, Tijssen
(1993) Laboratory Techniques in Biochemistry and Molecular
Biology--Hybridization with Nucleic Acid Probes, (Elsevier, New
York).
[0275] As used herein, a "genomic region" is a segment of a
chromosome in the genome of a plant cell that is present on either
side of the target site or, alternatively, also comprises a portion
of the target site. The genomic region can comprise at least 5-10,
5-15, 5-20, 5-25, 5-30, 5-35, 5-40, 5-45, 5-50, 5-55, 5-60, 5-65,
5-70, 5-75, 5-80, 5-85, 5-90, 5-95, 5-100, 5-200, 5-300, 5-400,
5-500, 5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100, 5-1200, 5-1300,
5-1400, 5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000, 5-2100,
5-2200, 5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800. 5-2900,
5-3000, 5-3100 or more bases such that the genomic region has
sufficient homology to undergo homologous recombination with the
corresponding region of homology.
[0276] Polynucleotides of interest and/or traits can be stacked
together in a complex trait locus as described in
US-2013-0263324-A1, published 3 Oct. 2013 and in PCT/US13/22891,
published Jan. 24, 2013, both applications are hereby incorporated
by reference. The guide polynucleotide/Cas9 endonuclease system
described herein provides for an efficient system to generate
double strand breaks and allows for traits to be stacked in a
complex trait locus.
[0277] In one embodiment, the guide polynucleotide/Cas endonuclease
system is used for introducing one or more polynucleotides of
interest or one or more traits of interest into one or more target
sites by providing one or more guide polynucleotides, one Cas
endonuclease, and optionally one or more donor DNAs to a plant
cell. A fertile plant can be produced from that plant cell that
comprises an alteration at said one or more target sites, wherein
the alteration is selected from the group consisting of (i)
replacement of at least one nucleotide, (ii) a deletion of at least
one nucleotide, (iii) an insertion of at least one nucleotide, and
(iv) any combination of (i)-(iii). Plants comprising these altered
target sites can be crossed with plants comprising at least one
gene or trait of interest in the same complex trait locus, thereby
further stacking traits in said complex trait locus. (see also
US-2013-0263324-A1, published 3 Oct. 2013 and in PCT/US13/22891,
published Jan. 24, 2013).
[0278] In one embodiment, the method comprises a method for
producing in a plant a complex trait locus comprising at least two
altered target sequences in a genomic region of interest, said
method comprising: (a) selecting a genomic region in a plant,
wherein the genomic region comprises a first target sequence and a
second target sequence; (b) contacting at least one plant cell with
at least a first guide polynucleotide, a second polynucleotide, and
optionally at least one donor DNA, and a Cas endonuclease, wherein
the first and second guide polynucleotide and the Cas endonuclease
can form a complex that enables the Cas endonuclease to introduce a
double strand break in at least a first and a second target
sequence; (c) identifying a cell from (b) comprising a first
alteration at the first target sequence and a second alteration at
the second target sequence; and (d) recovering a first fertile
plant from the cell of (c) said fertile plant comprising the first
alteration and the second alteration, wherein the first alteration
and the second alteration are physically linked.
[0279] In one embodiment, the method comprises a method for
producing in a plant a complex trait locus comprising at least two
altered target sequences in a genomic region of interest, said
method comprising: (a) selecting a genomic region in a plant,
wherein the genomic region comprises a first target sequence and a
second target sequence; (b) contacting at least one plant cell with
a first guide polynucleotide, a Cas endonuclease, and optionally a
first donor DNA, wherein the first guide polynucleotide and the Cas
endonuclease can form a complex that enables the Cas endonuclease
to introduce a double strand break a first target sequence; (c)
identifying a cell from (b) comprising a first alteration at the
first target sequence; (d) recovering a first fertile plant from
the cell of (c), said first fertile plant comprising the first
alteration; (e) contacting at least one plant cell with a second
guide polynucleotide, a Cas endonuclease and optionally a second
Donor DNA; (f) identifying a cell from (e) comprising a second
alteration at the second target sequence; (g) recovering a second
fertile plant from the cell of (f), said second fertile plant
comprising the second alteration; and, (h) obtaining a fertile
progeny plant from the second fertile plant of (g), said fertile
progeny plant comprising the first alteration and the second
alteration, wherein the first alteration and the second alteration
are physically linked.
[0280] The structural similarity between a given genomic region and
the corresponding region of homology found on the donor DNA can be
any degree of sequence identity that allows for homologous
recombination to occur. For example, the amount of homology or
sequence identity shared by the "region of homology" of the donor
DNA and the "genomic region" of the plant genome can be at least
50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%,
87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or
100% sequence identity, such that the sequences undergo homologous
recombination
[0281] The region of homology on the donor DNA can have homology to
any sequence flanking the target site. While in some embodiments
the regions of homology share significant sequence homology to the
genomic sequence immediately flanking the target site, it is
recognized that the regions of homology can be designed to have
sufficient homology to regions that may be further 5' or 3' to the
target site. In still other embodiments, the regions of homology
can also have homology with a fragment of the target site along
with downstream genomic regions. In one embodiment, the first
region of homology further comprises a first fragment of the target
site and the second region of homology comprises a second fragment
of the target site, wherein the first and second fragments are
dissimilar.
[0282] As used herein, "homologous recombination" includes the
exchange of DNA fragments between two DNA molecules at the sites of
homology. The frequency of homologous recombination is influenced
by a number of factors. Different organisms vary with respect to
the amount of homologous recombination and the relative proportion
of homologous to non-homologous recombination. Generally, the
length of the region of homology affects the frequency of
homologous recombination events: the longer the region of homology,
the greater the frequency. The length of the homology region needed
to observe homologous recombination is also species-variable. In
many cases, at least 5 kb of homology has been utilized, but
homologous recombination has been observed with as little as 25-50
bp of homology. See, for example, Singer et al., (1982) Cell
31:25-33; Shen and Huang, (1986) Genetics 112:441-57; Watt et al.,
(1985) Proc. Natl. Acad. Sci. USA 82:4768-72, Sugawara and Haber,
(1992) Mol Cell Biol 12:563-75, Rubnitz and Subramani, (1984) Mol
Cell Biol 4:2253-8; Ayares et al., (1986) Proc. Natl. Acad. Sci.
USA 83:5199-203; Liskay et al., (1987) Genetics 115:161-7.
[0283] Homology-directed repair (HDR) is a mechanism in cells to
repair double-stranded and single stranded DNA breaks.
Homology-directed repair includes homologous recombination (HR) and
single-strand annealing (SSA) (Lieber. 2010 Annu. Rev. Biochem.
79:181-211). The most common form of HDR is called homologous
recombination (HR), which has the longest sequence homology
requirements between the donor and acceptor DNA. Other forms of HDR
include single-stranded annealing (SSA) and breakage-induced
replication, and these require shorter sequence homology relative
to HR. Homology-directed repair at nicks (single-stranded breaks)
can occur via a mechanism distinct from HDR at double-strand breaks
(Davis and Maizels. PNAS (0027-8424), 111 (10), p. E924-E932.
[0284] Alteration of the genome of a plant cell, for example,
through homologous recombination (HR), is a powerful tool for
genetic engineering. Despite the low frequency of homologous
recombination in higher plants, there are a few examples of
successful homologous recombination of plant endogenous genes. The
parameters for homologous recombination in plants have primarily
been investigated by rescuing introduced truncated selectable
marker genes. In these experiments, the homologous DNA fragments
were typically between 0.3 kb to 2 kb. Observed frequencies for
homologous recombination were on the order of 10.sup.-4 to
10.sup.-5. See, for example, Halfter et al., (1992) Mol Gen Genet
231:186-93; Offring a et al., (1990) EMBO J 9:3077-84; Offring a et
al., (1993) Proc. Natl. Acad. Sci. USA 90:7346-50; Paszkowski et
al., (1988) EMBO J 7:4021-6; Hourda and Paszkowski, (1994) Mol Gen
Genet 243:106-11; and Risseeuw et al., (1995) Plant J 7:109-19.
[0285] Homologous recombination has been demonstrated in insects.
In Drosophila, Dray and Gloor found that as little as 3 kb of total
template:target homology sufficed to copy a large non-homologous
segment of DNA into the target with reasonable efficiency (Dray and
Gloor, (1997) Genetics 147:689-99). Using FLP-mediated DNA
integration at a target FRT in Drosophila, Golic et al., showed
integration was approximately 10-fold more efficient when the donor
and target shared 4.1 kb of homology as compared to 1.1 kb of
homology (Golic et al., (1997) Nucleic Acids Res 25:3665). Data
from Drosophila indicates that 2-4 kb of homology is sufficient for
efficient targeting, but there is some evidence that much less
homology may suffice, on the order of about 30 bp to about 100 bp
(Nassif and Engels, (1993) Proc. Natl. Acad. Sci. USA 90:1262-6;
Keeler and Gloor, (1997) Mol Cell Biol 17:627-34).
[0286] Homologous recombination has also been accomplished in other
organisms. For example, at least 150-200 bp of homology was
required for homologous recombination in the parasitic protozoan
Leishmania (Papadopoulou and Dumas, (1997) Nucleic Acids Res
25:4278-86). In the filamentous fungus Aspergillus nidulans, gene
replacement has been accomplished with as little as 50 bp flanking
homology (Chaveroche et al., (2000) Nucleic Acids Res 28:e97).
Targeted gene replacement has also been demonstrated in the ciliate
Tetrahymena thermophila (Gaertig et al., (1994) Nucleic Acids Res
22:5391-8). In mammals, homologous recombination has been most
successful in the mouse using pluripotent embryonic stem cell lines
(ES) that can be grown in culture, transformed, selected and
introduced into a mouse embryo. Embryos bearing inserted transgenic
ES cells develop as genetically offspring. By interbreeding
siblings, homozygous mice carrying the selected genes can be
obtained. An overview of the process is provided in Watson et al.,
(1992) Recombinant DNA, 2nd Ed., (Scientific American Books
distributed by WH Freeman & Co.); Capecchi, (1989) Trends Genet
5:70-6; and Bronson, (1994) J Biol Chem 269:27155-8. Homologous
recombination in mammals other than mouse has been limited by the
lack of stem cells capable of being transplanted to oocytes or
developing embryos. However, McCreath et al., Nature 405:1066-9
(2000) reported successful homologous recombination in sheep by
transformation and selection in primary embryo fibroblast
cells.
[0287] Error-prone DNA repair mechanisms can produce mutations at
double-strand break sites. The Non-Homologous-End-Joining (NHEJ)
pathways are the most common repair mechanism to bring the broken
ends together (Bleuyard et al., (2006) DNA Repair 5:1-12). The
structural integrity of chromosomes is typically preserved by the
repair, but deletions, insertions, or other rearrangements are
possible. The two ends of one double-strand break are the most
prevalent substrates of NHEJ (Kirik et al., (2000) EMBO J
19:5562-6), however if two different double-strand breaks occur,
the free ends from different breaks can be ligated and result in
chromosomal deletions (Siebert and Puchta, (2002) Plant Cell
14:1121-31), or chromosomal translocations between different
chromosomes (Pacher et al., (2007) Genetics 175:21-9).
[0288] Episomal DNA molecules can also be ligated into the
double-strand break, for example, integration of T-DNAs into
chromosomal double-strand breaks (Chilton and Que, (2003) Plant
Physiol 133:956-65; Salomon and Puchta, (1998) EMBO J 17:6086-95).
Once the sequence around the double-strand breaks is altered, for
example, by exonuclease activities involved in the maturation of
double-strand breaks, gene conversion pathways can restore the
original structure if a homologous sequence is available, such as a
homologous chromosome in non-dividing somatic cells, or a sister
chromatid after DNA replication (Molinier et al., (2004) Plant Cell
16:342-52). Ectopic and/or epigenic DNA sequences may also serve as
a DNA repair template for homologous recombination (Puchta, (1999)
Genetics 152:1173-81).
[0289] Once a double-strand break is induced in the DNA, the cell's
DNA repair mechanism is activated to repair the break. Error-prone
DNA repair mechanisms can produce mutations at double-strand break
sites. The most common repair mechanism to bring the broken ends
together is the nonhomologous end-joining (NHEJ) pathway (Bleuyard
et al., (2006) DNA Repair 5:1-12). The structural integrity of
chromosomes is typically preserved by the repair, but deletions,
insertions, or other rearrangements are possible (Siebert and
Puchta, (2002) Plant Cell 14:1121-31; Pacher et al., (2007)
Genetics 175:21-9).
[0290] Alternatively, the double-strand break can be repaired by
homologous recombination between homologous DNA sequences. Once the
sequence around the double-strand break is altered, for example, by
exonuclease activities involved in the maturation of double-strand
breaks, gene conversion pathways can restore the original structure
if a homologous sequence is available, such as a homologous
chromosome in non-dividing somatic cells, or a sister chromatid
after DNA replication (Molinier et al., (2004) Plant Cell
16:342-52). Ectopic and/or epigenic DNA sequences may also serve as
a DNA repair template for homologous recombination (Puchta, (1999)
Genetics 152:1173-81).
[0291] DNA double-strand breaks appear to be an effective factor to
stimulate homologous recombination pathways (Puchta et al., (1995)
Plant Mol Biol 28:281-92; Tzfira and White, (2005) Trends
Biotechnol 23:567-9; Puchta, (2005) J Exp Bot 56:1-14). Using
DNA-breaking agents, a two- to nine-fold increase of homologous
recombination was observed between artificially constructed
homologous DNA repeats in plants (Puchta et al., (1995) Plant Mol
Biol 28:281-92). In maize protoplasts, experiments with linear DNA
molecules demonstrated enhanced homologous recombination between
plasmids (Lyznik et al., (1991) Mol Gen Genet 230:209-18).
[0292] In one embodiment provided herein, the method comprises
contacting a plant cell with the donor DNA and the endonuclease.
Once a double-strand break is introduced in the target site by the
endonuclease, the first and second regions of homology of the donor
DNA can undergo homologous recombination with their corresponding
genomic regions of homology resulting in exchange of DNA between
the donor and the genome. As such, the provided methods result in
the integration of the polynucleotide of interest of the donor DNA
into the double-strand break in the target site in the plant
genome, thereby altering the original target site and producing an
altered genomic target site.
[0293] The donor DNA may be introduced by any means known in the
art. For example, a plant having a target site is provided. The
donor DNA may be provided by any transformation method known in the
art including, for example, Agrobacterium-mediated transformation
or biolistic particle bombardment. The donor DNA may be present
transiently in the cell or it could be introduced via a viral
replicon. In the presence of the Cas endonuclease and the target
site, the donor DNA is inserted into the transformed plant's
genome.
[0294] Another approach uses protein engineering of existing homing
endonucleases to alter their target specificities. Homing
endonucleases, such as I-SceI or I-CreI, bind to and cleave
relatively long DNA recognition sequences (18 bp and 22 bp,
respectively). These sequences are predicted to naturally occur
infrequently in a genome, typically only 1 or 2 sites/genome. The
cleavage specificity of a homing endonuclease can be changed by
rational design of amino acid substitutions at the DNA binding
domain and/or combinatorial assembly and selection of mutated
monomers (see, for example, Arnould et al., (2006) J Mol Biol
355:443-58; Ashworth et al., (2006) Nature 441:656-9; Doyon et al.,
(2006) J Am Chem Soc 128:2477-84; Rosen et al., (2006) Nucleic
Acids Res 34:4791-800; and Smith et al., (2006) Nucleic Acids Res
34:e149; Lyznik et al., (2009) U.S. Patent Application Publication
No. 20090133152A1; Smith et al., (2007) U.S. Patent Application
Publication No. 20070117128A1). Engineered meganucleases have been
demonstrated that can cleave cognate mutant sites without
broadening their specificity. An artificial recognition site
specific to the wild type yeast I-SceI homing nuclease was
introduced in maize genome and mutations of the recognition
sequence were detected in 1% of analyzed F1 plants when a
transgenic I-SceI was introduced by crossing and activated by gene
excision (Yang et al., (2009) Plant Mol Biol 70:669-79). More
practically, the maize liguleless locus was targeted using an
engineered single-chain endonuclease designed based on the I-CreI
meganuclease sequence. Mutations of the selected liguleless locus
recognition sequence were detected in 3% of the T0 transgenic
plants when the designed homing nuclease was introduced by
Agrobacterium-mediated transformation of immature embryos (Gao et
al., (2010) Plant J 61:176-87).
[0295] Polynucleotides of interest are further described herein and
are reflective of the commercial markets and interests of those
involved in the development of the crop. Crops and markets of
interest change, and as developing nations open up world markets,
new crops and technologies will emerge also. In addition, as our
understanding of agronomic traits and characteristics such as yield
and heterosis increase, the choice of genes for genetic engineering
will change accordingly.
[0296] Genome Editing Using the Guide RNA/Cas Endonuclease
System
[0297] As described herein, the guide RNA/Cas endonuclease system
can be used in combination with a co-delivered polynucleotide
modification template to allow for editing of a genomic nucleotide
sequence of interest. Also, as described herein, for each
embodiment that uses a guide RNA/Cas endonuclease system, a similar
guide polynucleotide/Cas endonuclease system can be deployed where
the guide polynucleotide does not solely comprise ribonucleic acids
but wherein the guide polynucleotide comprises a combination of
RNA-DNA molecules or solely comprise DNA molecules.
[0298] While numerous double-strand break-making systems exist,
their practical applications for gene editing may be restricted due
to the relatively low frequency of induced double-strand breaks
(DSBs). To date, many genome modification methods rely on the
homologous recombination system. Homologous recombination (HR) can
provide molecular means for finding genomic DNA sequences of
interest and modifying them according to the experimental
specifications. Homologous recombination takes place in plant
somatic cells at low frequency. The process can be enhanced to a
practical level for genome engineering by introducing double-strand
breaks (DSBs) at selected endonuclease target sites. The challenge
has been to efficiently make DSBs at genomic sites of interest
since there is a bias in the directionality of information transfer
between two interacting DNA molecules (the broken one acts as an
acceptor of genetic information). Described herein is the use of a
guide RNA/Cas system which provides flexible genome cleavage
specificity and results in a high frequency of double-strand breaks
at a DNA target site, thereby enabling efficient gene editing in a
nucleotide sequence of interest, wherein the nucleotide sequence of
interest to be edited can be located within or outside the target
site recognized and cleaved by a Cas endonuclease.
[0299] A "modified nucleotide" or "edited nucleotide" refers to a
nucleotide sequence of interest that comprises at least one
alteration when compared to its non-modified nucleotide sequence.
Such "alterations" include, for example: (i) replacement of at
least one nucleotide, (ii) a deletion of at least one nucleotide,
(iii) an insertion of at least one nucleotide, or (iv) any
combination of (i)-(iii).
[0300] The term "polynucleotide modification template" includes a
polynucleotide that comprises at least one nucleotide modification
when compared to the nucleotide sequence to be edited. A nucleotide
modification can be at least one nucleotide substitution, addition
or deletion. Optionally, the polynucleotide modification template
can further comprise homologous nucleotide sequences flanking the
at least one nucleotide modification, wherein the flanking
homologous nucleotide sequences provide sufficient homology to the
desired nucleotide sequence to be edited.
[0301] In one embodiment, the disclosure describes a method for
editing a nucleotide sequence in the genome of a cell, the method
comprising providing a guide RNA, a polynucleotide modification
template, and at least one Cas endonuclease to a cell, wherein the
Cas endonuclease is capable of introducing a double-strand break at
a target sequence in the genome of said cell, wherein said
polynucleotide modification template includes at least one
nucleotide modification of said nucleotide sequence. Cells include,
but are not limited to, human, animal, bacterial, fungal, insect,
and plant cells as well as plants and seeds produced by the methods
described herein. The nucleotide to be edited can be located within
or outside a target site recognized and cleaved by a Cas
endonuclease. In one embodiment, the at least one nucleotide
modification is not a modification at a target site recognized and
cleaved by a Cas endonuclease. In another embodiment, there are at
least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30, 40, 50, 100, 200, 300,
400, 500, 600, 700, 900 or 1000 nucleotides between the at least
one nucleotide to be edited and the genomic target site.
[0302] In another embodiment, the disclosure describes a method for
editing a nucleotide sequence in the genome of a plant cell, the
method comprising providing a guide RNA, a polynucleotide
modification template, and at least one maize optimized Cas9
endonuclease to a plant cell, wherein the maize optimized Cas9
endonuclease is capable of providing a double-strand break at a
moCas9 target sequence in the plant genome, wherein said
polynucleotide modification template includes at least one
nucleotide modification of said nucleotide sequence.
[0303] In another embodiment, the disclosure describes a method for
editing a nucleotide sequence in the genome of a cell, the method
comprising providing a guide RNA, a polynucleotide modification
template and at least one Cas endonuclease to a cell, wherein said
guide RNA and Cas endonuclease are capable of forming a complex
that enables the Cas endonuclease to introduce a double strand
break at a target site, wherein said polynucleotide modification
template comprises at least one nucleotide modification of said
nucleotide sequence.
[0304] In another embodiment of genome editing, editing of the
endogenous enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene
is disclosed herein (Example 16). In this embodiment, the
polynucleotide modification template (EPSPS polynucleotide
modification template) includes a partial fragment of the EPSPS
gene (and therefore does not encode a fully functional EPSPS
polypeptide by itself). The EPSPS polynucleotide modification
template contained three point mutations that were responsible for
the creation of the T102I/P106S (TIPS) double mutant (Funke, T et
al., J. Biol. Chem. 2009, 284:9854-9860), which provide glyphosate
tolerance to transgenic plants expressing as EPSPS double mutant
transgene.
[0305] As defined herein "Glyphosate" includes any herbicidally
effective form of N-phosphonomethylglycine (including any salt
thereof), other forms which result in the production of the
glyphosate anion in plants and any other herbicides of the
phosphonomethlyglycine family.
[0306] In one embodiment of the disclosure, an epsps mutant plant
is produced by the method described herein, said method comprising:
a) providing a guide RNA, a polynucleotide modification template
and at least one Cas endonuclease to a plant cell, wherein the Cas
endonuclease introduces a double strand break at a target site
within an epsps (enolpyruvylshikimate-3-phosphate synthase) genomic
sequence in the plant genome, wherein said polynucleotide
modification template comprises at least one nucleotide
modification of said epsps genomic sequence; b) obtaining a plant
from the plant cell of (a); c) evaluating the plant of (b) for the
presence of said at least one nucleotide modification and d)
selecting a progeny plant that shows resistance to glyphosate.
[0307] Increased resistance to an herbicide is demonstrated when
plants which display the increased resistance to an herbicide are
subjected to the herbicide and a dose/response curve is shifted to
the right when compared with that provided by an appropriate
control plant. Such dose/response curves have "dose" plotted on the
x-axis and "percentage injury", "herbicidal effect" etc. plotted on
the y-axis. Plants which are substantially resistant to the
herbicide exhibit few, if any, bleached, necrotic, lytic, chlorotic
or other lesions and are not stunted, wilted or deformed when
subjected to the herbicide at concentrations and rates which are
typically employed by the agricultural community to kill weeds in
the field. The terms resistance and tolerance may be used
interchangeably.
[0308] FIG. 12 shows a schematic representation of components used
in the genome editing procedure. A maize optimized Cas
endonuclease, a guide RNA and a polynucleotide modification
template were provided to a plant cell. For example, as shown in
FIG. 12, the polynucleotide modification template included three
nucleotide modifications (indicated by arrows) when compared to the
EPSPS genomic sequence to be edited. These three nucleotide
modifications are referred to as TIPS mutations as these nucleotide
modifications result in the amino acid changes T-102 to I-102 and
P-106 to S-106. The first point mutation results from the
substitution of the C nucleotide in the codon sequence ACT with a T
nucleotide, a second mutation results from the substitution of the
T nucleotide on the same codon sequence ACT with a C nucleotide to
form the isoleucine codon ATC, the third point mutation results
from the substitution of the first C nucleotide in the codon
sequence CCA with a T nucleotide in order to form a serine codon
TCA (FIG. 12).
[0309] In one embodiment, the disclosure describes a method for
producing an epsps (enolpyruvylshikimate-3-phosphate synthase)
mutant plant, the method comprising: a) providing a guide RNA, a
polynucleotide modification template and at least one Cas
endonuclease to a plant cell, wherein the Cas endonuclease
introduces a double strand break at a target site within an epsps
genomic sequence in the plant genome, wherein said polynucleotide
modification template comprises at least one nucleotide
modification of said epsps genomic sequence; b) obtaining a plant
from the plant cell of (a); c) evaluating the plant of (b) for the
presence of said at least one nucleotide modification; and, d)
screening a progeny plant of (c) that is void of said guide RNA and
Cas endonuclease.
[0310] The nucleotide sequence to be edited can be a sequence that
is endogenous, artificial, pre-existing, or transgenic to the cell
that is being edited. For example, the nucleotide sequence in the
genome of a cell can be a native gene, a mutated gene, a non-native
gene, a foreign gene, or a transgene that is stably incorporated
into the genome of a cell. Editing of such nucleotide may result in
a further desired phenotype or genotype.
Regulatory Sequence Modifications Using the Guide
Polynucleotide/Cas Endonuclease System
[0311] In one embodiment the nucleotide sequence to be modified can
be a regulatory sequence such as a promoter wherein the editing of
the promoter comprises replacing the promoter (also referred to as
a "promoter swap" or "promoter replacement") or promoter fragment
with a different promoter (also referred to as replacement
promoter) or promoter fragment (also referred to as replacement
promoter fragment), wherein the promoter replacement results in any
one of the following or any one combination of the following: an
increased promoter activity, an increased promoter tissue
specificity, a decreased promoter activity, a decreased promoter
tissue specificity, a new promoter activity, an inducible promoter
activity, an extended window of gene expression, a modification of
the timing or developmental progress of gene expression in the same
cell layer or other cell layer (such as but not limiting to
extending the timing of gene expression in the tapetum of maize
anthers (U.S. Pat. No. 5,837,850 issued Nov. 17, 1998), a mutation
of DNA binding elements and/or a deletion or addition of DNA
binding elements. The promoter (or promoter fragment) to be
modified can be a promoter (or promoter fragment) that is
endogenous, artificial, pre-existing, or transgenic to the cell
that is being edited. The replacement promoter (or replacement
promoter fragment) can be a promoter (or promoter fragment) that is
endogenous, artificial, pre-existing, or transgenic to the cell
that is being edited.
[0312] In one embodiment the nucleotide sequence can be a promoter
wherein the editing of the promoter comprises replacing an ARGOS 8
promoter with a Zea mays GOS2 PRO:GOS2-intron promoter.
[0313] In one embodiment the nucleotide sequence can be a promoter
wherein the editing of the promoter comprises replacing a native
EPSPS1 promoter from with a plant ubiquitin promoter.
[0314] In one embodiment the nucleotide sequence can be a promoter
wherein the editing of the promoter comprises replacing an
endogenous maize NPK1 promoter with a stress inducible maize RAB17
promoter.
[0315] In one embodiment the nucleotide sequence can be a promoter
wherein the promoter to be edited is selected from the group
comprising Zea mays-PEPC1 promoter (Kausch et al, Plant Molecular
Biology, 45: 1-15, 2001), Zea mays Ubiquitin promoter (UBI1ZM PRO,
Christensen et al, plant Molecular Biology 18: 675-689, 1992), Zea
mays-Rootmet2 promoter (U.S. Pat. No. 7,214,855), Rice actin
promoter (OS-ACTIN PRO, U.S. Pat. No. 5,641,876; McElroy et al, The
Plant Cell, Vol 2, 163-171, February 1990), Sorghum RCC3 promoter
(US 2012/0210463 filed on 13 Feb. 2012), Zea mays-GOS2 promoter
(U.S. Pat. No. 6,504,083), Zea mays-ACO2 promoter (U.S. application
Ser. No. 14/210,711 filed 14 Mar. 2014) or Zea mays-oleosin
promoter (U.S. Pat. No. 8,466,341 B2).
[0316] In another embodiment, the guide polynucleotide/Cas
endonuclease system can be used in combination with a co-delivered
polynucleotide modification template or donor DNA sequence to allow
for the insertion of a promoter or promoter element into a genomic
nucleotide sequence of interest, wherein the promoter insertion (or
promoter element insertion) results in any one of the following or
any one combination of the following: an increased promoter
activity (increased promoter strength), an increased promoter
tissue specificity, a decreased promoter activity, a decreased
promoter tissue specificity, a new promoter activity, an inducible
promoter activity, an extended window of gene expression, a
modification of the timing or developmental progress of gene
expression a mutation of DNA binding elements and/or an addition of
DNA binding elements. Promoter elements to be inserted can be, but
are not limited to, promoter core elements (such as, but not
limited to, a CAAT box, a CCAAT box, a Pribnow box, a and/or TATA
box, translational regulation sequences and/or a repressor system
for inducible expression (such as TET operator
repressor/operator/inducer elements, or Sulphonylurea (Su)
repressor/operator/inducer elements. The dehydration-responsive
element (DRE) was first identified as a cis-acting promoter element
in the promoter of the drought-responsive gene rd29A, which
contains a 9 bp conserved core sequence, TACCGACAT
(Yamaguchi-Shinozaki, K, and Shinozaki, K. (1994) Plant Cell 6,
251-264). Insertion of DRE into an endogenous promoter may confer a
drought inducible expression of the downstream gene. Another
example are ABA-responsive elements (ABREs) which contain a
(C/T)ACGTGGC consensus sequence found to be present in numerous ABA
and/or stress-regulated genes (Busk P. K., Pages M. (1998) Plant
Mol. Biol. 37:425-435). Insertion of 35S enhancer or MMV enhancer
into an endogenous promoter region will increase gene expression
(U.S. Pat. No. 5,196,525). The promoter (or promoter element) to be
inserted can be a promoter (or promoter element) that is
endogenous, artificial, pre-existing, or transgenic to the cell
that is being edited.
[0317] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to insert an enhancer element, such as but not
limited to a Cauliflower Mosaic Virus 35 S enhancer, in front of an
endogenous FMT1 promoter to enhance expression of the FTM1.
[0318] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to insert a component of the TET operator
repressor/operator/inducer system, or a component of the
sulphonylurea (Su) repressor/operator/inducer system into plant
genomes to generate or control inducible expression systems.
[0319] In another embodiment, the guide polynucleotide/Cas
endonuclease system can be used to allow for the deletion of a
promoter or promoter element, wherein the promoter deletion (or
promoter element deletion) results in any one of the following or
any one combination of the following: a permanently inactivated
gene locus, an increased promoter activity (increased promoter
strength), an increased promoter tissue specificity, a decreased
promoter activity, a decreased promoter tissue specificity, a new
promoter activity, an inducible promoter activity, an extended
window of gene expression, a modification of the timing or
developmental progress of gene expression, a mutation of DNA
binding elements and/or an addition of DNA binding elements.
Promoter elements to be deleted can be, but are not limited to,
promoter core elements, promoter enhancer elements or 35 S enhancer
elements (as described in Example 32) The promoter or promoter
fragment to be deleted can be endogenous, artificial, pre-existing,
or transgenic to the cell that is being edited.
[0320] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to delete the ARGOS 8 promoter present in a
maize genome as described herein.
[0321] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to delete a 35S enhancer element present in a
plant genome as described herein.
Terminator Modifications Using the Guide Polynucleotide/Cas
Endonuclease System
[0322] In one embodiment the nucleotide sequence to be modified can
be a terminator wherein the editing of the terminator comprises
replacing the terminator (also referred to as a "terminator swap"
or "terminator replacement") or terminator fragment with a
different terminator (also referred to as replacement terminator)
or terminator fragment (also referred to as replacement terminator
fragment), wherein the terminator replacement results in any one of
the following or any one combination of the following: an increased
terminator activity, an increased terminator tissue specificity, a
decreased terminator activity, a decreased terminator tissue
specificity, a mutation of DNA binding elements and/or a deletion
or addition of DNA binding elements." The terminator (or terminator
fragment) to be modified can be a terminator (or terminator
fragment) that is endogenous, artificial, pre-existing, or
transgenic to the cell that is being edited. The replacement
terminator (or replacement terminator fragment) can be a terminator
(or terminator fragment) that is endogenous, artificial,
pre-existing, or transgenic to the cell that is being edited.
[0323] In one embodiment the nucleotide sequence to be modified can
be a terminator wherein the terminator to be edited is selected
from the group comprising terminators from maize Argos 8 or SRTF18
genes, or other terminators, such as potato PinII terminator,
sorghum actin terminator (SB-ACTIN TERM, WO 2013/184537 A1
published December 2013), sorghum SB-GKAF TERM (WO2013019461), rice
T28 terminator (OS-T28 TERM, WO 2013/012729 A2), ATT9 TERM (WO
2013/012729 A2) or GZ-W64A TERM (U.S. Pat. No. 7,053,282).
[0324] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used in combination with a co-delivered
polynucleotide modification template or donor DNA sequence to allow
for the insertion of a terminator or terminator element into a
genomic nucleotide sequence of interest, wherein the terminator
insertion (or terminator element insertion) results in any one of
the following or any one combination of the following: an increased
terminator activity (increased terminator strength), an increased
terminator tissue specificity, a decreased terminator activity, a
decreased terminator tissue specificity, a mutation of DNA binding
elements and/or an addition of DNA binding elements.
The terminator (or terminator element) to be inserted can be a
terminator (or terminator element) that is endogenous, artificial,
pre-existing, or transgenic to the cell that is being edited.
[0325] In another embodiment, the guide polynucleotide/Cas
endonuclease system can be used to allow for the deletion of a
terminator or terminator element, wherein the terminator deletion
(or terminator element deletion) results in any one of the
following or any one combination of the following: an increased
terminator activity (increased terminator strength), an increased
terminator tissue specificity, a decreased terminator activity, a
decreased terminator tissue specificity, a mutation of DNA binding
elements and/or an addition of DNA binding elements. The terminator
or terminator fragment to be deleted can be endogenous, artificial,
pre-existing, or transgenic to the cell that is being edited.
Additional Regulatory Sequence Modifications Using the Guide
Polynucleotide/Cas Endonuclease System
[0326] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to modify or replace a regulatory sequence in
the genome of a cell. A regulatory sequence is a segment of a
nucleic acid molecule which is capable of increasing or decreasing
the expression of specific genes within an organism and/or is
capable of altering tissue specific expression of genes within an
organism. Examples of regulatory sequences include, but are not
limited to, 3' UTR (untranslated region) region, 5' UTR region,
transcription activators, transcriptional enhancers transcriptions
repressors, translational repressors, splicing factors, miRNAs,
siRNA, artificial miRNAs, promoter elements, CAMV 35 S enhancer,
MMV enhancer elements (PCT/US14/23451 filed Mar. 11, 2013), SECIS
elements, polyadenylation signals, and polyubiquitination sites. In
some embodiments the editing (modification) or replacement of a
regulatory element results in altered protein translation, RNA
cleavage, RNA splicing, transcriptional termination or post
translational modification. In one embodiment, regulatory elements
can be identified within a promoter and these regulatory elements
can be edited or modified do to optimize these regulatory elements
for up or down regulation of the promoter.
[0327] In one embodiment, the genomic sequence of interest to be
modified is a polyubiquitination site, wherein the modification of
the polyubiquitination sites results in a modified rate of protein
degradation. The ubiquitin tag condemns proteins to be degraded by
proteasomes or autophagy. Proteasome inhibitors are known to cause
a protein overproduction. Modifications made to a DNA sequence
encoding a protein of interest can result in at least one amino
acid modification of the protein of interest, wherein said
modification allows for the polyubiquitination of the protein (a
post translational modification) resulting in a modification of the
protein degradation
[0328] In one embodiment, the genomic sequence of interest to be
modified is a polyubiquitination site on a maize EPSPS gene,
wherein the polyubiquitination site modified resulting in an
increased protein content due to a slower rate of EPSPS protein
degradation.
[0329] In one embodiment, the genomic sequence of interest to be
modified is a an intron site, wherein the modification consist of
inserting an intron enhancing motif into the intron which results
in modulation of the transcriptional activity of the gene
comprising said intron.
[0330] In one embodiment, the genomic sequence of interest to be
modified is a an intron site, wherein the modification consist of
replacing a soybean EPSP1 intron with a soybean ubiquitin intron 1
as described herein (Example 25)
[0331] In one embodiment, the genomic sequence of interest to be
modified is a an intron or UTR site, wherein the modification
consist of inserting at least one microRNA into said intron or UTR
site, wherein expression of the gene comprising the intron or UTR
site also results in expression of said microRNA, which in turn can
silence any gene targeted by the microRNA without disrupting the
gene expression of the native/transgene comprising said intron.
[0332] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to allow for the deletion or mutation of a Zinc
Finger transcription factor, wherein the deletion or mutation of
the Zinc Finger transcription factor results in or allows for the
creation of a dominant negative Zinc Finger transcription factor
mutant (Li et al 2013 Rice zinc finger protein DST enhances grain
production through controlling Gn1a/OsCKX2 expression PNAS
110:3167-3172). Insertion of a single base pair downstream zinc
finger domain will result in a frame shift and produces a new
protein which still can bind to DNA without transcription activity.
The mutant protein will compete to bind to cytokinin oxidase gene
promoters and block the expression of cytokinin oxidase gene.
Reduction of cytokinin oxidase gene expression will increase
cytokinin level and promote panicle growth in rice and ear growth
in maize, and increase yield under normal and stress
conditions.
Modifications of Splicing Sites and/or Introducing Alternate
Splicing Sites Using the Guide Polynucleotide/Cas Endonuclease
System
[0333] Protein synthesis utilizes mRNA molecules that emerge from
pre-mRNA molecules subjected to the maturation process. The
pre-mRNA molecules are capped, spliced and stabilized by addition
of polyA tails. Eukaryotic cells developed a complex process of
splicing that result in alternative variants of the original
pre-mRNA molecules. Some of them may not produce functional
templates for protein synthesis. In maize cells, the splicing
process is affected by splicing sites at the exon-intron junction
sites. An example of a canonical splice site is AGGT. Gene coding
sequences can contains a number of alternate splicing sites that
may affect the overall efficiency of the pre-mRNA maturation
process and as such may limit the protein accumulation in cells.
The guide polynucleotide/Cas endonuclease system can be used in
combination with a co-delivered polynucleotide modification
template to edit a gene of interest to introduce a canonical splice
site at a described junction or any variant of a splicing site that
changes the splicing pattern of pre-mRNA molecules.
[0334] In one embodiment, the nucleotide sequence of interest to be
modified is a maize EPSPS gene, wherein the modification of the
gene consists of modifying alternative splicing sites resulting in
enhanced production of the functional gene transcripts and gene
products (proteins).
[0335] In one embodiment, the nucleotide sequence of interest to be
modified is a gene, wherein the modification of the gene consists
of editing the intron borders of alternatively spliced genes to
alter the accumulation of splice variants.
Modifications of Nucleotide Sequences Encoding a Protein of
Interest Using the Guide Polynucleotide/Cas Endonuclease System
[0336] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used to modify or replace a coding sequence in the
genome of a cell, wherein the modification or replacement results
in any one of the following, or any one combination of the
following: an increased protein (enzyme) activity, an increased
protein functionality, a decreased protein activity, a decreased
protein functionality, a site specific mutation, a protein domain
swap, a protein knock-out, a new protein functionality, a modified
protein functionality,
[0337] In one embodiment the protein knockout is due to the
introduction of a stop codon into the coding sequence of
interest.
[0338] In one embodiment the protein knockout is due to the
deletion of a start codon into the coding sequence of interest.
Amino Acid and/or Protein Fusions Using the Guide
Polynucleotide/Cas Endonuclease System
[0339] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used with or without a co-delivered polynucleotide
sequence to fuse a first coding sequence encoding a first protein
to a second coding sequence encoding a second protein in the genome
of a cell, wherein the protein fusion results in any one of the
following or any one combination of the following: an increased
protein (enzyme) activity, an increased protein functionality, a
decreased protein activity, a decreased protein functionality, a
new protein functionality, a modified protein functionality, a new
protein localization, a new timing of protein expression, a
modified protein expression pattern, a chimeric protein, or a
modified protein with dominant phenotype functionality.
[0340] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used with or without a co-delivered polynucleotide
sequence to fuse a first coding sequence encoding a chloroplast
localization signal to a second coding sequence encoding a protein
of interest, wherein the protein fusion results in targeting the
protein of interest to the chloroplast.
[0341] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used with or without a co-delivered polynucleotide
sequence to fuse a first coding sequence encoding a chloroplast
localization signal to a second coding sequence encoding a protein
of interest, wherein the protein fusion results in targeting the
protein of interest to the chloroplast.
[0342] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used with or without a co-delivered polynucleotide
sequence to fuse a first coding sequence encoding a chloroplast
localization signal (e.g., a chloroplast transit peptide) to a
second coding sequence, wherein the protein fusion results in a
modified protein with dominant phenotype functionality
Gene Silencing by Expressing an Inverted Repeat into a Gene of
Interest Using the Guide Polynucleotide/Cas Endonuclease System
[0343] In one embodiment, the guide polynucleotide/Cas endonuclease
system can be used in combination with a co-delivered
polynucleotide sequence to insert an inverted gene fragment into a
gene of interest in the genome of an organism, wherein the
insertion of the inverted gene fragment can allow for an in-vivo
creation of an inverted repeat (hairpin) and results in the
silencing of said endogenous gene.
[0344] In one embodiment the insertion of the inverted gene
fragment can result in the formation of an in-vivo created inverted
repeat (hairpin) in a native (or modified) promoter of a gene
and/or in a native 5' end of the native gene. The inverted gene
fragment can further comprise an intron which can result in an
enhanced silencing of the targeted gene.
Genome Deletion for Trait Locus Characterization
[0345] Trait mapping in plant breeding often results in the
detection of chromosomal regions housing one or more genes
controlling expression of a trait of interest. For a qualitative
trait, the guide polynucleotide/Cas endonuclease system can be used
to eliminate candidate genes in the identified chromosomal regions
to determine if deletion of the gene affects expression of the
trait. For quantitative traits, expression of a trait of interest
is governed by multiple quantitative trait loci (QTL) of varying
effect-size, complexity, and statistical significance across one or
more chromosomes. In cases of negative effect or deleterious QTL
regions affecting a complex trait, the guide polynucleotide/Cas
endonuclease system can be used to eliminate whole regions
delimited by marker-assisted fine mapping, and to target specific
regions for their selective elimination or rearrangement.
Similarly, presence/absence variation (PAV) or copy number
variation (CNV) can be manipulated with selective genome deletion
using the guide polynucleotide/Cas endonuclease system.
[0346] In one embodiment, the region of interest can be flanked by
two independent guide polynucleotide/CAS endonuclease target
sequences. Cutting would be done concurrently. The deletion event
would be the repair of the two chromosomal ends without the region
of interest. Alternative results would include inversions of the
region of interest, mutations at the cut sites and duplication of
the region of interest.
[0347] Methods for Identifying at Least One Plant Cell Comprising
in its Genome a Polynucleotide of Interest Integrated at the Target
Site.
[0348] Further provided are methods for identifying at least one
plant cell, comprising in its genome, a polynucleotide of interest
integrated at the target site. A variety of methods are available
for identifying those plant cells with insertion into the genome at
or near to the target site without using a screenable marker
phenotype. Such methods can be viewed as directly analyzing a
target sequence to detect any change in the target sequence,
including but not limited to PCR methods, sequencing methods,
nuclease digestion, Southern blots, and any combination thereof.
See, for example, U.S. patent application Ser. No. 12/147,834,
herein incorporated by reference to the extent necessary for the
methods described herein. The method also comprises recovering a
plant from the plant cell comprising a polynucleotide of Interest
integrated into its genome. The plant may be sterile or fertile. It
is recognized that any polynucleotide of interest can be provided,
integrated into the plant genome at the target site, and expressed
in a plant.
[0349] Polynucleotides/polypeptides of interest include, but are
not limited to, herbicide-resistance coding sequences, insecticidal
coding sequences, nematicidal coding sequences, antimicrobial
coding sequences, antifungal coding sequences, antiviral coding
sequences, abiotic and biotic stress tolerance coding sequences, or
sequences modifying plant traits such as yield, grain quality,
nutrient content, starch quality and quantity, nitrogen fixation
and/or utilization, fatty acids, and oil content and/or
composition. More specific polynucleotides of interest include, but
are not limited to, genes that improve crop yield, polypeptides
that improve desirability of crops, genes encoding proteins
conferring resistance to abiotic stress, such as drought, nitrogen,
temperature, salinity, toxic metals or trace elements, or those
conferring resistance to toxins such as pesticides and herbicides,
or to biotic stress, such as attacks by fungi, viruses, bacteria,
insects, and nematodes, and development of diseases associated with
these organisms. General categories of genes of interest include,
for example, those genes involved in information, such as zinc
fingers, those involved in communication, such as kinases, and
those involved in housekeeping, such as heat shock proteins. More
specific categories of transgenes, for example, include genes
encoding important traits for agronomics, insect resistance,
disease resistance, herbicide resistance, fertility or sterility,
grain characteristics, and commercial products. Genes of interest
include, generally, those involved in oil, starch, carbohydrate, or
nutrient metabolism as well as those affecting kernel size, sucrose
loading, and the like that can be stacked or used in combination
with other traits, such as but not limited to herbicide resistance,
described herein.
[0350] Agronomically important traits such as oil, starch, and
protein content can be genetically altered in addition to using
traditional breeding methods. Modifications include increasing
content of oleic acid, saturated and unsaturated oils, increasing
levels of lysine and sulfur, providing essential amino acids, and
also modification of starch. Hordothionin protein modifications are
described in U.S. Pat. Nos. 5,703,049, 5,885,801, 5,885,802, and
5,990,389, herein incorporated by reference. Another example is
lysine and/or sulfur rich seed protein encoded by the soybean 2S
albumin described in U.S. Pat. No. 5,850,016, and the chymotrypsin
inhibitor from barley, described in Williamson et al. (1987) Eur.
J. Biochem. 165:99-106, the disclosures of which are herein
incorporated by reference.
[0351] Commercial traits can also be encoded on a polynucleotide of
interest that could increase for example, starch for ethanol
production, or provide expression of proteins. Another important
commercial use of transformed plants is the production of polymers
and bioplastics such as described in U.S. Pat. No. 5,602,321. Genes
such as .beta.-Ketothiolase, PHBase (polyhydroxybutyrate synthase),
and acetoacetyl-CoA reductase (see Schubert et al. (1988) J.
Bacteriol. 170:5837-5847) facilitate expression of
polyhydroxyalkanoates (PHAs).
[0352] Derivatives of the coding sequences can be made by
site-directed mutagenesis to increase the level of preselected
amino acids in the encoded polypeptide. For example, the gene
encoding the barley high lysine polypeptide (BHL) is derived from
barley chymotrypsin inhibitor, U.S. application Ser. No.
08/740,682, filed Nov. 1, 1996, and WO 98/20133, the disclosures of
which are herein incorporated by reference. Other proteins include
methionine-rich plant proteins such as from sunflower seed (Lilley
et al. (1989) Proceedings of the World Congress on Vegetable
Protein Utilization in Human Foods and Animal Feedstuffs, ed.
Applewhite (American Oil Chemists Society, Champaign, Ill.), pp.
497-502; herein incorporated by reference); corn (Pedersen et al.
(1986) J. Biol. Chem. 261:6279; Kirihara et al. (1988) Gene 71:359;
both of which are herein incorporated by reference); and rice
(Musumura et al. (1989) Plant Mol. Biol. 12:123, herein
incorporated by reference). Other agronomically important genes
encode latex, Floury 2, growth factors, seed storage factors, and
transcription factors.
[0353] Polynucleotides that improve crop yield include dwarfing
genes, such as Rht1 and Rht2 (Peng et al. (1999) Nature
400:256-261), and those that increase plant growth, such as
ammonium-inducible glutamate dehydrogenase. Polynucleotides that
improve desirability of crops include, for example, those that
allow plants to have reduced saturated fat content, those that
boost the nutritional value of plants, and those that increase
grain protein. Polynucleotides that improve salt tolerance are
those that increase or allow plant growth in an environment of
higher salinity than the native environment of the plant into which
the salt-tolerant gene(s) has been introduced.
[0354] Polynucleotides/polypeptides that influence amino acid
biosynthesis include, for example, anthranilate synthase (AS; EC
4.1.3.27) which catalyzes the first reaction branching from the
aromatic amino acid pathway to the biosynthesis of tryptophan in
plants, fungi, and bacteria. In plants, the chemical processes for
the biosynthesis of tryptophan are compartmentalized in the
chloroplast. See, for example, US Pub. 20080050506, herein
incorporated by reference. Additional sequences of interest include
Chorismate Pyruvate Lyase (CPL) which refers to a gene encoding an
enzyme which catalyzes the conversion of chorismate to pyruvate and
pHBA. The most well characterized CPL gene has been isolated from
E. coli and bears the GenBank accession number M96268. See, U.S.
Pat. No. 7,361,811, herein incorporated by reference.
[0355] Polynucleotide sequences of interest may encode proteins
involved in providing disease or pest resistance. By "disease
resistance" or "pest resistance" is intended that the plants avoid
the harmful symptoms that are the outcome of the plant-pathogen
interactions. Pest resistance genes may encode resistance to pests
that have great yield drag such as rootworm, cutworm, European Corn
Borer, and the like. Disease resistance and insect resistance genes
such as lysozymes or cecropins for antibacterial protection, or
proteins such as defensins, glucanases or chitinases for antifungal
protection, or Bacillus thuringiensis endotoxins, protease
inhibitors, collagenases, lectins, or glycosidases for controlling
nematodes or insects are all examples of useful gene products.
Genes encoding disease resistance traits include detoxification
genes, such as against fumonisin (U.S. Pat. No. 5,792,931);
avirulence (avr) and disease resistance (R) genes (Jones et al.
(1994) Science 266:789; Martin et al. (1993) Science 262:1432; and
Mindrinos et al. (1994) Cell 78:1089); and the like. Insect
resistance genes may encode resistance to pests that have great
yield drag such as rootworm, cutworm, European Corn Borer, and the
like. Such genes include, for example, Bacillus thuringiensis toxic
protein genes (U.S. Pat. Nos. 5,366,892; 5,747,450; 5,736,514;
5,723,756; 5,593,881; and Geiser et al. (1986) Gene 48:109); and
the like.
[0356] An "herbicide resistance protein" or a protein resulting
from expression of an "herbicide resistance-encoding nucleic acid
molecule" includes proteins that confer upon a cell the ability to
tolerate a higher concentration of an herbicide than cells that do
not express the protein, or to tolerate a certain concentration of
an herbicide for a longer period of time than cells that do not
express the protein. Herbicide resistance traits may be introduced
into plants by genes coding for resistance to herbicides that act
to inhibit the action of acetolactate synthase (ALS), in particular
the sulfonylurea-type herbicides, genes coding for resistance to
herbicides that act to inhibit the action of glutamine synthase,
such as phosphinothricin or basta (e.g., the bar gene), glyphosate
(e.g., the EPSP synthase gene and the GAT gene), HPPD inhibitors
(e.g, the HPPD gene) or other such genes known in the art. See, for
example, U.S. Pat. Nos. 7,626,077, 5,310,667, 5,866,775, 6,225,114,
6,248,876, 7,169,970, 6,867,293, and U.S. Provisional Application
No. 61/401,456, each of which is herein incorporated by reference.
The bar gene encodes resistance to the herbicide basta, the nptII
gene encodes resistance to the antibiotics kanamycin and geneticin,
and the ALS-gene mutants encode resistance to the herbicide
chlorsulfuron.
[0357] Sterility genes can also be encoded in an expression
cassette and provide an alternative to physical detasseling.
Examples of genes used in such ways include male fertility genes
such as MS26 (see for example U.S. Pat. Nos. 7,098,388, 7,517,975,
7,612,251), MS45 (see for example U.S. Pat. Nos. 5,478,369,
6,265,640) or MSCA1 (see for example U.S. Pat. No. 7,919,676).
Maize plants (Zea mays L.) can be bred by both self-pollination and
cross-pollination techniques. Maize has male flowers, located on
the tassel, and female flowers, located on the ear, on the same
plant. It can self-pollinate ("selfing") or cross pollinate.
Natural pollination occurs in maize when wind blows pollen from the
tassels to the silks that protrude from the tops of the incipient
ears. Pollination may be readily controlled by techniques known to
those of skill in the art. The development of maize hybrids
requires the development of homozygous inbred lines, the crossing
of these lines, and the evaluation of the crosses. Pedigree
breeding and recurrent selections are two of the breeding methods
used to develop inbred lines from populations. Breeding programs
combine desirable traits from two or more inbred lines or various
broad-based sources into breeding pools from which new inbred lines
are developed by selfing and selection of desired phenotypes. A
hybrid maize variety is the cross of two such inbred lines, each of
which may have one or more desirable characteristics lacked by the
other or which complement the other. The new inbreds are crossed
with other inbred lines and the hybrids from these crosses are
evaluated to determine which have commercial potential. The hybrid
progeny of the first generation is designated F1. The F1 hybrid is
more vigorous than its inbred parents. This hybrid vigor, or
heterosis, can be manifested in many ways, including increased
vegetative growth and increased yield.
[0358] Hybrid maize seed can be produced by a male sterility system
incorporating manual detasseling. To produce hybrid seed, the male
tassel is removed from the growing female inbred parent, which can
be planted in various alternating row patterns with the male inbred
parent. Consequently, providing that there is sufficient isolation
from sources of foreign maize pollen, the ears of the female inbred
will be fertilized only with pollen from the male inbred. The
resulting seed is therefore hybrid (F1) and will form hybrid
plants.
[0359] Field variation impacting plant development can result in
plants tasseling after manual detasseling of the female parent is
completed. Or, a female inbred plant tassel may not be completely
removed during the detasseling process. In any event, the result is
that the female plant will successfully shed pollen and some female
plants will be self-pollinated. This will result in seed of the
female inbred being harvested along with the hybrid seed which is
normally produced. Female inbred seed does not exhibit heterosis
and therefore is not as productive as F1 seed. In addition, the
presence of female inbred seed can represent a germplasm security
risk for the company producing the hybrid.
[0360] Alternatively, the female inbred can be mechanically
detasseled by machine.
[0361] Mechanical detasseling is approximately as reliable as hand
detasseling, but is faster and less costly. However, most
detasseling machines produce more damage to the plants than hand
detasseling. Thus, no form of detasseling is presently entirely
satisfactory, and a need continues to exist for alternatives which
further reduce production costs and to eliminate self-pollination
of the female parent in the production of hybrid seed.
[0362] Mutations that cause male sterility in plants have the
potential to be useful in methods for hybrid seed production for
crop plants such as maize and can lower production costs by
eliminating the need for the labor-intensive removal of male
flowers (also known as de-tasseling) from the maternal parent
plants used as a hybrid parent. Mutations that cause male sterility
in maize have been produced by a variety of methods such as X-rays
or UV-irradiations, chemical treatments, or transposable element
insertions (ms23, ms25, ms26, ms32) (Chaubal et al. (2000) Am J Bot
87:1193-1201). Conditional regulation of fertility genes through
fertility/sterility "molecular switches" could enhance the options
for designing new male-sterility systems for crop improvement
(Unger et al. (2002) Transgenic Res 11:455-465).
[0363] Besides identification of novel genes impacting male
fertility, there remains a need to provide a reliable system of
producing genetic male sterility.
[0364] In U.S. Pat. No. 5,478,369, a method is described by which
the Ms45 male fertility gene was tagged and cloned on maize
chromosome 9. Previously, there had been described a male fertility
gene on chromosome 9, ms2, which had never been cloned and
sequenced. It is not allelic to the gene referred to in the '369
patent. See Albertsen, M. and Phillips, R. L., "Developmental
Cytology of 13 Genetic Male Sterile Loci in Maize" Canadian Journal
of Genetics & Cytology 23:195-208 (January 1981). The only
fertility gene cloned before that had been the Arabidopsis gene
described at Aarts, et al., supra.
[0365] Examples of genes that have been discovered subsequently
that are important to male fertility are numerous and include the
Arabidopsis ABORTED MICROSPORES (AMS) gene, Sorensen et al., The
Plant Journal (2003) 33(2):413-423); the Arabidopsis MS1 gene
(Wilson et al., The Plant Journal (2001) 39(2):170-181); the NEF1
gene (Ariizumi et al., The Plant Journal (2004) 39(2):170-181);
Arabidopsis AtGPAT1 gene (Zheng et al., The Plant Cell (2003)
15:1872-1887); the Arabidopsis dde2-2 mutation was shown to be
defective in the allene oxide syntase gene (Malek et al., Planta
(2002)216:187-192); the Arabidopsis faceless pollen-1 gene (flp1)
(Ariizumi et al, Plant Mol. Biol. (2003) 53:107-116); the
Arabidopsis MALE MEIOCYTE DEATH1 gene (Yang et al., The Plant Cell
(2003) 15: 1281-1295); the tapetum-specific zinc finger gene, TAZ1
(Kapoor et al., The Plant Cell (2002) 14:2353-2367); and the
TAPETUM DETERMINANT1 gene (Lan et al, The Plant Cell (2003)
15:2792-2804).
[0366] Other known male fertility mutants or genes from Zea mays
are listed in U.S. Pat. No. 7,919,676 incorporated herein by
reference.
[0367] Other genes include kinases and those encoding compounds
toxic to either male or female gametophytic development.
[0368] Furthermore, it is recognized that the polynucleotide of
interest may also comprise antisense sequences complementary to at
least a portion of the messenger RNA (mRNA) for a targeted gene
sequence of interest. Antisense nucleotides are constructed to
hybridize with the corresponding mRNA. Modifications of the
antisense sequences may be made as long as the sequences hybridize
to and interfere with expression of the corresponding mRNA. In this
manner, antisense constructions having 70%, 80%, or 85% sequence
identity to the corresponding antisense sequences may be used.
Furthermore, portions of the antisense nucleotides may be used to
disrupt the expression of the target gene. Generally, sequences of
at least 50 nucleotides, 100 nucleotides, 200 nucleotides, or
greater may be used.
[0369] In addition, the polynucleotide of interest may also be used
in the sense orientation to suppress the expression of endogenous
genes in plants. Methods for suppressing gene expression in plants
using polynucleotides in the sense orientation are known in the
art. The methods generally involve transforming plants with a DNA
construct comprising a promoter that drives expression in a plant
operably linked to at least a portion of a nucleotide sequence that
corresponds to the transcript of the endogenous gene. Typically,
such a nucleotide sequence has substantial sequence identity to the
sequence of the transcript of the endogenous gene, generally
greater than about 65% sequence identity, about 85% sequence
identity, or greater than about 95% sequence identity. See, U.S.
Pat. Nos. 5,283,184 and 5,034,323; herein incorporated by
reference.
[0370] The polynucleotide of interest can also be a phenotypic
marker. A phenotypic marker is screenable or a selectable marker
that includes visual markers and selectable markers whether it is a
positive or negative selectable marker. Any phenotypic marker can
be used. Specifically, a selectable or screenable marker comprises
a DNA segment that allows one to identify, or select for or against
a molecule or a cell that contains it, often under particular
conditions. These markers can encode an activity, such as, but not
limited to, production of RNA, peptide, or protein, or can provide
a binding site for RNA, peptides, proteins, inorganic and organic
compounds or compositions and the like.
[0371] Examples of selectable markers include, but are not limited
to, DNA segments that comprise restriction enzyme sites; DNA
segments that encode products which provide resistance against
otherwise toxic compounds including antibiotics, such as,
spectinomycin, ampicillin, kanamycin, tetracycline, Basta, neomycin
phosphotransferase II (NEO) and hygromycin phosphotransferase
(HPT)); DNA segments that encode products which are otherwise
lacking in the recipient cell (e.g., tRNA genes, auxotrophic
markers); DNA segments that encode products which can be readily
identified (e.g., phenotypic markers such as .beta.-galactosidase,
GUS; fluorescent proteins such as green fluorescent protein (GFP),
cyan (CFP), yellow (YFP), red (RFP), and cell surface proteins);
the generation of new primer sites for PCR (e.g., the juxtaposition
of two DNA sequence not previously juxtaposed), the inclusion of
DNA sequences not acted upon or acted upon by a restriction
endonuclease or other DNA modifying enzyme, chemical, etc.; and,
the inclusion of a DNA sequences required for a specific
modification (e.g., methylation) that allows its
identification.
[0372] Additional selectable markers include genes that confer
resistance to herbicidal compounds, such as glufosinate ammonium,
bromoxynil, imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D).
See for example, Yarranton, (1992) Curr Opin Biotech 3:506-11;
Christopherson et al., (1992) Proc. Natl. Acad. Sci. USA 89:6314-8;
Yao et al., (1992) Cell 71:63-72; Reznikoff, (1992) Mol Microbiol
6:2419-22; Hu et al., (1987) Cell 48:555-66; Brown et al., (1987)
Cell 49:603-12; Figge et al., (1988) Cell 52:713-22; Deuschle et
al., (1989) Proc. Natl. Acad. Sci. USA 86:5400-4; Fuerst et al.,
(1989) Proc. Natl. Acad. Sci. USA 86:2549-53; Deuschle et al.,
(1990) Science 248:480-3; Gossen, (1993) Ph.D. Thesis, University
of Heidelberg; Reines et al., (1993) Proc. Natl. Acad. Sci. USA
90:1917-21; Labow et al., (1990) Mol Cell Biol 10:3343-56;
Zambretti et al., (1992) Proc. Natl. Acad. Sci. USA 89:3952-6; Baim
et al., (1991) Proc. Natl. Acad. Sci. USA 88:5072-6; Wyborski et
al., (1991) Nucleic Acids Res 19:4647-53; Hillen and Wissman,
(1989) Topics Mol Struc Biol 10:143-62; Degenkolb et al., (1991)
Antimicrob Agents Chemother 35:1591-5; Kleinschnidt et al., (1988)
Biochemistry 27:1094-104; Bonin, (1993) Ph.D. Thesis, University of
Heidelberg; Gossen et al., (1992) Proc. Natl. Acad. Sci. USA
89:5547-51; Oliva et al., (1992) Antimicrob Agents Chemother
36:913-9; Hlavka et al., (1985) Handbook of Experimental
Pharmacology, Vol. 78 (Springer-Verlag, Berlin); Gill et al.,
(1988) Nature 334:721-4. Commercial traits can also be encoded on a
gene or genes that could increase for example, starch for ethanol
production, or provide expression of proteins. Another important
commercial use of transformed plants is the production of polymers
and bioplastics such as described in U.S. Pat. No. 5,602,321. Genes
such as .beta.-Ketothiolase, PHBase (polyhydroxyburyrate synthase),
and acetoacetyl-CoA reductase (see Schubert et al. (1988) J.
Bacteriol. 170:5837-5847) facilitate expression of
polyhyroxyalkanoates (PHAs).
[0373] Exogenous products include plant enzymes and products as
well as those from other sources including procaryotes and other
eukaryotes. Such products include enzymes, cofactors, hormones, and
the like. The level of proteins, particularly modified proteins
having improved amino acid distribution to improve the nutrient
value of the plant, can be increased. This is achieved by the
expression of such proteins having enhanced amino acid content.
[0374] The transgenes, recombinant DNA molecules, DNA sequences of
interest, and polynucleotides of interest can be comprise one or
more DNA sequences for gene silencing. Methods for gene silencing
involving the expression of DNA sequences in plant are known in the
art include, but are not limited to, cosuppression, antisense
suppression, double-stranded RNA (dsRNA) interference, hairpin RNA
(hpRNA) interference, intron-containing hairpin RNA (ihpRNA)
interference, transcriptional gene silencing, and micro RNA (miRNA)
interference
[0375] As used herein, "nucleic acid" means a polynucleotide and
includes a single or a double-stranded polymer of
deoxyribonucleotide or ribonucleotide bases. Nucleic acids may also
include fragments and modified nucleotides. Thus, the terms
"polynucleotide", "nucleic acid sequence", "nucleotide sequence"
and "nucleic acid fragment" are used interchangeably to denote a
polymer of RNA and/or DNA that is single- or double-stranded,
optionally containing synthetic, non-natural, or altered nucleotide
bases. Nucleotides (usually found in their 5'-monophosphate form)
are referred to by their single letter designation as follows: "A"
for adenosine or deoxyadenosine (for RNA or DNA, respectively), "C"
for cytosine or deoxycytosine, "G" for guanosine or deoxyguanosine,
"U" for uridine, "T" for deoxythymidine, "R" for purines (A or G),
"Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T,
"I" for inosine, and "N" for any nucleotide.
[0376] "Open reading frame" is abbreviated ORF.
[0377] The terms "subfragment that is functionally equivalent" and
"functionally equivalent subfragment" are used interchangeably
herein. These terms refer to a portion or subsequence of an
isolated nucleic acid fragment in which the ability to alter gene
expression or produce a certain phenotype is retained whether or
not the fragment or subfragment encodes an active enzyme. For
example, the fragment or subfragment can be used in the design of
genes to produce the desired phenotype in a transformed plant.
genes can be designed for use in suppression by linking a nucleic
acid fragment or subfragment thereof, whether or not it encodes an
active enzyme, in the sense or antisense orientation relative to a
plant promoter sequence.
[0378] The term "conserved domain" or "motif" means a set of amino
acids conserved at specific positions along an aligned sequence of
evolutionarily related proteins. While amino acids at other
positions can vary between homologous proteins, amino acids that
are highly conserved at specific positions indicate amino acids
that are essential to the structure, the stability, or the activity
of a protein. Because they are identified by their high degree of
conservation in aligned sequences of a family of protein
homologues, they can be used as identifiers, or "signatures", to
determine if a protein with a newly determined sequence belongs to
a previously identified protein family.
[0379] Polynucleotide and polypeptide sequences, variants thereof,
and the structural relationships of these sequences can be
described by the terms "homology", "homologous", "substantially
identical", "substantially similar" and "corresponding
substantially" which are used interchangeably herein. These refer
to polypeptide or nucleic acid fragments wherein changes in one or
more amino acids or nucleotide bases do not affect the function of
the molecule, such as the ability to mediate gene expression or to
produce a certain phenotype. These terms also refer to
modification(s) of nucleic acid fragments that do not substantially
alter the functional properties of the resulting nucleic acid
fragment relative to the initial, unmodified fragment. These
modifications include deletion, substitution, and/or insertion of
one or more nucleotides in the nucleic acid fragment.
[0380] Substantially similar nucleic acid sequences encompassed may
be defined by their ability to hybridize (under moderately
stringent conditions, e.g., 0.5.times.SSC, 0.1% SDS, 60.degree. C.)
with the sequences exemplified herein, or to any portion of the
nucleotide sequences disclosed herein and which are functionally
equivalent to any of the nucleic acid sequences disclosed herein.
Stringency conditions can be adjusted to screen for moderately
similar fragments, such as homologous sequences from distantly
related organisms, to highly similar fragments, such as genes that
duplicate functional enzymes from closely related organisms.
Post-hybridization washes determine stringency conditions.
[0381] The term "selectively hybridizes" includes reference to
hybridization, under stringent hybridization conditions, of a
nucleic acid sequence to a specified nucleic acid target sequence
to a detectably greater degree (e.g., at least 2-fold over
background) than its hybridization to non-target nucleic acid
sequences and to the substantial exclusion of non-target nucleic
acids. Selectively hybridizing sequences typically have about at
least 80% sequence identity, or 90% sequence identity, up to and
including 100% sequence identity (i.e., fully complementary) with
each other.
[0382] The term "stringent conditions" or "stringent hybridization
conditions" includes reference to conditions under which a probe
will selectively hybridize to its target sequence in an in vitro
hybridization assay. Stringent conditions are sequence-dependent
and will be different in different circumstances. By controlling
the stringency of the hybridization and/or washing conditions,
target sequences can be identified which are 100% complementary to
the probe (homologous probing). Alternatively, stringency
conditions can be adjusted to allow some mismatching in sequences
so that lower degrees of similarity are detected (heterologous
probing). Generally, a probe is less than about 1000 nucleotides in
length, optionally less than 500 nucleotides in length.
[0383] Typically, stringent conditions will be those in which the
salt concentration is less than about 1.5 M Na ion, typically about
0.01 to 1.0 M Na ion concentration (or other salt(s)) at pH 7.0 to
8.3, and at least about 30.degree. C. for short probes (e.g., 10 to
50 nucleotides) and at least about 60.degree. C. for long probes
(e.g., greater than 50 nucleotides). Stringent conditions may also
be achieved with the addition of destabilizing agents such as
formamide. Exemplary low stringency conditions include
hybridization with a buffer solution of 30 to 35% formamide, 1 M
NaCl, 1.degree. A) SDS (sodium dodecyl sulphate) at 37.degree. C.,
and a wash in 1.times. to 2.times.SSC (20.times.SSC=3.0 M NaCl/0.3
M trisodium citrate) at 50 to 55.degree. C. Exemplary moderate
stringency conditions include hybridization in 40 to 45% formamide,
1 M NaCl, 1% SDS at 37.degree. C., and a wash in 0.5.times. to
1.times.SSC at 55 to 60.degree. C. Exemplary high stringency
conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS
at 37.degree. C., and a wash in 0.1.times.SSC at 60 to 65.degree.
C.
[0384] "Sequence identity" or "identity" in the context of nucleic
acid or polypeptide sequences refers to the nucleic acid bases or
amino acid residues in two sequences that are the same when aligned
for maximum correspondence over a specified comparison window.
[0385] The term "percentage of sequence identity" refers to the
value determined by comparing two optimally aligned sequences over
a comparison window, wherein the portion of the polynucleotide or
polypeptide sequence in the comparison window may comprise
additions or deletions (i.e., gaps) as compared to the reference
sequence (which does not comprise additions or deletions) for
optimal alignment of the two sequences. The percentage is
calculated by determining the number of positions at which the
identical nucleic acid base or amino acid residue occurs in both
sequences to yield the number of matched positions, dividing the
number of matched positions by the total number of positions in the
window of comparison and multiplying the results by 100 to yield
the percentage of sequence identity. Useful examples of percent
sequence identities include, but are not limited to, 50%, 55%, 60%,
65%, 70%, 75%, 80%, 85%, 90% or 95%, or any integer percentage from
50% to 100%. These identities can be determined using any of the
programs described herein.
[0386] Sequence alignments and percent identity or similarity
calculations may be determined using a variety of comparison
methods designed to detect homologous sequences including, but not
limited to, the MegAlign.TM. program of the LASERGENE
bioinformatics computing suite (DNASTAR Inc., Madison, Wis.).
Within the context of this application it will be understood that
where sequence analysis software is used for analysis, that the
results of the analysis will be based on the "default values" of
the program referenced, unless otherwise specified. As used herein
"default values" will mean any set of values or parameters that
originally load with the software when first initialized.
[0387] The "Clustal V method of alignment" corresponds to the
alignment method labeled Clustal V (described by Higgins and Sharp,
(1989) CABIOS 5:151-153; Higgins et al., (1992) Comput Appl Biosci
8:189-191) and found in the MegAlign.TM. program of the LASERGENE
bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). For
multiple alignments, the default values correspond to GAP
PENALTY=10 and GAP LENGTH PENALTY=10. Default parameters for
pairwise alignments and calculation of percent identity of protein
sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3,
WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters
are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. After
alignment of the sequences using the Clustal V program, it is
possible to obtain a "percent identity" by viewing the "sequence
distances" table in the same program.
[0388] The "Clustal W method of alignment" corresponds to the
alignment method labeled Clustal W (described by Higgins and Sharp,
(1989) CABIOS 5:151-153; Higgins et al., (1992) Comput Appl Biosci
8:189-191) and found in the MegAlign.TM. v6.1 program of the
LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison,
Wis.). Default parameters for multiple alignment (GAP PENALTY=10,
GAP LENGTH PENALTY=0.2, Delay Divergen Seqs (%)=30, DNA Transition
Weight=0.5, Protein Weight Matrix=Gonnet Series, DNA Weight
Matrix=IUB). After alignment of the sequences using the Clustal W
program, it is possible to obtain a "percent identity" by viewing
the "sequence distances" table in the same program.
[0389] Unless otherwise stated, sequence identity/similarity values
provided herein refer to the value obtained using GAP Version 10
(GCG, Accelrys, San Diego, Calif.) using the following parameters:
% identity and % similarity for a nucleotide sequence using a gap
creation penalty weight of 50 and a gap length extension penalty
weight of 3, and the nwsgapdna.cmp scoring matrix; % identity and %
similarity for an amino acid sequence using a GAP creation penalty
weight of 8 and a gap length extension penalty of 2, and the
BLOSUM62 scoring matrix (Henikoff and Henikoff, (1989) Proc. Natl.
Acad. Sci. USA 89:10915). GAP uses the algorithm of Needleman and
Wunsch, (1970) J Mol Biol 48:443-53, to find an alignment of two
complete sequences that maximizes the number of matches and
minimizes the number of gaps. GAP considers all possible alignments
and gap positions and creates the alignment with the largest number
of matched bases and the fewest gaps, using a gap creation penalty
and a gap extension penalty in units of matched bases.
[0390] "BLAST" is a searching algorithm provided by the National
Center for Biotechnology Information (NCBI) used to find regions of
similarity between biological sequences. The program compares
nucleotide or protein sequences to sequence databases and
calculates the statistical significance of matches to identify
sequences having sufficient similarity to a query sequence such
that the similarity would not be predicted to have occurred
randomly. BLAST reports the identified sequences and their local
alignment to the query sequence.
[0391] It is well understood by one skilled in the art that many
levels of sequence identity are useful in identifying polypeptides
from other species or modified naturally or synthetically wherein
such polypeptides have the same or similar function or activity.
Useful examples of percent identities include, but are not limited
to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%, or any
integer percentage from 50% to 100%. Indeed, any integer amino acid
identity from 50% to 100% may be useful in describing the present
disclosure, such as 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%,
60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%,
73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%,
86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or
99%.
[0392] "Gene" includes a nucleic acid fragment that expresses a
functional molecule such as, but not limited to, a specific
protein, including regulatory sequences preceding (5' non-coding
sequences) and following (3' non-coding sequences) the coding
sequence. "Native gene" refers to a gene as found in nature with
its own regulatory sequences.
[0393] A "mutated gene" is a gene that has been altered through
human intervention. Such a "mutated gene" has a sequence that
differs from the sequence of the corresponding non-mutated gene by
at least one nucleotide addition, deletion, or substitution. In
certain embodiments of the disclosure, the mutated gene comprises
an alteration that results from a guide polynucleotide/Cas
endonuclease system as disclosed herein. A mutated plant is a plant
comprising a mutated gene.
[0394] As used herein, a "targeted mutation" is a mutation in a
native gene that was made by altering a target sequence within the
native gene using a method involving a double-strand-break-inducing
agent that is capable of inducing a double-strand break in the DNA
of the target sequence as disclosed herein or known in the art.
[0395] In one embodiment, the targeted mutation is the result of a
guideRNA/Cas endonuclease induced gene editing as described herein.
The guide RNA/Cas endonuclease induced targeted mutation can occur
in a nucleotide sequence that is located within or outside a
genomic target site that is recognized and cleaved by a Cas
endonuclease.
[0396] The term "genome" as it applies to a plant cells encompasses
not only chromosomal DNA found within the nucleus, but organelle
DNA found within subcellular components (e.g., mitochondria, or
plastid) of the cell.
[0397] A "codon-modified gene" or "codon-preferred gene" or
"codon-optimized gene" is a gene having its frequency of codon
usage designed to mimic the frequency of preferred codon usage of
the host cell.
[0398] An "allele" is one of several alternative forms of a gene
occupying a given locus on a chromosome. When all the alleles
present at a given locus on a chromosome are the same, that plant
is homozygous at that locus. If the alleles present at a given
locus on a chromosome differ, that plant is heterozygous at that
locus.
[0399] "Coding sequence" refers to a polynucleotide sequence which
codes for a specific amino acid sequence. "Regulatory sequences"
refer to nucleotide sequences located upstream (5' non-coding
sequences), within, or downstream (3' non-coding sequences) of a
coding sequence, and which influence the transcription, RNA
processing or stability, or translation of the associated coding
sequence. Regulatory sequences may include, but are not limited to:
promoters, translation leader sequences, 5' untranslated sequences,
3' untranslated sequences, introns, polyadenylation target
sequences, RNA processing sites, effector binding sites, and
stem-loop structures.
[0400] "A plant-optimized nucleotide sequence" is nucleotide
sequence that has been optimized for increased expression in
plants, particularly for increased expression in plants or in one
or more plants of interest. For example, a plant-optimized
nucleotide sequence can be synthesized by modifying a nucleotide
sequence encoding a protein such as, for example,
double-strand-break-inducing agent (e.g., an endonuclease) as
disclosed herein, using one or more plant-preferred codons for
improved expression. See, for example, Campbell and Gowri (1990)
Plant Physiol. 92:1-11 for a discussion of host-preferred codon
usage.
[0401] Methods are available in the art for synthesizing
plant-preferred genes. See, for example, U.S. Pat. Nos. 5,380,831,
and 5,436,391, and Murray et al. (1989) Nucleic Acids Res.
17:477-498, herein incorporated by reference. Additional sequence
modifications are known to enhance gene expression in a plant host.
These include, for example, elimination of: one or more sequences
encoding spurious polyadenylation signals, one or more exon-intron
splice site signals, one or more transposon-like repeats, and other
such well-characterized sequences that may be deleterious to gene
expression. The G-C content of the sequence may be adjusted to
levels average for a given plant host, as calculated by reference
to known genes expressed in the host plant cell. When possible, the
sequence is modified to avoid one or more predicted hairpin
secondary mRNA structures. Thus, "a plant-optimized nucleotide
sequence" of the present disclosure comprises one or more of such
sequence modifications.
[0402] "Promoter" refers to a DNA sequence capable of controlling
the expression of a coding sequence or functional RNA. The promoter
sequence consists of proximal and more distal upstream elements,
the latter elements often referred to as enhancers. An "enhancer"
is a DNA sequence that can stimulate promoter activity, and may be
an innate element of the promoter or a heterologous element
inserted to enhance the level or tissue-specificity of a promoter.
Promoters may be derived in their entirety from a native gene, or
be composed of different elements derived from different promoters
found in nature, and/or comprise synthetic DNA segments. It is
understood by those skilled in the art that different promoters may
direct the expression of a gene in different tissues or cell types,
or at different stages of development, or in response to different
environmental conditions. It is further recognized that since in
most cases the exact boundaries of regulatory sequences have not
been completely defined, DNA fragments of some variation may have
identical promoter activity. Promoters that cause a gene to be
expressed in most cell types at most times are commonly referred to
as "constitutive promoters".
[0403] It has been shown that certain promoters are able to direct
RNA synthesis at a higher rate than others. These are called
"strong promoters". Certain other promoters have been shown to
direct RNA synthesis at higher levels only in particular types of
cells or tissues and are often referred to as "tissue specific
promoters", or "tissue-preferred promoters" if the promoters direct
RNA synthesis preferably in certain tissues but also in other
tissues at reduced levels. Since patterns of expression of a
chimeric gene (or genes) introduced into a plant are controlled
using promoters, there is an ongoing interest in the isolation of
novel promoters which are capable of controlling the expression of
a chimeric gene or (genes) at certain levels in specific tissue
types or at specific plant developmental stages.
[0404] Some embodiments of the disclosures relate to newly
discovered U6 RNA polymerase III promoters, GM-U6-13.1 (SEQ ID NO:
120) as described in Example 12 and GM-U6-9.1 (SEQ ID NO: 295)
described in Example 19.
[0405] Non-limiting examples of methods and compositions relating
to the soybean promoters described herein are as follows:
[0406] A1. A recombinant DNA construct comprising a nucleotide
sequence comprising any of the sequences set forth in SEQ ID NO:120
or SEQ ID NO:295, or a functional fragment thereof, operably linked
to at least one heterologous sequence, wherein said nucleotide
sequence is a promoter.
[0407] A2. The recombinant DNA construct of embodiment A1, wherein
the nucleotide sequence has at least 95% identity, based on the
Clustal V method of alignment with pairwise alignment default
parameters (KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS
SAVED=4), when compared to the sequence set forth in SEQ ID NO:120
or SEQ ID NO: 295.
[0408] A3. A vector comprising the recombinant DNA construct of
embodiment A1.
[0409] A4. A cell comprising the recombinant DNA construct of
embodiment A1.
[0410] A5. The cell of embodiment A4, wherein the cell is a plant
cell.
[0411] A6. A transgenic plant having stably incorporated into its
genome the recombinant DNA construct of embodiment A1.
[0412] A7. The transgenic plant of embodiment A6, wherein said
plant is a dicot plant.
[0413] A8. The transgenic plant of embodiment A7 wherein the plant
is soybean.
[0414] A9. A transgenic seed produced by the transgenic plant of
embodiment A7, wherein the transgenic seed comprises the
recombinant DNA construct.
[0415] A10. The recombinant DNA construct of embodiment A1 wherein
the at least one heterologous sequence codes for a gene selected
from the group consisting of: a reporter gene, a selection marker,
a disease resistance conferring gene, a herbicide resistance
conferring gene, an insect resistance conferring gene; a gene
involved in carbohydrate metabolism, a gene involved in fatty acid
metabolism, a gene involved in amino acid metabolism, a gene
involved in plant development, a gene involved in plant growth
regulation, a gene involved in yield improvement, a gene involved
in drought resistance, a gene involved in cold resistance, a gene
involved in heat resistance and a gene involved in salt resistance
in plants.
[0416] A11. The recombinant DNA construct of embodiment A1, wherein
the at least one heterologous sequence encodes a protein selected
from the group consisting of: a reporter protein, a selection
marker, a protein conferring disease resistance, protein conferring
herbicide resistance, protein conferring insect resistance; protein
involved in carbohydrate metabolism, protein involved in fatty acid
metabolism, protein involved in amino acid metabolism, protein
involved in plant development, protein involved in plant growth
regulation, protein involved in yield improvement, protein involved
in drought resistance, protein involved in cold resistance, protein
involved in heat resistance and protein involved in salt resistance
in plants.
[0417] A12. A method of expressing a coding sequence or a
functional RNA in a plant comprising:
[0418] a) introducing the recombinant DNA construct of embodiment
A1 into the plant, wherein the at least one heterologous sequence
comprises a coding sequence or encodes a functional RNA;
[0419] b) growing the plant of step a); and
[0420] c) selecting a plant displaying expression of the coding
sequence or the functional RNA of the recombinant DNA
construct.
[0421] A13. A method of transgenically altering a marketable plant
trait, comprising:
[0422] a) introducing a recombinant DNA construct of embodiment A1
into the plant;
[0423] b) growing a fertile, mature plant resulting from step a);
and
[0424] c) selecting a plant expressing the at least one
heterologous sequence in at least one plant tissue based on the
altered marketable trait.
[0425] A14. The method of embodiment A13 wherein the marketable
trait is selected from the group consisting of: disease resistance,
herbicide resistance, insect resistance carbohydrate metabolism,
fatty acid metabolism, amino acid metabolism, plant development,
plant growth regulation, yield improvement, drought resistance,
cold resistance, heat resistance, and salt resistance.
[0426] A15. A method for altering expression of at least one
heterologous sequence in a plant comprising:
[0427] (a) transforming a plant cell with the recombinant DNA
construct of embodiment A1;
[0428] (b) growing fertile mature plants from transformed plant
cell of step (a); and
[0429] (c) selecting plants containing the transformed plant cell
wherein the expression of the heterologous sequence is increased or
decreased.
[0430] A16. The method of Embodiment A15 wherein the plant is a
soybean plant.
[0431] A17. A plant stably transformed with a recombinant DNA
construct comprising a soybean promoter and a heterologous nucleic
acid fragment operably linked to said promoter, wherein said
promoter is a capable of controlling expression of said
heterologous nucleic acid fragment in a plant cell, and further
wherein said promoter comprises any of the sequences set forth in
SEQ ID NO: 120 or SEQ ID NO:295.
[0432] New promoters of various types useful in plant cells are
constantly being discovered; numerous examples may be found in the
compilation by Okamuro and Goldberg, (1989) In The Biochemistry of
Plants, Vol. 115, Stumpf and Conn, eds (New York, N.Y.: Academic
Press), pp. 1-82.
[0433] "Translation leader sequence" refers to a polynucleotide
sequence located between the promoter sequence of a gene and the
coding sequence. The translation leader sequence is present in the
mRNA upstream of the translation start sequence. The translation
leader sequence may affect processing of the primary transcript to
mRNA, mRNA stability or translation efficiency. Examples of
translation leader sequences have been described (e.g., Turner and
Foster, (1995) Mol Biotechnol 3:225-236).
[0434] "3' non-coding sequences", "transcription terminator" or
"termination sequences" refer to DNA sequences located downstream
of a coding sequence and include polyadenylation recognition
sequences and other sequences encoding regulatory signals capable
of affecting mRNA processing or gene expression. The
polyadenylation signal is usually characterized by affecting the
addition of polyadenylic acid tracts to the 3' end of the mRNA
precursor. The use of different 3' non-coding sequences is
exemplified by Ingelbrecht et al., (1989) Plant Cell 1:671-680.
[0435] "RNA transcript" refers to the product resulting from RNA
polymerase-catalyzed transcription of a DNA sequence. When the RNA
transcript is a perfect complimentary copy of the DNA sequence, it
is referred to as the primary transcript or pre-mRNA. A RNA
transcript is referred to as the mature RNA or mRNA when it is a
RNA sequence derived from post-transcriptional processing of the
primary transcript pre mRNAt. "Messenger RNA" or "mRNA" refers to
the RNA that is without introns and that can be translated into
protein by the cell. "cDNA" refers to a DNA that is complementary
to, and synthesized from, a mRNA template using the enzyme reverse
transcriptase. The cDNA can be single-stranded or converted into
double-stranded form using the Klenow fragment of DNA polymerase I.
"Sense" RNA refers to RNA transcript that includes the mRNA and can
be translated into protein within a cell or in vitro. "Antisense
RNA" refers to an RNA transcript that is complementary to all or
part of a target primary transcript or mRNA, and that blocks the
expression of a target gene (see, e.g., U.S. Pat. No. 5,107,065).
The complementarity of an antisense RNA may be with any part of the
specific gene transcript, i.e., at the 5' non-coding sequence, 3'
non-coding sequence, introns, or the coding sequence. "Functional
RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may
not be translated but yet has an effect on cellular processes. The
terms "complement" and "reverse complement" are used
interchangeably herein with respect to mRNA transcripts, and are
meant to define the antisense RNA of the message.
[0436] The term "operably linked" refers to the association of
nucleic acid sequences on a single nucleic acid fragment so that
the function of one is regulated by the other. For example, a
promoter is operably linked with a coding sequence when it is
capable of regulating the expression of that coding sequence (i.e.,
the coding sequence is under the transcriptional control of the
promoter). Coding sequences can be operably linked to regulatory
sequences in a sense or antisense orientation. In another example,
the complementary RNA regions can be operably linked, either
directly or indirectly, 5' to the target mRNA, or 3' to the target
mRNA, or within the target mRNA, or a first complementary region is
5' and its complement is 3' to the target mRNA.
[0437] Standard recombinant DNA and molecular cloning techniques
used herein are well known in the art and are described more fully
in Sambrook et al., Molecular Cloning: A Laboratory Manual; Cold
Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989).
Transformation methods are well known to those skilled in the art
and are described infra.
[0438] "PCR" or "polymerase chain reaction" is a technique for the
synthesis of specific DNA segments and consists of a series of
repetitive denaturation, annealing, and extension cycles.
Typically, a double-stranded DNA is heat denatured, and two primers
complementary to the 3' boundaries of the target segment are
annealed to the DNA at low temperature, and then extended at an
intermediate temperature. One set of these three consecutive steps
is referred to as a "cycle".
[0439] The term "recombinant" refers to an artificial combination
of two otherwise separated segments of sequence, e.g., by chemical
synthesis, or manipulation of isolated segments of nucleic acids by
genetic engineering techniques.
[0440] The terms "plasmid", "vector" and "cassette" refer to an
extra chromosomal element often carrying genes that are not part of
the central metabolism of the cell, and usually in the form of
double-stranded DNA. Such elements may be autonomously replicating
sequences, genome integrating sequences, phage, or nucleotide
sequences, in linear or circular form, of a single- or
double-stranded DNA or RNA, derived from any source, in which a
number of nucleotide sequences have been joined or recombined into
a unique construction which is capable of introducing a
polynucleotide of interest into a cell. "Transformation cassette"
refers to a specific vector containing a gene and having elements
in addition to the gene that facilitates transformation of a
particular host cell. "Expression cassette" refers to a specific
vector containing a gene and having elements in addition to the
gene that allow for expression of that gene in a host.
[0441] The terms "recombinant DNA molecule", "recombinant
construct", "expression construct", "construct", "construct", and
"recombinant DNA construct" are used interchangeably herein. A
recombinant construct comprises an artificial combination of
nucleic acid fragments, e.g., regulatory and coding sequences that
are not all found together in nature. For example, a construct may
comprise regulatory sequences and coding sequences that are derived
from different sources, or regulatory sequences and coding
sequences derived from the same source, but arranged in a manner
different than that found in nature. Such a construct may be used
by itself or may be used in conjunction with a vector. If a vector
is used, then the choice of vector is dependent upon the method
that will be used to transform host cells as is well known to those
skilled in the art. For example, a plasmid vector can be used. The
skilled artisan is well aware of the genetic elements that must be
present on the vector in order to successfully transform, select
and propagate host cells. The skilled artisan will also recognize
that different independent transformation events may result in
different levels and patterns of expression (Jones et al., (1985)
EMBO J. 4:2411-2418; De Almeida et al., (1989) Mol Gen Genetics
218:78-86), and thus that multiple events are typically screened in
order to obtain lines displaying the desired expression level and
pattern. Such screening may be accomplished standard molecular
biological, biochemical, and other assays including Southern
analysis of DNA, Northern analysis of mRNA expression, PCR, real
time quantitative PCR (qPCR), reverse transcription PCR (RT-PCR),
immunoblotting analysis of protein expression, enzyme or activity
assays, and/or phenotypic analysis.
[0442] The term "expression", as used herein, refers to the
production of a functional end-product (e.g., an mRNA, guide RNA,
or a protein) in either precursor or mature form.
[0443] The term "introduced" means providing a nucleic acid (e.g.,
expression construct) or protein into a cell. Introduced includes
reference to the incorporation of a nucleic acid into a eukaryotic
or prokaryotic cell where the nucleic acid may be incorporated into
the genome of the cell, and includes reference to the transient
provision of a nucleic acid or protein to the cell. Introduced
includes reference to stable or transient transformation methods,
as well as sexually crossing. Thus, "introduced" in the context of
inserting a nucleic acid fragment (e.g., a recombinant DNA
construct/expression construct) into a cell, means "transfection"
or "transformation" or "transduction" and includes reference to the
incorporation of a nucleic acid fragment into a eukaryotic or
prokaryotic cell where the nucleic acid fragment may be
incorporated into the genome of the cell (e.g., chromosome,
plasmid, plastid, or mitochondrial DNA), converted into an
autonomous replicon, or transiently expressed (e.g., transfected
mRNA).
[0444] "Mature" protein refers to a post-translationally processed
polypeptide (i.e., one from which any pre- or propeptides present
in the primary translation product have been removed). "Precursor"
protein refers to the primary product of translation of mRNA (i.e.,
with pre- and propeptides still present). Pre- and propeptides may
be but are not limited to intracellular localization signals.
[0445] "Stable transformation" refers to the transfer of a nucleic
acid fragment into a genome of a host organism, including both
nuclear and organellar genomes, resulting in genetically stable
inheritance. In contrast, "transient transformation" refers to the
transfer of a nucleic acid fragment into the nucleus, or other
DNA-containing organelle, of a host organism resulting in gene
expression without integration or stable inheritance. Host
organisms containing the transformed nucleic acid fragments are
referred to as "transgenic" organisms.
[0446] The commercial development of genetically improved germplasm
has also advanced to the stage of introducing multiple traits into
crop plants, often referred to as a gene stacking approach. In this
approach, multiple genes conferring different characteristics of
interest can be introduced into a plant. Gene stacking can be
accomplished by many means including but not limited to
co-transformation, retransformation, and crossing lines with
different genes of interest.
[0447] The term "plant" refers to whole plants, plant organs, plant
tissues, seeds, plant cells, seeds and progeny of the same. Plant
cells include, without limitation, cells from seeds, suspension
cultures, embryos, meristematic regions, callus tissue, leaves,
roots, shoots, gametophytes, sporophytes, pollen and microspores.
Plant parts include differentiated and undifferentiated tissues
including, but not limited to roots, stems, shoots, leaves,
pollens, seeds, tumor tissue and various forms of cells and culture
(e.g., single cells, protoplasts, embryos, and callus tissue). The
plant tissue may be in plant or in a plant organ, tissue or cell
culture. The term "plant organ" refers to plant tissue or a group
of tissues that constitute a morphologically and functionally
distinct part of a plant. The term "genome" refers to the entire
complement of genetic material (genes and non-coding sequences)
that is present in each cell of an organism, or virus or organelle;
and/or a complete set of chromosomes inherited as a (haploid) unit
from one parent. "Progeny" comprises any subsequent generation of a
plant.
[0448] A transgenic plant includes, for example, a plant which
comprises within its genome a heterologous polynucleotide
introduced by a transformation step. The heterologous
polynucleotide can be stably integrated within the genome such that
the polynucleotide is passed on to successive generations. The
heterologous polynucleotide may be integrated into the genome alone
or as part of a recombinant DNA construct. A transgenic plant can
also comprise more than one heterologous polynucleotide within its
genome. Each heterologous polynucleotide may confer a different
trait to the transgenic plant. A heterologous polynucleotide can
include a sequence that originates from a foreign species, or, if
from the same species, can be substantially modified from its
native form. Transgenic can include any cell, cell line, callus,
tissue, plant part or plant, the genotype of which has been altered
by the presence of heterologous nucleic acid including those
transgenics initially so altered as well as those created by sexual
crosses or asexual propagation from the initial transgenic. The
alterations of the genome (chromosomal or extra-chromosomal) by
conventional plant breeding methods, by the genome editing
procedure described herein that does not result in an insertion of
a foreign polynucleotide, or by naturally occurring events such as
random cross-fertilization, non-recombinant viral infection,
non-recombinant bacterial transformation, non-recombinant
transposition, or spontaneous mutation are not intended to be
regarded as transgenic.
[0449] In certain embodiments of the disclosure, a fertile plant is
a plant that produces viable male and female gametes and is
self-fertile. Such a self-fertile plant can produce a progeny plant
without the contribution from any other plant of a gamete and the
genetic material contained therein. Other embodiments of the
disclosure can involve the use of a plant that is not self-fertile
because the plant does not produce male gametes, or female gametes,
or both, that are viable or otherwise capable of fertilization. As
used herein, a "male sterile plant" is a plant that does not
produce male gametes that are viable or otherwise capable of
fertilization. As used herein, a "female sterile plant" is a plant
that does not produce female gametes that are viable or otherwise
capable of fertilization. It is recognized that male-sterile and
female-sterile plants can be female-fertile and male-fertile,
respectively. It is further recognized that a male fertile (but
female sterile) plant can produce viable progeny when crossed with
a female fertile plant and that a female fertile (but male sterile)
plant can produce viable progeny when crossed with a male fertile
plant.
[0450] A "centimorgan" (cM) or "map unit" is the distance between
two linked genes, markers, target sites, loci, or any pair thereof,
wherein 1% of the products of meiosis are recombinant. Thus, a
centimorgan is equivalent to a distance equal to a 1% average
recombination frequency between the two linked genes, markers,
target sites, loci, or any pair thereof.
Breeding Methods and Methods for Selecting Plants Utilizing a Two
Component RNA Guide and Cas Endonuclease System
[0451] The present disclosure finds use in the breeding of plants
comprising one or more transgenic traits. Most commonly, transgenic
traits are randomly inserted throughout the plant genome as a
consequence of transformation systems based on Agrobacterium,
biolistics, or other commonly used procedures. More recently, gene
targeting protocols have been developed that enable directed
transgene insertion. One important technology, site-specific
integration (SSI) enables the targeting of a transgene to the same
chromosomal location as a previously inserted transgene.
Custom-designed meganucleases and custom-designed zinc finger
meganucleases allow researchers to design nucleases to target
specific chromosomal locations, and these reagents allow the
targeting of transgenes at the chromosomal site cleaved by these
nucleases.
[0452] The currently used systems for precision genetic engineering
of eukaryotic genomes, e.g. plant genomes, rely upon homing
endonucleases, meganucleases, zinc finger nucleases, and
transcription activator--like effector nucleases (TALENs), which
require de novo protein engineering for every new target locus. The
highly specific, RNA-directed DNA nuclease, guide RNA/Cas9
endonuclease system described herein, is more easily customizable
and therefore more useful when modification of many different
target sequences is the goal. This disclosure takes further
advantage of the two component nature of the guide RNA/Cas system,
with its constant protein component, the Cas endonuclease, and its
variable and easily reprogrammable targeting component, the guide
RNA or the crRNA.
[0453] The guide RNA/Cas system described herein is especially
useful for genome engineering, especially plant genome engineering,
in circumstances where nuclease off-target cutting can be toxic to
the targeted cells. In one embodiment of the guide RNA/Cas system
described herein, the constant component, in the form of an
expression-optimized Cas9 gene, is stably integrated into the
target genome, e.g. plant genome. Expression of the Cas9 gene is
under control of a promoter, e.g. plant promoter, which can be a
constitutive promoter, tissue-specific promoter or inducible
promoter, e.g. temperature-inducible, stress-inducible,
developmental stage inducible, or chemically inducible promoter. In
the absence of the variable component, i.e. the guide RNA or crRNA,
the Cas9 protein is not able to cut DNA and therefore its presence
in the plant cell should have little or no consequence. Hence a key
advantage of the guide RNA/Cas system described herein is the
ability to create and maintain a cell line or transgenic organism
capable of efficient expression of the Cas9 protein with little or
no consequence to cell viability. In order to induce cutting at
desired genomic sites to achieve targeted genetic modifications,
guide RNAs or crRNAs can be introduced by a variety of methods into
cells containing the stably-integrated and expressed cas9 gene. For
example, guide RNAs or crRNAs can be chemically or enzymatically
synthesized, and introduced into the Cas9 expressing cells via
direct delivery methods such a particle bombardment or
electroporation.
[0454] Alternatively, genes capable of efficiently expressing guide
RNAs or crRNAs in the target cells can be synthesized chemically,
enzymatically or in a biological system, and these genes can be
introduced into the Cas9 expressing cells via direct delivery
methods such a particle bombardment, electroporation or biological
delivery methods such as Agrobacterium mediated DNA delivery.
[0455] One embodiment of the disclosure is a method for selecting a
plant comprising an altered target site in its plant genome, the
method comprising: a) obtaining a first plant comprising at least
one Cas endonuclease capable of introducing a double strand break
at a target site in the plant genome; b) obtaining a second plant
comprising a guide RNA that is capable of forming a complex with
the Cas endonuclease of (a), c) crossing the first plant of (a)
with the second plant of (b); d) evaluating the progeny of (c) for
an alteration in the target site and e) selecting a progeny plant
that possesses the desired alteration of said target site.
[0456] Another embodiment of the disclosure is a method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising: a) obtaining a first plant
comprising at least one Cas endonuclease capable of introducing a
double strand break at a target site in the plant genome; b)
obtaining a second plant comprising a guide RNA and a donor DNA,
wherein said guide RNA is capable of forming a complex with the Cas
endonuclease of (a), wherein said donor DNA comprises a
polynucleotide of interest; c) crossing the first plant of (a) with
the second plant of (b); d) evaluating the progeny of (c) for an
alteration in the target site and e) selecting a progeny plant that
comprises the polynucleotide of interest inserted at said target
site.
[0457] Another embodiment of the disclosure is a method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising selecting at least one progeny plant
that comprises an alteration at a target site in its plant genome,
wherein said progeny plant was obtained by crossing a first plant
expressing at least one Cas endonuclease to a second plant
comprising a guide RNA and a donor DNA, wherein said Cas
endonuclease is capable of introducing a double strand break at
said target site, wherein said donor DNA comprises a polynucleotide
of interest.
[0458] As disclosed herein, a guide RNA/Cas system mediating gene
targeting can be used in methods for directing transgene insertion
and/or for producing complex transgenic trait loci comprising
multiple transgenes in a fashion similar as disclosed in
WO2013/0198888 (published Aug. 1, 2013) where instead of using a
double strand break inducing agent to introduce a gene of interest,
a guide RNA/Cas system or a guide polynucleotide/Cas system as
disclosed herein is used. In one embodiment, a complex transgenic
trait locus is a genomic locus that has multiple transgenes
genetically linked to each other. By inserting independent
transgenes within 0.1, 0.2, 0.3, 04, 0.5, 1, 2, or even 5
centimorgans (cM) from each other, the transgenes can be bred as a
single genetic locus (see, for example, U.S. patent application
Ser. No. 13/427,138) or PCT application PCT/US2012/030061. After
selecting a plant comprising a transgene, plants containing (at
least) one transgenes can be crossed to form an F1 that contains
both transgenes. In progeny from these F1 (F2 or BC1) 1/500 progeny
would have the two different transgenes recombined onto the same
chromosome. The complex locus can then be bred as single genetic
locus with both transgene traits. This process can be repeated to
stack as many traits as desired.
[0459] Chromosomal intervals that correlate with a phenotype or
trait of interest can be identified. A variety of methods well
known in the art are available for identifying chromosomal
intervals. The boundaries of such chromosomal intervals are drawn
to encompass markers that will be linked to the gene controlling
the trait of interest. In other words, the chromosomal interval is
drawn such that any marker that lies within that interval
(including the terminal markers that define the boundaries of the
interval) can be used as a marker for northern leaf blight
resistance. In one embodiment, the chromosomal interval comprises
at least one QTL, and furthermore, may indeed comprise more than
one QTL. Close proximity of multiple QTLs in the same interval may
obfuscate the correlation of a particular marker with a particular
QTL, as one marker may demonstrate linkage to more than one QTL.
Conversely, e.g., if two markers in close proximity show
co-segregation with the desired phenotypic trait, it is sometimes
unclear if each of those markers identifies the same QTL or two
different QTL. The term "quantitative trait locus" or "QTL" refers
to a region of DNA that is associated with the differential
expression of a quantitative phenotypic trait in at least one
genetic background, e.g., in at least one breeding population. The
region of the QTL encompasses or is closely linked to the gene or
genes that affect the trait in question. An "allele of a QTL" can
comprise multiple genes or other genetic factors within a
contiguous genomic region or linkage group, such as a haplotype. An
allele of a QTL can denote a haplotype within a specified window
wherein said window is a contiguous genomic region that can be
defined, and tracked, with a set of one or more polymorphic
markers. A haplotype can be defined by the unique fingerprint of
alleles at each marker within the specified window.
[0460] A variety of methods are available to identify those cells
having an altered genome at or near a target site without using a
screenable marker phenotype. Such methods can be viewed as directly
analyzing a target sequence to detect any change in the target
sequence, including but not limited to PCR methods, sequencing
methods, nuclease digestion, Southern blots, and any combination
thereof.
[0461] Proteins may be altered in various ways including amino acid
substitutions, deletions, truncations, and insertions. Methods for
such manipulations are generally known. For example, amino acid
sequence variants of the protein(s) can be prepared by mutations in
the DNA. Methods for mutagenesis and nucleotide sequence
alterations include, for example, Kunkel, (1985) Proc. Natl. Acad.
Sci. USA 82:488-92; Kunkel et al., (1987) Meth Enzymol 154:367-82;
U.S. Pat. No. 4,873,192; Walker and Gaastra, eds. (1983) Techniques
in Molecular Biology (MacMillan Publishing Company, New York) and
the references cited therein. Guidance regarding amino acid
substitutions not likely to affect biological activity of the
protein is found, for example, in the model of Dayhoff et al.,
(1978) Atlas of Protein Sequence and Structure (Natl Biomed Res
Found, Washington, D.C.). Conservative substitutions, such as
exchanging one amino acid with another having similar properties,
may be preferable. Conservative deletions, insertions, and amino
acid substitutions are not expected to produce radical changes in
the characteristics of the protein, and the effect of any
substitution, deletion, insertion, or combination thereof can be
evaluated by routine screening assays. Assays for
double-strand-break-inducing activity are known and generally
measure the overall activity and specificity of the agent on DNA
substrates containing target sites.
[0462] A variety of methods are known for the introduction of
nucleotide sequences and polypeptides into an organism, including,
for example, transformation, sexual crossing, and the introduction
of the polypeptide, DNA, or mRNA into the cell.
[0463] Methods for contacting, providing, and/or introducing a
composition into various organisms are known and include but are
not limited to, stable transformation methods, transient
transformation methods, virus-mediated methods, and sexual
breeding. Stable transformation indicates that the introduced
polynucleotide integrates into the genome of the organism and is
capable of being inherited by progeny thereof. Transient
transformation indicates that the introduced composition is only
temporarily expressed or present in the organism.
[0464] Protocols for introducing polynucleotides and polypeptides
into plants may vary depending on the type of plant or plant cell
targeted for transformation, such as monocot or dicot. Suitable
methods of introducing polynucleotides and polypeptides into plant
cells and subsequent insertion into the plant genome include
microinjection (Crossway et al., (1986) Biotechniques 4:320-34 and
U.S. Pat. No. 6,300,543), meristem transformation (U.S. Pat. No.
5,736,369), electroporation (Riggs et al., (1986) Proc. Natl. Acad.
Sci. USA 83:5602-6, Agrobacterium-mediated transformation (U.S.
Pat. Nos. 5,563,055 and 5,981,840), direct gene transfer
(Paszkowski et al., (1984) EMBO J. 3:2717-22), and ballistic
particle acceleration (U.S. Pat. Nos. 4,945,050; 5,879,918;
5,886,244; 5,932,782; Tomes et al., (1995) "Direct DNA Transfer
into Intact Plant Cells via Microprojectile Bombardment" in Plant
Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg
& Phillips (Springer-Verlag, Berlin); McCabe et al., (1988)
Biotechnology 6:923-6; Weissinger et al., (1988) Ann Rev Genet.
22:421-77; Sanford et al., (1987) Particulate Science and
Technology 5:27-37 (onion); Christou et al., (1988) Plant Physiol
87:671-4 (soybean); Finer and McMullen, (1991) In Vitro Cell Dev
Biol 27P:175-82 (soybean); Singh et al., (1998) Theor Appl Genet
96:319-24 (soybean); Datta et al., (1990) Biotechnology 8:736-40
(rice); Klein et al., (1988) Proc. Natl. Acad. Sci. USA 85:4305-9
(maize); Klein et al., (1988) Biotechnology 6:559-63 (maize); U.S.
Pat. Nos. 5,240,855; 5,322,783 and 5,324,646; Klein et al., (1988)
Plant Physiol 91:440-4 (maize); Fromm et al., (1990) Biotechnology
8:833-9 (maize); Hooykaas-Van Slogteren et al., (1984) Nature
311:763-4; U.S. Pat. No. 5,736,369 (cereals); Bytebier et al.,
(1987) Proc. Natl. Acad. Sci. USA 84:5345-9 (Liliaceae); De Wet et
al., (1985) in The Experimental Manipulation of Ovule Tissues, ed.
Chapman et al., (Longman, N.Y.), pp. 197-209 (pollen); Kaeppler et
al., (1990) Plant Cell Rep 9:415-8) and Kaeppler et al., (1992)
Theor Appl Genet 84:560-6 (whisker-mediated transformation);
D'Halluin et al., (1992) Plant Cell 4:1495-505 (electroporation);
Li et al., (1993) Plant Cell Rep 12:250-5; Christou and Ford (1995)
Annals Botany 75:407-13 (rice) and Osjoda et al., (1996) Nat
Biotechnol 14:745-50 (maize via Agrobacterium tumefaciens).
[0465] Alternatively, polynucleotides may be introduced into plants
by contacting plants with a virus or viral nucleic acids.
Generally, such methods involve incorporating a polynucleotide
within a viral DNA or RNA molecule. In some examples a polypeptide
of interest may be initially synthesized as part of a viral
polyprotein, which is later processed by proteolysis in vivo or in
vitro to produce the desired recombinant protein. Methods for
introducing polynucleotides into plants and expressing a protein
encoded therein, involving viral DNA or RNA molecules, are known,
see, for example, U.S. Pat. Nos. 5,889,191, 5,889,190, 5,866,785,
5,589,367 and 5,316,931. Transient transformation methods include,
but are not limited to, the introduction of polypeptides, such as a
double-strand break inducing agent, directly into the organism, the
introduction of polynucleotides such as DNA and/or RNA
polynucleotides, and the introduction of the RNA transcript, such
as an mRNA encoding a double-strand break inducing agent, into the
organism. Such methods include, for example, microinjection or
particle bombardment. See, for example Crossway et al., (1986) Mol
Gen Genet 202:179-85; Nomura et al., (1986) Plant Sci 44:53-8;
Hepler et al., (1994) Proc. Natl. Acad. Sci. USA 91:2176-80; and,
Hush et al., (1994) J Cell Sci 107:775-84.
[0466] The term "dicot" refers to the subclass of angiosperm plants
also knows as "dicotyledoneae" and includes reference to whole
plants, plant organs (e.g., leaves, stems, roots, etc.), seeds,
plant cells, and progeny of the same. Plant cell, as used herein
includes, without limitation, seeds, suspension cultures, embryos,
meristematic regions, callus tissue, leaves, roots, shoots,
gametophytes, sporophytes, pollen, and microspores.
[0467] The term "crossed" or "cross" or "crossing" in the context
of this disclosure means the fusion of gametes via pollination to
produce progeny (i.e., cells, seeds, or plants). The term
encompasses both sexual crosses (the pollination of one plant by
another) and selfing (self-pollination, i.e., when the pollen and
ovule (or microspores and megaspores) are from the same plant or
genetically identical plants).
[0468] The term "introgression" refers to the transmission of a
desired allele of a genetic locus from one genetic background to
another. For example, introgression of a desired allele at a
specified locus can be transmitted to at least one progeny plant
via a sexual cross between two parent plants, where at least one of
the parent plants has the desired allele within its genome.
Alternatively, for example, transmission of an allele can occur by
recombination between two donor genomes, e.g., in a fused
protoplast, where at least one of the donor protoplasts has the
desired allele in its genome. The desired allele can be, e.g., a
transgene, a modified (mutated or edited) native allele, or a
selected allele of a marker or QTL.
[0469] Standard DNA isolation, purification, molecular cloning,
vector construction, and verification/characterization methods are
well established, see, for example Sambrook et al., (1989)
Molecular Cloning: A Laboratory Manual, (Cold Spring Harbor
Laboratory Press, NY). Vectors and constructs include circular
plasmids, and linear polynucleotides, comprising a polynucleotide
of interest and optionally other components including linkers,
adapters, regulatory regions, introns, restriction sites,
enhancers, insulators, selectable markers, nucleotide sequences of
interest, promoters, and/or other sites that aid in vector
construction or analysis. In some examples a recognition site
and/or target site can be contained within an intron, coding
sequence, 5' UTRs, 3' UTRs, and/or regulatory regions.
[0470] The present disclosure further provides expression
constructs for expressing in a plant, plant cell, or plant part a
guide RNA/Cas system that is capable of binding to and creating a
double strand break in a target site. In one embodiment, the
expression constructs of the disclosure comprise a promoter
operably linked to a nucleotide sequence encoding a Cas gene and a
promoter operably linked to a guide RNA of the present disclosure.
The promoter is capable of driving expression of an operably linked
nucleotide sequence in a plant cell.
[0471] A promoter is a region of DNA involved in recognition and
binding of RNA polymerase and other proteins to initiate
transcription. A plant promoter is a promoter capable of initiating
transcription in a plant cell, for a review of plant promoters,
see, Potenza et al., (2004) In Vitro Cell Dev Biol 40:1-22.
Constitutive promoters include, for example, the core promoter of
the Rsyn7 promoter and other constitutive promoters disclosed in
WO99/43838 and U.S. Pat. No. 6,072,050; the core CaMV 35S promoter
(Odell et al., (1985) Nature 313:810-2); rice actin (McElroy et
al., (1990) Plant Cell 2:163-71); ubiquitin (Christensen et al.,
(1989) Plant Mol Biol 12:619-32; Christensen et al., (1992) Plant
Mol Biol 18:675-89); pEMU (Last et al., (1991) Theor Appl Genet
81:581-8); MAS (Velten et al., (1984) EMBO J 3:2723-30); ALS
promoter (U.S. Pat. No. 5,659,026), and the like. Other
constitutive promoters are described in, for example, U.S. Pat.
Nos. 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785;
5,399,680; 5,268,463; 5,608,142 and 6,177,611. In some examples an
inducible promoter may be used. Pathogen-inducible promoters
induced following infection by a pathogen include, but are not
limited to those regulating expression of PR proteins, SAR
proteins, beta-1,3-glucanase, chitinase, etc.
[0472] Chemical-regulated promoters can be used to modulate the
expression of a gene in a plant through the application of an
exogenous chemical regulator. The promoter may be a
chemical-inducible promoter, where application of the chemical
induces gene expression, or a chemical-repressible promoter, where
application of the chemical represses gene expression.
Chemical-inducible promoters include, but are not limited to, the
maize In2-2 promoter, activated by benzene sulfonamide herbicide
safeners (De Veylder et al., (1997) Plant Cell Physiol 38:568-77),
the maize GST promoter (GST-II-27, WO93/01294), activated by
hydrophobic electrophilic compounds used as pre-emergent
herbicides, and the tobacco PR-1a promoter (Ono et al., (2004)
Biosci Biotechnol Biochem 68:803-7) activated by salicylic acid.
Other chemical-regulated promoters include steroid-responsive
promoters (see, for example, the glucocorticoid-inducible promoter
(Schena et al., (1991) Proc. Natl. Acad. Sci. USA 88:10421-5;
McNellis et al., (1998) Plant J 14:247-257); tetracycline-inducible
and tetracycline-repressible promoters (Gatz et al., (1991) Mol Gen
Genet 227:229-37; U.S. Pat. Nos. 5,814,618 and 5,789,156).
[0473] Tissue-preferred promoters can be utilized to target
enhanced expression within a particular plant tissue.
Tissue-preferred promoters include, for example, Kawamata et al.,
(1997) Plant Cell Physiol 38:792-803; Hansen et al., (1997) Mol Gen
Genet 254:337-43; Russell et al., (1997) Transgenic Res 6:157-68;
Rinehart et al., (1996) Plant Physiol 112:1331-41; Van Camp et al.,
(1996) Plant Physiol 112:525-35; Canevascini et al., (1996) Plant
Physiol 112:513-524; Lam, (1994) Results Probl Cell Differ
20:181-96; and Guevara-Garcia et al., (1993) Plant J 4:495-505.
Leaf-preferred promoters include, for example, Yamamoto et al.,
(1997) Plant J 12:255-65; Kwon et al., (1994) Plant Physiol
105:357-67; Yamamoto et al., (1994) Plant Cell Physiol 35:773-8;
Gotor et al., (1993) Plant J 3:509-18; Orozco et al., (1993) Plant
Mol Biol 23:1129-38; Matsuoka et al., (1993) Proc. Natl. Acad. Sci.
USA 90:9586-90; Simpson et al., (1958) EMBO J. 4:2723-9; Timko et
al., (1988) Nature 318:57-8. Root-preferred promoters include, for
example, Hire et al., (1992) Plant Mol Biol 20:207-18 (soybean
root-specific glutamine synthase gene); Miao et al., (1991) Plant
Cell 3:11-22 (cytosolic glutamine synthase (GS)); Keller and
Baumgartner, (1991) Plant Cell 3:1051-61 (root-specific control
element in the GRP 1.8 gene of French bean); Sanger et al., (1990)
Plant Mol Biol 14:433-43 (root-specific promoter of A. tumefaciens
mannopine synthase (MAS)); Bogusz et al., (1990) Plant Cell
2:633-41 (root-specific promoters isolated from Parasponia
andersonii and Trema tomentosa); Leach and Aoyagi, (1991) Plant Sci
79:69-76 (A. rhizogenes roIC and roID root-inducing genes); Teeri
et al., (1989) EMBO J 8:343-50 (Agrobacterium wound-induced TR1'
and TR2' genes); VfENOD-GRP3 gene promoter (Kuster et al., (1995)
Plant Mol Biol 29:759-72); and roIB promoter (Capana et al., (1994)
Plant Mol Biol 25:681-91; phaseolin gene (Murai et al., (1983)
Science 23:476-82; Sengopta-Gopalen et al., (1988) Proc. Natl.
Acad. Sci. USA 82:3320-4). See also, U.S. Pat. Nos. 5,837,876;
5,750,386; 5,633,363; 5,459,252; 5,401,836; 5,110,732 and
5,023,179.
[0474] Seed-preferred promoters include both seed-specific
promoters active during seed development, as well as
seed-germinating promoters active during seed germination. See,
Thompson et al., (1989) BioEssays 10:108. Seed-preferred promoters
include, but are not limited to, Cim1 (cytokinin-induced message);
cZ19B1 (maize 19 kDa zein); and milps (myo-inositol-1-phosphate
synthase); (WO00/11177; and U.S. Pat. No. 6,225,529). For dicots,
seed-preferred promoters include, but are not limited to, bean
.beta.-phaseolin, napin, .beta.-conglycinin, soybean lectin,
cruciferin, and the like. For monocots, seed-preferred promoters
include, but are not limited to, maize 15 kDa zein, 22 kDa zein, 27
kDa gamma zein, waxy, shrunken 1, shrunken 2, globulin 1, oleosin,
and nuc1. See also, WO00/12733, where seed-preferred promoters from
END1 and END2 genes are disclosed.
[0475] A phenotypic marker is a screenable or selectable marker
that includes visual markers and selectable markers whether it is a
positive or negative selectable marker. Any phenotypic marker can
be used. Specifically, a selectable or screenable marker comprises
a DNA segment that allows one to identify, or select for or against
a molecule or a cell that contains it, often under particular
conditions. These markers can encode an activity, such as, but not
limited to, production of RNA, peptide, or protein, or can provide
a binding site for RNA, peptides, proteins, inorganic and organic
compounds or compositions and the like.
[0476] Examples of selectable markers include, but are not limited
to, DNA segments that comprise restriction enzyme sites; DNA
segments that encode products which provide resistance against
otherwise toxic compounds including antibiotics, such as,
spectinomycin, ampicillin, kanamycin, tetracycline, Basta, neomycin
phosphotransferase II (NEO) and hygromycin phosphotransferase
(HPT)); DNA segments that encode products which are otherwise
lacking in the recipient cell (e.g., tRNA genes, auxotrophic
markers); DNA segments that encode products which can be readily
identified (e.g., phenotypic markers such as .beta.-galactosidase,
GUS; fluorescent proteins such as green fluorescent protein (GFP),
cyan (CFP), yellow (YFP), red (RFP), and cell surface proteins);
the generation of new primer sites for PCR (e.g., the juxtaposition
of two DNA sequence not previously juxtaposed), the inclusion of
DNA sequences not acted upon or acted upon by a restriction
endonuclease or other DNA modifying enzyme, chemical, etc.; and,
the inclusion of a DNA sequences required for a specific
modification (e.g., methylation) that allows its
identification.
[0477] Additional selectable markers include genes that confer
resistance to herbicidal compounds, such as glufosinate ammonium,
bromoxynil, imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D).
See for example, Yarranton, (1992) Curr Opin Biotech 3:506-11;
Christopherson et al., (1992) Proc. Natl. Acad. Sci. USA 89:6314-8;
Yao et al., (1992) Cell 71:63-72; Reznikoff, (1992) Mol Microbiol
6:2419-22; Hu et al., (1987) Cell 48:555-66; Brown et al., (1987)
Cell 49:603-12; Figge et al., (1988) Cell 52:713-22; Deuschle et
al., (1989) Proc. Natl. Acad. Sci. USA 86:5400-4; Fuerst et al.,
(1989) Proc. Natl. Acad. Sci. USA 86:2549-53; Deuschle et al.,
(1990) Science 248:480-3; Gossen, (1993) Ph.D. Thesis, University
of Heidelberg; Reines et al., (1993) Proc. Natl. Acad. Sci. USA
90:1917-21; Labow et al., (1990) Mol Cell Biol 10:3343-56;
Zambretti et al., (1992) Proc. Natl. Acad. Sci. USA 89:3952-6; Baim
et al., (1991) Proc. Natl. Acad. Sci. USA 88:5072-6; Wyborski et
al., (1991) Nucleic Acids Res 19:4647-53; Hillen and Wissman,
(1989) Topics Mol Struc Biol 10:143-62; Degenkolb et al., (1991)
Antimicrob Agents Chemother 35:1591-5; Kleinschnidt et al., (1988)
Biochemistry 27:1094-104; Bonin, (1993) Ph.D. Thesis, University of
Heidelberg; Gossen et al., (1992) Proc. Natl. Acad. Sci. USA
89:5547-51; Oliva et al., (1992) Antimicrob Agents Chemother
36:913-9; Hlavka et al., (1985) Handbook of Experimental
Pharmacology, Vol. 78 (Springer-Verlag, Berlin); Gill et al.,
(1988) Nature 334:721-4.
[0478] The cells having the introduced sequence may be grown or
regenerated into plants using conventional conditions, see for
example, McCormick et al., (1986) Plant Cell Rep 5:81-4. These
plants may then be grown, and either pollinated with the same
transformed strain or with a different transformed or untransformed
strain, and the resulting progeny having the desired characteristic
and/or comprising the introduced polynucleotide or polypeptide
identified. Two or more generations may be grown to ensure that the
polynucleotide is stably maintained and inherited, and seeds
harvested.
[0479] Any plant can be used, including monocot and dicot plants.
Examples of monocot plants that can be used include, but are not
limited to, corn (Zea mays), rice (Oryza sativa), rye (Secale
cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g.,
pearl millet (Pennisetum glaucum), proso millet (Panicum
miliaceum), foxtail millet (Setaria italica), finger millet
(Eleusine coracana)), wheat (Triticum aestivum), sugarcane
(Saccharum spp.), oats (Avena), barley (Hordeum), switchgrass
(Panicum virgatum), pineapple (Ananas comosus), banana (Musa spp.),
palm, ornamentals, turfgrasses, and other grasses. Examples of
dicot plants that can be used include, but are not limited to,
soybean (Glycine max), canola (Brassica napus and B. campestris),
alfalfa (Medicago sativa), tobacco (Nicotiana tabacum), Arabidopsis
(Arabidopsis thaliana), sunflower (Helianthus annuus), cotton
(Gossypium arboreum), and peanut (Arachis hypogaea), tomato
(Solanum lycopersicum), potato (Solanum tuberosum) etc.
[0480] The transgenes, recombinant DNA molecules, DNA sequences of
interest, and polynucleotides of interest can comprise one or more
genes of interest. Such genes of interest can encode, for example,
a protein that provides agronomic advantage to the plant.
[0481] Marker Assisted Selection and Breeding of Plants
[0482] A primary motivation for development of molecular markers in
crop species is the potential for increased efficiency in plant
breeding through marker assisted selection (MAS). Genetic marker
alleles, or alternatively, quantitative trait loci (QTL alleles,
are used to identify plants that contain a desired genotype at one
or more loci, and that are expected to transfer the desired
genotype, along with a desired phenotype to their progeny. Genetic
marker alleles (or QTL alleles) can be used to identify plants that
contain a desired genotype at one locus, or at several unlinked or
linked loci (e.g., a haplotype), and that would be expected to
transfer the desired genotype, along with a desired phenotype to
their progeny. It will be appreciated that for the purposes of MAS,
the term marker can encompass both marker and QTL loci.
[0483] After a desired phenotype and a polymorphic chromosomal
locus, e.g., a marker locus or QTL, are determined to segregate
together, it is possible to use those polymorphic loci to select
for alleles corresponding to the desired phenotype--a process
called marker-assisted selection (MAS). In brief, a nucleic acid
corresponding to the marker nucleic acid is detected in a
biological sample from a plant to be selected. This detection can
take the form of hybridization of a probe nucleic acid to a marker,
e.g., using allele-specific hybridization, southern blot analysis,
northern blot analysis, in situ hybridization, hybridization of
primers followed by PCR amplification of a region of the marker or
the like. A variety of procedures for detecting markers are well
known in the art. After the presence (or absence) of a particular
marker in the biological sample is verified, the plant is selected,
i.e., used to make progeny plants by selective breeding.
[0484] Plant breeders need to combine traits of interest with genes
for high yield and other desirable traits to develop improved plant
varieties. Screening for large numbers of samples can be expensive,
time consuming, and unreliable. Use of markers, and/or
genetically-linked nucleic acids is an effective method for
selecting plant having the desired traits in breeding programs. For
example, one advantage of marker-assisted selection over field
evaluations is that MAS can be done at any time of year regardless
of the growing season. Moreover, environmental effects are
irrelevant to marker-assisted selection.
[0485] When a population is segregating for multiple loci affecting
one or multiple traits, the efficiency of MAS compared to
phenotypic screening becomes even greater because all the loci can
be processed in the lab together from a single sample of DNA.
[0486] The DNA repair mechanisms of cells are the basis to
introduce extraneous DNA or induce mutations on endogenous genes.
DNA homologous recombination is a specialized way of DNA repair
that the cells repair DNA damages using a homologous sequence. In
plants, DNA homologous recombination happens at frequencies too low
to be routinely used in gene targeting or gene editing until it has
been found that the process can be stimulated by DNA double-strand
breaks (Bibikova et al., (2001) Mol. Cell. Biol. 21:289-297; Puchta
and Baltimore, (2003) Science 300:763; Wright et al., (2005) Plant
J. 44:693-705).
[0487] The meaning of abbreviations is as follows: "sec" means
second(s), "min" means minute(s), "h" means hour(s), "d" means
day(s), ".mu.L" means microliter(s), "mL" means milliliter(s), "L"
means liter(s), ".mu.M" means micromolar, "mM" means millimolar,
"M" means molar, "mmol" means millimole(s), ".mu.mole" mean
micromole(s), "g" means gram(s), ".mu.g" means microgram(s), "ng"
means nanogram(s), "U" means unit(s), "bp" means base pair(s) and
"kb" means kilobase(s).
[0488] Also, as described herein, for each example or embodiment
that cites a guide RNA, a similar guide polynucleotide can be
designed wherein the guide polynucleotide does not solely comprise
ribonucleic acids but wherein the guide polynucleotide comprises a
combination of RNA-DNA molecules or solely comprises DNA
molecules.
Non-limiting examples of compositions and methods disclosed herein
are as follows: [0489] 1. A method for selecting a plant comprising
an altered target site in its plant genome, the method comprising:
[0490] a) obtaining a first plant comprising at least one Cas
endonuclease capable of introducing a double strand break at a
target site in the plant genome; [0491] b) obtaining a second plant
comprising a guide RNA that is capable of forming a complex with
the Cas endonuclease of (a); [0492] c) crossing the first plant of
(a) with the second plant of (b); [0493] d) evaluating the progeny
of (c) for an alteration in the target site; and, [0494] e)
selecting a progeny plant that possesses the desired alteration of
said target site. [0495] 2. A method for selecting a plant
comprising an altered target site in its plant genome, the method
comprising selecting at least one progeny plant that comprises an
alteration at a target site in its plant genome, wherein said
progeny plant was obtained by crossing a first plant comprising at
least one a Cas endonuclease with a second plant comprising a guide
RNA, wherein said Cas endonuclease is capable of introducing a
double strand break at said target site. [0496] 3. A method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising: [0497] a) obtaining a first plant
comprising at least one Cas endonuclease capable of introducing a
double strand break at a target site in the plant genome; [0498] b)
obtaining a second plant comprising a guide RNA and a donor DNA,
wherein said guide RNA is capable of forming a complex with the Cas
endonuclease of (a), wherein said donor DNA comprises a
polynucleotide of interest; [0499] c) crossing the first plant of
(a) with the second plant of (b); [0500] d) evaluating the progeny
of (c) for an alteration in the target site; and, [0501] e)
selecting a progeny plant that comprises the polynucleotide of
interest inserted at said target site. [0502] 4. A method for
selecting a plant comprising an altered target site in its plant
genome, the method comprising selecting at least one progeny plant
that comprises an alteration at a target site in its plant genome,
wherein said progeny plant was obtained by crossing a first plant
expressing at least one Cas endonuclease to a second plant
comprising a guide RNA and a donor DNA, wherein said Cas
endonuclease is capable of introducing a double strand break at
said target site, wherein said donor DNA comprises a polynucleotide
of interest. [0503] 5. A method for modifying a target site in the
genome of a plant cell, the method comprising introducing a guide
RNA into a plant cell having a Cas endonuclease, wherein said guide
RNA and Cas endonuclease are capable of forming a complex that
enables the Cas endonuclease to introduce a double strand break at
said target site. [0504] 6. A method for modifying a target site in
the genome of a plant cell, the method comprising introducing a
guide RNA and a Cas endonuclease into said plant cell, wherein said
guide RNA and Cas endonuclease are capable of forming a complex
that enables the Cas endonuclease to introduce a double strand
break at said target site. [0505] 7. A method for modifying a
target site in the genome of a plant cell, the method comprising
introducing a guide RNA and a donor DNA into a plant cell having a
Cas endonuclease, wherein said guide RNA and Cas endonuclease are
capable of forming a complex that enables the Cas endonuclease to
introduce a double strand break at said target site, wherein said
donor DNA comprises a polynucleotide of interest. [0506] 8. A
method for modifying a target site in the genome of a plant cell,
the method comprising: [0507] a) introducing into a plant cell a
guide RNA and a Cas endonuclease, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at said target
site; and, [0508] b) identifying at least one plant cell that has a
modification at said target, wherein the modification includes at
least one deletion or substitution of one or more nucleotides in
said target site. [0509] 9. A method for modifying a target DNA
sequence in the genome of a plant cell, the method comprising:
[0510] a) introducing into a plant cell a first recombinant DNA
construct capable of expressing a guide RNA and a second
recombinant DNA construct capable of expressing a Cas endonuclease,
wherein said guide RNA and Cas endonuclease are capable of forming
a complex that enables the Cas endonuclease to introduce a double
strand break at said target site; and, [0511] b) identifying at
least one plant cell that has a modification at said target,
wherein the modification includes at least one deletion or
substitution of one or more nucleotides in said target site. [0512]
10. A method for introducing a polynucleotide of Interest into a
target site in the genome of a plant cell, the method comprising:
[0513] a) introducing into a plant cell a first recombinant DNA
construct capable of expressing a guide RNA and a second
recombinant DNA construct capable of expressing a Cas endonuclease,
wherein said guide RNA and Cas endonuclease are capable of forming
a complex that enables the Cas endonuclease to introduce a double
strand break at said target site; [0514] b) contacting the plant
cell of (a) with a donor DNA comprising a polynucleotide of
Interest; and, [0515] c) identifying at least one plant cell from
(b) comprising in its genome the polynucleotide of Interest
integrated at said target site. [0516] 10-B A method for
introducing a polynucleotide of Interest into a target site in the
genome of a plant cell, the method comprising: [0517] a)
introducing into a plant cell a guide RNA and a Cas endonuclease,
wherein said guide RNA and Cas endonuclease are capable of forming
a complex that enables the Cas endonuclease to introduce a double
strand break at said target site; [0518] b) contacting the plant
cell of (a) with a donor DNA comprising a polynucleotide of
Interest; and, [0519] c) identifying at least one plant cell from
(b) comprising in its genome the polynucleotide of Interest
integrated at said target site. [0520] 11. The method of any one of
embodiments 5-8, wherein the guide RNA is introduced directly by
particle bombardment. [0521] 12. The method of any one of
embodiments 5-9, wherein the guide RNA is introduced via particle
bombardment or Agrobacterium transformation of a recombinant DNA
construct comprising the corresponding guide DNA operably linked to
a plant U6 polymerase III promoter. [0522] 13. The method of any
one of embodiments 1-10, wherein the Cas endonuclease gene is a
plant optimized Cas9 endonuclease. [0523] 14. The method of any one
of embodiments 1-10, wherein the Cas endonuclease gene is operably
linked to a SV40 nuclear targeting signal upstream of the Cas codon
region and a VirD2 nuclear localization signal downstream of the
Cas codon region. [0524] 15. The method of any one of embodiments
1-14, wherein the plant is a monocot or a dicot. [0525] 16. The
method of embodiment 15, wherein the monocot is selected from the
group consisting of maize, rice, sorghum, rye, barley, wheat,
millet, oats, sugarcane, turfgrass, or switchgrass. [0526] 17. The
method of embodiment 16, wherein the dicot is selected from the
group consisting of soybean, canola, alfalfa, sunflower, cotton,
tobacco, peanut, potato, tobacco, Arabidopsis, or safflower. [0527]
18. The method of any one of embodiments 1-17 wherein the target
site is located in the gene sequence of an acetolactate synthase
(ALS) gene, an Enolpyruvylshikimate Phosphate Synthase Gene (ESPSP)
gene, a male fertility (MS45, MS26 or MSCA1). [0528] 19. A plant or
seed produced by any one of embodiments 1-17. [0529] 20. A plant
comprising a recombinant DNA construct, said recombinant DNA
construct comprising a promoter operably linked to a nucleotide
sequence encoding a plant optimized Cas9 endonuclease, wherein said
plant optimized Cas9 endonuclease is capable of binding to and
creating a double strand break in a genomic target sequence said
plant genome. [0530] 21. A plant comprising a recombinant DNA
construct and a guide RNA, wherein said recombinant DNA construct
comprises a promoter operably linked to a nucleotide sequence
encoding a plant optimized Cas9 endonuclease, wherein said plant
optimized Cas9 endonuclease and guide RNA are capable of forming a
complex and creating a double strand break in a genomic target
sequence said plant genome. [0531] 22. A recombinant DNA construct
comprising a promoter operably linked to a nucleotide sequence
encoding a plant optimized Cas9 endonuclease, wherein said plant
optimized Cas9 endonuclease is capable of binding to and creating a
double strand break in a genomic target sequence said plant genome.
[0532] 23. A recombinant DNA construct comprising a promoter
operably linked to a nucleotide sequence expressing a guide RNA,
wherein said guide RNA is capable of forming a complex with a plant
optimized Cas9 endonuclease, and wherein said complex is capable of
binding to and creating a double strand break in a genomic target
sequence said plant genome. [0533] 24. A method for selecting a
male sterile plant, the method comprising selecting at least one
progeny plant that comprises an alteration at a genomic target site
located in a male fertility gene locus, wherein said progeny plant
is obtained by crossing a first plant expressing a Cas9
endonuclease to a second plant comprising a guide RNA, wherein said
Cas endonuclease is capable of introducing a double strand break at
said genomic target site. [0534] 25. A method for producing a male
sterile plant, the method comprising: [0535] a) obtaining a first
plant comprising at least one Cas endonuclease capable of
introducing a double strand break at a genomic target site located
in a male fertility gene locus in the plant genome; [0536] b)
obtaining a second plant comprising a guide RNA that is capable of
forming a complex with the Cas endonuclease of (a); [0537] c)
crossing the first plant of (a) with the second plant of (b);
[0538] d) evaluating the progeny of (c) for an alteration in the
target site; and, [0539] e) selecting a progeny plant that is male
sterile. [0540] 26. The method of any of embodiments 23-24 wherein
the male fertility gene is selected from the list comprising MS26,
MS45, M. [0541] 27. The method of any one of embodiments 24-26,
wherein the plant is a monocot or a dicot. [0542] 28. The method of
embodiment 27, wherein the monocot is selected from the group
consisting of maize, rice, sorghum, rye, barley, wheat, millet,
oats, sugarcane, turfgrass, or switchgrass. [0543] 29. A method for
editing a nucleotide sequence in the genome of a cell, the method
comprising introducing a guide RNA, a polynucleotide modification
template and at least one Cas endonuclease into a cell, wherein the
Cas endonuclease introduces a double-strand break at a target site
in the genome of said cell, wherein said polynucleotide
modification template comprises at least one nucleotide
modification of said nucleotide sequence. [0544] 30. The method of
embodiment 29, wherein the cell is a plant cell. [0545] 31. The
method of embodiment 29 wherein the nucleotide sequence is a
promoter, a regulatory sequence or a gene of interest of interest.
[0546] 32. The method of embodiment 31 wherein the gene of interest
is an EPSPS gene. [0547] 33. The method of embodiment 30 wherein
the plant cell is a monocot or dicot plant cell. [0548] 34. A
method for producing an epsps mutant plant, the method comprising:
[0549] a) providing a guide RNA, a polynucleotide modification
template and at least one Cas endonuclease to a plant cell, wherein
the Cas endonuclease introduces a double strand break at a target
site within an epsps genomic sequence in the plant genome, wherein
said polynucleotide modification template comprises at least one
nucleotide modification of said epsps genomic sequence. [0550] b)
obtaining a plant from the plant cell of (a); [0551] c) evaluating
the plant of (b) for the presence of said at least one nucleotide
modification; and, [0552] c) selecting a progeny plant that shows
tolerance to glyphosate. [0553] 35. A method for producing an epsps
mutant plant, the method comprising: [0554] a) providing a guide
RNA, a polynucleotide modification template and at least one Cas
endonuclease into a plant cell, wherein the Cas endonuclease
introduces a double strand break at a target site within an epsps
genomic sequence in the plant genome, wherein said polynucleotide
modification template comprises at least one nucleotide
modification of said epsps genomic sequence. [0555] b) obtaining a
plant from the plant cell of (a); [0556] c) evaluating the plant of
(b) for the presence of said at least one nucleotide modification;
and, [0557] d) screening a progeny plant of (c) that is void of
said guide RNA and Cas endonuclease. [0558] 36. The method of
embodiment 35, further comprising selecting a plant that shows
resistance to glyphosate. [0559] 37. A plant, plant cell or seed
produced by any one of embodiments 29-36 [0560] 38. The method of
any one of embodiments 29-36 wherein the Cas endonuclease is a Cas9
endonuclease. [0561] 39. The method of embodiment 38 wherein the
Cas9 endonuclease is expressed by SEQ ID NO:5. [0562] 40. The
method of embodiment 38 wherein the Cas9 endonuclease is encoded by
any one of SEQ ID NOs: 1, 124, 212, 213, 214, 215, 216, 193 or
nucleotides 2037-6329 of SEQ ID NO:5, or any functional fragment or
variant thereof. [0563] 41. The plant or plant cell of embodiment
37, wherein said plant cell shows resistance to glyphosate. [0564]
42. A plant cell comprising a modified nucleotide sequence, wherein
the modified nucleotide sequence was produced by providing a guide
RNA, a polynucleotide modification template and at least one Cas
endonuclease to a plant cell, wherein the Cas endonuclease is
capable of introducing a double-strand break at a target site in
the plant genome wherein said polynucleotide modification template
comprises at least one nucleotide modification of said nucleotide
sequence. [0565] 43. The method of embodiments 29, 34 and 35
wherein the at least one nucleotide modification is not a
modification at said target site. [0566] 44. A method for producing
a male sterile plant, the method comprising: [0567] a) introducing
into a plant cell a guide RNA and a Cas endonuclease, wherein said
guide RNA and Cas endonuclease are capable of forming a complex
that enables the Cas endonuclease to introduce a double strand
break at a target site located in or near a male fertility gene;
[0568] b) identifying at least one plant cell that has a
modification in said male fertility gene, wherein the modification
includes at least one deletion or substitution of one or more
nucleotides in said male sterility gene; and, [0569] c) obtaining a
plant from the plant cell of b).
[0570] 45. The method of embodiment 43, further comprising
selecting a progeny plant from the plant of c) wherein said progeny
plant is male sterile. [0571] 46. The method of embodiment 43,
wherein the male fertility gene is selected from the group
comprising MS26, MS45 and MSCA1. [0572] 47. A plant comprising at
least one altered target site, wherein the at least one altered
target site originated from a corresponding target site that was
recognized and cleaved by a guide RNA/Cas endonuclease system, and
wherein the at least one altered target site is in a genomic region
of interest that extends from the target sequence set forth in SEQ
ID NO: 229 to the target site set forth in SEQ ID NO: 235. [0573]
48. The plant of embodiment 47, wherein the at least one altered
target site has an alteration selected from the group consisting of
(i) replacement of at least one nucleotide, (ii) a deletion of at
least one nucleotide, (iii) an insertion of at least one
nucleotide, and (iv) any combination of (i)-(iii). [0574] 49. The
plant of embodiment 47, wherein the at least one altered target
site comprises a recombinant DNA molecule. [0575] 50. The plant of
embodiment 47, wherein the plant comprises at least two altered
target sites, wherein each of the altered target site originated
from corresponding target site that was recognized and cleaved by a
guide RNA/Cas endonuclease system, wherein the corresponding target
site is selected from the group consisting of SEQ ID NOs: 229, 230,
231, 232, 233, 234, 235 and 236. [0576] 51. A recombinant DNA
construct comprising a nucleotide sequence set forth in SEQ ID NO:
120 or SEQ ID NO:295, or a functional fragment thereof, operably
linked to at least one heterologous sequence, wherein said
nucleotide sequence is a promoter. [0577] 52. A plant stably
transformed with a recombinant DNA construct comprising a soybean
promoter and a heterologous nucleic acid fragment operably linked
to said soybean promoter, wherein said promoter is a capable of
controlling expression of said heterologous nucleic acid fragment
in a plant cell, and further wherein said promoter comprises any of
the sequences set forth in SEQ ID NO: 120 or SEQ ID NO: 295. [0578]
53. A method for editing a nucleotide sequence in the genome of a
cell, the method comprising introducing a guide polynucleotide, a
Cas endonuclease, and optionally a polynucleotide modification
template, into a cell, wherein said guide RNA and Cas endonuclease
are capable of forming a complex that enables the Cas endonuclease
to introduce a double strand break at a target site in the genome
of said cell, wherein said polynucleotide modification template
comprises at least one nucleotide modification of said nucleotide
sequence. [0579] 54. The method of embodiment 53, wherein the
nucleotide sequence in the genome of a cell is selected from the
group consisting of a promoter sequence, a terminator sequence, a
regulatory element sequence, a splice site, a coding sequence, a
polyubiquitination site, an intron site and an intron enhancing
motif. [0580] 55. A method for editing a promoter sequence in the
genome of a cell, the method comprising introducing a guide
polynucleotide, a polynucleotide modification template and at least
one Cas endonuclease into a cell, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at a target site in
the genome of said cell, wherein said polynucleotide modification
template comprises at least one nucleotide modification of said
nucleotide sequence. [0581] 56. A method for replacing a first
promoter sequence in a cell, the method comprising introducing a
guide RNA, a polynucleotide modification template, and a Cas
endonuclease into said cell, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break at a target site in
the genome of said cell, wherein said polynucleotide modification
template comprises a second promoter or second promoter fragment
that is different from said first promoter sequence. [0582] 57. The
method of embodiment 56, wherein the replacement of the first
promoter sequence results in any one of the following, or any one
combination of the following: an increased promoter activity, an
increased promoter tissue specificity, a decreased promoter
activity, a decreased promoter tissue specificity, a new promoter
activity, an inducible promoter activity, an extended window of
gene expression, or a modification of the timing or developmental
progress of gene expression in the same cell layer or other cell
layer [0583] 58. The method of embodiment 56, wherein the first
promoter sequence is selected from the group consisting of Zea mays
ARGOS 8 promoter, a soybean EPSPS1 promoter, a maize EPSPS
promoter, maize NPK1 promoter, wherein the second promoter sequence
is selected from the group consisting of a Zea mays GOS2
PRO:GOS2-intron promoter, a soybean ubiquitin promoter, a stress
inducible maize RAB17 promoter, a Zea mays-PEPC1 promoter, a Zea
mays Ubiquitin promoter, a Zea mays-Rootmet2 promoter, a rice actin
promoter, a sorghum RCC3 promoter, a Zea mays-GOS2 promoter, a Zea
mays-ACO2 promoter and a Zea mays oleosin promoter. [0584] 59. A
method for deleting a promoter sequence in the genome of a cell,
the method comprising introducing a guide polynucleotide, a Cas
endonuclease into a cell, wherein said guide RNA and Cas
endonuclease are capable of forming a complex that enables the Cas
endonuclease to introduce a double strand break in at least one
target site located inside or outside said promoter sequence.
[0585] 60. A method for inserting a promoter or a promoter element
in the genome of a cell, the method comprising introducing a guide
polynucleotide, a polynucleotide modification template comprising
the promoter or the promoter element, and a Cas endonuclease into a
cell, wherein said guide RNA and Cas endonuclease are capable of
forming a complex that enables the Cas endonuclease to introduce a
double strand break at a target site in the genome of said cell.
[0586] 61. The method of embodiment 60, wherein the insertion of
the promoter or promoter element results in any one of the
following, or any one combination of the following: an increased
promoter activity, an increased promoter tissue specificity, a
decreased promoter activity, a decreased promoter tissue
specificity, a new promoter activity, an inducible promoter
activity, an extended window of gene expression, a modification of
the timing or developmental progress of gene expression, a mutation
of DNA binding elements, or an addition of DNA binding elements.
[0587] 62. A method for editing a Zinc Finger transcription factor,
the method comprising introducing a guide polynucleotide, a Cas
endonuclease, and optionally a polynucleotide modification
template, into a cell, wherein the Cas endonuclease introduces a
double-strand break at a target site in the genome of said cell,
wherein said polynucleotide modification template comprises at
least one nucleotide modification or deletion of said Zinc Finger
transcription factor, wherein the deletion or modification of said
Zinc Finger transcription factor results in the creation of a
dominant negative Zinc Finger transcription factor mutant. [0588]
63. A method for creating a fusion protein, the method comprising
introducing a guide polynucleotide, a Cas endonuclease, and a
polynucleotide modification template, into a cell, wherein the Cas
endonuclease introduces a double-strand break at a target site
located inside or outside a first coding sequence in the genome of
said cell, wherein said polynucleotide modification template
comprises a second coding sequence encoding a protein of interest,
wherein the protein fusion results in any one of the following, or
any one combination of the following: a targeting of the fusion
protein to the chloroplast of said cell, an increased protein
activity, an increased protein functionality, a decreased protein
activity, a decreased protein functionality, a new protein
functionality, a modified protein functionality, a new protein
localization, a new timing of protein expression, a modified
protein expression pattern, a chimeric protein, or a modified
protein with dominant phenotype functionality. [0589] 64. A method
for producing in a plant a complex trait locus comprising at least
two altered target sequences in a genomic region of interest, said
method comprising: [0590] (a) selecting a genomic region in a
plant, wherein the genomic region comprises a first target sequence
and a second target sequence; [0591] (b) contacting at least one
plant cell with at least a first guide polynucleotide, a second
polynucleotide, and optionally at least one Donor DNA, and a Cas
endonuclease, wherein the first and second guide polynucleotide and
the Cas endonuclease can form a complex that enables the Cas
endonuclease to introduce a double strand break in at least a first
and a second target sequence; [0592] (c) identifying a cell from
(b) comprising a first alteration at the first target sequence and
a second alteration at the second target sequence; and, [0593] (d)
recovering a first fertile plant from the cell of (c) said fertile
plant comprising the first alteration and the second alteration,
wherein the first alteration and the second alteration are
physically linked. [0594] 65. A method for producing in a plant a
complex trait locus comprising at least two altered target
sequences in a genomic region of interest, said method comprising:
[0595] (a) selecting a genomic region in a plant, wherein the
genomic region comprises a first target sequence and a second
target sequence; [0596] (b) contacting at least one plant cell with
a first guide polynucleotide, a Cas endonuclease, and optionally a
first Donor DNA, wherein the first guide polynucleotide and the Cas
endonuclease can form a complex that enables the Cas endonuclease
to introduce a double strand break a first target sequence; [0597]
(c) identifying a cell from (b) comprising a first alteration at
the first target sequence; [0598] (d) recovering a first fertile
plant from the cell of (c), said first fertile plant comprising the
first alteration; [0599] (e) contacting at least one plant cell
with a second guide polynucleotide, a Cas endonuclease, and
optionally a second Donor DNA; [0600] (f) identifying a cell from
(e) comprising a second alteration at the second target sequence;
[0601] (g) recovering a second fertile plant from the cell of (f),
said second fertile plant comprising the second alteration; and,
[0602] (h) obtaining a fertile progeny plant from the second
fertile plant of (g), said fertile progeny plant comprising the
first alteration and the second alteration, wherein the first
alteration and the second alteration are physically linked. [0603]
66. A method for editing a nucleotide sequence in the genome of a
cell, the method comprising introducing at least one guide RNA, at
least one polynucleotide modification template and at least one Cas
endonuclease into a cell, wherein the Cas endonuclease introduces a
double-strand break at a target site in the genome of said cell,
wherein said polynucleotide modification template comprises at
least one nucleotide modification of said nucleotide sequence.
[0604] 67. The method of embodiment 66 wherein the editing of said
nucleotide sequence renders said nucleotide sequence capable of
conferring herbicide resistance to said cell. [0605] 68. The method
of embodiment 67, wherein the cell is a plant cell. [0606] 69. The
method of embodiment 66 wherein the nucleotide sequence is a
promoter, a regulatory sequence or a gene of interest of interest.
[0607] 70. The method of embodiment 69 wherein the gene of interest
is an enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene or an
ALS gene. [0608] 71. The method of embodiment 66 wherein the plant
cell is a monocot or dicot plant cell. [0609] 72. A method for
producing an acetolactate synthase (ALS) mutant plant, the method
comprising: [0610] a) providing a guide RNA, a polynucleotide
modification template, and a Cas endonuclease to a plant cell
comprising an ALS nucleotide sequence, wherein said guide RNA and
Cas endonuclease are capable of forming a complex that enables the
Cas endonuclease to introduce a double strand break at a target
site in the genome of said plant cell, wherein said polynucleotide
modification template comprises at least one nucleotide
modification of said ALS nucleotide sequence; [0611] b) obtaining a
plant from the plant cell of (a); [0612] c) evaluating the plant of
(b) for the presence of said at least one nucleotide modification;
and, [0613] d) selecting a progeny plant that shows resistance to
sulphonylurea. [0614] 73. A method for producing an acetolactate
synthase (ALS) mutant plant, the method comprising: [0615] a)
providing a guide RNA and a polynucleotide modification template to
a plant cell comprising a Cas endonuclease and an ALS nucleotide
sequence, wherein said Cas endonuclease introduces a double strand
break at a target site in the genome of said plant cell, wherein
said polynucleotide modification template comprises at least one
nucleotide modification of said ALS nucleotide sequence; [0616] b)
obtaining a plant from the plant cell of (a); [0617] c) evaluating
the plant of (b) for the presence of said at least one nucleotide
modification; and, [0618] d) selecting a progeny plant that shows
resistance to sulphonylurea. [0619] 74. The method of any of
embodiments 72-73, wherein said polynucleotide modification
template comprises a non-functional or partial fragment of the ALS
nucleotide sequence. [0620] 75. The method of any of embodiments
72-73, wherein the target site is located within the ALS nucleotide
sequence. [0621] 76. The method of any of embodiments 72-73,
further comprising selecting a progeny plant that is void of said
guide RNA and Cas endonuclease. [0622] 77. A method for producing
an acetolactate synthase (ALS) mutant plant, the method comprising:
[0623] a) obtaining a plant or a seed thereof, wherein the plant or
the seed comprises a modification in an endogenous ALS gene, the
modification generated by a Cas endonuclease, a guide RNA and a
polynucleotide modification template, wherein the plant or the seed
is resistant to sulphonylurea; and, [0624] b) producing a progeny
plant that is void of said guide RNA and Cas endonuclease. [0625]
78. The method of embodiment 77, further comprising selecting a
plant that shows resistance to sulphonylurea. [0626] 79. The method
of any one of embodiments 72-78, wherein the plant is a monocot or
a dicot. [0627] 80. The method of embodiment 79, wherein the
monocot is selected from the group consisting of maize, rice,
sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or
switchgrass. [0628] 81. The method of embodiment 79, wherein the
dicot is selected from the group consisting of soybean, canola,
alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco,
Arabidopsis, or safflower. [0629] 82. A method of generating a
sulphonylurea resistant plant, the method comprising providing a
plant cell wherein its endogenous chromosomal ALS gene by has been
modified through a guide RNA/Cas endonuclease system to produce a
sulphonylurea resistant ALS protein and growing a plant from said
maize plant cell, wherein said plant is resistant to sulphonylurea.
[0630] 83. The method of embodiment 82, wherein the plant is a
monocot or a dicot. [0631] 84. A plant produced by the method of
embodiment 82. [0632] 85. A seed produced by the plant of
embodiment 84. [0633] 86. A guide RNA wherein the variable
targeting domain targets a fragment of a plant EPSPS or ALS
nucleotide sequence. [0634] 87. A method for producing an
acetolactate synthase (ALS) mutant plant cell, the method
comprising: [0635] a) providing to a cell comprising an ALS
nucleotide sequence, a guide RNA, a Cas endonuclease, and a
polynucleotide modification template, wherein said guide RNA and
Cas endonuclease are capable of forming a complex that enables the
Cas endonuclease to introduce a double strand break at a target
site in the genome of said cell, wherein said polynucleotide
modification template comprises at least one nucleotide
modification of said ALS nucleotide sequence; and, [0636] b)
obtaining at least one plant cell of (a) that has at least one
nucleotide modification at said ALS nucleotide sequence, wherein
the modification includes at least one deletion, insertion or
substitution of one or more nucleotides in said ALS nucleotide
sequence. [0637] 88. A method for producing an acetolactate
synthase (ALS) mutant plant cell, the method comprising: [0638] a)
providing a guide RNA and a polynucleotide modification template to
a plant cell comprising a Cas endonuclease and a ALS nucleotide
sequence, wherein said Cas endonuclease introduces a double strand
break at a target site in the genome of said plant cell, wherein
said polynucleotide modification template comprises at least one
nucleotide modification of said ALS nucleotide sequence; and,
[0639] b) identifying at least one plant cell of (a) that has at
least one nucleotide modification at said ALS nucleotide sequence,
wherein the modification includes at least one deletion, insertion
or substitution of one or more nucleotides in said ALS nucleotide
sequence. [0640] 89. A method for producing an acetolactate
synthase (ALS) mutant cell, the method comprising: [0641] a)
providing to a cell comprising an ALS nucleotide sequence, a first
recombinant DNA construct capable of expressing a guide RNA, a
second recombinant DNA construct capable of expressing a Cas
endonuclease, and a polynucleotide modification template, wherein
said guide RNA and Cas endonuclease are capable of forming a
complex that enables the Cas endonuclease to introduce a double
strand break at a target site in the genome of said cell, wherein
said polynucleotide modification template comprises a
non-functional fragment of the ALS gene and at least one nucleotide
modification of said ALS nucleotide sequence; and, [0642] b)
identifying at least one cell of (a) that has at least one
nucleotide modification at said ALS nucleotide sequence, wherein
the modification includes at least one deletion, insertion or
substitution of one or more nucleotides in said ALS nucleotide
sequence.
EXAMPLES
[0643] In the following Examples, unless otherwise stated, parts
and percentages are by weight and degrees are Celsius. It should be
understood that these Examples, while indicating embodiments of the
disclosure, are given by way of illustration only. From the above
discussion and these Examples, one skilled in the art can make
various changes and modifications of the disclosure to adapt it to
various usages and conditions. Such modifications are also intended
to fall within the scope of the appended claims.
Example 1
[0644] Maize Optimized Expression Cassettes for Guide RNA/Cas
Endonuclease Based Genome Modification in Maize Plants
[0645] For genome engineering applications, the type II CRISPR/Cas
system minimally requires the Cas9 protein and a duplexed
crRNA/tracrRNA molecule or a synthetically fused crRNA and tracrRNA
(guide RNA) molecule for DNA target site recognition and cleavage
(Gasiunas et al. (2012) Proc. Natl. Acad. Sci. USA 109:E2579-86,
Jinek et al. (2012) Science 337:816-21, Mali et al. (2013) Science
339:823-26, and Gong et al. (2013) Science 339:819-23). Described
herein is a guideRNA/Cas endonuclease system that is based on the
type II CRISPR/Cas system and consists of a Cas endonuclease and a
guide RNA (or duplexed crRNA and tracrRNA) that together can form a
complex that recognizes a genomic target site in a plant and
introduces a double-strand-break into said target site.
[0646] To test the guide RNA/Cas endonuclease system in maize, the
Cas9 gene from Streptococcus pyogenes M1 GAS (SF370) (SEQ ID NO: 1)
was maize codon optimized per standard techniques known in the art
and the potato ST-LS1 intron (SEQ ID NO: 2) was introduced in order
to eliminate its expression in E. coli and Agrobacterium (FIG. 1A).
To facilitate nuclear localization of the Cas9 protein in maize
cells, Simian virus 40 (SV40) monopartite amino terminal nuclear
localization signal (MAPKKKRKV, SEQ ID NO: 3) and Agrobacterium
tumefaciens bipartite VirD2 T-DNA border endonuclease carboxyl
terminal nuclear localization signal (KRPRDRHDGELGGRKRAR, SEQ ID
NO: 4) were incorporated at the amino and carboxyl-termini of the
Cas9 open reading frame (FIG. 1A), respectively. The maize
optimized Cas9 gene was operably linked to a maize constitutive or
regulated promoter by standard molecular biological techniques. An
example of the maize optimized Cas9 expression cassette (SEQ ID NO:
5) is illustrated in FIG. 1A. FIG. 1A shows a maize optimized Cas9
gene containing the ST-LS1 intron, SV40 amino terminal nuclear
localization signal (NLS) and VirD2 carboxyl terminal NLS driven by
a plant Ubiquitin promoter.
[0647] The second component necessary to form a functional guide
RNA/Cas endonuclease system for genome engineering applications is
a duplex of the crRNA and tracrRNA molecules or a synthetic fusing
of the crRNA and tracrRNA molecules, a guide RNA. To confer
efficient guide RNA expression (or expression of the duplexed crRNA
and tracrRNA) in maize, the maize U6 polymerase III promoter (SEQ
ID NO: 9) and maize U6 polymerase III terminator (first 8 bases of
SEQ ID NO: 10) residing on chromosome 8 were isolated and operably
fused to the termini of a guide RNA (FIG. 1B) using standard
molecular biology techniques. Two different guide RNA
configurations were developed for testing in maize, a short guide
RNA (SEQ ID NO: 11) based on Jinek et al. (2012) Science 337:816-21
and a long guide RNA (SEQ ID NO: 8) based on Mali et al. (2013)
Science 339:823-26. An example expression cassette (SEQ ID NO: 12)
is shown in FIG. 1B which illustrates a maize U6 polymerase III
promoter driving expression of a long guide RNA terminated with a
U6 polymerase III terminator.
[0648] As shown in FIGS. 2 A and 2B, the guide RNA or crRNA
molecule contains a region complementary to one strand of the
double strand DNA target (referred to as the variable targeting
domain) that is approximately 12-30 nucleotides in length and
upstream of a PAM sequence (5'NGG3' on antisense strand of FIG.
2A-2B, corresponding to 5'CCN3' on sense strand of FIG. 2A-2B) for
target site recognition and cleavage (Gasiunas et al. (2012) Proc.
Natl. Acad. Sci. USA 109:E2579-86, Jinek et al. (2012) Science
337:816-21, Mali et al. (2013) Science 339:823-26, and Cong et al.
(2013) Science 339:819-23). To facilitate the rapid introduction of
maize genomic DNA target sequences into the crRNA or guide RNA
expression constructs, two Type IIS BbsI restriction endonuclease
target sites were introduced in an inverted tandem orientation with
cleavage orientated in an outward direction as described in Cong et
al. (2013) Science 339:819-23. Upon cleavage, the Type IIS
restriction endonuclease excises its target sites from the crRNA or
guide RNA expression plasmid, generating overhangs allowing for the
in-frame directional cloning of duplexed oligos containing the
desired maize genomic DNA target site into the variable targeting
domain. In this example, only target sequences starting with a G
nucleotide were used to promote favorable polymerase III expression
of the guide RNA or crRNA.
[0649] Expression of both the Cas endonuclease gene and the guide
RNA then allows for the formation of the guide RNA/Cas complex
depicted in FIG. 2 B (SEQ ID NO: 8). Alternatively, expression of
the Cas endonucleases gene, crRNA, and tracrRNA allow for the
formation of the crRNA/tracrRNA/Cas complex as depicted in FIG. 2
A, (SEQ ID NOs: 6-7).
Example 2
[0650] The Guide RNA/Cas Endonuclease System Cleaves Chromosomal
DNA in Maize and Introduces Mutations by Imperfect Non-Homologous
End-Joining
[0651] To test whether the maize optimized guide RNA/Cas
endonuclease described in example 1 could recognize, cleave, and
mutate maize chromosomal DNA through imprecise non-homologous
end-joining (NHEJ) repair pathways, three different genomic target
sequences in 5 maize loci were targeted for cleavage (see Table 1)
and examined by deep sequencing for the presence of NHEJ
mutations.
TABLE-US-00001 TABLE 1 Maize genomic target sites targeted by a
quideRNA/Cas endonuclease system. Target Maize Guide Site Genomic
PAM SEQ Loca- RNA Desig- Target Site Se- ID Locus tion Used nation
Sequence quence NO: MS26 Chr. 1: Long MS26Cas-1 GTACTCCATCC GGG 13
51.81 GCCCCATCGAG cM TA Long MS26Cas-2 GCACGTACGTC CGG 14
ACCATCCCGC Long MS26Cas-3 GACGTACGTGC GGG 15 CCTACTCGAT LIG Chr. 2:
Long LIGCas-1 GTACCGTACGT AGG 16 28.45 GCCCCGGCGG cM Long LIGCas-2
GGAATTGTACC CGG 17 GTACGTGCCC Long LIGCas-3 GCGTACGCGTA AGG 18
CGTGTG MS45 Chr. 9: Long MS45Cas-1 GCTGGCCGAGG CGG 19 119.15
TCGACTAC cM Long MS45Cas-2 GGCCGAGGTCG CGG 20 ACTACCGGC Long
MS45Cas-3 GGCGCGAGCTC CGG 21 GTGCTTCAC ALS Chr. 4: Long ALSCas-1
GGTGCCAATCA CGG 22 107.73 TGCGTCG cM and Long ALSCas-2 GGTCGCCATCA
AGG 23 Chr. 5: CGGGAC 115.49 Long ALSCas-3 GTCGCGGCACC TGG 24 cM
TGTCCCGTGA EPSP Chr. 9: Long EPSPSCas- GGAATGCTGGA CGG 25 S 69.43 1
ACTGCAATG cM Long EPSPSCas- GCAGCTCTTCT TGG 26 2 TGGGGAATGC Long
EPSPSCas- GCAGTAACAGC TGG 27 3 TGCTGTCAA MS26 = Male Sterility Gene
26, LIG = Liguleless 1 Gene Promoter, MS45 = Male Sterility Gene
45, ALS = Acetolactate Synthase Gene, EPSPS = Enolpyruvylshikimate
Phosphate Synthase Gene
[0652] The maize optimized Cas9 endonuclease and long guide RNA
expression cassettes containing the specific maize variable
targeting domains were co-delivered to 60-90 Hi-II immature maize
embryos by particle-mediated delivery (see Example 10) in the
presence of BBM and WUS2 genes (see Example 11). Hi-II maize
embryos transformed with either the LIG3-4 or MS26++ homing
endonucleases (see Example 9) targeting the same maize genomic loci
as the LIGCas or MS26Cas target sites served as a positive control
and embryos transformed with only the Cas9 or guide RNA expression
cassette served as negative controls. After 7 days, the 20-30 most
uniformly transformed embryos from each treatment were pooled and
total genomic DNA was extracted. The region surrounding the
intended target site was PCR amplified with Phusion.RTM. High
Fidelity PCR Master Mix (New England Biolabs, M0531 L) adding on
the sequences necessary for amplicon-specific barcodes and Illumnia
sequencing using "tailed" primers through two rounds of PCR. The
primers used in the primary PCR reaction are shown in Table 2 and
the primers used in the secondary PCR reaction were
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACG (forward, SEQ ID NO:
53) and CAAGCAGAAGACGGCATA (reverse, SEQ ID NO: 54).
TABLE-US-00002 TABLE 2 PCR primer sequences SEQ Primer ID Target
Slte Orientation Primary PCR Primer Sequence NO: MS26Cas-1 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTA 28 GGACCGGAAGCTCGCCGCGT
MS26Cas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTTC 29
CTGGAGGACGACGTGCTG MS26Cas-2 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 30 GGTCCTGGAGGACGACGTGCTG
MS26Cas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCC 31
GGAAGCTCGCCGCGT MS26Cas-3 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTC 32 CTCCGGAAGCTCGCCGCGT
MS26Cas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTTC 29
CTGGAGGACGACGTGCTG MS26 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTT 33 Meganuclease
CCTCCTGGAGGACGACGTGCTG MS26 Reverse
CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCC 31 Meganuclease
GGAAGCTCGCCGCGT LIGCas-1 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTA 34 GGACTGTAACGATTTACGCACCTGCTG
LIGCas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 35
AAATGAGTAGCAGCGCACGTAT LIGCas-2 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTC 36 CTCTGTAACGATTTACGCACCTGCTG
LIGCas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 35
AAATGAGTAGCAGCGCACGTAT LIGCas-3 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 37 GGCGCAAATGAGTAGCAGCGCAC
LIGCas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCA 38
CCTGCTGGGAATTGTACCGTA LIG3-4 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTC 39 Meganuclease
CTTCGCAAATGAGTAGCAGCGCAC LIG3-4 Reverse
CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCA 38 Meganuclease
CCTGCTGGGAATTGTACCGTA MS45Cas-1 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTA 40 GGAGGACCCGTTCGGCCTCAGT
MS45Cas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 41
CGGCTGGCATTGTCTCTG MS45Cas-2 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTC 42 CTGGACCCGTTCGGCCTCAGT
MS45Cas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 41
CGGCTGGCATTGTCTCTG MS45Cas-3 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTG 43 AAGGGACCCGTTCGGCCTCAGT
MS45Cas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 41
CGGCTGGCATTGTCTCTG ALSCas-1 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 44 GGCGACGATGGGCGTCTCCTG
ALSCas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 45
GTCTGCATCGCCACCTC ALSCas-2 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTT 46 CCCGACGATGGGCGTCTCCTG
ALSCas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 45
GTCTGCATCGCCACCTC ALSCas-3 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTG 47 GAACGACGATGGGCGTCTCCTG
ALSCas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 45
GTCTGCATCGCCACCTC EPSPSCas-1 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTG 48 GAAGAGGAAACATACGTTGCATTTCCA
EPSPSCas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTG 49
GTGGAAAGTTCCCAGTTGAGGA EPSPSCas-2 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 50 GCGGTGGAAAGTTCCCAGTTGAGGA
EPSPSCas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGA 51
GGAAACATACGTTGCATTTCCA EPSPSCas-3 Forward
CTACACTCTTTCCCTACACGACGCTCTTCCGATCTC 52 CTTGAGGAAACATACGTTGCATTTCCA
EPSPSCas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTG 49
GTGGAAAGTTCCCAGTTGAGGA
[0653] The resulting PCR amplifications were purified with a Qiagen
PCR purification spin column, concentration measured with a Hoechst
dye-based fluorometric assay, combined in an equimolar ratio, and
single read 100 nucleotide-length deep sequencing was performed on
Illumina's MiSeq Personal Sequencer with a 30-40% (v/v) spike of
PhiX control v3 (Illumina, FC-110-3001) to off-set sequence bias.
Only those reads with a .gtoreq.1 nucleotide indel arising within
the 10 nucleotide window centered over the expected site of
cleavage and not found in a similar level in the negative control
were classified as NHEJ mutations. NHEJ mutant reads with the same
mutation were counted and collapsed into a single read and the top
10 most prevalent mutations were visually confirmed as arising
within the expected site of cleavage. The total numbers of visually
confirmed NHEJ mutations were then used to calculate the % mutant
reads based on the total number of reads of an appropriate length
containing a perfect match to the barcode and forward primer.
[0654] The frequency of NHEJ mutations recovered by deep sequencing
for the guide RNA/Cas endonuclease system targeting the three
LIGCas targets (SEQ ID NOS: 16, 17, 18) compared to the LIG3-4
homing endonuclease targeting the same locus is shown in Table 3.
The ten most prevalent types of NHEJ mutations recovered based on
the guide RNA/Cas endonuclease system compared to the LIG3-4 homing
endonuclease are shown in FIG. 3 A (corresponding to SEQ ID NOs:
55-75) and FIG. 3 B (corresponding to SEQ ID NOs: 76-96).
Approximately, 12-23 fold higher frequencies of NHEJ mutations were
observed when using a guide RNA/Cas system to introduce a double
strand break at a maize genomic target site (Cas target sites),
relative to the LIG3-4 homing endonuclease control. As shown in
Table 4, a similar difference between the guide RNA/Cas system and
meganuclease double-strand break technologies was observed at the
MS26 locus with approximately 14-25 fold higher frequencies of NHEJ
mutations when a guide RNA/Cas endonuclease system was used. High
frequencies of NHEJ mutations were also recovered at the MS45, ALS
and EPSPS Cas targets (see Table 5) when using a guide RNA/Cas
endonuclease system. This data indicates that the guide RNA/Cas9
endonuclease system described herein can be effectively used to
introduce an alteration at genomic sites of interest such as those
related to male fertility, wherein an alteration results in the
creation of a male sterile gene locus and male sterile plants.
Altering the EPSPS target can result in the production of plants
that are tolerant and/or resistant against glyphosate based
herbicides. Altering the acetolactate synthase (ALS) gene target
site can result in the production of plants that are tolerant
and/or resistant to imidazolinone and sulphonylurea herbicides.
TABLE-US-00003 TABLE 3 Percent (%) mutant reads at maize Liguleless
1 target locus produced by a guide RNA/Cas system versus a homing
endonuclease system. Total Number of Number of Mutant % Mutant
System Reads Reads Reads Cas9 Only Control 640,063 1 0.00% guide
RNA Only 646,774 1 0.00% Control LIG3-4 Homing 616,536 1,211 0.20%
Endonuclease LIGCas-1 guide/ 716,854 33,050 4.61% Cas9 LIGCas-2
guide/ 711,047 16,675 2.35% Cas9 LIGCas-3 guide/ 713,183 27,959
3.92% Cas9
TABLE-US-00004 TABLE 4 Percent (%) mutant reads at maize Male
Sterility 26 target locus produced by a guide RNA/Cas system versus
a homing endonuclease. Total Number of Number of Mutant % Mutant
System Reads Reads Reads Cas9 Only Control 403,123 15 0.00% MS26++
Homing 512,784 642 0.13% Endonuclease MS26Cas-1 guide/ 575,671
10,073 1.75% Cas9 MS26Cas-2 guide/ 543,856 16,930 3.11% Cas9
MS26Cas-3 guide/ 538,141 13,879 2.58% Cas9
TABLE-US-00005 TABLE 5 Percent (%) mutant reads at maize Male
Sterility 45, Acetolactate Synthase and Enolpyruvylshikimate
Phosphate Synthase target loci produced by the guide RNA/Cas
system. Total Number of Number of Mutant % Mutant System Reads
Reads Reads Cas9 Only 899,500 27 0.00% Control (MS45) MS45Cas-1
812,644 3,795 0.47% guide/Cas9 MS45Cas-2 785,183 14,704 1.87%
guide/Cas9 MS45Cas-3 728,023 9,203 1.26% guide/Cas9 Cas9 Only
534,764 19 0.00% Control (ALS) ALSCas-1 434,452 9,669 2.23%
guide/Cas9 ALSCas-2 472,351 6,352 1.345% guide/Cas9 ALSCas-3
497,786 8,535 1.715% guide/Cas9 Cas9 Only 1,347,086 6 0.00% Control
(EPSPS) EPSPSCas-1 1,420,274 13,051 0.92% guide/Cas9 EPSPSCas-2
1,225,082 26,340 2.15% guide/Cas9 EPSPSCas-3 1,406,905 53,603 3.81%
guide/Cas9
[0655] Taken together, our data indicate that the maize optimized
guide RNA/Cas endonuclease system described herein using a long
guide RNA expression cassette efficiently cleaves maize chromosomal
DNA and generates imperfect NHEJ mutations at frequencies greater
than the engineered LIG3-4 and MS26++ homing endonucleases.
Example 3
[0656] Long Guide RNA of the Maize Optimized Guide RNA/Cas
Endonuclease System Cleaves Maize Chromosomal DNA More Efficiently
than the Short Guide RNA
[0657] To determine the most effective guide RNA (comprising a
fusion of the crRNA and tracrRNA) for use in maize, the recovery of
NHEJ mutations using a short guide RNA (SEQ ID NO: 11) based on
Jinek et al. (2012) Science 337:816-21 and a long guide RNA (SEQ ID
NO: 8) based on Mali et al. (2013) Science 339:823-26 was
examined.
[0658] The variable targeting domains of the guide RNA targeting
the maize genomic target sites at the LIG locus (LIGCas-1, LIGCas-2
and LIGCas-3, SEQ ID NOs: 16, 17 and 18, Table1) were introduced
into both the maize optimized long and short guide RNA expression
cassettes as described in Example 1 and co-transformed along with
the maize optimized Cas9 endonuclease expression cassette into
immature maize embryos and deep sequenced for NHEJ mutations as
described in Example 2. Embryos transformed with only the Cas9
endonuclease expression cassette served as a negative control.
[0659] As shown in Table 6 below, the frequency of NHEJ mutations
recovered with the long guide RNA far exceeded those obtained with
the short guide RNA. This data indicates that the long guide RNA
paired with the maize optimized Cas9 endonuclease gene described
herein more efficiently cleaves maize chromosomal DNA.
TABLE-US-00006 TABLE 6 Percent (%) mutant reads at the maize
Liquleless 1 target locus produced by a guide RNA/Cas system with a
long versus a short guide RNA. Number of guide RNA Total Number of
Mutant % Mutant System Used Reads Reads Reads Cas9 Only N/A 640,063
1 0.00% LIGCas-1 Short 676,870 43 0.01% guide/Cas9 LIGCas-2 Short
747,945 91 0.01% guide/Cas9 LIGCas-3 Short 655,157 10 0.00%
guide/Cas9 LIGCas-1 Long 716,854 33,050 4.61% guide/Cas9 LIGCas-2
Long 711,047 16,675 2.35% guide/cas9 LIGCas-3 Long 713,183 27,959
3.92% guide/Cas9
Example 4
[0660] The Guide RNA/Cas Endonuclease System May be Multiplexed to
Simultaneously Target Multiple Chromosomal Loci in Maize for
Mutagenesis by Imperfect Non-Homologous End-Joining
[0661] To test if multiple chromosomal loci may be simultaneously
mutagenized with the guide RNA/maize optimized Cas endonuclease
system described herein, the long guide RNA expression cassettes
targeting the MS26Cas-2 target site (SEQ ID NO: 14), the LIGCas-3
target site (SEQ ID NO: 18) and the MS45Cas-2 target site (SEQ ID
NO: 20), were co-transformed into maize embryos either in duplex or
in triplex along with the Cas9 endonuclease expression cassette and
examined by deep sequencing for the presence of imprecise NHEJ
mutations as described in Example 2.
[0662] Hi-II maize embryos co-transformed with the Cas9 expression
cassette and the corresponding guide RNA expression cassette singly
served as a positive control and embryos transformed with only the
Cas9 expression cassette served as a negative control.
[0663] As shown in Table 7 below, mutations resulting from
imprecise NHEJ were recovered at all relevant loci when multiple
guide RNA expression cassettes were simultaneously introduced
either in duplex or triplex with frequencies of mutant reads near
those of the positive control. Thus, demonstrating that the maize
optimized guide RNA/Cas endonuclease system described herein may be
used to simultaneously introduce imprecise NHEJ mutations at
multiple loci in maize.
TABLE-US-00007 TABLE 7 Percent (%) mutant reads at maize target
loci produced by a multiplexed guide RNA/Cas system. guide RNAs
Target Site Co-transformed Examined Individually, in Total Number
of for NHEJ Duplex, or in Triplex Number of Mutant % Mutant
Mutations with Cas9 Reads Reads Reads LIGCas-3, None (Cas9 Only
527,691 9 0.00% MS26Cas-2, control) MS45Cas-2 LIGCas-3 LIGCas-3
645,107 12,631 1.96% LIGCas-3 579,992 10,348 1.78% MS26Cas-2
LIGCas-3 648,901 12,094 1.86% MS26Cas-2 MS45Cas-2 MS26Cas-2 MS26
Cas 2 699,154 17,247 2.47% LIGCas-3 717,158 10,256 1.43% MS26Cas-2
MS26Cas-2 613,431 9,931 1.62% MS45Cas-2 LIGCas-3 471,890 7,311
1.55% MS26Cas-2 MS45Cas-2 MS45Cas-2 MS45Cas-2 503,423 10,034 1.99%
MS26Cas-2 480,178 8,008 1.67% MS45Cas-2 LIGCas-3 416,711 7,190
1.73% MS26Cas-2 MS45Cas-2
Example 5
[0664] Guide RNA/Cas Endonuclease Mediated DNA Cleavage in Maize
Chromosomal Loci can Stimulate Homologous Recombination
Repair-Mediated Transgene Insertion
[0665] To test the utility of the maize optimized guide RNA/Cas
system described herein to cleave maize chromosomal loci and
stimulate homologous recombination (HR) repair pathways to
site-specifically insert a transgene, a HR repair DNA vector (also
referred to as a donor DNA) (SEQ ID NO: 97) was constructed as
illustrated in FIG. 4 using standard molecular biology techniques
and co-transformed with a long guide RNA expression cassette,
comprising a variable targeting domain corresponding to the
LIGCas-3 genomic target site, and a Cas9 endonuclease expression
cassette into immature maize embryos as described in Example 2.
[0666] Maize embryos co-transformed with the HR repair DNA vector
and LIG3-4 homing endonuclease (see Example 9) targeting the same
genomic target site as LIGCas-3 served as a positive control. Since
successful delivery of the HR repair DNA vector confers bialaphos
herbicide resistance, callus events containing putative HR-mediated
transgenic insertions were selected by placing the callus on
herbicide containing media. After selection, stable callus events
were sampled, total genomic DNA extracted, and using the primer
pairs shown in FIG. 5 (corresponding to SEQ ID NOs: 98-101), PCR
amplification was carried out at both possible transgene genomic
DNA junctions to identify putative HR-mediated transgenic
insertions. The resulting amplifications were sequenced for
confirmation.
[0667] Sequence confirmed PCR amplifications indicating
site-specific transgene insertion for the guide RNA/Cas system were
detected for 37 out of 384 stable transformants with 15 containing
amplifications across both transgene genomic DNA junctions
indicating near perfect site-specific transgene insertion. The
LIG3-4 homing endonuclease positive control yielded PCR
amplifications indicating site-specific transgene insertion for 3
out of 192 stable transformants with 1 containing amplifications
across both transgene genomic DNA junctions. The data clearly
demonstrates that maize chromosomal loci cleaved with the maize
optimized guide RNA/Cas system described herein can be used to
stimulate HR repair pathways to site-specifically insert transgenes
at frequencies greater than the LIG3-4 homing endonuclease.
Example 6
[0668] Guide RNA/Cas Endonuclease System Transformed Together on a
Single Vector Results in Greater Recovery of Imperfect
Non-Homologous End-Joining Mutations
[0669] To evaluate different delivery methods for the maize
optimized guide RNA/Cas endonuclease system described herein, the
recovery of NHEJ mutations when the guide RNA/Cas expression
cassettes were either co-transformed as separate DNA vectors as in
Examples 2, 3, 4 and 5 or transformed as a single vector DNA
(comprising both guide RNA and Cas endonuclease expression
cassettes, as shown in FIG. 1C) was examined.
[0670] The long guide RNA expression cassette for LIGCas-3 and the
Cas9 expression cassette were consolidated onto a single vector DNA
(FIG. 1C, SEQ ID NO: 102) by standard molecular biology techniques
and transformed into immature Hi-II maize embryos as described in
Examples 10 and 11 by particle-mediated delivery. Hi-II embryos
co-transformed with the Cas9 and LIGCas-3 long guide RNA expression
cassettes served as a positive control while embryos transformed
with only the Cas9 expression cassette served as a negative
control. Deep sequencing for NHEJ mutations was performed as
described in Example 2.
[0671] As shown in Table 8 below, the frequency of NHEJ mutations
recovered when the Cas endonuclease and long guide RNA expression
cassettes were delivered together as a single vector DNA was
approximately 2-fold greater than that observed from the equivalent
co-transformation experiment. This indicates that delivery of the
guide RNA/Cas system expression cassettes together on a single
vector DNA results in a greater recovery of imperfect
non-homologous end-joining mutations.
TABLE-US-00008 TABLE 8 Percent (%) mutant reads at the maize
Liguleless 1 target locus produced by a guide RNA/Cas system with
Cas9 and guide RNA expression cassettes combined into one DNA
vector versus two separate DNA vectors. Total Number of Number of
Mutant % Mutant System Reads Reads Reads Cas9 Only Control
1,519,162 97 0.01% LIGCas-3 1,515,0607 36,346 2.40% guide/Cas9 (Two
vector DNAs) LIGCas-3 1,860,031 105,854 5.69% guide/Cas9 (Single
vector DNA)
Example 7
[0672] Delivery Methods for Plant Genome Editing Using the Guide
RNA/Cas Endonuclease System
[0673] This example describes methods to deliver or maintain and
express the Cas9 endonuclease and guide RNA (or individual crRNA
and tracrRNAs) into, or within plants, respectively, to enable
directed DNA modification or gene insertion via homologous
recombination. More specifically this example describes a variety
of methods which include, but are not limited to, delivery of the
Cas9 endonuclease as a DNA, RNA (5'-capped and polyadenylated) or
protein molecule. In addition, the guide RNA may be delivered as a
DNA or RNA molecule.
[0674] Shown in Example 2, a high mutation frequency was observed
when Cas9 endonuclease and guide RNA were delivered as DNA vectors
by biolistic transformation of immature corn embryos. Other
embodiments of this disclosure can be to deliver the Cas9
endonuclease as a DNA, RNA or protein and the guide RNA as a DNA or
RNA molecule or as a duplex crRNA/tracrRNA molecule as RNA or DNA
or a combination. Various combinations of Cas9 endonuclease, guide
RNA and crRNA/tracrRNA delivery methods can be, but are not limited
to, the methods shown in Table 9.
TABLE-US-00009 TABLE 9 Various combinations of delivery of the cas9
endonuclease, guide RNA or cRNA + tracrRNA. Components delivered.
combination (Delivery method is shown between brackets) 1 Cas9 (DNA
vector), guide RNA (DNA vector) 2 Cas9 (DNA vector), guide RNA
(RNA) 3 Cas9 (RNA), guide RNA (DNA) 4 Cas9 (RNA), guide RNA (RNA) 5
Cas9 (Protein), guide RNA (DNA) 6 Cas9 (Protein), guide RNA (RNA) 7
Cas9 (DNA vector), crRNA (DNA), tracrRNA (DNA) 8 Cas9 (DNA vector),
crRNA (RNA), tracrRNA (DNA) 9 Cas9 (DNA vector), crRNA (RNA),
tracrRNA (RNA) 10 Cas9 (DNA vector) crRNA (DNA), tracrRNA (RNA) 11
Cas9 (RNA), crRNA (DNA), tracrRNA (DNA) 12 Cas9 (RNA), crRNA (RNA),
tracrRNA (DNA) 13 Cas9 (RNA), crRNA (RNA), tracrRNA (RNA) 14 Cas9
(RNA), crRNA (DNA), tracrRNA (RNA) 15 Cas9 (Protein), crRNA (DNA),
tracrRNA (DNA) 16 Cas9 (Protein), crRNA (RNA), tracrRNA (DNA) 17
Cas9 (Protein), crRNA (RNA), tracrRNA 18(RNA) 18 Cas9 (Protein),
crRNA (DNA), tracrRNA (RNA)
[0675] Delivery of the Cas9 (as DNA vector) and guide RNA (as DNA
vector) example (Table 9, combination1) can also be accomplished by
co-delivering these DNA cassettes on a single or multiple
Agrobacterium vectors and transforming plant tissues by
Agrobacterium mediated transformation. In addition, a vector
containing a constitutive, tissue-specific or conditionally
regulated Cas9 gene can be first delivered to plant cells to allow
for stable integration into the plant genome to establish a plant
line that contains only the Cas9 gene in the plant genome. In this
example, single or multiple guide RNAs, or single or multiple crRNA
and a tracrRNA can be delivered as either DNA or RNA, or
combination, to the plant line containing the genome-integrated
version of the Cas9 gene for the purpose of generating mutations or
promoting homologous recombination when HR repair DNA vectors for
targeted integration are co-delivered with the guide RNAs. As
extension of this example, plant line containing the
genome-integrated version of the Cas9 gene and a tracrRNA as a DNA
molecule can also be established. In this example single or
multiple crRNA molecules can be delivered as RNA or DNA to promote
the generation of mutations or to promote homologous recombination
when HR repair DNA vectors for targeted integration are
co-delivered with crRNA molecule(s) enabling the targeted
mutagenesis or homologous recombination at single or multiple sites
in the plant genome.
Example 8
[0676] Components of the Guide RNA/Cas Endonuclease System
Delivered Directly as RNA in Plants
[0677] This example illustrates the use of the methods as described
in Table 9 configuration of Example 7 [Cas9 (DNA vector), guide RNA
(RNA)] for modification or mutagenesis of chromosomal loci in
plants. The maize optimized Cas9 endonuclease expression cassette
described in Example 1 was co-delivered by particle gun as
described in Example 2 along with single stranded RNA molecules
(synthesized by Integrated DNA Technologies, Inc.) constituting a
short guide RNA targeting the maize locus and sequence shown in
Table 10. Embryos transformed with only the Cas9 expression
cassette or short guide RNA molecules served as negative controls.
Seven days post-bombardment, the immature embryos were harvested
and analyzed by deep sequencing for NHEJ mutations as described in
Example 2. Mutations not present in the negative controls were
found at the site (FIG. 6, corresponding to SEQ ID NOs: 104-110).
These mutations were similar to those found in Examples 2, 3, 4 and
6. This data indicates that component(s) of the maize optimized
guide RNA/Cas endonuclease system described herein may be delivered
directly as RNA.
TABLE-US-00010 TABLE 10 Maize genomic target site and location for
short guide RNA delivered as RNA. Guide PAM SEQ Loca- RNA Maize Se-
ID Locus tion Used Designation Target Site quence NO 55 Chr. 1:
Short 55CasRNA-1 TGGGCAGGTCT TGG 103 51.78 CACGACGGT cM
Example 9
Creation of Rare Cutting Engineered Meganucleases
LIG3-4 Meganuclease and LIG3-4 Intended Recognition Sequence
[0678] An endogenous maize genomic target site comprising the
LIG3-4 intended recognition sequence (SEQ ID NO: 111) was selected
for design of a rare-cutting double-strand break inducing agent
(SEQ ID NO: 112) as described in US patent publication 2009-0133152
A1 (published May 21, 2009). The LIG3-4 intended recognition
sequence is a 22 bp polynucleotide having the following
sequence:
TABLE-US-00011 (SEQ ID NO: 111) ATATACCTCACACGTACGCGTA.
MS 26++ meganuclease
[0679] An endogenous maize genomic target site designated "TS-MS26"
(SEQ ID NO: 113) was selected for design of a custom double-strand
break inducing agent MS26++ as described in U.S. patent application
Ser. No. 13/526,912 filed Jun. 19, 2012). The TS-MS26 target site
is a 22 bp polynucleotide positioned 62 bps from the 5' end of the
fifth exon of the maize MS26 gene and having the following
sequence: gatggtgacgtac gtgccctac (SEQ ID NO: 113). The double
strand break site and overhang region is underlined, the enzyme
cuts after C13, as indicated by the . Plant optimized nucleotide
sequences for an engineered endonuclease (SEQ ID NO: 114) encoding
an engineered MS26++ endonuclease were designed to bind and make
double-strand breaks at the selected TS-MS26 target site.
Example 10
Transformation of Maize Immature Embryos
[0680] Transformation can be accomplished by various methods known
to be effective in plants, including particle-mediated delivery,
Agrobacterium-mediated transformation, PEG-mediated delivery, and
electroporation.
[0681] a. Particle-Mediated Delivery
[0682] Transformation of maize immature embryos using particle
delivery is performed as follows. Media recipes follow below.
[0683] The ears are husked and surface sterilized in 30% Clorox
bleach plus 0.5% Micro detergent for 20 minutes, and rinsed two
times with sterile water. The immature embryos are isolated and
placed embryo axis side down (scutellum side up), 25 embryos per
plate, on 560Y medium for 4 hours and then aligned within the
2.5-cm target zone in preparation for bombardment. Alternatively,
isolated embryos are placed on 560L (Initiation medium) and placed
in the dark at temperatures ranging from 26.degree. C. to
37.degree. C. for 8 to 24 hours prior to placing on 560Y for 4
hours at 26.degree. C. prior to bombardment as described above.
[0684] Plasmids containing the double strand brake inducing agent
and donor DNA are constructed using standard molecular biology
techniques and co-bombarded with plasmids containing the
developmental genes ODP2 (AP2 domain transcription factor ODP2
(Ovule development protein 2); US20090328252 A1) and Wushel
(US2011/0167516).
[0685] The plasmids and DNA of interest are precipitated onto 0.6
.mu.m (average diameter) gold pellets using a water-soluble
cationic lipid transfection reagent as follows. DNA solution is
prepared on ice using 1 .mu.g of plasmid DNA and optionally other
constructs for co-bombardment such as 50 ng (0.5 .mu.l) of each
plasmid containing the developmental genes ODP2 (AP2 domain
transcription factor ODP2 (Ovule development protein 2);
US20090328252 A1) and Wushel. To the pre-mixed DNA, 20 .mu.l of
prepared gold particles (15 mg/ml) and 5 .mu.l of a water-soluble
cationic lipid transfection reagent is added in water and mixed
carefully. Gold particles are pelleted in a microfuge at 10,000 rpm
for 1 min and supernatant is removed. The resulting pellet is
carefully rinsed with 100 ml of 100% EtOH without resuspending the
pellet and the EtOH rinse is carefully removed. 105 .mu.l of 100%
EtOH is added and the particles are resuspended by brief
sonication. Then, 10 .mu.l is spotted onto the center of each
macrocarrier and allowed to dry about 2 minutes before
bombardment.
[0686] Alternatively, the plasmids and DNA of interest are
precipitated onto 1.1 .mu.m (average diameter) tungsten pellets
using a calcium chloride (CaCl.sub.2) precipitation procedure by
mixing 100 .mu.l prepared tungsten particles in water, 10 .mu.l (1
.mu.g) DNA in Tris EDTA buffer (1 .mu.g total DNA), 100 .mu.l 2.5 M
CaCl.sub.2, and 10 .mu.l 0.1 M spermidine. Each reagent is added
sequentially to the tungsten particle suspension, with mixing. The
final mixture is sonicated briefly and allowed to incubate under
constant vortexing for 10 minutes. After the precipitation period,
the tubes are centrifuged briefly, liquid is removed, and the
particles are washed with 500 ml 100% ethanol, followed by a 30
second centrifugation. Again, the liquid is removed, and 105 .mu.l
of 100% ethanol is added to the final tungsten particle pellet. For
particle gun bombardment, the tungsten/DNA particles are briefly
sonicated. 10 .mu.l of the tungsten/DNA particles is spotted onto
the center of each macrocarrier, after which the spotted particles
are allowed to dry about 2 minutes before bombardment.
[0687] The sample plates are bombarded at level #4 with a Biorad
Helium Gun. All samples receive a single shot at 450 PSI, with a
total of ten aliquots taken from each tube of prepared
particles/DNA.
[0688] Following bombardment, the embryos are incubated on 560P
(maintenance medium) for 12 to 48 hours at temperatures ranging
from 26C to 37C, and then placed at 26C. After 5 to 7 days the
embryos are transferred to 560R selection medium containing 3
mg/liter Bialaphos, and subcultured every 2 weeks at 26C. After
approximately 10 weeks of selection, selection-resistant callus
clones are transferred to 288J medium to initiate plant
regeneration. Following somatic embryo maturation (2-4 weeks),
well-developed somatic embryos are transferred to medium for
germination and transferred to a lighted culture room.
Approximately 7-10 days later, developing plantlets are transferred
to 272V hormone-free medium in tubes for 7-10 days until plantlets
are well established. Plants are then transferred to inserts in
flats (equivalent to a 2.5'' pot) containing potting soil and grown
for 1 week in a growth chamber, subsequently grown an additional
1-2 weeks in the greenhouse, then transferred to Classic 600 pots
(1.6 gallon) and grown to maturity. Plants are monitored and scored
for transformation efficiency, and/or modification of regenerative
capabilities.
[0689] Initiation medium (560L) comprises 4.0 g/l N6 basal salts
(SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix
(1000.times.SIGMA-1511), 0.5 mg/l thiamine HCl, 20.0 g/l sucrose,
1.0 mg/l 2,4-D, and 2.88 g/l L-proline (brought to volume with D-I
H2O following adjustment to pH 5.8 with KOH); 2.0 g/l Gelrite
(added after bringing to volume with D-I H2O); and 8.5 mg/l silver
nitrate (added after sterilizing the medium and cooling to room
temperature).
[0690] Maintenance medium (560P) comprises 4.0 g/l N6 basal salts
(SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix
(1000.times.SIGMA-1511), 0.5 mg/l thiamine HCl, 30.0 g/l sucrose,
2.0 mg/l 2,4-D, and 0.69 g/l L-proline (brought to volume with D-I
H2O following adjustment to pH 5.8 with KOH); 3.0 g/l Gelrite
(added after bringing to volume with D-I H2O); and 0.85 mg/l silver
nitrate (added after sterilizing the medium and cooling to room
temperature).
[0691] Bombardment medium (560Y) comprises 4.0 g/l N6 basal salts
(SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix
(1000.times.SIGMA-1511), 0.5 mg/l thiamine HCl, 120.0 g/l sucrose,
1.0 mg/l 2,4-D, and 2.88 g/l L-proline (brought to volume with D-I
H2O following adjustment to pH 5.8 with KOH); 2.0 g/l Gelrite
(added after bringing to volume with D-I H2O); and 8.5 mg/l silver
nitrate (added after sterilizing the medium and cooling to room
temperature).
Selection medium (560R) comprises 4.0 g/l N6 basal salts (SIGMA
C-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000.times.SIGMA-1511),
0.5 mg/l thiamine HCl, 30.0 g/l sucrose, and 2.0 mg/l 2,4-D
(brought to volume with D-I H2O following adjustment to pH 5.8 with
KOH); 3.0 g/l Gelrite (added after bringing to volume with D-I
H2O); and 0.85 mg/l silver nitrate and 3.0 mg/l bialaphos (both
added after sterilizing the medium and cooling to room
temperature).
[0692] Plant regeneration medium (288J) comprises 4.3 g/l MS salts
(GIBCO 11117-074), 5.0 ml/l MS vitamins stock solution (0.100 g
nicotinic acid, 0.02 g/l thiamine HCL, 0.10 g/l pyridoxine HCL, and
0.40 g/l glycine brought to volume with polished D-I H2O)
(Murashige and Skoog (1962) Physiol. Plant. 15:473), 100 mg/l
myo-inositol, 0.5 mg/l zeatin, 60 g/l sucrose, and 1.0 ml/l of 0.1
mM abscisic acid (brought to volume with polished D-I H2O after
adjusting to pH 5.6); 3.0 g/l Gelrite (added after bringing to
volume with D-I H2O); and 1.0 mg/l indoleacetic acid and 3.0 mg/l
bialaphos (added after sterilizing the medium and cooling to
60.degree. C.). Hormone-free medium (272V) comprises 4.3 g/l MS
salts (GIBCO 11117-074), 5.0 ml/l MS vitamins stock solution (0.100
g/l nicotinic acid, 0.02 g/l thiamine HCL, 0.10 g/l pyridoxine HCL,
and 0.40 g/l glycine brought to volume with polished D-I H2O), 0.1
g/1 myo-inositol, and 40.0 g/l sucrose (brought to volume with
polished D-I H2O after adjusting pH to 5.6); and 6 g/l bacto-agar
(added after bringing to volume with polished D-I H2O), sterilized
and cooled to 60.degree. C.
[0693] b. Agrobacterium-Mediated Transformation
[0694] Agrobacterium-mediated transformation was performed
essentially as described in Djukanovic et al. (2006) Plant Biotech
J4:345-57. Briefly, 10-12 day old immature embryos (0.8-2.5 mm in
size) were dissected from sterilized kernels and placed into liquid
medium (4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Eriksson's
Vitamin Mix (Sigma E-1511), 1.0 mg/L thiamine HCl, 1.5 mg/L 2, 4-D,
0.690 g/L L-proline, 68.5 g/L sucrose, 36.0 g/L glucose, pH 5.2).
After embryo collection, the medium was replaced with 1 ml
Agrobacterium at a concentration of 0.35-0.45 OD550. Maize embryos
were incubated with Agrobacterium for 5 min at room temperature,
then the mixture was poured onto a media plate containing 4.0 g/L
N6 Basal Salts (Sigma C-1416), 1.0 ml/L Eriksson's Vitamin Mix
(Sigma E-1511), 1.0 mg/L thiamine HCl, 1.5 mg/L 2, 4-D, 0.690 g/L
L-proline, 30.0 g/L sucrose, 0.85 mg/L silver nitrate, 0.1 nM
acetosyringone, and 3.0 g/L Gelrite, pH 5.8. Embryos were incubated
axis down, in the dark for 3 days at 20.degree. C., then incubated
4 days in the dark at 28.degree. C., then transferred onto new
media plates containing 4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0
ml/L Eriksson's Vitamin Mix (Sigma E-1511), 1.0 mg/L thiamine HCl,
1.5 mg/L 2, 4-D, 0.69 g/L L-proline, 30.0 g/L sucrose, 0.5 g/L MES
buffer, 0.85 mg/L silver nitrate, 3.0 mg/L Bialaphos, 100 mg/L
carbenicillin, and 6.0 g/L agar, pH 5.8. Embryos were subcultured
every three weeks until transgenic events were identified. Somatic
embryogenesis was induced by transferring a small amount of tissue
onto regeneration medium (4.3 g/L MS salts (Gibco 11117), 5.0 ml/L
MS Vitamins Stock Solution, 100 mg/L myo-inositol, 0.1 .mu.M ABA, 1
mg/L IAA, 0.5 mg/L zeatin, 60.0 g/L sucrose, 1.5 mg/L Bialaphos,
100 mg/L carbenicillin, 3.0 g/L Gelrite, pH 5.6) and incubation in
the dark for two weeks at 28.degree. C. All material with visible
shoots and roots were transferred onto media containing 4.3 g/L MS
salts (Gibco 11117), 5.0 ml/L MS Vitamins Stock Solution, 100 mg/L
myo-inositol, 40.0 g/L sucrose, 1.5 g/L Gelrite, pH 5.6, and
incubated under artificial light at 28.degree. C. One week later,
plantlets were moved into glass tubes containing the same medium
and grown until they were sampled and/or transplanted into
soil.
Example 11
Transient Expression of BBM Enhances Transformation
[0695] Parameters of the transformation protocol can be modified to
ensure that the BBM activity is transient. One such method involves
precipitating the BBM-containing plasmid in a manner that allows
for transcription and expression, but precludes subsequent release
of the DNA, for example, by using the chemical PEI. In one example,
the BBM plasmid is precipitated onto gold particles with PEI, while
the transgenic expression cassette (UBI::moPAT.about.GFPm::PinII;
moPAT is the maize optimized PAT gene) to be integrated is
precipitated onto gold particles using the standard calcium
chloride method.
[0696] Briefly, gold particles were coated with PEI as follows.
First, the gold particles were washed. Thirty-five mg of gold
particles, 1.0 in average diameter (A.S.I. #162-0010), were weighed
out in a microcentrifuge tube, and 1.2 ml absolute EtOH was added
and vortexed for one minute. The tube was incubated for 15 minutes
at room temperature and then centrifuged at high speed using a
microfuge for 15 minutes at 4.degree. C. The supernatant was
discarded and a fresh 1.2 ml aliquot of ethanol (EtOH) was added,
vortexed for one minute, centrifuged for one minute, and the
supernatant again discarded (this is repeated twice). A fresh 1.2
ml aliquot of EtOH was added, and this suspension (gold particles
in EtOH) was stored at -20.degree. C. for weeks. To coat particles
with polyethylimine (PEI; Sigma #P3143), 250 .mu.l of the washed
gold particle/EtOH mix was centrifuged and the EtOH discarded. The
particles were washed once in 100 .mu.l ddH2O to remove residual
ethanol, 250 .mu.l of 0.25 mM PEI was added, followed by a
pulse-sonication to suspend the particles and then the tube was
plunged into a dry ice/EtOH bath to flash-freeze the suspension,
which was then lyophilized overnight. At this point, dry, coated
particles could be stored at -80.degree. C. for at least 3 weeks.
Before use, the particles were rinsed 3 times with 250 .mu.l
aliquots of 2.5 mM HEPES buffer, pH 7.1, with 1.times.
pulse-sonication, and then a quick vortex before each
centrifugation. The particles were then suspended in a final volume
of 250 .mu.l HEPES buffer. A 25 .mu.l aliquot of the particles was
added to fresh tubes before attaching DNA. To attach uncoated DNA,
the particles were pulse-sonicated, then 1 .mu.g of DNA (in 5 .mu.l
water) was added, followed by mixing by pipetting up and down a few
times with a Pipetteman and incubated for 10 minutes. The particles
were spun briefly (i.e. 10 seconds), the supernatant removed, and
60 .mu.l EtOH added. The particles with PEI-precipitated DNA-1 were
washed twice in 60 .mu.l of EtOH. The particles were centrifuged,
the supernatant discarded, and the particles were resuspended in 45
.mu.l water. To attach the second DNA (DNA-2), precipitation using
a water-soluble cationic lipid transfection reagent was used. The
45 .mu.l of particles/DNA-1 suspension was briefly sonicated, and
then 5 .mu.l of 100 ng/.mu.l of DNA-2 and 2.5 .mu.l of the
water-soluble cationic lipid transfection reagent were added. The
solution was placed on a rotary shaker for 10 minutes, centrifuged
at 10,000 g for 1 minute. The supernatant was removed, and the
particles resuspended in 60 .mu.l of EtOH. The solution was spotted
onto macrocarriers and the gold particles onto which DNA-1 and
DNA-2 had been sequentially attached were delivered into scutellar
cells of 10 DAP Hi-II immature embryos using a standard protocol
for the PDS-1000. For this experiment, the DNA-1 plasmid contained
a UBI::RFP::pinII expression cassette, and DNA-2 contained a
UBI::CFP::pinII expression cassette. Two days after bombardment,
transient expression of both the CFP and RFP fluorescent markers
was observed as numerous red & blue cells on the surface of the
immature embryo. The embryos were then placed on non-selective
culture medium and allowed to grow for 3 weeks before scoring for
stable colonies. After this 3-week period, 10 multicellular,
stably-expressing blue colonies were observed, in comparison to
only one red colony. This demonstrated that PEI-precipitation could
be used to effectively introduce DNA for transient expression while
dramatically reducing integration of the PEI-introduced DNA and
thus reducing the recovery of RFP-expressing transgenic events. In
this manner, PEI-precipitation can be used to deliver transient
expression of BBM and/or WUS2.
[0697] For example, the particles are first coated with
UBI::BBM::pinII using PEI, then coated with UBI::moPAT.about.YFP
using a water-soluble cationic lipid transfection reagent, and then
bombarded into scutellar cells on the surface of immature embryos.
PEI-mediated precipitation results in a high frequency of
transiently expressing cells on the surface of the immature embryo
and extremely low frequencies of recovery of stable transformants
Thus, it is expected that the PEI-precipitated BBM cassette
expresses transiently and stimulates a burst of embryogenic growth
on the bombarded surface of the tissue (i.e. the scutellar
surface), but this plasmid will not integrate. The PAT.about.GFP
plasmid released from the Ca++/gold particles is expected to
integrate and express the selectable marker at a frequency that
results in substantially improved recovery of transgenic events. As
a control treatment, PEI-precipitated particles containing a
UBI::GUS::pinII (instead of BBM) are mixed with the
PAT.about.GFP/Ca++ particles. Immature embryos from both treatments
are moved onto culture medium containing 3 mg/l bialaphos. After
6-8 weeks, it is expected that GFP+, bialaphos-resistant calli will
be observed in the PEI/BBM treatment at a much higher frequency
relative to the control treatment (PEI/GUS).
[0698] As an alternative method, the BBM plasmid is precipitated
onto gold particles with PEI, and then introduced into scutellar
cells on the surface of immature embryos, and subsequent transient
expression of the BBM gene elicits a rapid proliferation of
embryogenic growth. During this period of induced growth, the
explants are treated with Agrobacterium using standard methods for
maize (see Example 1), with T-DNA delivery into the cell
introducing a transgenic expression cassette such as
UBI::moPAT.about.GFPm::pinII. After co-cultivation, explants are
allowed to recover on normal culture medium, and then are moved
onto culture medium containing 3 mg/l bialaphos. After 6-8 weeks,
it is expected that GFP+, bialaphos-resistant calli will be
observed in the PEI/BBM treatment at a much higher frequency
relative to the control treatment (PEI/GUS).
[0699] It may be desirable to "kick start" callus growth by
transiently expressing the BBM and/or WUS2 polynucleotide products.
This can be done by delivering BBM and WUS2 5'-capped
polyadenylated RNA, expression cassettes containing BBM and WUS2
DNA, or BBM and/or WUS2 proteins. All of these molecules can be
delivered using a biolistics particle gun. For example 5'-capped
polyadenylated BBM and/or WUS2 RNA can easily be made in vitro
using Ambion's mMessage mMachine kit. RNA is co-delivered along
with DNA containing a polynucleotide of interest and a marker used
for selection/screening such as Ubi::moPAT.about.GFPm::PinII. It is
expected that the cells receiving the RNA will immediately begin
dividing more rapidly and a large portion of these will have
integrated the agronomic gene. These events can further be
validated as being transgenic clonal colonies because they will
also express the PAT.about.GFP fusion protein (and thus will
display green fluorescence under appropriate illumination). Plants
regenerated from these embryos can then be screened for the
presence of the polynucleotide of interest.
Example 12
[0700] DNA Constructs to Test the Guide RNA/Cas Endonuclease System
for Soybean Genome Modifications
[0701] To test if a guide RNA/Cas endonuclease system, similar to
that described in Example 1 for maize, is functional in a dicot
such as soybean, a Cas9 (SO) gene (SEQ ID NO:115) soybean codon
optimized from Streptococcus pyogenes M1 GAS (SF370) was expressed
with a strong soybean constitutive promoter GM-EF1A2 (US patent
application 20090133159 (SEQ ID NO: 116). A simian vacuolating
virus 40 (SV40) large T-antigen nuclear localization signal (SEQ ID
NO:117), representing the amino acid molecules of PKKKRKV (with a
linker SRAD (SRADPKKKRKV), was added to the carboxyl terminus of
the codon optimized Cas9 to facilitate transporting the codon
optimized Cas9 protein (SEQ ID NO:118) to the nucleus. The codon
optimized Cas9 gene was synthesized as two pieces by GenScript USA
Inc. (Piscataway, N.J.) and cloned in frame downstream of the
GM-EF1A2 promoter to make DNA construct QC782 shown in FIG. 7 (SEQ
ID NO:119).
[0702] Plant U6 RNA polymerase III promoters have been cloned and
characterized from such as Arabidopsis and Medicago truncatula
(Waibel and Filipowicz, NAR 18:3451-3458 (1990); Li et al., J.
Integrat. Plant Biol. 49:222-229 (2007); Kim and Nam, Plant Mol.
Biol. Rep. 31:581-593 (2013); Wang et al., RNA 14:903-913 (2008)).
Soybean U6 small nuclear RNA (snRNA) genes were identified herein
by searching public soybean variety Williams82 genomic sequence
using Arabidopsis U6 gene coding sequence. Approximately 0.5 kb
genomic DNA sequence upstream of the first G nucleotide of a U6
gene was selected to be used as a RNA polymerase III promoter for
example, GM-U6-13.1 promoter (SEQ ID NO:120), to express guide RNA
to direct Cas9 nuclease to designated genomic site. The guide RNA
coding sequence was 76 bp long (FIG. 8B) and comprised a 20 bp
variable targeting domain from a chosen soybean genomic target site
on the 5' end and a tract of 4 or more T residues as a
transcription terminator on the 3' end. (SEQ ID NO:121, FIG. 8 B).
The first nucleotide of the 20 bp variable targeting domain was a G
residue to be used by RNA polymerase III for transcription. The U6
gene promoter and the complete guide RNA was synthesized and then
cloned into an appropriate vector to make, for example, DNA
construct QC783 shown in FIG. 8 A (SEQ ID NO:122). Other soybean U6
homologous genes promoters were similarly cloned and used for small
RNA expression.
[0703] Since the Cas9 endonuclease and the guide RNA need to form a
protein/RNA complex to mediate site-specific DNA double strand
cleavage, the Cas9 endonuclease and guide RNA must be expressed in
same cells. To improve their co-expression and presence, the Cas9
endonuclease and guide RNA expression cassettes were linked into a
single DNA construct, for example, QC815 in FIG. 9 A (SEQ ID
NO:123), which was then used to transform soybean cells to test the
soybean optimized guide RNA/Cas system for genome modification.
Similar DNA constructs were made to target different genomic sites
using guide RNAs containing different target sequences.
Example 13
[0704] Selection of Soybean Genomic Sites to be Cleaved by the
Guide RNA/Cas Endonuclease System
[0705] A region of the soybean chromosome 4 (Gm04) was selected to
test if the soybean optimized guide RNA/Cas endonuclease system
could recognize, cleave, and mutate soybean chromosomal DNA through
imprecise non-homologous end-joining (NHEJ) repair. Two genomic
target sites were selected one close to a predicted gene
Glyma04g39780.1 at 114.13 cM herein named DD20 locus (FIG. 10A) and
another close to Glyma04g39550.1 at 111.95 cM herein named DD43
locus (FIG. 10B). Each of the 20 bp variable targeting domain of
the guide RNA started with a G residue required by RNA polymerase
III and was followed in the soybean genome by a 3 bp PAM motif
(Table 11). The chromosome positions of the soybean genomic targets
sites in close proximity to the PAM sequences were determined by
blast searching the public soybean variety Williams82 genomic
sequence. The soybean genomic target sites DD20CR1 (SEQ ID NO:
125), DD20CR2 (SEQ ID NO: 126), and DD43CR1 (SEQ ID NO: 127) were
identified as all unique in soybean genome while a second identical
23 bp genomic target site DD43CR2 (SEQ ID NO: 128) was found at
Gm06:12072339-12072361 so there are two potential cleavage sites
targeted by DD43CR2 guide RNA. Both DD43CR1 and DD43CR2 are
complementary strand sequences indicated by "c" after the
positions.
TABLE-US-00012 TABLE 11 Soybean genomic target sites for a guide
RNA/Cas endonuclease system. Genomic Chromo- Desig- Target some
Positions nation Sites PAM Gm04, 45936311- DD20CR1 GGAACTGACA TGG
114.13 45936333 CACGACATGA cM 45936324- DD20CR2 GACATGATGG AGG
45936346 AACGTGACTA Gm04, 45731921- DD43CR1 GTCCCTTGTA CGG 111.95
45731943c CTTGTACGTA cM 45731895- DD43CR2 GTATTCTAGA TGG 45731917c
AAAGAGGAAT
[0706] Guide RNA expression cassette comprising a variable
targeting domain targeting one of DD20CR1, DD20CR2, DD43CR2 genomic
target sites were similarly constructed and linked to the soybean
Cas9 expression cassette to make DNA constructs QC817, QC818, and
QC816 that are similar to QC815 in FIG. 9 A (SEQ ID NO:123) except
for the 20 bp variable targeting domain of the guide RNA
[0707] Since up to six continuous mismatches in the 5' regions of
the genomic target site (protospacer) with the 20 bp variable
targeting domain can be tolerated, i.e., a continuous stretch of 14
base pairs between the variable targeting domain and the crRNA
sequence proximate to the PAM is necessarily enough for efficient
targets cleavage any 23 bp genomic DNA sequence following the
pattern N(20)NGG can be selected as a target site for the guide
RNA/Cas endonuclease system. The last NGG is the PAM sequence that
should not be included in the 20 bp variable targeting domain of
the guide RNA. If the first N is not endogenously a G residue it
must be replaced with a G residue in guide RNA target sequence to
accommodate RNA polymerase III, which should not sacrifice
recognition specificity of the target site by the guide RNA.
Example 14
[0708] Delivery of the Guide RNA/Cas Endonuclease System DNA to
Soybean by Transient Transformation
[0709] The soybean optimized Cas9 endonuclease and guide RNA
expression cassettes were delivered to young soybean somatic
embryos in the form of embryogenic suspension cultures by particle
gun bombardment. Soybean embryogenic suspension cultures were
induced as follows. Cotyledons (.about.3 mm in length) were
dissected from surface sterilized, immature seeds and were cultured
for 6-10 weeks in the light at 26.degree. C. on a Murashige and
Skoog (MS) media containing 0.7% agar and supplemented with 10
mg/ml 2,4-D (2,4-Dichlorophenoxyacetic acid). Globular stage
somatic embryos, which produced secondary embryos, were then
excised and placed into flasks containing liquid MS medium
supplemented with 2,4-D (10 mg/ml) and cultured in the light on a
rotary shaker. After repeated selection for clusters of somatic
embryos that multiplied as early, globular staged embryos, the
soybean embryogenic suspension cultures were maintained in 35 ml
liquid media on a rotary shaker, 150 rpm, at 26.degree. C. with
fluorescent lights on a 16:8 hour day/night schedule. Cultures were
subcultured every two weeks by inoculating approximately 35 mg of
tissue into 35 ml of the same fresh liquid MS medium.
[0710] Soybean embryogenic suspension cultures were then
transformed by the method of particle gun bombardment using a
DuPont Biolistic.TM. PDS1000/HE instrument (Bio-Rad Laboratories,
Hercules, Calif.). To 50 .mu.l of a 60 mg/ml 1.0 mm gold particle
suspension were added (in order): 30 .mu.l of 30 ng/.mu.l QC815 DNA
fragment U6-13.1:DD43CR1+EF1A2:CAS9 as an example, 20 .mu.l of 0.1
M spermidine, and 25 .mu.l of 5 M CaCl.sub.2. The particle
preparation was then agitated for 3 minutes, spun in a centrifuge
for 10 seconds and the supernatant removed. The DNA-coated
particles were then washed once in 400 .mu.l 100% ethanol and
resuspended in 45 .mu.l of 100% ethanol. The DNA/particle
suspension was sonicated three times for one second each. Then 5
.mu.l of the DNA-coated gold particles was loaded on each macro
carrier disk.
[0711] Approximately 100 mg of a two-week-old suspension cultures
were placed in an empty 60.times.15 mm Petri dish and the residual
liquid removed from the tissue with a pipette. Membrane rupture
pressure was set at 1100 psi and the chamber was evacuated to a
vacuum of 28 inches mercury. The tissue was placed approximately
3.5 inches away from the retaining screen and bombarded once. The
tissue clumps were rearranged and bombarded another time. Minimum
amount of liquid MS media without 2,4-D supplement was added to the
tissue to prevent the cultures from drying or overgrowing. The
60.times.15 mm Petri dish was sealed in a 100.times.25 mm Petri
dish containing agar solid MS media to as another measure to keep
the tissues from drying up. The tissues were harvested seven days
after and genomic DNA was extracted for PCR analysis.
Example 15
[0712] Analysis of Guide RNA/Cas Endonuclease System Mediated
Site-Specific NHEJ by Deep Sequencing
[0713] To evaluate DNA double strand cleavage at a soybean genomic
target site mediated by the guide RNA/Cas endonuclease system, a
region of approximately 100 bp genomic DNA surrounding the target
site was amplified by PCR and the PCR product was then sequenced to
check mutations at the target site as results of NHEJs. The region
was first amplified by 20 cycles of PCR with Phusion High Fidelity
mastermix (New England Biolabs) from 100 ng genomic DNA using
gene-specific primers that also contain adaptors and
amplicon-specific barcode sequences needed for a second round PCR
and subsequence sequence analysis. For examples, the first PCR for
the four experiments listed in Table 2 were done using primers
DD20-S3 (SEQ ID NO:133)/DD20-A (SEQ ID NO:134), DD20-S4 (SEQ ID
NO:135)/DD20-A, DD43-S3 (SEQ ID NO:136)/DD43-A (SEQ ID NO:137) and
DD43-S4 (SEQ ID NO:138)/DD43-A. One micro liter of the first round
PCR products was further amplified by another 20 cycles of PCR
using universal primers (SEQ ID NOs:140, 141) with Phusion High
Fidelity mastermix. The resulting PCR products were separated on
1.5% agarose gel and the specific DNA bands were purified with
Qiagen gel purification spin columns. DNA concentrations were
measured with a DNA Bioanalyzer (Agilent) and equal molar amounts
of DNA for up to 12 different samples each with specific barcode
were mixed as one sample for Illumina deep sequencing analysis.
Single read 100 nucleotide-length deep sequencing was performed at
a DuPont core facility on a Illumnia's MiSeq Personal Sequencer
with a 40% (v/v) spike of PhiX control v3 (Illumina, FC-110-3001)
to off-set sequence bias.
[0714] Since the genomic target site is located in the middle of
the .about.100 bp long PCR amplicon (SEQ ID NOs: 142, 143, 144,
145), the 100 nucleotide-length deep sequencing is sufficient to
cover the targets site region. A window of 10 nucleotides centered
over the expected cleavage site, i.e., 3 bp upstream of the PAM,
was selected for sequence analysis. Only those reads with one or
more nucleotide indel arising within the 10 nucleotide window and
not found in a similar level in negative controls were classified
as NHEJ mutations. NHEJ mutant reads of different lengths but with
the same mutation were counted into a single read and up to 10 most
prevalent mutations were visually confirmed to be specific
mutations before they were then used to calculate the % mutant
reads based on the total analyzed reads containing specific barcode
and forward primer.
[0715] The frequencies of NHEJ mutations revealed by deep
sequencing for four target sites DD20CR1, DD20CR2, DD43CR1, DD43CR2
with one RNA polymerase III promoter GM-U6-13.1 are shown in Table
2. The visually confirmed most prevalent NHEJ mutations are shown
in FIG. 11A-11D. The mutant sequences in FIG. 11A-11E are listed as
SEQ ID NOs:147-201. The top row is the original reference sequence
with the target site sequence underlined. Deletions in the mutated
sequences are indicated by " - - - " while additions and
replacements are indicated by bold letters. Total count of each
mutation of different reads is given in the last column. Cas9
nuclease construct only, guide RNA construct only, and no DNA
bombardment negative controls were similarly performed and analyzed
but data not shown since no-specific mutations were detected. Other
targets sites and guide RNAs were also tested with similar positive
results and data not shown.
TABLE-US-00013 TABLE 12 Target site-specific mutations introduced
by guide RNA/Cas endonuclease mediated NHEJ. Mutant Total %
Experiment DNA reads reads Mutants U6-13.1:DD20CR1 + QC817 339
710,339 0.048% EF1A2:CAS9 U6-13.1:DD20CR2 + QC818 419 693,483
0.060% EF1A2:CAS9 U6-13.1:DD43CR1 + QC815 489 682,207 0.072%
EF1A2:CAS9 U6-13.1:DD43CR2 + QC816 917** 539,681 0.170% EF1A2:CAS9
**At least the top 15 reads are specific mutations but only the top
10 are counted in the table to be consistent with other
experiments. If all top 15 mutations are counted, the total Mutant
reads is 1080 and the % Mutants is 0.200%.
[0716] In conclusion, our data indicate that the soybean optimized
guide RNA/Cas endonuclease system is able to effectively cleave
soybean endogenous genomic DNA and create imperfect NHEJ mutations
at the specified genomic target sites.
Example 16
[0717] The Guide RNA/Cas Endonuclease System Delivers Double-Strand
Breaks (DBSs) to the Maize Epsps Locus Resulting in Desired Point
Mutations
[0718] Two maize optimized Cas9 endonucleases were developed and
evaluated for their ability to introduce a double-strand break at a
genomic target sequence. A first Cas9 endonuclease was as described
in FIG. 1A (Example 2 and expression cassette SEQ ID NO:5). A
second maize optimized Cas9 endonuclease (moCas9 endonuclease; SEQ
ID NO:192) was supplemented with the SV40 nuclear localization
signal by adding the signal coding sequence to the 5' end of the
moCas9 coding sequence (FIG. 13). The plant moCas9 expression
cassette was subsequently modified by the insertion of the ST-LS1
intron into the moCas9 coding sequence in order to enhance its
expression in maize cells and to eliminate its expression in E.
coli and Agrobacterium. The maize ubiquitin promoter and the potato
proteinase inhibitor II gene terminator sequences complemented the
moCas9 endonuclease gene designs. The structural elements of the
moCas9 expression cassette are shown in FIG. 13 and its amino acid
and nucleotide sequences are listed as SEQ ID Nos: 192 and 193.
[0719] A single guide RNA (sgRNA) expression cassette was
essentially as described in Example 1 and shown in FIG. 1B. It
consists of the U6 polymerase III maize promoter (SEQ ID NO: 9) and
its cognate U6 polymerase III termination sequences (TTTTTTTT). The
guide RNA (SEQ ID NO: 194) comprised a 20 nucleotide variable
targeting domain (nucleotide1-20 of SEQ ID NO: 194) followed by a
RNA sequence capable of interacting with the double strand break
inducing endonuclease.
[0720] A maize optimized Cas9 endonuclease target sequence (moCas9
target sequence) within the EPSPS codon sequence was complementary
to the 20 nucleotide variable sequence of the guide sgRNA
determined the site of the Cas9 endonuclease cleavage within the
EPSPS coding sequence.
[0721] The moCAS9 target sequence (nucleotides 25-44 of SEQ ID
NO:209) was synthesized and cloned into the guide RNA-Cas9
expression vector designed for delivery of the components of the
guide RNA-Cas9 system to the BMS (Black Mexican Sweet) cells
through Agrobacterium-mediated transformation. Agrobacterium T-DNA
delivered also the yeast FLP site-specific recombinase and the WDV
(wheat dwarf virus) replication-associated protein (replicase).
Since the moCas9 target sequences were flanked by the FLP
recombination targets (FRT), they were excised by FLP in maize
cells forming episomal (chromosome-like) structures. Such circular
DNA fragments were replicated by the WDV replicase (the origin of
replication was embedded into the WDV promoter) allowing their
recovery in E. coli cells. If the maize optimizedCas9 endonuclease
made a double-strand break at the moCas9 target sequence, its
repair might produce mutations. The procedure is described in
detail in: Lyznik, L. A., Djukanovic, V., Yang, M. and Jones, S.
(2012) Double-strand break-induced targeted mutagenesis in plants.
In: Transgenic plants: Methods and Protocols (Dunwell, J. M. and
Wetten, A. C. eds). New York Heidelberg Dordrecht London: Springer,
pp. 399-416.
[0722] The guideRNA/Cas endonuclease systems using either one of
the maize optimized Cas9 endonucleases described herein, generated
double-strand breaks in the moCas9 target sequence (Table 13).
Table 13 shows the percent of the moCas9 target sequences
mutagenized in the maize BMS cells using the moCas9 endonuclease of
SEQ ID NO: 192 or the maize optimized cas9 endonuclease described
in FIG. 1A and expressed by the expression cassette of SEQ ID NO:5.
Both guideRNA/Cas endonuclease systems generated double-strand
breaks (as judged by the number of targeted mutagenesis events)
ranging from 67 to 84% of the moCas9 target sequences available on
episomal DNA molecules in maize BMS cells. A sample of mutagenized
EPSPS target sequences is shown in FIG. 14. This observation
indicates that the maize optimized Cas9 endonuclease described
herein is functional in maize cells and efficiently generates
double-strand breaks at the moCas9 target sequence.
TABLE-US-00014 TABLE 13 Percent of the moCas9 target sequences
mutaqenized in the maize BMS cells by maize optimized Cas9
endonucleases. # of # of moCas9 # of intact mutagenized Cas9 target
moCas9 target moCas9 target Percent endonuclease sequences
sequences sequences mutagenesis version analyzed recovered found
(%) SEQ ID 81 13 68 84% NO: 193 (FIG. 13) SEQ ID 93 31 62 67% NO: 5
(FIG. 1A)
[0723] In order to accomplish targeted genome editing of the maize
chromosomal EPSPS gene, a polynucleotide modification template
which provided genetic information for editing the EPSPS coding
sequence was created (SEQ ID NO:195) and co-delivered with the
guide RNA/Cas9 system components.
[0724] As shown in FIG. 12, the polynucleotide modification
template comprised three nucleotide modifications (indicated by
arrows) when compared to the EPSPS genomic sequence to be edited.
These three nucleotide modifications are referred to as TIPS
mutations as these nucleotide modifications result in the amino
acid changes T-102 to I-102 and P-106 to S-106. The first point
mutation results from the substitution of the C nucleotide in the
codon sequence ACT with a T nucleotide, a second mutation results
from the substitution of the T nucleotide on the same codon
sequence ACT with a C nucleotide to form the isoleucine codon
(ATC), the third point mutation results from the substitution of
the first C nucleotide in the codon sequence CCA with a T
nucleotide in order to form a serine codon, TCA. (FIG. 12). Both
codon sequences were located within 9 nucleotides of each other as
shown in SEQ ID NO: 196: atcgcaatgcggtca. The three nucleotide
modifications are shown in bold. The nucleotides between the two
codon sequences were homologous to the non-edited EPSPS gene on the
epsps locus. The polynucleotide modification template further
comprised DNA fragments of maize EPSPS genomic sequence that were
used as homologous sequence for the EPSPS gene editing. The short
arm of homologous sequence (HR1--FIG. 12) was 810 base pairs long
and the long arm of homologous sequence (HR2--FIG. 12) was 2,883
base pairs long (SEQ ID NO: 195).
[0725] In this example, the EPSPS polynucleotide modification
template was co-delivered using particle gun bombardment as a
plasmid (see template vector 1, FIG. 15) together with the guide
sgRNA expression cassette and a maize optimizedCas9 endonuclease
expression vector which contained the maize optimized Cas9
endonuclease expression cassette described in FIG. 1A (Example 1,
SEQ ID NO:5) and also contained a moPAT selectable marker gene. Ten
to eleven day-old immature embryos were placed, embryo-axis down,
onto plates containing the N6 medium (Table 14) and incubated at
28.degree. C. for 4-6 hours before bombardment. The plates were
placed on the third shelf from the bottom in the PDS-1000 apparatus
and bombarded at 200 psi. Post-bombardment, embryos were incubated
in the dark overnight at 28.degree. C. and then transferred to
plates containing the N6-2 media for 6-8 days at 28.degree. C. The
embryos were then transferred to plates containing the N6-3 media
for three weeks, followed by transferring the responding callus to
plates containing the N6-4 media for an additional three-week
selection. After six total weeks of selection at 28.degree. C., a
small amount of selected tissue was transferred onto the MS
regeneration medium and incubated for three weeks in the dark at
28.degree. C.
TABLE-US-00015 TABLE 14 Composition of Culture Media. Culture
medium Composition N6 4.0 g/L N.sub.6 Basal Salts (Sigma C-1416;
Sigma-Aldrich Co., St. Louis, MO, USA), 1.0 ml/L Ericksson's
Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 190 g/L sucrose,
1.0 mg/L 2,4- dichlorophenoxyacetic acid (2,4-D), 2.88 g/L
L-proline, 8.5 mg/L silver nitrate, 25 mg/L cefotaxime, and 6.36
g/L Sigma agar at pH 5.8 N6-2 4.0 g/L N.sub.6 Basal Salts (Sigma
C-1416), 1.0 ml/L Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L
thiamine HCl, 20 g/L sucrose, 1.0 mg/L 2,4-D, 2.88 g/L L-proline,
8.5 mg/L silver nitrate, 25 mg/L cefotaxime, and 8.5 g/L Sigma agar
at pH 5.8 N6-3 4.0 g/L N.sub.6 Basal Salts (Sigma C-1416), 1.0 ml/L
Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 30
g/L sucrose, 1.5 mg/L 2,4-D, 0.69 g/L L-proline, 0.5 g/L 2-(N-
morpholino)ethanesulphonic acid (MES) buffer, 0.85 mg/L silver
nitrate, 5 mg/L glufosinate NH.sub.4, and 8.0 g/L Sigma agar at pH
5.8 N6-4 4.0 g/L N.sub.6 Basal Salts (Sigma C-1416), 1.0 ml/L
Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 30
g/L sucrose, 1.5 mg/L 2,4-D, 0.69 g/L L-proline, 0.5 g/L MES
buffer, 0.85 mg/L silver nitrate, 3 mg/L bialophos, and 8.0 g/L
Sigma agar at pH 5.8 MS 4.3 g/L Murashige and Skoog (MS) salts
(Gibco 11117; Gibco, Grand Island, NY), 5.0 ml/L MS Vitamins Stock
Solution (Sigma M3900), 100 mg/L myo-inositol, 0.1 .mu.mol abscisic
acid (ABA), 1 mg/L indoleacetic acid (IAA), 0.5 mg/L zeatin, 60.0
g/L sucrose, 3.0 mg/L Bialaphos, and 8.0 g/L Sigma agar at pH
5.6
[0726] DNA was extracted by placing callus cell samples, two
stainless-steel beads, and 450 ul of extraction buffer (250 mM
NaCl, 200 mM Tris-HCl pH 7.4, 25 mM EDTA, 4.2 M Guanidine HCl) into
each tube of a Mega titer rack. The rack was shaken in the
Genogrinder at 1650 r.p.m. for 60 seconds and centrifuged at
3000.times.g for 20 min at 4.degree. C. Three hundred .mu.l of
supernatant was transferred to the wells of the Unifilter 96-well
DNA Binding GF/F Microplate (770-2810, Whatman, GE Healthcare). The
plate was placed on the top of a Multi-well plate vacuum manifold
(5017, Pall Life Sciences). A vacuum pressure was applied until the
wells were completely dried. The vacuum filtration procedure was
repeated one time with 100 ul extraction buffer and two times with
250 ul washing buffer (50 mM Tris-HCl pH 7.4, 200 mM NaCl, 70%
ethanol). The residual ethanol was removed by placing the GF/F
filter plate on an empty waste collection plate and centrifuged for
10 min at 3000.times.g. The DNA was eluted in 100 ul Elution Buffer
(10 mM Tris-HCl, pH 8.3) and centrifuged at 3000.times.g for 1 min.
For each sample, four PCR reactions were run. They included
approximately 40 ng genomic DNA, 10 ul REDExtract-N-Amp PCR
ReadyMix (R4775, Sigma-Aldrich Co.), and 5 picomoles of each primer
in a total volume of 20 ul. Primer combinations for each PCR
reaction are listed in the Table 15.
TABLE-US-00016 TABLE 15 Primer combinations for PCR reactions. PCR
Primer SEQ reaction sequence ID NO: PCR product F-E2 CCGAGGAGATCGTG
197 Template CTGCA randomly CAATGGCCGCATTG 198 integrated CAGTTC or
gene editing event F-T CCGAGGAGATCGTG 199 Wild-type CTGCA EPSPS
allele TGACCGCATTGCGA 200 TTCCAG H-T TCCAAGTCGCTTTC 201 TIPS
editing CAACAGGATC event TGACCGCATTGCGA 202 TTCCAG F-E3
CCGAGGAGATCGTG 203 A fragment of CTGCA the epsps locus
ACCAAGCTGCTTCA 204 for cloning and ATCCGACAAC sequencing
[0727] The same PCR reactions were done on five samples of genomic
DNA obtained from untransformed maize inbred plantlets. After an
initial denaturation at 95.degree. C. for 5 minutes, each PCR
amplification was carried out over 35 cycles using DNA Engine
Tetrad2 Thermal Cycler (BioRad Laboratories, Hercules, Calif.) at
94.degree. C. for 30 sec denaturation, 68.degree. C. for 30 sec
annealing, and 72.degree. C. for 1 min extension. PCR products
F-E2, F-T and H-T were separated in 1% agarose gel at 100 Volts for
45 minutes, with 100 bp DNA Ladder (N0467S, NewEngland Biolabs).
For sequencing, the F-F3 PCR amplified fragments from selected
calli were cloned into pCR 2.1-TOPO vectors using the TOPO TA
Cloning Kit (Invitrogen Corp, Carlsbad, Calif.). DNA sequencing was
done with BigDye Terminator chemistry on ABI 3700 capillary
sequencing machines (Applied Biosystems, Foster City, Calif.). Each
sample contained about 0.5 ug Topo plasmid DNA and 6.4 pmole primer
E3-EPex3 Rev (ACCAAGCTGCTTCAATCCGACAAC, SEQ ID NO: 204). Sequences
were analyzed using the Sequencer program.
[0728] A sample of thirty one callus events selected on media
containing bialophos (the moPAT selectable marker gene was part of
the guide RNA-moCas9 expression vector) were screened for the
presence of the TIPS point mutations. Twenty four events contained
the TIPS point mutations integrated into genomic DNA (FIG. 16, the
F-E2 treatment). Among them, six events showed the PCR
amplification product of the chromosomal EPSPS gene with TIPS
mutations (FIG. 16, the H-T treatment). The pair of PCR primers
(one that can hybridize to the genomic epsps sequence not present
in the EPSPS polynucleotide modification template and the other one
binding to the edited EPSPS sequence present in the EPSPS
polynucleotide modification template) distinguished the EPSPS-TIPS
editing products from the wild-type epsps alleles or random
insertions of the TIPS mutations. If one EPSPS allele was edited to
contain the TIPS substitutions, it should be detected as a DNA
fragment originating from the genomic epsps locus, regardless
whether the TIPS substitutions were selected for during the PCR
amplification process. The TIPS primer was replaced with the
wild-type EPSPS primer (Table 15, the F-E3 pair of primers) and the
PCR amplification products were cloned into the TOPO cloning
vectors and sequenced. The sequencing data represented a random
sample of the genomic epsps locus sequences in one of the selected
events (FIG. 17, callus A12 3360.92). FIG. 17 shows that the method
disclosed herein resulted in the successful nucleotide editing of
three nucleotides (FIG. 17 bold) responsible for the TIPS mutations
without altering any of the other epsps nucleotides, while the
moCas9 target sequence (the site of guide RNA binding underlined in
FIG. 17) was not mutagenized.
[0729] Also, the other EPSPS allele was not edited indicating that
only one EPSPS allele was edited in this particular event (FIG. 17,
lower section).
[0730] This data further shows that the present disclosure of the
use of the guide RNA/Cas system for the gene editing demonstrates
the ability to recover gene editing events at a high efficiency of
1 out of fewer than 10 selected events.
Example 17
[0731] The Quide RNA/Cas Endonuclease System Delivers Double-Strand
Breaks to the Maize Epsps Locus Resulting in Maize Plants
Containing an EPSPS-TIPS Edited Gene.
[0732] The EPSPS gene edited events were produced and selected as
described in the Example 16. In short, the EPSPS polynucleotide
modification template was co-n delivered using particle gun
bombardment as a plasmid (see template vector 1, FIG. 15) together
with the guide RNA expression cassette and a maize optimized Cas9
endonuclease expression vector which contained the maize optimized
Cas9 endonuclease expression cassette described in FIG. 1A (Example
1, SEQ ID NO:5) and also contained a moPAT selectable marker
gene.
[0733] After six weeks of selection at 28.degree. C., a small
amount of selected tissue was transferred onto the MS regeneration
medium and incubated for three weeks in the dark at 28.degree. C.
After the three week incubation visible shoots were transferred to
plates containing the MS-1 medium and incubated at 26.degree. C. in
the light for 1-2 weeks until they were ready to be sent to a
greenhouse and transferred into soil flats. The Ms-1 medium
contained: 4.3 g/L MS salts (Gibco 11117), 5.0 ml/L MS Vitamins
Stock Solution (Sigma M3900), 100 mg/L myo-inositol, 40.0 g/L
sucrose, and 6.0 g/L Bacto-Agar at pH 5.6.
[0734] Using the procedures described above, 390 T0 maize plants
were produced originating from 3282 embryos, resulting in an
overall transformation efficiency of 12%, further indicating that
the guide RNA/Cas system used herein results in low or no toxicity
(Table 16).
TABLE-US-00017 TABLE 16 Transformation efficiency of the EPSPS
editing. # # Calli Selection T0 plants Overall Treatment Embryos
selected efficiency to GH Efficiency Particle 3282 489 15% 390 12%
bombardment
[0735] DNA was extracted from each T0 plantlet 7-10 days after
transfer to the greenhouse and PCR procedures were conducted as
described in the Example 16 to screen the T0 plants for mutations
at the epsps locus.
[0736] Seventy two percent of analyzed T0 plants ( 270/375, Table
17) contained mutagenized EPSPS alleles as determined by the
end-point PCR procedure described in the Example 16. Most of the
mutations ( 230/375 or 89%) were produced as a result of
error-prone non-homologous end joining (NHEJ) while forty T0 plants
( 40/375 or 11%) contained the TIPS edited EPSPS alleles indicating
the involvement of a templated double-strand break repair mechanism
(Table 17).
TABLE-US-00018 TABLE 17 Mutations at the epsps locus. Gene
Mutations Editing TO Plants at the Mutation TIPS Rate
Transformation Analyzed epsps locus rate editing (TIPS) Particle
375 270 72% 40 11% bombardment
[0737] A pair of primers (Table 15, the F-E3 pair of primers) was
used to amplify a native, endogenous fragment of the epsps locus
containing the moCas6 target sequence and the EPSPS editing site
from the genomic DNA of selected T0 plants. The PCR amplification
products were cloned into the TOPO cloning vectors and sequenced as
described in Example 16. The sequencing data represent a random
sample of the genomic epsps locus sequences from a particular T0
plant (Table 18) and indicate the genotype of the selected T0
plants. The list of the EPSPS-TIPS allele-containing T0 plants
transferred to the pots is presented in Table 18 (a selected set of
T0 plants from the original 40 TIPS-containing events).
TABLE-US-00019 TABLE 18 The epsps locus genotypes observed in T0
plants. TIPS refers to a clone comprising the TIPS edited EPSPS
sequence. NHEJ refers to the presence of a NHEJ mutation and WT
refers to the presence of a wild-type EPSPS sequence amplified from
the native epsps locus. Event Observed Sequences found at the (T0
plant) epsps locus E1 16 TIPS, 13 NHEJ E2 28 TIPS, 0 NHEJ E3 2
TIPS, 20 WT E4 1 TIPS, 28 NHEJ E5 2 TIPS, 2 NHEJ, 9 WT E6 10 TIPS,
17 NHEJ E7 12 TIPS, 17 NHEJ E8 11 TIPS, 15 NHEJ E9 17 TIPS, 10
NHEJ
[0738] As presented in Table 18, the selected plants of E1 and E3
to E9 contained the EPSPS-TIPS edited version of the EPSPS gene
either accompanied by a wild-type EPSPS allele (WT) or a NHEJ
mutagenized EPSPS allele (NHEJ). The numbers before TIPS, WT, NHEJ
in Table18 indicate the frequency at which a particular version of
the EPSPS allele was identified. If all clones contained the
TIPS-edited EPSPS sequence, the analyzed plant was likely to be
homozygous for the EPSPS-TIPS allele (see for example E2). If only
about 50% of clones contained a TIPS-edited EPSPS sequence, the
analyzed plant was likely to be hemizygous for the EPSPS-TIPS
allele (see for example E1). Other plants, such as E3 or E4, were
likely to be chimeric for TIPS. In one event, E2, the T0 plant
contained only TIPS-edited sequence at the epsps locus indicating
that the guide RNA/Cas endonuclease system disclosed herein
resulted in the successful nucleotide editing of three nucleotides
(FIG. 17 bold) responsible for the two EPSPS-TIPS alleles at the
epsps locus in maize plants.
[0739] A qPCR analysis was performed on the selected T0 plants to
estimate the copy number of the wild-type EPSPS genes and the
moCas9 endonuclease sequences. Multiplex qPCR amplifications of the
maize EPSPS gene and the ADH housekeeping gene were carried out on
the DNA samples from T0 plants. The primers and probes used in the
PCR reaction are shown in Table 19.
TABLE-US-00020 TABLE 19 Primers used in qPCR analysis of T0 plants.
Primer/ Primary PCR SEQ probe Primer Sequence ID NO: primer
5'-CAAGTCGCGGT SEQ ID qADH F TTTCAATCA-3 NO: 217 Primer
5'-TGAAGGTGGAA SEQ ID qADH R GTCCCAACAA-3' NO: 218 probe
VIC-TGGGAAGCCT SEQ ID ADH-VIC ATCTACCAC NO: 219 Probe
6FAM-CGGCCATTG SEQ ID wtEPSPS ACAGCA-MGB-NFQ NO: 220 Forward primer
5'-TCTTGGGGAAT ,SEQ ID qEPSPS F GCTGGAACT-3' NO: 221 reverse primer
5'-CACCAGCAGCA SEQ ID qEPSPSR GTAACAGCTG-3' NO: 222 FAM-wtEPSPS
6FAM-TGCTGTCA SEQ ID R probe ATGGCCGCA NO: 223 forward primer
5'-TCTTGGGGAA SEQ ID qEPSPS F TGCTGGAACT-3' NO: 224 reverse primer
5'-CCACCAGCAGC SEQ ID q wtEPSPS RA AGTAACAGC-3 NO: 225)
[0740] All analyses were conducted using the LightCycler 480
Real-Time PCR System (Roche Diagnostics). A threshold value for the
wtEPSPS genotype was set at 1.76. Every sample showing less than
1.76 copies of EPSPS, with the end-point florescence measurements
up to two times lower than the wild-type control, was categorized
as the One Allele EPSPS genotype (hemizygous for the wild-type
EPSPS allele).
[0741] A qPCR method was used to estimate the TIPS sequence copy
number. The primers and probes used in the qPCR reaction are shown
in Table 20.
TABLE-US-00021 TABLE 20 Primers used in qPCR analysis to estimate
the TIPS sequence copy number. Primer/ Primary PCR Primer SEQ ID
probe Sequence NO: forward primer 5'-GGAAGTGCAGCTCTTCTT SEQ ID q
epTIPS F GGG-3' NO: 226 reverse primer 5'-AGCTGCTGTCAATGAC SEQ ID q
epTIPS R CGC-3' NO: 227 TIPS probe 6FAM-AATGCTGGAATCGCA SEQ ID NO:
228)
[0742] A comparative Ct method with Delta Ct values normalized to
the average Delta Ct from the bi-allelic TIPS genotypes provided a
copy number estimation for the TIPS sequence detected in the
analyzed plant samples.
TABLE-US-00022 TABLE 21 qPCR genotyping and copy number of selected
T0 plants. TIPS Wild-type moCas9 Event EPSPS EPSPS coding name
allele allele # TIPS copy # sequence E1 positive Null 5 positive E2
positive Null 2 positive E7 positive Null 6 positive E8 positive
Null 1 positive E9 positive Null 3 positive
[0743] The qPCR genotyping indicated that no wild-type EPSPS
alleles were detected in the selected T0 plants of Events E1, E2,
E7, E8 and E9 (Table 21). Both, the TIPS template sequences and the
moCas9 coding sequence were found in the selected T0 plants,
presumably, as a result of random insertions associated with the
transformation process (Table 21: for the TIPS template sequences
E1, E7, and E9 T0 plants). Both genetic elements (the randomly
inserted TIPS templates and the moCas9 expression cassette) can be
segregated out by standard breeding procedures in the T1 progeny
generation, if not linked to the edited EPSPS-TIPS gene.
[0744] T0 plants grew well in the greenhouse and were fertile. A
sample of T0 plants was sprayed with a 1.times. dose of glyphosate
(Roundup Powermax) at V3 growth stage using the spray booth setting
of 20 gallons per acre. The 1.times. dose of glyphosate was
prepared as follow: 2.55 ml Powermax in 300 ml water (active
ingredient: glyphosate, N-(phosphonomethyl)glycine, in the form of
its potassium salt at 48.7%). Seven days after glyphosate
application, no leaf tissue damage was observed in some of the T0
plants. These plantlets were hemizygous for the EPSPS-TIPS alleles,
while other plantlets were severely damaged. One plant showing no
damage to the leaf tissue 14 days after herbicide application
contained 21 EPSPS-TIPS alleles among 44 genomic clones of the
epsps locus (cloned and sequenced as described in the Example
16).
[0745] These data indicate that a guide RNA/Cas system can be used
to create a TIPS-edited EPSPS allele in maize. Maize plants
homozygous at the epsps-tips locus (two EPSPS alleles edited) with
no additional insertion of the TIPS template (plant E2) were
obtained. Furthermore, some EPSPS-TIPS edited maize plants did show
some level of tolerance against a 1.times. dose of glyphosate.
Example 18
[0746] Guide RNA/Cas Endonuclease Mediated DNA Cleavage in Maize
Chromosomal Loci Enables Transgene Insertion in an Elite Maize
Line
[0747] To test whether a maize optimized guide RNA/Cas system can
cleave an maize chromosomal locus and enable homologous
recombination (HR) mediated pathways to site-specifically insert a
transgene in an elite maize line, 4 loci were selected on the maize
chromosome 1 located between 51.54 cM to 54.56 cM (FIG. 18). Two
target sites for a Cas endonuclease were identified at each of the
four loci and are referred to as MHP14Cas-1, MHP14Cas-3, TS8Cas-1,
TS8Cas2, TS9Cas-2, TS9Cas-3, TS10Cas-1 and TS10Cas-3 (FIG. 19,
Table 22, SEQ ID NOs:229-236).
TABLE-US-00023 TABLE 22 Maize genomic target sites targeted by a
guide RNA/Cas endonuclease. Maize SEQ Target Genomic Target ID
Locus Location Site Site Sequence PAM NO: MHP14 Chr. 1: MHP14
gttaaatctgac TGG 229 51.54cM Cas-1 gtgaatctgtt MHP14 acaaacattgaa
TGG 230 Cas-3 gcgacatag TS8 Chr. 1: TS8 gtacgtaacgtg TGG 231
52.56cM Cas-1 cagtac TS8 gctcatcagtga TGG 232 Cas-2 tcagctgg TS9
Chr. 1: TS9 ggctgtttgcgg AGG 233 53.56cM Cas-2 cctcg TS9
gcctcgaggttg CGG 234 Cas-3 cacgcacgt TS10 Chr.1: TS10 gcctcgccttcg
GGG 235 54.56cM Cas-1 ctagttaa TS10 gctcgtgttgga GGG 236 Cas-3
gataca
[0748] The maize optimized Cas endonuclease cassette (SEQ ID NO: 5
was as prepared as describe in Example 1. Long guide RNA expression
cassettes comprising a variable targeting domain targeting one of
the 8 genomic target sites, driven by a maize U6 polymerase III
promoter, and terminated by a maize U6 polymerase III terminator
were designed as described in Example 1 and 3 and listed in Table
23. A donor DNA (HR repair DNA) containing a selectable marker (a
phosphomannose-isomerase (PMI) expression cassette) flanked by two
homologous regions was constructed using standard molecular biology
techniques (FIG. 20).
TABLE-US-00024 TABLE 23 List of guide RNA (gRNA) and Donor DNA
expression cassettes Donor DNA gRNA (SEQ ID Locus Target Site (SEQ
ID NO:) NO:) MHP14 MHP14Cas-1 245 253 MHP14Cas-3 246 254 TS8
TS8Cas-1 247 255 TS8Cas-2 248 256 TS9 TS9Cas-2 249 257 TS9Cas-3 250
258 TS10 TS10Cas-1 251 259 TS10Cas-3 252 260
[0749] A vector containing the maize optimized Cas9 endonuclease of
SEQ ID NO: 5, a vector containing one of eight long guide RNA
expression cassettes of SEQ ID NOs: 245-252, and a vector
containing one of eight donor DNAs of SEQ ID NOs: 253-260 were
co-delivered to maize elite line immature embryos by
particle-mediated delivery as described in Example 10. About 1000
embryos were bombarded for each target site. Since the donor DNA
contained a selectable marker, PMI, successful delivery of the
donor DNA allowed for callus growth on mannose media. Putative
HR-mediated transgenic insertions were selected by placing the
callus on mannose containing media. After selection, stable shoots
on maturation plates were sampled, total genomic DNA extracted, and
using the primer pairs shown in Table 24 (corresponding to SEQ ID
NOs: 261-270), PCR amplification was carried out at both possible
transgene genomic DNA junctions to identify putative HR-mediated
transgenic insertions.
TABLE-US-00025 TABLE 24 Primer sequences used for integration event
screening at each target site. SEQ Target ID Locus Site Junction
Primer NO: UBIR donor 1 CCATGTCTAACTGTTCA 261 TTTATATGATTCTCT PSBF
donor 2 GCTCGTGTCCAAGCGTC 262 ACTTACGATTAGCT MHP14 MHP14Cas-1
14-1HR1f CTCACATGAGGCTCTTC 263 MHP14Cas-3 TTTGCTTGCT 14-1HR2r
AGGATCCTATTCCCCAA 264 TTTGTAGAT CHR1-8 TS8Cas-1 8HR1f
CAGTCCGTGGATTGAAG 265 CCAT TS8Cas-2 8HR2r CTCTGTCTCCGAGACGT 266
GCTTA CHR1-9 TS9Cas-2 9HR1f GGAGCAAATGTTTTAGG 267 TATGAAATG
TS9Cas-3 9HR2r CGGATTCTAAAGATCAT 268 ACGTAAATGAA CHR1-10 TS10Cas-1
10HR1f TGGCTTGTCTATGCGCA 269 TS10Cas-3 TCTC 10HR2r
CCAGACCCAAACAGCAG 270 GTT
The same genomic primers were used for each of the two target sites
at one locus. The resulting amplifications were sequenced to
determine if these sites were mutated or contained a transgene
insertion.
[0750] The "Event Recovery frequency" was calculated using the
number of events recovered divided by the total number of embryos
bombarded, and may indicate if an endonuclease has some toxic
effect or not (Table 26). Hence, if 1000 embryos were bombarded and
240 were recovered, the Event Recovery frequency is 24%. Table 26
indicates that for all target sites analyzed the Event Recovery
frequency ranged between 17 and 28%, indicating that the guide
RNA/Cas system used herein results in low or no toxicity. Cas
endonuclease activity was measured in-planta by determining the
"Target Site Mutation frequency" (Table 26) which is defined as:
(number of events with target site modification/total number
recovered events)*100%. Hence, if 240 events were recovered and 180
events showed a mutation, the Target Site Mutation frequency is
75%. The target site mutation frequency was measured using target
site allele copy number as described in Example 9 of U.S.
application Ser. No. 13/886,317, filed on May 3, 2013. The primers
and probes for obtaining the target site copy number using qPCR at
each site were as listed in Table 25 (SEQ ID NO: 271-294).
TABLE-US-00026 TABLE 25 Primer and probe sequences used to assess
DNA cleavage at 8 maize genomic target sites Target Site SEQ Desig-
Probe Primer ID nation primers sequence NO: MHP14 probe
CAGATTCACGTCAGATTT 271 Cas-1 forward CATAGTGGTGTATGAAAG 272
GAAGCACTT reverse CATTTTGGATTGTAATAT 273 GTGTACCTCATA MHP14 probe
CACCACTATGTCGCTTC 274 Cas-3 forward CGGATGCACGAAAATTGT 275 AGGA
reverse CTGACGTGAATCTGTTTG 276 GAATTG TS8 probe TACGTAACGTGCAGTACT
277 Cas-1 forward ACGGACGGACCATACG 278 TTATG reverse
TCAGCTGGTGGAGTATAT 279 TAGTTCGT TS8 probe CCAGCTGATCACTGATGA 280
Cas-2 forward ACGGACGGACCATACGT 281 TATG reverse CGCACATGTTATAAATTA
282 CAATGCAT TS9 probe CTGTTTGCGGCCTC 283 Cas-2 forward
CTGCGGAGCTGCTGG 284 CGAT reverse CTTGCTGGCTTCGTC 285 TGTCA TS9
probe CCGACGTGCGTGCAA 286 Cas-3 forward CTGCGGAGCTGCTGG 287 CGAT
reverse CTTGCTGGCTTCGTC 288 TGTCA TS10 probe TCGCCTTCGCTAG 289
Cas-1 TTAA forward AAGACCTGGCCGGTTT 290 TCCA reverse
TAGCGGCCATTGCCATCA 291 TS10 probe CTGTATCTCCAACAC 292 Cas-3 GAGC
forward AAGACCTGGCCGGTTT 293 TCCA reverse TAGCGGCCATTGCCA 294
TCA
As shown in Table 26, all 8 guide RNA/Cas9 systems were very
efficient in cleaving their target DNA and inducing mutations (by
non-homologous end joining (NHEJ) as is evidenced by a mutation
frequency ranging from 33-90%.
[0751] All events were also screened for the presence of an
inserted transgene. The insertion event screening for each target
site is illustrated in FIG. 21. The primers used for insertion PCR
analysis at each site are listed in Table 24. FIG. 22 shows one
example of an insertion event screening PCR result. The frequency
of transgene insertion was determined by calculating the "Insertion
frequency" which is defined as: (number of events with target site
insertion/total number recovered events)*100%. Hence, if 240 events
were recovered and 21 events showed a transgene insertion, the
Insertion frequency was 9%.
TABLE-US-00027 TABLE 26 Activity of the guide RNA/Cas 9 system at 8
target sites as determined by target site mutation frequency and
transgene insertion frequency at the desired target site in maize
plant tissue Insertion Target Site Mutation frequency Target Site
Event Recovery (%) (%) (%) TS10Cas-1 24% 75% 9% (7*) TS10Cas-3 22%
83% 16% (20*) TS8Cas-1 17% 90% 14% (9*) TS8Cas-2 27% 84% 8% (10*)
MHP14Cas-1 17% 33% 2% (2*) MHP14Cas-3 28% 68% 4% (1*) TS9Cas-2 23%
62% 8%** TS9Cas-3 28% 84% 8%** *Number of events with HR1 and HR2
both junctions positive **only HR2 junction available
[0752] Sequence--confirmed-PCR amplifications indicated a
site-specific transgene insertion for each of the 8 target sites as
shown in Table 26 (column Insertion frequency). A transgene
cassette was inserted at all 8 target sites with high efficiency
(2-16%). The number of events containing amplifications across both
transgene genomic DNA junctions, indicating near perfect
site-specific transgene insertion, are show in brackets in Table
26.
[0753] Taken together, these data demonstrates that maize
chromosomal loci cleaved with the maize optimized guide RNA/Cas
system described herein can be used to insert transgenes at high
frequencies in maize elite inbred line.
Example 19
[0754] Delivery of the Guide RNA/Cas9 Endonuclease System DNA to
Soybean by Stable Transformation
[0755] A soybean U6 small nuclear RNA promoter (GM-U6-9.1; SEQ ID
NO: 295) was identified in a similar manner as the soybean promoter
GM-U6-13.1 (SEQ ID NO:120) described in Example 12. The GM-U6-9.1
promoter was used to express guide RNA to direct Cas9 nuclease to
designated genomic target site.
[0756] A soybean codon optimized Cas9 endonuclease expression
cassette (such as for example EF1A2:CAS9, SEQ ID NO: 296) and a
guide RNA expression cassette (such as for example U6-9.1:DD20CR1;
SEQ ID NO: 297) were linked (such as U6-9.1: DD20CR1+EF1A2:CAS9;
SEQ ID NO: 298, FIG. 23A) and integrated into a DNA plasmid that
was co-delivered with another plasmid comprising a donor DNA
(repair DNA) cassette (such as DD20HR1-SAMS:HPT-DD20HR2; SEQ ID NO:
299) to young soybean somatic embryos in the form of embryogenic
suspension cultures by particle gun bombardment (FIGS. 23A and
23B). Other guide RNA/Cas9 DNA constructs targeting various soybean
genomic sites and donor DNA constructs for site-specific transgene
integration through homologous recombination were similarly
configured and are listed in Table 27. The four gRNA/Cas9
constructs differed only in the 20 bp guide RNA targeting domain
(variable targeting domain) targeting the soybean genomic target
sites DD20CR1 (SEQ ID NO: 125), DD20CR2 (SEQ ID NO: 126), DD43CR1
(SEQ ID NO: 127), or DD43CR2 (SEQ ID NO: 128). The two donor DNA
constructs differed only in the homologous regions such as DD20HR1
and DD20HR (FIG. 23B), or DD43HR1 and DD43HR2. These guide RNA/Cas9
DNA constructs and donor DNAs were co-delivered to an elite (93B86)
or a non-elite (Jack) soybean genome by the stable transformation
procedure described below.
TABLE-US-00028 TABLE 27 Guide RNA/Cas9 Mediated Soybean Stable
Transformation. SEQ ID Experiment Guide RNA/Cas9 Donor DNA NOs:
U6-9.1DD20CR1 U6-9.1:DD20CR1 + EF1A2:CAS9 DD20HR1-SAMS:HPT-DD20HR2
298, 299 U6-9.1DD20CR2 U6-9.1:DD20CR2 + EF1A2:CAS9
DD20HR1-SAMS:HPT-DD20HR2 300, 299 U6-9.1DD43CR1 U6-9.1:DD43CR1 +
EF1A2:CAS9 DD43HR1-SAMS:HPT-DD43HR2 301, 302 U6-9.1DD43CR2
U6-9.1:DD43CR2 + EF1A2:CAS9 DD43HR1-SAMS:HPT-DD43HR2 303, 302
[0757] Soybean somatic embryogenic suspension cultures were induced
from a DuPont Pioneer proprietary elite cultivar 93B86 as follows.
Cotyledons (.about.3 mm in length) were dissected from surface
sterilized, immature seeds and were cultured for 6-10 weeks in the
light at 26.degree. C. on a Murashige and Skoog (MS) media
containing 0.7% agar and supplemented with 10 mg/ml 2,4-D
(2,4-Dichlorophenoxyacetic acid). Globular stage somatic embryos,
which produced secondary embryos, were then excised and placed into
flasks containing liquid MS medium supplemented with 2,4-D (10
mg/ml) and cultured in light on a rotary shaker. After repeated
selection for clusters of somatic embryos that multiplied as early,
globular staged embryos, the soybean embryogenic suspension
cultures were maintained in 35 ml liquid media on a rotary shaker,
150 rpm, at 26.degree. C. with fluorescent lights on a 16:8 hour
day/night schedule. Cultures were subcultured every two weeks by
inoculating approximately 35 mg of tissue into 35 ml of the same
fresh liquid MS medium.
[0758] Soybean embryogenic suspension cultures were then
transformed by the method of particle gun bombardment using a
DuPont Biolistic.TM. PDS1000/HE instrument (Bio-Rad Laboratories,
Hercules, Calif.). To 50 .mu.l of a 60 mg/ml 1.0 mm gold particle
suspension were added in order: 30 .mu.l of equal amount (30
ng/.mu.l) plasmid DNA comprising, for example,
U6-9.1:DD20CR1+EF1A2:CAS9 (SEQ ID NO:298) and plasmid DNA
comprising, for example, (DD20HR1-SAMS:HPT-DD20HR2, SEQ ID NO: 299)
(Experiment U6-9.1 DD20CR1 listed in Table 27) 20 .mu.l of 0.1 M
spermidine, and 25 .mu.l of 5 M CaCl.sub.2. The particle
preparation was then agitated for 3 minutes, spun in a centrifuge
for 10 seconds and the supernatant removed. The DNA-coated
particles were then washed once in 400 .mu.l 100% ethanol and
resuspended in 45 .mu.l of 100% ethanol. The DNA/particle
suspension was sonicated three times for one second each. Then 5
.mu.l of the DNA-coated gold particles was loaded on each macro
carrier disk.
[0759] Approximately 300-400 mg of a two-week-old suspension
culture was placed in an empty 60.times.15 mm Petri dish and the
residual liquid removed from the tissue with a pipette. For each
transformation experiment, approximately 5 to 10 plates of tissue
were bombarded. Membrane rupture pressure was set at 1100 psi and
the chamber was evacuated to a vacuum of 28 inches mercury. The
tissue was placed approximately 3.5 inches away from the retaining
screen and bombarded once. Following bombardment, the tissue was
divided in half and placed back into liquid media and cultured as
described above.
[0760] Five to seven days post bombardment, the liquid media was
exchanged with fresh media containing 30 mg/ml hygromycin as
selection agent. This selective media was refreshed weekly. Seven
to eight weeks post bombardment, green, transformed tissue was
observed growing from untransformed, necrotic embryogenic clusters.
Isolated green tissue was removed and inoculated into individual
flasks to generate new, clonally propagated, transformed
embryogenic suspension cultures. Each clonally propagated culture
was treated as an independent transformation event and subcultured
in the same liquid MS media supplemented with 2,4-D (10 mg/ml) and
30 ng/ml hygromycin selection agent to increase mass. The
embryogenic suspension cultures were then transferred to agar solid
MS media plates without 2,4-D supplement to allow somatic embryos
to develop. A sample of each event was collected at this stage for
quantitative PCR analysis.
[0761] Cotyledon stage somatic embryos were dried-down (by
transferring them into an empty small Petri dish that was seated on
top of a 10 cm Petri dish containing some agar gel to allow slow
dry down) to mimic the last stages of soybean seed development.
Dried-down embryos were placed on germination solid media and
transgenic soybean plantlets were regenerated. The transgenic
plants were then transferred to soil and maintained in growth
chambers for seed production. Transgenic events were sampled at
somatic embryo stage or T0 leaf stage for molecular analysis.
[0762] Similar transformation experiments (U6-9.1 DD20CR2, U6-9.1
DD43CR1, U6-9.1DD43CR2) with the components listed in Table 27 and
using the elite cultivar 93B86 were performed as described
above.
[0763] Two transformation experiments, U6-9.1 DD20CR1 and U6-9.1
DD43CR1 listed in Table 27, were also performed in a non-elite
soybean cultivar "Jack" to test the gRNA/Cas9 system performance in
different soybean genotypes.
Example 20
Detection of Site-Specific NHEJ Mediated by the Guide RNA/Cas9
System in Stably Transformed Soybean
[0764] Genomic DNA was extracted from somatic embryo samples and
analyzed by quantitative PCR using a 7500 real time PCR system
(Applied Biosystems, Foster City, Calif.) with target site-specific
primers and FAM-labeled fluorescence probe to check copy number
changes of the target site DD20 or DD43 (FIG. 24 A-C). The qPCR
analysis was done in duplex reactions with a heat shock protein
(HSP) gene as the endogenous controls and a wild type 93B86 genomic
DNA sample that contains one copy of the target site with 2
alleles, as the single copy calibrator. The HSP endogenous control
qPCR employed primer probe set HSP-F/HSP-T/HSP-R. The DD20-CR1 (SEQ
ID NO:306) and DD20-CR2 (SEQ ID NO:307) specific qPCR employed
primer probe set DD20-F (SEQ ID NO:308)/DD20-T (SEQ ID
NO:309)/DD20-R(SEQ ID NO:310). The DD43-CR1 (SEQ ID NO:311)
specific qPCR employed primer probe set DD43-F (SEQ ID
NO:313)/DD43-T (SEQ ID NO:315)/DD43-R (SEQ ID NO:316) while the
DD43-CR2 (SEQ ID NO:312) specific qPCR employed primer probe set
DD43-F2 (SEQ ID NO:314)/DD43-T/DD43-R. The guide RNA/Cas9 DNA (SEQ
ID NOs: 298, 300, 301, and 303) specific qPCR employed primer probe
set Cas9-F (SEQ ID NO:317/Cas9-T (SEQ ID NO:318)/Cas-9-R(SEQ ID
NO:319). The donor DNA (SEQ ID NOS: 299, and 302) specific qPCR
employed primer probe set Sams-76F (SEQ ID NO:320)/FRT1I63-T (SEQ
ID NO:321)/FRT1I-41F (SEQ ID NO:322). The endogenous control probe
HSP-T was labeled with VIC and the gene-specific probes DD20-T,
DD43-T, Cas9-T, and FRT1I63-T were labeled with FAM for the
simultaneous detection of both fluorescent probes (Applied
Biosystems). PCR reaction data were captured and analyzed using the
sequence detection software provided with the 7500 real time PCR
system and the gene copy numbers were calculated using the relative
quantification methodology (Applied Biosystems).
[0765] Since the wild type 93B86 genomic DNA with two alleles of
the target site was used as the single copy calibrator, events
without any change of the target site would be detected as one copy
herein termed Wt-Homo (qPCR value>=0.7), events with one allele
changed, which is no longer detectible by the target site-specific
qPCR, would be detected as half copy herein termed NHEJ-Hemi (qPCR
value between 0.1 and 0.7), while events with both alleles changed
would be detected as null herein termed NHEJ-Null (qPCR
value=<0.1). The wide range of the qPCR values suggested that
most of the events contained mixed mutant and wild type sequences
of the target site. High percentage of NHEJ-Hemi (ranging from 10.1
to 33.5%, Table 28) and NHEJ-Null (ranging from 32.3 to 46.4%,
Table 21) were detected in all four experiments with combined NHEJ
average frequencies of more than 60% (Table 28).
TABLE-US-00029 TABLE 28 Target Site Mutations and Site Specific
Gene Integration Induced by the Guide RNA/Cas9 system in elite
soybean germplasm. Numbers indicate no. of events (numbers in
parentheses are %). NA = not analyzed. Wt-Homo NHEJ- NHEJ-Null
Insertion Project Total event (%) Hemi (%) (%) Frequency(%)
U6-9.1DD20CR1 239 85 (35.6%) 77 (32.2%) 77 (32.2%) 11 (4.6%)
U6-9.1DD20CR2 79 43 (54.4%) 8 (10.1%) 28 (35.4%) NA U6-9.1DD43CR1
263 53 (20.2%) 88 (33.5%) 122 (46.4%) 10 (3.8%)
TABLE-US-00030 TABLE 29 Target Site Mutations and Site Specific
Gene Integration Induced by the Guide RNA/Cas9 system in non-elite
soybean germplasm. Numbers indicate no. of events (numbers in
parentheses are % of the total analyzed events). Total Wt-Homo
NHEJ- NHEJ-Null Insertion Project event (%) Hemi (%) (%) frequency
(%) U6-9.1DD20CR1-Jack 149 99 (66.4%) 34 (22.8%) 16 (10.7%) 0 (0%)
U6-9.1DD43CR1-Jack 141 84 (59.6%) 27 (19.1%) 30 (21.3%) 1
(0.7%)
[0766] Both NHEJ-Hemi and NHEJ-Null were detected in the two
experiments U6-9.1DD20CR1-Jack and U6-9.1DD43CR1-Jack repeated in
"Jack" genotype though at lower frequencies (Table 29). The
differences between NHEJ frequencies were likely caused by
variations between transformation experiments.
[0767] The target region of NHEJ-Null events were amplified by
regular PCR from the same genomic DNA samples using DD20-LB (SEQ ID
NO: 323) and DD20-RB (SEQ ID NO: 326) primers specific respectively
to DD20-HR1 and DD20-HR2 for DD20 target site specific HR1-HR2 PCR
amplicon (FIG. 25 A-C; SEQ ID NO: 329), or DD43-LB (SEQ ID NO: 327)
and DD43-RB (SEQ ID NO: 328) primers specific respectively to
DD43-HR1 and DD43-HR2 for DD43 target site specific HR1-HR2 PCR
amplicon (SEQ ID NO: 332). The PCR bands were cloned into pCR2.1
vector using a TOPO-TA cloning kit (Invitrogen) and multiple clones
were sequenced to check for target site sequence changes as the
results of NHEJ. Various small deletions at the Cas9 cleavage site,
3 bp upstream of the PAM, were revealed at all four tested target
sites (FIG. 26 A-C). Small insertions were also detected in some
sequences. Different mutated sequences were identified from some of
the same events indicating the chimeric nature of these events.
Some of the same mutated sequences were also identified from
different events suggesting that the same mutations could have
happened independently or some of the events could be clonal
events. These sequence analysis confirmed the occurrence of NHEJ
mediated by the guide RNA/Cas9 system at the specific Cas9 target
sites.
Example 21
[0768] Identification of Site-Specific Gene Integration Via
Homologous Recombination Mediated by the Guide RNA/Cas9 System in
Stably Transformed Soybean
[0769] Site-specific gene integration via guide RNA/Cas9 system
mediated DNA homologous recombination was determined by
border-specific PCR analysis. The 5' end borders of DD20CR1 and
DD20CR2 events were amplified as a 1204 bp DD20 HR1-SAMS PCR
amplicon (SEQ ID NO: 330) by PCR with primers DD20-LB (SEQ ID NO:
323) and Sams-A1 (SEQ ID NO: 324) while the 3' borders of the same
events were amplified as a 1459 bp DD20 NOS--HR2 PCR amplicon (SEQ
ID NO: 331) with primers QC498A-S1 and DD20-RB (FIG. 25 A-C). Any
events with both the 5' border and 3' border-specific bands
amplified are considered as site-specific integration events
through homologous recombination containing the transgene from the
donor DNA fragment DD20HR1-SAMS:HPT-DD20HR2 or its circular form
(FIG. 23). The 5' end borders of DD43CR1 and DD43CR2 events were
amplified as a 1202 bp DD43 HR1-SAMS PCR amplicon (SEQ ID NO: 333)
by PCR with primers DD43-LB and Sams-A1 while the 3' borders of the
same events were amplified as a 1454 bp DD43 NOS-HR2 PCR amplicon
(SEQ ID NO: 334) with primers QC498A-S1 (SEQ ID NO: 325) and
DD43-RB (SEQ ID NO: 328). Any events with both the 5' border and 3'
border-specific bands amplified are considered as site-specific
integration events through homologous recombination containing the
transgene from repair DNA fragment DD43HR1-SAMS:HPT-DD43HR2 or its
circular form. Some of the border-specific PCR fragments were
sequenced and were all confirmed to be recombined sequences as
expected from homologous recombination. On average, gene
integration through the guide RNA/Cas9 mediated homologous
recombination occurred at approximately 4% of the total transgenic
events (Insertion frequency, Table 28 and Table 29). One homologous
recombination event was identified from experiment U6-9.1
DD43CR1-Jack repeated in "Jack" genotype (Table 29).
Example 22
[0770] The crRNA/tracrRNA/Cas Endonuclease System Cleaves
Chromosomal DNA in Maize and Introduces Mutations by Imperfect
Non-Homologous End-Joining
[0771] To test whether the maize optimized crRNA/tracrRNA/Cas
endonuclease system described in Example 1 could recognize, cleave,
and mutate maize chromosomal DNA through imprecise non-homologous
end-joining (NHEJ) repair pathways, three different genomic target
sequences were targeted for cleavage (see Table 30) and examined by
deep sequencing for the presence of NHEJ mutations.
TABLE-US-00031 TABLE 30 Maize genomic target sequences targeted by
a crRNA/tracrRNA/Cas endonuclease system. Tar- Maize Cas get
Genomic RNA Site Target PAM SEQ Lo- Loca- System Desig- Site Se- ID
cus tion Used nation Sequence quence NO: LIG Chr. 2: crRNA/ LIG
GTACCGTACGT AGG 16 28.45cM tracrRNA Cas-1 GCCCCGGCGG crRNA/ LIG
GGAATTGTACC CGG 17 tracrRNA Cas-2 GTACGTGCCC crRNA/ LIG GCGTACGCGTA
AGG 18 tracrRNA Cas-3 CGTGTG LIG = Liguleless 1 Gene Promoter
[0772] The maize optimized Cas9 endonuclease expression cassette,
crRNA expression cassettes containing the specific maize variable
targeting domains (SEQ ID NOs: 445-447) complementary to the
antisense strand of the maize genomic target sequences listed in
Table 30 and tracrRNA expression cassette (SEQ ID NO: 448) were
co-delivered to 60-90 Hi-II immature maize embryos by
particle-mediated delivery (see Example 5) in the presence of BBM
and WUS2 genes (see Example 6). Hi-II maize embryos transformed
with the Cas9 and long guide RNA expression cassettes targeting the
LIGCas-3 genomic target site (SEQ ID NO: 18) for cleavage served as
a positive control and embryos transformed with only the Cas9
expression cassette served as a negative control. After 7 days, the
20-30 most uniformly transformed embryos from each treatment were
pooled and total genomic DNA was extracted. The region surrounding
the intended target site was PCR amplified with Phusion.RTM. High
Fidelity PCR Master Mix (New England Biolabs, M0531L) adding on the
sequences necessary for amplicon-specific barcodes and Illumnia
sequencing using "tailed" primers through two rounds of PCR. The
primers used in the primary PCR reaction are shown in Table 31 and
the primers used in the secondary PCR reaction were
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACG (forward, SEQ ID NO:
53) and CAAGCAGAAGACGGCATA (reverse, SEQ ID NO: 54).
TABLE-US-00032 TABLE 31 PCR primer sequences Cas Primary Tar- RNA
Primer PCR SEQ get System Orien- Primer ID Site Used tation
Sequence NO: LIGCas-1 crRNA/ Forward CTACACTCTTTCCCTACACGA 36
tracrRNA CGCTCTTCCGATCTTCCTCTG TAACGATTTACGCACCTGCTG LIGCas-1
crRNA/ Reverse CAAGCAGAAGACGGCATACGA 35 tracrRNA
GCTCTTCCGATCTGCAAATGA GTAGCAGCGCACGTAT LIGCas-2 crRNA/ Forward
CTACACTCTTTCCCTACACGA 449 tracrRNA CGCTCTTCCGATCTGAAGCTG
TAACGATTTACGCACCTGCTG LIGCas-2 crRNA/ Reverse CAAGCAGAAGACGGCATACGA
35 tracrRNA GCTCTTCCGATCTGCAAATGA GTAGCAGCGCACGTAT LIGCas-3 crRNA/
Forward CTACACTCTTTCCCTACACGA 37 tracrRNA CGCTCTTCCGATCTAAGGCGC
AAATGAGTAGCAGCGCAC LIGCas-3 crRNA/ Reverse CAAGCAGAAGACGGCATACGA 38
tracrRNA GCTCTTCCGATCTCACCTGCT GGGAATTGTACCGTA LIGCas-3 Long
Forward CTACACTCTTTCCCTACACGA 450 guide CGCTCTTCCGATCTTTCCCGC RNA
AAATGAGTAGCAGCGCAC LIGCas-3 Long Reverse CAAGCAGAAGACGGCATACGA 38
guide GCTCTTCCGATCTCACCTGCT RNA GGGAATTGTACCGTA
[0773] The resulting PCR amplifications were purified with a Qiagen
PCR purification spin column, concentration measured with a Hoechst
dye-based fluorometric assay, combined in an equimolar ratio, and
single read 100 nucleotide-length deep sequencing was performed on
Illumina's MiSeq Personal Sequencer with a 30-40% (v/v) spike of
PhiX control v3 (Illumina, FC-110-3001) to off-set sequence bias.
Only those reads with a .gtoreq.1 nucleotide indel arising within
the 10 nucleotide window centered over the expected site of
cleavage and not found in a similar level in the negative control
were classified as NHEJ mutations. NHEJ mutant reads with the same
mutation were counted and collapsed into a single read and the top
10 most prevalent mutations were visually confirmed as arising
within the expected site of cleavage. The total numbers of visually
confirmed NHEJ mutations were then used to calculate the % mutant
reads based on the total number of reads of an appropriate length
containing a perfect match to the barcode and forward primer.
[0774] The frequency of NHEJ mutations recovered by deep sequencing
for the crRNA/tracrRNA/Cas endonuclease system targeting the three
LIGCas targets (SEQ ID NOS: 16, 17, 18) compared to the long guide
RNA/Cas endonuclease system targeting the same locus is shown in
Table 32.
TABLE-US-00033 TABLE 32 Percent (%) mutant reads at maize
Liguleless 1 target locus produced by crRNA/tracrRNA/Cas
endonuclease system compared to the long guide RNA/Cas endonuclease
system Total Number of Number of Mutant % Mutant System Reads Reads
Reads Cas9 Only Control 1,744,427 0 0.00% LIGCas-3 long 1,596,955
35,300 2.21% guide RNA LIGCas-1 1,803,163 4,331 0.24%
crRNA/tracrRNA LIGCas-2 1,648,743 3,290 0.20% crRNA/tracrRNA
LIGCas-3 1,681,130 2,409 0.14% crRNA/tracrRNA
[0775] The ten most prevalent types of NHEJ mutations recovered
based on the crRNA/tracrRNA/Cas endonuclease system are shown in
FIG. 27A (for LIGCas-1 target site, corresponding to SEQ ID
NOs:415-424), FIG. 27B (for LIGCas-2 target site corresponding to
SEQ ID NOs: 425-434) and FIG. 27C (for LIGCas-3 target site
corresponding to SEQ ID NOs:435-444). Approximately, 9-16 fold
lower frequencies of NHEJ mutations were observed when using a
crRNA/tracrRNA/Cas endonuclease system to introduce a double strand
break at a maize genomic target site, relative to the long guide
RNA/Cas endonuclease system control.
[0776] Taken together, our data indicate that the maize optimized
crRNA/tracrRNA/Cas endonuclease system described herein cleaves
maize chromosomal DNA and generates imperfect NHEJ mutations.
Example 23
[0777] Modifying the ARGOS8 Gene to Improve Drought Tolerance and
Nitrogen Use Efficiency in Maize Plants
[0778] ARGOS is a negative regulator for ethylene responses in
plants (WO 2013/066805 A1, published 10 May 2013). ARGOS proteins
target the ethylene signal transduction pathway. When
over-expressed in maize plants, ARGOS reduces plant sensitivity to
ethylene and promotes organ growth, leading to increased drought
tolerance (DRT) and improved nitrogen use efficiency (NUE) ((WO
2013/066805 A1, published 10 May 2013). To achieve optimal ethylene
sensitivity, promoters have been tested for driving Zm-ARGOS8
over-expression in transgenic maize plants. Field trials showed
that a maize promoter, Zm-GOS2 PRO:GOS2 INTRON (SEQ ID NO:460, U.S.
Pat. No. 6,504,083 patent issued on Jan. 7, 2003; Zm-GOS2 is a
maize homologous gene of rice GOS2. Rice GOS2 stands for Gene from
Oryza Sativa 2), provided a favorable expression level and tissue
coverage for Zm-ARGOS8 and the transgenic plants have a higher
grain yield than non-transgenic controls under drought stress and
low nitrogen conditions (WO 2013/066805 A1, published 10 May 2013).
However, these transgenic plants contain two ARGOS8 genes, the
endogenous gene and the transgene. ARGOS8 protein levels,
therefore, are determined by these two genes. Because the
endogenous ARGOS8 gene varies in sequence and the expression level
among different inbred lines, the ARGOS8 protein level will be
different when the transgene is integrated into different inbreds.
Here we present a mutagenization (gene editing) method to modify
the promoter region of the endogenous ARGOS8 gene to attain desired
expression patterns and eliminate the need for a transgene.
[0779] The promoter Zm-GOS2 PRO:GOS2 INTRON (SEQ ID NO:460; U.S.
Pat. No. 6,504,083 patent issued on Jan. 7, 2003) was inserted into
the 5'-UTR of Zm-ARGOS8 (SEQ ID NO:462) by using a guideRNA/Cas9
system. The Zm-GOS2 PRO:GOS2 INTRON fragment also included a primer
binding site (SEQ ID NO:459) at its 5' end to facilitate event
screening with PCR. We also substituted the native promoter of
Zm-ARGOS8 (SEQ ID NO:461) with Zm-GOS2 PRO::GOS2 INTRON (SEQ ID
NO:460). Resulted maize lines carry a new ARGOS8 allele whose
expression levels and tissue specificity will differ from the
native form. We expect that these lines will recapitulate the
phenotype of increased drought tolerance and improved NUE as
observed in the Zm-GOS2 PRO:Zm-ARGOS8 transgenic plants (WO
2013/066805 A1, published 10 May 2013). These maize lines are
different from those conventional transgenic events: (1) there is
only one ARGOS8 gene in the genome; (2) this modified version of
Zm-ARGOS8 resides at its native locus; (3) the ARGOS8 protein level
and the tissue specificity of gene expression are entirely
controlled by the edited allele. The DNA reagents used during the
mutagenization, such as guideRNA, Cas9endonuclease, transformation
selection marker and other DNA fragments are not required for
function of the newly generated ARGOS8 allele and can be eliminated
from the genome by segregation through standard breeding methods.
Because the promoter Zm-GOS2 PRO:GOS2 INTRON was copied from maize
GOS2 gene (SEQ ID NO:464) and inserted into the ARGOS8 locus
through homologous recombination, this ARGOS8 allele is
indistinguishable from natural mutant alleles.
A. Insertion of Zea mays-GOS2 PRO:GOS2 INTRON into Maize-ARGOS 8
Promoter
[0780] To insert Zm-GOS2 PRO:GOS2 INTRON into the 5'-UTR of maize
ARGOS8 gene, a guideRNA construct, gRNA1, was made using maize U6
promoter and terminator as described herein. The 5'-end of the
guide RNA contained a 19-bp variable targeting domain targeting the
genomic target sequence 1 (CTS1; SEQ ID NO; 451) in the 5'-UTR of
Zm-ARGOS8 (FIG. 28). A polynucleotide modification template
containing the Zm-GOS2 PRO:GOS2 INTRON that was flanked by two
genomic DNA fragments (HR1 and HR2, 370 and 430-bp in length,
respectively) derived from the upstream and downstream region of
the CTS1 (FIG. 28). The gRNA1 construct, the polynucleotide
modification template, a Cas9 cassette and transformation selection
marker phosphomannose isomerase (PMI) were introduced into maize
immature embryo cells by using a particle bombardment method.
PMI-resistant calli were screened with PCR for Zm-GOS2 PRO:GOS2
INTRON insertion (FIGS. 29A and 29B). Multiple callus events were
identified and plants were regenerated. The insertion events were
confirmed by amplifying the Zm-ARGOS8 region in T0 plants with PCR
(FIG. 29C) and sequencing the PCR products.
B. Replacement of Zm-ARGOS 8 Promoter with Zm-GOS2 PRO:GOS2 INTRON
Promoter (Promoter Swap).
[0781] To substitute (replace) the native promoter of Zm-ARGOS8
with Zm-GOS2 PRO:GOS2 INTRON, a guide RNA construct, gRNA3, was
made for targeting the genomic target site CTS3 (SEQ ID NO:453),
located 710-bp upstream of the Zm-ARGOS8 start codon (FIG. 30).
Another guide RNA, gRNA2, was designed to target the genomic target
site CTS2 (SEQ ID NO:452) located in the 5'-UTR of Zm-ARGOSO8 (FIG.
30). The polynucleotide modification template contained a 400-bp
genomic DNA fragment derived from the upstream region of CTS3,
Zm-GOS2 PRO:GOS2 INTRON and a 360-bp genomic DNA fragment derived
from the downstream region of CTS2 (FIG. 30). The gRNA3 and gRNA2,
the Cas9 cassette, the polynucleotide modification template and the
PMI selection marker were used to transform immature embryo cells.
Multiple promoter swap (promoter replacement) events were
identified by PCR screening of the PMI-resistance calli (FIGS. 31A,
31B & 31C) and plants were regenerated. The swap events were
confirmed by PCR analysis of the Zm-ARGOS8 region in T0 plants
(FIG. 31D).
C. Deletion of Zm-ARGOS 8 Promoter
[0782] To delete the promoter of Zm-ARGOS8, we screened the
PMI-resistance calli obtained from the above gRNA3/gRNA2 experiment
to look for events that produce a 1.1-kb PCR product (FIG. 32A).
Multiple deletion events were identified (FIG. 32B) and plants were
regenerated. The deletion events were confirmed by amplifying the
Zm-ARGOS8 region in T0 plants with PCR and sequencing of the PCR
products.
Example 24
[0783] Gene Editing of the Soybean EPSPS1 Gene Using the Guide
RNA/Cas Endonuclease System
A. guideRNA/Cas9 Endonuclease Target Site Design on the Soybean
EPSPS Genes.
[0784] Two guideRNA/Cas9 endonuclease target sites (soy EPSPS-CR1
and soy EPSPS-CR2) were identified in theExon2 of the soybean
EPSPS1 gene Glyma01g33660 (Table 33).
TABLE-US-00034 TABLE 33 Guide RNA/Cas9 endonuclease target sites on
soybean EPSPS1 gene Cas endonuclease Name of gRNA-Cas9 target
endonuclease sequence target site (SEQ ID NO:) Physical location
soy EPSPS-CR1 467 Gm01: 45865337 . . . 45865315 soy EPSPS-CR2 468
Gm01: 45865311 . . . 45865333
B. Guide-RNA Expression Cassettes, Cas9 Endonuclease Expression
Cassettes and Polynucleotide Modification Templates for
Introduction of Specific Amino Acid Changes in the Soybean EPSPS1
Gene
[0785] The soybean U6 small nuclear RNA promoter, GM-U6-13.1 (SEQ
ID. NO: 469), was used to express guide RNAs to direct Cas9
nuclease to designated genomic target sites (Table 34). A soybean
codon optimized Cas9 endonuclease (SEQ ID NO: 489) expression
cassette and a guide RNA expression cassette were linked in a first
plasmid that was co-delivered with a polynucleotide modification
template. The polynucleotide modification template contained
specific nucleotide changes that encoded for amino acid changes in
the EPSPS1 polypeptide (Glyma01g33660), such as the T183I and P187S
(TIPS) in the Exon2. Other amino acid changes in the EPSPS1
polypeptide can also be obtained using the guide RNA/Cas
endonuclease system described herein. Specific amino acid
modifications can be achieved by homologous recombination between
the genomic DNA and the polynucleotide modification template
facilitated by the guideRNA/Cas endonuclease system.
TABLE-US-00035 TABLE 34 Guide RNA/Cas9 expression cassettes and
polynucleotide modification templates used in soybean stable
transformation for the specific amino acid modifications of the
EPSPS1 gene. SEQ polynucleotide SEQ Guide RNA/Cas9 ID modification
ID Experiment (plasmid name) NO: template NO: soy EPSPS-
U6-13.1:EPSPS CR1 + 470 RTW1013A 472 CR1 EF1A2:CAS9 (QC878) soy
EPSPS- U6-13.1:EPSPS CR2 + 471 RTW1012A 473 CR2 EF1A2:CAS9
(QC879)
C. Detection of Site-Specific Non-Homologous-End-Joining (NHEJ)
Mediated by the Guide RNA/Cas9 System in Stably Transformed
Soybean
[0786] Genomic DNA was extracted from somatic embryo samples and
analyzed by quantitative PCR using a 7500 real time PCR system
(Applied Biosystems, Foster City, Calif.) with target site-specific
primers and FAM-labeled fluorescence probe to check copy number
changes of the double strand break target sites. The qPCR analysis
was done in duplex reactions with a syringolide induced protein
(SIP) as the endogenous controls and a wild type 93B86 genomic DNA
sample that contains one copy of the target site with 2 alleles, as
the single copy calibrator. The presence or absence of the guide
RNA-Cas9 expression cassette in the transgenic events was also
analyzed with the qPCR primer/probes for guideRNA/Cas9 (SEQ IDs:
477-479) and for PinII (SEQ ID: 480-482). The qPCR primers/probes
are listed in Table 35.
TABLE-US-00036 TABLE 35 Primers/Probes used in qPCR analyses of
transgenic soybean events. Primer/ Target Probe SEQ ID Site Name
Sequences NOs: EPSPS-CR1 & Soy1-F1 CCACTAGTAAGGAATCT 474
EPSPS-CR2 AAAGATGAAATCA Soy1-R2 CCTGCAGCAACCACAGC 475 TGCTGTC
Soy1-T1 CTGCAATGCGTCCTT 476 (FAM-MGB) gRNA/ Cas9-F CCTTCTTCCACCGCC
477 CAS9 TTGA Cas9-R TGGGTGTCTCTCGTGCT 478 TTTT Cas9-T
AATCATTCCTGGTGG 479 (FAM-MGB) AGGA plNll plNll-99F
TGATGCCCACATTATAG 480 TGATTAGC plNll-13R CATCTTCTGGATTGGCC 481
AACTT plNll-69T ACTATGTGTGCATCCTT 482 (FAM-MGB) SIP SIP-130F
TTCAAGTTGGGCTTTTT 483 CAGAAG SIP-198R TCTCCTTGGTGCTCTCA 484 TCACA
SIP-170T CTGCAGCAGAACCAA 485 (VIC-MGB)
[0787] The endogenous control probe SIP-T was labeled with VIC and
the gene-specific probes for all the target sites were labeled with
FAM for the simultaneous detection of both fluorescent probes
(Applied Biosystems). PCR reaction data were captured and analyzed
using the sequence detection software provided with the 7500 real
time PCR system and the gene copy numbers were calculated using the
relative quantification methodology (Applied Biosystems).
[0788] Since the wild type 93B86 genomic DNA with two alleles of
the double strand break target site was used as the single copy
calibrator, events without any change of the target site would be
detected as one copy herein termed Wt-Homo (qPCR value>=0.7),
events with one allele changed, which is no longer detectible by
the target site-specific qPCR, would be detected as half copy
herein termed NHEJ-Hemi (qPCR value between 0.1 and 0.7), while
events with both alleles changed would be detected as null herein
termed NHEJ-Null (qPCR value=<0.1). As shown in Table 36, both
guideRNA/Cas endonuclease systems targeting the soy EPSPS-CR1 and
EPSPS-CR2 sites can introduce efficient Double Strand Break (DSB)
efficiency at their designed target sites. Both NHEJ-Hemi and
NHEJ-Null were detected in the 93B86 genotype. NHEJ
(Non-Homologous-End-Joining) mutations mediated by the guide
RNA/Cas9 system at the specific Cas9 target sites were confirmed by
PCR/topo cloning/sequencing.
TABLE-US-00037 TABLE 36 Target Site Double Strand Break Rate
Mutations Induced by the Guide RNA/Cas9 system on soybean EPSPS1
gene. Numbers indicate no. of events (numbers in parentheses are
%). Total NHEJ-Hemi Project event Wt-Homo (%) (%) NHEJ-Null (%)
U6-13.1 168 63 (38%) 66 (39%) 39 (23%) EPSPS-CR1 U6-13.1 111 50
(45%) 21 (19%) 40 (36%) EPSPS-CR2
D. Detection of the TIPS Mutation in the Soybean EPSPS Gene
[0789] In order to edit specific amino acids at the native EPSPS
gene (such as those resulting in a TIPS modification), a
polynucleotide modification template, such as RTW1013A or RTW1012A
(Table 34), was co-delivered with the guideRNA/Cas9 expression
cassettes into soybean cells.
[0790] The modification of the native EPSPS1 gene via guide
RNA/Cas9 system mediated DNA homologous recombination was
determined by specific PCR analysis. A specific PCR assay with
primer pair WOL569 (SEQ ID NO: 486) and WOL876 (SEQ ID NO: 487) was
used to detect perfect TIPS modification at the native EPSPS1 gene.
A second primer pair WOL569 (SEQ ID NO: 486) and WOL570 (SEQ ID NO:
488) was used to amplify both TIPS modified EPSPS1 allele and WT
(wild type)/NHEJ mutated allele. Topo cloning/sequencing was used
to verify the sequences.
Example 25
[0791] Intron Replacement of Soybean Genes Using the guideRNA/Cas
Endonuclease System
A. guideRNA/Cas9 Endonuclease Target Site Design.
[0792] Four guideRNA/Cas9 endonuclease target sites were identified
in the soybean EPSPS1 gene Glyma01g33660 (Table 37). Two of the
target sites (soy EPSPS-CR1 and soy EPSPS-CR2) were identified to
target the Exon2 of the soybean EPSPS gene as described in Example
24. Another two target sites (soy EPSPS-CR4 and soy EPSPS-CR5) were
designed near the 5' end of the intron1 of the soybean EPSPS
gene.
TABLE-US-00038 TABLE 37 Guide RNA/Cas9 endonuclease target sites on
soybean EPSPS1 gene. Cas endonuclease Name of gRNA-Cas9 target
endonuclease sequence target site (SEQ ID NO:) Physical location
soy EPSPS-CR1 467 Gm01: 45865337 . . . 45865315 soy EPSPS-CR2 468
Gm01: 45865311 . . . 45865333 soy EPSPS-CR4 490 Gm01: 45866302 . .
. 45866280 soy EPSPS-CR5 491 Gm01: 45866295 . . . 45866274
B. Guide RNA/Cas9 Endonuclease Expression Cassettes and
Polynucleotide Modification Templates Used in Soybean Stable
Transformation for the Replacement of the Intron 1 of the Soybean
EPSPS1 Gene with the Soybean Ubiquitin (UBQ) Intron 1
[0793] The soybean U6 small nuclear RNA promoter GM-U6-13.1 (SEQ
ID. NO: 469) was used to express two guide RNAs (soy-EPSPS-CR1 and
soy-EPSPS-CR4, or soy-EPSPS-CR1 and soy-EPSPS-CR5) to direct Cas9
endonuclease to designated genomic target sites (Table 38). One of
the target sites (soy-EPSPS-CR1) was located in the exon2, as
described in Example 24, and a second target site (soy-EPSPS-CR4 or
soy-EPSPS-CR5) was located near the 5' end of intron1 of the native
EPSPS1 gene. A soybean codon optimized Cas9 endonuclease expression
cassette and a guide RNA expression cassette were linked in the
expression plasmids QC878/RTW1199 (SEQ ID NO:470/492) or
QC878/RTW1200 (SEQ ID NO:470/493) that was co-delivered with a
polynucleotide modification template. The polynucleotide
modification template, RTW1190A (SEQ ID NO:494), contained 532 bp
intron1 of the soybean UBQ gene and the TIPS modified Exon2.
Soybean EPSPS1 intron 1 replacement with the soybean UBQ intron1
can be achieved with the guide RNA/Cas system by homologous
recombination between the genomic DNA and the polynucleotide
modification template, resulting in enhancement of the native or
modified soy EPSPS1 gene expression.
TABLE-US-00039 TABLE 38 Guide RNA/Cas9 endonuclease expression
cassettes and polynucleotide modification templates used in soybean
stable transformation for the replacement of the Intron1 of the
soybean EPSPS1 gene with the soybean ubiquitin (UBQ) intron1
polynucleotide SEQ SEQ ID modification ID Experiment Guide RNA/Cas9
NO: template NO: soy EPSPS-CR1 U6-13.1:EPSPS 470/492 RTW1190A 494
and CR1 + CR4 + soy EPSPS-CR4 EF1A2:CAS9 (QC878/RTW1199) soy
EPSPS-CR1 U6-13.1:EPSPS 470/493 RTW1190A 494 and CR1 + CR5 + soy
EPSPS-CR5 EF1A2:CAS9 (QC878/RTW1200)
C. Detection of Site-Specific NHEJ Mediated by the Guide RNA/Cas9
System in Stably Transformed Soybean
[0794] Site-specific NHEJ was detected as described in Example 24C,
using the qPCR primers/probes listed in Table 39.
TABLE-US-00040 TABLE 39 Primers/Probes used in qPCR analyses of
transgenic soybean events. SEQ Target Primer/ ID Site Probe Name
Sequences NOs: EPSPS-CR1 & Soy1-F1 CCACTAGTAAGGAAT 474
EPSPS-CR2 CTAAAGATGAAATCA Soy1-R2 CCTGCAGCAACCACA 475 GCTGCTGTC
Soy1-T1 (FAM-MGB) CTGCAATGCGTCCTT 476 EPSPS-CR4 Soy1-F3
GTTTGTTTGTTGTTG 495 GGTGTGGG Soy1-R3 GACATGATGCTTCAT 496 TTTCACAGAA
Soy-T2 (FAM-MGB) TGTGTAGAGTGGATT 497 TTG EPSPS-CR5 Soy1-F2
TGTTGTTGGGTGTGG 498 GAATAGG Soy1-R3 GACATGATGCTTCAT 496 TTTCACAGAA
Soy1-T2 (FAM-MGB) TGTGTAGAGTGGATT 497 TTG gRNA/CAS9 Cas9-F
CCTTCTTCCACCGCC 477 TTGA Cas9-R TGGGTGTCTCTCGTG 478 CTTTTT Cas9-T
(FAM-MGB) AATCATTCCTGGTGG 479 AGGA plNll plNll-99F TGATGCCCACATTAT
480 AGTGATTAGC plNll-13R CATCTTCTGGATTGG 481 CCAACTT plNll-69T
(FAM-MGB) ACTATGTGTGCAT 482 CCTT SIP SIP-130F TTCAAGTTGGGCTTT 483
TTCAGAAG SIP-198R TCTCCTTGGTGCTCT 484 CATCACA SIP-170T (VIC-MGB)
CTGCAGCAGAACCAA 485
D. Detection of the Replacement of the Soybean EPSPS1 Intron1 with
the Soybean UBQ Intron 1 Using the Guide RNA/Cas9 Endonuclease
System.
[0795] In order to replace the soybean EPSPS1 intron1 with the
soybean UBQ intron1 at the native EPSPS1 gene, two guideRNA
expression vectors were used as shown in Table 38. The QC878 vector
(SEQ ID NO: 470) was targeting the exon2 and the RTW1199 (SEQ ID
NO:492) or RTW1200 (SEQ ID NO:493) was targeting the 5' end of the
intron1. The double cleavage of soybean EPSPS gene with the two
guide RNA/Cas systems resulted in the removal of the native EPSPS1
intron1/partial Exon2 fragment. At the same time, a polynucleotide
modification template RTW1190A (SEQ ID NO:494) was co-delivered
into soybean cells and homologous recombination between the
polynucleotide modification template and the genomic DNA resulted
in the replacement of EPSPS1 intron1 with the soybean UBQ intron1
and the desired amino acid modifications in exon2 as evidenced by
PCR analysis. PCR assays with primer WOL1001/WOL1002 pair (SEQ ID
NO: 499 and 500) and WOL1003/WOL1004 pair (SEQ ID NO: 501 and 502)
were used to detect the intron replacement events.
Example 26
[0796] Promoter Replacement (Promoter Swap) of Soybean Genes Using
the guideRNA/Cas Endonuclease System
A. guideRNA/Cas9 Endonuclease Target Site Design.
[0797] Four guideRNA/Cas9 endonuclease target sites were identified
in the soybean EPSPS1 gene Glyma01g33660 (Table 40). Two of the
target sites (soy EPSPS-CR1 and soy EPSPS-CR2) were identified to
target the Exon2 of the soybean EPSPS gene as described in Example
24. The soy EPSPS-CR6 and soy EPSPS-CR7 were identified near the 5'
end of the -798 bp of the native EPSPS promoter.
TABLE-US-00041 TABLE 40 Guide RNA/Cas9 endonuclease target sites on
soybean EPSPS1 gene. Cas Name of gRNA-Cas9 endonuclease
endonuclease target sequence target site (SEQ ID NO:) Physical
location soy EPSPS-CR1 467 Gm01: 45865337 . . . 45865315 soy
EPSPS-CR2 468 Gm01: 45865311 . . . 45865333 soy EPSPS-CR6 503 Gm01:
45867471 . . . 45867493 soy EPSPS-CR7 504 Gm01: 45867459 . . .
45867481
B. Guide RNA/Cas9 Endonuclease Expression Cassettes and
Polynucleotide Modification Templates Used in Soybean Stable
Transformation for the Replacement of the -798 bp Soybean EPSPS1
Promoter with the Soybean UBQ Promoter.
[0798] The soybean U6 small nuclear RNA promoter GM-U6-13.1 (SEQ
ID. NO: 469) was used to express two guide RNAs (soyEPSPS-CR1 and
soyEPSPS-CR6, or soyEPSPS-CR1 and soyEPSPS-CR7) to direct Cas9
nuclease to designated genomic target sites (Table 41). One of the
target sites (soy-EPSPS-CR1) was located in the exon2 as described
in Example 24 and a second target site (soy-EPSPS-CR6 or
soy-EPSPS-CR7) was located near 5' end of the -798 bp of the native
EPSPS1 promoter. A soybean codon optimized Cas9 endonuclease
expression cassette and a guide RNA expression cassette were linked
in the expression plasmids QC878/RTW1201 (SEQ ID NO:470/505) or
QC878/RTW1202 (SEQ ID NO:470/506) that was co-delivered with a
polynucleotide modification template, RTW1192A (SEQ ID NO:507). The
polynucleotide modification template contained 1369 bp of the
soybean UBQ gene promoter, 47 bp 5UTR and 532 bp UBQ intron1.
Specific soybean EPSPS1 promoter replacement with the soybean UBQ
promoter can be achieved with the guide RNA/Cas system by
homologous recombination between the genomic DNA and the
polynucleotide modification template, resulting enhancement of the
native or modified soy EPSPS1 gene expression
TABLE-US-00042 TABLE 41 Guide RNA/Cas9 endonuclease expression
cassettes and polynucleotide modification templates used in soybean
stable transformation for the replacement of the -798 bp soybean
EPSPS1 promoter with the soybean UBQ promoter SEQ polynucleotide
SEQ ID modification ID Experiment Guide RNA/Cas9 NO: template NO:
soy EPSPS-CR1 U6-13.1:EPSPS CR1 + 470, RTW1192A 507 and CR6 +
EF1A2:CAS9 505 soy EPSPS-CR6 (QC878/RTW1201) soy EPSPS-CR1
U6-13.1:EPSPS CR1 + 470, RTW1192A 507 and CR7 + EF1A2:CAS9 506 soy
EPSPS-CR7 (QC878/RTW1202)
C. Detection of Site-Specific NHEJ Mediated by the Guide RNA/Cas9
System in Stably Transformed Soybean
[0799] Site-specific NHEJ was detected as described in Example 24C,
using the qPCR primers/probes listed in Table 42.
TABLE-US-00043 TABLE 42 Primers/Probes used in qPCR analyses of
transgenic soybean events SEQ Target Primer/ ID Site Probe Name
Sequences NOs: EPSPS-CR1 & Soy1-F1 CCACTAGTAAGGAATC 474
EPSPS-CR12 TAAAGATGAAATCA Soy1-R2 CCTGCAGCAACCACAG 475 CTGCTGTC
Soy1-T1 (FAM-MGB) CTGCAATGCGTCCTT 476 EPSPS-CR6 & Soy1-F4
TCAATAATACTACTCT 508 EPSPS-CR7 CTTAGACACCAAACAA Soy1-R4
CAAGGAAAATGAATGA 509 TGGCTTT Soy1-T3 (FAM-MGB) CCTTCCCAAACTA 510
TAATC gRNA/CAS9 Cas9-F CCTTCTTCCACCGCC 477 TTGA Cas9-R
TGGGTGTCTCTCGTGC 478 TTTTT Cas9-T (FAM-MGB) AATCATTCCTGGTGG 479
AGGA plNll plNll-99F TGATGCCCACATTATA 480 GTGATTAGC plNll-13R
CATCTTCTGGATTGGC 481 CAACTT plNll-69T ACTATGTGTGCAT 482 (FAM-MGB)
CCTT SIP SIP-130F TTCAAGTTGGGCTTTT 483 TCAGAAG SIP-198R
TCTCCTTGGTGCTCTC 484 ATCACA SIP-170T (VIC-MGB) CTGCAGCAGAACCAA
485
D. Detection of the Promoter Replacement of the Soybean EPSPS1
Promoter with the Soybean UBQ Promoter Using the Guide RNA/Cas9
Endonuclease System.
[0800] In order to replace the soybean EPSPS1 promoter with the
soybean UBQ promoter at the native EPSPS1 gene, two guideRNA
expression vectors were used in each soybean transformation
experiment as shown in Table 41. The QC878 (SEQ ID NO: 470) was
targeting the exon2 and the RTW1201 (SEQ ID NO: 505) or RTW1202
(SEQ ID NO: 506) was targeting the 5' end of the soybean -798 bp
promoter. The double cleavage of the soybean EPSPS1 gene with the
two guide RNA/Cas systems resulted in removal of the native EPSPS1
promoter/5'UTR-Exon1/Intron1/partial Exon2 fragment at the native
EPSPS gene. At the same time, a polynucleotide modification
template RTW1192A (SEQ ID NO: 507) was co-delivered into soybean
cells. This RTW1192A DNA contained 1369 bp soybean UBQ promoter,
its 47 bp 5-UTR and 532 bp UBQ intron1 in front of the EPSPS1
exon1-Intron1-modified Exon2. Homologous recombination between the
polynucleotide modification template and the genomic DNA resulted
in the replacement of EPSPS1 promoter/5'UTR with the soybean UBQ
promoter/5'UTR/Intron1 and the desired amino acid modifications
evidenced by PCR analysis. PCR assays with primer WOL1005/WOL1006
pair (SEQ ID NO: 511 and 512) and WOL1003/WOL1004 pair (SEQ ID NO:
501 and 502) were used to detect the promoter replacement
events.
Example 27
Enhancer Element Deletions Using the guideRNA/Cas Endonuclease
System
[0801] The guide RNA/Cas endonuclease system described herein can
be used to allow for the deletion of a promoter element from either
a transgenic (pre-existing, artificial) or endogenous gene.
Promoter elements, such enhancer elements, or often introduced in
promoters driving gene expression cassettes in multiple copies
(3.times.=3 copies of enhancer element, FIG. 33) for trait gene
testing or to produce transgenic plants expressing specific trait.
Enhancer elements can be, but are not limited to, a 35S enhancer
element (Benfey et al, EMBO J, August 1989; 8(8): 2195-2202, SEQ ID
NO:513). In some plants (events), the enhancer elements can cause
an unwanted phenotype, a yield drag, or a change in expression
pattern of the trait of interest that is not desired. For example,
as shown in FIG. 33, a plant comprising multiple enhancer elements
(3 copies, 3.times.) in its genomic DNA located between two trait
cassettes (Trait A en Trait B) was characterized to show an
unwanted phenotype. It is desired to remove the extra copies of the
enhancer element while keeping the trait gene cassettes intact at
their integrated genomic location. The guide RNA/Cas endonuclease
system described herein can be used to removing the unwanted
enhancing element from the plant genome. A guide RNA can be
designed to contain a variable targeting region targeting a target
site sequence of 12-30 bps adjacent to a NGG (PAM) in the enhancer.
If a Cas endonuclease target site sequence is present in all copies
of the enhancer elements (such as the three Cas endonuclease target
sites 35S-CRTS1 (SEQ ID NO:514), 35S-CRTS2 (SEQ ID NO:515),
35S-CRTS3 (SEQ ID NO:516)), only one guide RNA is needed to guide
the Cas endonuclease to the target sites and induce a double strand
break in all the enhancer elements at once. The Cas endonuclease
can make cleavage to remove one or multiple enhancers. The
guideRNA/Cas endonuclease system can introduced by either
agrobacterium or particle gun bombardment. Alternatively, two
different guide RNAs (targeting two different genomic target sites)
can be used to remove all 3.times. enhancer elements from the
genome of an organism, in a manner similar to the removal of a
(transgenic or endogenous) promoter described herein.
Example 28
Regulatory Sequence Modifications Using the Guide RNA/Cas
Endonuclease System
A. Modification of Polyubiquitination Sites
[0802] There are defined ubiquitination sites on proteins to be
degraded and they were found within the maize EPSPS protein by
using dedicated computer programs (for example, the CKSAAP_UbSite
(Ziding Zhang's Laboratory of Protein Bioinformatics College of
Biological Sciences, China Agricultural University, 100193 Beijing,
China). One of the selected polyubiquitination site within the
maize EPSPS coding sequence is shown in FIG. 34A and its amino acid
signature sequence is compared to the equivalent EPSPS sites from
the other plants (FIG. 34A). The lysine amino acid (K) at position
90 (highly conserved in other plant species) was selected as a
potential site of the EPSPS protein polyubiquitination. The
polynucleotide modification template (referred to as EPSPS
polynucleotide maize K90R template) used to edit the epsps locus is
listed as SEQ ID NO: 517. This template allowed for editing the
epsps locus to contain the lysine (K) to arginine (R) substitution
at position 90 (K90R) and two additional TIPS substitutions at
positions 102 and 106 (FIGS. 34B and 34C). Maize genomic DNA was
edited using the guideRNA/Cas endonuclease system described herein
and T0 plants were produced as described herein. The T0 plants that
contained the nucleotide modifications, as specified by the
information provided on the K90R template (FIG. 34C), were selected
by the genotying methods described herein. F1 EPSPS-K90R plants can
be selected for elevated protein content due to a slower rate of
the EPSPS protein degradation.
B. Editing Intron Elements to Introduce Intron Mediated Enhancer
Elements (IMEs)
[0803] Transcriptional activity of the native EPSPS gene can be
modulated by transcriptional enhancers positioned in the vicinity
of other transcription controlling elements. Introns are known to
contain enhancer elements affecting the overall rate of
transcription from native promoters including the EPSPS promoter.
For example, the first intron of the maize ubiquitin 5'UTR confers
a high level of expression in monocot plants as specified in the WO
2011/156535 A1 patent application. An intron enhancing motif
CATATCTG (FIG. 35 A), also referred to as a intron-mediated
enhancer element, IME) was identified by proprietary analysis
(WO2011/156535 A1, published on Dec. 15, 2011) and appropriate
nucleotide sites at the 5' end of the EPSPS first intron were
selected for editing in order to introduce the intron-mediated
enhancer elements (IMEs) (FIG. 35B-35C). The polynucleotide
modification template (referred to as EPSPS polynucleotide maize
IME template) is listed as SEQ ID No: 518. The polynucleotide
modification template allows for editing of the epsps locus to
contain three IMEs (two on one strand of the DNA, one on the
reverse strand) in the first EPSPS intron and the TIPS
substitutions at positions 102 and 106. The genomic DNA of maize
plants was edited using the guideRNA/Cas endonuclease system
described herein. Maize plants containing the IME edited EPSPS
coding sequence can be selected by genotyping the T0 plants and can
be further evaluated for elevated EPSPS-TIPS protein content due to
the enhanced transcription rate of the native EPSPS gene.
Example 29
[0804] Modifications of Splicing Sites and/or Introducing Alternate
Splicing Sites Using the Guide RNA/Cas Endonuclease System
[0805] In maize cells, the splicing process is affected by splicing
sites at the exon-intron junction sites as illustrated in the EPSPS
mRNA production (FIG. 36A-36B). FIG. 36A shows analysis of EPSPS
amplified pre-mRNA (cDNA panel on left). Lane I4 in FIG. 36A shows
amplification of the EPSPS pre-mRNA containing the 3.sup.rd intron
unspliced, resulting in a 804 bp diagnostic fragment indicative for
an alternate splicing event. Lanes E3 and F8 show the EPSPS PCR
amplified fragments resulting from regular spliced introns.
Diagnostic fragments such as the 804 bp fragment of lane I4 are not
amplified unless cDNA is synthesized (as is evident by the absence
of bands in lanes E3, I4, and F8 comprising total RNA (shown in the
total RNA panel on right of FIG. 36A). The canonical splice site in
the maize EPSPS gene and genes from other species is AGGT, while
other (alternative) variants of the splice sites may lead to the
aberrant processing of pre-mRNA molecules. The EPSPS coding
sequence contains a number of alternate splicing sites that may
affect the overall efficiency of the pre-mRNA maturation process
and as such may limit the EPSPS protein accumulation in maize
cells.
[0806] In order to limit the occurrence of alternate splicing
events during EPSPS gene expression, a guideRNA/Cas endonuclease
system as described herein can be used to edit splicing sites. The
splicing site at the junction of the second native EPSPS intron and
the third exon is AGTT and can be edited in order to introduce the
canonical AGGT splice site at this junction (FIG. 37). The T>G
substitution does not affect the native EPSPS open reading frame
and it does not change the EPSPS amino acid sequence. The
polynucleotide modification template (referred to as EPSPS
polynucleotide maize Tspliced template) is listed as SEQ ID NO:
519. This polynucleotide modification template allows for editing
of the epsps locus to contain the canonical AGGT splice site at the
2.sup.nd intron-3.sup.rd exon junction site and the TIPS
substitutions at positions 102 and 106. Maize plants are edited
using the procedures described herein. F1 EPSPS-Tspliced maize
plants can be evaluated for increased protein content due to the
enhanced production of functional EPSPS mRNA messages.
Example 30
[0807] Shortening Maturity Via Manipulation of Early Flowering
Phenotype with ZmRap2.7 Down-Regulation Using the Guide RNA/Cas
Endonuclease System
[0808] Overall plant maturity can be shortened by modulating the
flowering time phenotype of plants through modulation of a maize
ZmRap2.7 gene. Shortening of plant maturity can be obtained by an
early flowering phenotype.
[0809] RAP2.7 is an acronym for Related to APETALA 2.7. RAPL means
RAP2.7 LIKE and RAP2.7 functions as an AP2-family transcription
factor that suppresses floral transition (SEQ ID NOs:520 and 521).
Transgenic phenotype upon silencing or knock-down of Rap2.7
resulted in early flowering, reduced plant height, but surprisingly
developed normal ear and tassel as compared the wild-type plants
(PCT/US14/26279 application, filed Mar. 13, 2014). The guide
RNA/Cas endonuclease system described herein can be used to target
and induce a double strand break at a Cas endonuclease target site
located within the RAP2.7 gene. Plants comprising NHEJ within the
RAP2.7 gene can be selected and evaluated for the presence of a
shortened maturity phenotype.
Example 31
[0810] Modulating Expression of a Maize NPK1B Gene for Engineering
Frost Tolerance in Maize Using a Guide RNA/Cas Endonuclease
System
[0811] Nicotiana Protein Kinase1 (NPK1) is a mitogen activated
protein kinase kinase kinase that is involved in cytokinesis
regulation and oxidative stress signal transduction. The ZM-NPK1B
(SEQ ID NO: 522 and SEQ ID NO: 523) which has about 70% amino acid
similarity to rice NPKL3 has been tested for frost tolerance in
maize seedlings and reproductive stages (PCT/US14/26279
application, filed Mar. 13, 2014). Transgenic seedlings and plants
comprising a ZM-NPK1B driven by an inducible promoter Rab17, had
significantly higher frost tolerance than control seedlings and
control plants. The gene seemed inducted after cold acclimation and
during -3.degree. C. treatment period in most of the events but at
low levels. (PCT/US14/26279 application, filed Mar. 13, 2014).
[0812] A guide RNA/Cas endonuclease system described herein can be
used to replace the endogenous promoter of NPK1 gene, with a
stress-inducible promoter such as the maize RAB17 promoter stages
(SEQ ID NO: 524; PCT/US14/26279 application, filed Mar. 13, 2014),
thus modulate NPK1B expression in a stress-responsive manner and
provide frost tolerance to the modulated maize plants.
Example 32
Shortening Maturity Via Manipulation of Early Flowering Phenotype
with FTM1 Expression Using a Guide RNA/Cas Endonuclease Systems
[0813] Overall plant maturity can be shortened by modulating the
flowering time phenotype of plants through expressing a transgene.
Such a phenotype modification can also be achieved with additional
transgenes or through a breeding approach.
[0814] FTM1 stands for Floral Transition MADS 1 transcription
factor (SEQ ID NOs: 525 and 526). It is a MADS Box transcriptional
factor and induces floral transition. Upon expression of FTM1 under
a constitutive promoter, transgenic plants exhibited early
flowering and shortened maturity, but surprisingly ear and tassel
developed normally as compared to the wild-type plants
(PCT/US14/26279 application, filed Mar. 13, 2014).
[0815] FTM1-expressing maize plants demonstrated that by
manipulating a floral transition gene, time to flowering can be
reduced significantly, leading to a shortened maturity for the
plant. As maturity can be generally described as time from seeding
to harvest, a shorter maturity is desired for ensuring that a crop
can finish in the northern continental dry climatic environment
(PCT/US14/26279 application, filed Mar. 13, 2014).
[0816] A guide RNA/Cas endonuclease system described herein can be
used to introduce enhancer elements such as the CaMV35S enhancers
(Benfey et al, EMBO J, August 1989; 8(8): 2195-2202, SEQ ID
NO:512), specifically targeted in front of the endogenous promoter
of FTM1, in order to enhance the expression of FTM1 while
preserving most of the tissue and temporal specificities of native
expression, providing shortened maturity to the modulated
plants.
Example 33
Inserting Inducible Responsive Elements in Plant Genomes
[0817] Inducible expression systems controlled by an external
stimulus are desirable for functional analysis of cellular proteins
as well as trait development as changes in the expression level of
the gene of interest can lead to an accompanying phenotype
modification. Ideally such a system would not only mediate an
"on/off" status for gene expression but would also permit limited
expression of a gene at a defined level.
The guide RNA/Cas endonuclease system described herein can be used
to introduce components of repressor/operator/inducer systems to
regulate gene expression of an organism. Repressor/operator/inducer
systems and their components are well known I the art (US
2003/0186281 published Oct. 2, 2003; U.S. Pat. No. 6,271,348). For
example, nut not limited to, components of the tetracycline (Tc)
resistance system of E. coli have been found to function in
eukaryotic cells and have been used to regulate gene expression
(U.S. Pat. No. 6,271,348). Nucleotide sequences of tet operators of
different classes are known in the art see for example: classA,
classB, classC, classD, classE TET operator sequences lists as SEQ
ID NOs:11-15 of U.S. Pat. No. 6,271,348.
[0818] Components of a sulfonylurea-responsive repressor system (as
described in U.S. Pat. No. 8,257,956, issued on Sep. 4, 2012) can
also be introduced into plant genomes to generate a
epressor/operator/inducer systems into said plant where
polypeptides can specifically bind to an operator, wherein the
specific binding is regulated by a sulfonylurea compound.
Example 34
Genome Deletion for Trait Locus Characterization
[0819] Trait mapping in plant breeding often results in the
detection of chromosomal regions housing one or more genes
controlling expression of a trait of interest. For quantitative
traits, expression of a trait of interest is governed by multiple
quantitative trait loci (QTL) of varying effect-size, complexity,
and statistical significance across one or more chromosomes. A QTL
or haplotype that is associated with suppression of kernel-row
number in the maize ear can be found to be endemic in elite
breeding germplasm. The negative effect of this QTL for kernel row
number can be fine-mapped to an acceptable resolution to desire
selective elimination of this negative QTL segment within specific
recipient germplasm. Two flanking cut sites for the guide
polynucleotide/Cas endonuclease system are designed via haplotype,
marker, and/or DNA sequence context at the targeted QTL region, and
the two guide polynucleotide/Cas endonuclease systems are deployed
simultaneously or sequentially to produce the desired end product
of two independent double strand breaks (cuts) that liberate the
intervening region from the chromosome. Individuals harboring the
desired deletion event would result by the NHEJ repair of the two
chromosomal ends and eliminating the intervening DNA region. Assays
to identify these individuals is based on the presence of flanking
DNA marker regions, but absence of intervening DNA markers. A
proprietary haplotype for kernel-row-number is created that is not
extant in the previously defined elite breeding germplasm pool.
An alternative approach would be to delete a region containing a
fluorescent gene. Recovery of plants with, and without,
fluorescence would give an approximate indication of the efficiency
of the deletion process.
Example 35
Engineering Drought Tolerance and Nitrogen Use Efficiency into
Maize Via Gene Silencing by Expressing an Inverted Repeat into an
ACS6 Gene Using the Guide RNA/Cas Endonuclease System
[0820] ACC (1-aminocyclopropane-1-carboxylic acid) synthase (ACS)
genes encode enzymes that catalyze the rate limiting step in
ethylene biosynthesis. A construct containing one of the maize ACS
genes, ZM-ACS6, in an inverted repeat configuration, has been
extensively tested for improved abiotic stress tolerance in maize
(PCT/US2010/051358, filed Oct. 4, 2010; PCT/US2010/031008, filed
Apr. 14, 2010). Multiple transgenic maize events containing a
ZM-ACS6 RNAi sequence driven by a ubiquitin constitutive promoter
had reduced ethylene emission, and a concomitant increase in grain
yield relative to controls under both drought and low nitrogen
field conditions (Plant Biotechnology Journal: 12 MAR 2014, DOI:
10.1111/pbi.12172).
[0821] In one embodiment, the guide RNA/Cas endonuclease system can
be used in combination with a co-delivered polynucleotide sequence
to insert an inverted ZM-ACS6 gene fragment into the genome of
maize, wherein the insertion of the inverted gene fragment allows
for the in-vivo creation of an inverted repeat (hairpin) and
results in the silencing of the endogenous ethylene biosynthesis
gene.
[0822] In an embodiment the insertion of the inverted gene fragment
can result in the formation of an in-vivo created inverted repeat
(hairpin) in a native (or modified) promoter of an ACS6 gene and/or
in a native 5' end of the native ACS6 gene. The inverted gene
fragment can further comprise an intron which can result in an
enhanced silencing of the targeted ethylene biosynthetic gene.
Example 36
T0 Plants from the Multiplexed Guide RNA/Cas Experiment Carried
High Frequency of Bi-Allelic Mutations and Demonstrated Proper
Inheritance of Mutagenized Alleles in the T1 Population
[0823] This example demonstrates the high efficiency of the guide
RNA/Cas endonuclease system in generating maize plants with
multiple mutagenized loci and their inheritance in the consecutive
generation(s).
[0824] Mutated events generated in the multiplexed experiment
described in Example 4 were used to regenerate T0 plants with
mutations at 3 different target sites: MS26Cas-2 target site (SEQ
ID NO: 14), LIGCas-3 target site (SEQ ID NO: 18) and MS45Cas-2
target site (SEQ ID NO: 20).
[0825] For further analysis, total genomic DNA was extracted from
leaf tissue of individual T0 plants. Fragments spanning all 3
target sites were PCR amplified using primer pairs for the
corresponding target sites, cloned into the pCR2.1-TOPO cloning
vector (Invitrogen), and sequenced. Table 43 shows examples of
mutations detected in four T0 plants resulting from imprecise NHEJ
at all relevant loci when multiple guide RNA expression cassettes
were simultaneously introduced either in duplex (see TS=Lig34/MS26)
or triplex (see TS=Lig34/MS26/MS45), respectively.
TABLE-US-00044 TABLE 43 Examples of mutations at maize target loci
produced by a multiplexed guide RNA/Cas system Target sites T0 qPCR
Sequencing data (TS) plant data Lig3/4 TS Ms26 TS Ms45 TS
Lig34/MS26 1 NULL/NULL* 1 bp ins/2 bp 1 bp ins/19 bp del + 1 bp ins
del 2 NULL/NULL 1 bp ins/1 bp del 1 bp ins/1 bp ins Lig34/MS26/ 1
NULL/NULL/ 1 bp ins/large del 1 bp ins/1 bp 15 bp del/ MS45 NULL
del large del 2 INDEL**/NULL/ 1 bp ins/WT 1 bp (T) ins/ 1 bp ins/
NULL 1 bp (C) ins large del *NULL indicates that both alleles are
mutated **INDEL indicates mutation in one of the two alleles. del =
deletion, ins = insertion, bp = base pair
[0826] All T0 plants were crossed with wild type maize plants to
produce T1 seeds. T1 progeny plants (32 plants) of the second T0
plant from the triplex experiment (see Table 43, Lig34/MS26/MS45)
were analyzed by sequencing to evaluate segregation frequencies of
the mutated alleles. Our results demonstrated proper inheritance
and expected (1:1) segregation of the mutated alleles as well as
between mutated and wild type alleles at all three target
sites.
[0827] The data clearly demonstrate that the guide RNA/maize
optimized Cas endonuclease system described herein, can be used to
simultaneously mutagenize multiple chromosomal loci and produce
progeny plants containing the stably inherited multiple gene
knock-outs.
Example 37
[0828] Guide RNA/Cas endonuclease mediated DNA cleavage in maize
chromosomal loci can stimulate homologous recombination
repair-mediated transgene insertion and resulting T1 progeny plants
demonstrated proper inheritance of the modified alleles.
[0829] Maize events generated in the experiment described in
Example 5 were used to regenerate T0 plants. T0 plants were
regenerated from 7 independent callus events with correct
amplifications across both transgene genomic DNA junctions and
analyzed. Leaf tissue was sampled, total genomic DNA extracted, and
PCR amplification at both transgene genomic DNA junctions was
carried out using the primer pairs (corresponding to SEQ ID NOs:
98-101). The resulting amplification products were sequenced for
confirmation. Plants with confirmed junctions at both ends were
further analyzed by Southern hybridization (FIG. 38) using two
probes, genomic (outside HR1 region, SEQ ID: 533) and transgenic
(within MoPAT gene, SEQ ID: 534). PCR, sequencing and Southern
hybridization data demonstrated that plants regenerated from two of
the 7 events (events 1 and 2) demonstrated perfect, clean, single
copy transgene integration at the expected target site via
homologous recombination. Plants regenerated from the remaining 5
events contained either additional, randomly integrated copies of
the transgene (events 4, 5, and 6) or rearranged copies of the
transgene integrated into the target site (events 3 and 7).
[0830] T0 plants from events 1 and 2 were crossed with wild type
maize plants to produce T1 seeds. Ninety-six T1 plants from events
1 and 2 were analyzed by Southern hybridization (using the same
probes as above) to evaluate segregation frequencies of the
transgene locus. Southern results demonstrated proper inheritance
and expected (1:1) segregation of the transgene and wild type
loci.
[0831] The data clearly demonstrate that maize chromosomal loci
cleaved with the maize optimized guide RNA/Cas system described
herein can be used to stimulate HR repair pathways to
site-specifically insert transgenes and produce progeny plants that
have the inserted transgene stably inherited.
Example 38
Production of Maize Transgenic Lines with Pre-Integrated Cas9 for
Transient Delivery of Guide RNA
[0832] This example describes the rationale, production, and
testing of maize transgenic lines with an integrated Cas9 gene
under constitutive and temperature inducible promoters.
[0833] As demonstrated in Example 2, a high mutation frequency was
observed when Cas9 endonuclease and guide RNA were delivered as DNA
vectors by biolistic transformation to immature corn embryo cells.
When Cas9 endonuclease was delivered as a DNA vector and guide RNA
as RNA molecules, a reduced mutation frequency was observed (Table
44).
TABLE-US-00045 TABLE 44 Mutant reads at LigCas-3 target site
produced by transiently delivered guide RNA Target Site Examined
for Transient Expression Mutant Total Mutations Delivery Cassette
Reads Reads LIGCas-3 -- Cas9 24.2 1,599,492 LIGCas-3 -- Cas9/guide
44170 1,674,825 RNA LIGCas-3 35 ng guide RNA Cas9 418 1,622,180
LIGCas-3 70 ng guide RNA Cas9 667 1,791,388 LIGCas-3 140 ng guide
RNA Cas9 239 1,632,137
[0834] Increased efficiency (increased mutant reads) may occur when
the Cas9 protein and guide RNA are present in the cell at the same
time. To facilitate the presence of both Cas9 endonuclease and
guide RNA in the same cell, a vector containing a constitutive and
conditionally regulated Cas9 gene can be first delivered to plant
cells to allow for stable integration into the plant genome to
establish a plant line that contains only the Cas9 gene in the
plant genome. Then, single or multiple guide RNAs can be delivered
as either DNA or RNA, or combination, to the embryo cells of the
plant line containing the genome-integrated version of the Cas9
gene.
[0835] Transgenic maize (genotype Hi-II) lines with an integrated
Cas9 gene driven by either a constitutive (Ubi) or an inducible
(CAS) promoter were generated via Agrobacterium-mediated
transformation. Besides the Cas9 gene, the Agro vector also
contained a visible marker (END2:Cyan) and a Red Fluorescent
Protein sequence interrupted with a 318 bp long linker
(H.sub.2B:RF-FP) (as described in U.S. patent Ser. No. 13/526,912,
filed Jun. 19, 2012). The linker sequence was flanked with 370 bp
long direct repeats to promote recombination and restoration of a
functional RFP gene sequence upon double strand break within the
linker.
[0836] Lines with single copies of the transgene were identified
and used for further experiments. Two guide RNA constructs
targeting 2 different sites (Table 45 in the linker sequence, were
delivered into immature embryo cells via particle bombardment.
Meganuclease variant LIG3-4 B65 with very high cutting activity
previously used in similar experiments was used as the positive
control.
TABLE-US-00046 TABLE 45 Target sites in the RF-FP linker for
guideRNA/Cas endonuclease system. Target Guide Site Target PAM SEQ
RNA Desig- Site Se- ID Locus Used nation Sequence quence NO: RF-
Long RF-FP GCAGGTCTC TGG 535 FP Cas-1 ACGACGGT linker Long RF-FP
GTAAAGTACG AGG 536 Cas-2 CGTACGTGT
[0837] After transformation, embryos with Cas9 gene under Ubiquitin
promoter were incubated at 28.degree. C. while embryos with Cas9
gene under temperature inducible CAS promoter were first incubated
at 37.degree. C. for 15-20 hours and then transferred to 28.degree.
C. Embryos were examined 3-5 days after bombardment under
luminescent microscope. Expression and activity of the
pre-integrated Cas9 protein was visually evaluated based on the
number of embryo cells with RFP protein expression. In most lines,
the guide RNA/Cas endonuclease system demonstrated similar or
higher frequency of RFP repair than LIG3-4 B65 meganuclease
indicating high level of Cas9 protein expression and activity in
the generated transgenic lines.
[0838] This example describes the production of transgenic lines
with a pre-integrated Cas9 gene that can be used in further
experiments to evaluate efficiency of mutagenesis at a target site
upon transient delivery of guide RNA in the form of RNA
molecules.
Example 39
[0839] The Quide RNA/Cas Endonuclease System Delivers Double-Strand
Breaks to the Maize ALS Locus and Facilitates Editing of the ALS
Gene
[0840] This example demonstrates that the guide RNA/Cas
endonuclease system can be efficiently used to introduce specific
changes into the nucleotide sequence of the maize ALS gene
resulting in resistance to sulfonylurea class herbicides,
specifically, chlorsulfuron.
[0841] Endogenous ALS protein is the target site of ALS inhibitor
sulfonylurea class herbicides. Expression of the herbicide tolerant
version of ALS protein in crops confers tolerance to this class of
herbicides. The ALS protein contains N-terminal transit peptides,
and the mature protein is formed following transport into the
chloroplast and subsequent cleavage of the transit peptide. The
mature protein starts at residue S41, resulting in a mature protein
of 598 amino acids with a predicted molecular weight of 65 kDa (SEQ
ID NO: 550).
TABLE-US-00047 TABLE 46 Deduced Amino Acid Sequence of the
Full-Length ZM-ALS Protein (SEQ ID no: 550) 1 MATAAAASTA LTGATTAAPK
ARRRAHLLAT RRALAAPIRC SAASPAMPMA 51 PPATPLRPWG PTEPRKGADI
LVESLERCGV RDVFAYPGGA SMEIHQALTR 101 SPVIANHLFR HEQGEAFAAS
GYARSSGRVG VCIATSGPGA TNLVSALADA 151 LLDSVPMVAI TGQVPRRMIG
TDAFQETPIV EVTRSITKHN YLVLDVDDIP 201 RVVQEAFFLA SSGRPGPVLV
DIPKDIQQQM AVPVWDKPMS LPGYIARLPK 251 PPATELLEQV LRLVGESRRP
VLYVGGGCAA SGEELRRFVE LTGIPVTTTL 301 MGLGNFPSDD PLSLRMLGMH
GTVYANYAVD KADLLLALGV RFDDRVTGKI 351 EAFASRAKIV HVDIDPAEIG
KNKQPHVSIC ADVKLALQGM NALLEGSTSK 401 KSFDFGSWND ELDQQKREFP
LGYKTSNEEI QPQYAIQVLD ELTKGEAIIG 451 TGVGQHQMWA AQYYTYKRPR
QWLSSAGLGA MGFGLPAAAG ASVANPGVTV 501 VDIDGDGSFL MNVQELAMIR
IENLPVKVFV LNNQHLGMVV QWEDRFYKAN 551 RAHTYLGNPE NESEIYPDFV
TIAKGFNIPA VRVTKKNEVR AAIKKMLETP 601 GPYLLDIIVP HQEHVLPMIP
SGGAFKDMIL DGDGRTVY
[0842] Modification of a single amino acid residue (P165A or P165S,
shown in bold) from the endogenous maize acetoacetate synthase
protein provides resistance to herbicides in maize.
[0843] There are two ALS genes in maize, ALS1 and ALS2, located on
chromosomes 5 and 4, respectively. As described in Example 2, guide
RNA expressing constructs for 3 different target sites within the
ALS genes were tested. Based on polymorphism between ALS1 and ALS2
nucleotide sequences, ALS1-specific and ALSCas-4 target site were
identified and tested. ALSCas-1 guide RNA expressing construct
targeting both ALS1 and ALS2 genes was used as control (Table
47)
TABLE-US-00048 TABLE 47 Maize ALS genomic target sites tested.
Maize Target Genomic Site Target SEQ Lo- Guide Desig- Site PAM ID
cus Location RNA nation Sequence Sequence NO: ALS Chr. 4: Long
ALSCas-1 GGTGCCAATCATGC CGG 22 107.73cM GTCG and Long ALSCas-4
GCTGCTCGATTCC TGG* 537 Chr. 5: GTCCCCA 115.49cM *Target site in the
ALS1 gene; bolded nucleo- tides are different in the ALS2 gene.
The experiment was conducted and mutation frequency determined as
described in Example 2 and results are shown in Table 48.
TABLE-US-00049 TABLE 48 Frequencies of NHEJ mutations at the two
ALS target sites recovered by deep sequencing. TS Total Reads
Mutant reads (ALS1) Mutant reads (ALS2) ALSCas-1 204,230 5072
(2.5%) 2704 (1.3%) ALSCas-4 120,766 3294 (2.7%) 40 (0.03%)
The results demonstrated that ALSCas-4 guide RNA/Cas9 system
mutates the ALS1 gene with approximately 90 times higher efficiency
than the ALS2 gene. Therefore, the ALSCas-4 target site and the
corresponding guide RNA were selected for the ALS gene editing
experiment.
[0844] To produce edited events, the ALS polynucleotide
modification repair template was co-delivered using particle
bombardment as a plasmid with an 804 bp long homologous region (SEQ
ID NO: 538) or as a single-stranded 127 bp DNA fragment (SEQ ID NO:
539), the maize optimized Cas9 endonuclease expression vector
described in Example 1, the guide RNA expression cassette
(targeting ALSCas-4 site), a moPAT-DsRed fusion as selectable and
visible markers, and developmental genes (ODP-2 and WUS).
Approximately 1000 Hi-II immature embryos were bombarded with each
of the two repair templates described above. Forty days after
bombardment, 600 young callus events (300 for each repair template)
were collected and transferred to the media with bialaphos
selection. The embryos with remaining events were transferred to
the media with 100 ppm of chlorsulfuron for selection. A month
later, events that continued growing under chlorsulfuron selection
were collected and used for analysis.
[0845] A small amount of callus tissue from each selected event was
used for total DNA extraction. A pair of genomic primers outside
the repair/donor DNA fragment (SEQ ID NO:540 and SEQ ID NO:541) was
used to amplify an endogenous fragment of the ALS1 locus containing
the ALSCas4 target sequence. The PCR amplification products were
gel purified, cloned into the pCR2.1 TOPO cloning vector
(Invitrogen) and sequenced. A total of 6 events demonstrated the
presence of the specifically edited ALS1 allele as well as either a
wild type or a mutagenized second allele.
[0846] These data indicate that a guide RNA/Cas system can be
successfully used to create edited ALS allele in maize. The data
further demonstrates that the guide RNA/maize optimized Cas
endonuclease system described herein, can be used to produce
progeny plants containing gene edits that are stably inherited.
Example 40
Gene Editing of the Soybean ALS1 Gene and Use as a Transformation
Selectable Marker for Soybean Transformation with the Guide RNA/Cas
Endonuclease System
[0847] A. guideRNA/Cas9 Endonuclease Target Site Design on the
Soybean ALS1 Gene.
[0848] There are four ALS genes in soybean (Glyma04g37270,
Glyma06g17790, Glyma13g31470 and Glyma15g07860). Two guideRNA/Cas9
endonuclease target sites (soy ALS1-CR1 and soy ALS1-CR2) were
designed near the Proline 178 of the soybean ALS1 gene
Glyma04g37270 (Table 49).
TABLE-US-00050 TABLE 49 Guide RNA/Cas9 endonuclease target sites on
soybean ALS1 gene Cas endonuclease Name of gRNA-Cas9 target
endonuclease sequence target site (SEQ ID NO:) Physical location
soy ALS1-CR1 542 Gm04: 43645633 . . . 43645612 soy ALS1-CR2 543
Gm04: 43645594 . . . 43645615
B. Guide-RNA Expression Cassettes, Cas9 Endonuclease Expression
Cassettes, Polynucleotide Modification Templates for Introduction
of Specific Amino Acid Changes and Use the P178S Modified ALS1
Allele as a Soybean Transformation Selectable Marker
[0849] The soybean U6 small nuclear RNA promoter, GM-U6-13.1 (SEQ
ID. NO: 469), was used to express guide RNAs to direct Cas9
nuclease to designated genomic target sites (Table 50). A soybean
codon optimized Cas9 endonuclease (SEQ ID NO:489) expression
cassette and a guide RNA expression cassette were linked in a first
plasmid that was co-delivered with a polynucleotide modification
template. The polynucleotide modification template contained
specific nucleotide changes that encoded for amino acid changes in
the soy ALS1 polypeptide
[0850] (Glyma04g37270), such as the P178S. Other amino acid changes
in the ALS1 polypeptide can also be obtained using the guide
RNA/Cas endonuclease system described herein. Specific amino acid
modifications can be achieved by homologous recombination between
the genomic DNA and the polynucleotide modification template
facilitated by the guideRNA/Cas endonuclease system.
TABLE-US-00051 TABLE 50 Guide RNA/Cas9 expression cassettes and
polynucleotide modification templates used in soybean stable
transformation for the specific amino acid modifications of the soy
ALS1 gene. SEQ polynucleotide SEQ Guide RNA/Cas9 ID modification ID
Experiment (plasmid name) NO: template NO: soy ALS1-CR1
U6-13.1:ALS1-CR1 + 544 RTW1026A 546 EF1A2:CAS9 (QC880) soy ALS-CR2
U6-13.1:ALS1-CR2 + 545 RTW1026A 546 EF1A2:CAS9 (QC881)
[0851] C. Detection of the P178S Mutation in the Soybean ALS1 Gene
in the Event Selected by Chlorsulfuron
[0852] In order to edit specific amino acids at the native ALS1
gene (such as the P178S modification), a polynucleotide
modification template such as RTW1026A (Table 50), was co-delivered
with the guideRNA/Cas9 expression cassettes into soybean cells.
Chlorsulfuron (100 ppb) was used to select the P178S ALS1 gene
editing events in soybean transformation process.
[0853] The modification of the native ALS1 gene via guide RNA/Cas9
system mediated DNA homologous recombination was determined by
specific PCR analysis. A specific PCR assay with primer pair WOL900
(SEQ ID NO: 547) and WOL578 (SEQ ID NO: 548) was used to detect
perfect P178S modification at the native ALS1 gene. A second primer
pair WOL573 (SEQ ID NO: 549) and WOL578 (SEQ ID NO: 548) was used
to amplify both a P178S modified Soy ALS1 allele and a NHEJ mutated
allele. A chlorsulfuron tolerant event (MSE3772-18) was generated
from the soy ALS1-CR2 experiment. The event contained a perfect
P178S modified allele and a 2.sup.nd allele with a 5 bp deletion at
the soyALS1-CR2 cleavage site. Topo cloning/sequencing was used to
verify the sequences. Our results demonstrated one P178S modified
ALS1 allele is sufficient to provide chlorsulfuron selection in
soybean transformation process.
Sequence CWU 1
1
55014107DNAStreptococcus pyogenes M1 GAS (SF370) 1atggataaga
aatactcaat aggcttagat atcggcacaa atagcgtcgg atgggcggtg 60atcactgatg
aatataaggt tccgtctaaa aagttcaagg ttctgggaaa tacagaccgc
120cacagtatca aaaaaaatct tataggggct cttttatttg acagtggaga
gacagcggaa 180gcgactcgtc tcaaacggac agctcgtaga aggtatacac
gtcggaagaa tcgtatttgt 240tatctacagg agattttttc aaatgagatg
gcgaaagtag atgatagttt ctttcatcga 300cttgaagagt cttttttggt
ggaagaagac aagaagcatg aacgtcatcc tatttttgga 360aatatagtag
atgaagttgc ttatcatgag aaatatccaa ctatctatca tctgcgaaaa
420aaattggtag attctactga taaagcggat ttgcgcttaa tctatttggc
cttagcgcat 480atgattaagt ttcgtggtca ttttttgatt gagggagatt
taaatcctga taatagtgat 540gtggacaaac tatttatcca gttggtacaa
acctacaatc aattatttga agaaaaccct 600attaacgcaa gtggagtaga
tgctaaagcg attctttctg cacgattgag taaatcaaga 660cgattagaaa
atctcattgc tcagctcccc ggtgagaaga aaaatggctt atttgggaat
720ctcattgctt tgtcattggg tttgacccct aattttaaat caaattttga
tttggcagaa 780gatgctaaat tacagctttc aaaagatact tacgatgatg
atttagataa tttattggcg 840caaattggag atcaatatgc tgatttgttt
ttggcagcta agaatttatc agatgctatt 900ttactttcag atatcctaag
agtaaatact gaaataacta aggctcccct atcagcttca 960atgattaaac
gctacgatga acatcatcaa gacttgactc ttttaaaagc tttagttcga
1020caacaacttc cagaaaagta taaagaaatc ttttttgatc aatcaaaaaa
cggatatgca 1080ggttatattg atgggggagc tagccaagaa gaattttata
aatttatcaa accaatttta 1140gaaaaaatgg atggtactga ggaattattg
gtgaaactaa atcgtgaaga tttgctgcgc 1200aagcaacgga cctttgacaa
cggctctatt ccccatcaaa ttcacttggg tgagctgcat 1260gctattttga
gaagacaaga agacttttat ccatttttaa aagacaatcg tgagaagatt
1320gaaaaaatct tgacttttcg aattccttat tatgttggtc cattggcgcg
tggcaatagt 1380cgttttgcat ggatgactcg gaagtctgaa gaaacaatta
ccccatggaa ttttgaagaa 1440gttgtcgata aaggtgcttc agctcaatca
tttattgaac gcatgacaaa ctttgataaa 1500aatcttccaa atgaaaaagt
actaccaaaa catagtttgc tttatgagta ttttacggtt 1560tataacgaat
tgacaaaggt caaatatgtt actgaaggaa tgcgaaaacc agcatttctt
1620tcaggtgaac agaagaaagc cattgttgat ttactcttca aaacaaatcg
aaaagtaacc 1680gttaagcaat taaaagaaga ttatttcaaa aaaatagaat
gttttgatag tgttgaaatt 1740tcaggagttg aagatagatt taatgcttca
ttaggtacct accatgattt gctaaaaatt 1800attaaagata aagatttttt
ggataatgaa gaaaatgaag atatcttaga ggatattgtt 1860ttaacattga
ccttatttga agatagggag atgattgagg aaagacttaa aacatatgct
1920cacctctttg atgataaggt gatgaaacag cttaaacgtc gccgttatac
tggttgggga 1980cgtttgtctc gaaaattgat taatggtatt agggataagc
aatctggcaa aacaatatta 2040gattttttga aatcagatgg ttttgccaat
cgcaatttta tgcagctgat ccatgatgat 2100agtttgacat ttaaagaaga
cattcaaaaa gcacaagtgt ctggacaagg cgatagttta 2160catgaacata
ttgcaaattt agctggtagc cctgctatta aaaaaggtat tttacagact
2220gtaaaagttg ttgatgaatt ggtcaaagta atggggcggc ataagccaga
aaatatcgtt 2280attgaaatgg cacgtgaaaa tcagacaact caaaagggcc
agaaaaattc gcgagagcgt 2340atgaaacgaa tcgaagaagg tatcaaagaa
ttaggaagtc agattcttaa agagcatcct 2400gttgaaaata ctcaattgca
aaatgaaaag ctctatctct attatctcca aaatggaaga 2460gacatgtatg
tggaccaaga attagatatt aatcgtttaa gtgattatga tgtcgatcac
2520attgttccac aaagtttcct taaagacgat tcaatagaca ataaggtctt
aacgcgttct 2580gataaaaatc gtggtaaatc ggataacgtt ccaagtgaag
aagtagtcaa aaagatgaaa 2640aactattgga gacaacttct aaacgccaag
ttaatcactc aacgtaagtt tgataattta 2700acgaaagctg aacgtggagg
tttgagtgaa cttgataaag ctggttttat caaacgccaa 2760ttggttgaaa
ctcgccaaat cactaagcat gtggcacaaa ttttggatag tcgcatgaat
2820actaaatacg atgaaaatga taaacttatt cgagaggtta aagtgattac
cttaaaatct 2880aaattagttt ctgacttccg aaaagatttc caattctata
aagtacgtga gattaacaat 2940taccatcatg cccatgatgc gtatctaaat
gccgtcgttg gaactgcttt gattaagaaa 3000tatccaaaac ttgaatcgga
gtttgtctat ggtgattata aagtttatga tgttcgtaaa 3060atgattgcta
agtctgagca agaaataggc aaagcaaccg caaaatattt cttttactct
3120aatatcatga acttcttcaa aacagaaatt acacttgcaa atggagagat
tcgcaaacgc 3180cctctaatcg aaactaatgg ggaaactgga gaaattgtct
gggataaagg gcgagatttt 3240gccacagtgc gcaaagtatt gtccatgccc
caagtcaata ttgtcaagaa aacagaagta 3300cagacaggcg gattctccaa
ggagtcaatt ttaccaaaaa gaaattcgga caagcttatt 3360gctcgtaaaa
aagactggga tccaaaaaaa tatggtggtt ttgatagtcc aacggtagct
3420tattcagtcc tagtggttgc taaggtggaa aaagggaaat cgaagaagtt
aaaatccgtt 3480aaagagttac tagggatcac aattatggaa agaagttcct
ttgaaaaaaa tccgattgac 3540tttttagaag ctaaaggata taaggaagtt
aaaaaagact taatcattaa actacctaaa 3600tatagtcttt ttgagttaga
aaacggtcgt aaacggatgc tggctagtgc cggagaatta 3660caaaaaggaa
atgagctggc tctgccaagc aaatatgtga attttttata tttagctagt
3720cattatgaaa agttgaaggg tagtccagaa gataacgaac aaaaacaatt
gtttgtggag 3780cagcataagc attatttaga tgagattatt gagcaaatca
gtgaattttc taagcgtgtt 3840attttagcag atgccaattt agataaagtt
cttagtgcat ataacaaaca tagagacaaa 3900ccaatacgtg aacaagcaga
aaatattatt catttattta cgttgacgaa tcttggagct 3960cccgctgctt
ttaaatattt tgatacaaca attgatcgta aacgatatac gtctacaaaa
4020gaagttttag atgccactct tatccatcaa tccatcactg gtctttatga
aacacgcatt 4080gatttgagtc agctaggagg tgactga 41072189DNASolanum
tuberosum 2gtaagtttct gcttctacct ttgatatata tataataatt atcattaatt
agtagtaata 60taatatttca aatatttttt tcaaaataaa agaatgtagt atatagcaat
tgcttttctg 120tagtttataa gtgtgtatat tttaatttat aacttttcta
atatatgacc aaaacatggt 180gatgtgcag 18939PRTSimian virus 40 3Met Ala
Pro Lys Lys Lys Arg Lys Val 1 5 418PRTAgrobacterium tumefaciens
4Lys Arg Pro Arg Asp Arg His Asp Gly Glu Leu Gly Gly Arg Lys Arg 1
5 10 15 Ala Arg 56717DNAArtificial SequenceMaize optimized Cas9
expression cassette 5gtgcagcgtg acccggtcgt gcccctctct agagataatg
agcattgcat gtctaagtta 60taaaaaatta ccacatattt tttttgtcac acttgtttga
agtgcagttt atctatcttt 120atacatatat ttaaacttta ctctacgaat
aatataatct atagtactac aataatatca 180gtgttttaga gaatcatata
aatgaacagt tagacatggt ctaaaggaca attgagtatt 240ttgacaacag
gactctacag ttttatcttt ttagtgtgca tgtgttctcc tttttttttg
300caaatagctt cacctatata atacttcatc cattttatta gtacatccat
ttagggttta 360gggttaatgg tttttataga ctaatttttt tagtacatct
attttattct attttagcct 420ctaaattaag aaaactaaaa ctctatttta
gtttttttat ttaataattt agatataaaa 480tagaataaaa taaagtgact
aaaaattaaa caaataccct ttaagaaatt aaaaaaacta 540aggaaacatt
tttcttgttt cgagtagata atgccagcct gttaaacgcc gtcgacgagt
600ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg gccaagcgaa
gcagacggca 660cggcatctct gtcgctgcct ctggacccct ctcgagagtt
ccgctccacc gttggacttg 720ctccgctgtc ggcatccaga aattgcgtgg
cggagcggca gacgtgagcc ggcacggcag 780gcggcctcct cctcctctca
cggcaccggc agctacgggg gattcctttc ccaccgctcc 840ttcgctttcc
cttcctcgcc cgccgtaata aatagacacc ccctccacac cctctttccc
900caacctcgtg ttgttcggag cgcacacaca cacaaccaga tctcccccaa
atccacccgt 960cggcacctcc gcttcaaggt acgccgctcg tcctcccccc
cccccctctc taccttctct 1020agatcggcgt tccggtccat gcatggttag
ggcccggtag ttctacttct gttcatgttt 1080gtgttagatc cgtgtttgtg
ttagatccgt gctgctagcg ttcgtacacg gatgcgacct 1140gtacgtcaga
cacgttctga ttgctaactt gccagtgttt ctctttgggg aatcctggga
1200tggctctagc cgttccgcag acgggatcga tttcatgatt ttttttgttt
cgttgcatag 1260ggtttggttt gcccttttcc tttatttcaa tatatgccgt
gcacttgttt gtcgggtcat 1320cttttcatgc ttttttttgt cttggttgtg
atgatgtggt ctggttgggc ggtcgttcta 1380gatcggagta gaattctgtt
tcaaactacc tggtggattt attaattttg gatctgtatg 1440tgtgtgccat
acatattcat agttacgaat tgaagatgat ggatggaaat atcgatctag
1500gataggtata catgttgatg cgggttttac tgatgcatat acagagatgc
tttttgttcg 1560cttggttgtg atgatgtggt gtggttgggc ggtcgttcat
tcgttctaga tcggagtaga 1620atactgtttc aaactacctg gtgtatttat
taattttgga actgtatgtg tgtgtcatac 1680atcttcatag ttacgagttt
aagatggatg gaaatatcga tctaggatag gtatacatgt 1740tgatgtgggt
tttactgatg catatacatg atggcatatg cagcatctat tcatatgctc
1800taaccttgag tacctatcta ttataataaa caagtatgtt ttataattat
tttgatcttg 1860atatacttgg atgatggcat atgcagcagc tatatgtgga
tttttttagc cctgccttca 1920tacgctattt atttgcttgg tactgtttct
tttgtcgatg ctcaccctgt tgtttggtgt 1980tacttctgca ggtcgactct
agaggatcca tggcaccgaa gaagaagcgc aaggtgatgg 2040acaagaagta
cagcatcggc ctcgacatcg gcaccaactc ggtgggctgg gccgtcatca
2100cggacgaata taaggtcccg tcgaagaagt tcaaggtcct cggcaataca
gaccgccaca 2160gcatcaagaa aaacttgatc ggcgccctcc tgttcgatag
cggcgagacc gcggaggcga 2220ccaggctcaa gaggaccgcc aggagacggt
acactaggcg caagaacagg atctgctacc 2280tgcaggagat cttcagcaac
gagatggcga aggtggacga ctccttcttc caccgcctgg 2340aggaatcatt
cctggtggag gaggacaaga agcatgagcg gcacccaatc ttcggcaaca
2400tcgtcgacga ggtaagtttc tgcttctacc tttgatatat atataataat
tatcattaat 2460tagtagtaat ataatatttc aaatattttt ttcaaaataa
aagaatgtag tatatagcaa 2520ttgcttttct gtagtttata agtgtgtata
ttttaattta taacttttct aatatatgac 2580caaaacatgg tgatgtgcag
gtggcctacc acgagaagta cccgacaatc taccacctcc 2640ggaagaaact
ggtggacagc acagacaagg cggacctccg gctcatctac cttgccctcg
2700cgcatatgat caagttccgc ggccacttcc tcatcgaggg cgacctgaac
ccggacaact 2760ccgacgtgga caagctgttc atccagctcg tgcagacgta
caatcaactg ttcgaggaga 2820accccataaa cgctagcggc gtggacgcca
aggccatcct ctcggccagg ctctcgaaat 2880caagaaggct ggagaacctt
atcgcgcagt tgccaggcga aaagaagaac ggcctcttcg 2940gcaaccttat
tgcgctcagc ctcggcctga cgccgaactt caaatcaaac ttcgacctcg
3000cggaggacgc caagctccag ctctcaaagg acacctacga cgacgacctc
gacaacctcc 3060tggcccagat aggagaccag tacgcggacc tcttcctcgc
cgccaagaac ctctccgacg 3120ctatcctgct cagcgacatc cttcgggtca
acaccgaaat taccaaggca ccgctgtccg 3180ccagcatgat taaacgctac
gacgagcacc atcaggacct cacgctgctc aaggcactcg 3240tccgccagca
gctccccgag aagtacaagg agatcttctt cgaccaatca aaaaacggct
3300acgcgggata tatcgacggc ggtgccagcc aggaagagtt ctacaagttc
atcaaaccaa 3360tcctggagaa gatggacggc accgaggagt tgctggtcaa
gctcaacagg gaggacctcc 3420tcaggaagca gaggaccttc gacaacggct
ccatcccgca tcagatccac ctgggcgaac 3480tgcatgccat cctgcggcgc
caggaggact tctacccgtt cctgaaggat aaccgggaga 3540agatcgagaa
gatcttgacg ttccgcatcc catactacgt gggcccgctg gctcgcggca
3600actcccggtt cgcctggatg acccggaagt cggaggagac catcacaccc
tggaactttg 3660aggaggtggt cgataagggc gctagcgctc agagcttcat
cgagcgcatg accaacttcg 3720ataaaaacct gcccaatgaa aaagtcctcc
ccaagcactc gctgctctac gagtacttca 3780ccgtgtacaa cgagctcacc
aaggtcaaat acgtcaccga gggcatgcgg aagccggcgt 3840tcctgagcgg
cgagcagaag aaggcgatag tggacctcct cttcaagacc aacaggaagg
3900tgaccgtgaa gcaattaaaa gaggactact tcaagaaaat agagtgcttc
gactccgtgg 3960agatctcggg cgtggaggat cggttcaacg cctcactcgg
cacgtatcac gacctcctca 4020agatcattaa agacaaggac ttcctcgaca
acgaggagaa cgaggacatc ctcgaggaca 4080tcgtcctcac cctgaccctg
ttcgaggacc gcgaaatgat cgaggagagg ctgaagacct 4140acgcgcacct
gttcgacgac aaggtcatga aacagctcaa gaggcgccgc tacactggtt
4200ggggaaggct gtcccgcaag ctcattaatg gcatcaggga caagcagagc
ggcaagacca 4260tcctggactt cctcaagtcc gacgggttcg ccaaccgcaa
cttcatgcag ctcattcacg 4320acgactcgct cacgttcaag gaagacatcc
agaaggcaca ggtgagcggg cagggtgact 4380ccctccacga acacatcgcc
aacctggccg gctcgccggc cattaaaaag ggcatcctgc 4440agacggtcaa
ggtcgtcgac gagctcgtga aggtgatggg ccggcacaag cccgaaaata
4500tcgtcataga gatggccagg gagaaccaga ccacccaaaa agggcagaag
aactcgcgcg 4560agcggatgaa acggatcgag gagggcatta aagagctcgg
gtcccagatc ctgaaggagc 4620accccgtgga aaatacccag ctccagaatg
aaaagctcta cctctactac ctgcagaacg 4680gccgcgacat gtacgtggac
caggagctgg acattaatcg gctatcggac tacgacgtcg 4740accacatcgt
gccgcagtcg ttcctcaagg acgatagcat cgacaacaag gtgctcaccc
4800ggtcggataa aaatcggggc aagagcgaca acgtgcccag cgaggaggtc
gtgaagaaga 4860tgaaaaacta ctggcgccag ctcctcaacg cgaaactgat
cacccagcgc aagttcgaca 4920acctgacgaa ggcggaacgc ggtggcttga
gcgaactcga taaggcgggc ttcataaaaa 4980ggcagctggt cgagacgcgc
cagatcacga agcatgtcgc ccagatcctg gacagccgca 5040tgaatactaa
gtacgatgaa aacgacaagc tgatccggga ggtgaaggtg atcacgctga
5100agtccaagct cgtgtcggac ttccgcaagg acttccagtt ctacaaggtc
cgcgagatca 5160acaactacca ccacgcccac gacgcctacc tgaatgcggt
ggtcgggacc gccctgatca 5220agaagtaccc gaagctggag tcggagttcg
tgtacggcga ctacaaggtc tacgacgtgc 5280gcaaaatgat cgccaagtcc
gagcaggaga tcggcaaggc cacggcaaaa tacttcttct 5340actcgaacat
catgaacttc ttcaagaccg agatcaccct cgcgaacggc gagatccgca
5400agcgcccgct catcgaaacc aacggcgaga cgggcgagat cgtctgggat
aagggccggg 5460atttcgcgac ggtccgcaag gtgctctcca tgccgcaagt
caatatcgtg aaaaagacgg 5520aggtccagac gggcgggttc agcaaggagt
ccatcctccc gaagcgcaac tccgacaagc 5580tcatcgcgag gaagaaggat
tgggacccga aaaaatatgg cggcttcgac agcccgaccg 5640tcgcatacag
cgtcctcgtc gtggcgaagg tggagaaggg caagtcaaag aagctcaagt
5700ccgtgaagga gctgctcggg atcacgatta tggagcggtc ctccttcgag
aagaacccga 5760tcgacttcct agaggccaag ggatataagg aggtcaagaa
ggacctgatt attaaactgc 5820cgaagtactc gctcttcgag ctggaaaacg
gccgcaagag gatgctcgcc tccgcaggcg 5880agttgcagaa gggcaacgag
ctcgccctcc cgagcaaata cgtcaatttc ctgtacctcg 5940ctagccacta
tgaaaagctc aagggcagcc cggaggacaa cgagcagaag cagctcttcg
6000tggagcagca caagcattac ctggacgaga tcatcgagca gatcagcgag
ttctcgaagc 6060gggtgatcct cgccgacgcg aacctggaca aggtgctgtc
ggcatataac aagcaccgcg 6120acaaaccaat acgcgagcag gccgaaaata
tcatccacct cttcaccctc accaacctcg 6180gcgctccggc agccttcaag
tacttcgaca ccacgattga ccggaagcgg tacacgagca 6240cgaaggaggt
gctcgatgcg acgctgatcc accagagcat cacagggctc tatgaaacac
6300gcatcgacct gagccagctg ggcggagaca agagaccacg ggaccgccac
gatggcgagc 6360tgggaggccg caagcgggca aggtaggtac cgttaaccta
gacttgtcca tcttctggat 6420tggccaactt aattaatgta tgaaataaaa
ggatgcacac atagtgacat gctaatcact 6480ataatgtggg catcaaagtt
gtgtgttatg tgtaattact agttatctga ataaaagaga 6540aagagatcat
ccatatttct tatcctaaat gaatgtcacg tgtctttata attctttgat
6600gaaccagatg catttcatta accaaatcca tatacatata aatattaatc
atatataatt 6660aatatcaatt gggttagcaa aacaaatcta gtctaggtgt
gttttgcgaa tgcggcc 6717639RNAArtificial SequencecrRNA containing
the LIGCas-3 target sequence in the variable targeting domain
6gcguacgcgu acgugugguu uuagagcuau gcuguuuug 39786RNAStreptococcus
pyogenes M1 GAS (SF370)tracRNA(1)..(86) 7ggaaccauuc aaaacagcau
agcaaguuaa aauaaggcua guccguuauc aacuugaaaa 60aguggcaccg agucggugcu
uuuuuu 86894RNAArtificial SequenceLong guide RNA containing the
LIGCas-3 target sequence in the variable targeting domain
8gcguacgcgu acgugugguu uuagagcuag aaauagcaag uuaaaauaag gcuaguccgu
60uaucaacuug aaaaaguggc accgagucgg ugcu 9491000DNAZea mays
9tgagagtaca atgatgaacc tagattaatc aatgccaaag tctgaaaaat gcaccctcag
60tctatgatcc agaaaatcaa gattgcttga ggccctgttc ggttgttccg gattagagcc
120ccggattaat tcctagccgg attacttctc taatttatat agattttgat
gagctggaat 180gaatcctggc ttattccggt acaaccgaac aggccctgaa
ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca
ccattagggt tagagatggt ggccatgggc gcatgtcctg 360gccaactttg
tatgatatat ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg
420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga
gggggcatca 480aagatctggc tgtgtttcca gctgtttttg ttagccccat
cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt
ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt
ctactagcag taagtcgtgt ttagaaatta tttttttata tacctttttt
720ccttctatgt acagtaggac acagtgtcag cgccgcgttg acggagaata
tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact
tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt
aatacgcaaa cgttttgtcc caccttgact aatcacaaga 960gtggagcgta
ccttataaac cgagccgcaa gcaccgaatt 10001016DNAZea mays 10tttttttttt
tttttt 161159RNAArtificial SequenceShort guide RNA containing the
LIGCas-3 variable targeting domain 11gcguacgcgu acgugugguu
uuagagcuag aaauagcaag uuaaaauaag gcuaguccg 59121102DNAArtificial
SequenceMaize optimized long guide RNA expression cassette
containing the LIGCas-3 variable targeting domain 12tgagagtaca
atgatgaacc tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc
agaaaatcaa gattgcttga ggccctgttc ggttgttccg gattagagcc
120ccggattaat tcctagccgg attacttctc taatttatat agattttgat
gagctggaat 180gaatcctggc ttattccggt acaaccgaac aggccctgaa
ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca
ccattagggt tagagatggt ggccatgggc gcatgtcctg 360gccaactttg
tatgatatat ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg
420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga
gggggcatca 480aagatctggc tgtgtttcca gctgtttttg ttagccccat
cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt
ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt
ctactagcag taagtcgtgt ttagaaatta tttttttata tacctttttt
720ccttctatgt acagtaggac acagtgtcag cgccgcgttg acggagaata
tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact
tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt
aatacgcaaa cgttttgtcc caccttgact aatcacaaga 960gtggagcgta
ccttataaac cgagccgcaa gcaccgaatt gcgtacgcgt acgtgtggtt
1020ttagagctag aaatagcaag ttaaaataag gctagtccgt tatcaacttg
aaaaagtggc 1080accgagtcgg tgcttttttt tt 11021327DNAZea mays
13gtactccatc cgccccatcg agtaggg 271424DNAZea mays 14gcacgtacgt
caccatcccg ccgg 241524DNAZea mays 15gacgtacgtg ccctactcga tggg
241624DNAZea mays 16gtaccgtacg tgccccggcg gagg 241724DNAZea mays
17ggaattgtac cgtacgtgcc ccgg 241820DNAZea mays 18gcgtacgcgt
acgtgtgagg 201922DNAZea mays 19gctggccgag gtcgactacc gg
222023DNAZea mays 20ggccgaggtc gactaccggc cgg 232123DNAZea mays
21ggcgcgagct cgtgcttcac cgg 232221DNAZea mays 22ggtgccaatc
atgcgtcgcg g 212320DNAZea mays 23ggtcgccatc acgggacagg 202424DNAZea
mays 24gtcgcggcac ctgtcccgtg atgg 242523DNAZea mays 25ggaatgctgg
aactgcaatg cgg 232624DNAZea mays 26gcagctcttc ttggggaatg ctgg
242723DNAZea mays 27gcagtaacag ctgctgtcaa tgg 232856DNAArtificial
SequenceMS26Cas-1 forward primer 28ctacactctt tccctacacg acgctcttcc
gatctaggac cggaagctcg ccgcgt 562954DNAArtificial SequenceMS26Cas-1
and MS26Cas-3 reverse primer 29caagcagaag acggcatacg agctcttccg
atcttcctgg aggacgacgt gctg 543059DNAArtificial SequenceMS26Cas-2
forward primer 30ctacactctt tccctacacg acgctcttcc gatctaaggt
cctggaggac gacgtgctg 593151DNAArtificial SequenceMS26Cas-2 and MS26
meganuclease reverse primer 31caagcagaag acggcatacg agctcttccg
atctccggaa gctcgccgcg t 513256DNAArtificial SequenceMS26Cas-3
forward primer 32ctacactctt tccctacacg acgctcttcc gatcttcctc
cggaagctcg ccgcgt 563359DNAArtificial SequenceMS26 Meganuclease
forward primer 33ctacactctt tccctacacg acgctcttcc gatctttcct
cctggaggac gacgtgctg 593463DNAArtificial SequenceLIGCas-1 forward
primer 34ctacactctt tccctacacg acgctcttcc gatctaggac tgtaacgatt
tacgcacctg 60ctg 633558DNAArtificial SequenceLIGCas-1 and LIGCas-2
reverse primer 35caagcagaag acggcatacg agctcttccg atctgcaaat
gagtagcagc gcacgtat 583663DNAArtificial SequenceLIGCas-2 forward
primer 36ctacactctt tccctacacg acgctcttcc gatcttcctc tgtaacgatt
tacgcacctg 60ctg 633760DNAArtificial SequenceLIGCas-3 forward
primer 37ctacactctt tccctacacg acgctcttcc gatctaaggc gcaaatgagt
agcagcgcac 603857DNAArtificial SequenceLIGCas-3 and LIG3-4
meganuclease reverse primer 38caagcagaag acggcatacg agctcttccg
atctcacctg ctgggaattg taccgta 573960DNAArtificial SequenceLIG3-4
meganuclease forward primer 39ctacactctt tccctacacg acgctcttcc
gatctccttc gcaaatgagt agcagcgcac 604058DNAArtificial
SequenceMS45Cas-1 forward primer 40ctacactctt tccctacacg acgctcttcc
gatctaggag gacccgttcg gcctcagt 584154DNAArtificial
SequenceMS45Cas-1, MS45Cas-2 and MS45Cas-3 reverse primer
41caagcagaag acggcatacg agctcttccg atctgccggc tggcattgtc tctg
544258DNAArtificial SequenceMS45Cas-2 forward primer 42ctacactctt
tccctacacg acgctcttcc gatcttcctg gacccgttcg gcctcagt
584358DNAArtificial SequenceMS45Cas-3 forward primer 43ctacactctt
tccctacacg acgctcttcc gatctgaagg gacccgttcg gcctcagt
584458DNAArtificial SequenceALSCas-1 forward primer 44ctacactctt
tccctacacg acgctcttcc gatctaaggc gacgatgggc gtctcctg
584553DNAArtificial SequenceALSCas-1, ALSCas-2 and ALSCas-3 reverse
primer 45caagcagaag acggcatacg agctcttccg atctgcgtct gcatcgccac ctc
534658DNAArtificial SequenceALSCas-2 forward primer 46ctacactctt
tccctacacg acgctcttcc gatctttccc gacgatgggc gtctcctg
584758DNAArtificial SequenceALSCas-3 forward primer 47ctacactctt
tccctacacg acgctcttcc gatctggaac gacgatgggc gtctcctg
584863DNAArtificial SequenceEPSPSCas-1 forward primer 48ctacactctt
tccctacacg acgctcttcc gatctggaag aggaaacata cgttgcattt 60cca
634957DNAArtificial SequencePSPSCas-1 and EPSPSCas-3 reverse primer
49caagcagaag acggcatacg agctcttccg atctggtgga aagttcccag ttgagga
575062DNAArtificial SequencePSPSCas-2 forward primer 50ctacactctt
tccctacacg acgctcttcc gatctaagcg gtggaaagtt cccagttgag 60ga
625158DNAArtificial SequenceEPSPSCas-2 reverse primer 51caagcagaag
acggcatacg agctcttccg atctgaggaa acatacgttg catttcca
585263DNAArtificial SequenceEPSPSCas-3 forward primer 52ctacactctt
tccctacacg acgctcttcc gatctccttg aggaaacata cgttgcattt 60cca
635343DNAArtificial SequenceForward primer for secondary PCR
53aatgatacgg cgaccaccga gatctacact ctttccctac acg
435418DNAArtificial SequenceReverse primer for secondary PCR
54caagcagaag acggcata 185593DNAZea mays 55ctgtaacgat ttacgcacct
gctgggaatt gtaccgtacg tgccccggcg gaggatatat 60atacctcaca cgtacgcgta
cgcgtatata tac 935698DNAArtificial sequenceMutation 1 for LIGCas-1
locus 56aggactgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc
ggtcggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
985798DNAArtificial sequenceMutation 2 for LIGCas-1 locus
57aggactgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggacggagga
60tatatatacc tcacacgtac gcgtacgcgt atatatac 985898DNAArtificial
sequenceMutation 3 for LIGCas-1 locus 58aggactgtaa cgatttacgc
acctgctggg aattgtaccg tacgtgcccc gggcggagga 60tatatatacc tcacacgtac
gcgtacgcgt atatatac 985995DNAArtificial sequenceMutation 4 for
LIGCas-1 locus 59aggactgtaa cgatttacgc acctgctggg aattgtaccg
tacgtgcggt cggaggatat 60atatacctca cacgtacgcg tacgcgtata tatac
956098DNAArtificial SequenceMutation 5 for LIGCas-1 locus
60aggactgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggccggagga
60tatatatacc tcacacgtac gcgtacgcgt atatatac 986196DNAArtificial
sequenceMutation 6 for LIGCas-1 locus 61aggactgtaa cgatttacgc
acctgctggg aattgtaccg tacgtgcccc gcggaggata 60tatatacctc acacgtacgc
gtacgcgtat atatac 966296DNAArtificial SequenceMutation 7 for
LIGCas-1 locus 62aggactgtaa cgatttacgc acctgctggg aattgtaccg
tacgtgcccc ggggaggata 60tatatacctc acacgtacgc gtacgcgtat atatac
966394DNAArtificial SequenceMutation 8 for LIGCas-1 locus
63aggactgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggaggatata
60tatacctcac acgtacgcgt acgcgtatat atac 946494DNAArtificial
SequenceMutation 8 for LIGCas-1 locus 64aggactgtaa cgatttacgc
acctgctggg aattgtaccg tacgtgcgtc ggaggatata 60tatacctcac acgtacgcgt
acgcgtatat atac 946543DNAArtificial SequenceMutation 10 for
LIGCas-1 locus 65aggactgtaa cgatttacgc acctgctggg aattgtaccg tac
436698DNAArtificial SequenceMutation 1 for LIGCas-2 locus
66tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtgaccc cggcggagga
60tatatatacc tcacacgtac gcgtacgcgt atatatac 986796DNAArtificial
SequenceMutation 2 for LIGCas-2 locus 67tcctctgtaa cgatttacgc
acctgctggg aattgtaccg tacgtccccg gcggaggata 60tatatacctc acacgtacgc
gtacgcgtat atatac 966898DNAArtificial SequenceMutation 3 for
LIGCas-2 locus 68tcctctgtaa cgatttacgc acctgctggg aattgtaccg
tacgtgtccc cggcggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
986994DNAArtificial SequenceMutation 4 for LIGCas-2 locus
69tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacccccggc ggaggatata
60tatacctcac acgtacgcgt acgcgtatat atac 947093DNAArtificial
SequenceMutation 5 for LIGCas-2 locus 70tcctctgtaa cgatttacgc
acctgctggg aattgtaccg taccccggcg gaggatatat 60atacctcaca cgtacgcgta
cgcgtatata tac 937198DNAArtificial SequenceMutation 6 for LIGCas-2
locus 71tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtggccc
cggcggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
987292DNAArtificial SequenceMutation 7 for LIGCas-2 locus
72tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacccggcgg aggatatata
60tacctcacac gtacgcgtac gcgtatatat ac 927399DNAArtificial
SequenceMutation 8 for LIGCas-2 locus 73tcctctgtaa cgatttacgc
acctgctggg aattgtaccg tacgtgaacc ccggcggagg 60atatatatac ctcacacgta
cgcgtacgcg tatatatac 997461DNAArtificial SequenceMutation 9 for
LIGCas-2 locus 74tcctctgtaa cgatttacgc acctgctggg aattgtaccg
tacgtgtacg cgtatatata 60c 617595DNAArtificial SequenceMutation 10
for LIGCas-2 locus 75tcctctgtaa cgatttacgc acctgctggg aattgtaccg
tacgccccgg cggaggatat 60atatacctca cacgtacgcg tacgcgtata tatac
957693DNAZea mays 76cgcaaatgag tagcagcgca cgtatatata cgcgtacgcg
tacgtgtgag gtatatatat 60cctccgccgg ggcacgtacg gtacaattcc cag
937798DNAArtificial SequenceMutation 1 for LIGCas-3 locus
77aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtt gtgaggtata
60tatatcctcc gccggggcac gtacggtaca attcccag 987896DNAArtificial
SequenceMutation 2 for LIGCas-3 locus 78aaggcgcaaa tgagtagcag
cgcacgtata tatacgcgta cgcgtacggt gaggtatata 60tatcctccgc cggggcacgt
acggtacaat tcccag 967996DNAArtificial SequenceMutation 3 for
LIGCas-3 locus 79aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta
cgcgtactgt gaggtatata 60tatcctccgc cggggcacgt acggtacaat tcccag
968095DNAArtificial SequenceMutation 4 for LIGCas-3 locus
80aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtg aggtatatat
60atcctccgcc ggggcacgta cggtacaatt cccag 958168DNAArtificial
SequenceMutation 5 for LIGCas-3 locus 81aaggcgcaaa tgagtagcag
cgcacgtata tatatcctcc gccggggcac gtacggtaca 60attcccag
688255DNAArtificial SequenceMutation 6 for LIGCas-3 locus
82aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cggtacaatt cccag
558393DNAArtificial SequenceMutation 7 for LIGCas-3 locus
83aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtgtgag gtatatatat
60cctccgccgg ggcacgtacg gtacaattcc cag 938469DNAArtificial
SequenceMutation 8 for LIGCas-3 locus 84aaggcgcaaa tgagtagcag
cgcacgtata tatacgcgta cgccggggca cgtacggtac 60aattcccag
698566DNAArtificial SequenceMutation 9 for LIGCas-3 locus
85aaggcgcaaa tgagtagcag cgcacgtata tatcctccgc cggggcacgt acggtacaat
60tcccag 668695DNAArtificial SequenceMutation 10 for LIGCas-3 locus
86aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtatgtg aggtatatat
60atcctccgcc ggggcacgta cggtacaatt cccag 958795DNAArtificial
SequenceMutation 1 for LIG3-4 homing endonuclease locus
87ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtg aggtatatat
60atcctccgcc ggggcacgta cggtacaatt cccag 958868DNAArtificial
SequenceMutation 2 for LIG3-4 homing endonuclease locus
88ccttcgcaaa tgagtagcag cgcacgtata tatatcctcc gccggggcac gtacggtaca
60attcccag 688965DNAArtificial SequenceMutation 3 for LIG3-4 homing
endonuclease locus 89ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta
cgcgtacgta cggtacaatt 60cccag 659055DNAArtificial SequenceMutation
4 for LIG3-4 homing endonuclease locus 90ccttcgcaaa tgagtagcag
cgcacgtata tatacgcgta cggtacaatt cccag 559169DNAArtificial
SequenceMutation 5 for LIG3-4 homing endonuclease locus
91ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgccggggca cgtacggtac
60aattcccag 699285DNAArtificial SequenceMutation 6 for LIG3-4
homing endonuclease locus 92ccttcgcaaa tgagtagcag cgcacgtata
tatacgtgtg aggtatatat atcctccgcc 60ggggcacgta cggtacaatt cccag
859393DNAArtificial SequenceMutation 7 for LIG3-4 homing
endonuclease locus 93ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta
cgcgtgtgag gtatatatat 60cctccgccgg ggcacgtacg gtacaattcc cag
939466DNAArtificial SequenceMutation 8 for LIG3-4 homing
endonuclease locus 94ccttcgcaaa tgagtagcag cgcacgtata tatcctccgc
cggggcacgt acggtacaat 60tcccag 669595DNAArtificial SequenceMutation
9 for LIG3-4 homing endonuclease locus 95ccttcgcaaa tgagtagcag
cgcacgtata tatacgcgta cgcgtacgtg tggtatatat 60atcctccgcc ggggcacgta
cggtacaatt cccag 9596102DNAArtificial SequenceMutation 10 for
LIG3-4 homing endonuclease locus 96ccttcgcaaa tgagtagcag cgcacgtata
tatacgcgta cggtatatat acgtgtgagg 60tatatatatc ctccgccggg gcacgtacgg
tacaattccc ag 102975424DNAArtificial Sequencedonor DNA -HR Repair
DNA 97cccatagaaa actgtgtgct ataatacacc aaaaggaaag caaagtgaaa
aggaaacttt 60gaatagccaa gaagactcgg agtgcttcac gccttcacct atcccacata
ggtgatgagc 120taagagtaaa atgtagattc tctcgagtac tgaatattgc
ctgcactttt ccttgcagta 180aatacacctt taatccatga cgagagtcca
ctctttgagt ccgtcttgag attcttccat 240tgatcataca acatgacctc
gaagtcctga tggagaacaa cttatataat taaaactaca 300atacagaaag
ttcctgacaa ttaaaacctt tggtggtggc atgccgtagg ttaaaaaaaa
360tagataatga caacacaact ggagacacgc tctttgccga gtgctcacac
gtttgctgag 420agcgagcact cggcaaatat atgatttgcc gaataccacc
ctcctcggca aaacaataca 480ctaggcaaaa aggtagtttc ccatcaccat
gatgcccgcc gttaatgtac cttctatgcc 540gagtatgttg gcgctcagca
aagagatcgt taccggcgtt tgtttcacca agagctcttt 600gacgagtgtg
gcacacgaca aaaccttttg ccgagtgtaa ttagtcgttt gccaagtgac
660tggtgcagtt ggcaaaggag tcgtttatta tgtgtgggca aaatgatata
tggtgccagt 720tagggctagc aaattaaagg gggggggggg ggggttaggt
tgaagaaggt gacgagtaat 780aaggtctcgg acggccgcgc gcatatatat
cagatccgat ccaatggcac acggtgcaaa 840cgaaaagcac gaaatttcca
ccagcttaat tagggagaga aaaatagagc accagctgat 900gagtgaatga
atgagataga cgggacacag agggtccagc aggctagcct actctggccg
960ccctaaatag aagtcagtgc cgtgacgacg cgcaaacttc ttttgatcgg
ctgcggaaat 1020aatatactgt aacgatttac gcacctgctg ggaattgtac
cgtacgtgcc ccggcggagg 1080atatatatac ctcacacaag ggcgaattgt
actagttagt tagctagtcg gtcctagatg 1140ccgtaatcat tagctaatcg
taagtgacgc ttggacacga gcggcttgag ctaggaacct 1200acgaagtcat
cggaatcagc tcaggtgtac agaagttcct atactttctg gagaatagga
1260acttcggaat aggaacttcg tatacgctag ggccgcattc gcaaaacaca
cctagactag 1320atttgttttg ctaacccaat tgatattaat tatatatgat
taatatttat atgtatatgg 1380atttggttaa tgaaatgcat ctggttcatc
aaagaattat aaagacacgt gacattcatt 1440taggataaga aatatggatg
atctctttct cttttattca gataactagt aattacacat 1500aacacacaac
tttgatgccc acattatagt gattagcatg tcactatgtg tgcatccttt
1560tatttcatac attaattaag ttggccaatc cagaagatgg acaagtctag
gtttcgactc 1620agatctgcgt caccgggcgc accgggcgcg gcggggccgg
cagctcgaag tcgcgctgcc 1680agaagccgac gtcgtgccag ccgccgtgct
tgtagccggc ggcgcggagg gtgccgcggg 1740cggtgtagcc gagggcctcg
tggaggcgca cggacgggtc gttcgggagg ccgatcacgg 1800ccaccacgga
cttgaagccc tgggcctcca tgctcttgag gaggtgggtg tagagggtgg
1860agccgaggcc gaggcgctgg tggcggtggg acacgtacac ggtggactcc
acggtccagt 1920cgtaggcgtt gcgggccttc cacgggccgg cgtaggcgat
gccggccacc acgccctcca 1980cctcggccac gagccacggg tagcggtcct
ggaggcgctc caggtcgtcg atccactcct 2040gcggggtctg cggctcggtg
cggaagttca cggtggaggt ctcgatgtag tggttcacga 2100tgtcgcacac
ggcggccatg tcggcggcgg tggccgggcg gatctcgacg gggcggcgct
2160cgggggacat ggtgtcgtgt ggatcccggt ggatctgaag ttcctatact
ttctagagaa 2220taggaacttc ggaataggaa cttcgctagc gaattgatcc
tctagagtcg acctgcagaa 2280gtaacaccaa acaacagggt gagcatcgac
aaaagaaaca gtaccaagca aataaatagc 2340gtatgaaggc agggctaaaa
aaatccacat atagctgctg catatgccat catccaagta 2400tatcaagatc
aaaataatta taaaacatac ttgtttatta taatagatag gtactcaagg
2460ttagagcata tgaatagatg ctgcatatgc catcatgtat atgcatcagt
aaaacccaca 2520tcaacatgta tacctatcct agatcgatat ttccatccat
cttaaactcg taactatgaa 2580gatgtatgac acacacatac agttccaaaa
ttaataaata caccaggtag tttgaaacag 2640tattctactc cgatctagaa
cgaatgaacg accgcccaac cacaccacat catcacaacc 2700aagcgaacaa
aaagcatctc tgtatatgca tcagtaaaac ccgcatcaac atgtatacct
2760atcctagatc gatatttcca tccatcatct tcaattcgta actatgaata
tgtatggcac 2820acacatacag atccaaaatt aataaatcca ccaggtagtt
tgaaacagaa ttctactccg 2880atctagaacg accgcccaac cagaccacat
catcacaacc aagacaaaaa aaagcatgaa 2940aagatgaccc gacaaacaag
tgcacggcat atattgaaat aaaggaaaag ggcaaaccaa 3000accctatgca
acgaaacaaa aaaaatcatg aaatcgatcc cgtctgcgga acggctagag
3060ccatcccagg attccccaaa gagaaacact ggcaagttag caatcagaac
gtgtctgacg 3120tacaggtcgc atccgtgtac gaacgctagc agcacggatc
taacacaaac acggatctaa 3180cacaaacatg aacagaagta gaactaccgg
gccctaacca tgcatggacc ggaacgccga 3240tctagagaag gtagagaggg
ggggggggga ggacgagcgg cgtaccttga agcggaggtg 3300ccgacgggtg
gatttggggg agatctggtt gtgtgtgtgt gcgctccgaa caacacgagg
3360ttggggaaag agggtgtgga gggggtgtct atttattacg gcgggcgagg
aagggaaagc 3420gaaggagcgg tgggaaagga atcccccgta gctgccggtg
ccgtgagagg aggaggaggc 3480cgcctgccgt gccggctcac gtctgccgct
ccgccacgca atttctggat gccgacagcg 3540gagcaagtcc aacggtggag
cggaactctc gagaggggtc cagaggcagc gacagagatg 3600ccgtgccgtc
tgcttcgctt ggcccgacgc gacgctgctg gttcgctggt tggtgtccgt
3660tagactcgtc gacggcgttt aacaggctgg cattatctac tcgaaacaag
aaaaatgttt 3720ccttagtttt tttaatttct taaagggtat ttgtttaatt
tttagtcact ttattttatt 3780ctattttata tctaaattat taaataaaaa
aactaaaata gagttttagt tttcttaatt 3840tagaggctaa aatagaataa
aatagatgta ctaaaaaaat tagtctataa aaaccattaa 3900ccctaaaccc
taaatggatg tactaataaa atggatgaag tattatatag gtgaagctat
3960ttgcaaaaaa aaaggagaac acatgcacac taaaaagata aaactgtaga
gtcctgttgt 4020caaaatactc aattgtcctt tagaccatgt ctaactgttc
atttatatga ttctctaaaa 4080cactgatatt attgtagtac tatagattat
attattcgta gagtaaagtt taaatatatg 4140tataaagata gataaactgc
acttcaaaca agtgtgacaa aaaaaatatg tggtaatttt 4200ttataactta
gacatgcaat gctcattatc tctagagagg ggcacgaccg ggtcacgctg
4260cactgcaggc tagcggcgaa ttcgcccttg tacgcgtacg cgtatatata
cgtgcgctgc 4320tactcatttg cgcgggaata cagctcagtc tgctgtgcgc
tgcaggatgt acatacatac 4380atgcgcaggt gcaaagtcta cgcgcgcggg
caatgcaagc ccctggcgta gttgggccat 4440gactgagatc acgcctcatg
gtcatggaac gaaacaccgc gtccggccgg gctgcccctg 4500gcgtcacgcg
ggaggcagct gctagcgtta gcgtacgtac ccaccgtctc gtacacacca
4560ccgcagggag agagaagagc gatgcaatgc acatgtacag catccgcatc
atgcatagat 4620actcatatct tcaaggccac acatgcagca gtgtcgtacg
ctacgttgtt tcaacggagg 4680aggaggatac atacatagac acccacagcc
agcctagcat atagcagata gcatacggac 4740tcccgggtga ggaaaaatgg
agggcgaacc aaaccaacca caaagaagca gcagcagcag 4800cagcagcagc
tgcggctgct atcaccactc accaactcca attaaagatc tctctctctc
4860tctctactgg ccggccctgt cagtgccagc gcccggtttg ttgctagctg
agctgcgggc 4920gtcgctctta gatatagccc aaaactcact ccaccaccac
tcgttccatg gaaccctaga 4980ccaaaagtac tcgcgctctc ggccctcgct
ctcgccctct ccctctccgc agcaaaagag 5040atccggccgg ccgagaaggg
cgcgcgctag ctgcccggct actagctggc gcccgcccgc 5100gcatatatct
gtgtcatcgc catcacccac accatggccc ggccggccaa caccgccgta
5160ttagctctgt ctgtcgctcg tccacctgcg accgactgag cgatcgatct
ccaccgagct 5220ctccgctaag cgctgtcctt gccgccgtcc tcccctccgt
cccctacgca tccatttccg 5280tgtgctcgtg tgtgcgcgcg cgggcactcc
tgctcctgct ccctccggcc cctcctcccc 5340tcccaggctc ccagctagcc
gcgcccgccc gcgcgacctg cacctgcaca gatcgggcgg 5400ccgggccgac
cgatcgatcg agat 54249824DNAArtificial SequenceForward PCR primer
98cccgttattg tatgaggtaa tgac 249931DNAArtificial SequenceReverse
PCR primer for site-specific transgene insertion at junction 1
99gctcgtgtcc aagcgtcact tacgattagc t 3110032DNAArtificial
SequenceForward PCR primer for site-specific transgene insertion at
junction 2 100ccatgtctaa ctgttcattt atatgattct ct
3210124DNAArtificial SequenceReverse PCR primer for site-specific
transgene insertion at junction 2 101gcagccgata ggttcatcat cttc
241027850DNAArtificial SequenceLinked Cas9 and LIGCas-3 long guide
RNA expression cassettes 102gtgcagcgtg acccggtcgt gcccctctct
agagataatg agcattgcat gtctaagtta 60taaaaaatta ccacatattt tttttgtcac
acttgtttga agtgcagttt atctatcttt 120atacatatat ttaaacttta
ctctacgaat aatataatct atagtactac aataatatca 180gtgttttaga
gaatcatata aatgaacagt tagacatggt ctaaaggaca attgagtatt
240ttgacaacag gactctacag ttttatcttt ttagtgtgca tgtgttctcc
tttttttttg 300caaatagctt cacctatata atacttcatc cattttatta
gtacatccat ttagggttta 360gggttaatgg tttttataga ctaatttttt
tagtacatct attttattct attttagcct 420ctaaattaag aaaactaaaa
ctctatttta gtttttttat ttaataattt agatataaaa 480tagaataaaa
taaagtgact aaaaattaaa caaataccct ttaagaaatt aaaaaaacta
540aggaaacatt tttcttgttt cgagtagata atgccagcct gttaaacgcc
gtcgacgagt 600ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg
gccaagcgaa gcagacggca 660cggcatctct gtcgctgcct ctggacccct
ctcgagagtt ccgctccacc gttggacttg 720ctccgctgtc ggcatccaga
aattgcgtgg cggagcggca gacgtgagcc ggcacggcag 780gcggcctcct
cctcctctca cggcaccggc agctacgggg gattcctttc ccaccgctcc
840ttcgctttcc cttcctcgcc cgccgtaata aatagacacc ccctccacac
cctctttccc 900caacctcgtg ttgttcggag cgcacacaca cacaaccaga
tctcccccaa atccacccgt 960cggcacctcc gcttcaaggt acgccgctcg
tcctcccccc cccccctctc taccttctct 1020agatcggcgt tccggtccat
gcatggttag ggcccggtag ttctacttct gttcatgttt 1080gtgttagatc
cgtgtttgtg ttagatccgt gctgctagcg ttcgtacacg gatgcgacct
1140gtacgtcaga cacgttctga ttgctaactt gccagtgttt ctctttgggg
aatcctggga 1200tggctctagc cgttccgcag acgggatcga tttcatgatt
ttttttgttt cgttgcatag 1260ggtttggttt gcccttttcc tttatttcaa
tatatgccgt gcacttgttt gtcgggtcat 1320cttttcatgc ttttttttgt
cttggttgtg atgatgtggt ctggttgggc ggtcgttcta 1380gatcggagta
gaattctgtt tcaaactacc tggtggattt attaattttg gatctgtatg
1440tgtgtgccat acatattcat agttacgaat tgaagatgat ggatggaaat
atcgatctag 1500gataggtata catgttgatg cgggttttac tgatgcatat
acagagatgc tttttgttcg 1560cttggttgtg atgatgtggt gtggttgggc
ggtcgttcat tcgttctaga tcggagtaga 1620atactgtttc aaactacctg
gtgtatttat taattttgga actgtatgtg tgtgtcatac 1680atcttcatag
ttacgagttt aagatggatg gaaatatcga tctaggatag gtatacatgt
1740tgatgtgggt tttactgatg catatacatg atggcatatg cagcatctat
tcatatgctc 1800taaccttgag tacctatcta ttataataaa caagtatgtt
ttataattat tttgatcttg 1860atatacttgg atgatggcat atgcagcagc
tatatgtgga tttttttagc cctgccttca 1920tacgctattt atttgcttgg
tactgtttct tttgtcgatg ctcaccctgt tgtttggtgt 1980tacttctgca
ggtcgactct agaggatcca tggcaccgaa gaagaagcgc aaggtgatgg
2040acaagaagta cagcatcggc ctcgacatcg gcaccaactc ggtgggctgg
gccgtcatca 2100cggacgaata taaggtcccg tcgaagaagt tcaaggtcct
cggcaataca gaccgccaca 2160gcatcaagaa aaacttgatc ggcgccctcc
tgttcgatag cggcgagacc gcggaggcga 2220ccaggctcaa gaggaccgcc
aggagacggt acactaggcg caagaacagg atctgctacc 2280tgcaggagat
cttcagcaac gagatggcga aggtggacga ctccttcttc caccgcctgg
2340aggaatcatt cctggtggag gaggacaaga agcatgagcg gcacccaatc
ttcggcaaca 2400tcgtcgacga ggtaagtttc tgcttctacc tttgatatat
atataataat tatcattaat 2460tagtagtaat ataatatttc aaatattttt
ttcaaaataa aagaatgtag tatatagcaa 2520ttgcttttct gtagtttata
agtgtgtata ttttaattta taacttttct aatatatgac 2580caaaacatgg
tgatgtgcag gtggcctacc acgagaagta cccgacaatc taccacctcc
2640ggaagaaact ggtggacagc acagacaagg cggacctccg gctcatctac
cttgccctcg 2700cgcatatgat caagttccgc ggccacttcc tcatcgaggg
cgacctgaac ccggacaact 2760ccgacgtgga caagctgttc atccagctcg
tgcagacgta caatcaactg ttcgaggaga 2820accccataaa cgctagcggc
gtggacgcca aggccatcct ctcggccagg ctctcgaaat 2880caagaaggct
ggagaacctt atcgcgcagt tgccaggcga aaagaagaac ggcctcttcg
2940gcaaccttat tgcgctcagc ctcggcctga cgccgaactt caaatcaaac
ttcgacctcg 3000cggaggacgc caagctccag ctctcaaagg acacctacga
cgacgacctc gacaacctcc 3060tggcccagat aggagaccag tacgcggacc
tcttcctcgc cgccaagaac ctctccgacg 3120ctatcctgct cagcgacatc
cttcgggtca acaccgaaat taccaaggca ccgctgtccg 3180ccagcatgat
taaacgctac gacgagcacc atcaggacct cacgctgctc aaggcactcg
3240tccgccagca gctccccgag aagtacaagg agatcttctt cgaccaatca
aaaaacggct 3300acgcgggata tatcgacggc ggtgccagcc aggaagagtt
ctacaagttc atcaaaccaa 3360tcctggagaa gatggacggc accgaggagt
tgctggtcaa gctcaacagg gaggacctcc 3420tcaggaagca gaggaccttc
gacaacggct ccatcccgca tcagatccac ctgggcgaac 3480tgcatgccat
cctgcggcgc caggaggact tctacccgtt cctgaaggat aaccgggaga
3540agatcgagaa gatcttgacg ttccgcatcc catactacgt gggcccgctg
gctcgcggca 3600actcccggtt cgcctggatg acccggaagt cggaggagac
catcacaccc tggaactttg 3660aggaggtggt cgataagggc gctagcgctc
agagcttcat cgagcgcatg accaacttcg 3720ataaaaacct gcccaatgaa
aaagtcctcc ccaagcactc gctgctctac gagtacttca 3780ccgtgtacaa
cgagctcacc aaggtcaaat acgtcaccga gggcatgcgg aagccggcgt
3840tcctgagcgg cgagcagaag aaggcgatag tggacctcct cttcaagacc
aacaggaagg 3900tgaccgtgaa gcaattaaaa gaggactact tcaagaaaat
agagtgcttc gactccgtgg 3960agatctcggg cgtggaggat cggttcaacg
cctcactcgg cacgtatcac gacctcctca 4020agatcattaa agacaaggac
ttcctcgaca acgaggagaa cgaggacatc ctcgaggaca 4080tcgtcctcac
cctgaccctg ttcgaggacc gcgaaatgat cgaggagagg ctgaagacct
4140acgcgcacct gttcgacgac aaggtcatga aacagctcaa gaggcgccgc
tacactggtt 4200ggggaaggct gtcccgcaag ctcattaatg gcatcaggga
caagcagagc ggcaagacca 4260tcctggactt cctcaagtcc gacgggttcg
ccaaccgcaa cttcatgcag ctcattcacg 4320acgactcgct cacgttcaag
gaagacatcc agaaggcaca ggtgagcggg cagggtgact 4380ccctccacga
acacatcgcc aacctggccg gctcgccggc cattaaaaag ggcatcctgc
4440agacggtcaa ggtcgtcgac gagctcgtga aggtgatggg ccggcacaag
cccgaaaata 4500tcgtcataga gatggccagg gagaaccaga ccacccaaaa
agggcagaag aactcgcgcg 4560agcggatgaa acggatcgag gagggcatta
aagagctcgg gtcccagatc ctgaaggagc 4620accccgtgga aaatacccag
ctccagaatg aaaagctcta cctctactac ctgcagaacg 4680gccgcgacat
gtacgtggac caggagctgg acattaatcg gctatcggac tacgacgtcg
4740accacatcgt gccgcagtcg ttcctcaagg acgatagcat cgacaacaag
gtgctcaccc 4800ggtcggataa aaatcggggc aagagcgaca acgtgcccag
cgaggaggtc gtgaagaaga 4860tgaaaaacta ctggcgccag ctcctcaacg
cgaaactgat cacccagcgc aagttcgaca 4920acctgacgaa ggcggaacgc
ggtggcttga gcgaactcga taaggcgggc ttcataaaaa 4980ggcagctggt
cgagacgcgc cagatcacga agcatgtcgc ccagatcctg gacagccgca
5040tgaatactaa gtacgatgaa aacgacaagc tgatccggga ggtgaaggtg
atcacgctga 5100agtccaagct cgtgtcggac ttccgcaagg acttccagtt
ctacaaggtc cgcgagatca 5160acaactacca ccacgcccac gacgcctacc
tgaatgcggt ggtcgggacc gccctgatca 5220agaagtaccc gaagctggag
tcggagttcg tgtacggcga ctacaaggtc tacgacgtgc 5280gcaaaatgat
cgccaagtcc gagcaggaga tcggcaaggc cacggcaaaa tacttcttct
5340actcgaacat catgaacttc ttcaagaccg agatcaccct cgcgaacggc
gagatccgca 5400agcgcccgct catcgaaacc aacggcgaga cgggcgagat
cgtctgggat aagggccggg 5460atttcgcgac ggtccgcaag gtgctctcca
tgccgcaagt caatatcgtg aaaaagacgg 5520aggtccagac gggcgggttc
agcaaggagt ccatcctccc gaagcgcaac tccgacaagc 5580tcatcgcgag
gaagaaggat tgggacccga aaaaatatgg cggcttcgac agcccgaccg
5640tcgcatacag cgtcctcgtc gtggcgaagg tggagaaggg caagtcaaag
aagctcaagt 5700ccgtgaagga gctgctcggg atcacgatta tggagcggtc
ctccttcgag aagaacccga 5760tcgacttcct agaggccaag ggatataagg
aggtcaagaa ggacctgatt attaaactgc 5820cgaagtactc gctcttcgag
ctggaaaacg gccgcaagag gatgctcgcc tccgcaggcg 5880agttgcagaa
gggcaacgag ctcgccctcc cgagcaaata cgtcaatttc ctgtacctcg
5940ctagccacta tgaaaagctc aagggcagcc cggaggacaa cgagcagaag
cagctcttcg 6000tggagcagca caagcattac ctggacgaga tcatcgagca
gatcagcgag ttctcgaagc 6060gggtgatcct cgccgacgcg aacctggaca
aggtgctgtc ggcatataac aagcaccgcg 6120acaaaccaat acgcgagcag
gccgaaaata tcatccacct cttcaccctc accaacctcg 6180gcgctccggc
agccttcaag tacttcgaca ccacgattga ccggaagcgg tacacgagca
6240cgaaggaggt gctcgatgcg acgctgatcc accagagcat cacagggctc
tatgaaacac 6300gcatcgacct gagccagctg ggcggagaca agagaccacg
ggaccgccac gatggcgagc 6360tgggaggccg caagcgggca aggtaggtac
cgttaaccta gacttgtcca tcttctggat 6420tggccaactt aattaatgta
tgaaataaaa ggatgcacac atagtgacat gctaatcact 6480ataatgtggg
catcaaagtt gtgtgttatg tgtaattact agttatctga ataaaagaga
6540aagagatcat ccatatttct tatcctaaat gaatgtcacg tgtctttata
attctttgat 6600gaaccagatg catttcatta accaaatcca tatacatata
aatattaatc atatataatt 6660aatatcaatt gggttagcaa aacaaatcta
gtctaggtgt gttttgcgaa tgcggccccc 6720cctcgaggtc gacggtatcg
ataagctttg agagtacaat gatgaaccta gattaatcaa 6780tgccaaagtc
tgaaaaatgc accctcagtc tatgatccag aaaatcaaga ttgcttgagg
6840ccctgttcgg ttgttccgga ttagagcccc ggattaattc ctagccggat
tacttctcta 6900atttatatag attttgatga gctggaatga atcctggctt
attccggtac aaccgaacag 6960gccctgaagg ataccagtaa tcgctgagct
aaattggcat gctgtcagag tgtcagtatt 7020gcagcaaggt agtgagataa
ccggcatcat ggtgccagtt tgatggcacc attagggtta 7080gagatggtgg
ccatgggcgc atgtcctggc caactttgta tgatatatgg cagggtgaat
7140aggaaagtaa aattgtattg taaaaaggga tttcttctgt ttgttagcgc
atgtacaagg 7200aatgcaagtt ttgagcgagg gggcatcaaa gatctggctg
tgtttccagc tgtttttgtt 7260agccccatcg aatccttgac ataatgatcc
cgcttaaata agcaacctcg cttgtatagt 7320tccttgtgct ctaacacacg
atgatgataa gtcgtaaaat agtggtgtcc aaagaatttc 7380caggcccagt
tgtaaaagct aaaatgctat tcgaatttct actagcagta agtcgtgttt
7440agaaattatt tttttatata ccttttttcc ttctatgtac agtaggacac
agtgtcagcg 7500ccgcgttgac ggagaatatt tgcaaaaaag taaaagagaa
agtcatagcg gcgtatgtgc 7560caaaaacttc gtcacagaga gggccataag
aaacatggcc cacggcccaa tacgaagcac 7620cgcgacgaag cccaaacagc
agtccgtagg tggagcaaag cgctgggtaa tacgcaaacg 7680ttttgtccca
ccttgactaa tcacaagagt ggagcgtacc ttataaaccg agccgcaagc
7740accgaattgc gtacgcgtac gtgtggtttt agagctagaa atagcaagtt
aaaataaggc 7800tagtccgtta tcaacttgaa aaagtggcac cgagtcggtg
cttttttttt 785010323DNAZea mays 103tgggcaggtc tcacgacggt tgg
2310491DNAZea mays 104ccggtttcgc gtgctctggc tttacattac atgggcaggt
ctcacgacgg ttgggctgga 60gagccggctg gtaggggagg acctcaacgg c
9110590DNAArtificial SequenceMutation 1 for 55CasRNA-1 locus
105ccggtttcgc gtgctctggc tttacattac atgggcaggt ctcacgaggt
tgggctggag 60agccggctgg taggggagga cctcaacggc 9010690DNAArtificial
SequenceMutation 2 for 55CasRNA-1 locus 106ccggtttcgc gtgctctggc
tttacattac atgggcaggt ctcacacggt tgggctggag 60agccggctgg taggggagga
cctcaacggc 9010792DNAArtificial SequenceMutation 3 for 55CasRNA-1
locus 107ccggtttcgc gtgctctggc tttacattac atgggcaggt ctcacgacgg
tttgggctgg 60agagccggct ggtaggggag gacctcaacg gc
9210889DNAArtificial SequenceMutation 4 for 55CasRNA-1 locus
108ccggtttcgc gtgctctggc tttacattgc atgagcaggt cgtgacggtt
gggctggaga 60gccggctggt aggggaggac ctcaacggc 8910957DNAArtificial
SequenceMutation 5 for 55CasRNA-1 locus 109gggcaggtct cgacggttgg
gctggagagc cggctggtag gggaggacct caacggc 5711057DNAArtificial
SequenceMutation 6 for 55CasRNA-1 locus 110ccggtttcgc gtgctcttgg
gctggagagc cggctggtag gggaggacct caacggc 5711122DNAZea mays
111atatacctca cacgtacgcg ta 221121053DNAZea mays 112atgaacacca
agtacaacaa ggagttcctg ctctacctgg ccggcttcgt ggacggcgac 60ggctccatca
aggcgcagat caagccgaac cagtcctgca agttcaagca ccagctctcc
120ctgaccttcc aggtgaccca gaagacgcag aggcgctggt tcctcgacaa
gctggtcgac 180gagatcgggg tgggctacgt ctacgaccgc gggtcggtgt
ccgactacga gctctcccag 240atcaagcccc tgcacaactt cctcacccag
ctccagccgt tcctcaagct gaagcagaag 300caggcgaacc tcgtcctgaa
gatcatcgag cagctcccct cggccaagga gtccccggac 360aagttcctgg
aggtgtgcac gtgggtcgac cagatcgcgg ccctcaacga cagcaagacc
420cgcaagacga cctcggagac ggtgcgggcg gtcctggact ccctcccagg
atccgtggga 480ggtctatcgc catctcaggc atccagcgcc gcatcctcgg
cttcctcaag cccgggttca 540gggatctccg aagcactcag agctggagca
actaagtcca aggaattcct gctctacctg 600gccggcttcg tggacggcga
cggctccatc atcgcgtcca tcaagccgcg ccagtgctac 660aagttcaagc
acgagctccg cctggagttc accgtgaccc agaagacgca gaggcgctgg
720ttcctcgaca agctggtcga cgagatcggg gtgggctacg tctacgaccg
cgggtcggtg 780tccgactacc gcctctccca gatcaagccc ctgcacaact
tcctcaccca gctccagccg 840ttcctcaagc tgaagcagaa gcaggcgaac
ctcgtcctga agatcatcga gcagctcccc 900tcggccaagg agtccccgga
caagttcctg gaggtgtgca cgtgggtcga ccagatcgcg 960gccctcaacg
acagcaagac ccgcaagacg acctcggaga cggtgcgggc ggtcctggac
1020tccctcagcg agaagaagaa gtcgtccccc tga 105311322DNAZea mays
113gatggtgacg tacgtgccct ac 221141053DNAZea mays 114atgaacacca
agtacaacaa ggagttcctc ctctacctgg caggtttcgt ggacggcgat 60gggtctatca
tcgcccagat taccccgcaa cagtcctaca agttcaagca cgccctgcgg
120ctgaggttca cggtcactca gaagacgcag cgcaggtggt tcctcgataa
gctggtcgac 180gaaatcggag tcggcaaggt gcgggacagg ggctctgtca
gcgactacat cctctcccag 240aagaagccgc tccacaactt cctgacccag
ctgcagccct tcctcaagct caagcagaag 300caggccaacc tggtgctcaa
gatcatcgag cagctgccat ctgccaagga gtcaccagac 360aagttccttg
aggtctgcac ctgggtcgat cagatcgctg ccctgaacga ctccaagacg
420aggaagacca cctccgagac cgtcagggct gtgctggact cactcccagg
atccgttggc 480ggtctcagcc cttctcaggc tagctcggct gcttcctcag
ccagcagctc acctggctcc 540ggtatcagcg aggctctcag agcaggtgcc
accaagtcca aggagttcct cctgtacctg 600gcaggcttcg ttgacggcga
cggctcgatc atggcgtcca ttaccccgaa ccagtcgtgt 660aagttcaagc
atcagctgcg cctgcgcttt accgtcacgc agaagaccca gaggcgctgg
720ttcctggaca aactggtgga cgagatcggg gtcgggaagg tgtacgacag
agggagcgtt 780agcgactacc ggctgtccca gaagaagccg ctccacaact
tcctgacgca gctccaaccc 840ttcctgaagc tgaagcagaa gcaggcgaac
cttgtgctga agatcattga gcagctgccg 900agcgccaagg agagccctga
caagttcctg gaggtctgca cctgggtcga ccagatcgct 960gccctcaacg
actccaagac caggaagacc acgagcgaga ccgttcgggc tgtcctggac
1020agcctctccg agaagaagaa gtcgagcccg tag 10531154104DNAArtificial
sequencesoybean codon optimized Cas9 115atggacaaaa agtactcaat
agggctcgac atagggacta actccgttgg atgggccgtc 60atcaccgacg agtacaaggt
gccctccaag aagttcaagg tgttgggaaa caccgacagg 120cacagcataa
agaagaattt gatcggtgcc ctcctcttcg actccggaga gaccgctgag
180gctaccaggc tcaagaggac cgctagaagg cgctacacca gaaggaagaa
cagaatctgc 240tacctgcagg agatcttctc caacgagatg gccaaggtgg
acgactcctt cttccaccgc 300cttgaggaat cattcctggt ggaggaggat
aaaaagcacg agagacaccc aatcttcggg 360aacatcgtcg acgaggtggc
ctaccatgaa aagtacccta ccatctacca cctgaggaag 420aagctggtcg
actctaccga caaggctgac ttgcgcttga tttacctggc tctcgctcac
480atgataaagt tccgcggaca cttcctcatt gagggagacc tgaacccaga
caactccgac 540gtggacaagc tcttcatcca gctcgttcag acctacaacc
agcttttcga ggagaaccca 600atcaacgcca gtggagttga cgccaaggct
atcctctctg ctcgtctgtc aaagtccagg 660aggcttgaga acttgattgc
ccagctgcct ggcgaaaaga agaacggact gttcggaaac 720ttgatcgctc
tctccctggg attgactccc aacttcaagt ccaacttcga cctcgccgag
780gacgctaagt tgcagttgtc taaagacacc tacgacgatg acctcgacaa
cttgctggcc 840cagataggcg accaatacgc cgatctcttc ctcgccgcta
agaacttgtc cgacgcaatc 900ctgctgtccg acatcctgag agtcaacact
gagattacca aagctcctct gtctgcttcc 960atgattaagc gctacgacga
gcaccaccaa gatctgaccc tgctcaaggc cctggtgaga 1020cagcagctgc
ccgagaagta caaggagatc tttttcgacc agtccaagaa cggctacgcc
1080ggatacattg acggaggcgc ctcccaggaa gagttctaca agttcatcaa
gcccatcctt 1140gagaagatgg acggtaccga ggagctgttg gtgaagttga
acagagagga cctgttgagg 1200aagcagagaa ccttcgacaa cggaagcatc
cctcaccaaa tccacctggg agagctccac 1260gccatcttga ggaggcagga
ggatttctat cccttcctga aggacaaccg cgagaagatt 1320gagaagatct
tgaccttcag aattccttac tacgtcgggc cactcgccag aggaaactct
1380aggttcgcct ggatgacccg caaatctgaa gagaccatta ctccctggaa
cttcgaggaa 1440gtcgtggaca agggcgcttc cgctcagtct ttcatcgaga
ggatgaccaa cttcgataaa 1500aatctgccca acgagaaggt gctgcccaag
cactccctgt tgtacgagta tttcacagtg 1560tacaacgagc tcaccaaggt
gaagtacgtc acagagggaa tgaggaagcc tgccttcttg 1620tccggagagc
agaagaaggc catcgtcgac ctgctcttca agaccaacag gaaggtgact
1680gtcaagcagc tgaaggagga ctacttcaag aagatcgagt gcttcgactc
cgtcgagatc 1740tctggtgtcg aggacaggtt caacgcctcc cttgggactt
accacgatct gctcaagatt 1800attaaagaca aggacttcct ggacaacgag
gagaacgagg acatccttga ggacatcgtg 1860ctcaccctga ccttgttcga
agacagggaa atgatcgaag agaggctcaa gacctacgcc 1920cacctcttcg
acgacaaggt gatgaaacag ctgaagagac gcagatatac cggctgggga
1980aggctctccc gcaaattgat caacgggatc agggacaagc agtcagggaa
gactatactc 2040gacttcctga agtccgacgg attcgccaac aggaacttca
tgcagctcat tcacgacgac 2100tccttgacct tcaaggagga catccagaag
gctcaggtgt ctggacaggg tgactccttg 2160catgagcaca ttgctaactt
ggccggctct cccgctatta agaagggcat tttgcagacc 2220gtgaaggtcg
ttgacgagct cgtgaaggtg atgggacgcc acaagccaga gaacatcgtt
2280attgagatgg ctcgcgagaa ccaaactacc cagaaagggc agaagaattc
ccgcgagagg 2340atgaagcgca ttgaggaggg cataaaagag cttggctctc
agatcctcaa ggagcacccc 2400gtcgagaaca ctcagctgca gaacgagaag
ctgtacctgt actacctcca aaacggaagg 2460gacatgtacg tggaccagga
gctggacatc aacaggttgt ccgactacga cgtcgaccac 2520atcgtgcctc
agtccttcct gaaggatgac tccatcgaca ataaagtgct gacacgctcc
2580gataaaaata gaggcaagtc cgacaacgtc ccctccgagg aggtcgtgaa
gaagatgaaa 2640aactactgga gacagctctt gaacgccaag ctcatcaccc
agcgtaagtt cgacaacctg 2700actaaggctg agagaggagg attgtccgag
ctcgataagg ccggattcat caagagacag 2760ctcgtcgaaa cccgccaaat
taccaagcac gtggcccaaa ttctggattc ccgcatgaac 2820accaagtacg
atgaaaatga caagctgatc cgcgaggtca aggtgatcac cttgaagtcc
2880aagctggtct ccgacttccg caaggacttc cagttctaca aggtgaggga
gatcaacaac 2940taccaccacg cacacgacgc ctacctcaac gctgtcgttg
gaaccgccct catcaaaaaa 3000tatcctaagc tggagtctga gttcgtctac
ggcgactaca aggtgtacga cgtgaggaag 3060atgatcgcta agtctgagca
ggagatcggc aaggccaccg ccaagtactt cttctactcc 3120aacatcatga
acttcttcaa gaccgagatc actctcgcca acggtgagat caggaagcgc
3180ccactgatcg agaccaacgg tgagactgga gagatcgtgt gggacaaagg
gagggatttc 3240gctactgtga ggaaggtgct ctccatgcct caggtgaaca
tcgtcaagaa gaccgaagtt 3300cagaccggag gattctccaa ggagtccatc
ctccccaaga gaaactccga caagctgatc 3360gctagaaaga aagactggga
ccctaagaag tacggaggct tcgattctcc taccgtggcc 3420tactctgtgc
tggtcgtggc caaggtggag aagggcaagt ccaagaagct gaaatccgtc
3480aaggagctcc tcgggattac catcatggag aggagttcct tcgagaagaa
ccctatcgac 3540ttcctggagg ccaagggata taaagaggtg aagaaggacc
tcatcatcaa gctgcccaag 3600tactccctct tcgagttgga gaacggaagg
aagaggatgc tggcttctgc cggagagttg 3660cagaagggaa atgagctcgc
ccttccctcc aagtacgtga acttcctgta cctcgcctct 3720cactatgaaa
agttgaaggg ctctcctgag gacaacgagc agaagcagct cttcgtggag
3780cagcacaagc actacctgga cgaaattatc gagcagatct ctgagttctc
caagcgcgtg 3840atattggccg acgccaacct cgacaaggtg ctgtccgcct
acaacaagca cagggataag 3900cccattcgcg agcaggctga aaacattatc
cacctgttta ccctcacaaa cttgggagcc 3960cctgctgcct tcaagtactt
cgacaccacc attgacagga agagatacac ctccaccaag 4020gaggtgctcg
acgcaacact catccaccaa tccatcaccg gcctctatga aacaaggatt
4080gacttgtccc agctgggagg cgac 41041161503DNAGlycine max
116ccgggtttac ttattttgtg ggtatctata cttttattag atttttaatc
aggctcctga 60tttcttttta tttcgattga attcctgaac ttgtattatt cagtagatcg
aataaattat 120aaaaagataa aatcataaaa taatatttta tcctatcaat
catattaaag caatgaatat 180gtaaaattaa tcttatcttt attttaaaaa
atcatatagg tttagtattt ttttaaaaat 240aaagatagga ttagttttac
tattcactgc ttattacttt taaaaaaatc ataaaggttt 300agtatttttt
taaaataaat ataggaatag ttttactatt cactgcttta atagaaaaat
360agtttaaaat ttaagatagt tttaatccca gcatttgcca cgtttgaacg
tgagccgaaa 420cgatgtcgtt acattatctt aacctagctg aaacgatgtc
gtcataatat cgccaaatgc 480caactggact acgtcgaacc cacaaatccc
acaaagcgcg tgaaatcaaa tcgctcaaac 540cacaaaaaag aacaacgcgt
ttgttacacg ctcaatccca cgcgagtaga gcacagtaac 600cttcaaataa
gcgaatgggg cataatcaga aatccgaaat aaacctaggg gcattatcgg
660aaatgaaaag tagctcactc aatataaaaa tctaggaacc ctagttttcg
ttatcactct 720gtgctccctc gctctatttc tcagtctctg tgtttgcggc
tgaggattcc gaacgagtga 780ccttcttcgt ttctcgcaaa ggtaacagcc
tctgctcttg tctcttcgat tcgatctatg 840cctgtctctt atttacgatg
atgtttcttc ggttatgttt ttttatttat gctttatgct 900gttgatgttc
ggttgtttgt ttcgctttgt ttttgtggtt cagtttttta ggattctttt
960ggtttttgaa tcgattaatc ggaagagatt ttcgagttat ttggtgtgtt
ggaggtgaat 1020cttttttttg aggtcataga tctgttgtat ttgtgttata
aacatgcgac tttgtatgat 1080tttttacgag gttatgatgt tctggttgtt
ttattatgaa tctgttgaga cagaaccatg 1140atttttgttg atgttcgttt
acactattaa aggtttgttt taacaggatt aaaagttttt 1200taagcatgtt
gaaggagtct tgtagatatg taaccgtcga tagttttttt gtgggtttgt
1260tcacatgtta tcaagcttaa tcttttacta tgtatgcgac catatctgga
tccagcaaag 1320gcgatttttt aattccttgt gaaacttttg taatatgaag
ttgaaatttt gttattggta 1380aactataaat gtgtgaagtt ggagtatacc
tttaccttct tatttggctt tgtgatagtt 1440taatttatat gtattttgag
ttctgacttg tatttctttg aattgattct agtttaagta 1500atc
150311733DNAArtificial sequencelinker SV40 NLS 117tctagagccg
atcccaagaa gaagagaaag gtg 331181379PRTArtificial sequenceCas9 with
a SV40 NLS 118Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr
Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val
Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His
Ser Ile Lys Lys Asn Leu Ile 35 40 45 Gly Ala Leu Leu Phe Asp Ser
Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 65 70 75 80 Tyr Leu Gln
Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 85 90 95 Phe
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105
110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu
Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu
Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys
Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile
Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230
235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn
Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp
Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly
Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser
Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu
Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg
Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu
Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355
360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met
Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp
Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg
Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu
Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val
Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr
Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475
480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys
His Ser 500 505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu
Thr Lys Val Lys 515 520 525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu
Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys
Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575 Ser Val Glu
Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590 Thr
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600
605 Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr
Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu
Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys
Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile
Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe
Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp
Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725
730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met
Gly 740 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg
Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu
Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser
Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu
Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg
Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser
Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys 835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850
855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met
Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile
Thr Gln Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln
Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys His Val Ala Gln Ile
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940 Glu Asn Asp Lys
Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 945 950 955 960 Lys
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970
975 Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990 Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser
Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg
Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala
Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe
Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050 Asn Gly Glu Ile Arg
Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu
Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090
1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp
Asp Pro 1115 1120 1125 Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val
Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly
Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly
Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys
Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210
1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys
Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val
Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln
Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala
Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg
Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His
Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330
1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly
Gly Asp 1355 1360 1365 Ser Arg Ala Asp Pro Lys Lys Lys Arg Lys Val
1370 1375 1198519DNAArtificial sequenceQC782 119ccgggtttac
ttattttgtg ggtatctata cttttattag atttttaatc aggctcctga 60tttcttttta
tttcgattga attcctgaac ttgtattatt cagtagatcg aataaattat
120aaaaagataa aatcataaaa taatatttta tcctatcaat catattaaag
caatgaatat 180gtaaaattaa tcttatcttt attttaaaaa atcatatagg
tttagtattt ttttaaaaat 240aaagatagga ttagttttac tattcactgc
ttattacttt taaaaaaatc ataaaggttt 300agtatttttt taaaataaat
ataggaatag ttttactatt cactgcttta atagaaaaat 360agtttaaaat
ttaagatagt tttaatccca gcatttgcca cgtttgaacg tgagccgaaa
420cgatgtcgtt acattatctt aacctagctg aaacgatgtc gtcataatat
cgccaaatgc 480caactggact acgtcgaacc cacaaatccc acaaagcgcg
tgaaatcaaa tcgctcaaac 540cacaaaaaag aacaacgcgt ttgttacacg
ctcaatccca cgcgagtaga gcacagtaac 600cttcaaataa gcgaatgggg
cataatcaga aatccgaaat aaacctaggg gcattatcgg 660aaatgaaaag
tagctcactc aatataaaaa tctaggaacc ctagttttcg ttatcactct
720gtgctccctc gctctatttc tcagtctctg tgtttgcggc tgaggattcc
gaacgagtga 780ccttcttcgt ttctcgcaaa ggtaacagcc tctgctcttg
tctcttcgat tcgatctatg 840cctgtctctt atttacgatg atgtttcttc
ggttatgttt ttttatttat gctttatgct 900gttgatgttc ggttgtttgt
ttcgctttgt ttttgtggtt cagtttttta ggattctttt 960ggtttttgaa
tcgattaatc ggaagagatt ttcgagttat ttggtgtgtt ggaggtgaat
1020cttttttttg aggtcataga tctgttgtat ttgtgttata aacatgcgac
tttgtatgat 1080tttttacgag gttatgatgt tctggttgtt
ttattatgaa tctgttgaga cagaaccatg 1140atttttgttg atgttcgttt
acactattaa aggtttgttt taacaggatt aaaagttttt 1200taagcatgtt
gaaggagtct tgtagatatg taaccgtcga tagttttttt gtgggtttgt
1260tcacatgtta tcaagcttaa tcttttacta tgtatgcgac catatctgga
tccagcaaag 1320gcgatttttt aattccttgt gaaacttttg taatatgaag
ttgaaatttt gttattggta 1380aactataaat gtgtgaagtt ggagtatacc
tttaccttct tatttggctt tgtgatagtt 1440taatttatat gtattttgag
ttctgacttg tatttctttg aattgattct agtttaagta 1500atccatggac
aaaaagtact caatagggct cgacataggg actaactccg ttggatgggc
1560cgtcatcacc gacgagtaca aggtgccctc caagaagttc aaggtgttgg
gaaacaccga 1620caggcacagc ataaagaaga atttgatcgg tgccctcctc
ttcgactccg gagagaccgc 1680tgaggctacc aggctcaaga ggaccgctag
aaggcgctac accagaagga agaacagaat 1740ctgctacctg caggagatct
tctccaacga gatggccaag gtggacgact ccttcttcca 1800ccgccttgag
gaatcattcc tggtggagga ggataaaaag cacgagagac acccaatctt
1860cgggaacatc gtcgacgagg tggcctacca tgaaaagtac cctaccatct
accacctgag 1920gaagaagctg gtcgactcta ccgacaaggc tgacttgcgc
ttgatttacc tggctctcgc 1980tcacatgata aagttccgcg gacacttcct
cattgaggga gacctgaacc cagacaactc 2040cgacgtggac aagctcttca
tccagctcgt tcagacctac aaccagcttt tcgaggagaa 2100cccaatcaac
gccagtggag ttgacgccaa ggctatcctc tctgctcgtc tgtcaaagtc
2160caggaggctt gagaacttga ttgcccagct gcctggcgaa aagaagaacg
gactgttcgg 2220aaacttgatc gctctctccc tgggattgac tcccaacttc
aagtccaact tcgacctcgc 2280cgaggacgct aagttgcagt tgtctaaaga
cacctacgac gatgacctcg acaacttgct 2340ggcccagata ggcgaccaat
acgccgatct cttcctcgcc gctaagaact tgtccgacgc 2400aatcctgctg
tccgacatcc tgagagtcaa cactgagatt accaaagctc ctctgtctgc
2460ttccatgatt aagcgctacg acgagcacca ccaagatctg accctgctca
aggccctggt 2520gagacagcag ctgcccgaga agtacaagga gatctttttc
gaccagtcca agaacggcta 2580cgccggatac attgacggag gcgcctccca
ggaagagttc tacaagttca tcaagcccat 2640ccttgagaag atggacggta
ccgaggagct gttggtgaag ttgaacagag aggacctgtt 2700gaggaagcag
agaaccttcg acaacggaag catccctcac caaatccacc tgggagagct
2760ccacgccatc ttgaggaggc aggaggattt ctatcccttc ctgaaggaca
accgcgagaa 2820gattgagaag atcttgacct tcagaattcc ttactacgtc
gggccactcg ccagaggaaa 2880ctctaggttc gcctggatga cccgcaaatc
tgaagagacc attactccct ggaacttcga 2940ggaagtcgtg gacaagggcg
cttccgctca gtctttcatc gagaggatga ccaacttcga 3000taaaaatctg
cccaacgaga aggtgctgcc caagcactcc ctgttgtacg agtatttcac
3060agtgtacaac gagctcacca aggtgaagta cgtcacagag ggaatgagga
agcctgcctt 3120cttgtccgga gagcagaaga aggccatcgt cgacctgctc
ttcaagacca acaggaaggt 3180gactgtcaag cagctgaagg aggactactt
caagaagatc gagtgcttcg actccgtcga 3240gatctctggt gtcgaggaca
ggttcaacgc ctcccttggg acttaccacg atctgctcaa 3300gattattaaa
gacaaggact tcctggacaa cgaggagaac gaggacatcc ttgaggacat
3360cgtgctcacc ctgaccttgt tcgaagacag ggaaatgatc gaagagaggc
tcaagaccta 3420cgcccacctc ttcgacgaca aggtgatgaa acagctgaag
agacgcagat ataccggctg 3480gggaaggctc tcccgcaaat tgatcaacgg
gatcagggac aagcagtcag ggaagactat 3540actcgacttc ctgaagtccg
acggattcgc caacaggaac ttcatgcagc tcattcacga 3600cgactccttg
accttcaagg aggacatcca gaaggctcag gtgtctggac agggtgactc
3660cttgcatgag cacattgcta acttggccgg ctctcccgct attaagaagg
gcattttgca 3720gaccgtgaag gtcgttgacg agctcgtgaa ggtgatggga
cgccacaagc cagagaacat 3780cgttattgag atggctcgcg agaaccaaac
tacccagaaa gggcagaaga attcccgcga 3840gaggatgaag cgcattgagg
agggcataaa agagcttggc tctcagatcc tcaaggagca 3900ccccgtcgag
aacactcagc tgcagaacga gaagctgtac ctgtactacc tccaaaacgg
3960aagggacatg tacgtggacc aggagctgga catcaacagg ttgtccgact
acgacgtcga 4020ccacatcgtg cctcagtcct tcctgaagga tgactccatc
gacaataaag tgctgacacg 4080ctccgataaa aatagaggca agtccgacaa
cgtcccctcc gaggaggtcg tgaagaagat 4140gaaaaactac tggagacagc
tcttgaacgc caagctcatc acccagcgta agttcgacaa 4200cctgactaag
gctgagagag gaggattgtc cgagctcgat aaggccggat tcatcaagag
4260acagctcgtc gaaacccgcc aaattaccaa gcacgtggcc caaattctgg
attcccgcat 4320gaacaccaag tacgatgaaa atgacaagct gatccgcgag
gtcaaggtga tcaccttgaa 4380gtccaagctg gtctccgact tccgcaagga
cttccagttc tacaaggtga gggagatcaa 4440caactaccac cacgcacacg
acgcctacct caacgctgtc gttggaaccg ccctcatcaa 4500aaaatatcct
aagctggagt ctgagttcgt ctacggcgac tacaaggtgt acgacgtgag
4560gaagatgatc gctaagtctg agcaggagat cggcaaggcc accgccaagt
acttcttcta 4620ctccaacatc atgaacttct tcaagaccga gatcactctc
gccaacggtg agatcaggaa 4680gcgcccactg atcgagacca acggtgagac
tggagagatc gtgtgggaca aagggaggga 4740tttcgctact gtgaggaagg
tgctctccat gcctcaggtg aacatcgtca agaagaccga 4800agttcagacc
ggaggattct ccaaggagtc catcctcccc aagagaaact ccgacaagct
4860gatcgctaga aagaaagact gggaccctaa gaagtacgga ggcttcgatt
ctcctaccgt 4920ggcctactct gtgctggtcg tggccaaggt ggagaagggc
aagtccaaga agctgaaatc 4980cgtcaaggag ctcctcggga ttaccatcat
ggagaggagt tccttcgaga agaaccctat 5040cgacttcctg gaggccaagg
gatataaaga ggtgaagaag gacctcatca tcaagctgcc 5100caagtactcc
ctcttcgagt tggagaacgg aaggaagagg atgctggctt ctgccggaga
5160gttgcagaag ggaaatgagc tcgcccttcc ctccaagtac gtgaacttcc
tgtacctcgc 5220ctctcactat gaaaagttga agggctctcc tgaggacaac
gagcagaagc agctcttcgt 5280ggagcagcac aagcactacc tggacgaaat
tatcgagcag atctctgagt tctccaagcg 5340cgtgatattg gccgacgcca
acctcgacaa ggtgctgtcc gcctacaaca agcacaggga 5400taagcccatt
cgcgagcagg ctgaaaacat tatccacctg tttaccctca caaacttggg
5460agcccctgct gccttcaagt acttcgacac caccattgac aggaagagat
acacctccac 5520caaggaggtg ctcgacgcaa cactcatcca ccaatccatc
accggcctct atgaaacaag 5580gattgacttg tcccagctgg gaggcgactc
tagagccgat cccaagaaga agagaaaggt 5640gtaggttaac ctagacttgt
ccatcttctg gattggccaa cttaattaat gtatgaaata 5700aaaggatgca
cacatagtga catgctaatc actataatgt gggcatcaaa gttgtgtgtt
5760atgtgtaatt actagttatc tgaataaaag agaaagagat catccatatt
tcttatccta 5820aatgaatgtc acgtgtcttt ataattcttt gatgaaccag
atgcatttca ttaaccaaat 5880ccatatacat ataaatatta atcatatata
attaatatca attgggttag caaaacaaat 5940ctagtctagg tgtgttttgc
gaatgcggcc gctcgagggg gggcccggta ccggcgcgcc 6000gttctatagt
gtcacctaaa tcgtatgtgt atgatacata aggttatgta ttaattgtag
6060ccgcgttcta acgacaatat gtccatatgg tgcactctca gtacaatctg
ctctgatgcc 6120gcatagttaa gccagccccg acacccgcca acacccgctg
acgcgccctg acgggcttgt 6180ctgctcccgg catccgctta cagacaagct
gtgaccgtct ccgggagctg catgtgtcag 6240aggttttcac cgtcatcacc
gaaacgcgcg agacgaaagg gcctcgtgat acgcctattt 6300ttataggtta
atgtcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca
6360gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg
cgtaatctgc 6420tgcttgcaaa caaaaaaacc accgctacca gcggtggttt
gtttgccgga tcaagagcta 6480ccaactcttt ttccgaaggt aactggcttc
agcagagcgc agataccaaa tactgtcctt 6540ctagtgtagc cgtagttagg
ccaccacttc aagaactctg tagcaccgcc tacatacctc 6600gctctgctaa
tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg
6660ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac
ggggggttcg 6720tgcacacagc ccagcttgga gcgaacgacc tacaccgaac
tgagatacct acagcgtgag 6780cattgagaaa gcgccacgct tcccgaaggg
agaaaggcgg acaggtatcc ggtaagcggc 6840agggtcggaa caggagagcg
cacgagggag cttccagggg gaaacgcctg gtatctttat 6900agtcctgtcg
ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg
6960gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct
ggccttttgc 7020tggccttttg ctcacatgtt ctttcctgcg ttatcccctg
attctgtgga taaccgtatt 7080accgcctttg agtgagctga taccgctcgc
cgcagccgaa cgaccgagcg cagcgagtca 7140gtgagcgagg aagcggaaga
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg 7200attcattaat
gcaggttgat cagatctcga tcccgcgaaa ttaatacgac tcactatagg
7260gagaccacaa cggtttccct ctagaaataa ttttgtttaa ctttaagaag
gagatatacc 7320catggaaaag cctgaactca ccgcgacgtc tgtcgagaag
tttctgatcg aaaagttcga 7380cagcgtctcc gacctgatgc agctctcgga
gggcgaagaa tctcgtgctt tcagcttcga 7440tgtaggaggg cgtggatatg
tcctgcgggt aaatagctgc gccgatggtt tctacaaaga 7500tcgttatgtt
tatcggcact ttgcatcggc cgcgctcccg attccggaag tgcttgacat
7560tggggaattc agcgagagcc tgacctattg catctcccgc cgtgcacagg
gtgtcacgtt 7620gcaagacctg cctgaaaccg aactgcccgc tgttctgcag
ccggtcgcgg aggctatgga 7680tgcgatcgct gcggccgatc ttagccagac
gagcgggttc ggcccattcg gaccgcaagg 7740aatcggtcaa tacactacat
ggcgtgattt catatgcgcg attgctgatc cccatgtgta 7800tcactggcaa
actgtgatgg acgacaccgt cagtgcgtcc gtcgcgcagg ctctcgatga
7860gctgatgctt tgggccgagg actgccccga agtccggcac ctcgtgcacg
cggatttcgg 7920ctccaacaat gtcctgacgg acaatggccg cataacagcg
gtcattgact ggagcgaggc 7980gatgttcggg gattcccaat acgaggtcgc
caacatcttc ttctggaggc cgtggttggc 8040ttgtatggag cagcagacgc
gctacttcga gcggaggcat ccggagcttg caggatcgcc 8100gcggctccgg
gcgtatatgc tccgcattgg tcttgaccaa ctctatcaga gcttggttga
8160cggcaatttc gatgatgcag cttgggcgca gggtcgatgc gacgcaatcg
tccgatccgg 8220agccgggact gtcgggcgta cacaaatcgc ccgcagaagc
gcggccgtct ggaccgatgg 8280ctgtgtagaa gtactcgccg atagtggaaa
ccgacgcccc agcactcgtc cgagggcaaa 8340ggaatagtga ggtacagctt
ggatcgatcc ggctgctaac aaagcccgaa aggaagctga 8400gttggctgct
gccaccgctg agcaataact agcataaccc cttggggcct ctaaacgggt
8460cttgaggggt tttttgctga aaggaggaac tatatccgga tgatcgggcg
cgccggtac 8519120434DNAGlycine max 120ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca acaa
434121104DNAArtificial sequenceGuide RNA for DD43CR1 121gtcccttgta
cttgtacgta gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac
ttgaaaaagt ggcaccgagt cggtgctttt tttt 1041223098DNAArtificial
sequenceQC783 122ccgggtgtga tttagtataa agtgaagtaa tggtcaaaag
aaaaagtgta aaacgaagta 60cctagtaata agtaatattg aacaaaataa atggtaaagt
gtcagatata taaaataggc 120tttaataaaa ggaagaaaaa aaacaaacaa
aaaataggtt gcaatggggc agagcagagt 180catcatgaag ctagaaaggc
taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat
ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac
300taatcttgcc ttgttgtttc attccctaac ttacaggact cagcgcatgt
catgtggtct 360cgttccccat ttaagtccca caccgtctaa acttattaaa
ttattaatgt ttataactag 420atgcacaaca acaaagcttg tcccttgtac
ttgtacgtag ttttagagct agaaatagca 480agttaaaata aggctagtcc
gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg
ctcgaggggg ggcccggtac cggcgcgccg ttctatagtg tcacctaaat
600cgtatgtgta tgatacataa ggttatgtat taattgtagc cgcgttctaa
cgacaatatg 660tccatatggt gcactctcag tacaatctgc tctgatgccg
catagttaag ccagccccga 720cacccgccaa cacccgctga cgcgccctga
cgggcttgtc tgctcccggc atccgcttac 780agacaagctg tgaccgtctc
cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg 840aaacgcgcga
gacgaaaggg cctcgtgata cgcctatttt tataggttaa tgtcatgacc
900aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga
aaagatcaaa 960ggatcttctt gagatccttt ttttctgcgc gtaatctgct
gcttgcaaac aaaaaaacca 1020ccgctaccag cggtggtttg tttgccggat
caagagctac caactctttt tccgaaggta 1080actggcttca gcagagcgca
gataccaaat actgtccttc tagtgtagcc gtagttaggc 1140caccacttca
agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca
1200gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag
acgatagtta 1260ccggataagg cgcagcggtc gggctgaacg gggggttcgt
gcacacagcc cagcttggag 1320cgaacgacct acaccgaact gagataccta
cagcgtgagc attgagaaag cgccacgctt 1380cccgaaggga gaaaggcgga
caggtatccg gtaagcggca gggtcggaac aggagagcgc 1440acgagggagc
ttccaggggg aaacgcctgg tatctttata gtcctgtcgg gtttcgccac
1500ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct
atggaaaaac 1560gccagcaacg cggccttttt acggttcctg gccttttgct
ggccttttgc tcacatgttc 1620tttcctgcgt tatcccctga ttctgtggat
aaccgtatta ccgcctttga gtgagctgat 1680accgctcgcc gcagccgaac
gaccgagcgc agcgagtcag tgagcgagga agcggaagag 1740cgcccaatac
gcaaaccgcc tctccccgcg cgttggccga ttcattaatg caggttgatc
1800agatctcgat cccgcgaaat taatacgact cactataggg agaccacaac
ggtttccctc 1860tagaaataat tttgtttaac tttaagaagg agatataccc
atggaaaagc ctgaactcac 1920cgcgacgtct gtcgagaagt ttctgatcga
aaagttcgac agcgtctccg acctgatgca 1980gctctcggag ggcgaagaat
ctcgtgcttt cagcttcgat gtaggagggc gtggatatgt 2040cctgcgggta
aatagctgcg ccgatggttt ctacaaagat cgttatgttt atcggcactt
2100tgcatcggcc gcgctcccga ttccggaagt gcttgacatt ggggaattca
gcgagagcct 2160gacctattgc atctcccgcc gtgcacaggg tgtcacgttg
caagacctgc ctgaaaccga 2220actgcccgct gttctgcagc cggtcgcgga
ggctatggat gcgatcgctg cggccgatct 2280tagccagacg agcgggttcg
gcccattcgg accgcaagga atcggtcaat acactacatg 2340gcgtgatttc
atatgcgcga ttgctgatcc ccatgtgtat cactggcaaa ctgtgatgga
2400cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag ctgatgcttt
gggccgagga 2460ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc
tccaacaatg tcctgacgga 2520caatggccgc ataacagcgg tcattgactg
gagcgaggcg atgttcgggg attcccaata 2580cgaggtcgcc aacatcttct
tctggaggcc gtggttggct tgtatggagc agcagacgcg 2640ctacttcgag
cggaggcatc cggagcttgc aggatcgccg cggctccggg cgtatatgct
2700ccgcattggt cttgaccaac tctatcagag cttggttgac ggcaatttcg
atgatgcagc 2760ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga
gccgggactg tcgggcgtac 2820acaaatcgcc cgcagaagcg cggccgtctg
gaccgatggc tgtgtagaag tactcgccga 2880tagtggaaac cgacgcccca
gcactcgtcc gagggcaaag gaatagtgag gtacagcttg 2940gatcgatccg
gctgctaaca aagcccgaaa ggaagctgag ttggctgctg ccaccgctga
3000gcaataacta gcataacccc ttggggcctc taaacgggtc ttgaggggtt
ttttgctgaa 3060aggaggaact atatccggat gatcgggcgc gccggtac
30981239093DNAArtificial sequenceQC815 123ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg tcccttgtac ttgtacgtag ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa aaagtactca
atagggctcg acatagggac 2100taactccgtt ggatgggccg tcatcaccga
cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga aacaccgaca
ggcacagcat aaagaagaat ttgatcggtg ccctcctctt 2220cgactccgga
gagaccgctg aggctaccag gctcaagagg accgctagaa ggcgctacac
2280cagaaggaag aacagaatct gctacctgca ggagatcttc tccaacgaga
tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga atcattcctg
gtggaggagg ataaaaagca 2400cgagagacac ccaatcttcg ggaacatcgt
cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac cacctgagga
agaagctggt cgactctacc gacaaggctg acttgcgctt 2520gatttacctg
gctctcgctc acatgataaa gttccgcgga cacttcctca ttgagggaga
2580cctgaaccca gacaactccg acgtggacaa gctcttcatc cagctcgttc
agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc cagtggagtt
gacgccaagg ctatcctctc 2700tgctcgtctg tcaaagtcca ggaggcttga
gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga ctgttcggaa
acttgatcgc tctctccctg ggattgactc ccaacttcaa 2820gtccaacttc
gacctcgccg aggacgctaa gttgcagttg tctaaagaca cctacgacga
2880tgacctcgac aacttgctgg cccagatagg cgaccaatac gccgatctct
tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc cgacatcctg
agagtcaaca ctgagattac 3000caaagctcct ctgtctgctt ccatgattaa
gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag gccctggtga
gacagcagct gcccgagaag tacaaggaga tctttttcga 3120ccagtccaag
aacggctacg ccggatacat tgacggaggc gcctcccagg aagagttcta
3180caagttcatc aagcccatcc ttgagaagat ggacggtacc gaggagctgt
tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag aaccttcgac
aacggaagca tccctcacca 3300aatccacctg ggagagctcc acgccatctt
gaggaggcag gaggatttct atcccttcct 3360gaaggacaac cgcgagaaga
ttgagaagat cttgaccttc agaattcctt actacgtcgg 3420gccactcgcc
agaggaaact ctaggttcgc ctggatgacc cgcaaatctg aagagaccat
3480tactccctgg aacttcgagg aagtcgtgga caagggcgct tccgctcagt
ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc caacgagaag
gtgctgccca agcactccct 3600gttgtacgag tatttcacag tgtacaacga
gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag cctgccttct
tgtccggaga gcagaagaag gccatcgtcg acctgctctt 3720caagaccaac
aggaaggtga ctgtcaagca gctgaaggag gactacttca agaagatcga
3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct
cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga caaggacttc
ctggacaacg aggagaacga 3900ggacatcctt gaggacatcg tgctcaccct
gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc aagacctacg
cccacctctt cgacgacaag gtgatgaaac agctgaagag 4020acgcagatat
accggctggg gaaggctctc ccgcaaattg atcaacggga tcagggacaa
4080gcagtcaggg aagactatac tcgacttcct gaagtccgac ggattcgcca
acaggaactt 4140catgcagctc attcacgacg actccttgac cttcaaggag
gacatccaga aggctcaggt 4200gtctggacag ggtgactcct tgcatgagca
cattgctaac ttggccggct ctcccgctat 4260taagaagggc attttgcaga
ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg 4320ccacaagcca
gagaacatcg ttattgagat ggctcgcgag aaccaaacta cccagaaagg
4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag ggcataaaag
agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa cactcagctg
cagaacgaga agctgtacct 4500gtactacctc caaaacggaa gggacatgta
cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac gacgtcgacc
acatcgtgcc tcagtccttc ctgaaggatg actccatcga 4620caataaagtg
ctgacacgct ccgataaaaa tagaggcaag tccgacaacg tcccctccga
4680ggaggtcgtg aagaagatga aaaactactg gagacagctc ttgaacgcca
agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc tgagagagga
ggattgtccg agctcgataa 4800ggccggattc atcaagagac agctcgtcga
aacccgccaa attaccaagc acgtggccca 4860aattctggat tcccgcatga
acaccaagta cgatgaaaat gacaagctga tccgcgaggt 4920caaggtgatc
accttgaagt ccaagctggt ctccgacttc cgcaaggact tccagttcta
4980caaggtgagg gagatcaaca actaccacca cgcacacgac gcctacctca
acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa gctggagtct
gagttcgtct acggcgacta 5100caaggtgtac gacgtgagga agatgatcgc
taagtctgag caggagatcg gcaaggccac 5160cgccaagtac ttcttctact
ccaacatcat gaacttcttc aagaccgaga tcactctcgc 5220caacggtgag
atcaggaagc gcccactgat cgagaccaac ggtgagactg gagagatcgt
5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg ctctccatgc
ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg aggattctcc
aaggagtcca tcctccccaa 5400gagaaactcc gacaagctga tcgctagaaa
gaaagactgg gaccctaaga agtacggagg 5460cttcgattct cctaccgtgg
cctactctgt gctggtcgtg gccaaggtgg agaagggcaa 5520gtccaagaag
ctgaaatccg tcaaggagct cctcgggatt accatcatgg agaggagttc
5580cttcgagaag aaccctatcg acttcctgga ggccaaggga tataaagagg
tgaagaagga 5640cctcatcatc aagctgccca agtactccct cttcgagttg
gagaacggaa ggaagaggat 5700gctggcttct gccggagagt tgcagaaggg
aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg tacctcgcct
ctcactatga aaagttgaag ggctctcctg aggacaacga 5820gcagaagcag
ctcttcgtgg agcagcacaa gcactacctg gacgaaatta tcgagcagat
5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac ctcgacaagg
tgctgtccgc 5940ctacaacaag cacagggata agcccattcg cgagcaggct
gaaaacatta tccacctgtt 6000taccctcaca aacttgggag cccctgctgc
cttcaagtac ttcgacacca ccattgacag 6060gaagagatac acctccacca
aggaggtgct cgacgcaaca ctcatccacc aatccatcac 6120cggcctctat
gaaacaagga ttgacttgtc ccagctggga ggcgactcta gagccgatcc
6180caagaagaag agaaaggtgt aggttaacct agacttgtcc atcttctgga
ttggccaact 6240taattaatgt atgaaataaa aggatgcaca catagtgaca
tgctaatcac tataatgtgg 6300gcatcaaagt tgtgtgttat gtgtaattac
tagttatctg aataaaagag aaagagatca 6360tccatatttc ttatcctaaa
tgaatgtcac gtgtctttat aattctttga tgaaccagat 6420gcatttcatt
aaccaaatcc atatacatat aaatattaat catatataat taatatcaat
6480tgggttagca aaacaaatct agtctaggtg tgttttgcga attcgatatc
aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg cgccgttcta
tagtgtcacc taaatcgtat 6600gtgtatgata cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc 6720gccaacaccc
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat 6900cccttaacgt gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 7020accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
7080cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca 7140cttcaagaac tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc 7200tgctgccagt ggcgataagt cgtgtcttac
cgggttggac tcaagacgat agttaccgga 7260taaggcgcag cggtcgggct
gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 7320gacctacacc
gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
7380agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg 7500acttgagcgt cgatttttgt gatgctcgtc
aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc tttttacggt
tcctggcctt ttgctggcct tttgctcaca tgttctttcc 7620tgcgttatcc
cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat
taatgcaggt tgatcagatc 7800tcgatcccgc gaaattaata cgactcacta
tagggagacc acaacggttt ccctctagaa 7860ataattttgt ttaactttaa
gaaggagata tacccatgga aaagcctgaa ctcaccgcga 7920cgtctgtcga
gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
7980cggagggcga agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca aagatcgtta
tgtttatcgg cactttgcat 8100cggccgcgct cccgattccg gaagtgcttg
acattgggga attcagcgag agcctgacct 8160attgcatctc ccgccgtgca
cagggtgtca cgttgcaaga cctgcctgaa accgaactgc 8220ccgctgttct
gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg 8340atttcatatg cgcgattgct gatccccatg tgtatcactg
gcaaactgtg atggacgaca 8400ccgtcagtgc gtccgtcgcg caggctctcg
atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg gcacctcgtg
cacgcggatt tcggctccaa caatgtcctg acggacaatg 8520gccgcataac
agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact 8640tcgagcggag gcatccggag cttgcaggat cgccgcggct
ccgggcgtat atgctccgca 8700ttggtcttga ccaactctat cagagcttgg
ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg atgcgacgca
atcgtccgat ccggagccgg gactgtcggg cgtacacaaa 8820tcgcccgcag
aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag ctgagttggc
tgctgccacc gctgagcaat 9000aactagcata accccttggg gcctctaaac
gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc cggatgatcg
ggcgcgccgg tac 90931244107DNAS. pyogenes 124atggataaga aatactcaat
aggcttagat atcggcacaa atagcgtcgg atgggcggtg 60atcactgatg attataaggt
tccgtctaaa aagctcaagg gtctgggaaa tacagaccgc 120cacggtatca
aaaaaaatct tataggggct cttttatttg acagtggaga gacagcggaa
180gcgactcgtc tcaaacggac agctcgtaga aggtatacac gtcggaagaa
tcgtatttgt 240tatctacagg agattttttc aaatgagatg gcgaaagtag
atgatagttt ctttcatcga 300cttgaagagt cttttttggt ggaagaagac
aagaagcatg aacgtcatcc tatttttgga 360aatatagtag atgaagttgc
ttatcatgag aaatatccaa ctatctatca tctgcgaaaa 420aaattggcag
attctactga taaagtggat ttgcgcttaa tctatttggc cttagcgcat
480atgattaagt ttcgtggtca ttttttgatt gagggagatt taaatcctga
taatagtgat 540gtggacaaac tatttatcca gttggtacaa acctacaatc
aattatttga agaaaaccct 600attaacgcaa gtagagtaga tgctaaagcg
attctttctg cacgattgag taaatcaaga 660cgattagaaa atctcattgc
tcagctcccc ggtgagaaga aaaatggatt gtttgggaat 720ctcattgctt
tgtcattggg attgacccct aattttaaat caaattttga tttggcagaa
780gatgctaaat tacagctttc aaaagatact tacgatgatg atttagataa
tttattggcg 840caaattggag atcaatatgc tgatttgttt ttggcagcta
agaatttatc agatgctact 900ttactttcag atatcctaag agtaaatagt
gaaataacta aggctcccct atcagcttca 960atgattaagc gctacgatga
acatcatcaa gacttgactc ttttaaaagc tttagttcga 1020caacaacttc
cagaaaagta taaagaaatc ttttttgatc aatcaaaaaa cggatatgca
1080ggttatattg atgggggagc tagccaagaa gaattttata aatttatcaa
accaatttta 1140gaaaaaatgg atggtactga ggaattattg gcgaaactaa
atcgtgaaga tttgctgcgc 1200aagcaacgga cctttgacaa cggctctatt
ccctatcaaa ttcacttggg tgagctgcat 1260gctattttga gaagacaaga
agacttttat ccatttttaa aagacaatcg tgagaagatt 1320gaaaaaatct
tgacttttcg aattccttat tatgttggtc cattggcgcg tggcaatagt
1380cgttttgcat ggatgactcg gaagtctgaa gaaacaatta ccccatggaa
ttttgaagaa 1440gttgtcgata aaggtgcttc agctcaatca tttattgaac
gcatgacaaa ctttgataaa 1500aatcttccaa atgaaaaagt actaccaaaa
catagtttgc tttatgagta ttttacggtt 1560tataacgaat tgacaaaagt
caaatatgtt actgagggaa tgcgaaaacc agcatttctt 1620tcaggtgaac
agaagaaagc cattgttgat ttactcttca aaacaaatcg aaaagtaacc
1680gttaagcaat taaaagaaga ttatttcaaa aaaatagaat gttttgatag
tgttgaaatt 1740tcaggagttg aagatagatt taatgcttca ttaggtacct
accatgattt gctaaaaatt 1800attaaagata aagatttttt ggataatgaa
gaaaacgaag atatcttaga ggatattgtt 1860ttaacattga ccttatttga
agatagggag atgattgagg aaagacttaa aacatatgct 1920cacctctttg
atgataaggt gatgaaacag cttaaacgtc gccgttatac tggttgggga
1980cgtttgtctc gaaaattgat taatggtatt agggataagc aatctggcaa
aacaatatta 2040gattttttga aatcagatgg ttttgccaat cgcaatttta
tgcagctgat ccatgatgat 2100agtttgacat ttaaagaaga cattcaaaaa
gcacaagtgt ctggacaagg cgatagttta 2160catgaacata ttgcaaattt
agctggtagc cctgctatta aaaaaggtat tttacagact 2220gtaaaagttg
ttgatgaatt ggtcaaagta atggggcggc ataagccaga aaatatcgtt
2280attgaaatgg cacgtgaaaa tcagacaact caaaagggcc agaaaaattc
gcgtgagcgt 2340atgaaacgta ttgaagaagg aataaaagaa ctaggaagtg
atattctaaa ggagtatcct 2400gttgaaaaca ctcaattaca aaatgaaaag
ctctatctct attatctcca aaatggaaga 2460gacatgtatg tggaccaaga
attagatatt aatcgtttaa gtgattatga tgtcgatcac 2520attgttccac
aaagtttcct taaagacgat tcaatagaca ataaggtctt aacgcgttct
2580gataaaaatc gtggtaaatc ggataacgtt ccaagtgaag aagtagtcaa
aaagatgaaa 2640aactattgga gacaacttct aaacgccaag ttaatcactc
aacgtaagtt tgataattta 2700acaaaagctg aacgtggagg tttgagtgaa
cttgataaag ttggttttat caaacgccaa 2760ttggttgaaa ctcgccaaat
cactaagcat gtggcacaaa ttttggatag tcgcatgaat 2820actaaatacg
atgaaaatga taaacttatt cgagaggtta gagtgattac cttaaaatct
2880aaattagttt ctgacttccg aaaagatttc caattctata aagtacgtga
gattaacaat 2940taccatcatg cccatgatgc gtatcttaat gccgtcgttg
gaactgcttt gattaagaaa 3000tatccaaaac ttgaatcgga gtttgtctat
ggtgattata aagtttatga tgttcgtaaa 3060atgattgcta agtctgagca
ggaaataggc aaagcaaccg caaaatattt cttttactct 3120aatatcatga
acttcttcaa aacagaaatt acacttgcaa atggagagat tcgcaaacgc
3180cctctaatcg aaactaatgg ggaaactgga gaaattgtct gggataaagg
gcgagatttt 3240gccacagtgc gcaaagtatt gtccatgccc caagtcaata
ttgtcaagaa aacagaagta 3300cagacaggcg gattctccaa ggagtcaatt
ttaccaaaaa gaaattcgga caagcttatt 3360gctcgtaaaa aagactggga
tccaaaaaaa tatggtggtt ttgatagtcc aacggtagct 3420tattcagtcc
tagtggttgc taaggtggaa aaagggaaat cgaagaagtt aaaatccgtt
3480aaagagttac tagggatcac aataatggaa agaagctctt ttgaaaaaga
tccgattgac 3540tttttagaag ctaaaggata taaggaagtt agaaaagact
taatcattaa actacctaaa 3600tatagtcttt ttgagttaga aaacggtcgt
aaacggatgc tggctagtgc cggagaattg 3660caaaaaggaa atgagctagc
tctgccaagc aaatatgtga attttttata tttagctagt 3720cattatgaaa
agttgaaggg tagtccagaa gataacgaac aaaaacaatt gtttgtggag
3780cagcataagc attatttaga tgagattatt gagcaaatca gtgaattttc
taagcgtgtt 3840attttagcag atgccaattt agataaagtt cttagtgcat
ataacaaaca tagagacaaa 3900ccaatacgtg aacaagcaga aaatattatt
catttattta cgttgacgaa tcttggagct 3960cccgctgctt ttaaatattt
tgatacaaca attgatcgta aacgatatac gtctacaaaa 4020gaagttttag
atgccactct tatccatcaa tccatcactg gtctttatga aacacgcatt
4080gatttgagtc agctaggagg tgactga 410712520DNAGlycine max
125ggaactgaca cacgacatga 2012620DNAGlycine max 126gacatgatgg
aacgtgacta 2012720DNAGlycine max 127gtcccttgta cttgtacgta
2012820DNAGlycine max 128gtattctaga aaagaggaat 2012970DNAGlycine
max 129atcaaaattc ggaactgaca cacgacatga tggaacgtga ctaaggtggg
tttttgactt 60tgcatgtcga 7013070DNAGlycine max 130tcgacatgca
aagtcaaaaa cccaccttag tcacgttcca tcatgtcgtg tgtcagttcc 60gaattttgat
7013170DNAGlycine max 131ggcagactcc aattcctctt ttctagaata
ccctccgtac gtacaagtac aagggacttg 60tgagttgtaa 7013270DNAGlycine max
132ttacaactca caagtccctt gtacttgtac gtacggaggg tattctagaa
aagaggaatt 60ggagtctgcc 7013371DNAGlycine max 133ctacactctt
tccctacacg acgctcttcc gatctggaat ttacagcaca agtagatcac 60ttgtacttat
c 7113459DNAGlycine max 134caagcagaag acggcatacg agctcttccg
atctaaatca ctctcacttc gacatgcaa 5913571DNAGlycine max 135ctacactctt
tccctacacg acgctcttcc gatctttcct ttacagcaca agtagatcac 60ttgtacttat
c 7113668DNAGlycine max 136ctacactctt tccctacacg acgctcttcc
gatctagctg taaatacagc cttacaactc 60acaagtcc 6813763DNAArtificial
sequencePrimer, DD43-A 137caagcagaag acggcatacg agctcttccg
atctttaatt taggactaaa agaagaggca 60gac 6313868DNAArtificial
sequencePrimer, DD43-S4 138ctacactctt tccctacacg acgctcttcc
gatctctagg taaatacagc cttacaactc 60acaagtcc 6813968DNAArtificial
sequencePrimer, DD43-S5 139ctacactctt tccctacacg acgctcttcc
gatctgatcg taaatacagc cttacaactc 60acaagtcc 6814043DNAArtificial
sequencePrimer, JKY557 140aatgatacgg cgaccaccga gatctacact
ctttccctac acg 4314118DNAArtificial sequenceprimer, JKY558
141caagcagaag acggcata 18142117DNAArtificial sequenceDD20CR1 PCR
amplicon 142ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacg tgactaaggt gggtttttga ctttgcatgt cgaagtgaga
gtgattt 117143117DNAArtificial sequenceDD20CR2 PCR amplicon
143ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacg tgactaaggt gggtttttga ctttgcatgt cgaagtgaga
gtgattt 117144108DNAArtificial sequenceDD43CR1 PCR amplicon
144agctgtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga ggaattggag tctgcctctt cttttagtcc taaattaa
108145108DNAArtificial sequenceDD43CR2 PCR amplicon 145ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
ggaattggag tctgcctctt cttttagtcc taaattaa 108146108DNAartificial
sequenceamplicon 146ctaggtaaat acagccttac aactcacaag tcccttgtac
ttgtacgtac ggagggtatt 60ctagaaaaga ggaattggag tctgcctctt cttttagtcc
taaattaa 108147101DNAArtificial sequenceDD20CR1 mutant target site
147ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgatg 60atggaacgtg actaaggtgg gtttttgact ttgcatgtcg a
101148101DNAArtificial sequenceDD20CR1 mutant target site
148ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgatg 60gaacgtgact aaggtgggtt tttgactttg catgtcgaag t
101149101DNAArtificial sequenceDD20CR1 mutant target site
149ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgact 60gatggaacgt gactaaggtg ggtttttgac tttgcatgtc g
101150101DNAArtificial sequenceDD20CR1 mutant target site
150ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacatgg 60aacgtgacta aggtgggttt ttgactttgc atgtcgaagt g
101151101DNAArtificial sequenceDD20CR1 mutant target site
151ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacatgatg 60gaacgtgact aaggtgggtt tttgactttg catgtcgaag t
101152101DNAArtificial sequenceDD20CR1 mutant target site
152ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacagacat 60gatggaacgt gactaaggtg ggtttttgac tttgcatgtc g
101153101DNAArtificial sequenceDD20CR1 mutant target site
153ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacgacatg 60atggaacgtg actaaggtgg gtttttgact ttgcatgtcg a
101154101DNAArtificial sequenceDD20CR1 mutant target site i
154ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacaagaaa 60tgatggaacg tgactaaggt gggtttttga ctttgcatgt c
101155101DNAArtificial sequenceDD20CR1 mutant target site
155ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgatt 60gaacgtgact aaggtgggtt tttgactttg catgtcgaag t
101156101DNAArtificial sequenceDD20CR1 mutant target site
156ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacattg 60aacgtgacta aggtgggttt ttgactttgc atgtcgaagt g
101157101DNAArtificial sequenceDD20CR2 mutant target site
157ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacg tctaaggtgg gtttttgact ttgcatgtcg a
101158101DNAArtificial sequenceDD20CR2 mutant target site
158ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacc taaggtgggt ttttgacttt gcatgtcgaa g
101159101DNAArtificial sequenceDD20CR2 mutant target site
159ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacg tgactaggtg ggtttttgac tttgcatgtc g
101160101DNAArtificial sequenceDD20CR2 mutant target site
160ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaact aaggtgggtt tttgactttg catgtcgaag t
101161101DNAArtificial sequenceDD20CR2 mutant target site
161ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacg
aaggtgggtt tttgactttg catgtcgaag t 101162101DNAArtificial
sequenceDD20CR2 mutant target site 162ttcctttaca gcacaagtag
atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaagg tgggtttttg
actttgcatg tcgaagtgag a 101163101DNAArtificial sequenceDD20CR2
mutant target site i 163ttcctttaca gcacaagtag atcacttgta cttatcaaaa
ttcggaactg acacacgaca 60tgatggacgt gactaaggtg ggtttttgac tttgcatgtc
g 101164101DNAArtificial sequenceDD20CR2 mutant target site
164ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaact ttactaaggt gggtttttga ctttgcatgt c
101165101DNAArtificial sequenceDD20CR2 mutant target site
165ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgaca 60tgatggaacg tgacaaggtg ggtttttgac tttgcatgtc g
101166101DNAArtificial sequenceDD20CR2 mutant target site
166ttcctttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacactaca 60ttatttaact ttactaaggt gggtttttga ctttgcatgt c
101167108DNAArtificial sequenceDD43CR1 mutant target site
167agctgtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga ggaattggag tctgcctctt cttttagtcc taaattaa
108168101DNAArtificial sequenceDD43CR1 mutant target site
168agctgtaaat acagccttac aactcacaag tcccttgtac ggagggtatt
ctagaaaaga 60ggaattggag tctgcctctt cttttagtcc taaattaaag a
101169101DNAArtificial sequenceDD43CR1 mutant target site
169agctgtaaat acagccttac aactcacaag tcccttgtac ttgtacggag
ggtattctag 60aaaagaggaa ttggagtctg cctcttcttt tagtcctaaa t
101170101DNAArtificial sequenceDD43CR1 mutant target site
170agctgtaaat acagccttac aactcacaag tcccttacgg agggtattct
agaaaagagg 60aattggagtc tgcctcttct tttagtccta aattaaagat c
101171101DNAArtificial sequenceDD43CR1 mutant target site
171agctgtaaat acagccttac aactcacaag tcccttgtac ttgtaccgta
cggagggtat 60tctagaaaag aggaattgga gtctgcctct tcttttagtc c
101172101DNAArtificial sequenceDD43CR1 mutant target site
172agctgtaaat acagccttac aactcacaag tcccttgtac tgtacggagg
gtattctaga 60aaagaggaat tggagtctgc ctcttctttt agtcctaaat t
101173101DNAArtificial sequenceDD43CR1 mutant target site
173agctgtaaat acagccttac aactcacaag tcccttgtag tacggagggt
attctagaaa 60agaggaattg gagtctgcct cttcttttag tcctaaatta a
101174101DNAArtificial sequenceDD43CR1 mutant target site
174agctgtaaat acagccttac aactcacaag tcccttgtac ttgtacgtag
ggtattctag 60aaaagaggaa ttggagtctg cctcttcttt tagtcctaaa t
101175100DNAArtificial sequenceDD43CR1 mutant target site
175agctgtaaat acagccttac aactcacaag tcctacactc tttccctaca
cgacgctctt 60cttttagtcc taaattaaag atcggaagat ctcgtatgcc
100176101DNAArtificial sequenceDD43CR1 mutant target site
176agctgtaaat acagccttac aactcacaag tcccttgtac ttgtacctta
cggagggtat 60tctagaaaag aggaattgga gtctgcctct tcttttagtc c
101177101DNAArtificial sequenceDD43CR2 mutant target site
177ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaatt ggagtctgcc tcttctttta gtcctaaatt a
101178101DNAArtificial sequenceDD43CR2 mutant target site
178ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga attggagtct gcctcttctt ttagtcctaa a
101179101DNAArtificial sequenceDD43CR2 mutant target site
179ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaattgg agtctgcctc ttcttttagt cctaaattaa a
101180101DNAArtificial sequenceDD43CR2 mutant target site
180ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga aattggagtc tgcctcttct tttagtccta a
101181101DNAArtificial sequenceDD43CR2 mutant target site
181ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaat tggagtctgc ctcttctttt agtcctaaat t
101182101DNAArtificial sequenceDD43CR2 mutant target site
182ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga ggattggagt ctgcctcttc ttttagtcct a
101183101DNAArtificial sequenceDD43CR2 mutant target site
183ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaattg gagtctgcct cttcttttag tcctaaatta a
101184101DNAArtificial sequenceDD43CR2 mutant target site
184ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctattggagt ctgcctcttc ttttagtcct aaattaaaga t
101185101DNAArtificial sequenceDD43CR2 mutant target site
185ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagtctgcc tcttctttta gtcctaaatt aaagatcgga a
101186101DNAArtificial sequenceDD43CR2 mutant target site
186ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaagt ctgcctcttc ttttagtcct aaattaaaga t
101187101DNAArtificial sequenceDD43CR2 mutant target site
187ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga gaattggagt ctgcctcttc ttttagtcct a
101188101DNAArtificial sequenceDD43CR2 mutant target site
188ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga ggagtctgcc tcttctttta gtcctaaatt a
101189101DNAArtificial sequenceDD43CR2 mutant target site
189ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctaattggag tctgcctctt cttttagtcc taaattaaag a
101190101DNAArtificial sequenceDD43CR2 mutant target site
190ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaaaga ggaaattgga gtctgcctct tcttttagtc c
101191101DNAArtificial sequenceDD43CR2 mutant target site
191ctaggtaaat acagccttac aactcacaag tcccttgtac ttgtacgtac
ggagggtatt 60ctagaaagag gaattggagt ctgcctcttc ttttagtcct a
1011921377PRTArtificial sequencemaize optimized moCAS9 endonuclease
192Met Ala Pro Lys Lys Lys Arg Lys Val Met Asp Lys Lys Tyr Ser Ile
1 5 10 15 Gly Leu Asp Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
Thr Asp 20 25 30 Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu
Gly Asn Thr Asp 35 40 45 Arg His Ser Ile Lys Lys Asn Leu Ile Gly
Ala Leu Leu Phe Asp Ser 50 55 60 Gly Glu Thr Ala Glu Ala Thr Arg
Leu Lys Arg Thr Ala Arg Arg Arg 65 70 75 80 Tyr Thr Arg Arg Lys Asn
Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser 85 90 95 Asn Glu Met Ala
Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu 100 105 110 Ser Phe
Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe 115 120 125
Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile 130
135 140 Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp
Leu 145 150 155 160 Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys
Phe Arg Gly His 165 170 175 Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
Asn Ser Asp Val Asp Lys 180 185 190 Leu Phe Ile Gln Leu Val Gln Thr
Tyr Asn Gln Leu Phe Glu Glu Asn 195 200 205 Pro Ile Asn Ala Ser Gly
Val Asp Ala Lys Ala Ile Leu Ser Ala Arg 210 215 220 Leu Ser Lys Ser
Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly 225 230 235 240 Glu
Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly 245 250
255 Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys
260 265 270 Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn
Leu Leu 275 280 285 Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu
Ala Ala Lys Asn 290 295 300 Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
Leu Arg Val Asn Thr Glu 305 310 315 320 Ile Thr Lys Ala Pro Leu Ser
Ala Ser Met Ile Lys Arg Tyr Asp Glu 325 330 335 His His Gln Asp Leu
Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu 340 345 350 Pro Glu Lys
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr 355 360 365 Ala
Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe 370 375
380 Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val
385 390 395 400 Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr
Phe Asp Asn 405 410 415 Gly Ser Ile Pro His Gln Ile His Leu Gly Glu
Leu His Ala Ile Leu 420 425 430 Arg Arg Gln Glu Asp Phe Tyr Pro Phe
Leu Lys Asp Asn Arg Glu Lys 435 440 445 Ile Glu Lys Ile Leu Thr Phe
Arg Ile Pro Tyr Tyr Val Gly Pro Leu 450 455 460 Ala Arg Gly Asn Ser
Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu 465 470 475 480 Thr Ile
Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser 485 490 495
Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro 500
505 510 Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe
Thr 515 520 525 Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu
Gly Met Arg 530 535 540 Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys
Ala Ile Val Asp Leu 545 550 555 560 Leu Phe Lys Thr Asn Arg Lys Val
Thr Val Lys Gln Leu Lys Glu Asp 565 570 575 Tyr Phe Lys Lys Ile Glu
Cys Phe Asp Ser Val Glu Ile Ser Gly Val 580 585 590 Glu Asp Arg Phe
Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys 595 600 605 Ile Ile
Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile 610 615 620
Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met 625
630 635 640 Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp
Lys Val 645 650 655 Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp
Gly Arg Leu Ser 660 665 670 Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
Gln Ser Gly Lys Thr Ile 675 680 685 Leu Asp Phe Leu Lys Ser Asp Gly
Phe Ala Asn Arg Asn Phe Met Gln 690 695 700 Leu Ile His Asp Asp Ser
Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala 705 710 715 720 Gln Val Ser
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu 725 730 735 Ala
Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val 740 745
750 Val Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile
755 760 765 Val Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly
Gln Lys 770 775 780 Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly
Ile Lys Glu Leu 785 790 795 800 Gly Ser Gln Ile Leu Lys Glu His Pro
Val Glu Asn Thr Gln Leu Gln 805 810 815 Asn Glu Lys Leu Tyr Leu Tyr
Tyr Leu Gln Asn Gly Arg Asp Met Tyr 820 825 830 Val Asp Gln Glu Leu
Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp 835 840 845 His Ile Val
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys 850 855 860 Val
Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro 865 870
875 880 Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu
Leu 885 890 895 Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu
Thr Lys Ala 900 905 910 Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala
Gly Phe Ile Lys Arg 915 920 925 Gln Leu Val Glu Thr Arg Gln Ile Thr
Lys His Val Ala Gln Ile Leu 930 935 940 Asp Ser Arg Met Asn Thr Lys
Tyr Asp Glu Asn Asp Lys Leu Ile Arg 945 950 955 960 Glu Val Lys Val
Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg 965 970 975 Lys Asp
Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His 980 985 990
Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys 995
1000 1005 Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr
Lys 1010 1015 1020 Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu
Gln Glu Ile 1025 1030 1035 Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
Ser Asn Ile Met Asn 1040 1045 1050 Phe Phe Lys Thr Glu Ile Thr Leu
Ala Asn Gly Glu Ile Arg Lys 1055 1060 1065 Arg Pro Leu Ile Glu Thr
Asn Gly Glu Thr Gly Glu Ile Val Trp 1070 1075 1080 Asp Lys Gly Arg
Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met 1085 1090 1095 Pro Gln
Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly 1100 1105 1110
Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu 1115
1120 1125 Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly
Phe 1130 1135 1140 Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
Ala Lys Val 1145 1150 1155 Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
Val Lys Glu Leu Leu 1160 1165 1170 Gly Ile Thr Ile Met Glu Arg Ser
Ser Phe Glu Lys Asn Pro Ile 1175 1180 1185 Asp Phe Leu Glu Ala Lys
Gly Tyr Lys Glu Val Lys Lys Asp Leu 1190 1195 1200 Ile Ile Lys Leu
Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly 1205 1210 1215 Arg Lys
Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn 1220 1225 1230
Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala 1235
1240 1245 Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu
Gln 1250 1255 1260 Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu
Asp Glu Ile 1265 1270 1275 Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
Val Ile Leu Ala Asp 1280 1285 1290 Ala Asn Leu Asp Lys Val Leu Ser
Ala Tyr Asn Lys His Arg Asp 1295 1300 1305 Lys Pro Ile Arg Glu Gln
Ala Glu Asn Ile Ile His Leu Phe Thr 1310 1315 1320 Leu Thr Asn Leu
Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr 1325 1330 1335 Thr Ile
Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp 1340 1345 1350
Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg 1355
1360 1365 Ile Asp Leu Ser Gln Leu Gly Gly Asp 1370 1375
1936677DNAArtificial sequencemaize optimized moCAS9 endonuclease
193ctgcagtgca gcgtgacccg gtcgtgcccc tctctagaga taatgagcat
tgcatgtcta 60agttataaaa aattaccaca tatttttttt gtcacacttg tttgaagtgc
agtttatcta 120tctttataca tatatttaaa ctttactcta cgaataatat
aatctatagt actacaataa 180tatcagtgtt ttagagaatc atataaatga
acagttagac atggtctaaa ggacaattga 240gtattttgac aacaggactc
tacagtttta tctttttagt gtgcatgtgt tctccttttt 300ttttgcaaat
agcttcacct atataatact tcatccattt tattagtaca tccatttagg
360gtttagggtt aatggttttt atagactaat ttttttagta catctatttt
attctatttt 420agcctctaaa ttaagaaaac taaaactcta ttttagtttt
tttatttaat aatttagata 480taaaatagaa taaaataaag tgactaaaaa
ttaaacaaat accctttaag aaattaaaaa 540aactaaggaa acatttttct
tgtttcgagt agataatgcc agcctgttaa acgccgtcga 600cgagtctaac
ggacaccaac cagcgaacca gcagcgtcgc gtcgggccaa gcgaagcaga
660cggcacggca tctctgtcgc tgcctctgga cccctctcga gagttccgct
ccaccgttgg 720acttgctccg ctgtcggcat ccagaaattg cgtggcggag
cggcagacgt gagccggcac 780ggcaggcggc ctcctcctcc tctcacggca
cggcagctac gggggattcc tttcccaccg 840ctccttcgct ttcccttcct
cgcccgccgt aataaataga caccccctcc acaccctctt 900tccccaacct
cgtgttgttc ggagcgcaca cacacacaac cagatctccc ccaaatccac
960ccgtcggcac ctccgcttca aggtacgccg ctcgtcctcc cccccccccc
tctctacctt 1020ctctagatcg gcgttccggt ccatgcatgg ttagggcccg
gtagttctac ttctgttcat 1080gtttgtgtta gatccgtgtt tgtgttagat
ccgtgctgct agcgttcgta cacggatgcg 1140acctgtacgt cagacacgtt
ctgattgcta acttgccagt gtttctcttt ggggaatcct 1200gggatggctc
tagccgttcc gcagacggga tcgatttcat gatttttttt gtttcgttgc
1260atagggtttg gtttgccctt ttcctttatt tcaatatatg ccgtgcactt
gtttgtcggg 1320tcatcttttc atgctttttt ttgtcttggt tgtgatgatg
tggtctggtt gggcggtcgt 1380tctagatcgg agtagaattc tgtttcaaac
tacctggtgg atttattaat tttggatctg 1440tatgtgtgtg ccatacatat
tcatagttac gaattgaaga tgatggatgg aaatatcgat 1500ctaggatagg
tatacatgtt gatgcgggtt ttactgatgc atatacagag atgctttttg
1560ttcgcttggt tgtgatgatg tggtgtggtt gggcggtcgt tcattcgttc
tagatcggag 1620tagaatactg tttcaaacta cctggtgtat ttattaattt
tggaactgta tgtgtgtgtc 1680atacatcttc atagttacga gtttaagatg
gatggaaata tcgatctagg ataggtatac 1740atgttgatgt gggttttact
gatgcatata catgatggca tatgcagcat ctattcatat 1800gctctaacct
tgagtaccta tctattataa taaacaagta tgttttataa ttattttgat
1860cttgatatac ttggatgatg gcatatgcag cagctatatg tggatttttt
tagccctgcc 1920ttcatacgct atttatttgc ttggtactgt ttcttttgtc
gatgctcacc ctgttgtttg 1980gtgttacttc tgcaggtcga ctctagagga
tccccatggc cccgaagaag aagaggaagg 2040tgcacatgga taagaagtac
agcatcggcc tcgacatcgg gaccaacagc gtcggctggg 2100ccgtcatcac
cgacgaatat aaggtgccca gcaagaagtt caaggtgctc gggaatacag
2160accgccacag catcaagaag aacctgatcg gcgccctcct gttcgactcg
ggcgagaccg 2220ctgaggccac cagactaaag aggaccgctc gccgccgcta
cacccgccgc aagaaccgca 2280tatgctacct ccaggagatc ttcagcaacg
agatggccaa ggtggacgac agcttcttcc 2340accgccttga ggagtcgttc
ctcgtggagg aggacaagaa gcatgagagg cacccgatct 2400tcgggaacat
cgtggacgag gtaagtttct gcttctacct ttgatatata tataataatt
2460atcattaatt agtagtaata taatatttca aatatttttt tcaaaataaa
agaatgtagt 2520atatagcaat tgcttttctg tagtttataa gtgtgtatat
tttaatttat aacttttcta 2580atatatgacc aaaacatggt gatgtgcagg
tggcgtacca cgagaagtac ccgacgatct 2640accacctccg caagaagctg
gtcgactcca cagacaaggc cgacctcaga ctgatctacc 2700tggccctcgc
gcacatgatc aagttccgcg ggcacttcct catcgagggc gacctgaacc
2760cggacaactc cgacgtcgac aagctcttca tccagctggt ccagacctac
aatcaactgt 2820tcgaggagaa cccgatcaac gcgtccggcg tggacgcgaa
ggccatcctc agcgcgaggc 2880tcagcaaatc aagacggctg gagaacctga
tcgcccagct cccaggcgag aagaaaaacg 2940gcttgttcgg caacctgatc
gcgctctcgc tcggcctcac gcccaacttc aaatcaaact 3000tcgacctggc
cgaggacgcg aaactgcagc tgtccaagga cacttacgac gacgacctcg
3060acaacctgct ggcgcaaatc ggtgaccagt acgcagacct cttcctggcc
gccaagaacc 3120tctcggacgc catcctgctg tccgatatcc tgagagtgaa
tacggagatc accaaggcgc 3180cgctcagcgc ctccatgatt aaaaggtacg
acgagcacca ccaggacctg acgctgctca 3240aggccctggt gcgccagcag
ctccccgaga agtacaagga gatcttcttc gaccaatcaa 3300aaaacggcta
cgccggctac atcgacgggg gcgcctccca ggaggagttc tacaagttca
3360tcaaaccaat tctcgagaag atggacggca cggaggagct tctcgtgaag
ctcaaccggg 3420aggacctcct gaggaagcag aggacgttcg acaacggctc
gataccgcat cagatccacc 3480tgggcgagct ccacgccatc ctgcgccggc
aggaggattt ctatccgttc ctcaaggaca 3540acagggagaa gatcgagaaa
attctgacgt tccgcatccc gtactacgtg ggccctctcg 3600cgcgcgggaa
cagccggttc gcctggatga ctcggaagtc ggaggagacg atcacgccgt
3660ggaacttcga ggaggtggtg gacaagggcg cctccgccca gtcgttcatc
gagcgcatga 3720cgaacttcga taaaaatctg cccaatgaaa aagtgctccc
gaagcacagc ctcctctacg 3780agtacttcac ggtgtacaac gagctcacga
aggtgaagta cgtgaccgag ggtatgcgga 3840agccggcgtt cctgagcggc
gagcagaaga aggccatcgt ggacctcctc ttcaagacga 3900accggaaagt
caccgtgaag caattaaagg aggactactt caagaaaata gagtgcttcg
3960acagcgtcga gatctcgggc gtcgaggaca ggttcaacgc gtcgctgggc
acataccacg 4020acctcctcaa gatcattaaa gacaaggact tcctggacaa
cgaggagaac gaggacatcc 4080tcgaggacat cgtgctgacc ctcaccctgt
ttgaggaccg ggagatgatc gaggagcgcc 4140tcaagacgta cgctcacctt
ttcgacgaca aggtgatgaa acagctgaag cggcgccgct 4200acaccggatg
gggccggctc tcccgcaagc tcattaatgg gatcagggac aagcagtccg
4260gcaagaccat actcgatttc ctgaagagcg acggcttcgc caaccggaac
ttcatgcagc 4320tcatccacga cgactccctc actttcaagg aggacatcca
gaaggcccag gtcagcggac 4380agggcgactc gctccacgaa cacatcgcca
acctggccgg gtcgcctgcg attaaaaagg 4440gaatccttca gaccgtcaag
gtcgtggacg agctggtgaa ggtgatgggc aggcacaagc 4500ccgaaaatat
cgtcattgag atggcccggg agaaccagac cacgcagaaa ggccagaaga
4560acagccggga gcgcatgaaa cggatcgagg agggtatcaa ggagctgggc
tcgcagatcc 4620tcaaggagca ccctgtggaa aatacccagc tgcagaatga
aaagctctac ctctactacc 4680tccagaacgg ccgcgacatg tacgtggacc
aggagctgga cattaatcgc ctctcggact 4740acgacgtcga ccacatcgtc
ccgcagtcct tcctgaagga cgacagcatc gacaacaagg 4800tcttgacccg
ctccgataaa aatcgcggga agtccgacaa cgtgccgtcg gaggaggtgg
4860tcaagaagat gaaaaactac tggcgccagc tgctcaacgc caagctaatc
acgcagcgca 4920agttcgacaa cctcaccaag gccgaacgcg gcggtctctc
cgagcttgat aaggctgggt 4980tcatcaagag acagctggtg gagacccggc
agatcaccaa gcatgtcgcc cagatcctgg 5040actcgcgcat gaatactaag
tacgatgaaa acgacaagct catccgcgag gtgaaggtga 5100tcaccctgaa
gagcaagctg gtctcggact tccggaagga cttccagttc tacaaggtcc
5160gggagatcaa caactaccac cacgcgcacg acgcctacct gaacgcggtg
gtgggcacag 5220cccttataaa gaagtaccct aagctcgagt ccgagttcgt
gtacggcgac tacaaggtgt 5280acgacgtccg caagatgatc gcgaagagcg
agcaggagat cgggaaggcc accgcaaaat 5340acttcttcta ctccaacatc
atgaacttct tcaagaccga gatcaccctg gccaacgggg 5400agatccgcaa
gcgcccgctg attgagacga acggagagac aggcgagata gtctgggaca
5460agggcaggga cttcgccacc gtgcgcaagg ttctgtccat gccgcaggtg
aacatcgtga 5520agaagactga ggtgcagaca ggcggcttct cgaaggagtc
catcctgccc aagcggaaca 5580gcgacaagct catcgcgcgg aagaaggact
gggaccctaa aaaatatggc gggttcgact 5640cgcccaccgt ggcttactcg
gtcctcgtgg tggccaaggt cgagaagggc aaaagcaaga 5700agctgaagag
cgtcaaggag ctcctcggca tcaccatcat ggagcggtcc agcttcgaga
5760agaacccgat cgacttcctc gaggcgaagg gatataagga ggtgaagaag
gacctcatca 5820ttaaactgcc gaagtactcg ctattcgaac tggagaatgg
tcgcaagagg atgctcgcga 5880gcgctggcga gctgcagaaa gggaacgagc
tggctctccc gagcaagtac gtcaacttcc 5940tctacctggc ctcccactat
gaaaagctca agggctcgcc ggaggacaac gagcagaagc 6000agctgttcgt
cgagcagcac aagcattacc tcgacgagat catcgagcag atctcggagt
6060tcagcaagcg cgtgatcctg gccgacgcca acctcgacaa ggtgctgtcc
gcatataaca 6120agcaccgcga caaaccaata cgggagcagg ccgaaaatat
catccacctg ttcaccctca 6180cgaacctggg cgcccccgcc gcgttcaagt
acttcgacac aaccatcgac cgcaagcggt 6240acacgagcac gaaggaggtg
ctggacgcca cgttgattca ccagtccatc acgggcctgt 6300atgaaacaag
gatcgatctc agccagctcg gcggcgacta ggtaccacat ggttaaccta
6360gacttgtcca tcttctggat tggccaactt aattaatgta tgaaataaaa
ggatgcacac 6420atagtgacat gctaatcact ataatgtggg catcaaagtt
gtgtgttatg tgtaattact 6480agttatctga ataaaagaga aagagatcat
ccatatttct tatcctaaat gaatgtcacg 6540tgtctttata attctttgat
gaaccagatg catttcatta accaaatcca tatacatata 6600aatattaatc
atatataatt aatatcaatt gggttagcaa aacaaatcta gtctaggtgt
6660gttttgcgaa ttgcggc 6677194100DNAArtificial sequenceDNA version
of guide RNA (EPSPS sgRNA) 194gcagtaacag ctgctgtcaa gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctttt 1001953708DNAArtificial sequenceEPSPS polynucleotide
template 195ctgcagccca tcaaggagat ctccggcacc gtcaagctgc cggggtccaa
gtcgctttcc 60aacaggatcc tcctgctcgc cgccctgtcc gaggtgagcg attttggtgc
ttgctgcgct 120gccctgtctc actgctacct aaatgttttg cctgtcgaat
accatggatt ctcggtgtaa 180tccatctcac gatcagatgc accgcatgtc
gcatgcctag ctctctctaa tttgtctagt 240agtttgtata cggattaaga
ttgataaatc ggtaccgcaa aagctaggtg taaataaaca 300ctacaaaatt
ggatgttccc ctatcggcct gtactcggct actcgttctt gtgatggcat
360gttatttctt cttggtgttt ggtgaactcc cttatgaaat ttgggcgcaa
agaaatcgcc 420ctcaagggtt gatcttatgc catcgtcatg ataaacagtg
aagcacggat gatcctttac 480gttgttttta acaaactttg tcagaaaact
agcaatgtta acttcttaat gatgatttca 540caacaaaaaa ggtaaccttg
ctactaacat aacaaaagac ttgttgctta ttaattatat 600gtttttttaa
tctttgatca ggggacaaca gtggttgata acctgttgaa cagtgaggat
660gtccactaca tgctcggggc cttgaggact cttggtctct ctgtcgaagc
ggacaaagct 720gccaaaagag ctgtagttgt tggctgtggt ggaaagttcc
cagttgagga tgctaaagag 780gaagtgcagc tcttcttggg gaatgctgga
atcgcaatgc ggtcattgac agcagctgtt 840actgctgctg gtggaaatgc
aacgtatgtt tcctctctct ctctacaata cttgttggag 900ttagtatgaa
acccatgtgt atgtctagtg gcttatggtg tattggtttt tgaacttcag
960ttacgtgctt gatggagtac caagaatgag ggagagaccc attggcgact
tggttgtcgg 1020attgaagcag cttggtgcag atgttgattg tttccttggc
actgactgcc cacctgttcg 1080tgtcaatgga atcggagggc tacctggtgg
caaggttagt tactaagggc cacatgttac 1140attcttctgt aaatggtaca
actattgtcg agcttttgca tttgtaagga aaacattgat 1200tgatctgaat
ttgatgctac accacaaaat atctacaaat ggtcatccct aactagcaaa
1260ccatgtctcc attaagctca atgaagtaat acttggcatg tgtttatcaa
cttaatttcc 1320atcttctggg gtattgcctg ttttctagtc taatagcatt
tgtttttaga attagctctt 1380acaactgtta tgttctacag gtcaagctgt
ctggctccat cagcagtcag tacttgagtg 1440ccttgctgat ggctgctcct
ttggctcttg gggatgtgga gattgaaatc attgataaat 1500taatctccat
tccctacgtc gaaatgacat tgagattgat ggagcgtttt ggtgtgaaag
1560cagagcattc tgatagctgg gacagattct acattaaggg aggtcaaaaa
tacaagtaag 1620ctctgtaatg tatttcacta ctttgatgcc aatgtttcag
ttttcagttt tccaaacagt 1680cgcatcaata tttgaataga tgcactgtag
aaaaaaatca ttgcagggaa aaactagtac 1740tgagtatttt gactgtaaat
tatttaacca gtcggaatat agtcagtcta ttggagtcaa 1800gagcgtgaac
cgaaatagcc agttaattat cccattatac agaggacaac catgtatact
1860attgaaactt ggtttaagag aatctaggta gctggactcg tagctgcttg
gcatggatac 1920cttcttatct ttaggaaaag acacttgatt ttttttctgt
ggccctctat gatgtgtgaa 1980cctgcttctc tattgcttta gaaggatata
tctatgtcgt tatgcaacat gcttccctta 2040gtcatttgta ctgaaatcag
tttcataagt tcgttagtgg ttccctaaac gaaaccttgt 2100ttttctttgc
aatcaacagg tcccctaaaa atgcctatgt tgaaggtgat gcctcaagcg
2160caagctattt cttggctggt gctgcaatta ctggagggac tgtgactgtg
gaaggttgtg 2220gcaccaccag tttgcaggta aagatttctt ggctggtgct
acgataactg cttttgtctt 2280tttggtttca gcattgttct cagagtcact
aaataacatt atcatctgca aacgtcaaat 2340agacatactt aggtgaatgg
atattcatgt aaccgtttcc ttacaaattt gctgaaacct 2400cagggtgatg
tgaagtttgc tgaggtactg gagatgatgg gagcgaaggt tacatggacc
2460gagactagcg taactgttac tggcccaccg cgggagccat ttgggaggaa
acacctcaag 2520gcgattgatg tcaacatgaa caagatgcct gatgtcgcca
tgactcttgc tgtggttgcc 2580ctctttgccg atggcccgac agccatcaga
gacggtaaaa cattctcagc cctacaacca 2640tgcctcttct acatcactac
ttgacaagac taaaaactat tggctcgttg gcagtggctt 2700cctggagagt
aaaggagacc gagaggatgg ttgcgatccg gacggagcta accaaggtaa
2760ggctacatac ttcacatgtc tcacgtcgtc tttccatagc tcgctgcctc
ttagcggctt 2820gcctgcggtc gctccatcct cggttgctgt ctgtgttttc
cacagctggg agcatctgtt 2880gaggaagggc cggactactg catcatcacg
ccgccggaga agctgaacgt gacggcgatc 2940gacacgtacg acgaccacag
gatggccatg gccttctccc ttgccgcctg tgccgaggtc 3000cccgtgacca
tccgggaccc tgggtgcacc cggaagacct tccccgacta cttcgatgtg
3060ctgagcactt tcgtcaagaa ttaataaagc gtgcgatact accacgcagc
ttgattgaag 3120tgataggctt gtgctgagga aatacatttc ttttgttctg
ttttttctct ttcacgggat 3180taagttttga gtctgtaacg ttagttgttt
gtagcaagtt tctatttcgg atcttaagtt 3240tgtgcactgt aagccaaatt
tcatttcaag agtggttcgt tggaataata agaataataa 3300attacgtttc
agtggctgtc aagcctgctg ctacgtttta ggagatggca ttagacattc
3360atcatcaaca acaataaaac cttttagcct caaacaataa tagtgaagtt
attttttagt 3420cctaaacaag ttgcattagg atatagttaa aacacaaaag
aagctaaagt tagggtttag 3480acatgtggat attgttttcc atgtatagta
tgttctttct ttgagtctca tttaactacc 3540tctacacata ccaactttag
ttttttttct acctcttcat gttactatgg tgccttctta 3600tcccactgag
cattggtata tttagaggtt tttgttgaac atgcctaaat catctcaatc
3660aacgatggac aatcttttct tcgattgagc tgaggtacgt catctaga
370819615DNAArtificial sequenceTIPS nucleotide modifications
196atcgcaatgc ggtca 1519719DNAArtificial sequencePrimer Seqeunce-1
F-E2 197ccgaggagat cgtgctgca 1919820DNAArtificial sequencePrimer
Seqeunce-2 F-E2 198caatggccgc attgcagttc 2019919DNAArtificial
sequencePrimer Seqeunce-1 F-T 199ccgaggagat cgtgctgca
1920020DNAArtificial sequencePrimer Seqeunce-2 F-T 200tgaccgcatt
gcgattccag 2020124DNAArtificial sequencePrimer Seqeunce-1 H-T
201tccaagtcgc tttccaacag gatc 2420220DNAArtificial sequencePrimer
Seqeunce-2 H-T 202tgaccgcatt gcgattccag 2020319DNAArtificial
sequencePrimer Seqeunce-1 F-E3 203ccgaggagat cgtgctgca
1920424DNAArtificial sequencePrimer Seqeunce-2 F-E3 204accaagctgc
ttcaatccga caac 2420563DNAArtificial sequenceDNA fragment with
intact Cas target sequence 205ggggaatgct ggaactgcaa tgcggccatt
gacagcagct gttactgctg ctggtggaaa 60tgc 6320660DNAArtificial
sequenceDNA fragment with mutated Cas target sequence 206ggggaatgct
ggaactgcaa tgcggccatt ggcagctgtt actgctgctg gtggaaatgc
6020750DNAArtificial sequenceDNA fragment with mutated Cas target
sequence 207ggggaatgct ggaactgcac agcagctgtt actgctgctg gtggaaatgc
5020833DNAArtificial sequenceDNA fragment with mutated Cas target
sequence 208ggggaatgct gttactgctg ctggtggaaa tgc
3320951DNAArtificial sequenceTIPS edited EPSPS nucleotide sequence
fragment 209aatgctggaa tcgcaatgcg gtcattgaca gcagctgtta ctgctgctgg
t 5121051DNAArtificial sequenceWild-type epsps nucleotide sequence
fragment 210aatgctggaa ctgcaatgcg gccattgaca gcagctgtta ctgctgctgg
t 512115124DNAZea mays 211atggcggcca tggcgaccaa ggccgccgcg
ggcaccgtgt cgctggacct cgccgcgccg 60ccggcggcgg cagcggcggc ggcggtgcag
gcgggtgccg aggagatcgt gctgcagccc 120atcaaggaga tctccggcac
cgtcaagctg ccggggtcca agtcgctttc caacaggatc 180ctcctgctcg
ccgccctgtc cgaggtgagc gattttggtg cttgctgcgc tgccctgtct
240cactgctacc taaatgtttt gcctgtcgaa taccatggat tctcggtgta
atccatctca 300cgatcagatg caccgcatgt cgcatgccta gctctctcta
atttgtctag tagtttgtat 360acggattaag attgataaat cggtaccgca
aaagctaggt gtaaataaac actacaaaat 420tggatgttcc cctatcggcc
tgtactcggc tactcgttct tgtgatggca tgttatttct 480tcttggtgtt
tggtgaactc ccttatgaaa tttgggcgca aagaaatcgc cctcaagggt
540tgatcttatg ccatcgtcat gataaacagt gaagcacgga tgatccttta
cgttgttttt 600aacaaacttt gtcagaaaac tagcaatgtt aacttcttaa
tgatgatttc acaacaaaaa 660aggtaacctt gctactaaca taacaaaaga
cttgttgctt attaattata tgttttttta 720atctttgatc aggggacaac
agtggttgat aacctgttga acagtgagga tgtccactac 780atgctcgggg
ccttgaggac tcttggtctc tctgtcgaag cggacaaagc tgccaaaaga
840gctgtagttg ttggctgtgg tggaaagttc ccagttgagg atgctaaaga
ggaagtgcag 900ctcttcttgg ggaatgctgg aactgcaatg cggccattga
cagcagctgt tactgctgct 960ggtggaaatg caacgtatgt ttcctctctc
tctctacaat acttgttgga gttagtatga 1020aacccatgtg tatgtctagt
ggcttatggt gtattggttt ttgaacttca gttacgtgct 1080tgatggagta
ccaagaatga gggagagacc cattggcgac ttggttgtcg gattgaagca
1140gcttggtgca gatgttgatt gtttccttgg cactgactgc ccacctgttc
gtgtcaatgg 1200aatcggaggg ctacctggtg gcaaggttag ttactaaggg
ccacatgtta cattcttctg 1260taaatggtac aactattgtc gagcttttgc
atttgtaagg aaaacattga ttgatctgaa 1320tttgatgcta caccacaaaa
tatctacaaa tggtcatccc taactagcaa accatgtctc 1380cattaagctc
aatgaagtaa tacttggcat gtgtttatca acttaatttc catcttctgg
1440ggtattgcct gttttctagt ctaatagcat ttgtttttag aattagctct
tacaactgtt 1500atgttctaca ggtcaagctg tctggctcca tcagcagtca
gtacttgagt gccttgctga 1560tggctgctcc tttggctctt ggggatgtgg
agattgaaat cattgataaa ttaatctcca 1620ttccctacgt cgaaatgaca
ttgagattga tggagcgttt tggtgtgaaa gcagagcatt 1680ctgatagctg
ggacagattc tacattaagg gaggtcaaaa atacaagtaa gctctgtaat
1740gtatttcact actttgatgc caatgtttca gttttcagtt ttccaaacag
tcgcatcaat 1800atttgaatag atgcactgta gaaaaaaatc attgcaggga
aaaactagta ctgagtattt 1860tgactgtaaa ttatttaacc agtcggaata
tagtcagtct attggagtca agagcgtgaa 1920ccgaaatagc cagttaatta
tcccattata cagaggacaa ccatgtatac tattgaaact 1980tggtttaaga
gaatctaggt agctggactc gtagctgctt ggcatggata ccttcttatc
2040tttaggaaaa gacacttgat tttttttctg tggccctcta tgatgtgtga
acctgcttct 2100ctattgcttt agaaggatat atctatgtcg ttatgcaaca
tgcttccctt agtcatttgt 2160actgaaatca gtttcataag ttcgttagtg
gttccctaaa cgaaaccttg tttttctttg 2220caatcaacag gtcccctaaa
aatgcctatg ttgaaggtga tgcctcaagc gcaagctatt 2280tcttggctgg
tgctgcaatt actggaggga ctgtgactgt ggaaggttgt ggcaccacca
2340gtttgcaggt aaagatttct tggctggtgc tacgataact gcttttgtct
ttttggtttc 2400agcattgttc tcagagtcac taaataacat tatcatctgc
aaacgtcaaa tagacatact 2460taggtgaatg gatattcatg taaccgtttc
cttacaaatt tgctgaaacc tcagggtgat 2520gtgaagtttg ctgaggtact
ggagatgatg ggagcgaagg ttacatggac cgagactagc 2580gtaactgtta
ctggcccacc gcgggagcca tttgggagga aacacctcaa ggcgattgat
2640gtcaacatga acaagatgcc tgatgtcgcc atgactcttg ctgtggttgc
cctctttgcc 2700gatggcccga cagccatcag agacggtaaa acattctcag
ccctacaacc atgcctcttc 2760tacatcacta cttgacaaga ctaaaaacta
ttggctcgtt ggcagtggct tcctggagag 2820taaaggagac
cgagaggatg gttgcgatcc ggacggagct aaccaaggta aggctacata
2880cttcacatgt ctcacgtcgt ctttccatag ctcgctgcct cttagcggct
tgcctgcggt 2940cgctccatcc tcggttgctg tctgtgtttt ccacagctgg
gagcatctgt tgaggaaggg 3000ccggactact gcatcatcac gccgccggag
aagctgaacg tgacggcgat cgacacgtac 3060gacgaccaca ggatggccat
ggccttctcc cttgccgcct gtgccgaggt ccccgtgacc 3120atccgggacc
ctgggtgcac ccggaagacc ttccccgact acttcgatgt gctgagcact
3180ttcgtcaaga attaataaag cgtgcgatac taccacgcag cttgattgaa
gtgataggct 3240tgtgctgagg aaatacattt cttttgttct gttttttctc
tttcacggga ttaagttttg 3300agtctgtaac gttagttgtt tgtagcaagt
ttctatttcg gatcttaagt ttgtgcactg 3360taagccaaat ttcatttcaa
gagtggttcg ttggaataat aagaataata aattacgttt 3420cagtggctgt
caagcctgct gctacgtttt aggagatggc attagacatt catcatcaac
3480aacaataaaa ccttttagcc tcaaacaata atagtgaagt tattttttag
tcctaaacaa 3540gttgcattag gatatagtta aaacacaaaa gaagctaaag
ttagggttta gacatgtgga 3600tattgttttc catgtatagt atgttctttc
tttgagtctc atttaactac ctctacacat 3660accaacttta gttttttttc
tacctcttca tgttactatg gtgccttctt atcccactga 3720gcattggtat
atttagaggt ttttgttgaa catgcctaaa tcatctcaat caacgatgga
3780caatcttttc ttcgattgag ctgaggtacg tcatctagag gataggacct
tgagaatatg 3840tgtccgtcaa tagctaaccc tctactaatt ttttcaatca
agcaacctat tggcttgact 3900ttaattcgta ccggcttcta ctacttctac
agtattttgt ctctataaat tgcagctaca 3960acagtcagaa cggctggctt
taaaatcaaa tggcctaagg atcattgaaa ggcatcttag 4020caatgtctaa
aattattacc ttctctagac gttgatatct ttgctccgga ttcgatccct
4080tgttgtatga ccacaaatcc aacaccaaat acgcatttct gcaacacacc
caaacacccc 4140ttccaaataa gtggaatggt tgagaaattt gctattttga
ttaaatattg gtgaaggggc 4200aaggctgagg aaacgagacg aaggttcctt
gacagctgaa aaatggaaca ctctagaggc 4260ggagggagcg aggcgagctg
tgtgaattgc cacccattga ttaagaatcc aacaacttga 4320ctagcaaatg
ccgacatggg tagcctacaa aggcgagttt tggagctggt ttcgtaataa
4380ggaaatttct caaccaacta ctttccttag aaaagagttg cttgaccgga
tcaacatctc 4440cccctaaacc ccttggaggg ggagggggct aagattttaa
tctacaagtt agatctaact 4500gtccacctca atccccctca aggaggtttt
tgtattattt gttagtgtag aatgataaag 4560tggatgtatt gataggagat
ggggtacaca tatttatagg gactcaaccc taaccctaat 4620gggtcggcag
cccaacagtg gtgtccggcc cacacacaca ctcacacaca cagtctaaca
4680tcccccgcag tcgcaacggg gacaccacac acgatgagac tggagtagag
gccgaaggta 4740ggagccgacg ggttgaaatc ccccctagtc gcagcgtcgt
gatagtacga atgttgcggc 4800tggagtagag accggtgtgt gctccaagaa
gacgatagcc cctagatgcc gaggtagccg 4860aagtcgaggt ggtcgcggtc
ggaagacgcg cagcaaaagc ctgatcttcg ggatggtcga 4920cgttcgagcg
tcaacgatcg gtagggcgac acaataaaag ggcaccagca ggtcgacctt
4980cctgcttctt cgatcgtcca gacgtcaagg agcctcgcta gggaggccga
cggcagcgca 5040cgcggctacg ccggtcatgg tgtcctcacc cgcggcagaa
aagaagggga atgtcggatc 5100cgaccgagaa ggccacggca gcga
51242123387DNAS. thermophilus 212atgagtgact tagttttagg acttgatatc
ggtataggtt ctgttggtgt aggtatcctt 60aacaaagtga caggagaaat tatccataaa
aactcacgca tcttcccagc agctcaagca 120gaaaataacc tagtacgtag
aacgaatcgt caaggaagac gcttgacacg acgtaaaaaa 180catcgtatag
ttcgtttaaa tcgtctattt gaggaaagtg gattaatcac cgattttacg
240aagatttcaa ttaatcttaa cccatatcaa ttacgagtta agggcttgac
cgatgaattg 300tctaatgaag aactgtttat cgctcttaaa aatatggtga
aacaccgtgg gattagttac 360ctcgatgatg ctagtgatga cggaaattca
tcagtaggag actatgcaca aattgttaag 420gaaaatagta aacaattaga
aactaagaca ccgggacaga tacagttgga acgctaccaa 480acatatggtc
aattacgtgg tgattttact gttgagaaag atggcaaaaa acatcgcttg
540attaatgtct ttccaacatc agcttatcgt tcagaagcct taaggatact
gcaaactcaa 600caagaattta atccacagat tacagatgaa tttattaatc
gttatctcga aattttaact 660ggaaaacgga aatattatca tggacccgga
aatgaaaagt cacggactga ttatggtcgt 720tacagaacga gtggagaaac
tttagacaat atttttggaa ttctaattgg gaaatgtaca 780ttttatccag
aagagtttag agcagcaaaa gcttcctaca cggctcaaga attcaatttg
840ctaaatgatt tgaacaatct aacagttcct actgaaacca aaaagttgag
caaagaacag 900aagaatcaaa tcattaatta tgtcaaaaat gaaaaggcaa
tggggccagc gaaacttttt 960aaatatatcg ctaagttact ttcttgtgat
gttgcagata tcaagggata ccgtatcgac 1020aaatcaggta aggctgagat
tcatactttc gaagcctatc gaaaaatgaa aacgcttgaa 1080accttagata
ttgaacaaat ggatagagaa acgcttgata aattagccta tgtcttaaca
1140ttaaacactg agagggaagg tattcaagaa gccttagaac atgaatttgc
tgatggtagc 1200tttagccaga agcaagttga cgaattggtt caattccgca
aagcaaatag ttccattttt 1260ggaaaaggat ggcataattt ttctgtcaaa
ctgatgatgg agttaattcc agaattgtat 1320gagacgtcag aagagcaaat
gactatcctg acacgacttg gaaaacaaaa acgacttcgt 1380cttcaaataa
aacaaaatat ttcaaataaa acaaaatata tagatgagaa actattaact
1440gaagaaatct ataatcctgt tgttgctaag tctgttcgcc aggctataaa
aatcgtaaat 1500gcggcgatta aagaatacgg agactttgac aatattgtca
tcgaaatggc tcgtgaaaca 1560aatgaagatg atgaaaagaa agctattcaa
aagattcaaa aagccaacaa agatgaaaaa 1620gatgcagcaa tgcttaaggc
tgctaaccaa tataatggaa aggctgaatt accacatagt 1680gttttccacg
gtcataagca attagcgact aaaatccgcc tttggcatca gcaaggagaa
1740cgttgccttt atactggtaa gacaatctca atccatgatt tgataaataa
tcctaatcag 1800tttgaagtag atcatatttt acctctttct atcacattcg
atgatagcct tgcaaataag 1860gttttggttt atgcaactgc taaccaagaa
aaaggacaac gaacacctta tcaggcttta 1920gatagtatgg atgatgcgtg
gtctttccgt gaattaaaag cttttgtacg tgagtcaaaa 1980acactttcaa
acaagaaaaa agaatacctc cttacagaag aagatatttc aaagtttgat
2040gttcgaaaga aatttattga acgaaatctt gtagatacaa gatacgcttc
aagagttgtc 2100ctcaatgccc ttcaagaaca ctttagagct cacaagattg
atacaaaagt ttccgtggtt 2160cgtggccaat ttacatctca attgagacgc
cattggggaa ttgagaagac tcgtgatact 2220tatcatcacc atgctgtcga
tgcattgatt attgccgcct caagtcagtt gaatttgtgg 2280aaaaaacaaa
agaataccct tgtaagttat tcagaagaac aactccttga tattgaaaca
2340ggtgaactta ttagtgatga tgagtacaag gaatctgtgt tcaaagcccc
ttatcaacat 2400tttgttgata cattgaagag taaagaattt gaagacagta
tcttattctc atatcaagtg 2460gattctaagt ttaatcgtaa aatatcagat
gccactattt atgcgacaag acaggctaaa 2520gtgggaaaag ataagaagga
tgaaacttat gtcttaggga aaatcaaaga tatctatact 2580caggatggtt
atgatgcctt tatgaagatt tataagaagg ataagtcaaa attcctcatg
2640tatcgtcacg acccacaaac ctttgagaaa gttatcgagc caattttaga
gaactatcct 2700aataagcaaa tgaatgaaaa aggaaaagag gtaccatgta
atcctttcct aaaatataaa 2760gaagaacatg gctatattcg taaatatagt
aaaaaaggca atggtcctga aatcaagagt 2820cttaaatact atgatagtaa
gcttttaggt aatcctattg atattactcc agagaatagt 2880aaaaataaag
ttgtcttaca gtcattaaaa ccttggagaa cagatgtcta tttcaataag
2940gctactggaa aatacgaaat ccttggatta aaatatgctg atctacaatt
tgagaaaggg 3000acaggaacat ataagatttc ccaggaaaaa tacaatgaca
ttaagaaaaa agagggtgta 3060gattctgatt cagaattcaa gtttacactt
tataaaaatg atttgttact cgttaaagat 3120acagaaacaa aagaacaaca
gcttttccgt tttctttctc gaactttacc taaacaaaag 3180cattatgttg
aattaaaacc ttatgataaa cagaaatttg aaggaggtga ggcgttaatt
3240aaagtgttgg gtaacgttgc taatggtggt caatgcataa aaggactagc
aaaatcaaat 3300atttctattt ataaagtaag aacagatgtc ctaggaaatc
agcatatcat caaaaatgag 3360ggtgataagc ctaagctaga tttttaa
33872133369DNAS. thermophilus 213atgagtgact tagttttagg acttgatatc
ggtataggtt ctgttggtgt aggtatcctt 60aacaaagtga caggagaaat tatccataaa
aactcacgca tcttcccagc agctcaagca 120gaaaataacc tagtacgtag
aacgaatcgt caaggaagac gcttgacacg acgtaaaaaa 180catcgtatag
ttcgtttaaa tcgtctattt gaggaaagtg gattaatcac cgattttacg
240aagatttcaa ttaatcttaa cccatatcaa ttacgagtta agggcttgac
cgatgaattg 300tctaatgaag aactgtttat cgctcttaaa aatatggtga
aacaccgtgg gattagttac 360ctcgatgatg ctagtgatga cggaaattca
tcagtaggag actatgcaca aattgttaag 420gaaaatagta aacaattaga
aactaagaca ccgggacaga tacagttgga acgctaccaa 480acatatggtc
aattacgtgg tgattttact gttgagaaag atggcaaaaa acatcgcttg
540attaatgtct ttccaacatc agcttatcgt tcagaagcct taaggatact
gcaaactcaa 600caagaattta attcacagat tacagatgaa tttattaatc
gttatctcga aattttaact 660ggaaaacgga aatattatca tggacccgga
aatgaaaagt cacggactga ttatggtcgt 720tacagaacga atggagaaac
tttagacaat atttttggaa ttctaattgg gaaatgtaca 780ttttatccag
acgagtttag agcagcaaaa gcttcctaca cggctcaaga attcaatttg
840ctaaatgatt tgaacaatct aacagttcct actgaaacca aaaagttgag
caaagaacag 900aagaatcaaa tcattaatta tgtcaaaaat gaaaaggtaa
tggggccagc gaaacttttt 960aaatatatcg ctaaattact ttcttgtgat
gttgcagata tcaagggaca ccgtatcgac 1020aaatcaggta aggctgagat
tcatactttc gaagcctatc gaaaaatgaa aacgcttgaa 1080accttagata
ttgagcaaat ggatagagaa acgcttgata aattagccta tgtcttaaca
1140ttaaacactg agagggaagg tattcaagaa gctttagaac atgaatttgc
tgatggtagc 1200tttagccaga agcaagttga cgaattggtt caattccgca
aagcaaatag ttccattttt 1260ggaaaaggat ggcataattt ttctgtcaaa
ctgatgatgg agttaattcc agaattgtat 1320gagacgtcag aagagcaaat
gactatcctg acacgacttg gaaaacaaaa aacaacttcg 1380tcttcaaata
aaacaaaata tatagatgag aaactattaa ctgaagaaat ctataatcct
1440gttgttgcta agtctgttcg ccaggctata aaaatcgtaa atgcggcgat
taaagaatac 1500ggagactttg acaatattgt catcgaaatg gctcgtgaaa
caaatgaaga tgatgaaaag 1560aaagctattc aaaagattca aaaagccaac
aaagatgaaa aagatgcagc aatgcttaag 1620gctgctaacc aatataatgg
aaaggctgaa ttaccacata gtgttttcca cggtcataag 1680caattagcga
ctaaaatccg cctttggcat cagcaaggag aacgttgcct ttatactggt
1740aagacaatct caatccatga tttgataaat aatcctaatc agtttgaagt
agatcatatt 1800ttacctcttt ctatcacatt cgatgatagc cttgcaaata
aggttttggt ttatgcaact 1860gctaaccaag aaaaaggaca acgaacacct
tatcaggctt tagatagtat ggatgatgcg 1920tggtctttcc gtgaattaaa
agcttttgta cgtgagtcaa aaacactttc aaacaagaaa 1980aaagaatacc
tccttacaga agaagatatt tcaaagtttg atgttcgaaa gaaatttatt
2040gaacgaaatc ttgtagatac aagatacgct tcaagagttg tcctcaatgc
ccttcaagaa 2100cactttagag ctcacaagat tgatacaaaa gtttccgtgg
ttcgtggcca atttacatct 2160caattgagac gccattgggg aattgagaag
actcgtgata cttatcatca ccatgctgtc 2220gatgcattga ttattgccgc
ctcaagtcag ttgaatttgt ggaaaaaaca aaagaatacc 2280cttgtaagtt
attcagaaga acaactcctt gatattgaaa caggtgaact tattagtgat
2340gatgagtaca aggaatctgt gttcaaagcc ccttatcaac attttgttga
tacattgaag 2400agtaaagaat ttgaagacag tatcttattc tcatatcaag
tggattctaa gtttaatcgt 2460aaaatatcag atgccactat ttatgcgaca
agacaggcta aagtgggaaa agataagaag 2520gatgaaactt atgtcttagg
gaaaatcaaa gatatctata ctcaggatgg ttatgatgcc 2580tttatgaaga
tttataagaa ggataagtca aaattcctca tgtatcgtca cgacccacaa
2640acctttgaga aagttatcga gccaatttta gagaactatc ctaataagga
aatgaatgaa 2700aaagggaaag aagtaccatg taatcctttc ctaaaatata
aagaagaaca tggctatatt 2760cgtaaatata gtaaaaaagg caatggtcct
gaaatcaaga gtcttaaata ctatgatagt 2820aagcttttag gtaatcctat
tgatattact ccagagaata gtaaaaataa agttgtctta 2880cagtcattaa
aaccttggag aacagatgtc tatttcaata aaaatactgg taaatatgaa
2940attttaggac tgaaatatgc tgatttacaa tttgaaaaga agacaggaac
atataagatt 3000tcccaggaaa aatacaatgg cattatgaaa gaagagggtg
tagattctga ttcagaattc 3060aagtttacac tttataaaaa tgatttgtta
ctcgttaaag atacagaaac aaaagaacaa 3120cagcttttcc gttttctttc
tcgaactatg cctaatgtga aatattatgt agagttaaag 3180ccttattcaa
aagataaatt tgagaagaat gagtcactta ttgaaatttt aggttctgca
3240gataagtcag gacgatgtat aaaagggcta ggaaaatcaa atatttctat
ttataaggta 3300agaacagatg tcctaggaaa tcagcatatc atcaaaaatg
agggtgataa gcctaagcta 3360gatttttaa 33692144113DNAS. agalactiae
214atgaataagc catattcaat aggccttgac atcggtacta attccgtcgg
atggagcatt 60attacagatg attataaagt acctgctaag aagatgagag ttttagggaa
cactgataaa 120gaatatatta agaagaatct cataggtgct ctgctttttg
atggcgggaa tactgctgca 180gatagacgct tgaagcgaac tgctcgtcgt
cgttatacac gtcgtagaaa tcgtattcta 240tatttacaag aaatttttgc
agaggaaatg agtaaagttg atgatagttt ctttcatcga 300ttagaggatt
cttttctagt tgaggaagat aagagaggga gcaagtatcc tatctttgca
360acattgcagg aagagaaaga ttatcatgaa aaattttcga caatctatca
tttgagaaaa 420gaattagctg acaagaaaga aaaagcagac cttcgtctta
tttatattgc tctagctcat 480atcattaaat ttagagggca tttcctaatt
gaggatgata gctttgatgt caggaataca 540gacatttcaa aacaatatca
agatttttta gaaatcttta atacaacttt tgaaaataat 600gatttgttat
ctcaaaacgt tgacgtagag gcaatactaa cagataagat tagcaagtct
660gcgaagaaag atcgtatttt agcgcagtat cctaaccaaa aatctactgg
catttttgca 720gaatttttga aattgattgt cggaaatcaa gctgacttca
agaaatattt caatttggag 780gataaaacgc cgcttcaatt cgctaaggat
agctacgatg aagatttaga aaatcttctt 840ggacagattg gtgatgaatt
tgcagactta ttctcagcag cgaaaaagtt atatgatagt 900gtccttttgt
ctggcattct tacagtaatc gacctcagta ccaaggcgcc actttcagct
960tctatgattc agcgttatga tgaacataga gaggacttga aacagttaaa
acaattcgta 1020aaagcttcat tgccggaaaa atatcaagaa atatttgctg
attcatcaaa agatggctac 1080gctggttata ttgaaggtaa aactaatcaa
gaagcttttt ataaatacct gtcaaaattg 1140ttgaccaagc aagaagatag
cgagaatttt cttgaaaaaa tcaagaatga agatttcttg 1200agaaaacaaa
ggacctttga taatggctca attccacacc aagtccattt gacagagctg
1260aaagctatta tccgccgtca atcagaatac tatcccttct tgaaagagaa
tcaagatagg 1320attgaaaaaa tccttacctt tagaattcct tattatatcg
ggccactagc acgtgagaag 1380agtgattttg catggatgac tcgcaaaaca
gatgacagta ttcgaccttg gaattttgaa 1440gacttggttg ataaagaaaa
atctgcggaa gcttttatcc atcgtatgac caacaatgat 1500ttttatcttc
ctgaagaaaa agttttacca aagcatagtc ttatttatga aaaatttacg
1560gtctataatg agttgactaa ggttagatat aaaaatgagc aaggtgagac
ttattttttt 1620gatagcaata ttaaacaaga aatctttgat ggagtattca
aggaacatcg taaggtatcc 1680aagaagaagt tgctagattt tctggctaaa
gaatatgagg agtttaggat agtagatgtt 1740attggtctag ataaagaaaa
taaagctttc aacgcctcat tgggaactta ccacgatctc 1800gaaaaaatac
tagacaaaga ttttctagat aatccagata atgagtctat tctggaagat
1860atcgtccaaa ctctaacatt atttgaagac agagaaatga ttaagaagcg
tcttgaaaac 1920tataaagatc tttttacaga gtcacaacta aaaaaactct
atcgtcgtca ctatactggc 1980tggggacgat tgtctgctaa gttaatcaat
ggtattcgag ataaagagag tcaaaaaaca 2040atcttggact atcttattga
tgatggtaga tctaatcgca actttatgca gttgataaat 2100gatgatggtc
tatctttcaa atcaattatc agtaaggcac aggctggtag tcattcagat
2160aatctaaaag aagttgtagg tgagcttgca ggtagccctg ctattaaaaa
gggaattcta 2220caaagtttga aaattgttga tgagcttgtt aaagtcatgg
gatacgaacc tgaacaaatt 2280gtggttgaga tggcgcgtga gaatcaaaca
acaaatcaag gtcgtcgtaa ctctcgacaa 2340cgctataaac ttcttgatga
tggcgttaag aatctagcta gtgacttgaa tggcaatatt 2400ttgaaagaat
atcctacgga taatcaagcg ttgcaaaatg aaagactttt cctttactac
2460ttacaaaacg gaagagatat gtatacaggg gaagctctag atattgacaa
tttaagtcaa 2520tatgatattg accacattat tcctcaagct ttcataaaag
atgattctat tgataatcgt 2580gttttggtat catctgctaa aaatcgtgga
aagtcagatg atgttcctag ccttgaaatt 2640gtaaaagatt gtaaagtttt
ctggaaaaaa ttacttgatg ctaagttaat gagtcagcgt 2700aagtatgata
atttgactaa ggcagagcgc ggaggcctaa cttccgatga taaggcaaga
2760tttatccaac gtcagttggt tgagacacga caaattacca agcatgttgc
ccgtatcttg 2820gatgaacgct ttaataatga gcttgatagt aaaggtagaa
ggatccgcaa agttaaaatt 2880gtaaccttga agtcaaattt ggtttcaaat
ttccgaaaag aatttggatt ctataaaatt 2940cgtgaagtta acaattatca
ccatgcacat gatgcctatc ttaatgcagt agttgctaaa 3000gctattctaa
ccaaatatcc tcagttagag ccagaatttg tctacggcga ctatccaaaa
3060tataatagtt acaaaacgcg taaatccgct acagaaaagc tatttttcta
ttcaaatatt 3120atgaacttct ttaaaactaa ggtaacttta gcggatggaa
ccgttgttgt aaaagatgat 3180attgaagtta ataatgatac gggtgaaatt
gtttgggata aaaagaaaca ctttgcgaca 3240gttagaaaag tcttgtcata
ccctcagaac aatatcgtga agaagacaga gattcagaca 3300ggtggtttct
ctaaggaatc aatcttggcg catggtaact cagataagtt gattccaaga
3360aaaacgaagg atatttattt agatcctaag aaatatggag gttttgatag
tccgatagta 3420gcttactctg ttttagttgt agctgatatc aaaaagggta
aagcacaaaa actaaaaaca 3480gttacggaac ttttaggaat taccatcatg
gagaggtcca gatttgagaa aaatccatca 3540gctttccttg aatcaaaagg
ctatttaaat attagggctg ataaactaat tattttgccc 3600aagtatagtc
tgttcgaatt agaaaatggg cgtcgtcgat tacttgctag tgctggtgaa
3660ttacaaaaag gtaatgagct agccttacca acacaattta tgaagttctt
ataccttgca 3720agtcgttata atgagtcaaa aggtaaacca gaggagattg
agaagaaaca agaatttgta 3780aatcaacatg tctcttattt tgatgacatc
cttcaattaa ttaatgattt ttcaaaacga 3840gttattctag cagatgctaa
tttagagaaa atcaataagc tttaccaaga taataaggaa 3900aatatatcag
tagatgaact tgctaataat attatcaatc tatttacttt taccagtcta
3960ggagctccag cagcttttaa attttttgat aaaatagttg atagaaaacg
ctatacatca 4020actaaagaag tacttaattc taccctaatt catcaatcta
ttactggact ttatgaaaca 4080cgtattgatt tgggtaagtt aggagaagat tga
41132154134DNAS. agalactiae 215atgaataagc catattcaat aggccttgac
atcggtacta attccgtcgg atggagcatt 60attacagatg attataaagt acctgctaag
aagatgagag ttttagggaa cactgataaa 120gaatatatta agaagaatct
cataggtgct ctgctttttg atggcgggaa tactgctgca 180gatagacgct
tgaagcgaac tgctcgtcgt cgttatacac gtcgtagaaa tcgtattcta
240tatttacaag aaatttttgc agaggaaatg agtaaagttg atgatagttt
ctttcatcga 300ttagaggatt cttttctagt tgaggaagat aagagaggta
gcaagtatcc tatctttgca 360acaatgcagg aggagaaata ttatcatgaa
aaatttccga caatctatca tttgagaaaa 420gaattggctg acaagaaaga
aaaagcagac cttcgtcttg tttatctggc tctagctcat 480atcattaaat
tcagagggca tttcctaatt gaggatgata gatttgatgt gaggaatacc
540gatattcaaa aacaatatca agccttttta gaaatttttg atactacctt
tgaaaataat 600catttgttat ctcaaaatgt agatgtagaa gcaattctaa
cagataagat tagcaagtct 660gcgaagaagg atcgcatctt agcgcagtat
cctaaccaaa aatctactgg tatttttgca 720gaatttttga aattgattgt
cggaaatcaa gctgacttca agaaacattt caatttggag 780gataaaacac
cgcttcaatt cgctaaggat agctacgatg aagatttaga aaatcttctt
840ggacagattg gtgatgaatt tgcagactta ttctcagtag cgaaaaagct
atatgatagt 900gttcttttat ctggcattct tacagtaact gatctcagta
ccaaggcgcc actttctgcc 960tctatgattc agcgttatga tgaacatcat
gaggacttaa agcatctaaa acaattcgta 1020aaagcttcat tacctgaaaa
ttatcgggaa gtatttgctg attcatcaaa agatggctac 1080gctggctata
ttgaaggcaa aactaatcaa gaagcttttt ataaatatct gttaaaattg
1140ttgaccaaac aagaaggtag cgagtatttt cttgagaaaa ttaagaatga
agattttttg 1200agaaaacaga gaacctttga taatggctca atcccgcatc
aagtccattt gacagaattg 1260agggctatta ttcgacgtca atcagaatac
tatccattct tgaaagagaa tcaagatagg 1320attgaaaaaa tccttacctt
tagaattcct tattatgtcg ggccactagc acgtgagaag 1380agtgattttg
catggatgac tcgcaaaaca gatgacagta ttcgaccttg gaattttgaa
1440gacttggttg ataaagaaaa atctgcggaa gcttttatcc atcgcatgac
caacaatgac 1500ctctatcttc cagaagaaaa agttttacca aagcatagtc
ttatttatga aaaatttact 1560gtttacaatg aattaacgaa ggttagattt
ttggcagaag gctttaaaga ttttcaattt 1620ttaaatagga agcaaaaaga
aactatcttt
aacagcttgt ttaaggaaaa acgtaaagta 1680actgaaaagg atattattag
ttttttgaat aaagttgatg gatatgaagg aattgcaatc 1740aaaggaattg
agaaacagtt taacgctagc ctttcaacct atcatgatct taaaaaaata
1800cttggcaagg atttccttga taatacagat aacgagctta ttttggaaga
tatcgtccaa 1860actctaacct tatttgaaga tagagaaatg attaagaagt
gtcttgacat ctataaagat 1920ttttttacag agtcacagct taaaaagctc
tatcgccgtc actatactgg ctggggacga 1980ttgtctgcta agctaataaa
tggcatccga aataaagaga atcaaaaaac aatcttggac 2040tatcttattg
atgatggaag tgcaaaccga aacttcatgc agttgataaa tgatgatgat
2100ctatcattta aaccaattat tgacaaggca cgaactggta gtcattcgga
taatctgaaa 2160gaagttgtag gtgaacttgc tggtagccct gctattaaaa
aagggattct acaaagtttg 2220aaaatagttg atgagctggt taaagtcatg
ggctatgaac ctgaacaaat cgtggttgaa 2280atggcacgtg agaaccaaac
gacagcaaaa ggattaagtc gttcacgaca acgcttgaca 2340accttgagag
aatctcttgc taatttgaag agtaatattt tggaagagaa aaagcctaag
2400tatgtgaaag atcaagttga aaatcatcat ttatctgatg accgtctttt
cctttactac 2460ttacaaaacg gaagagatat gtatacaaaa aaggctctgg
atattgataa tttaagtcaa 2520tatgatattg accacattat tcctcaagct
ttcataaaag atgattctat tgataatcgt 2580gttttggtat catctgctaa
aaatcgtgga aaatcagatg atgttcctag cattgaaatt 2640gtaaaagctc
gcaaaatgtt ctggaaaaat ttactggatg ctaagttaat gagtcagcgt
2700aagtatgata atttgactaa ggcagagcgc ggaggcctaa cttccgatga
taaggcaaga 2760tttatccaac gtcagttggt tgagactcga caaattacca
agcatgtagc tcgtatcttg 2820gatgaacgct tcaataatga agttgataat
ggtaaaaaga tttgcaaggt taaaattgta 2880accttgaagt caaatttggt
ttcaaatttc cgaaaagaat ttggattcta taaaattcgt 2940gaagttaatg
attatcacca tgcacacgat gcttatctta atgcagtagt tgccaaagct
3000attctaacca aatatccaca gttagagcca gagtttgtct acggaatgta
tagacagaaa 3060aaactttcga aaatcgttca tgaggataag gaagaaaaat
atagtgaagc aaccaggaaa 3120atgtttttct actccaactt gatgaatatg
ttcaaaagag ttgtgaggtt agcagatggt 3180tctattgttg taagaccagt
aatagaaact ggtagatata tgagaaaaac tgcatgggat 3240aaaaagaaac
actttgcgac agttagaaaa gtcttgtcat accctcagaa caatatcgtg
3300aagaagacag agattcagac aggtggtttc tctaaggaat caatcttggc
gcatggtaac 3360tcagataagt tgattccaag aaaaacgaag gatatttatt
tagatcctaa gaaatatgga 3420ggttttgata gtccgatagt agcttactct
gttttagttg tagctgatat caaaaaaggt 3480aaagcacaaa aactaaaaac
agttacggaa cttttaggaa ttaccatcat ggagaggtcc 3540agatttgaga
aaaatccatc agctttcctt gaatcaaaag gttatttaaa tattagggac
3600gataaattaa tgattttacc gaagtatagt ctgttcgaat tagaaaatgg
gcgtcgtcga 3660ttacttgcta gtgctggtga attacaaaaa ggtaacgagc
tagccttacc aacacaattt 3720atgaagttct tataccttgc aagtcgttat
aatgagtcaa aaggtaaacc agaggagatt 3780gagaagaaac aagaatttgt
aaatcaacat gtctcttatt ttgatgacat ccttcaatta 3840attaatgatt
tttcaaaacg agttattcta gcagatgcta atttagagaa aatcaataag
3900ctttaccagg ataataagga aaatatacca gtagatgaac ttgctaataa
tattatcaat 3960ctatttactt ttaccagtct aggagctcca gcagctttta
aattttttga taaaatagtt 4020gatagaaaac gctatacatc aactaaagaa
gtacttaatt ctactctaat ccatcaatct 4080attactggac tttatgaaac
acgtattgat ttgggtaaat taggagaaga ttga 41342164038DNAS. mutans
216atgaaaaaac cttactctat tggacttgat attggaacca attctgttgg
ttgggctgtt 60gtgacagatg actacaaagt tcctgctaag aagatgaagg ttctgggaaa
tacagataaa 120agtcatatcg agaaaaattt gcttggcgct ttattatttg
atagcgggaa tactgcagaa 180gacagacggt taaagagaac tgctcgccgt
cgttacacac gtcgcagaaa tcgtatttta 240tatttgcaag agattttttc
agaagaaatg ggcaaggtag atgatagttt ctttcatcgt 300ttagaggatt
cttttcttgt tactgaggat aaacgaggag agcgccatcc catttttggg
360aatcttgaag aagaagttaa gtatcatgaa aattttccaa ccatttatca
tttgcggcaa 420tatcttgcgg ataatccaga aaaagttgat ttgcgtttag
tttatttggc tttggcacat 480ataattaagt ttagaggtca ttttttaatt
gaaggaaagt ttgatacacg caataatgat 540gtacaaagac tgtttcaaga
atttttagca gtctatgata atacttttga gaatagttcg 600cttcaggagc
aaaatgttca agttgaagaa attctgactg ataaaatcag taaatctgct
660aagaaagata gagttttgaa actttttcct aatgaaaagt ctaatggccg
ctttgcagaa 720tttctaaaac taattgttgg taatcaagct gattttaaaa
agcattttga attagaagag 780aaagcaccat tgcaattttc taaagatact
tatgaagaag agttagaagt actattagct 840caaattggag ataattacgc
agagctcttt ttatcagcaa agaaactgta tgatagtatc 900cttttatcag
ggattttaac agttactgat gttggtacca aagcgccttt atctgcttcg
960atgattcagc gatataatga acatcagatg gatttagctc agcttaaaca
attcattcgt 1020cagaaattat cagataaata taacgaagtt ttttctgatg
tttcaaaaga cggctatgcg 1080ggttatattg atgggaaaac aaatcaagaa
gctttttata aataccttaa aggtctatta 1140aataagattg agggaagtgg
ctatttcctt gataaaattg agcgtgaaga ttttctaaga 1200aagcaacgta
cctttgacaa tggctctatt ccacatcaga ttcatcttca agaaatgcgt
1260gctatcattc gtagacaggc tgaattttat ccgtttttag cagacaatca
agataggatt 1320gagaaattat tgactttccg tattccctac tatgttggtc
cattagcgcg cggaaaaagt 1380gattttgctt ggttaagtcg gaaatcggct
gataaaatta caccatggaa ttttgatgaa 1440atcgttgata aagaatcctc
tgcagaagct tttatcaatc gtatgacaaa ttatgatttg 1500tacttgccaa
atcaaaaagt tcttcctaaa catagtttat tatacgaaaa atttactgtt
1560tacaatgaat taacaaaggt taaatataaa acagagcaag gaaaaacagc
attttttgat 1620gccaatatga agcaagaaat ctttgatggc gtatttaagg
tttatcgaaa agtaactaaa 1680gataaattaa tggatttcct tgaaaaagaa
tttgatgaat ttcgtattgt tgatttaaca 1740ggtctggata aagaaaataa
agtatttaac gcttcttatg gaacttatca tgatttgtgt 1800aaaattttag
ataaagattt tctcgataat tcaaagaatg aaaagatttt agaagatatt
1860gtgttgacct taacgttatt tgaagataga gaaatgatta gaaaacgtct
agaaaattac 1920agtgatttat tgaccaaaga acaagtgaaa aagctggaaa
gacgtcatta tactggttgg 1980ggaagattat cagctgagtt aattcatggt
attcgcaata aagaaagcag aaaaacaatt 2040cttgattatc tcattgatga
tggcaatagc aatcggaact ttatgcaact gattaacgat 2100gatgctcttt
ctttcaaaga agagattgct aaggcacaag ttattggaga aacagacaat
2160ctaaatcaag ttgttagtga tattgctggc agccctgcta ttaaaaaagg
aattttacaa 2220agcttgaaga ttgttgatga gcttgtcaaa attatgggac
atcaacctga aaatatcgtc 2280gtggagatgg cgcgtgaaaa ccagtttacc
aatcagggac gacgaaattc acagcaacgt 2340ttgaaaggtt tgacagattc
tattaaagaa tttggaagtc aaattcttaa agaacatccg 2400gttgagaatt
cacagttaca aaatgataga ttgtttctat attatttaca aaacggcaga
2460gatatgtata ctggagaaga attggatatt gattatctaa gccagtatga
tatagaccat 2520attatcccgc aagcttttat aaaggataat tctattgata
atagagtatt gactagctca 2580aaggaaaatc gtggaaaatc ggatgatgta
ccaagtaaag atgttgttcg taaaatgaaa 2640tcctattgga gtaagctact
ttcggcaaag cttattacac aacgtaaatt tgataatttg 2700acaaaagctg
aacgaggtgg attgaccgac gatgataaag ctggattcat caagcgtcaa
2760ttagtagaaa cacgacaaat taccaaacat gtagcacgta ttctggacga
acgatttaat 2820acagaaacag atgaaaacaa caagaaaatt cgtcaagtaa
aaattgtgac cttgaaatca 2880aatcttgttt ccaatttccg taaagagttt
gaactctaca aagtgcgtga aattaatgac 2940tatcatcatg cacatgatgc
ctatctcaat gctgtaattg gaaaggcttt actaggtgtt 3000tacccacaat
tggaacctga atttgtttat ggtgattatc ctcattttca tggacataaa
3060gaaaataaag caactgctaa gaaatttttc tattcaaata ttatgaactt
ctttaaaaaa 3120gatgatgtcc gtactgataa aaatggtgaa attatctgga
aaaaagatga gcatatttct 3180aatattaaaa aagtgctttc ttatccacaa
gttaatattg ttaagaaagt agaggagcaa 3240acgggaggat tttctaaaga
atctatcttg ccgaaaggta attctgacaa gcttattcct 3300cgaaaaacga
agaaatttta ttgggatacc aagaaatatg gaggatttga tagcccgatt
3360gttgcttatt ctattttagt tattgctgat attgaaaaag gtaaatctaa
aaaattgaaa 3420acagtcaaag ccttagttgg tgtcactatt atggaaaaga
tgacttttga aagggatcca 3480gttgcttttc ttgagcgaaa aggctatcga
aatgttcaag aagaaaatat tataaagtta 3540ccaaaatata gtttatttaa
actagaaaac ggacgaaaaa ggctattggc aagtgctagg 3600gaacttcaaa
agggaaatga aatcgttttg ccaaatcatt taggaacctt gctttatcac
3660gctaaaaata ttcataaagt tgatgaacca aagcatttgg actatgttga
taaacataaa 3720gatgaattta aggagttgct agatgttgtg tcaaactttt
ctaaaaaata tactttagca 3780gaaggaaatt tagaaaaaat caaagaatta
tatgcacaaa ataatggtga agatcttaaa 3840gaattagcaa gttcatttat
caacttatta acatttactg ctataggagc accggctact 3900tttaaattct
ttgataaaaa tattgatcga aaacgatata cttcaactac tgaaattctc
3960aacgctaccc tcatccacca atccatcacc ggtctttatg aaacgcggat
tgatctcaat 4020aagttaggag gagactaa 403821720DNAArtificial
SequenceMprimer qADH-F 217caagtcgcgg ttttcaatca
2021821DNAArtificial SequencePrimer qADH-R 218tgaaggtgga agtcccaaca
a 2121919DNAArtificial Sequenceprobe ADH-VIC 219tgggaagcct
atctaccac 1922015DNAArtificial SequenceProbe wtEPSPS 220cggccattga
cagca 1522120DNAArtificial SequenceForward primer qEPSPS-F
221tcttggggaa tgctggaact 2022221DNAArtificial Sequencereverse
primer qEPSPSR 222caccagcagc agtaacagct g 2122317DNAArtificial
SequenceFAM-wtEPSPS R probe 223tgctgtcaat ggccgca
1722420DNAArtificial Sequenceforward primer qEPSPS-F 224tcttggggaa
tgctggaact 2022520DNAArtificial Sequencereverse primer q wtEPSPS RA
225ccaccagcag cagtaacagc 2022621DNAArtificial Sequenceorward primer
q epTIPS F 226ggaagtgcag ctcttcttgg g 2122719DNAArtificial
Sequencereverse primer q epTIPS R 227agctgctgtc aatgaccgc
1922815DNAArtificial SequenceTIPS probe 228aatgctggaa tcgca
1522923DNAZea maysMHP14Cas1 target site(1)..(23) 229gttaaatctg
acgtgaatct gtt 2323021DNAZea maysMHP14Cas3 target site(1)..(21)
230acaaacattg aagcgacata g 2123118DNAZea maysTS8Cas1 target
site(1)..(18) 231gtacgtaacg tgcagtac 1823220DNAZea maysTS8Cas2
target site(1)..(20) 232gctcatcagt gatcagctgg 2023317DNAZea
maysTS9Cas2 target site(1)..(17) 233ggctgtttgc ggcctcg
1723421DNAZea maysTS9Cas3 target site(1)..(21) 234gcctcgaggt
tgcacgcacg t 2123520DNAZea maysTS10Cas1 target site(1)..(20)
235gcctcgcctt cgctagttaa 2023618DNAZea maysTS10Cas3 target
site(1)..(18) 236gctcgtgttg gagataca 1823780DNAZea mays
237gttaaatctg acgtgaatct gtttggaatt gaaaaacaag tgcttccttt
catacaccac 60tatgtcgctt caatgtttgt 8023880DNAZea mays 238acaaacattg
aagcgacata gtggtgtatg aaaggaagca cttgtttttc aattccaaac 60agattcacgt
cagatttaac 8023966DNAZea mays 239ccagtactgc acgttacgta cgtacgaact
aatatactcc accagctgat cactgatgag 60ccgagc 6624066DNAZea mays
240gctcggctca tcagtgatca gctggtggag tatattagtt cgtacgtacg
taacgtgcag 60tactgg 6624135DNAZea mays 241ccgacgtgcg tgcaacctcg
aggccgcaaa cagcc 3524235DNAZea mays 242ggctgtttgc ggcctcgagg
ttgcacgcac gtcgg 3524368DNAZea mays 243gctcgtgttg gagatacagg
gacagcaagt acttggccct taactagcga aggcgaggcg 60gccatgga
6824468DNAZea mays 244tccatggccg cctcgccttc gctagttaag ggccaagtac
ttgctgtccc tgtatctcca 60acacgagc 682451108DNAArtificial
SequenceMHP14Cas-1 guideRNA cassette 245tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gttaaatctg acgtgaatct 1020gttgttttag
agctagaaat agcaagttaa aataaggcta gtccgttatc aacttgaaaa
1080agtggcaccg agtcggtgct tttttttt 11082461106DNAArtificial
SequenceMHP14Cas-3 gRNA cassette 246tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gcaaacattg aagcgacata 1020ggttttagag
ctagaaatag caagttaaaa taaggctagt ccgttatcaa cttgaaaaag
1080tggcaccgag tcggtgcttt tttttt 11062471103DNAArtificial
SequenceTS8Cas-1 guideRNA cassette 247tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gtacgtaacg tgcagtacgt 1020tttagagcta
gaaatagcaa gttaaaataa ggctagtccg ttatcaactt gaaaaagtgg
1080caccgagtcg gtgctttttt ttt 11032481105DNAArtificial
SequenceTS8Cas-2 guideRNA cassette 248tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gctcatcagt gatcagctgg 1020gttttagagc
tagaaatagc aagttaaaat aaggctagtc cgttatcaac ttgaaaaagt
1080ggcaccgagt cggtgctttt ttttt 11052491102DNAArtificial
SequenceTS9Cas-2 guideRNA cassette 249tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa
atagtggtgt ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct
660attcgaattt ctactagcag taagtcgtgt ttagaaatta tttttttata
tacctttttt 720ccttctatgt acagtaggac acagtgtcag cgccgcgttg
acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa
agcgctgggt aatacgcaaa cgttttgtcc caccttgact aatcacaaga
960gtggagcgta ccttataaac cgagccgcaa gcaccgaatt ggctgtttgc
ggcctcggtt 1020ttagagctag aaatagcaag ttaaaataag gctagtccgt
tatcaacttg aaaaagtggc 1080accgagtcgg tgcttttttt tt
11022501106DNAArtificial SequenceTS9Cas-3 guideRNA cassette
250tgagagtaca atgatgaacc tagattaatc aatgccaaag tctgaaaaat
gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga ggccctgttc ggttgttccg
gattagagcc 120ccggattaat tcctagccgg attacttctc taatttatat
agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag
agtgtcagta ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag
tttgatggca ccattagggt tagagatggt ggccatgggc gcatgtcctg
360gccaactttg tatgatatat ggcagggtga ataggaaagt aaaattgtat
tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag
ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa
atagtggtgt ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct
660attcgaattt ctactagcag taagtcgtgt ttagaaatta tttttttata
tacctttttt 720ccttctatgt acagtaggac acagtgtcag cgccgcgttg
acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa
agcgctgggt aatacgcaaa cgttttgtcc caccttgact aatcacaaga
960gtggagcgta ccttataaac cgagccgcaa gcaccgaatt gcctcgaggt
tgcacgcacg 1020tgttttagag ctagaaatag caagttaaaa taaggctagt
ccgttatcaa cttgaaaaag 1080tggcaccgag tcggtgcttt tttttt
11062511105DNAArtificial SequenceTS10Cas-1 guideRNA cassette
251tgagagtaca atgatgaacc tagattaatc aatgccaaag tctgaaaaat
gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga ggccctgttc ggttgttccg
gattagagcc 120ccggattaat tcctagccgg attacttctc taatttatat
agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag
agtgtcagta ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag
tttgatggca ccattagggt tagagatggt ggccatgggc gcatgtcctg
360gccaactttg tatgatatat ggcagggtga ataggaaagt aaaattgtat
tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag
ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa
atagtggtgt ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct
660attcgaattt ctactagcag taagtcgtgt ttagaaatta tttttttata
tacctttttt 720ccttctatgt acagtaggac acagtgtcag cgccgcgttg
acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa
agcgctgggt aatacgcaaa cgttttgtcc caccttgact aatcacaaga
960gtggagcgta ccttataaac cgagccgcaa gcaccgaatt gcctcgcctt
cgctagttaa 1020gttttagagc tagaaatagc aagttaaaat aaggctagtc
cgttatcaac ttgaaaaagt 1080ggcaccgagt cggtgctttt ttttt
11052521103DNAArtificial SequenceTSCas-3 guideRNA cassette
252tgagagtaca atgatgaacc tagattaatc aatgccaaag tctgaaaaat
gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga ggccctgttc ggttgttccg
gattagagcc 120ccggattaat tcctagccgg attacttctc taatttatat
agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag
agtgtcagta ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag
tttgatggca ccattagggt tagagatggt ggccatgggc gcatgtcctg
360gccaactttg tatgatatat ggcagggtga ataggaaagt aaaattgtat
tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag
ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa
atagtggtgt ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct
660attcgaattt ctactagcag taagtcgtgt ttagaaatta tttttttata
tacctttttt 720ccttctatgt acagtaggac acagtgtcag cgccgcgttg
acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa
agcgctgggt aatacgcaaa cgttttgtcc caccttgact aatcacaaga
960gtggagcgta ccttataaac cgagccgcaa gcaccgaatt gctcgtgttg
gagatacagt 1020tttagagcta gaaatagcaa gttaaaataa ggctagtccg
ttatcaactt gaaaaagtgg 1080caccgagtcg gtgctttttt ttt
11032534928DNAArtificial SequenceMHP14Cas1 donor 253gccatgtcat
cttgtagtta gggcttggag ctagtcgacc gttggaggct ttgtcctcat 60gcggcaccgg
acagtctggt gctacaccgg acagtccggt gcccctctga ccatctgctc
120tgacatctga attgcactgt tcactttgca gagtcgacca ttgcgtgcag
gtagccattg 180ctccgctggt gcaccggaca gtccagtggc acaccggaca
gtccgatgaa ttatagcgga 240gctgcgcctg ggaaacccga agctgaggag
tttgagctga ttcaccctgg tgcaccggac 300actgtccggt ggcacactgg
acagtccggt gcgccggacc agggcacact tcggtttcct 360ttttgctcct
ttcttttgaa gcctaacttg ttcttttgat tggtttgtgt tgaaccttta
420gcacctgtag aatgtatgat ctagagcaaa ctagttagtc caattatttg
tgttgggcaa 480ttcaaccacc aaaaacattt aggaaaatgt ttgatcttat
ttccctttca tattctctta 540ttgctagttg tcggggtgaa gttgagctct
tgcttaggtt ttaattagtg ttgattttta 600gaaaaaccca attcaccccc
ctcttgggca tcgtgatcct tttagcaaca aaatgtgcac 660acatcaaaac
aagcgcttct accatatgta gttgttgcac aataatggtc ctccttagga
720tttgcaaccg tttaacaata gctatgtgac cacagattta tgtcggatgc
acgaaaattg 780taggatttta catttcttta ccttggttca caaacattga
agcgacatag tggtgtatga 840aaggaagcac ttgtttttca attccaaacc
gcggtaccat ttaaatctta agcctaggat 900aacttcgtat agcatacatt
atacgaagtt atggcgccgc tagcctgcag tgcagcgtga 960cccggtcgtg
cccctctcta gagataatga gcattgcatg tctaagttat aaaaaattac
1020cacatatttt ttttgtcaca cttgtttgaa gtgcagttta tctatcttta
tacatatatt 1080taaactttac tctacgaata atataatcta tagtactaca
ataatatcag tgttttagag 1140aatcatataa atgaacagtt agacatggtc
taaaggacaa ttgagtattt tgacaacagg 1200actctacagt tttatctttt
tagtgtgcat gtgttctcct ttttttttgc aaatagcttc 1260acctatataa
tacttcatcc attttattag tacatccatt tagggtttag ggttaatggt
1320ttttatagac taattttttt agtacatcta ttttattcta ttttagcctc
taaattaaga 1380aaactaaaac tctattttag tttttttatt taataattta
gatataaaat agaataaaat 1440aaagtgacta aaaattaaac aaataccctt
taagaaatta aaaaaactaa ggaaacattt 1500ttcttgtttc gagtagataa
tgccagcctg ttaaacgccg tcgacgagtc taacggacac 1560caaccagcga
accagcagcg tcgcgtcggg ccaagcgaag cagacggcac ggcatctctg
1620tcgctgcctc tggacccctc tcgagagttc cgctccaccg ttggacttgc
tccgctgtcg 1680gcatccagaa attgcgtggc ggagcggcag acgtgagccg
gcacggcagg cggcctcctc 1740ctcctctcac ggcaccggca gctacggggg
attcctttcc caccgctcct tcgctttccc 1800ttcctcgccc gccgtaataa
atagacaccc cctccacacc ctctttcccc aacctcgtgt 1860tgttcggagc
gcacacacac acaaccagat ctcccccaaa tccacccgtc ggcacctccg
1920cttcaaggta cgccgctcgt cctccccccc ccccctctct accttctcta
gatcggcgtt 1980ccggtccatg catggttagg gcccggtagt tctacttctg
ttcatgtttg tgttagatcc 2040gtgtttgtgt tagatccgtg ctgctagcgt
tcgtacacgg atgcgacctg tacgtcagac 2100acgttctgat tgctaacttg
ccagtgtttc tctttgggga atcctgggat ggctctagcc 2160gttccgcaga
cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg
2220cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc
ttttcatgct 2280tttttttgtc ttggttgtga tgatgtggtc tggttgggcg
gtcgttctag atcggagtag 2340aattctgttt caaactacct ggtggattta
ttaattttgg atctgtatgt gtgtgccata 2400catattcata gttacgaatt
gaagatgatg gatggaaata tcgatctagg ataggtatac 2460atgttgatgc
gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga
2520tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa
tactgtttca 2580aactacctgg tgtatttatt aattttggaa ctgtatgtgt
gtgtcataca tcttcatagt 2640tacgagttta agatggatgg aaatatcgat
ctaggatagg tatacatgtt gatgtgggtt 2700ttactgatgc atatacatga
tggcatatgc agcatctatt catatgctct aaccttgagt 2760acctatctat
tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga
2820tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat
acgctattta 2880tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt
gtttggtgtt acttctgcag 2940gtcgactcta gaggatcaat tcgctagcga
agttcctatt ccgaagttcc tattctctag 3000aaagtatagg aacttcagat
ccaccgggat ccccgatcat gcaaaaactc attaactcag 3060tgcaaaacta
tgcctggggc agcaaaacgg cgttgactga actttatggt atggaaaatc
3120cgtccagcca gccgatggcc gagctgtgga tgggcgcaca tccgaaaagc
agttcacgag 3180tgcagaatgc cgccggagat atcgtttcac tgcgtgatgt
gattgagagt gataaatcga 3240ctctgctcgg agaggccgtt gccaaacgct
ttggcgaact gcctttcctg ttcaaagtat 3300tatgcgcagc acagccactc
tccattcagg ttcatccaaa caaacacaat tctgaaatcg 3360gttttgccaa
agaaaatgcc gcaggtatcc cgatggatgc cgccgagcgt aactataaag
3420atcctaacca caagccggag ctggtttttg cgctgacgcc tttccttgcg
atgaacgcgt 3480ttcgtgaatt ttccgagatt gtctccctac tccagccggt
cgcaggtgca catccggcga 3540ttgctcactt tttacaacag cctgatgccg
aacgtttaag cgaactgttc gccagcctgt 3600tgaatatgca gggtgaagaa
aaatcccgcg cgctggcgat tttaaaatcg gccctcgata 3660gccagcaggg
tgaaccgtgg caaacgattc gtttaatttc tgaattttac ccggaagaca
3720gcggtctgtt ctccccgcta ttgctgaatg tggtgaaatt gaaccctggc
gaagcgatgt 3780tcctgttcgc tgaaacaccg cacgcttacc tgcaaggcgt
ggcgctggaa gtgatggcaa 3840actccgataa cgtgctgcgt gcgggtctga
cgcctaaata cattgatatt ccggaactgg 3900ttgccaatgt gaaattcgaa
gccaaaccgg ctaaccagtt gttgacccag ccggtgaaac 3960aaggtgcaga
actggacttc ccgattccag tggatgattt tgccttctcg ctgcatgacc
4020ttagtgataa agaaaccacc attagccagc agagtgccgc cattttgttc
tgcgtcgaag 4080gcgatgcaac gttgtggaaa ggttctcagc agttacagct
taaaccgggt gaatcagcgt 4140ttattgccgc caacgaatca ccggtgactg
tcaaaggcca cggccgttta gcgcgtgttt 4200acaacaagct gtaagagctt
actgaaaaaa ttaacatctc ttgctaagct gggggtggaa 4260cctagacttg
tccatcttct ggattggcca acttaattaa tgtatgaaat aaaaggatgc
4320acacatagtg acatgctaat cactataatg tgggcatcaa agttgtgtgt
tatgtgtaat 4380tactagttat ctgaataaaa gagaaagaga tcatccatat
ttcttatcct aaatgaatgt 4440cacgtgtctt tataattctt tgatgaacca
gatgcatttc attaaccaaa tccatataca 4500tataaatatt aatcatatat
aattaatatc aattgggtta gcaaaacaaa tctagtctag 4560gtgtgttttg
cgaatgcggc cctagcgtat acgaagttcc tattccgaag ttcctattct
4620ccagaaagta taggaacttc tgtacacctg agctgattcc gatgacttcg
taggttccta 4680gctcaagccg ctcgtgtcca agcgtcactt acgattagct
aatgattacg gcatctagga 4740ccgactagct aactaactag taccgaggcc
ggccccgcgg gagctcggcg cgccagattc 4800acgtcagatt taaccaaaac
tatattatga ggtacacata ttacaatcca aaatgaatta 4860tctagttctc
gagttgtaca cagtttatca cgtgttttac acattccaac cctaaactcc 4920aaccgtgg
49282544570DNAArtificial SequenceMHP14Cas3 donor 254acacttcggt
ttcctttttg ctcctttctt ttgaagccta acttgttctt ttgattggtt 60tgtgttgaac
ctttagcacc tgtagaatgt atgatctaga gcaaactagt tagtccaatt
120atttgtgttg ggcaattcaa ccaccaaaaa catttaggaa aatgtttgat
cttatttccc 180tttcatattc tcttattgct agttgtcggg gtgaagttga
gctcttgctt aggttttaat 240tagtgttgat ttttagaaaa acccaattca
cccccctctt gggcatcgtg atccttttag 300caacaaaatg tgcacacatc
aaaacaagcg cttctaccat atgtagttgt tgcacaataa 360tggtcctcct
taggatttgc aaccgtttaa caatagctat gtgaccacag atttatgtcg
420gatgcacgaa aattgtagga ttttacattt ctttaccttg gttcacaaac
attgaagcga 480caggtaccat ttaaatctta agcctaggat aacttcgtat
agcatacatt atacgaagtt 540atggcgccgc tagcctgcag tgcagcgtga
cccggtcgtg cccctctcta gagataatga 600gcattgcatg tctaagttat
aaaaaattac cacatatttt ttttgtcaca cttgtttgaa 660gtgcagttta
tctatcttta tacatatatt taaactttac tctacgaata atataatcta
720tagtactaca ataatatcag tgttttagag aatcatataa atgaacagtt
agacatggtc 780taaaggacaa ttgagtattt tgacaacagg actctacagt
tttatctttt tagtgtgcat 840gtgttctcct ttttttttgc aaatagcttc
acctatataa tacttcatcc attttattag 900tacatccatt tagggtttag
ggttaatggt ttttatagac taattttttt agtacatcta 960ttttattcta
ttttagcctc taaattaaga aaactaaaac tctattttag tttttttatt
1020taataattta gatataaaat agaataaaat aaagtgacta aaaattaaac
aaataccctt 1080taagaaatta aaaaaactaa ggaaacattt ttcttgtttc
gagtagataa tgccagcctg 1140ttaaacgccg tcgacgagtc taacggacac
caaccagcga accagcagcg tcgcgtcggg 1200ccaagcgaag cagacggcac
ggcatctctg tcgctgcctc tggacccctc tcgagagttc 1260cgctccaccg
ttggacttgc tccgctgtcg gcatccagaa attgcgtggc ggagcggcag
1320acgtgagccg gcacggcagg cggcctcctc ctcctctcac ggcaccggca
gctacggggg 1380attcctttcc caccgctcct tcgctttccc ttcctcgccc
gccgtaataa atagacaccc 1440cctccacacc ctctttcccc aacctcgtgt
tgttcggagc gcacacacac acaaccagat 1500ctcccccaaa tccacccgtc
ggcacctccg cttcaaggta cgccgctcgt cctccccccc 1560ccccctctct
accttctcta gatcggcgtt ccggtccatg catggttagg gcccggtagt
1620tctacttctg ttcatgtttg tgttagatcc gtgtttgtgt tagatccgtg
ctgctagcgt 1680tcgtacacgg atgcgacctg tacgtcagac acgttctgat
tgctaacttg ccagtgtttc 1740tctttgggga atcctgggat ggctctagcc
gttccgcaga cgggatcgat ttcatgattt 1800tttttgtttc gttgcatagg
gtttggtttg cccttttcct ttatttcaat atatgccgtg 1860cacttgtttg
tcgggtcatc ttttcatgct tttttttgtc ttggttgtga tgatgtggtc
1920tggttgggcg gtcgttctag atcggagtag aattctgttt caaactacct
ggtggattta 1980ttaattttgg atctgtatgt gtgtgccata catattcata
gttacgaatt gaagatgatg 2040gatggaaata tcgatctagg ataggtatac
atgttgatgc gggttttact gatgcatata 2100cagagatgct ttttgttcgc
ttggttgtga tgatgtggtg tggttgggcg gtcgttcatt 2160cgttctagat
cggagtagaa tactgtttca aactacctgg tgtatttatt aattttggaa
2220ctgtatgtgt gtgtcataca tcttcatagt tacgagttta agatggatgg
aaatatcgat 2280ctaggatagg tatacatgtt gatgtgggtt ttactgatgc
atatacatga tggcatatgc 2340agcatctatt catatgctct aaccttgagt
acctatctat tataataaac aagtatgttt 2400tataattatt ttgatcttga
tatacttgga tgatggcata tgcagcagct atatgtggat 2460ttttttagcc
ctgccttcat acgctattta tttgcttggt actgtttctt ttgtcgatgc
2520tcaccctgtt gtttggtgtt acttctgcag gtcgactcta gaggatcaat
tcgctagcga 2580agttcctatt ccgaagttcc tattctctag aaagtatagg
aacttcagat ccaccgggat 2640ccccgatcat gcaaaaactc attaactcag
tgcaaaacta tgcctggggc agcaaaacgg 2700cgttgactga actttatggt
atggaaaatc cgtccagcca gccgatggcc gagctgtgga 2760tgggcgcaca
tccgaaaagc agttcacgag tgcagaatgc cgccggagat atcgtttcac
2820tgcgtgatgt gattgagagt gataaatcga ctctgctcgg agaggccgtt
gccaaacgct 2880ttggcgaact gcctttcctg ttcaaagtat tatgcgcagc
acagccactc tccattcagg 2940ttcatccaaa caaacacaat tctgaaatcg
gttttgccaa agaaaatgcc gcaggtatcc 3000cgatggatgc cgccgagcgt
aactataaag atcctaacca caagccggag ctggtttttg 3060cgctgacgcc
tttccttgcg atgaacgcgt ttcgtgaatt ttccgagatt gtctccctac
3120tccagccggt cgcaggtgca catccggcga ttgctcactt tttacaacag
cctgatgccg 3180aacgtttaag cgaactgttc gccagcctgt tgaatatgca
gggtgaagaa aaatcccgcg 3240cgctggcgat tttaaaatcg gccctcgata
gccagcaggg tgaaccgtgg caaacgattc 3300gtttaatttc tgaattttac
ccggaagaca gcggtctgtt ctccccgcta ttgctgaatg 3360tggtgaaatt
gaaccctggc gaagcgatgt tcctgttcgc tgaaacaccg cacgcttacc
3420tgcaaggcgt ggcgctggaa gtgatggcaa actccgataa cgtgctgcgt
gcgggtctga 3480cgcctaaata cattgatatt ccggaactgg ttgccaatgt
gaaattcgaa gccaaaccgg 3540ctaaccagtt gttgacccag ccggtgaaac
aaggtgcaga actggacttc ccgattccag 3600tggatgattt tgccttctcg
ctgcatgacc ttagtgataa agaaaccacc attagccagc 3660agagtgccgc
cattttgttc tgcgtcgaag gcgatgcaac gttgtggaaa ggttctcagc
3720agttacagct taaaccgggt gaatcagcgt ttattgccgc caacgaatca
ccggtgactg 3780tcaaaggcca cggccgttta gcgcgtgttt acaacaagct
gtaagagctt actgaaaaaa 3840ttaacatctc ttgctaagct gggggtggaa
cctagacttg tccatcttct ggattggcca 3900acttaattaa tgtatgaaat
aaaaggatgc acacatagtg acatgctaat cactataatg 3960tgggcatcaa
agttgtgtgt tatgtgtaat tactagttat ctgaataaaa gagaaagaga
4020tcatccatat ttcttatcct aaatgaatgt cacgtgtctt tataattctt
tgatgaacca 4080gatgcatttc attaaccaaa tccatataca tataaatatt
aatcatatat aattaatatc 4140aattgggtta gcaaaacaaa tctagtctag
gtgtgttttg cgaatgcggc cctagcgtat 4200acgaagttcc tattccgaag
ttcctattct ccagaaagta taggaacttc tgtacacctg 4260agctgattcc
gatgacttcg taggttccta gctcaagccg ctcgtgtcca agcgtcactt
4320acgattagct aatgattacg gcatctagga ccgactagct aactaactag
taccgaggcc 4380ggccccgcgg gagctcggcg cgcctagtgg tgtatgaaag
gaagcacttg tttttcaatt 4440ccaaacagat tcacgtcaga tttaaccaaa
actatattat gaggtacaca tattacaatc 4500caaaatgaat tatctagttc
tcgagttgta cacagtttat cacgtgtttt acacattcca 4560accctaaact
45702555091DNAArtificial SequenceTS8Cas-1 donor 255cacacatgac
tgcctgagaa tctgctgccg ttgcctctca tattatattc gatcccctga 60ctaaaaaaac
tcggggccgg ctaatacgta ctgtacgtac gcagaattta cggtccagca
120cgggcatgcc gcgcgggctg actttgctcc actgactcga tcatgtgcgg
attccatcgc 180ggcgtagcgt agccaaccgc aacgcaaacc gacttcatct
ttttttttta ttatgaacaa 240aaggagatcg agagaaacgt gaacggtaaa
taatatatct gatcccatgc atgcacgctg 300cctgggtcga tctcgctctc
gctccgccca gacgaacatg catgctggtc aggctcaacg 360ctcaggcggg
caagctgtgg gaggacatgg gatgggagag gaggacacat gcatgctggc
420cagtcaggca ctgtgctggc acatgaggta gggatagggg ggccctcggc
cagtgtccag 480gccgcatgca tgcatgcccc ccctgctgct cgaccgaaca
acgttggatg cctggattga 540tgcaacagtt tggacggacg gaccatacgt
tatgtaccag taggtaccat ttaaatctta 600agcctaggat aacttcgtat
agcatacatt atacgaagtt atggcgccgc tagcctgcag 660tgcagcgtga
cccggtcgtg cccctctcta gagataatga gcattgcatg tctaagttat
720aaaaaattac cacatatttt ttttgtcaca cttgtttgaa gtgcagttta
tctatcttta 780tacatatatt taaactttac tctacgaata atataatcta
tagtactaca ataatatcag 840tgttttagag aatcatataa atgaacagtt
agacatggtc taaaggacaa ttgagtattt 900tgacaacagg actctacagt
tttatctttt tagtgtgcat gtgttctcct ttttttttgc 960aaatagcttc
acctatataa tacttcatcc attttattag tacatccatt tagggtttag
1020ggttaatggt ttttatagac taattttttt agtacatcta ttttattcta
ttttagcctc 1080taaattaaga aaactaaaac tctattttag tttttttatt
taataattta gatataaaat 1140agaataaaat aaagtgacta aaaattaaac
aaataccctt taagaaatta aaaaaactaa 1200ggaaacattt ttcttgtttc
gagtagataa tgccagcctg ttaaacgccg tcgacgagtc 1260taacggacac
caaccagcga accagcagcg tcgcgtcggg ccaagcgaag cagacggcac
1320ggcatctctg tcgctgcctc tggacccctc tcgagagttc cgctccaccg
ttggacttgc 1380tccgctgtcg gcatccagaa attgcgtggc ggagcggcag
acgtgagccg gcacggcagg 1440cggcctcctc ctcctctcac ggcaccggca
gctacggggg attcctttcc caccgctcct 1500tcgctttccc ttcctcgccc
gccgtaataa atagacaccc cctccacacc ctctttcccc 1560aacctcgtgt
tgttcggagc gcacacacac acaaccagat ctcccccaaa tccacccgtc
1620ggcacctccg cttcaaggta cgccgctcgt cctccccccc ccccctctct
accttctcta 1680gatcggcgtt ccggtccatg catggttagg gcccggtagt
tctacttctg ttcatgtttg 1740tgttagatcc gtgtttgtgt tagatccgtg
ctgctagcgt tcgtacacgg atgcgacctg 1800tacgtcagac acgttctgat
tgctaacttg ccagtgtttc tctttgggga atcctgggat 1860ggctctagcc
gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg
1920gtttggtttg cccttttcct ttatttcaat atatgccgtg cacttgtttg
tcgggtcatc 1980ttttcatgct tttttttgtc ttggttgtga tgatgtggtc
tggttgggcg gtcgttctag 2040atcggagtag aattctgttt caaactacct
ggtggattta ttaattttgg atctgtatgt 2100gtgtgccata catattcata
gttacgaatt gaagatgatg gatggaaata tcgatctagg 2160ataggtatac
atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc
2220ttggttgtga tgatgtggtg tggttgggcg gtcgttcatt cgttctagat
cggagtagaa 2280tactgtttca aactacctgg tgtatttatt aattttggaa
ctgtatgtgt gtgtcataca 2340tcttcatagt tacgagttta agatggatgg
aaatatcgat ctaggatagg tatacatgtt 2400gatgtgggtt ttactgatgc
atatacatga tggcatatgc agcatctatt catatgctct 2460aaccttgagt
acctatctat tataataaac aagtatgttt tataattatt ttgatcttga
2520tatacttgga tgatggcata tgcagcagct atatgtggat ttttttagcc
ctgccttcat 2580acgctattta tttgcttggt actgtttctt ttgtcgatgc
tcaccctgtt gtttggtgtt 2640acttctgcag gtcgactcta gaggatcaat
tcgctagcga agttcctatt ccgaagttcc 2700tattctctag aaagtatagg
aacttcagat ccaccgggat ccccgatcat gcaaaaactc 2760attaactcag
tgcaaaacta tgcctggggc agcaaaacgg cgttgactga actttatggt
2820atggaaaatc cgtccagcca gccgatggcc gagctgtgga tgggcgcaca
tccgaaaagc 2880agttcacgag tgcagaatgc cgccggagat atcgtttcac
tgcgtgatgt gattgagagt 2940gataaatcga ctctgctcgg agaggccgtt
gccaaacgct ttggcgaact gcctttcctg 3000ttcaaagtat tatgcgcagc
acagccactc tccattcagg ttcatccaaa caaacacaat 3060tctgaaatcg
gttttgccaa agaaaatgcc gcaggtatcc cgatggatgc cgccgagcgt
3120aactataaag atcctaacca caagccggag ctggtttttg cgctgacgcc
tttccttgcg 3180atgaacgcgt ttcgtgaatt ttccgagatt gtctccctac
tccagccggt cgcaggtgca 3240catccggcga ttgctcactt tttacaacag
cctgatgccg aacgtttaag cgaactgttc 3300gccagcctgt tgaatatgca
gggtgaagaa aaatcccgcg cgctggcgat tttaaaatcg 3360gccctcgata
gccagcaggg tgaaccgtgg caaacgattc gtttaatttc tgaattttac
3420ccggaagaca gcggtctgtt ctccccgcta ttgctgaatg tggtgaaatt
gaaccctggc 3480gaagcgatgt tcctgttcgc tgaaacaccg cacgcttacc
tgcaaggcgt ggcgctggaa 3540gtgatggcaa actccgataa cgtgctgcgt
gcgggtctga cgcctaaata cattgatatt 3600ccggaactgg ttgccaatgt
gaaattcgaa gccaaaccgg ctaaccagtt gttgacccag 3660ccggtgaaac
aaggtgcaga actggacttc ccgattccag tggatgattt tgccttctcg
3720ctgcatgacc ttagtgataa agaaaccacc attagccagc agagtgccgc
cattttgttc 3780tgcgtcgaag gcgatgcaac gttgtggaaa ggttctcagc
agttacagct taaaccgggt 3840gaatcagcgt ttattgccgc caacgaatca
ccggtgactg tcaaaggcca cggccgttta 3900gcgcgtgttt acaacaagct
gtaagagctt actgaaaaaa ttaacatctc ttgctaagct 3960gggggtggaa
cctagacttg tccatcttct ggattggcca acttaattaa tgtatgaaat
4020aaaaggatgc acacatagtg acatgctaat cactataatg tgggcatcaa
agttgtgtgt 4080tatgtgtaat tactagttat ctgaataaaa gagaaagaga
tcatccatat ttcttatcct 4140aaatgaatgt cacgtgtctt tataattctt
tgatgaacca gatgcatttc attaaccaaa 4200tccatataca tataaatatt
aatcatatat aattaatatc aattgggtta gcaaaacaaa 4260tctagtctag
gtgtgttttg cgaatgcggc cctagcgtat acgaagttcc tattccgaag
4320ttcctattct ccagaaagta taggaacttc tgtacacctg agctgattcc
gatgacttcg 4380taggttccta gctcaagccg ctcgtgtcca agcgtcactt
acgattagct aatgattacg 4440gcatctagga ccgactagct aactaactag
taccgaggcc ggccccgcgg actgcacgtt 4500acgtacgtac gaactaatat
actccaccag ctgatcactg atgagccgag ccgccatgca 4560ttgtaattta
taacatgtgc ggctgtacgc ttccatctca aatacctttt tatatatata
4620ttgtacttta tagtctacga cataatctgc catggtaatt tataagatgt
gctttattgc 4680tcgttgttct gttctcatct gtgtccatgg catggcatgg
atacaaaatg tatgtatggc 4740cacgcatcca atctgtgacg ttgtcaaggc
agaggtccaa ccgtccaaga ccctcttgtg 4800ccgccctgta cttgcagtca
gtgacgttgt gagaaaaagc tgtgggtggt ctccgcagag 4860cgcgcgggcc
acgagaggga gccccatctc tcggccgagg ggtacggggg ctccagacac
4920ggtcctttgg tttcttctgc ctgtagcgag cggccccgcc ccccaccgcg
ctgctagcct 4980agccgatgct gatccatcca ccacccacaa gggattgttc
cacgacttgt ggacctgacc 5040atgacgtgac ttcacgccat gtacgctcag
ccgctcacta gctttttttt c 50912565237DNAArtificial SequenceTS8Cas-2
donor 256tctctttcag ggcttgttcg tttacgttgg attgcacccg gaatcgttac
agctaatcaa 60agtttatata aattagagaa gcaaccggat aggaatcgtt ccgacccacc
aattcgacac 120aaacgaacaa ggcctcaatc cttctcaatc cacctccaac
ccaataagct cttggaggcg 180gcggcgggag agcagccaca cacatgactg
cctgagaatc tgctgccgtt gcctctcata 240ttatattcga tcccctgact
aaaaaaactc ggggccggct aatacgtact gtacgtacgc 300agaatttacg
gtccagcacg ggcatgccgc gcgggctgac tttgctccac tgactcgatc
360atgtgcggat tccatcgcgg cgtagcgtag ccaaccgcaa cgcaaaccga
cttcatcttt 420tttttttatt atgaacaaaa ggagatcgag agaaacgtga
acggtaaata atatatctga 480tcccatgcat gcacgctgcc tgggtcgatc
tcgctctcgc tccgcccaga cgaacatgca 540tgctggtcag gctcaacgct
caggcgggca agctgtggga ggacatggga tgggagagga 600ggacacatgc
atgctggcca gtcaggcact gtgctggcac atgaggtagg gatagggggg
660ccctcggcca gtgtccaggc cgcatgcatg catgcccccc ctgctgctcg
accgaacaac 720gttggatgcc tggattgatg caacagtttg gacggacgga
ccatacgtta tgtaccagta 780ctgcacgtta cgtacgtacg aactaatata
ctccaccagg taccatttaa atcttaagcc 840taggataact tcgtatagca
tacattatac gaagttatgg cgccgctagc ctgcagtgca 900gcgtgacccg
gtcgtgcccc tctctagaga taatgagcat tgcatgtcta agttataaaa
960aattaccaca tatttttttt gtcacacttg tttgaagtgc agtttatcta
tctttataca 1020tatatttaaa ctttactcta cgaataatat aatctatagt
actacaataa tatcagtgtt 1080ttagagaatc atataaatga acagttagac
atggtctaaa ggacaattga gtattttgac 1140aacaggactc tacagtttta
tctttttagt gtgcatgtgt tctccttttt ttttgcaaat 1200agcttcacct
atataatact tcatccattt tattagtaca tccatttagg gtttagggtt
1260aatggttttt atagactaat ttttttagta catctatttt attctatttt
agcctctaaa 1320ttaagaaaac taaaactcta ttttagtttt tttatttaat
aatttagata taaaatagaa 1380taaaataaag tgactaaaaa ttaaacaaat
accctttaag aaattaaaaa aactaaggaa 1440acatttttct tgtttcgagt
agataatgcc agcctgttaa acgccgtcga cgagtctaac 1500ggacaccaac
cagcgaacca gcagcgtcgc gtcgggccaa gcgaagcaga cggcacggca
1560tctctgtcgc tgcctctgga cccctctcga gagttccgct ccaccgttgg
acttgctccg 1620ctgtcggcat ccagaaattg cgtggcggag cggcagacgt
gagccggcac ggcaggcggc 1680ctcctcctcc tctcacggca ccggcagcta
cgggggattc ctttcccacc gctccttcgc 1740tttcccttcc tcgcccgccg
taataaatag acaccccctc cacaccctct ttccccaacc 1800tcgtgttgtt
cggagcgcac acacacacaa ccagatctcc cccaaatcca cccgtcggca
1860cctccgcttc aaggtacgcc gctcgtcctc cccccccccc ctctctacct
tctctagatc 1920ggcgttccgg tccatgcatg gttagggccc ggtagttcta
cttctgttca tgtttgtgtt 1980agatccgtgt ttgtgttaga tccgtgctgc
tagcgttcgt acacggatgc gacctgtacg 2040tcagacacgt tctgattgct
aacttgccag tgtttctctt tggggaatcc tgggatggct 2100ctagccgttc
cgcagacggg atcgatttca tgattttttt tgtttcgttg catagggttt
2160ggtttgccct tttcctttat ttcaatatat gccgtgcact tgtttgtcgg
gtcatctttt 2220catgcttttt tttgtcttgg ttgtgatgat gtggtctggt
tgggcggtcg ttctagatcg 2280gagtagaatt ctgtttcaaa ctacctggtg
gatttattaa ttttggatct gtatgtgtgt 2340gccatacata ttcatagtta
cgaattgaag atgatggatg gaaatatcga tctaggatag 2400gtatacatgt
tgatgcgggt tttactgatg catatacaga gatgcttttt gttcgcttgg
2460ttgtgatgat gtggtgtggt tgggcggtcg ttcattcgtt ctagatcgga
gtagaatact 2520gtttcaaact acctggtgta tttattaatt ttggaactgt
atgtgtgtgt catacatctt 2580catagttacg agtttaagat ggatggaaat
atcgatctag gataggtata catgttgatg 2640tgggttttac tgatgcatat
acatgatggc atatgcagca tctattcata tgctctaacc 2700ttgagtacct
atctattata ataaacaagt atgttttata attattttga tcttgatata
2760cttggatgat ggcatatgca gcagctatat gtggattttt ttagccctgc
cttcatacgc 2820tatttatttg cttggtactg tttcttttgt cgatgctcac
cctgttgttt ggtgttactt 2880ctgcaggtcg actctagagg atcaattcgc
tagcgaagtt cctattccga agttcctatt 2940ctctagaaag tataggaact
tcagatccac cgggatcccc gatcatgcaa aaactcatta 3000actcagtgca
aaactatgcc tggggcagca aaacggcgtt gactgaactt tatggtatgg
3060aaaatccgtc cagccagccg atggccgagc tgtggatggg cgcacatccg
aaaagcagtt 3120cacgagtgca gaatgccgcc ggagatatcg tttcactgcg
tgatgtgatt gagagtgata 3180aatcgactct gctcggagag gccgttgcca
aacgctttgg cgaactgcct ttcctgttca 3240aagtattatg cgcagcacag
ccactctcca ttcaggttca tccaaacaaa cacaattctg 3300aaatcggttt
tgccaaagaa aatgccgcag gtatcccgat ggatgccgcc gagcgtaact
3360ataaagatcc taaccacaag ccggagctgg tttttgcgct gacgcctttc
cttgcgatga 3420acgcgtttcg tgaattttcc gagattgtct ccctactcca
gccggtcgca ggtgcacatc 3480cggcgattgc tcacttttta caacagcctg
atgccgaacg tttaagcgaa ctgttcgcca 3540gcctgttgaa tatgcagggt
gaagaaaaat cccgcgcgct ggcgatttta aaatcggccc 3600tcgatagcca
gcagggtgaa ccgtggcaaa cgattcgttt aatttctgaa ttttacccgg
3660aagacagcgg tctgttctcc ccgctattgc tgaatgtggt gaaattgaac
cctggcgaag 3720cgatgttcct gttcgctgaa acaccgcacg cttacctgca
aggcgtggcg ctggaagtga 3780tggcaaactc cgataacgtg ctgcgtgcgg
gtctgacgcc taaatacatt gatattccgg 3840aactggttgc caatgtgaaa
ttcgaagcca aaccggctaa ccagttgttg acccagccgg 3900tgaaacaagg
tgcagaactg gacttcccga ttccagtgga tgattttgcc ttctcgctgc
3960atgaccttag tgataaagaa accaccatta gccagcagag tgccgccatt
ttgttctgcg 4020tcgaaggcga tgcaacgttg tggaaaggtt ctcagcagtt
acagcttaaa ccgggtgaat 4080cagcgtttat tgccgccaac gaatcaccgg
tgactgtcaa aggccacggc cgtttagcgc 4140gtgtttacaa caagctgtaa
gagcttactg aaaaaattaa catctcttgc taagctgggg 4200gtggaaccta
gacttgtcca tcttctggat tggccaactt aattaatgta tgaaataaaa
4260ggatgcacac atagtgacat gctaatcact ataatgtggg catcaaagtt
gtgtgttatg 4320tgtaattact agttatctga ataaaagaga aagagatcat
ccatatttct tatcctaaat 4380gaatgtcacg tgtctttata attctttgat
gaaccagatg catttcatta accaaatcca 4440tatacatata aatattaatc
atatataatt aatatcaatt gggttagcaa aacaaatcta 4500gtctaggtgt
gttttgcgaa tgcggcccta gcgtatacga agttcctatt ccgaagttcc
4560tattctccag aaagtatagg aacttctgta cacctgagct gattccgatg
acttcgtagg 4620ttcctagctc aagccgctcg tgtccaagcg tcacttacga
ttagctaatg attacggcat 4680ctaggaccga ctagctaact aactagtacc
gaggccggcc ccgcgggagc tcgctgatca 4740ctgatgagcc gagccgccat
gcattgtaat ttataacatg tgcggctgta cgcttccatc 4800tcaaatacct
ttttatatat atattgtact ttatagtcta cgacataatc tgccatggta
4860atttataaga tgtgctttat tgctcgttgt tctgttctca tctgtgtcca
tggcatggca 4920tggatacaaa atgtatgtat ggccacgcat ccaatctgtg
acgttgtcaa ggcagaggtc 4980caaccgtcca agaccctctt gtgccgccct
gtacttgcag tcagtgacgt tgtgagaaaa 5040agctgtgggt ggtctccgca
gagcgcgcgg gccacgagag ggagccccat ctctcggccg 5100aggggtacgg
gggctccaga cacggtcctt tggtttcttc tgcctgtagc gagcggcccc
5160gccccccacc gcgctgctag cctagccgat gctgatccat ccaccaccca
caagggattg 5220ttccacgact tgtggac 52372575427DNAArtificial
SequenceTS9Cas-2 donor 257agcaaggaac taaactgtta ttggacgcta
aagtttagta ctttatcttt aacatctttc 60agcatttcta tgtagatatt taagggctaa
attttagcaa gtgtgctgat aaattttagc 120ctaaatgttt ctgttgggct
aaattttagc aagtgtactg ttaaatttta gcatattcct 180tttagagtgg
tatgggtgtg catagactaa atgtttccgt tgggccctaa tttaacgatg
240tgtacgcagg cctgtttaga tgacttggta ccggcatatg gcctcgtact
gtttcatttg 300atgacgcgag cgtgcggccc atgcagcagc agcacgccgg
gaaggcagcg gattttgaag 360tactattgga cagcgcggcg cggggaccgg
gtcgttggcg cgcggtggag tgggggtggg 420tggtcctggc gtcctgccct
gcgcgatggt cgatggatgc cccatgcgcg tgtaaccgcc 480cagccgtcgc
catccgacca ggtgggcaga cgtacgtacg gtggcacgcc cacggcccat
540cggccatcgc gatcgcgttc gtatcgtgtc ctcaataacg aaagcgccaa
cggaaggcgc 600tgtcgtcgtc agttcaccgc gcgccggcgc cctgtgtcct
cgtccctctc gacttctcga 660ccagtaagaa ctctcgcgag ctgcggagct
gctggcgatg gccggccggt gggatccgac 720gtgcgtgcaa cctcgaattt
aaatcttaag cctaggataa cttcgtatag catacattat 780acgaagttat
ggcgccgcta gcctgcagtg cagcgtgacc cggtcgtgcc cctctctaga
840gataatgagc attgcatgtc taagttataa aaaattacca catatttttt
ttgtcacact 900tgtttgaagt gcagtttatc tatctttata catatattta
aactttactc tacgaataat 960ataatctata gtactacaat aatatcagtg
ttttagagaa tcatataaat gaacagttag 1020acatggtcta aaggacaatt
gagtattttg acaacaggac tctacagttt tatcttttta 1080gtgtgcatgt
gttctccttt ttttttgcaa atagcttcac ctatataata cttcatccat
1140tttattagta catccattta gggtttaggg ttaatggttt ttatagacta
atttttttag 1200tacatctatt ttattctatt ttagcctcta aattaagaaa
actaaaactc tattttagtt 1260tttttattta ataatttaga tataaaatag
aataaaataa agtgactaaa aattaaacaa 1320atacccttta agaaattaaa
aaaactaagg aaacattttt cttgtttcga gtagataatg 1380ccagcctgtt
aaacgccgtc gacgagtcta acggacacca accagcgaac cagcagcgtc
1440gcgtcgggcc aagcgaagca gacggcacgg catctctgtc gctgcctctg
gacccctctc 1500gagagttccg ctccaccgtt ggacttgctc cgctgtcggc
atccagaaat tgcgtggcgg 1560agcggcagac gtgagccggc acggcaggcg
gcctcctcct cctctcacgg caccggcagc 1620tacgggggat tcctttccca
ccgctccttc gctttccctt cctcgcccgc cgtaataaat 1680agacaccccc
tccacaccct ctttccccaa cctcgtgttg ttcggagcgc acacacacac
1740aaccagatct cccccaaatc cacccgtcgg cacctccgct tcaaggtacg
ccgctcgtcc 1800tccccccccc ccctctctac cttctctaga tcggcgttcc
ggtccatgca tggttagggc 1860ccggtagttc tacttctgtt catgtttgtg
ttagatccgt gtttgtgtta gatccgtgct 1920gctagcgttc gtacacggat
gcgacctgta cgtcagacac gttctgattg ctaacttgcc 1980agtgtttctc
tttggggaat cctgggatgg ctctagccgt tccgcagacg ggatcgattt
2040catgattttt tttgtttcgt tgcatagggt ttggtttgcc cttttccttt
atttcaatat 2100atgccgtgca cttgtttgtc gggtcatctt ttcatgcttt
tttttgtctt ggttgtgatg 2160atgtggtctg gttgggcggt cgttctagat
cggagtagaa ttctgtttca aactacctgg 2220tggatttatt aattttggat
ctgtatgtgt gtgccataca tattcatagt tacgaattga 2280agatgatgga
tggaaatatc gatctaggat aggtatacat gttgatgcgg gttttactga
2340tgcatataca gagatgcttt ttgttcgctt ggttgtgatg atgtggtgtg
gttgggcggt 2400cgttcattcg ttctagatcg gagtagaata ctgtttcaaa
ctacctggtg tatttattaa 2460ttttggaact gtatgtgtgt gtcatacatc
ttcatagtta cgagtttaag atggatggaa 2520atatcgatct aggataggta
tacatgttga tgtgggtttt actgatgcat atacatgatg 2580gcatatgcag
catctattca tatgctctaa ccttgagtac ctatctatta taataaacaa
2640gtatgtttta taattatttt gatcttgata tacttggatg atggcatatg
cagcagctat 2700atgtggattt ttttagccct gccttcatac gctatttatt
tgcttggtac tgtttctttt 2760gtcgatgctc accctgttgt ttggtgttac
ttctgcaggt cgactctaga ggatcaattc 2820gctagcgaag ttcctattcc
gaagttccta ttctctagaa agtataggaa cttcagatcc 2880accgggatcc
ccgatcatgc aaaaactcat taactcagtg caaaactatg cctggggcag
2940caaaacggcg ttgactgaac tttatggtat ggaaaatccg tccagccagc
cgatggccga 3000gctgtggatg ggcgcacatc cgaaaagcag ttcacgagtg
cagaatgccg ccggagatat 3060cgtttcactg cgtgatgtga ttgagagtga
taaatcgact ctgctcggag aggccgttgc 3120caaacgcttt ggcgaactgc
ctttcctgtt caaagtatta tgcgcagcac agccactctc 3180cattcaggtt
catccaaaca aacacaattc tgaaatcggt tttgccaaag aaaatgccgc
3240aggtatcccg atggatgccg ccgagcgtaa ctataaagat cctaaccaca
agccggagct 3300ggtttttgcg ctgacgcctt tccttgcgat gaacgcgttt
cgtgaatttt ccgagattgt 3360ctccctactc cagccggtcg caggtgcaca
tccggcgatt gctcactttt tacaacagcc 3420tgatgccgaa cgtttaagcg
aactgttcgc cagcctgttg aatatgcagg gtgaagaaaa 3480atcccgcgcg
ctggcgattt taaaatcggc cctcgatagc cagcagggtg aaccgtggca
3540aacgattcgt ttaatttctg aattttaccc ggaagacagc ggtctgttct
ccccgctatt 3600gctgaatgtg gtgaaattga accctggcga agcgatgttc
ctgttcgctg aaacaccgca 3660cgcttacctg caaggcgtgg cgctggaagt
gatggcaaac tccgataacg tgctgcgtgc 3720gggtctgacg cctaaataca
ttgatattcc ggaactggtt gccaatgtga aattcgaagc 3780caaaccggct
aaccagttgt tgacccagcc ggtgaaacaa ggtgcagaac tggacttccc
3840gattccagtg gatgattttg ccttctcgct gcatgacctt agtgataaag
aaaccaccat 3900tagccagcag agtgccgcca ttttgttctg cgtcgaaggc
gatgcaacgt tgtggaaagg 3960ttctcagcag ttacagctta aaccgggtga
atcagcgttt attgccgcca acgaatcacc 4020ggtgactgtc aaaggccacg
gccgtttagc gcgtgtttac aacaagctgt aagagcttac 4080tgaaaaaatt
aacatctctt gctaagctgg gggtggaacc tagacttgtc catcttctgg
4140attggccaac ttaattaatg tatgaaataa aaggatgcac acatagtgac
atgctaatca 4200ctataatgtg ggcatcaaag ttgtgtgtta tgtgtaatta
ctagttatct gaataaaaga 4260gaaagagatc atccatattt cttatcctaa
atgaatgtca cgtgtcttta taattctttg 4320atgaaccaga tgcatttcat
taaccaaatc catatacata taaatattaa tcatatataa 4380ttaatatcaa
ttgggttagc aaaacaaatc tagtctaggt gtgttttgcg aatgcggccc
4440tagcgtatac gaagttccta ttccgaagtt cctattctcc agaaagtata
ggaacttctg 4500tacacctgag ctgattccga tgacttcgta ggttcctagc
tcaagccgct cgtgtccaag 4560cgtcacttac gattagctaa tgattacggc
atctaggacc gactagctaa ctaactagta 4620ccgaggccgg ccccgcggga
gctcggccgc aaacagcctg gtgacagacg aagccagcaa 4680gcacgtacgt
acgcacgtct ctgctggtct ggatgtgtat ggatatggac gtctcacgtc
4740tggacgtcgt cgtcgccgtt gtattgtatc atgccaacca cttccgtacc
gtaccccctc 4800gcgtgccaac atgaccaccg ccggtacgtc tccatcgtcg
gccgtcggcg tctcaggcag 4860ctctcaatta agcggacgtg ttttggtaat
ctggtggaac gccgcgcgca ctgagggttt 4920gggggccccg gcggacgagc
gagcgagaga cggtgcatgc atgccaaatg gcaacgaggg 4980cccgcccgcc
catccaataa ccaacccaga cgtagcgcaa ccaacgtacg agtcctgtgc
5040tggcgcgtac gactaccacg ctagctgccg cgacatgcga actacggtcc
accaggcacc 5100agccatgaca atatatactg tatatatatt tttcttcttc
tttttgtttc cgctctctca 5160agttcctgct ctgctcctgc ctgtccgcgg
tgccgatcgg cgagagagca tgcatggaca 5220tggaccacgc gagatccagg
aaccggcacg ggcccatgcg tggcaggcgg ccgtttcgtc 5280aggttccccg
aaatgcccca actgcgcggc tgcaggatgg ctcatggctg gctgcctagc
5340tggcccgtga caccgatcga tcggtaacga cgacgcacgc acctgaagca
caggaaggag 5400cctccctctc gcatgcacgt tagtact
54272585426DNAArtificial SequenceTS9Cas-3 donor 258agcaaggaac
taaactgtta ttggacgcaa agtttagtac tttatcttta acatctttca 60gcatttctat
gtagatattt aagggctaaa ttttagcaag tgtgctgata aattttagcc
120taaatgtttc tgttgggcta aattttagca agtgtactgt taaattttag
catattcctt 180ttagagtggt atgggtgtgc atagactaaa tgtttccgtt
gggccctaat ttaacgatgt 240gtacgcaggc ctgtttagat gacttggtac
cggcatatgg cctcgtactg tttcatttga 300tgacgcgagc gtgcggccca
tgcagcagca gcacgccggg aaggcagcgg attttgaagt 360actattggac
agcgcggcgc ggggaccggg tcgttggcgc gcggtggagt gggggtgggt
420ggtcctggcg tcctgccctg cgcgatggtc gatggatgcc ccatgcgcgt
gtaaccgccc 480agccgtcgcc atccgaccag gtgggcagac gtacgtacgg
tggcacgccc acggcccatc 540ggccatcgcg atcgcgttcg tatcgtgtcc
tcaataacga aagcgccaac ggaaggcgct 600gtcgtcgtca gttcaccgcg
cgccggcgcc ctgtgtcctc gtccctctcg acttctcgac 660cagtaagaac
tctcgcgagc tgcggagctg ctggcgatgg ccggccggtg ggatccgacg
720atttaaatct taagcctagg ataacttcgt atagcataca ttatacgaag
ttatggcgcc 780gctagcctgc agtgcagcgt gacccggtcg tgcccctctc
tagagataat gagcattgca 840tgtctaagtt ataaaaaatt accacatatt
ttttttgtca cacttgtttg aagtgcagtt 900tatctatctt tatacatata
tttaaacttt actctacgaa taatataatc tatagtacta 960caataatatc
agtgttttag agaatcatat aaatgaacag ttagacatgg tctaaaggac
1020aattgagtat tttgacaaca ggactctaca gttttatctt tttagtgtgc
atgtgttctc 1080cttttttttt gcaaatagct tcacctatat aatacttcat
ccattttatt agtacatcca 1140tttagggttt agggttaatg gtttttatag
actaattttt ttagtacatc tattttattc 1200tattttagcc tctaaattaa
gaaaactaaa actctatttt agttttttta tttaataatt 1260tagatataaa
atagaataaa ataaagtgac taaaaattaa acaaataccc tttaagaaat
1320taaaaaaact aaggaaacat ttttcttgtt tcgagtagat aatgccagcc
tgttaaacgc 1380cgtcgacgag tctaacggac accaaccagc gaaccagcag
cgtcgcgtcg ggccaagcga 1440agcagacggc acggcatctc tgtcgctgcc
tctggacccc tctcgagagt tccgctccac 1500cgttggactt gctccgctgt
cggcatccag aaattgcgtg gcggagcggc agacgtgagc 1560cggcacggca
ggcggcctcc tcctcctctc acggcaccgg cagctacggg ggattccttt
1620cccaccgctc cttcgctttc ccttcctcgc ccgccgtaat aaatagacac
cccctccaca 1680ccctctttcc ccaacctcgt gttgttcgga gcgcacacac
acacaaccag atctccccca 1740aatccacccg tcggcacctc cgcttcaagg
tacgccgctc gtcctccccc ccccccctct 1800ctaccttctc tagatcggcg
ttccggtcca tgcatggtta gggcccggta gttctacttc 1860tgttcatgtt
tgtgttagat ccgtgtttgt gttagatccg tgctgctagc gttcgtacac
1920ggatgcgacc tgtacgtcag acacgttctg attgctaact tgccagtgtt
tctctttggg 1980gaatcctggg atggctctag ccgttccgca gacgggatcg
atttcatgat tttttttgtt 2040tcgttgcata gggtttggtt tgcccttttc
ctttatttca atatatgccg tgcacttgtt 2100tgtcgggtca tcttttcatg
cttttttttg tcttggttgt gatgatgtgg tctggttggg 2160cggtcgttct
agatcggagt agaattctgt ttcaaactac ctggtggatt tattaatttt
2220ggatctgtat gtgtgtgcca tacatattca tagttacgaa ttgaagatga
tggatggaaa 2280tatcgatcta ggataggtat acatgttgat gcgggtttta
ctgatgcata tacagagatg 2340ctttttgttc gcttggttgt gatgatgtgg
tgtggttggg cggtcgttca ttcgttctag 2400atcggagtag aatactgttt
caaactacct ggtgtattta ttaattttgg aactgtatgt 2460gtgtgtcata
catcttcata gttacgagtt taagatggat ggaaatatcg atctaggata
2520ggtatacatg ttgatgtggg ttttactgat gcatatacat gatggcatat
gcagcatcta 2580ttcatatgct ctaaccttga gtacctatct attataataa
acaagtatgt tttataatta 2640ttttgatctt gatatacttg gatgatggca
tatgcagcag ctatatgtgg atttttttag 2700ccctgccttc atacgctatt
tatttgcttg gtactgtttc ttttgtcgat gctcaccctg 2760ttgtttggtg
ttacttctgc aggtcgactc tagaggatca attcgctagc gaagttccta
2820ttccgaagtt cctattctct agaaagtata ggaacttcag atccaccggg
atccccgatc 2880atgcaaaaac tcattaactc agtgcaaaac tatgcctggg
gcagcaaaac ggcgttgact 2940gaactttatg gtatggaaaa tccgtccagc
cagccgatgg ccgagctgtg gatgggcgca 3000catccgaaaa gcagttcacg
agtgcagaat gccgccggag atatcgtttc actgcgtgat 3060gtgattgaga
gtgataaatc gactctgctc ggagaggccg ttgccaaacg ctttggcgaa
3120ctgcctttcc tgttcaaagt attatgcgca gcacagccac tctccattca
ggttcatcca 3180aacaaacaca attctgaaat cggttttgcc aaagaaaatg
ccgcaggtat cccgatggat 3240gccgccgagc gtaactataa agatcctaac
cacaagccgg agctggtttt tgcgctgacg 3300cctttccttg cgatgaacgc
gtttcgtgaa ttttccgaga ttgtctccct actccagccg 3360gtcgcaggtg
cacatccggc gattgctcac tttttacaac agcctgatgc cgaacgttta
3420agcgaactgt tcgccagcct gttgaatatg cagggtgaag aaaaatcccg
cgcgctggcg 3480attttaaaat cggccctcga tagccagcag ggtgaaccgt
ggcaaacgat tcgtttaatt 3540tctgaatttt acccggaaga cagcggtctg
ttctccccgc tattgctgaa tgtggtgaaa 3600ttgaaccctg gcgaagcgat
gttcctgttc gctgaaacac cgcacgctta cctgcaaggc 3660gtggcgctgg
aagtgatggc aaactccgat aacgtgctgc gtgcgggtct gacgcctaaa
3720tacattgata ttccggaact ggttgccaat gtgaaattcg aagccaaacc
ggctaaccag 3780ttgttgaccc agccggtgaa acaaggtgca gaactggact
tcccgattcc agtggatgat 3840tttgccttct cgctgcatga ccttagtgat
aaagaaacca ccattagcca gcagagtgcc 3900gccattttgt tctgcgtcga
aggcgatgca acgttgtgga aaggttctca gcagttacag 3960cttaaaccgg
gtgaatcagc gtttattgcc gccaacgaat caccggtgac tgtcaaaggc
4020cacggccgtt tagcgcgtgt ttacaacaag ctgtaagagc ttactgaaaa
aattaacatc 4080tcttgctaag ctgggggtgg aacctagact tgtccatctt
ctggattggc caacttaatt 4140aatgtatgaa ataaaaggat gcacacatag
tgacatgcta atcactataa tgtgggcatc 4200aaagttgtgt gttatgtgta
attactagtt atctgaataa aagagaaaga gatcatccat 4260atttcttatc
ctaaatgaat gtcacgtgtc tttataattc tttgatgaac cagatgcatt
4320tcattaacca aatccatata catataaata ttaatcatat ataattaata
tcaattgggt 4380tagcaaaaca aatctagtct aggtgtgttt tgcgaatgcg
gccctagcgt atacgaagtt 4440cctattccga agttcctatt ctccagaaag
tataggaact tctgtacacc tgagctgatt 4500ccgatgactt cgtaggttcc
tagctcaagc cgctcgtgtc caagcgtcac ttacgattag 4560ctaatgatta
cggcatctag gaccgactag ctaactaact agtaccgagg ccggccccgc
4620gggagctctg cgtgcaacct cgaggccgca aacagcctgg tgacagacga
agccagcaag 4680cacgtacgta cgcacgtctc tgctggtctg gatgtgtatg
gatatggacg tctcacgtct 4740ggacgtcgtc gtcgccgttg tattgtatca
tgccaaccac ttccgtaccg taccccctcg 4800cgtgccaaca tgaccaccgc
cggtacgtct ccatcgtcgg ccgtcggcgt ctcaggcagc 4860tctcaattaa
gcggacgtgt tttggtaatc tggtggaacg ccgcgcgcac tgagggtttg
4920ggggccccgg cggacgagcg agcgagagac ggtgcatgca tgccaaatgg
caacgagggc 4980ccgcccgccc atccaataac caacccagac gtagcgcaac
caacgtacga gtcctgtgct 5040ggcgcgtacg actaccacgc tagctgccgc
gacatgcgaa ctacggtcca ccaggcacca 5100gccatgacaa tatatactgt
atatatattt ttcttcttct ttttgtttcc gctctctcaa 5160gttcctgctc
tgctcctgcc tgtccgcggt gccgatcggc gagagagcat gcatggacat
5220ggaccacgcg agatccagga accggcacgg gcccatgcgt ggcaggcggc
cgtttcgtca 5280ggttccccga aatgccccaa ctgcgcggct gcaggatggc
tcatggctgg ctgcctagct 5340ggcccgtgac accgatcgat cggtaacgac
gacgcacgca cctgaagcac aggaaggagc 5400ctccctctcg catgcacgtt agtact
54262595152DNAArtificial SequenceTS10Cas-1 donor 259ggtaccaaat
agtaaacggg aggggaggtc gctagtagta aacgctaggt agctaggata 60atccgtctcg
tgttggacgg aaggttttgg acgcatctgc gtgcacagcc cgctgataca
120gatctgatcg actagctagc tagatgccga ggccccagag caaggcccgg
atactcctgc 180acagtccctg agatttcagc acagcaggtg ctgttgcatc
aatatataaa tccctgcttt 240attaatttaa tctctgtgca tgtatccata
catcgtcagc ggctcagcgc tatcacactg 300cagtgcacgc agctagttga
gcgcctgggt cagtatatat atagctagta gggacaaagg 360ggggcactgt
acgttggttt ggtttggcac gcacgcgatc gagagtggtg gaatggactg
420cagatcatcg atcgctgcac tgtacgcacg cgcaccggac tgcatttgca
tgcccctgaa 480ggaggaaagg ggaaggaaag aaaagaaata ggagaaagaa
gaagaagcag agaaatacgt 540cacagtccaa gaagagtgag ccgccctagc
tagcttcaac cctgacgaac ccggcagcca 600cacttccggc catgtatgca
tgcatgcatg gcttagcttc agatgtccaa tcgaatccat 660caagacctgg
ccggttttcc atggccgcct cgccttcgct agtggtacca tttaaatctt
720aagcctagga taacttcgta tagcatacat tatacgaagt tatggcgccg
ctagcctgca 780gtgcagcgtg acccggtcgt gcccctctct agagataatg
agcattgcat gtctaagtta 840taaaaaatta ccacatattt tttttgtcac
acttgtttga agtgcagttt atctatcttt 900atacatatat ttaaacttta
ctctacgaat aatataatct atagtactac aataatatca 960gtgttttaga
gaatcatata aatgaacagt tagacatggt ctaaaggaca attgagtatt
1020ttgacaacag gactctacag ttttatcttt ttagtgtgca tgtgttctcc
tttttttttg 1080caaatagctt cacctatata atacttcatc cattttatta
gtacatccat ttagggttta 1140gggttaatgg tttttataga ctaatttttt
tagtacatct attttattct attttagcct 1200ctaaattaag aaaactaaaa
ctctatttta gtttttttat ttaataattt agatataaaa 1260tagaataaaa
taaagtgact aaaaattaaa caaataccct ttaagaaatt aaaaaaacta
1320aggaaacatt tttcttgttt cgagtagata atgccagcct gttaaacgcc
gtcgacgagt 1380ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg
gccaagcgaa gcagacggca 1440cggcatctct gtcgctgcct ctggacccct
ctcgagagtt ccgctccacc gttggacttg 1500ctccgctgtc ggcatccaga
aattgcgtgg cggagcggca gacgtgagcc ggcacggcag 1560gcggcctcct
cctcctctca cggcaccggc agctacgggg gattcctttc ccaccgctcc
1620ttcgctttcc cttcctcgcc cgccgtaata aatagacacc ccctccacac
cctctttccc 1680caacctcgtg ttgttcggag cgcacacaca cacaaccaga
tctcccccaa atccacccgt 1740cggcacctcc gcttcaaggt acgccgctcg
tcctcccccc cccccctctc taccttctct 1800agatcggcgt tccggtccat
gcatggttag ggcccggtag ttctacttct gttcatgttt 1860gtgttagatc
cgtgtttgtg ttagatccgt gctgctagcg ttcgtacacg gatgcgacct
1920gtacgtcaga cacgttctga ttgctaactt gccagtgttt ctctttgggg
aatcctggga 1980tggctctagc cgttccgcag acgggatcga tttcatgatt
ttttttgttt cgttgcatag 2040ggtttggttt gcccttttcc tttatttcaa
tatatgccgt gcacttgttt gtcgggtcat 2100cttttcatgc ttttttttgt
cttggttgtg atgatgtggt ctggttgggc ggtcgttcta 2160gatcggagta
gaattctgtt tcaaactacc tggtggattt attaattttg gatctgtatg
2220tgtgtgccat acatattcat agttacgaat tgaagatgat ggatggaaat
atcgatctag 2280gataggtata catgttgatg cgggttttac tgatgcatat
acagagatgc tttttgttcg 2340cttggttgtg atgatgtggt gtggttgggc
ggtcgttcat tcgttctaga tcggagtaga 2400atactgtttc aaactacctg
gtgtatttat taattttgga actgtatgtg tgtgtcatac 2460atcttcatag
ttacgagttt aagatggatg gaaatatcga tctaggatag gtatacatgt
2520tgatgtgggt tttactgatg catatacatg atggcatatg cagcatctat
tcatatgctc 2580taaccttgag tacctatcta ttataataaa caagtatgtt
ttataattat tttgatcttg 2640atatacttgg atgatggcat atgcagcagc
tatatgtgga tttttttagc cctgccttca 2700tacgctattt atttgcttgg
tactgtttct tttgtcgatg ctcaccctgt tgtttggtgt 2760tacttctgca
ggtcgactct agaggatcaa ttcgctagcg aagttcctat tccgaagttc
2820ctattctcta gaaagtatag gaacttcaga tccaccggga tccccgatca
tgcaaaaact 2880cattaactca gtgcaaaact atgcctgggg cagcaaaacg
gcgttgactg aactttatgg 2940tatggaaaat ccgtccagcc agccgatggc
cgagctgtgg atgggcgcac atccgaaaag 3000cagttcacga gtgcagaatg
ccgccggaga tatcgtttca ctgcgtgatg tgattgagag 3060tgataaatcg
actctgctcg gagaggccgt tgccaaacgc tttggcgaac tgcctttcct
3120gttcaaagta ttatgcgcag cacagccact ctccattcag gttcatccaa
acaaacacaa 3180ttctgaaatc ggttttgcca aagaaaatgc cgcaggtatc
ccgatggatg ccgccgagcg 3240taactataaa gatcctaacc acaagccgga
gctggttttt gcgctgacgc ctttccttgc 3300gatgaacgcg tttcgtgaat
tttccgagat tgtctcccta ctccagccgg tcgcaggtgc 3360acatccggcg
attgctcact ttttacaaca gcctgatgcc gaacgtttaa gcgaactgtt
3420cgccagcctg ttgaatatgc agggtgaaga aaaatcccgc gcgctggcga
ttttaaaatc 3480ggccctcgat agccagcagg gtgaaccgtg gcaaacgatt
cgtttaattt ctgaatttta 3540cccggaagac agcggtctgt tctccccgct
attgctgaat gtggtgaaat tgaaccctgg 3600cgaagcgatg ttcctgttcg
ctgaaacacc gcacgcttac ctgcaaggcg tggcgctgga 3660agtgatggca
aactccgata acgtgctgcg tgcgggtctg acgcctaaat acattgatat
3720tccggaactg gttgccaatg tgaaattcga agccaaaccg gctaaccagt
tgttgaccca 3780gccggtgaaa caaggtgcag aactggactt cccgattcca
gtggatgatt ttgccttctc 3840gctgcatgac cttagtgata aagaaaccac
cattagccag cagagtgccg ccattttgtt 3900ctgcgtcgaa ggcgatgcaa
cgttgtggaa aggttctcag cagttacagc ttaaaccggg 3960tgaatcagcg
tttattgccg ccaacgaatc accggtgact gtcaaaggcc acggccgttt
4020agcgcgtgtt tacaacaagc tgtaagagct tactgaaaaa attaacatct
cttgctaagc 4080tgggggtgga acctagactt gtccatcttc tggattggcc
aacttaatta atgtatgaaa 4140taaaaggatg cacacatagt gacatgctaa
tcactataat gtgggcatca aagttgtgtg 4200ttatgtgtaa ttactagtta
tctgaataaa agagaaagag atcatccata tttcttatcc 4260taaatgaatg
tcacgtgtct ttataattct ttgatgaacc agatgcattt cattaaccaa
4320atccatatac atataaatat taatcatata taattaatat caattgggtt
agcaaaacaa 4380atctagtcta ggtgtgtttt gcgaatgcgg ccctagcgta
tacgaagttc ctattccgaa 4440gttcctattc tccagaaagt ataggaactt
ctgtacacct gagctgattc cgatgacttc 4500gtaggttcct agctcaagcc
gctcgtgtcc aagcgtcact tacgattagc taatgattac 4560ggcatctagg
accgactagc taactaacta gtaccgaggc cggccccgcg ggagctcggc
4620gcgcctaagg gccaagtact tgctgtccct gtatctccaa cacgagcctt
gattcctgcc 4680ggccggtgat ggcaatggcc gctagtagtc tccgctagct
agggagcggc gatccgacgc 4740gacgccacca tgtgtctaga aaagaagttt
cttgctttgc atgcagactt attagcgcgg 4800tcgacacctg tggggacccc
gtgtcttgag acaatgagac tgcctgtccg cccaagacac 4860tacttgtagc
catgaagcca tcgactcctc tccttgctct ccagtaatcc agtggatgga
4920tccatcatcg atagtttagt ttatcagtct tcttgaggcc ggtgtccccc
atgcataatg 4980atgacagaaa gcctgggcca ggtaaaagcc aaaaagtttg
accctctagg tactggggcc 5040agccctggcg tttgaacaaa aaaaaaatct
gagcgtgtcg ccccggcctg ttttcgaact 5100cctaaacgac gtcgcaactt
tttttataca cacactaccg gtacatggct tt 51522605146DNAArtificial
SequenceTS10Cas-3 donor 260aaatagtaaa cgggagggga ggtcgctagt
agtaaacgct aggtagctag gataatccgt 60ctcgtgttgg acggaaggtt ttggacgcat
ctgcgtgcac agcccgctga tacagatctg 120atcgactagc tagctagatg
ccgaggcccc agagcaaggc ccggatactc ctgcacagtc 180cctgagattt
cagcacagca ggtgctgttg catcaatata taaatccctg ctttattaat
240ttaatctctg tgcatgtatc catacatcgt cagcggctca gcgctatcac
actgcagtgc 300acgcagctag ttgagcgcct gggtcagtat atatatagct
agtagggaca aaggggggca 360ctgtacgttg gtttggtttg gcacgcacgc
gatcgagagt ggtggaatgg actgcagatc 420atcgatcgct gcactgtacg
cacgcgcacc ggactgcatt tgcatgcccc tgaaggagga 480aaggggaagg
aaagaaaaga aataggagaa agaagaagaa gcagagaaat acgtcacagt
540ccaagaagag tgagccgccc tagctagctt caaccctgac gaacccggca
gccacacttc 600cggccatgta tgcatgcatg catggcttag cttcagatgt
ccaatcgaat ccatcaagac 660ctggccggtt ttccatggcc gcctcgcctt
cgctagttaa gggccaagta cttgctgtcc 720ctgtggtacc atttaaatct
taagcctagg ataacttcgt atagcataca ttatacgaag 780ttatggcgcc
gctagcctgc agtgcagcgt gacccggtcg tgcccctctc tagagataat
840gagcattgca tgtctaagtt ataaaaaatt accacatatt ttttttgtca
cacttgtttg 900aagtgcagtt tatctatctt tatacatata tttaaacttt
actctacgaa taatataatc 960tatagtacta caataatatc agtgttttag
agaatcatat aaatgaacag ttagacatgg 1020tctaaaggac aattgagtat
tttgacaaca ggactctaca gttttatctt tttagtgtgc 1080atgtgttctc
cttttttttt gcaaatagct tcacctatat aatacttcat ccattttatt
1140agtacatcca tttagggttt agggttaatg gtttttatag actaattttt
ttagtacatc 1200tattttattc tattttagcc tctaaattaa gaaaactaaa
actctatttt agttttttta 1260tttaataatt tagatataaa atagaataaa
ataaagtgac taaaaattaa acaaataccc 1320tttaagaaat taaaaaaact
aaggaaacat ttttcttgtt tcgagtagat aatgccagcc 1380tgttaaacgc
cgtcgacgag tctaacggac accaaccagc gaaccagcag cgtcgcgtcg
1440ggccaagcga agcagacggc acggcatctc tgtcgctgcc tctggacccc
tctcgagagt 1500tccgctccac cgttggactt gctccgctgt cggcatccag
aaattgcgtg gcggagcggc 1560agacgtgagc cggcacggca ggcggcctcc
tcctcctctc acggcaccgg cagctacggg 1620ggattccttt cccaccgctc
cttcgctttc ccttcctcgc ccgccgtaat aaatagacac 1680cccctccaca
ccctctttcc ccaacctcgt gttgttcgga gcgcacacac acacaaccag
1740atctccccca aatccacccg tcggcacctc cgcttcaagg tacgccgctc
gtcctccccc 1800ccccccctct ctaccttctc tagatcggcg ttccggtcca
tgcatggtta gggcccggta 1860gttctacttc tgttcatgtt tgtgttagat
ccgtgtttgt gttagatccg tgctgctagc 1920gttcgtacac ggatgcgacc
tgtacgtcag acacgttctg attgctaact tgccagtgtt 1980tctctttggg
gaatcctggg atggctctag ccgttccgca gacgggatcg atttcatgat
2040tttttttgtt tcgttgcata gggtttggtt tgcccttttc ctttatttca
atatatgccg 2100tgcacttgtt tgtcgggtca tcttttcatg cttttttttg
tcttggttgt gatgatgtgg 2160tctggttggg cggtcgttct agatcggagt
agaattctgt ttcaaactac ctggtggatt 2220tattaatttt ggatctgtat
gtgtgtgcca tacatattca tagttacgaa ttgaagatga 2280tggatggaaa
tatcgatcta ggataggtat acatgttgat gcgggtttta ctgatgcata
2340tacagagatg ctttttgttc gcttggttgt gatgatgtgg tgtggttggg
cggtcgttca 2400ttcgttctag atcggagtag aatactgttt caaactacct
ggtgtattta ttaattttgg 2460aactgtatgt gtgtgtcata catcttcata
gttacgagtt taagatggat ggaaatatcg 2520atctaggata ggtatacatg
ttgatgtggg ttttactgat gcatatacat gatggcatat 2580gcagcatcta
ttcatatgct ctaaccttga gtacctatct attataataa acaagtatgt
2640tttataatta ttttgatctt gatatacttg gatgatggca tatgcagcag
ctatatgtgg 2700atttttttag ccctgccttc atacgctatt tatttgcttg
gtactgtttc ttttgtcgat 2760gctcaccctg ttgtttggtg ttacttctgc
aggtcgactc tagaggatca attcgctagc 2820gaagttccta ttccgaagtt
cctattctct agaaagtata ggaacttcag atccaccggg 2880atccccgatc
atgcaaaaac tcattaactc agtgcaaaac tatgcctggg gcagcaaaac
2940ggcgttgact gaactttatg gtatggaaaa tccgtccagc cagccgatgg
ccgagctgtg 3000gatgggcgca catccgaaaa gcagttcacg agtgcagaat
gccgccggag atatcgtttc 3060actgcgtgat gtgattgaga gtgataaatc
gactctgctc ggagaggccg ttgccaaacg 3120ctttggcgaa ctgcctttcc
tgttcaaagt attatgcgca gcacagccac tctccattca 3180ggttcatcca
aacaaacaca attctgaaat cggttttgcc aaagaaaatg ccgcaggtat
3240cccgatggat gccgccgagc gtaactataa agatcctaac cacaagccgg
agctggtttt 3300tgcgctgacg cctttccttg cgatgaacgc gtttcgtgaa
ttttccgaga ttgtctccct 3360actccagccg gtcgcaggtg cacatccggc
gattgctcac tttttacaac agcctgatgc 3420cgaacgttta agcgaactgt
tcgccagcct gttgaatatg cagggtgaag aaaaatcccg 3480cgcgctggcg
attttaaaat cggccctcga tagccagcag ggtgaaccgt ggcaaacgat
3540tcgtttaatt tctgaatttt acccggaaga cagcggtctg ttctccccgc
tattgctgaa 3600tgtggtgaaa ttgaaccctg gcgaagcgat gttcctgttc
gctgaaacac cgcacgctta 3660cctgcaaggc gtggcgctgg aagtgatggc
aaactccgat aacgtgctgc gtgcgggtct 3720gacgcctaaa tacattgata
ttccggaact ggttgccaat gtgaaattcg aagccaaacc 3780ggctaaccag
ttgttgaccc agccggtgaa acaaggtgca gaactggact tcccgattcc
3840agtggatgat tttgccttct cgctgcatga ccttagtgat aaagaaacca
ccattagcca 3900gcagagtgcc gccattttgt tctgcgtcga aggcgatgca
acgttgtgga aaggttctca 3960gcagttacag cttaaaccgg gtgaatcagc
gtttattgcc gccaacgaat caccggtgac 4020tgtcaaaggc cacggccgtt
tagcgcgtgt ttacaacaag ctgtaagagc ttactgaaaa 4080aattaacatc
tcttgctaag ctgggggtgg aacctagact tgtccatctt ctggattggc
4140caacttaatt aatgtatgaa ataaaaggat gcacacatag tgacatgcta
atcactataa 4200tgtgggcatc aaagttgtgt gttatgtgta attactagtt
atctgaataa aagagaaaga 4260gatcatccat atttcttatc ctaaatgaat
gtcacgtgtc tttataattc tttgatgaac 4320cagatgcatt tcattaacca
aatccatata catataaata ttaatcatat ataattaata 4380tcaattgggt
tagcaaaaca aatctagtct aggtgtgttt tgcgaatgcg gccctagcgt
4440atacgaagtt cctattccga agttcctatt ctccagaaag tataggaact
tctgtacacc 4500tgagctgatt ccgatgactt cgtaggttcc tagctcaagc
cgctcgtgtc caagcgtcac 4560ttacgattag ctaatgatta
cggcatctag gaccgactag ctaactaact agtaccgagg 4620ccggccccgc
gggagctcgg cgcgccatct ccaacacgag ccttgattcc tgccggccgg
4680tgatggcaat ggccgctagt agtctccgct agctagggag cggcgatccg
acgcgacgcc 4740accatgtgtc tagaaaagaa gtttcttgct ttgcatgcag
acttattagc gcggtcgaca 4800cctgtgggga ccccgtgtct tgagacaatg
agactgcctg tccgcccaag acactacttg 4860tagccatgaa gccatcgact
cctctccttg ctctccagta atccagtgga tggatccatc 4920atcgatagtt
tagtttatca gtcttcttga ggccggtgtc ccccatgcat aatgatgaca
4980gaaagcctgg gccaggtaaa agccaaaaag tttgaccctc taggtactgg
ggccagccct 5040ggcgtttgaa caaaaaaaaa atctgagcgt gtcgccccgg
cctgttttcg aactcctaaa 5100cgacgtcgca acttttttta tacacacact
accggtacat ggcttt 514626132DNAArtificial Sequenceubir primer from
donor 261ccatgtctaa ctgttcattt atatgattct ct 3226231DNAArtificial
Sequencepsbf primer from dono 262gctcgtgtcc aagcgtcact tacgattagc t
3126327DNAArtificial SequencesMHP14 14-1HR1f primer 263ctcacatgag
gctcttcttt gcttgct 2726426DNAArtificial SequenceMHP14 14-1HR2r
primer 264aggatcctat tccccaattt gtagat 2626521DNAArtificial
SequenceCHR1-8 8HR1f primer 265cagtccgtgg attgaagcca t
2126622DNAArtificial SequenceCHR1-8 8HR2r primer 266ctctgtctcc
gagacgtgct ta 2226726DNAArtificial SequenceCHR1-9 9HR1f primer
267ggagcaaatg ttttaggtat gaaatg 2626828DNAArtificial SequenceCHR1-9
9HR2r primer 268cggattctaa agatcatacg taaatgaa 2826921DNAArtificial
SequenceCHR1-10 10HR1f primer 269tggcttgtct atgcgcatct c
2127020DNAArtificial SequenceCHR1-10 10HR2r primer 270ccagacccaa
acagcaggtt 2027118DNAArtificial SequenceMHP14Cas-1 probe
271cagattcacg tcagattt 1827227DNAArtificial SequenceMHP14cas-1
forward primer 272catagtggtg tatgaaagga agcactt
2727330DNAArtificial SequenceMHP14cas-1 reverse primer
273cattttggat tgtaatatgt gtacctcata 3027417DNAArtificial
SequenceMHP14Cas-3 probe 274caccactatg tcgcttc 1727522DNAArtificial
SequenceMHP14Cas-3 forward primer 275cggatgcacg aaaattgtag ga
2227624DNAArtificial SequenceMHP14Cas-3 reverse primer
276ctgacgtgaa tctgtttgga attg 2427718DNAArtificial SequenceTS8Cas-1
probe 277tacgtaacgt gcagtact 1827821DNAArtificial SequenceTS8Cas-1
forward primer 278acggacggac catacgttat g 2127926DNAArtificial
SequenceTS8Cas-1 reverse primer 279tcagctggtg gagtatatta gttcgt
2628018DNAArtificial SequenceTS8Cas-2 probe 280ccagctgatc actgatga
1828121DNAArtificial SequenceTS8Cas-2 forward primer 281acggacggac
catacgttat g 2128226DNAArtificial SequenceTS8Cas-2 reverse primer
282cgcacatgtt ataaattaca atgcat 2628314DNAArtificial
SequenceTS9Cas-2 probe 283ctgtttgcgg cctc 1428419DNAArtificial
SequenceTS9Cas-2 forward primer 284ctgcggagct gctggcgat
1928520DNAArtificial SequenceTS9Cas-2 reverse primer 285cttgctggct
tcgtctgtca 2028615DNAArtificial SequenceTS9Cas-3 probe
286ccgacgtgcg tgcaa 1528719DNAArtificial SequenceTS9Cas-3 forward
primer 287ctgcggagct gctggcgat 1928820DNAArtificial
SequenceTS9Cas-3 reverse primer 288cttgctggct tcgtctgtca
2028917DNAArtificial SequenceTS10Cas-1 probe 289tcgccttcgc tagttaa
1729020DNAArtificial SequenceTS10Cas-1 forward primer 290aagacctggc
cggttttcca 2029118DNAArtificial SequenceTS10Cas-1 reverse primer
291tagcggccat tgccatca 1829219DNAArtificial SequenceTS10Cas-3 probe
292ctgtatctcc aacacgagc 1929320DNAArtificial SequenceTS10Cas-3
forward primer 293aagacctggc cggttttcca 2029418DNAArtificial
SequenceTS10Cas-3 reverse primer 294tagcggccat tgccatca
18295472DNAGlycine maxGM-U6-9.1 promoter(1)..(472) 295cccgggttaa
gagaattgta agtgtgcttt tatatattta aaattaatat attttgaaat 60gttaaaatat
aaaagaaaat tcaatgtaaa ttaaaaataa ataaatgttt aataaagata
120aattttaaaa cataaaagaa aatgtctaac aagaggatta agatcctgtg
ctcttaaatt 180tttaggtgtt gaaatcttag ccatacaaaa tatattttat
taaaaccaag catgaaaaaa 240gtcactaaag agctatataa ctcatgcagc
tagaaatgaa gtgaagggaa tccagtttgt 300tctcagtcga aagagtgtct
atctttgttc ttttctgcaa ccgagttaag caaaatggga 360atgcgaggta
tcttcctttc gttaggggag caccagatgc atagttagtc ccacattgat
420gaatataaca agagcttcac agaatatata gcccaggcca cagtaaaagc tt
4722965958DNAArtificial sequenceEF1A2-CAS9 296gggtttactt attttgtggg
tatctatact tttattagat ttttaatcag gctcctgatt 60tctttttatt tcgattgaat
tcctgaactt gtattattca gtagatcgaa taaattataa 120aaagataaaa
tcataaaata atattttatc ctatcaatca tattaaagca atgaatatgt
180aaaattaatc ttatctttat tttaaaaaat catataggtt tagtattttt
ttaaaaataa 240agataggatt agttttacta ttcactgctt attactttta
aaaaaatcat aaaggtttag 300tattttttta aaataaatat aggaatagtt
ttactattca ctgctttaat agaaaaatag 360tttaaaattt aagatagttt
taatcccagc atttgccacg tttgaacgtg agccgaaacg 420atgtcgttac
attatcttaa cctagctgaa acgatgtcgt cataatatcg ccaaatgcca
480actggactac gtcgaaccca caaatcccac aaagcgcgtg aaatcaaatc
gctcaaacca 540caaaaaagaa caacgcgttt gttacacgct caatcccacg
cgagtagagc acagtaacct 600tcaaataagc gaatggggca taatcagaaa
tccgaaataa acctaggggc attatcggaa 660atgaaaagta gctcactcaa
tataaaaatc taggaaccct agttttcgtt atcactctgt 720gctccctcgc
tctatttctc agtctctgtg tttgcggctg aggattccga acgagtgacc
780ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc tcttcgattc
gatctatgcc 840tgtctcttat ttacgatgat gtttcttcgg ttatgttttt
ttatttatgc tttatgctgt 900tgatgttcgg ttgtttgttt cgctttgttt
ttgtggttca gttttttagg attcttttgg 960tttttgaatc gattaatcgg
aagagatttt cgagttattt ggtgtgttgg aggtgaatct 1020tttttttgag
gtcatagatc tgttgtattt gtgttataaa catgcgactt tgtatgattt
1080tttacgaggt tatgatgttc tggttgtttt attatgaatc tgttgagaca
gaaccatgat 1140ttttgttgat gttcgtttac actattaaag gtttgtttta
acaggattaa aagtttttta 1200agcatgttga aggagtcttg tagatatgta
accgtcgata gtttttttgt gggtttgttc 1260acatgttatc aagcttaatc
ttttactatg tatgcgacca tatctggatc cagcaaaggc 1320gattttttaa
ttccttgtga aacttttgta atatgaagtt gaaattttgt tattggtaaa
1380ctataaatgt gtgaagttgg agtatacctt taccttctta tttggctttg
tgatagttta 1440atttatatgt attttgagtt ctgacttgta tttctttgaa
ttgattctag tttaagtaat 1500ccatggacaa aaagtactca atagggctcg
acatagggac taactccgtt ggatgggccg 1560tcatcaccga cgagtacaag
gtgccctcca agaagttcaa ggtgttggga aacaccgaca 1620ggcacagcat
aaagaagaat ttgatcggtg ccctcctctt cgactccgga gagaccgctg
1680aggctaccag gctcaagagg accgctagaa ggcgctacac cagaaggaag
aacagaatct 1740gctacctgca ggagatcttc tccaacgaga tggccaaggt
ggacgactcc ttcttccacc 1800gccttgagga atcattcctg gtggaggagg
ataaaaagca cgagagacac ccaatcttcg 1860ggaacatcgt cgacgaggtg
gcctaccatg aaaagtaccc taccatctac cacctgagga 1920agaagctggt
cgactctacc gacaaggctg acttgcgctt gatttacctg gctctcgctc
1980acatgataaa gttccgcgga cacttcctca ttgagggaga cctgaaccca
gacaactccg 2040acgtggacaa gctcttcatc cagctcgttc agacctacaa
ccagcttttc gaggagaacc 2100caatcaacgc cagtggagtt gacgccaagg
ctatcctctc tgctcgtctg tcaaagtcca 2160ggaggcttga gaacttgatt
gcccagctgc ctggcgaaaa gaagaacgga ctgttcggaa 2220acttgatcgc
tctctccctg ggattgactc ccaacttcaa gtccaacttc gacctcgccg
2280aggacgctaa gttgcagttg tctaaagaca cctacgacga tgacctcgac
aacttgctgg 2340cccagatagg cgaccaatac gccgatctct tcctcgccgc
taagaacttg tccgacgcaa 2400tcctgctgtc cgacatcctg agagtcaaca
ctgagattac caaagctcct ctgtctgctt 2460ccatgattaa gcgctacgac
gagcaccacc aagatctgac cctgctcaag gccctggtga 2520gacagcagct
gcccgagaag tacaaggaga tctttttcga ccagtccaag aacggctacg
2580ccggatacat tgacggaggc gcctcccagg aagagttcta caagttcatc
aagcccatcc 2640ttgagaagat ggacggtacc gaggagctgt tggtgaagtt
gaacagagag gacctgttga 2700ggaagcagag aaccttcgac aacggaagca
tccctcacca aatccacctg ggagagctcc 2760acgccatctt gaggaggcag
gaggatttct atcccttcct gaaggacaac cgcgagaaga 2820ttgagaagat
cttgaccttc agaattcctt actacgtcgg gccactcgcc agaggaaact
2880ctaggttcgc ctggatgacc cgcaaatctg aagagaccat tactccctgg
aacttcgagg 2940aagtcgtgga caagggcgct tccgctcagt ctttcatcga
gaggatgacc aacttcgata 3000aaaatctgcc caacgagaag gtgctgccca
agcactccct gttgtacgag tatttcacag 3060tgtacaacga gctcaccaag
gtgaagtacg tcacagaggg aatgaggaag cctgccttct 3120tgtccggaga
gcagaagaag gccatcgtcg acctgctctt caagaccaac aggaaggtga
3180ctgtcaagca gctgaaggag gactacttca agaagatcga gtgcttcgac
tccgtcgaga 3240tctctggtgt cgaggacagg ttcaacgcct cccttgggac
ttaccacgat ctgctcaaga 3300ttattaaaga caaggacttc ctggacaacg
aggagaacga ggacatcctt gaggacatcg 3360tgctcaccct gaccttgttc
gaagacaggg aaatgatcga agagaggctc aagacctacg 3420cccacctctt
cgacgacaag gtgatgaaac agctgaagag acgcagatat accggctggg
3480gaaggctctc ccgcaaattg atcaacggga tcagggacaa gcagtcaggg
aagactatac 3540tcgacttcct gaagtccgac ggattcgcca acaggaactt
catgcagctc attcacgacg 3600actccttgac cttcaaggag gacatccaga
aggctcaggt gtctggacag ggtgactcct 3660tgcatgagca cattgctaac
ttggccggct ctcccgctat taagaagggc attttgcaga 3720ccgtgaaggt
cgttgacgag ctcgtgaagg tgatgggacg ccacaagcca gagaacatcg
3780ttattgagat ggctcgcgag aaccaaacta cccagaaagg gcagaagaat
tcccgcgaga 3840ggatgaagcg cattgaggag ggcataaaag agcttggctc
tcagatcctc aaggagcacc 3900ccgtcgagaa cactcagctg cagaacgaga
agctgtacct gtactacctc caaaacggaa 3960gggacatgta cgtggaccag
gagctggaca tcaacaggtt gtccgactac gacgtcgacc 4020acatcgtgcc
tcagtccttc ctgaaggatg actccatcga caataaagtg ctgacacgct
4080ccgataaaaa tagaggcaag tccgacaacg tcccctccga ggaggtcgtg
aagaagatga 4140aaaactactg gagacagctc ttgaacgcca agctcatcac
ccagcgtaag ttcgacaacc 4200tgactaaggc tgagagagga ggattgtccg
agctcgataa ggccggattc atcaagagac 4260agctcgtcga aacccgccaa
attaccaagc acgtggccca aattctggat tcccgcatga 4320acaccaagta
cgatgaaaat gacaagctga tccgcgaggt caaggtgatc accttgaagt
4380ccaagctggt ctccgacttc cgcaaggact tccagttcta caaggtgagg
gagatcaaca 4440actaccacca cgcacacgac gcctacctca acgctgtcgt
tggaaccgcc ctcatcaaaa 4500aatatcctaa gctggagtct gagttcgtct
acggcgacta caaggtgtac gacgtgagga 4560agatgatcgc taagtctgag
caggagatcg gcaaggccac cgccaagtac ttcttctact 4620ccaacatcat
gaacttcttc aagaccgaga tcactctcgc caacggtgag atcaggaagc
4680gcccactgat cgagaccaac ggtgagactg gagagatcgt gtgggacaaa
gggagggatt 4740tcgctactgt gaggaaggtg ctctccatgc ctcaggtgaa
catcgtcaag aagaccgaag 4800ttcagaccgg aggattctcc aaggagtcca
tcctccccaa gagaaactcc gacaagctga 4860tcgctagaaa gaaagactgg
gaccctaaga agtacggagg cttcgattct cctaccgtgg 4920cctactctgt
gctggtcgtg gccaaggtgg agaagggcaa gtccaagaag ctgaaatccg
4980tcaaggagct cctcgggatt accatcatgg agaggagttc cttcgagaag
aaccctatcg 5040acttcctgga ggccaaggga tataaagagg tgaagaagga
cctcatcatc aagctgccca 5100agtactccct cttcgagttg gagaacggaa
ggaagaggat gctggcttct gccggagagt 5160tgcagaaggg aaatgagctc
gcccttccct ccaagtacgt gaacttcctg tacctcgcct 5220ctcactatga
aaagttgaag ggctctcctg aggacaacga gcagaagcag ctcttcgtgg
5280agcagcacaa gcactacctg gacgaaatta tcgagcagat ctctgagttc
tccaagcgcg 5340tgatattggc cgacgccaac ctcgacaagg tgctgtccgc
ctacaacaag cacagggata 5400agcccattcg cgagcaggct gaaaacatta
tccacctgtt taccctcaca aacttgggag 5460cccctgctgc cttcaagtac
ttcgacacca ccattgacag gaagagatac acctccacca 5520aggaggtgct
cgacgcaaca ctcatccacc aatccatcac cggcctctat gaaacaagga
5580ttgacttgtc ccagctggga ggcgactcta gagccgatcc caagaagaag
agaaaggtgt 5640aggttaacct agacttgtcc atcttctgga ttggccaact
taattaatgt atgaaataaa 5700aggatgcaca catagtgaca tgctaatcac
tataatgtgg gcatcaaagt tgtgtgttat 5760gtgtaattac tagttatctg
aataaaagag aaagagatca tccatatttc ttatcctaaa 5820tgaatgtcac
gtgtctttat aattctttga tgaaccagat gcatttcatt aaccaaatcc
5880atatacatat aaatattaat catatataat taatatcaat tgggttagca
aaacaaatct 5940agtctaggtg tgttttgc 5958297573DNAArtificial
sequenceU6-9.1-DD20CR1 297ccgggttaag agaattgtaa gtgtgctttt
atatatttaa aattaatata ttttgaaatg 60ttaaaatata aaagaaaatt caatgtaaat
taaaaataaa taaatgttta ataaagataa 120attttaaaac ataaaagaaa
atgtctaaca agaggattaa gatcctgtgc tcttaaattt 180ttaggtgttg
aaatcttagc catacaaaat atattttatt aaaaccaagc atgaaaaaag
240tcactaaaga gctatataac tcatgcagct agaaatgaag tgaagggaat
ccagtttgtt 300ctcagtcgaa agagtgtcta tctttgttct tttctgcaac
cgagttaagc aaaatgggaa 360tgcgaggtat cttcctttcg ttaggggagc
accagatgca tagttagtcc cacattgatg 420aatataacaa gagcttcaca
gaatatatag cccaggccac agtaaaagct tggaactgac 480acacgacatg
agttttagag ctagaaatag caagttaaaa taaggctagt ccgttatcaa
540cttgaaaaag tggcaccgag tcggtgcttt ttt 5732986611DNAArtificial
sequenceU6-9.1-DD20CR1+EF1A2-CAS9 298cgcgccggta cccgggttaa
gagaattgta agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat
aaaagaaaat tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata
aattttaaaa cataaaagaa aatgtctaac aagaggatta agatcctgtg
180ctcttaaatt tttaggtgtt gaaatcttag ccatacaaaa tatattttat
taaaaccaag 240catgaaaaaa gtcactaaag agctatataa ctcatgcagc
tagaaatgaa gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct
atctttgttc ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta
tcttcctttc gttaggggag caccagatgc atagttagtc 420ccacattgat
gaatataaca agagcttcac agaatatata gcccaggcca cagtaaaagc
480ttggaactga cacacgacat gagttttaga gctagaaata gcaagttaaa
ataaggctag 540tccgttatca acttgaaaaa gtggcaccga gtcggtgctt
ttttttgcgg ccgcaattgg 600atcgggttta cttattttgt gggtatctat
acttttatta gatttttaat caggctcctg 660atttcttttt atttcgattg
aattcctgaa cttgtattat tcagtagatc gaataaatta 720taaaaagata
aaatcataaa ataatatttt atcctatcaa tcatattaaa gcaatgaata
780tgtaaaatta atcttatctt tattttaaaa aatcatatag gtttagtatt
tttttaaaaa 840taaagatagg attagtttta ctattcactg cttattactt
ttaaaaaaat cataaaggtt 900tagtattttt ttaaaataaa tataggaata
gttttactat tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag
ttttaatccc agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt
tacattatct taacctagct gaaacgatgt cgtcataata tcgccaaatg
1080ccaactggac tacgtcgaac ccacaaatcc cacaaagcgc gtgaaatcaa
atcgctcaaa 1140ccacaaaaaa gaacaacgcg tttgttacac gctcaatccc
acgcgagtag agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag
aaatccgaaa taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact
caatataaaa atctaggaac cctagttttc gttatcactc 1320tgtgctccct
cgctctattt ctcagtctct gtgtttgcgg ctgaggattc cgaacgagtg
1380accttcttcg tttctcgcaa aggtaacagc ctctgctctt gtctcttcga
ttcgatctat 1440gcctgtctct tatttacgat gatgtttctt cggttatgtt
tttttattta tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg
tttttgtggt tcagtttttt aggattcttt 1560tggtttttga atcgattaat
cggaagagat tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt
gaggtcatag atctgttgta tttgtgttat aaacatgcga ctttgtatga
1680ttttttacga ggttatgatg ttctggttgt tttattatga atctgttgag
acagaaccat 1740gatttttgtt gatgttcgtt tacactatta aaggtttgtt
ttaacaggat taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat
gtaaccgtcg atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta
atcttttact atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt
taattccttg tgaaactttt gtaatatgaa gttgaaattt tgttattggt
1980aaactataaa tgtgtgaagt tggagtatac ctttaccttc ttatttggct
ttgtgatagt 2040ttaatttata tgtattttga gttctgactt gtatttcttt
gaattgattc tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc
tcgacatagg gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac
aaggtgccct ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag
cataaagaag aatttgatcg gtgccctcct cttcgactcc ggagagaccg
2280ctgaggctac caggctcaag aggaccgcta gaaggcgcta caccagaagg
aagaacagaa 2340tctgctacct gcaggagatc ttctccaacg agatggccaa
ggtggacgac tccttcttcc 2400accgccttga ggaatcattc ctggtggagg
aggataaaaa gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag
gtggcctacc atgaaaagta ccctaccatc taccacctga 2520ggaagaagct
ggtcgactct accgacaagg ctgacttgcg cttgatttac ctggctctcg
2580ctcacatgat aaagttccgc ggacacttcc tcattgaggg agacctgaac
ccagacaact 2640ccgacgtgga caagctcttc atccagctcg ttcagaccta
caaccagctt ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca
aggctatcct ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg
attgcccagc tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat
cgctctctcc ctgggattga ctcccaactt caagtccaac ttcgacctcg
2880ccgaggacgc taagttgcag ttgtctaaag acacctacga cgatgacctc
gacaacttgc 2940tggcccagat aggcgaccaa tacgccgatc tcttcctcgc
cgctaagaac ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca
acactgagat taccaaagct cctctgtctg 3060cttccatgat taagcgctac
gacgagcacc accaagatct gaccctgctc aaggccctgg 3120tgagacagca
gctgcccgag aagtacaagg agatcttttt cgaccagtcc aagaacggct
3180acgccggata cattgacgga ggcgcctccc aggaagagtt ctacaagttc
atcaagccca 3240tccttgagaa gatggacggt accgaggagc tgttggtgaa
gttgaacaga gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa
gcatccctca ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg
caggaggatt tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa
gatcttgacc ttcagaattc
cttactacgt cgggccactc gccagaggaa 3480actctaggtt cgcctggatg
acccgcaaat ctgaagagac cattactccc tggaacttcg 3540aggaagtcgt
ggacaagggc gcttccgctc agtctttcat cgagaggatg accaacttcg
3600ataaaaatct gcccaacgag aaggtgctgc ccaagcactc cctgttgtac
gagtatttca 3660cagtgtacaa cgagctcacc aaggtgaagt acgtcacaga
gggaatgagg aagcctgcct 3720tcttgtccgg agagcagaag aaggccatcg
tcgacctgct cttcaagacc aacaggaagg 3780tgactgtcaa gcagctgaag
gaggactact tcaagaagat cgagtgcttc gactccgtcg 3840agatctctgg
tgtcgaggac aggttcaacg cctcccttgg gacttaccac gatctgctca
3900agattattaa agacaaggac ttcctggaca acgaggagaa cgaggacatc
cttgaggaca 3960tcgtgctcac cctgaccttg ttcgaagaca gggaaatgat
cgaagagagg ctcaagacct 4020acgcccacct cttcgacgac aaggtgatga
aacagctgaa gagacgcaga tataccggct 4080ggggaaggct ctcccgcaaa
ttgatcaacg ggatcaggga caagcagtca gggaagacta 4140tactcgactt
cctgaagtcc gacggattcg ccaacaggaa cttcatgcag ctcattcacg
4200acgactcctt gaccttcaag gaggacatcc agaaggctca ggtgtctgga
cagggtgact 4260ccttgcatga gcacattgct aacttggccg gctctcccgc
tattaagaag ggcattttgc 4320agaccgtgaa ggtcgttgac gagctcgtga
aggtgatggg acgccacaag ccagagaaca 4380tcgttattga gatggctcgc
gagaaccaaa ctacccagaa agggcagaag aattcccgcg 4440agaggatgaa
gcgcattgag gagggcataa aagagcttgg ctctcagatc ctcaaggagc
4500accccgtcga gaacactcag ctgcagaacg agaagctgta cctgtactac
ctccaaaacg 4560gaagggacat gtacgtggac caggagctgg acatcaacag
gttgtccgac tacgacgtcg 4620accacatcgt gcctcagtcc ttcctgaagg
atgactccat cgacaataaa gtgctgacac 4680gctccgataa aaatagaggc
aagtccgaca acgtcccctc cgaggaggtc gtgaagaaga 4740tgaaaaacta
ctggagacag ctcttgaacg ccaagctcat cacccagcgt aagttcgaca
4800acctgactaa ggctgagaga ggaggattgt ccgagctcga taaggccgga
ttcatcaaga 4860gacagctcgt cgaaacccgc caaattacca agcacgtggc
ccaaattctg gattcccgca 4920tgaacaccaa gtacgatgaa aatgacaagc
tgatccgcga ggtcaaggtg atcaccttga 4980agtccaagct ggtctccgac
ttccgcaagg acttccagtt ctacaaggtg agggagatca 5040acaactacca
ccacgcacac gacgcctacc tcaacgctgt cgttggaacc gccctcatca
5100aaaaatatcc taagctggag tctgagttcg tctacggcga ctacaaggtg
tacgacgtga 5160ggaagatgat cgctaagtct gagcaggaga tcggcaaggc
caccgccaag tacttcttct 5220actccaacat catgaacttc ttcaagaccg
agatcactct cgccaacggt gagatcagga 5280agcgcccact gatcgagacc
aacggtgaga ctggagagat cgtgtgggac aaagggaggg 5340atttcgctac
tgtgaggaag gtgctctcca tgcctcaggt gaacatcgtc aagaagaccg
5400aagttcagac cggaggattc tccaaggagt ccatcctccc caagagaaac
tccgacaagc 5460tgatcgctag aaagaaagac tgggacccta agaagtacgg
aggcttcgat tctcctaccg 5520tggcctactc tgtgctggtc gtggccaagg
tggagaaggg caagtccaag aagctgaaat 5580ccgtcaagga gctcctcggg
attaccatca tggagaggag ttccttcgag aagaacccta 5640tcgacttcct
ggaggccaag ggatataaag aggtgaagaa ggacctcatc atcaagctgc
5700ccaagtactc cctcttcgag ttggagaacg gaaggaagag gatgctggct
tctgccggag 5760agttgcagaa gggaaatgag ctcgcccttc cctccaagta
cgtgaacttc ctgtacctcg 5820cctctcacta tgaaaagttg aagggctctc
ctgaggacaa cgagcagaag cagctcttcg 5880tggagcagca caagcactac
ctggacgaaa ttatcgagca gatctctgag ttctccaagc 5940gcgtgatatt
ggccgacgcc aacctcgaca aggtgctgtc cgcctacaac aagcacaggg
6000ataagcccat tcgcgagcag gctgaaaaca ttatccacct gtttaccctc
acaaacttgg 6060gagcccctgc tgccttcaag tacttcgaca ccaccattga
caggaagaga tacacctcca 6120ccaaggaggt gctcgacgca acactcatcc
accaatccat caccggcctc tatgaaacaa 6180ggattgactt gtcccagctg
ggaggcgact ctagagccga tcccaagaag aagagaaagg 6240tgtaggttaa
cctagacttg tccatcttct ggattggcca acttaattaa tgtatgaaat
6300aaaaggatgc acacatagtg acatgctaat cactataatg tgggcatcaa
agttgtgtgt 6360tatgtgtaat tactagttat ctgaataaaa gagaaagaga
tcatccatat ttcttatcct 6420aaatgaatgt cacgtgtctt tataattctt
tgatgaacca gatgcatttc attaaccaaa 6480tccatataca tataaatatt
aatcatatat aattaatatc aattgggtta gcaaaacaaa 6540tctagtctag
gtgtgttttg cgaattcgat atcaagctta tcgataccgt cgaggggggg
6600cccggtaccg g 66112995686DNAArtificial
sequenceDD20HR1-SAMSHPT-DD20HR2 299cgcgcctcta gttgaagaca cgttcatgtc
ttcatcgtaa gaagacactc agtagtcttc 60ggccagaatg gccatctgga ttcagcaggc
ctagaaggcc atttaaatcc tgaggatctg 120gtcttcctaa ggacccggga
tatcgctatc aactttgtat agaaaagttg ggccgaattc 180gagctcggta
cggccagaat ccggtaagtg actagggtca cgtgacccta gtcacttaaa
240ttcggccaga atggccatct ggattcagca ggcctagaag gcccggaccg
attaaacttt 300aattcggtcc gggttacctc gagcctagta ataattacac
atctaagata tccccttctt 360tttcaagtaa aataatatca tatgatctca
ttttagtgaa acaatactat ttccctgata 420actctcttca acattaggga
cttcatctaa tcatctactt tcaaggtata actagacgta 480tttgttcttt
taaaaaaaac actagatgta ctcgtcaact caaaattcat cgttcatgca
540ttttaattaa actttaatta gctaatgagt agaaaaagat catacgagta
aaatagaaga 600atcttcctag attttggaag aatggattgg agtgtaagtg
aattgatcca ttagtggaag 660atgctcttta caatggccaa actgttctaa
ttgttagagc acatttgaga tgaaacactt 720cagtagtgga ggtaacctac
aatcctagga tctgtatcct ctatcactaa tggagcaatg 780ggtttgagat
tgacttactc ctttccttgt ctctcgtagt gcatatgcgc actttcaaag
840gctacacaaa agccgttaac tttttgttta tttaagttac gaaagatagt
tgaattagag 900taaatggtga tattgaatta ggattttaaa taattttaaa
agaatttttt taataaaaaa 960aatattgtgt tgttggatca aaatttttaa
ataacatgaa taaggaaatg gattgcaatg 1020aggttttaaa caattatttt
aacatatagg attttagaaa gacttttata atattttgtt 1080gaagtttaga
ttttaatata tttatgtttt aaaattttaa aaaaaacttc atgaatttat
1140aatatttgaa aaagacacgt gaatatttag aaaacattta aaattacaat
aataaatcat 1200aatgagatag ggtgtattca tgtgtagacg agacaccaag
tatatggttc acaagtgaat 1260catctttttt ttttacagca caagtagatc
acttgtactt atcaaaattc ggaactgaca 1320cacactagtg gtcacctaag
tgactagggt cacgtgaccc tagtcactta ttcccaaaca 1380ctagtaacgg
ccgccagtgt gctggaattc gcccttccca agctttgctc tagatcaaac
1440tcacatccaa acataacatg gatatcttcc ttaccaatca tactaattat
tttgggttaa 1500atattaatca ttatttttaa gatattaatt aagaaattaa
aagatttttt aaaaaaatgt 1560ataaaattat attattcatg atttttcata
catttgattt tgataataaa tatatttttt 1620ttaatttctt aaaaaatgtt
gcaagacact tattagacat agtcttgttc tgtttacaaa 1680agcattcatc
atttaataca ttaaaaaata tttaatacta acagtagaat cttcttgtga
1740gtggtgtggg agtaggcaac ctggcattga aacgagagaa agagagtcag
aaccagaaga 1800caaataaaaa gtatgcaaca aacaaatcaa aatcaaaggg
caaaggctgg ggttggctca 1860attggttgct acattcaatt ttcaactcag
tcaacggttg agattcactc tgacttcccc 1920aatctaagcc gcggatgcaa
acggttgaat ctaacccaca atccaatctc gttacttagg 1980ggcttttccg
tcattaactc acccctgcca cccggtttcc ctataaattg gaactcaatg
2040ctcccctcta aactcgtatc gcttcagagt tgagaccaag acacactcgt
tcatatatct 2100ctctgctctt ctcttctctt ctacctctca aggtactttt
cttctccctc taccaaatcc 2160tagattccgt ggttcaattt cggatcttgc
acttctggtt tgctttgcct tgctttttcc 2220tcaactgggt ccatctagga
tccatgtgaa actctactct ttctttaata tctgcggaat 2280acgcgtttga
ctttcagatc tagtcgaaat catttcataa ttgcctttct ttcttttagc
2340ttatgagaaa taaaatcact ttttttttat ttcaaaataa accttgggcc
ttgtgctgac 2400tgagatgggg tttggtgatt acagaatttt agcgaatttt
gtaattgtac ttgtttgtct 2460gtagttttgt tttgttttct tgtttctcat
acattcctta ggcttcaatt ttattcgagt 2520ataggtcaca ataggaattc
aaactttgag caggggaatt aatcccttcc ttcaaatcca 2580gtttgtttgt
atatatgttt aaaaaatgaa acttttgctt taaattctat tataactttt
2640tttatggctg aaatttttgc atgtgtcttt gctctctgtt gtaaatttac
tgtttaggta 2700ctaactctag gcttgttgtg cagtttttga agtataacaa
cagaagttcc tattccgaag 2760ttcctattct ctagaaagta taggaacttc
cactagtcca tgaaaaagcc tgaactcacc 2820gcgacgtctg tcgagaagtt
tctgatcgaa aagttcgaca gcgtctccga cctgatgcag 2880ctctcggagg
gcgaagaatc tcgtgctttc agcttcgatg taggagggcg tggatatgtc
2940ctgcgggtaa atagctgcgc cgatggtttc tacaaagatc gttatgttta
tcggcacttt 3000gcatcggccg cgctcccgat tccggaagtg cttgacattg
gggaattcag cgagagcctg 3060acctattgca tctcccgccg tgcacagggt
gtcacgttgc aagacctgcc tgaaaccgaa 3120ctgcccgctg ttctgcagcc
ggtcgcggag gccatggatg cgatcgctgc ggccgatctt 3180agccagacga
gcgggttcgg cccattcgga ccgcaaggaa tcggtcaata cactacatgg
3240cgtgatttca tatgcgcgat tgctgatccc catgtgtatc actggcaaac
tgtgatggac 3300gacaccgtca gtgcgtccgt cgcgcaggct ctcgatgagc
tgatgctttg ggccgaggac 3360tgccccgaag tccggcacct cgtgcacgcg
gatttcggct ccaacaatgt cctgacggac 3420aatggccgca taacagcggt
cattgactgg agcgaggcga tgttcgggga ttcccaatac 3480gaggtcgcca
acatcttctt ctggaggccg tggttggctt gtatggagca gcagacgcgc
3540tacttcgagc ggaggcatcc ggagcttgca ggatcgccgc ggctccgggc
gtatatgctc 3600cgcattggtc ttgaccaact ctatcagagc ttggttgacg
gcaatttcga tgatgcagct 3660tgggcgcagg gtcgatgcga cgcaatcgtc
cgatccggag ccgggactgt cgggcgtaca 3720caaatcgccc gcagaagcgc
ggccgtctgg accgatggct gtgtagaagt actcgccgat 3780agtggaaacc
gacgccccag cactcgtccg agggcaaagg aatagtgagg tacctaaaga
3840aggagtgcgt cgaagcagat cgttcaaaca tttggcaata aagtttctta
agattgaatc 3900ctgttgccgg tcttgcgatg attatcatat aatttctgtt
gaattacgtt aagcatgtaa 3960taattaacat gtaatgcatg acgttattta
tgagatgggt ttttatgatt agagtcccgc 4020aattatacat ttaatacgcg
atagaaaaca aaatatagcg cgcaaactag gataaattat 4080cgcgcgcggt
gtcatctatg ttactagatc gatgtcgacc cgggccctag gaggccggcc
4140cagctgatga tcccggtgaa gttcctattc cgaagttcct attctccaga
aagtatagga 4200acttcactag agcttgcggc cgcgcatgct gacttaatca
gctaacgcca ctcgacctgc 4260aggcatgccc gcggatatcg atgggccccg
gccgaagctt caagtttgta caaaaaagca 4320ggctggcgcc ggaaccaatt
cagtcgactg gatccggtac cgaattcgcg gccgcactcg 4380agatatctag
acccagcttt cttgtacaaa gtggccgtta acggatcggc cagaatccgg
4440taagtgacta gggtcacgtg accctagtca cttaaattcg gccagaatgg
ccatctggat 4500tcagcaggcc tagaaggccc ggaccgatta aactttaatt
cggtccgggt tacctctaga 4560aagcttgtcg acctgcagac acgacatgat
ggaacgtgac taaggtgggt ttttgacttt 4620gcatgtcgaa gtgagagtga
ttttattgag agaataatag aagacctaca aaacaaatga 4680tcccgacgct
aaagtaagta cgagagttaa gagaataaat gggaaaatat gcatacatga
4740ttaggtgtgt gttcgtctca agaaagtacg aatgaatatg gtgtgtttgt
agtacatgaa 4800tgatgtgttt tgagggttca agggaaattg atatttatag
agtgaaatgg aaccagaggt 4860ctttgttgac aagggttgtt atgactcttg
caaataatta atagcttata aataatagcc 4920aataacttat tatagataga
gttagagata atatatagct aaatttgaac aaggcataca 4980aaacaaaaat
gctaaatatg aataagacaa tcaaaattgt agtcgatgtt caactctttg
5040tcgttgaaga acttgtttgc agtggtatag taaatgggtg tgagtgcagt
gtctcaccca 5100tctcacacca cacaaccaac ttcatatcta aagatattgt
cgctgaatac aaaattgagt 5160tatggaatat acaattcata atatagatac
gaaaaatcat ttcttacaaa acattcaatc 5220aaaaattatt caaacataat
tctagattaa gtaatccgaa gtacaagtta gtatcctaga 5280tccgttaatt
taaaattatg tttgcataat tttggatttg gtgttctata agggcacaat
5340tttgttcatt cttacaagtt tgtcaattct aaaatatatg caaatttgaa
gaaaaaaaat 5400ttacgaatgt gtctcaaaca ataacttaat gggaggagaa
tgagggatga agaagctcaa 5460aattaccaac gccttctacc tcaagaagct
acttcacaca aaatatgact ggcggaagga 5520taggggacaa ccgataacga
gaaggagata cataaggtaa tgtacgttgt tgtgtgaggg 5580atccggtcac
ctaagtgact agggtcacgt gaccctagtc acttattccc gggcaacttt
5640attatacaaa gttgatagat ctcgaattca ttccgattaa tcgtgg
56863006611DNAArtificial sequenceU6-9.1-DD20CR2+EF1A2-CAS9
300cgcgccggta cccgggttaa gagaattgta agtgtgcttt tatatattta
aaattaatat 60attttgaaat gttaaaatat aaaagaaaat tcaatgtaaa ttaaaaataa
ataaatgttt 120aataaagata aattttaaaa cataaaagaa aatgtctaac
aagaggatta agatcctgtg 180ctcttaaatt tttaggtgtt gaaatcttag
ccatacaaaa tatattttat taaaaccaag 240catgaaaaaa gtcactaaag
agctatataa ctcatgcagc tagaaatgaa gtgaagggaa 300tccagtttgt
tctcagtcga aagagtgtct atctttgttc ttttctgcaa ccgagttaag
360caaaatggga atgcgaggta tcttcctttc gttaggggag caccagatgc
atagttagtc 420ccacattgat gaatataaca agagcttcac agaatatata
gcccaggcca cagtaaaagc 480ttgacatgat ggaacgtgac tagttttaga
gctagaaata gcaagttaaa ataaggctag 540tccgttatca acttgaaaaa
gtggcaccga gtcggtgctt ttttttgcgg ccgcaattgg 600atcgggttta
cttattttgt gggtatctat acttttatta gatttttaat caggctcctg
660atttcttttt atttcgattg aattcctgaa cttgtattat tcagtagatc
gaataaatta 720taaaaagata aaatcataaa ataatatttt atcctatcaa
tcatattaaa gcaatgaata 780tgtaaaatta atcttatctt tattttaaaa
aatcatatag gtttagtatt tttttaaaaa 840taaagatagg attagtttta
ctattcactg cttattactt ttaaaaaaat cataaaggtt 900tagtattttt
ttaaaataaa tataggaata gttttactat tcactgcttt aatagaaaaa
960tagtttaaaa tttaagatag ttttaatccc agcatttgcc acgtttgaac
gtgagccgaa 1020acgatgtcgt tacattatct taacctagct gaaacgatgt
cgtcataata tcgccaaatg 1080ccaactggac tacgtcgaac ccacaaatcc
cacaaagcgc gtgaaatcaa atcgctcaaa 1140ccacaaaaaa gaacaacgcg
tttgttacac gctcaatccc acgcgagtag agcacagtaa 1200ccttcaaata
agcgaatggg gcataatcag aaatccgaaa taaacctagg ggcattatcg
1260gaaatgaaaa gtagctcact caatataaaa atctaggaac cctagttttc
gttatcactc 1320tgtgctccct cgctctattt ctcagtctct gtgtttgcgg
ctgaggattc cgaacgagtg 1380accttcttcg tttctcgcaa aggtaacagc
ctctgctctt gtctcttcga ttcgatctat 1440gcctgtctct tatttacgat
gatgtttctt cggttatgtt tttttattta tgctttatgc 1500tgttgatgtt
cggttgtttg tttcgctttg tttttgtggt tcagtttttt aggattcttt
1560tggtttttga atcgattaat cggaagagat tttcgagtta tttggtgtgt
tggaggtgaa 1620tctttttttt gaggtcatag atctgttgta tttgtgttat
aaacatgcga ctttgtatga 1680ttttttacga ggttatgatg ttctggttgt
tttattatga atctgttgag acagaaccat 1740gatttttgtt gatgttcgtt
tacactatta aaggtttgtt ttaacaggat taaaagtttt 1800ttaagcatgt
tgaaggagtc ttgtagatat gtaaccgtcg atagtttttt tgtgggtttg
1860ttcacatgtt atcaagctta atcttttact atgtatgcga ccatatctgg
atccagcaaa 1920ggcgattttt taattccttg tgaaactttt gtaatatgaa
gttgaaattt tgttattggt 1980aaactataaa tgtgtgaagt tggagtatac
ctttaccttc ttatttggct ttgtgatagt 2040ttaatttata tgtattttga
gttctgactt gtatttcttt gaattgattc tagtttaagt 2100aatccatgga
caaaaagtac tcaatagggc tcgacatagg gactaactcc gttggatggg
2160ccgtcatcac cgacgagtac aaggtgccct ccaagaagtt caaggtgttg
ggaaacaccg 2220acaggcacag cataaagaag aatttgatcg gtgccctcct
cttcgactcc ggagagaccg 2280ctgaggctac caggctcaag aggaccgcta
gaaggcgcta caccagaagg aagaacagaa 2340tctgctacct gcaggagatc
ttctccaacg agatggccaa ggtggacgac tccttcttcc 2400accgccttga
ggaatcattc ctggtggagg aggataaaaa gcacgagaga cacccaatct
2460tcgggaacat cgtcgacgag gtggcctacc atgaaaagta ccctaccatc
taccacctga 2520ggaagaagct ggtcgactct accgacaagg ctgacttgcg
cttgatttac ctggctctcg 2580ctcacatgat aaagttccgc ggacacttcc
tcattgaggg agacctgaac ccagacaact 2640ccgacgtgga caagctcttc
atccagctcg ttcagaccta caaccagctt ttcgaggaga 2700acccaatcaa
cgccagtgga gttgacgcca aggctatcct ctctgctcgt ctgtcaaagt
2760ccaggaggct tgagaacttg attgcccagc tgcctggcga aaagaagaac
ggactgttcg 2820gaaacttgat cgctctctcc ctgggattga ctcccaactt
caagtccaac ttcgacctcg 2880ccgaggacgc taagttgcag ttgtctaaag
acacctacga cgatgacctc gacaacttgc 2940tggcccagat aggcgaccaa
tacgccgatc tcttcctcgc cgctaagaac ttgtccgacg 3000caatcctgct
gtccgacatc ctgagagtca acactgagat taccaaagct cctctgtctg
3060cttccatgat taagcgctac gacgagcacc accaagatct gaccctgctc
aaggccctgg 3120tgagacagca gctgcccgag aagtacaagg agatcttttt
cgaccagtcc aagaacggct 3180acgccggata cattgacgga ggcgcctccc
aggaagagtt ctacaagttc atcaagccca 3240tccttgagaa gatggacggt
accgaggagc tgttggtgaa gttgaacaga gaggacctgt 3300tgaggaagca
gagaaccttc gacaacggaa gcatccctca ccaaatccac ctgggagagc
3360tccacgccat cttgaggagg caggaggatt tctatccctt cctgaaggac
aaccgcgaga 3420agattgagaa gatcttgacc ttcagaattc cttactacgt
cgggccactc gccagaggaa 3480actctaggtt cgcctggatg acccgcaaat
ctgaagagac cattactccc tggaacttcg 3540aggaagtcgt ggacaagggc
gcttccgctc agtctttcat cgagaggatg accaacttcg 3600ataaaaatct
gcccaacgag aaggtgctgc ccaagcactc cctgttgtac gagtatttca
3660cagtgtacaa cgagctcacc aaggtgaagt acgtcacaga gggaatgagg
aagcctgcct 3720tcttgtccgg agagcagaag aaggccatcg tcgacctgct
cttcaagacc aacaggaagg 3780tgactgtcaa gcagctgaag gaggactact
tcaagaagat cgagtgcttc gactccgtcg 3840agatctctgg tgtcgaggac
aggttcaacg cctcccttgg gacttaccac gatctgctca 3900agattattaa
agacaaggac ttcctggaca acgaggagaa cgaggacatc cttgaggaca
3960tcgtgctcac cctgaccttg ttcgaagaca gggaaatgat cgaagagagg
ctcaagacct 4020acgcccacct cttcgacgac aaggtgatga aacagctgaa
gagacgcaga tataccggct 4080ggggaaggct ctcccgcaaa ttgatcaacg
ggatcaggga caagcagtca gggaagacta 4140tactcgactt cctgaagtcc
gacggattcg ccaacaggaa cttcatgcag ctcattcacg 4200acgactcctt
gaccttcaag gaggacatcc agaaggctca ggtgtctgga cagggtgact
4260ccttgcatga gcacattgct aacttggccg gctctcccgc tattaagaag
ggcattttgc 4320agaccgtgaa ggtcgttgac gagctcgtga aggtgatggg
acgccacaag ccagagaaca 4380tcgttattga gatggctcgc gagaaccaaa
ctacccagaa agggcagaag aattcccgcg 4440agaggatgaa gcgcattgag
gagggcataa aagagcttgg ctctcagatc ctcaaggagc 4500accccgtcga
gaacactcag ctgcagaacg agaagctgta cctgtactac ctccaaaacg
4560gaagggacat gtacgtggac caggagctgg acatcaacag gttgtccgac
tacgacgtcg 4620accacatcgt gcctcagtcc ttcctgaagg atgactccat
cgacaataaa gtgctgacac 4680gctccgataa aaatagaggc aagtccgaca
acgtcccctc cgaggaggtc gtgaagaaga 4740tgaaaaacta ctggagacag
ctcttgaacg ccaagctcat cacccagcgt aagttcgaca 4800acctgactaa
ggctgagaga ggaggattgt ccgagctcga taaggccgga ttcatcaaga
4860gacagctcgt cgaaacccgc caaattacca agcacgtggc ccaaattctg
gattcccgca 4920tgaacaccaa gtacgatgaa aatgacaagc tgatccgcga
ggtcaaggtg atcaccttga 4980agtccaagct ggtctccgac ttccgcaagg
acttccagtt ctacaaggtg agggagatca 5040acaactacca ccacgcacac
gacgcctacc tcaacgctgt cgttggaacc gccctcatca 5100aaaaatatcc
taagctggag tctgagttcg tctacggcga ctacaaggtg tacgacgtga
5160ggaagatgat cgctaagtct gagcaggaga tcggcaaggc caccgccaag
tacttcttct 5220actccaacat catgaacttc ttcaagaccg agatcactct
cgccaacggt gagatcagga 5280agcgcccact gatcgagacc aacggtgaga
ctggagagat cgtgtgggac aaagggaggg 5340atttcgctac tgtgaggaag
gtgctctcca tgcctcaggt gaacatcgtc aagaagaccg 5400aagttcagac
cggaggattc tccaaggagt ccatcctccc caagagaaac tccgacaagc
5460tgatcgctag aaagaaagac tgggacccta agaagtacgg aggcttcgat
tctcctaccg 5520tggcctactc tgtgctggtc gtggccaagg tggagaaggg
caagtccaag aagctgaaat 5580ccgtcaagga gctcctcggg attaccatca
tggagaggag ttccttcgag aagaacccta 5640tcgacttcct ggaggccaag
ggatataaag aggtgaagaa ggacctcatc atcaagctgc 5700ccaagtactc
cctcttcgag ttggagaacg gaaggaagag gatgctggct tctgccggag
5760agttgcagaa gggaaatgag ctcgcccttc cctccaagta cgtgaacttc
ctgtacctcg 5820cctctcacta tgaaaagttg aagggctctc ctgaggacaa
cgagcagaag cagctcttcg 5880tggagcagca caagcactac ctggacgaaa
ttatcgagca gatctctgag ttctccaagc 5940gcgtgatatt ggccgacgcc
aacctcgaca aggtgctgtc cgcctacaac aagcacaggg 6000ataagcccat
tcgcgagcag gctgaaaaca
ttatccacct gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag
tacttcgaca ccaccattga caggaagaga tacacctcca 6120ccaaggaggt
gctcgacgca acactcatcc accaatccat caccggcctc tatgaaacaa
6180ggattgactt gtcccagctg ggaggcgact ctagagccga tcccaagaag
aagagaaagg 6240tgtaggttaa cctagacttg tccatcttct ggattggcca
acttaattaa tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat
cactataatg tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat
ctgaataaaa gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt
cacgtgtctt tataattctt tgatgaacca gatgcatttc attaaccaaa
6480tccatataca tataaatatt aatcatatat aattaatatc aattgggtta
gcaaaacaaa 6540tctagtctag gtgtgttttg cgaattcgat atcaagctta
tcgataccgt cgaggggggg 6600cccggtaccg g 66113016611DNAArtificial
sequenceU6-9.1DD43CR1+EF1A2CAS9 301cgcgccggta cccgggttaa gagaattgta
agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat aaaagaaaat
tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata aattttaaaa
cataaaagaa aatgtctaac aagaggatta agatcctgtg 180ctcttaaatt
tttaggtgtt gaaatcttag ccatacaaaa tatattttat taaaaccaag
240catgaaaaaa gtcactaaag agctatataa ctcatgcagc tagaaatgaa
gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct atctttgttc
ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta tcttcctttc
gttaggggag caccagatgc atagttagtc 420ccacattgat gaatataaca
agagcttcac agaatatata gcccaggcca cagtaaaagc 480ttgtcccttg
tacttgtacg tagttttaga gctagaaata gcaagttaaa ataaggctag
540tccgttatca acttgaaaaa gtggcaccga gtcggtgctt ttttttgcgg
ccgcaattgg 600atcgggttta cttattttgt gggtatctat acttttatta
gatttttaat caggctcctg 660atttcttttt atttcgattg aattcctgaa
cttgtattat tcagtagatc gaataaatta 720taaaaagata aaatcataaa
ataatatttt atcctatcaa tcatattaaa gcaatgaata 780tgtaaaatta
atcttatctt tattttaaaa aatcatatag gtttagtatt tttttaaaaa
840taaagatagg attagtttta ctattcactg cttattactt ttaaaaaaat
cataaaggtt 900tagtattttt ttaaaataaa tataggaata gttttactat
tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag ttttaatccc
agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt tacattatct
taacctagct gaaacgatgt cgtcataata tcgccaaatg 1080ccaactggac
tacgtcgaac ccacaaatcc cacaaagcgc gtgaaatcaa atcgctcaaa
1140ccacaaaaaa gaacaacgcg tttgttacac gctcaatccc acgcgagtag
agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag aaatccgaaa
taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact caatataaaa
atctaggaac cctagttttc gttatcactc 1320tgtgctccct cgctctattt
ctcagtctct gtgtttgcgg ctgaggattc cgaacgagtg 1380accttcttcg
tttctcgcaa aggtaacagc ctctgctctt gtctcttcga ttcgatctat
1440gcctgtctct tatttacgat gatgtttctt cggttatgtt tttttattta
tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg tttttgtggt
tcagtttttt aggattcttt 1560tggtttttga atcgattaat cggaagagat
tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt gaggtcatag
atctgttgta tttgtgttat aaacatgcga ctttgtatga 1680ttttttacga
ggttatgatg ttctggttgt tttattatga atctgttgag acagaaccat
1740gatttttgtt gatgttcgtt tacactatta aaggtttgtt ttaacaggat
taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat gtaaccgtcg
atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta atcttttact
atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt taattccttg
tgaaactttt gtaatatgaa gttgaaattt tgttattggt 1980aaactataaa
tgtgtgaagt tggagtatac ctttaccttc ttatttggct ttgtgatagt
2040ttaatttata tgtattttga gttctgactt gtatttcttt gaattgattc
tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc tcgacatagg
gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac aaggtgccct
ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag cataaagaag
aatttgatcg gtgccctcct cttcgactcc ggagagaccg 2280ctgaggctac
caggctcaag aggaccgcta gaaggcgcta caccagaagg aagaacagaa
2340tctgctacct gcaggagatc ttctccaacg agatggccaa ggtggacgac
tccttcttcc 2400accgccttga ggaatcattc ctggtggagg aggataaaaa
gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag gtggcctacc
atgaaaagta ccctaccatc taccacctga 2520ggaagaagct ggtcgactct
accgacaagg ctgacttgcg cttgatttac ctggctctcg 2580ctcacatgat
aaagttccgc ggacacttcc tcattgaggg agacctgaac ccagacaact
2640ccgacgtgga caagctcttc atccagctcg ttcagaccta caaccagctt
ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca aggctatcct
ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg attgcccagc
tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat cgctctctcc
ctgggattga ctcccaactt caagtccaac ttcgacctcg 2880ccgaggacgc
taagttgcag ttgtctaaag acacctacga cgatgacctc gacaacttgc
2940tggcccagat aggcgaccaa tacgccgatc tcttcctcgc cgctaagaac
ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca acactgagat
taccaaagct cctctgtctg 3060cttccatgat taagcgctac gacgagcacc
accaagatct gaccctgctc aaggccctgg 3120tgagacagca gctgcccgag
aagtacaagg agatcttttt cgaccagtcc aagaacggct 3180acgccggata
cattgacgga ggcgcctccc aggaagagtt ctacaagttc atcaagccca
3240tccttgagaa gatggacggt accgaggagc tgttggtgaa gttgaacaga
gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa gcatccctca
ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg caggaggatt
tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa gatcttgacc
ttcagaattc cttactacgt cgggccactc gccagaggaa 3480actctaggtt
cgcctggatg acccgcaaat ctgaagagac cattactccc tggaacttcg
3540aggaagtcgt ggacaagggc gcttccgctc agtctttcat cgagaggatg
accaacttcg 3600ataaaaatct gcccaacgag aaggtgctgc ccaagcactc
cctgttgtac gagtatttca 3660cagtgtacaa cgagctcacc aaggtgaagt
acgtcacaga gggaatgagg aagcctgcct 3720tcttgtccgg agagcagaag
aaggccatcg tcgacctgct cttcaagacc aacaggaagg 3780tgactgtcaa
gcagctgaag gaggactact tcaagaagat cgagtgcttc gactccgtcg
3840agatctctgg tgtcgaggac aggttcaacg cctcccttgg gacttaccac
gatctgctca 3900agattattaa agacaaggac ttcctggaca acgaggagaa
cgaggacatc cttgaggaca 3960tcgtgctcac cctgaccttg ttcgaagaca
gggaaatgat cgaagagagg ctcaagacct 4020acgcccacct cttcgacgac
aaggtgatga aacagctgaa gagacgcaga tataccggct 4080ggggaaggct
ctcccgcaaa ttgatcaacg ggatcaggga caagcagtca gggaagacta
4140tactcgactt cctgaagtcc gacggattcg ccaacaggaa cttcatgcag
ctcattcacg 4200acgactcctt gaccttcaag gaggacatcc agaaggctca
ggtgtctgga cagggtgact 4260ccttgcatga gcacattgct aacttggccg
gctctcccgc tattaagaag ggcattttgc 4320agaccgtgaa ggtcgttgac
gagctcgtga aggtgatggg acgccacaag ccagagaaca 4380tcgttattga
gatggctcgc gagaaccaaa ctacccagaa agggcagaag aattcccgcg
4440agaggatgaa gcgcattgag gagggcataa aagagcttgg ctctcagatc
ctcaaggagc 4500accccgtcga gaacactcag ctgcagaacg agaagctgta
cctgtactac ctccaaaacg 4560gaagggacat gtacgtggac caggagctgg
acatcaacag gttgtccgac tacgacgtcg 4620accacatcgt gcctcagtcc
ttcctgaagg atgactccat cgacaataaa gtgctgacac 4680gctccgataa
aaatagaggc aagtccgaca acgtcccctc cgaggaggtc gtgaagaaga
4740tgaaaaacta ctggagacag ctcttgaacg ccaagctcat cacccagcgt
aagttcgaca 4800acctgactaa ggctgagaga ggaggattgt ccgagctcga
taaggccgga ttcatcaaga 4860gacagctcgt cgaaacccgc caaattacca
agcacgtggc ccaaattctg gattcccgca 4920tgaacaccaa gtacgatgaa
aatgacaagc tgatccgcga ggtcaaggtg atcaccttga 4980agtccaagct
ggtctccgac ttccgcaagg acttccagtt ctacaaggtg agggagatca
5040acaactacca ccacgcacac gacgcctacc tcaacgctgt cgttggaacc
gccctcatca 5100aaaaatatcc taagctggag tctgagttcg tctacggcga
ctacaaggtg tacgacgtga 5160ggaagatgat cgctaagtct gagcaggaga
tcggcaaggc caccgccaag tacttcttct 5220actccaacat catgaacttc
ttcaagaccg agatcactct cgccaacggt gagatcagga 5280agcgcccact
gatcgagacc aacggtgaga ctggagagat cgtgtgggac aaagggaggg
5340atttcgctac tgtgaggaag gtgctctcca tgcctcaggt gaacatcgtc
aagaagaccg 5400aagttcagac cggaggattc tccaaggagt ccatcctccc
caagagaaac tccgacaagc 5460tgatcgctag aaagaaagac tgggacccta
agaagtacgg aggcttcgat tctcctaccg 5520tggcctactc tgtgctggtc
gtggccaagg tggagaaggg caagtccaag aagctgaaat 5580ccgtcaagga
gctcctcggg attaccatca tggagaggag ttccttcgag aagaacccta
5640tcgacttcct ggaggccaag ggatataaag aggtgaagaa ggacctcatc
atcaagctgc 5700ccaagtactc cctcttcgag ttggagaacg gaaggaagag
gatgctggct tctgccggag 5760agttgcagaa gggaaatgag ctcgcccttc
cctccaagta cgtgaacttc ctgtacctcg 5820cctctcacta tgaaaagttg
aagggctctc ctgaggacaa cgagcagaag cagctcttcg 5880tggagcagca
caagcactac ctggacgaaa ttatcgagca gatctctgag ttctccaagc
5940gcgtgatatt ggccgacgcc aacctcgaca aggtgctgtc cgcctacaac
aagcacaggg 6000ataagcccat tcgcgagcag gctgaaaaca ttatccacct
gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag tacttcgaca
ccaccattga caggaagaga tacacctcca 6120ccaaggaggt gctcgacgca
acactcatcc accaatccat caccggcctc tatgaaacaa 6180ggattgactt
gtcccagctg ggaggcgact ctagagccga tcccaagaag aagagaaagg
6240tgtaggttaa cctagacttg tccatcttct ggattggcca acttaattaa
tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat cactataatg
tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat ctgaataaaa
gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt cacgtgtctt
tataattctt tgatgaacca gatgcatttc attaaccaaa 6480tccatataca
tataaatatt aatcatatat aattaatatc aattgggtta gcaaaacaaa
6540tctagtctag gtgtgttttg cgaattcgat atcaagctta tcgataccgt
cgaggggggg 6600cccggtaccg g 66113025719DNAArtificial
sequenceDD43HR1-SAMSHPT-DD43HR2 302cgcgcctcta gttgaagaca cgttcatgtc
ttcatcgtaa gaagacactc agtagtcttc 60ggccagaatg gccatctgga ttcagcaggc
ctagaaggcc atttaaatcc tgaggatctg 120gtcttcctaa ggacccggga
tatcgctatc aactttgtat agaaaagttg ggccgaattc 180gagctcggta
cggccagaat ccggtaagtg actagggtca cgtgacccta gtcacttaaa
240ttcggccaga atggccatct ggattcagca ggcctagaag gcccggaccg
attaaacttt 300aattcggtcc gggttacctc gagatcttgt tcccctcctt
ggtttggcat aaattgattt 360tcatggctct tctcggtcga aactggagct
aattcaccct tagtctctct taaaattctg 420gctgtaagaa acaccacaga
acacataaat tataaactaa ttataatttg aagagtaaaa 480tatgttttta
ctcttatgat ttaattagtg tagttttaat tttctccttt ttttaaaaaa
540ttttggtatt cataaatttc aattttttaa aaataattgt tgttacccgt
taatgataac 600gggatatgtt atgttaccac taaatcggac aaaaaaaatt
caaaactttt ataaggatta 660aaattaacaa aaatatttta aaaaaatcta
acctcaataa agttaaattt ataagcacaa 720aataatactt ttaagcctaa
tttggcaaga cacaagcaag ctcacctgta gcattaatag 780aaaggaagca
aagcaagaga aaagcaacca gaaggaagcg tttgcttggt gacacagcca
840tcttacttga atttatggta ttactgagaa accttgatct tgcttcaaaa
tcttctagtt 900accctctttt tataggcaga aagagaacta gctagttgcc
aataggatat gaggacatgt 960ggtgcaatgc actcactctt caaggacaag
aaaaacaatg gctacaattg tggttcaaat 1020caatgtctcc tgctctgtcc
tgcctgaaaa tgacaccctt ttgcttggaa aagaggatca 1080aagctaagaa
caggagtggc ttcattccct tcatgtaacc aaacactttc gcattctgtc
1140attcgtgaat cagcaaaatc tgcaaccaaa aatatatggt gcctaaataa
aagaaataaa 1200ataatttaga gttgcggact aaaataataa acaaaagaaa
tatattataa tctagaatta 1260atttaggact aaaagaagag gcagactcca
attcctcttt tctagaatac cctccgtacg 1320tacactagtg gtcacctaag
tgactagggt cacgtgaccc tagtcactta ttcccaaaca 1380ctagtaacgg
ccgccagtgt gctggaattc gcccttccca agctttgctc tagatcaaac
1440tcacatccaa acataacatg gatatcttcc ttaccaatca tactaattat
tttgggttaa 1500atattaatca ttatttttaa gatattaatt aagaaattaa
aagatttttt aaaaaaatgt 1560ataaaattat attattcatg atttttcata
catttgattt tgataataaa tatatttttt 1620ttaatttctt aaaaaatgtt
gcaagacact tattagacat agtcttgttc tgtttacaaa 1680agcattcatc
atttaataca ttaaaaaata tttaatacta acagtagaat cttcttgtga
1740gtggtgtggg agtaggcaac ctggcattga aacgagagaa agagagtcag
aaccagaaga 1800caaataaaaa gtatgcaaca aacaaatcaa aatcaaaggg
caaaggctgg ggttggctca 1860attggttgct acattcaatt ttcaactcag
tcaacggttg agattcactc tgacttcccc 1920aatctaagcc gcggatgcaa
acggttgaat ctaacccaca atccaatctc gttacttagg 1980ggcttttccg
tcattaactc acccctgcca cccggtttcc ctataaattg gaactcaatg
2040ctcccctcta aactcgtatc gcttcagagt tgagaccaag acacactcgt
tcatatatct 2100ctctgctctt ctcttctctt ctacctctca aggtactttt
cttctccctc taccaaatcc 2160tagattccgt ggttcaattt cggatcttgc
acttctggtt tgctttgcct tgctttttcc 2220tcaactgggt ccatctagga
tccatgtgaa actctactct ttctttaata tctgcggaat 2280acgcgtttga
ctttcagatc tagtcgaaat catttcataa ttgcctttct ttcttttagc
2340ttatgagaaa taaaatcact ttttttttat ttcaaaataa accttgggcc
ttgtgctgac 2400tgagatgggg tttggtgatt acagaatttt agcgaatttt
gtaattgtac ttgtttgtct 2460gtagttttgt tttgttttct tgtttctcat
acattcctta ggcttcaatt ttattcgagt 2520ataggtcaca ataggaattc
aaactttgag caggggaatt aatcccttcc ttcaaatcca 2580gtttgtttgt
atatatgttt aaaaaatgaa acttttgctt taaattctat tataactttt
2640tttatggctg aaatttttgc atgtgtcttt gctctctgtt gtaaatttac
tgtttaggta 2700ctaactctag gcttgttgtg cagtttttga agtataacaa
cagaagttcc tattccgaag 2760ttcctattct ctagaaagta taggaacttc
cactagtcca tgaaaaagcc tgaactcacc 2820gcgacgtctg tcgagaagtt
tctgatcgaa aagttcgaca gcgtctccga cctgatgcag 2880ctctcggagg
gcgaagaatc tcgtgctttc agcttcgatg taggagggcg tggatatgtc
2940ctgcgggtaa atagctgcgc cgatggtttc tacaaagatc gttatgttta
tcggcacttt 3000gcatcggccg cgctcccgat tccggaagtg cttgacattg
gggaattcag cgagagcctg 3060acctattgca tctcccgccg tgcacagggt
gtcacgttgc aagacctgcc tgaaaccgaa 3120ctgcccgctg ttctgcagcc
ggtcgcggag gccatggatg cgatcgctgc ggccgatctt 3180agccagacga
gcgggttcgg cccattcgga ccgcaaggaa tcggtcaata cactacatgg
3240cgtgatttca tatgcgcgat tgctgatccc catgtgtatc actggcaaac
tgtgatggac 3300gacaccgtca gtgcgtccgt cgcgcaggct ctcgatgagc
tgatgctttg ggccgaggac 3360tgccccgaag tccggcacct cgtgcacgcg
gatttcggct ccaacaatgt cctgacggac 3420aatggccgca taacagcggt
cattgactgg agcgaggcga tgttcgggga ttcccaatac 3480gaggtcgcca
acatcttctt ctggaggccg tggttggctt gtatggagca gcagacgcgc
3540tacttcgagc ggaggcatcc ggagcttgca ggatcgccgc ggctccgggc
gtatatgctc 3600cgcattggtc ttgaccaact ctatcagagc ttggttgacg
gcaatttcga tgatgcagct 3660tgggcgcagg gtcgatgcga cgcaatcgtc
cgatccggag ccgggactgt cgggcgtaca 3720caaatcgccc gcagaagcgc
ggccgtctgg accgatggct gtgtagaagt actcgccgat 3780agtggaaacc
gacgccccag cactcgtccg agggcaaagg aatagtgagg tacctaaaga
3840aggagtgcgt cgaagcagat cgttcaaaca tttggcaata aagtttctta
agattgaatc 3900ctgttgccgg tcttgcgatg attatcatat aatttctgtt
gaattacgtt aagcatgtaa 3960taattaacat gtaatgcatg acgttattta
tgagatgggt ttttatgatt agagtcccgc 4020aattatacat ttaatacgcg
atagaaaaca aaatatagcg cgcaaactag gataaattat 4080cgcgcgcggt
gtcatctatg ttactagatc gatgtcgacc cgggccctag gaggccggcc
4140cagctgatga tcccggtgaa gttcctattc cgaagttcct attctccaga
aagtatagga 4200acttcactag agcttgcggc cgcgcatgct gacttaatca
gctaacgcca ctcgacctgc 4260aggcatgccc gcggatatcg atgggccccg
gccgaagctt caagtttgta caaaaaagca 4320ggctggcgcc ggaaccaatt
cagtcgactg gatccggtac cgaattcgcg gccgcactcg 4380agatatctag
acccagcttt cttgtacaaa gtggccgtta acggatcggc cagaatccgg
4440taagtgacta gggtcacgtg accctagtca cttaaattcg gccagaatgg
ccatctggat 4500tcagcaggcc tagaaggccc ggaccgatta aactttaatt
cggtccgggt tacctctaga 4560aagcttgtcg acctgcaggt acaagtacaa
gggacttgtg agttgtaagg ctgtatttac 4620aatagtgaaa agagaatcat
ctgggtgatt gggtttttag tccccagtga cgaattaaag 4680gtttgaattc
ttagtatgtt tgggaatcaa ttaggaattt cgttttggac tttccaaagc
4740aattattcac tttttcattc attaaatgtg actaaaaaat tgttatttct
ccattggcca 4800ggatgcatcg tttatataaa cataacctta gtgaaagcag
tgttttcatg tgacagcggc 4860agactatatc ttaaacaaaa ttacttgtaa
agaaagatac cgttaggaaa aaaatgaaaa 4920gaaaattgaa gctatcactt
gtttactttc ctaatatctt tcaagaatac aatgtggtga 4980atttcaattt
tccctacata tgtataccgt cagcctgacg caacttatga aacttctctt
5040tctttcattt gatgtatata taaagacaca ttatatataa agaaacttta
tatatatctc 5100catcatattt tagtacttgc tactatgtaa aattagctgt
tggaagtatc tcaagaaaca 5160tttaatttat tgaaccaagc attaaccatt
catctacatt tgagttctaa aataaatctt 5220aaatgatgtg gaggaaggga
aattgttaat tatttccctc ttctcctaca tggatatacc 5280tgaaacatgc
aatggatgga ttagatttta acatttgcag cctgagaagt tcactgactt
5340tcctccagct attttatgtg tgcccgccac catttatagc tcatgattgt
agctgaactg 5400caaaaactgc atcgattgca aactgaaatt gagaatctct
tttcaacttt atatgctgat 5460tgatgcatgc tgagcatgct atactagtac
tcgaagttcc tatatgtaga ctttgttact 5520gcctaatata ctttgtgttt
gttctcaagt tcttatttta tttcatattt tttcctataa 5580aaggttaatg
gctctataaa ggttgagtga cggatccggt cacctaagtg actagggtca
5640cgtgacccta gtcacttatt cccgggcaac tttattatac aaagttgata
gatctcgaat 5700tcattccgat taatcgtgg 57193036611DNAArtificial
sequenceU6-9.1DD43CR2+EF1A2CAS9 303cgcgccggta cccgggttaa gagaattgta
agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat aaaagaaaat
tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata aattttaaaa
cataaaagaa aatgtctaac aagaggatta agatcctgtg 180ctcttaaatt
tttaggtgtt gaaatcttag ccatacaaaa tatattttat taaaaccaag
240catgaaaaaa gtcactaaag agctatataa ctcatgcagc tagaaatgaa
gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct atctttgttc
ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta tcttcctttc
gttaggggag caccagatgc atagttagtc 420ccacattgat gaatataaca
agagcttcac agaatatata gcccaggcca cagtaaaagc 480ttgtattcta
gaaaagagga atgttttaga gctagaaata gcaagttaaa ataaggctag
540tccgttatca acttgaaaaa gtggcaccga gtcggtgctt ttttttgcgg
ccgcaattgg 600atcgggttta cttattttgt gggtatctat acttttatta
gatttttaat caggctcctg 660atttcttttt atttcgattg aattcctgaa
cttgtattat tcagtagatc gaataaatta 720taaaaagata aaatcataaa
ataatatttt atcctatcaa tcatattaaa gcaatgaata 780tgtaaaatta
atcttatctt tattttaaaa aatcatatag gtttagtatt tttttaaaaa
840taaagatagg attagtttta ctattcactg cttattactt ttaaaaaaat
cataaaggtt 900tagtattttt ttaaaataaa tataggaata gttttactat
tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag ttttaatccc
agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt tacattatct
taacctagct gaaacgatgt cgtcataata tcgccaaatg 1080ccaactggac
tacgtcgaac ccacaaatcc cacaaagcgc gtgaaatcaa atcgctcaaa
1140ccacaaaaaa gaacaacgcg tttgttacac gctcaatccc acgcgagtag
agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag aaatccgaaa
taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact caatataaaa
atctaggaac cctagttttc gttatcactc 1320tgtgctccct cgctctattt
ctcagtctct gtgtttgcgg ctgaggattc cgaacgagtg 1380accttcttcg
tttctcgcaa aggtaacagc ctctgctctt gtctcttcga ttcgatctat
1440gcctgtctct tatttacgat gatgtttctt cggttatgtt tttttattta
tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg tttttgtggt
tcagtttttt aggattcttt 1560tggtttttga atcgattaat cggaagagat
tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt gaggtcatag
atctgttgta tttgtgttat aaacatgcga ctttgtatga 1680ttttttacga
ggttatgatg ttctggttgt tttattatga atctgttgag acagaaccat
1740gatttttgtt gatgttcgtt tacactatta aaggtttgtt ttaacaggat
taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat
gtaaccgtcg atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta
atcttttact atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt
taattccttg tgaaactttt gtaatatgaa gttgaaattt tgttattggt
1980aaactataaa tgtgtgaagt tggagtatac ctttaccttc ttatttggct
ttgtgatagt 2040ttaatttata tgtattttga gttctgactt gtatttcttt
gaattgattc tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc
tcgacatagg gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac
aaggtgccct ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag
cataaagaag aatttgatcg gtgccctcct cttcgactcc ggagagaccg
2280ctgaggctac caggctcaag aggaccgcta gaaggcgcta caccagaagg
aagaacagaa 2340tctgctacct gcaggagatc ttctccaacg agatggccaa
ggtggacgac tccttcttcc 2400accgccttga ggaatcattc ctggtggagg
aggataaaaa gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag
gtggcctacc atgaaaagta ccctaccatc taccacctga 2520ggaagaagct
ggtcgactct accgacaagg ctgacttgcg cttgatttac ctggctctcg
2580ctcacatgat aaagttccgc ggacacttcc tcattgaggg agacctgaac
ccagacaact 2640ccgacgtgga caagctcttc atccagctcg ttcagaccta
caaccagctt ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca
aggctatcct ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg
attgcccagc tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat
cgctctctcc ctgggattga ctcccaactt caagtccaac ttcgacctcg
2880ccgaggacgc taagttgcag ttgtctaaag acacctacga cgatgacctc
gacaacttgc 2940tggcccagat aggcgaccaa tacgccgatc tcttcctcgc
cgctaagaac ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca
acactgagat taccaaagct cctctgtctg 3060cttccatgat taagcgctac
gacgagcacc accaagatct gaccctgctc aaggccctgg 3120tgagacagca
gctgcccgag aagtacaagg agatcttttt cgaccagtcc aagaacggct
3180acgccggata cattgacgga ggcgcctccc aggaagagtt ctacaagttc
atcaagccca 3240tccttgagaa gatggacggt accgaggagc tgttggtgaa
gttgaacaga gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa
gcatccctca ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg
caggaggatt tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa
gatcttgacc ttcagaattc cttactacgt cgggccactc gccagaggaa
3480actctaggtt cgcctggatg acccgcaaat ctgaagagac cattactccc
tggaacttcg 3540aggaagtcgt ggacaagggc gcttccgctc agtctttcat
cgagaggatg accaacttcg 3600ataaaaatct gcccaacgag aaggtgctgc
ccaagcactc cctgttgtac gagtatttca 3660cagtgtacaa cgagctcacc
aaggtgaagt acgtcacaga gggaatgagg aagcctgcct 3720tcttgtccgg
agagcagaag aaggccatcg tcgacctgct cttcaagacc aacaggaagg
3780tgactgtcaa gcagctgaag gaggactact tcaagaagat cgagtgcttc
gactccgtcg 3840agatctctgg tgtcgaggac aggttcaacg cctcccttgg
gacttaccac gatctgctca 3900agattattaa agacaaggac ttcctggaca
acgaggagaa cgaggacatc cttgaggaca 3960tcgtgctcac cctgaccttg
ttcgaagaca gggaaatgat cgaagagagg ctcaagacct 4020acgcccacct
cttcgacgac aaggtgatga aacagctgaa gagacgcaga tataccggct
4080ggggaaggct ctcccgcaaa ttgatcaacg ggatcaggga caagcagtca
gggaagacta 4140tactcgactt cctgaagtcc gacggattcg ccaacaggaa
cttcatgcag ctcattcacg 4200acgactcctt gaccttcaag gaggacatcc
agaaggctca ggtgtctgga cagggtgact 4260ccttgcatga gcacattgct
aacttggccg gctctcccgc tattaagaag ggcattttgc 4320agaccgtgaa
ggtcgttgac gagctcgtga aggtgatggg acgccacaag ccagagaaca
4380tcgttattga gatggctcgc gagaaccaaa ctacccagaa agggcagaag
aattcccgcg 4440agaggatgaa gcgcattgag gagggcataa aagagcttgg
ctctcagatc ctcaaggagc 4500accccgtcga gaacactcag ctgcagaacg
agaagctgta cctgtactac ctccaaaacg 4560gaagggacat gtacgtggac
caggagctgg acatcaacag gttgtccgac tacgacgtcg 4620accacatcgt
gcctcagtcc ttcctgaagg atgactccat cgacaataaa gtgctgacac
4680gctccgataa aaatagaggc aagtccgaca acgtcccctc cgaggaggtc
gtgaagaaga 4740tgaaaaacta ctggagacag ctcttgaacg ccaagctcat
cacccagcgt aagttcgaca 4800acctgactaa ggctgagaga ggaggattgt
ccgagctcga taaggccgga ttcatcaaga 4860gacagctcgt cgaaacccgc
caaattacca agcacgtggc ccaaattctg gattcccgca 4920tgaacaccaa
gtacgatgaa aatgacaagc tgatccgcga ggtcaaggtg atcaccttga
4980agtccaagct ggtctccgac ttccgcaagg acttccagtt ctacaaggtg
agggagatca 5040acaactacca ccacgcacac gacgcctacc tcaacgctgt
cgttggaacc gccctcatca 5100aaaaatatcc taagctggag tctgagttcg
tctacggcga ctacaaggtg tacgacgtga 5160ggaagatgat cgctaagtct
gagcaggaga tcggcaaggc caccgccaag tacttcttct 5220actccaacat
catgaacttc ttcaagaccg agatcactct cgccaacggt gagatcagga
5280agcgcccact gatcgagacc aacggtgaga ctggagagat cgtgtgggac
aaagggaggg 5340atttcgctac tgtgaggaag gtgctctcca tgcctcaggt
gaacatcgtc aagaagaccg 5400aagttcagac cggaggattc tccaaggagt
ccatcctccc caagagaaac tccgacaagc 5460tgatcgctag aaagaaagac
tgggacccta agaagtacgg aggcttcgat tctcctaccg 5520tggcctactc
tgtgctggtc gtggccaagg tggagaaggg caagtccaag aagctgaaat
5580ccgtcaagga gctcctcggg attaccatca tggagaggag ttccttcgag
aagaacccta 5640tcgacttcct ggaggccaag ggatataaag aggtgaagaa
ggacctcatc atcaagctgc 5700ccaagtactc cctcttcgag ttggagaacg
gaaggaagag gatgctggct tctgccggag 5760agttgcagaa gggaaatgag
ctcgcccttc cctccaagta cgtgaacttc ctgtacctcg 5820cctctcacta
tgaaaagttg aagggctctc ctgaggacaa cgagcagaag cagctcttcg
5880tggagcagca caagcactac ctggacgaaa ttatcgagca gatctctgag
ttctccaagc 5940gcgtgatatt ggccgacgcc aacctcgaca aggtgctgtc
cgcctacaac aagcacaggg 6000ataagcccat tcgcgagcag gctgaaaaca
ttatccacct gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag
tacttcgaca ccaccattga caggaagaga tacacctcca 6120ccaaggaggt
gctcgacgca acactcatcc accaatccat caccggcctc tatgaaacaa
6180ggattgactt gtcccagctg ggaggcgact ctagagccga tcccaagaag
aagagaaagg 6240tgtaggttaa cctagacttg tccatcttct ggattggcca
acttaattaa tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat
cactataatg tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat
ctgaataaaa gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt
cacgtgtctt tataattctt tgatgaacca gatgcatttc attaaccaaa
6480tccatataca tataaatatt aatcatatat aattaatatc aattgggtta
gcaaaacaaa 6540tctagtctag gtgtgttttg cgaattcgat atcaagctta
tcgataccgt cgaggggggg 6600cccggtaccg g 661130464DNAArtificial
sequenceDD20 qPCR amplicon 304attcggaact gacacacgac atgatggaac
gtgactaagg tgggtttttg actttgcatg 60tcga 64305115DNAArtificial
sequenceDD43 qPCR amplicon 305aaagaagagg cagactccaa ttcctctttt
ctagaatacc ctccgtacgt acaagtacaa 60gggacttgtg agttgtaagg ctgtatttac
aatagtgaaa agagaatcat ctggg 11530620DNAArtificial sequenceprimer,
DD20-CR1 306ggaactgaca cacgacatga 2030720DNAArtificial
sequenceprimer, DD20-CR2 307gacatgatgg aacgtgacta
2030822DNAArtificial sequenceprimer, DD20-F 308attcggaact
gacacacgac at 2230917DNAArtificial sequenceFAM-MGB probe, DD20-T
309atggaacgtg actaagg 1731022DNAArtificial sequenceprimer, DD20-R
310tcgacatgca aagtcaaaaa cc 2231120DNAArtificial sequenceprimer,
DD43CR1 311gtcccttgta cttgtacgta 2031220DNAArtificial
sequenceprimer, DD43CR2 312gtattctaga aaagaggaat
2031326DNAArtificial sequenceprimer, DD43-F 313ttctagaata
ccctccgtac gtacaa 2631426DNAArtificial sequenceprimer, DD43-F2
314aaagaagagg cagactccaa ttcctc 2631519DNAArtificial
sequenceFAM-MGB probe, DD43-T 315caagggactt gtgagttgt
1931626DNAArtificial sequenceprimer, DD43-R 316cccagatgat
tctcttttca ctattg 2631719DNAArtificial sequenceprimer, Cas9-F
317ccttcttcca ccgccttga 1931819DNAArtificial sequenceFAM-MGB probe,
Cas9-T 318aatcattcct ggtggagga 1931921DNAArtificial sequenceprimer,
Cas9-R 319tgggtgtctc tcgtgctttt t 2132022DNAArtificial
sequenceprimer, Sams-76F 320aggcttgttg tgcagttttt ga
2232122DNAArtificial sequenceFAM-MGB probe, FRT1I-63T 321tggactagtg
gaagttccta ta 2232221DNAArtificial sequenceprimer, FRT1I-41F
322gcggtgagtt caggcttttt c 2132331DNAArtificial sequenceprimer,
DD20-LB 323ggttatacct tcttcttagt gtggtctatc c 3132431DNAArtificial
sequenceprimer, Sams-A1 324cccaaaataa ttagtatgat tggtaaggaa g
3132523DNAArtificial sequenceprimer, QC498A-S1 325ggaacttcac
tagagcttgc ggc 2332628DNAArtificial sequenceprimer, DD20-RB
326gccattacat tcttcataag ttcctctc 2832726DNAArtificial
sequenceprimer, DD43-LB 327gtgtagtcca ttgtagccaa gtcacc
2632824DNAArtificial sequenceprimer, DD43-RB 328caaaccggag
agagaggaag aacc 243292105DNAArtificial SequenceDD20 HR1-HR2 PCR
amplicon 329ggttatacct tcttcttagt gtggtctatc ccctagtaat aattacacat
ctaagatatc 60cccttctttt tcaagtaaaa taatatcata tgatctcatt ttagtgaaac
aatactattt 120ccctgataac tctcttcaac attagggact tcatctaatc
atctactttc aaggtataac 180tagacgtatt tgttctttta aaaaaaacac
tagatgtact cgtcaactca aaattcatcg 240ttcatgcatt ttaattaaac
tttaattagc taatgagtag aaaaagatca tacgagtaaa 300atagaagaat
cttcctagat tttggaagaa tggattggag tgtaagtgaa ttgatccatt
360agtggaagat gctctttaca atggccaaac tgttctaatt gttagagcac
atttgagatg 420aaacacttca gtagtggagg taacctacaa tcctaggatc
tgtatcctct atcactaatg 480gagcaatggg tttgagattg acttactcct
ttccttgtct ctcgtagtgc atatgcgcac 540tttcaaaggc tacacaaaag
ccgttaactt tttgtttatt taagttacga aagatagttg 600aattagagta
aatggtgata ttgaattagg attttaaata attttaaaag aattttttta
660ataaaaaaaa tattgtgttg ttggatcaaa atttttaaat aacatgaata
aggaaatgga 720ttgcaatgag gttttaaaca attattttaa catataggat
tttagaaaga cttttataat 780attttgttga agtttagatt ttaatatatt
tatgttttaa aattttaaaa aaaacttcat 840gaatttataa tatttgaaaa
agacacgtga atatttagaa aacatttaaa attacaataa 900taaatcataa
tgagataggg tgtattcatg tgtagacgag acaccaagta tatggttcac
960aagtgaatca tctttttttt ttacagcaca agtagatcac ttgtacttat
caaaattcgg 1020aactgacaca cgacatgatg gaacgtgact aaggtgggtt
tttgactttg catgtcgaag 1080tgagagtgat tttattgaga gaataataga
agacctacaa aacaaatgat cccgacgcta 1140aagtaagtac gagagttaag
agaataaatg ggaaaatatg catacatgat taggtgtgtg 1200ttcgtctcaa
gaaagtacga atgaatatgg tgtgtttgta gtacatgaat gatgtgtttt
1260gagggttcaa gggaaattga tatttataga gtgaaatgga accagaggtc
tttgttgaca 1320agggttgtta tgactcttgc aaataattaa tagcttataa
ataatagcca ataacttatt 1380atagatagag ttagagataa tatatagcta
aatttgaaca aggcatacaa aacaaaaatg 1440ctaaatatga ataagacaat
caaaattgta gtcgatgttc aactctttgt cgttgaagaa 1500cttgtttgca
gtggtatagt aaatgggtgt gagtgcagtg tctcacccat ctcacaccac
1560acaaccaact tcatatctaa agatattgtc gctgaataca aaattgagtt
atggaatata 1620caattcataa tatagatacg aaaaatcatt tcttacaaaa
cattcaatca aaaattattc 1680aaacataatt ctagattaag taatccgaag
tacaagttag tatcctagat ccgttaattt 1740aaaattatgt ttgcataatt
ttggatttgg tgttctataa gggcacaatt ttgttcattc 1800ttacaagttt
gtcaattcta aaatatatgc aaatttgaag aaaaaaaatt tacgaatgtg
1860tctcaaacaa taacttaatg ggaggagaat gagggatgaa gaagctcaaa
attaccaacg 1920ccttctacct caagaagcta cttcacacaa aatatgactg
gcggaaggat aggggacaac 1980cgataacgag aaggagatac ataaggtaat
gtacgttgtt gtgtgaggta cacaattatg 2040gggatgaaga agttcaactt
tagtcgaaaa aatgtttgag aggaacttat gaagaatgta 2100atggc
21053301204DNAArtificial SequenceDD20 HR1-SAMS PCR amplicon
330ggttatacct tcttcttagt gtggtctatc ccctagtaat aattacacat
ctaagatatc 60cccttctttt tcaagtaaaa taatatcata tgatctcatt ttagtgaaac
aatactattt 120ccctgataac tctcttcaac attagggact tcatctaatc
atctactttc aaggtataac 180tagacgtatt tgttctttta aaaaaaacac
tagatgtact cgtcaactca aaattcatcg 240ttcatgcatt ttaattaaac
tttaattagc taatgagtag aaaaagatca tacgagtaaa 300atagaagaat
cttcctagat tttggaagaa tggattggag tgtaagtgaa ttgatccatt
360agtggaagat gctctttaca atggccaaac tgttctaatt gttagagcac
atttgagatg 420aaacacttca gtagtggagg taacctacaa tcctaggatc
tgtatcctct atcactaatg 480gagcaatggg tttgagattg acttactcct
ttccttgtct ctcgtagtgc atatgcgcac 540tttcaaaggc tacacaaaag
ccgttaactt tttgtttatt taagttacga aagatagttg 600aattagagta
aatggtgata ttgaattagg attttaaata attttaaaag aattttttta
660ataaaaaaaa tattgtgttg ttggatcaaa atttttaaat aacatgaata
aggaaatgga 720ttgcaatgag gttttaaaca attattttaa catataggat
tttagaaaga cttttataat 780attttgttga agtttagatt ttaatatatt
tatgttttaa aattttaaaa aaaacttcat 840gaatttataa tatttgaaaa
agacacgtga atatttagaa aacatttaaa attacaataa 900taaatcataa
tgagataggg tgtattcatg tgtagacgag acaccaagta tatggttcac
960aagtgaatca tctttttttt ttacagcaca agtagatcac ttgtacttat
caaaattcgg 1020aactgacaca cactagtggt cacctaagtg actagggtca
cgtgacccta gtcacttatt 1080cccaaacact agtaacggcc gccagtgtgc
tggaattcgc ccttcccaag ctttgctcta 1140gatcaaactc acatccaaac
ataacatgga tatcttcctt accaatcata ctaattattt 1200tggg
12043311459DNAArtificial SequenceDD20 NOS-HR2 PCR amplicon
331ggaacttcac tagagcttgc ggccgcgcat gctgacttaa tcagctaacg
ccactcgacc 60tgcaggcatg cccgcggata tcgatgggcc ccggccgaag cttcaagttt
gtacaaaaaa 120gcaggctggc gccggaacca attcagtcga ctggatccgg
taccgaattc gcggccgcac 180tcgagatatc tagacccagc tttcttgtac
aaagtggccg ttaacggatc ggccagaatc 240cggtaagtga ctagggtcac
gtgaccctag tcacttaaat tcggccagaa tggccatctg 300gattcagcag
gcctagaagg cccggaccga ttaaacttta attcggtccg ggttacctct
360agaaagcttg tcgacctgca gacacgacat gatggaacgt gactaaggtg
ggtttttgac 420tttgcatgtc gaagtgagag tgattttatt gagagaataa
tagaagacct acaaaacaaa 480tgatcccgac gctaaagtaa gtacgagagt
taagagaata aatgggaaaa tatgcataca 540tgattaggtg tgtgttcgtc
tcaagaaagt acgaatgaat atggtgtgtt tgtagtacat 600gaatgatgtg
ttttgagggt tcaagggaaa ttgatattta tagagtgaaa tggaaccaga
660ggtctttgtt gacaagggtt gttatgactc ttgcaaataa ttaatagctt
ataaataata 720gccaataact tattatagat agagttagag ataatatata
gctaaatttg aacaaggcat 780acaaaacaaa aatgctaaat atgaataaga
caatcaaaat tgtagtcgat gttcaactct 840ttgtcgttga agaacttgtt
tgcagtggta tagtaaatgg gtgtgagtgc agtgtctcac 900ccatctcaca
ccacacaacc aacttcatat ctaaagatat tgtcgctgaa tacaaaattg
960agttatggaa tatacaattc ataatataga tacgaaaaat catttcttac
aaaacattca 1020atcaaaaatt attcaaacat aattctagat taagtaatcc
gaagtacaag ttagtatcct 1080agatccgtta atttaaaatt atgtttgcat
aattttggat ttggtgttct ataagggcac 1140aattttgttc attcttacaa
gtttgtcaat tctaaaatat atgcaaattt gaagaaaaaa 1200aatttacgaa
tgtgtctcaa acaataactt aatgggagga gaatgaggga tgaagaagct
1260caaaattacc aacgccttct acctcaagaa gctacttcac acaaaatatg
actggcggaa 1320ggatagggga caaccgataa cgagaaggag atacataagg
taatgtacgt tgttgtgtga 1380ggtacacaat tatggggatg aagaagttca
actttagtcg aaaaaatgtt tgagaggaac 1440ttatgaagaa tgtaatggc
14593322098DNAArtificial SequenceDD43 HR1-HR2 PCR amplicon
332gtgtagtcca ttgtagccaa gtcaccaata tcttgttccc ctccttggtt
tggcataaat 60tgattttcat ggctcttctc ggtcgaaact ggagctaatt cacccttagt
ctctcttaaa 120attctggctg taagaaacac cacagaacac ataaattata
aactaattat aatttgaaga 180gtaaaatatg tttttactct tatgatttaa
ttagtgtagt tttaattttc tccttttttt 240aaaaaatttt ggtattcata
aatttcaatt ttttaaaaat aattgttgtt acccgttaat 300gataacggga
tatgttatgt taccactaaa tcggacaaaa aaaattcaaa acttttataa
360ggattaaaat taacaaaaat attttaaaaa aatctaacct caataaagtt
aaatttataa 420gcacaaaata atacttttaa gcctaatttg gcaagacaca
agcaagctca cctgtagcat 480taatagaaag gaagcaaagc aagagaaaag
caaccagaag gaagcgtttg cttggtgaca 540cagccatctt acttgaattt
atggtattac tgagaaacct tgatcttgct tcaaaatctt 600ctagttaccc
tctttttata ggcagaaaga gaactagcta gttgccaata ggatatgagg
660acatgtggtg caatgcactc actcttcaag gacaagaaaa acaatggcta
caattgtggt 720tcaaatcaat gtctcctgct ctgtcctgcc tgaaaatgac
acccttttgc ttggaaaaga 780ggatcaaagc taagaacagg agtggcttca
ttcccttcat gtaaccaaac actttcgcat 840tctgtcattc gtgaatcagc
aaaatctgca accaaaaata tatggtgcct aaataaaaga 900aataaaataa
tttagagttg cggactaaaa taataaacaa aagaaatata ttataatcta
960gaattaattt aggactaaaa gaagaggcag actccaattc ctcttttcta
gaataccctc 1020cgtacgtaca agtacaaggg acttgtgagt tgtaaggctg
tatttacaat agtgaaaaga 1080gaatcatctg ggtgattggg tttttagtcc
ccagtgacga attaaaggtt tgaattctta 1140gtatgtttgg gaatcaatta
ggaatttcgt tttggacttt ccaaagcaat tattcacttt 1200ttcattcatt
aaatgtgact aaaaaattgt tatttctcca ttggccagga tgcatcgttt
1260atataaacat aaccttagtg aaagcagtgt tttcatgtga cagcggcaga
ctatatctta 1320aacaaaatta cttgtaaaga aagataccgt taggaaaaaa
atgaaaagaa aattgaagct 1380atcacttgtt tactttccta atatctttca
agaatacaat gtggtgaatt tcaattttcc 1440ctacatatgt ataccgtcag
cctgacgcaa cttatgaaac ttctctttct ttcatttgat 1500gtatatataa
agacacatta tatataaaga aactttatat atatctccat catattttag
1560tacttgctac tatgtaaaat tagctgttgg aagtatctca agaaacattt
aatttattga 1620accaagcatt aaccattcat ctacatttga gttctaaaat
aaatcttaaa tgatgtggag 1680gaagggaaat tgttaattat ttccctcttc
tcctacatgg atatacctga aacatgcaat 1740ggatggatta gattttaaca
tttgcagcct gagaagttca ctgactttcc tccagctatt 1800ttatgtgtgc
ccgccaccat ttatagctca tgattgtagc tgaactgcaa aaactgcatc
1860gattgcaaac tgaaattgag aatctctttt caactttata tgctgattga
tgcatgctga 1920gcatgctata ctagtactcg aagttcctat atgtagactt
tgttactgcc taatatactt 1980tgtgtttgtt ctcaagttct tattttattt
catatttttt cctataaaag gttaatggct 2040ctataaaggt tgagtgacat
atatatacta taaaggttct tcctctctct ccggtttg 20983331202DNAArtificial
SequenceDD43 HR1-SAMS PCR PCR amplicon 333gtgtagtcca ttgtagccaa
gtcaccaata tcttgttccc ctccttggtt tggcataaat 60tgattttcat ggctcttctc
ggtcgaaact ggagctaatt cacccttagt ctctcttaaa 120attctggctg
taagaaacac cacagaacac ataaattata aactaattat aatttgaaga
180gtaaaatatg tttttactct tatgatttaa ttagtgtagt tttaattttc
tccttttttt
240aaaaaatttt ggtattcata aatttcaatt ttttaaaaat aattgttgtt
acccgttaat 300gataacggga tatgttatgt taccactaaa tcggacaaaa
aaaattcaaa acttttataa 360ggattaaaat taacaaaaat attttaaaaa
aatctaacct caataaagtt aaatttataa 420gcacaaaata atacttttaa
gcctaatttg gcaagacaca agcaagctca cctgtagcat 480taatagaaag
gaagcaaagc aagagaaaag caaccagaag gaagcgtttg cttggtgaca
540cagccatctt acttgaattt atggtattac tgagaaacct tgatcttgct
tcaaaatctt 600ctagttaccc tctttttata ggcagaaaga gaactagcta
gttgccaata ggatatgagg 660acatgtggtg caatgcactc actcttcaag
gacaagaaaa acaatggcta caattgtggt 720tcaaatcaat gtctcctgct
ctgtcctgcc tgaaaatgac acccttttgc ttggaaaaga 780ggatcaaagc
taagaacagg agtggcttca ttcccttcat gtaaccaaac actttcgcat
840tctgtcattc gtgaatcagc aaaatctgca accaaaaata tatggtgcct
aaataaaaga 900aataaaataa tttagagttg cggactaaaa taataaacaa
aagaaatata ttataatcta 960gaattaattt aggactaaaa gaagaggcag
actccaattc ctcttttcta gaataccctc 1020cgtacgtaca ctagtggtca
cctaagtgac tagggtcacg tgaccctagt cacttattcc 1080caaacactag
taacggccgc cagtgtgctg gaattcgccc ttcccaagct ttgctctaga
1140tcaaactcac atccaaacat aacatggata tcttccttac caatcatact
aattattttg 1200gg 12023341454DNAArtificial SequenceDD43 NOS-HR2 PCR
PCR amplicon 334ggaacttcac tagagcttgc ggccgcgcat gctgacttaa
tcagctaacg ccactcgacc 60tgcaggcatg cccgcggata tcgatgggcc ccggccgaag
cttcaagttt gtacaaaaaa 120gcaggctggc gccggaacca attcagtcga
ctggatccgg taccgaattc gcggccgcac 180tcgagatatc tagacccagc
tttcttgtac aaagtggccg ttaacggatc ggccagaatc 240cggtaagtga
ctagggtcac gtgaccctag tcacttaaat tcggccagaa tggccatctg
300gattcagcag gcctagaagg cccggaccga ttaaacttta attcggtccg
ggttacctct 360agaaagcttg tcgacctgca ggtacaagta caagggactt
gtgagttgta aggctgtatt 420tacaatagtg aaaagagaat catctgggtg
attgggtttt tagtccccag tgacgaatta 480aaggtttgaa ttcttagtat
gtttgggaat caattaggaa tttcgttttg gactttccaa 540agcaattatt
cactttttca ttcattaaat gtgactaaaa aattgttatt tctccattgg
600ccaggatgca tcgtttatat aaacataacc ttagtgaaag cagtgttttc
atgtgacagc 660ggcagactat atcttaaaca aaattacttg taaagaaaga
taccgttagg aaaaaaatga 720aaagaaaatt gaagctatca cttgtttact
ttcctaatat ctttcaagaa tacaatgtgg 780tgaatttcaa ttttccctac
atatgtatac cgtcagcctg acgcaactta tgaaacttct 840ctttctttca
tttgatgtat atataaagac acattatata taaagaaact ttatatatat
900ctccatcata ttttagtact tgctactatg taaaattagc tgttggaagt
atctcaagaa 960acatttaatt tattgaacca agcattaacc attcatctac
atttgagttc taaaataaat 1020cttaaatgat gtggaggaag ggaaattgtt
aattatttcc ctcttctcct acatggatat 1080acctgaaaca tgcaatggat
ggattagatt ttaacatttg cagcctgaga agttcactga 1140ctttcctcca
gctattttat gtgtgcccgc caccatttat agctcatgat tgtagctgaa
1200ctgcaaaaac tgcatcgatt gcaaactgaa attgagaatc tcttttcaac
tttatatgct 1260gattgatgca tgctgagcat gctatactag tactcgaagt
tcctatatgt agactttgtt 1320actgcctaat atactttgtg tttgttctca
agttcttatt ttatttcata ttttttccta 1380taaaaggtta atggctctat
aaaggttgag tgacatatat atactataaa ggttcttcct 1440ctctctccgg tttg
145433560DNAGlycine maxsoybean genomic DD20CR1 target
region(1)..(60) 335acttgtactt atcaaaattc ggaactgaca cacgacatga
tggaacgtga ctaaggtggg 6033659DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 336acttgtactt atcaaaattc ggaactgaca
cacgactgat ggaacgtgac taaggtggg 5933759DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 337acttgtactt
atcaaaattc ggaactgaca cacgaatgat ggaacgtgac taaggtggg
5933858DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 338acttgtactt atcaaaattc ggaactgaca cacgatgatg gaacgtgact
aaggtggg 5833958DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 339acttgtactt atcaaaattc ggaactgaca cacgacgatg
gaacgtgact aaggtggg 5834058DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 340acttgtactt atcaaaattc ggaactgaca
cacggtgatg gaacgtgact aaggtggg 5834157DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 341acttgtactt
atcaaaattc ggaactgaca cacatgatgg aacgtgacta aggtggg
5734257DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 342acttgtactt atcaaaattc ggaactgaca cacgtgatgg aacgtgacta
aggtggg 5734356DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 343acttgtactt atcaaaattc ggaactgaca cactgatgga
acgtgactaa ggtggg 5634456DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 344acttgtactt atcaaaattc ggaactgaca
cacggatgga acgtgactaa ggtggg 5634555DNAArtificial sequencesequence
of gRNA/Cas9 system mediated NHEJ 345acttgtactt atcaaaattc
ggaactgaca cacgatggaa cgtgactaag gtggg 5534655DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 346acttgtactt
atcaaaattc ggaactgaca catgatggaa cgtgactaag gtggg
5534754DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 347acttgtactt atcaaaattc ggaactgaca cacatggaac gtgactaagg tggg
5434854DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 348acttgtactt atcaaaattc ggaactgaca ctgatggaac gtgactaagg tggg
5434953DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 349acttgtactt atcaaaattc ggaactgaca cgatggaacg tgactaaggt ggg
5335051DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 350acttgtactt atcaaaattc ggaactgatg atggaacgtg actaaggtgg g
5135150DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 351acttgtactt atcaaaattc ggaactgaca tggaacgtga ctaaggtggg
5035250DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 352acttgtactt atcaaaattc ggaactgtga tggaacgtga ctaaggtggg
5035351DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 353acttgtactt atcaaaattc ggaactgaca cacgaacgtg actaaggtgg g
5135450DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 354acttgtactt atcaaaattc ggaactgaca cggaacgtga ctaaggtggg
5035549DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 355acttgtacct atcaaaattc ggaactgaat ggaacgtgac taaggtggg
4935648DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 356acttgtactt atcaaaattc ggaactgatg gaacgtgact aaggtggg
4835746DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 357acttgtactt atcaaaattc ggaactgaga acgtgactaa ggtggg
4635838DNAArtificial sequencesequence of gRNA/Cas9 system mediated
target site modification 358acttgtactt atcaaaattc ggaactgaca
cacgacat 3835938DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 359acttgtactt atcaaaattc ggaactgaca aaggtggg
3836039DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 360acttgtactt atcaaaattc ggaacgtgac taaggtggg
3936124DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 361actatggaac gtgactaagg tggg 24362211DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 362acttgtactt atcaaaattc ggaactgaca cacggccggt
gatggattgg tggatgagtg 60ttgcgtcgag cacctccttg gtggaggtgt atctcttcct
gtcaatggtg gtgtcgaagt 120acttgaaggc agcaggggct cccaagtttg
tgagggtaaa caggtggata atgttttcag 180cctgctcgcg atggaacgtg
actaaggtgg g 21136390DNAArtificial sequencesequence of gRNA/Cas9
system mediated target site modification 363acttgtactt atcaaaaact
acttgtgctg taaaaaaaaa gaggaacaat cttcactcat 60caataagtga tggaacgcga
ctaaggtggg 9036457DNAGlycine maxsoybean genomic DD20CR2 target
region(1)..(57) 364gacacacgac atgatggaac gtgactaagg tgggtttttg
actttgcatg tcgaagt 5736561DNAartificial sequencesequence of
gRNA/Cas9 system mediated target site modification 365actgacacac
gacatgatgg aacgtgaact aaggtgggtt tttgactttg catgtcgaag 60t
6136659DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 366actgacacac gacatgatgg aacgtactaa ggtgggtttt tgactttgca
tgtcgaagt 5936758DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 367actgacacac gacatgatgg aacgtctaag gtgggttttt
gactttgcat gtcgaagt 5836858DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 368actgacacac gacatgatgg aacgtgaaag
gtgggttttt gactttgcat gtcgaagt 5836957DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 369actgacacac
gacatgatgg aacgctaagg tgggtttttg actttgcatg tcgaagt
5737057DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 370actgacacac gacatgatgg aacgtgaagg tgggtttttg actttgcatg
tcgaagt 5737156DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 371actgacacac gacatgatgg aacgtgaggt gggtttttga
ctttgcatgt cgaagt 5637256DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 372actgacacac gacatgatgg aacgtaaggt
gggtttttga ctttgcatgt cgaagt 5637356DNAArtificial sequencesequence
of gRNA/Cas9 system mediated NHEJ 373actgacacac gacatgatgg
aacctaaggt gggtttttga ctttgcatgt cgaagt 5637456DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 374actgacacac
gacatgatgg aacgtgaggt gggtttttga ctttgcatgt cgaagt
5637555DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 375actgacacac gacatgatgg aactaaggtg ggtttttgac tttgcatgtc
gaagt 5537654DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 376actgacacac gacatgatgg aataaggtgg gtttttgact
ttgcatgtcg aagt 5437753DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 377actgacacac gacatgatgg ctaaggtggg tttttgactt
tgcatgtcga agt 5337853DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 378actgacacac gacatgatgg ataaggtggg tttttgactt
tgcatgtcga agt 5337951DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 379actgacacac gacatgatgg aaggtgggtt tttgactttg
catgtcgaag t 5138050DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 380actgacacac gacatgatgg aggtgggttt ttgactttgc
atgtcgaagt 5038144DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 381actgacacac gacatgatgg gtttttgact ttgcatgtcg
aagt 4438243DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 382actgacacac gacaggtggg tttttgactt tgcatgtcga agt
4338340DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 383actgacacta aggtgggttt ttgactttgc atgtcgaagt
4038425DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 384actgacacac gacatgatgg aacgt 2538520DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 385actgacacac
gacatgatgg 2038660DNAGlycine maxsoybean genomic DD43CR1 target
region(1)..(60) 386agccttacaa ctcacaagtc ccttgtactt gtacgtacgg
agggtattct agaaaagagg 6038759DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 387agccttacaa ctcacaagtc ccttgtactt
gtactacgga gggtattcta gaaaagagg 5938859DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 388agccttacaa
ctcacaagtc ccttgtactt gtagtacgga gggtattcta gaaaagagg
5938958DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 389agccttacaa ctcacaagtc ccttgtactt gtgtacggag ggtattctag
aaaagagg 5839058DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 390agccttacaa ctcacaagtc ccttgtactt gcgtacggag
ggtattctag aaaagagg 5839157DNAArtificial sequencesequence of
gRNA/Cas9 system mediated NHEJ 391agccttacaa ctcacaagtc ccttgtactt
ggtacggagg gtattctaga aaagagg 5739257DNAArtificial sequencesequence
of gRNA/Cas9 system mediated NHEJ 392agccttacaa ctcacaagtc
ccttgtactt gttacggagg gtattctaga aaagagg 5739356DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 393agccttacaa
ctcacaagtc ccttgtactt gtacggaggg tattctagaa aagagg
5639455DNAArtificial sequencesequence of gRNA/Cas9 system mediated
NHEJ 394agccttacaa ctcacaagtc ccttgtactt tacggagggt attctagaaa
agagg 5539555DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 395agccttacaa ctcacaagtc ccttgtactg tacggagggt
attctagaaa agagg 5539654DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 396agccttacaa ctcacaagcc ccttgtactt acggagggta
ttctagaaaa gagg 5439752DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 397agccttacaa ctcacaagtc ccttgtatac ggagggtatt
ctagaaaaga gg 5239852DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 398agccttacaa ctcacaagtc ccttgtgtac ggagggtatt
ctagaaaaga gg 5239950DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 399agccttacaa ctcacaagtc ccttgtacgg agggtattct
agaaaagagg 5040049DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 400agccttacaa ctcacaagtc cctttacgga gggtattcta
gaaaagagg 4940148DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 401agccttacaa ctcacaagtc ccttacggag ggtattctag
aaaagagg 4840247DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 402agccttacaa ctcacaagtc cctacggagg gtattctaga
aaagagg 4740343DNAArtificial sequencesequence of gRNA/Cas9 system
mediated NHEJ 403agccttacaa ctcacaagtc ccttgtactt gtaagaaaag agg
4340449DNAArtificial sequencesequence of gRNA/Cas9 system mediated
target site modification 404agccttacaa ctcacaagtc ctaaattaaa
ggttattcta gaaaagagg 49405227DNAArtificial sequencesequence of
gRNA/Cas9 system mediated target site modification 405agccttacaa
ctcacaagtc ccttgtactt gtagaatcca gttcataaaa caagtgacac 60acaacagata
tgaactggac tacgtcgaac ccacaaatcc cacaaagcgc gtgaaatcaa
120atcgctcaaa ccacaaaaaa gaacaacgcg tttgttacac gctaatacca
aaattatacc 180caaatcttaa gctatttatg cgtacggagg gtattctaga aaagagg
22740697DNAArtificial sequencesequence of gRNA/Cas9 system mediated
target site modification 406agccttacaa ctcacaagtc ccttgtactt
gtaatgctcc cctctaaact cgtatcgctt 60cagagttgag agtacggagg gtattctaga
aaagagg 97407183DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 407agccttacaa ctcacaagtc
ccttgtatat agatacccac aaaataagta aacccgatcc 60aaaatcttaa atgatgtgga
ggaagggaaa ttgttaatta ttcccctctt ctcctacatg 120gatatacctg
aaacatgcaa tggatggatt agattttgta cggagggtat tctagaaaag 180agg
183408234DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 408agccttacaa ctcacaagtc
ccttgtactt gtaccagggg atgtttttta tttacattca 60cgtcttttgg aaagagccgc
taaattaagt tctcagttag gcgaaggaag tatgactgct 120ttaccaatag
ttgaaactca atcgggagat gtttcagctt atattcctac taatgtaatt
180tccattacag atggccaaat attcttacgt acggagggta ttctagaaaa gagg
234409280DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 409agccttacaa ctcacaagtc
ccttgtactt gtaccgaaaa tttcagccat aaaaaaagtt 60ataatagaat ttaaagcaaa
agtttcattt tttaaacata tatactgaca cgctccgata 120aaaatagagg
caagtccgac aacgtcccct ccgaggaggt cgtgaagaag atgaaaaact
180actggagaca gctcttgaac gccaagctca tcacccagcg taagctcgac
aacctgacta 240aggctgagag aggtgtacgg agggtattct agaaaagagg
280410250DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 410agccttacaa ctcacaagtc
ccttgtactt gtactggatt tggtgaggga tgcttccgtt 60gtcgaaggtt ctctgcttcc
tcaacaggtc ctctctgttc aacttcacca acagctcctc 120ggtaccgtcc
atcttctcaa ggatgaagat cgagtgcttc gactccgtcg agatctctgg
180tgtcgaggac aggttcaacg cctcccttgg gacttgccac gatcgtacgg
agggtattct 240agaaaagagg 250411161DNAArtificial sequencesequence of
gRNA/Cas9 system mediated target site modification 411agccttacaa
ctcacaagtc ccttatgacc tcaaaaaaaa gattcacctc caacacacca 60aataactcga
aaatctcttt cctattctct agaaagtata ggaacttcca ctagtccatg
120aaaaagcctg aactcgtacg gagggtattc tagaaaagag g
161412185DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 412agccttacaa ctcacaagtc
ccttgtactt gtacacctgg ggcatggaga gcaccttcct 60cacagtagcg aaatccctcc
ctttgtccca cacgatctct ccagtctcac cgttggtctc 120gatcagtggg
cgcttcctga tctcaccgtt ggcgagagtg tacggagggt attctagaaa 180agagg
185413212DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 413agccttacaa ctcacaagtc
ccttgtactt gtgctaggtt agccgaaaga tggttatcgg 60ttcaaggacg caaggtgccc
ctgctttttc agggtaataa ggggtagaga aaatgcctcg 120agccaaagtt
cgagtaccag gcgctacagc gctgaagtaa tccatgccat actcccagga
180aaagccgtac ggagggtatt ctagaaaaga gg 212414231DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 414agccttacaa ctcacaagtc ccttgtactt gtactcaagt
tcttatttta tttcatattt 60tttcctataa aaggttaatg gctctataaa ggttgagtga
cggatccggt cacctaagtg 120actagggtca cgtgacccta gtcacttatt
cccgggcaac tttattatac aaagttgata 180gatctcgaat tcattccgat
taatcgtggc gagggtattc tagaaaagag g 23141598DNAArtificial
sequenceLIGCas-1 mutation 1 415tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggtcggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9841698DNAArtificial sequenceLIGCas-1 mutation 2
416tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc
ggacggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
9841798DNAArtificial
sequenceLIGCas-1 mutation 3 417tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc gggcggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9841898DNAArtificial sequenceLIGCas-1 mutation 4
418tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc
ggccggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
9841999DNAArtificial sequenceLIGCas-1 mutation 5 419tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggatcggagg 60atatatatac
ctcacacgta cgcgtacgcg tatatatac 9942094DNAArtificial
sequenceLIGCas-1 mutation 6 420tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggaggatata 60tatacctcac acgtacgcgt acgcgtatat
atac 9442181DNAArtificial sequenceLIGCas-1 mutation 7 421tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggttcacacg 60tacgcgtacg
cgtatatata c 8142265DNAArtificial sequenceLIGCas-1 mutation 8
422tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtacgcg
tacgcgtata 60tatac 6542399DNAArtificial sequenceLIGCas-1 mutation 9
423tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc
ggttcggagg 60atatatatac ctcacacgta cgcgtacgcg tatatatac
9942495DNAArtificial sequenceLIGCas-1 mutation 10 424tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc cggaggatat 60atatacctca
cacgtacgcg tacgcgtata tatac 9542598DNAArtificial sequenceLIGCas-2
mutation 1 425gaagctgtaa cgatttacgc acctgctggg aattgtaccg
tacgtgaccc cggcggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
9842698DNAArtificial sequenceLIGCas-2 mutation 2 426gaagctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgtccc cggcggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac 9842796DNAArtificial
sequenceLIGCas-2 mutation 3 427gaagctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtccccg gcggaggata 60tatatacctc acacgtacgc gtacgcgtat
atatac 9642898DNAArtificial sequenceLIGCas-2 mutation 4
428gaagctgtaa cgatttacgc acctgctggg aattgtaccg tacgtggccc
cggcggagga 60tatatatacc tcacacgtac gcgtacgcgt atatatac
9842999DNAArtificial sequenceLIGCas-2 mutation 5 429gaagctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcacc ccggcggagg 60atatatatac
ctcacacgta cgcgtacgcg tatatatac 9943087DNAArtificial
sequenceLIGCas-2 mutation 6 430gaagctgtaa cgatttacgc acctgctggg
aattgtaccc ggcggaggat atatatacct 60cacacgtacg cgtacgcgta tatatac
8743192DNAArtificial sequenceLIGCas-2 mutation 7 431gaagctgtaa
cgatttacgc acctgctggg aattgtaccg tccccggcgg aggatatata 60tacctcacac
gtacgcgtac gcgtatatat ac 9243294DNAArtificial sequenceLIGCas-2
mutation 8 432gaagctgtaa cgatttacgc acctgctggg aattgtaccg
tacccccggc ggaggatata 60tatacctcac acgtacgcgt acgcgtatat atac
9443395DNAArtificial sequenceLIGCas-2 mutation 9 433gaagctgtaa
cgatttacgc acctgctggg aattgtaccg tacgccccgg cggaggatat 60atatacctca
cacgtacgcg tacgcgtata tatac 9543488DNAArtificial sequenceLIGCas-2
mutation 10 434gaagctgtaa cgatttacgc acctgctggg aattgtaccc
cggcggagga tatatatacc 60tcacacgtac gcgtacgcgt atatatac
8843598DNAArtificial sequenceLIGCas-3 mutation 1 435aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtacgtt gtgaggtata 60tatatcctcc
gccggggcac gtacggtaca attcccag 9843696DNAArtificial
sequenceLIGCas-3 mutation 2 436aaggcgcaaa tgagtagcag cgcacgtata
tatacgcgta cgcgtacggt gaggtatata 60tatcctccgc cggggcacgt acggtacaat
tcccag 9643795DNAArtificial sequenceLIGCas-3 mutation 3
437aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtg
aggtatatat 60atcctccgcc ggggcacgta cggtacaatt cccag
9543896DNAArtificial sequenceLIGCas-3 mutation 4 438aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtactgt gaggtatata 60tatcctccgc
cggggcacgt acggtacaat tcccag 9643968DNAArtificial sequenceLIGCas-3
mutation 5 439aaggcgcaaa tgagtagcag cgcacgtata tatatcctcc
gccggggcac gtacggtaca 60attcccag 6844093DNAArtificial
sequenceLIGCas-3 mutation 6 440aaggcgcaaa tgagtagcag cgcacgtata
tatacgcgta cgcgtgtgag gtatatatat 60cctccgccgg ggcacgtacg gtacaattcc
cag 9344189DNAArtificial sequenceLIGCas-3 mutation 7 441aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgtgaggtat atatatcctc 60cgccggggca
cgtacggtac aattcccag 8944289DNAArtificial sequenceLIGCas-3 mutation
8 442aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtactat
atatatcctc 60cgccggggca cgtacggtac aattcccag 8944394DNAArtificial
sequenceLIGCas-3 mutation 9 443aaggcgcaaa tgagtagcag cgcacgtata
tatacgcgta cgcgtacgga ggtatatata 60tcctccgccg gggcacgtac ggtacaattc
ccag 9444496DNAArtificial sequenceLIGCas-3 mutation 0 444aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtacgat gaggtatata 60tatcctccgc
cggggcacgt acggtacaat tcccag 964451051DNAArtificial
sequenceLIGCas-1_crRNA_Expression_Cassette 445tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gtaccgtacg tgccccggcg 1020ggttttagag
ctatgctgtt ttgttttttt t 10514461051DNAArtificial
sequenceLIGCas-2_crRNA_Expression_Cassette 446tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt ggaattgtac cgtacgtgcc 1020cgttttagag
ctatgctgtt ttgttttttt t 10514471047DNAArtificial
sequenceLIGCas-3_crRNA_Expression_Cassette 447tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gcgtacgcgt acgtgtggtt 1020ttagagctat
gctgttttgt ttttttt 10474481087DNAArtificial
sequencetracrRNA_Expression_Cassette 448tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat
tcctagccgg attacttctc taatttatat agattttgat gagctggaat
180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct
gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga gggggcatca
480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt
acagtaggac acagtgtcag cgccgcgttg acggagaata tttgcaaaaa
780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt ggaaccattc aaaacagcat 1020agcaagttaa
aataaggcta gtccgttatc aacttgaaaa agtggcaccg agtcggtgct 1080ttttttt
108744963DNAArtificial SequenceLIGCas-2 forward primer for primary
449ctacactctt tccctacacg acgctcttcc gatctgaagc tgtaacgatt
tacgcacctg 60ctg 6345060DNAArtificial SequenceLIGCas-3 forward
primer for primary PCR 450ctacactctt tccctacacg acgctcttcc
gatctttccc gcaaatgagt agcagcgcac 6045119DNAArtificial
sequenceZm-ARGOS8-CTS1, Cas9 target sequence 1 451gcgtgcatcg
atccatcgc 1945221DNAArtificial sequencem-ARGOS8-CTS2, Cas9 target
sequence 2 452ggctacggat agatatgatg c 2145320DNAArtificial
sequenceZm-ARGOS8-CTS3, Cas9 target sequence 3 453gttacttctc
taagcacggc 2045421DNAArtificial sequenceP1, Forward_primer
454gcgccattcc ctaaaggtaa c 2145523DNAArtificial sequenceP2,
Reverse_primer 455gctaatcgta agtgacgctt gga 2345631DNAArtificial
sequenceP3, Forward_primer 456gctcgtgtcc aagcgtcact tacgattagc t
3145720DNAArtificial sequenceP4, Reverse_primer 457ctgcgaactg
cttgattccg 2045824DNAArtificial sequenceP5, Forward_primer
458accgtcctta tctctgcatc atct 2445931DNAArtificial sequencePBS,
Primer Binding Site 459gctcgtgtcc aagcgtcact tacgattagc t
314601823DNAZea maysmisc_feature(1)..(1823)Zm-GOS2 PRO-GOS2 INTRON,
maize GOS2 promoter and GOS2 intron1 including the promoter,
5'-UTR1, INTRON1 and 5'-UTR2 sequence 460taattattgg ctgtaggatt
ctaaacagag cctaaatagc tggaatagct ctagccctca 60atccaaacta atgatatcta
tacttatgca actctaaatt tttattctaa aagtaatatt 120tcatttttgt
caacgagatt ctctactcta ttccacaatc ttttgaagct atatttacct
180taaatctgta ctctatacca ataatcatat attctattat ttatttttat
ctctctccta 240aggagcatcc ctctatgtct gcatggcccc cgcctcgggt
cccaatctct tgctctgcta 300gtagcacaga agaaaacact agaaatgact
tgcttgactt agagtatcag ataaacatca 360tgtttactta actttaattt
gtatcggttt ctactatttt tataatattt ttgtctctat 420agatactacg
tgcaacagta taatcaacct agtttaatcc agagcgaagg attttttact
480aagtacgtga ctccatatgc acagcgttcc ttttatggtt cctcactggg
cacagcataa 540acgaaccctg tccaatgttt tcagcgcgaa caaacagaaa
ttccatcagc gaacaaacaa 600catacatgcg agatgaaaat aaataataaa
aaaagctccg tctcgatagg ccggcacgaa 660tcgagagcct ccatagccag
ttttttccat cggaacggcg gttcgcgcac ctaattatat 720gcaccacacg
cctataaagc caaccaaccc gtcggagggg cgcaagccag acagaagaca
780gcccgtcagc ccctctcgtt tttcatccgc cttcgcctcc aaccgcgtgc
gctccacgcc 840tcctccagga aagcgagagg tgagcgcagt cccctttccc
ctccttccaa ttcaattcgt 900cttctcgttc gcagccctag gatttggggg
tctggagggg tttgatcgtt tctcgccgtg 960aatctgcttt ggtgtaaacc
aacggatctc ggatcgtagt cttcagaaga tcccggattt 1020tgcggtttgg
cccctcctgg attcaattcg tcgtatcgtt cgcagcccta ggatttgggg
1080atctggaggg gtttgatcgt ttctcgccgc gaatctgctc tggtgtaaac
caacggatct 1140cgggtcgtag tcttcagaag gtcccggatt ttgcggtttg
gcccctcctg gattcaattc 1200gtcgtatcgt tcgcagccct aggatttggg
gatctggagg ggtttgatcc tttctcgccg 1260cgaatctgct ctggtataac
caacggatct cgggtcgtag tcttcagaag gtcccggatt 1320ttgcggtttg
gtggttcttg ctctatgaat cagagggatg gttcttcccg gatttatgcc
1380ttgcggccac tctgtcgaat catggggttt cgacccgatt cgtaggcgtg
ctccctgttt 1440tggatgggaa gtaggcgtgt ttgtagtatt cgtgcttcga
ttcgtcaacg gagattagaa 1500gacctgggat gggatttgag gaaatctagg
tatctgtcta gcacgtttct agatctattc 1560ttcagctgtt atatgagagt
aattttggaa ccctggtggg gtatgtttga ccgagtattc 1620tgtagattat
tgtccgtgac ttgctggctg ttaccgtcct tatctctgca tcatctatct
1680gtgctagttt ctgcgtgctt ctcaaatatt tccggcctgt gtagcatgtg
actgataata 1740tgattttggc agcttctgca taagaacaac aaatcaaaag
cttgatcagc tcggtgccta 1800caaaacctca acaaccaagt ttc
1823461556DNAZea maysmisc_feature(1)..(556)Zm-ARGOS8 promoter
461gttacttctc taagcacggc tggatttcag gcctctagtc ctctactagt
actagctaca 60cgacgtgcac gcatgcatca cagcatcaac aactagacac gcacacgctg
cacgcggccg 120gggaacccac tgattccccc cttccccgcg cgcggtttga
tttcctttcc tggtacggat 180ccatatctga gggcttgttc ggttattccc
aacacacatg tattggatgg gattgaaaaa 240aaaatgagaa gaagtttgac
ttgtttggga ttcaaaccca tccaatccca ctcaatccac 300atggattgag
agctaaccga acaagccctc atagtacata cctggtacgg atccatatca
360tagtacatag atccagtaga atagaaggtg atccgaccgc cggcgcttgc
gttgttttcc 420ccggtccatt gaacctgcca accctcctaa ccacaggcac
gccaaaccgc gggctccggc 480caccaccgcc accgccacct gccctgccgc
acctctccaa ccccaaatcc aggggggggg 540gggggcacca tgcgtg
556462155DNAZea maysmisc_feature(1)..(155)Zm-ARGOS8 5'-UTR
462catcgatcca tcgctggcgc gcgggtccgg cggggcggtc tgtgagggca
aatttatata 60ggtctagtgg gtacccggct acggatagat atgatgctgc actgcacatt
ggctatatct 120gaggctcctg cgcgcgcctt ggccaggtgt ctgtc
155463285DNAZea mays 463atgcgggcga tgccgcagga agaggaagcc gcggtggcga
cgacgaccat ggccgggggc 60aaggtggcgg cgctgctggc cacggcggcc gcgctgctgc
tgctgctccc gctggcgctg 120ccgccgctgc cgccgccgcc cacgcagctg
ttgttcgtcc ccgtggtctt gctgctcctc 180gtggcgtccc tcgcgttctg
ccccgccgcg accccctcgc cgtcgccgat gcatgccgcc 240gaccacgggt
cgttcgggac cactggatca ccgcacctat gttga 2854642843DNAZea
maysmisc_feature(1)..(2843)Zm-GOS2 gene, including promoter,
5'-UTR, CDS, 3'-UTR and introns sequence 464taattattgg ctgtaggatt
ctaaacagag cctaaatagc tggaatagct ctagccctca 60atccaaacta atgatatcta
tacttatgca actctaaatt tttattctaa aagtaatatt 120tcatttttgt
caacgagatt ctctactcta ttccacaatc ttttgaagct atatttacct
180taaatctgta ctctatacca ataatcatat attctattat ttatttttat
ctctctccta 240aggagcatcc ctctatgtct gcatggcccc cgcctcgggt
cccaatctct tgctctgcta 300gtagcacaga agaaaacact agaaatgact
tgcttgactt agagtatcag ataaacatca 360tgtttactta actttaattt
gtatcggttt ctactatttt tataatattt ttgtctctat 420agatactacg
tgcaacagta taatcaacct agtttaatcc agagcgaagg attttttact
480aagtacgtga ctccatatgc acagcgttcc ttttatggtt cctcactggg
cacagcataa 540acgaaccctg tccaatgttt tcagcgcgaa caaacagaaa
ttccatcagc gaacaaacaa 600catacatgcg agatgaaaat aaataataaa
aaaagctccg tctcgatagg ccggcacgaa 660tcgagagcct ccatagccag
ttttttccat cggaacggcg gttcgcgcac ctaattatat 720gcaccacacg
cctataaagc caaccaaccc gtcggagggg cgcaagccag acagaagaca
780gcccgtcagc ccctctcgtt tttcatccgc cttcgcctcc aaccgcgtgc
gctccacgcc 840tcctccagga aagcgagagg tgagcgcagt cccctttccc
ctccttccaa ttcaattcgt 900cttctcgttc gcagccctag gatttggggg
tctggagggg tttgatcgtt tctcgccgtg 960aatctgcttt ggtgtaaacc
aacggatctc ggatcgtagt cttcagaaga tcccggattt 1020tgcggtttgg
cccctcctgg attcaattcg tcgtatcgtt cgcagcccta ggatttgggg
1080atctggaggg gtttgatcgt ttctcgccgc gaatctgctc tggtgtaaac
caacggatct 1140cgggtcgtag tcttcagaag gtcccggatt ttgcggtttg
gcccctcctg gattcaattc 1200gtcgtatcgt tcgcagccct aggatttggg
gatctggagg ggtttgatcc tttctcgccg 1260cgaatctgct ctggtataac
caacggatct cgggtcgtag tcttcagaag gtcccggatt 1320ttgcggtttg
gtggttcttg ctctatgaat cagagggatg gttcttcccg gatttatgcc
1380ttgcggccac tctgtcgaat catggggttt cgacccgatt cgtaggcgtg
ctccctgttt 1440tggatgggaa gtaggcgtgt ttgtagtatt cgtgcttcga
ttcgtcaacg gagattagaa 1500gacctgggat gggatttgag gaaatctagg
tatctgtcta gcacgtttct agatctattc 1560ttcagctgtt atatgagagt
aattttggaa ccctggtggg gtatgtttga ccgagtattc 1620tgtagattat
tgtccgtgac ttgctggctg ttaccgtcct tatctctgca tcatctatct
1680gtgctagttt ctgcgtgctt ctcaaatatt tccggcctgt gtagcatgtg
actgataata 1740tgattttggc agcttctgca taagaacaac aaatcaaaag
cttgatcagc tcggtgccta 1800caaaacctca acaaccaagt ttcatgtctg
atctcgacgt ccagcttcca tctgcctttg 1860gtatggctac ttctcaattc
atgatgccat gttttttttt atattgtggt tttacataat 1920acatagcatc
ttccagcttc ctgaagagta ttactgaata gattgataac atcatacaca
1980cgaagttcat cttgaacatg cttattagtg ttctgtttgc atctgatggt
atggcatcat 2040ctttgataga tccgtttgct gaggcaaatg ctgaggactc
tggtgctggt cctggaacga 2100aggattatgt gcatgtgcgc atccagcagc
gcaacggcag aaagagtctg actacagtcc 2160agggtctgaa gaaagagttc
agctataaca agatcctcaa ggatctgaag aaggaattct 2220gctgcaatgg
tactgtagtt caggacccag agctaggcca ggtaagatac gagaacaatg
2280catttcaagc ttgtaaaaat ggtatctgcc ggttggtgga tatactgatc
tgtttgtccg 2340ctgcaggtca ttcagctcca aggtgaccag cgcaagaatg
ttgctacttt cctagttcag 2400gtattcagaa tcttcagacc tggccagctg
aatactgttt taccataccg atagatgttc 2460aatctgttaa tactgatcgt
gcaattatta cttgtcttgg taggctggga ttgcgaagaa 2520agagaacatc
aagattcacg ggttctaagg gacctgtaaa tgcttgtgcc ctatattgtg
2580tgcctccaca tattggggag cttgaagcat cgacagttac tagtcattgc
ttacttatat 2640aagaacataa gtagtatttg ctattgtcaa gtgtgccttg
cttgatgcaa gttgtgtttt 2700cgtatcatta ttattatgca cggccatcgt
acgtgtatgg cttgtatggg ttattgccaa 2760cttaataaaa gcacactctg
tttgcctata agcactgatg tttgcctcgt catgcacatg 2820ttgagtcggg
ttttatttgt att 2843465800DNAZea maysmisc_feature(1)..(800)Zm-GOS2
PRO, maize GOS2 promoter 465taattattgg ctgtaggatt ctaaacagag
cctaaatagc tggaatagct ctagccctca 60atccaaacta atgatatcta tacttatgca
actctaaatt tttattctaa aagtaatatt 120tcatttttgt caacgagatt
ctctactcta ttccacaatc ttttgaagct atatttacct 180taaatctgta
ctctatacca ataatcatat attctattat ttatttttat ctctctccta
240aggagcatcc ctctatgtct gcatggcccc cgcctcgggt cccaatctct
tgctctgcta 300gtagcacaga agaaaacact agaaatgact tgcttgactt
agagtatcag ataaacatca 360tgtttactta actttaattt gtatcggttt
ctactatttt tataatattt ttgtctctat 420agatactacg tgcaacagta
taatcaacct agtttaatcc agagcgaagg attttttact 480aagtacgtga
ctccatatgc acagcgttcc ttttatggtt cctcactggg cacagcataa
540acgaaccctg tccaatgttt tcagcgcgaa caaacagaaa ttccatcagc
gaacaaacaa 600catacatgcg agatgaaaat aaataataaa aaaagctccg
tctcgatagg ccggcacgaa 660tcgagagcct ccatagccag ttttttccat
cggaacggcg gttcgcgcac ctaattatat 720gcaccacacg cctataaagc
caaccaaccc gtcggagggg cgcaagccag acagaagaca 780gcccgtcagc
ccctctcgtt 8004661023DNAZea maysmisc_feature(1)..(1023)GOS2 INTRON,
maize GOS2 5'-UTR1 and intron1 and 5'-UTR2 sequence 466tttcatccgc
cttcgcctcc aaccgcgtgc gctccacgcc tcctccagga aagcgagagg 60tgagcgcagt
cccctttccc ctccttccaa ttcaattcgt cttctcgttc gcagccctag
120gatttggggg tctggagggg tttgatcgtt tctcgccgtg aatctgcttt
ggtgtaaacc 180aacggatctc ggatcgtagt cttcagaaga tcccggattt
tgcggtttgg cccctcctgg 240attcaattcg tcgtatcgtt cgcagcccta
ggatttgggg atctggaggg gtttgatcgt 300ttctcgccgc gaatctgctc
tggtgtaaac caacggatct cgggtcgtag tcttcagaag 360gtcccggatt
ttgcggtttg gcccctcctg gattcaattc gtcgtatcgt tcgcagccct
420aggatttggg gatctggagg ggtttgatcc tttctcgccg cgaatctgct
ctggtataac 480caacggatct cgggtcgtag tcttcagaag gtcccggatt
ttgcggtttg gtggttcttg 540ctctatgaat cagagggatg gttcttcccg
gatttatgcc ttgcggccac tctgtcgaat 600catggggttt cgacccgatt
cgtaggcgtg ctccctgttt tggatgggaa gtaggcgtgt 660ttgtagtatt
cgtgcttcga ttcgtcaacg gagattagaa gacctgggat gggatttgag
720gaaatctagg tatctgtcta gcacgtttct agatctattc ttcagctgtt
atatgagagt 780aattttggaa ccctggtggg gtatgtttga ccgagtattc
tgtagattat tgtccgtgac 840ttgctggctg ttaccgtcct tatctctgca
tcatctatct gtgctagttt ctgcgtgctt 900ctcaaatatt tccggcctgt
gtagcatgtg actgataata tgattttggc agcttctgca 960taagaacaac
aaatcaaaag cttgatcagc tcggtgccta caaaacctca acaaccaagt 1020ttc
102346723DNAGlycine max 467gcgtcctttg acagcagctg tgg
2346823DNAGlycine max 468gcaaccacag ctgctgtcaa agg
23469434DNAGlycine max 469ccgggtgtga tttagtataa agtgaagtaa
tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg aacaaaataa
atggtaaagt gtcagatata taaaataggc 120tttaataaaa ggaagaaaaa
aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag
ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa
240atacttggat ctttctctta ccctgtttat attgagacct gaaacttgag
agagatacac 300taatcttgcc ttgttgtttc attccctaac ttacaggact
cagcgcatgt catgtggtct 360cgttccccat ttaagtccca caccgtctaa
acttattaaa ttattaatgt ttataactag 420atgcacaaca acaa
4344709093DNAArtificial sequenceQC878 470ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg cgtcctttga cagcagctgg ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa aaagtactca
atagggctcg acatagggac 2100taactccgtt ggatgggccg tcatcaccga
cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga aacaccgaca
ggcacagcat aaagaagaat ttgatcggtg ccctcctctt 2220cgactccgga
gagaccgctg aggctaccag gctcaagagg accgctagaa ggcgctacac
2280cagaaggaag aacagaatct gctacctgca ggagatcttc tccaacgaga
tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga atcattcctg
gtggaggagg ataaaaagca 2400cgagagacac ccaatcttcg ggaacatcgt
cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac cacctgagga
agaagctggt cgactctacc gacaaggctg acttgcgctt 2520gatttacctg
gctctcgctc acatgataaa gttccgcgga cacttcctca ttgagggaga
2580cctgaaccca gacaactccg acgtggacaa gctcttcatc cagctcgttc
agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc cagtggagtt
gacgccaagg ctatcctctc 2700tgctcgtctg tcaaagtcca ggaggcttga
gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga ctgttcggaa
acttgatcgc tctctccctg ggattgactc ccaacttcaa 2820gtccaacttc
gacctcgccg aggacgctaa gttgcagttg tctaaagaca cctacgacga
2880tgacctcgac aacttgctgg cccagatagg cgaccaatac gccgatctct
tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc cgacatcctg
agagtcaaca ctgagattac 3000caaagctcct ctgtctgctt ccatgattaa
gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag gccctggtga
gacagcagct gcccgagaag tacaaggaga tctttttcga 3120ccagtccaag
aacggctacg ccggatacat tgacggaggc gcctcccagg aagagttcta
3180caagttcatc aagcccatcc ttgagaagat ggacggtacc gaggagctgt
tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag aaccttcgac
aacggaagca tccctcacca 3300aatccacctg ggagagctcc acgccatctt
gaggaggcag gaggatttct atcccttcct 3360gaaggacaac cgcgagaaga
ttgagaagat cttgaccttc agaattcctt actacgtcgg 3420gccactcgcc
agaggaaact ctaggttcgc ctggatgacc cgcaaatctg aagagaccat
3480tactccctgg aacttcgagg aagtcgtgga caagggcgct tccgctcagt
ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc caacgagaag
gtgctgccca agcactccct 3600gttgtacgag tatttcacag tgtacaacga
gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag cctgccttct
tgtccggaga gcagaagaag gccatcgtcg acctgctctt 3720caagaccaac
aggaaggtga ctgtcaagca gctgaaggag gactacttca agaagatcga
3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct
cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga caaggacttc
ctggacaacg aggagaacga 3900ggacatcctt gaggacatcg tgctcaccct
gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc aagacctacg
cccacctctt cgacgacaag gtgatgaaac agctgaagag 4020acgcagatat
accggctggg gaaggctctc ccgcaaattg atcaacggga tcagggacaa
4080gcagtcaggg aagactatac tcgacttcct gaagtccgac ggattcgcca
acaggaactt 4140catgcagctc attcacgacg actccttgac cttcaaggag
gacatccaga aggctcaggt 4200gtctggacag ggtgactcct tgcatgagca
cattgctaac ttggccggct ctcccgctat 4260taagaagggc attttgcaga
ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg 4320ccacaagcca
gagaacatcg ttattgagat ggctcgcgag aaccaaacta cccagaaagg
4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag ggcataaaag
agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa cactcagctg
cagaacgaga agctgtacct 4500gtactacctc caaaacggaa gggacatgta
cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac gacgtcgacc
acatcgtgcc tcagtccttc ctgaaggatg actccatcga 4620caataaagtg
ctgacacgct ccgataaaaa tagaggcaag tccgacaacg tcccctccga
4680ggaggtcgtg aagaagatga aaaactactg gagacagctc ttgaacgcca
agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc tgagagagga
ggattgtccg agctcgataa 4800ggccggattc atcaagagac agctcgtcga
aacccgccaa attaccaagc acgtggccca 4860aattctggat tcccgcatga
acaccaagta cgatgaaaat gacaagctga tccgcgaggt 4920caaggtgatc
accttgaagt ccaagctggt ctccgacttc cgcaaggact tccagttcta
4980caaggtgagg gagatcaaca actaccacca cgcacacgac gcctacctca
acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa gctggagtct
gagttcgtct acggcgacta 5100caaggtgtac gacgtgagga agatgatcgc
taagtctgag caggagatcg gcaaggccac 5160cgccaagtac ttcttctact
ccaacatcat gaacttcttc aagaccgaga tcactctcgc 5220caacggtgag
atcaggaagc gcccactgat cgagaccaac ggtgagactg gagagatcgt
5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg ctctccatgc
ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg aggattctcc
aaggagtcca tcctccccaa 5400gagaaactcc gacaagctga tcgctagaaa
gaaagactgg gaccctaaga agtacggagg 5460cttcgattct cctaccgtgg
cctactctgt gctggtcgtg gccaaggtgg agaagggcaa 5520gtccaagaag
ctgaaatccg tcaaggagct cctcgggatt accatcatgg agaggagttc
5580cttcgagaag aaccctatcg acttcctgga ggccaaggga tataaagagg
tgaagaagga 5640cctcatcatc aagctgccca agtactccct cttcgagttg
gagaacggaa ggaagaggat 5700gctggcttct gccggagagt tgcagaaggg
aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg tacctcgcct
ctcactatga aaagttgaag ggctctcctg aggacaacga 5820gcagaagcag
ctcttcgtgg agcagcacaa gcactacctg gacgaaatta tcgagcagat
5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac ctcgacaagg
tgctgtccgc 5940ctacaacaag cacagggata agcccattcg cgagcaggct
gaaaacatta tccacctgtt 6000taccctcaca aacttgggag cccctgctgc
cttcaagtac ttcgacacca ccattgacag 6060gaagagatac acctccacca
aggaggtgct cgacgcaaca ctcatccacc aatccatcac 6120cggcctctat
gaaacaagga ttgacttgtc ccagctggga ggcgactcta gagccgatcc
6180caagaagaag agaaaggtgt aggttaacct agacttgtcc atcttctgga
ttggccaact 6240taattaatgt atgaaataaa aggatgcaca catagtgaca
tgctaatcac tataatgtgg 6300gcatcaaagt tgtgtgttat gtgtaattac
tagttatctg aataaaagag aaagagatca 6360tccatatttc ttatcctaaa
tgaatgtcac gtgtctttat aattctttga tgaaccagat 6420gcatttcatt
aaccaaatcc atatacatat aaatattaat catatataat taatatcaat
6480tgggttagca aaacaaatct agtctaggtg tgttttgcga attcgatatc
aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg cgccgttcta
tagtgtcacc taaatcgtat 6600gtgtatgata cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc 6720gccaacaccc
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat 6900cccttaacgt gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 7020accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
7080cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca 7140cttcaagaac tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc 7200tgctgccagt ggcgataagt cgtgtcttac
cgggttggac tcaagacgat agttaccgga 7260taaggcgcag cggtcgggct
gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 7320gacctacacc
gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
7380agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg 7500acttgagcgt cgatttttgt gatgctcgtc
aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc tttttacggt
tcctggcctt ttgctggcct tttgctcaca tgttctttcc 7620tgcgttatcc
cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat
taatgcaggt tgatcagatc 7800tcgatcccgc gaaattaata cgactcacta
tagggagacc acaacggttt ccctctagaa 7860ataattttgt ttaactttaa
gaaggagata tacccatgga aaagcctgaa ctcaccgcga 7920cgtctgtcga
gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
7980cggagggcga agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca aagatcgtta
tgtttatcgg cactttgcat 8100cggccgcgct cccgattccg gaagtgcttg
acattgggga attcagcgag agcctgacct 8160attgcatctc ccgccgtgca
cagggtgtca cgttgcaaga cctgcctgaa accgaactgc 8220ccgctgttct
gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg 8340atttcatatg cgcgattgct gatccccatg tgtatcactg
gcaaactgtg atggacgaca 8400ccgtcagtgc gtccgtcgcg caggctctcg
atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg gcacctcgtg
cacgcggatt tcggctccaa caatgtcctg acggacaatg 8520gccgcataac
agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact 8640tcgagcggag gcatccggag cttgcaggat cgccgcggct
ccgggcgtat atgctccgca 8700ttggtcttga ccaactctat cagagcttgg
ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg atgcgacgca
atcgtccgat ccggagccgg gactgtcggg cgtacacaaa 8820tcgcccgcag
aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag ctgagttggc
tgctgccacc gctgagcaat 9000aactagcata accccttggg gcctctaaac
gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc cggatgatcg
ggcgcgccgg tac 90934719093DNAArtificial sequenceQC879 471ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc
120tttaataaaa ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc
agagcagagt 180catcatgaag ctagaaaggc taccgataga taaactatag
ttaattaaat acattaaaaa 240atacttggat ctttctctta ccctgtttat
attgagacct gaaacttgag agagatacac 300taatcttgcc ttgttgtttc
attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag
420atgcacaaca acaaagcttg caaccacagc tgctgtcaag ttttagagct
agaaatagca 480agttaaaata aggctagtcc gttatcaact tgaaaaagtg
gcaccgagtc ggtgcttttt 540tttgcggccg caattggatc gggtttactt
attttgtggg tatctatact tttattagat 600ttttaatcag gctcctgatt
tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa
taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca
720tattaaagca atgaatatgt aaaattaatc ttatctttat tttaaaaaat
catataggtt 780tagtattttt ttaaaaataa agataggatt agttttacta
ttcactgctt attactttta 840aaaaaatcat aaaggtttag tattttttta
aaataaatat aggaatagtt ttactattca 900ctgctttaat agaaaaatag
tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg
agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt
1020cataatatcg ccaaatgcca actggactac gtcgaaccca caaatcccac
aaagcgcgtg 1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt
gttacacgct caatcccacg 1140cgagtagagc acagtaacct tcaaataagc
gaatggggca taatcagaaa tccgaaataa 1200acctaggggc attatcggaa
atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt
atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg
1320aggattccga acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc
tgctcttgtc 1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat
gtttcttcgg ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg
ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg attcttttgg
tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg
aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa
1620catgcgactt tgtatgattt tttacgaggt tatgatgttc tggttgtttt
attatgaatc 1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac
actattaaag gtttgtttta 1740acaggattaa aagtttttta agcatgttga
aggagtcttg tagatatgta accgtcgata 1800gtttttttgt gggtttgttc
acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc
cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt
1920gaaattttgt tattggtaaa ctataaatgt gtgaagttgg agtatacctt
taccttctta 1980tttggctttg tgatagttta atttatatgt attttgagtt
ctgacttgta tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa
aaagtactca atagggctcg acatagggac 2100taactccgtt ggatgggccg
tcatcaccga cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga
aacaccgaca ggcacagcat aaagaagaat ttgatcggtg ccctcctctt
2220cgactccgga gagaccgctg aggctaccag gctcaagagg accgctagaa
ggcgctacac 2280cagaaggaag aacagaatct gctacctgca ggagatcttc
tccaacgaga tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga
atcattcctg gtggaggagg ataaaaagca 2400cgagagacac ccaatcttcg
ggaacatcgt cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac
cacctgagga agaagctggt cgactctacc gacaaggctg acttgcgctt
2520gatttacctg gctctcgctc acatgataaa gttccgcgga cacttcctca
ttgagggaga 2580cctgaaccca gacaactccg acgtggacaa gctcttcatc
cagctcgttc agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc
cagtggagtt gacgccaagg ctatcctctc 2700tgctcgtctg tcaaagtcca
ggaggcttga gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga
ctgttcggaa acttgatcgc tctctccctg ggattgactc ccaacttcaa
2820gtccaacttc gacctcgccg aggacgctaa gttgcagttg tctaaagaca
cctacgacga 2880tgacctcgac aacttgctgg cccagatagg cgaccaatac
gccgatctct tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc
cgacatcctg agagtcaaca ctgagattac 3000caaagctcct ctgtctgctt
ccatgattaa gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag
gccctggtga gacagcagct gcccgagaag tacaaggaga tctttttcga
3120ccagtccaag aacggctacg ccggatacat tgacggaggc gcctcccagg
aagagttcta 3180caagttcatc aagcccatcc ttgagaagat ggacggtacc
gaggagctgt tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag
aaccttcgac aacggaagca tccctcacca 3300aatccacctg ggagagctcc
acgccatctt gaggaggcag gaggatttct atcccttcct 3360gaaggacaac
cgcgagaaga ttgagaagat cttgaccttc agaattcctt actacgtcgg
3420gccactcgcc agaggaaact ctaggttcgc ctggatgacc cgcaaatctg
aagagaccat 3480tactccctgg aacttcgagg aagtcgtgga caagggcgct
tccgctcagt ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc
caacgagaag gtgctgccca agcactccct 3600gttgtacgag tatttcacag
tgtacaacga gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag
cctgccttct tgtccggaga gcagaagaag gccatcgtcg acctgctctt
3720caagaccaac aggaaggtga ctgtcaagca gctgaaggag gactacttca
agaagatcga 3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg
ttcaacgcct cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga
caaggacttc ctggacaacg aggagaacga 3900ggacatcctt gaggacatcg
tgctcaccct gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc
aagacctacg cccacctctt cgacgacaag gtgatgaaac agctgaagag
4020acgcagatat accggctggg gaaggctctc ccgcaaattg atcaacggga
tcagggacaa 4080gcagtcaggg aagactatac tcgacttcct gaagtccgac
ggattcgcca acaggaactt 4140catgcagctc attcacgacg actccttgac
cttcaaggag gacatccaga aggctcaggt 4200gtctggacag ggtgactcct
tgcatgagca cattgctaac ttggccggct ctcccgctat 4260taagaagggc
attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg
4320ccacaagcca gagaacatcg ttattgagat ggctcgcgag aaccaaacta
cccagaaagg 4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag
ggcataaaag agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa
cactcagctg cagaacgaga agctgtacct 4500gtactacctc caaaacggaa
gggacatgta cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac
gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg actccatcga
4620caataaagtg ctgacacgct ccgataaaaa tagaggcaag tccgacaacg
tcccctccga 4680ggaggtcgtg aagaagatga aaaactactg gagacagctc
ttgaacgcca agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc
tgagagagga ggattgtccg agctcgataa 4800ggccggattc atcaagagac
agctcgtcga aacccgccaa attaccaagc acgtggccca 4860aattctggat
tcccgcatga acaccaagta cgatgaaaat gacaagctga tccgcgaggt
4920caaggtgatc accttgaagt ccaagctggt ctccgacttc cgcaaggact
tccagttcta 4980caaggtgagg gagatcaaca actaccacca cgcacacgac
gcctacctca acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa
gctggagtct gagttcgtct acggcgacta 5100caaggtgtac gacgtgagga
agatgatcgc taagtctgag caggagatcg gcaaggccac 5160cgccaagtac
ttcttctact ccaacatcat gaacttcttc aagaccgaga tcactctcgc
5220caacggtgag atcaggaagc gcccactgat cgagaccaac ggtgagactg
gagagatcgt 5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg
ctctccatgc ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg
aggattctcc aaggagtcca tcctccccaa 5400gagaaactcc gacaagctga
tcgctagaaa gaaagactgg gaccctaaga agtacggagg 5460cttcgattct
cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg agaagggcaa
5520gtccaagaag ctgaaatccg tcaaggagct cctcgggatt accatcatgg
agaggagttc 5580cttcgagaag aaccctatcg acttcctgga ggccaaggga
tataaagagg tgaagaagga 5640cctcatcatc aagctgccca agtactccct
cttcgagttg gagaacggaa ggaagaggat 5700gctggcttct gccggagagt
tgcagaaggg aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg
tacctcgcct ctcactatga aaagttgaag ggctctcctg aggacaacga
5820gcagaagcag ctcttcgtgg agcagcacaa gcactacctg gacgaaatta
tcgagcagat 5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac
ctcgacaagg tgctgtccgc 5940ctacaacaag cacagggata agcccattcg
cgagcaggct gaaaacatta tccacctgtt 6000taccctcaca aacttgggag
cccctgctgc cttcaagtac ttcgacacca ccattgacag 6060gaagagatac
acctccacca aggaggtgct cgacgcaaca ctcatccacc aatccatcac
6120cggcctctat gaaacaagga ttgacttgtc ccagctggga ggcgactcta
gagccgatcc 6180caagaagaag agaaaggtgt aggttaacct agacttgtcc
atcttctgga ttggccaact 6240taattaatgt atgaaataaa aggatgcaca
catagtgaca tgctaatcac tataatgtgg 6300gcatcaaagt tgtgtgttat
gtgtaattac tagttatctg aataaaagag aaagagatca 6360tccatatttc
ttatcctaaa tgaatgtcac gtgtctttat aattctttga tgaaccagat
6420gcatttcatt aaccaaatcc atatacatat aaatattaat catatataat
taatatcaat 6480tgggttagca aaacaaatct agtctaggtg tgttttgcga
attcgatatc aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg
cgccgttcta tagtgtcacc taaatcgtat 6600gtgtatgata cataaggtta
tgtattaatt gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact
ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc
6720gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg
cttacagaca 6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt
tcaccgtcat caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct
atttttatag gttaatgtca tgaccaaaat 6900cccttaacgt gagttttcgt
tccactgagc gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct
7020accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga
aggtaactgg 7080cttcagcaga gcgcagatac caaatactgt ccttctagtg
tagccgtagt taggccacca 7140cttcaagaac tctgtagcac cgcctacata
cctcgctctg ctaatcctgt taccagtggc 7200tgctgccagt ggcgataagt
cgtgtcttac cgggttggac tcaagacgat agttaccgga 7260taaggcgcag
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac
7320gacctacacc gaactgagat acctacagcg tgagcattga gaaagcgcca
cgcttcccga 7380agggagaaag gcggacaggt atccggtaag cggcagggtc
ggaacaggag agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct
ttatagtcct gtcgggtttc gccacctctg 7500acttgagcgt cgatttttgt
gatgctcgtc aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc
tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc
7620tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag
ctgataccgc 7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc
gaggaagcgg aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg
gccgattcat taatgcaggt tgatcagatc 7800tcgatcccgc gaaattaata
cgactcacta tagggagacc acaacggttt ccctctagaa 7860ataattttgt
ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga
7920cgtctgtcga gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg
atgcagctct 7980cggagggcga agaatctcgt gctttcagct tcgatgtagg
agggcgtgga tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca
aagatcgtta tgtttatcgg cactttgcat 8100cggccgcgct cccgattccg
gaagtgcttg acattgggga attcagcgag agcctgacct 8160attgcatctc
ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc
8220ccgctgttct gcagccggtc gcggaggcta tggatgcgat cgctgcggcc
gatcttagcc 8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg
tcaatacact acatggcgtg 8340atttcatatg cgcgattgct gatccccatg
tgtatcactg gcaaactgtg atggacgaca 8400ccgtcagtgc gtccgtcgcg
caggctctcg atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg
gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg
8520gccgcataac agcggtcatt gactggagcg aggcgatgtt cggggattcc
caatacgagg 8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat
ggagcagcag acgcgctact 8640tcgagcggag gcatccggag cttgcaggat
cgccgcggct ccgggcgtat atgctccgca 8700ttggtcttga ccaactctat
cagagcttgg ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg
atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa
8820tcgcccgcag aagcgcggcc gtctggaccg atggctgtgt agaagtactc
gccgatagtg 8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata
gtgaggtaca gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag
ctgagttggc tgctgccacc gctgagcaat 9000aactagcata accccttggg
gcctctaaac gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc
cggatgatcg ggcgcgccgg tac 90934721357DNAArtificial sequenceRTW1013A
472ctagaagata aaccctcccc caaaacacaa attagaatga catttcaagt
tccatgtatg 60tcactttcat tctattattt ttacaacttt tagttactta acagatgtct
tgttcagcat 120aaattataat ttattctgtt tttttttagg gaacaactgt
tgtagacaac ttgttgtata 180gtgaggatat tcattacatg cttggtgcat
taaggaccct tggactgcgt gtggaagatg 240acaaaacaac caaacaagca
attgttgaag gctgtggggg attgtttccc actagtaagg 300aatctaaaga
tgaaatcaat ttattccttg gaaatgctgg tattgcaatg agatctttga
360cagcagctgt tgttgctgca ggtggaaatg caaggtctgt tttttttttt
tttgttcagc 420ataatctttg aattgttcct cgtataacta atcacaacag
agtacgtgtt cttcttcctg 480ttataatcta aaaatctcat ccagattagt
catcctttct tcttaaaagg aacctttaat 540tatcaatgta tttatttaat
atttaaatta gcttgtcaaa gtctagcata tacatatttt 600gattatattc
tgagaaatgc acctgagggt gttcctcatg atctacttca acctctgtta
660ttattagatt ttctatcatg attactggtt tgagtctcta agtagaccat
cttgatgttc 720aaaatatttc agctacgtac ttgatggggt gccccgaatg
agagagaggc caattgggga 780tttggttgct ggtcttaagc aacttggtgc
agatgttgat tgctttcttg gcacaaactg 840tccacctgtt cgtgtaaatg
ggaagggagg acttcctggc ggaaaggtat ggtttggatt 900tcatttagaa
taaggtggag taactttcct ggatcaaaat tctaatttaa gaagcctccc
960tgttttcctc tctttagaat aagactaagg gtaggtttag gagttgggtt
ttggagagaa 1020atggaaggga gagcaatttt tttcttcttc taataaatat
tctttaattt gatacatttt 1080ttaagtaaaa gaatataaag atagattagc
ataacttaat gttttaatct tttatttatt 1140tttataaata ttatatacct
gtctatttaa aaatcaaata tttgtcctcc attccctttc 1200ccttcaaaac
ctcagttcca aatataccgt agttgaatta tattttggaa ggcctattgg
1260ttggagactt ttccttttca gagattatcc ctcaccttta ttatagcctt
tctattttta 1320aacttcatat agacgccatt cttggggcgg ccgcgat
13574731357DNAArtificial sequenceRTW1012A 473ctagaagata aaccctcccc
caaaacacaa attagaatga catttcaagt tccatgtatg 60tcactttcat tctattattt
ttacaacttt tagttactta acagatgtct tgttcagcat 120aaattataat
ttattctgtt tttttttagg gaacaactgt tgtagacaac ttgttgtata
180gtgaggatat tcattacatg cttggtgcat taaggaccct tggactgcgt
gtggaagatg 240acaaaacaac caaacaagca attgttgaag gctgtggggg
attgtttccc actagtaagg 300aatctaaaga tgaaatcaat ttattccttg
gaaatgctgg tattgcaatg agatctttga 360cagcagctgt ggttgctgca
ggtggaaatg caaggtctgt tttttttttt tttgttcagc 420ataatctttg
aattgttcct cgtataacta atcacaacag agtacgtgtt cttcttcctg
480ttataatcta aaaatctcat ccagattagt catcctttct tcttaaaagg
aacctttaat 540tatcaatgta tttatttaat atttaaatta gcttgtcaaa
gtctagcata tacatatttt 600gattatattc tgagaaatgc acctgagggt
gttcctcatg atctacttca acctctgtta 660ttattagatt ttctatcatg
attactggtt tgagtctcta agtagaccat cttgatgttc 720aaaatatttc
agctacgtac ttgatggggt gccccgaatg agagagaggc caattgggga
780tttggttgct ggtcttaagc aacttggtgc agatgttgat tgctttcttg
gcacaaactg 840tccacctgtt cgtgtaaatg ggaagggagg acttcctggc
ggaaaggtat ggtttggatt 900tcatttagaa taaggtggag taactttcct
ggatcaaaat tctaatttaa gaagcctccc 960tgttttcctc tctttagaat
aagactaagg gtaggtttag gagttgggtt ttggagagaa 1020atggaaggga
gagcaatttt tttcttcttc taataaatat tctttaattt gatacatttt
1080ttaagtaaaa gaatataaag atagattagc ataacttaat gttttaatct
tttatttatt 1140tttataaata ttatatacct gtctatttaa aaatcaaata
tttgtcctcc attccctttc 1200ccttcaaaac ctcagttcca aatataccgt
agttgaatta tattttggaa ggcctattgg 1260ttggagactt ttccttttca
gagattatcc ctcaccttta ttatagcctt tctattttta 1320aacttcatat
agacgccatt cttggggcgg ccgcgat 135747430DNAArtificial
sequenceprimer, soy1-F1 474ccactagtaa ggaatctaaa gatgaaatca
3047524DNAArtificial sequenceprimer, soy1-R2 475cctgcagcaa
ccacagctgc tgtc 2447615DNAArtificial sequenceprobe, soy1-T1(FAM-MGB
476ctgcaatgcg tcctt 1547719DNAArtificial sequenceprimer, cas9-F
477ccttcttcca ccgccttga 1947821DNAArtificial sequenceprimer, Cas9-R
478tgggtgtctc tcgtgctttt t 2147919DNAArtificial sequenceprobe,
Cas9-T(FAM-MGB) 479aatcattcct ggtggagga 1948025DNAArtificial
sequenceprimer, pINII-99F 480tgatgcccac attatagtga ttagc
2548122DNAArtificial sequenceprimer, pINII-13R 481catcttctgg
attggccaac tt 2248217DNAArtificial sequenceprobe,
pINII-69T(FAM-MGB) 482actatgtgtg catcctt 1748323DNAArtificial
sequenceprimer, SIP-130F 483ttcaagttgg gctttttcag aag
2348422DNAArtificial sequenceprimer, SIP-198R 484tctccttggt
gctctcatca ca 2248515DNAArtificial sequenceprobe, SIP-170T(VIC-MGB)
485ctgcagcaga accaa 1548624DNAArtificialWOL569, Forward_primer
486ggacccatta ggtgagagcg tggg 2448719DNAArtificialWOL876,
Reverse_primer 487cagctgctgt caaagatct 1948829DNAArtificialWOL570,
Reverse_primer 488tctaataata acagaggttg aagtagatc
294894104DNAGlycine max 489atggacaaaa agtactcaat agggctcgac
atagggacta actccgttgg atgggccgtc 60atcaccgacg agtacaaggt gccctccaag
aagttcaagg tgttgggaaa caccgacagg 120cacagcataa agaagaattt
gatcggtgcc ctcctcttcg actccggaga gaccgctgag 180gctaccaggc
tcaagaggac cgctagaagg cgctacacca gaaggaagaa cagaatctgc
240tacctgcagg agatcttctc caacgagatg gccaaggtgg acgactcctt
cttccaccgc 300cttgaggaat cattcctggt ggaggaggat aaaaagcacg
agagacaccc aatcttcggg 360aacatcgtcg acgaggtggc ctaccatgaa
aagtacccta ccatctacca cctgaggaag 420aagctggtcg actctaccga
caaggctgac ttgcgcttga tttacctggc tctcgctcac 480atgataaagt
tccgcggaca cttcctcatt gagggagacc tgaacccaga caactccgac
540gtggacaagc tcttcatcca gctcgttcag acctacaacc agcttttcga
ggagaaccca 600atcaacgcca gtggagttga cgccaaggct atcctctctg
ctcgtctgtc aaagtccagg 660aggcttgaga acttgattgc ccagctgcct
ggcgaaaaga agaacggact gttcggaaac 720ttgatcgctc tctccctggg
attgactccc aacttcaagt ccaacttcga cctcgccgag 780gacgctaagt
tgcagttgtc taaagacacc tacgacgatg acctcgacaa cttgctggcc
840cagataggcg accaatacgc cgatctcttc ctcgccgcta agaacttgtc
cgacgcaatc 900ctgctgtccg acatcctgag agtcaacact gagattacca
aagctcctct gtctgcttcc 960atgattaagc gctacgacga gcaccaccaa
gatctgaccc tgctcaaggc cctggtgaga 1020cagcagctgc ccgagaagta
caaggagatc tttttcgacc agtccaagaa cggctacgcc 1080ggatacattg
acggaggcgc ctcccaggaa gagttctaca agttcatcaa gcccatcctt
1140gagaagatgg acggtaccga ggagctgttg gtgaagttga acagagagga
cctgttgagg 1200aagcagagaa ccttcgacaa cggaagcatc cctcaccaaa
tccacctggg agagctccac 1260gccatcttga ggaggcagga ggatttctat
cccttcctga aggacaaccg cgagaagatt 1320gagaagatct tgaccttcag
aattccttac tacgtcgggc cactcgccag aggaaactct 1380aggttcgcct
ggatgacccg caaatctgaa gagaccatta ctccctggaa cttcgaggaa
1440gtcgtggaca agggcgcttc cgctcagtct ttcatcgaga ggatgaccaa
cttcgataaa 1500aatctgccca acgagaaggt gctgcccaag cactccctgt
tgtacgagta tttcacagtg 1560tacaacgagc tcaccaaggt gaagtacgtc
acagagggaa tgaggaagcc tgccttcttg 1620tccggagagc agaagaaggc
catcgtcgac ctgctcttca agaccaacag gaaggtgact 1680gtcaagcagc
tgaaggagga ctacttcaag aagatcgagt gcttcgactc cgtcgagatc
1740tctggtgtcg aggacaggtt caacgcctcc cttgggactt accacgatct
gctcaagatt 1800attaaagaca aggacttcct ggacaacgag gagaacgagg
acatccttga ggacatcgtg 1860ctcaccctga ccttgttcga agacagggaa
atgatcgaag agaggctcaa gacctacgcc 1920cacctcttcg acgacaaggt
gatgaaacag ctgaagagac gcagatatac cggctgggga 1980aggctctccc
gcaaattgat caacgggatc agggacaagc agtcagggaa gactatactc
2040gacttcctga agtccgacgg attcgccaac aggaacttca tgcagctcat
tcacgacgac 2100tccttgacct tcaaggagga catccagaag gctcaggtgt
ctggacaggg tgactccttg 2160catgagcaca ttgctaactt ggccggctct
cccgctatta agaagggcat tttgcagacc 2220gtgaaggtcg ttgacgagct
cgtgaaggtg atgggacgcc acaagccaga gaacatcgtt 2280attgagatgg
ctcgcgagaa ccaaactacc cagaaagggc agaagaattc ccgcgagagg
2340atgaagcgca ttgaggaggg cataaaagag cttggctctc agatcctcaa
ggagcacccc 2400gtcgagaaca ctcagctgca gaacgagaag ctgtacctgt
actacctcca aaacggaagg 2460gacatgtacg tggaccagga gctggacatc
aacaggttgt ccgactacga cgtcgaccac 2520atcgtgcctc agtccttcct
gaaggatgac tccatcgaca ataaagtgct gacacgctcc 2580gataaaaata
gaggcaagtc cgacaacgtc ccctccgagg aggtcgtgaa gaagatgaaa
2640aactactgga gacagctctt gaacgccaag ctcatcaccc agcgtaagtt
cgacaacctg 2700actaaggctg agagaggagg attgtccgag ctcgataagg
ccggattcat caagagacag 2760ctcgtcgaaa cccgccaaat taccaagcac
gtggcccaaa ttctggattc ccgcatgaac 2820accaagtacg atgaaaatga
caagctgatc cgcgaggtca aggtgatcac cttgaagtcc 2880aagctggtct
ccgacttccg caaggacttc cagttctaca aggtgaggga gatcaacaac
2940taccaccacg cacacgacgc ctacctcaac gctgtcgttg gaaccgccct
catcaaaaaa 3000tatcctaagc tggagtctga gttcgtctac ggcgactaca
aggtgtacga cgtgaggaag 3060atgatcgcta agtctgagca ggagatcggc
aaggccaccg ccaagtactt cttctactcc 3120aacatcatga acttcttcaa
gaccgagatc actctcgcca acggtgagat caggaagcgc 3180ccactgatcg
agaccaacgg tgagactgga gagatcgtgt gggacaaagg gagggatttc
3240gctactgtga ggaaggtgct ctccatgcct caggtgaaca tcgtcaagaa
gaccgaagtt 3300cagaccggag gattctccaa ggagtccatc ctccccaaga
gaaactccga caagctgatc 3360gctagaaaga aagactggga ccctaagaag
tacggaggct tcgattctcc taccgtggcc 3420tactctgtgc tggtcgtggc
caaggtggag aagggcaagt ccaagaagct gaaatccgtc 3480aaggagctcc
tcgggattac catcatggag aggagttcct tcgagaagaa ccctatcgac
3540ttcctggagg ccaagggata taaagaggtg aagaaggacc tcatcatcaa
gctgcccaag 3600tactccctct tcgagttgga gaacggaagg aagaggatgc
tggcttctgc cggagagttg 3660cagaagggaa atgagctcgc ccttccctcc
aagtacgtga acttcctgta cctcgcctct 3720cactatgaaa agttgaaggg
ctctcctgag gacaacgagc agaagcagct cttcgtggag 3780cagcacaagc
actacctgga cgaaattatc gagcagatct ctgagttctc caagcgcgtg
3840atattggccg acgccaacct cgacaaggtg ctgtccgcct acaacaagca
cagggataag 3900cccattcgcg agcaggctga aaacattatc cacctgttta
ccctcacaaa cttgggagcc 3960cctgctgcct tcaagtactt cgacaccacc
attgacagga agagatacac ctccaccaag 4020gaggtgctcg acgcaacact
catccaccaa tccatcaccg gcctctatga aacaaggatt 4080gacttgtccc
agctgggagg cgac 410449023DNAGlycine max 490gtttgtttgt tgttgggtgt
ggg 2349122DNAGlycine max 491tgttgttggg tgtgggaata gg
224929174DNAArtificial sequenceRTW1199 492ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg tttgtttgtt gttgggtgtg ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggcacc gaagaagaag
cgcaaggtga tggacaaaaa 2100gtactcaata gggctcgaca tagggactaa
ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg ccctccaaga
agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg
atcggtgccc tcctcttcga ctccggagag accgctgagg ctaccaggct
2280caagaggacc gctagaaggc gctacaccag aaggaagaac agaatctgct
acctgcagga 2340gatcttctcc aacgagatgg ccaaggtgga cgactccttc
ttccaccgcc ttgaggaatc 2400attcctggtg gaggaggata aaaagcacga
gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc taccatgaaa
agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac
aaggctgact tgcgcttgat ttacctggct ctcgctcaca tgataaagtt
2580ccgcggacac ttcctcattg agggagacct gaacccagac aactccgacg
tggacaagct 2640cttcatccag ctcgttcaga cctacaacca gcttttcgag
gagaacccaa tcaacgccag 2700tggagttgac gccaaggcta tcctctctgc
tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc cagctgcctg
gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga
ttgactccca acttcaagtc caacttcgac ctcgccgagg acgctaagtt
2880gcagttgtct aaagacacct acgacgatga cctcgacaac ttgctggccc
agataggcga 2940ccaatacgcc gatctcttcc tcgccgctaa gaacttgtcc
gacgcaatcc tgctgtccga 3000catcctgaga gtcaacactg agattaccaa
agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag caccaccaag
atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac
aaggagatct ttttcgacca gtccaagaac ggctacgccg gatacattga
3180cggaggcgcc tcccaggaag agttctacaa gttcatcaag cccatccttg
agaagatgga 3240cggtaccgag gagctgttgg tgaagttgaa cagagaggac
ctgttgagga agcagagaac 3300cttcgacaac ggaagcatcc ctcaccaaat
ccacctggga gagctccacg ccatcttgag 3360gaggcaggag gatttctatc
ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga
attccttact acgtcgggcc actcgccaga ggaaactcta ggttcgcctg
3480gatgacccgc aaatctgaag agaccattac tccctggaac ttcgaggaag
tcgtggacaa 3540gggcgcttcc gctcagtctt tcatcgagag gatgaccaac
ttcgataaaa atctgcccaa 3600cgagaaggtg ctgcccaagc actccctgtt
gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg aagtacgtca
cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc
atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg tcaagcagct
3780gaaggaggac tacttcaaga agatcgagtg cttcgactcc gtcgagatct
ctggtgtcga 3840ggacaggttc aacgcctccc ttgggactta ccacgatctg
ctcaagatta ttaaagacaa 3900ggacttcctg gacaacgagg agaacgagga
catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa gacagggaaa
tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg
atgaaacagc tgaagagacg cagatatacc ggctggggaa ggctctcccg
4080caaattgatc aacgggatca gggacaagca gtcagggaag actatactcg
acttcctgaa 4140gtccgacgga ttcgccaaca ggaacttcat gcagctcatt
cacgacgact ccttgacctt 4200caaggaggac atccagaagg ctcaggtgtc
tggacagggt gactccttgc atgagcacat 4260tgctaacttg gccggctctc
ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc
gtgaaggtga tgggacgcca caagccagag aacatcgtta ttgagatggc
4380tcgcgagaac caaactaccc agaaagggca gaagaattcc cgcgagagga
tgaagcgcat 4440tgaggagggc ataaaagagc ttggctctca gatcctcaag
gagcaccccg tcgagaacac 4500tcagctgcag aacgagaagc tgtacctgta
ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag ctggacatca
acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg
aaggatgact ccatcgacaa taaagtgctg acacgctccg ataaaaatag
4680aggcaagtcc gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa
actactggag 4740acagctcttg aacgccaagc tcatcaccca gcgtaagttc
gacaacctga ctaaggctga 4800gagaggagga ttgtccgagc tcgataaggc
cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt accaagcacg
tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac
aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc
4980cgacttccgc aaggacttcc agttctacaa ggtgagggag atcaacaact
accaccacgc 5040acacgacgcc tacctcaacg ctgtcgttgg aaccgccctc
atcaaaaaat atcctaagct 5100ggagtctgag ttcgtctacg gcgactacaa
ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag gagatcggca
aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag
accgagatca ctctcgccaa cggtgagatc aggaagcgcc cactgatcga
5280gaccaacggt gagactggag agatcgtgtg ggacaaaggg agggatttcg
ctactgtgag 5340gaaggtgctc tccatgcctc aggtgaacat cgtcaagaag
accgaagttc agaccggagg 5400attctccaag gagtccatcc tccccaagag
aaactccgac aagctgatcg ctagaaagaa 5460agactgggac cctaagaagt
acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc
aaggtggaga agggcaagtc caagaagctg aaatccgtca aggagctcct
5580cgggattacc atcatggaga ggagttcctt cgagaagaac cctatcgact
tcctggaggc 5640caagggatat aaagaggtga agaaggacct catcatcaag
ctgcccaagt actccctctt 5700cgagttggag aacggaagga agaggatgct
ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc cttccctcca
agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc
tctcctgagg acaacgagca gaagcagctc ttcgtggagc agcacaagca
5880ctacctggac gaaattatcg agcagatctc tgagttctcc aagcgcgtga
tattggccga 5940cgccaacctc gacaaggtgc tgtccgccta caacaagcac
agggataagc ccattcgcga 6000gcaggctgaa aacattatcc acctgtttac
cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc gacaccacca
ttgacaggaa gagatacacc tccaccaagg aggtgctcga 6120cgcaacactc
atccaccaat ccatcaccgg cctctatgaa acaaggattg acttgtccca
6180gctgggaggc gactctagag ccgatcccaa gaagaagaga aaggtgaaga
gaccacggga 6240ccgccacgat ggcgagctgg gaggccgcaa gcgggcaagg
taggttaacc tagacttgtc 6300catcttctgg attggccaac ttaattaatg
tatgaaataa aaggatgcac acatagtgac 6360atgctaatca ctataatgtg
ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct 6420gaataaaaga
gaaagagatc atccatattt cttatcctaa atgaatgtca cgtgtcttta
6480taattctttg atgaaccaga tgcatttcat taaccaaatc catatacata
taaatattaa 6540tcatatataa ttaatatcaa ttgggttagc aaaacaaatc
tagtctaggt gtgttttgcg 6600aattcgatat caagcttatc gataccgtcg
agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac ctaaatcgta
tgtgtatgat acataaggtt atgtattaat tgtagccgcg 6720ttctaacgac
aatatgtcca tatggtgcac tctcagtaca atctgctctg atgccgcata
6780gttaagccag ccccgacacc cgccaacacc cgctgacgcg ccctgacggg
cttgtctgct 6840cccggcatcc gcttacagac aagctgtgac cgtctccggg
agctgcatgt gtcagaggtt 6900ttcaccgtca tcaccgaaac gcgcgagacg
aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc atgaccaaaa
tcccttaacg tgagttttcg ttccactgag cgtcagaccc 7020cgtagaaaag
atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt
7080gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag
agctaccaac 7140tctttttccg aaggtaactg gcttcagcag agcgcagata
ccaaatactg ttcttctagt 7200gtagccgtag ttaggccacc acttcaagaa
ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg ttaccagtgg
ctgctgccag tggcgataag tcgtgtctta ccgggttgga 7320ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac
7380acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc
gtgagctatg 7440agaaagcgcc acgcttcccg aagggagaaa ggcggacagg
tatccggtaa gcggcagggt 7500cggaacagga gagcgcacga gggagcttcc
agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt cgccacctct
gacttgagcg tcgatttttg tgatgctcgt caggggggcg 7620gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc
7680ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc
gtattaccgc 7740ctttgagtga gctgataccg ctcgccgcag ccgaacgacc
gagcgcagcg agtcagtgag 7800cgaggaagcg gaagagcgcc caatacgcaa
accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg ttgatcagat
ctcgatcccg cgaaattaat acgactcact atagggagac 7920cacaacggtt
tccctctaga aataattttg tttaacttta agaaggagat atacccatgg
7980aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg 8040tctccgacct gatgcagctc tcggagggcg aagaatctcg
tgctttcagc ttcgatgtag 8100gagggcgtgg atatgtcctg cgggtaaata
gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg gcactttgca
tcggccgcgc tcccgattcc ggaagtgctt gacattgggg 8220aattcagcga
gagcctgacc tattgcatct cccgccgtgc acagggtgtc acgttgcaag
8280acctgcctga aaccgaactg cccgctgttc tgcagccggt cgcggaggct
atggatgcga 8340tcgctgcggc cgatcttagc cagacgagcg ggttcggccc
attcggaccg caaggaatcg 8400gtcaatacac tacatggcgt gatttcatat
gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt gatggacgac
accgtcagtg cgtccgtcgc gcaggctctc gatgagctga 8520tgctttgggc
cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat ttcggctcca
8580acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc
gaggcgatgt 8640tcggggattc ccaatacgag gtcgccaaca tcttcttctg
gaggccgtgg ttggcttgta 8700tggagcagca gacgcgctac ttcgagcgga
ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta tatgctccgc
attggtcttg accaactcta tcagagcttg gttgacggca 8820atttcgatga
tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga tccggagccg
8880ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg 8940tagaagtact cgccgatagt ggaaaccgac gccccagcac
tcgtccgagg gcaaaggaat 9000agtgaggtac agcttggatc gatccggctg
ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac cgctgagcaa
taactagcat aaccccttgg ggcctctaaa cgggtcttga 9120ggggtttttt
gctgaaagga ggaactatat ccggatgctc gggcgcgccg gtac
91744939174DNAArtificial sequenceRTW1200 493ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg tgttgttggg tgtgggaatg ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggcacc gaagaagaag
cgcaaggtga tggacaaaaa 2100gtactcaata gggctcgaca tagggactaa
ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg ccctccaaga
agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg
atcggtgccc tcctcttcga ctccggagag accgctgagg ctaccaggct
2280caagaggacc gctagaaggc gctacaccag aaggaagaac agaatctgct
acctgcagga 2340gatcttctcc aacgagatgg ccaaggtgga cgactccttc
ttccaccgcc ttgaggaatc 2400attcctggtg gaggaggata aaaagcacga
gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc taccatgaaa
agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac
aaggctgact tgcgcttgat ttacctggct ctcgctcaca tgataaagtt
2580ccgcggacac ttcctcattg agggagacct gaacccagac aactccgacg
tggacaagct 2640cttcatccag ctcgttcaga cctacaacca gcttttcgag
gagaacccaa tcaacgccag 2700tggagttgac gccaaggcta tcctctctgc
tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc cagctgcctg
gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga
ttgactccca acttcaagtc caacttcgac ctcgccgagg acgctaagtt
2880gcagttgtct aaagacacct acgacgatga cctcgacaac ttgctggccc
agataggcga 2940ccaatacgcc gatctcttcc tcgccgctaa gaacttgtcc
gacgcaatcc tgctgtccga 3000catcctgaga gtcaacactg agattaccaa
agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag caccaccaag
atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac
aaggagatct ttttcgacca gtccaagaac ggctacgccg gatacattga
3180cggaggcgcc tcccaggaag agttctacaa gttcatcaag cccatccttg
agaagatgga 3240cggtaccgag gagctgttgg tgaagttgaa cagagaggac
ctgttgagga agcagagaac 3300cttcgacaac ggaagcatcc ctcaccaaat
ccacctggga gagctccacg ccatcttgag 3360gaggcaggag gatttctatc
ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga
attccttact acgtcgggcc actcgccaga ggaaactcta ggttcgcctg
3480gatgacccgc aaatctgaag agaccattac tccctggaac ttcgaggaag
tcgtggacaa 3540gggcgcttcc gctcagtctt tcatcgagag gatgaccaac
ttcgataaaa atctgcccaa 3600cgagaaggtg ctgcccaagc actccctgtt
gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg aagtacgtca
cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc
atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg tcaagcagct
3780gaaggaggac tacttcaaga agatcgagtg cttcgactcc gtcgagatct
ctggtgtcga 3840ggacaggttc aacgcctccc ttgggactta ccacgatctg
ctcaagatta ttaaagacaa 3900ggacttcctg gacaacgagg agaacgagga
catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa gacagggaaa
tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg
atgaaacagc tgaagagacg cagatatacc ggctggggaa ggctctcccg
4080caaattgatc aacgggatca gggacaagca gtcagggaag actatactcg
acttcctgaa 4140gtccgacgga ttcgccaaca ggaacttcat gcagctcatt
cacgacgact ccttgacctt 4200caaggaggac atccagaagg ctcaggtgtc
tggacagggt gactccttgc atgagcacat 4260tgctaacttg gccggctctc
ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc
gtgaaggtga tgggacgcca caagccagag aacatcgtta ttgagatggc
4380tcgcgagaac caaactaccc agaaagggca gaagaattcc cgcgagagga
tgaagcgcat 4440tgaggagggc ataaaagagc ttggctctca gatcctcaag
gagcaccccg tcgagaacac 4500tcagctgcag aacgagaagc tgtacctgta
ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag ctggacatca
acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg
aaggatgact ccatcgacaa taaagtgctg acacgctccg ataaaaatag
4680aggcaagtcc gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa
actactggag 4740acagctcttg aacgccaagc tcatcaccca gcgtaagttc
gacaacctga ctaaggctga 4800gagaggagga ttgtccgagc tcgataaggc
cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt accaagcacg
tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac
aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc
4980cgacttccgc aaggacttcc agttctacaa ggtgagggag atcaacaact
accaccacgc 5040acacgacgcc tacctcaacg ctgtcgttgg aaccgccctc
atcaaaaaat atcctaagct 5100ggagtctgag ttcgtctacg gcgactacaa
ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag gagatcggca
aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag
accgagatca ctctcgccaa cggtgagatc aggaagcgcc cactgatcga
5280gaccaacggt gagactggag agatcgtgtg ggacaaaggg agggatttcg
ctactgtgag 5340gaaggtgctc tccatgcctc aggtgaacat cgtcaagaag
accgaagttc agaccggagg 5400attctccaag gagtccatcc tccccaagag
aaactccgac aagctgatcg ctagaaagaa 5460agactgggac cctaagaagt
acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc
aaggtggaga agggcaagtc caagaagctg aaatccgtca aggagctcct
5580cgggattacc atcatggaga ggagttcctt cgagaagaac cctatcgact
tcctggaggc 5640caagggatat aaagaggtga agaaggacct catcatcaag
ctgcccaagt actccctctt 5700cgagttggag aacggaagga agaggatgct
ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc cttccctcca
agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc
tctcctgagg acaacgagca gaagcagctc ttcgtggagc agcacaagca
5880ctacctggac gaaattatcg agcagatctc tgagttctcc aagcgcgtga
tattggccga 5940cgccaacctc gacaaggtgc tgtccgccta caacaagcac
agggataagc ccattcgcga 6000gcaggctgaa aacattatcc acctgtttac
cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc gacaccacca
ttgacaggaa gagatacacc tccaccaagg aggtgctcga 6120cgcaacactc
atccaccaat ccatcaccgg cctctatgaa acaaggattg acttgtccca
6180gctgggaggc gactctagag ccgatcccaa gaagaagaga aaggtgaaga
gaccacggga 6240ccgccacgat ggcgagctgg gaggccgcaa gcgggcaagg
taggttaacc tagacttgtc 6300catcttctgg attggccaac ttaattaatg
tatgaaataa aaggatgcac acatagtgac 6360atgctaatca ctataatgtg
ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct 6420gaataaaaga
gaaagagatc atccatattt cttatcctaa atgaatgtca cgtgtcttta
6480taattctttg atgaaccaga tgcatttcat taaccaaatc catatacata
taaatattaa 6540tcatatataa ttaatatcaa ttgggttagc aaaacaaatc
tagtctaggt gtgttttgcg 6600aattcgatat caagcttatc gataccgtcg
agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac ctaaatcgta
tgtgtatgat acataaggtt atgtattaat tgtagccgcg 6720ttctaacgac
aatatgtcca tatggtgcac tctcagtaca atctgctctg atgccgcata
6780gttaagccag ccccgacacc cgccaacacc cgctgacgcg ccctgacggg
cttgtctgct 6840cccggcatcc gcttacagac aagctgtgac cgtctccggg
agctgcatgt gtcagaggtt 6900ttcaccgtca tcaccgaaac gcgcgagacg
aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc atgaccaaaa
tcccttaacg tgagttttcg ttccactgag cgtcagaccc 7020cgtagaaaag
atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt
7080gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag
agctaccaac 7140tctttttccg aaggtaactg gcttcagcag agcgcagata
ccaaatactg ttcttctagt 7200gtagccgtag ttaggccacc acttcaagaa
ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg ttaccagtgg
ctgctgccag tggcgataag tcgtgtctta ccgggttgga 7320ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac
7380acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc
gtgagctatg 7440agaaagcgcc acgcttcccg aagggagaaa ggcggacagg
tatccggtaa gcggcagggt 7500cggaacagga gagcgcacga gggagcttcc
agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt cgccacctct
gacttgagcg tcgatttttg tgatgctcgt caggggggcg 7620gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc
7680ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc
gtattaccgc 7740ctttgagtga gctgataccg ctcgccgcag ccgaacgacc
gagcgcagcg agtcagtgag 7800cgaggaagcg gaagagcgcc caatacgcaa
accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg ttgatcagat
ctcgatcccg cgaaattaat acgactcact atagggagac 7920cacaacggtt
tccctctaga aataattttg tttaacttta agaaggagat atacccatgg
7980aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg 8040tctccgacct gatgcagctc tcggagggcg aagaatctcg
tgctttcagc ttcgatgtag 8100gagggcgtgg atatgtcctg cgggtaaata
gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg gcactttgca
tcggccgcgc tcccgattcc ggaagtgctt gacattgggg 8220aattcagcga
gagcctgacc tattgcatct cccgccgtgc acagggtgtc acgttgcaag
8280acctgcctga aaccgaactg cccgctgttc tgcagccggt cgcggaggct
atggatgcga 8340tcgctgcggc cgatcttagc cagacgagcg ggttcggccc
attcggaccg caaggaatcg 8400gtcaatacac tacatggcgt gatttcatat
gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt gatggacgac
accgtcagtg cgtccgtcgc gcaggctctc gatgagctga 8520tgctttgggc
cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat ttcggctcca
8580acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc
gaggcgatgt 8640tcggggattc ccaatacgag gtcgccaaca tcttcttctg
gaggccgtgg ttggcttgta 8700tggagcagca gacgcgctac ttcgagcgga
ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta tatgctccgc
attggtcttg accaactcta tcagagcttg gttgacggca 8820atttcgatga
tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga tccggagccg
8880ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg 8940tagaagtact cgccgatagt ggaaaccgac gccccagcac
tcgtccgagg gcaaaggaat 9000agtgaggtac agcttggatc gatccggctg
ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac cgctgagcaa
taactagcat aaccccttgg ggcctctaaa cgggtcttga 9120ggggtttttt
gctgaaagga ggaactatat ccggatgctc gggcgcgccg gtac
91744943175DNAArtificial sequenceRTW1190A 494cgaattctac aggtcactaa
taccatctaa gtagttggtt catagtgact gcatatgtaa 60aaattatcct tattttaagg
aaattaaaaa ttatcatata tatataagtt ttaaattaat 120tatcttatat
atgtaccaaa aagttttaaa gcaattatta taaaaattaa taaatttatc
180atataaaata atttataatt aaattttaaa ttatcaattc attaaattaa
attatttaaa 240atttttgaat gataatataa taattttatc ctctactaag
tcccaacgtt tcctatttta 300ttccactttt agcaataaat tttgtcataa
acacttataa caaaaaaagt aagtaaaaaa 360taaaaaaaag tttttcaata
aagtataaac taatttgtat aaacttttag aaaaaataaa 420gttatacatt
gataatataa attttttaca taattatccg atcaactcat tatatatgat
480aaatttattg attttttaaa ataattatct taaaataatt taaacaatga
tttgcaatta 540gatgataata taaaattatt ttacacacta catgtattaa
actcaaactt ttatatatta 600gtttttctaa aaactaattt ttaactcaaa
aaaaatgtta cttataattt tcttatcttc 660tttttttata agtatttttt
aagaaattta ttgaaacatg accatgcttg ggtcaataat 720actactctct
tagacaccaa acaacccttc ccaaactata atctaatcca aaagccatca
780ttcattttcc ttggtaggta aagttccaag accttcacca actttttcac
tcaattgttt 840tggtgtaagc aattcgacat gtgttagtgt tagttggcaa
ccaaaaatcc ctttatgtga 900ctcaatccaa caaccactca caccaccaac
ccccataacc atttctcaca atacccttca 960tttacacatt atcatcacca
aaaataaata aaaaaaacct ctcatttcag agagagagag 1020agagacttca
cagaccaaag tgcagagaac aacaaagttc acaactttaa ggaaaattga
1080aatggcccaa gtgagcagag tgcacaatct tgctcaaagc actcaaattt
ttggccattc 1140ttccaactcc aacaaactca aatcggtgaa ttcggtttca
ttgaggccac gcctttgggg 1200ggcctcaaaa tctcgcatcc cgatgcataa
aaatggaagc tttatgggaa attttaatgt 1260ggggaaggga aattccggcg
tgtttaaggt ttctgcatcg gtcgccgccg cagagaagcc 1320gtcaacgtcg
ccggagatcg tgttggaacc catcaaagac ttctcgggta ccatcacatt
1380gccagggtcc aagtctctgt ccaatcgaat tttgcttctt gctgctctct
ctgaggttcg 1440tagatttctt ccgttttttt ttcttcttct ttattgtttg
ttctacatca gcatgatgtt 1500gatttgattg tgttttctat cgtttcatcg
attataaatt ttcataatca gaagattcag 1560cttttattaa tgcaagaacg
tccttaattg atgattttat aaccgtaaat taggtctaat 1620tagagttttt
ttcataaaga ttttcagatc cgtttacaac aagccttaat tgttgattct
1680gtagtcgtag attaaggttt ttttcatgaa ctacttcaga tccgttaaac
aacagcctta 1740tttgttgata cttcagtcgt ttttcaagaa attgttcaga
tccgttgata aaagccttat 1800tcgttgattc tgtatggtat ttcaagagat
attgctcagg tcctttagca actaccttat 1860ttgttgattc tgtggccata
gattaggatt ttttttcacg aaattgcttc ttgaaattac 1920gtgatggatt
ttgattctga tttatcttgt gattgttgac tctacaggga acaactgttg
1980tagacaactt gttgtatagt gaggatattc attacatgct tggtgcatta
aggacccttg 2040gactgcgtgt ggaagatgac aaaacaacca aacaagcaat
tgttgaaggc tgtgggggat 2100tgtttcccac tagtaaggaa tctaaagatg
aaatcaattt attccttgga aatgctggta 2160ttgcaatgag atctttgaca
gcagctgttg ttgctgcagg tggaaatgca aggtctgttt 2220tttttttttt
tgttcagcat aatctttgaa ttgttcctcg tataactaat cacaacagag
2280tacgtgttct tcttcctgtt ataatctaaa aatctcatcc agattagtca
tcctttcttc 2340ttaaaaggaa cctttaatta tcaatgtatt tatttaatat
ttaaattagc ttgtcaaagt 2400ctagcatata catattttga ttatattctg
agaaatgcac ctgagggtgt tcctcatgat 2460ctacttcaac ctctgttatt
attagatttt ctatcatgat tactggtttg agtctctaag 2520tagaccatct
tgatgttcaa aatatttcag ctacgtactt gatggggtgc cccgaatgag
2580agagaggcca attggggatt tggttgctgg tcttaagcaa cttggtgcag
atgttgattg 2640ctttcttggc acaaactgtc cacctgttcg tgtaaatggg
aagggaggac ttcctggcgg 2700aaaggtatgg tttggatttc atttagaata
aggtggagta actttcctgg atcaaaattc 2760taatttaaga agcctccctg
ttttcctctc tttagaataa gactaagggt aggtttagga 2820gttgggtttt
ggagagaaat ggaagggaga gcaatttttt tcttcttcta ataaatattc
2880tttaatttga tacatttttt aagtaaaaga atataaagat agattagcat
aacttaatgt 2940tttaatcttt tatttatttt tataaatatt atatacctgt
ctatttaaaa atcaaatatt 3000tgtcctccat tccctttccc ttcaaaacct
cagttccaaa tataccgtag ttgaattata 3060ttttggaagg cctattggtt
ggagactttt ccttttcaga gattatccct cacctttatt 3120atagcctttc
tatttttaaa cttcatatag acgccattct tggggcggcc gcgat
317549523DNAArtificial sequenceprimer, soy1-F3 495gtttgtttgt
tgttgggtgt ggg 2349625DNAArtificial sequenceprimer, soy1-R3
496gacatgatgc ttcattttca cagaa 2549718DNAArtificial sequencerobe,
soy1-T2(FAM-MGB) 497tgtgtagagt ggattttg 1849822DNAArtificial
sequenceprimer, soy1-F2 498tgttgttggg tgtgggaata gg
2249931DNAArtificial sequenceWOL1001, Forward_primer 499aggtttaatt
ttatataatg ttagcataca g 3150028DNAArtificial sequence500 WOL1002,
Reverse_primer 500atcaacatca tgctgatgta gaacaaac
2850129DNAArtificial sequence501 WOL1003, Forward_primer
501attctgattt atcttgtgat tgttgactc 2950227DNAArtificial
sequenceWOL1004, Reverse_primer 502atttactttg gagagaataa ggagggg
2750323DNAGlycine mac 503gaaacgttgg gacttagtag agg
2350423DNAGlycine max 504ggaataaaat aggaaacgtt ggg
235059174DNAArtificial sequenceRTW1201 505ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg aaacgttggg acttagtagg ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggcacc gaagaagaag
cgcaaggtga tggacaaaaa 2100gtactcaata gggctcgaca tagggactaa
ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg ccctccaaga
agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg
atcggtgccc tcctcttcga ctccggagag accgctgagg ctaccaggct
2280caagaggacc gctagaaggc gctacaccag aaggaagaac agaatctgct
acctgcagga 2340gatcttctcc aacgagatgg ccaaggtgga cgactccttc
ttccaccgcc ttgaggaatc 2400attcctggtg gaggaggata aaaagcacga
gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc taccatgaaa
agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac
aaggctgact tgcgcttgat ttacctggct ctcgctcaca tgataaagtt
2580ccgcggacac ttcctcattg agggagacct gaacccagac aactccgacg
tggacaagct 2640cttcatccag ctcgttcaga cctacaacca gcttttcgag
gagaacccaa tcaacgccag 2700tggagttgac gccaaggcta tcctctctgc
tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc cagctgcctg
gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga
ttgactccca acttcaagtc caacttcgac ctcgccgagg acgctaagtt
2880gcagttgtct aaagacacct acgacgatga cctcgacaac ttgctggccc
agataggcga 2940ccaatacgcc gatctcttcc tcgccgctaa gaacttgtcc
gacgcaatcc tgctgtccga 3000catcctgaga gtcaacactg agattaccaa
agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag caccaccaag
atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac
aaggagatct ttttcgacca gtccaagaac ggctacgccg gatacattga
3180cggaggcgcc tcccaggaag agttctacaa gttcatcaag cccatccttg
agaagatgga 3240cggtaccgag gagctgttgg tgaagttgaa cagagaggac
ctgttgagga agcagagaac 3300cttcgacaac ggaagcatcc ctcaccaaat
ccacctggga gagctccacg ccatcttgag 3360gaggcaggag gatttctatc
ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga
attccttact acgtcgggcc actcgccaga ggaaactcta ggttcgcctg
3480gatgacccgc aaatctgaag agaccattac tccctggaac ttcgaggaag
tcgtggacaa 3540gggcgcttcc gctcagtctt tcatcgagag gatgaccaac
ttcgataaaa atctgcccaa 3600cgagaaggtg ctgcccaagc actccctgtt
gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg aagtacgtca
cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc
atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg tcaagcagct
3780gaaggaggac tacttcaaga agatcgagtg cttcgactcc gtcgagatct
ctggtgtcga 3840ggacaggttc aacgcctccc ttgggactta ccacgatctg
ctcaagatta ttaaagacaa 3900ggacttcctg gacaacgagg agaacgagga
catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa gacagggaaa
tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg
atgaaacagc tgaagagacg cagatatacc ggctggggaa ggctctcccg
4080caaattgatc aacgggatca gggacaagca gtcagggaag actatactcg
acttcctgaa 4140gtccgacgga ttcgccaaca ggaacttcat gcagctcatt
cacgacgact ccttgacctt 4200caaggaggac atccagaagg ctcaggtgtc
tggacagggt gactccttgc atgagcacat 4260tgctaacttg gccggctctc
ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc
gtgaaggtga tgggacgcca caagccagag aacatcgtta ttgagatggc
4380tcgcgagaac caaactaccc agaaagggca gaagaattcc cgcgagagga
tgaagcgcat 4440tgaggagggc ataaaagagc ttggctctca gatcctcaag
gagcaccccg tcgagaacac 4500tcagctgcag aacgagaagc tgtacctgta
ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag ctggacatca
acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg
aaggatgact ccatcgacaa taaagtgctg acacgctccg ataaaaatag
4680aggcaagtcc gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa
actactggag 4740acagctcttg aacgccaagc tcatcaccca gcgtaagttc
gacaacctga ctaaggctga 4800gagaggagga ttgtccgagc tcgataaggc
cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt accaagcacg
tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac
aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc
4980cgacttccgc aaggacttcc agttctacaa ggtgagggag atcaacaact
accaccacgc 5040acacgacgcc tacctcaacg ctgtcgttgg aaccgccctc
atcaaaaaat atcctaagct 5100ggagtctgag ttcgtctacg gcgactacaa
ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag gagatcggca
aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag
accgagatca ctctcgccaa cggtgagatc aggaagcgcc cactgatcga
5280gaccaacggt gagactggag agatcgtgtg ggacaaaggg agggatttcg
ctactgtgag 5340gaaggtgctc tccatgcctc aggtgaacat cgtcaagaag
accgaagttc agaccggagg 5400attctccaag gagtccatcc tccccaagag
aaactccgac aagctgatcg ctagaaagaa 5460agactgggac cctaagaagt
acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc
aaggtggaga agggcaagtc caagaagctg aaatccgtca aggagctcct
5580cgggattacc atcatggaga ggagttcctt cgagaagaac cctatcgact
tcctggaggc 5640caagggatat aaagaggtga agaaggacct catcatcaag
ctgcccaagt actccctctt 5700cgagttggag aacggaagga agaggatgct
ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc cttccctcca
agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc
tctcctgagg acaacgagca gaagcagctc ttcgtggagc
agcacaagca 5880ctacctggac gaaattatcg agcagatctc tgagttctcc
aagcgcgtga tattggccga 5940cgccaacctc gacaaggtgc tgtccgccta
caacaagcac agggataagc ccattcgcga 6000gcaggctgaa aacattatcc
acctgtttac cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc
gacaccacca ttgacaggaa gagatacacc tccaccaagg aggtgctcga
6120cgcaacactc atccaccaat ccatcaccgg cctctatgaa acaaggattg
acttgtccca 6180gctgggaggc gactctagag ccgatcccaa gaagaagaga
aaggtgaaga gaccacggga 6240ccgccacgat ggcgagctgg gaggccgcaa
gcgggcaagg taggttaacc tagacttgtc 6300catcttctgg attggccaac
ttaattaatg tatgaaataa aaggatgcac acatagtgac 6360atgctaatca
ctataatgtg ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct
6420gaataaaaga gaaagagatc atccatattt cttatcctaa atgaatgtca
cgtgtcttta 6480taattctttg atgaaccaga tgcatttcat taaccaaatc
catatacata taaatattaa 6540tcatatataa ttaatatcaa ttgggttagc
aaaacaaatc tagtctaggt gtgttttgcg 6600aattcgatat caagcttatc
gataccgtcg agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac
ctaaatcgta tgtgtatgat acataaggtt atgtattaat tgtagccgcg
6720ttctaacgac aatatgtcca tatggtgcac tctcagtaca atctgctctg
atgccgcata 6780gttaagccag ccccgacacc cgccaacacc cgctgacgcg
ccctgacggg cttgtctgct 6840cccggcatcc gcttacagac aagctgtgac
cgtctccggg agctgcatgt gtcagaggtt 6900ttcaccgtca tcaccgaaac
gcgcgagacg aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc
atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc
7020cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa
tctgctgctt 7080gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg
ccggatcaag agctaccaac 7140tctttttccg aaggtaactg gcttcagcag
agcgcagata ccaaatactg ttcttctagt 7200gtagccgtag ttaggccacc
acttcaagaa ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg
ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga
7320ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg
gttcgtgcac 7380acagcccagc ttggagcgaa cgacctacac cgaactgaga
tacctacagc gtgagctatg 7440agaaagcgcc acgcttcccg aagggagaaa
ggcggacagg tatccggtaa gcggcagggt 7500cggaacagga gagcgcacga
gggagcttcc agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt
cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg
7620gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct
tttgctggcc 7680ttttgctcac atgttctttc ctgcgttatc ccctgattct
gtggataacc gtattaccgc 7740ctttgagtga gctgataccg ctcgccgcag
ccgaacgacc gagcgcagcg agtcagtgag 7800cgaggaagcg gaagagcgcc
caatacgcaa accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg
ttgatcagat ctcgatcccg cgaaattaat acgactcact atagggagac
7920cacaacggtt tccctctaga aataattttg tttaacttta agaaggagat
atacccatgg 7980aaaagcctga actcaccgcg acgtctgtcg agaagtttct
gatcgaaaag ttcgacagcg 8040tctccgacct gatgcagctc tcggagggcg
aagaatctcg tgctttcagc ttcgatgtag 8100gagggcgtgg atatgtcctg
cgggtaaata gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg
gcactttgca tcggccgcgc tcccgattcc ggaagtgctt gacattgggg
8220aattcagcga gagcctgacc tattgcatct cccgccgtgc acagggtgtc
acgttgcaag 8280acctgcctga aaccgaactg cccgctgttc tgcagccggt
cgcggaggct atggatgcga 8340tcgctgcggc cgatcttagc cagacgagcg
ggttcggccc attcggaccg caaggaatcg 8400gtcaatacac tacatggcgt
gatttcatat gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt
gatggacgac accgtcagtg cgtccgtcgc gcaggctctc gatgagctga
8520tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat
ttcggctcca 8580acaatgtcct gacggacaat ggccgcataa cagcggtcat
tgactggagc gaggcgatgt 8640tcggggattc ccaatacgag gtcgccaaca
tcttcttctg gaggccgtgg ttggcttgta 8700tggagcagca gacgcgctac
ttcgagcgga ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta
tatgctccgc attggtcttg accaactcta tcagagcttg gttgacggca
8820atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga
tccggagccg 8880ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc
cgtctggacc gatggctgtg 8940tagaagtact cgccgatagt ggaaaccgac
gccccagcac tcgtccgagg gcaaaggaat 9000agtgaggtac agcttggatc
gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac
cgctgagcaa taactagcat aaccccttgg ggcctctaaa cgggtcttga
9120ggggtttttt gctgaaagga ggaactatat ccggatgctc gggcgcgccg gtac
91745069174DNAArtificial sequenceRTW1202 506ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg gaataaaata ggaaacgttg ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggcacc gaagaagaag
cgcaaggtga tggacaaaaa 2100gtactcaata gggctcgaca tagggactaa
ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg ccctccaaga
agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg
atcggtgccc tcctcttcga ctccggagag accgctgagg ctaccaggct
2280caagaggacc gctagaaggc gctacaccag aaggaagaac agaatctgct
acctgcagga 2340gatcttctcc aacgagatgg ccaaggtgga cgactccttc
ttccaccgcc ttgaggaatc 2400attcctggtg gaggaggata aaaagcacga
gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc taccatgaaa
agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac
aaggctgact tgcgcttgat ttacctggct ctcgctcaca tgataaagtt
2580ccgcggacac ttcctcattg agggagacct gaacccagac aactccgacg
tggacaagct 2640cttcatccag ctcgttcaga cctacaacca gcttttcgag
gagaacccaa tcaacgccag 2700tggagttgac gccaaggcta tcctctctgc
tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc cagctgcctg
gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga
ttgactccca acttcaagtc caacttcgac ctcgccgagg acgctaagtt
2880gcagttgtct aaagacacct acgacgatga cctcgacaac ttgctggccc
agataggcga 2940ccaatacgcc gatctcttcc tcgccgctaa gaacttgtcc
gacgcaatcc tgctgtccga 3000catcctgaga gtcaacactg agattaccaa
agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag caccaccaag
atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac
aaggagatct ttttcgacca gtccaagaac ggctacgccg gatacattga
3180cggaggcgcc tcccaggaag agttctacaa gttcatcaag cccatccttg
agaagatgga 3240cggtaccgag gagctgttgg tgaagttgaa cagagaggac
ctgttgagga agcagagaac 3300cttcgacaac ggaagcatcc ctcaccaaat
ccacctggga gagctccacg ccatcttgag 3360gaggcaggag gatttctatc
ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga
attccttact acgtcgggcc actcgccaga ggaaactcta ggttcgcctg
3480gatgacccgc aaatctgaag agaccattac tccctggaac ttcgaggaag
tcgtggacaa 3540gggcgcttcc gctcagtctt tcatcgagag gatgaccaac
ttcgataaaa atctgcccaa 3600cgagaaggtg ctgcccaagc actccctgtt
gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg aagtacgtca
cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc
atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg tcaagcagct
3780gaaggaggac tacttcaaga agatcgagtg cttcgactcc gtcgagatct
ctggtgtcga 3840ggacaggttc aacgcctccc ttgggactta ccacgatctg
ctcaagatta ttaaagacaa 3900ggacttcctg gacaacgagg agaacgagga
catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa gacagggaaa
tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg
atgaaacagc tgaagagacg cagatatacc ggctggggaa ggctctcccg
4080caaattgatc aacgggatca gggacaagca gtcagggaag actatactcg
acttcctgaa 4140gtccgacgga ttcgccaaca ggaacttcat gcagctcatt
cacgacgact ccttgacctt 4200caaggaggac atccagaagg ctcaggtgtc
tggacagggt gactccttgc atgagcacat 4260tgctaacttg gccggctctc
ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc
gtgaaggtga tgggacgcca caagccagag aacatcgtta ttgagatggc
4380tcgcgagaac caaactaccc agaaagggca gaagaattcc cgcgagagga
tgaagcgcat 4440tgaggagggc ataaaagagc ttggctctca gatcctcaag
gagcaccccg tcgagaacac 4500tcagctgcag aacgagaagc tgtacctgta
ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag ctggacatca
acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg
aaggatgact ccatcgacaa taaagtgctg acacgctccg ataaaaatag
4680aggcaagtcc gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa
actactggag 4740acagctcttg aacgccaagc tcatcaccca gcgtaagttc
gacaacctga ctaaggctga 4800gagaggagga ttgtccgagc tcgataaggc
cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt accaagcacg
tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac
aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc
4980cgacttccgc aaggacttcc agttctacaa ggtgagggag atcaacaact
accaccacgc 5040acacgacgcc tacctcaacg ctgtcgttgg aaccgccctc
atcaaaaaat atcctaagct 5100ggagtctgag ttcgtctacg gcgactacaa
ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag gagatcggca
aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag
accgagatca ctctcgccaa cggtgagatc aggaagcgcc cactgatcga
5280gaccaacggt gagactggag agatcgtgtg ggacaaaggg agggatttcg
ctactgtgag 5340gaaggtgctc tccatgcctc aggtgaacat cgtcaagaag
accgaagttc agaccggagg 5400attctccaag gagtccatcc tccccaagag
aaactccgac aagctgatcg ctagaaagaa 5460agactgggac cctaagaagt
acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc
aaggtggaga agggcaagtc caagaagctg aaatccgtca aggagctcct
5580cgggattacc atcatggaga ggagttcctt cgagaagaac cctatcgact
tcctggaggc 5640caagggatat aaagaggtga agaaggacct catcatcaag
ctgcccaagt actccctctt 5700cgagttggag aacggaagga agaggatgct
ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc cttccctcca
agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc
tctcctgagg acaacgagca gaagcagctc ttcgtggagc agcacaagca
5880ctacctggac gaaattatcg agcagatctc tgagttctcc aagcgcgtga
tattggccga 5940cgccaacctc gacaaggtgc tgtccgccta caacaagcac
agggataagc ccattcgcga 6000gcaggctgaa aacattatcc acctgtttac
cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc gacaccacca
ttgacaggaa gagatacacc tccaccaagg aggtgctcga 6120cgcaacactc
atccaccaat ccatcaccgg cctctatgaa acaaggattg acttgtccca
6180gctgggaggc gactctagag ccgatcccaa gaagaagaga aaggtgaaga
gaccacggga 6240ccgccacgat ggcgagctgg gaggccgcaa gcgggcaagg
taggttaacc tagacttgtc 6300catcttctgg attggccaac ttaattaatg
tatgaaataa aaggatgcac acatagtgac 6360atgctaatca ctataatgtg
ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct 6420gaataaaaga
gaaagagatc atccatattt cttatcctaa atgaatgtca cgtgtcttta
6480taattctttg atgaaccaga tgcatttcat taaccaaatc catatacata
taaatattaa 6540tcatatataa ttaatatcaa ttgggttagc aaaacaaatc
tagtctaggt gtgttttgcg 6600aattcgatat caagcttatc gataccgtcg
agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac ctaaatcgta
tgtgtatgat acataaggtt atgtattaat tgtagccgcg 6720ttctaacgac
aatatgtcca tatggtgcac tctcagtaca atctgctctg atgccgcata
6780gttaagccag ccccgacacc cgccaacacc cgctgacgcg ccctgacggg
cttgtctgct 6840cccggcatcc gcttacagac aagctgtgac cgtctccggg
agctgcatgt gtcagaggtt 6900ttcaccgtca tcaccgaaac gcgcgagacg
aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc atgaccaaaa
tcccttaacg tgagttttcg ttccactgag cgtcagaccc 7020cgtagaaaag
atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt
7080gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag
agctaccaac 7140tctttttccg aaggtaactg gcttcagcag agcgcagata
ccaaatactg ttcttctagt 7200gtagccgtag ttaggccacc acttcaagaa
ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg ttaccagtgg
ctgctgccag tggcgataag tcgtgtctta ccgggttgga 7320ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac
7380acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc
gtgagctatg 7440agaaagcgcc acgcttcccg aagggagaaa ggcggacagg
tatccggtaa gcggcagggt 7500cggaacagga gagcgcacga gggagcttcc
agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt cgccacctct
gacttgagcg tcgatttttg tgatgctcgt caggggggcg 7620gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc
7680ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc
gtattaccgc 7740ctttgagtga gctgataccg ctcgccgcag ccgaacgacc
gagcgcagcg agtcagtgag 7800cgaggaagcg gaagagcgcc caatacgcaa
accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg ttgatcagat
ctcgatcccg cgaaattaat acgactcact atagggagac 7920cacaacggtt
tccctctaga aataattttg tttaacttta agaaggagat atacccatgg
7980aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg 8040tctccgacct gatgcagctc tcggagggcg aagaatctcg
tgctttcagc ttcgatgtag 8100gagggcgtgg atatgtcctg cgggtaaata
gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg gcactttgca
tcggccgcgc tcccgattcc ggaagtgctt gacattgggg 8220aattcagcga
gagcctgacc tattgcatct cccgccgtgc acagggtgtc acgttgcaag
8280acctgcctga aaccgaactg cccgctgttc tgcagccggt cgcggaggct
atggatgcga 8340tcgctgcggc cgatcttagc cagacgagcg ggttcggccc
attcggaccg caaggaatcg 8400gtcaatacac tacatggcgt gatttcatat
gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt gatggacgac
accgtcagtg cgtccgtcgc gcaggctctc gatgagctga 8520tgctttgggc
cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat ttcggctcca
8580acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc
gaggcgatgt 8640tcggggattc ccaatacgag gtcgccaaca tcttcttctg
gaggccgtgg ttggcttgta 8700tggagcagca gacgcgctac ttcgagcgga
ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta tatgctccgc
attggtcttg accaactcta tcagagcttg gttgacggca 8820atttcgatga
tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga tccggagccg
8880ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg 8940tagaagtact cgccgatagt ggaaaccgac gccccagcac
tcgtccgagg gcaaaggaat 9000agtgaggtac agcttggatc gatccggctg
ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac cgctgagcaa
taactagcat aaccccttgg ggcctctaaa cgggtcttga 9120ggggtttttt
gctgaaagga ggaactatat ccggatgctc gggcgcgccg gtac
91745076113DNAArtificial sequenceRTW1192A 507caagtagttc tagtcttaat
acaaatgtca aatggcacaa gtgagatttt gaatttctga 60tgttgtaaaa atctcaggac
atgaatacta ttgggaagca attattcata cttcaccaat 120ccaaactgac
ccaaaattct caaatcacat gaaagcaaaa atgcatataa cacgaagaat
180aagaagaaga ggaactaacc tggggtttcg atgattaaag cgttgttgtt
gatgatgaaa 240acgatgatta tggagagaaa ttgttgttga atggtgaaat
tgttatagaa agagaacgaa 300gatagagaaa aagatatata gatttttcaa
ggctcaaacc ctaaaatcac catgagagag 360aacaaagatt gagaaaccta
caaccactat gagagagaat gagcagaaca gaagcgtgag 420atagagaacg
agagttaagg tgcgagagga cacgaagaac aaaaggtgtg agagaaagaa
480caaaggagcc tacggtgtga gatgagagaa tttgaaattc ttaccattta
ggtggaattt 540caattctaca attttattct attaaaatta ttttaaaaaa
tgatgtcatt ttaaattctt 600taaaatctca tatccaacaa ctgaattatg
atagaggtat ttcaaattca cttaaaaaaa 660ttatcttatt taaatacccc
atccaaacat agcgaaatgt tcatgagaag gatcaagtgg 720tttggaaaca
tagtactaat ggtgtttata cagttcatgg aatccttgat agataattta
780aaggttgctg gaaattggat gaaggtgtgg agattaaata ttcttccaaa
aataaagcgt 840tttatttgga gagtgttgtg tggttgtctc ccctgtaggc
aaaagcttcg atgtaaagga 900gttcaatgtc caataaccta tgctttctat
ccctcgatta ttgaaaatga atgacacatt 960ttatttggtt gaaatcaaga
aataagcatg tggcaagcaa cgggtatttg acaattcata 1020gaacaaaagg
tgaatgcagc aaaagaatta atgaactcct tttcgatcta cttggatcac
1080tacatggaga tattatcaac aaatttgatg ttactttatg gagcagttgg
aattcttgga 1140atgacaagat atgaaatgaa cataccaacc ctcctcttgt
ttctgtttcg gtttctatgc 1200agtattttgt tgaatggcaa agtgcaaggt
aatatgctcc tcaacatcaa ttaacaaatg 1260ttcatgacat ctcttaccag
ctccaacttg gggacgtttg acaaacacca ccgtcaagtt 1320tccttaaatg
caacattaat gttgctcatt tcaaggagga gaatagtttt ggtgtcggca
1380tgatactcca tcaaggaaga ttcgtcaaag ctcactcacg ttttcgacat
gggtcgacat 1440gggttacctg acccaaaggc tgaggcttag gcttgggttt
gcttcaagta ttgatctggg 1500cccagactat tggtttacat aatatcattt
ttgaaaacct aacatctaaa actcaaggtt 1560gtttagaggt gcgccattcc
aaaataagat tatcctattt gtgcatgaat gcgaccaact 1620atctcctgtt
tcagcattat aaagtataaa caacaaactt ctttaatcaa gggactaaaa
1680gatattggac atacaagcta aaagtgatag aatttgagaa aacaaatatt
gacaacaata 1740ttcaagagga cactaaaaca taattctcaa attttttttg
tttatttaaa ataaagtggt 1800tcattaggta gctccgggtg attgcggtta
catcatgtac ggaaaaataa ttctaatcct 1860tgatttaaat ttgaacttga
ctatttattt attctttatt tcattttgta aatcatttta 1920tgtatctcct
ggcaagcaat tttatccacc ttgcaccaac accttcgggt tccataatca
1980aaccacctta acttcacacc atgctgtaac tcacaccgcc cagcatctcc
aatgtgaaag 2040aagctaaaat ttaataaaca atcatacgaa gcagtgacaa
aataccagat ggtattaatg 2100cttcgataaa attaattgga aagtataaaa
tggtagaaaa taataaatta taattaattt 2160aagtaagata aaaaataatt
aaaaactaaa atgttaaaat tttaaaaaaa ttattttaaa 2220taatatttaa
aaacattaaa aatcatttta aaaaatttat ttatagaaca attaaataaa
2280tatttcagct aataaaaaac aaaagcttac ctagccttag aagacaactt
gtccaacaat 2340tagatgatac ccattgccct tacgttttct ttaacatcaa
ttattgtttt tgtcaacaag 2400ctatctttta gttttatttt attggtaaaa
aatatgtcgc cttcaagttg catcatttaa 2460cacatctcgt cattagaaaa
ataaaactct tccctaaacg attagtagaa aaaatcattc 2520gataataaat
aagaaagaaa aattagaaaa aaataacttc attttaaaaa aatcattaag
2580gctatatttt ttaaatgact aattttatat agactgtaac taaaagtata
caatttatta 2640tgctatgtat cttaaagaat tacttataaa aatctacgga
agaatatctt acaaagtgaa 2700aaacaaatga gaaagaattt agtgggatga
ttatgatttt atttgaaaat tgaaaaaata 2760attattaaag actttagtgg
agtaagaaag ctttcctatt agtcttttct tatccataaa 2820aaaaaaaaaa
aaaatctagc gtgacagctt ttccatagat tttaataatg taaaatactg
2880gtagcagccg accgttcagg taatggacac tgtggtccta acttgcaacg
ggtgcgggcc 2940caatttaata acgccgtggt aacggataaa gccaagcgtg
aagcggtgaa ggtacatctc 3000tgactccgtc aagattacga aaccgtcaac
tacgaaggac tccccgaaat atcatctgtg 3060tcataaacac caagtcacac
catacatggg cacgcgtcac aatatgattg gagaacggtt 3120ccaccgcata
tgctataaaa tgcccccaca cccctcgacc ctaatcgcac ttcaattgca
3180atcaaattag ttcattctct ttgcgcagtt ccctacctct cctttcaagg
ttcgtagatt 3240tcttccgttt ttttttcttc ttctttattg tttgttctac
atcagcatga tgttgatttg 3300attgtgtttt ctatcgtttc atcgattata
aattttcata atcagaagat tcagctttta 3360ttaatgcaag aacgtcctta
attgatgatt ttataaccgt aaattaggtc taattagagt 3420ttttttcata
aagattttca gatccgttta caacaagcct taattgttga ttctgtagtc
3480gtagattaag gtttttttca tgaactactt cagatccgtt aaacaacagc
cttatttgtt 3540gatacttcag tcgtttttca agaaattgtt cagatccgtt
gataaaagcc ttattcgttg 3600attctgtatg gtatttcaag agatattgct
caggtccttt agcaactacc ttatttgttg 3660attctgtggc catagattag
gatttttttt cacgaaattg cttcttgaaa ttacgtgatg 3720gattttgatt
ctgatttatc ttgtgattgt tgactctaca gatggcccaa gtgagcagag
3780tgcacaatct tgctcaaagc actcaaattt ttggccattc ttccaactcc
aacaaactca 3840aatcggtgaa ttcggtttca ttgaggccac gcctttgggg
ggcctcaaaa tctcgcatcc 3900cgatgcataa aaatggaagc tttatgggaa
attttaatgt ggggaaggga aattccggcg 3960tgtttaaggt ttctgcatcg
gtcgccgccg cagagaagcc gtcaacgtcg ccggagatcg 4020tgttggaacc
catcaaagac ttctcgggta ccatcacatt gccagggtcc aagtctctgt
4080ccaatcgaat tttgcttctt gctgctctct ctgaggtgaa gtttatttat
ttatttattt 4140gtttgtttgt tgttgggtgt gggaatagga gtttgatgtg
tagagtggat tttgaatatt 4200tgattttttt ttgtattatt ctgtgaaaat
gaagcatcat gtcccatgaa agaaatggac 4260acgaaattaa gtggcttatg
atgtgaaatg aggatagaaa tgtgtgtagg gttttttaat 4320gggtagcaat
aagcatattc aatatctgga ttgatttgga cgtttctgta taaaggagta
4380tgctagcaat gtgttaatgt atggcttgct aaaatactcc taaaaatcaa
gtgggagtag 4440tatacatatc tacagcaaat gtattaggtg aggcatttgg
cttctctatt gtaaggaaca 4500aataatatca gttaatgtga aaatcaatgg
ttgatattcc aatacattca tgatgtgtta 4560tttatatgta cctaatattg
actgttgttt ttctccgcaa tgaccaagat tatttatttt 4620atcctctaaa
gtgactaatt gagttgctta ctttagagaa gttggaccca ttaggtgaga
4680gcgtgggggg aactaatctt gaatatacaa tctgagtctt gattatccaa
gtatggttgt 4740atgaacaatg ttagctctag aagataaacc ctcccccaaa
acacaaatta gaatgacatt 4800tcaagttcca tgtatgtcac tttcattcta
ttatttttac aacttttagt tacttaacag 4860atgtcttgtt cagcataaat
tataatttat tctgtttttt tttagggaac aactgttgta 4920gacaacttgt
tgtatagtga ggatattcat tacatgcttg gtgcattaag gacccttgga
4980ctgcgtgtgg aagatgacaa aacaaccaaa caagcaattg ttgaaggctg
tgggggattg 5040tttcccacta gtaaggaatc taaagatgaa atcaatttat
tccttggaaa tgctggtatt 5100gcaatgagat ctttgacagc agctgttgtt
gctgcaggtg gaaatgcaag gtctgttttt 5160tttttttttg ttcagcataa
tctttgaatt gttcctcgta taactaatca caacagagta 5220cgtgttcttc
ttcctgttat aatctaaaaa tctcatccag attagtcatc ctttcttctt
5280aaaaggaacc tttaattatc aatgtattta tttaatattt aaattagctt
gtcaaagtct 5340agcatataca tattttgatt atattctgag aaatgcacct
gagggtgttc ctcatgatct 5400acttcaacct ctgttattat tagattttct
atcatgatta ctggtttgag tctctaagta 5460gaccatcttg atgttcaaaa
tatttcagct acgtacttga tggggtgccc cgaatgagag 5520agaggccaat
tggggatttg gttgctggtc ttaagcaact tggtgcagat gttgattgct
5580ttcttggcac aaactgtcca cctgttcgtg taaatgggaa gggaggactt
cctggcggaa 5640aggtatggtt tggatttcat ttagaataag gtggagtaac
tttcctggat caaaattcta 5700atttaagaag cctccctgtt ttcctctctt
tagaataaga ctaagggtag gtttaggagt 5760tgggttttgg agagaaatgg
aagggagagc aatttttttc ttcttctaat aaatattctt 5820taatttgata
cattttttaa gtaaaagaat ataaagatag attagcataa cttaatgttt
5880taatctttta tttattttta taaatattat atacctgtct atttaaaaat
caaatatttg 5940tcctccattc cctttccctt caaaacctca gttccaaata
taccgtagtt gaattatatt 6000ttggaaggcc tattggttgg agacttttcc
ttttcagaga ttatccctca cctttattat 6060agcctttcta tttttaaact
tcatatagac gccattcttg gggcggccgc gat 611350832DNAArtificial
sequenceprimer, soy1-F4 508tcaataatac tactctctta gacaccaaac aa
3250923DNAArtificial sequenceprimer, soy1-R4 509caaggaaaat
gaatgatggc ttt 2351018DNAArtificial sequenceprobe, soy1-T3(FAM-MGB)
510ccttcccaaa ctataatc 1851127DNAArtificial sequenceWOL1005,
Forward_primer 511aaatgttatc agaggaacat gagctgc
2751228DNAArtificial sequenceWOL1006, Reverse_primer 512attatttttc
cgtacatgat gtaaccgc 28513438DNACauliflower mozaic virus
513cccatggagt caaagattca aatagaggac ctaacagaac tcgccgtaaa
gactggcgaa 60cagttcatac agagtctctt acgactcaat gacaagaaga aaatcttcgt
caacatggtg 120gagcacgaca cgcttgtcta ctccaaaaat atcaaagata
cagtctcaga agaccaaagg 180gcaattgaga cttttcaaca aagggtaata
tccggaaacc tcctcggatt ccattgccca 240gctatctgtc actttattgt
gaagatagtg gaaaaggaag gtggctccta caaatgccat 300cattgcgata
aaggaaaggc catcgttgaa gatgcctctg ccgacagtgg tcccaaagat
360ggacccccac ccacgaggag catcgtggaa aaagaagacg ttccaaccac
gtcttcaaag 420caagtggatt gatgtgat 43851419DNACauliflower mozaic
virus 514gtctcagaag accaaaggg 1951525DNACauliflower mozaic virus
515tgccatcatt gcgataaagg aaagg 2551620DNACauliflower mozaic virus
516gatgcctctg ccgacagtgg 205173708DNAZea mays 517ctgcagccca
tcaaggagat ctccggcacc gtcaagctgc cggggtccaa gtcgctttcc 60aacaggatcc
tcctgctcgc cgccctgtcc gaggtgagcg attttggtgc ttgctgcgct
120gccctgtctc actgctacct aaatgttttg cctgtcgaat accatggatt
ctcggtgtaa 180tccatctcac gatcagatgc accgcatgtc gcatgcctag
ctctctctaa tttgtctagt 240agtttgtata cggattaaga ttgataaatc
ggtaccgcaa aagctaggtg taaataaaca 300ctacaaaatt ggatgttccc
ctatcggcct gtactcggct actcgttctt gtgatggcat 360gttatttctt
cttggtgttt ggtgaactcc cttatgaaat ttgggcgcaa agaaatcgcc
420ctcaagggtt gatcttatgc catcgtcatg ataaacagtg aagcacggat
gatcctttac 480gttgttttta acaaactttg tcagaaaact agcaatgtta
acttcttaat gatgatttca 540caacaaaaaa ggtaaccttg ctactaacat
aacaaaagac ttgttgctta ttaattatat 600gtttttttaa tctttgatca
ggggacaaca gtggttgata acctgttgaa cagtgaggat 660gtccactaca
tgctcggggc cttgaggact cttggtctct ctgtcgaagc ggacaaagct
720gccaaaagag ctgtagttgt tggctgtggt ggaaagttcc cagttgagga
tgctagagag 780gaagtgcagc tcttcttggg gaatgctgga atcgcaatgc
ggtcattgac agcagctgtt 840actgctgctg gtggaaatgc aacgtatgtt
tcctctctct ctctacaata cttgttggag 900ttagtatgaa acccatgtgt
atgtctagtg gcttatggtg tattggtttt tgaacttcag 960ttacgtgctt
gatggagtac caagaatgag ggagagaccc attggcgact tggttgtcgg
1020attgaagcag cttggtgcag atgttgattg tttccttggc actgactgcc
cacctgttcg 1080tgtcaatgga atcggagggc tacctggtgg caaggttagt
tactaagggc cacatgttac 1140attcttctgt aaatggtaca actattgtcg
agcttttgca tttgtaagga aaacattgat 1200tgatctgaat ttgatgctac
accacaaaat atctacaaat ggtcatccct aactagcaaa 1260ccatgtctcc
attaagctca atgaagtaat acttggcatg tgtttatcaa cttaatttcc
1320atcttctggg gtattgcctg ttttctagtc taatagcatt tgtttttaga
attagctctt 1380acaactgtta tgttctacag gtcaagctgt ctggctccat
cagcagtcag tacttgagtg 1440ccttgctgat ggctgctcct ttggctcttg
gggatgtgga gattgaaatc attgataaat 1500taatctccat tccctacgtc
gaaatgacat tgagattgat ggagcgtttt ggtgtgaaag 1560cagagcattc
tgatagctgg gacagattct acattaaggg aggtcaaaaa tacaagtaag
1620ctctgtaatg tatttcacta ctttgatgcc aatgtttcag ttttcagttt
tccaaacagt 1680cgcatcaata tttgaataga tgcactgtag aaaaaaatca
ttgcagggaa aaactagtac 1740tgagtatttt gactgtaaat tatttaacca
gtcggaatat agtcagtcta ttggagtcaa 1800gagcgtgaac cgaaatagcc
agttaattat cccattatac agaggacaac catgtatact 1860attgaaactt
ggtttaagag aatctaggta gctggactcg tagctgcttg gcatggatac
1920cttcttatct ttaggaaaag acacttgatt ttttttctgt ggccctctat
gatgtgtgaa 1980cctgcttctc tattgcttta gaaggatata tctatgtcgt
tatgcaacat gcttccctta 2040gtcatttgta ctgaaatcag tttcataagt
tcgttagtgg ttccctaaac gaaaccttgt 2100ttttctttgc aatcaacagg
tcccctaaaa atgcctatgt tgaaggtgat gcctcaagcg 2160caagctattt
cttggctggt gctgcaatta ctggagggac tgtgactgtg gaaggttgtg
2220gcaccaccag tttgcaggta aagatttctt ggctggtgct acgataactg
cttttgtctt 2280tttggtttca gcattgttct cagagtcact aaataacatt
atcatctgca aacgtcaaat 2340agacatactt aggtgaatgg atattcatgt
aaccgtttcc ttacaaattt gctgaaacct 2400cagggtgatg tgaagtttgc
tgaggtactg gagatgatgg gagcgaaggt tacatggacc 2460gagactagcg
taactgttac tggcccaccg cgggagccat ttgggaggaa acacctcaag
2520gcgattgatg tcaacatgaa caagatgcct gatgtcgcca tgactcttgc
tgtggttgcc 2580ctctttgccg atggcccgac agccatcaga gacggtaaaa
cattctcagc cctacaacca 2640tgcctcttct acatcactac ttgacaagac
taaaaactat tggctcgttg gcagtggctt 2700cctggagagt aaaggagacc
gagaggatgg ttgcgatccg gacggagcta accaaggtaa 2760ggctacatac
ttcacatgtc tcacgtcgtc tttccatagc tcgctgcctc ttagcggctt
2820gcctgcggtc gctccatcct cggttgctgt ctgtgttttc cacagctggg
agcatctgtt 2880gaggaagggc cggactactg catcatcacg ccgccggaga
agctgaacgt gacggcgatc 2940gacacgtacg acgaccacag gatggccatg
gccttctccc ttgccgcctg tgccgaggtc 3000cccgtgacca tccgggaccc
tgggtgcacc cggaagacct tccccgacta cttcgatgtg 3060ctgagcactt
tcgtcaagaa ttaataaagc gtgcgatact accacgcagc ttgattgaag
3120tgataggctt gtgctgagga aatacatttc ttttgttctg ttttttctct
ttcacgggat 3180taagttttga gtctgtaacg ttagttgttt gtagcaagtt
tctatttcgg atcttaagtt 3240tgtgcactgt aagccaaatt tcatttcaag
agtggttcgt tggaataata agaataataa 3300attacgtttc agtggctgtc
aagcctgctg ctacgtttta ggagatggca ttagacattc 3360atcatcaaca
acaataaaac cttttagcct caaacaataa tagtgaagtt attttttagt
3420cctaaacaag ttgcattagg atatagttaa aacacaaaag aagctaaagt
tagggtttag 3480acatgtggat attgttttcc atgtatagta tgttctttct
ttgagtctca tttaactacc 3540tctacacata ccaactttag ttttttttct
acctcttcat gttactatgg tgccttctta 3600tcccactgag cattggtata
tttagaggtt tttgttgaac atgcctaaat catctcaatc 3660aacgatggac
aatcttttct tcgattgagc tgaggtacgt catctaga 37085183714DNAZea mays
518ctgcagccca tcaaggagat ctccggcacc gtcaagctgc cggggtccaa
gtcgctttcc 60aacaggatcc tcctgctcgc cgccctgtcc gaggtgagcg attttggtgc
ttgctgcgct 120gccctgtctc actgctacct aaatgttttg cctgtcgaat
accatggatt ctcggtgtaa 180tccatatctg cacgatcaga tatgcaccgc
atgtcgcata tctgagctct ctctaatttg 240tctagtagtt tgtatacgga
ttaagattga taaatcggta ccgcaaaagc taggtgtaaa 300taaacactac
aaaattggat gttcccctat cggcctgtac tcggctactc gttcttgtga
360tggcatgtta tttcttcttg gtgtttggtg aactccctta tgaaatttgg
gcgcaaagaa 420atcgccctca agggttgatc ttatgccatc gtcatgataa
acagtgaagc acggatgatc 480ctttacgttg tttttaacaa actttgtcag
aaaactagca atgttaactt cttaatgatg 540atttcacaac aaaaaaggta
accttgctac taacataaca aaagacttgt tgcttattaa 600ttatatgttt
ttttaatctt tgatcagggg acaacagtgg ttgataacct gttgaacagt
660gaggatgtcc actacatgct cggggccttg aggactcttg gtctctctgt
cgaagcggac 720aaagctgcca aaagagctgt agttgttggc tgtggtggaa
agttcccagt tgaggatgct 780aaagaggaag tgcagctctt cttggggaat
gctggaatcg caatgcggtc attgacagca 840gctgttactg ctgctggtgg
aaatgcaacg tatgtttcct ctctctctct acaatacttg 900ttggagttag
tatgaaaccc atgtgtatgt ctagtggctt atggtgtatt ggtttttgaa
960cttcagttac gtgcttgatg gagtaccaag aatgagggag agacccattg
gcgacttggt 1020tgtcggattg aagcagcttg gtgcagatgt tgattgtttc
cttggcactg actgcccacc 1080tgttcgtgtc aatggaatcg gagggctacc
tggtggcaag gttagttact aagggccaca 1140tgttacattc ttctgtaaat
ggtacaacta ttgtcgagct tttgcatttg taaggaaaac 1200attgattgat
ctgaatttga tgctacacca caaaatatct acaaatggtc atccctaact
1260agcaaaccat gtctccatta agctcaatga agtaatactt ggcatgtgtt
tatcaactta 1320atttccatct tctggggtat tgcctgtttt ctagtctaat
agcatttgtt tttagaatta 1380gctcttacaa ctgttatgtt ctacaggtca
agctgtctgg ctccatcagc agtcagtact 1440tgagtgcctt gctgatggct
gctcctttgg ctcttgggga tgtggagatt gaaatcattg 1500ataaattaat
ctccattccc tacgtcgaaa tgacattgag attgatggag cgttttggtg
1560tgaaagcaga gcattctgat agctgggaca gattctacat taagggaggt
caaaaataca 1620agtaagctct gtaatgtatt tcactacttt gatgccaatg
tttcagtttt cagttttcca 1680aacagtcgca tcaatatttg aatagatgca
ctgtagaaaa aaatcattgc agggaaaaac 1740tagtactgag tattttgact
gtaaattatt taaccagtcg gaatatagtc agtctattgg 1800agtcaagagc
gtgaaccgaa atagccagtt aattatccca ttatacagag gacaaccatg
1860tatactattg aaacttggtt taagagaatc taggtagctg gactcgtagc
tgcttggcat 1920ggataccttc ttatctttag gaaaagacac ttgatttttt
ttctgtggcc ctctatgatg 1980tgtgaacctg cttctctatt gctttagaag
gatatatcta tgtcgttatg caacatgctt 2040cccttagtca tttgtactga
aatcagtttc ataagttcgt tagtggttcc ctaaacgaaa 2100ccttgttttt
ctttgcaatc aacaggtccc ctaaaaatgc ctatgttgaa ggtgatgcct
2160caagcgcaag ctatttcttg gctggtgctg caattactgg agggactgtg
actgtggaag 2220gttgtggcac caccagtttg caggtaaaga tttcttggct
ggtgctacga taactgcttt 2280tgtctttttg gtttcagcat tgttctcaga
gtcactaaat aacattatca tctgcaaacg 2340tcaaatagac atacttaggt
gaatggatat tcatgtaacc gtttccttac aaatttgctg 2400aaacctcagg
gtgatgtgaa gtttgctgag gtactggaga tgatgggagc gaaggttaca
2460tggaccgaga ctagcgtaac tgttactggc ccaccgcggg agccatttgg
gaggaaacac 2520ctcaaggcga ttgatgtcaa catgaacaag atgcctgatg
tcgccatgac tcttgctgtg 2580gttgccctct ttgccgatgg cccgacagcc
atcagagacg gtaaaacatt ctcagcccta 2640caaccatgcc tcttctacat
cactacttga caagactaaa aactattggc tcgttggcag 2700tggcttcctg
gagagtaaag gagaccgaga ggatggttgc gatccggacg gagctaacca
2760aggtaaggct acatacttca catgtctcac gtcgtctttc catagctcgc
tgcctcttag 2820cggcttgcct gcggtcgctc catcctcggt tgctgtctgt
gttttccaca gctgggagca 2880tctgttgagg aagggccgga ctactgcatc
atcacgccgc cggagaagct gaacgtgacg 2940gcgatcgaca cgtacgacga
ccacaggatg gccatggcct tctcccttgc cgcctgtgcc 3000gaggtccccg
tgaccatccg ggaccctggg tgcacccgga agaccttccc cgactacttc
3060gatgtgctga gcactttcgt caagaattaa taaagcgtgc gatactacca
cgcagcttga 3120ttgaagtgat aggcttgtgc tgaggaaata catttctttt
gttctgtttt ttctctttca 3180cgggattaag ttttgagtct gtaacgttag
ttgtttgtag caagtttcta tttcggatct 3240taagtttgtg cactgtaagc
caaatttcat ttcaagagtg gttcgttgga ataataagaa 3300taataaatta
cgtttcagtg gctgtcaagc ctgctgctac gttttaggag atggcattag
3360acattcatca tcaacaacaa taaaaccttt tagcctcaaa caataatagt
gaagttattt 3420tttagtccta aacaagttgc attaggatat agttaaaaca
caaaagaagc taaagttagg 3480gtttagacat gtggatattg ttttccatgt
atagtatgtt ctttctttga gtctcattta 3540actacctcta cacataccaa
ctttagtttt ttttctacct cttcatgtta ctatggtgcc 3600ttcttatccc
actgagcatt ggtatattta gaggtttttg ttgaacatgc ctaaatcatc
3660tcaatcaacg atggacaatc ttttcttcga ttgagctgag gtacgtcatc taga
37145193708DNAZea mays 519ctgcagccca tcaaggagat ctccggcacc
gtcaagctgc cggggtccaa gtcgctttcc 60aacaggatcc tcctgctcgc cgccctgtcc
gaggtgagcg attttggtgc ttgctgcgct 120gccctgtctc actgctacct
aaatgttttg cctgtcgaat accatggatt ctcggtgtaa 180tccatctcac
gatcagatgc accgcatgtc gcatgcctag ctctctctaa tttgtctagt
240agtttgtata cggattaaga ttgataaatc ggtaccgcaa aagctaggtg
taaataaaca 300ctacaaaatt ggatgttccc ctatcggcct gtactcggct
actcgttctt gtgatggcat 360gttatttctt cttggtgttt ggtgaactcc
cttatgaaat ttgggcgcaa agaaatcgcc 420ctcaagggtt gatcttatgc
catcgtcatg ataaacagtg aagcacggat gatcctttac 480gttgttttta
acaaactttg tcagaaaact agcaatgtta acttcttaat gatgatttca
540caacaaaaaa ggtaaccttg ctactaacat aacaaaagac ttgttgctta
ttaattatat 600gtttttttaa tctttgatca ggggacaaca gtggttgata
acctgttgaa cagtgaggat 660gtccactaca tgctcggggc cttgaggact
cttggtctct ctgtcgaagc ggacaaagct 720gccaaaagag ctgtagttgt
tggctgtggt ggaaagttcc cagttgagga tgctagaaag 780gaagtgcagc
tcttcttggg gaatgctgga atcgcaatgc ggtcattgac agcagctgtt
840actgctgctg gtggaaatgc aacgtatgtt tcctctctct ctctacaata
cttgttggag 900ttagtatgaa acccatgtgt atgtctagtg gcttatggtg
tattggtttt tgaacttcag 960gtacgtgctt gatggagtac caagaatgag
ggagagaccc attggcgact tggttgtcgg 1020attgaagcag cttggtgcag
atgttgattg tttccttggc actgactgcc cacctgttcg 1080tgtcaatgga
atcggagggc tacctggtgg caaggttagt tactaagggc cacatgttac
1140attcttctgt aaatggtaca actattgtcg agcttttgca tttgtaagga
aaacattgat 1200tgatctgaat ttgatgctac accacaaaat atctacaaat
ggtcatccct aactagcaaa 1260ccatgtctcc attaagctca atgaagtaat
acttggcatg tgtttatcaa cttaatttcc 1320atcttctggg gtattgcctg
ttttctagtc taatagcatt tgtttttaga attagctctt 1380acaactgtta
tgttctacag gtcaagctgt ctggctccat cagcagtcag tacttgagtg
1440ccttgctgat ggctgctcct ttggctcttg gggatgtgga gattgaaatc
attgataaat 1500taatctccat tccctacgtc gaaatgacat tgagattgat
ggagcgtttt ggtgtgaaag 1560cagagcattc tgatagctgg gacagattct
acattaaggg aggtcaaaaa tacaagtaag 1620ctctgtaatg tatttcacta
ctttgatgcc aatgtttcag ttttcagttt tccaaacagt 1680cgcatcaata
tttgaataga tgcactgtag aaaaaaatca ttgcagggaa aaactagtac
1740tgagtatttt gactgtaaat tatttaacca gtcggaatat agtcagtcta
ttggagtcaa 1800gagcgtgaac cgaaatagcc agttaattat cccattatac
agaggacaac catgtatact 1860attgaaactt ggtttaagag aatctaggta
gctggactcg tagctgcttg gcatggatac 1920cttcttatct ttaggaaaag
acacttgatt ttttttctgt ggccctctat gatgtgtgaa 1980cctgcttctc
tattgcttta gaaggatata tctatgtcgt tatgcaacat gcttccctta
2040gtcatttgta ctgaaatcag tttcataagt tcgttagtgg ttccctaaac
gaaaccttgt 2100ttttctttgc aatcaacagg tcccctaaaa atgcctatgt
tgaaggtgat gcctcaagcg 2160caagctattt cttggctggt gctgcaatta
ctggagggac tgtgactgtg gaaggttgtg 2220gcaccaccag tttgcaggta
aagatttctt ggctggtgct acgataactg cttttgtctt 2280tttggtttca
gcattgttct cagagtcact aaataacatt atcatctgca aacgtcaaat
2340agacatactt aggtgaatgg atattcatgt aaccgtttcc ttacaaattt
gctgaaacct 2400cagggtgatg tgaagtttgc tgaggtactg gagatgatgg
gagcgaaggt tacatggacc 2460gagactagcg taactgttac tggcccaccg
cgggagccat ttgggaggaa acacctcaag 2520gcgattgatg tcaacatgaa
caagatgcct gatgtcgcca tgactcttgc tgtggttgcc
2580ctctttgccg atggcccgac agccatcaga gacggtaaaa cattctcagc
cctacaacca 2640tgcctcttct acatcactac ttgacaagac taaaaactat
tggctcgttg gcagtggctt 2700cctggagagt aaaggagacc gagaggatgg
ttgcgatccg gacggagcta accaaggtaa 2760ggctacatac ttcacatgtc
tcacgtcgtc tttccatagc tcgctgcctc ttagcggctt 2820gcctgcggtc
gctccatcct cggttgctgt ctgtgttttc cacagctggg agcatctgtt
2880gaggaagggc cggactactg catcatcacg ccgccggaga agctgaacgt
gacggcgatc 2940gacacgtacg acgaccacag gatggccatg gccttctccc
ttgccgcctg tgccgaggtc 3000cccgtgacca tccgggaccc tgggtgcacc
cggaagacct tccccgacta cttcgatgtg 3060ctgagcactt tcgtcaagaa
ttaataaagc gtgcgatact accacgcagc ttgattgaag 3120tgataggctt
gtgctgagga aatacatttc ttttgttctg ttttttctct ttcacgggat
3180taagttttga gtctgtaacg ttagttgttt gtagcaagtt tctatttcgg
atcttaagtt 3240tgtgcactgt aagccaaatt tcatttcaag agtggttcgt
tggaataata agaataataa 3300attacgtttc agtggctgtc aagcctgctg
ctacgtttta ggagatggca ttagacattc 3360atcatcaaca acaataaaac
cttttagcct caaacaataa tagtgaagtt attttttagt 3420cctaaacaag
ttgcattagg atatagttaa aacacaaaag aagctaaagt tagggtttag
3480acatgtggat attgttttcc atgtatagta tgttctttct ttgagtctca
tttaactacc 3540tctacacata ccaactttag ttttttttct acctcttcat
gttactatgg tgccttctta 3600tcccactgag cattggtata tttagaggtt
tttgttgaac atgcctaaat catctcaatc 3660aacgatggac aatcttttct
tcgattgagc tgaggtacgt catctaga 3708520464PRTzea mays 520Met Gln Leu
Asp Leu Asn Val Ala Glu Ala Pro Pro Pro Val Glu Met 1 5 10 15 Glu
Ala Ser Asp Ser Gly Ser Ser Val Leu Asn Ala Ser Glu Ala Ala 20 25
30 Ser Ala Gly Gly Ala Pro Ala Pro Ala Glu Glu Gly Ser Ser Ser Thr
35 40 45 Pro Ala Val Leu Glu Phe Ser Ile Leu Ile Arg Ser Asp Ser
Asp Ala 50 55 60 Ala Gly Ala Asp Glu Asp Glu Asp Ala Thr Pro Ser
Pro Pro Pro Arg 65 70 75 80 His Arg His Gln His Gln Gln Gln Leu Val
Thr Arg Glu Leu Phe Pro 85 90 95 Ala Gly Ala Gly Pro Pro Ala Pro
Thr Pro Arg His Trp Ala Glu Leu 100 105 110 Gly Phe Phe Arg Ala Asp
Leu Gln Gln Gln Gln Ala Pro Gly Pro Arg 115 120 125 Ile Val Pro His
Pro His Ala Ala Pro Pro Pro Ala Lys Lys Ser Arg 130 135 140 Arg Gly
Pro Arg Ser Arg Ser Ser Gln Tyr Arg Gly Val Thr Phe Tyr 145 150 155
160 Arg Arg Thr Gly Arg Trp Glu Ser His Ile Trp Asp Cys Gly Lys Gln
165 170 175 Val Tyr Leu Gly Gly Phe Asp Thr Ala His Ala Ala Ala Arg
Ala Tyr 180 185 190 Asp Arg Ala Ala Ile Lys Phe Arg Gly Val Asp Ala
Asp Ile Asn Phe 195 200 205 Asn Leu Ser Asp Tyr Glu Asp Asp Met Lys
Gln Met Gly Ser Leu Ser 210 215 220 Lys Glu Glu Phe Val His Val Leu
Arg Arg Gln Ser Thr Gly Phe Ser 225 230 235 240 Arg Gly Ser Ser Arg
Tyr Arg Gly Val Thr Leu His Lys Cys Gly Arg 245 250 255 Trp Glu Ala
Arg Met Gly Gln Phe Leu Gly Lys Lys Tyr Ile Tyr Leu 260 265 270 Gly
Leu Phe Asp Ser Glu Val Glu Ala Ala Arg Ala Tyr Asp Lys Ala 275 280
285 Ala Ile Lys Cys Asn Gly Arg Glu Ala Val Thr Asn Phe Glu Pro Ser
290 295 300 Thr Tyr His Gly Glu Leu Pro Thr Glu Val Ala Asp Val Asp
Leu Asn 305 310 315 320 Leu Ser Ile Ser Gln Pro Ser Pro Gln Arg Asp
Lys Asn Ser Cys Leu 325 330 335 Gly Leu Gln Leu His His Gly Pro Phe
Glu Gly Ser Glu Leu Lys Lys 340 345 350 Thr Lys Ile Asp Asp Ala Pro
Ser Glu Leu Pro Gly Arg Pro Arg Gln 355 360 365 Leu Ser Pro Leu Val
Ala Glu His Pro Pro Ala Trp Pro Ala Gln Pro 370 375 380 Pro His Pro
Phe Phe Val Phe Thr Asn His Glu Met Ser Ala Ser Gly 385 390 395 400
Asp Leu His Arg Arg Pro Ala Gly Ala Val Pro Ser Trp Ala Trp Gln 405
410 415 Val Ala Ala Ala Ala Pro Pro Pro Ala Ala Leu Pro Ser Ser Ala
Ala 420 425 430 Ala Ser Ser Gly Phe Ser Asn Thr Ala Thr Thr Ala Ala
Thr Thr Ala 435 440 445 Pro Ser Ala Ser Ser Leu Arg Tyr Cys Pro Pro
Pro Pro Pro Pro Ser 450 455 460 5211413DNAZea mays 521atgcagttgg
atctgaacgt ggccgaggcg ccgccgccgg tggagatgga ggcgagcgac 60tcggggtcgt
cggtgctgaa cgcgtcggaa gcggcgtcgg cgggcggcgc gcccgcgccg
120gcggaggagg gatctagctc aacgccggcc gtgctggagt tcagcatcct
catccggagc 180gatagcgacg cggccggcgc ggacgaggac gaggacgcca
cgccatcgcc tcctcctcgc 240caccgccacc agcaccagca gcagctcgtg
acccgcgagc tgttcccggc cggcgccggt 300ccgccggccc cgacgccgcg
gcattgggcc gagctcggct tcttccgcgc cgacctgcag 360cagcaacagg
cgccgggccc caggatcgtg ccgcacccac acgccgcgcc gccgccggcc
420aagaagagcc gccgcggccc gcgctcccgc agctcgcagt accgcggcgt
caccttctac 480cgccgcacag gccgctggga gtcccacatc tgggattgcg
gcaagcaggt gtacctaggt 540ggattcgaca ccgctcacgc cgctgcaagg
gcgtacgacc gggcggcgat caagttccgc 600ggcgtcgacg ccgacatcaa
cttcaacctc agcgactacg aggacgacat gaagcagatg 660gggagcctgt
ccaaggagga gttcgtgcac gtcctgcgcc gtcagagcac cggcttctcg
720agaggcagct ccaggtacag aggcgtcacc ctgcacaagt gcggccgctg
ggaggcgcgc 780atggggcagt tcctcggcaa gaagtacata taccttgggc
tattcgacag cgaagtagag 840gctgcaagag cctacgacaa ggccgccatc
aaatgcaatg gcagagaggc cgtgacgaac 900ttcgagccga gcacgtatca
cggggagctg ccgactgaag ttgctgatgt cgatctgaac 960ctgagcatat
ctcagccgag cccccaaaga gacaagaaca gctgcctagg tctgcagctc
1020caccacggac cattcgaggg ctccgaactg aagaaaacca agatcgacga
tgctccctct 1080gagctaccgg gccgccctcg tcagctgtct cctctcgtgg
ctgagcatcc gccggcctgg 1140cctgcgcagc cgcctcaccc cttcttcgtc
ttcacaaacc atgagatgag tgcatcagga 1200gatctccaca ggaggcctgc
aggggctgtt cccagctggg catggcaggt ggcagcagca 1260gctcctcctc
ctgccgccct gccgtcgtcc gctgcagcat catcaggatt ctccaacacc
1320gccacgacag ctgccaccac cgccccatcg gcctcctccc tccggtactg
cccgccgccg 1380ccgccgccgt cgagccatca ccatccccgc tga
1413522514PRTZea Mays 522Met Thr Thr Ser Thr Thr Ala Lys Gln Leu
Arg Arg Val Arg Thr Leu 1 5 10 15 Gly Arg Gly Ala Ser Gly Ala Val
Val Trp Leu Ala Ser Asp Glu Ala 20 25 30 Ser Gly Glu Leu Val Ala
Val Lys Ser Ala Arg Ala Ala Gly Ala Ala 35 40 45 Ala Gln Leu Gln
Arg Glu Gly Arg Val Leu Arg Gly Leu Ser Ser Pro 50 55 60 His Ile
Val Pro Cys Leu Gly Ser Arg Ala Ala Ala Gly Gly Glu Tyr 65 70 75 80
Gln Leu Leu Leu Glu Phe Ala Pro Gly Gly Ser Leu Ala Asp Glu Ala 85
90 95 Ala Arg Ser Gly Gly Gly Arg Leu Ala Glu Arg Ala Ile Gly Ala
Tyr 100 105 110 Ala Gly Asp Val Ala Arg Gly Leu Ala Tyr Leu His Gly
Arg Ser Leu 115 120 125 Val His Gly Asp Val Lys Ala Arg Asn Val Val
Ile Gly Gly Asp Gly 130 135 140 Arg Ala Arg Leu Thr Asp Phe Gly Cys
Ala Arg Pro Ala Gly Gly Ser 145 150 155 160 Thr Arg Pro Val Gly Gly
Thr Pro Ala Phe Met Ala Pro Glu Val Ala 165 170 175 Arg Gly Gln Glu
Gln Gly Pro Ala Ala Asp Val Trp Ala Leu Gly Cys 180 185 190 Met Val
Val Glu Leu Ala Thr Gly Arg Ala Pro Trp Ser Asp Val Glu 195 200 205
Gly Asp Asp Leu Leu Ala Ala Leu His Arg Ile Gly Tyr Thr Asp Asp 210
215 220 Val Pro Glu Val Pro Ala Trp Leu Ser Pro Glu Ala Lys Asp Phe
Leu 225 230 235 240 Ala Gly Cys Phe Glu Arg Arg Ala Ala Ala Arg Pro
Thr Ala Ala Gln 245 250 255 Pro Ala Ala His Pro Phe Val Val Ala Ser
Ala Ser Ala Ala Ala Ala 260 265 270 Ile Arg Gly Pro Ala Lys Gln Glu
Val Val Pro Ser Pro Lys Ser Thr 275 280 285 Leu His Asp Ala Phe Trp
Asp Ser Asp Ala Glu Asp Glu Ala Asp Glu 290 295 300 Met Ser Thr Gly
Ala Ala Ala Glu Arg Ile Gly Ala Leu Ala Cys Ala 305 310 315 320 Ala
Ser Ala Leu Pro Asp Trp Asp Thr Glu Glu Gly Trp Ile Asp Leu 325 330
335 Gln Asp Asp His Ser Ala Gly Thr Ala Asp Ala Pro Pro Ala Pro Val
340 345 350 Ala Asp Tyr Phe Ile Ser Trp Ala Glu Pro Ser Asp Ala Glu
Leu Glu 355 360 365 Pro Phe Val Ala Val Ala Ala Ala Ala Gly Leu Pro
His Val Ala Gly 370 375 380 Val Ala Leu Ala Gly Ala Thr Ala Val Asn
Leu Gln Gly Ser Tyr Tyr 385 390 395 400 Tyr Tyr Pro Pro Met His Leu
Gly Val Arg Gly Asn Glu Ile Pro Arg 405 410 415 Pro Leu Leu Asp His
His Gly Asp Gly Leu Glu Lys Gly Gln Gly Ser 420 425 430 His Arg Val
Cys Asn Arg Glu Thr Glu Lys Val Thr Met Lys Arg Ile 435 440 445 Ser
Leu Lys Arg Arg Ala Ala Phe Leu Leu Asp Gln His His Val Arg 450 455
460 Ser Leu Asp Lys Leu Glu Tyr Arg Pro Arg His Asp Arg Met Leu Arg
465 470 475 480 Arg Arg Gln Ser Ile Tyr Arg Ser Asn Ser Val Leu Gly
Tyr Asp Val 485 490 495 Ser Lys Gly Arg Gln Val Arg Trp Arg Arg Ala
Val Cys Ile Ala Val 500 505 510 Ala Ala 5231545DNAZea mays
523atgacgacgt cgaccacggc gaagcagctc cggcgcgtgc gcacgctcgg
ccgcggcgcg 60tcgggcgccg tggtgtggct ggcctccgac gaggcctcgg gcgagctggt
ggcggtcaag 120tcggcgcgcg ccgccggggc cgcggcgcag ctgcagcgcg
agggccgcgt cctccggggc 180ctctcgtcgc cgcacatcgt gccctgcctc
ggctcccgcg ccgcggcggg cggcgagtac 240cagctcctgc tggagttcgc
gccgggcggg tcgctggccg acgaggccgc caggagcggc 300gggggccgcc
tcgcggagcg cgccatcggc gcctacgccg gggacgtggc gcgcgggctg
360gcgtacctcc acggccggtc gctcgtgcac ggggacgtca aggcccggaa
cgtggtcatc 420ggcggcgacg ggcgcgccag gctgaccgac ttcgggtgcg
cgaggccggc cggcgggtcg 480acgcgccccg tcgggggcac cccggcgttc
atggcgcccg aggtggcgcg cggccaggag 540cagggccccg ccgccgacgt
ctgggcgctc gggtgcatgg tcgtcgagct ggccacgggc 600cgcgcgccct
ggagcgacgt ggagggcgac gacctcctcg ccgcgctcca ccggatcggg
660tacacggacg acgtgccgga ggtgcccgcg tggctgtcgc ccgaggccaa
ggacttcctg 720gccggctgct tcgagcgccg cgccgccgcc cggcccacgg
ccgcgcagcc cgcggcgcac 780ccgttcgtcg tcgcctccgc ctccgccgcc
gccgccatcc gcggcccggc gaagcaggag 840gtggtcccgt cacccaagag
cacgctgcac gacgcgttct gggactcgga cgccgaggac 900gaagcggacg
agatgtcgac gggcgcggcg gccgagagga tcggggcatt ggcgtgcgcc
960gcctccgcgc tgcctgactg ggacaccgag gaaggctgga tcgacctcca
ggacgaccac 1020tcggccggaa ctgccgacgc accgccggcg cccgtcgcgg
actacttcat cagctgggcg 1080gagccgtcag acgcagagct ggaaccattc
gtcgccgtcg ccgccgccgc aggtctcccg 1140cacgttgcag gagttgcatt
agcaggcgcc accgccgtta acctgcaggg cagttattat 1200tattacccgc
ctatgcatct aggcgtccgc ggaaacgaga ttccacgccc gttgttggat
1260catcatggcg acgggttaga aaaggggcag ggatcccacc gcgtttgtaa
cagagaaaca 1320gaaaaggtaa caatgaaacg aatttcgtta aaaagaagag
ctgctttcct tctcgaccag 1380catcacgtgc gatcgctgga caaactggaa
tatcgtccac gtcacgaccg aatgctgcgt 1440cgacggcaat ctatatatcg
gagcaatagc gtccttggtt acgacgttag caaaggtagg 1500caggtccgtt
ggcgccgtgc ggtttgcatt gccgttgctg cctga 1545524671DNAzea mays
524cggatccact agtaacggcc gccagtgtgc tggaattcgc ccttgacggc
ccgggctggt 60atttcaaaac tatagtattt taaaattgca ttaacaaaca tgtcctaatt
ggtactcctg 120agatactata ccctcctgtt ttaaaatagt tggcattatc
gaattatcat tttacttttt 180aatgttttct cttcttttaa tatattttat
gaattttaat gtattttaaa atgttatgca 240gttcgctctg gacttttctg
ctgcgcctac acttgggtgt actgggccta aattcagcct 300gaccgaccgc
ctgcattgaa taatggatga gcaccggtaa aatccgcgta cccaactttc
360gagaagaacc gagacgtggc gggccgggcc accgacgcac ggcaccagcg
actgcacacg 420tcccgccggc gtacgtgtac gtgctgttcc ctcactggcc
gcccaatcca ctcatgcatg 480cccacgtaca cccctgccgt ggcgcgccca
gatcctaatc ctttcgccgt tctgcacttc 540tgctgcctat aaatggcggc
atcgaccgtc acctgcttca ccaccggcga gccacatcga 600gaacacgatc
gagcacacaa gcacgaagac tcgtttagga gaaaccacaa accaccaagc
660cgtgcaagca c 671525245PRTzea mays 525Met Gly Arg Gly Lys Val Gln
Leu Lys Arg Ile Glu Asn Lys Ile Asn 1 5 10 15 Arg Gln Val Thr Phe
Ser Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala 20 25 30 His Glu Ile
Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Ile Phe 35 40 45 Ser
Thr Lys Gly Lys Leu Tyr Glu Tyr Ser Thr Asp Ser Cys Met Asp 50 55
60 Lys Ile Leu Glu Arg Tyr Glu Arg Tyr Ser Tyr Ala Glu Lys Val Leu
65 70 75 80 Ile Ser Ala Glu Tyr Glu Thr Gln Gly Asn Trp Cys His Glu
Tyr Arg 85 90 95 Lys Leu Lys Ala Lys Val Glu Thr Ile Gln Lys Cys
Gln Lys His Leu 100 105 110 Met Gly Glu Asp Leu Glu Thr Leu Asn Leu
Lys Glu Leu Gln Gln Leu 115 120 125 Glu Gln Gln Leu Glu Ser Ser Leu
Lys His Ile Arg Thr Arg Lys Ser 130 135 140 Gln Leu Met Val Glu Ser
Ile Ser Ala Leu Gln Arg Lys Glu Lys Ser 145 150 155 160 Leu Gln Glu
Glu Asn Lys Val Leu Gln Lys Glu Leu Ala Glu Lys Gln 165 170 175 Lys
Asp Gln Arg Gln Gln Val Gln Arg Asp Gln Thr Gln Gln Gln Thr 180 185
190 Ser Ser Ser Ser Thr Ser Phe Met Leu Arg Glu Ala Ala Pro Thr Thr
195 200 205 Asn Val Ser Ile Phe Pro Val Ala Ala Gly Gly Arg Val Val
Glu Gly 210 215 220 Ala Ala Ala Gln Pro Gln Ala Arg Val Gly Leu Pro
Pro Trp Met Leu 225 230 235 240 Ser His Leu Ser Cys 245
526738DNAzea mays 526atggggcgcg ggaaggtgca gctgaagcgg atcgagaaca
agatcaaccg ccaggtgaca 60ttctccaagc gccgctcggg gctactcaag aaggcgcacg
agatctccgt gctctgcgac 120gccgaggtcg cgctcatcat cttctccacc
aagggcaagc tctacgagta ctctaccgat 180tcatgtatgg acaaaattct
tgaacggtat gagcgctact cctatgcaga aaaggttctc 240atttccgcag
aatatgaaac tcagggcaat tggtgccatg aatatagaaa actaaaggcg
300aaggtcgaga caatacagaa atgtcaaaag cacctcatgg gagaggatct
tgaaactttg 360aatctcaaag agcttcagca actagagcag cagctggaga
gttcactgaa acatatcaga 420acaaggaaga gccagcttat ggtcgagtca
atttcagcgc tccaacggaa ggagaagtca 480ctgcaggagg agaacaaggt
tctgcagaag gagctcgcgg agaagcagaa agaccagcgg 540cagcaagtgc
aacgggacca aactcaacag cagaccagtt cgtcttccac gtccttcatg
600ttaagggaag ctgccccaac aacaaatgtc agcatcttcc ctgtggcagc
aggcgggagg 660gtggtggaag gggcagcagc gcagccgcag gctcgcgttg
gactgccacc atggatgctt 720agccatctga gctgctga 73852780DNAzea
maysmisc_feature(1)..(80)sequence of Figure 34B 527gctaaagagg
aagtgcagct cttcttgggg aatgctggaa ctgcaatgcg gccattgaca 60gcagctgtta
ctgctgctgg 8052880DNAzea maysmisc_feature(1)..(80)sequence of
Figure 34c 528gctagagagg aagtgcagct cttcttgggg aatgctggaa
tcgcaatgcg gtcattgaca 60gcagctgtta ctgctgctgg 8052937DNAzea
maysmisc_feature(1)..(37)sequence of Figure 35b 529catctcacga
tcagatgcac cgcatgtcgc atgccta 3753042DNAzea
maysmisc_feature(1)..(42)sequence of Figure 35c 530catatctgca
cgatcagata tgcaccgcat gtcgcatatc tg 4253131DNAzea
maysmisc_feature(1)..(31)sequence of Figure 37 531gtttttgaac
ttcagttacg tgcttgatgg a 3153231DNAzea
maysmisc_feature(1)..(31)sequence of Figure 37 532gtttttgaac
ttcaggtacg tgcttgatgg a 31533459DNAArtificial SequenceSouthern
genomic probe 533agctttatcc atccatccat cgcgctagct ggctgcaggc
acgggttatc ttatcttgtc 60gtccagagga cgacacacgg ccggccggtg aagtaaaagg
gagtaatctt attttgccag 120gacgaggggc ggtacatgat attacacacg
taccatgcat gcatatatgc atggacaagg 180tacgtcgtcg tcgatcgacg
tcgatgcata tgtgtgtatg tatgtacgtg cataatgcat 240ggtaccagct
gctggcttat atatatttgt caccgatcga tgcatgctgc tgctctacac
300ggtttgacac tttaatttga ctcatcgatg accttgctag atagtagcgg
ctcgtcaatt 360aatgagccat caagttaaca agagggcacg ggcttgcgcg
actgattcca ccttattaac 420atacgccctg
cgcccgcgcg tgctgtacgt acgagaatt 459534446DNAArtificial
SequenceSouthern MoPAT probe 534tcgaagtcgc gctgccagaa gccgacgtcg
tgccagccgc cgtgcttgta gccggcggcg 60cggagggtgc cgcgggcggt gtagccgagg
gcctcgtgga ggcgcacgga cgggtcgttc 120gggaggccga tcacggccac
cacggacttg aagccctggg cctccatgct cttgaggagg 180tgggtgtaga
gggtggagcc gaggccgagg cgctggtggc ggtgggacac gtacacggtg
240gactccacgg tccagtcgta ggcgttgcgg gccttccacg ggccggcgta
ggcgatgccg 300gccaccacgc cctccacctc ggccacgagc cacgggtagc
ggtcctggag gcgctccagg 360tcgtcgatcc actcctgcgg ggtctgcggc
tcggtgcgga agttcacggt ggaggtctcg 420atgtagtggt tcacgatgtc gcacac
44653520DNAArtificial SequenceRF-FPCas-1 535gcaggtctca cgacggttgg
2053623DNAArtificial SequenceRF-FPCas-2 536gtaaagtacg cgtacgtgtg
agg 2353723DNAArtificial SequenceALSCas-4 537gctgctcgat tccgtcccca
tgg 23538804DNAArtificial SequenceALS modification repair template
804 538agcttacagc cgccgcaacc atggccaccg ccgccgccgc gtctaccgcg
ctcactggcg 60ccactaccgc tgcgcccaag gcgaggcgcc gggcgcacct cctggccacc
cgccgcgccc 120tcgccgcgcc catcaggtgc tcagcggcgt cacccgccat
gccgatggct cccccggcca 180ccccgctccg gccgtggggc cccaccgatc
cccgcaaggg cgccgacatc ctcgtcgagt 240ccctcgagcg ctgcggcgtc
cgcgacgtct tcgcctaccc cggcggcgcg tccatggaga 300tccaccaggc
actcacccgc tcccccgtca tcgccaacca cctcttccgc cacgagcaag
360gggaggcctt tgcggcctcc ggctacgcgc gctcctcggg ccgcgtcggc
gtctgcatcg 420ccacctccgg ccccggcgcc accaaccttg tctccgcgct
cgccgacgcg ttgctcgact 480ccgtccccat tgtcgccatc acgggacagg
tgtcgcgacg catgattggc accgacgcct 540tccaggagac gcccatcgtc
gaggtcaccc gctccatcac caagcacaac tacctggtcc 600tcgacgtcga
cgacatcccc cgcgtcgtgc aggaggcttt cttcctcgcc tcctctggtc
660gaccagggcc ggtgcttgtc gacatcccca aggacatcca gcagcagatg
gcggtgcctg 720tctgggacaa gcccatgagt ctgcctgggt acattgcgcg
ccttcccaag ccccctgcga 780ctgagttgct tgagcagaag ggcg
804539127DNAArtificial SequenceALS modification repair template 127
539aaccttgtct ccgcgctcgc cgacgcgttg ctcgactccg tccccattgt
cgccatcacg 60ggacaggtgt cgcgacgcat gattggcacc gacgccttcc aggagacgcc
catcgtcgag 120gtcaccc 12754025DNAArtificial SequenceALS
Forward_primer; 540ctacgcacat ccccctttct cccac 2554136DNAArtificial
SequenceALS Reverse_primer 541atgcatacct agcatgcgca gagacagtgg
gtcgtc 3654222DNAArtificial sequencesoy ALS1-CR1, Cas9 target
sequence 542caccggccag gtcccccgcc gg 2254322DNAArtificial
sequencesoy ALS2-CR2, Cas9 target sequence 543ggcgtcggtg ccgatcatcc
gg 225449093DNAArtificial sequenceQC880 544ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct
gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg caccggccag gtcccccgcg ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg
tatctatact tttattagat 600ttttaatcag gctcctgatt tctttttatt
tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat
aggaatagtt ttactattca 900ctgctttaat agaaaaatag tttaaaattt
aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca
taatcagaaa tccgaaataa 1200acctaggggc attatcggaa atgaaaagta
gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca 1500gttttttagg attcttttgg tttttgaatc
gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg
tagatatgta accgtcgata 1800gtttttttgt gggtttgttc acatgttatc
aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa aaagtactca
atagggctcg acatagggac 2100taactccgtt ggatgggccg tcatcaccga
cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga aacaccgaca
ggcacagcat aaagaagaat ttgatcggtg ccctcctctt 2220cgactccgga
gagaccgctg aggctaccag gctcaagagg accgctagaa ggcgctacac
2280cagaaggaag aacagaatct gctacctgca ggagatcttc tccaacgaga
tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga atcattcctg
gtggaggagg ataaaaagca 2400cgagagacac ccaatcttcg ggaacatcgt
cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac cacctgagga
agaagctggt cgactctacc gacaaggctg acttgcgctt 2520gatttacctg
gctctcgctc acatgataaa gttccgcgga cacttcctca ttgagggaga
2580cctgaaccca gacaactccg acgtggacaa gctcttcatc cagctcgttc
agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc cagtggagtt
gacgccaagg ctatcctctc 2700tgctcgtctg tcaaagtcca ggaggcttga
gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga ctgttcggaa
acttgatcgc tctctccctg ggattgactc ccaacttcaa 2820gtccaacttc
gacctcgccg aggacgctaa gttgcagttg tctaaagaca cctacgacga
2880tgacctcgac aacttgctgg cccagatagg cgaccaatac gccgatctct
tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc cgacatcctg
agagtcaaca ctgagattac 3000caaagctcct ctgtctgctt ccatgattaa
gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag gccctggtga
gacagcagct gcccgagaag tacaaggaga tctttttcga 3120ccagtccaag
aacggctacg ccggatacat tgacggaggc gcctcccagg aagagttcta
3180caagttcatc aagcccatcc ttgagaagat ggacggtacc gaggagctgt
tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag aaccttcgac
aacggaagca tccctcacca 3300aatccacctg ggagagctcc acgccatctt
gaggaggcag gaggatttct atcccttcct 3360gaaggacaac cgcgagaaga
ttgagaagat cttgaccttc agaattcctt actacgtcgg 3420gccactcgcc
agaggaaact ctaggttcgc ctggatgacc cgcaaatctg aagagaccat
3480tactccctgg aacttcgagg aagtcgtgga caagggcgct tccgctcagt
ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc caacgagaag
gtgctgccca agcactccct 3600gttgtacgag tatttcacag tgtacaacga
gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag cctgccttct
tgtccggaga gcagaagaag gccatcgtcg acctgctctt 3720caagaccaac
aggaaggtga ctgtcaagca gctgaaggag gactacttca agaagatcga
3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct
cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga caaggacttc
ctggacaacg aggagaacga 3900ggacatcctt gaggacatcg tgctcaccct
gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc aagacctacg
cccacctctt cgacgacaag gtgatgaaac agctgaagag 4020acgcagatat
accggctggg gaaggctctc ccgcaaattg atcaacggga tcagggacaa
4080gcagtcaggg aagactatac tcgacttcct gaagtccgac ggattcgcca
acaggaactt 4140catgcagctc attcacgacg actccttgac cttcaaggag
gacatccaga aggctcaggt 4200gtctggacag ggtgactcct tgcatgagca
cattgctaac ttggccggct ctcccgctat 4260taagaagggc attttgcaga
ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg 4320ccacaagcca
gagaacatcg ttattgagat ggctcgcgag aaccaaacta cccagaaagg
4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag ggcataaaag
agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa cactcagctg
cagaacgaga agctgtacct 4500gtactacctc caaaacggaa gggacatgta
cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac gacgtcgacc
acatcgtgcc tcagtccttc ctgaaggatg actccatcga 4620caataaagtg
ctgacacgct ccgataaaaa tagaggcaag tccgacaacg tcccctccga
4680ggaggtcgtg aagaagatga aaaactactg gagacagctc ttgaacgcca
agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc tgagagagga
ggattgtccg agctcgataa 4800ggccggattc atcaagagac agctcgtcga
aacccgccaa attaccaagc acgtggccca 4860aattctggat tcccgcatga
acaccaagta cgatgaaaat gacaagctga tccgcgaggt 4920caaggtgatc
accttgaagt ccaagctggt ctccgacttc cgcaaggact tccagttcta
4980caaggtgagg gagatcaaca actaccacca cgcacacgac gcctacctca
acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa gctggagtct
gagttcgtct acggcgacta 5100caaggtgtac gacgtgagga agatgatcgc
taagtctgag caggagatcg gcaaggccac 5160cgccaagtac ttcttctact
ccaacatcat gaacttcttc aagaccgaga tcactctcgc 5220caacggtgag
atcaggaagc gcccactgat cgagaccaac ggtgagactg gagagatcgt
5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg ctctccatgc
ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg aggattctcc
aaggagtcca tcctccccaa 5400gagaaactcc gacaagctga tcgctagaaa
gaaagactgg gaccctaaga agtacggagg 5460cttcgattct cctaccgtgg
cctactctgt gctggtcgtg gccaaggtgg agaagggcaa 5520gtccaagaag
ctgaaatccg tcaaggagct cctcgggatt accatcatgg agaggagttc
5580cttcgagaag aaccctatcg acttcctgga ggccaaggga tataaagagg
tgaagaagga 5640cctcatcatc aagctgccca agtactccct cttcgagttg
gagaacggaa ggaagaggat 5700gctggcttct gccggagagt tgcagaaggg
aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg tacctcgcct
ctcactatga aaagttgaag ggctctcctg aggacaacga 5820gcagaagcag
ctcttcgtgg agcagcacaa gcactacctg gacgaaatta tcgagcagat
5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac ctcgacaagg
tgctgtccgc 5940ctacaacaag cacagggata agcccattcg cgagcaggct
gaaaacatta tccacctgtt 6000taccctcaca aacttgggag cccctgctgc
cttcaagtac ttcgacacca ccattgacag 6060gaagagatac acctccacca
aggaggtgct cgacgcaaca ctcatccacc aatccatcac 6120cggcctctat
gaaacaagga ttgacttgtc ccagctggga ggcgactcta gagccgatcc
6180caagaagaag agaaaggtgt aggttaacct agacttgtcc atcttctgga
ttggccaact 6240taattaatgt atgaaataaa aggatgcaca catagtgaca
tgctaatcac tataatgtgg 6300gcatcaaagt tgtgtgttat gtgtaattac
tagttatctg aataaaagag aaagagatca 6360tccatatttc ttatcctaaa
tgaatgtcac gtgtctttat aattctttga tgaaccagat 6420gcatttcatt
aaccaaatcc atatacatat aaatattaat catatataat taatatcaat
6480tgggttagca aaacaaatct agtctaggtg tgttttgcga attcgatatc
aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg cgccgttcta
tagtgtcacc taaatcgtat 6600gtgtatgata cataaggtta tgtattaatt
gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact ctcagtacaa
tctgctctga tgccgcatag ttaagccagc cccgacaccc 6720gccaacaccc
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct atttttatag
gttaatgtca tgaccaaaat 6900cccttaacgt gagttttcgt tccactgagc
gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat cctttttttc
tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 7020accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
7080cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca 7140cttcaagaac tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc 7200tgctgccagt ggcgataagt cgtgtcttac
cgggttggac tcaagacgat agttaccgga 7260taaggcgcag cggtcgggct
gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 7320gacctacacc
gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
7380agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg 7500acttgagcgt cgatttttgt gatgctcgtc
aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc tttttacggt
tcctggcctt ttgctggcct tttgctcaca tgttctttcc 7620tgcgttatcc
cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat
taatgcaggt tgatcagatc 7800tcgatcccgc gaaattaata cgactcacta
tagggagacc acaacggttt ccctctagaa 7860ataattttgt ttaactttaa
gaaggagata tacccatgga aaagcctgaa ctcaccgcga 7920cgtctgtcga
gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
7980cggagggcga agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca aagatcgtta
tgtttatcgg cactttgcat 8100cggccgcgct cccgattccg gaagtgcttg
acattgggga attcagcgag agcctgacct 8160attgcatctc ccgccgtgca
cagggtgtca cgttgcaaga cctgcctgaa accgaactgc 8220ccgctgttct
gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg 8340atttcatatg cgcgattgct gatccccatg tgtatcactg
gcaaactgtg atggacgaca 8400ccgtcagtgc gtccgtcgcg caggctctcg
atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg gcacctcgtg
cacgcggatt tcggctccaa caatgtcctg acggacaatg 8520gccgcataac
agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact 8640tcgagcggag gcatccggag cttgcaggat cgccgcggct
ccgggcgtat atgctccgca 8700ttggtcttga ccaactctat cagagcttgg
ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg atgcgacgca
atcgtccgat ccggagccgg gactgtcggg cgtacacaaa 8820tcgcccgcag
aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag ctgagttggc
tgctgccacc gctgagcaat 9000aactagcata accccttggg gcctctaaac
gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc cggatgatcg
ggcgcgccgg tac 90935459093DNAArtificial sequenceQC881 545ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc
120tttaataaaa ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc
agagcagagt 180catcatgaag ctagaaaggc taccgataga taaactatag
ttaattaaat acattaaaaa 240atacttggat ctttctctta ccctgtttat
attgagacct gaaacttgag agagatacac 300taatcttgcc ttgttgtttc
attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag
420atgcacaaca acaaagcttg ggcgtcggtg ccgatcatcg ttttagagct
agaaatagca 480agttaaaata aggctagtcc gttatcaact tgaaaaagtg
gcaccgagtc ggtgcttttt 540tttgcggccg caattggatc gggtttactt
attttgtggg tatctatact tttattagat 600ttttaatcag gctcctgatt
tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa
taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca
720tattaaagca atgaatatgt aaaattaatc ttatctttat tttaaaaaat
catataggtt 780tagtattttt ttaaaaataa agataggatt agttttacta
ttcactgctt attactttta 840aaaaaatcat aaaggtttag tattttttta
aaataaatat aggaatagtt ttactattca 900ctgctttaat agaaaaatag
tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg
agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt
1020cataatatcg ccaaatgcca actggactac gtcgaaccca caaatcccac
aaagcgcgtg 1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt
gttacacgct caatcccacg 1140cgagtagagc acagtaacct tcaaataagc
gaatggggca taatcagaaa tccgaaataa 1200acctaggggc attatcggaa
atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt
atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg
1320aggattccga acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc
tgctcttgtc 1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat
gtttcttcgg ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg
ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg attcttttgg
tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg
aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa
1620catgcgactt tgtatgattt tttacgaggt tatgatgttc tggttgtttt
attatgaatc 1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac
actattaaag gtttgtttta 1740acaggattaa aagtttttta agcatgttga
aggagtcttg tagatatgta accgtcgata 1800gtttttttgt gggtttgttc
acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc
cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt
1920gaaattttgt tattggtaaa ctataaatgt gtgaagttgg agtatacctt
taccttctta 1980tttggctttg tgatagttta atttatatgt attttgagtt
ctgacttgta tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa
aaagtactca atagggctcg acatagggac 2100taactccgtt ggatgggccg
tcatcaccga cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga
aacaccgaca ggcacagcat aaagaagaat ttgatcggtg ccctcctctt
2220cgactccgga gagaccgctg aggctaccag gctcaagagg accgctagaa
ggcgctacac 2280cagaaggaag aacagaatct gctacctgca ggagatcttc
tccaacgaga tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga
atcattcctg gtggaggagg ataaaaagca 2400cgagagacac ccaatcttcg
ggaacatcgt cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac
cacctgagga agaagctggt cgactctacc gacaaggctg acttgcgctt
2520gatttacctg gctctcgctc acatgataaa gttccgcgga cacttcctca
ttgagggaga 2580cctgaaccca gacaactccg acgtggacaa gctcttcatc
cagctcgttc agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc
cagtggagtt gacgccaagg ctatcctctc 2700tgctcgtctg tcaaagtcca
ggaggcttga gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga
ctgttcggaa acttgatcgc tctctccctg ggattgactc ccaacttcaa
2820gtccaacttc gacctcgccg aggacgctaa gttgcagttg tctaaagaca
cctacgacga 2880tgacctcgac aacttgctgg cccagatagg cgaccaatac
gccgatctct tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc
cgacatcctg agagtcaaca ctgagattac 3000caaagctcct ctgtctgctt
ccatgattaa gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag
gccctggtga gacagcagct gcccgagaag tacaaggaga tctttttcga
3120ccagtccaag aacggctacg ccggatacat tgacggaggc gcctcccagg
aagagttcta 3180caagttcatc aagcccatcc ttgagaagat ggacggtacc
gaggagctgt tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag
aaccttcgac aacggaagca tccctcacca 3300aatccacctg ggagagctcc
acgccatctt gaggaggcag gaggatttct atcccttcct 3360gaaggacaac
cgcgagaaga ttgagaagat cttgaccttc agaattcctt actacgtcgg
3420gccactcgcc agaggaaact ctaggttcgc ctggatgacc cgcaaatctg
aagagaccat 3480tactccctgg aacttcgagg aagtcgtgga caagggcgct
tccgctcagt ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc
caacgagaag gtgctgccca agcactccct 3600gttgtacgag tatttcacag
tgtacaacga gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag
cctgccttct tgtccggaga gcagaagaag gccatcgtcg acctgctctt
3720caagaccaac aggaaggtga ctgtcaagca gctgaaggag gactacttca
agaagatcga 3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg
ttcaacgcct cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga
caaggacttc ctggacaacg aggagaacga 3900ggacatcctt gaggacatcg
tgctcaccct gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc
aagacctacg cccacctctt cgacgacaag gtgatgaaac agctgaagag
4020acgcagatat accggctggg gaaggctctc ccgcaaattg atcaacggga
tcagggacaa 4080gcagtcaggg aagactatac tcgacttcct gaagtccgac
ggattcgcca acaggaactt 4140catgcagctc attcacgacg actccttgac
cttcaaggag gacatccaga aggctcaggt 4200gtctggacag ggtgactcct
tgcatgagca cattgctaac ttggccggct ctcccgctat 4260taagaagggc
attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg
4320ccacaagcca gagaacatcg ttattgagat ggctcgcgag aaccaaacta
cccagaaagg 4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag
ggcataaaag agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa
cactcagctg cagaacgaga agctgtacct 4500gtactacctc caaaacggaa
gggacatgta cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac
gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg actccatcga
4620caataaagtg ctgacacgct ccgataaaaa tagaggcaag tccgacaacg
tcccctccga 4680ggaggtcgtg aagaagatga aaaactactg gagacagctc
ttgaacgcca agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc
tgagagagga ggattgtccg agctcgataa 4800ggccggattc atcaagagac
agctcgtcga aacccgccaa attaccaagc acgtggccca 4860aattctggat
tcccgcatga acaccaagta cgatgaaaat gacaagctga tccgcgaggt
4920caaggtgatc accttgaagt ccaagctggt ctccgacttc cgcaaggact
tccagttcta 4980caaggtgagg gagatcaaca actaccacca cgcacacgac
gcctacctca acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa
gctggagtct gagttcgtct acggcgacta 5100caaggtgtac gacgtgagga
agatgatcgc taagtctgag caggagatcg gcaaggccac 5160cgccaagtac
ttcttctact ccaacatcat gaacttcttc aagaccgaga tcactctcgc
5220caacggtgag atcaggaagc gcccactgat cgagaccaac ggtgagactg
gagagatcgt 5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg
ctctccatgc ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg
aggattctcc aaggagtcca tcctccccaa 5400gagaaactcc gacaagctga
tcgctagaaa gaaagactgg gaccctaaga agtacggagg 5460cttcgattct
cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg agaagggcaa
5520gtccaagaag ctgaaatccg tcaaggagct cctcgggatt accatcatgg
agaggagttc 5580cttcgagaag aaccctatcg acttcctgga ggccaaggga
tataaagagg tgaagaagga 5640cctcatcatc aagctgccca agtactccct
cttcgagttg gagaacggaa ggaagaggat 5700gctggcttct gccggagagt
tgcagaaggg aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg
tacctcgcct ctcactatga aaagttgaag ggctctcctg aggacaacga
5820gcagaagcag ctcttcgtgg agcagcacaa gcactacctg gacgaaatta
tcgagcagat 5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac
ctcgacaagg tgctgtccgc 5940ctacaacaag cacagggata agcccattcg
cgagcaggct gaaaacatta tccacctgtt 6000taccctcaca aacttgggag
cccctgctgc cttcaagtac ttcgacacca ccattgacag 6060gaagagatac
acctccacca aggaggtgct cgacgcaaca ctcatccacc aatccatcac
6120cggcctctat gaaacaagga ttgacttgtc ccagctggga ggcgactcta
gagccgatcc 6180caagaagaag agaaaggtgt aggttaacct agacttgtcc
atcttctgga ttggccaact 6240taattaatgt atgaaataaa aggatgcaca
catagtgaca tgctaatcac tataatgtgg 6300gcatcaaagt tgtgtgttat
gtgtaattac tagttatctg aataaaagag aaagagatca 6360tccatatttc
ttatcctaaa tgaatgtcac gtgtctttat aattctttga tgaaccagat
6420gcatttcatt aaccaaatcc atatacatat aaatattaat catatataat
taatatcaat 6480tgggttagca aaacaaatct agtctaggtg tgttttgcga
attcgatatc aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg
cgccgttcta tagtgtcacc taaatcgtat 6600gtgtatgata cataaggtta
tgtattaatt gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact
ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc
6720gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg
cttacagaca 6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt
tcaccgtcat caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct
atttttatag gttaatgtca tgaccaaaat 6900cccttaacgt gagttttcgt
tccactgagc gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct
7020accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga
aggtaactgg 7080cttcagcaga gcgcagatac caaatactgt ccttctagtg
tagccgtagt taggccacca 7140cttcaagaac tctgtagcac cgcctacata
cctcgctctg ctaatcctgt taccagtggc 7200tgctgccagt ggcgataagt
cgtgtcttac cgggttggac tcaagacgat agttaccgga 7260taaggcgcag
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac
7320gacctacacc gaactgagat acctacagcg tgagcattga gaaagcgcca
cgcttcccga 7380agggagaaag gcggacaggt atccggtaag cggcagggtc
ggaacaggag agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct
ttatagtcct gtcgggtttc gccacctctg 7500acttgagcgt cgatttttgt
gatgctcgtc aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc
tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc
7620tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag
ctgataccgc 7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc
gaggaagcgg aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg
gccgattcat taatgcaggt tgatcagatc 7800tcgatcccgc gaaattaata
cgactcacta tagggagacc acaacggttt ccctctagaa 7860ataattttgt
ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga
7920cgtctgtcga gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg
atgcagctct 7980cggagggcga agaatctcgt gctttcagct tcgatgtagg
agggcgtgga tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca
aagatcgtta tgtttatcgg cactttgcat 8100cggccgcgct cccgattccg
gaagtgcttg acattgggga attcagcgag agcctgacct 8160attgcatctc
ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc
8220ccgctgttct gcagccggtc gcggaggcta tggatgcgat cgctgcggcc
gatcttagcc 8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg
tcaatacact acatggcgtg 8340atttcatatg cgcgattgct gatccccatg
tgtatcactg gcaaactgtg atggacgaca 8400ccgtcagtgc gtccgtcgcg
caggctctcg atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg
gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg
8520gccgcataac agcggtcatt gactggagcg aggcgatgtt cggggattcc
caatacgagg 8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat
ggagcagcag acgcgctact 8640tcgagcggag gcatccggag cttgcaggat
cgccgcggct ccgggcgtat atgctccgca 8700ttggtcttga ccaactctat
cagagcttgg ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg
atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa
8820tcgcccgcag aagcgcggcc gtctggaccg atggctgtgt agaagtactc
gccgatagtg 8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata
gtgaggtaca gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag
ctgagttggc tgctgccacc gctgagcaat 9000aactagcata accccttggg
gcctctaaac gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc
cggatgatcg ggcgcgccgg tac 90935461113DNAArtificial sequenceRTW1026A
546agcttggtac cgagctcgga tccactagta tggcggccac cgcttccaga
accacccgat 60tctcttcttc ctcttcacac cccaccttcc ccaaacgcat tactagatcc
accctccctc 120tctctcatca aaccctcacc aaacccaacc acgctctcaa
aatcaaatgt tccatctcca 180aaccccccac ggcggcgccc ttcaccaagg
aagcgccgac cacggagccc ttcgtgtcac 240ggttcgcctc cggcgaacct
cgcaagggcg cggacatcct tgtggaggcg ctggagaggc 300agggcgtgac
gacggtgttc gcgtaccccg gcggtgcgtc gatggagatc caccaggcgc
360tcacgcgctc cgccgccatc cgcaacgtgc tcccgcgcca cgagcagggc
ggcgtcttcg 420ccgccgaagg ctacgcgcgt tcctccggcc tccccggcgt
ctgcattgcc acctccggcc 480ccggcgccac caacctcgtg agcggcctcg
ccgacgcttt aatggacagc gtcccagtcg 540tcgccatcac cggccaggtc
agccgtcgca tgatcggtac cgacgccttc caagaaaccc 600cgatcgtgga
ggtgagcaga tccatcacga agcacaacta cctcatcctc gacgtcgacg
660acatcccccg cgtcgtcgcc gaggctttct tcgtcgccac ctccggccgc
cccggtccgg 720tcctcatcga cattcccaaa gacgttcagc agcaactcgc
cgtgcctaat tgggacgagc 780ccgttaacct ccccggttac ctcgccaggc
tgcccaggcc ccccgccgag gcccaattgg 840aacacattgt cagactcatc
atggaggccc aaaagcccgt tctctacgtc ggcggtggca 900gtttgaattc
cagtgctgaa ttgaggcgct ttgttgaact cactggtatt cccgttgcta
960gcactttaat gggtcttgga acttttccta ttggtgatga atattccctt
cagatgctgg 1020gtatgcatgg tactgtttat gctaactatg ctgttgacaa
tagtgatttg ttgcttgcct 1080ttggggtaag gtttgatgac cgtgttactg gga
111354717DNAArtificialWOL900, Forward_primer 547atcaccggcc aggtcag
1754825DNAArtificialWOL578, Reverse_primer 548acttaccctc cactcctttc
tcctc 2554929DNAArtificialWOL573, Forward_primer 549atggcggcca
ccgcttccag aaccacccg 29550638PRTzea mays 550Met Ala Thr Ala Ala Ala
Ala Ser Thr Ala Leu Thr Gly Ala Thr Thr 1 5 10 15 Ala Ala Pro Lys
Ala Arg Arg Arg Ala His Leu Leu Ala Thr Arg Arg 20 25 30 Ala Leu
Ala Ala Pro Ile Arg Cys Ser Ala Ala Ser Pro Ala Met Pro 35 40 45
Met Ala Pro Pro Ala Thr Pro Leu Arg Pro Trp Gly Pro Thr Glu Pro 50
55 60 Arg Lys Gly Ala Asp Ile Leu Val Glu Ser Leu Glu Arg Cys Gly
Val 65 70 75 80 Arg Asp Val Phe Ala Tyr Pro Gly Gly Ala Ser Met Glu
Ile His Gln 85 90 95 Ala Leu Thr Arg Ser Pro Val Ile Ala Asn His
Leu Phe Arg His Glu 100 105 110 Gln Gly Glu Ala Phe Ala Ala Ser Gly
Tyr Ala Arg Ser Ser Gly Arg 115 120 125 Val Gly Val Cys Ile Ala Thr
Ser Gly Pro Gly Ala Thr Asn Leu Val 130 135 140 Ser Ala Leu Ala Asp
Ala Leu Leu Asp Ser Val Pro Met Val Ala Ile 145 150 155 160 Thr Gly
Gln Val Pro Arg Arg Met Ile Gly Thr Asp Ala Phe Gln Glu 165 170 175
Thr Pro Ile Val Glu Val Thr Arg Ser Ile Thr Lys His Asn Tyr Leu 180
185 190 Val Leu Asp Val Asp Asp Ile Pro Arg Val Val Gln Glu Ala Phe
Phe 195 200 205 Leu Ala Ser Ser Gly Arg Pro Gly Pro Val Leu Val Asp
Ile Pro Lys 210 215 220 Asp Ile Gln Gln Gln Met Ala Val Pro Val Trp
Asp Lys Pro Met Ser 225 230 235 240 Leu Pro Gly Tyr Ile Ala Arg Leu
Pro Lys Pro Pro Ala Thr Glu Leu 245 250 255 Leu Glu Gln Val Leu Arg
Leu Val Gly Glu Ser Arg Arg Pro Val Leu 260 265 270 Tyr Val Gly Gly
Gly Cys Ala Ala Ser Gly Glu Glu Leu Arg Arg Phe 275 280 285 Val Glu
Leu Thr Gly Ile Pro Val Thr Thr Thr Leu Met Gly Leu Gly 290 295 300
Asn Phe Pro Ser Asp Asp Pro Leu Ser Leu Arg Met Leu Gly Met His 305
310 315 320 Gly Thr Val Tyr Ala Asn Tyr Ala Val Asp Lys Ala Asp Leu
Leu Leu 325 330 335 Ala Leu Gly Val Arg Phe Asp Asp Arg Val Thr Gly
Lys Ile Glu Ala 340 345 350 Phe Ala Ser Arg Ala Lys Ile Val His Val
Asp Ile Asp Pro Ala Glu 355 360 365 Ile Gly Lys Asn Lys Gln Pro His
Val Ser Ile Cys Ala Asp Val Lys 370 375 380 Leu Ala Leu Gln Gly Met
Asn Ala Leu Leu Glu Gly Ser Thr Ser Lys 385 390 395 400 Lys Ser Phe
Asp Phe Gly Ser Trp Asn Asp Glu Leu Asp Gln Gln Lys 405 410 415 Arg
Glu Phe Pro Leu Gly Tyr Lys Thr Ser Asn Glu Glu Ile Gln Pro 420 425
430 Gln Tyr Ala Ile Gln Val Leu Asp Glu Leu Thr Lys Gly Glu Ala Ile
435 440 445 Ile Gly Thr Gly Val Gly Gln His Gln Met Trp Ala Ala Gln
Tyr Tyr 450 455 460 Thr Tyr Lys Arg Pro Arg Gln Trp Leu Ser Ser Ala
Gly Leu Gly Ala 465 470 475 480 Met Gly Phe Gly Leu Pro Ala Ala Ala
Gly Ala Ser Val Ala Asn Pro 485 490 495 Gly Val Thr Val Val Asp Ile
Asp Gly Asp Gly Ser Phe Leu Met Asn 500 505 510 Val Gln Glu Leu Ala
Met Ile Arg Ile Glu Asn Leu Pro Val Lys Val 515 520 525 Phe Val Leu
Asn Asn Gln His Leu Gly Met Val Val Gln Trp Glu Asp 530 535 540 Arg
Phe Tyr Lys Ala Asn Arg Ala His Thr Tyr Leu Gly Asn Pro Glu 545 550
555 560 Asn Glu Ser Glu Ile Tyr Pro Asp Phe Val Thr Ile Ala Lys Gly
Phe 565 570 575 Asn Ile Pro Ala Val Arg Val Thr Lys Lys Asn Glu Val
Arg Ala Ala 580 585 590 Ile Lys Lys Met Leu Glu Thr Pro Gly Pro Tyr
Leu Leu Asp Ile Ile 595 600 605 Val Pro His Gln Glu His Val Leu Pro
Met Ile Pro Ser Gly Gly Ala 610 615 620 Phe Lys Asp Met Ile Leu Asp
Gly Asp Gly Arg Thr Val Tyr 625 630 635
* * * * *