U.S. patent application number 13/012222 was filed with the patent office on 2011-12-22 for methods of fetal abnormality detection.
This patent application is currently assigned to Artemis Health, Inc.. Invention is credited to Yue-Jen Chuu, Richard P. Rava.
Application Number | 20110312503 13/012222 |
Document ID | / |
Family ID | 45329183 |
Filed Date | 2011-12-22 |
United States Patent
Application |
20110312503 |
Kind Code |
A1 |
Chuu; Yue-Jen ; et
al. |
December 22, 2011 |
METHODS OF FETAL ABNORMALITY DETECTION
Abstract
Methods and kits for selectively enriching non-random
polynucleotide sequences are provided. Methods and kits for
generating libraries of sequences are provided. Methods of using
selectively enriched non-random polynucleotide sequences for
detection of fetal aneuploidy are provided.
Inventors: |
Chuu; Yue-Jen; (Cupertino,
CA) ; Rava; Richard P.; (Redwood City, CA) |
Assignee: |
Artemis Health, Inc.
San Carlos
CA
|
Family ID: |
45329183 |
Appl. No.: |
13/012222 |
Filed: |
January 24, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61297755 |
Jan 23, 2010 |
|
|
|
Current U.S.
Class: |
506/2 ; 435/6.11;
435/6.12 |
Current CPC
Class: |
C12Q 2600/112 20130101;
C12Q 2600/156 20130101; C12Q 2600/16 20130101; C12Q 1/6883
20130101; C12Q 1/6869 20130101 |
Class at
Publication: |
506/2 ; 435/6.12;
435/6.11 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C40B 20/00 20060101 C40B020/00 |
Claims
1. A method for determining the presence or absence of fetal
aneuploidy comprising: a. selectively enriching non-random
polynucleotide sequences of genomic DNA from a cell-free DNA
sample; b. sequencing said enriched polynucleotide sequences; c.
enumerating sequence reads from said sequencing step; and d.
determining the presence or absence of fetal aneuploidy based on
said enumerating.
2. The method of claim 1, wherein said selectively enriching
comprises performing PCR.
3. The method of claim 1, wherein said selectively enriching
comprises linear amplification.
4. The method of claim 1, wherein said selectively enriching
comprises enriching at least 1, 5, 10, 50, 100, or 1000 non-random
polynucleotide sequences from a first chromosome.
5. The method of claim 1, wherein said selectively enriching
comprises enriching at least 1, 10, or 100 polynucleotide sequences
from one or more regions of a first chromosome, wherein each region
is up to 50 kb.
6. The method of claim 1, wherein said non-random polynucleotide
sequences comprise sequences that are sequenced at a rate of
greater than 5-fold than other sequences on the same
chromosome.
7. The method of claim 1, wherein said non-random polynucleotide
sequences each comprise about 50-1000 bases.
8. The method of claim 1, wherein said cell-free DNA sample is a
maternal sample.
9. The method of claim 8, wherein said maternal sample is a
maternal blood sample.
10. The method of claim 9, wherein said maternal sample comprises
fetal and maternal cell-free DNA.
11. The method of claim 1, wherein said cell-free DNA is from a
plurality of different individuals.
12. The method of claim 1, wherein said sequencing comprises Sanger
sequencing, sequencing-by-synthesis, or massively parallel
sequencing.
13. The method of claim 1, wherein said aneuploidy is trisomy 21,
trisomy 18, or trisomy 13.
14. The method of claim 1, wherein said aneuploidy is suspected or
determined when the number of enumerated sequences is greater than
a predetermined amount.
15. The method of claim 14, wherein said predetermined amount is
based on estimated amount of DNA in said cell-free DNA sample.
16. The method of claim 14, wherein said predetermined amount is
based on the amount of enumerated sequences from a control
region.
17. A method comprising: a. providing oligonucleotides that
specifically hybridize to one or more polynucleotide sequences from
a polynucleotide template, wherein said one or more polynucleotide
sequences comprise sequences that are sequenced at rate greater
than 5-fold than other sequences from the polynucleotide template;
b. selectively enriching said one or more polynucleotide sequences;
and c. optionally sequencing said enriched one or more
polynucleotide sequences.
18. The method of claim 17, wherein each of said oligonucleotides
has a substantially similar thermal profile.
19. The method of claim 17, wherein said polynucleotide sequences
each comprise about 50-1000 bases.
20. The method of claim 17, wherein said polynucleotide sequences
are from a cell-free DNA sample.
21. The method of claim 17, wherein said polynucleotide sequences
are from a maternal sample.
22. The method of claim 21, wherein said maternal sample is a
maternal blood sample.
23. The method of claim 22, wherein said maternal sample comprises
fetal and maternal cell-free DNA.
24. The method of claim 17, wherein said polynucleotide template is
a chromosome suspected of being aneuploid.
25. The method of claim 17, wherein said polynucleotide template is
chromosome 21.
Description
CROSS-REFERENCE
[0001] This application claims the benefit of U.S. Provisional
Application No. 61/297,755, filed Jan. 23, 2010, which application
is incorporated herein by reference.
SEQUENCE LISTING
[0002] The instant application contains a Sequence Listing which
has been submitted in ASCII format via EFS-Web and is hereby
incorporated by reference in its entirety. Said ASCII copy, created
on May 11, 2011, is named 32477692.txt and is 27,793 bytes in
size.
BACKGROUND OF THE INVENTION
[0003] Massively parallel sequencing techniques are used for
detection of fetal aneuploidy from samples that comprise fetal and
maternal nucleic acids. Fetal DNA often constitutes less than 10%
of the total DNA in a sample, for example, a maternal cell-free
plasma sample. Sequencing a large number of polynucleotides to
generate sufficient data for fetal aneuploidy detection can be
expensive. Methods for randomly enriching fetal nucleic acids in
cell-free maternal sample have been described, including enriching
nucleic acids based on size, formaldehyde treatment, methylation
status, or hybridization to oligonucleotide arrays. There is a need
for a means of selectively enriching non-random fetal and maternal
polynucleotide sequences in a way that facilitates aneuploidy
detection by massively parallel sequencing techniques and increases
the sensitivity of aneuploidy detection.
SUMMARY OF THE INVENTION
[0004] In one aspect, a method for determining the presence or
absence of fetal aneuploidy is provided comprising a) selectively
enriching non-random polynucleotide sequences of genomic DNA from a
cell-free DNA sample; b) sequencing said enriched polynucleotide
sequences; c) enumerating sequence reads from said sequencing step;
and d) determining the presence or absence of fetal aneuploidy
based on said enumerating. In one embodiment, said selectively
enriching comprises performing PCR. In another embodiment, said
selectively enriching comprises linear amplification. In another
embodiment, said selectively enriching comprises enriching at least
1, 5, 10, 50, 100, or 1000 non-random polynucleotide sequences from
a first chromosome. In another embodiment, said selectively
enriching comprises enriching at least 1, 10, or 100 polynucleotide
sequences from one or more regions of a first chromosome, wherein
each region is up to 50 kb. In another embodiment, said non-random
polynucleotide sequences comprise sequences that are sequenced at a
rate of greater than 5-fold than other sequences on the same
chromosome. In another embodiment, said non-random polynucleotide
sequences each comprise about 50-1000 bases. In another embodiment,
said cell-free DNA sample is a maternal sample. In another
embodiment, said maternal sample is a maternal blood sample. In
another embodiment, said maternal sample comprises fetal and
maternal cell-free DNA. In another embodiment, said cell-free DNA
is from a plurality of different individuals.
[0005] In another embodiment, said sequencing comprises Sanger
sequencing, sequencing-by-synthesis, or massively parallel
sequencing.
[0006] In another embodiment, said aneuploidy is trisomy 21,
trisomy 18, or trisomy 13. In another embodiment, said aneuploidy
is suspected or determined when the number of enumerated sequences
is greater than a predetermined amount. In another embodiment, said
predetermined amount is based on estimated amount of DNA in said
cell-free DNA sample. In another embodiment, said predetermined
amount is based on the amount of enumerated sequences from a
control region.
[0007] In another aspect, a method is provided comprising: a)
providing oligonucleotides that specifically hybridize to one or
more polynucleotide sequences from a polynucleotide template,
wherein said one or more polynucleotide sequences comprise
sequences that are sequenced at rate greater than 5-fold than other
sequences from the polynucleotide template; b) selectively
enriching said one or more polynucleotide sequences; and c)
optionally sequencing said enriched one or more polynucleotide
sequences.
[0008] In another embodiment, each of said oligonucleotides has a
substantially similar thermal profile. In another embodiment, said
polynucleotide sequences each comprise about 50-1000 bases. In
another embodiment, said polynucleotide sequences are from a
cell-free DNA sample. In another embodiment, said polynucleotide
sequences are from a maternal sample. In another embodiment, said
maternal sample is a maternal blood sample. In another embodiment,
said maternal sample comprises fetal and maternal cell-free DNA. In
another embodiment, said polynucleotide template is a chromosome
suspected of being aneuploid. In another embodiment, said
polynucleotide template is chromosome 21. In another embodiment,
the polynucleotide template is a chromosome not suspected of being
aneuploid. In another embodiment, said polynucleotide template is
chromosome 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17,
19, 20, or 22.
[0009] In another embodiment, said rate is at least 10 or 50-fold.
In another embodiment, there are at least 7, 10, 17, or 27 sequence
reads for the sequences that were sequenced at a higher frequency
rate. In another embodiment, said selectively enriching comprises
performing PCR. In another embodiment, said selectively enriching
comprises linear amplification. In another embodiment, said
selectively enriching comprises enriching at least 1, 5, 10, 50,
100, or 1000 non-random polynucleotide sequences from a first
chromosome. In another embodiment, said selectively enriching
comprises enriching at least 1, 10, or 100 polynucleotide sequences
from one or more regions of a first chromosome, wherein each region
is up to 50 kb. In another embodiment, said sequencing comprises
Sanger sequencing, sequencing-by-synthesis, or massively parallel
sequencing.
[0010] In another embodiment, the method further comprises a step
of determining the presence of absence of fetal aneuploidy based on
said sequencing
[0011] In another aspect, a method for identifying polynucleotide
sequences for enrichment in a polynucleotide template is provided
comprising: a) sequencing a plurality of polynucleotide sequences
from the polynucleotide template; b) enumerating sequenced
polynucleotide sequences; and c) identifying one or more sequenced
polynucleotide sequences that are sequenced or that have a coverage
rate at least 5-fold greater than a second set of polynucleotide
sequences.
[0012] In one embodiment, said polynucleotide sequences are from a
cell-free DNA sample. In another embodiment, said polynucleotide
sequences are from a maternal sample. In another embodiment, said
sequencing coverage rate is at least 10- or 50-fold. In another
embodiment, there are at least 7, 10, 17, or 27 reads for the
polynucleotide sequences that were sequenced at a higher frequency
rate.
[0013] In another embodiment, said identified polynucleotide
sequences are used to determine the presence or absence of fetal
aneuploidy.
[0014] In another aspect, a kit comprising a set of
oligonucleotides that selectively amplify one or more regions of a
chromosome is provided, wherein each of said regions is sequenced
at a rate of greater than 5-fold than other regions of the
chromosome.
[0015] In one embodiment, each of said oligonucleotides in the kit
is part of an oligonucleotide pair. In another embodiment, said set
of oligonucleotides comprises at least 100 oligonucleotides. In
another embodiment, an oligonucleotide in each oligonucleotide pair
comprises sequence identical to sequence in an oligonucleotide in
the other pairs and sequence unique to that individual
oligonucleotide.
[0016] In another aspect, a method for sequencing cell-free DNA
from a maternal sample is provided comprising: a) obtaining a
maternal sample comprising cell-free DNA, b) enriching sequences
that are representative of a plurality of up to 50 kb regions of a
chromosome, or enriching sequences that are sequenced at a rate of
at least 5-fold greater than other sequences using an Illumina
Genome Analyzer sequencer, and c) sequencing said enriched
sequences of cell-free DNA.
[0017] In one embodiment, said sequencing comprises
sequencing-by-synthesis. In another embodiment, said method further
comprises bridge amplification. In another embodiment, said
sequencing comprises Sanger sequencing. In another embodiment, said
sequencing comprises single molecule sequencing. In another
embodiment, said sequencing comprises pyrosequencing. In another
embodiment, said sequencing comprises a four-color
sequencing-by-ligation scheme. In another embodiment, said
sequenced enriched sequences are used to determine the presence or
absence of fetal aneuploidy. In another aspect, one or more unique
isolated genomic DNA sequences are provided, wherein said genomic
DNA sequences comprise regions that are sequenced at a rate greater
than 500% than other regions of genomic DNA. In another embodiment,
the isolated genomic DNA are sequenced by a method comprising
bridge amplification, Sanger sequencing, single molecule
sequencing, pyrosequencing, or a four-color sequencing by ligation
scheme. In another embodiment, the isolated genomic regions
comprise at least 100, 1000, or 10,000 different sequences. In
another embodiment, the regions are present at a rate greater than
50-fold, 100-fold, 20-fold. In another embodiment, the sequence is
a single amplicon.
[0018] In another aspect, a set of one or more oligonucleotides are
provided that selectively hybridize to one or more unique genomic
DNA sequences, wherein said genomic DNA sequences comprise regions
that are sequenced at a rate greater than 500% than other regions
of genomic DNA. In one embodiment, the oligonucleotides hybridize
to the sequences under mild hybridization conditions. In another
embodiment, the oligonucleotides have similar thermal profiles.
[0019] In another aspect, a method is provided comprising: a)
amplifying one or more polynucleotide sequences with a first set of
oligonucleotide pairs; b) amplifying the product of a) with a
second set of oligonucleotides pairs; and c) amplifying the product
of b) with a third set of oligonucleotide pairs. In one embodiment,
the first set of oligonucleotide pairs comprises sequence that
distinguishes polynucleotides in one sample from polynucleotides in
another sample. In another embodiment, said first set of
oligonucleotide pairs comprises sequence that distinguishes
polynucleotides in one sample from polynucleotides in another
sample and sequence that extends the length of the product. In
another embodiment, said polynucleotide sequences are enriched
sequences.
[0020] In another aspect, a method for labeling enriched
polynucleotides in two or more samples that allows identification
of which sample the polynucleotide originated is provided,
comprising: a) amplifying one or more polynucleotide sequences in
two or more samples with a first set of oligonucleotide pairs,
wherein the first set of oligonucleotide pairs comprises sequence
that distinguishes polynucleotides from one sample from
polynucleotides in another sample; b) amplifying the product of a)
with a second set of oligonucleotides pairs; and c) amplifying the
product of b) with a third set of oligonucleotide pairs.
[0021] In another aspect, a kit is provided comprising a) a first
set of oligonucleotide primer pairs comprising: sequence that
selectively hybridizes to a first set of genomic DNA sequences and
sequence in-common amongst each of the first set of oligonucleotide
primer pairs, b) a second set of oligonucleotide primer pairs with
sequence that selectively hybridizes to the common sequence of the
first set of oligonucleotide primer pairs and sequence common to
the second set of oligonucleotide pairs, and c) a third set of
oligonucleotide primer pairs with sequence that selectively
hybridizes to the common sequence of the second set of
oligonucleotide pairs. In one embodiment, the common region in the
first set of primers comprises sequence that distinguishes
polynucleotides in one sample from polynucleotides in another
sample. In another embodiment, the common region in the first set
of primers comprises sequence that distinguishes polynucleotides in
one sample from polynucleotides in another sample and sequence that
extends the length of the product.
[0022] In another aspect, a kit is provided comprising: a first set
of primer pairs that selectively amplifies a set of genomic
sequences to create a first set of amplification products, a second
set of primer pair that selectively amplifies the first set of
amplification products, and a third set of primer pairs that
selectively amplifies the second set of amplification products.
INCORPORATION BY REFERENCE
[0023] All publications, patents, and patent applications mentioned
in this specification are herein incorporated by reference to the
same extent as if each individual publication, patent, or patent
application was specifically and individually indicated to be
incorporated by reference.
BRIEF DESCRIPTION OF THE DRAWINGS
[0024] The novel features of the invention are set forth with
particularity in the appended claims. A better understanding of the
features and advantages of the present invention will be obtained
by reference to the following detailed description that sets forth
illustrative embodiments, in which the principles of the invention
are utilized, and the accompanying drawings of which:
[0025] FIG. 1 illustrates a strategy for selecting sequences for
enrichment based on "hot spots."
[0026] FIG. 2 illustrates a PCR scheme for "hot spot"
enrichment.
[0027] FIG. 3 illustrates results of amplification of chromosome 21
with different primer pairs.
[0028] FIG. 4 illustrates simplex PCR amplification Bioanalyzer
results.
[0029] FIG. 5 illustrates simplex PCR amplification Bioanalyzer
results.
[0030] FIG. 6 illustrates multiplex PCR amplification Bioanalyzer
results.
[0031] FIG. 7 illustrates PCR amplification of approximately 60 bp
amplicons from chromosome 21.
[0032] FIG. 8 illustrates Fluidigm digital PCR analysis evidence of
chromosome 21 and 1 amplification.
[0033] FIG. 9 illustrates size and concentration of DNA library
construction conditions for PCR enrichment of chromosome 21
fragments in 4 different conditions.
[0034] FIG. 10 illustrates Illumina GA sequencing analysis. FIG. 10
discloses SEQ ID NOS 9-10 and 96-98, respectively, in order of
appearance.
[0035] FIG. 11 illustrates strategy for design of PCR primers for
the "chromosome walk" method of amplification.
[0036] FIG. 12 illustrates a primer pair (SEQ ID NOS 42-43,
respectively, in order of appearance) designed for use in PCR
amplification.
[0037] FIG. 13 illustrates relative position of regions A, B, C,
and a Down syndrome critical region on a schematic of chromosome
21.
[0038] FIG. 14 illustrates PCR amplification results using the
"chromosome walk" method of sequence selection.
[0039] FIG. 15 illustrates enrichment of regions of chromosome 21
using the "chromosome walk" sequence selection method.
[0040] FIG. 16 illustrates enrichment of chromosome 21 sequence and
reference chromosome 1, 2, and 3 sequence.
[0041] FIG. 17 illustrates enrichment of sequences from reference
chromosomes 1, 2, and 3.
[0042] FIG. 18 illustrates chromosome amplification rates of
sequences selected using the "chromosome walk" method or based on
"hot spots."
[0043] FIG. 19 illustrates sequence coverage of chromosome 21.
[0044] FIG. 20 highlights different regions of sequence coverage
mapped to a schematic of chromosome 21.
[0045] FIG. 21 illustrates criteria used to select and amplify a
"hot spot" region of chromosome 21.
[0046] FIG. 22 highlights a Down syndrome critical region on a
schematic of sequence reads that map to chromosome 21.
[0047] FIG. 23 magnifies regions of sequence read coverage on a
schematic of chromosome 21.
[0048] FIG. 24 illustrates sequences reads mapped on chromosome 21
(SEQ ID NOS 99-132, respectively, in order of appearance).
[0049] FIG. 25 illustrates primers (SEQ ID NOS 15-16, respectively,
in order of appearance) designed for amplifying sequence from a 251
bp segment of chromosome 21 (SEQ ID NO: 133).
[0050] FIG. 26 illustrates a nested PCR strategy for DNA library
construction.
DETAILED DESCRIPTION OF THE INVENTION
Overview
[0051] In one aspect, the provided invention includes methods for
selecting non-random polynucleotide sequences for enrichment. The
non-random sequences can be enriched from a maternal sample for use
in detecting a fetal abnormality, for example, fetal aneuploidy. In
one embodiment, the selection of non-random polynucleotide
sequences for enrichment can be based on the frequency of sequence
reads in a database of sequenced samples from one or more subjects.
In another embodiment, the selection of polynucleotide sequences
for enrichment can be based on the identification in a sample of
sequences that can be amplified in one or more regions of a
chromosome. The selection of polynucleotide sequences to enrich can
be based on knowledge of regions of chromosomes that have a role in
aneuploidy. The selective enrichment of sequences can comprise
enriching both fetal and maternal polynucleotide sequences.
[0052] In another aspect, the provided invention includes methods
for determining the presence or absence of a fetal abnormality
comprising a step of enriching non-random polynucleotide sequences
from a maternal sample. The non-random polynucleotide sequences can
be both fetal and maternal polynucleotide sequences.
[0053] In another aspect, the provided invention comprises a kit
comprising oligonucleotides for use in selectively enriching
non-random polynucleotide sequences.
[0054] In another aspect, the provided invention includes methods
for generating a library of enriched polynucleotide sequences. A
library can be generated by the use of one or more amplification
steps, which can introduce functional sequences in polynucleotide
sequences that have been selectively enriched. For example, the
amplification steps can introduce sequences that serve as
hybridization sites for oligonucleotides for sequencing, sequences
that identify that sample from which the library was generated,
and/or sequences that serve to extend the length of the enriched
polynucleotide sequences, for example, to facilitate sequencing
analysis.
[0055] In one aspect, a method for determining the presence or
absence of fetal aneuploidy is provided comprising selectively
enriching non-random polynucleotide sequences (e.g., genomic DNA)
from a cell-free nucleic acid (e.g., DNA or RNA) sample, sequencing
said enriched polynucleotide sequences, enumerating sequence reads
from said sequencing step, and determining the presence or absence
of fetal aneuploidy based on said enumerating.
[0056] The selectively enriching step can comprise amplifying
nucleic acids. Amplification can comprise performing a polymerase
chain reaction (PCR) on a sample of nucleic acids. PCR techniques
that can be used include, for example, digital PCR (dPCR),
quantitative PCR (qPCR) or real-time PCR (e.g., TaqMan PCR; Applied
Biosystems), reverse-transcription PCR (RT-PCR), allele-specific
PCR, amplified fragment length polymorphism PCR (AFLP PCR), colony
PCR, Hot Start PCR, in situ PCR (ISH PCR), inverse PCR (IPCR), long
PCR, multiplex PCR, or nested PCR. Amplification can be linear
amplification, wherein the number of copies of a nucleic acid
increases at a linear rate in a reaction.
[0057] The selectively enriching step can comprise a hybridization
step. The hybridization can occur on a solid support.
Selecting Sequences Based on "Hotspots"
[0058] Sequencing data can be analyzed to identify polynucleotide
sequences to be selectively enriched. Some polynucleotide sequences
from a sample comprising nucleic acids (e.g., genomic DNA) can be
sequenced at a higher frequency than other polynucleotide
sequences. These sequences may be more likely to be enriched by,
for example, amplification methods. Identifying and enriching these
polynucleotide sequences can reduce the number of nucleic acids
that need to be analyzed to determine the presence or absence of
fetal aneuploidy. This enrichment can reduce the cost of aneuploidy
determination.
[0059] In one embodiment, the non-random polynucleotide sequences
that are selectively enriched can comprise sequences that are
sequenced at a frequency of greater than at least 2-, 3-, 4-, 5-,
6-, 7-, 8-, 9-, 10-, 15-, 20-, 25-, 30-, 40-, 50-, 60-, 70-, 80-,
90-, or 100-fold than other sequences on the same chromosome in a
database of sequence information. The polynucleotide sequences that
are sequenced at a higher frequency can be referred to as
"hot-spots." The non-random polynucleotides that are selectively
enriched can be selected from regions of a chromosome known to have
a role in a disease, for example, Down syndrome. The sequencing
rate data can be derived from a database of enumerated
polynucleotide sequences, and the database of enumerated
polynucleotide sequences can be generated from one or more samples
comprising non-maternal samples, maternal samples, or samples from
subjects that are pregnant, have been pregnant, or are suspected of
being pregnant. The samples can be cell-free nucleic acid (e.g.,
DNA or RNA) samples. The subjects can be mammals, e.g., human,
mouse, horse, cow, dog, or cat. The samples can contain maternal
polynucleotide sequences and/or fetal polynucleotide sequences. The
enumerated sequences can be derived from random, massively parallel
sequencing of samples, e.g., as described in U.S. Patent
Application Publication Nos. 20090029377 and 20090087847, or Fan HC
et al. (2008) PNAS 105:16266-71, which are herein incorporated by
reference in their entireties. Techniques for massively parallel
sequencing of samples are described below.
[0060] The database can comprise sequence information from samples
from at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 40, 50, 60, 70, 80, 90,
100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000,
5000, 7500, 10,000, 100,000, or 1,000,000 different subjects. The
data can be processed to indicate the overlap of individual
polynucleotide sequences from the samples from the subjects (FIGS.
22-24). The database can indicate the frequency with which one or
more nucleotides at a specific chromosome position is sequenced
among the samples. The length of the sequence that can overlap can
be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45,
50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200,
225, 250, 275, or 300 bases. The frequency of sequencing of one or
more nucleotides at a first position of a chromosome can be
compared to the frequency of sequencing of one or more other
nucleotides at a second position on the chromosome to determine the
fold frequency at which the first position was sequenced relative
to the second position. The sequence (polynucleotide sequence or
base) that is sequenced at a higher frequency can be sequenced at
least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60,
70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000,
2000, 3000, 5000, 7500, 10,000, 100,000, or 1,000,000 times in one
or more samples in the database.
[0061] In one embodiment, a method for identifying polynucleotide
sequences for enrichment in a polynucleotide template is provided
comprising sequencing a plurality of polynucleotide sequences from
the polynucleotide template, enumerating sequenced polynucleotide
sequences, and identifying one or more sequenced polynucleotide
sequences that are sequenced or that have a coverage rate at least
5-fold greater than a second set of polynucleotide sequences.
[0062] In another aspect, one or more unique isolated genomic DNA
sequences are provided, wherein said genomic DNA sequences comprise
regions that are sequenced at a rate greater than 5-fold than other
regions of genomic DNA. The isolated genomic sequences can comprise
at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80,
90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000,
4000, 5000, 6000, 7000, 8000, 9000, or 10,000 different sequences.
Each isolated genomic sequence can be a single amplicon.
[0063] In another aspect, a set of one or more oligonucleotides
that selectively hybridize to the isolated sequences is provided.
The oligonucleotides can hybridize to the sequences under mild
hybridization conditions. The oligonucleotides can have similar
thermal profiles.
[0064] In one embodiment, the non-random sequences to be
selectively enriched are identified based on the number of times
they are sequenced in a database of sequence information,
independent of the rate of sequencing of a second set of sequences.
For example, the sequences to be selectively enriched can be those
that are sequenced at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600,
700, 800, 900, 1000, 2000, 3000, 5000, 7500, 10,000, 100,000, or
1,000,000 times in one or more samples in the database.
[0065] The number of non-random polynucleotide sequences that can
be selectively enriched in a sample can be at least 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95,
100, 150, 200, 250, 300, 400, 500, 600, 700, 750, 800, 900, or
1000. The size of the non-random polynucleotide sequences to be
selectively enriched can comprise about 10-1000, 10-500, 10-260,
10-260, 10-200, 50-150, or 50-100 bases or bp, or at least 10, 15,
20, 25, 30, 35, 40, 45, 50, 55, 60, 66, 70, 75, 80, 85, 90, 95,
100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220,
230, 240, 250, 260, 270, 280, 290, 300, 400, 500, 600, 700, 800,
900, or 1000 bases or bp.
[0066] The selective enrichment step can comprise designing
oligonucleotides (primers) that hybridize specifically to
polynucleotide sequences that are sequenced at a higher frequency
than other sequences on a chromosome or are sequenced a certain
number of times. A program, for example, Basic Local Alignment
Search Tool (BLAST), can be used to design oligonucleotides that
hybridize to sequence specific to one chromosome or region. The
oligonucleotide primers can be manually designed by a user, e.g.,
using known genome or chromosome sequence template as a guide. A
computer can be used to design the oligonucleotides. The
oligonucleotides can be designed to avoid hybridizing to sequence
with one or more polymorphisms, e.g., single nucleotide
polymorphisms (SNPs).
[0067] One or more oligonucleotide pairs can be generated to
hybridize specifically to one or more polynucleotide sequences; the
oligonucleotide pairs can be used in amplification reactions, e.g.,
a PCR technique described above, to selectively enrich sequences.
In one embodiment, the oligonucleotides or oligonucleotide pairs
can be provided in a kit. A set of oligonucleotides can be
generated wherein each oligonucleotide has a similar thermal
profile (e.g., T.sub.m). A set of oligonucleotides can comprise at
least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90,
95 or 100 oligonucleotide pairs. An oligonucleotide pair can be a
pair of oligonucleotides that can hybridize to and amplify a
sequence in a PCR. Each of the pairs of oligonucleotides can
comprise sequence identical to sequence in all the other
oligonucleotide pairs and sequence unique to that individual
oligonucleotide pair.
[0068] In another aspect, a kit comprising a set of
oligonucleotides that selectively hybridize and/or used to amplify
one or more regions of a chromosome is provided, wherein each of
said regions is sequenced at a rate of greater than 5-fold than
other regions of the chromosome. The oligonucleotides can have the
properties of the oligonucleotides described above.
Selecting Sequences Based on "Chromosome Walk"
[0069] In another embodiment, the selective enriching of non-random
polynucleotide sequences can comprise identifying for enrichment
and/or enriching at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65,
70, 75, 80, 85, 90, 95 or 100 polynucleotide sequences from one or
more regions of a first chromosome. The length of a region can be
at least, or up to, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65,
70, 75, 80, 85, 90, 95, 100, 200, 300, 400, 500, 600, 700, 800,
900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or
10,000 kb. The number of regions from which sequences can be
enriched can be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. The
selection of polynucleotide sequences to be enriched can be
independent of the rate at which polynucleotides are sequenced in
other samples. The polynucleotide sequences to be enriched can be
clustered in a region, wherein the cluster can comprise about
1000-8000 bp, 1000-7000 bp, 1000-6000 bp, 1000-5000 bp, 1000-4000
bp, 1000-3000 bp, 1000-2000 bp, 4000-8000 bp, 5000-8000 bp,
6000-8000 bp, or 7000-8000 bp. There can be at least 1, 2, 3, 4, 5,
6, 7, 8, 9, or 10 clusters per region (e.g., per 50 kb region). The
regions can be selected based on knowledge of a role for the region
in a disease, for example, Down syndrome. Some polynucleotide
sequences selected using this technique can be enriched (e.g.,
amplified) in practice, whereas some of the polynucleotide
sequences selected using this technique may not be enriched (e.g.,
amplified) in practice. The polynucleotide sequences that are
enriched using this identification technique can be used for
subsequent enumeration and aneuploidy detection.
[0070] Oligonucleotide (primers) can be designed that hybridize
specifically to polynucleotide sequences within a region (e.g., 50
kb). The oligonucleotide (primer) design can be automated to select
sequences within a region (e.g., 50 kb) for enrichment using
assembled chromosome sequence as a template for design. No prior
knowledge of the level of sequenced polynucleotide sequences in
other samples (e.g., in a database sequence information) is
necessary to select the sequences for enrichment. PRIMER-BLAST
(from NCBI open/public software) can be used to design
oligonucleotides that specifically hybridize to sequences on one
chromosome. The oligonucleotides can be designed to avoid
hybridizing with sequences that contains one or more polymorphisms,
e.g., a single nucleotide polymorphism (SNP). One or more
oligonucleotide pairs can be generated to hybridize specifically to
one or more polynucleotide sequences; the oligonucleotide pairs can
be used in amplification reactions, e.g., using a PCR technique
described above. A set of oligonucleotides can be generated wherein
each oligonucleotide has a similar thermal profile (e.g., T.sub.m).
The set of oligonucleotides can comprise at least 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35,
40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100
oligonucleotide pairs. In one embodiment, a kit is provided
comprising oligonucleotide pairs that can hybridize to specific
polynucleotide sequences within a region (e.g., 50 kb). Each of the
pairs of oligonucleotides can comprises sequence identical to
sequence in all the other oligonucleotide pairs and sequence unique
to that individual oligonucleotide pair.
Samples
[0071] The sample from which the non-random polynucleotide
sequences are to be selectively enriched can be a maternal sample.
Maternal samples that can be used in the methods of the provided
invention include, for example, whole blood, serum, plasma, sweat,
tears, ear flow, sputum, lymph, bone marrow suspension, lymph,
urine, saliva, semen, sweat, vaginal flow, feces, transcervical
lavage, cerebrospinal fluid, brain fluid, ascites, milk, or
secretions of the respiratory, intestinal and genitourinary tracts.
A sample can be from a processed blood sample, for example, a buffy
coat sample. A buffy coat sample is an anticoagulated blood sample
that forms after density gradient centrifugation of whole blood. A
buffy coat sample contains, e.g., maternal nucleated cells, e.g.,
peripheral blood mononuclear cells (PBMCs). In one embodiment, a
sample comprises fetal cells (e.g., fetal nucleated red blood cells
(fnRBCs) or trophoblasts) and maternal cells.
[0072] A cell-free nucleic acid (e.g., DNA or RNA) sample can be a
maternal sample, for example, serum or plasma. Methods for
generating serum or plasma and methods for extracting nucleic acids
are known in the art. A cell-free sample can comprise fetal and
maternal cell-free nucleic acid, for example, DNA or RNA. A
cell-free DNA sample can be from a plurality of different subjects.
Samples used for generation of a database of sequenced
polynucleotides can be cell-free nucleic acid samples.
Sequencing Methods
[0073] Applicable nucleic acid sequencing methods that can be used
in the methods of the provided invention include, e.g.,
multi-parallel sequencing, massively parallel sequencing,
sequencing-by-synthesis, ultra-deep sequencing, shot-gun
sequencing, and Sanger sequencing, e.g., using labeled terminators
or primers and gel separation in slab or capillary. These
sequencing methods have been described previously. For example, a
description of shotgun sequencing can be found in Fan et al. (2008)
PNAS 105:16266-16271. Sanger sequencing methods are described in
Sambrook et al., (2001) Molecular Cloning, Third Edition, Cold
Spring Harbor Laboratory Press. Other DNA sequencing techniques can
include sequencing-by-synthesis using reversibly terminated labeled
nucleotides, pyrosequencing, 454 sequencing, allele specific
hybridization to a library of labeled oligonucleotide probes,
sequencing by synthesis using allele specific hybridization to a
library of labeled clones followed by ligation, real time
monitoring of the incorporation of labeled nucleotides during a
polymerization step, polony sequencing, and SOLiD sequencing.
[0074] Sequencing methods are described in more detail below. A
sequencing technology that can be used in the methods of the
provided invention is SOLEXA sequencing (Illumina). SOLEXA
sequencing is based on the amplification of DNA on a solid surface
using fold-back PCR and anchored primers. Genomic DNA is
fragmented, and adapters are added to the 5' and 3' ends of the
fragments. DNA fragments that are attached to the surface of flow
cell channels are extended and bridge amplified. The fragments
become double stranded, and the double stranded molecules are
denatured. Multiple cycles of the solid-phase amplification
followed by denaturation can create several million clusters of
approximately 1,000 copies of single-stranded DNA molecules of the
same template in each channel of the flow cell. Primers, DNA
polymerase and four fluorophore-labeled, reversibly terminating
nucleotides are used to perform sequential sequencing. After
nucleotide incorporation, a laser is used to excite the
fluorophores, and an image is captured and the identity of the
first base is recorded. The 3' terminators and fluorophores from
each incorporated base are removed and the incorporation, detection
and identification steps are repeated.
[0075] Another sequencing technique that can be used in the methods
of the provided invention includes, for example, Helicos True
Single Molecule Sequencing (tSMS) (Harris T. D. et al. (2008)
Science 320:106-109). In the tSMS technique, a DNA sample is
cleaved into strands of approximately 100 to 200 nucleotides, and a
polyA sequence is added to the 3' end of each DNA strand. Each
strand is labeled by the addition of a fluorescently labeled
adenosine nucleotide. The DNA strands are then hybridized to a flow
cell, which contains millions of oligo-T capture sites that are
immobilized to the flow cell surface. The templates can be at a
density of about 100 million templates/cm.sup.2. The flow cell is
then loaded into an instrument, e.g., HeliScope.TM. sequencer, and
a laser illuminates the surface of the flow cell, revealing the
position of each template. A CCD camera can map the position of the
templates on the flow cell surface. The template fluorescent label
is then cleaved and washed away. The sequencing reaction begins by
introducing a DNA polymerase and a fluorescently labeled
nucleotide. The oligo-T nucleic acid serves as a primer. The
polymerase incorporates the labeled nucleotides to the primer in a
template directed manner. The polymerase and unincorporated
nucleotides are removed. The templates that have directed
incorporation of the fluorescently labeled nucleotide are detected
by imaging the flow cell surface. After imaging, a cleavage step
removes the fluorescent label, and the process is repeated with
other fluorescently labeled nucleotides until the desired read
length is achieved. Sequence information is collected with each
nucleotide addition step.
[0076] Another example of a DNA sequencing technique that can be
used in the methods of the provided invention is 454 sequencing
(Roche; Margulies, M. et al. (2005) Nature 437:376-380). 454
sequencing involves two steps. In the first step, DNA is sheared
into fragments of approximately 300-800 base pairs, and the
fragments are blunt-ended. Oligonucleotide adaptors are then
ligated to the ends of the fragments. The adaptors serve as primers
for amplification and sequencing of the fragments. The fragments
can be attached to DNA capture beads, e.g., streptavidin-coated
beads using, e.g., Adaptor B, which contains 5'-biotin tag. The
fragments attached to the beads are PCR amplified within droplets
of an oil-water emulsion. The result is multiple copies of clonally
amplified DNA fragments on each bead. In the second step, the beads
are captured in wells (pico-liter sized). Pyrosequencing is
performed on each DNA fragment in parallel. Addition of one or more
nucleotides generates a light signal that is recorded by a CCD
camera in a sequencing instrument. The signal strength is
proportional to the number of nucleotides incorporated.
[0077] Pyrosequencing makes use of pyrophosphate (PPi) which is
released upon nucleotide addition. PPi is converted to ATP by ATP
sulfurylase in the presence of adenosine 5' phosphosulfate.
Luciferase uses ATP to convert luciferin to oxyluciferin, and this
reaction generates light that is detected and analyzed.
[0078] Another example of a DNA sequencing technique that can be
used in the methods of the provided invention is SOLiD technology
(Applied Biosystems). In SOLiD sequencing, genomic DNA is sheared
into fragments, and adaptors are attached to the 5' and 3' ends of
the fragments to generate a fragment library. Alternatively,
internal adaptors can be introduced by ligating adaptors to the 5'
and 3' ends of the fragments, circularizing the fragments,
digesting the circularized fragment to generate an internal
adaptor, and attaching adaptors to the 5' and 3' ends of the
resulting fragments to generate a mate-paired library. Next, clonal
bead populations are prepared in microreactors containing beads,
primers, template, and PCR components. Following PCR, the templates
are denatured and beads are enriched to separate the beads with
extended templates. Templates on the selected beads are subjected
to a 3' modification that permits bonding to a glass slide.
[0079] The sequence can be determined by sequential hybridization
and ligation of partially random oligonucleotides with a central
determined base (or pair of bases) that is identified by a specific
fluorophore. After a color is recorded, the ligated oligonucleotide
is cleaved and removed and the process is then repeated.
[0080] Another example of a sequencing technology that can be used
in the methods of the provided invention includes the single
molecule, real-time (SMRT.TM.) technology of Pacific Biosciences.
In SMRT, each of the four DNA bases is attached to one of four
different fluorescent dyes. These dyes are phospholinked. A single
DNA polymerase is immobilized with a single molecule of template
single stranded DNA at the bottom of a zero-mode waveguide (ZMW). A
ZMW is a confinement structure which enables observation of
incorporation of a single nucleotide by DNA polymerase against the
background of fluorescent nucleotides that rapidly diffuse in an
out of the ZMW (in microseconds). It takes several milliseconds to
incorporate a nucleotide into a growing strand. During this time,
the fluorescent label is excited and produces a fluorescent signal,
and the fluorescent tag is cleaved off. Detection of the
corresponding fluorescence of the dye indicates which base was
incorporated. The process is repeated.
[0081] Another example of a sequencing technique that can be used
is the methods of the provided invention is nanopore sequencing
(Soni G V and Meller A. (2007) Clin Chem 53:1996-2001). A nanopore
is a small hole, of the order of 1 nanometer in diameter. Immersion
of a nanopore in a conducting fluid and application of a potential
across it results in a slight electrical current due to conduction
of ions through the nanopore. The amount of current which flows is
sensitive to the size of the nanopore. As a DNA molecule passes
through a nanopore, each nucleotide on the DNA molecule obstructs
the nanopore to a different degree. Thus, the change in the current
passing through the nanopore as the DNA molecule passes through the
nanopore represents a reading of the DNA sequence.
[0082] Another example of a sequencing technique that can be used
in the methods of the provided invention involves using a
chemical-sensitive field effect transistor (chemFET) array to
sequence DNA (e.g., as described in U.S. Patent Application
Publication No. 20090026082). In one example of the technique, DNA
molecules can be placed into reaction chambers, and the template
molecules can be hybridized to a sequencing primer bound to a
polymerase. Incorporation of one or more triphosphates into a new
nucleic acid strand at the 3' end of the sequencing primer can be
detected by a change in current by a chemFET. An array can have
multiple chemFET sensors. In another example, single nucleic acids
can be attached to beads, and the nucleic acids can be amplified on
the bead, and the individual beads can be transferred to individual
reaction chambers on a chemFET array, with each chamber having a
chemFET sensor, and the nucleic acids can be sequenced.
[0083] The sequencing technique used in the methods of the provided
invention can generate at least 1000 reads per run, at least 10,000
reads per run, at least 100,000 reads per run, at least 500,000
reads per run, or at least 1,000,000 reads per run.
[0084] The sequencing technique used in the methods of the provided
invention can generate about 30 bp, about 40 bp, about 50 bp, about
60 bp, about 70 bp, about 80 bp, about 90 bp, about 100 bp, about
110, about 120 bp per read, about 150 bp, about 200 bp, about 250
bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, about
500 bp, about 550 bp, or about 600 bp per read.
[0085] The sequencing technique used in the methods of the provided
invention can generate at least 30, 40, 50, 60, 70, 80, 90, 100,
110, 120, 150, 200, 250, 300, 350, 400, 450, 500, 550, or 600 bp
per read.
[0086] In another aspect, a method for sequencing cell-free DNA
from a maternal sample is provided comprising obtaining a maternal
sample comprising cell-free DNA, enriching sequences that are
representative of one or more 50 kb regions of a chromosome, or
enriching sequences that are sequenced at a rate of at least 2-fold
greater than other sequences, using an Illumina sequencer (e.g.,
Illumina Genome Analyzer IIx) and sequencing said enriched
sequences of cell-free DNA.
Aneuploidy
[0087] The non-random sequences to be selectively enriched can
include those on a chromosome suspected of being aneuploid in a
fetus and/or on a chromosome suspected of being euploid in a fetus.
Polynucleotide sequences from chromosome 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, or Y can be
selectively enriched. Chromosomes suspected of being aneuploid in a
fetus can include chromosome 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, or Y. Chromosomes
suspected of being euploid in a fetus can include chromosome 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, X, or Y.
[0088] The methods of the provided invention can be used to detect
aneuploidy. Aneuploidy is a state where there is an abnormal number
of chromosome(s), or parts of a chromosome. Aneuploidy can include,
for example, monosomy, partial monosomy, trisomy, partial trisomy,
tetrasomy, and pentasomy. Examples of aneuploidy that can be
detected include Angelman syndrome (15q11.2-q13), cri-du-chat
syndrome (5p-), DiGeorge syndrome and Velo-cardiofacial syndrome
(22q11.2), Miller-Dieker syndrome (17 p13.3), Prader-Willi syndrome
(15q11.2-q13), retinoblastoma (13q14), Smith-Magenis syndrome (17
p11.2), trisomy 13 (Patau syndrome), trisomy 16, trisomy 18 (Edward
syndrome), trisomy 21 (Down syndrome), triploidy, Williams syndrome
(7q 11.23), and Wolf-Hirschhom syndrome (4p-). Examples of sex
chromosome abnormalities that can be detected by methods described
herein include, but are not limited to, Kallman syndrome (Xp22.3),
steroid sulfate deficiency (STS) (Xp22.3), X-linked ichthyosis
(Xp22.3), Klinefelter syndrome (XXY), fragile X syndrome, Turner
syndrome, metafemales or trisomy X (XXX syndrome, 47,XXX
aneuploidy), and monosomy X.
[0089] In addition, the enrichment methods can also be used to
detect locus- and allele-specific sequences of interest, for
example, autosomal and sex chromosomal point mutations, deletions,
insertions, and translocations, which can be associated disease.
Examples of translocations associated with disease include, for
example, t(9; 22)(q34; q11)--Philadelphia chromosome, CML, ALL;
t(2; 5)(p23; q35) (anaplastic large cell lymphoma); t(8;
14)--Burkitt's lymphoma (c-myc); t(8; 21)(q22; q22)--acute
myeloblastic leukemia with maturation (AML1-ETO); t(12; 21)(p12;
q22)--ALL (TEL-AML1); t(12; 15)(p13; q25)--(TEL-TrkC); t(9;
12)(p24; p13)--CML, ALL (TEL-JAK2); acute myeloid leukemia,
congenital fibrosarcoma, secretory breast carcinoma; t(11;
14)--Mantle cell lymphoma (cyclin D1); t(11; 22)(q24;
q11.2-12)--Ewing's sarcoma; t(14; 18)(q32; q21)--Follicular
lymphoma (Bcl-2); t(15; 17)--Acute promyelocytic leukemia; t(1;
12)(q21; p13)--Acute myelogenous leukemia; t(17; 22)--DFSP; and
t(X; 18)(p11.2; q11.2)--Synovial sarcoma.
[0090] Methods for determining fetal aneuploidy using random
sequencing techniques are described, for example, in U.S. Patent
Application Publication Nos. 20090029377 and 20090087847, Fan HC et
al. (2008) PNAS 105:16266-71, and U.S. Provisional Patent
Application Nos. 61/296,358 and 61/296,464, which are herein
incorporated by reference in their entireties. The methods of fetal
aneuploidy determination can be based on the fraction of fetal DNA
in a sample. Such methods are described, for example, in U.S.
Provisional Patent Application No. 61/296,358.
[0091] Aneuploidy can be suspected or determined when the number of
enumerated sequences is greater than a predetermined amount. The
predetermined amount can be based on estimated amount of DNA in a
cell-free DNA sample. The predetermined amount can be based on the
amount of enumerated sequences from a control region.
Library Formation
[0092] In another aspect, a method is provided for generating a
library of selectively enriched non-random polynucleotide sequences
comprising a) amplifying one or more polynucleotide sequences with
a first set of oligonucleotide pairs, b) amplifying the product of
a) with a second set of oligonucleotides pairs; and c) amplifying
the product of b) with a third set of oligonucleotide pairs.
[0093] The polynucleotide sequences can be those enriched by the
methods of the provided invention. The first set of oligonucleotide
pairs can comprise sequence that distinguishes polynucleotides in
one sample from polynucleotides in another sample. The first set of
oligonucleotide pairs can comprise sequence that distinguishes
polynucleotides in one sample from polynucleotides in another
sample and sequence that extends the length of the product. Bridge
amplification in Illumina (SOLEXA) sequencing can be most effective
when the sequences are 100-500 bp. Fetal nucleic acid sequences are
often less than 250 bp, and sequences of less than 100 bp can be
amplified from cell-free samples. Thus, the sequence that extends
the length of the product can facilitate SOLEXA sequencing. The
polynucleotide sequences can be sequences enriched using the
methods described herein.
[0094] In another aspect, a method for labeling enriched
polynucleotides in two or more samples that allows identification
of which sample the polynucleotide originated is provided,
comprising: a) amplifying one or more polynucleotide sequences in
two or more samples with a first set of oligonucleotide pairs,
wherein the first set of oligonucleotide pairs comprises sequence
that distinguishes polynucleotides from one sample from
polynucleotides in another sample, b) amplifying the product of a)
with a second set of oligonucleotides pairs; and c) amplifying the
product of b) with a third set of oligonucleotide pairs.
[0095] In another aspect, a kit is provided comprising a) a first
set of oligonucleotide primer pairs comprising: sequence that
selectively hybridizes to a first set of genomic DNA sequences and
sequence in-common amongst each of the first set of oligonucleotide
primer pairs, b) a second set of oligonucleotide primer pairs with
sequence that selectively hybridizes to the common sequence of the
first set of oligonucleotide primer pairs and sequence common to
the second set of oligonucleotide pairs, and c) a third set of
oligonucleotide primer pairs with sequence that selectively
hybridizes to the common sequence of the second set of
oligonucleotide pairs.
[0096] The first set of primers can comprise sequence that
distinguishes polynucleotides in one sample from polynucleotides in
another sample.
[0097] The common region in the first set of primers can comprise
sequence that distinguishes polynucleotides in one sample from
polynucleotides in another sample and that extends the length of
the product.
[0098] In another aspect, a kit is provided comprising: a first set
of primer pairs that selectively amplifies a set of genomic
sequences to create a first set of amplification products, a second
set of primer pair that selectively amplifies the first set of
amplification products, and a third set of primer pairs that
selectively amplifies the second set of amplification products.
EXAMPLES
Example 1
"Hot Spot" Amplification Strategy
[0099] FIG. 1 illustrates a strategy for selecting sequences from
chromosome 21 for enrichment. In step 100, sequence run data was
combined. Total chromosome 21 sequence reads were used (102). These
samples can include reads from samples that contain trisomy 21.
"Hot" and "cold" regions of sequence coverage were mapped on
chromosome 21 (104). For example, the region examined can be within
a 5.8 Mb Down syndrome critical region (DSCR). PCR primers are
designed, which can anneal to intergenic DNA or intragenic DNA
(106). The primers were designed to anneal specifically with
chromosome 21. The regions to be amplified can be a hot spot
region, or region to which a number of sequence reads map (108).
The PCR fragments generated can be approximately 200 bp in length.
Next, sequencing analysis is performed using BioAnalyzer analysis
and/or PCR/probe analysis (110).
[0100] PCR primers were designed to generate amplicons of
approximately 200 bp and 150 bp from cell-free DNA template, as
depicted is shown in FIG. 2. PCR amplification was performed using
both simplex and multiplex reactions. The size of the amplicons was
analyzed by Agilent 2100 Bioanalyzer and DNA 1000 kit. Sequences
for primer pairs 1.sub.--150, 2.sub.--150, 3.sub.--150,
4.sub.--150, 5.sub.--150, 6.sub.--150, and 7.sub.--150 regions
amplification, used in generating the data in FIGS. 2, 3, 4, and 5,
are shown in Table 1.
[0101] Primer sequences for 1.sub.--200, 2.sub.--200, 3.sub.--200,
4.sub.--200, 5.sub.--200, and 6.sub.--200 regions amplification,
for FIGS. 2, 4, and 6, are illustrated in Table 2.
TABLE-US-00001 TABLE 1 Sequences for primer pairs 1_150, 2_150,
3_150, 4_150, 5_150, 6_150, and 7_150 (SEQ ID NOS 1-14,
respectively, in order of appearance). PCR Size Chromosome Location
Primer Name Primer Sequence (bp) (1) Chr21: 45,651,908-
1_150_45652158_F CCCCAAGAGGTGCTTGTAGT 155 45,652,158
1_150_45652158_R GCCATGGTGGAGTGTAGGAG (2) Chr21: 46,153,568-
2_150_46153825_F CTGAAGTGCTGCCAACACAC 153 46,153,825
2_150_46153825_R TGATCTTGGAGCCTCCTTTG (3) Ch21: 46,048,091-
3_150_46,048,339_F AGCTTCTCCAGGACCCAGAT 151 46,048,339
3_150_46,048,339_R CATTCATGGGAAGGGACTCA (4) Chr21: 46,013,033-
4_150_46,013,258_F CCATTGCACTGGTGTGCTT 155 46,013,258
4_150_46,013,258_R GAGACGAGGGGACGATAGC (5) Chr21: 40,372,444-
5_150_40,372,655_F TGCCATCGTAGTTCAGCGTA 152 40,372,655
5_150_40,372,655_R TTGGACCACAGCTCAGAGG (6) Chr21: 41,470,712-
6_41,470,712-150_F AAAGTGTGCTTGCTCCAAGG 152 41,470,747
6_41,470,712-150_R GGCAAAACACAGCCCAATAG (7) Chr21 Ch21_APP150_F
CCTAGTGCGGGAAAAGACAC 145 Ch21_APP150_R TTCTCTCCCTTGCTCATTGC
TABLE-US-00002 TABLE 2 Sequences for primer pairs 1_200, 2_200,
3_200, 4_020, 5_200, and 6_200 (SEQ ID NOS 15-26, respectively, in
order of appearance). PCR Size Chromosome Location Primer Name
Primer Sequence (bp) (1) Chr21: 45,651,908- 1_45651908-45652158_F
GAGTCAGAGTGGAGCTGAGGA 199 45,652,158 1_45651908-45652158_R
GGAGGTCCTAGTGGTGAGCA (2) Chr21: 46,153,568- 2_46153568-46153825_F
TGTGGGAAGTCAGGACACAC 205 46,153,825 2_46153568-46153825_R
GATCTTGGAGCCTCCTTTGC (3) Ch21: 46,048,091-
3_46,048,091-46,048,339_F GTGACAGCCTGGAACATGG 203 46,048,339
3_46,048,091-46,048,339_R CAAGGCACCTGCACTAAGGT (4) Chr21:
46,013,033- 4_46,013,033-46,013,258_F TGCCTCCTGCTACTTTTACCC 204
46,013,258 4_46,013,033-46,013,258_R AGACGGAACAGGCAGAGGT (5) Chr21:
40,372,444- 5_40372444-40372655_F CAAGACACAAGCAGGAGAGC 196
40,372,655 5_40372444-40372655_R CAGTTTGGACCACAGCTCAG (6) Chr21:
41,470,712- 6_41470710-200F AAAGTGTGCTTGCTCCAAGG 194 41,470,747
6_41470710-200R TGGAACAAGCCTCCATTTTC
TABLE-US-00003 TABLE 3 Primer sequences for 1_150_60 and 2_150_60
region PCR amplification (FIG. 7); same primer plus probe sequences
for FIG. 8 (SEQ ID NOS 27-41, respectively, in order of appearance)
PCR Size Chromosome Location Primer Name Primer Sequence (bp) (1)
Chr21: 45,651,908- 1_150_60_45652158_F GAGGTGCTTGTAGTCAGTGCTTCA 64
45,652,158 1_150_60_45652158_R CCCGGTGACACAGTCCTCTT
1_150_60_45652158_P AGTCAGAGTGGAGCTGAG (2) Chr21: 46,153,568-
2_60_150_46153825_F TGCTGCCAACACACGTGTCT 60 46,153,825
2_60_150_46153825_R CAGGGCTGTTGCTCATGGA 2_60_150_46153825_P
TCCCCTAGGATATCATC (5) Chr21: 40,372,444- 5_60_150_40372655_F
CCCGCATCTGCAGCTCAT 65 40,372,655 5_60_150_40372655_R
TCTCTCCAAGTCCTACATCCTGTATG 5_60_150_40372655_P CCAGGTGGCTTCC Ch21
7_Amyloid_21_F GGG AGC TGG TAC AGA AAT GAC TTC ref. 1
7_Amyloid_21_R TTG CTC ATT GCG CTG ACA A 7_Amyloid_21_P AGC CAT CCT
TCC CGG GCC TAG G Ch1 ch1_1_F GTTCGGCTTTCACCAGTCT ref. 1 chl_1_R
CTCCATAGCTCTCCCCACT ch1_1_P CGCCCTGCCATGTGGAA
[0102] Ref. 1 in Table 3 refers to Fan HC et al. (2008) PNAS 105:
16266-16271, which is herein incorporated by reference in its
entirety. FIG. 3 illustrates amounts of nucleic acids that were
detected for different samples of cell-free plasma DNA using
different primers. FIG. 4 illustrates simplex PCR Amplification
Bioanalyzer results, some of which correspond to the data in FIG.
3.
[0103] FIG. 5 illustrates results of PCR amplification of
chromosome 21 in singleplex reactions. FIG. 6 illustrates
Bioanalyzer results for multiplex PCR amplifications of chromosome
21. FIG. 7 illustrates Bioanalyzer results for PCR amplifications
of approximately 60 bp amplicons. Table 3. illustrates primer
sequences for 1.sub.--150.sub.--60 and 2.sub.--150.sub.--60 region
PCR amplification.
[0104] FIG. 8A illustrates enrichment of chromosome 1 and 21
sequence. Four different sequences from chromosome 21 were
amplified, as well a region from chromosome 1. Numbers of molecules
were counted by dPCR. The ratio of the different sequences of
chromosome 21 to chromosome 1 sequences from samples that underwent
enrichment was calculated. Also provided are the ratio of
chromosome 21 to 1 sequences from non-enriched (cf plasma DNA)
samples. Also, genomic DNA was extracted from a cultured T21 cell
line (Down Syndrome in origin) as positive control to show that
dPCR primer/probe can amplify the ch21. The T21 cell line was
ordered from ATCC and cultured in the lab: ATCC number: CCL-54;
Organism: Homo sapiens; Morphology: fibroblast; Disease: Down
syndrome; Gender: male; Ethnicity: Caucasian.
[0105] FIG. 8B illustrates a comparison of chromosome 1 and 21
counts pre-amplification (left side). Shown on the right side of
the chart is the state following enrichment for ch21.sub.--5 using
5.sub.--60.sub.--150 primers (Table 3); amplified sequences were
probed with chromosome 1-VIC and chromosome 21-FAM probes (Table
3). Only Ch21.sub.--5 sequence was amplified. FIG. 8C illustrates
the size of an enriched fragment, ch21.sub.--5, using
5.sub.--60.sub.--150 primers (Table 3).
[0106] A DNA library was generated with 24103.sub.--5.sub.--150 PCR
fragment using Illumina ChIP-Seq Sample Preparation kit in 4
different conditions. The size and concentration of the generated
DNA library was analyzed using Bioanalyzer shown in FIG. 9.
[0107] This DNA library was sequenced using an Illumina GA
Sequencer and the sequences was analyzed with Illumina Pipeline
software. The output sequencing reads were aligned to a human
reference sequence. The correct and unique aligned sequences were
then scored, of which 20% and 12% are exactly the same sequences of
forward and reverse primer sequences and adjacent flanking
sequences, respectively, as shown in the FIG. 10.
Example 2
Chromosome Walk Strategy for Sequence Enrichment
[0108] FIG. 11 illustrates an overview of the chromosome walk
strategy for sequence enrichment. A 5.8 Mbp Down syndrome critical
region was selected (1100). PRIMER-BLAST (1102) was used to design
100 PCR primers (1104) in 50,000 bp regions. Unique sequences on
chromosome 21 were sought to generate approximately 140-150 bp
fragments. Primers were selected from different clusters in
different regions on chromosome 21 (1106) and synthesized and
arranged in 96 well plates (1108).
[0109] FIG. 12 illustrates a primer pair that was designed,
indicating length, annealing position on chromosome 21, melting
temperature (T.sub.m), and percent GC content. FIG. 13 illustrates
the positions of three 50 kbp regions in a Down syndrome critical
region on chromosome 21. FIG. 14 illustrates Bioanalyzer results of
PCR amplification of different sequences from clusters A, B, and C
in regions A, B, and C on chromosome 21. FIG. 15 illustrates
amplification results from different clusters in regions A, B, and
C of chromosome 21, one primer pair/cluster.
[0110] FIG. 16 illustrates PCR amplification of chromosome 21 and
reference chromosome 1 sequences. Ch21_A25, ch21_B16, and ch21_C58
are sequences selected using chromosome walk strategy. Ch1.sub.--1,
ch1.sub.--2, ch2.sub.--1, ch2.sub.--2, ch3.sub.--1, ch3.sub.--2 are
sequences selected using "hot spot" strategy. The sequences of
primers used to generate data in FIGS. 15 and 16 is in Table 4.
TABLE-US-00004 TABLE 4 Primer sequences used to generate data in
FIGS. 15, 16, and 17 (SEQ ID NOS 42-95, respectively, in order of
appearance). A18_F_22632000 TGAAGCCCGGGAGGTTCCCT A18_R_22632000
TCCAGGCTGTGTGCCCTCCC A2_F_22632000 GCCAGGCTGCAGGAAGGAGG
A2_R_22632000 GTTAGGGGAGGGCACGCAGC A28_F_22632000
CCAGCACCACACACCAGCCC A28_R_22632000 GCAGAAAGCTCAGCCTGGCCC
A72_F_22632000 TCCAGTCCTGCACCCTCTCCC A72_R_22632000
GGTGGCTCGGGGCTCCTCAT A7_F_22632000 CAGTGTCCCCACGCACTCACG
A7_R_22632000 TCCAGCACCTCCAGCCTCCC A73_F_22632000
CTGTGGTCAGCAGTCGCACGC A73_R_22632000 TCCCCTTGGCCTGCCATCGT
A25_F_22632000 GGACCATGGCAACGGCCTCC A25_R_22632000
TCCAACAGGCGGTGTCAAGCC B16_F_22681999 GCCAAGCCT GCCTTGTGGGA
B16_R_22681999 GGTGCCCTCCCTCACGATGC B19_F_22681999
GTGGGCACTTCAGAGCTGGGC B19_R_22681999 GTGGGATGTGCCCTCGTGCC
B54_F_22681999 CCCGCCTTGTTGGGTACGAGC B54_R_22681999
GAGCGGGGAGCAGGATGGGT B34_F_22681999 TCCCAGAATGCCACGCCCTG
B34_R_22681999 GAGGTGTGTGCTGAGGGGCG B32_F_22681999
ACTCTGTCCCGTGCCCTTGCT B32_R_22681999 CAAGGCGCCCTTGACTGGCA
B7_F_22681999 ATGCCATGCCCAACGCCACT B7_R_22681999
CTGTGGCCTCAGCTGCTCGG C1_F_28410001 CTGTGGGCCGCTCTCCCTCT
C1_R_28410001 CCTCCGGTAGGGCCAAGGCT C58_F_28410001
TGACCTGTGGGCCGCTCTCC C58_R_28410001 CCTCCGGTAGGGCCAAGGCT
C6_F_28410001 CAGCCCTGTGAGGCATGGGC C6 R 28410001
AGTGAGAGGAGCGGCTGCCA C74_F_28410001 GGGGCTGGTGGAGCTGGTGA
C74_R_28410001 TGGAGCCCCACATCCTGCGT C19_F_28410001
TGTTCCCCGTGCCTGGCTCT C19_R_28410001 TGGGGCCCATCCTGGGGTTC
C29_F_28410001 TGATGGCACGTGTTGCCCCG C29_R_28410001
ACCGTGGCTGACCCCTCCTC C72_F_28410001 CGCCGGGACACAGGAAGCAC
C72_R_28410001 CCCTGGTGAGGAGCCGGGAG C55_F_28410001
GCCAGGGAAGGACTGCGGTG C55_R_28410001 CAGCCAGGGCAGGACTCGGA
Ch1_1_150_F GAGGTCTGGTTCGGCTTTC ref. 1 Ch1_1_150_R
CAGAGCTGGGAGGGATGAG ref.1 ch1_2_150_F TGCAACAGCTTCGTTGGTAG
ch1_2_150_R TAGGTCCAGCAGGAAGTTGG ch2_1_150_F GTCGGAGAAGATCCGTGAGA
ch2_1_150_R CCAGGCATCAATGTCATCAG ch2_2_150_F TGTCAACCAGACGTTCCAAA
ch2_2_150_R TAACACAGCTGGTGCCTGAG ch3_1_150_F ATTCCCCCTTAACCACTTGC
ch3_1_150_R GAGGGTGTCTCGCTTGGTC ch3_2_150_F GCTGAGTAGGAAATGGGAGGT
ch3_2_150_R CTGCAGTCAGGGAGCAGAGT
[0111] FIG. 17 illustrates PCR amplification of reference
chromosomes 1, 2, and 3. Primer sequences used to generate data are
shown in Table 4.
[0112] FIG. 18 illustrates a comparison of amplification success
rate using the "chromosome walk" method and the "hot spot" sequence
selection method. 76% (16/21) amplifications of chromosome 21 were
successful using the "chromosome walk" method to select sequences.
100% (7/7) sequences selected based on "hot spots" on chromosome 21
amplified. 100% (5/5) sequences selected based on "hot spots" on
chromosomes 1, 2, and/or 3 amplified.
Example 3
Selection of Hotspot Region for Amplification
[0113] Sequences for enrichment can be chosen on the basis of being
in a "hotspot," a region of relatively high sequence coverage. FIG.
19 illustrates that sequence runs from multiple samples were
combined to give 79% coverage of chromosome 21. The bottom chart
illustrates Illumina pipeline output files containing multiple
files and each given start and end chromosome positions; therefore
the sequencing reads cover 37 M region (46,927,127 last
position-9,757,475 1st position=.about.37 M). FIG. 20 shows a
schematic of chromosome 21 to which sequence reads have been
mapped. Some regions have more sequence coverage than other
regions. FIG. 21 illustrates an example of a process that was used
to select a specific region of 251 base pairs for amplification.
Sequence within 13,296,000-46,944,323 (illustrated in FIG. 20) was
selected for amplification. FIGS. 22A and B illustrate the relative
position for a Down syndrome critical region
(35,892,000-41,720,000) on chromosome 21. Magnified views of the
sequence reads mapped to chromosome 21 are shown in FIG. 23. FIG.
24 illustrates sequence reads that map to a 4207 bp region on
chromosome 21 and a 251 bp region within that 4207 bp region. The Y
axis is the number of sequence reads at a chromosome position. FIG.
25 illustrates a primer pair that was designed to anneal to
sequence with the 251 bp region.
Example 4
Nested PCR for DNA Library Construction
[0114] FIG. 26 illustrates methods for generating library of
enriched sequences. In the scheme shown in FIG. 26A, a three step
PCR amplification process is used to generate a library of enriched
nucleic acids where the fragments have sequence incorporated that
can be used for annealing to primers for subsequent sequencing. A
first pair of primers is used to amplify enriched sequences. These
primers have sequence that anneals to a second set of primers that
is used to amplify products of the first reaction. The second set
of primers can have sequence that can anneal to sequencing primers.
A third set of primers anneals to sequence from the first set of
primers and is used further amplify the products. The third set of
primers also introduces sequence onto the fragments that can anneal
to sequencing primers.
[0115] The PCR scheme in FIG. 26B illustrates a means for indexing
sequences. The enriched fragments from each sample (e.g.,
individual maternal cell-free samples) can have sequence
incorporated that identifies the fragment as originating from that
sample. This indexing allows multiple samples to be pooled without
loss of information with respect to which sample a fragment
originated. The three step PCR proceeds as shown in FIG. 26A with
indexing sequence being incorporated in primers used in the first
amplification step. The indexing sequence can be in primers used
for the 1.sup.st, 2.sup.nd or 3.sup.rd amplification step.
[0116] The PCR scheme in FIG. 26C differs in that sequence is
incorporated that serves to extend the length of enriched
fragments. Fetal DNA in maternal cell-free samples is often less
than 200 bp in size. Some amplifications enrich fragments that are,
e.g., 60 bp in size. However, sequence reactions using, e.g.,
Illumina sequencing technology are more efficient when fragments
are at least 100 bp in length. Thus, the PCR indexing scheme can be
modified, e.g., as shown in FIG. 26C, to amplify fragments with
sequence in the 1.sup.st, 2.sup.nd, or 3.sup.rd step that serves to
lengthen the fragments in the library.
[0117] While preferred embodiments of the present invention have
been shown and described herein, it will be obvious to those
skilled in the art that such embodiments are provided by way of
example only. Numerous variations, changes, and substitutions will
now occur to those skilled in the art without departing from the
invention. It should be understood that various alternatives to the
embodiments of the invention described herein may be employed in
practicing the invention. It is intended that the following claims
define the scope of the invention and that methods and structures
within the scope of these claims and their equivalents be covered
thereby.
Sequence CWU 1
1
133120DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 1ccccaagagg tgcttgtagt 20220DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
2gccatggtgg agtgtaggag 20320DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 3ctgaagtgct gccaacacac
20420DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 4tgatcttgga gcctcctttg 20520DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
5agcttctcca ggacccagat 20620DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 6cattcatggg aagggactca
20719DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 7ccattgcact ggtgtgctt 19819DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
8gagacgaggg gacgatagc 19920DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 9tgccatcgta gttcagcgta
201019DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 10ttggaccaca gctcagagg 191120DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
11aaagtgtgct tgctccaagg 201220DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 12ggcaaaacac agcccaatag
201320DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 13cctagtgcgg gaaaagacac 201420DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
14ttctctccct tgctcattgc 201521DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 15gagtcagagt ggagctgagg a
211620DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 16ggaggtccta gtggtgagca 201720DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
17tgtgggaagt caggacacac 201820DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 18gatcttggag cctcctttgc
201919DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 19gtgacagcct ggaacatgg 192020DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
20caaggcacct gcactaaggt 202121DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 21tgcctcctgc tacttttacc c
212219DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 22agacggaaca ggcagaggt 192320DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
23caagacacaa gcaggagagc 202420DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 24cagtttggac cacagctcag
202520DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 25aaagtgtgct tgctccaagg 202620DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
26tggaacaagc ctccattttc 202724DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 27gaggtgcttg tagtcagtgc ttca
242820DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 28cccggtgaca cagtcctctt 202918DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
29agtcagagtg gagctgag 183020DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 30tgctgccaac acacgtgtct
203119DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 31cagggctgtt gctcatgga 193217DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
32tcccctagga tatcatc 173318DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 33cccgcatctg cagctcat
183426DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 34tctctccaag tcctacatcc tgtatg 263513DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
35ccaggtggct tcc 133624DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 36gggagctggt acagaaatga cttc
243719DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 37ttgctcattg cgctgacaa 193822DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
38agccatcctt cccgggccta gg 223919DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 39gttcggcttt caccagtct
194019DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 40ctccatagct ctccccact 194117DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
41cgccctgcca tgtggaa 174220DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 42tgaagcccgg gaggttccct
204320DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 43tccaggctgt gtgccctccc 204420DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
44gccaggctgc aggaaggagg 204520DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 45gttaggggag ggcacgcagc
204620DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 46ccagcaccac acaccagccc 204721DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
47gcagaaagct cagcctggcc c 214821DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 48tccagtcctg caccctctcc c
214920DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 49ggtggctcgg ggctcctcat 205021DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
50cagtgtcccc acgcactcac g 215120DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 51tccagcacct ccagcctccc
205221DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 52ctgtggtcag cagtcgcacg c 215320DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
53tccccttggc ctgccatcgt 205420DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 54ggaccatggc aacggcctcc
205521DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 55tccaacaggc ggtgtcaagc c 215620DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
56gccaagcctg ccttgtggga 205720DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 57ggtgccctcc ctcacgatgc
205821DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 58gtgggcactt cagagctggg c 215920DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
59gtgggatgtg ccctcgtgcc 206021DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 60cccgccttgt tgggtacgag c
216120DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 61gagcggggag caggatgggt 206220DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
62tcccagaatg ccacgccctg 206320DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 63gaggtgtgtg ctgaggggcg
206421DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 64actctgtccc gtgcccttgc t 216520DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
65caaggcgccc ttgactggca 206620DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 66atgccatgcc caacgccact
206720DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 67ctgtggcctc agctgctcgg 206820DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
68ctgtgggccg ctctccctct 206920DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 69cctccggtag ggccaaggct
207020DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 70tgacctgtgg gccgctctcc 207120DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
71cctccggtag ggccaaggct 207220DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 72cagccctgtg aggcatgggc
207320DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 73agtgagagga gcggctgcca 207420DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
74ggggctggtg gagctggtga 207520DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 75tggagcccca catcctgcgt
207620DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 76tgttccccgt gcctggctct 207720DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
77tggggcccat cctggggttc 207820DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 78tgatggcacg tgttgccccg
207920DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 79accgtggctg acccctcctc 208020DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
80cgccgggaca caggaagcac 208120DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 81ccctggtgag gagccgggag
208220DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 82gccagggaag gactgcggtg 208320DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
83cagccagggc aggactcgga 208419DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 84gaggtctggt tcggctttc
198519DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 85cagagctggg agggatgag 198620DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
86tgcaacagct tcgttggtag 208720DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 87taggtccagc aggaagttgg
208820DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 88gtcggagaag atccgtgaga 208920DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
89ccaggcatca atgtcatcag 209020DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 90tgtcaaccag acgttccaaa
209120DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 91taacacagct ggtgcctgag 209220DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
92attccccctt aaccacttgc 209319DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 93gagggtgtct cgcttggtc
199421DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 94gctgagtagg aaatgggagg t 219520DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
95ctgcagtcag ggagcagagt 2096212DNAHomo sapiens 96caagacacaa
gcaggagagc cacaaagcca gccagcttac tgccatcgta gttcagcgta 60gcgaagttgg
cctgcttctc cgcgcagccc gcactgttgc acacccgcat ctgcagctca
120taccaggtgg cttcctgcag gtcatacagg atgtaggact tggagagaga
ggtcctctga 180gctgtggtcc aaactgtggt cccaaagggc ct 2129736DNAHomo
sapiens 97tgccatcgta gttcagcgta gcgaagttgg cctgct 369836DNAHomo
sapiens 98ttggaccaca gctcagagga cctctctctc caagtc 369928DNAHomo
sapiens 99gtagtcagtg cttcagagtc agagtgga 2810028DNAHomo sapiens
100cccccaagag gtgcttgtag tcagtgct 2810128DNAHomo sapiens
101cgtgaccccc aagaggtgct tgtagtca 2810228DNAHomo sapiens
102aacagccgtg acccccaara ggtgcttg 2810328DNAHomo sapiens
103ccatggccac gccaggagcc tggtctca 2810428DNAHomo sapiens
104ccatggccac gccaggagcc tggtctca 2810528DNAHomo sapiens
105caccggggca gctgctgatg cccatggc 2810628DNAHomo sapiens
106aggaagagga ctgtgtcacc ggggcagt 2810728DNAHomo sapiens
107gaggaagagg actgtgttac cggggcag 2810828DNAHomo sapiens
108gagctgagga agaggactgt gtcaccgg 2810928DNAHomo sapiens
109gagtcagagt ggagctgagg aagaggac 2811028DNAHomo sapiens
110gatgcccatg gccacgccag gagcctgg 2811128DNAHomo sapiens
111gatgcccatg gccacgccag gagcctgg 2811228DNAHomo sapiens
112ctgatgccca tggccacgcc aggagcct 2811328DNAHomo sapiens
113gtcaccgggg cagttgctga tgcccatg 2811428DNAHomo sapiens
114gccaggagcc tggtctcatg agtctcct 2811528DNAHomo sapiens
115caccatggca tcaagctcta cccctgcc 2811628DNAHomo sapiens
116caccatggca tcaagctcta cccctgcc 2811728DNAHomo sapiens
117ccaccatggc atcaagctct acccctgc 2811828DNAHomo sapiens
118cactccacca tggcatcaag ctctaccc 2811928DNAHomo sapiens
119cctacactcc accatggcat caagctct 2812028DNAHomo sapiens
120cctacactcc accatggcat caagctct 2812128DNAHomo sapiens
121agtctccttg tctctgagcc tctcctac 2812228DNAHomo sapiens
122tccaccatgg catcaagctc tacccctg 2812328DNAHomo sapiens
123tctcctacac tccaccatgg catcaagc 2812422DNAHomo sapiens
124ccactaggac ctcctcctgt ct 2212526DNAHomo sapiens
125ctcaccacta ggacctcctc ctgtct 2612628DNAHomo sapiens
126cctgcccctg ctcaccacta ggacctcc 2812728DNAHomo sapiens
127tctgcccctg ctcaccacta ggacctcc 2812828DNAHomo sapiens
128atgcatgtcc tgcccctgct caccacta 2812910DNAHomo sapiens
129cctcctgtct 1013028DNAHomo sapiens 130cagcccccag aagatgcatg
tcctgccc 2813126DNAHomo sapiens 131ctcaccacta ggacctcctc ctgtct
2613228DNAHomo sapiens 132aacagccgtg acccccaaga ggtgcttg
28133251DNAHomo sapiens 133aacagccgtg acccccaaga ggtgcttgta
gtcagtgctt cagagtcaga gtggagctga 60ggaagaggac tgtgtcaccg gggcagttgc
tgatgcccat ggccacgcca ggagcctggt 120ctcatgagtc tccttgtctc
tgagcctctc ctacactcca ccatggcatc aagctctacc 180cctgcctccc
tgcagccccc agaagatgca tgtcctgccc ctgctcacca ctaggacctc
240ctcctgtctg g 251
* * * * *